The Definition of “Pure Function”

“When a function is pure,
we say that ‘output depends (only) on input.’”

From the book, Becoming Functional
(with the word “only” added by me)

Goals

This lesson has two goals:

  1. Properly define the term “pure function.”
  2. Show a few examples of pure functions.

It also tries to simplify the pure function definition, and shares a tip on how to easily identify many impure functions.

Introduction

As I mentioned in the “What is Functional Programming?” chapter, I define functional programming (FP) like this:

Functional programming is a way of writing software applications using only pure functions and immutable values.

Because that definition uses the term “pure functions,” it’s important to understand what a pure function is. I gave a partial pure function definition in that chapter, and now I’ll provide a more complete definition.

Definition of “pure function”

Just like the term functional programming, different people will give you different definitions of a pure function. I provide links to some of those at the end of this lesson, but skipping those for now, Wikipedia defines a pure function like this:

  1. The function always evaluates to the same result value given the same argument value(s). It cannot depend on any hidden state or value, and it cannot depend on any I/O.
  2. Evaluation of the result does not cause any semantically observable side effect or output, such as mutation of mutable objects or output to I/O devices.

That’s good, but I prefer to reorganize those statements like this:

  1. A pure function depends only on (a) its declared input parameters and (b) its algorithm to produce its result. A pure function has no “back doors,” which means:
    1. Its result can’t depend on reading any hidden value outside of the function scope, such as another field in the same class or global variables.
    2. It cannot modify any hidden fields outside of the function scope, such as other mutable fields in the same class or global variables.
    3. It cannot depend on any external I/O. It can’t rely on input from files, databases, web services, UIs, etc; it can’t produce output, such as writing to a file, database, or web service, writing to a screen, etc.
  2. A pure function does not modify its input parameters.

This can be summed up concisely with this definition:

A pure function is a function that depends only on its declared input parameters and its algorithm to produce its output. It does not read any other values from “the outside world” — the world outside of the function’s scope — and it does not modify any values in the outside world.

A mantra for writing pure functions

Once you’ve seen a formal pure function definition, I prefer this short mantra:

Output depends only on input.

I like that because it’s short and easy to remember, but technically it isn’t 100% accurate because it doesn’t address side effects. A more accurate way of saying this is:

  1. Output depends only on input
  2. No side effects

You can represent that as an image like this:

The Pure Function equation

or more simply, like this:

The Pure Function equation - variables only

In this book I’ll generally either write, “Output depends on input,” or show one of these images.

Examples of pure and impure functions

Given the definition of pure functions and these simpler mantras, let’s look at some examples of pure and impure functions.

Examples of pure functions

Mathematical functions are great examples of pure functions because it’s pretty obvious that “output depends only on input.” Methods like these in scala.math._ are all pure functions:

  • abs
  • ceil
  • max
  • min

I refer to these as “methods” because they are defined using def in the package object math. However, these methods work just like functions, so I also refer to them as pure functions.

Because a Scala String is immutable, every method available to a String is a pure function, including:

  • charAt
  • isEmpty
  • length
  • substring

Many methods that are available on Scala’s collections’ classes fit the definition of a pure function, including the common ones:

  • drop
  • filter
  • map
  • reduce

Examples of impure functions

Conversely, the following functions are impure.

Going right back to the collections’ classes, the foreach method is impure. foreach is used only for its side effects, which you can tell by looking at its signature on the Seq class:

def foreach(f: (A) => Unit): Unit 

Date and time related methods like getDayOfWeek, getHour, and getMinute are all impure because their output depends on something other than their inputs. Their results rely on some form of hidden I/O.

Methods on the scala.util.Random class like nextInt are also impure because their output depends on something other than their inputs.

In general, impure functions do one or more of these things:

  • Read hidden inputs (variables not explicitly passed in as function input parameters)
  • Write hidden outputs
  • Mutate the parameters they are given
  • Perform some sort of I/O with the outside world

Tip: Telltale signs of impure functions

By looking at function signatures only, there are two ways you can identify many impure functions:

  • They don’t have any input parameters
  • They don’t return anything (or they return Unit in Scala, which is the same thing)

For example, here’s the signature for the println method of the Scala Predef object:

def println(x: Any): Unit

Because println is such a commonly-used method you already know that it writes information to the outside world, but if you didn’t know that, its Unit return type would be a terrific hint of that behavior.

Similarly when you look at the “read*” methods that were formerly in Predef (and are now in scala.io.StdIn), you’ll see that a method like readLine takes no input parameters, which is also a giveaway that it is impure:

def readLine(): String

Because it takes no input parameters, the mantra, “Output depends only on input” clearly can’t apply to it.

Simply stated:

  • If a function has no input parameters, how can its output depend on its input?
  • If a function has no result, it must have side effects: mutating variables, or performing some sort of I/O.

While this is an easy way to spot many impure functions, other impure methods can have both (a) input parameters and (b) a non-Unit return type, but still be impure because they read variables outside of their scope, mutate variables outside of their scope, or perform I/O.

Summary

As you saw in this lesson, this is my formal definition of a pure function:

A pure function is a function that depends only on its declared inputs and its internal algorithm to produce its output. It does not read any other values from “the outside world” — the world outside of the function’s scope — and it does not modify any values in the outside world.

Once you understand the complete definition, I prefer the short mantra:

Output depends only on input.

or this statement:

  1. Output depends only on input
  2. No side effects

That statement can be represented like this:

The Pure Function equation

or this:

The Pure Function equation - variables only

What’s next

Now that you’ve seen the definition of a pure function, I’ll show some problems that arise from using impure functions, and then summarize the benefits of using pure functions.

See also