The Benefits of Pure Functions

When asked, “What are the advantages of writing in a language without side effects?,” Simon Peyton Jones, co-creator of Haskell, replied, “You only have to reason about values and not about state. If you give a function the same input, it’ll give you the same output, every time. This has implications for reasoning, for compiling, for parallelism.”

From the book, Masterminds of Programming

The goal of this lesson is simple: to list and explain the benefits of writing pure functions.

Benefits of pure functions

My favorite benefits of pure functions are:

They’re easier to reason about
They’re easier to combine
They’re easier to test
They’re easier to debug
They’re easier to parallelize

Functional programming developers talk about other benefits of writing pure functions. For instance, Venkat Subramaniam adds these benefits:

They are idempotent
They offer referential transparency
They are memoizable
They can be lazy

In this lesson I’ll examine each of these benefits.

Pure functions are easier to reason about

Pure functions are easier to reason about than impure functions, and I cover this in detail in the lesson, “Pure Function Signatures Tell All.” The key point is that because a pure function has no side effects or hidden I/O, you can get a terrific idea of what it does just by looking at its signature.

Pure functions are easier to combine

Because “output depends only on input,” pure functions are easy to combine together into simple solutions. For example, you’ll often see FP code written as a chain of function calls, like this:

val x = doThis(a).thenThis(b)
                 .andThenThis(c)
                 .doThisToo(d)
                 .andFinallyThis(e)

This capability is referred to as functional composition. I’ll demonstrate more examples of it throughout this book.

As you’ll see in the “FP is Like Unix Pipelines” lesson, Unix pipelines work extremely well because most Unix commands are like pure functions: they read input and produce transformed output based only on the inputs and the algorithm you supply.

Pure functions are easier to test

As I showed in the “Benefits of Functional Programming” chapter, and the unit test in the previous lesson, pure functions are easier to test than impure functions. I expand on this in several other lessons in this book, including the lesson on property-based testing.

Pure functions are easier to debug

In the “Benefits of Functional Programming” chapter I wrote that on a large scale, FP applications are easier to debug. In the small scale, pure functions are also easier to debug than their impure counterparts. Because the output of a pure function depends only on the function’s input parameters and your algorithm, you don’t need to look outside the function’s scope to debug it.

Contrast that with having to debug the makeCookies function in the previous lesson. Because love is a hidden input, you have to look outside the function’s scope to determine what love’s state was at the time makeCookies was called, and how that state was set.

Pure functions are easier to parallelize

I wrote in the “Benefits of Functional Programming” chapter that it’s easier to write parallel/concurrent applications with FP. Because all of those same reasons apply here I won’t repeat them, but I will show one example of how a compiler can optimize code within a pure function.

I’m not a compiler writer, so I’ll begin with this statement from the “pure functions” section of the Wikipedia functional programming page:

“If there is no data dependency between two pure expressions, then their order can be reversed, or they can be performed in parallel and they cannot interfere with one another (in other terms, the evaluation of any pure expression is thread-safe).”

As an example of what that means, in this code:

val x = f(a)
val y = g(b)
val z = h(c)
val result = x + y + z

there are no data dependencies between the first three expressions, so they can be executed in any order. The only thing that matters is that they are executed before the assignment to result. If the compiler/interpreter wants to run those expressions in parallel, it can do that and then merge their values in the final expression. This can happen because (a) the functions are pure, and (b) there are no dependencies between the expressions.

That same Wikipedia page also states:

“If the entire language does not allow side-effects, then any evaluation strategy can be used; this gives the compiler freedom to reorder or combine the evaluation of expressions in a program (for example, using deforestation).”

Pure functions are idempotent

I don’t use the word “idempotent” too often, so I’ll quote from Venkat Subramaniam’s explanation of the benefit of idempotence in regards to pure functions (with a few minor edits by me):

The word idempotent has a few different meanings ... a function or operation is idempotent if the result of executing it multiple times for a given input is the same as executing it only once for the same input. If we know that an operation is idempotent, we can run it as many times as we like ... it's safe to retry.

In a related definition, in A practical introduction to functional programming, Mary Rose Cook states:

A process is deterministic if repetitions yield the same result every time.

The terms idempotent and deterministic are similar to a favorite phrase of mine: you can call a pure function an infinite number of times and always get the same result.

Pure functions offer referential transparency

Referential transparency (RT) is another technical term that you’ll hear in the FP world. It’s similar to idempotency, and refers to what you (and a compiler) can do because your functions are pure.

If you like algebra, you’ll like RT. It’s said that an expression is referentially transparent if it can be replaced by its resulting value without changing the behavior of the program.

For instance, assume that x and y are immutable values within some scope of an application, and within that scope they’re used to form this expression:

x + y

Then you can assign this expression to a third variable z:

val z = x + y

Now, throughout the given scope of your program, anywhere the expression x + y is used, it can be replaced by z without affecting the result of the program (and vice-versa).

Note that although I state that x and y are immutable values, they can also be the result of pure functions. For instance, "hello".length + "world".length will always be 10. This result could be assigned to z, and then z could be used everywhere instead of this expression. In Scala this looks like this:

val x = "hello".length   // 5
val y = "world".length   // 5
val z = x + y            // 10

Because all of those values are immutable, you can use z anywhere you might use x+y, and in fact, in this example you can replace z with 10 anywhere, and your program will run exactly the same.

In FP we say things like, “10 cannot be reduced any more.” (More on this later.)

Conversely, if x or y was an impure function, such as a “get the current time” function, z could not be a reliable replacement for x + y at different points in the application.

Pure functions are memoizable

Because a pure function always returns the same result when given the same inputs, a compiler (or your application) can also use caching optimizations, such as memoization.

Wikipedia defines memoization like this:

“Memoization is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.”

For example, I previously noted that my Android football game has this function call:

val possiblePlays = OffensiveCoordinator.determinePossiblePlays(gameState)

The determinePossiblePlays function currently has several thousand lines of pure functions behind it, and over time it’s only going to get more complicated. Although this function doesn’t currently use memoization, it would be fairly simple to create a cache for it, so that each time it received the same gameState it would return the same result.

The cache could be implemented as a Map, with a type of Map[GameState, Seq[OffensivePlay]]. Then when determinePossiblePlays receives a GameState instance, it could perform a fast lookup in this cache.

While those statements are true, I don’t want to oversimplify this too much. determinePossiblePlays makes decisions based on many GameState factors, including two important (a) game score and (b) time remaining. Those two variables would have to be factors in any cache.

Pure functions can be lazy

Laziness is a major feature of the Haskell language, where everything is lazy. In Scala I primarily use laziness with large data sets and streams, so I haven’t personally taken advantage of this benefit yet.

(I’ll update this benefit when I have a good Scala example.)

Summary

In this lesson I wrote about the benefits of pure functions. My favorite benefits are:

They’re easier to reason about
They’re easier to combine
They’re easier to test
They’re easier to debug
They’re easier to parallelize

Other FP developers write about these benefits of pure functions:

They are idempotent
They offer referential transparency
They are memoizable
They can be lazy