How to walk through a Scala collection with ‘reduce’ and ‘fold’

This is an excerpt from the 1st Edition of the Scala Cookbook (partially modified for the internet). This is Recipe 10.20, “How to Walk Through a Scala Collection with the reduce and fold Methods”

Problem

You want to walk through all of the elements in a Scala sequence, comparing two neighboring elements as you walk through the collection.

Solution

Use the reduceLeft, foldLeft, reduceRight, and foldRight methods to walk through the elements in a sequence, applying your function to neighboring elements to yield a new result, which is then compared to the next element in the sequence to yield a new result.

Related methods, such as scanLeft and scanRight, are also shown in the Discussion.

For example, use reduceLeft to walk through a sequence from left to right (from the first element to the last). reduceLeft starts by comparing the first two elements in the collection with your algorithm, and returns a result. That result is compared with the third element, and that comparison yields a new result. That result is compared to the fourth element to yield a new result, and so on.

If you’ve never used these methods before, you’ll see that they give you a surprising amount of power. The best way to show this is with some examples. First, create a sample collection to experiment with:

scala> val a = Array(12, 6, 15, 2, 20, 9)
a: Array[Int] = Array(12, 6, 15, 2, 20, 9)

Given that sequence, use reduceLeft to determine different properties about the collection. The following example shows how to get the sum of all the elements in the sequence:

scala> a.reduceLeft(_ + _)
res0: Int = 64

Don’t let the underscores throw you for a loop; they just stand for the two parameters that are passed into your function. You can write that code like this, if you prefer:

a.reduceLeft((x,y) => x + y)

The following examples show how to use reduceLeft to get the product of all elements in the sequence, the smallest value in the sequence, and the largest value:

scala> a.reduceLeft(_ * _)
res1: Int = 388800

scala> a.reduceLeft(_ min _)
res2: Int = 2

scala> a.reduceLeft(_ max _)
res3: Int = 20

Show each step in the process

You can demonstrate how reduceLeft works by creating a larger function. The following function does a “max” comparison like the last example, but has some extra debugging code so you can see how reduceLeft works as it marches through the sequence.

Here’s the function:

// returns the max of the two elements
val findMax = (x: Int, y: Int) => {
    val winner = x max y
    println(s"compared $x to $y, $winner was larger")
    winner
}

Now call reduceLeft again on the array, this time giving it the findMax function:

scala> a.reduceLeft(findMax)
compared 12 to 6, 12 was larger
compared 12 to 15, 15 was larger
compared 15 to 2, 15 was larger
compared 15 to 20, 20 was larger
compared 20 to 9, 20 was larger
res0: Int = 20

The output shows how reduceLeft marches through the elements in the sequence, and how it called the function at each step. Here’s how the process works:

  • reduceLeft starts by calling findMax to test the first two elements in the array, 12 and 6. findMax returned 12, because 12 is larger than 6.
  • reduceLeft takes that result (12), and calls findMax(12, 15). 12 is the result of the first comparison, and 15 is the next element in the collection. 15 is larger, so it becomes the new result.
  • reduceLeft keeps taking the result from the function and comparing it to the next element in the collection, until it marches through all the elements in the collection, ending up with the result, 20.

The code that reduceLeft uses under the hood looks like this:

// you provide the sequence 'seq' and the function 'f'
var result = seq(0)

for (i <- 1 until seq.length) {
    val next = seq(i)
    result = f(result, next)
}

Feeding different algorithms into this loop lets you extract different types of information from your sequence. Wrapping the algorithm in a method also makes for very concise code.

One subtle but important note about reduceLeft: the function (or method) you supply must return the same data type that’s stored in the collection. This is necessary so reduceLeft can compare the result of your function to the next element in the collection.

Working with other sequences and types

As you can imagine, the type contained in the sequence can be anything you need. For instance, determining the longest or shortest string in a sequence of strings is a matter of walking through the elements in the sequence with a function to compare the lengths of two strings:

scala> val peeps = Vector("al", "hannah", "emily", "christina", "aleka")
peeps: scala.collection.immutable.Vector[java.lang.String] =
    Vector(al, hannah, emily, christina, aleka)

// longest
scala> peeps.reduceLeft((x,y) => if (x.length > y.length) x else y)
res0: String = christina

// shortest
scala> peeps.reduceLeft((x,y) => if (x.length < y.length) x else y)
res1: String = al

If this had been a collection of Person instances, you could run a similar algorithm on each person’s name to get the longest and shortest names.

foldLeft, reduceRight, and foldRight

The foldLeft method works just like reduceLeft, but it lets you set a seed value to be used for the first element. The following examples demonstrate a “sum” algorithm, first with reduceLeft and then with foldLeft, to demonstrate the difference:

scala> val a = Array(1, 2, 3)
a: Array[Int] = Array(1, 2, 3)

scala> a.reduceLeft(_ + _)
res0: Int = 6

scala> a.foldLeft(20)(_ + _)
res1: Int = 26

scala> a.foldLeft(100)(_ + _)
res2: Int = 106

In the last two examples, foldLeft uses 20 and then 100 for its first element, which affects the resulting sum as shown.

If you haven’t seen syntax like that before, foldLeft takes two parameter lists. The first parameter list takes one field, the seed value. The second parameter list is the block of code you want to run (your algorithm).

Recipe 3.18, “Creating Your Own Control Structures”, demonstrates the use of multiple parameter lists.

The reduceRight and foldRight methods work the same as reduceLeft and foldLeft, respectively, but they begin at the end of the collection and work from right to left, i.e., from the end of the collection back to the beginning.

The difference between reduceLeft and reduceRight

In many algorithms, it may not matter if you call reduceLeft or reduceRight. In that case, you can call reduce instead. The reduce Scaladoc states, “The order in which operations are performed on elements is unspecified and may be nondeterministic.”

But some algorithms will yield a big difference. For example, given this divide function:

val divide = (x: Double, y: Double) => {
    val result = x / y
    println(s"divided $x by $y to yield $result")
    result
}

and this array:

val a = Array(1.0, 2.0, 3.0)

reduceLeft and reduceRight yield a significantly different result:

scala> a.reduceLeft(divide)
divided 1.0 by 2.0 to yield 0.5
divided 0.5 by 3.0 to yield 0.16666666666666666
res0: Double = 0.16666666666666666

scala> a.reduceRight(divide)
divided 2.0 by 3.0 to yield 0.6666666666666666
divided 1.0 by 0.6666666666666666 to yield 1.5
res1: Double = 1.5

scanLeft and scanRight

Two methods named scanLeft and scanRight walk through a sequence in a manner similar to reduceLeft and reduceRight, but they return a sequence instead of a single value.

For instance, scanLeft “Produces a collection containing cumulative results of applying the operator going left to right.” To understand how it works, create another function with a little debug code in it:

val product = (x: Int, y: Int) => {
    val result = x * y
    println(s"multiplied $x by $y to yield $result")
    result
}

Here’s what scanLeft looks like when it’s used with that function and a seed value:

scala> val a = Array(1, 2, 3)
a: Array[Int] = Array(1, 2, 3)

scala> a.scanLeft(10)(product)
multiplied 10 by 1 to yield 10
multiplied 10 by 2 to yield 20
multiplied 20 by 3 to yield 60
res0: Array[Int] = Array(10, 10, 20, 60)

As you can see, scanLeft returns a new sequence, rather than a single value. The scanRight method works the same way, but marches through the collection from right to left.

There are a few more related methods, including reduceLeftOption, reduceRightOption, and reduce, which was mentioned earlier.

If you’re curious about the statement in the reduce method Scaladoc that, “The order in which operations are performed on elements is unspecified and may be nondeterministic,” run this code in the REPL:

val findMax = (x: Int, y: Int) => {
    Thread.sleep(10)
    val winner = x max y
    println(s"compared $x to $y, $winner was larger")
    winner
}

val a = Array.range(0,50)
a.par.reduce(findMax)

You’ll see that the elements in the sequence are indeed compared in a nondeterministic order.