How to merge Scala sequential collections (List, Vector, ArrayBuffer, Array, Seq)

This is an excerpt from the Scala Cookbook (partially modified for the internet). This is Recipe 10.22, “How to Merge Scala Sequential Collections”

Problem

You want to join two Scala sequences into one sequence, either keeping all of the original elements, finding the elements that are common to both collections, or finding the difference between the two sequences.

Solution

There are a variety of solutions to this problem, depending on your needs:

  • Use the ++= method to merge a sequence into a mutable sequence
  • Use the ++ method to merge two mutable or immutable sequences
  • Use collection methods like union, diff, and intersect

The ++= method

Use the ++= method to merge a sequence (any TraversableOnce) into a mutable collection like an ArrayBuffer:

scala> val a = collection.mutable.ArrayBuffer(1,2,3)
a: scala.collection.mutable.ArrayBuffer[Int] = ArrayBuffer(1, 2, 3)

scala> a ++= Seq(4,5,6)
res0: a.type = ArrayBuffer(1, 2, 3, 4, 5, 6)

The ++ method

Use the ++ method to merge two mutable or immutable collections while assigning the result to a new variable:

scala> val a = Array(1,2,3)
a: Array[Int] = Array(1, 2, 3)

scala> val b = Array(4,5,6)
b: Array[Int] = Array(4, 5, 6)

scala> val c = a ++ b
c: Array[Int] = Array(1, 2, 3, 4, 5, 6)

union, intersect

You can also use methods like union and intersect to combine sequences to create a resulting sequence:

scala> val a = Array(1,2,3,4,5)
a: Array[Int] = Array(1, 2, 3, 4, 5)

scala> val b = Array(4,5,6,7,8)
b: Array[Int] = Array(4, 5, 6, 7, 8)

// elements that are in both collections
scala> val c = a.intersect(b)
c: Array[Int] = Array(4, 5)

// all elements from both collections
scala> val c = a.union(b)
c: Array[Int] = Array(1, 2, 3, 4, 5, 4, 5, 6, 7, 8)

// distinct elements from both collections
scala> val c = a.union(b).distinct
c: Array[Int] = Array(1, 2, 3, 4, 5, 6, 7, 8)

diff

The diff method results depend on which sequence it’s called on:

scala> val c = a diff b
c: Array[Int] = Array(1, 2, 3)

scala> val c = b diff a
c: Array[Int] = Array(6, 7, 8)

The Scaladoc for the diff method states that it returns, “a new list which contains all elements of this list except some of occurrences of elements that also appear in that. If an element value x appears n times in that, then the first n occurrences of x will not form part of the result, but any following occurrences will.”

The objects that correspond to most collections also have a concat method:

scala> Array.concat(a, b)
res0: Array[Int] = Array(1, 2, 3, 4, 4, 5, 6, 7)

::: with List

If you happen to be working with a List, the ::: method prepends the elements of one list to another list:

scala> val a = List(1,2,3,4)
a: List[Int] = List(1, 2, 3, 4)

scala> val b = List(4,5,6,7)
b: List[Int] = List(4, 5, 6, 7)

scala> val c = a ::: b
c: List[Int] = List(1, 2, 3, 4, 4, 5, 6, 7)

Discussion

You can also use the diff method to get the relative complement of two sets. The relative complement of a set A with respect to a set B is the set of elements in B that are not in A.

On a recent project, I needed to find the elements in one list that weren’t in another list. I did this by converting the lists to sets, and then using the diff method to compare the two sets. For instance, given these two arrays:

val a = Array(1,2,3,11,4,12,4,5)
val b = Array(6,7,4,5)

you can find the relative complement of each array by first converting them to sets (to eliminate the duplicate values), and then comparing them with the diff method:

// the elements in a that are not in b
scala> val c = a.toSet diff b.toSet
c: scala.collection.immutable.Set[Int] = Set(1, 2, 12, 3, 11)

// the elements in b that are not in a
scala> val d = b.toSet diff a.toSet
d: scala.collection.immutable.Set[Int] = Set(6, 7)

If desired, you can then sum those results to get the list of elements that are either in the first set or the second set, but not both sets:

scala> val complement = c ++ d
complement: scala.collection.immutable.Set[Int] = Set(1, 6, 2, 12, 7, 3, 11)

This works because diff returns a set that contains the elements in the current set (this) that are not in the other set (that).

You can also use the -- method to get the same result:

scala> val c = a.toSet -- b.toSet
c: scala.collection.immutable.Set[Int] = Set(1, 2, 12, 3, 11)

scala> val d = b.toSet -- a.toSet
d: scala.collection.immutable.Set[Int] = Set(6, 7)

Subtracting the intersection of the two sets also yields the same result:

scala> val i = a.intersect(b)
i: Array[Int] = Array(4, 5)

scala> val c = a.toSet -- i.toSet
c: scala.collection.immutable.Set[Int] = Set(1, 2, 12, 3, 11)

scala> val d = b.toSet -- i.toSet
d: scala.collection.immutable.Set[Int] = Set(6, 7)