How to sort a Scala Map by key or value (sortBy, sortWith)

This is an excerpt from the 1st Edition of the Scala Cookbook (#ad) (partially modified for the internet). This is Recipe 11.23, “How to Sort an Existing Scala (#ad) Map by Key or Value”

Problem

You have an unsorted Scala Map and want to sort the elements in the map by the key or value.

Solution

Given a basic, immutable Map:

scala> val grades = Map("Kim" -> 90,
     |     "Al" -> 85,
     |     "Melissa" -> 95,
     |     "Emily" -> 91,
     |     "Hannah" -> 92
     | )
grades: scala.collection.immutable.Map[String,Int] = Map(Hannah -> 92, Melissa -> 95, Kim -> 90, Emily -> 91, Al -> 85)

You can sort the map by key, from low to high, using sortBy:

scala> import scala.collection.immutable.ListMap
import scala.collection.immutable.ListMap

scala> ListMap(grades.toSeq.sortBy(_._1):_*)
res0: scala.collection.immutable.ListMap[String,Int] = Map(Al -> 85, Emily -> 91, Hannah -> 92, Kim -> 90, Melissa -> 95)

You can also sort the keys in ascending or descending order using sortWith:

// low to high
scala> ListMap(grades.toSeq.sortWith(_._1 < _._1):_*)
res0: scala.collection.immutable.ListMap[String,Int] = Map(Al -> 85, Emily -> 91, Hannah -> 92, Kim -> 90, Melissa -> 95)

// high to low
scala> ListMap(grades.toSeq.sortWith(_._1 > _._1):_*)
res1: scala.collection.immutable.ListMap[String,Int] = Map(Melissa -> 95, Kim -> 90, Hannah -> 92, Emily -> 91, Al -> 85)

You can sort the map by value using sortBy:

scala> ListMap(grades.toSeq.sortBy(_._2):_*)
res0: scala.collection.immutable.ListMap[String,Int] = Map(Al -> 85, Kim -> 90, Emily -> 91, Hannah -> 92, Melissa -> 95)

You can also sort by value in ascending or descending order using sortWith:

// low to high
scala> ListMap(grades.toSeq.sortWith(_._2 < _._2):_*)
res0: scala.collection.immutable.ListMap[String,Int] = Map(Al -> 85, Kim -> 90, Emily -> 91, Hannah -> 92, Melissa -> 95)

// high to low
scala> ListMap(grades.toSeq.sortWith(_._2 > _._2):_*)
res1: scala.collection.immutable.ListMap[String,Int] = Map(Melissa -> 95, Hannah -> 92, Emily -> 91, Kim -> 90, Al -> 85)

In all of these examples, you’re not sorting the existing map; the sort methods result in a new sorted map, so the output of the result needs to be assigned to a new variable.

Also, you can use either a ListMap or a LinkedHashMap in these recipes. This example shows how to use a LinkedHashMap and assign the result to a new variable:

scala> val x = collection.mutable.LinkedHashMap(grades.toSeq.sortBy(_._1):_*)
x: scala.collection.mutable.LinkedHashMap[String,Int] =
    Map(Al -> 85, Emily -> 91, Hannah -> 92, Kim -> 90, Melissa -> 95)

scala> x.foreach(println)
(Al,85)
(Emily,91)
(Hannah,92)
(Kim,90)
(Melissa,95)

Discussion

To understand these solutions, it’s helpful to break them down into smaller pieces. First, start with the basic immutable Map:

scala> val grades = Map("Kim" -> 90,
     |   "Al" -> 85,
     |   "Melissa" -> 95,
     |   "Emily" -> 91,
     |   "Hannah" -> 92
     | )
grades: scala.collection.immutable.Map[String,Int] = Map(Hannah -> 92, Melissa -> 95, Kim -> 90, Emily -> 91, Al -> 85)

Next, this is what grades.toSeq looks like:

scala> grades.toSeq
res0: Seq[(String, Int)] = ArrayBuffer((Hannah,92), (Melissa,95), (Kim,90), (Emily,91), (Al,85))

You make the conversion to a Seq because it has sorting methods you can use:

scala> grades.toSeq.sortBy(_._1)
res0: Seq[(String, Int)] = ArrayBuffer((Al,85), (Emily,91), (Hannah,92), (Kim,90), (Melissa,95))

scala> grades.toSeq.sortWith(_._1 < _._1)
res1: Seq[(String, Int)] = ArrayBuffer((Al,85), (Emily,91), (Hannah,92), (Kim,90), (Melissa,95))

Once you have the map data sorted as desired, store it in a ListMap to retain the sort order:

scala> ListMap(grades.toSeq.sortBy(_._1):_*)
res0: scala.collection.immutable.ListMap[String,Int] = Map(Al -> 85, Emily -> 91, Hannah -> 92, Kim -> 90, Melissa -> 95)

The LinkedHashMap also retains the sort order of its elements, so it can be used in all of the examples as well:

scala> import scala.collection.mutable.LinkedHashMap
import scala.collection.mutable.LinkedHashMap

scala> LinkedHashMap(grades.toSeq.sortBy(_._1):_*)
res0: scala.collection.mutable.LinkedHashMap[String,Int] = Map(Al -> 85, Emily -> 91, Hannah -> 92, Kim -> 90, Melissa -> 95)

There are both mutable and immutable versions of a ListMap, but LinkedHashMap is only available as a mutable class. Use whichever is best for your situation.

About that _*

The _* portion of the code takes a little getting used to. It’s used to convert the data so it will be passed as multiple parameters to the ListMap or LinkedHashMap. You can see this a little more easily by again breaking down the code into separate lines. The sortBy method returns a Seq[(String, Int)], i.e., a sequence of tuples:

scala> val x = grades.toSeq.sortBy(_._1)
x: Seq[(String, Int)] = ArrayBuffer((Al,85), (Emily,91), (Hannah,92), (Kim,90), (Melissa,95))

You can’t directly construct a ListMap with a sequence of tuples, but because the apply method in the ListMap companion object accepts a Tuple2 varargs parameter, you can adapt x to work with it, i.e., giving it what it wants:

scala> ListMap(x: _*)
res0: scala.collection.immutable.ListMap[String,Int] = Map(Al -> 85, Emily -> 91, Hannah -> 92, Kim -> 90, Melissa -> 95)

Attempting to create the ListMap without using this approach results in an error:

scala> ListMap(x)
<console>:16: error: type mismatch;
 found   : Seq[(String, Int)]
 required: (?, ?)
              ListMap(x)
                      ^

Another way to see how _* works is to define your own method that takes a varargs parameter. The following printAll method takes one parameter, a varargs field of type String:

def printAll(strings: String*) {
    strings.foreach(println)
}

If you then create a List like this:

// a sequence of strings
val fruits = List("apple", "banana", "cherry")
you won’t be able to pass that `List` into printAll; it will fail like the previous example:

scala> printAll(fruits)
<console>:20: error: type mismatch;
 found   : List[String]
 required: String
              printAll(fruits)
                       ^

But you can use _* to adapt the List to work with printAll, like this:

// this works
printAll(fruits: _*)

If you come from a Unix background, it may be helpful to think of _* as a “splat” operator. This operator tells the compiler to pass each element of the sequence to printAll as a separate argument, instead of passing fruits as a single List argument.