How to sort a Scala Map by key or value (sortBy, sortWith)

This is an excerpt from the 1st Edition of the Scala Cookbook (partially modified for the internet). This is Recipe 11.23, “How to Sort an Existing Scala Map by Key or Value”

Problem

You have an unsorted Scala Map and want to sort the elements in the map by the key or value.

Solution

Given a basic, immutable Map:

scala> val grades = Map("Kim" -> 90,
     |     "Al" -> 85,
     |     "Melissa" -> 95,
     |     "Emily" -> 91,
     |     "Hannah" -> 92
     | )
grades: scala.collection.immutable.Map[String,Int] = Map(Hannah -> 92, Melissa -> 95, Kim -> 90, Emily -> 91, Al -> 85)

You can sort the map by key, from low to high, using sortBy:

scala> import scala.collection.immutable.ListMap
import scala.collection.immutable.ListMap

scala> ListMap(grades.toSeq.sortBy(_._1):_*)
res0: scala.collection.immutable.ListMap[String,Int] = Map(Al -> 85, Emily -> 91, Hannah -> 92, Kim -> 90, Melissa -> 95)

You can also sort the keys in ascending or descending order using sortWith:

// low to high
scala> ListMap(grades.toSeq.sortWith(_._1 < _._1):_*)
res0: scala.collection.immutable.ListMap[String,Int] = Map(Al -> 85, Emily -> 91, Hannah -> 92, Kim -> 90, Melissa -> 95)

// high to low
scala> ListMap(grades.toSeq.sortWith(_._1 > _._1):_*)
res1: scala.collection.immutable.ListMap[String,Int] = Map(Melissa -> 95, Kim -> 90, Hannah -> 92, Emily -> 91, Al -> 85)

You can sort the map by value using sortBy:

scala> ListMap(grades.toSeq.sortBy(_._2):_*)
res0: scala.collection.immutable.ListMap[String,Int] = Map(Al -> 85, Kim -> 90, Emily -> 91, Hannah -> 92, Melissa -> 95)

You can also sort by value in ascending or descending order using sortWith:

// low to high
scala> ListMap(grades.toSeq.sortWith(_._2 < _._2):_*)
res0: scala.collection.immutable.ListMap[String,Int] = Map(Al -> 85, Kim -> 90, Emily -> 91, Hannah -> 92, Melissa -> 95)

// high to low
scala> ListMap(grades.toSeq.sortWith(_._2 > _._2):_*)
res1: scala.collection.immutable.ListMap[String,Int] = Map(Melissa -> 95, Hannah -> 92, Emily -> 91, Kim -> 90, Al -> 85)

In all of these examples, you’re not sorting the existing map; the sort methods result in a new sorted map, so the output of the result needs to be assigned to a new variable.

Also, you can use either a ListMap or a LinkedHashMap in these recipes. This example shows how to use a LinkedHashMap and assign the result to a new variable:

scala> val x = collection.mutable.LinkedHashMap(grades.toSeq.sortBy(_._1):_*)
x: scala.collection.mutable.LinkedHashMap[String,Int] =
    Map(Al -> 85, Emily -> 91, Hannah -> 92, Kim -> 90, Melissa -> 95)

scala> x.foreach(println)
(Al,85)
(Emily,91)
(Hannah,92)
(Kim,90)
(Melissa,95)

Discussion

To understand these solutions, it’s helpful to break them down into smaller pieces. First, start with the basic immutable Map:

scala> val grades = Map("Kim" -> 90,
     |   "Al" -> 85,
     |   "Melissa" -> 95,
     |   "Emily" -> 91,
     |   "Hannah" -> 92
     | )
grades: scala.collection.immutable.Map[String,Int] = Map(Hannah -> 92, Melissa -> 95, Kim -> 90, Emily -> 91, Al -> 85)

Next, this is what grades.toSeq looks like:

scala> grades.toSeq
res0: Seq[(String, Int)] = ArrayBuffer((Hannah,92), (Melissa,95), (Kim,90), (Emily,91), (Al,85))

You make the conversion to a Seq because it has sorting methods you can use:

scala> grades.toSeq.sortBy(_._1)
res0: Seq[(String, Int)] = ArrayBuffer((Al,85), (Emily,91), (Hannah,92), (Kim,90), (Melissa,95))

scala> grades.toSeq.sortWith(_._1 < _._1)
res1: Seq[(String, Int)] = ArrayBuffer((Al,85), (Emily,91), (Hannah,92), (Kim,90), (Melissa,95))

Once you have the map data sorted as desired, store it in a ListMap to retain the sort order:

scala> ListMap(grades.toSeq.sortBy(_._1):_*)
res0: scala.collection.immutable.ListMap[String,Int] = Map(Al -> 85, Emily -> 91, Hannah -> 92, Kim -> 90, Melissa -> 95)

The LinkedHashMap also retains the sort order of its elements, so it can be used in all of the examples as well:

scala> import scala.collection.mutable.LinkedHashMap
import scala.collection.mutable.LinkedHashMap

scala> LinkedHashMap(grades.toSeq.sortBy(_._1):_*)
res0: scala.collection.mutable.LinkedHashMap[String,Int] = Map(Al -> 85, Emily -> 91, Hannah -> 92, Kim -> 90, Melissa -> 95)

There are both mutable and immutable versions of a ListMap, but LinkedHashMap is only available as a mutable class. Use whichever is best for your situation.

About that _*

The _* portion of the code takes a little getting used to. It’s used to convert the data so it will be passed as multiple parameters to the ListMap or LinkedHashMap. You can see this a little more easily by again breaking down the code into separate lines. The sortBy method returns a Seq[(String, Int)], i.e., a sequence of tuples:

scala> val x = grades.toSeq.sortBy(_._1)
x: Seq[(String, Int)] = ArrayBuffer((Al,85), (Emily,91), (Hannah,92), (Kim,90), (Melissa,95))

You can’t directly construct a ListMap with a sequence of tuples, but because the apply method in the ListMap companion object accepts a Tuple2 varargs parameter, you can adapt x to work with it, i.e., giving it what it wants:

scala> ListMap(x: _*)
res0: scala.collection.immutable.ListMap[String,Int] = Map(Al -> 85, Emily -> 91, Hannah -> 92, Kim -> 90, Melissa -> 95)

Attempting to create the ListMap without using this approach results in an error:

scala> ListMap(x)
<console>:16: error: type mismatch;
 found   : Seq[(String, Int)]
 required: (?, ?)
              ListMap(x)
                      ^

Another way to see how _* works is to define your own method that takes a varargs parameter. The following printAll method takes one parameter, a varargs field of type String:

def printAll(strings: String*) {
    strings.foreach(println)
}

If you then create a List like this:

// a sequence of strings
val fruits = List("apple", "banana", "cherry")
you won’t be able to pass that `List` into printAll; it will fail like the previous example:

scala> printAll(fruits)
<console>:20: error: type mismatch;
 found   : List[String]
 required: String
              printAll(fruits)
                       ^

But you can use _* to adapt the List to work with printAll, like this:

// this works
printAll(fruits: _*)

If you come from a Unix background, it may be helpful to think of _* as a “splat” operator. This operator tells the compiler to pass each element of the sequence to printAll as a separate argument, instead of passing fruits as a single List argument.