Scala: How to extract a column from a list of strings (like awk/print)

On Twitter this morning I saw this post by Pablo Fco. Pérez where he compared some Bash commands to Scala. In particular he noted that this awk command:

awk '{print $1}'

is equivalent to:

map(_.split(" ").head)

I had to think about that for a moment before I thought, “Hey, he’s right.” Here’s a quick demonstration.

Extract a column from a list of strings (like awk)

First create a List with three strings:

scala> val x = List.fill(3)("foo bar baz")
x: List[String] = List(foo bar baz, foo bar baz, foo bar baz)

Then run his map/split/head command combo on the List[String]:

scala> x.map(_.split(" ").head)
res0: List[String] = List(foo, foo, foo)

Pretty cool, it prints the first column from a list of strings whose fields are space-separated.

Breaking it down

If that command doesn’t make sense, try to break it down into smaller pieces. First, notice what split(" ") does on one String:

scala> "foo bar baz".split(" ")
res1: Array[String] = Array(foo, bar, baz)

Then notice what happens when you add head to that:

scala> "foo bar baz".split(" ").head
res2: String = foo