Scala String
FAQ: How do I split a String
in Scala based on a field separator, such as a String
I get from a comma-separated value (CSV) file or pipe-delimited file.
Solution
Use one of the split
methods that are available on Scala/Java String
objects. For instance, this example in the Scala REPL shows how to split a string based on a blank space:
scala> "hello world".split(" ") res0: Array[java.lang.String] = Array(hello, world)
Notice that the split
method returns an Array
of String
elements — Array[String]
— which you can then work with as a normal Scala Array
, such as printing the array elements in this example:
scala> "hello world".split(" ").foreach(println) hello world
Real-world split-string example
Here’s a real-world example that shows how to split a URL-encoded string you might receive in a web application. Here I split this string by specifying the & character as the field separator or delimiter:
scala> val result = "oauth_token=FOO&oauth_token_secret=BAR&oauth_expires_in=3600" result: java.lang.String = oauth_token=FOO&oauth_token_secret=BAR&oauth_expires_in=3600 scala> val nameValuePairs = result.split("&") nameValuePairs: Array[java.lang.String] = Array(oauth_token=FOO, oauth_token_secret=BAR, oauth_expires_in=3600)
As you can see in the second line, I call the split
method, telling it to use the &
character to split my string into multiple strings. As you can see in the REPL output, the variable nameValuePairs
is an Array
of String
type, and in this case, these are the name/value pairs I wanted.
Splitting a CSV string
Here’s an example that shows how to split a CSV string into a string array:
scala> val s = "eggs, milk, butter, Coco Puffs" s: java.lang.String = eggs, milk, butter, Coco Puffs // 1st attempt scala> s.split(",") res0: Array[java.lang.String] = Array(eggs, " milk", " butter", " Coco Puffs")
Note that when using this approach it’s best to trim each string. This is shown in the following code, where I use the map
method to call trim
on each string before returning the array:
// 2nd attempt, cleaned up scala> s.split(",").map(_.trim) res1: Array[java.lang.String] = Array(eggs, milk, butter, Coco Puffs)
NOTE: This isn’t a perfect solution because CSV rows can have additional commas inside of quotes. I’m just trying to show how this approach generally works, and how it works for simple CSV files.
Splitting a String using regular expressions
You can also split a string based on a regular expression (regex). This example shows how to split a string on whitespace characters:
scala> "hello world, this is Al".split("\\s+") res0: Array[java.lang.String] = Array(hello, world,, this, is, Al)
For more examples of regular expressions, see the Java Pattern class, or see my common Java regular expression examples.
this post is sponsored by my books: | |||
#1 New Release |
FP Best Seller |
Learn Scala 3 |
Learn FP Fast |
Details: Where the ‘split’ method comes from
The split
method is overloaded, with some versions of the method coming from the Java String
class and some coming from the Scala StringLike
class. For instance, if you call split
with a Char
argument instead of a String
argument, you’re using the split
method from StringLike
:
// split with a String argument scala> "hello world".split(" ") res0: Array[java.lang.String] = Array(hello, world) // split with a Char argument scala> "hello world".split(' ') res1: Array[String] = Array(hello, world)
The subtle difference in that output — Array[java.lang.String]
versus Array[String]
— is a hint that something is different, but as a practical matter, this isn’t important. Also, with the Scala IDE project integrated into Eclipse, you can see where each method comes from when the Eclipse “code assist” dialog is displayed. (IntelliJ IDEA and NetBeans may show similar information.)
Note: The actual Scala class that contains the
split
method may change over time. At the time of this writing thesplit
method is in theStringLike
class, but as the Scala libraries are reorganized this may change.