Scala FAQ: How can I iterate through each character in a Scala String
, performing an operation on each character as I traverse the string?
Solution
Depending on your needs and preferences, you can use the Scala map
or foreach
methods, a for
loop, or other approaches.
The ‘map’ method
Here’s a simple example of how to create an uppercase string from an input string, using the map
method that’s available on all Scala sequential collections:
scala> val upper = "hello, world".map(c => c.toUpper) upper: String = HELLO, WORLD
As you see in many examples in the Scala Cookbook, you can shorten that code using the magic of Scala’s underscore character:
scala> val upper = "hello, world".map(_.toUpper) upper: String = HELLO, WORLD
With any Scala collection — such as a sequence of characters in a string — you can also chain collection methods together to achieve a desired result. In the following example, the filter
method is called on the original String
to create a new String
with all occurrences of the lowercase letter “L” removed. That String
is then used as input to the map
method to convert the remaining characters to uppercase:
scala> val upper = "hello, world".filter(_ != 'l').map(_.toUpper) upper: String = HEO, WORD
The ‘for’ loop
When you first start with Scala, you may not be comfortable with the map
method, in which case you can use Scala’s for
loop to achieve the same result. This example shows another way to print each character:
scala> for (c <- "hello") println(c) h e l l o
To write a for
loop to work like a map
method, add a yield
statement to the end of the loop. This for/yield loop is equivalent to the first two map
examples:
scala> val upper = for (c <- "hello, world") yield c.toUpper upper: String = HELLO, WORLD
Adding yield
to a for
loop essentially places the result from each loop iteration into a temporary holding area. When the loop completes, all of the elements in the holding area are returned as a single collection.
This for/yield loop achieves the same result as the third map
example:
val result = for { c <- "hello, world" if c != 'l' } yield c.toUpper
The ‘foreach’ method
Whereas the map
or for/yield approaches are used to transform one collection into another, the foreach
method is typically used to operate on each element without returning a result. This is useful for situations like printing:
scala> "hello".foreach(println) h e l l o
Note: Having used Scala for a few years now, I can say that using
map
is the most common Scala idiom for use cases like this. Using for/yield is also common when there are multiple lines of processing to perform, and in my experience, theforeach
method isn’t used that often in Scala. That being said, feel free to use whatever you're comfortable using.
Discussion
Because Scala treats a string as a sequence of characters -- and because of Scala’s back‐ ground as both an object-oriented and functional programming language -- you can iterate over the characters in a string with the approaches shown. Compare those examples with a common Java approach:
String s = "Hello"; StringBuilder sb = new StringBuilder(); for (int i = 0; i < s.length(); i++) { char c = s.charAt(i); // do something with the character ... // sb.append ... } String result = sb.toString();
You’ll see that the Scala approach is more concise, but still very readable. This combination of conciseness and readability lets you focus on solving the problem at hand. Once you get comfortable with Scala, it feels like the imperative code in the Java example obscures your business logic (IMHO).
Imperative programming
Wikipedia describes imperative programming like this:
“Imperative programming is a programming paradigm that describes computation in terms of statements that change a program state ... imperative programs define sequences of commands for the computer to perform.”
This is shown in the Java example, which defines a series of explicit statements that tell a computer how to achieve a desired result.
Understanding how ‘map’ works
Depending on your coding preferences, you can pass large blocks of code to a map
method. These two examples demonstrate the syntax for passing an algorithm to a map
method:
// first example "HELLO".map(c => (c.toByte+32).toChar) // second example "HELLO".map{ c => (c.toByte+32).toChar }
Notice that the algorithm operates on one Char
at a time. This is because the map
method in this example is called on a String, and map
treats a String
as a sequential collection of Char
elements. The map
method has an implicit loop, and in that loop, it passes one Char
at a time to the algorithm it’s given.
Although this algorithm it still short, imagine for a moment that it is longer. In this case, to keep your code clear, you might want to write it as a method (or function) that you can pass into the map
method.
To write a method that you can pass into map
to operate on the characters in a String
, define it to take a single Char
as input, then perform the logic on that Char
inside the method. When the logic is complete, return whatever it is that your algorithm returns.
Though the following algorithm is still short, it demonstrates how to create a custom method and pass that method into map
:
// write your own method that operates on a character scala> def toLower(c: Char): Char = (c.toByte+32).toChar toLower: (c: Char)Char // use that method with map scala> "HELLO".map(toLower) res0: String = hello
As an added benefit, the same method also works with the for/yield approach:
scala> val s = "HELLO" s: java.lang.String = HELLO scala> for (c <- s) yield toLower(c) res1: String = hello
Scala methods vs functions
I’ve used the word “method” in this discussion, but you can also use functions here instead of methods. What’s the difference between a method and a function?
Here’s a quick look at a function that’s equivalent to the toLower
method shown:
val toLower = (c: Char) => (c.toByte+32).toChar
This function can be passed into map
in the same way the previous toLower
method was used:
scala> "HELLO".map(toLower) res0: String = hello
For more information on functions and the differences between methods and functions, see Chapter 9 of the Scala Cookbook, Functional Programming.
A complete example
The following example demonstrates how to call the getBytes
method on a String
, and then pass a block of code into a foreach
method to help calculate an Adler-32 checksum value on a String:
package tests /** * Calculate the Adler-32 checksum using Scala. * @see http://en.wikipedia.org/wiki/Adler-32 */ object Adler32Checksum { val MOD_ADLER = 65521 def main(args: Array[String]) { val sum = adler32sum("Wikipedia") printf("checksum (int) = %d\n", sum) printf("checksum (hex) = %s\n", sum.toHexString) } def adler32sum(s: String): Int = { var a = 1 var b = 0 s.getBytes.foreach{char => a = (char + a) % MOD_ADLER b = (b + a) % MOD_ADLER } // note: Int is 32 bits, which this requires b * 65536 + a // or (b << 16) + a } }
The getBytes
method returns a sequential collection of bytes from a String
, as shown here:
scala> "hello".getBytes res0: Array[Byte] = Array(104, 101, 108, 108, 111)
Adding the foreach
method call after getBytes
lets you operate on each Byte
value:
scala> "hello".getBytes.foreach(println) 104 101 108 108 111
You use foreach
in this example instead of map
, because the goal is to loop over each Byte
in the String
, and do something with each Byte
, but you don’t want to return anything from the loop.
this post is sponsored by my books: | |||
#1 New Release |
FP Best Seller |
Learn Scala 3 |
Learn FP Fast |
See Also
- Under the covers, the Scala compiler translates a for loop into a foreach method call. This gets more complicated if the loop has one or more if statements (guards) or a yield expression. This is discussed in detail in Recipe 3.1 in the Scala Cookbook, “Looping with for and foreach.” The full details are presented in “For Comprehensions and For Loops” in Section 6.19 of the current Scala Language Specification.
- The Adler-32 checksum algorithm