This is an excerpt from the 1st Edition of the Scala Cookbook (partially modified for the internet). This is Recipe 1.7, “Finding Patterns in Scala Strings.”
Problem
You need to determine whether a Scala String
contains a regular expression pattern.
Solution
Create a Regex
object by invoking the .r
method on a String
, and then use that pattern with findFirstIn
when you’re looking for one match, and findAllIn
when looking for all matches.
To demonstrate this, first create a Regex
for the pattern you want to search for, in this case a sequence of one or more numeric characters:
scala> val numPattern = "[0-9]+".r
numPattern: scala.util.matching.Regex = [0-9]+
Next, create a sample String
you can search:
scala> val address = "123 Main Street Suite 101"
address: java.lang.String = 123 Main Street Suite 101
The findFirstIn
method finds the first match:
scala> val match1 = numPattern.findFirstIn(address)
match1: Option[String] = Some(123)
(Notice that this method returns an Option[String]
. I’ll dig into that in the Discussion.)
When looking for multiple matches, use the findAllIn
method:
scala> val matches = numPattern.findAllIn(address)
matches: scala.util.matching.Regex.MatchIterator = non-empty iterator
As you can see, findAllIn
returns an iterator, which lets you loop over the results:
scala> matches.foreach(println)
123
101
If findAllIn
doesn’t find any results, an empty iterator is returned, so you can still write your code just like that — you don’t need to check to see if the result is null
.
If you’d rather have the results as an Array
, add the toArray
method after the findAllIn
call:
scala> val matches = numPattern.findAllIn(address).toArray
matches: Array[String] = Array(123, 101)
If there are no matches, this approach yields an empty Array
. Other methods like toList
, toSeq
, and toVector
are also available.
Discussion
Using the .r
method on a String
is the easiest way to create a Regex
object. Another approach is to import the Regex
class, create a Regex
instance, and then use the instance in the same way:
scala> import scala.util.matching.Regex
import scala.util.matching.Regex
scala> val numPattern = new Regex("[0-9]+")
numPattern: scala.util.matching.Regex = [0-9]+
scala> val address = "123 Main Street Suite 101"
address: java.lang.String = 123 Main Street Suite 101
scala> val match1 = numPattern.findFirstIn(address)
match1: Option[String] = Some(123)
Although this is a bit more work, it’s also more obvious. I’ve found that it can be easy to overlook the .r
at the end of a String
(and then spend a few minutes wondering how the code I saw could possibly work).
Handling the Option
returned by findFirstIn
As mentioned in the Solution, the findFirstIn
method finds the first match in the String
and returns an Option[String]
:
scala> val match1 = numPattern.findFirstIn(address)
match1: Option[String] = Some(123)
The Option/Some/None pattern is discussed in detail in Recipe 20.6, but the simple way to think about an Option
is that it’s a container that holds either zero or one values. In the case of findFirstIn
, if it succeeds, it returns the string “123” as a Some(123)
, as shown in this example. However, if it fails to find the pattern in the string it’s searching, it will return a None
, as shown here:
scala> val address = "No address given"
address: String = No address given
scala> val match1 = numPattern.findFirstIn(address)
match1: Option[String] = None
To summarize, a method defined to return an Option[String]
will either return a Some(String)
, or a None
.
The normal way to work with an Option
is to use one of these approaches:
- Use the
Option
in amatch
expression - Use the
Option
in aforeach
loop - Call
getOrElse
on the value
Recipe 20.6 describes those approaches in detail, but they’re demonstrated here for your convenience.
A match
expression provides a very readable solution to the problem, and is generally the preferred solution, especially by functional programmers, who routinely take advantage of pattern-matching:
match1 match {
case Some(s) => println(s"Found: $s")
case None =>
}
Because an Option
is a collection of zero or one elements, an experienced Scala developer will also use a foreach
loop in this situation:
numPattern.findFirstIn(address).foreach { e =>
// perform the next step in your algorithm,
// operating on the value 'e'
}
With the getOrElse
approach you attempt to “get” the result, while also specifying a default value that should be used if the method failed:
scala> val result = numPattern.findFirstIn(address).getOrElse("no match")
result: String = 123
See Recipe 20.6 for more information.
Summary
To summarize this approach, the following REPL example shows the complete process of creating a Regex
, searching a String
with findFirstIn
, and then using a foreach
loop on the resulting match:
scala> val numPattern = "[0-9]+".r
numPattern: scala.util.matching.Regex = [0-9]+
scala> val address = "123 Main Street Suite 101"
address: String = 123 Main Street Suite 101
scala> val match1 = numPattern.findFirstIn(address)
match1: Option[String] = Some(123)
scala> match1.foreach { e =>
| println(s"Found a match: $e")
| }
Found a match: 123
this post is sponsored by my books: | |||
#1 New Release |
FP Best Seller |
Learn Scala 3 |
Learn FP Fast |
See Also
- The Scala StringOps class
- The Scala Regex class
- Recipe 20.6, “Using Scala’s Option/Some/None Pattern,” provides more information on
Option
, which was shown in thefindFirstIn
example