This is an excerpt from the 1st Edition of the Scala Cookbook (#ad) (partially modified for the internet). This is Recipe 1.7, “Finding Patterns in Scala Strings.”
Problem
You need to determine whether a Scala String
contains a regular expression pattern.
Solution
Create a Regex
object by invoking the .r
method on a String
, and then use that pattern with findFirstIn
when you’re looking for one match, and findAllIn
when looking for all matches.
To demonstrate this, first create a Regex
for the pattern you want to search for, in this case a sequence of one or more numeric characters:
scala> val numPattern = "[0-9]+".r
numPattern: scala.util.matching.Regex = [0-9]+
Next, create a sample String
you can search:
scala> val address = "123 Main Street Suite 101"
address: java.lang.String = 123 Main Street Suite 101
The findFirstIn
method finds the first match:
scala> val match1 = numPattern.findFirstIn(address)
match1: Option[String] = Some(123)
(Notice that this method returns an Option[String]
. I’ll dig into that in the Discussion.)
When looking for multiple matches, use the findAllIn
method:
scala> val matches = numPattern.findAllIn(address)
matches: scala.util.matching.Regex.MatchIterator = non-empty iterator
As you can see, findAllIn
returns an iterator, which lets you loop over the results:
scala> matches.foreach(println)
123
101
If findAllIn
doesn’t find any results, an empty iterator is returned, so you can still write your code just like that — you don’t need to check to see if the result is null
.
If you’d rather have the results as an Array
, add the toArray
method after the findAllIn
call:
scala> val matches = numPattern.findAllIn(address).toArray
matches: Array[String] = Array(123, 101)
If there are no matches, this approach yields an empty Array
. Other methods like toList
, toSeq
, and toVector
are also available.
Discussion
Using the .r
method on a String
is the easiest way to create a Regex
object. Another approach is to import the Regex
class, create a Regex
instance, and then use the instance in the same way:
scala> import scala.util.matching.Regex
import scala.util.matching.Regex
scala> val numPattern = new Regex("[0-9]+")
numPattern: scala.util.matching.Regex = [0-9]+
scala> val address = "123 Main Street Suite 101"
address: java.lang.String = 123 Main Street Suite 101
scala> val match1 = numPattern.findFirstIn(address)
match1: Option[String] = Some(123)
Although this is a bit more work, it’s also more obvious. I’ve found that it can be easy to overlook the .r
at the end of a String
(and then spend a few minutes wondering how the code I saw could possibly work).
Handling the Option
returned by findFirstIn
As mentioned in the Solution, the findFirstIn
method finds the first match in the String
and returns an Option[String]
:
scala> val match1 = numPattern.findFirstIn(address)
match1: Option[String] = Some(123)
The Option/Some/None pattern is discussed in detail in Recipe 20.6, but the simple way to think about an Option
is that it’s a container that holds either zero or one values. In the case of findFirstIn
, if it succeeds, it returns the string “123” as a Some(123)
, as shown in this example. However, if it fails to find the pattern in the string it’s searching, it will return a None
, as shown here:
scala> val address = "No address given"
address: String = No address given
scala> val match1 = numPattern.findFirstIn(address)
match1: Option[String] = None
To summarize, a method defined to return an Option[String]
will either return a Some(String)
, or a None
.
The normal way to work with an Option
is to use one of these approaches:
- Use the
Option
in amatch
expression - Use the
Option
in aforeach
loop - Call
getOrElse
on the value
Recipe 20.6 describes those approaches in detail, but they’re demonstrated here for your convenience.
A match
expression provides a very readable solution to the problem, and is generally the preferred solution, especially by functional programmers, who routinely take advantage of pattern-matching:
match1 match {
case Some(s) => println(s"Found: $s")
case None =>
}
Because an Option
is a collection of zero or one elements, an experienced Scala developer will also use a foreach
loop in this situation:
numPattern.findFirstIn(address).foreach { e =>
// perform the next step in your algorithm,
// operating on the value 'e'
}
With the getOrElse
approach you attempt to “get” the result, while also specifying a default value that should be used if the method failed:
scala> val result = numPattern.findFirstIn(address).getOrElse("no match")
result: String = 123
See Recipe 20.6 for more information.
Summary
To summarize this approach, the following REPL example shows the complete process of creating a Regex
, searching a String
with findFirstIn
, and then using a foreach
loop on the resulting match:
scala> val numPattern = "[0-9]+".r
numPattern: scala.util.matching.Regex = [0-9]+
scala> val address = "123 Main Street Suite 101"
address: String = 123 Main Street Suite 101
scala> val match1 = numPattern.findFirstIn(address)
match1: Option[String] = Some(123)
scala> match1.foreach { e =>
| println(s"Found a match: $e")
| }
Found a match: 123
this post is sponsored by my books: | |||
![]() #1 New Release |
![]() FP Best Seller |
![]() Learn Scala 3 |
![]() Learn FP Fast |
See Also
- The Scala StringOps class
- The Scala Regex class
- Recipe 20.6, “Using Scala’s Option/Some/None Pattern,” provides more information on
Option
, which was shown in thefindFirstIn
example