Scala regex FAQ: How can I match a regex pattern against an entire string in Scala?
This morning I needed to write a little Scala code to make sure a String completely matches a regex pattern. I started off by creating a Scala Regex
instance, and then realized the Regex class doesn’t have a simple method to determine whether a String completely matches a pattern.
In my case I wanted to make sure the String I was given contained only four characters, and those characters could be letters and numbers only, i.e., a four-character alphanumeric string.
Cutting to the chase, the easy solution to the problem is to remember that Scala is built on top of Java, so just use the matches
method of the Java String class. The following examples in the Scala REPL show my approach:
scala> "!123".matches("[a-zA-Z0-9]{4}") res0: Boolean = false scala> "!1 ".matches("[a-zA-Z0-9]{4}") res1: Boolean = false scala> "ab12".matches("[a-zA-Z0-9]{4}") res2: Boolean = true scala> "34Az".matches("[a-zA-Z0-9]{4}") res3: Boolean = true scala> "123".matches("[a-zA-Z0-9]{4}") res4: Boolean = false
As you can see, using this regex pattern, the String must be an alphanumeric string containing only letters and numbers, and must have a length of four characters. Problem solved.
Note: You can probably use Regex methods like findFirstIn
and findPrefixOf
to find a match, but those approaches use the Option/Some/None pattern, which require more work in this case.
Another example: A regex pattern to match URI patterns
As another example of this technique, here’s a regex pattern I just created and I’m testing to match against URI patterns, which I plan to use for a blogging application I’m creating:
scala> val r = "[0-9a-zA-Z-_]*" r: String = [0-9a-zA-Z-_]* scala> "foo-bar".matches(r) res0: Boolean = true scala> "foo- bar".matches(r) res1: Boolean = false scala> "foo-123".matches(r) res2: Boolean = true scala> "/blog/foo-123".matches(r) res3: Boolean = true scala> "/blog/foo/123 ".matches(r) res4: Boolean = false scala> "/blog/foo/123".matches(r) res5: Boolean = true
The first line shows the regex pattern I created — "[0-9a-zA-Z-_]*" — and the rest of the lines show the tests I’m looking at. In my world, URIs should only contain the characters shown in the pattern: 0-9, a-z, A-Z, - and _.