Scala: How to match a regex pattern against an entire String

Scala regex FAQ: How can I match a regex pattern against an entire string in Scala?

This morning I needed to write a little Scala code to make sure a String completely matches a regex pattern. I started off by creating a Scala Regex instance, and then realized the Regex class doesn’t have a simple method to determine whether a String completely matches a pattern.

In my case I wanted to make sure the String I was given contained only four characters, and those characters could be letters and numbers only, i.e., a four-character alphanumeric string.

Cutting to the chase, the easy solution to the problem is to remember that Scala is built on top of Java, so just use the matches method of the Java String class. The following examples in the Scala REPL show my approach:

scala> "!123".matches("[a-zA-Z0-9]{4}")
res0: Boolean = false

scala> "!1  ".matches("[a-zA-Z0-9]{4}")
res1: Boolean = false

scala> "ab12".matches("[a-zA-Z0-9]{4}")
res2: Boolean = true

scala> "34Az".matches("[a-zA-Z0-9]{4}")
res3: Boolean = true

scala> "123".matches("[a-zA-Z0-9]{4}")
res4: Boolean = false

As you can see, using this regex pattern, the String must be an alphanumeric string containing only letters and numbers, and must have a length of four characters. Problem solved.

Note: You can probably use Regex methods like findFirstIn and findPrefixOf to find a match, but those approaches use the Option/Some/None pattern, which require more work in this case.

Another example: A regex pattern to match URI patterns

As another example of this technique, here’s a regex pattern I just created and I’m testing to match against URI patterns, which I plan to use for a blogging application I’m creating:

scala> val r = "[0-9a-zA-Z-_]*"
r: String = [0-9a-zA-Z-_]*

scala> "foo-bar".matches(r)
res0: Boolean = true

scala> "foo- bar".matches(r)
res1: Boolean = false

scala> "foo-123".matches(r)
res2: Boolean = true

scala> "/blog/foo-123".matches(r)
res3: Boolean = true

scala> "/blog/foo/123 ".matches(r)
res4: Boolean = false

scala> "/blog/foo/123".matches(r)
res5: Boolean = true

The first line shows the regex pattern I created — "[0-9a-zA-Z-_]*" — and the rest of the lines show the tests I’m looking at. In my world, URIs should only contain the characters shown in the pattern: 0-9, a-z, A-Z, - and _.

Add new comment

Anonymous format

  • Allowed HTML tags: <em> <strong> <cite> <code> <ul type> <ol start type> <li> <pre>
  • Lines and paragraphs break automatically.