Why isn't the Java matches method working? (Java pattern matching)

Java Matcher problem: You're trying to use the matches method of the Java Matcher class to match a regular expression (regex) you have defined, and it's not working against a given string, and you don't know why.

Solution: The important thing to remember about this Java matches method is that your regular expression must match the entire line. Specifically, a regex pattern like the following one will not work with the matches method when you work on a larger line of input text:

" year "

However, by modifying this pattern to look like this:

".*year.*"

it will work with the matches method.

Two possible solutions

There are actually two possible solutions to this problem:

  1. Modify your regex pattern to match the entire String, and keep using the matches method.
  2. Modify your Java program to use the find method of the Matcher class instead of using the matches method.

Here's some source code for a complete Java class that demonstrates the matches method, showing both (a) the wrong way and (b) the correct way to define a regex pattern for use with the matches method:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Demonstrates a common error when attempting to use the
 * "matches" method.
 * 
 * The matches() method attempts to match the entire input 
 * sequence against the pattern, so if you're going to try
 * to use it, your pattern must match the entire string.
 * 
 */
public class PatternMatcherMatchesError
{
  public static void main(String[] args)
  {
    // the string we want to search
    String stringToSearch = "Four score and seven years ago our fathers ...";
    
    // search for this simple pattern
    String patternToSeachFor = " year ";          // doesn't match
    //String patternToSeachFor = ".*year.*";      // does match

    // set everything up
    Pattern p = Pattern.compile(patternToSeachFor, Pattern.CASE_INSENSITIVE);
    Matcher m = p.matcher(stringToSearch);
    
    // now see if we find a match
    if (m.matches())
      System.out.println("Found a match");
    else
      System.out.println("Did not find a match");
  }
}

The pattern currently not commented-out makes this program print this output:

Did not find a match

But the other pattern will find a match.

matches() method - Discussion

The Matcher class Javadoc states, "The matches() method attempts to match the entire input sequence against the pattern." Therefore, your pattern must match the entire input sequence.

The find() method solution

As mentioned earlier, another approach is to use the find method of the Matcher class. The find method does not require your pattern to match the entire String. To implement this solution, just replace this line of code above:

if (m.matches())

with this line of code:

if (m.find())

Comments

Permalink

Thanks, this was very helpful, and I appreciate the source code for a complete class I can test with.

Permalink

Coming from a background with Perl, this implementation of regular expressions by the Java development team just seems plain wrong to me. I would say that in most cases the developer would NOT want to match an entire string. Indeed, almost all the examples of where I've used this in Java have required partial matches.

Knowing a little about C regex processing (which the runtime is most probably using), then the regex "^.*abc.*$" (which is what Java forces you to use) is less efficient than the regex "abc".

Was there any discussion of this with the Java community before Sun decided to use this particular implementation?

Thanks for the comment. Unfortunately I don't know much about their discussion process ... I really don't know if this functionality came out of their JSR process (which is open), or if it preceded that.

I do know that for several years there were several competing third-party regex tools. I can't think of their names, but I believe "ORO" was one of them. I'd hope that from that competition they took what everyone thought was the best approach, but really, I don't know.

Permalink

The very first sentence of your solution resolved all my confusion!

Add new comment

Anonymous format

  • Allowed HTML tags: <em> <strong> <cite> <code> <ul type> <ol start type> <li> <pre>
  • Lines and paragraphs break automatically.