Java: How to extract a group from a String that contains a regex pattern

Java pattern problem: In a Java program, you want to determine whether a String contains a regular expression (regex) pattern, and then you want to extract the group of characters from the string that matches your regex pattern.

Solution: Use the Java Pattern and Matcher classes, supply a regular expression (regex) to the Pattern class, use the find method of the Matcher class to see if there is a match, then use the group method to extract the actual group of characters from the String that matches your regular expression.

This is demonstrated in the following Java program:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class PatternMatcherGroup1
{
  public static void main(String[] args)
  {
    String stringToSearch = "Four score and seven years ago our fathers ...";

    Pattern p = Pattern.compile(" (\\S+or\\S+) ");   // the pattern to search for
    Matcher m = p.matcher(stringToSearch);

    // if we find a match, get the group 
    if (m.find())
    {
      // we're only looking for one group, so get it
      String theGroup = m.group(1);
      
      // print the group out for verification
      System.out.format("'%s'\n", theGroup);
    }

  }
}

The output from this program is:

'score'

Discussion

In this example, the regex " (\\S+or\\S+) " was intended to match the string of characters score in the input string, which it did. If you're not familiar with regular expressions, that pattern can be read as, "Find a blank space, followed by one or more non-whitespace characters, followed by 'or', followed by one or more non-whitespace characters, and one more blank space. Also, remember everything between the parentheses as a 'group'".

The parentheses are special characters that provide this grouping functionality for us, and everything enclosed between the parentheses will be treated as one group, which we can later extract from our String, using the group method.

As one final note, in a more complicated example, our group could easily have matched our input String several times, so this simple code would have to be modified. I'll demonstrate that technique in other tutorials on this blog.

Add new comment

Anonymous format

  • Allowed HTML tags: <em> <strong> <cite> <code> <ul type> <ol start type> <li> <pre>
  • Lines and paragraphs break automatically.