Extract multiple groups from a Java String using Pattern and Matcher

Problem: In a Java program, you need a way to extract multiple groups (regular expressions) from a given String.

Solution: Use the Java Pattern and Matcher classes, and define the regular expressions (regex) you need when creating your Pattern class. Also, put your regex definitions inside grouping parentheses so you can extract the actual text that matches your regex patterns from the String.

Example: How to extract multiple regex patterns from a String

In the following source code example I demonstrate how to extract two groups from a given String:

import java.util.regex.Matcher;
import java.util.regex.Pattern;


/**
 * Demonstrates how to extract multiple "groups" from a given string
 * using regular expressions and the Pattern and Matcher classes.
 * 
 * Note: "\\S" means "A non-whitespace character".
 * @see http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html
 */
public class PatternMatcherGroupMultiple
{
  public static void main(String[] args)
  {
    String stringToSearch = "Four score and seven years ago our fathers ...";

    // specify that we want to search for two groups in the string
    Pattern p = Pattern.compile(" (\\S+or\\S+) .* (\\S+the\\S+).*");
    Matcher m = p.matcher(stringToSearch);

    // if our pattern matches the string, we can try to extract our groups
    if (m.find())
    {
      // get the two groups we were looking for
      String group1 = m.group(1);
      String group2 = m.group(2);
      
      // print the groups, with a wee bit of formatting
      System.out.format("'%s', '%s'\n", group1, group2);
    }

  }
}

With these two regular expressions, the output from this program is:

'score', 'fathers'

Discussion

The first regex ((\\S+or\\S+)) matches the word "score", and the second regex ((\\S+the\\S+).*")) matches the word "fathers". These two groups are extracted from the input String with these lines of code:

String group1 = m.group(1);
String group2 = m.group(2);

and are then printed with System.out.format.

It’s also important to note that the find method will only succeed if both of these patterns are found in the String. If only one regex pattern is found, find will return false. You can test this on your own system by changing one of the regex patterns to intentionally cause it to fail.

Add new comment

Anonymous format

  • Allowed HTML tags: <em> <strong> <cite> <code> <ul type> <ol start type> <li> <pre>
  • Lines and paragraphs break automatically.