Sometimes you need to know more than just whether an RE matched an
input string. In editors and many other tools, you will want to know
exactly what characters were matched. Remember that with multipliers
such as *
, the length of the text that was
matched may have no relationship to the length of the pattern that
matched it. Do not underestimate the mighty .*
,
which will happily match thousands or millions of characters if
allowed to. As you can see from looking at the API, you can find out
whether a given match succeeds just by using match( )
, as we’ve done up to now. But it may be more useful
to get a description of what it matched by using one of the
getParen( )
methods.
The notion of parentheses is central to RE processing. REs may be
nested to any level of complexity. The getParen( )
methods let you retrieve whatever matched at a given parenthesis
level. If you haven’t used any explicit parens, you can just
treat whatever matched as “level zero.” For example:
// Part of REmatch.java String patt = "Q[^u]\\d+\\."; RE r = new RE(patt); String line = "Order QT300. Now!"; if (r.match(line)) { System.out.println(patt + " matches '" + r.getParen(0) + "' in '" + line + "'"); Match whence = RE.match(patt, line); }
When run, this prints:
Q[^u]\d+\. matches "QT300." in "Order QT300. Now!"
It is also possible to get the
starting and ending indexes and the
length of the text that the pattern
matched (remember that \d+
can match any number of
digits in the input). You can use these in conjunction with the
String.substring( )
methods as follows:
// Part of REsubstr.java -- Prints exactly the same as REmatch.java if (r.match(line)) { System.out.println(patt + " matches '" + line.substring(r.getParenStart(0), r.getParenEnd(0)) + ' in '" + line + "'"); }
Suppose you need to extract several items from a string. If the input is:
Smith, John Adams, John Quincy
and you want to get out:
John Smith John Quincy Adams
just use:
// from REmatchTwoFields.java // Construct an RE with parens to "grab" both field1 and field2 RE r = new RE("(.*), (.*)"); if (!r.match(inputLine)) throw new IllegalArgumentException("Bad input: " + inputLine); System.out.println(r.getParen(2) + ' ' + r.getParen(1));
Get Java Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.