Predefined shorthand character classes

As we have seen from the preceding examples, certain character classes, such as digits [0-9] or word characters [0-9A-Za-z_], are used in most regex patterns. The Java language, like all regular expression flavors, provides convenient predefined character classes for these character classes. Here is the list:

Shorthand Class Meaning Character Class
\d A digit 0-9 [0-9]
\D A non-digit [^\d]
\w A word character [a-zA-Z0-9_]
W A non-word character [^\w]
\s A whitespace character, including line break [ \t\r\n\f\x0B]
\S A non-whitespace chacracter [^\s]
\h A horizontal whitespace character [ \t\xA0\u1680\u180e\u2000-\u200a\u202f\u205f\u3000]
\H A non-horizontal whitespace character [^\h] ...

Get Java 9 Regular Expressions now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.