2.1. Match Literal Text
Problem
Create a regular expression to exactly match this
gloriously contrived sentence: The punctuation characters in the ASCII table are:
!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
.
This is intended to show which characters have special meaning in regular expressions, and which characters always match themselves literally.
Solution
This regular expression matches the sentence stated in the problem:
The●punctuation●characters●in●the●ASCII●table●are:●↵ !"#\$%&'\(\)\*\+,-\./:;<=>\?@\[\\]\^_`\{\|}~
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Discussion
Any regular expression that does not include any of the dozen
characters $()*+.?[\^{|
simply
matches itself. To find whether Mary had a little lamb
in the text you’re
editing, simply search for ‹Mary●had●a●little●lamb
›. It doesn’t matter whether the
“regular expression” checkbox is turned on in your text editor.
The 12 punctuation characters that make regular expressions work
their magic are called metacharacters. If you want your
regex to match them literally, you need to escape them by placing a backslash
in front of them. Thus, the regex: ‹\$\(\)\*\+\.\?\[\\\^\{\|
› matches the text
$()*+.?[\^{|
.
Notably absent from the list are the closing square bracket
]
, the hyphen -
, and the closing curly bracket }
. The first two become metacharacters only
after an unescaped [
, and the }
only after an unescaped {
. There’s no need to
ever escape }
. Metacharacter rules for the blocks that appear between ...
Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.