You want to validate times in various traditional time formats, such as hh:mm and hh:mm:ss in both 12-hour and 24-hour formats.
Hours and minutes, 12-hour clock:
^(1[0-2]|0?[1-9]):([0-5]?[0-9])$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Hours and minutes, 24-hour clock:
^(2[0-3]|[01]?[0-9]):([0-5]?[0-9])$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Hours, minutes and seconds, 12-hour clock:
^(1[0-2]|0?[1-9]):([0-5]?[0-9]):([0-5]?[0-9])$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Hours, minutes and seconds, 24-hour clock:
^(2[0-3]|[01]?[0-9]):([0-5]?[0-9]):([0-5]?[0-9])$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
The question marks in all of the preceding regular expressions make leading zeros optional. Remove the question marks to make leading zeros mandatory.
Validating times is considerably easier than validating dates.
Every hour has 60 minutes, and every minute has 60 seconds. This means
we don’t need any complicated alternations in the regex. For the
minutes and seconds, we don’t use alternation at all. ‹[0-5]?[0-9]
› matches a digit
between 0 and 5, followed by a digit between 0 and 9. This correctly
matches any number between 0 and 59. The question mark after the first
character class makes it optional. This way, a single digit between 0
and 9 is also accepted as a valid minute or second. Remove the
question mark if the first 10 minutes and seconds should be written as
00 to 09. See Recipe 2.3 and Recipe 2.12 for details on character classes and
quantifiers such as the question mark.
For the hours, we do use alternation (see Recipe 2.8). The second digit allows different ranges, depending on
the first digit. On a 12-hour clock, if the first digit is 0, the
second digit allows all 10 digits, but if the first digit is 1, the
second digit must be 0, 1, or 2. In a regular expression, we write
this as ‹1[0-2]|0?[1-9]
›. On a 24-hour clock, if the
first digit is 0 or 1, the second digit allows all 10 digits, but if
the first digit is 2, the second digit must be between 0 and 3. In
regex syntax, this can be expressed as 2[0-3]|[01]?[0-9]
. Again, the question mark
allows the first 10 hours to be written with a single digit. Remove
the question mark to require two digits.
We put parentheses around the parts of the regex that match the hours, minutes, and seconds. That makes it easy to retrieve the digits for the hours, minutes, and seconds without the colons. Recipe 2.9 explains how parentheses create capturing groups. Recipe 3.9 explains how you can retrieve the text matched by those capturing groups in procedural code.
The parentheses around the hour part keeps two alternatives for the hour together. If you remove those parentheses, the regex won’t work correctly. Removing the parentheses around the minutes and seconds has no effect, other than making it impossible to retrieve their digits separately.
If you want to search for times in larger bodies of text instead
of checking whether the input as a whole is a time, you cannot use the
anchors ‹^
› and ‹$
›. Merely removing the anchors
from the regular expression is not the right solution. That would
allow the hour and minute regexes to match 12:12
within 9912:1299
, for instance. Instead of
anchoring the regex match to the start and end of the subject, you
have to specify that the date cannot be part of longer sequences of
digits.
This is easily done with a pair of word boundaries. In regular expressions, digits are treated as characters
that can be part of words. Replace both ‹^
› and ‹$
› with ‹\b
›. As an example:
\b(2[0-3]|[01]?[0-9]):([0-5]?[0-9])\b
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Word boundaries don’t disallow everything; they only disallow
letters, digits and underscores. The regex just shown, which matches
hours and minutes on a 24-hour clock, matches 16:08
within the subject
text The time is
16:08:42 sharp
. The space is not a word character, whereas
the 1
is, so
the word boundary matches between them. The 8
is a word character, whereas the
colon isn’t, so ‹\b
›
also matches between those two.
If you want to disallow colons as well as word characters, you
need to use lookaround (see Recipe 2.16).
The following regex will not match any part of The time is 16:08:42
sharp
. It only works with flavors that support
lookbehind:
(?<![:\w])(2[0-3]|[01]?[0-9]):([0-5]?[0-9])(?![:\w])
Regex options: None |
Regex flavors: .NET, Java, PCRE, Perl, Python, Ruby 1.9 |
Get Regular Expressions Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.