You want to validate times in various traditional time formats, such as hh:mm and hh:mm:ss in both 12-hour and 24-hour formats.
Hours and minutes, 12-hour clock:
^(1[0-2]|0?[1-9]):([0-5]?[0-9])(●?[AP]M)?$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Hours and minutes, 24-hour clock:
^(2[0-3]|[01]?[0-9]):([0-5]?[0-9])$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Hours, minutes, and seconds, 12-hour clock:
^(1[0-2]|0?[1-9]):([0-5]?[0-9]):([0-5]?[0-9])(●?[AP]M)?$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Hours, minutes, and seconds, 24-hour clock:
^(2[0-3]|[01]?[0-9]):([0-5]?[0-9]):([0-5]?[0-9])$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
The question marks in all of the preceding regular expressions make leading zeros optional. Remove the question marks to make leading zeros mandatory.
Validating times is considerably easier than validating dates.
Every hour has 60 minutes, and every minute has 60 seconds. This means
we don’t need any complicated alternations in the regex. For the minutes
and seconds, we don’t use alternation at all. ‹[0-5]?[0-9]
› matches a digit between 0 and 5,
followed by a digit between 0 and 9. This correctly matches any number
between 0 and 59. The question mark after the first character class
makes it optional. This way, a single digit between 0 and 9 is also
accepted as a valid minute or second. Remove the question mark if the
first 10 minutes and seconds should be written as 00 to 09. See Recipes
2.3
and 2.12
for details on character classes and quantifiers such as the question
mark.
For the hours, we do need to use alternation (see Recipe 2.8). The second digit allows different
ranges, depending on the first digit. On a 12-hour clock, if the first
digit is 0, the second digit allows all 10 digits, but if the first
digit is 1, the second digit must be 0, 1, or 2. In a regular
expression, we write this as ‹1[0-2]|0?[1-9]
›. On a 24-hour clock, if the first
digit is 0 or 1, the second digit allows all 10 digits, but if the first
digit is 2, the second digit must be between 0 and 3. In regex syntax,
this can be expressed as ‹2[0-3]|[01]?[0-9]
›. Again, the question mark
allows the first 10 hours to be written with a single digit. Whether
you’re working with a 12- or 24-hour clock, remove the question mark to
require two digits.
We put parentheses around the parts of the regex that match the hours, minutes, and seconds. That makes it easy to retrieve the digits for the hours, minutes, and seconds, without the colons. Recipe 2.9 explains how parentheses create capturing groups. Recipe 3.9 explains how you can retrieve the text matched by those capturing groups in procedural code.
The parentheses around the hour part keeps two alternatives for the hour together. If you remove those parentheses, the regex won’t work correctly. Removing the parentheses around the minutes and seconds has no effect, other than making it impossible to retrieve their digits separately.
On a 12-hour clock, we allow the time to be followed by AM or PM.
We also allow a space between the time and the AM/PM indicator. ‹[AP]M
› matches AM or PM. ‹●?
› matches an optional space. ‹(●?[AP]M)?
› groups the space and the
indicator, and makes them optional as one unit. We don’t use ‹●?([AP]M)?
› because that would allow a
space even when the indicator is omitted.
If you want to search for times in larger bodies of text instead
of checking whether the input as a
whole is a time, you cannot use the anchors ‹^
› and ‹$
›. Merely removing the anchors from the regular
expression is not the right solution. That would allow the hour and
minute regexes to match 12:12
within 9912:1299
, for instance. Instead of
anchoring the regex match to the start and end of the subject, you have
to specify that the time cannot be part of longer sequences of
digits.
This is easily done with a pair of word boundaries. In regular
expressions, digits are treated as characters that can be part of words.
Replace both ‹^
› and
‹$
› with ‹\b
›. As an example:
\b(2[0-3]|[01]?[0-9]):([0-5]?[0-9])\b
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Word boundaries don’t disallow everything; they only disallow
letters, digits and underscores. The regex just shown, which matches
hours and minutes on a 24-hour clock, matches 16:08
within the subject text The time is 16:08:42
sharp
. The space is not a word character, whereas the
1
is, so the
word boundary matches between them. The 8
is a word character, whereas the colon
isn’t, so ‹\b
› also
matches between those two.
If you want to disallow colons as well as word characters, you
need to use lookaround (see Recipe 2.16), as
shown in the following regex. Unlike before, this regex will not match
any part of The time is
16:08:42 sharp
. It only works with flavors that support
lookbehind:
(?<![:\w])(2[0-3]|[01]?[0-9]):([0-5]?[0-9])(?![:\w])
Regex options: None |
Regex flavors: .NET, Java, PCRE, Perl, Python, Ruby 1.9 |
This chapter has several other recipes for matching dates and times. Recipes 4.4 and 4.5 show how to validate traditional date formats. Recipe 4.7 shows how to validate date and time formats according to the ISO 8601 standard.
Techniques used in the regular expressions in this recipe are discussed in Chapter 2. Recipe 2.3 explains character classes. Recipe 2.5 explains anchors. Recipe 2.6 explains word boundaries. Recipe 2.8 explains alternation. Recipe 2.9 explains grouping. Recipe 2.12 explains repetition. Recipe 2.16 explains lookaround.
Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.