You want to validate dates in the traditional formats mm/dd/yy, mm/dd/yyyy, dd/mm/yy, and dd/mm/yyyy. You want to use a simple regex that simply checks whether the input looks like a date, without trying to weed out things such as February 31st.
Solution 1: Match any of these date formats, allowing leading zeros to be omitted:
^[0-3]?[0-9]/[0-3]?[0-9]/(?:[0-9]{2})?[0-9]{2}$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Solution 2: Match any of these date formats, requiring leading zeros:
^[0-3][0-9]/[0-3][0-9]/(?:[0-9][0-9])?[0-9][0-9]$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Solution 3: Match m/d/yy and mm/dd/yyyy, allowing any combination of one or two digits for the day and month, and two or four digits for the year:
^(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2}$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Solution 4: Match mm/dd/yyyy, requiring leading zeros:
^(1[0-2]|0[1-9])/(3[01]|[12][0-9]|0[1-9])/[0-9]{4}$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Solution 5: Match d/m/yy and dd/mm/yyyy, allowing any combination of one or two digits for the day and month, and two or four digits for the year:
^(3[01]|[12][0-9]|0?[1-9])/(1[0-2]|0?[1-9])/(?:[0-9]{2})?[0-9]{2}$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Solution 6: Match dd/mm/yyyy, requiring leading zeros:
^(3[01]|[12][0-9]|0[1-9])/(1[0-2]|0[1-9])/[0-9]{4}$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Solution 7: Match any of these date formats with greater accuracy, allowing leading zeros to be omitted:
^(?:(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])|↵ (3[01]|[12][0-9]|0?[1-9])/(1[0-2]|0?[1-9]))/(?:[0-9]{2})?[0-9]{2}$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
We can use the free-spacing option to make this regular expression easier to read:
^(?: # m/d or mm/dd (1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9]) | # d/m or dd/mm (3[01]|[12][0-9]|0?[1-9])/(1[0-2]|0?[1-9]) ) # /yy or /yyyy /(?:[0-9]{2})?[0-9]{2}$
Regex options: Free-spacing |
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby |
Solution 8: Match any of these date formats with greater accuracy, requiring leading zeros:
^(?:(1[0-2]|0[1-9])/(3[01]|[12][0-9]|0[1-9])|↵ (3[01]|[12][0-9]|0[1-9])/(1[0-2]|0[1-9]))/[0-9]{4}$
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
The same solution using the free-spacing option to make it easier to read:
^(?: # mm/dd (1[0-2]|0[1-9])/(3[01]|[12][0-9]|0[1-9]) | # dd/mm (3[01]|[12][0-9]|0[1-9])/(1[0-2]|0[1-9]) ) # /yyyy /[0-9]{4}$
Regex options: Free-spacing |
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby |
You might think that something as conceptually trivial as a date
should be an easy job for a regular expression. But it isn’t, for two
reasons. Because dates are such an everyday thing, humans are very
sloppy with them. 4/1
may be April Fools’ Day to you. To
somebody else, it may be the first working day of the year, if New
Year’s Day is on a Friday.
The other issue is that regular expressions don’t deal directly
with numbers. You can’t tell a regular expression to “match a number
between 1 and 31”, for instance. Regular expressions work character by
character. We use ‹3[01]|[12][0-9]|0?[1-9]
› to match 3
followed by 0
or 1
, or
to match 1
or 2
followed by any digit, or to match an
optional 0
followed by 1
to 9
. In
character classes, we can use ranges for single digits, such as ‹[1-9]
›. That’s because the
characters for the digits 0 through 9 occupy consecutive positions in
the ASCII and Unicode character tables. See Chapter 6
for more details on matching all kinds of numbers with regular
expressions.
Because of this, you have to choose how simple or how accurate you
want your regular expression to be. If you already know your subject
text doesn’t contain any invalid dates, you could use a trivial regex
such as ‹\d{2}/\d{2}/\d{4}
›. The fact that this matches
things like 99/99/9999
is irrelevant if those don’t
occur in the subject text.
The first two solutions for this recipe are quick and
simple, too, and they also match invalid dates, such as 0/0/00
and 31/31/2008
. They only use
literal characters for the date delimiters, character classes (see Recipe 2.3) for the digits, and the question mark
(see Recipe 2.12) to make certain digits
optional. ‹(?:[0-9]{2})?[0-9]{2}
› allows the year to consist
of two or four digits. ‹[0-9]{2}
› matches exactly two digits. ‹(?:[0-9]{2})?
› matches zero or two
digits. The noncapturing group (see Recipe 2.9)
is required, because the question mark needs to apply to the character
class and the quantifier ‹{2}
› combined. ‹[0-9]{2}?
› matches exactly two
digits, just like ‹[0-9]{2}
›. Without the group, the
question mark makes the quantifier lazy, which has no effect because
‹{2}
› cannot repeat more
than two times or fewer than two times.
Solutions 3 through 6 restrict the month to numbers between 1 and 12, and the day to numbers between 1 and 31. We use alternation (see Recipe 2.8) inside a group to match various pairs of digits to form a range of two-digit numbers. We use capturing groups here because you’ll probably want to capture the day and month numbers anyway.
The final two solutions are a little more complex, so we’re presenting these in both condensed and free-spacing form. The only difference between the two forms is readability. JavaScript does not support free-spacing. The final two solutions allow all of the date formats, just like the first two examples. The difference is that the last two use an extra level of alternation to restrict the dates to 12/31 and 31/12, disallowing invalid months, such as 31/31.
If you want to search for dates in larger bodies of text instead
of checking whether the input as a whole is a date, you cannot use the
anchors ‹^
› and ‹$
›. Merely removing the anchors
from the regular expression is not the right solution. That would allow
any of these regexes to match 12/12/2001
within 9912/12/200199
, for
example. Instead of anchoring the regex match to the start and end of
the subject, you have to specify that the date cannot be part of longer
sequences of digits.
This is easily done with a pair of word boundaries. In regular
expressions, digits are treated as characters that can be part of words.
Replace both ‹^
› and
‹$
› with ‹\b
›. As an example:
\b(1[0-2]|0[1-9])/(3[01]|[12][0-9]|0[1-9])/[0-9]{4}\b
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
This chapter has several other recipes for matching dates and times. Recipe 4.5 shows how to validate traditional date formats more accurately. Recipe 4.6 shows how to validate traditional time formats. Recipe 4.7 shows how to validate date and time formats according to the ISO 8601 standard.
Recipe 6.7 explains how you can create a regular expression to match a number in a given range of numbers.
Techniques used in the regular expressions in this recipe are discussed in Chapter 2. Recipe 2.3 explains character classes. Recipe 2.5 explains anchors. Recipe 2.8 explains alternation. Recipe 2.9 explains grouping. Recipe 2.12 explains repetition.
Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.