You’re tasked with ensuring that any passwords chosen by your website users meet your organization’s minimum complexity requirements.
The following regular expressions check many individual conditions, and can be mixed and matched as necessary to meet your business requirements. At the end of this section, we’ve included several JavaScript code examples that show how you can tie these regular expressions together as part of a password security validation routine.
^.{8,32}$
Regex options: Dot matches line breaks (“^ and $ match at line breaks” must not be set) |
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby |
Standard JavaScript doesn’t have a “dot matches line
breaks” option. Use ‹[\s\S]
›
instead of a dot in JavaScript to ensure that the regex works
correctly even for crazy passwords that include line breaks:
^[\s\S]{8,32}$
Regex options: None (“^ and $ match at line breaks” must not be set) |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
If this next regex matches a password, you can be sure
it includes only the characters A
–Z
,
a
–z
, 0
–9
,
space, and ASCII punctuation. No control characters, line breaks, or
characters outside of the ASCII table are allowed:
^[\x20-\x7E]+$
Regex options: None (“^ and $ match at line breaks” must not be set) |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
If you want to additionally prevent the use of spaces, use
‹^[\x21-\x7E]+$
›
instead.
[A-Z]
Regex options: None (“case insensitive” must not be set) |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Any Unicode uppercase letter:
\p{Lu}
Regex options: None (“case insensitive” must not be set) |
Regex flavors: .NET, Java, PCRE, Perl, Ruby 1.9 |
If you want to check for the presence of any letter character
(not limited to uppercase), enable the “case insensitive” option or
use ‹[A-Za-z]
›. For the
Unicode case, you can use ‹\p{L}
›,
which matches any kind of letter from any language.
[a-z]
Regex options: None (“case insensitive” must not be set) |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
\p{Ll}
Regex options: None (“case insensitive” must not be set) |
Regex flavors: .NET, Java, PCRE, Perl, Ruby 1.9 |
ASCII punctuation and spaces only:
[●!"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~]
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
Anything other than ASCII letters and numbers:
[^A-Za-z0-9]
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
This next regex is intended to rule out passwords like
111111
. It works in the opposite
way of the others in this recipe. If it matches, the password
doesn’t meet the condition. In other words, the
regex only matches strings that repeat a character three times in a
row.
(.)\1\1
Regex options: Dot matches line breaks |
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby |
([\s\S])\1\1
Regex options: None |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
The following code combines five password requirements:
Length between 8 and 32 characters.
One or more uppercase letters.
One or more lowercase letters.
One or more numbers.
One or more special characters (ASCII punctuation or space characters).
function validate(password) { var minMaxLength = /^[\s\S]{8,32}$/, upper = /[A-Z]/, lower = /[a-z]/, number = /[0-9]/, special = /[ !"#$%&'()*+,\-./:;<=>?@[\\\]^_`{|}~]/; if (minMaxLength.test(password) && upper.test(password) && lower.test(password) && number.test(password) && special.test(password) ) { return true; } return false; }
The validate
function just shown returns true
if
the provided string meets the password requirements. Otherwise,
false
is returned.
This next example enforces a minimum and maximum password length (8–32 characters), and additionally requires that at least three of the following four character types are present:
One or more uppercase letters.
One or more lowercase letters.
One or more numbers.
One or more special characters (anything other than ASCII letters and numbers).
function validate(password) { var minMaxLength = /^[\s\S]{8,32}$/, upper = /[A-Z]/, lower = /[a-z]/, number = /[0-9]/, special = /[^A-Za-z0-9]/, count = 0; if (minMaxLength.test(password)) { // Only need 3 out of 4 of these to match if (upper.test(password)) count++; if (lower.test(password)) count++; if (number.test(password)) count++; if (special.test(password)) count++; } return count >= 3; }
As before, this modified validate
function returns true
if the provided password meets the
overall requirements. If not, it returns false
.
This final code example is the most complicated of the
bunch. It assigns a positive or negative score to various conditions,
and uses the regexes we’ve been looking at to help calculate an
overall score for the provided password. The rankPassword
function returns a number from
0
–4
that corresponds to the password rankings
“Too Short,” “Weak,” “Medium,” “Strong,” and “Very Strong”:
var rank = { TOO_SHORT: 0, WEAK: 1, MEDIUM: 2, STRONG: 3, VERY_STRONG: 4 }; function rankPassword(password) { var upper = /[A-Z]/, lower = /[a-z]/, number = /[0-9]/, special = /[^A-Za-z0-9]/, minLength = 8, score = 0; if (password.length < minLength) { return rank.TOO_SHORT; // End early } // Increment the score for each of these conditions if (upper.test(password)) score++; if (lower.test(password)) score++; if (number.test(password)) score++; if (special.test(password)) score++; // Penalize if there aren't at least three char types if (score < 3) score--; if (password.length > minLength) { // Increment the score for every 2 chars longer than the minimum score += Math.floor((password.length - minLength) / 2); } // Return a ranking based on the calculated score if (score < 3) return rank.WEAK; // score is 2 or lower if (score < 4) return rank.MEDIUM; // score is 3 if (score < 6) return rank.STRONG; // score is 4 or 5 return rank.VERY_STRONG; // score is 6 or higher } // Test it... var result = rankPassword("password1"), labels = ["Too Short", "Weak", "Medium", "Strong", "Very Strong"]; alert(labels[result]); // -> Weak
Because of how this password ranking algorithm is designed, it
can serve two purposes equally well. First, it can be used to give
users guidance about the quality of their password while they’re still
typing it. Second, it lets you easily reject passwords that don’t rank
at whatever you choose as your minimum security threshold. For
example, the condition if(result
<= rank.MEDIUM)
can be used to reject any password that
isn’t ranked as “Strong” or “Very Strong.”
Users are notorious for choosing simple or common passwords that are easy to remember. But easy to remember doesn’t necessarily translate into something that keeps their account and your company’s information safe. It’s therefore typically necessary to protect users from themselves by enforcing minimum password complexity rules. However, the exact rules to use can vary widely between businesses and systems, which is why this recipe includes numerous regexes that serve as the raw ingredients to help you cook up whatever combination of validation rules you choose.
Limiting each regex to a specific rule brings the additional benefit of simplicity. As a result, all of the regexes shown thus far are fairly straightforward. Following are a few additional notes on each of them:
- Length between 8 and 32 characters
To require a different minimum or maximum length, change the numbers used as the upper and lower bounds for the quantifier ‹
{8,32}
›. If you don’t want to specify a maximum, use ‹{8,}
›, or remove the ‹$
› anchor and change the quantifier to ‹{8}
›.All of the programming languages covered by this book provide a simple and efficient way to determine the length of a string. However, using a regex allows you to test both the minimum and maximum length at the same time, and makes it easier to mix and match password complexity rules by choosing from a list of regexes.
- ASCII visible and space characters only
As mentioned earlier, this regex allows the characters
A
–Z
,a
–z
,0
–9
, space, and ASCII punctuation only. To be more specific about the allowed punctuation characters, they are!
,"
,#
,$
,%
,&
,'
,(
,)
,*
,+
,-
,.
,/
,:
,;
,<
,=
,>
,?
,@
,[
,\
,]
,^
,_
,`
,{
,|
,}
,~
, and comma. In other words, all the punctuation you can type using a standard U.S. keyboard.Limiting passwords to these characters can help avoid character encoding related issues, but keep in mind that it also limits the potential complexity of your passwords.
- Uppercase letters
To check whether the password contains two or more uppercase letters, use ‹
[A-Z].*[A-Z]
›. For three or more, use ‹[A-Z].*[A-Z].*[A-Z]
› or ‹(?:[A-Z].*){3}
›. If you’re allowing any Unicode uppercase letters, just change each ‹[A-Z]
› in the preceding examples to ‹\p{Lu}
›. In JavaScript, replace the dots with ‹[\s\S]
›.- Lowercase letters
As with the “uppercase letters” regex, you can check whether the password contains at least two lowercase letters using ‹
[a-z].*[a-z]
›. For three or more, use ‹[a-z].*[a-z].*[a-z]
› or ‹(?:[a-z].*){3}
›. If you’re allowing any Unicode lowercase letters, change each ‹[a-z]
› to ‹\p{Ll}
›. In JavaScript, replace the dots with ‹[\s\S]
›.- Numbers
You can check whether the password contains two or more numbers using ‹
[0-9].*[0-9]
›, and ‹[0-9].*[0-9].*[0-9]
› or ‹(?:[0-9].*){3}
› for three or more. In JavaScript, replace the dots with ‹[\s\S]
›.We didn’t include a listing for matching any Unicode decimal digit (‹
\p{Nd}
›), because it’s uncommon to treat characters other than0
–9
as numbers (although readers who speak Arabic or Hindi might disagree!).- Special characters
Use the same principles shown for letters and numbers if you want to require more than one special character. For instance, using ‹
[^A-Za-z0-9].*[^A-Za-z0-9]
› would require the password to contain at least two special characters.Note that ‹
[^A-Za-z0-9]
› is different than ‹\W
› (the negated version of the ‹\w
› shorthand for word characters). ‹\W
› goes beyond ‹[^A-Za-z0-9]
› by additionally excluding the underscore, which we don’t want to do here. In some regex flavors, ‹\W
› also excludes any Unicode letter or decimal digit from any language.- Disallow three or more sequential identical characters
This regex matches repeated characters using backreferences to a previously matched character. Recipe 2.10 explains how backreferences work. If you want to disallow any use of repeated characters, change the regex to ‹
(.)\1
›. To allow up to three repeated characters but not four, use ‹(.)\1\1\1
› or ‹(.)\1{3}
›.Remember that you need to check whether this regular expression doesn’t match your subject text. A match would indicate that repeated characters are present.
The three blocks of JavaScript example code each use this recipe’s regular expressions a bit differently.
The first example requires all conditions to be met or else the
password fails. In the second example, acing the password test
requires three out of four conditional requirements to be met. The
third example, titled , is probably the most interesting. It
includes a function called rankPassword
that does what it says on the tin
and ranks passwords by how secure they are. It can thus help provide a
more user-friendly experience and encourage users to choose strong
passwords.
The rankPassword
function’s password ranking algorithm increments and decrements an
internal password score based on multiple conditions. If the
password’s length is less than the specified minimum of eight
characters, the function returns early with the numeric equivalent of
“Too Short.” Not including at least three character types incurs a
one-point penalty, but this can be balanced out because every two
additional characters after the minimum of eight adds a point to the
running score.
The code can of course be customized to further improve it or to meet your particular requirements. However, it works quite well as-is, regardless of what you throw at it. As a sanity check, we ran it against several hundred of the known most common (and therefore most insecure) user passwords. All came out ranked as either “Too Short” or “Weak,” which is exactly what we were hoping for.
Up to this point, we’ve split password validation into discrete rules that can be tested using simple regexes. That’s usually the best approach. It keeps the regexes readable, and makes it easier to provide error messages that identify why a password isn’t up to code. It can even help you rank a password’s complexity, as we’ve seen. However, there may be times when you don’t care about all that, or when one regex is all you can use. In any case, it’s common for people to want to validate multiple password rules using a single regex, so let’s take a look at how it can be done. We’ll use the following requirements:
Length between 8 and 32 characters.
One or more uppercase letters.
One or more lowercase letters.
One or more numbers.
Here’s a regex that pulls it all off:
^(?=.{8,32}$)(?=.*[A-Z])(?=.*[a-z])(?=.*[0-9]).*
Regex options: Dot matches line breaks (“^ and $ match at line breaks” must not be set) |
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby |
This regex can be used with standard JavaScript (which doesn’t
have a “dot matches line breaks” option) if you replace each of the
five dots with ‹[\s\S]
›.
Otherwise, you might fail to match some valid passwords that contain
line breaks. Either way, though, the regex won’t match any invalid
passwords.
Notice how this regular expression puts each validation rule into its own lookahead group at the beginning of the regex. Because lookahead does not consume any characters as part of a match (see Recipe 2.16), each lookahead test runs from the very beginning of the string. When a lookahead succeeds, the regex moves along to test the next one, starting from the same position. Any lookahead that fails to find a match causes the overall match to fail.
The first lookahead, ‹(?=.{8,32}$)
›, ensures that any match is between
8 and 32 characters long. Make sure to keep the ‹$
› anchor after ‹{8,32}
›, otherwise the match
will succeed even when there are more than 32 characters. The next
three lookaheads search one by one for an uppercase letter, lowercase
letter, and digit. Because each lookahead searches from the beginning
of the string, they use ‹.*
› before their respective character classes.
This allows other characters to appear before the character type that
they’re searching for.
By following the approach shown here, it’s possible to add as many lookahead-based password tests as you want to a single regex, so long as all of the conditions are always required.
The ‹.*
› at the
very end of this regex is not actually required. Without it, though,
the regex would return a zero-length empty string when it successfully
matches. The trailing ‹.*
› lets the regex include the password itself
in successful match results.
Caution
It’s equally valid to write this regex as ‹^(?=.*[A-Z])(?=.*[a-z])(?=.*[0-9]).{8,32}$
›,
with the length test coming after the lookaheads. Unfortunately,
writing it this way triggers a bug in Internet Explorer 5.5–8 that
prevents it from working correctly. Microsoft fixed the bug in the
new regex engine included in IE9.
Techniques used in the regular expressions in this recipe are discussed in Chapter 2. Recipe 2.2 explains how to match nonprinting characters. Recipe 2.3 explains character classes. Recipe 2.4 explains that the dot matches any character. Recipe 2.5 explains anchors. Recipe 2.7 explains how to match Unicode characters. Recipe 2.9 explains grouping. Recipe 2.10 explains backreferences. Recipe 2.12 explains repetition. Recipe 2.16 explains lookaround.
Get Regular Expressions Cookbook, 2nd Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.