A look at password security, Part II: Web Sites
In part I, we took a look at the design of password authentication systems for old-school multiuser systems. While timesharing is mostly gone, most of us continue to use multiuser systems; we just call them Web sites. In this post, I’ll be covering some of the problems of Web authentication using passwords.
As I discussed previously, the strength of passwords depends to a great extent on how fast the attacker can try candidate passwords. The nature of a Web application inherently limits the velocity at which you can try passwords quite a bit. Even ignoring limits on the rate which you can transmit stuff over the network, real systems — at least well managed ones — have all kinds of monitoring software which is designed to detect large numbers of login attempts, so just trying millions of candidate passwords is not very effective. This doesn’t mean that remote attacks aren’t possible: you can of course try to log in with some of the obvious passwords and hope you get lucky, and if you have a good idea of a candidate password, you can try that (see below), but this kind of attack is inherently somewhat limited.
Remote compromise and password cracking
Of course, this kind of limitation in the number of login attempts you could make also applied to the old multiuser systems and the way you attack Web sites is the same: get a copy of the password file and remotely crack it.
The way this plays out is that somehow the attacker exploits a vulnerability in the server’s system to compromise the password database.1 They can then crack it offline and try to recover people’s passwords. Once they’ve done that, they can then use those passwords to log into the site themselves. If a site’s password database is stolen, their strongest defense is to reset everyone’s password, which is obviously really inconvenient, harms the site’s brand, and runs the risk of user attrition, and so doesn’t always happen.
To make matters worse, many users use the same password on multiple sites, so once you have broken someone’s password on one site, you can then try to login as them on other sites with the same password, even if the user’s password was reset on the site which was originally compromised. Even though this is an online attack, it’s still very effective, because password reuse is so common (this is one reason why it’s a bad idea to reuse passwords).
Password database disclosure is unfortunately quite a common occurrence, so much so that there are services such as Firefox Monitor and Have I been pwned? devoted to letting users know when some service they have an account on has been compromised.
Assuming a site is already following best practices (long passwords, slow password hashing algorithms, salting, etc.) then the next step is to either make it harder to steal the password hash or to make the password hash less useful. A good example here is the Facebook system described in this talk by Alec Muffett (famous for, among other things, the Crack password cracker). The system uses multiple layers of hashing, one of which is a keyed hash [technically, HMAC-SHA256] performed on a separate, hardened, machine. Even if you compromise the password hash database, it’s not useful without the key, which means you would also have to compromise that machine as well.2
Another defense is to use one-time password systems (often also called two-factor authentication systems). I’ll cover those in a future post.
Phishing
Leaked passwords aren’t the only threat to password authentication on Web sites. The other big issue is what’s called phishing. In the basic phishing attack, the attacker sends you an e-mail inviting you to log into your account. Often this will be phrased in some scary way like telling you your account will be deleted if you don’t log in immediately. The e-mail will helpfully contain a link to use to log in, but of course this link will go not to the real site but to the attacker’s site, which will usually look just like the real site and may even have a similar domain name (e.g., mozi11a.com
instead of mozilla.com
). When the user clicks on the link and logs in, the attacker captures their username and password and can then log into the real site. Note that having users use good passwords totally doesn’t help here because the user gives the site their whole password.
Preventing phishing has proven to be a really stubborn challenge because, well, people are not as suspicious as they should be and it’s actually fairly hard on casual examination to determine whether you are on the right site. Most modern browsers try to warn users if they are going to known phishing sites (Firefox uses the Google Safe Browsing service for this). In addition, if you use a password manager, then it shouldn’t automatically fill in your password on a phishing site because password managers key off of the domain name and just looking similar isn’t good enough. Of course, both of these defenses are imperfect: the lists of phishing sites can be incomplete and if users don’t use password managers or are willing to manually cut and paste their passwords, then phishing attacks are still possible.3
Beyond Passwords
The good news is that we now have standards and technologies which are better than simple passwords and are more resistant to these kinds of attacks. I’ll be talking about them in the next post.
- A more fatal security issue occurs when application developers mistakenly write plaintext user passwords to debug logs. This allows the attacker to target the logging system and get immediate access to passwords without having to do any sort of computational work. ↩
- The Facebook system is actually pretty ornate. At least as of 2014 they had four separate layers: MD5, HMAC-SHA1 (with a public salt), HMAC-SHA256(with a secret key), and Scrypt, and then HMAC-SHA256 (with public salt) again, Muffet’s talk and this post do a good job of providing the detail, but this design is due to a combination of technical requirements. In particular, the reason for the MD5 stage is that an older system just had MD5-hashed passwords and because Facebook doesn’t know the original password they can’t convert them to some other algorithm; it’s easiest to just layer another hash on. ↩
- This is an example of a situation in which the difficulty of implementing a good password manager makes the problem much worse. Sites vary a lot in how they present their password dialogs and so password managers have trouble finding the right place to fill in the password. This means that users sometimes have to type the password in themselves even if there is actually a stored password, teaching them bad habits which phishers can then exploit. ↩