August 7, 2009

Making Chroma-Hash Less Leaky

Prologue

Recently, Jakob Nielsen yelled at everyone that password masking is a usability problem. When that man yells, people listen, and so were planted the seeds for some interesting experiments in providing password hints. The sexiest of these so far is Mattt Thompson's Chroma-Hash.

Some valid security concerns were raised over this widget. Mattt has solved several of these already with his recent improvements. I'd like to examine one of the remaining issues and suggest a solution. You can view my fork on Github for the source code.

The problem

The scenario goes something like this: a user takes and shares a screenshot or screencast of their login screen with password typed in. Someone malicious views this and can garner information about the hashed password from the color bars. From here, I'm going to assume that you understand the basics of how MD5 is a one-way function and why that's important.
Chroma-Hash password box
In the standard operating mode, Chroma-Hash is pulling number values right from the MD5 hash. We can get the colors with an eyedropper, and look - they match up (in reverse order) to the first part of the hash of the salted password.

$ echo -n "hooray12:7be82b35cb0199120eea35a4507c9acf" | md5sum
4ea16c514a6697bce642ee2250aa92f6 -

If we were using five color bars, we would have disclosed almost the whole hash.

People keep bringing up the fact that MD5 is not considered a secure hash function any more. These concerns are misplaced. MD5 is considered broken because it's too easy to find collisions - things that hash to the same MD5 sum. This is useful indeed if you are wanting to forge a digitally signed certificate or tamper with transferred data. But unless the authentication server is using the exact same salt and hash algorithm as Chroma-Hash, creating a collision with someone's color bar hash is useless - you'll be able to get the same colors, but you won't be able to log in.

The real concern here is this: we've allowed an attacker to move the computational load onto their own hardware. When you control the password oracle, it's easy to limit the rate at which login attempts may be made. This makes a brute force attack or even a dictionary attack infeasible. The attacker can't try passwords fast enough to have a reasonable chance of guessing the right one within years. But when an attacker has a hashed result of your password, they can run a dictionary attack as fast as their hardware allows, and a matching MD5 from a dictionary attack is likely to be the right password, because let's face it, people in general don't choose secure passwords.

An aside: at the leading edge of server-side security, the equivalent threat of a stolen database is dealt with by bcrypt, a hashing scheme that can be tuned to be computationally intensive. So maybe the password check takes a tenth of a second instead of a thousandth - it's no big deal in the course of regular business, but it will significantly slow down an attacker trying to test a lot of passwords against stolen hashes. This strikes me as impractical for our purposes, and not only because we would need to implement bcrypt in Javascript. Tune it too strong, and a user running on slow hardware could suffer a bad performance hit when trying to type in their password. Tune it too weak, and an attacker with a couple dedicated cores could crank through at a fair clip.

My solution

In this case, I say collisions are actually our friends. If we can limit the information available to an attacker, we can leave them with a very large set of possible matches that they can only check by attempting to log in to the server. The point here is to make them verify against the server, rather than doing it at their own pace.

This is where another convenient fact comes into play. In his blog entry, Mattt points out that over-the-shoulder attacks won't be effective against Chroma-Hash.
As a color expressed in Hex, there are 16,777,215 possible colors for each bar. Eye-balling it wouldn’t be enough to get an exact color value—the difference between #952A08 and #952A09 is nearly imperceptible...
Those millions of possible colors come from 24 bits used to represent each color, which in turn is 24 bits of our hash leaked for every color bar. If we don't leak the information in some of those bits, our attacker cannot be as precise about identifying matches. And since humans cannot really differentiate all those colors anyways, we're losing almost nothing by eliminating some of the possibilities.

The best way to do this is to redact the low-order bits, so that we keep the entire color range and lose only the fine distinctions between shades. You can think of this like counting in multiples. Instead of every number being an option, we round to the nearest even number, or multiple of 16, or whatever we like. The more we round, the more information we can withhold from an attacker.

Let's see it in action.
Chroma-Hash password box
In this version, rgbStepSize is 2. You can see that the color values are very close to the original, but each 2-character hex number is even (0x96 = 150, 0xbc = 188, 0xe6 = 230, and so on). And since we're rounding, the attacker cannot know if the original hash contained "96" or "97", "bc" or "bd", etc.
Chroma-Hash password box
In this one rgbStepSize is 16. Looking at the color values, you can see that the second character of each pair is 0. We've eliminated half of the bits leaked by Chroma-Hash, and the colors are still remarkably close to the exact values as far as the human eye is concerned. In fact, quick experimentation shows that we can go with a step size of 64 or so without affecting user experience too drastically.

Did it work?

Now, how much does this help us? I'm a little out of my depth here, so I can only provide some back-of-the-napkin estimates. The small version of Openwall's word lists, which consists of various words and word combinations, has about 300,000 entries. For a six-digit password consisting of lowercase letters and numbers, there are about 2 billion total possibilities. A 64 bit hash can have about 18 quintillion different values, so if a dictionary attack finds a match against all bits, it's almost certainly the true password.

Let's say we're showing three color bars with a step size of 64. This means that 6 of each 24 bits per color is leaked. So an attacker is working with 18 bits, a space of about 260,000. Assuming the distribution through this space is even (it should be), each possible combination of these 6 bits will match up with roughly 34 million possibilities in the six-digit password space. This is good, as the attacker cannot test 34 million passwords against the server in a reasonable amount of time. However, working with the small word list, we can expect an almost one-to-one correspondence, which is not good. If we were to drop to two color bars, we could expect 73 matches per colorset. If we were to use a step size of 128 instead of 64, we could bring it up to 585 matches per colorset. If we did both of these, 4,688 (but at some point, usability drops off).

Regaining perspective

By dead reckoning, I would guess that most passwords used in a reasonably computer-literate community are stronger than the small dictionary list, containing non-words, numbers and hopefully capital letters or even symbols. But humans do like phonetic constructions and show a strong aversion to random combinations of letters and symbols. And a not-inconsequential number of people are still using dangerously weak passwords, unaware of the dangers of computer security.

So, is it worth it? Assess the risks. A user must leak their password information through a screenshot or similarly exact reproduction. This must either be initiated by the user or social-engineered out of them - someone with direct access to the user's computer could just install a keylogger instead. Additionally, that user must have a weak password. An attacker must take the time to launch a dictionary attack against the gathered information, then test all resulting possibilities against the server until one works. Unlikely, but not implausible. Put this in context with the more mundane but oh-so-effective threats like phishing, email password reset, compromise from another site, and general password carelessness. And finally weigh your perceived threat against the usability benefits Chroma-Hash offers.

Is it worth it? That's up to you.