Last week I dumped a bunch of information about the sorry state of passwords and the internet, mostly from Light Blue Touchpaper. As usual, I soon ran across more information that should be included. It turns out that Gawker had another problem. Why should we think they are alone?
Read on if you’re interested.
Light Blue Touchpaper » Blog Archive » Another Gawker bug: handling non-ASCII characters in passwords
A few weeks ago I detailed how Gawker lost a million of their users passwords. Soon after this I found an interesting vulnerability in Gawkers password deployment involving the handling of non-ASCII characters. Specifically, they didnt handle them at all until two weeks ago, instead they were mapping all non-ASCII characters to the ASCII ? prior to hashing them. This not only greatly limited the theoretical space of passwords, but meant that passwords consisting of any n non-ASCII characters were equivalent to ?^n. Native Georgian or Korean speakers with passwords like రహస్య సంకేత పదం or 비밀번호 were vulnerable to an attacker simply guessing a string of question marks. An attacker may in fact know in advance that some users are from non-Latin countries (for example by looking at their email addresses) potentially making this more easily exploitable.
We users-of-ascii-english have it easy — and hard in a way. I have had to deal with related issues in recent years, primarily because C/C++ does not account for non-ascii characters for sorting unless you take special steps. That causes ordering and uniqueness issues as soon as you run into data with accented characters.