Light Blue Touchpaper is a blog run by leading international security researchers at the Computer Laboratory, University of Cambridge. In recent posts, researcher Steven Murdoch writes that Touchpaper, which is based on the same WordPress blogging software I use, was breached around the same time as Identityblog (described here).
Steven explains that the attack was the result of several problems in WordPress – a SQL injection vulnerability plus a basic misuse in the way password hashes are stored and used in cookies. The latter problem remains even after release 2.3.1. He writes:
It is disappointing to see that people are still getting this type of thing wrong. In their 1978 summary, Morris and Thompson describe the importance of one way hashing and password salting (neither of which WordPress does properly).
I also pointed this problem out to several people when first experimenting with how to integrate Information Cards into WordPress a couple of years ago. The comments may not have made their way back to people who could fix the problems…
Steven has another recent post that describes more, equally surprising, uses of hashing, and discusses the interplay between hashes and search engines:
One of the steps used by the attacker who compromised Light Blue Touchpaper a few weeks ago was to create an account (which he promoted to administrator; more on that in a future post). I quickly disabled the account, but while doing forensics, I thought it would be interesting to find out the account password. WordPress stores raw MD5 hashes in the user database (despite my recommendation to use salting). As with any respectable hash function, it is believed to be computationally infeasible to discover the input of MD5 from an output. Instead, someone would have to try out all possible inputs until the correct output is discovered.
So, I wrote a trivial Python script which hashed all dictionary words, but that didn’t find the target (I also tried adding numbers to the end). Then, I switched to a Russian dictionary (because the comments in the shell code installed were in Russian) but that didn’t work either. I could have found or written a better password cracker, which varies the case of letters, and does common substitutions (e.g. o ? 0, a ? 4) but that would have taken more time than I wanted to spend. I could also improve efficiency with a rainbow table, but this needs a large database which I didn’t have.
Instead, I asked Google. I found, for example, a genealogy page listing people with the surname “Anthony”, and an advert for a house, signing off “Please Call for showing. Thank you, Anthony”. And indeed, the MD5 hash of “Anthony” was the database entry for the attacker. I had discovered his password.
In both the webpages, the target hash was in a URL. This makes a lot of sense — I’ve even written code which does the same. When I needed to store a file, indexed by a key, a simple option is to make the filename the key’s MD5 hash. This avoids the need to escape any potentially dangerous user input and is very resistant to accidental collisions. If there are too many entries to store in a single directory, by creating directories for each prefix, there will be an even distribution of files. MD5 is quite fast, and while it’s unlikely to be the best option in all cases, it is an easy solution which works pretty well.
Because of this technique, Google is acting as a hash pre-image finder, and more importantly finding hashes of things that people have hashed before. Google is doing what it does best — storing large databases and searching them. I doubt, however, that they envisaged this use though.
They say misery loves company. And if I had wanted company while my blog was being breached, the Cambridge Computer Laboratory would have been about as good company as I could get. But I'm sure they, like me, draw one conclusion above all others: build systems on the basis they will be breached, in order to reduce the consequences to the absolute minimum.
[Thanks to Hans Van Es for pinging me about this.]