I just came across Ian Brown's proposal for doing random audits while avoiding data breaches like Britain's terrible HMRC Identity Chernobyl:
It is clear from correspondence between the National Audit Office and Her Majesty's Revenue & Customs over the lost files fiasco that this data should never have been requested, nor supplied.
NAO wanted to choose a random sample of child benefit recipients to audit. Understandably, it did not want HMRC to select that sample “randomly”. However, HMRC could have used an extremely simple bit-commitment protocol to give NAO a way to choose recipients themselves without revealing any of the data related to those not chosen:
- For each recipient, HMRC should have calculated a cryptographic hash of all of the recipient's data and then given NAO a set of index numbers and this hash data.
- NAO could then select a sample of these records to audit. They would inform HMRC of the index values of the records in that sample.
- HMRC would finally supply only those records. NAO could verify the records had not been changed by comparing their hashes to those in the original data received from HMRC.
This is not cryptographic rocket science. Any competent computer science graduate could have designed this scheme and implemented it in about an hour using an open source cryptographic library like OpenSSL.
Ben Laurie notes that the redacted correspondence itself demonstrates a lack of basic security awareness. I hope those carrying out the security review of the ContactPoint database are better informed.