The more I learn from Alex Stoianov about the advantages of Biometric Encryption, the more I understand how dangerous the use of conventional biometric templates really is. I had not understood that the templates were a reliable unique identifier reusable across databases and even across template schemes without a fresh biometric sample. People have to be stark, raving mad to use conventional biometrics to improve the efficiency of a children's lunch line.
Alex begins by driving home how easy template matching across databases really is:
Yes, that’s true: conventional biometric templates can be easily correlated across databases. Most biometric matching algorithms work this way: a fresh biometric sample is acquired and processed; a fresh template is extracted from it; and this template is matched against previously enrolled template.
If the biometric templates are stored in the databases, you don’t need a fresh biometric sample for the offline match – the templates contain all the information required.
Moreover, this search is extremely fast, such as 1,000,000 matches per sec is easily available. In our example, it would take only 10 sec to search a database of 10,000,000 records (we may disregard for now the issue of false acceptance – the accuracy is constantly improving). Biometric industry is actively developing standards, so that very soon all the databases will have standardized templates, i.e. will become fully interoperable.
BE, on the other hand, operates in a “blind†mode and, therefore, is inherently a one-to-one algorithm. Our estimate of 11.5 days for just one search makes it infeasible at present to do data mining across BE databases. If the computational power grows according to Kim’s estimates, i.e. without saturation, then in 10 – 20 years the data mining may indeed become common.
Kim already suggested a solution – just make the BE matching process slower! In fact, the use of one-way slowdown functions (known in cryptography) for BE was considered before. The research in this area has not been active because this is not a top priority problem for BE at present. In the future, as long as the computer power grows, every time the user gets re-enrolled, the slower function will be applied to keep the matching time at the same level, for example, 1 sec.
Other points to consider:
- BE is primarily intended for use in a distributed environment, i.e. without central databases;
- the data mining between databases is even much easier with users’ names – you wouldn’t even need biometrics for that. We are basically talking about biometric anonymous databases – a non-existing application at present;
- if a BE database custodian obtains and retains a fresh biometric sample just to do data mining, it would be a violation of his own policy. In contrast, if you give away your templates in conventional biometrics, the custodian is technically free to do any offline search.
These arguments are beyond compelling, and I very much appreciate the time Alex and Ann have taken to explain the issues.
It's understandable that BE researchers would be concentrating on more challenging aspects of the problem, but I strongly support the idea of building in a “slowdown function” from day one. The BE computations Alex describes lend themselves perfectly to parallel processing, so Moore's law will be operating in two, not one, dimensions. Maybe this issue could be addressed directly in one of the prototypes. For 1:1 applications it doesn't seem like reduced efficiency would be an issue.
Why couldn't the complexity of the calculation be a tunable characteristic of the system – sort of like the number of hash iterations in password based encryption (PBE)?
After finally figuring out that the “glass slipper” was a reference to Cinderella, who was identified by the prince when he went door to door to fit the glass slipper to each girls’ foot, I am still puzzled by this Biometric Encryption idea. Sure, it has advantages over traditional template or bitmap base schemes. But I think it’s case is overstated.
Would it prevent profiling? Let’s picture two databases, each holding personal information on me. While both applications use BE to authenticate regularly users, the owners of the systems must have access to all their data. My records would not be encrypted with anything unique to me, or the store owners could not use my data to create their services.
The primary keys will be different. The authentication attributes (based on my encrypted biometric information) will be different. But someone with access to the databases will be able to match most records based on other data fields: street address, birthday, credit card number, whatever is available in both directories. Once he has matched two directories, his profiles have grown and matching info from a third directory will be even more effective. Remember, when Yahoo published a large number of search strings without any identifying information, some searchers could still be identified from what they had been searching for.
To prevent profiling, we would need a very different architecture. If I order a book from Amazon, instead of my street address I would give them a one-time, personalized token. With that token, they could ask UPS to deliver my book, without ever knowing where it is going. That token could be structured such that only the local delivery branch would get access to my detailed address; higher up in the UPS network they only would know my country and town. Similarly, I would give them a one-time token that only Amazon could use at the credit card company to get my money. Next time I buy a book, they get new tokens for my address and bank. Even when they store that data, there is nothing to match with other directories to build my profile. Only my bank would know that I bought two somethings at Amazon and presumably they would have to know where I live. Well, death, tax and banks.