All about Phorm

The Law of User Control is hard at work in a growing controversy about interception of people's web traffic in the United Kingdom.  At the center of the storm is the “patent-pending” technology of a new company called Phorm.  It's web site advises:

Leading UK ISPs BT, Virgin Media and TalkTalk, along with advertisers, agencies, publishers and ad networks, work with Phorm to make online advertising more relevant, rewarding and valuable. (View press release.)

Phorm's proprietary ad serving technology uses anonymised ISP data to deliver the right ad to the right person at the right time – the right number of times. Our platform gives consumers advertising that's tailored to their interests – in real time – with irrelevant ads replaced in the process.

What makes the technology behind OIX and Webwise truly groundbreaking is that it takes consumer privacy protection to a new level. Our technology doesn't store any personally identifiable information or IP addresses, and we don't retain information on user browsing behaviour. So we never know – and can't record – who's browsing, or where they've browsed.

It is counterintuitive to see claims of increased privacy posited as the outcome of a tracking system.  But even if that happened to be true, it seems like the system is being laid on the population as a fait accompli by the big powerful ISPs.  It doesn't seem that users will be able to avoid having their traffic redirected and inspected.  And early tests of the system were branded “illegal” by Nicholas Bohm of the Foundation for Information Policy Research (FIPR). 

Is Phorm completely wrong?  Probably not.  Respected and wise privacy activist Simon Davies has done an Interim Privacy Impact Assessment that argues (in part):

In our view, Phorm has successfully implemented privacy as a key design component in the development of its Phorm Technology system. In contrast to the design of other targeting systems, careful choices have been made to ensure that privacy is preserved to the greatest possible extent. In particular, Phorm has quite consciously avoided the processing of personally identifiable information.

Simon seems to be suggesting we consider Phorm in relation to the current alternatives – which may be worse.

To make a judgment we need to really understand how Phorm's system works.  Dr. Richard Clayton, a computer security researcher at the University of Cambridge and a participant in Light Blue Touchpaper, has published a succinct ten page explanation that that is a must-read for anyone who is a protocol head.

Richard says his technical analysis of the Phorm online advertising system has reinforced his view that it is “illegal”, breaking laws designed to limit unwarranted interception of data.

The British Information Commissioners Office confirmed to the BBC that BT is planning a large-scale trial of the technology “involving around 10,000 broadband users later this month”.  The ICO said: “We have spoken to BT about this trial and they have made clear that unless customers positively opt in to the trial their web browsing will not be monitored in order to deliver adverts.”

Having quickly read Richard's description of the actual protocol, it isn't yet clear to me that if you opt out, your web traffic isn't still being examined and redirected.  But there is worse. I have to admit to a sense of horror when I realized the system rewards ISPs for abusing their trusted role in the Internet by improperly posing as other peoples’ domains in order to create fraudulent cookies and place them on users machines.  Is there a worse precedent?  How come ISPs can do this kind of thing and other can't?  Or perhaps now they can…

To accord with the Laws of Identity, no ISP would examine or redirect packets to a Phorm-related server unless a user explicitly opted-in to such a service.  Opting in should involve explicitly accepting Phorm as a justifiable witness to all web interactions, and agreeing to be categorized by the Phorm systems.

The system is devised to aggregate across contexts, and thus runs counter to the Fourth Law of Identity.  It claims to mitigate this by reducing profiling to categorization information.  However, I don't buy that.  Categorization, practiced at a grand enough scale and over a sufficient period of time, potentially becomes more privacy invasive than a regularly truncated audit trail.    Thus there must be mechanisms for introducing amnesia into the categorization itself.

Phorm would therefore require clearly defined mechanisms for deprecating and deleting profile information over time, and these should be made clear during the opt-in process.

I also have trouble with the notion that in Phorm identities are “anonymized”.  As I understand it, each user is given a persistent random ID.  Whenever the user accesses the ISP, the ISP can see the link between the random ID and the user's natural identity.  I understand that ISPs will prevent Phorm from knowing the user's natural identity.  That is certainly better than many other systems.  But I still wouldn't claim the system is based on anonymity.  It is based on controlling the release of information.

[Podcasts are available here]

Chaos computer club gives us the German phish finger

If you missed this article in The Register, you missed the most instructive story to date about applied biometrics:  

A hacker club has published what it says is the fingerprint of Wolfgang Schauble, Germany's interior minister and a staunch supporter of the collection of citizens’ unique physical characteristics as a means of preventing terrorism.

In the most recent issue of Die Datenschleuder, the Chaos Computer Club printed the image on a plastic foil that leaves fingerprints when it is pressed against biometric readers…

Last two pages of magazine issue, showing article and including plastic film containing Schauble's fingerprint

“The whole research has always been inspired by showing how insecure biometrics are, especially a biometric that you leave all over the place,” said Karsten Nohl, a colleague of an amateur researcher going by the moniker Starbug, who engineered the hack. “It's basically like leaving the password to your computer everywhere you go without you being able to control it anymore.” … 

A water glass 

Schauble's fingerprint was captured off a water glass he used last summer while participating in a discussion celebrating the opening of a religious studies department at the University of Humboldt in Berlin. The print came from an index finger, most likely the right one, Starbug believes, because Schauble is right-handed.

The print is included in more than 4,000 copies of the latest issue of the magazine, which is published by the CCC. The image is printed two ways: one using traditional ink on paper, and the other on a film of flexible rubber that contains partially dried glue. The latter medium can be covertly affixed to a person's finger and used to leave an individual's prints on doors, telephones or biometric readers…

Schauble is a big proponent of using fingerprints and other unique characteristics to identify individuals.

“Each individual’s fingerprints are unique,” he is quoted as saying in this official interior department press release announcing a new electronic passport that stores individuals’ fingerprints on an RFID chip. “This technology will help us keep one step ahead of criminals. With the new passport, it is possible to conduct biometric checks, which will also prevent authentic passports from being misused by unauthorized persons who happen to look like the person in the passport photo.”

The magazine is calling on readers to collect the prints of other German officials, including Chancellor Angela Merkel, Bavarian Prime Minister Guenther Beckstein and BKA President Joerg Ziercke.

“The thing I like a lot is the political activism of the hack,” said Bruce Schneier, who is chief security technology officer for BT and an expert on online authentication. Fingerprint readers were long ago shown to be faulty, largely because designers opt to make the devices err on the side of false positives rather than on the side of false negatives…

[Read the full article here]

How to safely deliver information to auditors

I just came across Ian Brown's proposal for doing random audits while avoiding data breaches like Britain's terrible HMRC Identity Chernobyl: 

It is clear from correspondence between the National Audit Office and Her Majesty's Revenue & Customs over the lost files fiasco that this data should never have been requested, nor supplied.

NAO wanted to choose a random sample of child benefit recipients to audit. Understandably, it did not want HMRC to select that sample “randomly”. However, HMRC could have used an extremely simple bit-commitment protocol to give NAO a way to choose recipients themselves without revealing any of the data related to those not chosen:

  1. For each recipient, HMRC should have calculated a cryptographic hash of all of the recipient's data and then given NAO a set of index numbers and this hash data.
  2. NAO could then select a sample of these records to audit. They would inform HMRC of the index values of the records in that sample.
  3. HMRC would finally supply only those records. NAO could verify the records had not been changed by comparing their hashes to those in the original data received from HMRC.

This is not cryptographic rocket science. Any competent computer science graduate could have designed this scheme and implemented it in about an hour using an open source cryptographic library like OpenSSL.

Ben Laurie notes that the redacted correspondence itself demonstrates a lack of basic security awareness. I hope those carrying out the security review of the ContactPoint database are better informed.

Know your need

Here's a great comment from the smart and witty Paul Madsen.   He really his the nail on the head with his “Know your Need” corollory

In announcing Microsoft's purchase of the Credentica patents (and hiring of Stefan's core team), Kim uses the ‘need to know’ analogy.

That danger can be addressed by adopting a need-to-know approach to the Internet.

(For the life of me, I just cannot get Sgt Shultz's ‘I know nothing’ out of my head.)

Credentica's U-prove technology promises to close off a (depending on the deployment environment, potentially big) ‘knowledge leak’ – if the IDP doesn't need to know what/where/why/when/who the user does with the assertions it creates, then the principle of minimal ‘need to know’ means that it shouldn't.

Cardspace seems a great application for U-Prove to prove itself. As Stefan points out, ‘its a good thing’ to influence/control both client and server.

Separately, I see the flip side of ‘need to know’ as ‘know your need’, i.e. entities involved in identity transactions must be able to assess and assert their needs for identity attributes. This is the CARML piece of the Identity Governance Framework). Put another way, before a decision is made as to whether or not some entity ‘needs to know’, it'd be nice to know why they are asking.

I agree that it is sometimes a positive and useful thing for a claims provider to know the user's “what, where, why, when and who”.  So everything is a matter of minimization – but within to the requirements of the scenario.

I don't actually buy the “influence/control both client and server” phraseology.  I'm fine with influence, but see control as an elusive and worthless goal.  That's not how the world works.  It works through synergy and energy radiating from everywhere, and those of us who are on this odyssey must tap into that.

Scotland's eCare wins award

Scotland's eCare has been recognised at an international awards ceremony on good practice in data protection.  On Tuesday, 11 December, the Data Protection Agency of the Region of Madrid awarded the eCare framework one of two “special mention” awards.  The aim of the annual prize is to expand the awareness of best practices in data protection by government bodies across Europe.

I'm really pleased to see the authors of eCare recognized. They have created a system for sharing health information that concretely embodies the kind of thinking set out in the Laws of Identity.

A Scottish Executive publication describes eCare this way:

The system is designed with a central multi-agency eCare store in a ‘demilitarised’ zone (hanging off NHS net), which links to the multiple back office legacy systems operating locally in the partner agencies. This means that each locality will have its own locally defined and unique approach.

All data shared is subject to consent by the client. The system users are authenticated through their local systems, and are only entitled to view the data of their clients. Clients can change their consent status, and as soon as this is logged on the local system the records cannot be viewed by the partner practitioners.

Benefits that the programme will deliver

The direct benefit to the citizen will be through improved experience of care. Single Shared Assessment, through electronic information sharing, will reduce the volume of questions repeatedly asked by professionals, as data will only have to be collected from the client once, then shared through the technology.

The Children's Services stream will focus on the delivery of an electronic Personal Care Record, an Integrated Children’s Services Record, and a Single Assessment Framework for sharing, to benefit both Scotland’s children, and care practitioners. Across the streams’ care groups, practitioners will save time, because core data will be shared, rather than gathered by multiple agencies. This will reduce the possibility of duplicated or inappropriate care. A more holistic picture of the client will be created, which will help to ensure services that more accurately meet peoples needs.

The principal deliverables of the Learning Disability stream are the development of integrated local service records, which will help planning across a range of services, and the piloting of a national anonymised database, which will enable the Scottish Executive to monitor implementation of ‘The same as you?’ initiative.

Ken Macdonald, Assistant Commissioner (Information Commissioner’s Office, which provided a note of support for the eCare application) has commented:

It is wonderful to see UK expertise in data protection being officially recognised in Europe for the second year running.  Recent events have highlighted the need to comply with the principles of the Data Protection Act and I am delighted to see the eCare Framework and the Scottish Government setting such a fine example to others not just in the UK but throughout Europe.

I hope the work is published more broadly.  From seeing presentations on the system, it partitions information for safety.  It employs encrypted data, not simply network encryption.  It favors local administration, and leaves information control close to those responsible for it.  It puts information sharing under the control of the data subjects.  It consistently enforces “need to know” as well as user consent prior to information release.  In fact it strikes me as being everything you would expect from a system built after wide consultation with citizens and thought leaders – as happened in this case.  And not surprisingly with such a quality project, it uses innovative new technologies and approaches to achieve its goals.

NAO's “redaction” adds fuel to the flames

Google's Ben Laurie has a revealing link to correspondence published by the National Auditing Office relating to HMRC's recent identity disaster. 

He also explains that the practice of publishing “redacted texts” is itself outmoded in light of the kinds of statistical attacks that can now be mounted.  He concludes that, “those who are entrusted with our data have absolutely no idea of the threats it faces, nor the countermeasures one should take to avoid those threats.”

In the wake of the HMRC disaster (nicely summarised by Kim Cameron), the National Audit Office has published scans of correspondence relating to the lost data.

First of all, it's notable that everyone concerned seems to be far more concerned about cost than about privacy. But an interesting question arises in relation to the redactions made to protect the “innocent”. Once more, NAO and HMRC have shown their lack of competence in these matters…

A few years ago it was a popular pastime to recover redacted data from such documents, using a variety of techniques, from the hilarious cut'n'paste attacks (where the redacted data had not been removed, merely covered over with black graphics) to the much more interesting typography related attacks. The way these work is by working backwards from the way that computers typeset. For each font, there are lookup tables that show exactly how wide each character is, and also modifications for particular pairs of characters (for example, “fe” often has less of a gap between the characters than would be indicated by the widths of the two letters alone). This means that if you can accurately measure the width of some text it is possible to deduce which characters must have made up the text (and often what order those characters must appear in). Obviously this isn't guaranteed to give a single result, but often gives a very small number of possibilities, which can then be further reduced by other evidence (such as grammar or spelling).

It seems HMRC and NAO are entirely ignorant of these attacks, since they have left themselves wide open to them. For example, on page 5 of the PDF, take the first line “From: redacted (Benefits and Credits)”. We can easily measure the gap between “:” and “(“, which must span a space, one or more words (presumably names) and another space. From this measurement we can probably make a good shortlist of possible names.

Even more promising is line 3, “cc: redacted@…”. In this case the space between the : and the @ must be filled by characters that make a legal email address and contain no spaces. Another target is the second line of the letter itself “redacted has passed this over to me for my views”. Here we can measure the gap between the left hand margin and the first character of “has” – and fit into that space a capital letter and some other letters, no spaces. Should be pretty easy to recover that name.

And so on.

This clearly demonstrates that those who are entrusted with our data have absolutely no idea of the threats it faces, nor the countermeasures one should take to avoid those threats.

Britain's HMRC Identity Chernobyl

The recent British Identy Chernobyl demands our close examination. 

Consider:

  • the size of the breach – loss of one person's identity information is cause for concern, but HMRC lost the information on 25 million people (7.5 million families)
  • the actual information “lost” – unencrypted records containing not only personal but also banking and national insurance details (a three-for-one…)
  • the narrative – every British family with a child under sixteen years of age made vulnerable to fraud and identity theft

According to Bloomberg News,

Political analysts said the data loss, which prompted the resignation of the head of the tax authority, could badly damage the government.

“I think it’s just a colossal error that I think could really rebound on the government’s popularity”, said Lancaster University politics Professor David Denver.

“What people think about governments these days is not so about much ideology, but about competence, and here we have truly massive incompetence.”

Even British Chancellor Alistair Darling said,

“Of course it shakes confidence, because you have a situation where millions of people give you information and expect it to be protected.

Systemic Failure

Meanwhile, in parliament, Prime Minister Gordon Brown explained that security measures had been breached when the information was downloaded and sent by courier to the National Audit Office, although there had been no “systemic failure”.

This is really the crux of the matter. Because, from a technology point of view, the failure was systemic. 

From a technology point of view, the failure was systemic.

We are living in an age where systems dealing with our identity must be designed from the bottom up not to leak information in spite of being breached.  Perhaps I should say, “redesigned from the bottom up”, because today's systems rarely meet the bar.  It's not that data protection wasn't considered when devising them.  It is simply that the profound risks were not yet evident, and guaranteeing protection was not seen to be as fundamental as meeting other design goals – like making sure the transactions balanced or abusers were caught.

Isn't it incredible that “a junior official” could simply “download” detailed personal and financial information on 25 million people?  Why would a system be designed this way? 

To me this is the equivalent of assembling a vast pile of dynamite in the middle of a city on the assumption that excellent procedures would therefore be put in place, so no one would ever set it off.  

There is no need to store all of society's dynamite in one place, and no need to run the risk of the collosal explosion that an error in procedure might produce.  

Similarly, the information that is the subject of HMRC's identity catastrophe should have been partitioned – broken up both in terms of the number of records and the information components.

In addition, it should have been encrypted – even rights protected from beginning to end.  And no official (A.K.A insider) should ever have been able to get at enough of it that a significant breach could occur.

Gordon Brown, like other political leaders, deserves technical advisors savvy enough to explain the advantages of adopting new approaches to these problems.  Information technology is important enough to the lives of citizens that political leaders really ought to understand the implications of different technology strategies.  Governments need CTOs that are responsible for national technical systems in much the same ways that chancellors and the like are responsible for finances.

Rather than being advised to apologize for systems that are fundamentally flawed, leaders should be advised to inform the population that the government has inherited antiquated systems that are not up to the privacy requirements of the digital age, and put in place solutions based on breach-resistance and privacy-enhancing technologies. 

The British information commissioner, Richard Thomas, is conducting a broad inquiry on government data privacy.  He is quoted by the Guardian as saying he was demanding more powers to enter government offices without warning for spot-checks.

He said he wanted new criminal penalties for reckless disregard of procedures. He also disclosed that only last week he had sought assurances from the Home Office on limiting information to be stored on ID cards.

“This could not be more serious and has to be a serious wake-up call to the whole of government. We have been warning about these dangers for more than a year.  

I have never understood why any politician in his (or her) right mind wouldn't want to be on the privacy-enhancing and future-facing side of this problem.

Hunky-Dory

 Paul Madsen at ConnectID writes:

Kim defends CardSpace on the issue of the Display Token.

Personally, I think it's a UI issue. The concern would be mitigated if the identity selector were to simply preface the display token with a caveat:

The following attributes are what the IDP claims to be sending. If you do not trust your IdP, do not click on “Send”.

If the UI doesn't misrepresent the reality of what the DisplayToken is (and isn't), then we're hunky-dory.

And of course, CardSpace is not the only WS-Trust based identity selector in town. The other selectors are presumably under no constraints to deal with DisplayToken in the same way as does CardSpace? 

Paul has a good point and I buy the “general idea”.  I guess my question would be, should this warning be presented each time an Information Card is used, or just when making the initial decision to depend on a new card? 

I think the answer should come from “user studies”:  let's find out what approach is more effective.  I hear a lot of user interface experts telling us to reduce user communication to what is essential at any specific point in time so that what is communicated is effectively conveyed.

Despite this notion, identity providers should be held accountable for ensuring that the contents of information tokens correspond to the contents of their associated display tokens.  This should be mandated in the digital world.

By the way, I love Paul's recollection of the word “Hunky-Dory”.  He gives a nice reference.  Funny – I always thought it referred to a “certain beverage“.

EPIC opposes Google / Doubleclick merger

Last week the Electronic Privacy Information Center (EPIC) made an agenda-setting intervention on the newest dangers in digital privacy.  EPIC is perhaps the world’s most influential privacy advocacy group,  and presented its brief to a US Senate hearing looking into Google’s proposed acquisition of Doubleclick

According to USA Today,

“The Federal Trade Commission is already reviewing whether the Google-DoubleClick combination would violate antitrust law.  Consumer groups are pressing the agency to also scrutinize Google's privacy practices.  Marc Rotenberg, executive director of the Electronic Privacy Information Center, told the Senate committee that Google should be required to strengthen its privacy practices as a condition of the acquisition.”

Continue reading EPIC opposes Google / Doubleclick merger

We need a spectrum

Stefan Brands runs off in the wrong direction in his recent treatise on OpenID.  Who really needs a “shock and awe” attempt to bonk the new OpenID “cryptographic primitives” back into the stone age?

It's not that you shouldn't read the piece; even though subtlety is not its strong suit, it brings together a lot of information about a threat model for OpenID.

The main problem is simply that it misses the whole point about why OpenID is of interest.
Continue reading We need a spectrum