Information Card user education resources

Keith Brown points out that we need some permanent web resources that would teach people about Information Cards, how they work, how to use them, and so on:

As I add support for information cards to Pluralsight, I'm rather surprised that I'm having trouble finding official landing pages for consumers. For example, on our logon page, there will be a button to click to log in using an information card, kind of like what you see on Kim's login page. For people who don't know what an information card is, this might be confusing, so of course we'll want a link that points to some documentation. But right now it seems as though everyone is creating their own descriptions for this. Here's Kim's what is an information card page, for example.

It seems as though it would help adoption if there were some centralized descriptions of this stuff. Do these pages exist and I'm just missing them? Or is it that Microsoft only wants to talk about CardSpace, which is their implementation of the selector? I note that when Kim wants to tell you how to install an identity selector, he points to a WordPress blog called the Pamela Project, which doesn't seem too helpful, but might be interesting for someone wanting to add support for information cards to their WordPress blog.

It seems to me that if the industry really wants consumers to start adopting information cards, somebody's going to have to explain this stuff in terms my mother can understand, and it would help to have a common place where those explanations live.

In another post, discussing the issue with Richard Turner, he adds:

In my opinion, somebody (Microsoft?) needs to break this holding pattern fast. I agree that things aren't going to take off until there are more relying parties. But as a guy who is busy doing just that (adding support for infocard to pluralsight.com), it doesn't make me feel very comfortable that those consumer landing pages I talked about in my post don't already exist on the web. I happen to be very committed to this technology, so I'm going to implement a relying party no matter what. Other websites might not be so inclined.

I agree this would help – and simplify our lives.  When I InfoCard-enabled my site, I had to cobble stuff up from scratch.  It was tedious since none of the materials existed.   Maybe that's why my help screens are a bit, as Keith is too polite to tell you, crude.

It sure would be neat to have a PERMANENT location everyone's Information Card help links can point to.  That would provide consistency, and let us get some really good resources together, including videos. 

I'll bounce this idea around with others here at Microsoft and see how we can play, as Keith says, a leadership role in making this happen.

What does the identity provider know?

I appreciate the correction by Irving Reid in a posting called What did the identity provider know?

…And when did it know it?

Kim Cameron sums up the reasons why we need to understand the technical possibilities for how digital identity information can affect privacy; in short, we can’t make good policy if we don’t know how this stuff actually works.

But I want to call out one assertion he (and he’s not the only one) makes:

 First, part of what becomes evident is that with browser-based technologies like Liberty, WS-Federation and OpenID,  NO collusion is actually necessary for the identity provider to “see everything”.

The identity provider most certainly does not “see everything”. The IP sees which RPs you initiate sessions with and, depending on configuration, has some indication of how long those sessions last. Granted, that is *a lot* of information, but it’s far from “everything”. The IP must collude with the RPs to get any information about what you did at the RP during the session.

Completely right. I'll try to make this clearer as I go on. Without collusion, the IP doesn't know how the user actually behaved while at the RP.  I was too focussed on the “identity channel”, thinking about the fact that the IP knows times, what RPs were visited, and what claims were released for each particular user to each RP.

Collusion takes effort; how much?

Eric Norman, from University of Wisconsin, has a new blog called Fun with Metaphors and an independent spirit that is attractive and informed.    He weighs in to our recent discussion with Collusion takes effort:

Now don't get me wrong here. I'm all for protection of privacy. In fact, I have been credited by some as raising consciousness about 8 years ago (pre-Shibboleth) in the Internet2 community to the effect that privacy concerns need to be dealt with in the beginning and at a fundamental level instead of being grafted on later as an afterthought.

There have been recent discussions in the blogosphere about various parties colluding to invade someone's privacy. What I would like to see during such discussions is a more ecological and risk-assessing approach. I'll try to elaborate.

The other day, Kim Cameron analyzed sundry combinations of colluding parties and identity systems to find out what collusion is possible and what isn't. That's all well and good and useful. It answers questions about what's possible in a techno- and crypto- sense. However, I think there's more to the story.

The essence of the rest of the story is that collusion takes effort and motivation on the part of the conspirators. Such effort would act as a deterrent to the formation of such conspiracies and might even make them not worthwhile.

Just the fact that privacy violations would take collusion might be enough to inhibit them in some cases. This is a lightweight version of separation of duty — the nuclear launch scenario; make sure the decision to take action can't be unilateral.

In some of the cases, not much is said about how the parties that are involved in such a conspiracy would find each other. In the case of RPs colluding with each other, how would one of the RPs even know that there's another RP to conspire with and who the other RP is? That would involve a search and I don't think they could just consult Google. It would take effort.

Just today, Kaliya reported another example. A court has held that email is subject to protection under the Fourth Amendment and therefore a subpoena is required for collusion. That takes a lot of effort.

Anyway, the message here is that it is indeed useful to focus on just the technical and cryptographic possibilities. However, all that gets you is a yes/no answer about what's possible and what's not. Don't forget to also include the effort it would actually take to make such collusions happen.

First of all, I agree that the technical and crypto possibilities are not the whole story of linkability.  But they are a part of the story we do need to understand a lot more objectively than is currently the case.  Clearly this applies to technical people, but I think the same goes for policy makers.  Let's get to the point where the characteristics of the systems can be discussed without emotion or the bias of any one technology.

Now let's turn to one of Eric's main points: the effort required for conspirators to collude would act as a deterrent to the formation of such conspiracies.

First, part of what becomes evident is that with browser-based technologies like Liberty, WS-Federation and OpenID,  NO collusion is actually necessary for the identity provider to “see everything” – in the sense of all aspects of the identity exchange.  That in itself may limit use cases.   It also underlines the level of trust the user MUST place in such an IP.  At the very minimum, all the users of the system need to be made aware of how this works.  I'm not sure that has been happening…

Secondly, even if you blind the IP as to the identity of the RP, you clearly can't prevent the inverse, since the RP needs to know who has made the claims!  Even so,  I agree that this blinding represents something akin to “separation of duty”, making collusion a lot harder to get away with on a large scale.

So I really am trying to set up this continuum to allow for “risk assessment” and concrete understanding of different use cases and benefits.  In this regard Eric and I are in total agreement.

As a concrete example of such risk assessment, people responsible for privacy in government have pointed out to me that their systems are tightly connected, and are often run by entities who provide services across multiple departments.  They worry that in this case, collusion is very easy.  Put another way, the separation of duties is too fragile.

Assemble the audit logs and you collude.  No more to it than that.  This is why they see it as prudent to put in place a system with properties that make routine creation of super-dossiers more difficult.  And why we need to understand our continuum. 

Kafka would have been proud

Here, via MSNBC, is a message in a bottle from some dimension I would not otherwise believe existed: 

VIENNA, Va. – A rule against physical contact at a Fairfax County middle school is so strict that students can be sent to the principal's office for hugging, holding hands or even high-fiving.

Unlike some schools in the Washington area, which ban fighting or inappropriate touching, Kilmer Middle School in Vienna bans all touching — and that has some parents lobbying for a change.

Hugging was Hal Beaulieu's crime when he sat next to his girlfriend at lunch a few months ago and put his arm around her shoulder. He was given a warning, but told that repeat missteps could lead to detention.

“I think hugging is a good thing,” said Hal, a seventh-grader. “I put my arm around her. It was like for 15 seconds. I didn't think it would be a big deal.”

But at a school of 1,100 students that was meant to accommodate 850, school officials think some touching can turn into a big deal. They've seen pokes lead to fights, gang signs in the form of handshakes or girls who are uncomfortable being hugged but embarrassed to say anything.

“You get into shades of gray,” Kilmer Principal Deborah Hernandez said. “The kids say, ‘If he can high-five, then I can do this.’ ”

Hernandez said the no-touching rule is meant to ensure that all students are comfortable and crowded hallways and lunchrooms stay safe. She said school officials are allowed to use their judgment in enforcing the rule. Typically, only repeat offenders are reprimanded.

‘Making out goes too far’

But such a strict policy doesn't seem necessary to 13-year-old Hal and his parents, who have written a letter to the county school board asking for a review of the rule. Hugging is encouraged in their home, and their son has been taught to greet someone with a handshake.

Hal said he feels he knows what's appropriate and what's not.

“I think you should be able to shake hands, high-five and maybe a quick hug,” he said. “Making out goes too far.”

His parents said they agree that teenagers need to have clear limits but don't want their son to be taught that physical contact is bad.

“How do kids learn what's right and what's wrong?” Henri Beaulieu asked. “They are all smart kids, and they can draw lines. If they cross them, they can get in trouble. But I don't think it would happen too often.”

I can't help thinking of Kafka's ironic question, “If judges are putting to death the mentally retarded, why is this judge still alive?” 

Long live minimal disclosure tokens!

Stefan Brands has a nice new piece called, Anonymous Credentials? No, Minimal Disclosure Certificates!  I think he's right about the need to stay away from the moniker “anonymous credentials”.  I adopted it – in spite of the confusion it creates – but I hereby give it up.  If I use it again, slap me around:

“Kim Cameron is in the midst of blogging an excellent series of posts on the important topic of unlinkability; see here and here, for instance. As I had expected from past experience, several commentors on Kim’s post (such as here and here) wrongly equate unlinkability with anonymity. Of course, an unfortunate choice of terminology (‘anonymous credentials’) does not help at all in this respect… 

“In short, ‘anonymous credentials’ are not all about anonymity. They are about the ability to disclose the absolute minimum that is required when presented an identity claim. Similarly, ‘unlinkability’, ‘untraceability’, and ‘selective disclosure’ are not about anonymity per se.

“Anonymity is just an extreme point on the privacy ‘spectrum’ that can be achieved, all depending on what attribute information is encoded in certificates and what of that is disclosed at presentation time. Currently prevalent technologies, such as standard digital signatures and PKI/X.509 certificates, are a poor technology to protect identity claims, since they inescapably leak a lot of identifying information when presenting protected identity claims; in particular, they disclose universally unique identifiers (correlation handles) that can be used to unambiguously link their presentation to their issuance.

I hope people will think hard about the difference between privacy and anonymity to which Stefan calls our attention.  Both are important, but people in the privacy community consider it crucial not to conflate them.  I'll try to find pointers to some of the detailed analysis that has been done by people like Simon Davies of Privacy International and Ann Cavoukian, Privacy Commisioner of Ontario, in this area – not to mention a host of other advocates and policy specialists.

So I'm going to STOP saying anonymous credentials, and even fix my very time-consuming graphic!  (Is that true dedication or what??) 

But I hope Stefan will allow me to say “Minimal Disclosure Tokens” rather than “Minimal Disclosure Certificates”.  I know the word “token” can possibly be confused with a hardware second factor, but its usage has become widely accepted in the world of web services and distributed computing.  Further, I want to get away from the connotations of X.509 “certificates” and forge new ground.  Finally, the word “tokens” ties in to thinking about claims, and allows us to envisage not only “hiding” as a means of minimization, but less draconian mechanisms as well.

Revealing patterns when there is no need to do so

Irving Reid of Controlled Flight into Terrain has come up with exactly the kind of use case I wanted to see when I was thinking about Paul Madsen's points:

Kim Cameron responds to Paul Madsen responding to Kim Cameron, and I wonder what it is about Canadians and identity…

But I have to admit that I have not personally been that interested in the use case of presenting “managed assertions” to amnesiac web sites.  In other words, I think the cases where you would want a managed identity provider for completely amnesiac interactions are fairly few and far between.  (If someone wants to turn me around me in this regard I’m wide open.)

Shibboleth, in particular, has a very clear requirement for this use case. FERPA requires that educational institutions disclose the least possible information about students, staff and faculty to their partners. The example I heard, back in the early days of SAML, was of an institution that had a contract with an on-line case law research provider such that anyone affiliated with the law school at that institution could look up cases.

In this case, the “managed identity provider” (representing the educational institution) needs to assert that the person visiting right now is affiliated with the law school. However, the provider has no need to know anything more than that, and therefore the institution has a responsibility under FERPA to not give the provider any extra information. “The person looking up Case X right now is the same person who looked up Case Y last week” is one of the pieces of information the institution shouldn’t share with the provider.

Put this way it is obvious that it breaks the law of minimal disclosure to reveal that “the person looking up Case X right now is the same person who looked up Case Y last week” when there is no need to do so.

I initially didn't see that a pseudonymous link between Case X and Case Y would leak very much information.  But on reflection, in the competitive world of academic research, these linkages could benefit an observer by revealing patterns the observer would not otherwise be aware of.  He might not know whose research he was observing, but might nonetheless cobble a paper together faster than the original researcher, beating him in terms of publication date.

I'll include this example in discussing some of the collusion issues raised by various identity technologies.

Colluding with yourself

Further to Dave Kearn's article, here is the complete text of Paul Masden's comment

Kim Cameron introduces a nice diagram into his series exploring linkability & correlation in different identity systems.

Kim categorizes correlation as either ‘IP sees all’, ‘RP/RP collusion’, or ‘RP/IP collusion’, depending on which two entities can ‘talk’ about the user.

A meaningful distinction for RP/RP collusion that Kim omits (at least in the diagram and in his discussion of X.509) is ‘temporal self-correlation’, i.e. that in which the same RP is able to correlate the same user's visits occurring over time.

Were an IDP to use transient (as opposed to persistent pseudonymous) identifiers within a SAML assertion each time it asserted to a RP, then not only would RP's be unable to collude with each other (based on that identifier), they'd be unable to collude with themselves (the past or future themselves).

I was working on a diagram comparable to Kim's, but got lost in the additional axis for representing time (e.g. ‘what the provider knows and when they learned it’ when considering collusion potential).

Separately, Kim will surely acknowledge at some point (or already has) that these identity systems, with their varying degrees of inhibiting correlation & subsequent collusion, will all be deployed in an environment that, by default, does not support the same degree of obfuscation. Not to say that designing identity systems to inhibit correlation isn't important & valuable for privacy, just that there is little point in deploying such a system without addressing the other vulnerabilities (like a masked bank robber writing his ‘hand over the money’ note on a monogrammed pad).

First, I love Paul's comment that he “got lost in the additional axis”, since there are many potential axes – some of which have taken me to the steps of purgatory.  Perhaps we can collect them into a joint set of diagrams since the various axes are interesting in different ways.

Second, I want everyone to understand that I do not see correlation as being something which is in itself bad.  It depends on the context, on what we are trying to achieve.  When writing my blog, I want everyone to know it is “me again”, for better or for worse.  But as I always say, I would like to be able to use my search engine and read my newspaper without fear that some profile of me, the real-world Kim Cameron, would be assembled and shared.

The one statement Paul makes that I don't agree with is this: 

Were an IDP to use transient (as opposed to persistent pseudonymous) identifiers within a SAML assertion each time it asserted to a RP, then not only would RP's be unable to collude with each other (based on that identifier), they'd be unable to collude with themselves (the past or future themselves).

I've been through this thinking myself.

Suppose we got rid of the user identifier completely, and just kept the assertion ID that identifies a given SAML token (must be unique across time and space – totally transient).  If the relying party received such a token and colluded with the identity provider, the assertionID could be used to tie the profile at the relying party to the person who authenticated and got the token in the first place.  So it doesn't really prevent linking once you try to handle the problem of collusion.

No masks in the grocery store

Dave Kearns discusses the first part of my examination of the relation between identity technologies and linking, beginning with a reference to Paul Madsen:

Paul Madsen comments on Kim Cameron's first post in a series he's about to do on privacy and collusion in on-line identity-based transactions. He notes:

A meaningful distinction for RP/RP collusion that Kim omits (at least in the diagram and in his discussion of X.509) is ‘temporal self-correlation’, i.e. that in which the same RP is able to correlate the same user's visits occurring over time.

and concludes:

Not to say that designing identity systems to inhibit correlation isn't important & valuable for privacy, just that there is little point in deploying such a system without addressing the other vulnerabilities (like a masked bank robber writing his ‘hand over the money’ note on a monogrammed pad).

Paul makes some good points.  Rereading my post I tweaked it slightly to make it somewhat clearer that correlating the same user's visits occuring over time is one possible aspect of linking. 

But I have to admit that I have not personally been that interested in the use case of presenting “managed assertions” to amnesiac web sites.  In other words, I think the cases where you would want a managed identity provider for completely amnesiac interactions are fairly few and far between.  (If someone wants to turn me around me in this regard I'm wide open.)  To me the interesting use cases have been those of pseudonymous identity – sites that respond to you over time, but are not linked to a natural person.  This isn't to say that whatever architecture we come out with can simply ignore use cases people think are important.

Dave continues:

I'd like to add that Kim's posting seems to fall into what I call on-line fallacy #1 – the on-line experience must be better in some way than the “real world” experience, as defined by some non-consumer “expert”. This first surfaced for me in discussions about electronic voting (see Rock the Net Vote), where I concluded “The bottom line is that computerized voting machines – even those running Microsoft operating systems [Dave, mais vous êtes trop méchant! – Kim]- are more secure and more reliable than any other ‘secret ballot’ vote tabulation method we've used in the past.”

When I re-visit a store, I expect to be recognized. I hope that the clerk will remember me and my preferences (and not have to ask “plastic or paper?” every single blasted time!). Customers like to be recognized when they return to the store. We appreciate it when we go to the saloon where “everybody knows your name” and the bartender presents you with a glass of “the usual” without you having to ask. And there is nothing wrong with that! It's what most people want. Fallacy #2 is that most Jeremiahs (those weeping, wailing, and tooth-gnashing doomsayers who wish to stop technology in it's tracks) think that what they want is what everyone should want, and would want if the hoi-polloi were only educated enough. (and people think I'm elitist! 🙂

I do wish that all those “anonymity advocates” would start trying to anonymize themselves in the physical world, too. So here's a test – next time you need to visit your bank, wear a mask. Be anonymous. But tell your lawyer to stand by the phone…

Dave, I think you are really bringing up an important issue here.  But beyond the following brief comment, I would like to refrain from the discussion until I finish the technical exploration.  I ask you to go with me on the idea that there are cases where you want to be treated like you are in your local pub, and there are cases where you don't.  The whole world is not a pub – as much as that might have some advantages, like beer.

In the physical world we do leave impressions of the kind you describe.  But in the digital world they can all be assembled and integrated automatically and communicated intercontinentally to forces unknown to you in a way that is just impossible in the physical world.  There is absolutely no precedent for digital physics.  We need to temper your proposed fallacies with this reality.

I'm trying to do a dispassionate examination of how the different identity technologies relate to linking, without making value judgements about use cases.

That done, let's see if we can agree on some of the digital physics versus physical reality issues.

Evolving technology for better privacy

Let's continue to explore linking, and of how it relates to CardSpace, identity protocols, token formats and cryptography.

I've summarized a number of thoughts in the following diagram, which contrasts the linking threats posed by a number of technology combinations. The diagram presents these technologies on an ordinal scale ranging from the most dangerous to the least – along the axis of linkage prevention.

X.509 with OCSP

Let's begin with Public Key Infrastructure (PKI) technology employed with X.509 user certificates.

Here the user has a key only she can “exercise”, and some Certificate Authority (CA) mints a long-lived certificate binding her key to her name, organization, country and the like. When the user visits a relying party who trusts the CA, she presents the certificate and exercises the key – typically by using it to sign a challenge created by the relying party.

In many cases, in addition to binding the key to attributes,  the CA exists with the explicit mission of linking it to a “natural person” (in the sense of an identifiable person in the physical world).  However, for now we'll leave the meaning of assertions aside and look only at how the technology itself impacts privacy.

Since the user presents the same long-lived certificate to every relying party who trusts the CA, the certificate and the information in it link the user accross sessions on one web site, and between one web site and another. Any two web sites obtaining this kind of certificate can compare notes and determine that the same user has visited each of them. This allows linkage of their profiles into a super-dossier (possibly including a super-dossier of a natural person).

What is good about X.509 is that if a relying party does not collude, the CA has no visibility onto the fact that a given user has visited it (we will see that in some other systems such visibility is unavoidable). But a relying party could at any point decide to collude with the CA (assuming the CA actually accepts such information, which may be a breach of policy).  This might result in the transfer of information in either direction beyond that contained in the certificate itself.

So in the diagram, I express this through two risks of collusion. The first is between any two relying parties who receive the same certificate. The second is between any relying party and the certificate authority. In esssence, then, all participating parties can collude with any other party, so this represents one of the worst possible technology alternatives if privacy is your goal.

In light of this it makes sense that X.509 has been successful as a technology for public entities like corporate web sites, where correlation is actually a good thing, but not for individual identification where privacy is part of the equation.

(Continues tomorrow…).

News on the Australian “Access Card”

Here is a report from The Australian about the issues surrounding Australia's Human Services Access Card.  Some of the key points: 

“By this time next year, the federal Government hopes to be interviewing and photographing 35,000 Australians each day to create the nation's first ID databank. Biometric photos, matched with names, addresses, dates of birth, signatures, sex, social security status and children's details, would be loaded into a new centralised database. Welfare bureaucrats, ASIO, the Australian Federal Police and possibly even the Australian Taxation Office would have some form of access to the unprecedented collection of identity data.

“Within three years, all Australians seeking benefits such as Medicare, pensions, childcare subsidies, family payments, unemployment or disability allowances – about 16.5 million people – would have joined the databank. They would be given a photographic access card to prove who they are and show their eligibility for social security.

“This week, however, the billion-dollar project hit a bump when Human Services Minister Chris Ellison revealed that legislation due to go before federal Parliament this month had been delayed…

“How will Australians’ privacy be protected? How will the database and cards be kept secure? Who can see information on the card? What identity documents will Australians need to acquire a card, and what will happen to the estimated 600,000 people without a birth certificate, passport or driver's licence?

“The Government's mantra is that this is not an ID card because it does not have to be carried, but users will have to show it to prove their identity when claiming welfare benefits…

“The Government claims the new system will stem between $1.6 billion and $3 billion in welfare fraud over the next decade…

“A key Government adviser, Allan Fels – a former chairman of the Australian Competition and Consumer Commission and now head of the Government's Access Card Consumer and Privacy Taskforce – is at loggerheads with Medicare, Centrelink and the AFP, who all want the new card to display the user's identification number, photograph and signature…

“The photo would be stored in a central database, as well as in a microchip that could be read by 50,000 terminals in government offices, doctors’ surgeries and pharmacies…

“Despite his official role as the citizens’ watchdog, Fels still has not seen the draft bill…

“‘The law should be specific about what is on the card, in the chip and in the database,’ he says. ‘If anyone in future wants to change that they would have to do nothing less than get an act of parliament through. We don't want a situation where, just by administrative decisions, changes can be made…’

“‘There will be no mega-database created that will record a customer's dealings with different agencies,” the minister [Ellison] told the conference…

“Cardholders may be able to include sensitive personal information – such as their blood type, emergency contacts, allergies or illnesses such as AIDS or epilepsy – in the one-third of the microchip space that will be reserved for personal use. It is not yet clear who would have access to this private zone.

“Hansard transcripts of Senate committee hearings into the access card legislation reveal that police, spies and perhaps even the taxman will be able to glean details from the new database. The Department of Human Services admits the AFP will be able to obtain and use information from the databank and card chip to respond to threats of killing or injury, to identify disaster victims, investigate missing persons, or to ‘enforce criminal law or for the protection of the public revenue’.

“Australia's super-secretive spy agency, the Defence Signals Directorate, will test security for the new access card system…

“The Australian Privacy Foundation's no-ID-card campaign director, Anna Johnston, fears future governments could “misuse and abuse” the biometric databank…

(Full story…)

ID Cards can be deployed in ways that increase, rather than decrease, the privacy of citizens, while still achieving the goals of fraud reduction.  It's a matter of taking advantage of new card and crypto technologies.  My view is that politicians would be well advised in funding such products rather than massive centralized databases.

As for the Defense Signals Directorate's access to identity data, what has this got to do with databases offering generalized access to every curious official?  You would think they were without other means.