Identity Crisis Podcast

Identity Crisis If you haven't read Jim Harper's book, Identity Crisis: How Identification Is Overused and Missunderstood I urge you to do so as soon as you can.

I was initially a bit skeptical about this book because – I hope my more politically inclined friends will forgive me – it was published by what I assume is a political “think tank”.  I worried it might reflect some kind of ideology, rather than being a dispassionate examination of reality.

But in this case I was wrong, wrong, wrong. 

Jim Harper really understands identification.  And he is better than anyone at explaining what identification systems won't do for us – or our institutions. He carefully explains why many of the proposed uses of identification are irrational – delivering results that are quite unrelated to what they are purported to do.  In my view, getting this message out is just as important as explaining what identity will do.  In fact it is a prerequisite for the identity big-bang.  There are two sides to this equation an we need to understand them both.

He directly takes on the myth that if only we knew what peoples’ identifiers were, “we would be safe”.  Metaphorically, he is asking what kind of plane we would rather fly in – one where the passengers’ identifiers have been checked against a database or one where they and their luggage have been screened for explosives and guns? 

I think he will convey to “lay people” why a so-called “blacklist” is one of the weakest forms of protection, showing that all you have to do is impersonate anyone not on it to sneak through the cracks.

The book is full of important discussions.  It has chapters like “Use identification less” and “Use authorization more.”  I have only one criticism of the book.  I would like to see us separate the notion of identity, on the one hand, and individual identification (or identifiers) on the other.  We need return to the original meaning of identity: the fact of being who or what a person or thing is.

As a simple example, suppose I'm a service provider building a chat room for children, and want to limit participation to children who are between 12 and 15.  Let me contrast two ways of doing this. 

In the first, all the children are given an identifier.  To get into the room, they present their identifier and prove they are the person to whom that identifier was given.  Then the chatroom system does a lookup in some public system linking identifier and age to make the access control decision.

In the second, the children are given a “digital claim” that they are of some age, and a way to prove they are the person to whom that “claim” was given.  The chatroom system just queries the claim to see if it meets its criteria.  There is no reference to any public or even private identifier.

My point is that the first mechanism involves use of an identifier.  The second still involves identity – in the sense of being what a person is – but the identification, so rightly put into question by Jim's book, has been put into the trashcan where it belongs.

The use of an identifier in our first example breaks the second Law of Identity (Data Minimization – release no more data than necessary). It breaks the third Law too (Fewest Parties – since it discloses use of information to a central database unnecessary to the transaction).   Finally, it breaks the Fourth Law (using an omnidirectional identifier when none is required).

The book was written before “claims-based thinking” began to gain mindshare, and so it's missing as a category in Jim's discussion of advanced identity technologies.  But we've talked extensively about these issues and we have concluded that we have no theoretical difference – in fact the alignment between his work and the Laws of Identity struck us both as remarkable given that we come at these issues from such different starting points. 

Jim's book is wonderful reading.  It should help newcomers better understand the Laws of Identity.  And this week the Cato Institute in Washington held an event at which Jim spoke, along with James Lewis, Director and Senior Fellow, Technology and Public Policy Program Center for Strategic and International Studies; and Jay Stanley, Public Education Director, Technology and Liberty Project American Civil Liberties Union.

Download the podcast or watch the video here.

 

State of the market or chance to get things right?

Eric Norlin of Digital Identity World comments on my concerns (note:  concerns are not allegations) about the need for client-side anti-spoofing components: 

Every now and then a technical disagreement betrays the state of a marketplace. That phenomenon is currently happening in the user-centric identity trenches.

The players are Kim Cameron (InfoCards/CardSpace) of Microsoft on one side and Dick Hardt (OpenID) of Sxip Identity on the other.  The issue: Kim's recent allegations that OpenID will make identity *less* secure and possibly result in security breaches that will set the user-centric identity work back in the minds of users.

The debate highlights where we are with user-centric identity.

The technical details all focus around the need (or lack of need) for client-side identity selectors with Kim arguing that its necessary to prevent spoofing, and Dick arguing that the spoofing security threat is acknowledged and defensible via OpenID. But the technical details (and argument) are not the most interesting thing.

Arguments like this, as all engineers know, are common in the world of the engineering. The reason is simple: the “engineer's mind” (versus the “marketer's mind”) naturally seeks the “perfect solution.” That's the blessing of the engineer's mind. It is, of course, also the curse.

As any student of technology history knows, the “perfect solution” has rarely won the battle of the marketplace. Instead, the solution that solved the problem set using “the principle of good enough”, and *also* attained a critical mass of adoption has won. Does that result in further problems to be solved? Of course it does! That, my friends, is the cycle of innovation.

The current debate between Kim and Dick actually serves to show us where the user-centric identity market actually is. Several years ago, two groups were competing around federation standards (the Liberty Alliance and Microsoft/IBM's WS-* standards). For what seemed like forever, they held obscure debates about the details of the standards. Eventually, the market moved forward (seemingly without either group's help), and now today we find ourselves witnessing a new Liberty Alliance President saying that the “gloves are off” and they'd like to find ways to converge with the WS-* standards.

That simple, recent analogy shows us where we are with user-centric identity. We're on the verge of the market beginning to really adopt some technology. These conversations don't reach this level unless those involved see this potential.

In the meantime, the engineers will continue to debate the details, and that's good for all of us.

I want people to understand I'm not against OpenID, and I don't see this as something that should turn into a war, marketing or other.  We should do everything we can to make OpenID as secure as possible, and that includes integrating it with InfoCards wherever this is possible.

 

 

Superpat and the third way

Pat Patterson leaps through the firmament to punctuate my recent discussion of minimal disclosure with this gotcha: 

But, but, but… how does the relying party know not to ask for givenname, surname and emailaddress the second (and subsequent) time round? It doesn't know that it's already collected those claims for that user, since it doesn't know who the user is yet…

In the case described by Pat, the site really does use a “registration” model like the one from BestBuy shown here. 

When registering you hand over your identity information, and subsequently you only “authenticate”. 

This is really the current model for how identity is handled by most web sites.  In other words the “Registration process” is completely separated from the “Returning user” process.

So the obvious answer to Pat's question is that when you press “create an account” above, you invoke an object tag that asks for the four attributes discussed earlier.  And if you press “Sign in”, you invoke an object tag that only asks for PPID and then associates with your stored information.  

In other words, there is no new problem and no new framework is required.

This doesn't prevent Pat from serving up a little irony:

If only there were some specification (perhaps part of some sort of framework) that, given a token from an authentication, allowed you to get the data you needed, subject, of course, to the user's permission. 

I guess it bothered Pat that I didn't include use of backend protocols as one of the options for reducing disclosure. 

I want to set this right.  I've said since the beginning that as I saw it, the PPID (or other authenticated identifier) delivered by an InfoCard could also be used to animate a back-end protocol such as he's refering to.  That's one of the reasons I thought everyone should be able to rally behind these proposals.

The third option

So let me add a third alternative to the two I gave yesterday (storing locally or asking the user to resubmit through infocard).  The relying party could authenticate the user using InfoCard and then contact the identity provider with the user's PPID and ask it for the information the user has already agreed should be released to it.  This could be done using the protocols referred to by Pat. 

My uberpoint is simple.  InfoCards are intended to be as neutral as possible in their technical assumptions (e.g. to be an identity platform) and can be used in many ways that make sense in different environments and use cases.

I don't personally agree that the back-end protocol route for obtaining attributes is either simpler or more secure than delivering the claims directly on an as-needed basis in the authentication token, but it is certainly possible and I'm sure it has its use cases.  I wonder if Pat's implementation of Information Cards, should there be one, will take this approach?  Interesting.

 

Resending of personal data with InfoCards

Eric Schultz writes with this question: 

I've been investigating CardSpace and the practicality of it's use for login on a new social networking site.

I have a question regarding the method through which data is transferred. I see that you can require certain claims from an InfoCard such as email, first and last name, zip code etc. When I look at the login code I see that the same claims are required again.

Does this mean that each time an InfoCard is sent all the personal data is resent? Isn't this dangerous for security/privacy? The potential for a server failure (malicious or not) caused by a buffer overflow, a coding mistake that outputs the details of session variables etc. seems rather risky in this scenario.

Perhaps I am being alarmist?

This is an area in which being “an alarmist” – perhaps I will rephrase it as being thoroughly pessimistic about what can go wrong – is the best starting point.  You questions are ones everyone should think about.

InfoCard and Minimal Disclosure

The simple answer is that there is nothing built into InfoCard concepts that requires a “relying party” to ask for attributes every time a user comes to its site.  Let's first look at the mechanics. 

The relying party controls what attributes it asks for by putting an OBJECT tag in the HTML page where the user opts to use an infocard.

The example shown here will bring up the infocard dialog and illuminate any cards that offer all four claims so the user can select one. 

If, next time, the relying party doesn't want to receive these claims, it just doesn't ask for them.  If it has stored them, it should be able to retrieve them when necessary by using  “privatepersonalidentifier” as a handle.  This identifier is just a random pairwise number meaningless to any other site, and so there is no identity risk in using it.

No theoretical bias 

In other words, the InfoCard system has no theoretical bias about what information should be asked for when.  Through the Laws of Identity we have tried to help people understand that they should only ask for what they need to complete a transaction and should only keep it for the length of time they absolutely must. 

In particular, there should be no hoarding of rainy-day information – information that “might come in handy” some day – but which is more likely to turn into a liability than into a benefit.

Do your risk analysis 

You'll need to do the conventional risk analysis and think about whether it is more dangerous to store the information or just ask for it on an “as-needed” basis and then forget it.  My personal sense is that it is more dangerous to store it than to use an on-demand approach. 

A central machine with the stored information that animates a successful internet business is a honeypot.  It could well be subject to insider attacks, and certainly, since it lives on the internet, will be subject to many attacks on the information it stores.  Why not avoid these problems completely?

Certainly, the on-demand approach has benefits in convincing customers and legal practitioners that, having held no identity information, you cannot be seen as being responsible for an identity meltdown.  To me this is very attractive, and something that has not been possible until now.

Conclusion

The examples Eric gives of things that can go wrong seem to me to apply even more strongly if you have stored information locally than if you ask for it on demand.

But as I said earlier, this just expresses my thinking – there is lots more to be written by Eric and hundreds of others as they develop applications. 

Meanwhile, InfoCard has no built-in assumptions around this and can be used in whatever way is appropriate to a given situation.

 

Separating apples from oranges

Dick Hardt of Sxip posted a reply to my recent comments on the fears I have about attacks on OpenID:

My good friend Kim Cameron posted the following misinformation on OpenID:

InfoCards change the current model in which the user can be controlled by an evil site. OpenID doesn’t. 

If a user ends up at an evil site today, it can pose as a good site known to the user by scooping the good site’s skin so the user is fooled into entering her username and password.

An evil site proxying a web based OpenID Provider is a known security threat in OpenID. There are a number of ways of thwarting this attack, and given that the user has a close relationship with their OP, not difficult to deploy. Similar to the advantages that CardSpace has with a client side implementation, Sxipper is a Firefox plug-in that provides an OpenID user with the same client side protections as CardSpace. OpenID is a protocol, similar at a high level to WS-* — so this is somewhat of an apples an oranges comparison. CardSpace could support OpenID and protect the user.

I’d like to see OpenID and InfoCard technologies come together more. I’ll be presenting a plan for that over the next little while.

I’m looking forward to seeing your thoughts Kim! Perhaps supporting OpenID in Cardspace?

I'm not just talking about (realtime) proxying.  I'm talking about social engineering attacks in general.  Where is the misinformation in my description of these attacks?  

Dick reassures us that use of the protocol to help convince uers to reveal their credential is a known attack.  Should this make me feel better?  I don't think so. 

I also don't think the mitigations I've heard about are too convincing – with a couple of exceptions. 

One of those is the mitigation devised by Dick himself – called Sxipper.  This is a plugin for the browser which, as I understand it, take control away from the evil party.  If all OpenID implementations were to add this feature it would be a big step forward!

But in that case, if we want to separate apples from oranges, let's not say that InfoCards require a client component and OpenID doesn't.  Let's say that to offer reasonable protection, InfoCards require a client component and so does OpenID.  That would, as Dick proposes, properly separate discussion of protocols from the overall delivery of an identity solution.

Use of a hardware token at the OpenID site also mitigates the social engineering attack, though not the man in the middle attack.

More later about how Cardspace and OpenID can converge further.

As simple as possible – but no simpler

Gunnar Peterson recently showed me a posting by Andy Jaquith (securitymetrics.org and JSPWiki) that combines coding, wiki, identity, Cardspace, cybervandals, and OpenID all into one post – a pretty interesting set of memes.  Like me, he finally actually shut off anonymous commentary because he couldn't stand the spam workload, and is looking for answers:

Last week's shutoff of this website's self-registration system was something I did with deep misgivings. I've always been a fan of keeping the Web as open as possible. I cannot stand soul-sucking, personally invasive registration processes like the New York Times website. However, my experience with a particularly persistent Italian vandal was instructive, and it got me thinking about the relationship between accountability and identity.

Some background. When you self-register on securitymetrics.org you supply a desired “wiki name”, a full name, a desired login id, and optionally a e-mail address for password resets. We require the identifying information to associate specific activities (page edits, attachment uploads, login/log out events) with particular members. We do not verify the information, and we trust that the user is telling truth. Our Italian vandal decided to abuse our trust in him by attaching pr0n links to the front page. Cute.

The software we use here on securitymetrics.org has decent audit logs. It was a simple matter of identifying the offending user account. I know which user put the porn spam on the website, and I know when he did it. I also know when he logged in, what IP address he came from, and what he claimed his “real” name was. But although I've got a decent amount of forensic information available to me, what I don't have any idea of whether the person who did it supplied real information when he registered.

And therein lies the paradox. I don't want to keep personal information about members — but at the same time, I want to have some degree of assurance that people who choose to become members are real people who have serious intent. But there's no way to get any level of assurance about intent. After alll, as the New Yorker cartoon aptly put it, on the Internet no-one knows if you're a dog. Or just a jackass.

During the holiday break, I did a bit of thinking and exploration about how to address (note I do not say “solve”) issues of identity and accountability, in the context of website registration. Because I am a co-author of the software we use to run this website (JSPWiki), I have a fair amount of freedom in coming up with possible enhancements.

One obvious way to address self-registration identity issue is to introduce a vetting system into the registration process. That is, when someone registers, it triggers a little workflow that requires me to do some investigation on the person. I already do this for the mailing list, so it would be a logical extension to do it for the wiki, too. This would solve the identity issue — successfully vetting someone would enable the administrator to have much higher confidence in their identity claims, albeit with some sacrifice

There's just one problem with this — I hate vetting people. It takes time to do, and I am always far, far behind.

A second approach is to not do anything special for registration, but moderate page changes. This, too, requires workflow. On the JSPWiki developer mailing lists, we've been discussing this option quite a bit, in combination with blacklists and anti-spam heuristics. This would help solve the accountability problem.

A third approach would be to accept third-party identities that you have a reasonable level of assurance in. Classic PKI (digital certificates) are a good example of third-party identities that you can inspect and choose to trust or not. But client-side digital certificates have deployment shortcomings. Very few people use them.

A promising alternative to client-side certificates is the new breed of digital identity architectures, many of which do not require a huge, monolithic corporate infrastructure to issue. I'm thinking mostly of OpenID and Microsoft's CardSpace specs. I really like what Kim Cameron has done with CardSpace; it takes a lot of the things that I like about Apple's Keychain (self-management, portability, simple user metaphors, ease-of-use) and applies it specifically to the issue of identity. CardSpace‘s InfoCards (I have always felt they should be called IdentityCards) are kind of like credit cards in your wallet. When you want to express a claim about your identity, you pick a card (any card!) and present it to the person who's asking.

What's nice about InfoCards is that, in theory, these are things you can create for yourself at a registrar (identity provider) of your choice. InfoCards also have good privacy controls — if you don't want a relying party (e.g., securitymetrics.org) to see your e-mail identity attribute, you don't have to release that information.

So, InfoCards have promise. But they use the WS-* XML standards for communication (think: big, hairy, complicated), and they require a client-side supplicant that allows users to navigate their InfoCards and present them when asked. It's nice to see that there's a Firefox InfoCard client, but there isn't one for Safari, and older versions of Windows are still left out in the cold. CardSpace will make its mark in time, but it is still early, methinks.

OpenID holds more promise for me. There are loads more implementations available (and several choices for Java libraries), and the mechanism that identity providers use to communicate with relying parties is simple and comprehensible by humans. It doesn't require special software because it relies on HTTP redirects to work. And best of all, the thing the identity is based on is something “my kind of people” all have: a website URL. Identity, essentially, boils down to an assertion of ownership over a URL. I like this because it's something I can verify easily. And by visiting your website, I can usually tell whether the person who owns that URL is my kind of people.

OpenID is cool. I got far enough into the evaluation process to do some reasonably serious interoperability testing with the SXIP and JanRain libraries. I mocked up a web server and got it to sucessfully accept identities from the Technorati and VeriSign OpenID services. But I hit a few snags.

Recall that the point of all of this fooling around is to figure out a way to balance privacy and authenticity. By “privacy”, I mean that I do not want to ask users to disgorge too much personal information to me when they register. And correspondingly, I do not want the custodial obligation of having to store and safeguard any information they give me. The ideal implementation, therefore, would accept an OpenID identity when presented, dyamically collect the attributes we want (really, just the full name and websute URL) and pull them into our in-memory session, and flush them at the end of the session. In other words, the integrity of the attributes presented, combined with transience yields privacy. It's kind of like the front-desk guard I used to see when I consulted to the Massachussetts Department of Mental Health. He was a rehabilitated patient, but his years of illness and heavy treatment left him with no memory for faces at all. Despite the fact I'd visited DMH on dozens of occasions, every time I signed in he would ask “Have you been here before? Do you know where you are going?” Put another way, integrity of identity + dynamic attribute exchange protocols + enforced amnesia = privacy.

By “authenticity” I mean having reasonable assurance that the person on my website is not just who they say they are, but that I can also get some idea about their intentions (or what they might have been). OpenID meets both of these criteria… if I want to know something more about the person registering or posting on my website, I can just go and check ’em out by visiting their URL.

But, in my experiments I found that the attribute-exchange process needs work… I could not get VeriSign's or Technorati's identity provider to release to my relying website the attributes I wanted, namely my identity's full name and e-mail addresses. I determined that this was because neither of these identity providers support what the OpenID people call the “Simple Registration” profile aka SREG.

More on this later. Needless to say, I am encouraged by my progress so far. And regardless of the outcome of my investigations into InfoCard and OpenID, my JSPWiki workflow library development continues at a torrid pace.

Bottom line: once we have a decent workflow system in place, I'll open registrations back up. And longer term, we will have more some open identity system choices.

Hmmm.  Interesting thoughts that I want to explore more over the next while.

Before I get to the nitty-gritty, please note that Cardspace and InfoCards do NOT require a client-side wiki or web site to use WS-* protocols

The system supports WS-*, which gives it the ability to handle upper-end scenarios, but doesn't require it and can operate in a RESTful mode! 

So the actual effort required to implement the client side is on the same order of magnitude as for OpenID.  But I agree there are not very many open-source options out there for doing this yet – requiring more creativity on the part of the implementor.  I'm trying to help with this.

It's also true that InfoCards require client software (although there are ways around this if someone is enterprising: you could build an infocard selector that “lives in the cloud”).

But the advantages of InfoCard speak clearly too.  Andy Jaquist would find that release of user information is built right in, and that “what you see is what you get” – user control.  Further, the model doesn't expose the user to the risk that personal information will become public by being posted on the web.  This makes it useful in a number of applications which OpenID can't handle without a lot of complexity.

But what's the core issue? 

InfoCards change the current model in which the user can be controlled by an evil site.  OpenID doesn't.

if a user ends up at an evil site today, it can pose as a good site known to the user by scooping the good site's skin so the user is fooled into entering her username and passord.

But think of what we unleash with OpenID…

It's way easier for the evil site to scoop the skin of a user's OpenID service because – are you ready? – the user helps out by entering her honeypot's URL!

By playing back her OpenID skin the evil site can trick the user into revealing her creds.  But these are magic creds,  the keys to her whole kingdom!  The result is a world more perilous than the one we live in now.

If that isn't enough, evil doers armed with identifiers and ill-gotten creds can then crawl the web to see where the URL they have absconded with is in play, and break into those locations too.

The attacks on OpenID all lend themselves to automation…

One can say all this doesn't matter because these are low-value identities, but I think it is a question of setting off on the wrong foot unless we build the evolution of OpenID into it.

It will really be a shame if all the interest in new identity technology leads to security breaches worse than those that currently exist, and brings about a further demoralization of the user.

I'd like to see OpenID and InfoCard technologies come together more.  I'll be presenting a plan for that over the next little while.

 

Back in action

My day job has conspired with the holidays to play havoc with my blog over the last while. 

What can I say?  Maybe something good will come out of it.  At least those of you who subscribe to my feed got a bit of peace and quiet!  And I feel rested and relatively renewed.  I missed writing.

At the same time, there were many exciting identity-related developments that came to my attention but which I wasn't able to pass on.  Sorry about that.  There was simply no way to “do everything simultaneously all at once”.

But on the positive side of the balance sheet, I was able to complete some work on how Cardspace actually behaves over the wire. 

I've put together a PHP implementation of the Identity Provider end of things which I hope will help better convey, in a cross-platform fashion, what is possible with the identity provider paradigm and how Cardspace actually uses the WS protocols.  I hope this, in conjunction with some important new documentation by Arun Nanda, will aid in the development of other compatible InfoCard implementations.

All that remains is to write about all this stuff.  So, here we go…