Kim Cameron's Identity Weblog – Page 47 – Digital Identity, Privacy, and the Internet's Missing Identity Layer

Scary phishing video from gnucitizen

Here's a must-see that punctuates our current conversation with – are you ready? – drama and numerous other arts.

Beyond the fact that it's just plain cool – and scary – as a production, it underlines one of the main points we need to get across. The evolution of the virtual world will be blocked until we can make it as safe as the physical one.

As Pam says, “this video really captures the panic involved in being hacked…”. That's for sure.

Integrating OpenID and Infocard – Part 1

Let's start by taking a step-by-step look at the basic OpenID protocol to see how the phishing attack works. (Click on the diagrams to see them on a more readable scale.)

The system consists of three parties – the relying party (or RP) which wants an ID in order to provide services to the user; the user – running a browser; and the Identity Provider (OpenID affectionados call it an OP – presumably because the phrase Open Identity Identity Provider smacks of the Department of Redundancy Department. None the less I'll stick with the term IP since I want to discuss this in a broader context).

OpenID can employ a few possible messages and patterns, but I'll just deal with the one which is of concern to me. An interaction starts with the user telling the RP what her URL is (1). The RP consults the URL content to determine where the user's IP is located (not shown). Then it redirects the user to her IP to pick up an authentication token, as shown in (2) and (3). To do the authentication, the IP has to be sure that it's the user who is making the request. So it presents her with an authentication screen, typically asking for a username and password in (4). If they are entered correctly, the IP mints a token to send to the RP as shown in (5) and (6). If the IP and RP already know each other, this is the end of the authentication part of the protocol. If not, the back channel is used as well.

The attack works as shown in the next diagram. The user unwittingly goes to an evil site (through conventional phishing or even by following a search engine). The user sends the evil RP her URL (1) and it consults the URL's content to determine the location of her IP (not shown). But instead of redirecting the user to the legitimate IP, it redirects her to the Evil Scooper site as shown in (2) an (3). The Evil Scooper contacts the legitimate IP and pulls down an exact replica of its login experience (it can even simply become a “man in the middle”) as shown in (4). Convinced she is talking to her IP, the user posts her credentials (username and password) which can now be used by the Evil Scooper to get tokens from the legitimate IP. These tokens can then be used to gain access to any legitimate RP (not shown – too gory).

The problem here is that redirection to the home site is under the control of the evil party, and the user gives that party enough information to sink her. Further, the whole process can be fully automated.

We can eliminate this attack if the user employs Cardspace (or some other identity selector) to log in to the Identity Provider. One way to do this is through use of a self-issued card. Let's look at what this does to the attacker.

Everything looks the same until step (4), where the user would normally enter her username and password. With self-issued cards, username and password aren't used and can't be revealed no matter how much the user is tricked. There is nothing to steal. The central “honeypot credentials” cannot be pried out of the user. The system employs public key cryptography and generates different keys for every site the user visits. So an Evil Scooper can scoop as much as it wants but nothing of value will be revealed to it.

I'll point out that this is a lot stronger as a solution than just configuring a web browser to know the IP's address. I won't go into the many potential attacks on the web browser, although I wish people would start thinking about those, too. What I am saying is the solution I am proposing benefits from cryptogrphy, and that is a good thing, not a bad thing.

There are other advantages as well. Not the least of these is that the user comes to see authentication as being a consistent experience whether going to an OpenID identity provider or to an identity provider using some other technology.

So is this just like saying, “you can fix OpenID if you replace it with Cardspace”? Absolutely not. In this proposal, the relying parties continue to use OpenID in its current form, so we have a very nice lightweight solution. Meanwhile Cardspace is used at the identity provider to keep credentials from being stolen. So the best aspects of OpenID are retained.

How hard would it be for OpenID producers to go in this direction?

Trivial. OpenID software providers would just have to hook support for self-issued cards into their “OP” authentication. More and more software is coming out that will make this easy, and if anyone has trouble just let me know.

Clearly not everyone will use Infocards on day one. But if OpenID embraces the alternative I am proposing, people who want to use selectors will have the option to protect themselves. It will give those of us really concerned about phishing and security the opportunity to work with people so they can understand the benefits of Information Cards – especially when they want, as they inevitably will, to start protecting things of greater value.

So my ask is simple. Build Infocard compatibility into OpenID identity providers. This would help promote Infocards on the one hand, and result in enhanced safety for OpenID on the other. How can that be anything other than a WIN/WIN? I know there are already a number of people in the milieux who want to do this.

I think it would really help and is eminently doable.

This said, I have another proposal as well. I'll get to it over then next few days.

Gabe hits the nail on the head

This post by Gabe Wachob at Digital Identity and Beyond is golden:

There's been a lot of discussion about the fact that OpenID protocol has a special exposure to phishing/pharming and that the OpenID community needs to address these issues, either technically or through pressure on various parties to address phishing/pharming more broadly. There are a lot of proposals – in particular, we are all waiting to hear from Kim Cameron about OpenID and Cardspace (though applying Cardspace at the OpenID Provider seems like a straightforward solution).

If you ask me, things are happening EXACTLY how I would have wanted and expected them to. Why? Becuase OpenID is a platform for innovation in authentication. People who want to innovate in authentication methods (Mozilla/Firefox, Cardspace, VxVsolutions, etc) do NOT have to be the same people who innovate in offering services on the web (any one of a million folks running mediawiki, drupal, etc). That “delinking” of authentication innovation and service innovation is what is valuable in OpenID.

No, OpenID doesn't solve all problems, and maybe today it only solves a very narrow set of problems with an acceptable risk profile. But to me, thats not the point – its the unleashing of creativity and the power to let developers and architects focus on what they are interested in and good at. Security and identity nuts can focus on authentication and let the social networking, wiki-touting, web 2.0-heads do what they do best! OpenID is an abstraction, a key middle ground for these folks to meet and leverage each other's work – that OpenID is deployed for use in a fairly narrow set of use cases TODAY should not mean that it will not be very important in they very near future…

Very interesting thinking and way to put things.

Ben Laurie and the “Kittens” phishing attack.

Here's a post about potential OpenID phishing problems by Ben Laurie, long-time security avocate who played an important role in getting SSL into open source. He's now at Google. Don't misinterpret his intentions despite his characteristically colorful introductory sentence – in a subsequent piece he makes it clear that he too wants to find solutions to these problems.

OpenID announced the release of a new draft of OpenID Authentication 2.0 today. Iâ€™m reluctantly forced to come to the conclusion that the OpenID people donâ€™t care about phishing, since theyâ€™ve defined a standard that has to be the worst Iâ€™ve ever seen from a phishing point of view.

OK, so whatâ€™s the problem? If Iâ€™m a phisher my goal is to be able to log in to some website, the Real Website, as you, the Innocent Victim. In order to do this, I persuade you to go to a website I control that looks like the Real Website. When you log in, thinking it is the Real Website, I get your username and password, and I can then proceed to empty your Paypal account, write myself cheques from your bank account, or whatever fiendish plan I have today.

So, why does OpenID make this worse? Because in the standard case, I (the phisher) have to make my website look like the Real Website and persuade you to go to it somehow – i.e. con you into thinking I am the real Paypal, and your account really has been frozen (or is that phrozen?) and you really do need to log in to unphreeze it.

But in the OpenID case I just persuade you to go anywhere at all, say my lovely site of kitten photos, and get you to log in using your OpenID. Following the protocol, I find out where your provider is (i.e. the site you log in to to prove you really own that OpenID), but instead of sending you there (because, yes, OpenID works by having the site youâ€™re logging in to send you to your provider) I send you to my fake provider, which then just proxies the real provider, stealing your login as it does. I donâ€™t have to persuade you that Iâ€™m anything special, just someone who wants you to use OpenID, as the designers hope will become commonplace, and I donâ€™t have to know your provider in advance.

So, I can steal login credentials on a massive basis without any tailoring or pretence at all! All I need is good photos of kittens.

I had hoped that by constantly bringing this up the OpenID people might take some step to deal with the issue, but they continue to insist on punting on it entirely:

The manner in which the end user authenticates to their OP [OpenID provider] and any policies surrounding such authentication is out of scope for this document.

which means, in practice, people will authenticate using passwords in forms, as usual. Which means, in turn, that phishing will be trivial.

Like me, Ben was struck with how readily the system currently lends itself to automation of phishing attacks. His second post on the subject is also interesting.

We need severe crypto

Someone just pointed out this super strong message from the Energy Field that is Marc Canter (founder of Macromedia and Broadband Mechanics):

Kim Cameron sets the record straight: State of the market or chance to get things right? And he has nothing against OpenID. But Kim is the god head and groks this shit better than any of us – so please listen to him! ID is a hell of a lot more than SSO or authentication and if weâ€™re to stop phishing, and spoofing and ID theft – we need severe crypto, locked down, secure ID systems.

No one can say Marc doesn't speak in thunder bolts. This is better summary of what I've been trying to say than I can manage (I would have elided the “god head” part!)

Dmitry Shechtman's Undevelopment Blog

So much is happening in the identity discussion it's hard to keep up with it. Through the miracles of ping-back I came across The Undevelopment Blog by Dmitry Shechtman, and this posting on a new proposal called Identity Manager:

It seems like the OpenID community is currently bothered with the following two questions:

OpenID facilitates phishing. What can be done about this?

FireFox 3.0 will have CardSpace and OpenID support. What does that mean?

I addressed the OpenID phishing problem even before it became wildly discussed. Unfortunately, the method wasnâ€™t foolproof, to say the least. Several other suggestions have been brought up, but none seemed to solve the problem without making OpenID unusable.

Kim Cameron of Microsoft has been repeatedly promising to elaborate on how CardSpace and OpenID could converge. Although he has yet to keep his promise, we can make an educated guess. We recently saw the FireFox extension Identity Selector act as an in-browser OpenID-to-InfoCard bridge. That is definitely something CardSpace folks would love to see as a standard browser feature, since it would effectively turn an OpenID into nothing more than a fairly insecure InfoCard.

Of course, OpenID could simply dismiss CardSpace (I was trying to get into the average kool-aid drinkerâ€™s shoes). Or it could very well learn from it. The CardSpace UI seems very intuitive:

A Sign In button on a website

An identity selection dialog

Seamless secure login

This is exactly what OpenID needs in order to become both widely used and insusceptible to phishing. And since CardSpace planned support is now a reality, why shouldnâ€™t OpenID be integrated? This is no trivial requirement, but one that can be met with some additions to the browser logic.

The combination of UI and business logic outlined in this proposal is dubbed Identity Manager. The proposal uses informal language (should, must, be and do are used interchangeably); handle with care.

Whenever a web page presents an OpenID sign in option, the OpenID field and the Sign In button are replaced by a single OpenID Sign In button. Moreover, separate OpenID Sign In and CardSpace Sign In buttons are replaced with a Secure Sign In button.

Once such a button is pushed, an Identity Manager window is presented with a list of the userâ€™s identities â€” OpenIDs, InfoCards or both, depending on what the relying party accepts. The user must be able to decline; we treat this case as trivial. The user must be able to make a persistent selection (e.g. a checkbox with the text Always use this ID for example.com).

(Dmitry's piece continues here…)

I would never characterize OpenID as “nothing more than a fairly insecure infocard”. It is a system where the root of trust is defined to be control over the content at a URL. Folks, this is innovative. I like it as what I call an “underlying identity system” that should live within the identity metasystem. Given its theoretical starting point in terms of trust, OpenID has the security characteristics, good and bad, of the Internet which it harnesses in the name of identity. That makes it very exciting, especially for bottoms up use cases involving public personna.

But “exciting” doesn't mean “good for every purpose.” OpenID won't replace all other forms of digital identity!

Is it necessary to explain further?

I'm fine with blog comments being associated with my URL. But I don't want access to my bank account to be gated by nothing more than the ability to set the header in what a system thinks is https://www.identityblog.com (I'm thinking here about all the potential attacks on DNS as well as the ways in which third parties could gain unauthorized access to my page).

My site is hosted by the good people at http://www.textdrive.com. As administrators of the shared systems there, they could certainly, for example, gain access to my pages.

Are their employees bonded? Do they practice strict separation of duties for access to web pages? Do they have HR practices that will protect them from organized crime? I don't think so! And if they did, wouldn't they turn into the world's most bureaucratic mess as a web hosting service? Their flexibility and personal touch is what makes them so good. I like them just as they are, thank you very much.

So it all comes back to the Laws of Identity. There will be a pluralism of providers and technologies, optimal in different use cases. And, as the potential phishing attacks demonstrate, there remains the requirement of giving users a consistent and controlled experience across these multiple systems.

My conclusion?

Combine CardSpace (insert your favorite replacement identity selector here) with OpenID and you have the best of both worlds. You have the web-based identity system. You have a consistent anti-phishing user experience. And you have continuity between OpenID and other underlying systems in a metasystem. Wouldn't we all want this?

As Dmitry reports, I have promised to share my own technical ideas about how to move forward but haven't come through on my promise yet. So I'm going to do that now. One idea is very simple (and effective) – I'll start with that. The second is in many ways more interesting (at least to me) but I need to explain a bit more about managed cards before I get to it.

Identity Crisis Podcast

If you haven't read Jim Harper's book, Identity Crisis: How Identification Is Overused and Missunderstood I urge you to do so as soon as you can.

I was initially a bit skeptical about this book because – I hope my more politically inclined friends will forgive me – it was published by what I assume is a political “think tank”. I worried it might reflect some kind of ideology, rather than being a dispassionate examination of reality.

But in this case I was wrong, wrong, wrong.

Jim Harper really understands identification. And he is better than anyone at explaining what identification systems won't do for us – or our institutions. He carefully explains why many of the proposed uses of identification are irrational – delivering results that are quite unrelated to what they are purported to do. In my view, getting this message out is just as important as explaining what identity will do. In fact it is a prerequisite for the identity big-bang. There are two sides to this equation an we need to understand them both.

He directly takes on the myth that if only we knew what peoples’ identifiers were, “we would be safe”. Metaphorically, he is asking what kind of plane we would rather fly in – one where the passengers’ identifiers have been checked against a database or one where they and their luggage have been screened for explosives and guns?

I think he will convey to “lay people” why a so-called “blacklist” is one of the weakest forms of protection, showing that all you have to do is impersonate anyone not on it to sneak through the cracks.

The book is full of important discussions. It has chapters like “Use identification less” and “Use authorization more.” I have only one criticism of the book. I would like to see us separate the notion of identity, on the one hand, and individual identification (or identifiers) on the other. We need return to the original meaning of identity: the fact of being who or what a person or thing is.

As a simple example, suppose I'm a service provider building a chat room for children, and want to limit participation to children who are between 12 and 15. Let me contrast two ways of doing this.

In the first, all the children are given an identifier. To get into the room, they present their identifier and prove they are the person to whom that identifier was given. Then the chatroom system does a lookup in some public system linking identifier and age to make the access control decision.

In the second, the children are given a “digital claim” that they are of some age, and a way to prove they are the person to whom that “claim” was given. The chatroom system just queries the claim to see if it meets its criteria. There is no reference to any public or even private identifier.

My point is that the first mechanism involves use of an identifier. The second still involves identity – in the sense of being what a person is – but the identification, so rightly put into question by Jim's book, has been put into the trashcan where it belongs.

The use of an identifier in our first example breaks the second Law of Identity (Data Minimization – release no more data than necessary). It breaks the third Law too (Fewest Parties – since it discloses use of information to a central database unnecessary to the transaction). Finally, it breaks the Fourth Law (using an omnidirectional identifier when none is required).

The book was written before “claims-based thinking” began to gain mindshare, and so it's missing as a category in Jim's discussion of advanced identity technologies. But we've talked extensively about these issues and we have concluded that we have no theoretical difference – in fact the alignment between his work and the Laws of Identity struck us both as remarkable given that we come at these issues from such different starting points.

Jim's book is wonderful reading. It should help newcomers better understand the Laws of Identity. And this week the Cato Institute in Washington held an event at which Jim spoke, along with James Lewis, Director and Senior Fellow, Technology and Public Policy Program Center for Strategic and International Studies; and Jay Stanley, Public Education Director, Technology and Liberty Project American Civil Liberties Union.

Download the podcast or watch the video here.

State of the market or chance to get things right?

Eric Norlin of Digital Identity World comments on my concerns (note: concerns are not allegations) about the need for client-side anti-spoofing components:

Every now and then a technical disagreement betrays the state of a marketplace. That phenomenon is currently happening in the user-centric identity trenches.

The players are Kim Cameron (InfoCards/CardSpace) of Microsoft on one side and Dick Hardt (OpenID) of Sxip Identity on the other. The issue: Kim's recent allegations that OpenID will make identity *less* secure and possibly result in security breaches that will set the user-centric identity work back in the minds of users.

The debate highlights where we are with user-centric identity.

The technical details all focus around the need (or lack of need) for client-side identity selectors with Kim arguing that its necessary to prevent spoofing, and Dick arguing that the spoofing security threat is acknowledged and defensible via OpenID. But the technical details (and argument) are not the most interesting thing.

Arguments like this, as all engineers know, are common in the world of the engineering. The reason is simple: the “engineer's mind” (versus the “marketer's mind”) naturally seeks the “perfect solution.” That's the blessing of the engineer's mind. It is, of course, also the curse.

As any student of technology history knows, the “perfect solution” has rarely won the battle of the marketplace. Instead, the solution that solved the problem set using “the principle of good enough”, and *also* attained a critical mass of adoption has won. Does that result in further problems to be solved? Of course it does! That, my friends, is the cycle of innovation.

The current debate between Kim and Dick actually serves to show us where the user-centric identity market actually is. Several years ago, two groups were competing around federation standards (the Liberty Alliance and Microsoft/IBM's WS-* standards). For what seemed like forever, they held obscure debates about the details of the standards. Eventually, the market moved forward (seemingly without either group's help), and now today we find ourselves witnessing a new Liberty Alliance President saying that the “gloves are off” and they'd like to find ways to converge with the WS-* standards.

That simple, recent analogy shows us where we are with user-centric identity. We're on the verge of the market beginning to really adopt some technology. These conversations don't reach this level unless those involved see this potential.

In the meantime, the engineers will continue to debate the details, and that's good for all of us.

I want people to understand I'm not against OpenID, and I don't see this as something that should turn into a war, marketing or other. We should do everything we can to make OpenID as secure as possible, and that includes integrating it with InfoCards wherever this is possible.

Superpat and the third way

Pat Patterson leaps through the firmament to punctuate my recent discussion of minimal disclosure with this gotcha:

But, but, but… how does the relying party know not to ask for givenname, surname and emailaddress the second (and subsequent) time round? It doesn't know that it's already collected those claims for that user, since it doesn't know who the user is yet…

In the case described by Pat, the site really does use a “registration” model like the one from BestBuy shown here.

When registering you hand over your identity information, and subsequently you only “authenticate”.

This is really the current model for how identity is handled by most web sites. In other words the “Registration process” is completely separated from the “Returning user” process.

So the obvious answer to Pat's question is that when you press “create an account” above, you invoke an object tag that asks for the four attributes discussed earlier. And if you press “Sign in”, you invoke an object tag that only asks for PPID and then associates with your stored information.

In other words, there is no new problem and no new framework is required.

This doesn't prevent Pat from serving up a little irony:

If only there were some specification (perhaps part of some sort of framework) that, given a token from an authentication, allowed you to get the data you needed, subject, of course, to the user's permission.

I guess it bothered Pat that I didn't include use of backend protocols as one of the options for reducing disclosure.

I want to set this right. I've said since the beginning that as I saw it, the PPID (or other authenticated identifier) delivered by an InfoCard could also be used to animate a back-end protocol such as he's refering to. That's one of the reasons I thought everyone should be able to rally behind these proposals.

The third option

So let me add a third alternative to the two I gave yesterday (storing locally or asking the user to resubmit through infocard). The relying party could authenticate the user using InfoCard and then contact the identity provider with the user's PPID and ask it for the information the user has already agreed should be released to it. This could be done using the protocols referred to by Pat.

My uberpoint is simple. InfoCards are intended to be as neutral as possible in their technical assumptions (e.g. to be an identity platform) and can be used in many ways that make sense in different environments and use cases.

I don't personally agree that the back-end protocol route for obtaining attributes is either simpler or more secure than delivering the claims directly on an as-needed basis in the authentication token, but it is certainly possible and I'm sure it has its use cases. I wonder if Pat's implementation of Information Cards, should there be one, will take this approach? Interesting.

Resending of personal data with InfoCards

Eric Schultz writes with this question:

I've been investigating CardSpace and the practicality of it's use for login on a new social networking site.

I have a question regarding the method through which data is transferred. I see that you can require certain claims from an InfoCard such as email, first and last name, zip code etc. When I look at the login code I see that the same claims are required again.

Does this mean that each time an InfoCard is sent all the personal data is resent? Isn't this dangerous for security/privacy? The potential for a server failure (malicious or not) caused by a buffer overflow, a coding mistake that outputs the details of session variables etc. seems rather risky in this scenario.

Perhaps I am being alarmist?

This is an area in which being “an alarmist” – perhaps I will rephrase it as being thoroughly pessimistic about what can go wrong – is the best starting point. You questions are ones everyone should think about.

InfoCard and Minimal Disclosure

The simple answer is that there is nothing built into InfoCard concepts that requires a “relying party” to ask for attributes every time a user comes to its site. Let's first look at the mechanics.

The relying party controls what attributes it asks for by putting an OBJECT tag in the HTML page where the user opts to use an infocard.

The example shown here will bring up the infocard dialog and illuminate any cards that offer all four claims so the user can select one.

If, next time, the relying party doesn't want to receive these claims, it just doesn't ask for them. If it has stored them, it should be able to retrieve them when necessary by using “privatepersonalidentifier” as a handle. This identifier is just a random pairwise number meaningless to any other site, and so there is no identity risk in using it.

No theoretical bias

In other words, the InfoCard system has no theoretical bias about what information should be asked for when. Through the Laws of Identity we have tried to help people understand that they should only ask for what they need to complete a transaction and should only keep it for the length of time they absolutely must.

In particular, there should be no hoarding of rainy-day information – information that “might come in handy” some day – but which is more likely to turn into a liability than into a benefit.

Do your risk analysis

You'll need to do the conventional risk analysis and think about whether it is more dangerous to store the information or just ask for it on an “as-needed” basis and then forget it. My personal sense is that it is more dangerous to store it than to use an on-demand approach.

A central machine with the stored information that animates a successful internet business is a honeypot. It could well be subject to insider attacks, and certainly, since it lives on the internet, will be subject to many attacks on the information it stores. Why not avoid these problems completely?

Certainly, the on-demand approach has benefits in convincing customers and legal practitioners that, having held no identity information, you cannot be seen as being responsible for an identity meltdown. To me this is very attractive, and something that has not been possible until now.

Conclusion

The examples Eric gives of things that can go wrong seem to me to apply even more strongly if you have stored information locally than if you ask for it on demand.

But as I said earlier, this just expresses my thinking – there is lots more to be written by Eric and hundreds of others as they develop applications.

Meanwhile, InfoCard has no built-in assumptions around this and can be used in whatever way is appropriate to a given situation.