May 2008 – Kim Cameron's Identity Weblog

Students enlist readers’ assistance in CardSpace “breach”

Students at Ruhr Universitat Bochum in Germany have published an account this week describing an attack on the use of CardSpace within Internet Explorer. Their claim is to “confirm the practicability of the attack by presenting a proof of concept implementation“.

I’ve spent a fair amount of time reproducing and analyzing the attack. The students were not actually able to compromise my safety except by asking me to go through elaborate measures to poison my own computer (I show how complicated this is in a video I will post next). For the attack to succeed, the user has to bring full administrative power to bear against her own system. It seems obvious that if people go to the trouble to manually circumvent all their defenses they become vulnerable to the attacks those defenses were intended to resist. In my view, the students did not compromise CardSpace.

DNS must be undermined through a separate (unspecified) attack

To succeed, the students first require a compromise of a computer’s Domain Name System (DNS). They ask their readers to reconfigure their computers and point to an evil DNS site they have constructed. Once we help them out with this, they attempt to exploit the fact that poisoned DNS allows a rogue site and a legitimate site to appear to have the same internet “domain name” (e.g. www.goodsite.com) . Code in browser frames animated by one domain can interact with code from other frames animated by the same domain. So once DNS is compromised, code supplied by the rogue site can interfere with the code supplied by the legitimate site. The students want to use this capability to hijack the legitimate site’s CardSpace token.

However, the potential problems of DNS are well understood. Computers protect themselves from attacks of this kind by using cryptographic certificates that guarantee a given site REALLY DOES legitimately own a DNS name. Use of certificates prevents the kind of attack proposed by the students.

The certificate store must also “somehow be compromised”

But this is no problem as far as the students are concerned. They simply ask us to TURN OFF this defense as well. In other words, we have to assist them by poisoning all of the safeguards that have been put in place to thwart their attack.

Note that both safeguards need to be compromised at the same time. Could such a compromise occur in the wild? It is theoretically possible that through a rootkit or equivalent, an attacker could completely take over the user’s computer. However, if this is the case, the attacker can control the web browser, see and alter everything on the user’s screen and on the computer as a whole, so there is no need to obtain the CardSpace token.

I think it is amazing that the Ruhr students describe their attack as successful when it does NOT provide a method for compromising EITHER DNS or the certificate store. They say DNS might be taken over through a drive-by attack on a badly installed wireless home network. But they provide no indication of how to simultaneously compromise the Root Certificate Store.

In summary, the students’ attack is theoretical. They have not demonstrated the simultaneous compromise of the systems necessary for the attack to succeed.

The user experience

Because of the difficulty of compromising the root certificate store, let’s look at what would happen if only DNS were attacked.

Internet Explorer does a good job of informing the user that she is in danger and of advising her not to proceed.

First the user encounters the following screen, and has to select “Continue to the website (not recommended)”:

If recalcitrant, the user next sees an ominous red band warning within the address bar and an unnaturally long delay:

The combined attacks require a different yet coordinated malware delivery mechanism than a visit to the phishing site provides. In other words, accomplishing two or more attacks simultaneously greatly reduces the likelihood of success.

The students’ paper proposes adding a false root certificate that will suppress the Internet Explorer warnings. As is shown in the video, this requires meeting an impossibly higher bar. The user must be tricked into importing a “root certificate”. This by default doesn’t work – the system protects the user again by installing the false certificate in a store that will not deceive the browser. Altering this behavior requires a complex manual override.

However, should all the planets involved in the attack align, the contents of the token are never visible to the attacker. They are encrypted for the legitimate party, and no personally identifying information is disclosed by the system. This is not made clear by the students’ paper.

What the attempt proves

The demonstrator shows that if you are willing to compromise enough parts of your system using elevated access, you can render your system attackable. This aspect of the students’ attack is not noteworthy.

There is, however, one interesting aspect to their attack. It doesn’t concern CardSpace, but rather the way intermittent web site behavior can be combined with DNS to confuse the browser. The student’s paper proposes implementing a stronger “Same Origin Policy” to deal with this (and other) possible attacks. I wish they had concentrated on this positive contribution rather than making claims that require suspension of disbelief.

The students propose a mechanism for associating Information Card tokens with a given SSL channel. This idea would likely harden Information Card systems and is worth evaluating.

However, the students propose equipping browsers with end user certificates so the browsers would be authenticated, rather than the sites they are visiting. This represents a significant privacy problem in that a single tracking key would be used at all the sites the user visits. It also doesn’t solve the problem of knowning whether I am at a “good” site or not. The problem here is that if duped, I might provide an illegitimate site with information which seriously damages me.

One of the most important observations that must be made is that security isn’t binary – there is no simple dichotomy between vulnerable and not-vulnerable. Security derives from concentric circles of defense that act cumulatively and in such a way as to reinforce one another. The title of the students’ report misses this essential point. We need to design our systems in light of the fact that any system is breachable. That’s what we’ve attempted to do with CardSpace. And that’s why there is an entire array of defenses which act together to provide a substantial and practical barrier against the kind of attack the students have attempted to achieve.

More on distributed query

Dave Kearns responded to my post on the Identity Bus with Getting More Violent All the Time (note to the Rating Board: he's talking about violent agreement… which is really rough):

What Kim fails to note… is that a well designed virtual directory (see Radiant Logic's offering, for example) will allow you to do a SQL query to the virtual tables! You get the best of both: up to date data (today's new hires and purchases included) with the speed of an SQL join. And all without having to replicate or synchronize the data. I'm happy, the application is happy – and Kim should be happy too. We are in violent agreement about what the process should look like at the 40,000 foot level and only disagree about the size and shape of the paths – or, more likely, whether they should be concrete or asphalt.

Neil Macehiter answers by making an important distinction that I didn't emphasize enough:

But the issue is not with the language you use to perform the query: it's where the data is located. If you have data in separate physical databases then it's necessary to pull the data from the separate sources and join them locally. So, in Kim's example, if you have 5000 employees and have sold 10000 computers then you need to pull down the 15000 records over the network and perform the join locally (unless you have an incredibly smart distributed query optimiser which works across heterogeneous data stores). This is going to be more expensive than if the computer order and employee data are colocated.

Clayton Donley, who is the Senior Director of Development for Oracle Identity Management, understands exactly what I'm trying to get at and puts it well in this piece:

Dave Kearns has followed up on Kim Cameron's posting from Friday.

Kim says that sometimes you need to copy data in order to join it with other data

Dave says the same thing, except indicates that you wouldn't copy the data but just use “certain virtual directory functionality”

Actually, in #2, that functionality would likely be persistent cache, which if you look under the covers is exactly the same as a meta-directory in that it will copy data locally. In fact, the data may even be stored (again!) in a relational database (SQLServer in the Radiant Logic example he provides).

Let's use laser focus and only look at Kim's example of joining purchase orders with user identity.

Let's face it. Most applications aren't designed to go to one database when you're dealing solely with transactional data and another database when you're dealing with a combination of transactional data and identities.

If we model this through the virtual directory and indicate that every time an application joins purchase orders and identities that it does so (even via SQL instead of LDAP) through the virtual directory, you've now said the following:

You're okay with re-modelling all of these data relationships in a virtual directory — even those representing purchase order information.

You're okay with moving a lot of identity AND transactional information into a virtual directory's local database.

You're okay with making this environment scalable and available for those applications.

Unfortunately, this doesn't really hold up. There are a lot more issues, but even after just these first three (or even the first one) you begin to realize that while virtual directory makes sense for identity, it may not make sense as the ONLY way to get identity. I think the same thing goes for an identity hub that ONLY thinks in terms of virtualization.

The real solution here is a combination of virtualization with more standardized publish/subscribe for delivery of changes. This gets us away from this ad-hoc change discovery that makes meta-directories miserable, while ensuring that the data gets where it needs to go for transactions within an application.

I discourage people from thinking that metadirectory implies “ad-hoc change discovery”. That's a defect of various metadirectory implementations, not a characteristic of the technology or architecture. As soon as applications understand they are PART OF a wider distributed fabric, they could propagate changes using a publication pattern that retains the closed-loop verification of self-converging metadirectory.

Internet as extension of mind

Ryan Janssen at drstarcat.com published an interview recently that led me to think back over the various phases of my work on identity. I generally fear boring people with the details, but Ryan explored some things that are very important to me, and I appreciate it.

After talking about some of the identity problems of the enterprise, he puts forward a description of metadirectory that I found interesting because it starts from current concepts like claims rather than the vocabulary of X.500:

…. Kim and the ZOOMIT team came up with the concept of a “metadirectory”. Metadirectory software essentially tries to find correlation handles (like a name or email) across the many heterogeneous software environments in an enterprise, so network admins can determine who has access to what. Once this is done, it then takes the heterogeneous claims and transforms them into a kind of claim the metadirectory can understand. The network admin can then use the metadirectory to assign and remove access from a single place.

Zoomit released their commercial metadirectory software (called “VIA”) in 1996 and proceeded to clean the clock of larger competitors like IBM for the next few years until Microsoft acquired the company in the summer of 1999. Now anyone who is currently involved in the modern identity movement and the issues of “data portability” that surround it has to be feeling a sense of deja vu because these are EXACTLY the same problems that we are now trying to solve on the internet—only THIS time we are trying to take control of our OWN claims that are spread across innumerable heterogeneous systems that have no way to communicate with each other. Kim’s been working on this problem for SIXTEEN years—take note!

Yikes. Time flies when you're having fun.

When I asked Kim what his single biggest realization about Identity in the 16 years since he started working on it was, he was slow to answer, but definitive when he did—privacy. You see, Kim is a philosopher as well as a technologist. He sees information technology (and the Internet in particular) as a social extension of the human mind. He also understands that the decisions we make as technologists have unintended as well as intended consequences. Now creating technology that enables a network administrator to understand who we are across all of a company’s systems is one thing, but creating technology that allows someone to understand who we are across the internet, particularly as more and more of who we are as humans is stored there, and particularly if that someone isn’t US or someone we WANT to have that complete view, is an entirely other problem.

Kim has consistently been one the strongest advocates for obscuring ANY correlation handles that would allow ANY Identity Provider or Relying Party to have a more complete view of us than we explicitly give them. Some have criticized his concerns as overly cautious in a world where “privacy is dead”. When you think of your virtual self as an extension of your personal self though, and you realize that the line between the two is becoming increasingly obscured, you realize that if we lose privacy on the internet, we, in a very real sense, lose something that is essentially human. I’m not talking about the ability to hide our pasts or to pretend to be something we’re not (though we certainly will lose that). What we lose is that private space that makes each of us unique. It’s the space where we create. It’s the space that continues to ensure that we don’t all collapse into one.

Yes, it is the space on which and through which Civilization has been built.

Out-manned and out-gunned

Jeff Bohren draws our attention to this article on Cyber Offence research being done by the US Air Force Cyber Command (AFCYBER). The article says:

…Williamson makes a pretty decent case for the military botnet; his points are especially strong when he describes the inevitable failure of a purely defensive posture. Williamson argues that, like every fortress down through history that has eventually fallen to a determined invader, America’s cyber defenses can never be strong enough to ward off all attacks.

And here, Williamson is on solid infosec ground-it’s a truism in security circles that any electronic “fortress” that you build, whether it’s intended to protect media files from unauthorized viewers or financial data from thieves, can eventually be breached with enough collective effort.

Given that cyber defenses are doomed to failure, Williamson argues that we need a credible cyber offensive capability to act as a deterrent against foreign attackers. I have a hard time disagreeing with this, but I’m still very uncomfortable with it, partly because it involves using civilian infrastructure for military ends…

Jeff then comments:

The idea (as I understand it) is to use military owned computers to launch a botnet attack as a retaliation against an attack by an enemy.

In this field of battle I fear the AFCYBER is both out-manned and out-gunned. The AF are the go-to guys if you absolutely, positively need something blown up tomorrow. But a DDoS attack? Without compromising civilian hardware, the AF likely couldn’t muster enough machines. Additionally the network locations of the machines they could muster could be easily predicted before the start of any cyber war.

There is an interesting alternative if anyone from AFCYBER is reading this. How about a volunteer botnet force? Civilians could volunteer to download an application that would allow their computer to be used in an AFCYBER controlled botnet in time of a cyber war. Obviously securing this so that it couldn’t be hijacked is a formidable technical challenge, but it’s not insurmountable.

If the reason for having a botnet is because we should assume every system can be compromised, don't we HAVE TO assume the botnet can be compromised too? Once we say “the problem is not surmountable” we have turned our back on the presuppositions that led to the botnet in the first place.

Talking about the Identity Bus

During the Second European Identity Conference, Kuppinger-Cole did a number of interviews with conference speakers. You can see these on the Kuppingercole channel at YouTube.

Dave Kearns, Jackson Shaw, Dave Olds and myself had a good old time talking with Felix Gaehtgens about the “identity bus”. I had a real “aha” during the interview while I was talking with Dave about why synchronization and replication are an important part of the bus. I realized part of the disconnect we've been having derives from the differing “big problems” each of us find ourselves confronted with.

As infrastructure people one of our main goals is to get over our “information chaos” headaches… These have become even worse as the requirements of audit and compliance have matured. Storing information in one authoritative place (and one only) seems to be a way to get around these problems. We can then retrieve the information through web service queries and drastically reduce complexity…

What does this worldview make of application developers who don't want to make their queries across the network? Well, there must be something wrong with them… They aren't hip to good computing practices… Eventually they will understand the error of their ways and “come around”…

But the truth is that the world of query looks different from the point of view of an application developer.

Let's suppose an application wants to know the name corresponding to an email address. It can issue a query to a remote web service or LDAP directory and get an answer back immediately. All is well and accords with our ideal view.

But the questions application developers want to answer aren't always of the simple “do a remote search in one place” variety.

Sometimes an application needs to do complex searches involving information “mastered” in multiple locations. I'll make up a very simple “two location” example to demonstrate the issue:

“What purchases of computers were made by employees who have been at the company for less than two years?”

Here we have to query “all the purchases of computers” from the purchasing system, and “all empolyees hired within the last two years” from the HR system, and find the intersection.

Although the intersection might only represent a few records, performing this query remotely and bringing down each result set is very expensive. No doubt many computers have been purchased in a large company, and a lot of people are likely to have been hired in the last two years. If an application has to perform this type of query with great efficiency and within a controlled response time, the remote query approach of retrieving all the information from many systems and working out the intersection may be totally impractical.

Compare this to what happens if all the information necessary to respond to a query is present locally in a single database. I just do a “join” across the tables, and the SQL engine understands exactly how to optimize the query so the result involves little computing power and “even less time”. Indexes are used and distributions of values well understood: many thousands of really smart people have been working on these optimizations in many companies for the last 40 years.

So, to summarize, distributed databases (or queries done through distributed services) are not appropriate for all purposes. Doing certain queries in a distributed fashion works, while in other cases it leads to unacceptable performance.

The result is that many application developers “don't want to go there” – at least some of the time. Yet their applications must be part of the identity fabric. That is why the identity metasystem has to include application databases populated through synchronization and business rules.

On another note, I recommend the interview with Dave Kearns on the importance of context to identity.

Satisfaction Guaranteed?

Francois Paget, an investigator at McAfee Avert Labs, has posted a detailed report on a site that gives us insight into the emerging international market for identity information. He writes:

Last Friday morning in France, my investigations lead me to visit a site proposing top-quality data for a higher price than usual. But when we look at this data we understand that as everywhere, you have to pay for quality. The first offer concerned bank logons. As you can see in the following screenshot, pricing depends on available balance, bank organization and country. Additional information such as PIN and Transfer Passphrase are also given when necessary:

For such prices, the seller offers some guaranties. For example, the purchase is covered by replacement, if you are unable – within the 24 hours – to log into the account using the provided details.

The selling site also proposes US, Austria and Spanish credit cards with full information…

It is also possible to purchase skimmers (for ATM machine) and “dump tracks” to create fake credit cards. Here too, cost is in touch with the quality:

Many other offers are available like shop administrative area accesses (back end of an online store where all the customer details are stored – from Name, SSN, DOB, Address, Phone number to CC) or UK or Swiss Passport information:

Read the rest of Francois’ story here. Beyond that, it's well worth keeping up with the Avert Labs blog, where every post reminds us that the future of the Internet depends on fundamentally increasing its security and privacy. [Note: I slightly condensed Francois’ graphics…]

Fingerprint charade

I got a new Toshiba Portege a few weeks ago, the first machine I've owned that came with a fingerprint sensor. At first the system seemed to have been designed in a sensible way. The fingerprint template is encrypted and stays local. It is never released or stored in a remote database. I decided to try it out – to experience what it “felt like”.

A couple of days later, I was at a conference and on stage under pretty bright lights. Glancing down at my shiny new computer, I saw what looked unmistakably like a fingerprint on my laptop's right mouse button. Then it occurred to me that the fingerprint sensor was only a quarter of an inch from what seemed to be a perfect image of my fingerprint. How secure is that?

A while later I ran into Dale Olds from Novell. Since Dale's an amazing photographer, I asked if he would photograph the laptop to see if the fingerprint was actually usable. Within a few seconds he took the picture above.

When Dale actually sent me the photo, he said,

I have attached a slightly edited version of the photo that showed your fingerprint most clearly. In fact, it is so clear I am wondering whether you want to publish it. The original photos were in Olympus raw format. Please let me know if this version works for you.

Eee Gads. I opened up the photo in Paint and saw something along these lines:

The gold blotch wasn't actually there. I added it as a kind of fig-leaf before posting it here, since it covers the very clearest part of the fingerprint.

The net of all of this was to drive home, yet again, just how silly it is to use a “public” secret as a proof of identity. The fact that I can somehow “demonstrate knowledge” of a given fingerprint means nothing. Identification is only possible by physically verifying that my finger embodies the fingerprint. Without physical verifcation, what kind of a lock does the fingerprint reader provide? A lock which conveniently offers every thief the key.

At first my mind boggled at the fact that Toshiba would supply mouse buttons that were such excellent fingerprint collection devices. But then I realized that even if the fingerprint weren't conveniently stored on the mouse button, it would be easy to find it somewhere on the laptop's surface.

It hit me that in the age of digital photography, a properly motivated photographer could probably find fingerprints on all kinds of surfaces, and capture them as expertly as Dale did. I realized it was no longer necessary to use special powder or inks or tape or whatever. Fingerprints have become a thing of “sousveillance”.