June 2007 – Page 3 – Kim Cameron's Identity Weblog

Evolving technology for better privacy

Let's continue to explore linking, and of how it relates to CardSpace, identity protocols, token formats and cryptography.

I've summarized a number of thoughts in the following diagram, which contrasts the linking threats posed by a number of technology combinations. The diagram presents these technologies on an ordinal scale ranging from the most dangerous to the least – along the axis of linkage prevention.

X.509 with OCSP

Let's begin with Public Key Infrastructure (PKI) technology employed with X.509 user certificates.

Here the user has a key only she can “exercise”, and some Certificate Authority (CA) mints a long-lived certificate binding her key to her name, organization, country and the like. When the user visits a relying party who trusts the CA, she presents the certificate and exercises the key – typically by using it to sign a challenge created by the relying party.

In many cases, in addition to binding the key to attributes, the CA exists with the explicit mission of linking it to a “natural person” (in the sense of an identifiable person in the physical world). However, for now we'll leave the meaning of assertions aside and look only at how the technology itself impacts privacy.

Since the user presents the same long-lived certificate to every relying party who trusts the CA, the certificate and the information in it link the user accross sessions on one web site, and between one web site and another. Any two web sites obtaining this kind of certificate can compare notes and determine that the same user has visited each of them. This allows linkage of their profiles into a super-dossier (possibly including a super-dossier of a natural person).

What is good about X.509 is that if a relying party does not collude, the CA has no visibility onto the fact that a given user has visited it (we will see that in some other systems such visibility is unavoidable). But a relying party could at any point decide to collude with the CA (assuming the CA actually accepts such information, which may be a breach of policy). This might result in the transfer of information in either direction beyond that contained in the certificate itself.

So in the diagram, I express this through two risks of collusion. The first is between any two relying parties who receive the same certificate. The second is between any relying party and the certificate authority. In esssence, then, all participating parties can collude with any other party, so this represents one of the worst possible technology alternatives if privacy is your goal.

In light of this it makes sense that X.509 has been successful as a technology for public entities like corporate web sites, where correlation is actually a good thing, but not for individual identification where privacy is part of the equation.

(Continues tomorrow…).

News on the Australian “Access Card”

Here is a report from The Australian about the issues surrounding Australia's Human Services Access Card. Some of the key points:

“By this time next year, the federal Government hopes to be interviewing and photographing 35,000 Australians each day to create the nation's first ID databank. Biometric photos, matched with names, addresses, dates of birth, signatures, sex, social security status and children's details, would be loaded into a new centralised database. Welfare bureaucrats, ASIO, the Australian Federal Police and possibly even the Australian Taxation Office would have some form of access to the unprecedented collection of identity data.

“Within three years, all Australians seeking benefits such as Medicare, pensions, childcare subsidies, family payments, unemployment or disability allowances – about 16.5 million people – would have joined the databank. They would be given a photographic access card to prove who they are and show their eligibility for social security.

“This week, however, the billion-dollar project hit a bump when Human Services Minister Chris Ellison revealed that legislation due to go before federal Parliament this month had been delayed…

“How will Australians’ privacy be protected? How will the database and cards be kept secure? Who can see information on the card? What identity documents will Australians need to acquire a card, and what will happen to the estimated 600,000 people without a birth certificate, passport or driver's licence?

“The Government's mantra is that this is not an ID card because it does not have to be carried, but users will have to show it to prove their identity when claiming welfare benefits…

“The Government claims the new system will stem between $1.6 billion and $3 billion in welfare fraud over the next decade…

“A key Government adviser, Allan Fels – a former chairman of the Australian Competition and Consumer Commission and now head of the Government's Access Card Consumer and Privacy Taskforce – is at loggerheads with Medicare, Centrelink and the AFP, who all want the new card to display the user's identification number, photograph and signature…

“The photo would be stored in a central database, as well as in a microchip that could be read by 50,000 terminals in government offices, doctors’ surgeries and pharmacies…

“Despite his official role as the citizens’ watchdog, Fels still has not seen the draft bill…

“‘The law should be specific about what is on the card, in the chip and in the database,’ he says. ‘If anyone in future wants to change that they would have to do nothing less than get an act of parliament through. We don't want a situation where, just by administrative decisions, changes can be made…’

“‘There will be no mega-database created that will record a customer's dealings with different agencies,” the minister [Ellison] told the conference…

“Cardholders may be able to include sensitive personal information – such as their blood type, emergency contacts, allergies or illnesses such as AIDS or epilepsy – in the one-third of the microchip space that will be reserved for personal use. It is not yet clear who would have access to this private zone.

“Hansard transcripts of Senate committee hearings into the access card legislation reveal that police, spies and perhaps even the taxman will be able to glean details from the new database. The Department of Human Services admits the AFP will be able to obtain and use information from the databank and card chip to respond to threats of killing or injury, to identify disaster victims, investigate missing persons, or to ‘enforce criminal law or for the protection of the public revenue’.

“Australia's super-secretive spy agency, the Defence Signals Directorate, will test security for the new access card system…

“The Australian Privacy Foundation's no-ID-card campaign director, Anna Johnston, fears future governments could “misuse and abuse” the biometric databank…

(Full story…)

ID Cards can be deployed in ways that increase, rather than decrease, the privacy of citizens, while still achieving the goals of fraud reduction. It's a matter of taking advantage of new card and crypto technologies. My view is that politicians would be well advised in funding such products rather than massive centralized databases.

As for the Defense Signals Directorate's access to identity data, what has this got to do with databases offering generalized access to every curious official? You would think they were without other means.

More on the iTunes approach to privacy

Reading more about Apple's decision to insert user's names and email addresses in the songs they download from iTunes, I stumbled across a Macworld article on iTunes 6.0.2 where Rob Griffiths described the store's approach to capturing user preferences as “spyware”.

I blogged about Rob's piece, but it turns it was 18 months old, and Apple had quickly published a fix to the “phone-home without user permission” issue.

Since I don't want to beat a dead horse, and Apple showed the right spirit in fixing things, I took that post down within a couple of hours (leaving it here for anyone who wonders what it said).

So now, with a better understanding of the context, I can get on with thinking about what it means for Apple to insert our names and email addresses into the music files we download – again without telling us.

First I have to thank David Waite for pointing out that the original profiling issue had been resolved:

Kim, [the Macworld] article is almost 18 months old. Apple quickly released a newer version of iTunes which â€˜fixedâ€™ this issue – the mini store is disabled by default, and today when you select to â€˜Show MiniStoreâ€™ it displays:

â€œThe iTunes MiniStore helps you discover new music and video right from your iTunes Library. As you select tracks or videos in your Library, information about your selections are sent to Apple and the MiniStore will display related songs, artists, or videos. Apple does not keep any information related to the contents of your iTunes Library.

Would you like to turn on the MiniStore now?â€

The interesting thing about the more recent debacle about Apple including your name and email address in the songs you buy from their store is that they have done this since Day 1. Its only after people thought Apple selling music with no DRM was too good to be true that the current stink over it started.

It's interesting to understand the history here. I would have thought that in light of their previous experience, Apple would have been very up front about the fact that they are embedding your name and email address in the files they give you. After all, it is PII, and I would think it would require your knowledge and approval.

I wonder what the Europeans will make of this?

More on the iTunes approach to privacy

Reading more about Apple's decision to insert user's names and email addresses in the songs they download from iTunes, I came across some related information in an excellent Macworld article by Rob Griffiths:

Yesterday, Appleâ€™s iTunes 6.0.2 update was released, and offered these features, according to the Read Me:

iTunes 6.0.2 includes stability and performance improvements over iTunes 6.0.1.

What it also offered, but didnâ€™t bother to disclose, was the addition of a bit of potential spyware to the iTunes interface. As reported originally on since1968.com, and then followed-up on boingboing and other sites, the new iTunes MiniStore, which appears directly below the song list area in the main iTunes window, watches what you click on in iTunes and sends that information across the Web to a remote server. When you double-click a song to play in your Library or playlists, the display in the mini-store changes to reflect â€˜matchesâ€™ based on whatâ€™s been selected, as seen below.

In order to do this, the music store must obviously know what youâ€™re listening to. It learns this information via a packet of information sent each time you play a song via a double-click. This data is sent without your explicit permission, and as far as I can tell, there are no Apple privacy policies that cover that transfer of information. Itâ€™s also unclear exactly what data is being sent. (Is it just song and title? Or does it include your Apple music store ID, which would tie the song info directly to your personal data?) And although Apple now assures us that the data is not collected, that information is not made clear to users when they begin using iTunes.

The MiniStore can be easily disabledâ€”just hit Shift-Command-M, or choose Edit: Hide MiniStore, and itâ€™s gone. Once hidden, no more data is transmitted, as confirmed by Kirk McElhearn using the Unix program tcpdump, which watches traffic sent over your network connection. Disable the MiniStore, and your private listening habits will stay just thatâ€”private.

However, this isnâ€™t about the MiniStore itself. Itâ€™s about Appleâ€™s attitude in rolling this change out to the millions of iTunes users, without as much as a peep about whatâ€™s going on behind the scenes. Consider, for example, if Microsoft had done such a thing with a minor Office updateâ€”say they started collecting data on the names of the files you were editing, in the hopes of selling you preformatted templates to help with future similar projects. If they did this in a minor update, and without telling anyone that the data were being transmitted, there would be universal outrage over this potential attack on our privacy. And now Appleâ€™s gone and done basically the exact same thing.

Personally, I am quite upset with Appleâ€™s decision-making in this case, and I hope others are as well.

No company, even one I admire as much as Apple (I did spend nearly five years of my life working there), should start transmitting personal data over the Internet without my explicit permission and a clear explanation of how itâ€™s being used. In addition, if a company is collecting this information, I have a right to know exactly whatâ€™s being collected, and what the company plans on doing with my personal information.

The good news is, Apple tells us that the information is not actually being collected. The data sent is used to update the MiniStore and then discarded. If you think about it, this makes senseâ€”imagine the size of the data files they would accumulate with millions of users and what must be hundreds of millions of songs played each day. But Apple should tell us as much, so that we can all relax a bit about sharing our listening habits with Apple.

Apple should amend iTunes to clearly disclose what data the program is transmitting and how itâ€™s being used. There should be a dialog box that pops up the first time iTunes runs, explaining exactly how the MiniStore works. If Apple had just included that yesterday â€” or even some information in the Read Me, then I wouldnâ€™t have even raised this as an issue. A little transparency and openness can go a long way to easing privacy fears.

As interesting as the article are the 166 comments on it. About half seem to think it's fine for Apple to collect the information without consent. Oops. I shouldn't have said “collect” – or at least that's Apple's spin on this. It seems that even though the information is sent in (through a third party), Apple doesn't actually “collect” it, since it discards the information after “processing it”. So “collect” seems to mean “retain in raw form.” The iTune supporters make it clear they “don't think” Apple would use the information to create a profile of their tastes. Customer loyalty is a beautiful thing. This is the stuff that great ads are made of.

iTunes and Identity-Based Digital Rights Management

A fascinating posting by Randy Picker at the University of Chicago Law School Faculty Blog:

Over the last week, it has been become clear that Apple is embedding some identifying information in songs purchased from iTunes, including the name of the customer and his or her e-mail address. This has raised the ire of consumer advocates, including the Electronic Frontier Foundation which addressed this again yesterday.

Last year, I published a paper entitled Mistrust-Based Digital Rights Management (online preprint available here). In that paper, I argued that as we switched from content products such as CDs and DVDs to content services such as iTunes, Google Video and YouTube, we would embrace identity-based digital rights management. This is exactly what we are seeing from iTunes. How should we assess identity-based DRM?

Take a step backwards. As long as I keep my songs to myself and donâ€™t share them, the embedded information shouldnâ€™t matter. The information may facilitate interactions between Apple and its customers and might make it easier to verify whether a particular song was purchased from iTunes, but this doesnâ€™t seem to be the central point of embedding identity in the songs.

Instead, identity matters if I share the song with someone else. Identity travels with the content. If I know that and care, I will be less likely to share the content indiscriminately over p2p networks. Why should I care? It depends on what happens with the embedded information. One use would make it possible for Apple to identify who was sharing content on p2p networks. Having traced content to its purchaser, Apple might choose to drop that person as a customer.

But Apple could do this without embedding the information in the clear. As Fred von Lohmann asked in his post on the EFF blog, why embed identity in the clear rather than as encrypted data? After all, if Apple intends to scour p2p networks, it could do so just as easily looking for encrypted identities.

Apple might have a different strategy, one that relies on third-party sanctions, and that strategy would require actual identities. Suppose Apple posted the following notice on iTunes:

â€œSongs downloaded from iTunes are not to be shared with strangers. We have embedded your name and email address into the songs. Our best guess is that if you share iTunes songs on p2p networks, your name and email will be harvested from those songs and you will receive an extra 10 spam emails per day from third parties.â€

Encrypted information works if Apple is doing all of the detection. It would even work, as I suggested in my paper, if Apple relied on third parties to do the detection by turning in p2p uploaders to Apple. We could run that system with encrypted information. All that is required is that the rat knows that he is turning in someone; he doesnâ€™t need to know who that person is exactly.

But a third-party punishment strategy would probably be implemented using actual identity. The spammer who harvests the email address inflicts the penalty for uploading, not Apple itself. For Apple to drop out of the punishment business, it needs to hand off identity. Obviously, extra spam is just one possible cost for disclosing names and emails; other costs would further reduce the incentive to upload.

Disclosing identity is a clumsy tool. It doesnâ€™t scale very well. It will work most powerfully against the casual uploader. It offers no (marginal) deterrence against someone who would upload lots of songs anyway. My mistrust-based scheme (described in the paper) might work better in those circumstances.

So far, Apple doesnâ€™t seem to be saying much about what it is doing. It needs to be careful. As the Sony BMG fiascoâ€”also discussed in the paperâ€”emphasizes, content owners may not get that many opportunities to establish technological protection schemes. Each one they get wrong makes it that much harder to try another scheme later, given the adverse public relations fallout. As I suggest above, Apple may have a legitimate strategy for disclosing identity in the clear. It will be interesting to see what Apple says next.

I haven't read Randy's paper yet but will do so now.

Keys, signatures and linkability

Stefan Brands is contributing to the discussion of traceability, inkability and selective disclosure with a series of posts over at identity corner. He is one of the world's key innovators in the cryptography of unlinkability, so his participation is especially interesting.

Consider a user who self-generates several identity claims at different occassions, say â€œI am 25 years of ageâ€, â€œI am maleâ€, and â€œI am a citizen of Canadaâ€. The userâ€™s software packages these assertions into identity claims by means of attribute type/value pairs; for instance, claim 1 is encoded as â€œage = 25â€, claim 2 is â€œgender = 0â€, and claim 3 is â€œcitizenship = 1â€. Clearly, relying parties that receive these identity claims cannot trace them to their userâ€™s identity (whether that be represented in the form of a birth name, an SSN, or another identifier) by analyzing the presented claims; self-generated claims are untraceable. Similarly, they cannot decide whether or not different claims are presented by the same or by different users; self-generated claims are unlinkable.

Note that these two privacy properties (which are different but, as we will see in the next paragraph, complementary) hold â€œunconditionally;â€ no amount of computing power will enable relying parties to trace or link by analyzing incoming identity-data flows, not even if relying parties collude (indeed, they may be the same entity).

Now, consider the same self-generated identity claims, but this time their user â€œself-protectsâ€ them by means of a self-generated cryptographic key pair (e.g., a random RSA private key and its corresponding public key). The user digitally signs the identity claims with his private key; for example, claim 1 as presented to a relying party looks like â€œage = 25; PublicKey = 37AC986Bâ€¦; Signature = 21A4A5B6â€¦â€. Clearly, these self-protected claims are as untraceable as their unprotected cousins in the previous paragraph. Are they unlinkable? Well, that depends:

If the user applies the same key pair to all claims, then the public key that is present in the presented messages will be the same; thus, all presented identity claims are linkable. As a result, a relying party that receives all three claims over time knows that it is dealing with a 25-year old Canadian male. As the user over time presents more linkable claims, this may indirectly lead to traceability; for example, the relying party may be able to infer the userâ€™s birth name once the user presents a linkable identity claim that states the postal code of his home address.

If the user applies a different self-generated key pair to each identity claim, the three presented claims are as unlinkable and untraceable as in the example where no cryptographic data was appended. Note that this solution does notforce unlinkability and untraceability: in cases where the user should be identified, the user can simply provide a claim that specifies his name: â€œname=Jon Smithâ€ or â€œSSN-identifier=945278476â€, for instance. Similarly, to make self-generated identity claims linkable, an additional common attribute value can be encoded

This is a clear way to introduce the notion of how keys and signatures affect tracability and linkability of claims. However there is more to consider. Even if the user applies a different self-generated key pair for each of the three attributes discussed above, if the three attributes are transfered in a single transaction, they are still linked. The transaction itself links the attribute assertions. Convenyance of multiple claims is a very common case.

Similarly, if Stefan's three attributes are released during what can be considered to be the same session, they are linked, again regardless of the cryptography. And if they are released within a given time window from the same transport (IP) address, they should be considered linked too.

While cryptography is one factor contributing to linkability, we need to look at the protocol patterns and visibility they render possible as well. I'll be starting to do that in my next posting.

Neil Macehiter on the identity metasystem

Here is some recent commentary from Neil Macehiter at macehiterward-dutton Blog on IT Business Alignment:

It's perhaps unsurprising, given all the brouhaha surrounding Microsoft's claims that open source software infringes on 235 of its patents (which incidentally I take to be largely ‘sabre rattling’ from Redmond in the face of the implications of the GPLv3 for its deal with Novell, as discussed in the Risk Factors of the latter's recent 10-K filing), that some recent news regarding the Redmond company's very positive collaboration with the open source community has not received the attention it deserves.

The news in question concerns a series of announcements the company made at last week's Interop conference in Las Vegas. These announcements, as the title of the post suggest, all revolve around Microsoft's vision for an Internet-scale, interoperable identity metasystem and range from additions to the Open Specification Promise (OSP) through to support for OpenLDAP with Microsoft's Identity Lifecycle Manager.

So, what did they announce? First, Microsoft is

making the Identity Selector Interoperability Profile available under the OSP to enhance interoperability in the identity metasystem for client computers using any platform. An individual open source software developer or a commercial software developer can build its identity selector software and pay no licensing fees to Microsoft, nor will it need to worry about future patent concerns related to the covered specifications for that technology

In other words, third parties are free to build the equivalent of Microsoft's CardSpace, following the likes of the Higgins project, Ian Brown's Apple Safari Plug-In and Chuck Mortimore's Firefox Identity Selector. This is important not only because it extends the reach of CardSpace-like capabilities beyond Windows but also because it facilitates the consistent user experience (I know because I have used CardSpace, the Safari Plug-In and the Firefox Identity Selector) which helps to reduce errors and misunderstanding by users.

Second, Microsoft

is starting four open source projects that will help Web developers support information cards, the primary mechanism for representing user identities in the identity metasystem. These projects will implement software for specifying the Web siteâ€™s security policy and accepting information cards in Java for Sun Java System Web Servers or Apache Tomcat or IBMâ€™s WebSphere Application Server, Ruby on Rails, and PHP for the Apache Web server. An additional project will implement a C Library that may be used generically for any Web site or service. These implementations will complement the existing ability to support information cards on the Microsoft® Windows® platform using the Microsoft Visual Studio® development environment.

Or, to put it another way, doing for back end servers what the first announcement is doing for the front-end: enabling web sites and enterprises running a wide variety of web server infrastructure to support authentication using CardSpace and the other identity selectors.

The cyncical amongst you might be forgiven for thinking that these two announcements are just Microsoft paying lip service to interoperability. This post should help to allay your concerns: at the Internet Identity Workshop earlier in May the Open Source Identity Selector (OSIS) group demonstrated interoperability amongst 5 identity selectors, 11 relying parties (the party relying on authentication to prove an identity), 7 identity providers (the party asserting the identity), 4 types of identity token (the mechanism for conveying the identity assertion), and 2 authentication mechanisms. Also, on the same day as the Microsoft press release, Internet2 announced plans to extend Shibboleth, a federated web single sign-on solution based on SAML that is widely used amongst educational institutions, to support CardSpace and compatible identity selectors.

The third piece of news from Redmond last week, concerned the new Identity Lifecycle Manager product and is thus primarily focussed behind the firewall. Microsoft is going to be working with KERNEL Networks and Oxford Computer Group to enable bi-directional synchronisation of identity data between OpenLDAP, an open source implementation of the ubiquitous directory standard, and Microsoft's Active Directory. Identity Lifecycle Manager already supports a wide range of the commonly-deployed identity data repositories so I think this move is primarily in the “playing well with open source” category – but valuable nonetheless.

These announcements are further evidence that the likes of Kim Cameron, Microsoft's chief identity architect, and Mike Jones, the company's Director of Identity Partnerships, have been working hard to foster the relationships and commitment (both from Microsoft and third parties) required to help make the identity metasystem a reality. That reality is too important for the results of those efforts to be diluted by political shenanigans around patents and GPLv3.

I'm glad to hear that Neil has tried CardSpace and its sister implementations on different platforms.

Linkage and identification

Inspired by some of Ben Laurie's recent postings, I want to continue exploring the issues of privacy and linkability (see related pieces here and here).

I have explained that CardSpace is a way of selecting and transferring a relevant digital identity – not a crypto system; and that the privacy characteristics involved depend on the nature of the transaction and the identity provider being used within CardSpace – not on CardSpace itself. I ended my last piece this way:

The question now becomes that of how identity providers behave. Given that suddenly they have no visibility onto the relying party, is linkability still possible?

But before zeroing in on specific technologies, I want to drill into two issues. First is the meaning of “identification”; and second, the meaning of “linkability” and its related concept of “traceability”.

Having done this will allow us to describe different types of linkage, and set up our look at how different cryptographic approaches and transactional architectures relate to them.

Identification

There has been much discussion of identification (which, for those new to this world, is not at all the same as digital identity). I would like to take up the definitions used in the EU Data Protection Directive, which have been nicely summarized here, but add a few precisions. First, we need to broaden the definition of “indirect identification” by dropping the requirement for unique attributes – as long as you end up with unambiguous identification. Second, we need to distinguish between identification as a technical phenomenon and personal identification.

This leads to the following taxonomy:

Personal data:
- any piece of information regarding an identified or identifiable natural person.
Direct Personal Identification:
- establishing that an entity is a specific natural person through use of basic personal data (e.g., name, address, etc.), plus a personal number, a widely known pseudo-identity, a biometric characteristic such as a fingerprint, PD, etc.
Indirect Personal Identification:
- establishing that an entity is a specific natural person through other characteristics or attributes or a combination of both – in other words, to assemble “sufficiently identifying” information
Personal Non-Identification:
- assumed if the amount and the nature of the indirectly identifying data are such that identification of the individual as a natural person is only possible with the application of disproportionate effort, or through the assistance of a third party outside the power and authority of the person responsible…

Translating to the vocabulary we often use in the software industry, direct personal identification is done through a unique personal identifier assigned to a natural person. Indirect personal identification occurs when enough claims are released – unique or not – that linkage to a natural person can be accomplished. If linkage to a natural person is not possible, you have personal non-identification. We have added the word “personal” to each of these definitions so we could withstand the paradox that when pseudonyms are used, unique identifiers may in fact lead to personal non-identification…

The notion of “disproportionate effort” is an important one. The basic idea is useful, with the proviso that when one controls computerized systems end-to-end one may accomplish very complicated tasks, computations and correlations very easily – and this does not in itself constitute “disproportionate effort”.

Linkability

If you search for “linkability”, you will find that about half the hits refer to the characteristics that make people want to link to your web site. That's NOT what's being discussed here.

Instead, we're talking about being able to link one transaction to another.

The first time I heard the word used this way was in reference to the E-Cash systems of the eighties. With physical cash, you can walk into a store and buy something with one coin, later buy something else with another coin, and be assured there is no linkage between the two transactions that is caused by the coins themselves.

This quality is hard to achieve with electronic payments. Think of how a credit card or debit card or bank account works. Use the same credit card for two transactions and you create an electronic trail that connects them together.

E-Cash was proposed as a means of getting characteristics similar to those of the physical world when dealing with electronic transactions. Non-linkability was the concept introduced to describe this. Over time it has become a key concept of privacy research, which models all identity transactions as involving similar basic issues.

Linkability is closely related to traceability. By traceability people are talking about being able to follow a transaction through all its phases by collecting transaction information and having some way of identifying the transaction payload as it moves through the system.

Traceability is often explicitly sought. For example, with credit card purchases, there is a transaction identifier which ties the same event together across the computer systems of the participating banks, clearing house and merchant. This is certainly considered “a feature.” There are other, subtler, sometimes unintended, ways of achieving traceability (timestamps and the like).

Once you can link two transactions, many different outcomes may result. Two transactions conveying direct personal identification might be linked. Or, a transaction initially characterized by personal non-identification may suddenly become subject to indirect personal identification.

To further facilitate the discussion, I think we should distinguish various types of linking:

Intra-transaction linking is the product of traceability, and provides visibility between the claims issuer, the user presenting the claims, and the relying party (for example, credit card transaction number).
Single-site transaction linking associates a number of transactions at a single site with a data subject. The phrase “data subject” is used to clarify that no linking is implied between the transactions and any “natural person”.
Multi-site transaction linking associates linked transactions at one site with those at another site.
Natural person linking associates a data subject with a natural person.

Next time I will use these ideas to help explain how specific crypto systems and protocol approaches impact privacy.

Notes from IIW 2007a

Over at self-issued, Mike Jones picked up on the OSIS Wiki Page reporting on the recent Information Card Connect-a-thon. Maybe the most encouraging thing was to see new players show up with working bits:

The OSIS group sponsored an Information Card interoperability connect-a-thon on May 15, 2007 as part of the Internet Identity Workshop 2007 A in Mountain View California. Participants collaborated to work through combinations of Identity Provider, Identity Agent, and Relying Party scenarios, in order to identify and workshop problems with interoperability. The following representatives were present and participated:

5 Information Card Selectors

Ian Brownâ€™s Safari Plugin

XMLDAP

Windows Cardspace

Higgins IdA Native

Higgins IdA Java

11 Relying Parties

Bandit (basic wiki authentcation)

Bandit (elevated privileges)

PamelaWare

CA

XMLDAP

Windows Live RP (used to obtain a managed card)

Windows Live/single-issuer (where you can use the managed card)

Oracle RP

Identityblog RP (based on Rob Richardsâ€™ library)

Identityblog helloworld token RP

UW/Shibboleth

7 Identity Providers

Higgins

Bandit

XMLDAP

UW/Shibboleth

LiveLabs

HumanPresent

Identityblog HelloWorld IdP

4 Token Types

SAML 1.0

SAML 1.1

helloworld

username token

2 Authentication Mechanisms

username/password

self-issued (personal) card

Many combinations interoperated as expected; several issues were identified and are being fixed in preparation for the coming Information Card Interop event to be held at the Burton Group Catalyst Conference in San Francisco (June 25-29).

Socket and Ecosystem Days

David Coder comments on my recent reflection on “novel auth” technology:

Kim, I very much agree with everything you wrote.

But there is another thing I don't understand lately. CardSpace is shipping now for almost half a year in its RTM version. And yet, I have never come across a production site (except this one) that uses it. You post all these fantastic anouncements of new groups that will support this, but out there on the web, very little adoption seems to take place. And in particular, there seems to be not a single Microsoft site that uses it. Why? Contrary, the one huge MS group where I would have thought they might use it (Windows Live ID and all the sites that use it) seems to be even implementing their own identity selector.

Quite frankly, right now my impression is that what is needed most is some highly visible commitment from MS itself to this idea and to implement it widespread on its platform. I am just quite sceptical that anyone else will use this widespread, unless you do the first step.

Make no mistake: you will see deep Microsoft support. But you need to give us time to roll it out, just as we need to give others in the industry time to do the same.

Using your example of Windows Live ID, it is a huge production system handling a billion authentications a day. There are strict requirements for introducing new software. In fact, some of them arose through input from policy makers. Much more is involved than “wanting to do something” and coming up with “bits” suitable for use on such an enormous site. There is Process.

The same is true in terms of integrating the new technology into our federation product, Active Directory Federation Service (ADFS). There is a whole team working on CardSpace support, so administrators will be able to give their Active Directory (AD) users Information Cards at the flick of a switch. But we want to do it as well as we can, and in the most secure way possible, and we can't do that over night.

My colleagues and I wanted to see CardSpace bits get into circulation as early as possible – even if service offerings weren't ready yet. Why?

Socket and Ecosystem Days

The problem with identity is getting the infrastructure in place. Some great talent – I don't know who – pointed this out when he said, “The Public Key Infrastructure (PKI) is great except for one thing: the public has no keys”…

CardSpace eliminates the need to “give the people keys”. But the bits still have to “get out there” before it will work. We are still in “Socket and Ecosystem Days”, when sockets start to appear on desktops and people running web sites can move past “but nobody has information cards” and get to “hey, everyone is going to have them”.

Our first job was to ship CardSpace V1.0 so Information Cards became “real”. Now we need to distribute bits. And finally we need to lead in adoption, just as you say.

CardSpace can't succeed without its sister implementations on other platforms. It also needs relying party software in a dozen languages to run on all platforms. And identity provider software.

These are just starting to emerge. But all this is happening in a methodical and persistant way. I think of it as “ecosystem time”.

I'll post the report that appeared on the OSIS wiki describing the Connect-a-thon held at a recent IIW. You will see the degree to which the ecosystem is growing.

Meanwhile, Windows Live ID plans to introduce Information Card support this summer. At that point, all the Microsoft properties will be enabled. The integration will grow progressively stronger over time.