Four laws in one blow

I've been meaning to draw peoples’ attention to this story (via Identity Woman) by David Lazarus at sfgate.com:

“The University of California has suffered yet another potential data breach, this one involving the names and Social Security numbers of about 7, 000 students, faculty and staff at the San Francisco campus.

“For Sen. Diane Feinstein, D-Calif., enough is enough. She told me Tuesday that she'll introduce federal legislation within the next few days requiring encryption of all data stored for commercial purposes.

“This latest incident involving UCSF follows news that UC Berkeley lost control of personal info for nearly 100,000 grad students, alumni and applicants last month when a laptop computer was stolen from an unlocked campus office.

“It also follows a flurry of other security lapses, including San Francisco's Wells Fargo, the nation's fourth-largest bank, experiencing no fewer than three data breaches due to stolen computers over the past year and a half.”

Senator Feinstein said, “What this shows is that there is enormous sloppy handling of personal data.”

It seems to me the question of whether the personal information was handled sloppily or tidily is just part of the problem. I'm equally bothered by the information being there in the first place. How and why did it get there? Did the identified subjects agree to this usage of the information? Why were public identifiers (social security numbers) kept for private individuals? Once these questions are answered, we can turn to operational issues: why the information appeared on a test machine, and why a test machine was deployed with no firewall.

All of this is so far a disturbing mystery. There should be a public investigation of the circumstances through which this breach (and all like it) came about. We need to understand what was going on in the heads of the people who put the data on the compromised machine. The best practice is not to store unnecessary information, and not to store it in unnecessary places. What were these guys thinking about? We need to build peoples’ understanding of the underlying issues.

I expect this information disaster came about by breaking four identity laws at once. What a run!

  • Were users in control of what their information was being employed for? Were they told where and how it was being used (law of user control)?
  • Was there really a need to store social security numbers rather than some local or derived identifier (law of minimal information, law of directional identity)?
  • Would the identified subjects see a “test machine” as a legitimate party to their identity relationship with the university (law of fewest parties)?

Encryption is a good idea but will probably lead to a false sense of confidence and further breaches. We need a more holistic solution.

One final comment. We should give UCSF's forensic staff credit for detecting the breach:

In UCSF's case, campus techies noticed in late February that a server used in part by the university's accounting and personnel departments was generating an unusually high level of network activity.

I'm willing to bet things like this are happening almost everywhere and almost every day – but that most institutions don't have the mechanisms in place to detect what is going on.

From identity to identifiers – Law of Control

I am really fascinated by work Drummond Reed has started on his blog in which he uses the laws of identity to structure a discussion on identifiers. I look forward to seeing where this goes, since Drummond has thought incredibly deeply about identifiers (he is the technical chair of the OASIS Extensible Resource Identifier – XRI – Technical Committee; not to mention his work on XDI…). I know from conversations with my friends at NAC (the Network Applications Consortium) that identifiers are becoming a super-hot pragmatic issue.

Drummond explains what he's doing this way:

When Kim published his Fourth Law (the Law of Directed Identity), it was the first (and only) law that touched directly on identifiers. I knew his Laws had gained quite a following when I quickly received several email messages asking if XRIs (Extensible Resource Identifiers), the new OASIS specifications for abstract identifiers, conformed to the 4th Law.

In discussing this with other members of the XRI TC, as well as with Kim, we realized that each of his “Laws of Identity” has a “Corollary For Identifiers”. In particular, these corollaries would apply to any universal identifier metasystem that aspired to be the addressing scheme for the “mega momma backplane” (as Kim, Marc Canter, and Craig Burton put it.)

That, of course, is precisely the goal of the OASIS XRI effort dating back to 2003 (and previously to the XNS work dating back to 1999.) Given that the XRI 2.0 specifications are currently in public review in advance of a full OASIS vote, now seems like a good time to follow Kim’s lead and publish “The Seven Corollaries of Identifiers”.

The idea that each of the laws has its own ‘identifier corollary’ makes perfect sense. And I'm struck by the way in which the laws provide a conceptual handle through which the issues of identification can be understood by an audience wider than those who wake up, have a coffee, and think about identifiers all day long.

So let's look at the first corollory:

1. The Law of Control

Technical identity systems MUST only reveal information identifying a user with the user’s consent.

1a. The Corollary of Identifier Control.

The identifiers in a universal identifier metasystem MUST only reveal information identifying a user with the user’s consent.

Funny how intuitive it seems when you put it this way. A user’s online identifier should not force the user to reveal any more information than they wish. And yet one of the online identifiers most frequently requested from users squarely violates this principle: an email address. Websites who require an email address to register – and many have no choice because it is often the only easy, universal way to perform basic user authentication – force individuals into revealing information that in many cases they would rather not.

So half the Web breaks this corollary before we’re even out of the starting gate. But it gets worse. Look at one of the current bulwarks of online identification: DNS. A standard requirement for most DNS name registries is accurate, current contact data for the registrant that is published publicly as “Whois” data. Although many registrars now offer proxy registration services to preserve registrant privacy and prevent spam, there’s no escaping that a major component of our current Internet identifier infrastructure breaks the First Corollary squarely in two.

So can XRIs fix this problem? Yes. The first principle of XRI architecture is that XRIs are abstract – the association between an XRI and the real-world resource it represents is entirely under the control of its XRI authority (the person or organization registering the XRI, at any level of delegation). So nothing in an XRI need reveal anything about the authority’s identity or messaging address.

So how can the identifier be authenticated, i.e., what’s the XRI equivalent of the simple email address verification test that websites use every day? The ISSO (I-Name Single Sign-On) protocol, which combines XRI 2.0 resolution with SAML 2.0 authentication assertion exchange. It’s easier, faster, and much more secure than email authentication – and still does not require revealing any other information identifying the user.

So that fixes the first problem. What about the second – the DNS “Whois” problem? What registrant data is required when registering an XRI? Here I can only speak for the XRI global registry services to be offered by XDI.ORG. Based on its Global Services Specifications (GSS) that have been in public review since December, the answer is: none. Following XDI.ORG’s Minimum Information Policy, a cornerstone of its Data Protection Policies, the XDI.ORG global registries will store only registered XRIs, resolution values, and authentication credentials. There is no public (or private) “Whois” service. (There is a Public Trustee Service that provides an alternate means of authenticating a registrant to XDI.ORG if they lose their registration credential, but that data is entirely private.)

So what provides accountability for global registrations? Dispute Notification Service. Every global XRI registrar is required to provide a means of forwarding authenticated dispute notifications to a registrant. This accomplishes the same goal as DNS Whois service but without revealing registrant identifying data or exposing registrants to spam.

This really helps me understand what XRI is all about. And we're just at law 1.

Identity Reform

Chris Ceppi has gone further in explaining his ideas around ‘Identity Reform’. And now I understand the interesting point he is making. We are talking about technological reform.

In an earlier post I referenced the work of Frank Luntz, a Republican pollster and wordsmith who has, regrettably since I often find myself at odds with his positions, been very successful at promoting legislative initiatives by correctly determining the most compelling words to use to promote them. Luntz has done loads of research showing the dramatic effect using different words can have on how the same idea is received. A few notable examples from politics in last few years include:

  • Eliminating the “Estate Tax” is much less popular than eliminating the “Death Tax” – same legislation, broader appeal since everyone dies, but not everyone has an estate worth worrying about.
  • “Welfare Cuts” raised fears and were not popular, “Welfare Reform” (including cuts) passed with broad support under Clinton.
  • Social Security “Phase out” is a non starter, “Private Accounts” are less unpopular but still better than “Privatization”.

The connotations triggered by word choice can ultimately determine whether an idea flies in the mainstream or not – this is why Luntz makes a good living helping Republican politicians craft the language they use to market less than popular initiatives. Given the high degree of suspicion of new identity technology (see ACLU Pizza, attitudes toward Microsoft, etc.) in the general public, I think it is important for those of us developing new technology in this space to be very conscious of the language we use to frame our work.

My view is that the technical innovation surrounding identity is, in fact, part of an ad hoc reform effort. The technical systems, business practices, and regulatory regimes that currently touch identity are primitive and badly broken – these systems and practices need to be upgraded to better serve the interests of important stakeholders.

So what is the most compelling way to communicate the need for technical innovation in the current climate of mistrust and borderline paranoia about identity? Emphasizing the sorry state of the status quo and calling for ‘Identity Reform’ is my current best guess.

These days I'm really focussed on the need to develop a cross-platform system embracing technical alternatives that allow users to select specific variants which ‘work best’ for them. We need to think in terms of an “identity bus” that allows individuals and organizations to “plug in” such alternatives. I see the emergence of these alternatives as being the essential vehicle by which all the relevant parties can posit and influence our digital identity future.

Doing in this could indeed be called a reform of the current chaotic and primitive status quo.

Empire and Communications sleuths, we thank you!

The good news is that Empire and Communications sleuth Janet R located a relatively inexpensive copy here, The bad news is that I bought it.

The good news is that Mark P found a “print to order” copy here. The bad news is that it's…

…still out of my range at $74. The author is listed as Harold Innis, rather than Harold Innes, by the way. First edition, $100 here. Soft cover edition, $61.95 here. Hope these help someone.

Mark is right that I mispelled Harold's name – I have fixed the posting and apologized to Harold.

The good news is that when I receive the copy I just ordered, I will make it available for readers of this blog to borrow (I have my own copy, currently on loan). I've been thinking of getting the book its own I-name to make this easier. I wonder if Drummond has a domain for books? Maybe he will cut me a special deal. Can a blogger be a lending library?