Who is harmed by a “Real Names” policy?

Skud at Geek Feminism Blog has created a wiki documenting work she and her colleagues are doing to “draft a comprehensive list” of those who would be harmed by a policy banning pseudonymity and requiring “real names”.

The result is impressive.  The rigour Skud and colleagues have applied to their quest has produced an information payload that is both illuminating and touching.

Those of us working on identity technology have to internalize the lessons here.  Over-identification is ALWAYS wrong.  But beyond that, there are people who are especially vulnerable to it.  They have to be treated as first class citizens with clear rights and we need to figure out how to protect them.  This goes beyond what we conventionally think of as privacy concerns (although perhaps it sheds light on the true nature of what privacy is – I'm still learning).

Often people argue in favor of “Real Names” in order to achieve accountability.  The fact is that technology offers us other ways to achieve accountability.  By leveraging the properties of minimal disclosure technology, we can allow people to remain anonymous and yet bar them from given environments if their behavior gets sufficiently anti-social.

But enough editorializing.  Here's Skud's intro.  Just remember that in this case the real enlightenment is in the details, not the summary.

This page lists groups of people who are disadvantaged by any policy which bans Pseudonymity and requires so-called “Real names” (more properly, legal names).

This is an attempt to create a comprehensive list of groups of people who are affected by such policies.

The cost to these people can be vast, including:

  • harassment, both online and offline
  • discrimination in employment, provision of services, etc.
  • actual physical danger of bullying, hate crime, etc.
  • arrest, imprisonment, or execution in some jurisdictions
  • economic harm such as job loss, loss of professional reputation, etc.
  • social costs of not being able to interact with friends and colleagues
  • possible (temporary) loss of access to their data if their account is suspended or terminated

The groups of people who use pseudonyms, or want to use pseudonyms, are not a small minority (some of the classes of people who can benefit from pseudonyms constitute up to 50% of the total population, and many of the others are classes of people that almost everyone knows). However, their needs are often ignored by the relatively privileged designers and policy-makers who want people to use their real/legal names.

Wait a minute.  Just got a note from the I Can't Stop Editorializing Department: the very wiki page that brings us Skud's analysis contains a Facebook “Like” button.  It might be worth removing it given that Facebook requires “Real Names”, and then transmits the URL of any page with a “Like” button to Facebook so it can be associated with the user's “Real Name” – whether or not they click on the button or are logged into Facebook.

Head over to the Office of Inadequate Security

First of all, I have to refer readers to the Office of Inadequate Security, apparently operated by databreaches.net. I suggest heading over there pretty quickly too – the office is undoubtedly going to be so busy you'll have to line up as time goes on.

So far it looks like the go-to place for info on breaches – it even has a twitter feed for breach junkies.

Recently the Office published an account that raises a lot of questions:

I just read a breach disclosure to the New Hampshire Attorney General’s Office with accompanying notification letters to those affected that impressed me favorably. But first, to the breach itself:

StudentCity.com, a site that allows students to book trips for school vacation breaks, suffered a breach in their system that they learned about on June 9 after they started getting reports of credit card fraud from customers. An FAQ about the breach, posted on www.myidexperts.com explains:

StudentCity first became concerned there could be an issue on June 9, 2011, when we received reports of customers travelling together who had reported issues with their credit and debit cards. Because this seemed to be with 2011 groups, we initially thought it was a hotel or vendor used in conjunction with 2011 tours. We then became aware of an account that was 2012 passengers on the same day who were all impacted. This is when we became highly concerned. Although our processing company could find no issue, we immediately notified customers about the incident via email, contacted federal authorities and immediately began a forensic investigation.

According to the report to New Hampshire, where 266 residents were affected, the compromised data included students’ credit card numbers, passport numbers, and names. The FAQ, however, indicates that dates of birth were also involved.

Frustratingly for StudentCity, the credit card data had been encrypted but their investigation revealed that the encryption had broken in some cases. In the FAQ, they explain:

The credit card information was encrypted, but the encryption appears to have been decoded by the hackers. It appears they were able to write a script to decode some information for some customers and most or all for others.

The letter to the NH AG’s office, written by their lawyers on July 1, is wonderfully plain and clear in terms of what happened and what steps StudentCity promptly took to address the breach and prevent future breaches, but it was the tailored letters sent to those affected on July 8 that really impressed me for their plain language, recognition of concerns, active encouragement of the recipients to take immediate steps to protect themselves, and for the utterly human tone of the correspondence.

Kudos to StudentCity.com and their law firm, Nelson Mullins Riley & Scarborough, LLP, for providing an exemplar of a good notification.

It would be great if StudentCity would bring in some security experts to audit the way encryption was done, and report on what went wrong. I don't say this to be punitive, I agree that StudentCity deserves credit for at least attempting to employ encryption. But the outcome points to the fact that we need programming frameworks that make it easy to get truly robust encryption and key protection – and to deploy it in a minimal disclosure architecture that keeps secrets off-line. If StudentCity goes the extra mile in helping others learn from their unfortunate experience, I'll certainly be a supporter.

There is a fundamental problem here

Joe Mansfield at Peccavi has done a very cogent post where, though he agrees with my concerns, he criticizes me for picking almost exclusively on Google when there are lots of others who have been doing the same thing.  He's right – I have been too narrowly focused. 

Let me be clear:  I have great respect for Google and many of its accomplishments.   I have a disagreement with a particular Google team.

I find the Google Street View team's abuse of identifiers especially worrisome because they have not only been collecting info about WiFi access points, but the MAC addresses of peoples’ personal devices (laptops and phones).  

This bothers me because I see it as dangerous.  It's like going over to visit a neighbor and finding out he's been building a nuclear reactor in his basement. 

 I'm not an expert on the geolocation industry and I have no knowledge of whether this kind of end-user-device-snooping is commonplace.  If it is, then let me know.  Everything I have said about Google applies equally to any similar practitioners. 

But let's get to Peccavi which makes the point better than I do:

I’ve been following Kim Cameron’s increasingly critical analysis of Google’s StreetView WiFi mapping data privacy debacle with some interest of late.

Some background might be in order for those interested in reading where he’s been coming from – start here and work forward. He’s been quite vocal and directed in his criticism and I have been surprised that his focus has been almost entirely on Google rather than on the underlying technical root cause. My initial view on the issue was that it was a stupid over-reaction to something that everyone has been doing for years, and that at least Google were being open about having logged too much data. I’m still of the opinion that the targeting of Google specifically is off base here, although I think Kim is right that there is a fundamental problem here.

Kim is probably the pre-eminent proponent and defender of strong authentication and privacy on the net at the moment. His Laws of Identity should be mandatory reading for anyone working with user data in any sort of context but especially for anyone working with online systems. He’s a hugely influential thought leader for doing the right thing and as a key technical leader within Microsoft he’s doing more than almost anyone else to lay the groundwork for a move away from our current reliance on insecure, privacy leaking methods of authentication. Let’s just say that I’m a fan.

For obvious reasons he has spotted the huge privacy problems associated with the practice of gathering WiFi SSID and MAC addresses and using them to create large scale geo-location databases. There are serious privacy issues here and despite my initial cynicism about this perhaps it’s a good thing that there has been a huge furore over what Google were doing.

Note that there were two issues in play here – the intentional data (the SSID’s, MAC addresses and geo-location info) and the unintentional data (actual user payloads). I’m only going to talk about the intentionally harvested data right now because that is the much trickier problem – few people would argue that having Google (or anyone) logging actual WiFi traffic from their homes is OK.

The problem that I see with Kim’s general position on this and the focus on Google’s activities alone is that he’s not seeing the wood for the trees. The problem of companies or individuals harvesting this data is minor compared to the problem that enables it. The technical standards that we all use to connect wirelessly with the endless array of devices that we all now have in our homes, use at work and carry on our person every day are promiscuous communicators of identifiers that can be easily and extensively misused. Even if Google are prevented by law from doing it, if the standards aren’t changed then someone else will…

I agree with almost every point made except, “The problem of companies or individuals harvesting this data is minor compared to the problem that enables it.”  I would put it differently.  I would say, “There are two problems.  Both are bad.”

We're technologists so we immediately look to technology to prevent abuse.  This is the right instinct for us to have.  But societly can use disincentives too.  I've come to believe that technology must belong to society as a whole.  And we need a combination of  technical solutions and those society can impose.

I actually think I see at least some of the woods as well as the trees.  That is what the Fourth Law is all about.  Of course I want to change the underlying technology as fast as we can. 

But I don't think that will happen unless there is a MUCH greater understanding of the issues, and I've been trying with this set of posts to get them onto the table.    

[More Peccavi here.]

 

How to prevent wirelesstapping

Responding to “What harm can possibly come from a MAC address“, Hal Berenson writes:

“The real problem here is technological not legal. You could ban collecting SSIDs and MAC addresses and why would it matter? Your sexual predator scenario wouldn’t be prevented (as (s)he is already committing a far more heinous crime it just isn’t going to deter them). The real problem is that WIFI (a) still doesn’t encrypt properly and (b) nearly all public hotspots avoid encryption altogether. I’ll almost leave (b) alone because it is so obvious, yet despite that we have companies like AT&T pushing us (by eliminating unlimited data plans) to use hotspots rather than their (better) protected 3G access.

“Sure my iPad connects nicely via WIFI when I’m in the United Red Carpet Club, but it also leaves much of my communications easily intercepted (3G may be vulnerable, but it does take some expertise and special equipment to set up my own cell). But what the *&#$#&*^$ is going on with encrypted WIFI not encrypting the MAC addresses? If something needs to be exposed it should be a locally unique address, not a globally unique one! I seem to recall that when I first looked at cryptography in the early 70s I read articles about how traffic analysis on encrypted data was nearly as useful as being able to decrypt the data itself. There were all kinds of examples of tracking troop movements, launch orders, etc. using traffic analysis. It is almost 40 years later and we still haven’t learned our lesson.”

I assume Hal is using “*&#$#&*^$” as a form of encryption.  Anyway, I totally agree with the technical points being made.  WIreless networks used the static MAC concept they inherited from wired systems in order to facilitate interoperability with them.  Designers didn't think the fact that the MAC addresses would be visible to eavesdroppers would be very important – the payload was all they cared about.   As I said in the Fourth Law of Identity:

Bluetooth and other wireless technologies have not so far conformed to the fourth law. They use public beacons for private entities.

I'd love to figure out how we would get agreement on “fixing” the wireless infrastructure.  But one thing is for sure:  it is really hard and would take a while!  I don't think, in the meantime, we should simply allow our private space to be invaded.  Just because technology allows theft of the identifiers doesn't mean society should.

Similarly, in reference to the predator scenario, the fact that laws don't prevent crime has never meant there shouldn't be laws.  Regulation of “wirelesstapping” would make the emergence of this new kind of crime less likely.

 

What harm can possibly come from a MAC address?

If you are new to doing privacy threat analysis, I should explain that to do it, you need to be thoroughly pessimistic.  A privacy threat analysis is in this sense no different from any other security threat analysis.  

In our pessimistic frame of mind we then need to brainstorm from two different vantage points.  The first is the vantage point of the party being attacked.  How can people in various situations potentially be endangered by the new technology?  The second is the vantage point of the attacker.  How can people with different motivations misuse the technology?  This is a long and complicated job, and should be done in great detail by those who propose a technology.  The results should be published and vetted.

I haven't seen such publication or vetting by the proponents of world-wide WiFi packet collection and giant central databases of device identifiers.  Perhaps the Street View team or someone else has such a study at hand – it would be great for them to share it.

In the meantime I'm just going to throw out a few simple initial ideas – things that are pretty obvious, by constructing a few scenerios.

SCENARIO:  Collecting MAC Addresses is Legal and Morally Acceptable

In this scenario it is legal and socially acceptable to drive up and down the streets recording people's MAC addresses and other network traffic.   

It is also fine for anyone to use a geolocation service to build his own database of MAC addresses and street addresses. 

How could a collector could possibly get the software to do this?  No problem.  In this scenario, since the activity is legal and there is a demand, the software is freely available.  In fact it is widely advertised on various Internet sites.

The collector builds his collection in the evenings, when people are at home with their WiFi enabled phones and computers.  It doesn't take very long to assemble a really detailed map of all the devices used by the people who live in an entire neighborhood  – perhaps a rich neighborhood

Note that it would not matter whether people in the neighborhood have their WiFi encryption turned on or off – the drive by collector would be able to map their devices, since WiFi encryption does not hide the MAC address.

SCENARIO 2 – Collector is a sexual predator

In Scenario 1, anyone can be “a MAC collector”.  In this scenario, the collector is a sexual predator.

When children pass him in the park, they have their phones and WiFi turned on and their MAC addresses are discernable by his laptop software.  Normally the MAC addresses would be meaningless random numbers, but the collector has a complete database of what MAC addresses are associated with a given house address.  It is therefore simple for the collection software on his laptop to automatically convert the WiFi packets emitted from the childrens’ phones into the street addresses where the children live, showing the locations on a map.

There is thus no need for the collector to go up to the children and ask them where they live.  And it won't matter that their parents have taught them never to reveal that to a stranger.  Their devices will have revealed it for them.

I can easily understand that some people might have problems with this example simply because so many questionable things have been justified through reference to predators.  That's not a bandwagon I'm trying to get on. 

I chose the example not only because I think it's real and exposes a threat, but because it reveals two important things germane to a threat analysis:

  • The motivations people have to abuse the technical mechanisms we put in place are pretty much unlimited. 
  • We need to be able to empathize with people who are vulnerable – like children – rather than taking a “people deserve what they get” attitude.   

Finally, I hope it is obvious I am not arguing Google is doing anything remotely on a par this example,  I'm talking about something different: the matter of whether we want WiFi snooping to be something our society condones, and what some of the software that might come into being if we do.

 

Don't take identities from our homes without our consent

Joerg Resch of Kuppinger Cole in Germany wrote recently about the importance of identity management to the Smart Grid – by which he means the emerging energy infrastructure based on intelligent, distributed renewable resources:

In 10-12 years from now, the whole utilities and energy market will look dramatically different. Decentralization of energy production with consumers converting to prosumers pumping solar energy into the grid and offering  their electric car batteries as storage facilities, spot markets for the masses offering electricity on demand with a fully transparent price setting (energy in a defined region at a defined time can be cheaper, if the sun is shining or the wind is blowing strong), and smart meters in each home being able to automatically contract such energy from spot markets and then tell the washing machine to start working as soon as electricity price falls under a defined line. And – if we think a bit further and apply Google-like business models to the energy market, we can get an idea of the incredible size this market will develop into.

These are just a few examples, which might give you an idea on how the “post fossile energy market” will work. The drivers leading the way into this new age are clear: energy production from oil and gas will become more and more expensive, because pollution is not for free and the resources will not last forever. And the transparency gain from making the grid smarter will make electricity cheaper than it is now.

The drivers are getting stronger every day. Therefore, we will soon see many large scale smart grid initiatives, and we will see questions rising such as who has control over the information collected by the smart meter in my home. Is it my energy provider? How would Kim Cameron´s 7 laws of Identity work in a smart grid? What would a “grid perimeter” look like which keeps information on the usage of whatever electric devices within my 4 walls? By now, we all know what cybercrimes are and how they can affect each of us. But what are the risks of “smart grid hacking”? How might we be affected by “grid crimes”?

In fact at Blackhat 2009, security consultant Mike Davis demonstrated successful hacker attacks on commercially available smart meters.  He told the conference,

“Many of the security vulnerabilities we found are pretty frightening and most smart meters don't even use encryption or ask for authentication before carrying out sensitive functions like running software updates and severing customers from the power grid.”

Privacy commission Ann Cavoukian of Ontario has insisted that industry turn its attention to the security and privacy of these devices:

“The best response is to ensure that privacy is proactively embedded into the design of the Smart Grid, from end to end. The Smart Grid is presently in its infancy worldwide – I’m confident that many jurisdictions will look to our work being done in Ontario as the privacy standard to be met. We are creating the necessary framework with which to address this issue.”

Until recently, no one has talked about drive-by mapping of our home devices.  But from now on we will.  When we think about home devices, we need to reach into the future and come to terms with the huge stakes that are up for grabs here.  

The smart home and the smart grid alert us to just how important the identity and privacy of our devices really is.  We can use technical mechanisms like encryption to protect some information from eavesdroppers.   But not the patterns of our communication or the identities of our devices…  To do that we need a regulatory framework that ensures commercial interests don't enter our “device space” without our consent.

Google's recent Street View WiFi boondoggle is a watershed event in drawing our attention to these matters.

Enterprise lockdown versus consumer applications

My friend Cameron Westland, who has worked on some cool applications for the iPhone, wrote me to complain that I linked to iPhone Privacy:

I understand the implications of what you are trying to say, but how is this any different from Mac OS X applications accessing the address book or Windows applications accessing contacts? (I'm not sure about Windows, but I know it's possible on a Mac).

Also, the article touches on storing patient information on an iPhone. I believe Seriot is guilty of a major oversight in simply correlating the fact that spy phone has access to contacts with it also being able to do so in a secured enterprise.

If the iPhone is deployed in the enterprise, the corporate administrators can control exactly which applications get installed. In the situations where patient information is stored on the phone, they should be using their own security review process to verify that all applications installed meet the HIPPA  certification requirements. Apple makes no claim that applications meet the stringent needs of certain industries – that's why they give control to administrators to encrypt phones, restrict specific application installs, and do remote wipes.

Also, Seriot did no research behavior of a phone connected to a company's active directory, versus just plain old address book… This is cargo cult science at best, and I'm really surprised you linked to it!

I buy Cameron's point that the controls available to enterprises mitigate a number of the attacks presented by Seriot – and agree this is  important.  How do these controls work?  Corporate administrators can set policies specifying the digital signatures of applications that can be installed.  They can use their own processes to decide what applications these will be. 

None of this depends on App Store verification, sandboxing, or Apple's control of platform content.  In fact it is no different from the universally available ability to use a combination of enterprise policy and digital signature to protect enterprise desktop and server systems.  Other features, like the ability for an operator to wipe information, are also pretty much universal.

If the iPhone can be locked down in enterprises, why is Seriot's paper still worth reading?  Because many companies and even governments are interested in developing customer applications that run on phones.  They can't dictate to customers what applications to install, and so lock-down solutions are of little interest.  They turn to Apple's own claims about security, and find statements like this one, taken from the otherwise quite interesting iPhone security overview.

Runtime Protection

Applications on the device are “sandboxed” so they cannot access data stored by other applications. In addition, system files, resources, and the kernel are shielded from the user’s application space. If an application needs to access data from another application, it can only do so using the APIs and services provided by iPhone OS. Code generation is also prevented.

Seriot shows that taking this claim at face value would be risky.  As he says in an eWeek interview:

“In late 2009, I was involved in discussions with the Swiss private banking industry regarding the confidentiality of iPhone personal data,” Seriot told eWEEK. “Bankers wanted to know how safe their information [stores] were, which ones are exactly at risk and which ones are not. In brief, I showed that an application downloaded from the App Store to a standard iPhone could technically harvest a significant quantity of personal data … [including] the full name, the e-mail addresses, the phone number, the keyboard cache entries, the Wi-Fi connection logs and the most recent GPS location.” 

It is worth noting that Seriot's demonstration is very easy to replicate, and doesn't depend on silly assumptions like convincing the user to disable their security settings and ignore all warnings.

The points made about banking applications apply even more to medical applications.  Doctors are effectively customers from the point of view of the information management services they use.  Those services won't be able to dictate the applications their customers deploy.  I know for sure that my doctor, bless his soul,  doesn't have an IT department that sets policies limiting his ability to play games or buy stocks.  If he starts using his phone for patient-related activities, he should be aware of the potential issues, and that's what MedPage was talking about.

Neither MedPage, nor CNET, nor eWeek nor Seriot nor I are trying to trash the iPhone – it's just that application isolation is one of the hardest problems of computer science.  We are pointing out that the iPhone is a computing device like all the others and subject to the same laws of digital physics, despite dangerous mythology to the contrary.  On this point I don't think Cameron Westland and I disagree.

 

SpyPhone for iPhone

The MedPage Today blog recently wrote about “iPhone Security Risks and How to Protect Your Data — A Must-Read for Medical Professionals.”  The story begins: 

Many healthcare providers feel comfortable with the iPhone because of its fluid operating system, and the extra functionality it offers, in the form of games and a variety of other apps.  This added functionality is missing with more enterprise-based smart phones, such as the Blackberry platform.  However, this added functionality comes with a price, and exposes the iPhone to security risks. 

Nicolas Seriot, a researcher from the Swiss University of Applied Sciences, has found some alarming design flaws in the iPhone operating system that allow rogue apps to access sensitive information on your phone.

MedPage quotes a CNET article where Elinor Mills reports:

Lax security screening at Apple's App Store and a design flaw are putting iPhone users at risk of downloading malicious applications that could steal data and spy on them, a Swiss researcher warns.

Apple's iPhone app review process is inadequate to stop malicious apps from getting distributed to millions of users, according to Nicolas Seriot, a software engineer and scientific collaborator at the Swiss University of Applied Sciences (HEIG-VD). Once they are downloaded, iPhone apps have unfettered access to a wide range of privacy-invasive information about the user's device, location, activities, interests, and friends, he said in an interview Tuesday…

In addition, a sandboxing technique limits access to other applications’ data but leaves exposed data in the iPhone file system, including some personal information, he said.

To make his point, Seriot has created open-source proof-of-concept spyware dubbed “SpyPhone” that can access the 20 most recent Safari searches, YouTube history, and e-mail account parameters like username, e-mail address, host, and login, as well as detailed information on the phone itself that can be used to track users, even when they change devices.

Following the link to Seriot's paper, called iPhone Privacy, here is the abstract:

It is a little known fact that, despite Apple's claims, any applications downloaded from the App Store to a standard iPhone can access a significant quantity of personal data.

This paper explains what data are at risk and how to get them programmatically without the user's knowledge. These data include the phone number, email accounts settings (except passwords), keyboard cache entries, Safari searches and the most recent GPS location.

This paper shows how malicious applications could pass the mandatory App Store review unnoticed and harvest data through officially sanctioned Apple APIs. Some attack scenarios and recommendations are also presented.

 

In light of Seriot's paper, MedPage concludes:

These security risks are substantial for everyday users, but become heightened if your phone contains sensitive data, in the form of patient information, and when your phone is used for patient care.   Over at iMedicalApps.com, we are not fans of medical apps that enable you to input patient data, and there are several out there.  But we also have peers who have patient contact information stored on their phones, patient information in their calendars, or are accessible to their patients via e-mail.  You can even e-prescribe using your iPhone. 

I don't want to even think about e-prescribing using an iPhone right now, thank you.

Anyone who knows anything about security has known all along that the iPhone – like all devices – is vulnerable to some set of attacks.  For them, iPhone Privacy will be surprising not because it reveals possible attacks, but because of how amazingly elementary they are (the paper is a must-read from this point of view).  

On a positive note, the paper might awaken some of those sent into a deep sleep by proselytizers convinced that Apple's App Store censorship program is reasonable because it protects them from rogue applications.

Evidently Apple's App Store staff take their mandate to protect us from people like award winning Mad Magazine cartoonist  Tom Richmond pretty seriously (see Apple bans Nancy Pelosi bobble head).  If their approach to “protecting” the underlying platform has any merit at all, perhaps a few of them could be reassigned to work part time on preventing trivial and obvious hacker exploits..

But I don't personally think a closed platform with a censorship board is either the right approach or one that can possibly work as attackers get more serious (in fact computer science has long known that this approach is baloney).  The real answer will lie in hard, unfashionable and (dare I say it?) expensive R&D into application isolation and related technologies. I hope this will be an outcome:  first, for the sake of building a secure infrastructure;  second, because one of my phones is an iPhone and I like to explore downloaded applications too.

[Heads Up: Khaja Ahmed]

More unintended consequences of browser leakage

Joerg Resch at Kuppinger Cole points us to new research showing  how social networks can be used in conjunction with browser leakage to provide accurate identification of users who think they are browsing anonymously.

Joerg writes:

Thorsten Holz, Gilbert Wondracek, Engin Kirda and Christopher Kruegel from Isec Laboratory for IT Security found a simple and very effective way to identify a person behind a website visitor without asking for any kind of authentication. Identify in this case means: full name, adress, phone numbers and so on. What they do, is just exploiting the browser history to find out, which social networks the user is a member of and to which groups he or she has subscribed within that social network.

The Practical Attack to De-Anonymize Social Network Users begins with what is known as “history stealing”.  

Browsers don’t allow web sites to access the user’s “history” of visited sites.  But we all know that browsers render sites we have visited in a different color than sites we have not.  This is available programmatically through javascript by examining the a:visited style.  So malicious sites can play a list of URLs and examine the a:visited style to determine if they have been visited, and can do this without the user being aware of it.

This attack has been known for some time, but what is novel is its use.  The authors claim the groups in all major social networks are represented through URLs, so history stealing can be translated into “group membership stealing”.  This brings us to the core of this new work.  The authors have developed a model for the identification characteristics of group memberships – a model that will outlast this particular attack, as dramatic as it is.

The researchers have created a demonstration site that works with the European social network Xing.  Joerg tried it out and, as you can see from the table at left, it identified him uniquely – although he had done nothing to authenticate himself.  He says,

“Here is a screenshot from the self-test I did with the de-anonymizer described in my last post. I´m a member in 5 groups at Xing, but only active in just 2 of them. This is already enough to successfully de-anonymize me, at least if I use the Google Chrome Browser. Using Microsoft Internet Explorer did not lead to a result, as the default security settings (I use them in both browsers) seem to be stronger. That´s weird!”

Since I’m not a user of Xing I can’t explore this first hand.

Joerg goes on to ask if history-stealing is a crime?  If it’s not, how mainstream is this kind of analysis going to become?  What is the right legal framework for considering these issues?  One thing for sure:  this kind of demonstration, as it becomes widely understood, risks profoundly changing the way people look at the Internet.

To return to the idea of minimal disclosure for the browser, why do sites we visit need to be able to read the a:visited attribute?  This should again be thought of as “fingerprinting”, and before a site is able to retrieve the fingerprint, the user must be made aware that it opens the possibility of being uniquely identified without authentication.

New prototype could really help OpenID

I've sometimes been of two minds about OpenID.  I've always seen it as alluring because of its simplicity and openness.  It seemed perfect for simple web applications.

But in my darker moments, I worried about some of the system's usability and security issues.  In particular, I was concerned about how easy it would be for an “evil site” to trick users into going to a web site that looks identical to their OpenID provider, convincing them to log in, and then stealing their credentials.  If this were to happen, everything that is good about OpenID would turn into something negative.

OpenID has become a key part of the Identity Metasystem

I think many of us involved with the OpenID community came to the same conclusions, but felt that if we kept trying to move adoption forward, we'd be able to figure out how to solve the problems.  In the last year, OpenID has without doubt become the most widely adopted system for reusable internet identity.  Adoption by destination sites continues to grow dramatically: approximately 50,000 sites as of July 1, 2009.  The big Internet properties like Google, Yahoo, AOL, MySpace, and Windows Live have become (or are becoming) OpenID Providers.   As a result, the vast majority of the online US population has an account that can be used to log in at the growing number of destination sites. 

Maybe even more important, some of these sites are of the kind that can quickly change perception and behavior. 

Most notable is Facebook, which took a huge step forward when it started accepting OpenIDs for login – blowing away the old saw that “no one wants to be a relying party”. 

Now, the US Government has decided to adopt OpenID as one of the identity protocols for citizen interaction – again, as Relying Party, not Identity Provider.

Sea Change

There is a sea-change here.  I strongly believe the right thing to do is get  behind OpenID as part of the Identity Metasystem, help promote adoption, and work with the community to make it safer and easier to use.  What is encouraging is that the community has repeatedly shown its ability to evolve as it deploys, and has been able to rapidly extend the standard from the inside.   It has now become widely recognized in the industry that active client software (also called an “Identity Selector”) for OpenID could solve most of its problems, given some minor revisions or additions to the protocol.  By remembering the identities you use, this kind of software can address two sets of issues:

  • Usability:  Lets you bring your identities with you to the site, rather than the site having to guess what identities you have
  • Security:  Protects you from being sent to a malicious site impersonating a real site that would steal your password

New prototype at IIW

Yesterday at the OpenID Summit hosted by Yahoo, Microsoft's Mike Jones and Ariel Gordon  showed some of the work their team has been doing to help figure out how this kind of capability could work.  What's cool is that the client they were showing is completely optional – without it, OpenID continues to work as it currently does.  But with it, experience improves and the dangers are greatly reduced.  I agree with them that demand for a better and safer OpenID user experience will drive selector adoption, which will in turn enable scenarios at higher levels of assurance than are possible with OpenID today.

Ariel Gordon, the main UX designer, told me, “I see it as a starting point for joint work with others in the community – definitely not a finished solution or product.”

It is consistent with the Information Card metaphor:

  • Your OpenIDs are shown as visual cards
  • You select an OpenID by clicking
  • The OpenID last used at the site is the default selection

New OpenIDs can be added on the fly, by picking one from a list suggested by the site, or by typing the provider’s URL.

Mike made a good point about what this means for people who use smaller OpenID providers:  “The cool thing is that it remembers the OpenIDs you’ve used and where you used them […] With a web-based Nascar user interface, Arizona Sate University users will never get the same user experience that Google.com users get […]”

Good Tweets

Unfortunately I couldn't attend the meeting in person but remained wired to the tweets.  Summit host Allen Tom from Yahoo said, “Showing already used OpeniIDs is a great protection against phishing: if a rogue RP tries to send the user to ‘fake yahoo.com’, a regular Yahoo user will click on his Yahoo button in the selector and won’t even see the fake yahoo link.”

He added, “The prototype selector goes in the right direction by offering a better experience when present, while not preventing users to access their favorite sites from any computer.”

Google's Eric Sachs saw value too. “…And a fake yahoo tile would say “never used here” so that’s even more information to help protect the user.”

Bringing our perceptions together from different organizations with different missions and  vantage points is what can make all of this succeed. The partnering is the key.

So one of the best things about the prototype, in my view, is that it has already demonstrated collaboration between a whole set of really experienced community members:

  • Relying Parties: JanRain, Plaxo, Deutsche Telekom
  • OpenID Providers: Yahoo, Google, JanRain
  • Identity Selectors: Microsoft, Deutsche Telekom
  • Enhancing Specifications: Microsoft, Facebook, Yahoo. 

Today, the same prototype was presented to the influential Internet Identity Workshop .  I'll add to my growing lis of IOU's a promise to do a screen capture of how the prototype works so everyone can take a look.