Southworks seeds open source claims transformer

Reading Matias Woloski's blog I see that Southworks has put its work bridging OpenID and WS-Federation into an open source project (download here).    This is a great move.  He also shows some screen shots that give a good feel for what was involved in the Medtronics proof of concept described here.  Matias writes:

A year ago I wrote a blog post about how to use the Windows Identity Foundation with OpenID. Essentially the idea was writing an STS that can speak both protocol WS-Federation and OpenID, so your apps can keep using WIF as the claims framework, no matter what your Identity Provider is. WS-Fed == enterprise, OpenID == consumer…

Fast forward to May this year, I’m happy to disclose the proof of concept we did with the Microsoft Federated Identity Interop group (represented by Mike Jones), Medtronic and PayPal. The official post from the Interoperability blog includes a video about it and Mike also did a great write up

The business scenario brought by Medtronic is around an insulin pump trial. In order to register to this trial, users would login with PayPal, which represents a trusted authority for authentication and attributes like shipping address and age for them. Below are some screenshots of the actual proof of concept:

image

image

image

image

While there are different ways to solve a scenario like this, we chose to create an intermediary Security Token Service that understands the OpenID protocol (used by PayPal), WS-Federation protocol and SAML 1.1 tokens (used by Medtronic apps). This intermediary STS enables SSO between the web applications, avoiding re-authentication with the original identity provider (PayPal).

Also, we had to integrate with a PHP web application and we chose the simpleSAMLphp library. We had to adjust here and there to make it compatible with ADFS/WIF implementation of the standards. No big changes though.

We decided together with the Microsoft Federated Identity Interop team to make the implementation of this STS available under open source using the Microsoft Public License.

And not only that but also we went a step further and added a multi-protocol capability to this claims provider. This is, it’s extensible to support not only OpenID but also OAuth and even a proprietary authentication method like Windows Live.

image

 

 

DISCLAIMER: This code is provided as-is under the Ms-PL license. It has not been tested in production environments and it has not gone through threats and countermeasures analysis. Use it at your own risk.

Project Home page
http://github.com/southworks/protocol-bridge-claims-provider

Download
http://github.com/southworks/protocol-bridge-claims-provider/downloads

Docs
http://southworks.github.com/protocol-bridge-claims-provider

If you are interested and would like to contribute, ping us through the github page, twitter @woloski or email matias at southworks dot net

This endeavor could not have been possible without the professionalism of my colleagues: Juan Pablo Garcia who was the main developer behind this project, Tim Osborn for his support and focus on the customer, Johnny Halife who helped shaping out the demo in the early stages in HTML :), and Sebastian Iacomuzzi that helped us with the packaging. Finally, Madhu Lakshmikanthan who was key in the project management to align stakeholders and Mike who was crucial in making all this happen.

Happy federation!

Trusting Mobile Technology

Jacques Bus recently shared a communication he has circulated about the mobile technology issues I've been exploring.  To European readers he will need no introduction:  as Head of Unit for the European Commission's Information and Communication Technologies (ICT) Research Programme he oversaw and gave consistency to the programs shaping Europe's ICT research investment.  Thoroughly expert and equally committed to results, Jacques’ influence on ICT policy thinking is clearly visible in Europe.   Jacques is now an independent consultant on ICT issues.

On June 20, Kim Cameron [KC] posted a piece on this blog titled: Harvesting phone and laptop fingerprints for its database – Google says the user’s device sends a request to its location server with a list of all MAC addresses currently visible to it. Does that include yours?

It was the start of a series of communications that reads like a thriller. Unfortunately the victim is not imaginary, but it is me and you.

He started with an example of someone attending a conference while subscribed to a geo-location service. “I [KC] argued that the subscriber’s cell phone would pick up all the MAC addresses (which serve as digital fingerprints) of nearby phones and laptops and send them in to the centralized database service, which would look them up and potentially use the harvested addresses to further increase its knowledge of people’s behavior – for example, generating a list of those attending the conference.”

He then explained how Google says its location database works, showing that “certainly the MAC addresses of all nearby phones and laptops are sent in to the geo-location server – not simply the MAC addresses of wireless access points that are broadcasting SSIDs.”

His first post was followed by others, including reference to an excellent piece of Niraj Chokshi in The Atlantic and demonstrating that Google's messages in its application descriptions are, to say the least, not in line with their PR messages to Chokshi.

On 2 July a discussion of Apple iTunes follows in KC's post: Update to iTunes comes with privacy fibs with as main message: As the personal phone evolves it will become increasingly obvious that groups within some of our best tech companies have built businesses based on consciously crafted privacy fibs.

The new iTunes policy says: By using this software in connection with an iTunes Store account, you agree to the latest iTunes Store Terms of Service, which you may access and review from the home page of the iTunes Store. So iTunes says: Our privacy policy is that you need to read another privacy policy. This other policy states:

We also collect non-personal information – data in a form that does not permit direct association with any specific individual. We may collect, use, transfer, and disclose non-personal information for any purpose. The following are some examples of non-personal information that we collect and how we may use it:

  • We may collect information such as occupation, language, zip code, area code, unique device identifier, location, and the time zone where an Apple product is used so that we can better understand customer behavior and improve our products, services, and advertising.

I think KC rightly asks the question: What does downloading a song have to do with giving away your location???

Clearly Apple would call its unique device identifier – and its location – ”non-personal data”. However, personal data means in Europe any information relating to an identified or identifiable natural person. Even Google CEO Eric Schmidt would under this EU definition supposedly disagree with Apple, given his statement in a recent speech quoted by KC: Google is making the Android phone, we have the Kindle, of course, and we have the iPad. Each of these form factors with the tablet represent in many ways your future….: they’re personal. They’re personal in a really fundamental way. They know who you are. So imagine that the next version of a news reader will not only know who you are, but it’ll know what you’ve read…and it’ll be more interactive. And it’ll have more video. And it’ll be more real-time. Because of this principle of “now.”.

We could go on with the post of 3 July: The current abuse of personal device identifiers by Google and Apple is at least as significant as the problems I discussed long ago with Passport. He is referring to a story by Todd Bishop at TechFlash – here I refer readers to the original thriller rather than trying to summarize it for them.

What is absolutely clear from the above is how dependent we all are on mobile technology. It is also clear that to enjoy the personal and location services we request one needs to combine data on the person and his location. However, I am convinced that in the complex society we live in, we will eventually only accept services and infrastructure if we can trust them to work as we expect, including the handling of our personal data. But trust can only be given if the services and infrastructure is trustworthy. O'Hara and Hall describe trust on the Web very well, based on fundamental principles. They decompose trust in local trust (personal experience through high-bandwidth interactions) and global trust (outsourcing our trust decisions to trusted institutions, like accepted roles through training, witnessing, or certification). Reputation is usually a mix of this.

For trust to be built up the transparency and accountability of the data collectors and processors is essential. As local trust is particularly difficult in global transactions over the Web, we need stronger global trust through a-priori assurances on compliance with legal obligations on privacy protection, transparency, auditing, and effective law enforcement and redress. These are basic principles on which our free and developed societies are built, and which are necessary to guarantee creativity, social stability, economic activity and growth.

One can conclude from KCs posts that not much of these essential elements are represented in the current mobile world.

I agree that the legal solutions he proposes are small steps in the right direction and should be pursued. However, essential action at the level of the legislators is urgently needed. Data Protection authorities in Europe are well aware of that as is demonstrated in The Future of Privacy. Unfortunately these solutions are slow to implement, whilst commercial developments are very fast.

Technology solutions, like developing WiFi protocols that appropriately randomize MAC addresses and also protect other personal data, are also needed urgently to enable develop trustworthy solutions that are competitive and methods should be sought to standardize such results quickly.

However, the gigantic global centralization of data collection and the possibilities of massive correlation is scaring and may make DP Commissioners, even in group in Europe, look helpless. The data is already out there and usable.

What I wonder: is all this data available for law enforcers under warrant and accepted as legal proof in court? And if not, how can it be possible that private companies can collect it? Don't we need some large legal test cases?

And let’s not forget one thing: any government action must be as global as possible given the broad international presence of the most important companies in this field, hence the proposed standards of the joint international DP authorities in their Madrid Declaration.

Smart questions and conclusions.

 

Using Consumer Identities for Business Interactions

Mike Jones writes about an “identity mashup” that drives home a really important lesson:  the organizational and technical walls that used to stand in the way of Internet business are dissolving before our very eyes.  The change agent is the power of claims.  The mashup Mike describes crosses boundaries in many dimensions at once:

  • between industries (medical, financial, technical)
  • between organizations (Medtronic, PayPal, Southworks, Microsoft)
  • between protocols (OpenID and SAML)
  • between computing platforms (Windows and Linux)
  • between software products (Windows Identity Foundation, DotNetOpenAuth, SimpleSAMLphp)
  • between identity requirements (ranging from strong identity verification to anonymous comment)

This is a super-concrete demonstration of the progress being made on the “Identity Metasystem” so many of us in the industry have been working on.   My favorite word in Mike's piece is “quickly”, to which I have taken the liberty of adding my own emphasis:

Medtronic, PayPal, Southworks, and Microsoft recently worked together to demonstrate the ability for people to use their PayPal identities for participating in a Medtronic medical device trial, rather than having to create yet another username and password. Furthermore, the demo showed the use of verified claims, where the name, address, birth date, and gender claims provided by PayPal are relied upon by Medtronic and its partners as being sufficiently authoritative to sign people up for the trial and ship them the equipment. I showed this to many of you at the most recent Internet Identity Workshop.

From a technology point of view, this was a multi-protocol federation using OpenID and WS-Federation – OpenID for the PayPal identities and WS-Federation between Medtronic and two relying parties (one for ordering the equipment and one for anonymously recording opinions about the trial). It was also multi-platform, with the Medtronic STS running on Windows and using the Windows Identity Foundation (WIF) and DotNetOpenAuth, the equipment ordering site running on Linux and using simpleSAMLphp, and the opinions site running on Windows and also using WIF. A diagram of the scenario flows is as follows:

Identity Mash-Up Diagram

We called the demo an “identity mash-up” because Medtronic constructed a identity for the user containing both claims that came from the original PayPal identity and claims it added (“mashed-up”) to form a new, composite identity. And yet, access to this new identity was always through the PayPal identity. You can read more about the demo on the Interoperability @ Microsoft blog, including viewing a video of the demo. Southworks also made the documentation and code for the multi-protocol STS available.

I’ll close by thanking the teams at PayPal, Medtronic, and Southworks for coming together to produce this demo. They were all enthusiastic about using consumer identities for Medtronic’s business scenario and pitched in together to quickly make it happen.

 

How to anger your most loyal supporters

The gaming world is seething after what is seen as an egregious assault on privacy by World of Warcraft (WoW), one of the most successful multiplayer role-playing games yet devised.  The issue?  Whereas players used to know each other through their WoW “handles”, the company is now introducing a system called “RealID” that forces players to reveal their offline identities within the game's fantasy context.  Commentators think the company wanted to turn its user base into a new social network.  Judging from the massive hullabaloo amongst even its most loyal supporters, the concept may be doomed.

To get an idea of the dimensions of the backlash just type “WoW RealID” into a search engine.  You'll hit paydirt:

The RealID feature is probably the kookiest example yet of breaking the Fourth Law of Identity – the law of Directed Identity.   This law articulates the requirement to scope digital identifiers to the context in which they are used.  In particular, it explains why universal identifiers should not be used where a person's relationship is to a specific context.  The law arises from the need for “contextual separation” – the right of individuals to participate in multiple contexts without those contexts being linkable unless the individual wants them to be.

The company seems to have initially inflicted Real ID onto everyone, and then backed off by describing the lack of “opt-in” as a “security flaw”, according to this official post on wow.com:

To be clear, everyone who does not have a parentally controlled account has in fact opted into Real ID, due to a security flaw. Addons have access to the name on your account right now. So you need to be very careful about what addons you download — make sure they are reputable. In order to actually opt out, you need to set up parental controls on your account. This is not an easy task. Previous to the Battle.net merge, you could just go to a page and set them up. Done. Now, you must set up an account as one that is under parental control. Once your account is that of a child's (a several-step process), your settings default to Real ID-disabled. Any Real ID friends you have will no longer be friends. In order to enable it, you need to check the Enable Real ID box.

 Clearly there are security problems that emerge from squishing identifiers together and breaking cross-context separation.  Mary Landsman has a great post on her Antivirus Software Blog called “WoW Real ID: A Really Bad Idea“:

Here are a couple of snippets about the new Battle.net Real ID program:

“…when you click on one of your Real ID friends, you will be able to see the names of his or her other Real ID friends, even if you are not Real ID friends with those players yourself.”

“…your mutual Real ID friends, as well as their Real ID friends, will be able to see your first and last name (the name registered to the Battle.net account).”

“…Real ID friends will see detailed Rich Presence information (what character the Real ID friend is playing, what they are doing within that game, etc.) and will be able to view and send Broadcast messages to other Real ID friends.”

And this is all cross-game, cross-realm, and cross-alts. Just what already heavily targeted players need, right? A merge of WoW/Battle.net/StarCraft with Facebook-style social networking? Facepalm might have been a better term to describe Real ID given its potential for scams. Especially since Blizzard rolled out the change without any provision to protect minors whatsoever:

Will parents be able to manage whether their children are able to use Real ID?
We plan to update our Parental Controls with tools that will allow parents to manage their children's use of Real ID. We'll have more details to share in the future.

Nice. So some time in the future, Blizzard might start looking at considering security seriously. In the meantime, the unmanaged Real ID program makes it even easier for scammers to socially engineer players AND it adds potential stalking to the list of concerns. With no provision to protect minors whatsoever.

Thanks, Blizz…Not!

And Kyth has a must-read post at stratfu called Deeply Disappointed with the ‘RealID’ System where he explains how RealID should have been done.  His ideas are a great implementation of the Fourth Law.

Using an alias would be fine, especially if the games are integrated in such a way that you could pull up a list of a single Battle.net account's WoW/D3 characters and SC2 profiles. Here is how the system should work:

  • You have a Battle.net account. The overall account has a RealID Handle. This Handle defaults to being your real name, but you can easily change it (talking single-click retard easy here) to anything you desire. Mine would be [WGA]Kazanir, just like my Steam handle is.
  • Each of your games is attached to your Battle.net account and thereby to your RealID. Your RealID friends can see you when you are online in any of those games and message you cross-game, as well as seeing a list of your characters or individual game profiles. Your displayed RealID is the handle described above.
  • Each game contains either a profile (SC2) or a list of characters. A list of any profiles or characters attached to your Battle.net account would be easily accessible from your account management screen. Any of these characters can be “opted out” of your RealID by unchecking them from the list. Thus, my list might look like this:
    X Kazanir.wga – SC2 ProfileX Kazanir – WoW – 80 Druid Mal'ganisX Gidgiddoni – WoW – 60 Warrior Mal'ganis_ Kazbank – WoW – 2 Hunter Mal'ganisX Kazabarb – D3 – 97 Barbarian US East_ Kazahidden – D3 – 45 Monk US West

    In this way I can play on characters (such as a bank alt or a secret D3 character with my e-girlfriend) without forcibly having their identity broadcast to my friends.When I am online on any of the characters I have unchecked, my RealID friends will be able to message me but those characters will not be visible even to RealID friends. The messages will merely appear to come from my RealID and the “which character is he on” information will not be available.

  • Finally, the RealID messenger implementation in every game should be able to hide my presence from view just like any instant messenger application can right now. I shouldn't be forced to be present with my RealID just because I am playing a game — there should be a universal “pretend to not be online” button available in every Battle.net enabled game.

These are the most basic functionality requirements that should be implemented by anyone with an IQ over 80 who designs a system like this.

Check out the comments in response to his post.  I would have to call his really sensible and informed proposal “wildly popular”.  It will be really interesting to see how this terrible blunder by such a creative company will end up.

 [Thanks to Joe Long for heads up]

“Microsoft Accuses Apple, Google of Attempted Privacy Murder”

Ms. Smith at Network World made it to the home page of digg.com yesterday when she reported on my concerns about the collection and release of information related to people's movements and location. 

I want to set the record straight about one thing: the headline.  It's not that I object to the term “attempted privacy murder” – it pretty much sums things up. The issue is just that I speak as Kim Cameron – a person, not a corporation.  I'm not in marketing or public releations – I'm a technologist who has come to understand that we must  all work together to ensure people are able to trust their digital environment.  The ideas I present here are the same ones I apply liberally in my day job, but this is a personal blog.

Ms. Smith is as precise as she is concise:

A Microsoft identity guru bit Apple and smacked Google over mobile privacy policies. Once upon a time, before working for Microsoft, this same man took MS to task for breaking the Laws of Identity.

Kim Cameron, Microsoft's Chief Identity Architect in the Identity and Security Division, said of Apple, “If privacy isn’t dead, Apple is now amongst those trying to bury it alive.”

What prompted this was when Cameron visited the Apple App store to download a new iPhone application. When he discovered Apple had updated its privacy policy, he read all 45 pages on his iPhone. Page 37 lets Apple users know:

Collection and Use of Non-Personal Information

We also collect non-personal information – data in a form that does not permit direct association with any specific individual. We may collect, use, transfer, and disclose non-personal information for any purpose. The following are some examples of non-personal information that we collect and how we may use it:

· We may collect information such as occupation, language, zip code, area code, unique device identifier, location, and the time zone where an Apple product is used so that we can better understand customer behavior and improve our products, services, and advertising.

The MS identity guru put the smack down not only on Apple, but also on Google, writing in his blog, “Maintaining that a personal device fingerprint has ‘no direct association with any specific individual’ is unbelievably specious in 2010 – and even more ludicrous than it used to be now that Google and others have collected the information to build giant centralized databases linking phone MAC addresses to house addresses. And – big surprise – my iPhone, at least, came bundled with Google’s location service.”

MAC in this case refers to Media Access Control addresses associated with specific devices and one of the types that Google collected. Google admits to collecting MAC addresses of WiFi routers, but denies snagging MAC addresses of laptops or phones. Google is under mass investigation for its WiFi blunder.

Apple's new policy is also under fire from two Congressmen who gave Apple until July 12th to respond. Reps. Edward J. Markey (D-Mass.) and Joe Barton (R-Texas) sent a letter to Apple CEO Steve Jobs asking for answers about Apple gathering location information on its customers.

As far as Cameron goes, Microsoft's Chief Identity Architect seems to call out anyone who violates privacy. That includes Microsoft. According to Wikipedia's article on Microsoft Passport:

“A prominent critic was Kim Cameron, the author of the Laws of Identity, who questioned Microsoft Passport in its violations of those laws. He has since become Microsoft's Chief Identity Architect and helped address those violations in the design of the Windows Live ID identity meta-system. As a consequence, Windows Live ID is not positioned as the single sign-on service for all web commerce, but as one choice of many among identity systems.”

Cameron seems to believe location based identifiers and these changes of privacy policies may open the eyes of some people to the, “new world-wide databases linking device identifiers and home addresses.”

 

Doing it right: Touch2Id

And now for something refreshingly different:  an innovative company that is doing identity right. 

I'm talking about a British outfit called Touch2Id.  Their concept is really simple.  They offer young people a smart card that can be used to prove they are old enough to drink alcohol.  The technology is now well beyond the “proof of concept” phase – in fact its use in Wiltshire, England is being expanded based on its initial success.

  • To register, people present their ID documents and, once verified, a template of their fingerprint is stored on a Touch2Id card that is immediately given to them. 
  • When they go to a bar, they wave their card over a machine similar to a credit card reader, and press their finger on the machine.  If their finger matches the template on their card, the lights come on and they can walk on in.

   What's great here is:

  • Merchants don't have to worry about making mistakes.  The age vetting process is stringent and fake IDs are weeded out by experts.
  • Young people don't have to worry about being discriminated against (or being embarassed) just because they “look young”
  • No identifying information is released to the merchant.  No name, age or photo appears on (or is stored on) the card.
  • The movements of the young person are not tracked.
  • There is no central database assembled that contains the fingerprints of innocent people
  • The fingerprint template remains the property of the person with the fingerprint – there is no privacy issue or security honeypot.
  • Kids cannot lend their card to a friend – the friend's finger would not match the fingerprint template.
  • If the card is lost or stolen, it won't work any more
  • The templates on the card are digitally signed and can't be tampered with

I met the man behind the Touch2Id, Giles Sergant, at the recent EEMA meeting in London.

Being a skeptic versed in the (mis) use of biometrics in identity – especially the fingerprinting of our kids – I was initially more than skeptical. 

But Giles has done his homework (even auditing the course given by privacy experts Gus Hosein and Simon Davies at the London School of Economics).  The better I understood the approach he has taken, the more impressed I was.

Eventually I even agreed to enroll so as to get a feeling for what the experience was like.  The verdict:  amazing.  Its a lovely piece of minimalistic engineering, with no unnecessary moving parts or ugly underbelly.    If I look strangely euphoric in the photo that was taken it is because I was thoroughly surprised by seeing something so good.

Since then, Giles has already added an alternate form factor – an NFC sticker people can put on their mobile phone so they don't actually need to carry around an additional artifact.  It will be fascinating to watch how young people respond to this initiative, which Giles is trying to grow from the bottom up.  More info on the Facebook page.

Microsoft identity guru questions Apple, Google on mobile privacy

Todd Bishop at TechFlash published a comprehensive story this week on device fingerprints and location services: 

Kim Cameron is an expert in digital identity and privacy, so when his iPhone recently prompted him to read and accept Apple's revised terms and conditions before downloading a new app, he was perhaps more inclined than the rest of us to read the entire privacy policy — all 45 pages of tiny text on his mobile screen.

It's important to note that apart from writing his own blog on identity issues — where he told this story — Cameron is Microsoft's chief identity architect and one of its distinguished engineers. So he's not a disinterested industry observer in the broader sense. But he does have extensive expertise.

And he is publicly acknowledging his use of an iPhone, after all, which should earn him at least a few points for neutrality…

At this point I'll butt in and editorialize a little.  I'd like to amplify on Todd's point for the benefit of readers who don't know me very well:  I'm not critical of Street View WiFi because I am anti-Google.  I'm not against anyone who does good technology.  My critique stems from my work as a computer scientist specializing in identity, not as a person playing a role in a particular company.  In short, Google's Street View WiFi is bad technology, and if the company persists in it, it will be one of the identity catastrophes of our time.

When I figured out the Laws of Identity and understood that Microsoft had broken them, I was just as hard on Microsoft as I am on Google today.  In fact, someone recently pointed out the following reference in Wikipedia's article on Microsoft's Passport:

“A prominent critic was Kim Cameron, the author of the Laws of Identity, who questioned Microsoft Passport in its violations of those laws. He has since become Microsoft's Chief Identity Architect and helped address those violations in the design of the Windows Live ID identity meta-system. As a consequence, Windows Live ID is not positioned as the single sign-on service for all web commerce, but as one choice of many among identity systems.”

I hope this has earned me some right to comment on the current abuse of personal device identifiers by Google and Apple – which, if their FAQs and privacy policies represent what is actually going on, is at least as significant as the problems I discussed long ago with Passport.  

But back to Todd: 

At any rate, as Cameron explained on his IdentityBlog over the weekend, his epic mobile reading adventure uncovered something troubling on Page 37 of Apple's revised privacy policy, under the heading of “Collection and Use of Non-Personal Information.” Here's an excerpt from Apple's policy, Cameron's emphasis in bold.

We also collect non-personal information — data in a form that does not permit direct association with any specific individual. We may collect, use, transfer, and disclose non-personal information for any purpose. The following are some examples of non-personal information that we collect and how we may use it:

We may collect information such as occupation, language, zip code, area code, unique device identifier, location, and the time zone where an Apple product is used so that we can better understand customer behavior and improve our products, services, and advertising.

Here's what Cameron had to say about that.

Maintaining that a personal device fingerprint has “no direct association with any specific individual” is unbelievably specious in 2010 — and even more ludicrous than it used to be now that Google and others have collected the information to build giant centralized databases linking phone MAC addresses to house addresses. And — big surprise — my iPhone, at least, came bundled with Google’s location service.

The irony here is a bit fantastic. I was, after all, using an “iPhone”. I assume Apple’s lawyers are aware there is an ‘I’ in the word “iPhone”. We’re not talking here about a piece of shared communal property that might be picked up by anyone in the village. An iPhone is carried around by its owner. If a link is established between the owner’s natural identity and the device (as Google’s databases have done), its “unique device identifier” becomes a digital fingerprint for the person using it.

MAC in this context refers to Media Access Control addresses associated with specific devices, one type of data that Google has acknowledged collecting. However, in a response to an Atlantic magazine piece that quoted an earlier Cameron blog post, Google says that it hasn't gone as far Cameron is suggesting. The company says it has collected only the MAC addresses of WiFi routers, not of laptops or phones.

The distinction is important because it speaks to how far the companies could go in linking together a specific device with a specific person in a particular location.

Google's FAQ, for the record, says its location-based services (such as Google Maps for Mobile) figure out the location of a device when that device “sends a request to the Google location server with a list of MAC addresses which are currently visible to the device” — not distinguishing between MAC addresses from phones or computers and those from wireless routers.

Here's what Cameron said when I asked about that topic via email.

I have suggested that the author ask Google if it will therefore correct its FAQ, since the portion of the FAQ on “how the system works” continues to say it behaves in the way I described. If Google does correct its FAQ then it will be likely that data protection authorities ask Google to demonstrate that its shipped software behaving in the way described in the correction.

I would of course feel better about things if Google’s FAQ is changed to say something like, “The user’s device sends a request to the Google location server with the list of MAC addresses found in Beacon Frames announcing a Network Access Point SSID and excluding the addresses of end user devices.”

However, I would still worry that the commercially irresistible feature of tracking end user devices could be turned on at any second by Google or others. Is that to be prevented? If so, how?

So a statement from Google that its FAQ was incorrect would be good news – and I would welcome it – but not the end of the problem for the industry as a whole.

The privacy statement for Microsoft's Location Finder service, for the record, is more specific in saying that the service uses MAC addresses from wireless access points, making no reference to those from individual devices.

In any event, the basic question about Apple is whether its new privacy policy is ultimately correct in saying that the company is only collecting “data in a form that does not permit direct association with any specific individual” — if that data includes such information as the phone's unique device identifier and location.

Cameron isn't the only one raising questions.

The Consumerist blog picked up on this issue last week, citing a separate portion of the revised privacy policy that says Apple and its partners and licensees “may collect, use, and share precise location data, including the real-time geographic location of your Apple computer or device.” The policy adds, “This location data is collected anonymously in a form that does not personally identify you and is used by Apple and our partners and licensees to provide and improve location-based products and services.”

The Consumerist called the language “creepy” and said it didn't find Apple's assurances about the lack of personal identification particularly comforting. Cameron, in a follow-up post, agreed with that sentiment.

SF Weekly and the Hypebot music technology blog also noted the new location-tracking language, and the fact that users must agree to the new privacy policy if they want to use the service.

“Though Apple states that the data is anonymous and does not enable the personal identification of users, they are left with little choice but to agree if they want to continue buying from iTunes,” Hypebot wrote.

We've left messages with Apple and Google to comment on any of this, and we'll update this post depending on the response.

And for the record, there is an option to email the Apple privacy policy from the phone to a computer for reading, and it's also available here, so you don't necessarily need to duplicate Cameron's feat by reading it all on your phone.

Update to iTunes comes with privacy fibs

A few days ago I reported that from now on, to get into the iPhone App store you must allow Apple to share your phone or tablet device fingerprints and detailed, dynamic location information with anyone it pleases.  No chance to vet the purposes for which your location data is being used.  No way to know who it is going to. 

As incredible as it sounds in 2010, no user control.  Not even  transparency.  Just one thing is for sure.  If privacy isn't dead, Apple is now amongst those trying to bury it alive.

Then today, just when I thought Apple had gone as far as it could go in this particular direction, a new version of iTunes wanted to install itself on my laptop.  What do you know?  It had a new privacy policy too… 

The new iTunes policy was snappier than the iPhone policy – it came to the point – sort of – in the 5th paragraph rather than the 37th page!

5. iTunes Store and other Services.  This software enables access to Apple's iTunes Store which offers downloads of music for sale and other services (collectively and individually, “Services”). Use of the Services requires Internet access and use of certain Services requires you to accept additional terms of service which will be presented to you before you can use such Services.

By using this software in connection with an iTunes Store account, you agree to the latest iTunes Store Terms of Service, which you may access and review from the home page of the iTunes Store.

I shuddered.  Mind bend!  A level of indirection in a privacy policy! 

Imagine:  “Our privacy policy is that you need to read another privacy policy.”  This makes it much more likely that people will figure out what they're getting into, don't you think?  Besides, it is a really novel application of the proposition that all problems of computer science can be solved through a level of indirection!  Bravo!

But then – the coup de grace.  The privacy policy to which Apple redirects you is… are you ready… the same one we came across a few days ago at the App Store!  So once again you need to get to the equivalent of page 37 of 45 to read:

Collection and Use of Non-Personal Information

We also collect non-personal information – data in a form that does not permit direct association with any specific individual. We may collect, use, transfer, and disclose non-personal information for any purpose. The following are some examples of non-personal information that we collect and how we may use it:

  • We may collect information such as occupation, language, zip code, area code, unique device identifier, location, and the time zone where an Apple product is used so that we can better understand customer behavior and improve our products, services, and advertising.

The mind bogggggles.  What does downloading a song have to do with giving away your location???

Some may remember my surprise that the Lords of The iPhone would call its unique device identifier – and its location – “non-personal data”.  Non-personal implies there is no strong relationship to the person who is using it.  I wrote:

The irony here is a bit fantastic.  I was, after all, using an “iPhone”.   I assume Apple’s lawyers are aware there is an ”I” in the word “iPhone”.  We’re not talking here about a piece of shared communal property that might be picked up by anyone in the village.  An iPhone is carried around by its owner.  If a link is established between the owner’s natural identity and the device (as Google’s databases have done), its “unique device identifier” becomes a digital fingerprint for the person using it. 

Anybody who thinks about identity understands that a “personal device” is associated with (even an extension of) the person who uses it.  But most people – including technical people – don't give these matters the slightest thought.  

A parade of tech companies have figured out how to use peoples’ ignorance about digital identity to get away with practices letting them track what we do from morning to night in the physical world.  But of course, they never track people, they only track their personal devices!  Those unruly devices really have a mind of their own – you definitely need central databases to keep tabs on where they're going.

I was therefore really happy to read some of  Google CEO Eric Schmidt’s recent speech to the American Society of News Editors.  Talking about mobility he made a number of statements that begin to explain the ABCs of what mobile devices are about:

Google is making the Android phone, we have the Kindle, of course, and we have the iPad. Each of these form factors with the tablet represent in many ways your future….: they’re personal. They’re personal in a really fundamental way. They know who you are. So imagine that the next version of a news reader will not only know who you are, but it’ll know what you’ve read…and it’ll be more interactive. And it’ll have more video. And it’ll be more real-time. Because of this principle of “now.”

It is good to see Eric sharing the actual truth about personal devices with a group of key influencers.  This stands in stark contrast to the silly fibs about phones and laptops being non-personal that are being handed down in the iTunes Store, the iPhone App Store, and in the “Refresher FAQ” Fantasyland Google created in response to its Street View WiFi shenanigans. 

As the personal phone evolves it will become increasingly obvious  that groups within some of our best tech companies have built businesses based on consciously crafted privacy fibs.  I'm amazed at the short-sightedness involved:  folks, we're talking about a “BP moment”.  History teaches us that “There is no vice that doth so cover a man with shame as to be found false and perfidious.” [Francis Bacon]  And statements that your personal device doesn't identify you and that location is not personal information are precisely “false and perfidious.”

 

What Could Google Do With the Data It's Collected?

Niraj Chokshi has published a piece in The Atlantic where he grapples admirably with the issues related to Google's collection and use of device fingerprints (technically called MAC Addresses).  It is important and encouraging to have journalists like Niraj taking the time to explore these complex issues.  

But I have to say that such an exploration is really hard right now. 

Whether on purpose or by accident, the Google PR machine is still handing out contradictory messages.  In particular, the description in Google's Refresher FAQ titled “How does this location database work?” is currently completely different from (read: the opposite of) what its public relations people are telling journalists like Nitaj.  I think reestablishing credibility around location services requires the messages to be made consistent so they can be verified by data protection authorities.

Here are some excerpts from the piece – annotated with some comments by me.  [Read the whole article here.] 

The Wi-Fi data Google collected in over 30 countries could be more revealing than initially thought…

Google's CEO Eric Schmidt has said the information was hardly useful and that the company had done nothing with it. The search giant has also been ordered (or sought) to destroy the data. According to their own blog post, Google logged three things from wireless networks within range of their vans: snippets of unencrypted data; the names of available wireless networks; and a unique identifier associated with devices like wireless routers. Google blamed the collection on a rogue bit of code that was never removed after it had been inserted by an engineer during testing.

[The statement about rogue code is an example of the PR ambiguity Nitaj and other journalists must deal with.  Google blogs don't actually blame the collection of unique identifiers on rogue code, although they seem crafted to leave people with that impression.  Spokesmen only blame rogue code for the collection of unencrypted data content (e.g. email messages.) – Kim]

Each of the three types of data Google recorded has its uses, but it's that last one, the unique identifier, that could be valuable to a company of Google's scale. That ID is known as the media access control (MAC) address and it is included — unencrypted, by design — in any transfer, blogger Joe Mansfield explains.

Google says it only downloaded unencrypted data packets, which could contain information about the sites users visited. Those packets also include the MAC address of both the sending and receiving devices — the laptop and router, for example.

[Another contradiction: Google PR says it “only” collected unencrypted data packets, but Google's GStumbler report  says its cars did collect and record the MAC addresses from encrypted data frames as well. – Kim]

A company as large as Google could develop profiles of individuals based on their mobile device MAC addresses, argues Mansfield:

Get enough data points over a couple of months or years and the database will certainly contain many repeat detections of mobile MAC addresses at many different locations, with a decent chance of being able to identify a home or work address to go with it.

Now, to be fair, we don't know whether Google actually scrubbed the packets it collected for MAC addresses and the company's statements indicate they did not. [Yet the GStumbler report says ALL MAC addresses were recorded – Kim].  The search giant even said it “cannot identify an individual from the location data Google collects via its Street View cars.”  Add a step, however, and Google could deduce an individual from the location data, argues Avi Bar-Zeev, an employee of Microsoft, a Google competitor.

[Google] could (opposite of cannot) yield your identity if you've used Google's services or otherwise revealed it to them in association with your IP address (which would be the public IP of your router in most cases, visible to web servers during routine queries like HTTP GET). If Google remembered that connection (and why not, if they remember your search history?), they now have your likely home address and identity at the same time. Whether they actually do this or not is unclear to me, since they say they can't do A but surely they could do B if they wanted to.

Theoretically, Google could use the MAC address for a mobile device — an iPod, a laptop, etc. — to build profiles of an individual's activity. (It's unclear whether they did and Google has indicated that they have not.) But there's also value in the MAC addresses of wireless routers.

Once a router has been associated with a real-world location, it becomes useful as a reference point. The Boston company Skyhook Wireless, for example, has long maintained a database of MAC addresses, collected in a (slightly) less-intrusive way. Skyhook is the primary wireless positioning system used by Apple's iPhone and iPod Touch. (See a map of their U.S. coverage here.) When your iPod Touch wants to retrieve the current location, it shares the MAC addresses of nearby routers with Skyhook which pings its database to figure out where you are.

Google Latitude, which lets users share their current location, has at least 3 million active users and works in a similar way. When a user decides to share his location with any Google service on a non-GPS device, he sends all visible MAC addresses in the vicinity to the search giant, according to the company's own description of how its location services works.

[Update: Google's own “refresher FAQ” states that a user of its geo-location services, such as Latitude, sends all MAC addresses “currently visible to the device” to Google, but a spokesman said the service only collects the MAC addresses of routers. That FAQ statment is the basis of the following argument.]

This is disturbing, argues blogger Kim Cameron (also a Microsoft employee), because it could mean the company is getting not only router addresses, but also the MAC addresses of devices such as laptops and iPods. If you are sitting next to a Google Latitude user who shares his location, Google could know the address and location of your device even though you didn't opt in. That could then be compared with all other logged instances of your MAC address to develop a profile of where the device is and has been.

Google denies using the information it collected and, if the company is telling the truth, then only data from unencrypted networks was intercepted anyway, so you have less to worry about if your home wireless network is password-protected. (It's still not totally clear whether only router MAC addresses were collected. Google said it collected the information for devices “like a WiFi router.”) Whether it did or did not collect or use this information isn't clear, but Google, like many of its competitors, has a strong incentive to get this kind of location data.

[Again, and I really do feel for Niraj, the PR leaves the impression that if you have passwords and encryption turned on you have nothing to worry about, but Googles’ GStumbler report says that passwords and encryption did not prevent the collection of the MAC addresses of phones and laptops from homes and businesses. – Kim]

I really tuned in to these contradictory messages when a reader first alerted me to Niraj's article.   It looked like this:

My comments earned their strike-throughs when a Google spokesman assured the Atlantic “the Service only collects the MAC addresses of routers.”  I pointed out that my statement was actually based on Google's own FAQ, and it was their FAQ (“How does this location database work?”) – rather than my comments – that deserved to be corrected.  After verifying that this was true, Niraj agreed to remove the strikethrough.

How can anyone be expected to get this story right given the contradictions in what Google says it has done?

In light of this, I would like to see Google issue a revision to its “Refresher FAQ” that currently reads:

The “list of MAC addresses which are currently visible to the device” would include the addresses of nearby phones and laptops.  Since Google PR has assured Niraj that “the service only collects the MAC addresses of routers”, the right thing to do would be to correct the FAQ so it reads:

  • “The user’s device sends a request to the Google location server with the list of MAC addresses found in Beacon Frames announcing a Network Access Point SSID and excluding the addresses of end user devices like WiFi enabled phones and laptops.”

This would at least reassure us that Google has not delivered software with the ability to track non-subscribers and this could be verified by data protection authorities.  We could then limit our concerns to what we need to do to ensure that no such software is ever deployed in the future.

 

National Strategy for Trusted Identities in Cyberspace

Friday saw what I think is a historic post by Howard Schmidt on The Whitehouse Blog:

“Today, I am pleased to announce the latest step in moving our Nation forward in securing our cyberspace with the release of the draft National Strategy for Trusted Identities in Cyberspace (NSTIC).  This first draft of NSTIC was developed in collaboration with key government agencies, business leaders and privacy advocates. What has emerged is a blueprint to reduce cybersecurity vulnerabilities and improve online privacy protections through the use of trusted digital identities. “

I say the current draft is historic because of the grasp of identity issues it achieves

At the core of the document is a recognition that we need a solution supporting privacy-enhancing technologies and built by harnessing a user-centric Identity Ecosystem offering citizens and private enterprise plenty of choice.  

Finally we have before us a proposal that can move society forward in  protecting individual privacy and simultaneously create a secure and trustworthy infrastructure with enough protections to be resistant to insider attacks.  

Further, the work appears to have support from multiple government agencies – the Department of Homeland Security was a key partner in its creation. 

Here are the guiding principles (beginning page 8):

  • Identity solutions will be secure and resilient
  • Identity solutions will be interoperable
  • Identity solutions will be privacy enhancing and voluntary for the public
  • Identity solutions will be cost-effective and easy to use

Let's start with the final “s” on the word “solutions” – a major achievement.  The authors understand society needs a spectrum of approaches suitable for different use cases but fitting within a common interoperable framework – what I and others have called an identity metasystem. 

The report embraces the need for anonymous access as well as that for strong identification.  It stands firmly in favor of minimal disclosure.  The authors call out the requirement that solutions be privacy enhancing and voluntary for the public, rather than attempting to ram something bureaucratic down peoples’ throats.  And they are fully cognisant of the practicality and usability requirements for the initiative to be successful.  A few years ago I would not have believed this kind of progress would be possible.

Nor is the report just a theoretical treatment devoid of concrete proposals.  The section on “Commitment to Action” includes:

  • Designate a federal agency to lead the public/private sector efforts to advance the vision
  • Develop a shared, comprehensive public/private sector implementation plan
  • Accelerate the expansion of government services, pilots and policies that align with the identity ecosystem
  • Work to implement enhanced privacy protections
  • Coordinate the development and refinement of risk management and interoperability standards
  • Address liability concerns of service providers and individuals
  • Perform outreach and awareness across all stakeholders
  • Continue collaborating in international efforts
  • Identify other means to drive adoption

Readers should dive into the report – it is in a draft stage and “Public ideas and recommendations to further refine this Strategy are encouraged.”  

A number of people and organizations in the identity world have participated in getting this right, working closely with policy thinkers and those leading this initiative in government.  I don't hesitate to say that congratulations are due all round for getting this effort off to such a good start.

We can expect suggestions to be made strengthening various aspects of the report – mainly in terms of making it more internally consistent.  

For example, the report contains good vignettes about minimal disclosure and the use of claims to gain access to resources.  Yet it also retains the traditional notion that authentication is dependent on identification.  What is meant by identification?  Many will assume it means “unique identification” in the old-fashioned sense of associating someone with an identifier.  That doesn't jive with the notion of minimal disclosure present throughout the report.  Why? For many purposes association with an identifier is over-identification or unhelpful, and a simple proof of some set of claims would suffice to control access.  

But these refinements can be made fairly easily.  The real challenge will be to actually live up to the guiding principles as we move from high level statements to a widely deployed system – making it truly secure, resilient and privacy enhancing.  These are guiding principles we can use to measure our success and help select between alternatives.