Clarke: Appropriating home network identifiers is the real issue

Here is some background on the Google Street View WiFi issue by Roger Clarke, a well known Australian privacy expert.  Roger points out that Peter Schaar, Germany's Federal Commissioner for Freedom of Information, was concerned about misuse of network identifiers from the very beginning. 

I agree that the identifiers of users’ devices is the real issue.

And your invocation of “It reminds me of an old skit by “Beyond the Fringe” where a police inspector points out that “Once you have identified the criminal's face, the criminal's body is likely to be close by” does hit the spot very nicely!

You ask why the payload is getting all the attention.  After all, it was the device-addresses that Peter Schaar first drew attention to.  As I wrote here,

The third mistake came to light on 22 April 2010, when The Register reported that “[Google's] Street View service is under fire [from the German Data Protection Commissioner, Peter Schaar] for scanning private WLAN networks, and recording users’ unique [device] addresses, as the car trundles along”.

As soon as Peter Fleischer [Google's European privacy advisor – Kim]  published his document of 27 April, I wrote to Schaar, saying:

“Fleischer's document doesn't say anything about whether the surveillance apparatus in the vehicle detects other messages from the router, and messages from other devices…

“In relation to messages other than beacons, on the surface of it, Fleischer might seem to be making an unequivocal statement that Google does *not* collect and store MAC addresses.


  1. If Google's surveillance apparatus is in a Wifi zone, how does it avoid ‘collecting’ the data?  [Other statements make clear that it does in fact collect that data]
  2. [In the statement “Google does not collect or store payload data”,] the term ‘payload data’ would most sensibly be interpreted as meaning the content, but not including the headers.
  3. The MAC-addresses are in the headers.
  4. So Fleischer's statement is open to the interpretation that header data of messages other than beacons *is* collected, and *is* stored.

“Google has failed to make the statement that connected-device MAC-addresses are *not* collected and stored.

“Because Google has had ample opportunity to make such a statement, and has avoided doing so, I therefore make the conservative assumption that Google *does* collect and store MAC addresses of any devices on networks, not just of routers.”

The document sent to the Commissioners added fuel to the fire, by saying “The equipment is able to receive data from all broadcast frames [i.e. not only beacons are intercepted; any traffic may be intercepted.] This includes, from the header data, SSID and MAC addresses [i.e. consistent with the analysis above, the MAC-addresses of all devices are available to Google's surveillance apparatus.] However, all data payload from data frames are discarded, so Google never collects the content of any communications.

Subsequently, on 14 May, investigations by Hamburg Commissioner Caspar led to the unavoidable conclusion that Fleischer's post on April 27 had been incorrect in a key respect. As Eustace put it, “It's now clear that we have been mistakenly collecting samples of payload data [i.e. message content] from open (i.e. non-password-protected) WiFi networks”.

So I think there are a couple of reasons why the payload aspect is getting most of the press:

  1. The significance of identifiers isn't readily apparent to most people, whereas ‘payload’, like people's Internet Banking passwords, is easier to visualise. (Leave aside that only highly insecure services send authenticators unencrypted. Low-tech reporters have to (over-) simplify stories to communicate to low-tech readers
  2. A corporation appeared to have been caught telling fibs, constructively misleading the public and the media, and regulators
  3. That's what catapulted it into the news, and reporters feed off one another's work, so it's the payload they all focus on
  4. A final factor is that breaches of telecommunications laws may be easier to prove in the case of content than of device-identifiers.

The Australian Privacy Foundation (APF) stepped up the pressure in Australia late this week.

Firstly, we directly requested Google not to delete the data, and gave them notice that we were considering using a little-known part of the TIAA to launch an action.  That was promptly followed by the NYT's report of the Oz Privacy Commissioner saying that the Australian data is in the USA.  (The first useful utterance she's made on the topic – a month after this story broke, there's no mention of the matter on her web-site).

Secondly, we wrote to the relevant regulators, and requested them to contact Google to ensure that the data is not deleted, and to investigate whether Google's actions breached Australian laws.


Don't take identities from our homes without our consent

Joerg Resch of Kuppinger Cole in Germany wrote recently about the importance of identity management to the Smart Grid – by which he means the emerging energy infrastructure based on intelligent, distributed renewable resources:

In 10-12 years from now, the whole utilities and energy market will look dramatically different. Decentralization of energy production with consumers converting to prosumers pumping solar energy into the grid and offering  their electric car batteries as storage facilities, spot markets for the masses offering electricity on demand with a fully transparent price setting (energy in a defined region at a defined time can be cheaper, if the sun is shining or the wind is blowing strong), and smart meters in each home being able to automatically contract such energy from spot markets and then tell the washing machine to start working as soon as electricity price falls under a defined line. And – if we think a bit further and apply Google-like business models to the energy market, we can get an idea of the incredible size this market will develop into.

These are just a few examples, which might give you an idea on how the “post fossile energy market” will work. The drivers leading the way into this new age are clear: energy production from oil and gas will become more and more expensive, because pollution is not for free and the resources will not last forever. And the transparency gain from making the grid smarter will make electricity cheaper than it is now.

The drivers are getting stronger every day. Therefore, we will soon see many large scale smart grid initiatives, and we will see questions rising such as who has control over the information collected by the smart meter in my home. Is it my energy provider? How would Kim Cameron´s 7 laws of Identity work in a smart grid? What would a “grid perimeter” look like which keeps information on the usage of whatever electric devices within my 4 walls? By now, we all know what cybercrimes are and how they can affect each of us. But what are the risks of “smart grid hacking”? How might we be affected by “grid crimes”?

In fact at Blackhat 2009, security consultant Mike Davis demonstrated successful hacker attacks on commercially available smart meters.  He told the conference,

“Many of the security vulnerabilities we found are pretty frightening and most smart meters don't even use encryption or ask for authentication before carrying out sensitive functions like running software updates and severing customers from the power grid.”

Privacy commission Ann Cavoukian of Ontario has insisted that industry turn its attention to the security and privacy of these devices:

“The best response is to ensure that privacy is proactively embedded into the design of the Smart Grid, from end to end. The Smart Grid is presently in its infancy worldwide – I’m confident that many jurisdictions will look to our work being done in Ontario as the privacy standard to be met. We are creating the necessary framework with which to address this issue.”

Until recently, no one has talked about drive-by mapping of our home devices.  But from now on we will.  When we think about home devices, we need to reach into the future and come to terms with the huge stakes that are up for grabs here.  

The smart home and the smart grid alert us to just how important the identity and privacy of our devices really is.  We can use technical mechanisms like encryption to protect some information from eavesdroppers.   But not the patterns of our communication or the identities of our devices…  To do that we need a regulatory framework that ensures commercial interests don't enter our “device space” without our consent.

Google's recent Street View WiFi boondoggle is a watershed event in drawing our attention to these matters.

Misuse of network identifiers was done on purpose

Ben Adida has a list of achievements as long as my arm – many of which are related to privacy and security.  His latest post concerns what he calls, “privacy advocacy theater… a problem that my friends and colleagues are guilty of, and I’m sure I’m guilty of it at times, too.  Privacy Advocacy Theater is the act of extreme criticism for an accidental data breach rather than a systemic privacy design flaw. Example: if you’re up in arms over the Google Street View privacy “fiasco” of the last few days, you’re guilty of Privacy Advocacy Theater.”

Ben then proceeds take me to task for this piece:

I also have to be harsh with people I respect deeply, like Kim Cameron who says that Google broke two of his very nicely crafted Laws of Identity. Come on, Kim, this was accidental data collection by code that the Google Street View folks didn’t even realize was running. (I’m giving them the benefit of the doubt. If they are lying, that’s a different problem, but no one’s claiming they’re lying, as far as I know.) The Laws of Identity apply predominantly to the systems that individuals choose to use to manage their data. If anyone is breaking the Laws of Identity, it’s the WiFi access points that don’t actively nudge users towards encrypting their WiFi network.

But let's hold on a minute.  My argument wasn't about the payload data that was collected accidently.  It was about the device identification data that was collected on purpose.  As Google's Alan Eustace put it: 

We said that while Google did collect publicly broadcast SSID information (the WiFi network name) and MAC addresses (the unique number given to a device like a WiFi router) using Street View cars, we did not collect payload data (information sent over the network). But it’s now clear that we have been mistakenly collecting samples of payload data…

Device identifiers were collected on purpose

SSID and MAC addresses are the identifiers of your devices.  They are transmitted as part of the WiFi traffic just like the payload data is.  And they are not “publically broadcast” any more than the payload data is. 

Yet Google consciously decided to abscond with, tabulate and monetize the identities of our personal, business and home devices.  The identifiers are persistent and last for the lifetime of the devices.  Their collection, cataloging and use is, in my view, more dangerous than the payload data that was collected. Why? The payload data, though deeply personal, is transient and represents a single instant.  The identifiers are persistent, and the Street View WiFi plan was to use them for years.  

Let's be clear:  Identity has as much to do with devices, software, services and organizations as with individuals.  And equally important, identity is about the relationships between these things.  In fact identity can only be adequately expressed through the relationships (some call it context).

When Google says, “MAC addresses are a simple hardware ID assigned by the manufacturer” and “We cannot identify an individual” using those “simple hardware IDs”,  it sounds like the devices found in your home and briefcase and pocket have nothing to do with you as a flesh and blood person.  Give me a break!  It reminds me of an old skit by “Beyond the Fringe” where a police inspector points out that “Once you have identified the criminal's face, the criminal's body is likely to be close by…”  Our identities and the identities of our devices are related, and understanding this relationship is essential to getting identity and privacy right.

One great thing about blogging is you find out when you haven't been clear enough.  I hope I'm making progress in expressing the real issues here:  the collection of device identifiers was purposeful, and this represents precisely the kind of “systemic privacy design flaw” to which Ben refers.  

It bothers me that this disturbing systemic privacy design flaw – for which there has been no apology – is being obscured through the widely publicized apology for a completely separate and apparently accidental sin.  

In contemporary networks, the hardware ID of the device is NOT intended to be a “universal identifier”.  It is intended to be a “unidirectional identifier” (see The Fourth Law) employed purely to map between a physical machine and a transient, local logical address.  Many people who read this blog understand why networking works this way.  In Street View WiFi, Google was consciously misusing this unidirectional identifier as a universal identifier, and misappropriating it by insinuating itself, as eavesdropper, into our network conversations.

Ben says, “The Laws of Identity apply predominantly to the systems that individuals choose to use to manage their data.”  But I hope he rethinks this in the context of what identity really is, its use in devices and systems, and the fact that human, device and service identities are tied together in what one day should be a trustworthy system.  I also hope to see Google apologize for its misuse of our device identities, and assure us they will not be used in any of their systems.

Finally, despite Ben's need to rethink this matter,  I do love his blog, and strongly agree with his comments on  Opera Mini, discussed in the same piece.


EPIC on Google WiFi eavesdropping

Readers have drawn our attention to a recent letter from EPIC's Marc Rotenberg to  FCC Chairman, Julius Genachowski.

In the detailed letter, Marc Rotenberg specifically calls attention to the mapping of private device identifiers, saying, “We understand that Google also downloaded and recorded a unique device ID, the MAC address, for wireless access devices as well as the SSID assigned by users.”

He argues:

The capture of Wi-Fi data in this manner by Google Street View could easily constitute a violation of Title III of the Omnibus Crime Control and Safe Streets Act of 1968, also known as the Wiretap Act, as amended by the Electronic Communications Privacy Act (ECPA) of 1986 to include electronic communications. Courts most oten define “interception” under ECPA as “acquisitions contemporaneous with transmission.” The Wiretap Act provides for civil liability and criminal penalties against any person who “intentionally intercepts, endeavors to intercept, or procures any other person to intercept or endeavor to intercept any… electronic communication [except as provided in the statute].”

The Wiretap Act imposes identical liability on any person who “intentionally discloses … to any other person the contents of any… electronic communication, knowing or having reason to know that the information was obtained through the interception of a[n] … electronic communication in violation of
this subsection,” or “intentionally uses … the contents of any… electronic communication, knowing or having reason to know that the information was obtained through the interception of a[n]… electronic communication in violation of this subsection.”

Full text (including many footnotes elided in the quote above) is available in pdf and Word format.  See also The Hill's technology blog.

The Laws of Identity smack Google

Alan Eustace, Google's Senior VP of Engineering & Research, blogged recently about Google's collection of Wi-Fi data using its Street View cars:

The engineering team at Google works hard to earn your trust—and we are acutely aware that we failed badly here. We are profoundly sorry for this error and are determined to learn all the lessons we can from our mistake.  

I think the idea of learning all the lessons he can from Google's mistake is a really good one, and I accept that Alan really is sorry.  But what constituted the mistake?

Last month Google was good enough to provide us with a “refresher FAQ” that dealt with the subject in a particularly specious way, even though it was remarkable in its condescension:

“What do you mean when you talk about WiFi network information?
“WiFi networks broadcast information that identifies the network and how that network operates. That includes SSID data (i.e. the network name) and MAC address (a unique number given to a device like a WiFi router).

“Networks also send information to other computers that are using the network, called payload data, but Google does not collect or store payload data.*

“But doesn’t this information identify people?
“MAC addresses are a simple hardware ID assigned by the manufacturer. And SSIDs are often just the name of the router manufacturer or ISP with numbers and letters added, though some people do also personalize them.

“However, we do not collect any information about householders, we cannot identify an individual from the location data Google collects via its Street View cars.

“Is it, as the German DPA states, illegal to collect WiFi network information?
“We do not believe it is illegal–this is all publicly broadcast information which is accessible to anyone with a WiFi-enabled device…

Let's start with the last point. Is information that can be collected using a WiFi device actually being “broadcast”?  Or is it being transmitted for a specific purpose and private use?  If everything is deemed to be “broadcast” simply by virtue of being a signal that can be received, then surely payload data – people's surfing behavior, emails and chat – is also being “broadcast”.  Once the notion of “broadcast” is accepted, the FAQ implies there can be no possible objection to collecting it.

But Alan's recent post says, “it’s now clear that we have been mistakenly collecting samples of payload data from open (i.e. non-password-protected) WiFi networks.”  He adds, “We want to delete this data as soon as possible…”  What is the mistake?  Does Alan mean Google has now accepted that WiFi information is not by definition being “broadcast” for its use?  Or does Alan see the mistake as being the fact they created a PR disaster?  I think “learning everything we can” means learning that the initial premises of the Street View WiFi system were wrong (and the behavior perhaps even illegal) because the system collected WiFi information that was intended to be used for private purposes and not intended to include Google.  

The FAQ claims – and this is disturbing – that the information collected about network identifiers “doesn't identify people”.  The fact is that it identifies devices that are closely associated with people – including their personal computers and phones.  MAC addresses are persistent, remaining constant over the lifetime of the device.  They are identifiers that are extremely reliable in establishing identity by virtue of being in peoples’ pockets or briefcases.

As a result, Google breaks two Laws of Identity in one go with their Street View boondoggle, 

Google breaks Law 3, the Law of  Justifiable Parties.

Digital identity systems must limit disclosure of identifying information to parties having a necessary and justifiable place in a given identity relationship

Google is not part of the transactions between my network devices and is not justified in intervening or recording the details of their use and relationship. 

Google also breaks Law 4, Directed Identity:

A universal identity metasystem must support both “omnidirectional” identifiers for use by public entities and “unidirectional” identifiers for private entities, thus facilitating discovery while preventing unnecessary release of correlation handles.

My network devices are private entities intended for use in the contexts for which I authorize them.  My home network is a part of my home, and Google (or any other company) has not been invited to employ that network for its own purposes.  The identifiers in use there are contextually specific, not public, and not intended to be shared across all contexts.  They are more private than the IP addresses used in TCP/IP, since they are not shared across end-points in different networks.  The same applies to SSIDs.

One can stand in the street, point a directional microphone at a window and record the conversations inside.  This doesn't make them public or give anyone the right to use the conversations for commercial purposes.  The same applies to recording the information we exchange using digital media – including our identifiers, SSIDs and MAC addresses.  It is particularly disingenuous to argue that because information is not encrypted it doesn't belong to anyone and there are no rights associated with it.  If lack of encryption meant information is fair game a lot of Google's own intellectual property would be up for grabs,

Google's justification for collecting MAC addresses was that if a stranger walked down your street, the MAC addresses of your computers and routers could be used provide his systems (or Googles’?)  with information on where he was.  The idea that Google would, without our consent, employ our home networks for its own commercial purposes betrays a problem of ethics and a lack of control.  Let's hope this is what Alan means when he says,

“Given the concerns raised, we have decided that it’s best to stop our Street View cars collecting WiFi network data entirely.”

I know there are many people inside Google who will recognize that these problems represent more than a “mistake” – there is clearly the need for a much deeper understanding of identity and privacy within the engineering and business staff.   I hope this will be the outcome.  The Laws of Identity are a harsh teacher, and it's sad to see the Street View technology sullied by privacy catastrophes.

Meanwhile, there is one more lesson for the rest of us.  We tend to be cavalier in pooh poohing the idea that commercial interests would actually abuse our networks and digital privacy in fundamental ways.  This episode demonstrates how naive that is.  We need to strengthen the networking infrastructure, and protect it from misuse by commercial interests as well as criminals.  We need clear legislation that serves as a disincentive to commercial interests contemplating privacy-invasive use of technology.  And on a technical note, we need to fix the problems of static MAC addresses precisely because they are strong personal identifiers that ultimately will be used to target individuals physically as criminals begin to understand their possible uses.