Alan Eustace, Google's Senior VP of Engineering & Research, blogged recently about Google's collection of Wi-Fi data using its Street View cars:
The engineering team at Google works hard to earn your trust—and we are acutely aware that we failed badly here. We are profoundly sorry for this error and are determined to learn all the lessons we can from our mistake.
I think the idea of learning all the lessons he can from Google's mistake is a really good one, and I accept that Alan really is sorry. But what constituted the mistake?
Last month Google was good enough to provide us with a “refresher FAQ” that dealt with the subject in a particularly specious way, even though it was remarkable in its condescension:
“What do you mean when you talk about WiFi network information?
“WiFi networks broadcast information that identifies the network and how that network operates. That includes SSID data (i.e. the network name) and MAC address (a unique number given to a device like a WiFi router).
“Networks also send information to other computers that are using the network, called payload data, but Google does not collect or store payload data.*
“But doesn’t this information identify people?
“MAC addresses are a simple hardware ID assigned by the manufacturer. And SSIDs are often just the name of the router manufacturer or ISP with numbers and letters added, though some people do also personalize them.
“However, we do not collect any information about householders, we cannot identify an individual from the location data Google collects via its Street View cars.
“Is it, as the German DPA states, illegal to collect WiFi network information?
“We do not believe it is illegal–this is all publicly broadcast information which is accessible to anyone with a WiFi-enabled device…
Let's start with the last point. Is information that can be collected using a WiFi device actually being “broadcast”? Or is it being transmitted for a specific purpose and private use? If everything is deemed to be “broadcast” simply by virtue of being a signal that can be received, then surely payload data – people's surfing behavior, emails and chat – is also being “broadcast”. Once the notion of “broadcast” is accepted, the FAQ implies there can be no possible objection to collecting it.
But Alan's recent post says, “it’s now clear that we have been mistakenly collecting samples of payload data from open (i.e. non-password-protected) WiFi networks.” He adds, “We want to delete this data as soon as possible…” What is the mistake? Does Alan mean Google has now accepted that WiFi information is not by definition being “broadcast” for its use? Or does Alan see the mistake as being the fact they created a PR disaster? I think “learning everything we can” means learning that the initial premises of the Street View WiFi system were wrong (and the behavior perhaps even illegal) because the system collected WiFi information that was intended to be used for private purposes and not intended to include Google.
The FAQ claims – and this is disturbing – that the information collected about network identifiers “doesn't identify people”. The fact is that it identifies devices that are closely associated with people – including their personal computers and phones. MAC addresses are persistent, remaining constant over the lifetime of the device. They are identifiers that are extremely reliable in establishing identity by virtue of being in peoples’ pockets or briefcases.
As a result, Google breaks two Laws of Identity in one go with their Street View boondoggle,
Google breaks Law 3, the Law of Justifiable Parties.
Digital identity systems must limit disclosure of identifying information to parties having a necessary and justifiable place in a given identity relationship
Google is not part of the transactions between my network devices and is not justified in intervening or recording the details of their use and relationship.
Google also breaks Law 4, Directed Identity:
A universal identity metasystem must support both “omnidirectional” identifiers for use by public entities and “unidirectional” identifiers for private entities, thus facilitating discovery while preventing unnecessary release of correlation handles.
My network devices are private entities intended for use in the contexts for which I authorize them. My home network is a part of my home, and Google (or any other company) has not been invited to employ that network for its own purposes. The identifiers in use there are contextually specific, not public, and not intended to be shared across all contexts. They are more private than the IP addresses used in TCP/IP, since they are not shared across end-points in different networks. The same applies to SSIDs.
One can stand in the street, point a directional microphone at a window and record the conversations inside. This doesn't make them public or give anyone the right to use the conversations for commercial purposes. The same applies to recording the information we exchange using digital media – including our identifiers, SSIDs and MAC addresses. It is particularly disingenuous to argue that because information is not encrypted it doesn't belong to anyone and there are no rights associated with it. If lack of encryption meant information is fair game a lot of Google's own intellectual property would be up for grabs,
Google's justification for collecting MAC addresses was that if a stranger walked down your street, the MAC addresses of your computers and routers could be used provide his systems (or Googles’?) with information on where he was. The idea that Google would, without our consent, employ our home networks for its own commercial purposes betrays a problem of ethics and a lack of control. Let's hope this is what Alan means when he says,
“Given the concerns raised, we have decided that it’s best to stop our Street View cars collecting WiFi network data entirely.”
I know there are many people inside Google who will recognize that these problems represent more than a “mistake” – there is clearly the need for a much deeper understanding of identity and privacy within the engineering and business staff. I hope this will be the outcome. The Laws of Identity are a harsh teacher, and it's sad to see the Street View technology sullied by privacy catastrophes.
Meanwhile, there is one more lesson for the rest of us. We tend to be cavalier in pooh poohing the idea that commercial interests would actually abuse our networks and digital privacy in fundamental ways. This episode demonstrates how naive that is. We need to strengthen the networking infrastructure, and protect it from misuse by commercial interests as well as criminals. We need clear legislation that serves as a disincentive to commercial interests contemplating privacy-invasive use of technology. And on a technical note, we need to fix the problems of static MAC addresses precisely because they are strong personal identifiers that ultimately will be used to target individuals physically as criminals begin to understand their possible uses.