Trip down memory lane

Joe Mansfield's comment that Bluetooth “doesn’t appear to be all that bad from a privacy leakage perspective” left me rummaging through memory lane – awakening memories that may help explain why I now believe that world-wide databases of MAC addresses constitute a central socio-technical problem of our time.

I was taken back to an unforgettable experience I had in 2005 while working on the Laws of Identity.  I had finished the Fourth Law and understood theoretically why technical systems should use “unidirectional identifiers” (meaning identifiers limited to a defined context) rather than “universal identifiers” (things like social security numbers) unless the goal was to be completely public.  But there is a difference between understanding something theoretically and right in the gut.

Rather than retell the story, here is what I wrote on my blog in Just a few scanning machines on Tuesday 6 September 2005:

Since I seem to be on the subject of Bluetooth again, I want to tell you about an experience I had recently that put a gnarly visceral edge on my opposition to technologies that serve as tracking beacons for us as private individuals.

I was having lunch in San Diego with Paul Trevithick, Stefan Brands and Mary Rundle. Everyone knows Paul for his work with Social Physics and the Berkman identity wiki; Stefan is a tremendously innovative privacy cryptographer; and Mary is pushing the envelope on cyber law with Berkman and Stanford.

Suddenly Mary recalled the closing plenary at the Computers, Freedom and PrivacyPanopticon Conference” in Seattle.

She referred off-handedly to “the presentation where they flashed a slide tracking your whereabouts throughout the conference using your Bluetooth phone.”

Essentially I was flabbergasted. I had missed the final plenary, and had no idea this had happened.

MAC Name Room Time Talk
Kim Cameron Mobile
00:09:2D:02:9A:68
Grand I (G1) Wed 09:32 09:32 ????
Grand Crescent (gc) Wed 09:35 09:35 Adware and Privacy: Finding a Common Ground
Grand I (G1) Wed 09:37 09:37 ????
Grand Crescent (gc) Wed 09:41 09:42 Adware and Privacy: Finding a Common Ground
Grand I (G1) Wed 09:46 09:47 ????
Grand III (g3) Wed 10:18 10:30 Intelligent Video Surveillance
Baker (ol) Wed 10:33 10:42 Reforming E-mail and Digital Telephonic Privacy
Grand III (g3) Wed 10:47 10:48 Intelligent Video Surveillance
Grand Crescent (gc) Wed 11:25 11:26 Adware and Privacy: Finding a Common Ground
Grand III (g3) Wed 11:46 12:22 Intelligent Video Surveillance
5th Avenue (5a) Wed 12:33 12:55 ????
Grand III (g3) Wed 13:08 14:34 Plenary: Government CPOs: Are they worth fighting for?

Of course, to some extent I'm a public figure when it comes to identity matters, and tracking my participation at a privacy conference is, I suspect, fair game. Or at any rate, it's good theatre, and drives home the message of the Fourth Law, which makes the point that private individuals must not be subjected – without their knowledge or against their will – to technologies that create tracking beacons.

A picture named kim_cameron.JPGLater Mary introduced me to Paul Holman from The Shmoo Group. He was the person who had put this presentation together, and given our mutual friends I don't doubt his motives. In fact, I look forward to meeting him in person.

He told me:

“I take it you missed our quick presentation, but essentially, we just put Bluetooth scanning machines in a few of the conference rooms and had them log the devices they saw. This was a pretty unsophisticated exercise, showing only devices in discoverable mode. To get them all would be a lot more work. You could do the same kind of thing just monitoring for cell phones or WiFi devices or whatever. We were trying to illustrate a crude version of what will be possible with RFIDs.”

The Bluetooth tracking was tied in to the conference session titles, and by clicking on a link you could see the information represented graphically – including my escape to a conference center window so I could take a phone call.

Anyway, I think I have had a foretaste of how people will feel when networks of billboards and posters start tracking their locations and behaviors. They won't like it one bit. They'll push back.

A foretaste indeed

One of my readers wrote to say I should turn my Bluetooth broadcast off, and I responded:

You’re right, and I have turned it off. Which bothers me. Because I like some of the convenience I used to enjoy.

So I write about this because I’d rather leave my Bluetooth phone enabled, interacting only with devices run by entities I’ve told it to cooperate with.

We have a lot of work to do to get things to this point. I see our work on identity as being directed to that end, at least in part.

We need to be able to easily express and select the relationships we want to participate in – and avoid – as cyberspace progressively penetrates the world of physical things.

The problems of Bluetooth all exist in current Wifi too. My portable computer broadcasts another tracking beacon. I’m not picking on Bluetooth versus other technologies. Incredibly, they all need to be fixed. They’re all misdesigned.

If anything has shocked me while working on the Laws of Identity, it has been the discovery of how naive we’ve been in the design of these systems to date – a product of our failure to understand the Fourth Law of Identity. The potential for abuse of these systems is collosal – enterprises like the UK’s Filter are just the most benign tip of an ugly iceberg.

For everyone’s sake I try to refrain from filling in what the underside of this iceberg might look like

Google's Street View group, which has been assembling a massive central registry of WiFi MAC addresses, has definitely crawled out from under this iceberg, and the project is more sinister than any I imagined only a few years ago.

But so as not to leave everyone feeling completely depressed, all the dreams of Billboards that recognize you from your Bluetooth phone have now been abandoned by Bluetooth manufacturers, and the specification has been greatly improved in light of the criticism it received.  Let's hope that geo-location providers, and Google in particular, see the same light, and assure us they will no longer collect or store the MAC address of any device unless that collection is approved by the subscriber.

Does the non-content trump the content?

In my previous post I referred to an interesting Wired story in which former U.S. federal prosecutor Paul Ohm says Google “likely” breached a U.S. federal criminal statute by intercepting the metadata and address information on residential and business WiFi networks.  The statute refers to a “pen register” – an electronic device that records all numbers dialed from a particular telephone line.  Wikipedia tells us the term has come to include any device or program that performs similar functions to an original pen register, including programs monitoring Internet communications.”  The story continues:

“I think it’s likely they committed a criminal misdemeanor of the Pen Register and Trap and Traces Device Act,” said Ohm, a prosecutor from 2001 to 2005 in the Justice Department’s Computer Crime and Intellectual Property Section. “For every packet they intercepted, not only did they get the content, they also have your IP address and destination IP address that they intercepted. The e-mail message from you to somebody else, the ‘to’ and ‘from’ line is also intercepted.”

“This is a huge irony, that this might come down to the non-content they acquired,” (.pdf) said Ohm, a professor at the University of Colorado School of Law.

I understand how people unacquainted with the emerging role of identity in the Internet can see this as an irony – a kind of side-effect – whereas in reality Google's plan to establish a vast centralized database of device identifiers has much longer-term consequences than the misappropriation of content.  Metadata is no less important than other data –  and “addresses” being referred to are really device identifiers clearly associated with individual users, much like the telephone numbers to which the statute applies.  Given the similarity to issues that arose with pre-Internet communication, we should perhaps not be surprised that there may already be regulation in place that prevents “registering” of the identifiers.

The Wired article continues:

Google said it was a coding error that led it to sniff as much as 600 gigabytes of data across dozens of countries as it was snapping photos for its Street View project. The data likely included webpages users visited and pieces of e-mail, video and document files…

The pen register act described by Ohm, which he said is rarely prosecuted, is usually thought of in terms of preventing unauthorized monitoring of outbound and inbound telephone numbers.

Violations are a misdemeanor and cannot be prosecuted by private lawyers in civil court, Ohm said. He said the act requires that Google “knew, or should have known” of the activity in question.

Google denies any wrongdoing.

In fact, Google knew about the collection of MAC addresses, and has never said otherwise or stated that their collection of these addresses was done accidently.  In fact they have been careful to never state explicitly that their collection was limited to Wireless Access Points.  The Gstumbler report makes it clear they were parsing and recording both the source and destination MAC addresses in all the WiFi frames they intercepted. 

The Wired article explains:

As far as a criminal court goes, it is not considered wiretapping “to intercept or access an electronic communication made through an electronic communication system that is configured so that such electronic communication is readily accessible to the general public.”

It is not known how many non-password-protected Wi-Fi networks there are in the United States.

What makes this especially interesting is the fact that it is not possible to configure a WiFi network so that the MAC addresses are hidden.  Use of passwords protects the communication content carried by the network, but does not protect the MAC addresses.  Configuring the WIreless Access Point not to broadcast an SSID does not prevent eavesdropping on MAC addresses either.   Yet we can hardly say the metadata is readily accessible to the general public, since it cannot be detected except acquiring and using very specialized programs. 

Wired draws the conclusion that,  “The U.S. courts have not clearly addressed the issue involved in the Google flap.”

 

Title 18 – Part II – Chapter 206 – § 3121

Former federal prosecutor Paul Ohm says Google “likely” breached a U.S. federal criminal statute in connection with its accidental Wi-Fi sniffing — but not for siphoning private data from internet surfers using unsecured networks.

Instead, Mr. Ohm  thinks Google might have breached the Pen Register and Trap and Traces Device Act for intercepting the metadata and address information alongside the content.

According to Wikipedia, a “pen register is an electronic device that records all numbers dialed from a particular telephone line. The term has come to include any device or program that performs similar functions to an original pen register, including programs monitoring Internet communications.”

I'll expand on the identity implications in my next post, but to prepare the discussion, here is the statute to which Mr. Ohm is referring:

Title 18 – Part II – Chapter 206 – § 3121

  1. In General.— Except as provided in this section, no person may install or use a pen register or a trap and trace device without first obtaining a court order under section 3123 of this title or under the Foreign Intelligence Surveillance Act of 1978 (50 U.S.C. 1801 et seq.).
  2. Exception.— The prohibition of subsection (a) does not apply with respect to the use of a pen register or a trap and trace device by a provider of electronic or wire communication service—
    1. relating to the operation, maintenance, and testing of a wire or electronic communication service or to the protection of the rights or property of such provider, or to the protection of users of that service from abuse of service or unlawful use of service; or
    2. to record the fact that a wire or electronic communication was initiated or completed in order to protect such provider, another provider furnishing service toward the completion of the wire communication, or a user of that service, from fraudulent, unlawful or abusive use of service; or
    3. where the consent of the user of that service has been obtained.
  3. Limitation.— A government agency authorized to install and use a pen register or trap and trace device under this chapter or under State law shall use technology reasonably available to it that restricts the recording or decoding of electronic or other impulses to the dialing, routing, addressing, and signaling information utilized in the processing and transmitting of wire or electronic communications so as not to include the contents of any wire or electronic communications.
  4. Penalty.— Whoever knowingly violates subsection (a) shall be fined under this title or imprisoned not more than one year, or both.

Conor changes his mind

Conor Cahill has taken a look at the Gstumbler report.  His conclusion is:

Given this new information I would have to agree that Google has clearly stepped into the arena of doing something that could be detrimental to the user's privacy.

Conor explains that, “the information in the report is quite different than the information that had been published at the time I expressed my opinions on the events at hand.”

He argues:

  1. “We had been led to believe that Google had only captured data on open wireless networks (networks that broadcast their SSIDs and/or were unencrypted). The analysis of the software shows that to be incorrect — Google captured data on every network regardless of the state of openness. So no matter what the user did to try to protect their network, Google captured data that the underlying protocols required to be transmitted in the clear.
  2. “We had been led to believe that Google had only captured data from wireless access points (APs). Again the analysis shows that this was incorrect — Google captured data on any device for which it was able to capture the wireless traffic for (AP or user device). So portable devices that were currently transmitting as the Street View vehicle passed would have their data captured.”

Anyone who knows Conor knows he is a gentlemanly model of how people should behave towards each other in our industry.  I understand his position fully, and respect it.  He says:

[Kim] seems to have a particular fondness for the phrase “wrong,” “completely wrong,” and “wishful thinking” when referring to my comments on the topic.  In my defense, I will say that there was no “wishful thinking” going on in my mind. I was just examining the published information rather than jumping to conclusions — something that I will always advocate. In this case, after examining the published report, it does appear that those who jumped to conclusions happened to be closer to the mark, but I still think they were wrong to jump to those conclusions until the actual facts had been published.

I can't disagree that Google's public relations messages may well have been crafted to leave the impression that their wireless eavesdropping was only directed at network access points.  But if you read them extremely carefully you see they refrain from making any such claims. 

At any rate, Conor needs no defense and I accept his point.  People who took the view that Google couldn't possibly have been doing what I claimed were acting based on the messages the company conveyed.  Sadly, if people of Conor's undisputed technical sophistication are misled by this kind of public relations campaign, the crafting of the information might also be considered suspect.

[More of Conor's post here]

 

Why location services have to be done right…

uTest describes itself as the world's largest marketplace for software testing services. Recently it held a Bug Battle to test the web and mobile applications of the leading “check-in” location services. A Bug Battle is a quarterly app testing competition, where “software professionals from around the world compete to find bugs and rank today's popular applications” (previous Bug Battles have focused on browsers, search engines, social networking sites, etc.

When evaluating location-based check-in services, testers were given ten days in May to report the most interesting and severe bugs, and to rank these applications based on

  • geo-location accuracy,
  • social media integration,
  • friend connectivity,
  • status recognition features and
  • ease-of use

uTest offered nearly $4,000 in prize money to those who submitted the best bugs for feedback.
The results of the battle, which rated Foursquare, Gowalla and Brightkite, are detailed here.

The report includes comments by people who clearly love the service. For example:

“The Gowalla app and web interface themselves are easy on the eyes, and venues get their own snazzy icon depicting what type of establishment it is. I feed my Gowalla check-ins to Facebook, and having an image that catches attention in a cluttered news feed matters. The user can see everyone who has checked in at a particular venue and how many times.”

But one clear outcome is that many testers reported serious bugs related to privacy and security – a category not present in the original list:

“The impact of check-in services on personal privacy and security took on a prominent role in this study. 80% of respondents responded “Yes” when asked if they were concerned about how location-based check-in services could impact their personal privacy and safety. Nearly half of respondents (49%) chose “privacy and security concerns” as the top reason they do not use check-in services more often.”

VentureBeat, which wrote about the report, concludes:

In addition to appreciating easy-to-use services and bemoaning the lack of Frappuccino deals, the testers seem to be concerned about the privacy and security implications of check-in services in general. 49% of testers said privacy and security concerns were the top reason they don’t use check-in services more often. This is something the check-in services need to address if they want to avoid privacy flames like the ones Facebook is constantly fighting.

These services can be built so as to respect and enhance privacy.  Things like giant world databases linking our devices to our home locations don't help convince anyone that we are doing that.

Claims based identity tops IT security concerns

John Fontana has crossed the floor, moving from the critics box to the stage.  

Many people will remember John's articles at Network World, where he served as one of the most talented and informed journalists in the world writing about network infrastructure for as long as I can remember. 

While it is sad to lose him in his role as journalist, I really look forward to what he will do at his new digs: Ping.  

I know his ability to explain identity and its issues and technology in words people can actually understand will benefit the whole identity ecosystem.  Kudos to Andre Durand and Ping for having the wisdom to bring him over.

Meanwhile, there's exciting news in this post from the new John: 

If I could say it better than my former Network World colleague Ellen Messmer I would, but I can’t so I’m just going to link to her story on Gartner’s survey that shows identity management projects rank first in the top five priorities for IT's security spending.
The results of the survey come from interviews with IT professionals at 308 companies.

But let me highlight two paragraphs from Ellen's story:

“Identity management appears to be taking the lead as a top priority as businesses look to deploy some of the more advanced federated identity technologies both within the enterprise for single sign-on and as a way to potentially extend identity-based access control into cloud-computing environments.”

And this one:

“But in terms of firewalls as a priority, [Gartner] notes that there's a movement to install next-generation firewalls.”

On that last point, check this link to From Firewall to IdentityWall.

[John is on Twitter where he also puts together a Tweet list]

Rethink things in light of Google's Gstumbler report

A number of technical people have given Google the benefit of the doubt in the Street View Wifi case and as a result published information that Google's new “Gstumbler” report shows is completely incorrect.  It is important that people re-evaluate what they are saying in light of this report. 

I'll pick on Conor's recent posting on our discussion as an example – it contains a number of statements and implies a number of things explicitly contradicted by Google's new report.  Once he reads the report and applies the logic he has put forward, logic will require Conor to change his conclusions.

Conor begins with a bunch of statements that are true:

  • MAC addresses typically are persistent identifiers that by the definition of the protocols used in wireless APs can't be hidden from snoopers, even if you turn on encryption.
  • By themselves, MAC addresses are not all that useful except to communicate with a local network entity (so you need to be nearby on the same local network to use them.
  • When you combine MAC addresses with other information (locality, user identity, etc.) you can be creating worrisome data aggregations that when exposed publicly could have a detrimental impact on a user's privacy.
  • SSIDs have some of these properties as well, though the protocol clearly gives the user control over whether or not to broadcast (publicize) their SSID. The choice of the SSID value can have a substantial impact on it's use as a privacy invading value — a generic value such as “home” or “linksys” is much less likely to be a privacy issue than “ConorCahillsHomeAP”.

Wishful thinking and completely wrong

 These are followed by a statement that is just plain wishful thinking.  Conor continues:

  • Google purposely collected SSID and MAC Addresses from APs which were configured in SSID broadcast mode and inadvertently collected some network traffic data from those same APs. Google did not collect information from APs configured to not broadcast SSIDs.

Google's report says Conor is wrong about this, explicitly saying in paragraph 26, “Kismet can also detect the existence of networks with non-broadcast SSIDs, and will capture, parse, and record data from such networks“.   Conor continues:

  • Google associated the SSID and MAC information with some location information (probably the GPS vehicle location at the time the AP signal was strongest).

This is true, but it is important to indicate that this was not limited to access points.  Google's report says that it recorded the association between the MAC address and geographic location of all the active devices on the network.  When it did this, the MAC addresses became, according to Conor's own earlier definition, “worrisome data aggregations”.

  • There is no AP protocol defined means to differentiate between open wireless hotspots and closed hotspots which broadcast their SSIDs. 

This is true, but Google's report indicates this would not have mattered – it collected MACs regardless of whether SSIDs were broadcast.

  • I have not found out if Google used the encryption status of the APs in its decision about recording the SSID/MAC information for the AP.

Google's report indicates it did not.  It only used that status to decide whether or not to record the payload – and only recorded the payload of unencrypted frames…

I like Conor's logic that, “When you combine MAC addresses with other information (locality, user identity, etc.) you can be creating worrisome data aggregations that when exposed publicly could have a detrimental impact on a user's privacy.”   I urge Conor to read the Gstumbler report.  Once he knows what was actually happening, I hope he'll tell the world about it.

 

Gstumbler tells all

The third party commissioned by Google to review the software used in its Street View WiFi cars has completed its report, called Source Code Analysis of ‘Gstumbler’.  I will resist commenting on the name, since Google did the right thing in publishing the report:  there will no longer be any ambiguity about what was being collected. 

As we have discussed over the last week, two issues are of importance – collection of device identity data, and collection of payload data.  One thing I like about te report is that it has a begins with a a number of technical “descriptions and definitions”.  For example, in paragraph 7 it explains enveloping:

“Each packet is comprised of a packet header which contains network administrative information and the addressing information (or “envelope” information) necessary to transmit the data packet from one device to another along the path to its final destination.  Each packet also contains a “payload” which is a fragment of the “content” of the communication or data transmission sent or received over the internet…”

It explains that in 802.11 packets are encapsulated in frames, describes the types of frames and presents the standard diagram showing how a frame is structured.

Readers should understand that when network encryption is turned on, it is only the Frame Body (Payload) of data frames that is encrypted.

In paragraph 19, the report provides an overview of its findings:

“While running in memory, the program parses frame header information, such as frame type, MAC addresses, and other network administrative data from each of the captured frames.  The parsing separates the information into discreet fields for easier analysis… All available MAC addresses contained in a frame are also parsed.  All of this parsed header information is written to disk for frames transmitted over both encrypted and unencrypted wireless networks [emphasis mine – Kim].”

In paragraph 20, the report explains that the software discards the content of encrypted bodies (which of course it can't analyse anyway) whereas unencrypted bodies are also written to disk.  I have not discussed the issue of collecting the frame bodies in these pages – there is no need to do so since it is intuitively easy for people to understand what it means to collect payloads.

In paragraph 22 the report concludes that “all wireless frame data was recorded except for the bodies of 802.11 Data frames from encypted networks.” 

All device identifiers were recorded

As a result, there is no longer any question.  The MAC addresses of all the WiFi laptops and phones in the homes, businesses, enterprises and government buildings were recorded by the driveby mapping cars, as were the wireless access points, and this regardless of the use of encryption. 

My one quibble with the otherwise excellent report is that it calls the MAC addresses “network administrative data”.  In fact they are the device identifiers of the network devices – both of the network access point and the devices connecting to that access point – phones and laptops.

It is also worth, given some of the previous conversations about supposed “broadcasting”, drawing attention to paragraph 26, which explains,

“Kismet captures wireless frames using wireless network interface cards set to monitoring mode.  The use of monitoring mode means that Kismet directs the wireless hardware to listen for and process all wireless traffic regardless of its intended destination… Through the use of passive packet sniffing, Kismet can also detect the existence of netwrks with non-broadcast SSIDs, and will capture, parse, and record data from such networks.”

 

The core of the matter at hand

We've explored many of the basic issues of WiFi snooping.  I would now like to go directly to the core of the matter: why do large centralized databases of MAC addresses linked to our street addresses have really serious consequences for peoples’ privacy?  I'd like to approach this through an example:

Consider the case of someone attending a conference at which people are using laptops and phones over a wireless network.  We picture the devices within range of a given attendee in Figure 1:

The green dot represents the WiFi access point through which conference attendees gain access to the Internet.  For now, let's assume this is a permanent WiFi network.  Let's therefore assume its MAC address and location are present within the linking database that also contains our residential MAC to street address mapping.

Now suppose one or more people at the conference have opted into a geo-location service that makes use of the database.  And let's assume that the way this service works is to listen for nearby MAC addresses (all the little circles in the figure) and submit them to the geo-location system for analysis.

The geo-location system will learn that the opted-in user (let's call him Red) is near the given WiFi point, and thus will know Red is at a given location.  If the geo-location system is also capable of searching the web (as one would expect that Google's could), it will also be able to infer that Red is in a given hotel, and that the hotel is hosting a conference C on the date in question. 

If Red stays in the same location for some time, and is surrounded by a number of other people who are in the same location (discernable because their MAC addresses continue to be near by), the smart service will be able to infer that Red is attending conference C being held in the hotel. 

So far, there's nothing wrong with this, since Red has opted in to the geo-location service, and presumably been told that's how it works.

However, note that the geo-location system also learns about the MAC addresses of all the attendees within range who have NOT opted into the system (Green).  And if they remain within range over time, it can also deduce that they too are present at conference C.  Further, it can look up their MAC addresses in the database to discover their street addresses.  This in turn can be used to make many inferences about who the attendees at the conference are, since a lot of information is keyed to their street addresses.  That can itself become further profile information.

Opting out doesn't help

The problem here is this:  The geo-location system is perfectly capable of tracking your location and associating it with your home street address whether you opt in or not.  Home address is a key to many aspects of your identity.  Presto – you have linked many aspects of your identity to your location, and this becomes intellectual property that the geo-location can service sell and benefit from in a myriad of ways.

Is this the way any particular geo-location services would actually work?  I have no idea.  But that's not the point.  The point is that this is the capability one enables by building the giant central database of laptop and phone MAC addresses linked to street addresses.

Commercial interest will naturally tend towards maximal use of these capabilities and the information at hand. 

That is why we need to fully understand the implications of wirelesstapping on a massive scale and figure out if and where we want to draw the line.  How does the collection of MAC addresses using WiFi trucks relate to the regulations involving data collection, proportionality and consent?  Are there limits on the usage of this data? 

One thing for sure.  Breaking the Fourth Law, and turning a unidirectional identifier into a universal identifier is like the story of the Sorcerer's Apprentice.  All the brooms have started dancing.  I wonder if Mickey will get out of this one?

 

More input and points of view

Dave Nikolejsin, CIO of British Columbia and a man who sees identity as the key to efficient government, writes:

“I agree with your comments and focus on the MAC layer data collection going on with Google. One observation I would have re all the “other” similar type activity would be that no others have Google’s resources and thus no others are doing systematic sweep of the western world on such a data gathering mission. As we all know the value of data increases in an N-squared manner and the “N” once Google is done will be a big number.”

Dave goes on to compare wirelesstapping with Facebook's privacy problems and makes what I think is a very insightful comment:

At least (for all its warts) we actually willingly give our info to Facebook!

Heavy duty SOA architect Gunnar Peterson (an expert on Service-Oriented Security) condenses our discussion to date and comes out strongly in favor of the arguments I've been making with regards to the wirelesstapping of MAC addresses: 

Google's Macondo Street View team cannot seem to get the right combination of top kill or cap to fit on its MAC spillage. Your MAC is not like a house number (which everyone knows and are used for many purposes), MAC address is scoped to one use. There's no harm in collecting MACs, the hell you say, there's a number of evil emergent cocktails that can come out of this. Its not so much the MAC itself, its the association of the MAC and the gelocation and time – combining something unique like MAC with geolocation.

This looked like a rogue team (or as Google put it last week a “rogue software engineer“) until this shocking announcement that Google is patenting (emphasis added) – “The invention pertains to location approximation of devices, e.g., wireless access points and client devices in a wireless network.”

It seems pretty obvious that any number of permutations of problems  will result by combining private client data and geolocation. Maybe Google books should scan a copy of J.C. Cannon's book “Privacy: What Developers and IT Professionals Should Know” and Stefan Brands Primer on User Identification.  In both works you see the risks of promiscuously mixing identification cocktails and the unexpected leakages that result. In addition, what does benefit to the user who is being spied upon does all this spying create…?

[More here]

It's true that J.C. and Stefan both do a great job of helping explain the issues at play here, and I advise people who want to understand the issues better to check out their work.

Hal Berenson got back to me after my response to him, saying,

One thing I would suggest is that you write language attempting to ban what you want to ban and let the rest of us poke holes in it  – meaning, show all the legitimate scenarios you would make illegal or accidental criminals you would create… 🙂

I have total sympathy with this concern of course.  We want the minimum intervention necessary.  But our society has come to realize there are many instances where consumers and citizens need be protected in various ways.  In introducing these protections, lawmakers had to deal with exactly the same difficult concerns about balancing rights.  The good news here:  our legal system seems to handle this just fine.  In fact, that's what it is about.   So I will leave the crafting of the appropriate disincentives to professionals.

Ted Howard, who has broad experience including in the Games industry, commented,

Regarding the issues of Google's collection of MAC addresses and wireless SSIDs:

“If I leave my blinds open so that I can get sun into my home, that means I have no problem with anyone walking past my house pausing to watch me in my home. When I talk with a friend while walking in the park, that means I cannot be bothered by a stranger walking alongside us listening to every word. In both cases, am I “publicly broadcasting” something or is the broadcast just a side effect of my activities? The analogy should be clear.

 

“Do a survey to see how many wireless router owners understand what MAC and SSID are. I suspect that very few (< 5%) people know. If they don't know what these are, then how can anyone claim that these people have been intentionally publicly broadcasting these with an understanding that the broadcast has become publicly-available knowledge? Government regulation exists to, among other reasons, protect the public when the public needs protection and would otherwise be unprotected. This seems to me like a good case for protecting the truly ignorant public.

Journalist Mary Branscombe comments:

For me there are 3 big questions.

  1. How much info did or could someone capture; and by the way Google is in the data capture and data mining/machine learning business and that data *has* been used because if they didn't know it was there they didn't know they had to exclude it.
  2. How personally identifiable or anonymised that information is (i don't think my phone has a bunch of captured MAC addresses and if it does I don't think it has any pii about them but I may be wrong.
  3. How much people care. What is public or private about my SSID or about my MAC address? (I honestly don't know is how much you can find out about me from my MAC address but I'm assuming not much; if I'm wrong that's a data point! I know in 2007 Skyhook told me they were confident they had privacy cracked but they would and they're not the only people and they're the good guys…)

Location services is a *huge* business and people are oversharing location information for trivial rewards (see Foursquare). 8000 apps use Skyhook location data. It's Facebook with co-ordinates. I don't think we can not do this – but I think we can regulate and do it more safely.

In answer to Mary's question two, systems like the Street View Wifi system exist to map peoples’ device MAC addresses to their residential address, as described in the Google patent.  Her point about big business is a BIG POINT.  But to me it just increases the urgency to give people the geo-location capabilities they want without creating a privacy chernoble that will explode down the line.

Jan and Susan Huffman write,

“Standing naked in my front yard is like broadcasting my MAC address.  If I don’t want people to look at me naked in my front yard, I wear clothes. I don’t ask that the law punish those who might take a look through the bushes.

My response:  your MAC address is visible in your wireless packets no matter what you do.  Turning on encryption doesn't help.  So there are no clothes to put on.  The analogy with clothes and bushes therefore just doesn't stand up.  Furthermore, in most neighborhoods, if you spend your time trawling the neighborhood and peeking through bushes you end up with people giving you a pretty hard time…  Maybe that's what's happening here.