Apple giving out your iPhone fingerprints and location

I went to the Apple App store a few days ago to download a new iPhone application.  I expected that this would be as straightforward as it had been in the past: choose a title, click on pay, and presto – a new application becomes available.

No such luck.  Apple had changed it's privacy policy, and I was taken to the screen at right,  To proceed I had to “read and accept the new Terms and Conditions”.  I pressed OK and up came page 1 of a new 45 page “privacy” policy.

I would assume “normal people” would say “uncle” and “click approve” around page 3.  But in light of what is happening in the industry around location services I kept reading the tiny, unsearchable, unzoomable print.

And there – on page 37 – you come to “the news”.  Apple's new “privacy” policy reveals that if you use Apple products Apple can disclose your device fingerprints and location to whomever it chooses and for whatever purpose:

Collection and Use of Non-Personal Information

We also collect non-personal information – data in a form that does not permit direct association with any specific individual. We may collect, use, transfer, and disclose non-personal information for any purpose. The following are some examples of non-personal information that we collect and how we may use it:

  • We may collect information such as occupation, language, zip code, area code, unique device identifier, location, and the time zone where an Apple product is used so that we can better understand customer behavior and improve our products, services, and advertising.

No “direct association with any specific individual…”

Maintaining that a personal device fingerprint has “no direct association with any specific individual” is unbelievably specious in 2010 – and even more ludicrous than it used to be now that Google and others have collected the information to build giant centralized databases linking phone MAC addresses to house addresses.  And – big surprise – my iPhone, at least, came bundled with Google's location service.

The irony here is a bit fantastic.  I was, after all, using an “iPhone”.  I assume Apple's lawyers are aware there is an “I” in the word “iPhone”.  We're not talking here about a piece of shared communal property that might be picked up by anyone in the village.  An iPhone is carried around by its owner.  If a link is established between the owner's natural identity and the device (as Google's databases have done), its “unique device identifier” becomes a digital fingerprint for the person using it. 

Apple's statements constitute more disappointing doubletalk that is suspiciously well-aligned with the statements in Google's now-infamous WiFi FAQ.  Checking with the “Wayback machine” (which is of course not guaranteed to be accurate or up to date) the last change recorded in Apple's privacy policy seems to have been made in April 2008.  It contained no reference to device identifiers or location services. 

 

Harvesting phone and laptop fingerprints for its database

In The core of the matter at hand I gave the example of someone attending a conference while subscribed to a geo-location service.  I argued that the subscriber's cell phone would pick up all the MAC addresses (which serve as digital fingerprints) of nearby phones and laptops and send them in to the centralized database service, which would look them up and potentially use the harvested addresses to further increase its knowledge of people's behavior – for example, generating a list of those attending the conference.

A reader wrote to express disbelief that the MAC addresses of non-subscribers would be collected by a company like Google.  So I close this series on WiFi device identifiers with this quote from what Google calls its “refresher FAQ” (emphasis in the quote below is mine):  

How does this location database work?

Google location based services using WiFi access point data work as follows:

  • The user’s device sends a request to the Google location server with a list of MAC addresses which are currently visible to the device;
  • The location server compares the MAC addresses seen by the user’s device with its list of known MAC addresses, and identifies associated geocoded locations (i.e. latitude / longitude);
  • The location server then uses the geocoded locations associated with visible MAC address to triangulate the approximate location of the user;
  • and this approximate location is geocoded and sent back to the user’s device.

So certainly the MAC addresses of all nearby phones and laptops are sent in to the geo-location server – not simply the MAC addresses of wireless access points that are broadcasting SSIDs.  And this is significant from a technical point of view.

Why not edit out the MAC addresses you don't need prior to transmission, reducing transmission size, cost and the amount of work that the central database server must do? Clearly, it was considered useful to collect all the phone fingerprints – including those of non-subscribers.  Of course Google's  WiFi cars also collect the same fingerprints – while driving past peoples’ homes.  So it is clearly possible for their system to match the fingerprints of non-subscribers to their home locations, and thus to their natural identities. 

Is this matching of non-subscribers being done today?  I have no idea.  But Google has put in place all the machinery to do it and pays a premium to operate its geolocation service so as to gather this information.  Further, if allowed to mature, the market for the extra intelligence collected about our behaviors will be immense.

So there is nothing unlikely about the scenario I describe.   I have now examined all the issues I wanted to bring to light and I'll move on to other matters for a while.

 

Trip down memory lane

Joe Mansfield's comment that Bluetooth “doesn’t appear to be all that bad from a privacy leakage perspective” left me rummaging through memory lane – awakening memories that may help explain why I now believe that world-wide databases of MAC addresses constitute a central socio-technical problem of our time.

I was taken back to an unforgettable experience I had in 2005 while working on the Laws of Identity.  I had finished the Fourth Law and understood theoretically why technical systems should use “unidirectional identifiers” (meaning identifiers limited to a defined context) rather than “universal identifiers” (things like social security numbers) unless the goal was to be completely public.  But there is a difference between understanding something theoretically and right in the gut.

Rather than retell the story, here is what I wrote on my blog in Just a few scanning machines on Tuesday 6 September 2005:

Since I seem to be on the subject of Bluetooth again, I want to tell you about an experience I had recently that put a gnarly visceral edge on my opposition to technologies that serve as tracking beacons for us as private individuals.

I was having lunch in San Diego with Paul Trevithick, Stefan Brands and Mary Rundle. Everyone knows Paul for his work with Social Physics and the Berkman identity wiki; Stefan is a tremendously innovative privacy cryptographer; and Mary is pushing the envelope on cyber law with Berkman and Stanford.

Suddenly Mary recalled the closing plenary at the Computers, Freedom and PrivacyPanopticon Conference” in Seattle.

She referred off-handedly to “the presentation where they flashed a slide tracking your whereabouts throughout the conference using your Bluetooth phone.”

Essentially I was flabbergasted. I had missed the final plenary, and had no idea this had happened.

MAC Name Room Time Talk
Kim Cameron Mobile
00:09:2D:02:9A:68
Grand I (G1) Wed 09:32 09:32 ????
Grand Crescent (gc) Wed 09:35 09:35 Adware and Privacy: Finding a Common Ground
Grand I (G1) Wed 09:37 09:37 ????
Grand Crescent (gc) Wed 09:41 09:42 Adware and Privacy: Finding a Common Ground
Grand I (G1) Wed 09:46 09:47 ????
Grand III (g3) Wed 10:18 10:30 Intelligent Video Surveillance
Baker (ol) Wed 10:33 10:42 Reforming E-mail and Digital Telephonic Privacy
Grand III (g3) Wed 10:47 10:48 Intelligent Video Surveillance
Grand Crescent (gc) Wed 11:25 11:26 Adware and Privacy: Finding a Common Ground
Grand III (g3) Wed 11:46 12:22 Intelligent Video Surveillance
5th Avenue (5a) Wed 12:33 12:55 ????
Grand III (g3) Wed 13:08 14:34 Plenary: Government CPOs: Are they worth fighting for?

Of course, to some extent I'm a public figure when it comes to identity matters, and tracking my participation at a privacy conference is, I suspect, fair game. Or at any rate, it's good theatre, and drives home the message of the Fourth Law, which makes the point that private individuals must not be subjected – without their knowledge or against their will – to technologies that create tracking beacons.

A picture named kim_cameron.JPGLater Mary introduced me to Paul Holman from The Shmoo Group. He was the person who had put this presentation together, and given our mutual friends I don't doubt his motives. In fact, I look forward to meeting him in person.

He told me:

“I take it you missed our quick presentation, but essentially, we just put Bluetooth scanning machines in a few of the conference rooms and had them log the devices they saw. This was a pretty unsophisticated exercise, showing only devices in discoverable mode. To get them all would be a lot more work. You could do the same kind of thing just monitoring for cell phones or WiFi devices or whatever. We were trying to illustrate a crude version of what will be possible with RFIDs.”

The Bluetooth tracking was tied in to the conference session titles, and by clicking on a link you could see the information represented graphically – including my escape to a conference center window so I could take a phone call.

Anyway, I think I have had a foretaste of how people will feel when networks of billboards and posters start tracking their locations and behaviors. They won't like it one bit. They'll push back.

A foretaste indeed

One of my readers wrote to say I should turn my Bluetooth broadcast off, and I responded:

You’re right, and I have turned it off. Which bothers me. Because I like some of the convenience I used to enjoy.

So I write about this because I’d rather leave my Bluetooth phone enabled, interacting only with devices run by entities I’ve told it to cooperate with.

We have a lot of work to do to get things to this point. I see our work on identity as being directed to that end, at least in part.

We need to be able to easily express and select the relationships we want to participate in – and avoid – as cyberspace progressively penetrates the world of physical things.

The problems of Bluetooth all exist in current Wifi too. My portable computer broadcasts another tracking beacon. I’m not picking on Bluetooth versus other technologies. Incredibly, they all need to be fixed. They’re all misdesigned.

If anything has shocked me while working on the Laws of Identity, it has been the discovery of how naive we’ve been in the design of these systems to date – a product of our failure to understand the Fourth Law of Identity. The potential for abuse of these systems is collosal – enterprises like the UK’s Filter are just the most benign tip of an ugly iceberg.

For everyone’s sake I try to refrain from filling in what the underside of this iceberg might look like

Google's Street View group, which has been assembling a massive central registry of WiFi MAC addresses, has definitely crawled out from under this iceberg, and the project is more sinister than any I imagined only a few years ago.

But so as not to leave everyone feeling completely depressed, all the dreams of Billboards that recognize you from your Bluetooth phone have now been abandoned by Bluetooth manufacturers, and the specification has been greatly improved in light of the criticism it received.  Let's hope that geo-location providers, and Google in particular, see the same light, and assure us they will no longer collect or store the MAC address of any device unless that collection is approved by the subscriber.

What does a MAC address tell you?

Joe Mansfield at Peccavi has published a nice, clear and abridged explanation of the issues I've been discussing over the last few weeks.  

But before doing that he makes an important and novel point about why regulation may be useful even if it can't “prevent all abuses”:

I’d discounted the payload snooping issue as a distraction because I’d believed (and still do) that it was almost certainly an unfortunate error. I’d then made the point that a legal barrier to a technical problem was insufficient to prevent the bad guys doing bad things but I used that as an excuse to ignore the problem – small scale abuses of this sort of thing are not good but systematic large scale abuses “benefit” from network scaling effects. You might not be able to prevent small scale\illegal abuse through legal means but just because you can’t does not mean that you can’t control large scale abuses this way. The benefits and dangers inherent in this data become exponentially worse as the scale of the database that contains it increases. Large scale means companies and companies react to regulation by being much more careful about what they do. If a technology that is already out there has major privacy issues the regulatory approach is the only way to keep a lid on the problem while the technologists argue about how to fix the bits. Even if we assume that the law was OK about companies creating Geo-location databases using WiFi SSID\MAC mapping, effective regulation would have made the additional mistake made by Google (assuming it was a mistake) much less likely.

Next he explains how WiFi works as a layered protocol in which MAC addresses are exposed despite encryption and SSID suppression:

Now the obvious question is should scanning for identifiers that are broadcast openly by all WiFi radio signals be acceptable and legal?

802.11 WiFi signals are pretty complex things – Wikipedia has a brief overview here for those who want to see the alphabet soup of standards involved. Despite the range of encoding\modulation schemes and the number of frequency bands and channels almost all 802.11 devices revert to a couple of basic communication modes. This makes it easy for devices to connect to each other, and it’s what makes public WiFi hotspots practical. However it also makes configuring a device to monitor WiFi traffic trivially easy – the hardware does all the heavy lifting and the standards don’t really do anything to stop it happening. An important feature of WiFi is that, even though the payload encryption standards can now be pretty robust, the data link layer is not protected from snooping. This means that the content (my Google searches, the video clip I’m streaming down from Youtube etc) can be pretty well kept away from prying eyes but, at what the Ethernet folks call layer 2, the logical structures called frames that carry your encrypted data transmit some control data in the open.

So even with WPA2’s thorough key management and AES encryption your WiFi traffic still contains quite a bit of chatter that isn’t hidden away. The really critical thing for me is that the layer 2 addresses, the Media Access Control (MAC) addresses, of the sender and receiver (generally your PC\Phone’s WiFi adaptor and your Access Point) for each frame are always visible. And remember that MAC addresses are globally unique identifiers by design. Individual WiFi networks are defined by another identifier, the Service Set Identifier or SSID – when you set up your home WiFi AP and call the network “MyWLAN” you are choosing an SSID. SSID’s are very important, you can’t connect to a wireless LAN without knowing the relevant SSID, but they are not secure even though they can be sort of hidden they are never protected and can always be seen by someone just watching your wireless traffic. Interestingly SSID’s are not globally unique – there’s generally no real issue so long as my chosen SSID doesn’t match that of another network that’s relatively close by.

So SSID’s are possibly visible but MAC addresses are definitely visible, and MAC addresses are unique. While driving along a street or sitting in a coffee shop, hotel lobby or conference room your WiFi adaptor will see dozens if not hundreds of WiFi packets all of which will contain globally unique MAC addresses. It is possible to hack some WiFi hardware to change the MAC address but that practice is rare. Your PC has a couple (one for the wired Ethernet adaptor which isn’t important here, and usually one for WiFi these days), your Wii\PS3\XBox-360 has one, so does your Nintendo DS, iPhone, PSP … you get the picture. Another feature of MAC addresses is that it is very easy to differentiate between the MAC address of a Linksys Access Point, an iPhone and a Nintendo DS – Network protocol analyzers have been doing that trick for decades.

So the systematic scanners out there (Google, Navizon, Skyhook and the rest) can drive around or recruit volunteers and gather location data and build databases of unique identifiers, device types, timestamps, signal strengths and possibly other data. The simplest (and most) benign use of that would be to pull out the ID’s of devices that are known to be fixed to one place (Access Points say) and use that for enabling Geo-location.

Joe then looks at what it means to start collecting and analyzing the MAC addresses of mobile devices.

It’s not a big leap to also track the MAC addresses that are more mobile. Get enough data points over a couple of months or years and the database will certainly contain many repeat detections of mobile MAC addresses at many different locations, with a decent chance of being able to identify a home or work address to go with it. Kim Cameron describes the start of this cascade effect in his most recent post, mapping the attendees at a conference to home addresses even when they’ve never consented to any such tracking is not going to be hard if you’ve gone to the trouble of scanning every street in every city in the country. With a minor bit of further analysis the same techniques could be used to get a good idea of the travel or shopping habits of almost everyone sitting in an airport departure lounge or the home addresses of everyone participating in a Stop The War protest.

And remember that even though you can only effectively use WiFi to send and receive data over a range of a few 10’s to maybe a 100m you can detect and read WiFi signals easily from 100’s to 1000’s of metres away without any special equipment.

The plans to blanket London with “Free WiFi” start to sound quite disturbing when you think about those possibilities.

To answer my own title question – MAC addresses can tell far more about you than you think and keeping databases of where and when they’ve been seen can be extremely dangerous in terms of privacy.

Finally, he compares WiFi to Bluetooth:

Bluetooth is a slightly different animal. It’s also a short range radio standard for data communications but it was developed from the ground up to replace wires and the folks building the standard got a lot of stuff right. It doesn’t appear to be all that bad from a privacy leakage perspective – when implemented correctly nothing is sent in clear text (the entire frame is encoded, not just the payload) and the frequency hopping RF behaviour makes it much harder to casually snoop on specific conversations. Bluetooth devices have a Bluetooth Device ID that is very like a MAC address (48 bits), with a manufacturer ID that enables broad classification of devices if the ID can be discovered but most Bluetooth devices keep that hidden most of the time by defaulting to a “not visible” mode even when Bluetooth is enabled. When actively communicating (paired) all data is encrypted so the device ID’s are not visible to a third party. Almost all modern Bluetooth devices only allow themselves to remain openly visible in this way for a short period of time before they revert to a safer non broadcasting mode. The main weakness is that when devices are set to “visible” the unique identifiers and other data can be scanned remotely and used in just the same way as scanned WiFi MAC addresses. That’s not to say that Bluetooth doesn’t have its share of security problems but they made an attempt to get some of the fundamentals right. It does also show that there is a practical way to approach the wireless privacy challenge which is good to see.

All in all a very nice explanation of the issues involved here.   The only thing I would add is that the early versions of Bluetooth had few of the privacy-respecting behaviors present in the recent specifications.  The consortium has really worked to clean up its act and we should all congratulate it.  This came about because privacy concerns came to be perceived as an adoption blocker. 

Does the non-content trump the content?

In my previous post I referred to an interesting Wired story in which former U.S. federal prosecutor Paul Ohm says Google “likely” breached a U.S. federal criminal statute by intercepting the metadata and address information on residential and business WiFi networks.  The statute refers to a “pen register” – an electronic device that records all numbers dialed from a particular telephone line.  Wikipedia tells us the term has come to include any device or program that performs similar functions to an original pen register, including programs monitoring Internet communications.”  The story continues:

“I think it’s likely they committed a criminal misdemeanor of the Pen Register and Trap and Traces Device Act,” said Ohm, a prosecutor from 2001 to 2005 in the Justice Department’s Computer Crime and Intellectual Property Section. “For every packet they intercepted, not only did they get the content, they also have your IP address and destination IP address that they intercepted. The e-mail message from you to somebody else, the ‘to’ and ‘from’ line is also intercepted.”

“This is a huge irony, that this might come down to the non-content they acquired,” (.pdf) said Ohm, a professor at the University of Colorado School of Law.

I understand how people unacquainted with the emerging role of identity in the Internet can see this as an irony – a kind of side-effect – whereas in reality Google's plan to establish a vast centralized database of device identifiers has much longer-term consequences than the misappropriation of content.  Metadata is no less important than other data –  and “addresses” being referred to are really device identifiers clearly associated with individual users, much like the telephone numbers to which the statute applies.  Given the similarity to issues that arose with pre-Internet communication, we should perhaps not be surprised that there may already be regulation in place that prevents “registering” of the identifiers.

The Wired article continues:

Google said it was a coding error that led it to sniff as much as 600 gigabytes of data across dozens of countries as it was snapping photos for its Street View project. The data likely included webpages users visited and pieces of e-mail, video and document files…

The pen register act described by Ohm, which he said is rarely prosecuted, is usually thought of in terms of preventing unauthorized monitoring of outbound and inbound telephone numbers.

Violations are a misdemeanor and cannot be prosecuted by private lawyers in civil court, Ohm said. He said the act requires that Google “knew, or should have known” of the activity in question.

Google denies any wrongdoing.

In fact, Google knew about the collection of MAC addresses, and has never said otherwise or stated that their collection of these addresses was done accidently.  In fact they have been careful to never state explicitly that their collection was limited to Wireless Access Points.  The Gstumbler report makes it clear they were parsing and recording both the source and destination MAC addresses in all the WiFi frames they intercepted. 

The Wired article explains:

As far as a criminal court goes, it is not considered wiretapping “to intercept or access an electronic communication made through an electronic communication system that is configured so that such electronic communication is readily accessible to the general public.”

It is not known how many non-password-protected Wi-Fi networks there are in the United States.

What makes this especially interesting is the fact that it is not possible to configure a WiFi network so that the MAC addresses are hidden.  Use of passwords protects the communication content carried by the network, but does not protect the MAC addresses.  Configuring the WIreless Access Point not to broadcast an SSID does not prevent eavesdropping on MAC addresses either.   Yet we can hardly say the metadata is readily accessible to the general public, since it cannot be detected except acquiring and using very specialized programs. 

Wired draws the conclusion that,  “The U.S. courts have not clearly addressed the issue involved in the Google flap.”

 

Title 18 – Part II – Chapter 206 – § 3121

Former federal prosecutor Paul Ohm says Google “likely” breached a U.S. federal criminal statute in connection with its accidental Wi-Fi sniffing — but not for siphoning private data from internet surfers using unsecured networks.

Instead, Mr. Ohm  thinks Google might have breached the Pen Register and Trap and Traces Device Act for intercepting the metadata and address information alongside the content.

According to Wikipedia, a “pen register is an electronic device that records all numbers dialed from a particular telephone line. The term has come to include any device or program that performs similar functions to an original pen register, including programs monitoring Internet communications.”

I'll expand on the identity implications in my next post, but to prepare the discussion, here is the statute to which Mr. Ohm is referring:

Title 18 – Part II – Chapter 206 – § 3121

  1. In General.— Except as provided in this section, no person may install or use a pen register or a trap and trace device without first obtaining a court order under section 3123 of this title or under the Foreign Intelligence Surveillance Act of 1978 (50 U.S.C. 1801 et seq.).
  2. Exception.— The prohibition of subsection (a) does not apply with respect to the use of a pen register or a trap and trace device by a provider of electronic or wire communication service—
    1. relating to the operation, maintenance, and testing of a wire or electronic communication service or to the protection of the rights or property of such provider, or to the protection of users of that service from abuse of service or unlawful use of service; or
    2. to record the fact that a wire or electronic communication was initiated or completed in order to protect such provider, another provider furnishing service toward the completion of the wire communication, or a user of that service, from fraudulent, unlawful or abusive use of service; or
    3. where the consent of the user of that service has been obtained.
  3. Limitation.— A government agency authorized to install and use a pen register or trap and trace device under this chapter or under State law shall use technology reasonably available to it that restricts the recording or decoding of electronic or other impulses to the dialing, routing, addressing, and signaling information utilized in the processing and transmitting of wire or electronic communications so as not to include the contents of any wire or electronic communications.
  4. Penalty.— Whoever knowingly violates subsection (a) shall be fined under this title or imprisoned not more than one year, or both.

Conor changes his mind

Conor Cahill has taken a look at the Gstumbler report.  His conclusion is:

Given this new information I would have to agree that Google has clearly stepped into the arena of doing something that could be detrimental to the user's privacy.

Conor explains that, “the information in the report is quite different than the information that had been published at the time I expressed my opinions on the events at hand.”

He argues:

  1. “We had been led to believe that Google had only captured data on open wireless networks (networks that broadcast their SSIDs and/or were unencrypted). The analysis of the software shows that to be incorrect — Google captured data on every network regardless of the state of openness. So no matter what the user did to try to protect their network, Google captured data that the underlying protocols required to be transmitted in the clear.
  2. “We had been led to believe that Google had only captured data from wireless access points (APs). Again the analysis shows that this was incorrect — Google captured data on any device for which it was able to capture the wireless traffic for (AP or user device). So portable devices that were currently transmitting as the Street View vehicle passed would have their data captured.”

Anyone who knows Conor knows he is a gentlemanly model of how people should behave towards each other in our industry.  I understand his position fully, and respect it.  He says:

[Kim] seems to have a particular fondness for the phrase “wrong,” “completely wrong,” and “wishful thinking” when referring to my comments on the topic.  In my defense, I will say that there was no “wishful thinking” going on in my mind. I was just examining the published information rather than jumping to conclusions — something that I will always advocate. In this case, after examining the published report, it does appear that those who jumped to conclusions happened to be closer to the mark, but I still think they were wrong to jump to those conclusions until the actual facts had been published.

I can't disagree that Google's public relations messages may well have been crafted to leave the impression that their wireless eavesdropping was only directed at network access points.  But if you read them extremely carefully you see they refrain from making any such claims. 

At any rate, Conor needs no defense and I accept his point.  People who took the view that Google couldn't possibly have been doing what I claimed were acting based on the messages the company conveyed.  Sadly, if people of Conor's undisputed technical sophistication are misled by this kind of public relations campaign, the crafting of the information might also be considered suspect.

[More of Conor's post here]

 

Why location services have to be done right…

uTest describes itself as the world's largest marketplace for software testing services. Recently it held a Bug Battle to test the web and mobile applications of the leading “check-in” location services. A Bug Battle is a quarterly app testing competition, where “software professionals from around the world compete to find bugs and rank today's popular applications” (previous Bug Battles have focused on browsers, search engines, social networking sites, etc.

When evaluating location-based check-in services, testers were given ten days in May to report the most interesting and severe bugs, and to rank these applications based on

  • geo-location accuracy,
  • social media integration,
  • friend connectivity,
  • status recognition features and
  • ease-of use

uTest offered nearly $4,000 in prize money to those who submitted the best bugs for feedback.
The results of the battle, which rated Foursquare, Gowalla and Brightkite, are detailed here.

The report includes comments by people who clearly love the service. For example:

“The Gowalla app and web interface themselves are easy on the eyes, and venues get their own snazzy icon depicting what type of establishment it is. I feed my Gowalla check-ins to Facebook, and having an image that catches attention in a cluttered news feed matters. The user can see everyone who has checked in at a particular venue and how many times.”

But one clear outcome is that many testers reported serious bugs related to privacy and security – a category not present in the original list:

“The impact of check-in services on personal privacy and security took on a prominent role in this study. 80% of respondents responded “Yes” when asked if they were concerned about how location-based check-in services could impact their personal privacy and safety. Nearly half of respondents (49%) chose “privacy and security concerns” as the top reason they do not use check-in services more often.”

VentureBeat, which wrote about the report, concludes:

In addition to appreciating easy-to-use services and bemoaning the lack of Frappuccino deals, the testers seem to be concerned about the privacy and security implications of check-in services in general. 49% of testers said privacy and security concerns were the top reason they don’t use check-in services more often. This is something the check-in services need to address if they want to avoid privacy flames like the ones Facebook is constantly fighting.

These services can be built so as to respect and enhance privacy.  Things like giant world databases linking our devices to our home locations don't help convince anyone that we are doing that.

Latitude privacy policy doesn't fess up to what Google stores

Never one to mince words, Jackson Shaw asks, “To the Google privacy core – Is it rotten?”  He writes,

“I read Kim’s post and immediately decided to turn off Google’s Latitude service on my phone but, as Kim illustrates, it probably won’t make any difference…

“I took a few minutes to check out Google’s privacy policy around Latitude and found out this much:

“If you choose to ‘Hide your location’, you can hide from your Latitude friends all at once, so they won't be able to see your location. If you hide in Latitude, we don't store your location.

“I’m not worried about hiding in Latitude. I wish I could hide from Google!”

The funny thing here is that Google already stores our residential locations through association with our devices, as indicated by its Gstumbler report, contradicting the Latitude privacy policy.

Jackson then directs us to a Wired article that is tremendously germane to this discussion – partly because of what it says about the current legal environment in the US, and partly because it reflects the very real problem that, in general, neither technologists nor policy makers understand that tapping of device identifiers is as serious as theft of content. 

See:  “Former Prosecutor: Google Wi-Fi Snafu ‘Likely’ Illegal ” – I'll discuss it next.  

Rethink things in light of Google's Gstumbler report

A number of technical people have given Google the benefit of the doubt in the Street View Wifi case and as a result published information that Google's new “Gstumbler” report shows is completely incorrect.  It is important that people re-evaluate what they are saying in light of this report. 

I'll pick on Conor's recent posting on our discussion as an example – it contains a number of statements and implies a number of things explicitly contradicted by Google's new report.  Once he reads the report and applies the logic he has put forward, logic will require Conor to change his conclusions.

Conor begins with a bunch of statements that are true:

  • MAC addresses typically are persistent identifiers that by the definition of the protocols used in wireless APs can't be hidden from snoopers, even if you turn on encryption.
  • By themselves, MAC addresses are not all that useful except to communicate with a local network entity (so you need to be nearby on the same local network to use them.
  • When you combine MAC addresses with other information (locality, user identity, etc.) you can be creating worrisome data aggregations that when exposed publicly could have a detrimental impact on a user's privacy.
  • SSIDs have some of these properties as well, though the protocol clearly gives the user control over whether or not to broadcast (publicize) their SSID. The choice of the SSID value can have a substantial impact on it's use as a privacy invading value — a generic value such as “home” or “linksys” is much less likely to be a privacy issue than “ConorCahillsHomeAP”.

Wishful thinking and completely wrong

 These are followed by a statement that is just plain wishful thinking.  Conor continues:

  • Google purposely collected SSID and MAC Addresses from APs which were configured in SSID broadcast mode and inadvertently collected some network traffic data from those same APs. Google did not collect information from APs configured to not broadcast SSIDs.

Google's report says Conor is wrong about this, explicitly saying in paragraph 26, “Kismet can also detect the existence of networks with non-broadcast SSIDs, and will capture, parse, and record data from such networks“.   Conor continues:

  • Google associated the SSID and MAC information with some location information (probably the GPS vehicle location at the time the AP signal was strongest).

This is true, but it is important to indicate that this was not limited to access points.  Google's report says that it recorded the association between the MAC address and geographic location of all the active devices on the network.  When it did this, the MAC addresses became, according to Conor's own earlier definition, “worrisome data aggregations”.

  • There is no AP protocol defined means to differentiate between open wireless hotspots and closed hotspots which broadcast their SSIDs. 

This is true, but Google's report indicates this would not have mattered – it collected MACs regardless of whether SSIDs were broadcast.

  • I have not found out if Google used the encryption status of the APs in its decision about recording the SSID/MAC information for the AP.

Google's report indicates it did not.  It only used that status to decide whether or not to record the payload – and only recorded the payload of unencrypted frames…

I like Conor's logic that, “When you combine MAC addresses with other information (locality, user identity, etc.) you can be creating worrisome data aggregations that when exposed publicly could have a detrimental impact on a user's privacy.”   I urge Conor to read the Gstumbler report.  Once he knows what was actually happening, I hope he'll tell the world about it.