What does a MAC address tell you?

Joe Mansfield at Peccavi has published a nice, clear and abridged explanation of the issues I've been discussing over the last few weeks.  

But before doing that he makes an important and novel point about why regulation may be useful even if it can't “prevent all abuses”:

I’d discounted the payload snooping issue as a distraction because I’d believed (and still do) that it was almost certainly an unfortunate error. I’d then made the point that a legal barrier to a technical problem was insufficient to prevent the bad guys doing bad things but I used that as an excuse to ignore the problem – small scale abuses of this sort of thing are not good but systematic large scale abuses “benefit” from network scaling effects. You might not be able to prevent small scale\illegal abuse through legal means but just because you can’t does not mean that you can’t control large scale abuses this way. The benefits and dangers inherent in this data become exponentially worse as the scale of the database that contains it increases. Large scale means companies and companies react to regulation by being much more careful about what they do. If a technology that is already out there has major privacy issues the regulatory approach is the only way to keep a lid on the problem while the technologists argue about how to fix the bits. Even if we assume that the law was OK about companies creating Geo-location databases using WiFi SSID\MAC mapping, effective regulation would have made the additional mistake made by Google (assuming it was a mistake) much less likely.

Next he explains how WiFi works as a layered protocol in which MAC addresses are exposed despite encryption and SSID suppression:

Now the obvious question is should scanning for identifiers that are broadcast openly by all WiFi radio signals be acceptable and legal?

802.11 WiFi signals are pretty complex things – Wikipedia has a brief overview here for those who want to see the alphabet soup of standards involved. Despite the range of encoding\modulation schemes and the number of frequency bands and channels almost all 802.11 devices revert to a couple of basic communication modes. This makes it easy for devices to connect to each other, and it’s what makes public WiFi hotspots practical. However it also makes configuring a device to monitor WiFi traffic trivially easy – the hardware does all the heavy lifting and the standards don’t really do anything to stop it happening. An important feature of WiFi is that, even though the payload encryption standards can now be pretty robust, the data link layer is not protected from snooping. This means that the content (my Google searches, the video clip I’m streaming down from Youtube etc) can be pretty well kept away from prying eyes but, at what the Ethernet folks call layer 2, the logical structures called frames that carry your encrypted data transmit some control data in the open.

So even with WPA2’s thorough key management and AES encryption your WiFi traffic still contains quite a bit of chatter that isn’t hidden away. The really critical thing for me is that the layer 2 addresses, the Media Access Control (MAC) addresses, of the sender and receiver (generally your PC\Phone’s WiFi adaptor and your Access Point) for each frame are always visible. And remember that MAC addresses are globally unique identifiers by design. Individual WiFi networks are defined by another identifier, the Service Set Identifier or SSID – when you set up your home WiFi AP and call the network “MyWLAN” you are choosing an SSID. SSID’s are very important, you can’t connect to a wireless LAN without knowing the relevant SSID, but they are not secure even though they can be sort of hidden they are never protected and can always be seen by someone just watching your wireless traffic. Interestingly SSID’s are not globally unique – there’s generally no real issue so long as my chosen SSID doesn’t match that of another network that’s relatively close by.

So SSID’s are possibly visible but MAC addresses are definitely visible, and MAC addresses are unique. While driving along a street or sitting in a coffee shop, hotel lobby or conference room your WiFi adaptor will see dozens if not hundreds of WiFi packets all of which will contain globally unique MAC addresses. It is possible to hack some WiFi hardware to change the MAC address but that practice is rare. Your PC has a couple (one for the wired Ethernet adaptor which isn’t important here, and usually one for WiFi these days), your Wii\PS3\XBox-360 has one, so does your Nintendo DS, iPhone, PSP … you get the picture. Another feature of MAC addresses is that it is very easy to differentiate between the MAC address of a Linksys Access Point, an iPhone and a Nintendo DS – Network protocol analyzers have been doing that trick for decades.

So the systematic scanners out there (Google, Navizon, Skyhook and the rest) can drive around or recruit volunteers and gather location data and build databases of unique identifiers, device types, timestamps, signal strengths and possibly other data. The simplest (and most) benign use of that would be to pull out the ID’s of devices that are known to be fixed to one place (Access Points say) and use that for enabling Geo-location.

Joe then looks at what it means to start collecting and analyzing the MAC addresses of mobile devices.

It’s not a big leap to also track the MAC addresses that are more mobile. Get enough data points over a couple of months or years and the database will certainly contain many repeat detections of mobile MAC addresses at many different locations, with a decent chance of being able to identify a home or work address to go with it. Kim Cameron describes the start of this cascade effect in his most recent post, mapping the attendees at a conference to home addresses even when they’ve never consented to any such tracking is not going to be hard if you’ve gone to the trouble of scanning every street in every city in the country. With a minor bit of further analysis the same techniques could be used to get a good idea of the travel or shopping habits of almost everyone sitting in an airport departure lounge or the home addresses of everyone participating in a Stop The War protest.

And remember that even though you can only effectively use WiFi to send and receive data over a range of a few 10’s to maybe a 100m you can detect and read WiFi signals easily from 100’s to 1000’s of metres away without any special equipment.

The plans to blanket London with “Free WiFi” start to sound quite disturbing when you think about those possibilities.

To answer my own title question – MAC addresses can tell far more about you than you think and keeping databases of where and when they’ve been seen can be extremely dangerous in terms of privacy.

Finally, he compares WiFi to Bluetooth:

Bluetooth is a slightly different animal. It’s also a short range radio standard for data communications but it was developed from the ground up to replace wires and the folks building the standard got a lot of stuff right. It doesn’t appear to be all that bad from a privacy leakage perspective – when implemented correctly nothing is sent in clear text (the entire frame is encoded, not just the payload) and the frequency hopping RF behaviour makes it much harder to casually snoop on specific conversations. Bluetooth devices have a Bluetooth Device ID that is very like a MAC address (48 bits), with a manufacturer ID that enables broad classification of devices if the ID can be discovered but most Bluetooth devices keep that hidden most of the time by defaulting to a “not visible” mode even when Bluetooth is enabled. When actively communicating (paired) all data is encrypted so the device ID’s are not visible to a third party. Almost all modern Bluetooth devices only allow themselves to remain openly visible in this way for a short period of time before they revert to a safer non broadcasting mode. The main weakness is that when devices are set to “visible” the unique identifiers and other data can be scanned remotely and used in just the same way as scanned WiFi MAC addresses. That’s not to say that Bluetooth doesn’t have its share of security problems but they made an attempt to get some of the fundamentals right. It does also show that there is a practical way to approach the wireless privacy challenge which is good to see.

All in all a very nice explanation of the issues involved here.   The only thing I would add is that the early versions of Bluetooth had few of the privacy-respecting behaviors present in the recent specifications.  The consortium has really worked to clean up its act and we should all congratulate it.  This came about because privacy concerns came to be perceived as an adoption blocker. 

Title 18 – Part II – Chapter 206 – § 3121

Former federal prosecutor Paul Ohm says Google “likely” breached a U.S. federal criminal statute in connection with its accidental Wi-Fi sniffing — but not for siphoning private data from internet surfers using unsecured networks.

Instead, Mr. Ohm  thinks Google might have breached the Pen Register and Trap and Traces Device Act for intercepting the metadata and address information alongside the content.

According to Wikipedia, a “pen register is an electronic device that records all numbers dialed from a particular telephone line. The term has come to include any device or program that performs similar functions to an original pen register, including programs monitoring Internet communications.”

I'll expand on the identity implications in my next post, but to prepare the discussion, here is the statute to which Mr. Ohm is referring:

Title 18 – Part II – Chapter 206 – § 3121

  1. In General.— Except as provided in this section, no person may install or use a pen register or a trap and trace device without first obtaining a court order under section 3123 of this title or under the Foreign Intelligence Surveillance Act of 1978 (50 U.S.C. 1801 et seq.).
  2. Exception.— The prohibition of subsection (a) does not apply with respect to the use of a pen register or a trap and trace device by a provider of electronic or wire communication service—
    1. relating to the operation, maintenance, and testing of a wire or electronic communication service or to the protection of the rights or property of such provider, or to the protection of users of that service from abuse of service or unlawful use of service; or
    2. to record the fact that a wire or electronic communication was initiated or completed in order to protect such provider, another provider furnishing service toward the completion of the wire communication, or a user of that service, from fraudulent, unlawful or abusive use of service; or
    3. where the consent of the user of that service has been obtained.
  3. Limitation.— A government agency authorized to install and use a pen register or trap and trace device under this chapter or under State law shall use technology reasonably available to it that restricts the recording or decoding of electronic or other impulses to the dialing, routing, addressing, and signaling information utilized in the processing and transmitting of wire or electronic communications so as not to include the contents of any wire or electronic communications.
  4. Penalty.— Whoever knowingly violates subsection (a) shall be fined under this title or imprisoned not more than one year, or both.

The core of the matter at hand

We've explored many of the basic issues of WiFi snooping.  I would now like to go directly to the core of the matter: why do large centralized databases of MAC addresses linked to our street addresses have really serious consequences for peoples’ privacy?  I'd like to approach this through an example:

Consider the case of someone attending a conference at which people are using laptops and phones over a wireless network.  We picture the devices within range of a given attendee in Figure 1:

The green dot represents the WiFi access point through which conference attendees gain access to the Internet.  For now, let's assume this is a permanent WiFi network.  Let's therefore assume its MAC address and location are present within the linking database that also contains our residential MAC to street address mapping.

Now suppose one or more people at the conference have opted into a geo-location service that makes use of the database.  And let's assume that the way this service works is to listen for nearby MAC addresses (all the little circles in the figure) and submit them to the geo-location system for analysis.

The geo-location system will learn that the opted-in user (let's call him Red) is near the given WiFi point, and thus will know Red is at a given location.  If the geo-location system is also capable of searching the web (as one would expect that Google's could), it will also be able to infer that Red is in a given hotel, and that the hotel is hosting a conference C on the date in question. 

If Red stays in the same location for some time, and is surrounded by a number of other people who are in the same location (discernable because their MAC addresses continue to be near by), the smart service will be able to infer that Red is attending conference C being held in the hotel. 

So far, there's nothing wrong with this, since Red has opted in to the geo-location service, and presumably been told that's how it works.

However, note that the geo-location system also learns about the MAC addresses of all the attendees within range who have NOT opted into the system (Green).  And if they remain within range over time, it can also deduce that they too are present at conference C.  Further, it can look up their MAC addresses in the database to discover their street addresses.  This in turn can be used to make many inferences about who the attendees at the conference are, since a lot of information is keyed to their street addresses.  That can itself become further profile information.

Opting out doesn't help

The problem here is this:  The geo-location system is perfectly capable of tracking your location and associating it with your home street address whether you opt in or not.  Home address is a key to many aspects of your identity.  Presto – you have linked many aspects of your identity to your location, and this becomes intellectual property that the geo-location can service sell and benefit from in a myriad of ways.

Is this the way any particular geo-location services would actually work?  I have no idea.  But that's not the point.  The point is that this is the capability one enables by building the giant central database of laptop and phone MAC addresses linked to street addresses.

Commercial interest will naturally tend towards maximal use of these capabilities and the information at hand. 

That is why we need to fully understand the implications of wirelesstapping on a massive scale and figure out if and where we want to draw the line.  How does the collection of MAC addresses using WiFi trucks relate to the regulations involving data collection, proportionality and consent?  Are there limits on the usage of this data? 

One thing for sure.  Breaking the Fourth Law, and turning a unidirectional identifier into a universal identifier is like the story of the Sorcerer's Apprentice.  All the brooms have started dancing.  I wonder if Mickey will get out of this one?

 

What harm can possibly come from a MAC address?

If you are new to doing privacy threat analysis, I should explain that to do it, you need to be thoroughly pessimistic.  A privacy threat analysis is in this sense no different from any other security threat analysis.  

In our pessimistic frame of mind we then need to brainstorm from two different vantage points.  The first is the vantage point of the party being attacked.  How can people in various situations potentially be endangered by the new technology?  The second is the vantage point of the attacker.  How can people with different motivations misuse the technology?  This is a long and complicated job, and should be done in great detail by those who propose a technology.  The results should be published and vetted.

I haven't seen such publication or vetting by the proponents of world-wide WiFi packet collection and giant central databases of device identifiers.  Perhaps the Street View team or someone else has such a study at hand – it would be great for them to share it.

In the meantime I'm just going to throw out a few simple initial ideas – things that are pretty obvious, by constructing a few scenerios.

SCENARIO:  Collecting MAC Addresses is Legal and Morally Acceptable

In this scenario it is legal and socially acceptable to drive up and down the streets recording people's MAC addresses and other network traffic.   

It is also fine for anyone to use a geolocation service to build his own database of MAC addresses and street addresses. 

How could a collector could possibly get the software to do this?  No problem.  In this scenario, since the activity is legal and there is a demand, the software is freely available.  In fact it is widely advertised on various Internet sites.

The collector builds his collection in the evenings, when people are at home with their WiFi enabled phones and computers.  It doesn't take very long to assemble a really detailed map of all the devices used by the people who live in an entire neighborhood  – perhaps a rich neighborhood

Note that it would not matter whether people in the neighborhood have their WiFi encryption turned on or off – the drive by collector would be able to map their devices, since WiFi encryption does not hide the MAC address.

SCENARIO 2 – Collector is a sexual predator

In Scenario 1, anyone can be “a MAC collector”.  In this scenario, the collector is a sexual predator.

When children pass him in the park, they have their phones and WiFi turned on and their MAC addresses are discernable by his laptop software.  Normally the MAC addresses would be meaningless random numbers, but the collector has a complete database of what MAC addresses are associated with a given house address.  It is therefore simple for the collection software on his laptop to automatically convert the WiFi packets emitted from the childrens’ phones into the street addresses where the children live, showing the locations on a map.

There is thus no need for the collector to go up to the children and ask them where they live.  And it won't matter that their parents have taught them never to reveal that to a stranger.  Their devices will have revealed it for them.

I can easily understand that some people might have problems with this example simply because so many questionable things have been justified through reference to predators.  That's not a bandwagon I'm trying to get on. 

I chose the example not only because I think it's real and exposes a threat, but because it reveals two important things germane to a threat analysis:

  • The motivations people have to abuse the technical mechanisms we put in place are pretty much unlimited. 
  • We need to be able to empathize with people who are vulnerable – like children – rather than taking a “people deserve what they get” attitude.   

Finally, I hope it is obvious I am not arguing Google is doing anything remotely on a par this example,  I'm talking about something different: the matter of whether we want WiFi snooping to be something our society condones, and what some of the software that might come into being if we do.

 

Are SSIDs and MAC addresses like house numbers?

Architect Conor Cahill writes:

Kim's assertion that Google was wrong to do so is based upon two primary factors:

  • Google intended to capture the SSID and MAC address of the access points
  • SSIDs and MAC addresses are persistent identifiers

And it seems that this has at least gotten Ben re-thinking his assertion that this was all about privacy theater and even him giving Kim a get-out-of-jail-free card.

While I agree that Kim's asserted facts are true, I disagree with his conclusion.

  • I don't believe Google did anything wrong in collecting SSIDs and MAC addresses (capturing data, perhaps). The SSIDs were configured to *broadcast* (to make something known widely). However, SSIDs and MAC addresses are local identifiers more like house numbers. They identify entities within the local wireless network and are generally not re-transmitted beyond that wireless network.
  • I don't believe that what they did had an impact on the user's privacy. As I pointed out above, it's like capturing house numbers and associating them with a location. That, in itself, has little to do with the user's privacy unless something else associates the location with the user…

Let's think about this.  Are SSIDs and MAC addresses like house numbers?

Your house number is used – by anyone in the world who wants to find it – to get to your house.  Your house was given a number for that purpose.  The people who live in the houses like this.  They actually run out and buy little house number things, and nail them up on the side of their houses, to advertise clearly what number they are.

So let's see:

  1. Are SSIDS and MAC addresses used by anyone in the world to get through to your network?  No.  A DNS name would be used for that.  In residential neighborhoods, you employ a SSID for only one reason – to make it easier to get wireless working for members of your family and their visitors.  Your intent is for the wireless access point's MAC address to be used only by your family's devices, and the MACs of their devices only by the other devices in the house.
  2. Were SSIDS and MAC addressed invented to allow anyone in the world to find the devices in your house?   No, nothing like that.  The MAC is used only within the confines of the local network segment.
  3. Do people consciously try to advertise their SSIDs and MAC addresses to the world by running to the store, buying them, and nailing them to their metaphorical porches?  Nope again.  Zero analogy.

So what is similar?  Nothing. 

That's because house addresses are what, in Law Four of the Laws of Identity, were called “universal identifiers”, while SSIDs and MAC addresses are what were called “unidirectional identifiers” – meaning that they were intended to be constrained to use in a single context. 

Keeping “unidirectional identifiers” private to their context is essential for privacy.  And let me be clear: I'm not refering only to the privacy of individuals, but also that of enterprises, governments and organizations.  Protecting unidirectional identifiers is essential for building a secure and trustworthy Internet.

 

Ben Adida releases me from the theatre

When I published Misuse of network identifiers was done on purposeBen Adida  twittered that “Kim Cameron answers my latest post with some good points I need to think about…”.  And he came through on that promise, even offering me a “Get out of theatre free” card:

“A few days ago, I wrote about Privacy Advocacy Theater and lamented how some folks, including EPIC and Kim Cameron, are attacking Google in a needlessly harsh way for what was an accidental collection of data.  Kim Cameron responded, and he is right to point out that my argument, in the Google case, missed an important issue.

“Kim points out that two issues got confused in the flurry of press activity: the accidental collection of payload data, i.e. the URLs and web content you browsed on unsecured wifi at the moment the Google Street View car was driving by, and the intentional collection of device identifiers, i.e. the network hardware identifiers and network names of public wifi access points.  Kim thinks the network identifiers are inherently more problematic than the payload, because they last for quite a bit of time, while payload data, collected for a few randomly chosen milliseconds, are quite ephemeral and unlikely to be problematic.  [Just for the record, I didn't actually say “unlikely to be problematic” – Kim]

“Kim’s right on both points. Discussion of device identifiers, which I missed in my first post, is necessary, because the data collection, in this case, was intentional, and apparently was not disclosed, as documented in EPIC’s letter to the FCC. If Google is collecting public wifi data, they should at least disclose it. In their blog post on this topic, Google does not clarify that issue.

“So, Google, please tell us how long you’ve been collecting network identifiers, and how long you failed to disclose it. It may have been an oversight, but, given how much other data you’re collecting, it would really improve the public’s trust in you to be very precise here.”

Ben also says my initial post seems “to weave back and forth between both issues”.  In fact I see payload and header being two parts of the same WiFi packet.  Google “accidently” collected one part of the packet but collected the other part on purpose.  I think it is really bizarre that a lot of technical people consider one part of the packet (emails and instant messages) to be private, and then for some irrational reason assume the other part of the same packet (the MAC address) is public.  This makes no sense and as an architect it drives me nuts.  Stealing one part of the WiFi packet is as bad as stealing another.

Ben also says,

“I agree that device privacy can be a big deal, especially when many people are walking around with RFIDs in their passports, pants, and with bluetooth headsets. But, in this particular case, is it a problem? If Google really only did collect the SSIDs of open, public networks that effectively invite anyone to connect to them and thus discover network name and device identifier, is that a violation of privacy, or of the Laws of Identity? I’m having trouble seeing the harm or the questionable act. Once again, these are public/open WiFi networks.”

Let me be clear:  If Google or any other operator only collected the SSIDs of “open, public networks that invite anyone to connect to them” there would be zero problem from the point of view of the Laws of Identity.  They would, in the terminology of Law Four, be collecting “universal identifiers”. 

But when you drive down a street, the vast majority of networks you encounter are NOT public, and are NOT inviting just anyone to connect to them.  The routers emit packets so the designated users of the network can connect to them, not so others can connect to them, hack them, map them or use them for commercial purposes.  If one is to talk about intent, the intent is for private, unidirectional identifiers to be used within a constrained scope.

In other words, as much as I wish I didn't have to do so, I must strongly dispute Ben's assertion that “Once again, these are public/open WiFi networks” and insist that private identifiers are being misappropriated.

In matters of eavesdropping I subscribe to EPIC's argument that proving harm is not essential – it is the eavesdropping itself which is problematic.  However, in my next post I'll talk about harm, and the problems of a vast world-wide system capable of inference based on use of device identifiers.

  

Clarke: Appropriating home network identifiers is the real issue

Here is some background on the Google Street View WiFi issue by Roger Clarke, a well known Australian privacy expert.  Roger points out that Peter Schaar, Germany's Federal Commissioner for Freedom of Information, was concerned about misuse of network identifiers from the very beginning. 

I agree that the identifiers of users’ devices is the real issue.

And your invocation of “It reminds me of an old skit by “Beyond the Fringe” where a police inspector points out that “Once you have identified the criminal's face, the criminal's body is likely to be close by” does hit the spot very nicely!

You ask why the payload is getting all the attention.  After all, it was the device-addresses that Peter Schaar first drew attention to.  As I wrote here,

The third mistake came to light on 22 April 2010, when The Register reported that “[Google's] Street View service is under fire [from the German Data Protection Commissioner, Peter Schaar] for scanning private WLAN networks, and recording users’ unique [device] addresses, as the car trundles along”.

As soon as Peter Fleischer [Google's European privacy advisor – Kim]  published his document of 27 April, I wrote to Schaar, saying:

“Fleischer's document doesn't say anything about whether the surveillance apparatus in the vehicle detects other messages from the router, and messages from other devices…

“In relation to messages other than beacons, on the surface of it, Fleischer might seem to be making an unequivocal statement that Google does *not* collect and store MAC addresses.

“But:

  1. If Google's surveillance apparatus is in a Wifi zone, how does it avoid ‘collecting’ the data?  [Other statements make clear that it does in fact collect that data]
  2. [In the statement “Google does not collect or store payload data”,] the term ‘payload data’ would most sensibly be interpreted as meaning the content, but not including the headers.
  3. The MAC-addresses are in the headers.
  4. So Fleischer's statement is open to the interpretation that header data of messages other than beacons *is* collected, and *is* stored.

“Google has failed to make the statement that connected-device MAC-addresses are *not* collected and stored.

“Because Google has had ample opportunity to make such a statement, and has avoided doing so, I therefore make the conservative assumption that Google *does* collect and store MAC addresses of any devices on networks, not just of routers.”

The document sent to the Commissioners added fuel to the fire, by saying “The equipment is able to receive data from all broadcast frames [i.e. not only beacons are intercepted; any traffic may be intercepted.] This includes, from the header data, SSID and MAC addresses [i.e. consistent with the analysis above, the MAC-addresses of all devices are available to Google's surveillance apparatus.] However, all data payload from data frames are discarded, so Google never collects the content of any communications.

Subsequently, on 14 May, investigations by Hamburg Commissioner Caspar led to the unavoidable conclusion that Fleischer's post on April 27 had been incorrect in a key respect. As Eustace put it, “It's now clear that we have been mistakenly collecting samples of payload data [i.e. message content] from open (i.e. non-password-protected) WiFi networks”.

So I think there are a couple of reasons why the payload aspect is getting most of the press:

  1. The significance of identifiers isn't readily apparent to most people, whereas ‘payload’, like people's Internet Banking passwords, is easier to visualise. (Leave aside that only highly insecure services send authenticators unencrypted. Low-tech reporters have to (over-) simplify stories to communicate to low-tech readers
  2. A corporation appeared to have been caught telling fibs, constructively misleading the public and the media, and regulators
  3. That's what catapulted it into the news, and reporters feed off one another's work, so it's the payload they all focus on
  4. A final factor is that breaches of telecommunications laws may be easier to prove in the case of content than of device-identifiers.

The Australian Privacy Foundation (APF) stepped up the pressure in Australia late this week.

Firstly, we directly requested Google not to delete the data, and gave them notice that we were considering using a little-known part of the TIAA to launch an action.  That was promptly followed by the NYT's report of the Oz Privacy Commissioner saying that the Australian data is in the USA.  (The first useful utterance she's made on the topic – a month after this story broke, there's no mention of the matter on her web-site).

Secondly, we wrote to the relevant regulators, and requested them to contact Google to ensure that the data is not deleted, and to investigate whether Google's actions breached Australian laws.

 

Don't take identities from our homes without our consent

Joerg Resch of Kuppinger Cole in Germany wrote recently about the importance of identity management to the Smart Grid – by which he means the emerging energy infrastructure based on intelligent, distributed renewable resources:

In 10-12 years from now, the whole utilities and energy market will look dramatically different. Decentralization of energy production with consumers converting to prosumers pumping solar energy into the grid and offering  their electric car batteries as storage facilities, spot markets for the masses offering electricity on demand with a fully transparent price setting (energy in a defined region at a defined time can be cheaper, if the sun is shining or the wind is blowing strong), and smart meters in each home being able to automatically contract such energy from spot markets and then tell the washing machine to start working as soon as electricity price falls under a defined line. And – if we think a bit further and apply Google-like business models to the energy market, we can get an idea of the incredible size this market will develop into.

These are just a few examples, which might give you an idea on how the “post fossile energy market” will work. The drivers leading the way into this new age are clear: energy production from oil and gas will become more and more expensive, because pollution is not for free and the resources will not last forever. And the transparency gain from making the grid smarter will make electricity cheaper than it is now.

The drivers are getting stronger every day. Therefore, we will soon see many large scale smart grid initiatives, and we will see questions rising such as who has control over the information collected by the smart meter in my home. Is it my energy provider? How would Kim Cameron´s 7 laws of Identity work in a smart grid? What would a “grid perimeter” look like which keeps information on the usage of whatever electric devices within my 4 walls? By now, we all know what cybercrimes are and how they can affect each of us. But what are the risks of “smart grid hacking”? How might we be affected by “grid crimes”?

In fact at Blackhat 2009, security consultant Mike Davis demonstrated successful hacker attacks on commercially available smart meters.  He told the conference,

“Many of the security vulnerabilities we found are pretty frightening and most smart meters don't even use encryption or ask for authentication before carrying out sensitive functions like running software updates and severing customers from the power grid.”

Privacy commission Ann Cavoukian of Ontario has insisted that industry turn its attention to the security and privacy of these devices:

“The best response is to ensure that privacy is proactively embedded into the design of the Smart Grid, from end to end. The Smart Grid is presently in its infancy worldwide – I’m confident that many jurisdictions will look to our work being done in Ontario as the privacy standard to be met. We are creating the necessary framework with which to address this issue.”

Until recently, no one has talked about drive-by mapping of our home devices.  But from now on we will.  When we think about home devices, we need to reach into the future and come to terms with the huge stakes that are up for grabs here.  

The smart home and the smart grid alert us to just how important the identity and privacy of our devices really is.  We can use technical mechanisms like encryption to protect some information from eavesdroppers.   But not the patterns of our communication or the identities of our devices…  To do that we need a regulatory framework that ensures commercial interests don't enter our “device space” without our consent.

Google's recent Street View WiFi boondoggle is a watershed event in drawing our attention to these matters.

Misuse of network identifiers was done on purpose

Ben Adida has a list of achievements as long as my arm – many of which are related to privacy and security.  His latest post concerns what he calls, “privacy advocacy theater… a problem that my friends and colleagues are guilty of, and I’m sure I’m guilty of it at times, too.  Privacy Advocacy Theater is the act of extreme criticism for an accidental data breach rather than a systemic privacy design flaw. Example: if you’re up in arms over the Google Street View privacy “fiasco” of the last few days, you’re guilty of Privacy Advocacy Theater.”

Ben then proceeds take me to task for this piece:

I also have to be harsh with people I respect deeply, like Kim Cameron who says that Google broke two of his very nicely crafted Laws of Identity. Come on, Kim, this was accidental data collection by code that the Google Street View folks didn’t even realize was running. (I’m giving them the benefit of the doubt. If they are lying, that’s a different problem, but no one’s claiming they’re lying, as far as I know.) The Laws of Identity apply predominantly to the systems that individuals choose to use to manage their data. If anyone is breaking the Laws of Identity, it’s the WiFi access points that don’t actively nudge users towards encrypting their WiFi network.

But let's hold on a minute.  My argument wasn't about the payload data that was collected accidently.  It was about the device identification data that was collected on purpose.  As Google's Alan Eustace put it: 

We said that while Google did collect publicly broadcast SSID information (the WiFi network name) and MAC addresses (the unique number given to a device like a WiFi router) using Street View cars, we did not collect payload data (information sent over the network). But it’s now clear that we have been mistakenly collecting samples of payload data…

Device identifiers were collected on purpose

SSID and MAC addresses are the identifiers of your devices.  They are transmitted as part of the WiFi traffic just like the payload data is.  And they are not “publically broadcast” any more than the payload data is. 

Yet Google consciously decided to abscond with, tabulate and monetize the identities of our personal, business and home devices.  The identifiers are persistent and last for the lifetime of the devices.  Their collection, cataloging and use is, in my view, more dangerous than the payload data that was collected. Why? The payload data, though deeply personal, is transient and represents a single instant.  The identifiers are persistent, and the Street View WiFi plan was to use them for years.  

Let's be clear:  Identity has as much to do with devices, software, services and organizations as with individuals.  And equally important, identity is about the relationships between these things.  In fact identity can only be adequately expressed through the relationships (some call it context).

When Google says, “MAC addresses are a simple hardware ID assigned by the manufacturer” and “We cannot identify an individual” using those “simple hardware IDs”,  it sounds like the devices found in your home and briefcase and pocket have nothing to do with you as a flesh and blood person.  Give me a break!  It reminds me of an old skit by “Beyond the Fringe” where a police inspector points out that “Once you have identified the criminal's face, the criminal's body is likely to be close by…”  Our identities and the identities of our devices are related, and understanding this relationship is essential to getting identity and privacy right.

One great thing about blogging is you find out when you haven't been clear enough.  I hope I'm making progress in expressing the real issues here:  the collection of device identifiers was purposeful, and this represents precisely the kind of “systemic privacy design flaw” to which Ben refers.  

It bothers me that this disturbing systemic privacy design flaw – for which there has been no apology – is being obscured through the widely publicized apology for a completely separate and apparently accidental sin.  

In contemporary networks, the hardware ID of the device is NOT intended to be a “universal identifier”.  It is intended to be a “unidirectional identifier” (see The Fourth Law) employed purely to map between a physical machine and a transient, local logical address.  Many people who read this blog understand why networking works this way.  In Street View WiFi, Google was consciously misusing this unidirectional identifier as a universal identifier, and misappropriating it by insinuating itself, as eavesdropper, into our network conversations.

Ben says, “The Laws of Identity apply predominantly to the systems that individuals choose to use to manage their data.”  But I hope he rethinks this in the context of what identity really is, its use in devices and systems, and the fact that human, device and service identities are tied together in what one day should be a trustworthy system.  I also hope to see Google apologize for its misuse of our device identities, and assure us they will not be used in any of their systems.

Finally, despite Ben's need to rethink this matter,  I do love his blog, and strongly agree with his comments on  Opera Mini, discussed in the same piece.

 

EPIC on Google WiFi eavesdropping

Readers have drawn our attention to a recent letter from EPIC's Marc Rotenberg to  FCC Chairman, Julius Genachowski.

In the detailed letter, Marc Rotenberg specifically calls attention to the mapping of private device identifiers, saying, “We understand that Google also downloaded and recorded a unique device ID, the MAC address, for wireless access devices as well as the SSID assigned by users.”

He argues:

The capture of Wi-Fi data in this manner by Google Street View could easily constitute a violation of Title III of the Omnibus Crime Control and Safe Streets Act of 1968, also known as the Wiretap Act, as amended by the Electronic Communications Privacy Act (ECPA) of 1986 to include electronic communications. Courts most oten define “interception” under ECPA as “acquisitions contemporaneous with transmission.” The Wiretap Act provides for civil liability and criminal penalties against any person who “intentionally intercepts, endeavors to intercept, or procures any other person to intercept or endeavor to intercept any… electronic communication [except as provided in the statute].”

The Wiretap Act imposes identical liability on any person who “intentionally discloses … to any other person the contents of any… electronic communication, knowing or having reason to know that the information was obtained through the interception of a[n] … electronic communication in violation of
this subsection,” or “intentionally uses … the contents of any… electronic communication, knowing or having reason to know that the information was obtained through the interception of a[n]… electronic communication in violation of this subsection.”

Full text (including many footnotes elided in the quote above) is available in pdf and Word format.  See also The Hill's technology blog.