Are SSIDs and MAC addresses like house numbers?

Architect Conor Cahill writes:

Kim's assertion that Google was wrong to do so is based upon two primary factors:

  • Google intended to capture the SSID and MAC address of the access points
  • SSIDs and MAC addresses are persistent identifiers

And it seems that this has at least gotten Ben re-thinking his assertion that this was all about privacy theater and even him giving Kim a get-out-of-jail-free card.

While I agree that Kim's asserted facts are true, I disagree with his conclusion.

  • I don't believe Google did anything wrong in collecting SSIDs and MAC addresses (capturing data, perhaps). The SSIDs were configured to *broadcast* (to make something known widely). However, SSIDs and MAC addresses are local identifiers more like house numbers. They identify entities within the local wireless network and are generally not re-transmitted beyond that wireless network.
  • I don't believe that what they did had an impact on the user's privacy. As I pointed out above, it's like capturing house numbers and associating them with a location. That, in itself, has little to do with the user's privacy unless something else associates the location with the user…

Let's think about this.  Are SSIDs and MAC addresses like house numbers?

Your house number is used – by anyone in the world who wants to find it – to get to your house.  Your house was given a number for that purpose.  The people who live in the houses like this.  They actually run out and buy little house number things, and nail them up on the side of their houses, to advertise clearly what number they are.

So let's see:

  1. Are SSIDS and MAC addresses used by anyone in the world to get through to your network?  No.  A DNS name would be used for that.  In residential neighborhoods, you employ a SSID for only one reason – to make it easier to get wireless working for members of your family and their visitors.  Your intent is for the wireless access point's MAC address to be used only by your family's devices, and the MACs of their devices only by the other devices in the house.
  2. Were SSIDS and MAC addressed invented to allow anyone in the world to find the devices in your house?   No, nothing like that.  The MAC is used only within the confines of the local network segment.
  3. Do people consciously try to advertise their SSIDs and MAC addresses to the world by running to the store, buying them, and nailing them to their metaphorical porches?  Nope again.  Zero analogy.

So what is similar?  Nothing. 

That's because house addresses are what, in Law Four of the Laws of Identity, were called “universal identifiers”, while SSIDs and MAC addresses are what were called “unidirectional identifiers” – meaning that they were intended to be constrained to use in a single context. 

Keeping “unidirectional identifiers” private to their context is essential for privacy.  And let me be clear: I'm not refering only to the privacy of individuals, but also that of enterprises, governments and organizations.  Protecting unidirectional identifiers is essential for building a secure and trustworthy Internet.

 

Electronic Eternity

From the Useful Spam Department :  I got an advertisement from a robot at “complianceonline.com” that works for a business addressing the problem of data retention on the web from the corporate point of view. 

We've all read plenty about the dangers of teenagers publishing their party revels only to find themselves rejected by a university snooping on their Facebook account.  But it's important to remember that the same issues affect business and government as well, as the complianceonline robot points out:

“Avoid Documentation ‘Time Bombs’

“Your own communications and documents can be used against you.

“Lab books, project and design history files, correspondence including e-mails, websites, and marketing literature may all contain information that can compromise a company and it's regulatory compliance. Major problems with the U.S. FDA and/or in lawsuits have resulted from careless or inappropriate comments or even inaccurrate opinions being “voiced” by employees in controlled or retained documents. Opinionated or accusatory E-mails have been written and sent, where even if deleted, still remain in the public domain where they can effectively “last forever”.

“In this electronic age of My Space, Face Book, Linked In, Twitter, Blogs and similar instant communication, derogatory information about a company and its products can be published worldwide, and “go viral”, whether based on fact or not. Today one's ‘opinion’ carries the same weight as ‘fact’.”

This is all pretty predictable and even banal, but then we get to the gem:  the company offers a webinar on “Electronic Eternity”.  I like the rubric.  I think “Electronic Eternity” is one of the things we should question.  Do we really need to accept that it is inevitable?  Whose interest does it serve?  I can't see any stakeholder who benefits except, perhaps, the archeologist. 

Perhaps everything should have a half-life unless a good argument can be made for preserviing it. 

 

Definitions for a Common Identity Framework

The Proposal for a Common Identity Framework begins by explaining the termnology it uses.  This wasn't intended to open up old wounds or provoke ontological debate.  We just wanted to reduce ambiguity about what we actually mean to say in the rest of the paper.  To do this, we did think very carefully about what we were going to call things, and tried to be very precise about our use of terms.

The paper presents its definitions in alphabetical order to faciliate lookup while reading the proposal, but I'll group them differently here to facilitate discussion.

Let's start with the series of definitions pertaining to claims.  It is key to the document that claims are assertions by one subject about another subject that are “in doubt”.  This is a fundamental notion since it leads to an understanding that one of the basic services of a multi-party model must be “Claims Approval”.  The simple assumption by systems that assertions are true – in other words the failure to factor out “approval” as a separate service – has lead to conflation and insularity in earlier systems.

  • Claim:  an assertion made by one subject about itself or another subject that a relying party considers to be “in doubt” until it passes “Claims Approval”
  • Claims Approval: The process of evaluating a set of claims associated with a security presentation to produce claims trusted in a specific environment so it can used for automated decision making and/or mapped to an application specific identifier.
  • Claims Selector:  A software component that gives the user control over the production and release of sets of claims issued by claims providers. 
  • Security Token:  A set of claims.

The concept of claims provider is presented in relation to “registration” of subjects.  Then claims are divided into two broad categories:  primordial and substantive…

  • Registration:  The process through which a primordial claim is associated with a subject so that a claims provider can subsequently issue a set of claims about that subject.
  • Claims Provider:  An individual, organization or service that:
  1. Registers subjects and associates them with primordial claims, with the goal of subsequently exchanging their primordial claims for a set of substantive claims about the subject that can be presented at a relying party; or
  2. Interprets one set of substantive claims and produces a second set (this specialization of a claims provider is called a claims transformer).  A claims set produced by a claims provider is not a primordial claim.
  • Claims Transformer:  A claims provider that produces one set of substantive claims from another set.

To understand this better let's look at what we mean by  “primordial” and “substantive” claims.  The word “primordial” may seem a strange at first, but its use will be seen to be rewardingly precise:  Constituting the beginning or starting point, from which something else is derived or developed, or on which something else depends. (OED) .

As will become clear, the claims-based model works through the use of “Claims Providers”.  In the most basic case, subjects prove to a claims provider that they are an entity it has registered, and then the claims provider makes “substantive” claims about them.  The subject proves that it is the registered entity by using a “primordial” claim – one which is thus the beginning or starting point, and from which the provider's substantive claims are derived.  So our definitions are the following: 

  • Primordial Claim: A proof – based on secret(s) and/or biometrics – that only a single subject is able to present to a specific claims provider for the purpose of being recognized and obtaining a set of substantive claims.
  • Substantive claim:  A claim produced by a claims provider – as opposed to a primordial claim.

Passwords and secret keys are therefore examples of “primordial” claims, whereas SAML tokens and X.509 certificates (with DNs and the like) are examples of substantive claims. 

Some will say, “Why don't you just use the word ‘credential'”?   The answer is simple.  We avoided “credential” precisely because people use it to mean both the primordial claim (e.g. a secret key) and the substantive claim (e.g. a certificate or signed statement).   This conflation makes it unsuitable for expressing the distinction between primordial and substantive, and this distinction is essential to properly factoring the services in the model.

There are a number of definitions pertaining to subjects, persons and identity itself:

  • Identity:  The fact of being what a person or a thing is, and the characteristics determining this.

This definition of identity is quite different from the definition that conflates identity and “identifier” (e.g. kim@foo.bar being called an identity).  Without clearing up this confusion, nothing can be understood.   Claims are the way of communicating what a person or thing is – different from being that person or thing.  An identifier is one possible claim content.

We also distinguish between a “natural person”, a “person”, and a “persona”, taking into account input from the legal and policy community:

  • Natural person:  A human being…
  • Person:  an entity recognized by the legal system.  In the context of eID, a person who can be digitally identified.
  • Persona:  A character deliberately assumed by a natural person

A “subject” is much broader, including things like services:

  • Subject:  The consumer of a digital service (a digital representation of a natural or juristic person, persona, group, organization, software service or device) described through claims.

And what about user?

  • User:  a natural person who is represented by a subject.

The entities that depend on identity are called relying parties:

  • Relying party:  An individual, organization or service that depends on claims issued by a claims provider about a subject to control access to and personalization of a service.
  • Service:  A digital entity comprising software, hardware and/or communications channels that interacts with subjects.

Concrete services that interact with subjects (e.g. digital entities) are not to be confused with the abstract services that constitute our model:

  • Abstract services:  Architectural components that deliver useful services and can be described through high level goals, structures and behaviors.  In practice, these abstract services are refined into concrete service definitions and instantiations.

Concrete digital services, including both relying parties and claims providers, operate on the behalf of some “person” (in the sense used here of legal persons including organizations).  This implies operations and administration:

  • Administrative authority:  An organization responsible for the management of an administrative domain.
  • Administrative domain:  A boundary for the management of all business and technical aspects related to:
  1. A claims provider;
  2. A relying party; or
  3. A relying party that serves as its own claims provider 

There are several definitions that are necessary to understand how different pieces of the model fit together:

  • ID-data base:  A collection of application specific identifiers used with automatic claims approval
  • Application Specific Identifier (ASID):  An identifier that is used in an application to link a specific subject to data in the application.
  • Security presentation:  A set consisting of elements like knowledge of secrets, possession of security devices or aspects of administration which are associated with automated claims approval.  These elements derive from technical policy and legal contracts of a chain of administrative domains.
  • Technical Policy:  A set of technical parameters constraining the behavior of a digital service and limited to the present tense.

And finally, there is the definition of what we mean by user-centric.  Several colleagues have pointed out that the word “user-centric” has been used recently to justify all kinds of schemes that usurp the autonomy of the user.  So we want to be very precise about what we mean in this paper:

  • User-centric:  Structured so as to allow users to conceptualize, enumerate and control their relationships with other parties, including the flow of information.