Keys, signatures and linkability

Stefan Brands is contributing to the discussion of traceability, inkability and selective disclosure with a series of posts over at identity corner.  He is one of the world's key innovators in the cryptography of unlinkability, so his participation is especially interesting.   

Consider a user who self-generates several identity claims at different occassions, say “I am 25 years of age”, “I am male”, and “I am a citizen of Canada”. The user’s software packages these assertions into identity claims by means of attribute type/value pairs; for instance, claim 1 is encoded as “age = 25”, claim 2 is “gender = 0”, and claim 3 is “citizenship = 1”. Clearly, relying parties that receive these identity claims cannot trace them to their user’s identity (whether that be represented in the form of a birth name, an SSN, or another identifier) by analyzing the presented claims; self-generated claims are untraceable. Similarly, they cannot decide whether or not different claims are presented by the same or by different users; self-generated claims are unlinkable.

Note that these two privacy properties (which are different but, as we will see in the next paragraph, complementary) hold “unconditionally;” no amount of computing power will enable relying parties to trace or link by analyzing incoming identity-data flows, not even if relying parties collude (indeed, they may be the same entity).

Now, consider the same self-generated identity claims, but this time their user “self-protects” them by means of a self-generated cryptographic key pair (e.g., a random RSA private key and its corresponding public key). The user digitally signs the identity claims with his private key; for example, claim 1 as presented to a relying party looks like “age = 25; PublicKey = 37AC986B…; Signature = 21A4A5B6…”. Clearly, these self-protected claims are as untraceable as their unprotected cousins in the previous paragraph. Are they unlinkable? Well, that depends:

  • If the user applies the same key pair to all claims, then the public key that is present in the presented messages will be the same; thus, all presented identity claims are linkable. As a result, a relying party that receives all three claims over time knows that it is dealing with a 25-year old Canadian male. As the user over time presents more linkable claims, this may indirectly lead to traceability; for example, the relying party may be able to infer the user’s birth name once the user presents a linkable identity claim that states the postal code of his home address.
  • If the user applies a different self-generated key pair to each identity claim, the three presented claims are as unlinkable and untraceable as in the example where no cryptographic data was appended. Note that this solution does notforce unlinkability and untraceability: in cases where the user should be identified, the user can simply provide a claim that specifies his name: “name=Jon Smith” or “SSN-identifier=945278476”, for instance. Similarly, to make self-generated identity claims linkable, an additional common attribute value can be encoded

This is a clear way to introduce the notion of how keys and signatures affect tracability and linkability of claims.  However there is more to consider.  Even if the user applies a different self-generated key pair for each of the three attributes discussed above,  if the three attributes are transfered in a single transaction, they are still linked.  The transaction itself links the attribute assertions.  Convenyance of multiple claims is a very common case.

Similarly, if Stefan's three attributes are released during what can be considered to be the same session, they are linked, again regardless of the cryptography.  And if they are released within a given time window from the same transport (IP) address, they should be considered linked too.

While cryptography is one factor contributing to linkability, we need to look at the protocol patterns and visibility they render possible as well.  I'll be starting to do that in my next posting.

Published by

Kim Cameron

Work on identity.

6 thoughts on “Keys, signatures and linkability”

  1. What Stefan always ignores is that self-generated key pairs are not guaranteed to be unique. It is possible (although highly unlikely) for a user to believe they are generating a unique key pair when, in fact, they end up with a second usage of a previous one. More likely (although still rare), two seperate entities could self-generate the same key pair, allowing a linkage which is, in fact, false – but could potentially be very damaging.

  2. The odds of generating colliding key pairs using secure cryptographic constructs are negligible. How is this different from two IdPs assigning the same identifier to two users?

  3. Stephan gives a different, stronger definition of traceability than you did in your recent post on the subject.

    As I understood your post, your distinction between linkability and traceability was that the former links separate transactions and the latter links parts of a single transaction as it moves through one or more systems.

    Stefan says that traceability is the ability to determine the user's identity based on analysis of his claims.

    Perhaps what accounts for the difference is that you wrote about traceability of transactions whereas he wrote about traceability of claims, but I think that for the sake of the coherency of the conversation, it would help a lot to have a clear, consistent definition of the term.

    “Being able to follow a transaction through all its phases by collecting transaction information and having some way of identifying the transaction payload as it moves through the system” doesn't necessarily mean that I can also identify the person who originated the transaction. I might want the first kind of traceability while not wanting the second. Can we have separate terms, please?

  4. Kim, your definition of linkabilty talkes about linking transactions. Other than that, using referrers, SAML IDP log files portals and reverse proxies, users might be linked to services (URLs). This is obviously less useful than linked transaction data, but still a privacy issue. I would call this referabilty, but maybut someone already coined a term.

Comments are closed.