What you have versus what you are

 Ralf Bendrath sees biometrics as being about “what you have” (had?) rather than “what you are”.

Kim Cameron at Identityblog picked up on Jerry Fishenden's post on the problems of biometrics (by the way: Jerry will speak at our privacy workshop in Athens, see below). He again brings up the story from Malaysia, where some brutal car thieves cut off the index finger of a Mercedes owner in order to circumvent the biometric engine lock. First of all, the thieves could have had it much easier, also without having to carry around a rotting finger. With a bit more high-tech, in the future they could maybe just read the fingerprint out of the car owner's passport.

But more important, this case shows the problems with identity and how hard it is to proof to a machine who you are. It is often based on the classic trinity of authentication, which either can be done by something you have (a key, a USB dongle, a chipcard), something you know (a password, a PIN, your mother's maiden name), or something you are (your fingerprint, your retina). There are of course other possible authentication factors, but these are the most common.

This story makes clear that “what you have” is much clearer than “what you are”. I would prefer saying “I have ten fingers” instead of “I am ten fingers”. “What I am” relates more directly to my personality / identity than “what I have” or “what I know”. It is a story, a flowing amorphous thing, changing from context to context and over time. Of course, you can break it down to some extent to single pieces of data (address, date of birth, employer, email, favourite mp3s, …) – but this is all not good for authentication purposes, as most of it is not really secret. “What I know” can be secret, and as Jerry Fishenden points out in his post, could be linked to “what I have” in order to have multi-factor authentication. But it again is not the same as “what I am”.

Biometrics therefore is more about what I have than what I am. The only difference is that it can't be stolen as easily as a car key or a passport. Fingers can be cut off, but faces? Ok, Hollywood was always ahead of us.

Last open question: Can “what you have” also be said about the way you walk? Probably not. But is that really what you are?

Openssl vulnerability

Ben Laurie and Matasano Chargen describe a significant attack on RSA signature implementation (not the algorithm itself).  Quoting from Matasano:

Bell Labs crypto shaolin Daniel Bleichenbacher disclosed a freaky attack on RSA signature implementations that may, under some common circumstances, break SSL/TLS.

Do I have your attention? Good, because this involves PKCS padding, which is not exciting. Bear with me.

RSA signatures use a public signing keypair with a message digest (like SHA1) to stamp a document or protocol message. In RSA, signing and verification are analogs of decryption and encryption. To sign a message, you:

  1. Hash it
  2. Expand the (short) hash to the (long) size of the RSA modulus
  3. Convert that expansion to a number
  4. RSA decrypt the number
  5. Convert that to a series of bytes, which is your signature.

It’s step 2 we care about here. It’s called “padding”. You need it because RSA wants to encrypt or decrypt something that is the same size as the RSA key modulus.

There are a bunch of different ways to pad data before operating on it, but the one everyone uses is called PKCS#1 v1.5 (technically, EMSA-PKCS1-V1_5-ENCODE). It involves tacking a bunch of data in front of your hash, enough to pad it out to the size of the modulus:

pkcs-s.png

Note the order and placement of those boxes. They’re all variable length. Let’s call that out:

pkcs-b.png

The two big goals of a padding scheme are (a) expanding messages to the modulus length and (b) making it easy to correctly recover the message bytes later, regardless of what they are. This padding scheme is designed to make that straightforward. The padding bytes are clearly marked (“00 01”, which tells you that PKCS#1 v1.5 is in use), terminated (with a 00, which cannot occur in the padding), and followed by a blob of data with a length field. This whole bundle of data is what RSA works on.

The problem is, despite all the flexiblity PKCS#1 v1.5 gives you, nobody expects you to ever use any of it. In fact, a lot of software apparently depends on data being laid out basically like the picture above. But all the metadata in that message gives you other options. For instance:

pkcs-a.png

For some RSA implementations, this could be an acceptable message layout. It’s semantically invalid, but can appear syntactically valid. And if you don’t completely unpack and check all the metadata in the signature, well, this can happen:

  1. Determine the padding from the fixed header bytes.
  2. Scan until the terminator.
  3. Scoop out the hash information.
  4. Use the hash to confirm you’re looking at the same message that the “signer” signed.
  5. Use the signature to confirm that a real “signer” signed it.

The problem is that this attack breaks the connection between (4) and (5). The hash, now “centered” instead of “right-justified”, doesn’t really mean anything, because the signature covers a bunch more bits.

This is all trivia without some value for “evil” that lets you substitute an arbitrary message hash (and thus an arbitrary message) into an otherwise valid-looking signature. Enter Bleichenbacher’s attack.

RSA takes parameters, one of which is a “public exponent”, which is part of the public key. If that exponent is “3”, which it often is, an attacker can exploit broken signature validation code to forge messages. The math here, which Bleichenbacher claims is simple enough to do with a pencil and paper, gets a bit hairy for me (I lose it at polynomials). Dino explains it better than I do. The long and the short of it is, you validate an RSA signature by computing:

s ^ e = m (mod n)

(where “e” is the public exponent, “n” is the public modulus, and “s” is the signature) and verifying that you get the same result as applying steps (1) and (2) from the signature process yourself. But:

  1. If the public exponent is “3”, and
  2. you inject the right “evil” bits into the PKCS data to make it a perfect cube, then
  3. you can create a something that broken RSA will validate.

Good technical details in Hal Finney’s OpenPGP post (OpenPGP is not vulnerable). And a security advisory for OpenSSL (OpenSSL is vulnerable, through 0.9.8b).

Two things have gone wrong here:

  1. Implementations misparse signatures, assuming that a syntactically valid-looking hash is semantically operative.
  2. For the common RSA exponent parameter “3”, there’s a straightforward way to manipulate a bogus signature to make it look valid.

My understanding is, the demonstrated attack sticks the evil bits outside of the DigestInfo (the hash data). The way I see the bug being described, broken implementations are just scanning the buffer without “fully decoding it”, implying that if they just validated the ASN.1 metadata against the signature buffer they’d be OK. That may be true, but it seems possible that a blind “memcmp” on an over-long SHA1 octet string, which would be ASN.1-valid and leave the Digest at the end of the buffer, could also trigger the same bug.

This is only a problem if:

  1. You’re running broken code
  2. You’re relying on certificate validation
  3. The certificates you’re validating use an exponent of “3”

Unfortunately, although the default for OpenSSL is “65537” (no practical attack known for that), “3” is common alternative: it’s faster, especially in embedded environments. Ferguson and Schneier recommend it in Practical Cryptography. Checking the CA bundle for “curl”:

cat curl-ca-bundle.crt | grep Exponent | grep “: 3” | wc -l

gives 6 hits, from Digital Signature Trust Co. and Entrust.net. Those same certs are in Firefox. Firefox doesn’t use OpenSSL; it uses NSS, and I don’t know if NSS is vulnerable. Here’s the code. I see it extracting a SHA1 hash’s worth of data, and don’t see it checking to make sure that date exhausts the buffer, but I don’t know the code well. Safari also depends on OpenSSL.

Correct me on any of these points and I’ll amend the post. I haven’t tested this (although Google Security did a proof of concept against OpenSSL that precipitated the advisory).

You should upgrade to the most recent OpenSSL. Bleichenbacher also recommended people stop using signing keys with a “3” exponent.

WordPress InfoCard integration code

Update:  There are now excellent community-based and commercial implementations of Information Card code for WordPress, php, ruby, “C” and other languages.  I've left this zip here for documentary and pedagogical purposes only.

 I've been wanting to share my experiences adding Information Card support to identityblog for quite a while now.  I just haven't had the time.

I started by publishing my work on building the necessary code for handling secure identity tokens.  But then I got interrupted with the necessities of life – like shipping Cardspace.

Anyway, now I'm ready to present my integration code.  Very little of it is unique to WordPress – it is really code that would in general apply just as much to any other piece of software.  Someone could easily factor my code so the interface is a little cleaner than is currently the case. 

When I had to actually alter wordpress files (only 3 of them), I just show the changes that are necessary.  You'll have to download the original files from wordpress to see what I'm talking about (version 2.0.4) in context (usually not necessary unless you are making the changes in your own version.)

Download my contribution here.  My assumption is that the root of this download is the same as the root of the wordpress directory. 

[WARNING:  DO NOT INSTALL THE WORDPRESS FILES  FROM MY ZIP INTO YOUR OPERATIONAL WORDPRESS DIRECTORY!  IF YOU WANTED TO USE THIS CODE, YOU WOULD NEED TO MANUALLY INTEGRATE THE CHANGES I HAVE MADE TO MY VERSION OF THE WORDPRESS FILES INTO YOUR VERSION OF THE SAME FILES..  THIS NO LONGER MAKES SENSE SINCE THERE ARE EXCELLENT (SUPPORTED!!) VERSIONS AVAILABLE. ]

The files all begin with “infocard” so they're easy to delete if you want to.

I'll be publishing a number of pieces explaining why I took the approaches it did.  I hope this will get some good, concrete conversation going.  The first in this series is uncharacteristically wordpress specific – don't get discouraged if you're looking for something more general.  It talks about how I approached changing the wp-login page.  I'm pretty sure that even people thinking about infocard-enabling other products will find some ideas here that help them out.

Like my previous work, you can use this code in whatever way you want.  My goal is to help as many people as possible understand, use and deploy information cards.

UPDATE:  Thanks to Samuel Rinnetmäki for pointing out the need to warn readers not to install “as is” in an operational directory – it had never occured to me they might do this…  I've edited the  ZIP to make this impossible (09-02-2008).

Giving identity thieves the finger

Jerry Fishenden has been posting about biometrics recently, and I'll comment on the issues over the next little while. But before we get there, just to put everything in perspective, here's a piece from the BBC, quoted by Jerry, that I missed when it first came out.

Police in Malaysia are hunting for members of a violent gang who chopped off a car owner's finger to get round the vehicle's hi-tech security system.

The car, a Mercedes S-class, was protected by a fingerprint recognition system.

Accountant K Kumaran's ordeal began when he was run down by four men in a small car as he was about to get into his Mercedes in a Kuala Lumpur suburb.

The gang, armed with long machetes, demanded the keys to his car. It is worth around $75,000 second-hand on the local market, where prices are high because of import duties.

Stripped naked

The attackers forced Mr Kumaran to put his finger on the security panel to start the vehicle, bundled him into the back seat and drove off.

But having stripped the car, the thieves became frustrated when they wanted to restart it. They found they again could not bypass the immobiliser, which needs the owner's fingerprint to disarm it.

They stripped Mr Kumaran naked and left him by the side of the road – but not before cutting off the end of his index finger with a machete.

Police believe the gang is responsible for a series of thefts in the area.

Note to self:  don't purchase technology based on retinal scans.

Future discussion:  not only “things you are” but “things you know” can ultimately expose you to harm.

P.S.  Who would ever buy an S-Class?

 

Dynamic detection of client dialect requirements

It seems I might not have found quite the magic recipe yet in my attempt to dynamically recognize whether you are coming from a July CTP or release candidate client.  “Close, probably, but no cigar.”

If you have any kind of problem logging in with an Information Card, please email me the output of this diagnostic.

“Funny, it worked on MY machines.” (From Programming Yarns, Volume 1, Chapter 1). 

Sorry for having been a little optimistic about my initial success.  A bunch of people had reported that things worked – and I prematureluy took that as meaning that they didn't NOT work. 

I'm still trying to sort out why some people are having problems.  So if you don't mind trying out and mailing in the diagnostic, I'd really appreciate it.

 

Upcoming DIDW

I hope everyone's going to Digital ID World (DIDW) next week. We'll start on Monday with an Identity Open Space Unconference (don't worry, Virgos, they're unstructured, but not without shape and self-revealing purpose). Once this gives rise to the main event, there are a number of sessions that look fascinating for identity afficionados – like “What Do the Internet's Largest Sites Think About Identity?”, a panel moderated by Dan Farber and featuring representatives of the large sites and a new presentation by Dick Hardt. There will also be an OSIS meeting – and of course, the endless hallway conversation.

I'm pairing up with Patrick Harding (from Ping Identity) on a Wednesday session called “Understanding InfoCards in an Enterprise Setting“. It will include a demo that I think will really help show the concrete benefits of InfoCards inside the enterprise. What can you expect? 

First, you'll see the latest version of Ping's InfoCard server, now featuring both Managed IdP as well as Service Provider capabilities. Ping's goal is to show how to seamlessly chain passive and active federation – allowing for on-the-fly privacy context switching.  They'll use real-world use-cases where passive federation gives way to active and vice-versa.

According to Andre Durand, Ping Identity's CEO:

“The Digital ID World demo will show two scenarios to depict how passive federation (via SAML 2.0 Web SSO Profiles or WS-Federation) and active federation (via CardSpace) can both play a role in enabling a seamless user experience for accessing outsourced applications. The plan is to demonstrate how passive and active federation work together to enable a myriad of different business use cases when chained together in different situations

“Scenario 1:

“An enterprise employee leverages her internal employee portal to access applications that are hosted externally. In the first case we show how SAML 2.0 Web SSO (passive federation) is used to enable seamless access into the SF.com web site. The user accepts this as part of her employment contract – the employer has deemed that the use of SF.com is critical to their business and they want no friction for their sales force in entering information for forecasting purposes.

“In the second case we'll show how CardSpace is used to ‘optionally’ enable seamless access into the employees Employee Benefits web site. As the Employee Benefits web site is made up of a mixture of personal and corporate information (i.e. 401k, health and payroll) the employee is given the choice of whether to enable SSO via the use of CardSpace. The Employee Benefits web site is enabled with CardSpace. After the user clicks on the ‘Benefits’ link in their corporate portal, she is prompted with different Cards (Employer and Benefits) which she can then choose between for accessing the Benefits web site. If she chooses ‘Employer’ then she will be enabled with SSO from the Corporate Portal in future interactions.”

By the way, Andre, please tell me there's some way for her to change her mind later!

“Scenario 2:

“An enterprise employee is traveling and loses her cell phone. She uses her laptop to access her corporate cell phone provider in an effort to have the phone replaced immediately. The employee would normally access this web site via SSO from her corporate portal. The cell phone provider web site is enabled with Card Space to simplify the IdP discovery and selection process. The employee is prompted to use her Employer card to authenticate to her employer's authentication service. The cell phone provider web site leverages CardSpace to handle IdP Selection rather than having to discover this themselves. Once the user has authenticated to her employer the returned security token contains the relevant information to service the employee's request for a new cell phone.”

It all sounds very interesting – amongst the first examples of what it means to have a full palette of identity options.  Ping is emblematic of an emerging ecology – many of us, across the industry, moving us towards the Identity Big Bang.

Doc Searls will be doing the closing Keynote.  I'm really looking forward to that and to seeing you in Santa Clara.

Can namespaces survive name changes?

Arcadian Vision, an interesting place created by a person (I'm not sure who…) with deep knowledge of Ruby, thinks the namespace change problem I explained earlier today could have been avoided if we were using namespace schemes with a “little more indirection”. His thinking seems to spontaneously head in the same direction as Drummond Reed's.

Kim Cameron writes about namespace changes relating to Microsoft’s Cardspace initiative. The explanations offered sound good, but it’s hard to not be somewhat annoyed if you’re the one patching your code as a result of this change. This also reminds me of a few unconnected experiences that revolve, at least somewhat, around the permanence of URIs. URIs used to denote namespaces often (typically?) aren’t actually valid URLs. They specify a transfer protocol, but they’re not actually meant to be used with that protocol (e.g. they don’t link to documentation about that namespace). It seems to me that this is doubling the burden on a mechanism that isn’t necessarily appropriate. I suppose the argument goes that you control your domain, so you can split that resource among its various responsibilities. Sounds shaky to me, but let’s see where it leads us.

He reaches the conclusion:

So when I put it all together, I’m using my domain name to identify namespaces that are potentially distinct from the content served up via HTTP from that domain. I’m also using my domain name to locate information that isn’t intrinsically related to my domain. I think there’s a blog in there, too. Personally, I’m going to closely watch Google Base to see if it catches on. I could host my own data but have a unique Google Base identifier for it that I can edit to reflect changes in where I’m keeping my data. So how about rather than using a URI to identify my namespace, I identify it as this, which is a unique identifier, can be annotated with relevant metadata (like a link to documentation), and won’t screw anyone else up if I change the URL of my website.

I find it interesting that someone would think of using Google Base as a kind of XRI.  That's pretty far out of the box.  I can hear schema-addicts writhing in pain, but no one can argue with the simplicity of Arcadian's scheme.

Regardless, I think the case of whether to put InfoCard claims under “xmlsoap.org” or “microsoft.org” turns on a different set of issues.  I think the move makes a statement – that is a part of the essence of the InfoCard system – about the cross-industry character of the technology.  In other words, the semantics of the work are becoming richer as a result of the move.

In terms of using Google Base and names like http://base.google.com/base/a/1354745/D5640690229463248432 , doesn't that have a fixed root too?  Arcadian ends up still being tied to a domain-based system, and the more he goes down this path, the more he will find himself becoming dependent on the domain.  If his approach were to become popular, everyone would be making themselves progressively more dependent on a single namespace with a commercial purpose and future – a course one shouldn't adopt without careful thought.

Arcardian should look at Drummond Reed's work before adopting conventional search engines for this particular purpose.  It introduces a framework of persistent identifiers that sit behind transient namespaces, and provides a mapping service with, as I understand it, no central commercial owner.  In other words, the indirection is offered through a new commons.  You can get an intro here and here.

 

Namespace change in Cardspace release candidate

Via Steve Linehan, a pointer to Vittorio Bertocci's blog, Vibro.NET:

In RC1 (.NET framework 3.0, IE7.0 and/or Vista: for once, we have all nicely aligned) we discontinued the namespace http://schemas.microsoft.com/ws/2005/05/identity, substituted by http://schemas.xmlsoap.org/ws/2005/05/identity. That holds both for the claims in the self issued cards (s-i-c) and for the qname of the issuer associated to s-i-c. If you browse a pre-RC RP site from a RC1 machine, you may experience weird effects. For example, like the Identity Selector claiming that the website is asking for a managed card from the issuer http://schemas.microsoft.com/ws/2005/05/identity/issuer/self no longer recognized as the s-i-c special issuer. Note that often is not a good idea to explicitly ask for a specific issuer 🙂

 If you want to see a sample of this, check out the updated version of the sandbox.

Why this change? As you may know, relying parties specify the claims they want the identity provider to supply (for example, “lastname” or “givenname”) using URIs.

Everyone will agree that the benefit of this is that the system is very flexible – anyone can make up their own URIs, get relying parties to ask for them, and then supply them through their own identity provider. 

But a lot of synergy accrues if we can agree on sets of basic URIs – much like we did with LDAP attribute names and definitions.  

Given that a number of players are implementing systems that interoperate with our self-asserted identity provider, it made sense to change the namespace of the claims from microsoft.com to xmlsoap.org.  In fact this is an early outcome of our collaboration with the Open Source Identity Selector (OSIS) members.  Now that there are a bunch of people who want to support the same set of claims, it makes total sense to move them into a “neutral” namespace.

While this is therefore a “good and proper” refinement, it can pose a problem for people trying out the new software:  if you are using an early version of Cardspace with self-issued cards that respond to the “microsoft.com” namespace, it won't match new-fangled claims requested by a web site using the “xmlsoap.org” namespace.  And vica versa.  Further, the “card illumination” logic from one version won't recognize the claims from the other namespace.  Cardspace will think the relying party is looking for specialized claims supplied by a “managed card” provider (e.g. a third party).  Thus the confusing message.

After getting some complaints, I fixed this problem at identityblog: now I detect the version of cardspace a client is running and then dynamically request claims in either an old dialect or the new one.  I would say people would do well to build this capability into their implementation from day one.  My sample code is here.

Kim Cameron and DRM

Ben Laurie thinks I was damning digital rights technology when I complained about not being being able to burn some of the Modern Times songs I had paid for and downloaded. He writes:

“Kim’s got all steamed up over iTunes’ DRM.

“Perhaps a better target for his vitriol would be his own company’s DRM, which will not only prevent you from burning stuff to CD, it’ll even remove your right to play it after you’ve purchased it.”

Why?  The parties to a transaction may feel fine about a contract limiting the number of times content can be burned or played.  I have nothing against that.  Let a thousand flowers bloom.  I'm not against technological capabilities, if they are reliable and people want to use them.

But I went to iTunes for two reasons.  First, it had the album that I wanted to burn to CD.  Second, its policy said you can burn your downloaded songs onto CD seven times.

If iTunes had announced more draconian rules, I just wouldn't have gone there.

The problem is that some of the songs were not covered by the announced policy. 

Some have argued the tracks in question aren't songs, they are “videos”. 

I still think they're songs even though you can see Dylan's mouth moving.  At any rate, for four titles – that I also paid for – the sound of Dylan and his band is now caged up inside  iTunes’ proprietary environment.  I can't burn them.  And I can't hear them in my car, on my stereo, or on my television.  I have to use the iTunes application.

Selling me songs and then saying they're not songs and that they're bonus items is really the pits.  iTunes songs cost $1.00 each.  There are 10 songs on the album that can be burned to CD (cost of that is $10.00).  But I paid $14.00.  So the extra songs are not a bonus – they're charged at the same rate as all the other songs – but can't be burned.

All of this is what leads Cory Doctorow to ask if there is a two-tiered music distribution system emerging, and I think it's a very good question.

Music that can only be played on a television.

Julian Bond takes Cory's “new business model” thinking even further:

Cory has been talking about : Kim Cameron having trouble with missing tracks from Bob Dylan's CD “Modern Times”

There's another way to get Modern Times and burn it to a CD: you can buy it from AllOfMP3.com

But go here and you'll find AllOfMp3 only have the 10 tracks off the Audio CD. Not the 4 tracks off the DVD.

I think we're going to see more and more of this. A CD packaged with a DVD containing videos of additional tracks. Ripping the audio from the DVD's videos is considerably harder than ripping a CD to MP3. And it opens up an avenue for the record companies (and by implication iTMS) to change the rules.

To a certain extent I admire this. It's a way of making the physical object worth more than the digital download. But it can also be seen as yet another example of DRM. In this case, the stronger DRM present on a DVD than the unprotected audio CD. The big downside of course is that the DVD is only playable on a DVD player. Which for many will mean no playing in the car, on a portable CD player or on the average stereo. That seems quite a strange idea. Music that can only be played on a television.