Identity – Page 2 – Kim Cameron's Identity Weblog

Yes to SCIM. Yes to Graph.

Today Alex Simons, Director of Program Management for Active Directory, posted the links to the Developer Preview of Windows Azure Active Directory. Another milestone.

I'll write about the release in my next post. Today, since the Developer Preview focuses a lot of attention on our Graph API, I thought it would be a good idea to respond first to the discussion that has been taking place on Twitter about the relationship between the Graph API and SCIM (Simple Cloud Identity Management).

Since the River of Tweets flows without beginning or end, I'll share some of the conversation for those who had other things to do:

@NishantK: @travisspencer IMO, @johnshew’s posts talk about SaaS connecting to WAAD using Graph API (read, not prov) @IdentityMonk @JohnFontana

@travisspencer: @NishantK Check out @vibronet’s TechEd Europe talk on @ch9. It really sounded like provisioning /cc @johnshew @IdentityMonk @JohnFontana

@travisspencer: @NishantK But if it’s SaaS reading and/or writing, then I agree, it’s not provisioning /cc @johnshew @IdentityMonk @JohnFontana

@travisspencer: @NishantK But even read/write access by SaaS *could* be done w/ SCIM if it did everything MS needs /cc @johnshew @IdentityMonk @JohnFontana

@NishantK: @travisspencer That part I agree with. I previously asked about conflict/overlap of Graph API with SCIM @johnshew @IdentityMonk @JohnFontana

@IdentityMonk: @travisspencer @NishantK @johnshew @JohnFontana check slide 33 of SIA322 it is really creating new users

@IdentityMonk: @NishantK @travisspencer @johnshew @JohnFontana it is JSON vs XML over HTTP… as often, MS is doing the same as standards with its own

@travisspencer: @IdentityMonk They had to ship, so it’s NP. Now, bring those ideas & reqs to IETF & let’s get 1 std for all @NishantK @johnshew @JohnFontana

@NishantK: @IdentityMonk But isn’t that slide talking about creating users in WAAD (not prov to SF or Webex)? @travisspencer @johnshew @JohnFontana

@IdentityMonk: @NishantK @travisspencer @johnshew @JohnFontana indeed. But its like they re one step of 2nd phase. What are your partners position on that?

@IdentityMonk: @travisspencer @NishantK @johnshew @JohnFontana I hope SCIM will not face a #LetTheWookieWin situation

@NishantK: @johnshew @IdentityMonk @travisspencer @JohnFontana Not assuming anything about WAAD. Wondering about overlap between SCIM & Open Graph API

Given these concerns, let me explain what I see as the relationship between SCIM and the Graph API.

What is SCIM?

All the SCIM documents begin with a commendably unambiguous statement of what it is:

The Simple Cloud Identity Management (SCIM) specification is designed to make managing user identity in cloud based applications and services easier. The specification suite seeks to build upon experience with existing schemas and deployments, placing specific emphasis on simplicity of development and integration, while applying existing authentication, authorization and privacy models. Its intent is to reduce the cost and complexity of user management operations by providing a common user schema and extension model, as well as binding documents to provide patterns of exchanging this schema using standard protocols. In essence, make it fast, cheap and easy to move users in to, out of and around the cloud. [Kim: emphasis is mine]

I support this goal. Further, I like the concept of spec writers being crisp about the essence of what they are doing: “Make it fast, cheap and easy to move users in to, out of and around the cloud”. For this type of spec to be useful we need it to be as widely adopted as possible, and that means keeping it constrained, focussed and simple enough that everyone chooses to implement it.

I think the SCIM authors have done important work to date. I have no comments on the specifics of the protocol or schema at this point – I assume those will continue to be worked out in accordance with the spec's “essence statement” and be vetted by a broad group of players now that SCIM is on a track towards standardization. Microsoft will try to help move this forward: Tony Nadalin will be attending the next SCIM meeting in Vancouver on our behalf.

Meanwhile, what is “the Graph”?

Given that SCIM's role is clear, let's turn to the question of how it relates to a “Graph API”.

Why does our thinking focus on a Graph API in addition to a provisioning protocol like SCIM? There are two answers.

Let's start with the theoretical one. It is because of the central importance of graph technology in being able to manage connectedness – something that is at the core of the digital universe. Treating the world as a graph allows us to have a unified approach to querying and manipulating interconnected objects of many different kinds that exist in many different relationships to each other.

But theory only appeals to some… So let's add a second answer that is more… practical. A directory has emerged that by August is projected to contain one billion users. True, it's only one directory in a world with many directories (most agree too many). But beyond the importance it achieves through its scale, it fundamentally changes what it means to be a directory: it is a directory that surfaces a multi-dimensional network.

This network isn't simply a network of devices or people. It's a network of people and the actions they perform, the things they use and create, the things that are important to them and the places they go. It's a network of relationships between many meaningful things. And the challenge is now for all directories, in all domains, to meet a new bar it has set.

Readers who come out of a computer science background are no doubt familiar with what a graph is. But I recommend taking the time to come up to speed on the current work on connectedness, much of which is summarized in Networks, Crowds and Markets: Reasoning About a Highly Connected World (by Easley and Kleinberg). The thesis is straightforward: the world of technology is one where everything is connected with everything else in a great many dimensions, and by refocusing on the graph in all its diversity we can begin to grasp it.

In early directories we had objects that represented “organizations”, “people”, “groups” and so on. We saw organizations as “containing” people, and saw groups as “containing” people and other groups in a hierarchical and recursive fashion. The hierarchy was a particularly rigid kind of network or graph that modeled the rigid social structures (governments, companies) being described by technology at the time.

But in today's flatter, more interconnected world, the things we called “objects” in the days of X.500 and LDAP are better expressed as “nodes” with different kinds of “edges” leading to many possible kinds of other “nodes”. Those who know my work from around 2000 may remember I used to call this polyarchy and contrast it with the hierarchical limitations of LDAP directory technology.

From a graph perspective we can see “person nodes” having “membership edges” to “group nodes”. Or “person nodes” having “friend edges” to other “person nodes”. Or “person nodes” having “service edges” to a “mail service node”. In other words the edges are typed relationships between nodes that may possibly contain other properties. Starting from a given node we can “navigate the graph” across different relationships (I think of them as dimensions), and reason in many new ways.

For example, we can reason about the strength of the relationships between nodes, and perform analysis, understand why things cluster together in different dimensions, and so on.

From this vantage point, directory is a repository of nodes that serve as points of entry into a vast graph, some of which are present in the same repository, and others of which can only be reached by following edges that point to resources in different repositories. We already have forerunners of this in today's directories – for example, if the URL of my blog is contained in my directory entry it represents an edge leading to another object. But with conventional technology, there is a veil over that distant part of the graph (my blog). We can read it in a browser but not access the entities it contains as structured objects. The graph paradigm invites us to take off the veil, making it possible to navigate nodes across many dimensions.

The real power of directory in this kind of interconnected world is its ability to serve as the launch pad for getting from one node to a myriad of others by virtue of different relationships.

This requires a Graph Protocol

To achieve this we need a simple, RESTful protocol that allows use of these launch pads to enter a multitude of different dimensions.

We already know we can build a graph with just HTTP REST operations. After all, the web started as a graph of pages… The pages contained URLs (edges) to other pages. It is a pretty simple graph but that's what made it so powerful.

With JSON (or XML) the web can return objects. And those objects can also contain URLs. So with just JSON and HTTP you can have a graph of things. The things can be of different kinds. It's all very simple and very profound.

No technology ghetto

Here I'm going to put a stake in the ground. When I was back at ZOOMIT we built the first commercial implementation of LDAP while Tim Howes was still at University of Michigan. It was a dramatic simplification relative to X.500 (a huge and complicated standard that ZOOMIT had also implemented) and we were all very excited at how much Tim had simplified things. Yet in retrospect, I think the origins of LDAP in X.500 condemned directory people to life in a technology ghetto. Much more dramatic simplifications were coming down the pike all around us in the form of HTML, latter day SQL and XML. For every 100 application programmers familiar with these technologies, there might have been – on a good day – one who knew something about LDAP. I absolutely respect and am proud of all the GOOD that came from LDAP, but I am also convinced that our “technology isolation” was an important factor that kept (and keeps) directory from being used to its potential.

So one of the things that I personally want to see as we reimagine directory is that every application programmer will know how to program to it. We know this is possible because of the popularity of the Facebook Graph API. If you haven't seen it close up and you have enough patience to watch a stream of consciousness demo you will get the idea by watching this little walkthrough of the Facebook Graph Explorer. Or better still just go here and try with your own account data.

You have to agree it is dead simple and yet does a lot of what is necessary to navigate the kind of graph we are talking about. There are many other similar explorers available out there – including ours. I chose Facebook's simply because it shows that this approach is already being used at colossal scale. For this reason it reveals the power of the graph as an easily understood model that will work across pretty much any entity domain – i.e. a model that is not technologically isolated from programming in general.

A pluggable namespace with any kind of entity plugging in

In fact, the Graph API approach taken by Facebook follows a series of discussions by people now scattered across the industry where the key concept was one of creating a uniform pluggable namespace with “any” kind of entity plugging in (ideas came from many sources including the design of the Azure Service Bus).

Nishant and others have posed the question as to whether such a multidimensional protocol could do what SCIM does. And my intuition is that if it really is multidimensional it should be able to provide the necessary functionality. Yet I don't think that diminishes in any way the importance of or the need for SCIM as a specialized protocol. Paradoxically it is the very importance of the multidimensional approach that explains this.

Let's have a thought experiment.

Let's begin with the assumption that a multidimensional protocol is one of the great requirements of our time. It then seems inevitable to me that we will continue to see the emergence of a number of different proposals for what it should be. Human nature and the angels of competition dictate that different players in the cloud will align themselves with different proposals. Ultimately we will see convergence – but that will take a while. Question: How are we do cloud provisioning in the meantime? Does everyone have to implement every multidimensional protocol proposal? Fail!

So pragmatism calls for us to have a widely accepted and extremely focused way of doing provisioning that “makes it fast, cheap and easy to move users in to, out of and around the cloud”.

Meanwhile, allow developers to combine identity information with information about machines, services, web sites, databases, file systems, and line of business applications through multidimensional protocols and APIs like the Facebook and the Windows Azure Active Directory Graph APIs. For those who are interested, you can begin exploring our Graph API here: Windows Azure AD Graph Explorer (hosted in Windows Azure) (Select ‘Use Demo Company’ unless you have your own Azure directory and have gone through the steps to give the explorer permission to see it…)

To me, the goals of SCIM and the goals of the Graph API are entirely complementary and the protocols should coexist peacefully. We can even try to find synergy and ways to make things like schema elements align so as to make it as easy as possible to move between one and the other.

Diagram 2.0: No hub. No center.

As I wrote here, Mary Jo Foley's interpretation of one of the diagrams in John Shewchuk's second WAAD post made it clear we needed to get a lot visually crisper about what we were trying to show. So I promised that we'd go back to the drawing board. John put our next version out on twitter, got more feedback (see comments below) and ended up with what Mary Jo christened “Diagram 2.0”. Seriously, getting feedback from so many people who bring such different experiences to bear on something like this is amazing. I know the result is infinitely clearer than what we started with.

In the last frame of the diagram, any of the directories represented by the blue symbol could be an on-premise AD, a Windows Azure AD, something hybrid, an OpenLDAP directory, an Oracle directory or anything else. Our view is that having your directory operated in the cloud simplifies a lot. And we want WAAD to be the best possible cloud directory service, operating directories that are completely under the control of their data owners: enterprises, organizations, government departments and startups.

Further comments welcome.

Good news and bad news from Delaware Lawmakers

Reading the following SFGate story was a real rollercoaster ride:

DOVER, Del. (AP) — State lawmakers have given final approval to a bill prohibiting universities and colleges in Delaware from requiring that students or applicants for enrollment provide their social networking login information.

The bill, which unanimously passed the Senate shortly after midnight Saturday, also prohibits schools and universities from requesting that a student or applicant log onto a social networking site so that school officials can access the site profile or account.

The bill includes exemptions for investigations by police agencies or a school's public safety department if criminal activity is suspected.

Lawmakers approved the bill after deleting an amendment that expanded the scope of its privacy protections to elementary and secondary school students.

First of all there was the realization that if lawmakers had to draft this law it meant universities and colleges were already strong-arming students into giving up their social networking credentials. This descent into hell knocked my breath away.

But I groped my way back from the burning sulfur since the new bill seemed to show a modicum of common sense.

Until finally we learn that younger children won't be afforded the same protections… Can teachers and principals actually bully youngsters to log in to Facebook and access their accounts? Can they make kids hand over their passwords? What are we teaching our young people about their identity?

Why oh why oh why oh?

There is no hub. There is no center.

Mary Jo Foley knows her stuff, knows identity and knows Microsoft. She just published a piece called “With Azure Active Directory, Microsoft wants to be the meta ID hub“. The fact that she picked up on John Shewchuk's piece despite all the glamorous announcements made in the same timeframe testifies to the fact that she understands a lot about the cloud. On the other hand, I hope she won't mind if I push back on part of her thesis. But before I do that, let's hear it:

Summary: A soon-to-be-delivered preview of a Windows Azure Active Directory update will include integration with Google and Facebook identity providers.

Microsoft isn’t just reimaginging Windows and reimaginging tablets. It’s also reimaginging Active Directory in the form of the recently (officially) unveiled Windows Azure Active Directory (WAAD).

In a June 19 blog post that largely got lost among the Microsoft Surface shuffle last week, Microsoft Technical Fellow John Shewchuk delivered the promised Part 2 of Microsoft’s overall vision for WAAD.

WAAD is the cloud complement to Microsoft’s Active Directory directory service. Here’s more about Microsoft’s thinking about WAAD, based on the first of Shewchuk’s posts. It already is being used by Office 365, Windows InTune and Windows Azure. Microsoft’s goal is to convince non-Microsoft businesses and product teams to use WAAD, too.

This is how the identity-management world looks today, in the WAAD team’s view:

And this is the ideal and brave new world they want to see, going forward.

WAAD is the center of the universe in this scenario (something with which some of Microsoft’s competitors unsurprisingly have problem).

[Read more of the article here]

The diagrams Mary Jo uses are from John's post. And the second clearly shows the “Active Directory Service” triangle in the center of the picture so one can understand why Mary Jo (and others) could think we are talking about Active Directory being at the center of the universe.

Yet in describing what we are building, John writes,

“Having a shared directory that enables this integration provides many benefits to developers, administrators, and users.”

“Shared” is not the same as “Central”. For the Windows Azure AD team the “shared directory” is not “THE hub” or “THE center”. There is no one center any more in our multi-centered world. We are not building a monolithic, world-wide directory. We are instead consciously operating a directory service that contains hundreds of thousands of directories that are actually owned by individual enterprises, startups and government organizations. These directories are each under the control of their data owner, and are completely independent until their data owner decides to share something with someone else.

The difference may sound subtle, but I don't think it is. When I think of a hub I think of a standalone entity mediating between a set of claims providers and a set of relying parties.

But with Azure Active Directory the goal is quite different: to offer a holistic “Identity Management as a Service” for organizations, whether startups, established enterprises or government organizations – in other words to “operate” on behalf of these organizations.

One of the things such a service can do is to take care of connecting an organization to all the consumer and corporate claims providers that may be of use to it. We've actually built that capability, and we'll operate it on a 24/7 basis as something that scales and is robust. But IdMaaS involves a LOT of other different capabilities as well. Some organizations will want to use it for authentication, for authorization, for registration, credential management and so on. The big IdMaaS picture is one of serving the organizations that employ it – quite different from being an independent hub and following a “hub” business model.

In this era of the cloud, there are many cloud operators. Martin Kuppinger has pointed out that “the cloud” is too often vendor-speak for “this vendor's cloud”. In reality there are “clouds” that will each host services that are premium grade and that other services constructed in different clouds will want to consume. So we will all need the ability to reach accross clouds with complete agility, security and privacy and within a single governance framework. That's what Identity Management as a Service needs to facilitate, and the Active Directory Service triangle in the diagram above is precisely such a service. There will be others operated by competitors handling the identity needs of other organizations. Each of us will need to connect enterprises we serve with those served by our competitors.

This said, I really accept the point that to express this in a diagram we could (and should) draw it very differently. So that's something John and I are going to work on over the next few days. Then we'll get back to you with a diagram that better expresses our intentions.

Disruptive Forces: The Economy and the Cloud

New generations of digital infrastructure get deployed quickly even when they are incompatible with what already exists. But old infrastructure is incredibly slow to disappear. The complicated business and legal mechanisms embodied in computer systems are risky and expensive to replace.. But existing systems can't function without the infrastructure that was in place when they were built… Thus new generations of infrastructure can be easily added, but old and even antique infrastructures survive alongside them to power the applications that have not yet been updated to employ new technologies.

This persistence of infrastructure can be seen as a force likely to slow changes in Identity Management, since it is a key component of digital infrastructure.

Yet global economic and technological trends lead in the opposite direction. The current reality is one of economic contraction where enterprises and governments are under increasing pressure to produce more with less. Analysts and corporate planners don’t see this contraction as being transient or likely to rebound quickly. They see it as a long-term trend in which organizations become leaner, better focused and more fit-to-purpose – competing in an economy where only fit-to-purpose entities survive.

At the same time that these economic imperatives are shaking the enterprise and governments, the introduction of cloud computing enables many of the very efficiencies that are called for.

Cloud computing combines a number of innovations. Some represent new ways of delivering and operating computing and communications power. But the innovations go far beyond higher density of silicon or new efficiencies in cooling technologies… The cloud is ushering in a whole new division of labor within information technology.

Accelerating the specialization of functions

The transformational power of the cloud stems above all else from its ability to accelerate the specialization of functions so they are provided by those with the greatest expertise and lowest costs.

I was making this “theoretical” point while addressing the TSCP conference recently, which brings together people from extremely distributed industries such as aeronautics and defense. Looking out into the audience I was suddenly struck by something that should have been totally obvious to me. All the industries represented in that room, except for information technology, had an extensive division of labor across a huge number of parties. Companies like Boeing or Airbus don't manufacture the spokes on the wheels of their planes, so to speak. They develop specifications and assemble completed products in cost effective ways that are manufactured and refined by a whole ecosystem. They have massively distributed supply chains. Yet our model in information technology has remained rather pre-industrial and there are innumerable examples of companies expending their own resources doing things they aren't expert at, rather than employing a supply chain. And part of the reason is because of the lack of an infrastructure that supports this diversification. That infrastructure is just arriving now – in the form of the cloud.

Redistributing processes to be most efficiently performed

So technologically the cloud is an infrastructure honed for multi-sourcing – refactoring processes and redistributing them to be most efficiently performed.

The need to become leaner and more fit-to-purpose will drive continuous change. Organizations will attempt to take advantage of the emerging cloud ecology to substitute off-the-shelf commoditized systems offered as specialized services. When this is not possible they will construct their newly emerging systems in the cloud using other specialized ecosystem services as building blocks.

Given the fact that the best building blocks for given purposes may well be hosted on different clouds, developers will expect to be able to reach across clouds to integrate with the services of their choice. Cloud platforms that don’t offer this capability will die from synergy deficiency.

Technological innovation will need to take place before services will be able to work securely in this kind of loosely coupled world – constituting a high-value version of what has been called the “API Economy”. The precept of the API economy is to expose all functionality as simple and easily understood services (e.g. based on REST) – and allow them to be consumed at a high level of granularity on a pay-as-you-go basis.

In the organizational world, most of the data that will flow through these APIs will be private data. For enterprises and governments to participate in the API Economy they will require a system of access control in which many different applications run by different administrations in different clouds are able to reuse knowledge of identity and security policy to adequately protect the data they handle. They will also need shared governance.

Specifically, it must be possible to reliably identify, authenticate, authorize and audit across a graph of services before reuse of specialized services becomes practicable and economical and the motor of cloud economics begins to hum.

Making Good on the Promise of IdMaaS

The second part of John Shewchuk's blog on Windows Azure Active Directory has been published here. John goes into more detail about a number of things, focusing on the way it allows customers to hook their Cloud AD into the API Economy in a controlled and secure way.

Rather than describe John's blog myself I'm going to parrot the blog post that analyst Craig Burton put up just a few hours ago. I find it really encouraging to see his excitement: it's the way I feel too, since I also think this is going to open up so many opportunities for innovation, make developing services simpler and make the services themselves more secure and respectful of privacy. Here's Craig's post:

As a follow up to Microsoft’s announcement of IdMaaS, the company announced the — to be soon delivered — developer preview for Windows Azure Active Directory (WAzAD). As John Shewchuk puts it:

The developer preview, which will be available soon, builds on capabilities that Windows Azure Active Directory is already providing to customers. These include support for integration with consumer-oriented Internet identity providers such as Google and Facebook, and the ability to support Active Directory in deployments that span the cloud and enterprise through synchronization technology.

Together, the existing and new capabilities mean a developer can easily create applications that offer an experience that is connected with other directory-integrated applications. Users get SSO across third-party and Microsoft applications, and information such as organizational contacts, groups, and roles is shared across the applications. From an administrative perspective, Windows Azure Active Directory provides a foundation to manage the life cycle of identities and policy across applications.

In the Windows Azure Active Directory developer preview, we added a new way for applications to easily connect to the directory through the use of REST/HTTP interfaces.

An authorized application can operate on information in Windows Azure Active Directory through a URL such as:

https://directory.windows.net/contoso.com/Users(‘Ed@Contoso.com’)

Such a URL provides direct access to objects in the directory. For example, an HTTP GET to this URL will provide the following JSON response (abbreviated for readability):

{ “d”: {
“Manager”: { “uri”:”https://directory.windows.net/contoso.com/Users(‘User…’)/Manager” },
“MemberOf”: { “uri”:”https://directory.windows.net/contoso.com/Users(‘User…’)/MemberOf” },
“ObjectId”: “90ef7131-9d01-4177-b5c6-fa2eb873ef19”,
“ObjectReference”: “User_90ef7131-9d01-4177-b5c6-fa2eb873ef19”,
“ObjectType”: “User”,
“AccountEnabled”: true,
“DisplayName”: “Ed Blanton”,
“GivenName”: “Ed”,
“Surname”: “Blanton”,
“UserPrincipalName”: Ed@contoso.com,
“Mail”: Ed@contoso.com,
“JobTitle”: “Vice President”,
“Department”: “Operations”,
“TelephoneNumber”: “4258828080”,
“Mobile”: “2069417891”,
“StreetAddress”: “One Main Street”,
“PhysicalDeliveryOfficeName”: “Building 2”,
“City”: “Redmond”,
“State”: “WA”,
“Country”: “US”,
“PostalCode”: “98007” }
}

Having a shared directory that enables this integration provides many benefits to developers, administrators, and users. If an application integrates with a shared directory just once—for one corporate customer, for example—in most respects no additional work needs to be done to have that integration apply to other organizations that use Windows Azure Active Directory. For an independent software vendor (ISV), this is a big change from the situation where each time a new customer acquires an application a custom integration needs to be done with the customer’s directory. With the addition of Facebook, Google, and the Microsoft account services, that one integration potentially brings a billion or more identities into the mix. The increase in the scope of applicability is profound. (Highlighting is mine – Craig).

Now that’s What I’m Talking About

There is still a lot to consider in what an IdMaaS system should actually do, but my position is that just the little bit of code reference shown here is a huge leap for usability and simplicity for all of us. I am very encouraged. This would be a major indicator that Microsoft is on the right leadership track to not only providing a specification for an industry design for IdMaaS, but also is on well on its way to delivering a product that will show us all how this is supposed to work.

Bravo!

The article goes on to make commitments on support for OAuth, Open ID Connect, and SAML/P. No mention of JSON Path support but I will get back to you about that. My guess is that if Microsoft is supporting JSON, JSON Path is also going to be supported. Otherwise it just wouldn’t make sense.

JSON and JSON Path

The API Economy is being fueled by the huge trend of accessibility of organization’s core competence through APIs. Almost all of the API development occurring in this trend are based of a RESTful API design with data being encoded in JSON (JavaScript Object Notation). While JSON is not a new specification by any means, it is only in the last 5 years that JSON has emerged as the preferred — in lieu of XML — data format. We see this trend only becoming stronger.

[Craig presents a table comparing XPath to XML – look at it here.]

Summary

As an industry, we are completely underwater in getting our arms around a workable — distributed and multi-centered identity management metasystem — that can even come close to addressing the issues that are already upon us. This includes the Consumerization of IT and its subsequent Identity explosion. Let alone the rise of the API Economy. No other vendor has come close to articulating a vision that can get us out of the predicament we are already in. There is no turning back.

Because of the lack leadership (the crew that killed off Information Cards) in the past at Microsoft about its future in Identity Management, I had completely written Microsoft off as being relevant. I would have never expected Microsoft to gain its footing, do an about face, and head in the right direction. Clearly the new leadership has a vision that is ambitious and in alignment with what is needed. Shifting with this much spot on thinking in the time frame we are talking about (a little over 18 months) is tantamount to turning an aircraft carrier 180 degrees in a swimming pool.

I am stunned, pleased and can’t wait to see what happens next.

I think it goes without saying that “turning an aircraft carrier 180 degrees in a swimming pool” is a fractal mixed metaphor of colossal and recursive proportions that boggles the mind – yet there is more than a little truth to it. In fact that's really one of the things the cloud demands of us all.

Craig's question about JSON Path is a good one. The answer is that JSON Path is essentially a way of navigating and extracting information from a JSON document. WAzAD's Graph API returns JSON documents and if they are complex documents we expect programmers will use JSON Path – which they already know – to extract specific information. It will be part of their local programming environment on whatever device or platform they are issuing a query from.

On the other hand, one can imagine supporting JSON Path queries in the RESTful interface itself. Suppose you have a JSON document with many links to other JSON documents. Do you then support “chaining” on the server so it follows the links for you and returns the distributed JSON Path result? The problem with this approach is that a programming model we want to be ultra-simple and transparent for the programmer turns into something opaque that can have many side effects, become unpredictable and exhibit performance issues. As far as I know, the social network APIs that are most sophisticated in their use of links don't support this. They just get the programmer to chase the links that are of interest.

So for these reasons server support is something we have talked about but don't yet have a position on. This is exactly the kind of thing we'd like to explore by collaborating with developers and getting their input. I'd also like to hear what other people have experienced in this regard.

Viviane Reding's Speech to the Digital Enlightenment Forum

It was a remarkable day at the annual conference of the Digital Enlightenment Forum in Luxembourg. The Forum is an organization that has been set up over the last year to animate a dialog about how we evolve a technology that embodies our human values. It describes its vision this way:

The DIGITAL ENLIGHTENMENT FORUM aims to shed light on today’s rapid technological changes and their impact on society and its governance. The FORUM stimulates debate and provides guidance. By doing so, it takes reference from the Enlightenment period as well as from transformations and evolutions that have taken place since. It examines digital technologies and their application openly with essential societal values in mind. Such values might need to be given novel forms taking advantage of both today’s knowledge and unprecedented access to information.

For the FORUM, Europe’s Age of Enlightenment in the 18th century serves as a metaphor for our current times. The Enlightenment took hold after a scientific and technological revolution that included the invention of book printing, which generated a novel information and communication infrastructure. The elite cultural Enlightenment movement sought to mobilise the power of reason, in order to reform society and advance knowledge. It promoted science and intellectual interchange and opposed superstition, intolerance and abuses by the church and state. (more)

The conference was intended to address four main themes:

What can be an effective organisation of governance of ICT infrastructure, including clouds? What is the role of private companies in relation to the political governance in the control and management of infrastructure? How will citizens be empowered in the handling of their personal data and hence in the management of their public and private lives?

How do we see the relation between technology and jurisdiction? Can we envisage a techno-legal ecosystem that ensures compliance with law (’coded law’), and how can sufficient political control be ensured in a democratic society?

What are the consequences for privacy, freedom and creativity of the massive data collection on behaviour, location, etc. by private and public organisations and their use through mining and inferencing for profiling and targeted advertising?

What needs to be done to ensure open discussion and proper political decision-making to find an appropriate balance between convenience of technology use and social acceptability?

The day was packed with discussions that went beyond the usual easy over-simplifications. I won't try to describe it here but will post the link to the webcast when it becomes available.

One of the highlights was a speech by Mme Viviane Reding, the Vice President of the European Union (who also serves as commissioner responsible for Justice, Fundamental Rights and Citizenship) about her new proposed Data Protection legislation. Speaking later to the press she emphasized that the principle of private data belonging to the individual has applied in the European Union since 1995, and that her new proposals are simply a continuation along three lines. First, she wants users to understand their rights and get them enforced; second she is trying to provide clarity for companies and reduce uncertainty about how the data protection laws will be applied; and third, she wants to make everyone understand that there will be sanctions. She said,

“If you don't have sanctions, who cares about the rules? Who cares about the law?”

And the sanctions are major: 2% of world-wide turnover of the company. Further they apply to all companies, anywhere in the world, that collect information from Europeans.

I very much recommend that everyone involved with identity and data protection read her speech, “Outdoing Huxley: Forging a high level of data protection for Europe in the brave new digital world”.

In my view, the sanctions Mme Reding proposes will, from the point of view of computer science, be meted out as corrections for breaking the Laws of Identity. John Fontana asked me about this very dynamic in an article he did recently on the relevance of the Laws of Identity seven years after they were written (2005).

ZDNet: The Laws of Identity predicted that government intervention in identity and privacy would increase, why is that happening now?

Cameron: There are many entities that routinely break various of these identity laws; they use universal identifiers, they collect information and use it for different purposes than were intended, they give it to parties that don’t have rights to it, they do it without user control and consent. You can say that makes the Laws irrelevant. But what I predicted is that if you break those Laws there will be counter forces to correct for that. And I believe when we look at recent developments – government and policy initiatives that go in the direction of regulation – that is what is happening. Those developments are providing the counter force necessary to bring behavior in accordance with the laws. The amount of regulation will depend on how quickly entities (Google, Facebook, etc.) respond to the pressure.

ZDNet: Do we need regulation?

Cameron: It’s not that I am calling for regulation. I am saying it is something people bring upon themselves really. And they bring it on themselves when they break the Laws of Identity.

Identity Management before the Cloud (Part 2)

The First Generation Identity Ecosystem Model

The biggest problem of the “domain based model of identity management” was that it assumed each domain was an independent entity whose administrators had complete control over the things that were within it – be they machines, applications or people.

During the computational Iron Age – the earliest days of computing – this assumption worked.

But even before the emergence of the Internet we began to see domains colliding within closed organizational boundaries – as discussed here. The idea of organizations having an “administrative authority” revealed itself to be far more complicated than anyone initially thought, since enterprises were evolving into multi-centered things with autonomous business units experiencing bottoms-up innovation. The old-fashioned bureaucratic models, probably always somewhat fictional, slowly crumbled.

Many of us who worked on IT architecture were therefore already looking for ways to transcend the domain model even before the Internet began to flood the enterprise and wear away its firewalls. Yet the Internet profoundly shook up our thinking. On the one hand organizations began to understand that it was now possible – and in fact mandatory – to interact with people as individuals and citizens and consumers. And on the other any organization that rolled up its sleeves and got to work on this soon saw that it needed a model where it could “plug in” to systems run by partners and suppliers in seamless and flexible ways.

With increasing experience enterprise and Internet architects concluded that standardization of identity architecture and components was the only way to achieve the flexibility essential for business agility, whether inside or outside the firewall. It simply wasn’t viable to recode or “change out” systems every time organizations were realigned or restructured.

Technologists introduced new protocols like SAML that implemented a clear separation of standardized identity provider (IdP) and relying party (RP) roles so components would no longer be hard-wired together. In this model, when users want a service the service provider sends them to an IdP which authenticates them and then returns identifying information to the service provider (an RP within the model). All the CRUD is performed by the IdP which issues credentials that can be understood and trusted by RPs. It is a formal division of labor – even in scenarios where the same “Administrative Domain” runs both the IdP and the RP.

The increasing need for inter-corporate communications, data-sharing and transactions led these credentials to become increasingly claims-based, which is to say the hard dependencies on internal identifiers and proprietary sauce that only made sense inside one party’s firewall gave way to statements that could be understood by unrelated systems. This provided the possibility of making assertions about users that could be understood in spite of crossing enterprise boundaries. It also allowed strategists to contemplate outsourcing identity roles that are not core to a company’s business (for example, the maintenance of login and password systems for retirees or consumers).

Many of the largest companies have successfully set up relations with their most important partners based on this model. Others have wisely used it to restructure their internal systems to increase their flexibility in the future. The model has represented a HUGE step forward and a number of excellent interoperable products from a variety of technology companies are being deployed.

Yet in practice, most organizations have found federation hard to do. New technology and ways of doing things had to be mastered, and there was uncertainty about liability issues and legal implications. These difficulties grow geometrically for organizations that want to establish relationships with a large number of other other organizaitons. Establishing configuration and achieving secure connectivity is hard enough, but keeping the resultant matrix of connections reliable in an operational sense can be daunting and therefore seen as a real source of risk.

When it came to using the model for internet facing consumer registration, service providers observed that individual consumers use many different services and have accounts (or don’t have accounts) with many different web entities. Most concluded that it would be a gamble to switch from registering and managing “their own users” to figuring out how to successfully reuse peoples’ diverse existing identities. Would they confuse their users and lose their customers? Could identity providers be trusted as reliable? Was there a danger of losing their customer base? Few wanted to find out…

As a result, while standardized architecture makes identity management systems much more pluggable and flexible, the emergence of an ecosystem of parties dedicated to specialized roles has been slow. The one notable entity that has gained some momentum is Facebook, although it has not so much replaced internet-facing registration systems as supplemented them with additional information (claims).

[Next in this series: Disruptive Forces: The Economy and the Cloud]

Governance is key

I want to return to Nishant's concerns with the way I've presented IdMaaS:

What I was surprised to find missing from Kim’s and Craig’s discussion about IdMaaS were the governance controls one needs in identity management (and therefore IdMaaS) – like approval workflows, access request and access recertification. In other words, those crucial business tools in identity management that have led many analysts and vendors (including me) to repeat on stage many, many times that “Identity Management is about process, not technology”. And this is the part that makes identity management, and therefore IdMaaS, really hard, as I alluded to in my talk about ‘Access Provisioning in a Services World‘ at Catalyst a couple of years ago.

Let me begin by saying I agree completely with Nishant about the importance of governance. In fact, in my first blog about IdMaaS I highlighted two fundamental aspects of IdMaaS and digital identity being:

confidential auditing; and
assurance of compliance.

I also agree with him on the urgent requirement for “approval workflows, access request and access recertification.” I believe we need identity and access process control.

I'm therefore surprised about the confusion on whether or not I think governance is important, but I'm glad to get this cleared up right at the beginning.

Let me explain what I had in mind as a way to achieve some depth in this discussion. It seemed to me we need to decompose the overall service capabilities, rather than trying to discuss “everything simultaneously”. I started by trying to talk about the IdM models that have led us to the current point in time, in order to set the stage for the exploration of the new emerging model of Identity Management as a Service and its capabilities, as illustrated in this graphic:

Now my point here is not to argue that this graphic captures all the needed IdMaaS capabilities – it's very much a work in progress. It is simply that, when you look at the whole landscape, you see there are a number of areas that warrant real discussion in depth… My conclusion was that we will only succeed at this by looking at things one at a time.

The point can be made, and perhaps this is what Nishant was saying, that governance applies to everything. I accept that this is true, but governance still can be factored out for purposes of discussion. I think we'll achieve more clarity if that's what we do. For one thing, it means we can dive more deeply into governance itself.

Let me know if this decompositional approach seems wrong-headed and we should just have a free-for-all where we discuss everything as it relates to everything else. I agree that this can be interesting too.

That said, I want to take up some of the points Nishant makes when talking about governance in the Domain Identity Model.

In… ‘Identity management before the cloud (part one)‘, Kim says “In the domain paradigm identity management was thought to be the CRUD and little more.”. But that is not true. What made identity management so hard and expensive was the need to supplement the CRUD features with a governance layer that included policy and process to manage over the entirety of the identity management infrastructure. The responsibility for this was early on thrust upon the provisioning products like Thor Xellerate and Waveset, and later on spawned more specialized handling in IAG products like Sailpoint and Aveksa. Kim alludes to these when he says “A category of Identity Management integration products arose … often brittle point products and tools that could only be deployed at high cost by skilled specialists”. That’s accurate, but not because they were pointless or overhead or overkill. These products were difficult to deploy and needed customization because it wasn’t well understood how to introduce the controls needed in IAM in a manner that was practical and usable. And it was always assumed that every customer would demand unique business processes, so the approach was a toolkit approach rather than a solution approach.

Reading this, I hold even more strongly than before to the statement that the Domain Model was about CRUD and absolute control by The Domain. The fact that businesses required governance is historically true but doesn't change the way Domains were conceptualized, built and sold by everyone in the industry. So I agree with Nishant about the importance of governance but don't think this changes the essence of what domains actually were.

For a at least several decades computer governance was provided as an outcome of security analysts configuring domain based systems to implement a variety of well-known techniques (physical security, separation of duties, multiple approvers and the like) in order to satisfy business objectives and comply with normative standards prevalent in the industries and national or geographical jurisdictions.

I'm sure many of us witnessed the calisthenics of colleagues in banks and financial institutions, who, as security officers, figured out how to use mainframes and LANS in both their nascent and more evolved forms to be effective at this. I know I used to marvel at some of what they accomplished.

We are talking about a time when governance wasn't synonymous with government regulation. Governance was more or less orthogonal to the way products were built by the industry. Domain products could be used in ways that accorded with asset protection requirements if the right expertise was present to set the systems up to achieve these ends. And on a pessimistic note, has so much really changed in this regard since then?

Many of the provisioning concepts that appeared in products like Waveset and Xellerate appeared earlier in products like ZOOMIT VIA and Metamerge. But those, like Waveset, Xellerate and Aveksa were actually, in my view, “post-domain” products that attempted a holistic solution working across product boundaries.

Still, while being post-domain in some ways (e.g. meta), they continued to require extensive manual intervention by security experts to coax “compliant” behaviors out of them, and this intervention was embodied in detailed configurations and scripts dependent on the behaviors of underlying products. This meant they were often fragile: if the underlying products were upgraded, for example, they might no longer be compatible with the framework intended to manage them.

Nishant goes on to say,

And an IdMaaS architecture as alluded to by Kim and illustrated by Craig in this diagram just makes the solving of this problem more difficult and even more critical due to the zero trust environment. Since the identities have not been created and are not controlled by the organization that needs to make the access decisions, approval and review controls become even more important because they’re all the enterprise has. The ability to de-provision access based on events or manual intervention becomes a crucial component of access lifecycle management. These are the safety measures the organization needs to put in place for security and compliance.

I agree the ability to de-provision is key and in fact it is key to what we will be delivering. On the other hand, Nishant's conclusion that “the [IdMaaS] architecture.. must make the solving of this problem more difficult… due to the zero trust environment” is I think absolutely unfounded. As I will show when we go through the requirements for IdMaaS, Trust Frameworks are a necessity, and I know of few Trust Frameworks that are based on “zero trust”.

There is a bit too much flailing at paper tigers for me to take all of this apart in a single post. Let's take a deep breath and delve systematically both into requirements and the details of what is being proposed in WAzAD.

Freedom of choice != Your choice of captor

I am happy to see that Nishant Kaushik (@NishantK) has responded to the posts I've been doing on IdMaaS. Nishant has strong ideas, having led product architecture and strategy within the Identity Management & Security Products group at Oracle for many years. Nowadays he is with a startup called Identropy and writes the blog TalkingIdentity.

Nishant's main concern in his first post was that I've gone as far as I have without discussing the importance of governance controls. I'm going to save this issue for my next piece, since Nishant also ended up in a spirited conversation with Craig Burton that is really worth following. He wrote:

Craig Burton thinks that this vision, and the associated work Microsoft is doing on Windows Azure Active Directory (as described in this post by John Shewchuck) is “profoundly innovative”. I’ll be honest, I’m having a little trouble seeing what is so innovative about WAAD itself. How is the fact that becoming an Office 365 customer automatically gives you an AD in the cloud that you can build/attach other Azure applications to that different from Oracle saying that deploying a Fusion Application will include an OUD based identity store that the enterprise can also use for other applications? Apart from being in the cloud and therefore far easier to use in federated identity (SAML, OpenID, OAuth) scenarios. But I’ll wait to hear more before commenting any further (though John Fontana and others have already weighed in).

Craig Burton, as is his trademark, includes a few lightning bolts in his response:

Nishant must not have read my post very carefully. In my explanation of why Microsoft’s vision for IDMaaS is so profound, he failed to notice that I never once mentioned WAAD (Windows Azure Active Directory) or Office 365. There is a reason for that. I am not applauding Microsoft’s — or any other vendor’s — implementation of IDMaaS.

What is so profound about this announcement is that Microsoft is following Kim Cameron’s directives for building a Common Identity Framework for the planet, not just for a vendor.

In 2009 Kim Cameron, Reinhard Posch and Kai Rannenberg wrote Proposal for a Common Identity Framework: A User-Centric Identity Metasystem.

In section 5.4 of that document, the authors spell out the requirement for customer Freedom of Choice.

Freedom of Choice

Freedom of choice for both users and relying parties refers to choice of service operators they may wish to use as well as to the interoperability of the respective systems.

This definition is quite different than the freedom of choice Mr. Kaushik writes about in his blog piece. I posit that the Microsoft vision is so profound because it is built on a definition of Freedom of Choice that fits the above description and not where the customer is free to choose a particular captor.

And so I state again:

Freedom of Choice != Your Choice of Captor

Microsoft’s vision has changed the playing field. Any vendor building IdMaaS that is not meeting the Freedom of Choice requirements defined here is no longer in the game. That is profoundly innovative because this is truly a vision that benefits everyone — but mostly the customer.

With these remarks Craig starts really getting to the bare bones of what it takes to be trusted to manage identity for enterprises and governments.

It didn't take long before Nishant fired off a second dispatch accepting Craig's points and clarifing what he saw as the real issues:

I want to be clear: I am not questioning the vision that Kim Cameron has started to talk about in his posts about IDMaaS (though I was bringing up a part – the governance controls – that I felt was missing and that I believe has a major impact on the architecture of a Common Identity Framework, as Craig called it). And I am completely in agreement with what Craig described in his original post in the section “Stop Gushing and Lay it Out for Me”.

Craig talks about how Freedom of Choice necessarily includes Freedom from Captor. He then says “This definition is quite different than the freedom of choice Mr. Kaushik writes about in his blog piece“. I’m not sure why he thinks that, because what I am saying is exactly in line with what Craig and Kim are saying. It is what I have been saying since back in 2006 when I first started talking about the Identity Services Platform, which talks about the framework through which identity-enabled applications (essentially any application) consume identity from standardized services that can plug into any identity system or metasystem.

What I was pointing out was that John Shewchuck’s post about WAAD seemed to indicate a lack of Freedom of Choice in what Microsoft is rolling out, at least right now. Becoming an Office 365 customer would “automatically create a new Windows Azure Active Directory that is associated with the Office 365 account“, forcing you to store and manage your identities in WAAD. It should simply ask for the domain from which users could use this, and you could simply point to the Google Apps domain of your company, sign up for WAAD if needed, or grant access to contractors/partners using whatever identity they choose (traditional AD environment, Facebook or Twitter accounts, even personal OpenIDs). By the way, the governance controls I was talking about are essential here in order to define the process of granting, managing and taking away access in this deployment model.

When I said “I’m having a little trouble seeing what is so innovative about WAAD itself”, I was pointing out my opinion that the details in John’s post did not seem to match up with the vision being outlined in Kim’s post, representing the kind of disconnect that Craig himself called out as a risk at various times in his post, but most notably in the section titled Caveats. I guess I’m not quite ready to make the leap that Microsoft’s work will line up Kim’s vision, and was calling out the disconnect I was seeing. And when Craig said “Microsoft is not only doing something innovative – but profoundly innovative”, I assumed he was talking about WAAD and related work, and not just referring to what Kim is talking about.

Nishant goes on to give more examples of how he thinks Office 365 could be implemented. I won't discuss those at this point since I think we should save our implementation discussions for later. First we need a more thorough conversation about what IdMaaS actually involves given all the changes that are impacting us. It is these definitions that must lead to implementation considerations. I hope Nishant will bear with me on this so we can continue the discussion begun so far.

I also want, in deference to Nishant and others who may have similar concerns, make a few remarks on what we have rolled out right now. I want to be really clear that while I think we already do a number of things really well and in a robust way at very high scale, there are all kinds of things we still don't do that form an integral part of our vision for what must be done. Anyone who says they can do all that is needed just doesn't, in my view, have a vision.

On the other hand, I hope we can steer clear of overly simplified recipies for what complicated offerings like Office 365 require as identity management. For example, applications like Office need directories and places to store information about people in them, and nowhere is it written in stone that this should be done by sending realtime queries to dozens or thousands of systems. Enterprise users want directory lookup that is as fast and reliable when served from the cloud as it is on premises. And so on. My point here is not to argue for one solution versus another, but to invite Nishant and others who may be interested to zero in on the broad set of requirements before getting overly committed to possible ways of meeting them.