CardSpace boasts a limited degree of unlinkability, based on a weak attack model: for self-signed cards, user can generate two assertions for two different websites that appear independent. (My colleague Ben from Google security team disputes even that weak guarantee, arguing that only assertions that can not be linked even with help from the identity provider qualify as “unlinkable”)
OpenID gets a bad reputation for allowing linkability but in fact there is no requirement of universal identifier in the specification. OpenID provider could choose to assert two different “names” for the same user to two dfiferent websites. (Of course they are still linkable in the sense that the ID provider knows what is going on, even if the sites can not put the picture together on their own– sort of, see next two points.)
The problem is even this weak guarantee of “unlinkable” identities at multiple websites breaks down in the real world, for two reasons.
First problem is that websites insist on email address or other unique identifier– and they want this at authentication time. When an inherently PII information such as email address is shared, unlinkability of the underlying protocol becomes largely irrelevant since there is another, even more universal identifier to go by. Same email address would appear even when user authenticates via two different identity provides– this is linkage across independent providers.
Federated ID providers are not in a position to say no: they are trying to convince relying sites to interoperate. Everyone already has a proprietary identity management system already, requiring users to sign up. This registration process collects some basic information and availability of that information is firmly embedded in the business logic. Going from a model where the site has an email address to one where they know the user as “pq2t45x” is not an appealing proposition. Similarly any time the user shares a global identifier such as address, real name or credit card number they void any privacy guarantees from the identity model.
As a matter of architecture, authentication systems should strive for minimum disclosure– more identifying information can always be added after the fact, but it is impossible to go back in the direction of greater privacy. Even if majority of transactions ended up with the user sharing PII during at one point (making them very linkable regardless of authentication) it’s fair to argue that underlying protocols need to optimize for the best case of no disclosure and casual browsing. But the reliance on email address in existing scenarios means that redesigning basic protocols in this fashion to disclose less will be an exercise in rearranging deck chairs.
In many ways email address is the easiest attribute to fix, especially when ID provider is also the email provider– true for the three largest ID providers, Windows Live/Passport, GMail/Google and Yahoo– they could simply fabricate email aliases that forward to the original. Unfortunately that still breaks support scenarios because when email@example.com calls asking for help, the system has her records files under the very private firstname.lastname@example.org. Other identifiers have their own private versions depending on the provider: some credit card companies support issuing one-time card numbers billing to the original. Mail services allow signing up for a PO box and hide original physical address (although good luck getting many ecommerce merchants to deliver to one due to their high incidence of fraud) and conceivably they could start algorithmically generating those PO box numbers to break linkage.
Even if every instance of linkable PII could be replaced by a pairwise unique variant, there is a second problem: linkage between identifiers is possible when user is authenticated to multiple sites at the same time.