About those strange P3P compact policies (1/2)

There are times when past mistakes come back to haunt the designers and developers of a system in unexpected ways. The implementation of the privacy standard P3P in Internet Explorer is proving to be that example for this blogger.

First some background: P3P stands for Platform for Privacy Preferences Project. P3P was forged over a decade ago, amidst the great privacy scares of 2000, in what can be seen as a more innocent/idylic time before September 11 when the greatest threat to online users were evil marketers trying to track users with third-party cookies. Under the charter of the World Wide Web consortium’s Technology and Society group, P3P was an ambitious effort to introduce greater transparency and user control over the collection of information online. In many ways it was also ahead of its time. In the vein of similar initiatives that attempt to prescribe technological fixes to what are fundamentally economic incentive problems, only a tiny fraction of the ideas found their way into widespread implementation. (It would be another 10 years before W3C would dabble on the policy front, with Do-Not-Track, instantly getting mired in as much controversy as P3P in its heyday. To think– DNT introduces just one modest HTTP header representing a yes/no decision. P3P is enormously complex by comparison.

In the original vision, websites express their privacy policies– often couched in legalese and not written with the purpose of informing users– in machine readable XML format. The web browser could then retrieve and compare these policies against the user’s preferences as they navigated to different websites. P3P even proposed a machine-readable standard for expressing user preferences called APPEL, also in XML naturally, which went nowhere. It’s difficult to argue against greater transparency– although several advertising networks managed to do precisely that, out of concern that shining a light into data collection practices could paint an unflattering picture.

Earlier iterations of the protocol also had serious disconnects with the way web browsers operate and their focus on performance. Blocking, synchronous evaluation of privacy policies for every resource associated with a web page, as originally envisioned in the draft spec, would have been an enormous speed penalty. With some reality checks to focus on improved efficiency, attention eventually focused on the perceived privacy boogeyman du jour: HTTP cookies. In order to avoid out-of-band retrieval of privacy statements, compact policies were introduced, as a summary of the full XML policy that could be expressed succinctly in HTTP response headers accompanying cookies. Compact policies are derived from the full XML version via a deterministic transformation. This process is lossy and  produces a worst-case picture: while the full XML format allows specifiying that a particular type of data (say email address) is collected for a specific purpose, retention and third-party usage, the compact policy simply lists all categories, all purposes, retention times etc. as one dimensional list, collapsing such subtle distinctions. Still compact policies could be specified in HTTP headers or even in the body of the HTML document, allowing fast decisions about cookies.

So what was implemented in practice? Internet Explorer ended up being the only web browser supporting P3P and a very specific subset at that: (Full disclosure: this blogger was involved in the standards effort and implementation in IE.)

  • IE uses compact policies for cookie management.
  • IE does not evaluate full XML policies or otherwise act differently based on the presence/absence of that document. It does not even make an attempt to retrieve the XML or verify its consistency against the compact policy. There is an option under the privacy report to retrieve the policy and render it in natural language, if the user went out of their way to ask for it. (Not surprisingly many sites only deployed compact policies, never bothering to publish the XML.)
  • No APPEL or other automatic policy evaluation triggers, for example before submitting a form or logging in to a new service when it would be a useful data point for the user.

Even with this subset, P3P had significant effect on web sites because of its default settings. Belying the assertion that default settings are just that and easily modified by users who disagree with them, the default choice of “medium” privacy became the de facto standard for websites that depended on cookies. First-party cookies were given a wide berth– not requiring a compact policy and permitting existing usage to continue functioning without any changes– third-party cookies without an associated satisfactory were summarily rejected. That means not only advertising networks must implement P3P, they must have a policy that meets the default settings for IE Otherwise all of those banner-ads and iframes with punch-the-monkey animated Flash ads get stripped of their cookies, losing their capability to accurately track distinct users.

This is a great example of regulation by code as Lawrence Lessig described it brilliantly in “Code and other laws of cyberspace.” By choosing a particular default configuration in the most popular web browser, MSFT had established a minimum privacy bar for a segment of the online industry. (The irony is inescapable: at the same time that MSFT was trying to discredit Lessig in the antitrust trial, the engineers were busy providing a textbook example of his central thesis around regulation via West Coast Code.)



Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s