Browser in the middle: 25 years after the MSFT antitrust trial

In May 1998 the US Department of Justice and the Attorneys General of 20 states along with the District of Columbia sued Microsoft in federal court, alleging predatory strategies and anticompetitive business practices. At the heart of the lawsuit was the web browser Internet Explorer, and strong-arm tactics MSFT adopted with business partners to increase the share of IE over the competing Netscape Navigator. 25 years later in a drastically altered technology landscape, DOJ is now going after Google for its monopoly power in search and advertising. With the benefit of hindsight, there are many lessons in the MSFT experience that could offer useful parallels for the new era of antitrust enforcement, as both sides prepare for the trial in September.

The first browser wars

By all indications, MSFT was late to the Internet disruption. The company badly missed the swift rise of the open web, instead investing in once-promising ideas such such interactive TV or walled-garden in the style of early Prodigy. It was not until the Gates’s 1995 “Tidal Wave” memo that the company began to mobilize its resources. Some of the changes were laughably amateurish— teams hiring for dedicating “Internet program manager” roles. Others proved more strategic, including the decision to build a new browser. In the rush to get something out the door, the first version of Internet Explorer was based on Spyglass Mosaic, a commercial version of the first popular browser NCSA Mosaic developed at the University of Illinois. (The team behind Mosaic would go on to create Netscape Navigator.)  Even the name itself betrayed the Windows-centric and incrementalist attitude prevalent in Redmond: “explorer” was the name of the Windows GUI or “shell” for browsing content on the local machine. Internet Explorer would be its networked cousin helping users explore the wild wild web.

By the time IE 1.0 shipped in August 1995, Netscape already had a commanding lead in market share, not to mention the better product measured in features and functionality. But by this time MSFT had mobilized its considerable resources, greatly expanding investment in the browser team, replacing Spyglass code with its own proprietary implementation. IE3 was the first credible version to have some semblance of feature parity against Navigator, having added support for frames, cascading stylesheets, and Javascript. It was also the first time MSFT started going on the offensive, and responding with its own proprietary alternatives to technologies introduced by Netscape. Navigator had the Netscape Plugin API (NPAPI) for writing browser extensions; IE introduced ActiveX— completely incompatible with NAPI but entirely built on other MSFT-centric technologies including COM and OLE. Over the next two years this pattern would repeat as IE and Navigator duked it out for market share by introducing competing technologies. Netscape allowed web pages to run dynamic content with a new scripting language Javascript; MSFT would support that in the name of compatibility but also subtly try to undermine JS by pushing vbscript, based on the Visual Basic language so familiar to existing Windows developers.

Bundle of trouble

While competition heated up over functionality— and chasing fads, such as the “push” craze of the late 1990s that resulted in Channel Definition Language— there was one weapon uniquely available to MSFT for grabbing market share: shipping IE with Windows. Netscape depended on users downloading the software from their website. Quaint as this sounds in 2024, it was a significant barrier to adoption in an age when most of the world had not made the transition to being online. How does one download Navigator from the official Netscape website if they do not have a web browser to begin with? MSFT had a well-established channel exempt from this bootstrapping problem: copies of Windows distributed “offline” using decidedly low-tech means such as shrink-wrapped boxes of CDs or preinstalled on PCs. In principle Netscape could seek out similar arrangements with Dell or HP to include its browser instead. Unless of course MSFT made the OEMs an offer they could not refuse.

That became the core of the government accusation for anticompetitive practices: MSFT pushed for exclusive deals, pressuring partners such as PC manufactures (OEM or “original equipment manufacturer in industry lingo) to not only include a copy of Internet Explorer with prominent desktop placement but also rule out shipping any alternative browsers. Redmond clearly had far more leverage than Mountain View over PC manufacturers: shipping any browser at all was icing on the cake, but a copy of the reigning OS was practically mandatory.

What started out as a sales/marketing strategy rapidly crossed over into the realm of software engineering when later releases of Windows began to integrate Internet Explorer in what MSFT claimed was an inextricable fashion. The government objected to this characterization: IE was an additional piece of software downloaded from the web or installed from CDs at the consumer’s discretion. Shipping a copy with Windows out-of-the-box may have been convenient to save users the effort of jumping through those installation hoops, but surely a version of Windows could also be distributed without this optional component

 When MSFT objected that these versions of Windows could not function properly without IE, the government sought out a parade of expert witnesses to disprove this. What followed was a comedy of errors on both sides. One expert declared the mission accomplished after removing the icon and primary executable, forgetting about all of the shared libraries (dynamic link library or DLL in Windows parlance) that provide the majority of browser functionality. IE was designed to be modular, to allow “embedding” the rendering engine or even subsets of functionality such as the HTTP stack into as many applications as possible. The actual “Internet Explorer” icon users clicked on was only the tip of the iceberg. Deleting that was the equivalent of arguing that the electrical system in a car can be safely removed by smashing the headlights and noting the car still drives fine without lights. Meanwhile MSFT botched its own demonstration of how a more comprehensive removal of all browser components results in broken OS functionality. A key piece of evidence entered by the defense was allegedly a screen recording from a PC showing everything that goes wrong with Windows when IE components are missing. Plaintiffs lawyers were quick to point out strange discontinuities and changes in the screenshots, eventually forcing MSFT into an embarrassing admission that the demonstration was spliced together from multiple sequences.

Holding back the tide

The next decade of developments would vindicate MSFT, proving that company leadership was fully justified in worrying about the impact of the web. MSFT mobilized to keep Windows relevant, playing the game on two fronts:

  1. Inject Windows dependencies into the web platform, ensuring that even if websites were accessible on any platform in theory, they worked best on Windows viewed in IE. Pushing ActiveX was a good example of this. Instead of pushing to standardize cross-platform APIs, IE added appealing features such as the initial incarnation of XML HTTP request as ActiveX controls. Another example was the addition of Windows-specific quirks into the MSFT version of Java. This provoked a lawsuit from Sun for violating the trademark “Java” with incompatible implementation. MSFT responded by deciding to remove the JVM from every product that previously shipped it.
  2. Stop further investments into the browser once it became clear that IE won the browser wars.  The development of IE4 involved a massive spike of resources. That release also marked the turning of the tide, and IE starting to win out in comparisons against Navigator 4. IE5 was an incremental effort by comparison. By IE6, the team had been reduced to a shadow of itself where it would remain for the next ten years until Google Chrome came upon the scene. (Even the “security push” in the early 2000s culminating in SP2 focused narrowly on cleaning up the cesspool of vulnerabilities in the IE codebase. It was never about adding features and enhancing functionality for a more capable web.)

This lack of investment from MSFT had repercussions far beyond the Redmond campus. It effectively put the web platform into deep freeze. HTML and JavaScript evolved very quickly in the 1990s. HTML2 was published as an RFC in 1995. Later the World Wide Web Consortium took up the mantle of standardizing HTML. HTML3 came out in 1997. It took less than a year for HTML4 to be published as an official W3C “recommendation”— what would be called a standard under any other organization. This was a time of rapid evolution for the web, with Netscape, MSFT and many other companies participating to drive the evolution forward. It would be another 17 years before HTML5 followed.

Granted MSFT had its own horse in the race with MSN, building out web properties and making key investments such as the acquisition of Hotmail. Some even achieved a modicum of success, such as the travel site Expedia which was spun out into a public company in 1999. But a clear consensus had emerged inside the company around the nature of software development. Applications accessed through a web browser were fine for “simple” tasks, characterized by limited functionality, with correspondingly low performance expectations: minimalist UI, laggy/unresponsive interface, only accessible with an Internet connection and even then constrained by the limits of broadband at the time. Anything more required native applications, installed locally and designed to target the Windows API. These were also called “rich clients” in a not-so-subtle dig at the implied inferiority of web applications.

Given that bifurcated mindset, it is no surprise the web browser became an afterthought in the early 2000s. IE had emerged triumphant from the first browser wars, while Netscape disappeared into the corporate bureaucracy of AOL following the acquisition. Mozilla Firefox was just staring to emerge phoenix-like from the open-sourced remains of the Navigator codebase, far from posing any threat to market share. The much-heralded Java applets in the browser that were going to restore parity with native applications failed to materialize. There were no web-based word processors or spreadsheets to compete against Office. In fact there seemed to be hardly any profitable applications on the web, with sites still trying to work out the economics of “free” services funded by increasingly annoying advertising. 

Meanwhile MSFT itself had walked away from the antitrust trial mostly unscathed. After losing the initial round in federal court after a badly botched defense, the company handily won at the appellate court. In a scathing ruling the circuit court not only reversed the breakup order but found the trial judge to have engaged in unethical, biased conduct. Facing another trial under a new judge, the DOJ blinked and decided it was no longer seeking a structural remedy. The dramatic antitrust trial of the decade ended with a whimper: the parties agreed to a mild settlement that required MSFT to modify its licensing practices and better documents its APIs for third-parties to develop interoperable software.

This outcome was widely panned as a minor slap-on-the-wrist by industry pundits, raising concerns that it left the company without any constraints to continue engaging in the same pattern of anticompetitive behavior. In hindsight the trial did have an important consequence that was difficult to observe from the outside: it changed the rules of engagement within MSFT. Highly motivated to avoid another extended legal confrontation that would drag on share price and distract attention, leadership grew more cautious about pushing the envelope around business practices. It may have been too little too late for Netscape, but this shift in mindset meant that when the next credible challenger to IE materialized in the shape of Google Chrome, the browser was left to fend for itself, competing strictly on its own merits. There would be no help from the OS monopoly.

Second chances for the web

More than any other company it was Google responsible for revitalizing the web as a capable platform for rich applications. For much of the 2000s, it appeared that the battle for developer mindshare had settled into a stalemate: HTML and Javascript were good for basic applications (augmented by the ubiquitous Adobe Flash for extra pizzaz when necessary) but any heavy lifting— CPU intensive computing, fancy graphics, interacting with peripheral devices— required a locally installed desktop application. Posting updates on social life and sharing hot-takes on recent events? Web browsers proved  perfectly adequate for that. But if you planned to crunch numbers on a spreadsheet with complex formulas, touch up high-resolution pictures or hold a video conference, the consensus held that you needed “real” software written in a low-level language such as C/C++ and directly interfacing with the operating system API.

Google challenged that orthodoxy, seeking to move more applications to the cloud. It was Google continually pushing the limits of what existing browsers can do, often with surprising results. Gmail was an eye opener for its responsive, fast UI as much as it was for the generous gigabyte of space every user received and the controversial revenue model driven by contextual advertising based on the content of emails. Google Maps— an acquisition, unlike the home-grown Gmail which had started out as one engineer’s side project— and later street view proved that even high resolution imagery overlaid with local search results could be delivered over existing browsers with decent user experience. Google Docs and Spreadsheets (also acquisitions) were even more ambitious undertakings aimed at the enterprise segment cornered by MSFT Office until that point.

These were mere opening moves in the overall strategic plan: every application running in the cloud, accessed through a web browser. Standing in the way of that grand vision was the inadequacy of existing browsers. They were limited in principle by the modest capabilities of standard HTML5 and Javascript APIs defined at the time, without venturing into proprietary, platform-dependent extensions such as Flash, Silverlight and ActiveX. They were hamstrung in practice even further thanks to the mediocre implementation of those capabilities by the dominant browser of the time, namely Internet Explorer. What good would innovative cloud applications do when users had to access them through a buggy, slow browser riddled with software vulnerabilities? (There is no small measure of irony that the 2009 “Aurora” breach of Google by Chinese APT started out with an IE6 SP2 zero-day vulnerability.)

Google was quick to recognize the web browser as a vital component for its business strategy, in much the same way MSFT correctly perceived the danger Netscape posed. Initially Google put its weight behind Mozilla Firefox. The search deal to become the default engine for Firefox (realistically, did anyone want Bing?) provided much of the revenue for the fledgling browser early on. While swearing by the benefits of having an open-source alternative to the sclerotic IE, Google would soon realize that a development model driven by democratic consensus came with one undesirable downsides: despite being a major source of revenue for Firefox, it could exert only so much influence over the product roadmap. For Google controlling its own fate all but made inevitable to embark on its own browser project.

Browser wars 2.0

Chrome was the ultimate Trojan horse for advancing the Google strategy: wrapped in the mantle of “open-source” without any of the checks-and-balances of an outside developer community to decide what features are prioritized (a tactic that  Android would soon come to perfect in the even more cut-throat setting of mobile platforms.) Those lack of constraints allowed Google to move quickly and decisively on the main objective: advance the web platform. Simply shipping a faster and safer browser alone would not have been enough to achieve parity with desktop applications. HTML and Javascript itself had to evolve.

More than anything else Chrome gave Google a seat at the table for standardization of future web technologies. While work on HTML5 had started in 2004 at the instigation of Firefox and Opera representatives, it was not until Chrome reignited the browser wars that bits and pieces of the specification began to find their way into working code. Crucially the presence of a viable alternative to IE meant standardization efforts were no longer an academic exercise. The finished output of W3C working groups is called a “recommendation.” That is no false modesty in terminology because at the end of the day W3C has no authority or even indirect influence to compel browser publishers to implement anything. In a world where most users are running an outdated version of IE (with most desktops were stuck on IE6 SP2 or IE7) the W3C can keep cranking out enhancements to HTML5 on paper without delivering any tangible benefit to users. It’s difficult enough to incentivize websites to take advantage of new features. The path of least resistance already dictates coding for the least common denominator. Suppose some website crucially depends on a browser feature missing from 10% of visitors who are running an ancient version of IE. Whether they do not care enough to upgrade, or perhaps can not upgrade as with enterprise users at the mercy of their IT department for software choice, these users will be shut out of using the website, representing a lost revenue opportunity. By contrast a competitor with more modest requirements from their customers’ software, or alternatively a more ambitious development mindset dedicated to backwards compatibility will have no problem monetizing that segment.

The credibility of a web browser backed by the might of Google shifted that calculus. Observing the clear trend with Chrome and Firefox capturing market share from IE (and crucially, declining share of legacy IE versions) made it easier to justify building new applications for a modern web incorporating the latest and greatest from the W3C drawing board: canvas, web-sockets, RTC, offline mode, drag & drop, web storage… It no longer seemed like questionable business judgment to bet on that trend and build novel applications assuming a target audience with modern browsers. In 2009 YouTube engineers snuck in a banner threatening to cut off support for IE, careful to stay under the radar lest their new overlords at Google object to this protest. By 2012 the tide had turned to the point that an Australian retailer began imposing a surcharge on IE7 users to offset the cost of catering to their ancient browser.

While the second round of the browser wars is not quite over, some conclusions are obvious. Google Chrome has a decisive lead over all other browsers especially in the desktop market. Firefox share is declining, creating doubts about the future of the only independent open-source web browser that can claim the mantle of representing users as stakeholders. As for MSFT, despite getting its act together and investing in auto-update functionality to avoid getting stuck with another case of the “IE6 installed-base legacy” problem, Internet Explorer versions have steadily lost market share during the 2010s. Technology publications cheered on every milestones such as the demise of IE6 and the “flipping” point when Google Chrome reached 50%. Eventually Redmond gave up and decided to start over with a new browser altogether dubbed “Edge,” premised on a full rewrite instead of incremental tweaks. That has not fared much better either. After triumphantly unveiling a new HTML rendering engine written from scratch to replace IE’s “Trident,” MSFT quickly threw in the towel, announcing that it would adopt Blink— the engine from Chrome. (In as much as MSFT of the 1990s was irrationally combative in its rejection of technology not invented in Redmond, its current incarnation had no qualms admitting defeat and making pragmatic business decisions to leverage competing platforms.) Despite multiple legal skirmishes with EU regulators over its ads and browser monopoly, there are no credible challengers to Chrome on the desktop today. When it comes to market power, Google Chrome is the new IE.

The browser disruption in hindsight

Did MSFT overreact to the Netscape Navigator threat and knee-cap itself by inviting a regulatory showdown through its aggressive business tactics? Subsequent history vindicates the company leadership in correctly judging the disruption potential but not necessarily the response. It turned out the browser was indeed a critical piece of software— it literally became the window users through which users experience the infinite variety of content and applications beyond the narrow confines of their local device. Platform-agnostic and outside the control of companies providing the hardware/software powering that local device, it was an escape hatch out of the “Wintel” duopoly. Winning the battle against Netscape diffused that immediate threat for MSFT. Windows did not become “a set of poorly debugged device drivers” as Netscape’s Marc Andreessen had once quipped.

An expansive take on “operating system”

MSFT was ahead of its time in another respect: browsers are considered an intrinsic component of the operating system, a building block for other applications to leverage. Today a consumer OS shipping without some rudimentary browser out-of-the-box would be an anomaly. To pick two examples comparable to Windows:

  • MacOS includes Safari starting with the Panther release in 2003.
  • Ubuntu desktop releases come with Firefox as the default browser.

On the mobile front, browser bundling is not only standard but pervasive in its reach:

  • iOS not only ships a mobile version of Safari but the webkit rendering engine is tightly integrated into the operating system, as the mandatory embedded browser to be leverage by all other apps that intend to display web content. In fact until recently Apple forbid shipping any alternative browser that is not built on webkit. The version of “Chrome” for iOS is nothing more than a glossy paint-job over the same internals powering Safari. Crucially, Apple can enforce this policy. Unlike desktop platforms with their open ecosystem where users are free to source software from anywhere, mobile devices are closed appliances. Apple exerts 100% control on software distribution for iOS.
  • Android releases have included Google Chrome since 2012. Unlike iOS, Google has no restrictions on alternative browsers as independent applications. However embedded web views in Android are still based on the Chrome rendering engine.

During the antitrust trial, some astute observers pointed out that only a few years ago even the most rudimentary networking functionality— namely the all important TCP/IP stack— was an optional component in Windows. Today it is not only a web browser that has become table stakes. Here are three examples of functionality once considered strictly distinct lines of business from providing an operating system:

  1. Productivity suites: MacOS comes with Pages for word processing, Sheets for spreadsheets and Keynote for crafting slide-decks. Similarly many Linux distributions include LibreOffice suite which includes open-source replacements for Word, Excel, PowerPoint etc. (This is a line even MSFT did not cross: to this day no version of Windows includes a copy of the “Office suite” understood as a set of native applications.)
  2. Video conferencing and real-time collaboration: Again each vendor has been putting forward their preferred solution, with Google including Meet (previously Hangouts), Apple promoting FaceTime and MSFT pivoting to Teams after giving up on Skype.
  3. Cloud storage: To pick an example where the integration runs much deeper, Apple devices have seamless access to iCloud storage, Android & ChromeOS are tightly coupled to Google Drive for backups. Once the raison d’être of unicorn startups Dropbox and Box, this functionality has been steadily incorporated into the operating system casting doubt on the commercial prospects of these public companies. Even MSFT has not shied away from integrating its competing OneDrive service with Windows.

There are multiple reasons why these examples raise few eyebrows from the antitrust camp. In some cases the applications are copycats or also-rans: Apple’s productivity suite can interop with MSFT Office formats (owing in large part to EU consent decree that forced MSFT to start documenting its proprietary formats) but still remains a pale imitation of the real thing. In other cases, the added functionality is not considered a strategic platform or has little impact on the competitive landscape. FaceTime is strictly a consumer-oriented product that has no bearing on the lucrative enterprise market. While Teams and Meet have commercial aspirations, they face strong headwinds competing against established players Zoom and WebEx specializing in this space. No one is arguing that Zoom is somehow disadvantaged on Android because it has to be installed as a separate application from Play Store. But even when integration obviously favors an adjacent business unit— as in the case of mobile platforms creating entrenched dependencies on the cloud storage offering from the same company— there is a growing recognition that the definition of an “operating system” is subject to expansion. Actions that once may have been portrayed as leveraging platform monopoly to take over another market— Apple & Google rendering Dropbox irrelevant— become the natural outcome of evolving customer expectations.

Safari on iOS may look like a separate application with its own icon, but it is also the underlying software that powers embedded “web views” for all other iOS apps when those apps are displaying web content inside their interface. Google Chrome provides a similar function for Android apps by default. No one in their right mind would resurrect the DOJ argument of the 1990s that a browser is an entirely separate piece of functionality and weaving it into the OS is an arbitrary marketing choice without engineering merits. (Of course that still leaves open the question of whether that built-in component should be swappable and/or extensible. Much like authentication or cryptography capabilities for modern platforms have an extensibility mechanism to replace default, out-of-the-box software with alternatives, it is fair to insist that the platform allow substituting a replacement browser designated by the consumer.) Google turned the whole model upside down with Chromebooks, building an entire operating system around a web browser.

All hail the new browser monopoly

Control over the browser temporarily handed MSFT significant leeway over the future direction of the web platform. If that platform remained surprisingly stagnant afterwards— compared to its frantic pace of innovation during the 1990s— that was mainly because MSFT had neither the urgency or vision to take it to the next level. (Witness the smart tags debacle.) Meanwhile the W3C ran around in circles, alternating between incremental tweaks— introducing XHTML, HTML repackaged as well-formed XML— and ambitious visions of a “semantic web.” The latter imagined a clean separation of content from style, two distinct layers HTML munged together, making it possible for software to extract information, process it and combine it in novel ways for the benefit of users. Outside the W3C there were few takers. Critics derided it as angle-brackets-everywhere: XSLT, XPath, XQuery, Xlink. The semantic web did not get the opportunity it deserved for large-scale demonstration to test its premise. For a user sitting in front of their browser and accessing websites, it would have been difficult to articulate the immediate benefits. Over time Google and ChatGPT would prove machines were more than adequate for grokking unstructured information on web pages even without the benefit of XML tagging.

Luckily for the web, plenty of startups did have more compelling visions of how the web should work and what future possibilities could be realized— given the right capabilities. This dovetailed nicely with the shift in emphasis from shipping software to operating services. (It certainly helped that the economics were favorable. Instead of selling a piece of software once for a lump sum and hope the customer upgrades when the next version comes out, what if you could count on a recurring source of revenue from monthly subscriptions?) The common refrain for all of these entrepreneurs: the web browser had become the bottleneck. PCs kept getting faster and even operating systems became more capable over time, but websites could only access a tiny fraction of those resources through HTML and Javascript APIs, and only through a notoriously buggy, fragile implementation held together by duct-tape: Internet Explorer.

In hindsight it is clear something would change; there was too much market pressure against a decrepit piece of software guarding an increasingly untenable OS monopoly. Surprisingly that change came in the form of not one but two major developments in the 2010s. One shift had nothing to do with browsers: smart-phones gave developers a compelling new way to reach users. It was a clean slate, with new powerful APIs unconstrained by the web platform. MSFT did not have a credible response to the rise of iOS and Android any more than it did to Chrome. Windows Mobile never made much inroads with device manufacturers, despite or perhaps because of the Nokia acquisition. It had even less success winning over developers, failing to complete the virtuous cycle between supply & demand that drives platform. (At one point a desperate  MSFT started outright offering money to publishers of popular apps to port their iOS & Android apps to Windows Mobile.)

Perhaps the strongest evidence that MSFT judged the risk accurately comes from Google Chrome itself. Where MSFT saw one-sided threat to the Windows and Office revenue streams, Google perceived a balanced mix of opportunity and risk. The “right” browser could accelerate the shift to replace local software with web applications— such as the Google Apps suite— by closing the perceived functionality gap between them. The “wrong” browser would continue to frustrate that shift or even push the web towards another dead-end proprietary model tightly coupled to one competitor. Continued investment in Chrome is how the odds get tilted towards the first outcome. Having watched MSFT squander its browser monopoly with years of neglect, Google knows better than to rest on its laurels.

CP

The elusive nature of ownership in Web3

A critical take on Read-Write-Own

In the recently published “Read Write Own,” Chris Dixon makes the case that blockchains allow consumers to capture more of the returns from the value generated in a network because of the strongly enshrined rules of ownership. This is an argument about fairness: the value of networks is derived from the contributions of participants. Whether it is Facebook users sharing updates with their network or Twitter/X influencers opining on latest trends, it is Metcalfe’s law that allows these systems to become so valuable. But as the history of social networks has demonstrated time and again, that value accrues to a handful of employees and investors who control the company. Not only do customers not capture any of those returns (hence the often used analogy of “sharecroppers” operating on Facebook’s land) they are stuck with the negative externalities, including degraded privacy, disinformation and in the case of Facebook, repercussions that spill out into the real-world including outbreaks of violence.

The linchpin of this argument is that blockchains can guarantee ownership in ways that the two prevailing alternatives (“protocol networks” such as SMTP or HTTP and the better-known “corporate networks” such as Twitter) can not. Twitter can take away any handle, shadow-ban the account or modify their ranking algorithms to reduce its distribution. By comparison if you own a virtual good such as some NFT issued on a blockchain, no one can interfere with your rightful ownership of that asset. This blog post delves into some counterarguments on why this sense of ownership may prove illusory in most cases. The arguments will run from the least-likely and theoretical to most probably, in each case demonstrating ways these vaunted property rights fail.

Immutability of blockchains

The first shibboleth that we can dispense with is the idea that blockchains operate according to immutable rules cast in stone. An early dramatic illustration of this came about in 2016, as a result of the DAO attack on Ethereum. The DAO was effectively a joint investment project operated by a smart-contract on the Ethereum chain. Unfortunately that contract had a serious bug, resulting in an critical security vulnerability. An attacker exploited that vulnerability to drain most of the funds, to tune of $150MM USD notional at the time.

This left the Ethereum project with a difficult choice. They could double down on the doctrine that Code-Is-Law and let the theft stand: argue that the “attacker” did nothing wrong, since they used the contract in exactly the way it was implemented. (Incidentally, that is a mischaracterization of the way Larry Lessig intended that phrase. “Code and other laws of cyberspace” where the phrase originates was prescient in warning about the dangers in allowing privately developed software, or “West Coast Code” as Lessig termed it, to usurp democratically created laws or “East Coast Code” in regulating behavior.) Or they could orchestrate a difficult, disruptive hard-fork to change the rules governing the blockchain and rewrite history to pretend the DAO breach never occurred. This option would return stolen funds back to investors.

Without reopening the charged debate around which option was “correct” from an ideological perspective, we note the Ethereum foundation emphatically took the second route. From the attacker perspective, their “ownership” of stolen ether proved very short lived.

While this episode demonstrated the limits of blockchain immutability, it is also the least relevant to the sense of property rights that most users are concerned about. Despite fears that the DAO rescue could set a precedent and force the Ethereum foundation to repeatedly bailout vulnerable projects, no such hard-forks followed. Over the years much larger security failures occurred on Ethereum (measured in notional dollar value) with the majority attributed with high confidence to rogue states such as North Korea. None of them merited so much as a serious discussion of whether another hard-fork is justified to undo the theft and restore the funds to rightful owners. If hundreds of million dollars in tokens ending up in the coffers of a sanctioned state does not warrant breaking blockchain immutability, it is fair to say the average NFT holder has little reason to fear that some property dispute will result in blockchain-scale reorganization that takes away their pixelated monkey images.

Smart-contract design: backdoors and compliance “features”

Much more relevant to the threat model of a typical participant is the way virtual assets are managed on-chain: using smart-contracts that are developed by private companies and often subject to private control. Limiting our focus to Ethereum for now, recall that the only “native” asset on chain is ether. All other assets such as fungible ERC-20 tokens and collectible NFTs must be defined by smart contracts, in other words software that someone authors. Those contracts govern the operation of the asset: conditions under which it can be “minted”— in other words, created out of thin air—transferred or destroyed. To take a concrete example: a stablecoin such as Circle (USDC) is designed to be pegged 1:1 to the US dollar. More USDC is issued on chain when Circle the company receives fiat deposits from a counterparty requesting virtual assets. Similarly USDC must be taken out of circulation or “burned” when a counterparty returns their virtual dollars and demands ordinary dollars back in a bank account.

None of this is surprising. As long the contract properly enforces rules around who can invoke those actions on chain, this is exactly how one envisions a stablecoin to operate. (There is a separate question around whether the 1:1 backing is maintained, but that can only be resolved by off-chain audits. It is outside the scope of enforcement by blockchain rules.) Less appreciated is the fact that most stablecoins contracts also grant the operator ability to freeze funds or even seize assets from any participant. This is not a hypothetical capability; issuers have not shied away from using it when necessary. To pick two examples:

While the existence of such a “backdoor” or “God mode” may sound sinister in general, these specific interventions are hardly objectionable. But it serves to illustrate the general point: even if blockchains themselves are immutable and arbitrary hard-forks a relic of the past, virtual assets themselves are governed not by “native” rules ordained by the blockchain, but independent software authored by the entity originating that asset. That code can include arbitrary logic granting the issuer any right they wish to reserve.

To be clear, that logic will be visible on-chain for anyone to view. Most prominent smart-contracts today have their source code published for inspection. (For example, here is the Circle USD contract.) Even if the contract did not disclose its source code, the logic can be reverse engineered from the low-level EVM bytecode available on chain. In that sense there should be no “surprises” about whether an issuer can seize an NFT or refuse to honor a transfer privately agreed upon by two parties. One could argue that users will not purchase virtual assets from issuers who grant themselves such broad privileges to override property rights by virtue of their contract logic. But that is a question of market power and whether any meaningful alternative exists for consumers who want to vote with their wallet. It may well become the norm that all virtual assets are subject to permanent control by the issuer, something users accept without a second thought much like the terms-of-use agreements one clicks through without hesitation when registering for advertising-supported services. The precedent with stablecoins is not encouraging: Tether and Circle are by far the two largest stablecoins by market capitalization. The existence of administrative overrides in their code was no secret. Even multiple invocations of that power has not resulting in a mass exodus of customer into alternative stablecoins.

When ownership rights can be ignored

Let’s posit that popular virtual assets will be managed by “fair” smart-contracts without designed-in backdoors that would enable infringement of ownership rights. This brings us to the most intractable problem: real-world systems are not bound by ownership rights expressed on the blockchain.

Consider the prototypical example of ownership that proponents argue can benefit from blockchains: in-game virtual goods. Suppose your game character has earned a magical sword after significant time spent completing challenges. In most games today, your ownership of that virtual sword is recorded as an entry in the internal database of the game studio, subject to their whims. You may be allowed to trade it, but only on a sanctioned platform most likely affiliated with the same studio. The studio could confiscate that item because you were overdue on payments or unwittingly violated some other rule in the virtual universe. They could even make the item “disappear” one day if they decide there are too many of these swords or they grant an unfair advantage. If that virtual sword was instead represented by an NFT on chain, the argument runs, the game studio would be constrained in these types of capricious actions. You could even take the same item to another gaming universe created by a different publisher.

On the face of it, this argument looks sound, subject to the caveats about the smart-contract not having backdoors. But it is a case of confusing the map with the territory. There is no need for the game publisher to tamper with on-chain state in order to manipulate property rights; nothing prevents the game software from ignoring on-chain state. On-chain state could very well reflect that you are the rightful owner of that sword while in-game logic refuses to render your character holding that object. The game software is not running on the blockchain or in any way constrained by the Ethereum network or even the smart-contract managing virtual goods. It is running on servers controlled by a single company— the game studio. That software may, at its discretion, consult the Ethereum blockchain to check on ownership assignments. That is not the same as being constrained by on-chain state. Just because the blockchain ledger indicates you are the rightful owner of a sword or avatar does not automatically force the game rendering software to depict your character with those attributes in the game universe. In fact the publisher may deliberately depart from on-chain state for good reasons. Suppose an investigation determines that Bob bought that virtual sword from someone who stole it from Alice. Or there have been multiple complaints about a user-designed avatar being offensive and violating community standards. Few would object to the game universe being rendered in a way that is inconsistent with on-chain ownership records under these circumstances. Yet the general principle stands: users are still subject to the judgment of one centralized entity on when it is “fair game” to ignore blockchain state and operate as if that virtual asset did not exist.

Case of the disappearing NFT

An instructive case of  “pretend-it-does-not-exist” took place in 2021 when Moxie Marlinspike created a proof-of-concept NFT that renders differently depending on which website it is viewed from. Moxie listed the NFT on OpenSea, at the time the leading marketplace for trading NFTs. While it was intended in good spirit as a humorous demonstration of the mutability and transience of NFTs, OpenSea was not amused. Not only did they take down the listing, but the NFT was removed from the results returned by OpenSea API. As it turns out, a lot of websites rely on that API for NFT inventories. Once OpenSea ghosted Moxie’s API, it is as if the NFT did not exist. To be clear: OpenSea did not and could make any changes to blockchain state. The NFT was still there on-chain and Moxie was its rightful owner as far as the Ethereum network is concerned. But once the OpenSea API started returning alternative facts, the NFT vanished from view for every other service relying on that API instead of directly inspecting the blockchain themselves. (It turns out there were a lot of them, further reinforcing Moxie’s critique of the extent of centralization.)

Suppose customers disagree with the policy of the game studio. What recourse do they have? Not much within that particular game universe, anymore than the average user has any leverage with Twitter or Facebook in reversing their trust & safety decisions. Users can certainly try to take the same item to another game but there are limits to portability. While blockchain state is universal, game universes are not. The magic sword from the medieval setting will not do much good in a Call Of Duty title set in WW2.

In that sense, owners of virtual game assets are in a more difficult situation than Moxie with his problematic NFT. OpenSea can disregard that NFT but can not preclude listing on competing marketplaces or even arranging private sales to a willing buyer who values it on collectible or artistic merits. It would be the exact same situation if OpenSea for some bizarre reason came to insist that you do not own a bitcoin that you rightfully own on blockchain. OpenSea persisting at such a delusion would not detract in any way from the value of your bitcoin. Plenty of sensible buyers exist elsewhere who can form an independent judgment about blockchain state and accept that bitcoin in exchange for services. But when the value of a virtual asset is determined primarily by its function within a single ecosystem— namely that of the game universe controlled by a centralized publisher— what those independent observers think about ownership status carries little weight.

CP

We can bill you: antagonistic gadgets and dystopian visions of Philip K Dick

Dystopian visions

Acclaimed science-fiction author Isaac Asimov’s stories on robots involved a set of three rules that all robots were expected to obey:

  1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  2. A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
  3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

The prioritization leaves no ambiguity in the relationship between robots and their creators. Regardless of their level of artificial intelligence and autonomy, robots were to avoid harm to human beings. Long before sentient robots running amok and turning against their creators became a staple of science-fiction (Mary Shelly’s 19th century novel“Frankenstein” could be seen as their predecessor) Asimov systematically was formulating the intended relationship. Ethical implications of artificial intelligence are a recurring theme today. Can our own creations end up undermining humanity after they achieve sentience? But there are far more subtle and less imaginative ways that technology works against people in everyday settings, and this involves emergent AI to blame. This version too was predicted by science-fiction.

Three decades after Asimov, the dystopian imagination of Philip K Dick produced a more conflicted relationship between man and his creations. In the 1969 novel “Ubik”, the protagonist inhabits a world where advanced technology controls basic household functions from kitchen appliances to locks on the door. But there is a twist: all of these gadgets operate on what today would be called a subscription model. The coffee-maker refuses to brew the morning cup of joe until coins are inserted. (For all the richness and wild imagination of his alternate realities, PKD did not bother devising an alternative payment system for this future world.) When he runs out of money, he is held hostage at home; his front-door will not open without coins.

Compared to some of the more fanciful alternate universes brought to life in PKD fiction— Germany winning World War II in “The man in the high castle” or an omniscient police-state preemptively arresting criminals before they commit crimes as in “The minority report”— this level of dystopia is mild, completely benign. But it is also one that bears a striking resemblance to where the tech industry is stridently marching towards. Consumers are increasingly losing control over their devices— devices which they have fully and rightfully paid for. Not only is the question of ownership being challenged with increasing restrictions on what they can do to hardware they have every right to expect 100% control over, but those devices are actively working against the interests of the consumer, doing the bidding of third-parties be it the manufacturer, the service-provider or possibly the government.

License to tinker

There is a long tradition in American culture of hobbyists tinkering with their gadgets. This predates the Internet or the personal computer. Perhaps automobiles and motorcycles were the first technology which lent itself to mass tinkering. Mass production made cars accessible to everyone and for those with a knack for spending long hours in the garage, they were relatively easy to modify- for different aesthetics or better performance under the hood. In one sense the hot-rodders of the 1950s and 1960s were one of the cultural predecessors of today’s software hobbyists. Cars at the time were relatively low-tech; with carburetors and manual transmission being the norm, spending a few months in high school shop-class would provide adequate background. But more importantly the platform was tinkering-friendly. Manufacturers did not go out of their way to prevent buyers from modifying their hardware. That is partly related to technical limitations. It is not as-if cars could be equipped with tamper-detection sensors to immobilize the vehicle if the owner installed parts the manufacturer did not approve of. But more importantly, ease of customization was itself considered a competitive advantage. In fact some of the most cherished vehicles of the 20th century including muscle-cars, V-twin motorcycles and air-cooled Volkswagens owed part of their iconic status to their vibrant aftermarket for mods.

Natural limits existed on how far owners could modify their vehicle. To drive on public-roads, it had to be road-legal after all. One could install a different exhaust system to improve engine sound, but not have flames shooting out the back. More subtly an economic disincentive existed. Owners risked giving up on their warranty coverage for modified parts, a significant consideration given that Detroit was not exactly known for high-quality, low-defect manufacturing at the time. But even that setback was localized. Replace the stereo or rewire the speakers yourself, and you can no longer complain about electrical system malfunctions. But you would still expect the transmission to operate as advertised and the manufacturer to continue honoring any warranty coverage for the drivetrain. There was no warning sticker anywhere that loosening this or that bolt would void the entire warranty on every other part of the vehicle. Crucially consumers were given a meaningful choice: you are free to modify the car for personal expression in exchange for giving up warranty claims against the manufacturer.

From honor code to software enforcement

Cars from the golden-era of hot-rodding were relatively dumb gadgets. Part of the reason manufacturers did not have much of a say in how owners could modify their vehicle is that they had no feasible technology to enforce those restrictions once the proud new owner drove it off the lot. By contrast, software can enforce very specific restrictions on how a particular system operates. In fact it can impose entirely arbitrary limitations to disallow specific uses of the hardware, even when the hardware itself is perfectly capable of performing those functions.

Here is an example. In the early days Windows NT 3.51 had two editions: workstation and server, differentiated by the type of scenario they were intended for. The high-end server SKU supported machines with up to 8 processors while the workstation maxed out at 2. If you happened to have more powerful hardware, even if you did not need any of the bells-and-whistles of server, you had to spring for the more expensive product. (Note: there is a significant difference between uniprocessor and multiprocessor kernels; juggling multiple CPUs requires substantial changes but going from 2 to 8 processors does not.) What was the major difference between those editions? From an economical perspective, $800 measured in 1996 dollars. From a technology perspective, handful of bytes in a registry key describing which type of installation occurred. As noted in a 1996 article titled Differences Between NT Server and Workstation Are Minimal:

“We have found that NTS and NTW have identical kernels; in fact, NT is a single operating system with two modes. Only two registry settings are needed to switch between these two modes in NT 4.0, and only one setting in NT 3.51. This is extremely significant, and calls into question the related legal limitations and costly upgrades that currently face NTW users.”

There is no intrinsic technical reason why the lower-priced edition could not take advantage of more powerful hardware, or for that matter, allow more than 10 concurrents connections to function as a web-server— as Microsoft later relented after customer backlash. These are arbitrary calls made by someone on sales team who, in their infinite wisdom, concluded that customers with expensive hardware or web-sites ought to pay more for their operating system.

Two tangents worth exploring about this case. First the proprietary nature of the software and its licensing model is crucial for enforcing these types of policies. Arbitrary restrictions would not fly with open-source software. If a clueless vendor shipped a version of Linux with random, limit on the number of CPUs or memory which does not originate from technical limitations, customers could modify the source-code to lift that restriction. Second, the ability to enforce draconian restrictions dreamt up by marketing is greatly constrained by platform limitations. That’s because the personal computer is an open platform. Even with a proprietary operating system such as Windows, users get full control over their machine. You could edit the registry or tamper with OS logic to trigger an identity crisis between workstation/server.  Granted, that would be an almost certain violation of the shrink-warp license nobody read when installing the OS. MSFT would not look kindly upon this practice if carried out large scale. It’s one thing for hobbyists to demonstrate the possibility as a symbolic gesture; it is another level of malicious intent for an enterprise with thousands of Windows licenses to engage in systematic software piracy by giving themselves a free upgrade. So at the end of the day, enforcement still relied on messy social norms and imperfect contractual obligations. Software did not aspire to replace the conscience of the consumer, to stop them from perceived wrongdoing at all costs.

Insert quarters to continue

In fact software licensing in the enterprise has a history of such arbitrary restrictions enforced through a combination of business logic implemented in proprietary-software along with dubious reliance on over-arching “terms of use” that discourage tampering with said logic. To this day copies of Windows Server are sold with client-licenses, dictating the number of concurrent users that the server is willing to support. If the system is licensed for 10 clients, the eleventh user attempting to connect will be turned away regardless of how much spare CPU or memory capacity is left. You must purchase more licenses. In other words: insert quarters to continue.

Yet this looks very different than Philip K Dick’s dystopian coffeemaker and does not elicit anywhere near the same level of indignation. There are several reasons for that. First, enterprise software has acclimatized to the notion of discriminatory pricing. Vendors extract higher prices from companies who are in a position to pay. The reasoning goes: if you can afford that fancy server with a dozen CPUs and boatloads of memory, surely you can also spring for the high-end edition of Windows server that will agree to fully utilize the hardware? Second, the complex negotiations around software licensing are rarely surfaced to end-users. It is the responsibility of the IT department to work out how many licenses are required and determine the right mix of hardware/software required to support the business. If an employee is unable to perform her job because she is turned down by a server having reached its cap on maximum simultaneous users—an arbitrary limit that exists only in the realm of software licensing it must be noted, not in the absolute resources available in the underlying hardware— she is not expected to solve that problem by taking out her credit-card and personally paying for the additional license. Finally this scenario is removed from everyday considerations. Not everyone works in a large enterprise subject to stringent licensing rules, and even for those who are unlucky enough to run into this situation, the inconvenience created by an uncooperative server is relatively mild- far cry from the front-door that refuses to open and locks its occupants inside.

From open platforms to appliances

One of more disconcerting trends of the past decade is that what used to be norm in the enterprise segment is now trickling down into the consumer space. We may not have  coffeemakers operating on a subscription model yet. Doors that demand payment for performing their basic function would likely never pass fire-code regulations. But gradually consumer electronics have started imposing greater restrictions on their alleged owners, restrictions that are equally arbitrary and disconnected from the capabilities of hardware, chosen unilaterally by their manufacturers. Consider some examples from consumer electronics:

  • Region coding in DVD players. DVD players are designed to play content manufactured only for a specific region, even though in principle there is nothing that prevents the hardware from being able to play discs purchased anywhere in the world. Why? Because of disparities in purchasing power, DVDs are priced much lower in developing regions than they are in Western countries. If DVD players sold to American consumers could play content, it would suddenly become possible to “arbitrage” this price difference by purchasing cheap DVDs in say Taiwan and play them in the US. Region coding protects the revenue model of content providers, which depends crucial on price discrimination: charging more to US consumers for the same thing as Taiwanese consumers because they can afford to pay higher prices for movies.
  • Generalizing from the state of DVD players, any Digital Rights Management (or as it has been derisively called “digital restrictions management”)  technology is an attempt to hamper the capabilities of software/hardware platforms to further the interests of content owners. While rest of the software industry is focused on doing more with existing resources— squeeze more performance out of the CPU, add more features to an application that users will enjoy— those working on DRM are trying to get devices to do less. Information is inherently copyable; DRM tries to stop users from copying bits. By default audio and video signals can be freely sent to any output device; HDMI tries to restrict where they can be routed in the name of battling piracy. The restrictions do not even stop with anything involving content. Because the PC platform is inherently open, DRM enforcement inevitably takes an expansive view of its mission and begins to monitor the user for any signs that they could be doing perfectly valid activity that could potentially undermine DRM such as installing unsigned device drivers or enabling kernel-mode debugging on Windows.
  • Many cell phones sold in North America are “locked” to a specific carrier, typically the one where the customer bought their phone from. It is not possible to switch to another wireless carrier while keeping the device. Again there is no technical reason for this. Much like the number of processors that an operating system will agree to run on, it is an arbitrary setting. (In fact it takes more work to implement such checks.) The standard excuse is that cost of devices are greatly subsidized by the carrier in hidden costs buried into the service contract. But this argument fails basic sanity checks. Presumably the subsidy is paid-off after some number of months but phones remain locked. Meanwhile customers who bring their own unlocked device are not rewarded with any special discounts, effectively distorting the market. Also carriers already charge an early termination fee to customers who walk away from their contract prematurely, surely they can also include additional costs to cover the lost subsidy?
  • Speaking of cell phones, they are increasingly becoming more and more locked down appliances to use the terminology from Zittrain’s “The future of the Internet,” instead of open computing platforms. Virtually all PCs allow users to replace the operating system. Not a fan of Windows 8? Feel free to wipe the slate clean and install Linux. Today consumers can even purchase PCs preloaded with Linux to escape the dreaded “Microsoft tax” where cost of Windows licenses are implicitly factored into hardware prices. And if the idea of Linux-on- the-desktop turns out to be wishful thinking yet again, you can repent and install Windows 10 on that PC which came with Ubuntu out of the box. By contrast phones ship with one operating system picked by the all-knowing manufacturer and it is very difficult to change that. On the surface, consumers have plenty of choice because they can pick from thousands of apps written for that operating system. Yet one level below that, they are stuck with the operating system as an immutable choice. In fact, some Android devices never receive software updates from the manufacturer or carrier, so they are “immutable” in a very real sense. Users must go out of their way to exploit a security vulnerability in order to jailbreak/root their devices to replace the OS wholesale or even extend its capabilities in ways the manufacturer did not envision. OEMs further exploit this confusion to discourage users from tinkering with their devices, trying to equate such practices with weakening security— as if users are better off sticking to their abandoned “stock” OS with known vulnerabilities that will never get patched.

Unfree at any speed

Automative technology too is evolving in this direction of locked-down appliances. Cars remained relatively dumb until the 1990s when microprocessors slowly started making their way into every system, starting with engine management. On the way to becoming more software- driven, effectively computers-on-wheels, something funny happened: the vehicle gained greater capability to sense present conditions and more importantly, it became capable of reacting to these inputs. Initially this looks like an unalloyed good. All of the initial applications are uncontroversial, improving occupant safety: antilock brakes, airbags and traction control. All depend on software monitoring input from sensors and promptly responding to signals indicating that a dangerous condition is imminent.

The next phase may be less clear-cut, as enterprising companies continue pushing the line between choice and coercion. Insurers such as Geico offer pay-per-mile plans that use gadgets attached to the OBD-II port to collect statistics on how far the vehicle is driven, and presumably on how aggressively the driver attacks corners. While some may consider this an invasion of privacy, at least there is a clear opt-out: do not sign up for that plan. In other cases, opt-out becomes ambiguous. GM found itself in a pickle over the Stingray Corvette recording occupants with a camera in the rearview mirror. This was a feature not a bug, designed to create YouTube-worthy videos while the car was being put through its paces. But if occupants are not aware that they are being recorded, it is not clear they consented to appearing as extras in a Sebastian-Vettel-role-playing game. At the extreme end of the informed consent scale is use of remote immobilizers for vehicles sold to consumers with subprime credit. In these cases the dealers literally get a remote kill-switch for disabling operation of the vehicle if the consumer fails to stay current on payments. (At least that is the idea; NYT reports allegations of mistaken or unwarranted remote shutdown by unscrupulous lenders.) One imagines the next version of these gadgets will incorporate a credit-card reader to better approximate the PKD dystopia. Insert quarters to continue.

What is at stake here is a question of fairness and rights, but not in the legal sense. Very little has changed about the mechanics of consumer financing: purchasing a car on credit still obligates the borrower to make payments promptly until the balance is paid off. Failure to fulfill that obligation entitles the seller to repossess the vehicle. This is not some new-fangled notion of how to handle loans in default; the right to repossess or foreclose has always existed on the books. In practice exercising that right often required some dramatic, made-for-TV adventures in for tracking down the consumer or vehicle in question. Software has greatly amplified ability of lenders to enforce their rights and collect on their entitlements under the law.

From outright ownership to permanent tenancy

Closely related is a shift from ownership to subscription models. Software has made it possible to recast what used to be one-time purchases into ongoing subscriptions or pay-per-use models. Powerful social norms exist around how goods are distributed according to one or other model. No one expects that they can pay for electricity or cable with a lump-sum payment once and call it a day, receiving free service in perpetuity. If you stop paying for cable, the screen will eventually go blank. By contrast hardware gadgets such as television sets are expected to operate according to a different model: once you bring it home, it is yours. It may have been purchased with borrowed money, with credit extended by the store or your own credit-card issuer. But consumers would be outraged if their bank, BestBuy or TV manufacturer remotely reached out to brick their television in response to late on payments. Even under most subscription models, there are strict limitations on how service providers can retaliate against consumers breaking the contract.  If you stop paying for water, the utility can shut off future supply of water. They can not send vigilantes over to your house to drain the water tank or “take back” water you are presumably no longer entitled to.

Such absurd scenarios can and do happen in software. Perhaps missing the symbolism, Amazon remotely wiped copies of George Orwell’s 1984 from Kindles over copyright problems. (The irony can only be exceeded if Amazon threatens to remove  copies of Philip K Dick’s “Ubik” unless customers pay up.) These were not die-hard Orwell fans or DMCA protestors deliberately pirating the novel; they had purchased their copies from the official Amazon store. Yet the company defended its decision, arguing that the publisher who had offered those novels on its marketplace lacked the proper rights. Kindle is a locked-down appliance where Amazon calls the shots and customers have no recourse no matter however arbitrary those decisions appear.

What about computers? It used to be the case that if you bought a PC, it was yours for the keeping. It would continue running until its hardware failed.  In 2006 Microsoft launched FlexGo, a pay-as-you-go model for PC ownership in emerging markets. Echoing the words of the used car-salesmen on the benefits bestowed on consumers while barely suppressing a sense of colonialist contempt, a spokesperson for a partnering bank in Brazil enthuses: “Our lower-income customers are excited to finally buy their first PC with minimal upfront investment, paying for time as they need it, and owning a computer with superior features and genuine software.” (Emphasis on genuine software, since consumers in China or Brazil never had any problem getting their hands on pirated versions of Windows.) MSFT takes a more measured approach in touting the benefits of this alternative: “Customers can get a full featured Windows-enabled PC with low entry costs that they can access using prepaid cards or through a monthly subscription.” Insert quarters to continue.

FlexGo did not did not crater like “Bob,” Vista or others in the pantheon of MSFT disasters. Instead it faded into obscurity, having bet on the wrong vision of “making computing accessible” soon rendered irrelevant on both financial and technology grounds. Hardware prices continued to drop Better access to banking services and consumer credit meant citizens in developing countries got access to flexible payment options to buy a regular PC, without an OEM or software vendor in the loop to supervise the loan or tweak the operating system to enforce alternative licensing models. More dramatically the emergence of smartphones cast into doubt whether everyone in Brazil actually needed that “full-featured Windows-enabled PC” in the first place to cross the digital divide.

FlexGo may have disappeared but the siren song of subscription models still exerts its pull on the technology industry. Economics favor such models on both sides. Compared to the infrequent purchase of big-ticket items, the steady revenue stream from monthly subscribers smooths out seasonal fluctuations in revenue. From the consumer perspective, making “small” monthly payments over time instead of one big lump payment may look more appealing due to cognitive biases.

If anything the waning of PC as the dominant platform paves the way for this transformation. Manufacturers can push locked-down “appliances” without the historical baggage associated with the notion of a personal computer. Ideas that would never fly on the PC platform, practices that would provoke widespread consumer outrage and derision—locked boot-loaders, mandatory data-collection, always-on microphones and cameras, remote kill capabilities— can become the new normal for a world of locked-down appliances. In this ecosystem users no longer “own” their devices in the traditional sense; even if they were paid in full and no one can legally show up at the door to demand their return. These gadgets suffer from a serious case of split-personality disorder. On the one hand they are designed to provide some useful service to their presumed “owner;” this is the ostensible purpose they are advertised and purchased for. At the same time the gadget software contains business logic to serve the interests of the device manufacturer/service-provider/whoever happens to actually control the bits running there. These two goals are not always aligned. In a hypothetical universe with efficient markets, one would expect strong correlation. If the gadget deliberately sacrificed functionality to protect the manufacturer’s platform or artificially sustain an untenable revenue model, enlightened consumers would flock to an alternative from a competitor that is not saddled with such baggage. In reality such competitive dynamics operate imperfectly if at all, and the winner-takes-all nature of many market segments means that it is very difficult for a new entrant to make significant gains against entrenched leaders by touting openness or user-control as distinguishing feature. (Case in point: troubled history of open-source mobile phone projects and their failure to reach mass adoption.)

Going against the grain?

If there are forces counteracting the irresistible pull of locked-down appliances, they will face an uneven playing field. The share of PCs continues to decline among all consumer devices; Android has recently surpassed Windows as the most common operating systems on the Internet. Meanwhile the highly fashionable Internet of Things (IoT) notion is predicated on blackbox devices which are not programmable or extensible by their ostensible owners. It turns out that in some cases, they are not even managed by the manufacturer; just ask owners of IP cameras who devices were unwittingly enrolled into the Mirai botnet.

Consumers looking for an alternative face a paradoxical situation. On the one hand, there is a dearth of off-the-shelf solutions designed with user rights in mind. The “market” favors polished solutions such as the Nest thermostat, where hardware, software and cloud services are inextricably bundled together. Suppose you are a fan of the hardware but skeptical about how much private information it is sending to a cloud service provider? Tough luck; there is no cherry picking allowed. On the other hand, there has never been a better time to be tinkering with hardware: Arduino, Raspberry Pi and a host of other low-cost embedded platforms made it easier than ever to put together your own custom solution. This is still a case of payment in exchange for preserving user rights but it is a uniquely undemocratic system. But this “payment” is in the form of additional time spent to engineer and operate home-brew solutions. More worrisome is that such capabilities are only available to a small number of people, distinguished by their ability to renegotiate the terms service providers attempt to impose on their customer base. While that capability is to be celebrated—and it is why every successful jailbreak of a locked-down appliance is celebrated in the security community— it is fundamentally undemocratic by virtue of being restricted to a new ruling class of technocrats.

CP

[Update: Edited Feb 27th to correct typo.]

The missing identity layer for DeFi

Bootstrapping permissioned applications

To paraphrase the infamous 1993 New Yorker cartoon: “On the blockchain, nobody knows that you are a dog.” All participants are identified by opaque addresses with no connection to their real-world identity. Privacy by default is a virtue, but even those who voluntarily want to link addresses to their identity have few good options that would be persuasive. This blog post can lay claim to the address 0xa12Db34D434A073cBEE0162bB99c0A3121698879 on Ethereum but can readers be certain? (Maybe Ethereum Naming Serivce or ENS can help.) On the one hand, there is an undeniable egalitarian ethos here: if the only relevant facts about an address are those represented on-chain— its balance in cryptocurrency, holdings of NFTs, track record of participating in DAO governance votes— there is no way to discriminate between addresses based on such “irrelevant” factors as the citizenship or geographic location of the person/entity controlling that address. Yet such discrimination based on real-world identity is exactly what many scenarios call for. To cite a few examples:

  1. Combating illicit financing of sanctioned entities. This is particularly relevant given that rogue states including North Korea have increasingly pivoted to committing digital asset theft as their access to the mainstream financial system is cut off.
  2. Launching a regulated financial service where the target audience must be limited by law, for example to citizens of a particular country only.
  3. Flip-side of the coin: excluding participants from a particular country (for example, the United States) in order to avoid triggering additional regulatory requirements that would come into play when serving customers in that jurisdiction.
  4. Limiting participation in high-risk products to accredited investors only. While this may seem trivial to check by inspecting the balance on-chain, the relevant criteria are total holdings of that person, which are unlikely to be all concentrated in one address.

As things stand, there are at best some half-baked solutions to the first problem. Blockchain analytics companies such as Chainalysis, TRM Labs and Elliptic surveil public blockchains, tracing funds movement associated with known criminal activity as these actors hop from address to address. Customers of these services can in turn receive intelligence about the state of an address or even an application such as lending pool. Chainalysis even makes this information conveniently accessible on-chain: the company maintains smart-contracts on Ethereum and other EVM compatible chains containing a list of OFAC sanctioned addresses. Any other contract can consult this registry to check on the status of an address they are interacting with.

The problem with these services is three-fold:

  1. The classifications are reactive. New addresses are innocent until proven guilty, when they are later involved in illicit activity. At that point, the damage has been done: other participants may have interacted with the address or allowed the address to participate in their decentralized applications. In some cases it may be possible to unwind specific transactions or isolate the impact. In other situations such as a lending pool where funds from multiple participants are effectively blended together, it is difficult to identify which transactions are now “tainted” by association and which ones are clean.
  2. “Not a terrorist organization” is a low bar to meet. Even if this could be ascertained promptly and 100% accurately, most applications have additional requirements of their participants. Some of the examples alluded to below include location, country of citizenship or accredited investor status. Excluding the tiny fraction of bad actors in the cross-hairs of FinCEN is useful but insufficient for building the types of regulated dapps that can take DeFi mainstream.
  3. All of these services follow a “blacklist” model: excluding bad actors. In information security, it is a well-known principle that this model is inferior to “whitelisting”— only accepting known good actors. In other words, a blacklist fails open: any address not on the list is assumed clean by default. The onus is on the maintainer of the list to keep up with the thousands of new addresses that crop up, not to mention any sign of nefarious activity by existing addresses previously considered safe. By contrast, whitelists require an affirmative step before addresses are considered trusted. If the maintainer is slow to react, the system fails safe: a good address is considered untrusted because the administrator has not gotten around to including them.

What would an ideal identity verification layer for blockchains look like? Some high-level requirements are:

  • Flexible. Instead of expressing a binary distinction between sanctioned vs not-yet-sanctioned, it must be capable of expressing a range of different identity attributes as required by a wide range of decentralized apps.
  • Opt-in. The decision to go through identity verification for an address must reside strictly with person or persons controlling that address. While we can not stop existing analytics companies from continuing to conduct surveillance of all blockchain activity and try to deanonymize addresses, we must avoid creating additional incentives or pressure for participants to voluntarily surrender their privacy. 
  • Universally accepted. The value of an authentication system increases with the number of applications and services accepting that identity. If each system is only useful for onboarding with a handful of dapps, it is difficult for participants to justify the time and cost of jumping through the hoops to get verified. Government identities such as driver’s licenses are valuable precisely because they are accepted everywhere. Imagine an alternative model where every bar had to perform its own age verification system and issue their own permits— not recognized by any other establishment— in order to enforce laws around drinking age.
  • Privacy respecting. The protocols involved in proving identity must limit information disclosed to the minimum required to achieve the objective. Since onboarding requirements vary between dapps, there is a risk of disclosing too much information to prove compliance. For example, if a particular dapp is only open to US residents, that is the only piece of information that must be disclosed, and not for example the exact address where the owner resides. Similarly proof of accredited investor status does not require disclosing total holdings, or proving that a person is not a minor can be done without revealing the exact date of birth. (This requirement has implications on design. In particular, it rules out simplistic approaches around issuing publicly readable “identity papers” directly on-chain, for example as a “soul-bound” token attached to the address. 

Absent such an identity layer, deploying permission DeFi apps is challenging. Aave’s dedicated ARC pool is an instructive example. Restricted to KYCed entities vetted by the custodian Fireblocks, it failed to achieve even a fraction of the total-value locked (TVL) available in the main Aave lending pool. While there were many headwinds facing the product due to timing and the general implosion of cryptocurrency markets in 2022 (here is a good post-mortem thread), the difficulty of scaling the market when participants must be hand-picked is one of the challenges. While ARC may have been one of the first and more prominent examples, competing projects are likely to face the same odds for boot-strapping their own identity system. In fact they do not even stand to benefit from the work done by the ARC team: while participants went through rigorous checks to gain access to that walled garden, there is no reusable, portable identity resulting from that process. Absent an open and universally recognized KYC standard, each project is required to engage in a wasteful effort to field their own identity system. In many ways, the situation is worse than the early days of web authentication. Before identity federation standards such as SAML and OAuth emerged to allow interoperability, every website resorted to building their own login solution. Not surprisingly, many of these were poorly designed and riddled with security vulnerabilities. Even in the best case when each system functioned correctly in isolation, collectively they burdened customers with the challenge of managing dozens of independent usernames and passwords. Yet web authentication is a self-contained problem, much simpler than trying to link online identities to real-world ones. 

What about participants’ incentives for jumping through the necessary hoops for on-boarding? Putting aside ARC, there is a chicken-egg problem to boot-strapping any identity system: Without interesting application that are gated on having that ID, participants have no compelling reason to sign up for one; the value proposition is not there. Meanwhile if few people have onboarded with that ID system, no developer wants to build an application limited to customers with one of those rare IDs— that would be tantamount to choking off your own customer acquisition pipeline. Typically this vicious cycle is only broken in one of two ways:

  1. An existing application with a proprietary identity system, which is already compelling and self-sustaining, sees value in opening up that system such that verified identities can be used elsewhere. (Either because it can monetize those identity services or due to competitive pressure from competing applications offering the same flexibility to their customers for free.) If there are multiple such applications with comparable criteria for vetting customers, this can result in an efficient and competitive outcome. Users are free to take their verified identity anywhere and participate in any permissions application, instead of being held hostage by the specific provider who happened to perform the initial verification. Meanwhile developers can focus on their core competency— building innovative applications— instead of reinventing the wheel to solve an ancillary problem around limiting access to the right audience.
  2. New regulations are introduced, forcing developers to enforce identity verification for their applications. This will often result in an inefficient scramble for each service provider to field something quickly to avoid the cost of noncompliance, leaving little room for industry-wide cooperation or standards to emerge. Alternatively it may result in a highly centralized outcome. One provider specializing in identity verification may be in the right place at the right time when rules go into effect, poised to become the de facto gatekeeper for all decentralized apps. 

In the case of DeFi, this second outcome is looking increasingly more likely.

CP

Blockchain thefts, retroactive bug-bounties and socially-responsible crime

Or, monetizing stolen cryptocurrency proves non-trivial.

It is not often one hears of bank robbers returning piles of cash after a score because they decided they could not find a way to spend the money. Yet this exact scenario has played out over and over again in the context of cryptocurrency in 2022. Multiple blockchain-based projects were breached, resulting in losses in millions of dollars. That part alone would not have been news, only business as usual. Where the stories take a turn for the bizarre is when the perpetrators strike a bargain with the project administrators to return most of the loot, typically in exchange for a token “bug bounty” to acknowledge the services of the thieves in uncovering a security vulnerability.

To name a handful:

  • August 2021, Poly Network. A generous attacker returns close to 600 million dollars in stolen funds back to the project.
  • Jan 2022, Multichain. Attacker returns 80% of the 1 million dollars stolen, deciding that he/she earned 20% for services rendered.
  • June 2022, Crema Finance. Attacker returns $8 million USD, keeping $1.6 million as “white-hat bounty.” (Narrator: That is not how legitimate white-hat rewards work.)
  • Oct 2022, Transit Swap. Perpetrator returns 16 million (about two-thirds of the total haul)
  • December 2022, Defrost Finance on Avalanche. Again the attacker returned close to 100% of funds.

While bug bounty programs are very common in information security, they are often carefully structured with rules governing the conduct of both the security researchers and affected companies. There is a clear distinction between a responsible disclosure of a vulnerability and outright attack. Case in point: disgraced former Uber CSO has been convicted of lying to Federal Investigators over an incident when the Uber security team retroactively tried to label an actual breach as a valid bug-bounty submission. It was a clear-cut case of an actual attack: the perpetrators had not merely identified a vulnerability but exploited it to the maximum extent to grab Uber customer data. They even tried to extort Uber for payment in exchange for keeping the incident under wraps—none of this is within the framework for what qualifies as responsible disclosure. To avoid negative PR, Uber took up the perpetrators on their offer, attempting to recharacterize a real breach after the fact as a legitimate report. That did not go over very well with the FTC or the Department of Justice who prosecuted the former Uber executive and obtained a guilty verdict.

Given that this charade did not work out for Uber, it is strange to see multiple DeFi projects embrace the same deception. It reeks of desperation, of the unique flavor experienced by a company facing an existential crisis. Absent a miracle to reverse the theft (along the lines of the DAO hard-fork the Ethereum foundation orchestrated to bail-out an early high-profile project) these projects would be out of business. The stakes are correspondingly much higher than they were for Uber circa 2017: given the number of ethics scandals and privacy debacles Uber experienced on a regular basis, the company could easily have weather one more security incident. But for fledgling DeFi projects, the abrupt loss of all (or even substantial part of) customer funds is the end of the road.

On the other hand, it is even more puzzling that the perpetrators—or “vulnerability researchers” if one goes along with the rhetoric—are playing along, giving up the lion’s share of their ill-gotten gains in exchange for… what exactly? While the terms of the negotiation between the perpetrators and project administrators are often kept confidential, there are a few plausible theories:

  • They are legitimate security researchers who discovered a serious vulnerability and decided to stage their own “rescue” operation. There are unique circumstances around vulnerability disclosure on blockchains. Bug collisions happen all the time and at any point, someone else— someone less scrupulous than our protagonist—may discover the same vulnerability and choose to exploit it for private gain. (This is quite different than say finding a critical Windows vulnerability. It would be as if you could exploit that bug on all Windows machines at the same time, regardless of where those targets are located in the world and how well they are defended otherwise. Blockchains are unique in this regard: anyone in the world can exploit a smart-contract vulnerability. The flip side of the coin is that anyone can role-play at being a hero and protecting all users of the vulnerable contract. Going back to our example, while one cannot “patch” Windows without help from MSFT and whoever owns the machine, it is possible to protect 100% of customers. The catch is one must race to exploit the vulnerability and seize all the funds at risk, in the name of safekeeping, before the black-hats can do the same for less noble purposes.
    While it possible that in at least some of these instances, the perpetrators were indeed socially-responsible whitehat researchers motivated by nothing more than protecting customers, that seems an unlikely explanation for all of the cases. Among other clues, virtually every incident occurred without any advance notification. One would expect that a responsible researcher would at least make an effort to contact the project in advance of executing a “rescue,” notifying them of their intentions and offering contact information. Instead project administrators were reduced to putting out public-service announcements on Twitter to reach out to the anonymous attackers, offering to negotiate for return of missing funds. There is no
  • Immunity from prosecution. If the thieves agree to return the majority of the funds taken, the administrators could agree not to press charges or otherwise pursue legal remedies. While this may sound compelling, it is unlikely the perpetrators could get much comfort from such an assurance. Law enforcement could still treat the incident as a criminal matter even if everyone officially associated with the project claims they have made peace with the perpetrators.
  • The perpetrators came to the sad realization that stealing digital assets is the easy part. Converting those assets into dollars or otherwise usable currency without linking that activity to their real-world identity is far more difficult.

That last possibility would be a remarkable turn-around; conventional wisdom holds that blockchains are the lawless Wild West of finance where criminal activity runs rampant and crooks have an easy time getting rich by taking money from hapless users. The frequency of security breaches suggests the first part of that statement may still be true: thefts are still rampant. But it turns out that when it comes to digital currency, stealing money and being able to spend it are two very different problems.

For all the progress made on enabling payments in cryptocurrency—mainly via the Lightning Network—most transactions still take place in fiat. Executing a heist on blockchain may be no more difficult than 2017 when coding secure smart-contracts was more art than science. One thing that has certainly changed in the past five years is regulatory scrutiny on the on/off-ramps from cryptocurrency into the fiat world. Criminals still have to convert their stolen bitcoin, ether or more esoteric ERC20 assets into “usable” form. Typically, that means money in a bank account; stablecoins such as Tether or Circle will not do the trick. By and large merchants demand US dollars, not dollar-equivalent digital assets requiring trust in the solvency of private issuers.

That necessity creates a convenient chokepoint for enforcement: cryptocurrency exchanges, which are the on-ramps and off-ramps between fiat money and digital assets. Decentralization makes it impossible to stop someone from exploiting a smart-contract—or what one recently arrested trader called a “highly profitable trading strategy”—by broadcasting a transaction into a distributed network. But there is nothing trustless or distributed about converting the proceeds of that exploit it into dollars spendable in the real world. That must go through a centralized exchange. To have any hope of sending/receiving US dollars, that exchange must have some rudimentary compliance program and at least make a token effort at following regulatory obligations, including Know Your Customer (KYC) and anti-money laundering (AML) rules. (Otherwise, the exchange risks experiencing the same fate as Bitfinex which was unceremoniously dropped by its correspondent bank Wells Fargo in 2017 much to the chagrin of Bitfinex executives.) Companies with aspirations to staying in business do not look kindly on having their platform being used to launder proceeds from criminal activity. They frequently cooperate with law enforcement in seizing assets as well as providing information leading to the arrest of perpetrators. Binance is a great demonstration of this in action. Once singled out by Reuters as the platform preferred by criminals laundering cryptocurrency, the exchange has responded by ramping up its compliance efforts and participating in several high-profile asset seizures. Lest the irony is lost: a cryptocurrency business proudly declares its commitment to surveilling its own customer base to look for evidence of anyone receiving funds originating with criminal activity. (The company even publishes hagiographic profiles on its compliance team retrieving assets from crooks foolish enough to choose Binance as their off-ramp to fiat land.)

This is not to say that monetizing theft on blockchains has become impossible. Determined actors with resources—such as the rogue state of North Korea—no doubt still retains access to avenues for exiting into fiat. (Even in that case, increased focus on enforcement can help by increasing the “haircut” or percentage of value lost by criminals when they convert digital assets into fiat through ever inefficient schemes.) But those complex arrangements are not accessible to a casual vulnerability researcher who stumbles into a serious flaw in a smart-contract or compromises the private keys controlling a large wallet. Put another way: there are far more exploitable vulnerabilities than ways of converting proceeds from that exploit into usable money. Immature development practices and gold-rush mentality around rushing poorly designed DeFi applications to market has created a target-rich environment. This is unlikely to change any time soon. On the flip side, increased focus on regulation and availability of better tools for law enforcement—including dedicated services such as Chainalysis and TRM Labs for tracing funds on chain—makes it far more difficult to monetize those attacks in any realistic way. It was a running joke in the information security community that blockchains come with a built-in bug bounty. Find a serious security vulnerability and monetary rewards shall follow automatically—even if the owner of the system ever bothered to create an official bounty program. Digital assets that are blacklisted by every reputable business and can never be exchanged for anything else of value are about as valuable as monopoly money. Given that dilemma, it is no surprise that creative vulnerability researchers would embrace the post hoc “white-hat disclosure” charade, choosing a modest but legitimate payout over holding on to a much larger sum of tainted funny-money they have little of being able to spend.

CP

The myth of tainted blockchain addresses [part II]

[continued from part I]

Ethereum and account-based blockchains

The Ethereum network does not have a concept of discrete “spend candidates” or UTXOs. Instead, funds are assigned to unique blockchain addresses. While this is a more natural model for how consumers expect digital assets to behave (and bitcoin wallet software goes out of its way to create the same appearance while juggling UTXOs under the covers) it also complicates the problem of separating clean vs dirty funds.

Consider this example:

  • Alice has a balance of 5 ETH balance on her Ethereum address
  • She receives 1 ETH from a sanctioned address (For simplicity assume 100% of these funds are tainted, for example because they represent stolen.)
  • She receives another 5 ETH from a clean address.
  • Alice sends 1 ETH to Bob.

If Alice and Bob are concerned about complying with AML rules, they may be asking themselves: are they in possession of tainted ETH that needs to be frozen or otherwise segregated for potential seizure by law enforcement? (Note in this example their interests are somewhat opposed: Alice would much prefer that the 1ETH she transferred to Bob “flushed” all the criminal proceeds out of her wallet, while Bob wants to operate under the assumption that he received all clean money and all tainted funds still reside with Alice.)

Commodities parallel

In one were to draw a crude—no pun intended—comparison to commodities, tainted Bitcoin behaves like blood diamonds while tainted Ethereum behaves like contraband oil imported from a sanctioned petro-dictatorship. While UTXO can be partially tainted, it does not “mix” with other UTXO associated with the same address. Imagine a precious stones vault containing diamonds. Some of these turn out to be conflict diamonds, others have a verifiable pedigree. While the vault may contain items of both type, there is no question whether any given sale includes conflict diamonds. In fact, once the owner becomes aware of the situation, they can make a point of putting those samples aside and never selling them to any customer. This is the UTXO model in bitcoin: any given transaction either references a given UTXO (and consumes 100% of the available funds there) or does not reference that UTXO at all. If the wallet owner is careful to never use tainted inputs in constructing their transaction, they can be confident that the outputs are also clean.

Ethereum balances do not behave this way because they are all aggregated together in one address. Stretching the commodity example, instead of a vault with boxes of precious gems, imagine an oil storage facility. There is a tank with a thousand barrels of domestic oil with side-entry mixer running inside to stir up the contents and avoid sludge settling at the bottom. Some joker dumps a thousand barrels of contraband petrostate oil of identical density and physical characteristics into this tank. Given that the contents are being continuously stirred, it would be difficult to separate out the product into its constituent parts. If someone tapped one barrel from that tank and sold it, should that barrel be considered sanctioned, clean or something in between such as “half sanctioned”?

There are logical arguments that could justify each of these decisions:

  1. One could take the extreme view that even the slightest amount of contraband oil mixed into the tank results in spoilage of the entire contents. This is the obsessive-compulsive school of blockchain hygiene, which holds that even de minimus amounts originating from a sanctioned address irreversibly poisons an entire wallet. In this case all 2000 barrels coming out of that tank will be tainted. In fact, if any more oil were added to that tank, it too would get tainted. At this point, one might as well shutter that facility altogether.
  2. A more lenient interpretation holds that there are indeed one thousand sanctioned barrels, but those are in the batch of second thousand barrels coming out of the spout. Since the first thousand original barrels were clean, we can tap up to that amount without a problem. This is known as FIFO or first-in-first-out ordering in computer science.
  3. Conversely, one could argue that the first thousand are contraband because those were the most recent additions to the tank, while the next thousand will be clean. That would be LIFO or last-in-first-out ordering.
  4. Finally, one could argue the state of being tainted exists on a continuum. Instead of a simple yes/no, each barrel is assigned a percentage. Given that the tank holds equal parts “righteous” and “nefarious” crude oil, every barrel coming out of it will be 50% tainted according to this logic.

Pre-Victorian legal precedents

While there may not be any physical principles for choosing between these hypotheses, it turns out this problem does come up in legal contexts and there is precedent for adopting a convention. In the paper Bitcoin Redux a group of researchers from the University of Cambridge expound on how an 1816 UK High Court ruling singles out a particular way of tracking stolen funds:

It was established in 1816, when a court had to tackle the problem of mixing after a bank went bust and its obligations relating to one customer account depended on what sums had been deposited and withdrawn in what order before the insolvency. Clayton’s case (as it’s known) sets a simple rule of first-in-first-out (FIFO): withdrawals from an account are deemed to be drawn against the deposits first made to it.

In fact, their work tackles a more complicated scenario where multiple types of taint are tracked, including stolen assets, funds from Iran (OFAC sanctioned) and funds coming out of a mixer. The authors compare the FIFO heuristic against the more radical “poison” approach which corresponds to #1 in our list above, as well as the “haircut” which corresponds to #4, highlighting its advantages:

The poison diagram shows how all outputs are fully tainted by all inputs. In the haircut diagram, the percentages of taint on each output are shown by the extent of the coloured bars. The taint diffuses so widely that the effect of aggressive asset recovery via regulated exchanges might be more akin to a tax on all users.
[…]
With the FIFO algorithm, the taint does not go across in percentages, but to individual components (indeed, individual Satoshis) of each output. Thus the first output has an untainted component, then the stolen component – both from the 9 first input – and then part of the Iranian component from the second input. As the taint does not spread or diffuse, the transaction processes it in a lossless way.

Ethereum revisited

While the Bitcoin Redux paper only considered the Bitcoin network, the FIFO heuristic translates naturally into the Ethereum context as it corresponds to option #2 in the crude-oil tank example. Going back to the Alice & Bob hypothetical, it vindicates Bob—in fact it means Alice can send another 4ETH from that address before getting to the tainted portion.

Incidentally the FIFO model has another important operational advantage: it allows the wallet owner to quarantine tainted funds in a fully deterministic, controlled manner. Suppose Alice’s compliance officer advises her to quarantine all tainted funds at a specific address for later disbursement to law enforcement. Recall that the tainted sum of 1 ETH is “sandwiched” chronologically between two chunks of clean ETH in arrival order. But Alice can create a series of transactions to isolate it:

  • If necessary, she needs to spend the first 5 ETH that were present at the address prior to the arrival of tainted funds. Alice could wait until this happens naturally, as in her outbound transfer to Bob. Any remaining amount can be immediately consumed in a loopback transaction sending funds back to the original address or she could temporarily shift those funds to another wallet under her control.
  • Now she creates another 1 ETH transaction to move the tainted portion to the quarantine address.

The important point here is that no one else can interfere with this sequence. If instead the LIFO heuristic had been adopted, Alice could receive a deposit between steps #1 and #2, resulting in her outbound transaction in the second step using up a different 1 ETH segment that does not correspond exactly to the portion she wanted to get rid of. This need not even be a malicious donation. For example, charities accepting donations on chain receive deposits from contributors without any prior arrangement. Knowing the donation address is sufficient; there is no need to notify the charity in advance of an upcoming payment. Similarly, cryptocurrency exchanges hand out deposit addresses to customers with the understanding that the customer is free to send funds to that address any time and they will be credited to her account. In these situations, the unexpected deposit would throw off the carefully orchestrated plan to isolate tainted funds but only if LIFO is used—because in that model the “last-in” addition going “first-out” is the surprise deposit.

In conclusion: blockchain addresses are not hopelessly tainted because of one unsolicited transaction sent by someone looking to make a point. Only specific chunks of assets associated with that address carry taint. Using Tornado Cash to permanently poison vast sums of ether holdings remains nothing more than wishful thinking because the affected portion can be reliably separated by those seeking to comply with AML rules, at the cost of some additional complexity in wallet operations.

CP

The myth of tainted blockchain addresses [part I]

[Full disclosure: This blogger is not an attorney and what follows is not legal advice.]

Unsolicited gifts on chain

In the aftermath of the OFAC sanctions against Tornado Cash, it has become an article faith in the cryptocurrency community that banning blockchain addresses sets a dangerous precedent. Some have argued that blacklisting Tornado addresses and everyone who interacts with them will have dangerous downstream effects due to the interconnectedness of networks. Funds flow from one address to another, the argument goes, often merging with unrelated pools of capital before being split off again. Once we decide one pool of funds are tainted by virtue of being associated with a bad actor or event—a scam, rug-pull or garden-variety theft—that association propagates unchecked and continues to taint funds belonging to innocent bystanders who were not in any way involved with the original crime. As if to illustrate the point, shortly after the ban imposed on US residents from interacting with the Tornado mixer, some wisecrack decided to use that very mixer to send unsolicited funds to prominent blockchain addresses. These were either addresses with unusually high balances (eg “whales” in industry parlance) or previously tagged as belonging to celebrities or well-known cryptocurrency businesses such as exchanges. Here is an unsolicited “donation” sent to the Kraken cold-wallet through the Tornado mixer.

That raises the question: are these unwitting recipients also in violation of OFAC sanctions? Are all funds in those wallets now permanently tainted because of an inbound transaction, a transaction they neither asked for or had any realistic means to prevent given the way blockchains operate? With a few exceptions, anyone can send funds from their own address to any other address on most blockchains; the recipient cannot prevent this. Granted, there are a few special cases where the recipient can limit unwanted transfers. For example, Algorand requires the recipient to opt-in to supporting a specific ASA before they can receive assets of that type. But that does not in any way prevent free transfer of the native currency ALGO. Ethereum smart-contracts make it possible to take action on incoming transfers and reject them based on sender identity. Of course, this assumes the recipients have a way to identify “bad” addresses. Often such labels are introduced after the offending address has been active and transacting. Even if there was a 100% reliable way to flag and reject tainted transfers, requiring that would place an undue burden on every blockchain participant to implement expensive measures (including the use of smart-contracts and integrating with live data feeds of currently sanctioned addresses, according to all possible regulators around the world) to defend against a hypothetical scenario that few will encounter.

Given that inbound transfers from blacklisted addresses can not be prevented in any realistic setup, does that mean blacklisting Tornado Cash also incidentally blacklists all of these downstream recipients by association? While compelling on its face, this logic ignores the complexity of how distributed ledgers track balances and adopts one possible convention among many plausible ones for deciding how to track illicit funds in motion. This blog post will argue that there are equally valid conventions that make it easier to isolate funds associated with illicit activity and prevent this type of uncontrolled propagation of “taint” through the network. To make this case, we will start with the simple scenario where tainted funds are clearly isolated from legitimate financial activity and then work our way up to more complex situations where commingling of funds requires choosing a specific convention for separation.

Easy case: simple Bitcoin transactions

UTXO model

The Bitcoin network makes it easy to separate different pools of money within an address, because the blockchain organizes funds into distinct lumps of assets called “unspent transaction outputs” or UTXO. The concept “balance of address” does not exist natively in the Bitcoin ledger; it is a synthetic metric created by aggregating UTXO that all share the same destination address.

Conceptually, one can speculate if this was a consequence of the relentless focus on privacy Satoshi advocated. The Bitcoin whitepaper warns about the dangers of address reuse, urging participants to only use each address once. In this extreme model, it does not make sense to track balances over time, since each address only appears twice on the blockchain. First when funds are deposited at that address, temporarily creating a non-zero balance. The second and last reference occurs when funds are withdrawn, after which point the balance will always be zero. This is not how most bitcoin wallets operate in reality. Address reuse is common and often necessary for improving operational controls around funds movement. Address whitelisting is a very common security feature used to restrict transfers to known, previously defined trusted destinations. That model can only scale if each participant has a handful of fixed blockchain addresses such that all counterparties interacting with that person can record those entries in their whitelist of “safe” destinations.

For these reasons it is convenient to speak of an address “balance” as a single figure and draw charts depicting how that number varies over time. But it is important to remember that single number is a synthetic creation representing an aggregate over discrete, individual UTXOs. In fact the same balance may behave differently depending on the organization of its spend candidates. Consider these two addresses:

  • First one is comprised of a thousand UTXOs each worth 0.00001 BTC
  • Second one is a single 0.01 BTC UTXO.

On paper, both addresses have a balance of “0.01 bitcoin.” In reality, the second address is far useful for commercial activity. Recall that each bitcoin transaction pays a mining fee proportional to the size of the transaction in bytes—not proportional to the value transacted, as most payment networks operate. Inputs typically account for the bulk of transaction size due to the presence of cryptographic signatures, even after accounting for the artificial discount introduced by segregated witness. That means scrounging together dozens of inputs is less efficient than using a single output to supply a given amount. In the extreme case of “dust outputs,” the mining fees required to include a UTXO as input may exceed the amount of funds that input contributes. Including the UTXO would effectively be a net negative. Such a UTXO is economically unusable unless network fees decline.

Isolating tainted funds

This organization of funds into such distinct lumps makes it easy to isolate unsolicited contributions. Every UTXO stands on its own. If funds are sent from a sanctioned actor, the result is a distinct UTXO that stands apart on the bitcoin ledger from all the other UTXOs sharing the recipient address. That UTXO should be considered tainted in its entirety, subject to asset freeze/seizure or whatever remedy the powers-that-be deem appropriate for the situation. Everything else is completely free of taint.

One way to implement this is for the wallet software to exclude such UTXOs when calculating available balances or picking available UTXOs to prepare a new outbound transaction. The wallet owner effectively acts as if that UTXO did not exist This model extends naturally to downstream transfers. If the tainted UTXO is used as input into another subsequent bitcoin transaction (perhaps because the wallet owner did not know it was tainted) it will end up creating another tainted UTXO while leaving everything else untouched.

Mixed bag: transactions with mismatched inputs/outputs

The previous section glossed over an important aspect of bitcoin transactions: they can have multiple inputs and outputs. The more general case is illustrated by transaction C here:

Transaction graph example from the Bitcoin wiki

Consider an example along the lines of transaction C, but using a different arrangement of inputs and outputs:
Inputs:

  • First input of clean 5 BTC
  • Second one worth 1 BTC originating from a sanctioned address

Outputs:

  • First output is designated to receive 3 BTC
  • Second output receives 2.999BTC (leaving 0.001 BTC in mining fees)

Due to the different amounts, there is no way to map inputs/outputs in one-to-one relationship. There is a total of 1 tainted bitcoin on the input side that must be somehow passed through to outputs. (There is of course the question of where the mining fees came from. It would be convenient to argue that those were paid for using tainted funds to reduce the total. But here we will make the worst-case assumption: tainted funds must be propagated 100% to outputs. Mining fees are assumed to come out of the “clean” portion of inputs.)

A problem of convention

Clearly there are different ways taint can be allocated among the outputs within those constraints. Here are some examples:

  1. Allocate evenly, with 50% assigned to each output. This will result in partial taint of both outputs.
  2. Allocate 100% to the first output. That output is now partially tainted while the remaining output is clean. (More generally, we may need to allocate to multiple outputs until the entire tainted input is fully accounted for. If the first input of 5BTC had been the one from a sanctioned address, it would have required both outputs to fully cover the amount.)
  3. Same as #2 but select the tainted outputs in a different order. The adjectives “first” and “second” are in reference to the transaction layout on blockchain, where inputs and outputs are strictly ordered. But for the purposes of tracking tainted funds, we do not have to follow the same order. Here are some other reasonable criteria:
  4. FIFO or first-in-first-out. Match inputs and outputs in order. In this case since the first output can be paid out of the first clean input entirely, it is considered clean. But the second output requires the additional tainted 1 BTC so it is partially tainted.
  5. Highest balance first. To reduce the number of tainted outputs, use the outputs in decreasing order until the tainted input is fully consumed.

Regardless of which convention is adopted, one conclusion stands: No matter how one slices and dices the outputs, there are scenarios where some UTXO will be partially tainted, even if the starting state follows the all-or-none characterization. Previous remedies to quarantine or otherwise exclude that UTXO in its entirety from usable assets are no longer appropriate.

Instead of trying to solve for this special case in bitcoin, we look at how the comparable situation arises in Ethereum, which differs from Bitcoin in a crucial way: it does not have the concept of UTXO. Here the concept of “balance” is native to the blockchain and associated with every address. That means the neat separation of funds into discrete “clean” and “tainted” chunks cannot possibly work in Ethereum, forcing us to confront this problem of commingled funds in a broader context.

[continued]

CP

Remote attestation: from security feature to anticompetitive lock-in

Lessons from the first instant-messaging war

In the late 1990s and early 2000s instant messaging was all the rage. A tiny Israeli startup Mirabilis set the stage with ICQ but IM quickly become a battle ground of tech giants, running counter to the usual dot-com era mythology of small startups disrupting incumbents on the way to heady IPO valuations. AOL Instant Messenger had taken a commanding lead from the gate while MSFT was living up to its reputation as “fast-follower” (or put less charitably, tail-light chaser) with MSN Messenger. Google had yet to throw its hat into the arena with GChat. These IM networks were completely isolated: an AOL user could only communicate with other AOL users. This resulted in most users having to maintain multiple accounts to participate in different networks, each with their barrage of notifications and task-bar icons.

While these companies were locked in what they viewed as zero-sum game for marketshare, the benefits of interoperability to consumers were clear. In fact one software vendor called Trillion even made a multiple-network client that effectively aggregated the protocols for all the different IM services. Standards for interoperability such as SIP and XMPP were still a ways off from becoming relevant; everyone invented their own client/server protocol for instant messaging, and expected to provide both sides of the implementation from scratch. But there was a more basic reason why some IM services were resistant to adopting an open standard: it was not necessarily good for the bottom line. Interop is asymmetric: it helps the smaller challenger compete against the incumbent behemoth. If you are MSN Messenger trying to win customers away from AOL, it is a selling point if you can build an IM client that can exchange messages with both MSN and AOL customer. Presto: AOL users can switch to your application, still keep in touch with their existing contacts while becoming part of the MSN ecosystem. Granted the same dynamics operate in the other direction: in principle AOL could have built an IM client that connected its customers with MSN users. But this is where existing market shares matter: AOL has more to lose from by allowing such interoperability and opening itself up to competition with MSN, compared to keeping its users locked up in the wallet garden.

Not surprisingly then they did go to of their way to keep each IM service an island unto its own. Interestingly for tech giants, this skirmish was fought in code instead of the more common practice of lawyers exchanging nastygrams. AOL tried to prevent any client other than the official AIM client from connecting to its service. You would think this is an easy problem: after all, they control the software on both sides. They could ship a new IM client that includes a subtle, specific quirk when communicating with the IM server. AOL servers would in turn look for that quirk and reject any “rogue” clients missing that quirk.

White lies for compatibility

This idea runs into several problems. A practical engineering constraint in the early 2000s was the lack of automatic software updates. AOL could ship a new client but in those Dark Ages of software delivery, “ship” meant uploading the new version to a website— itself a much heralded improvement from the “shrink-wrap” model of actually burning software on CDs and selling them in a retail store. There was no easy way to force-upgrade the entire customer base. If the server insisted on enforcing the new client fingerprint, they would have to turn away a large percent of customers running legacy versions or make them jump through hoops to download the latest version— and who knows, maybe some of those customers would decide to switch to MSN in frustration. That problem is tractable and ultimately solved with better software engineering. Windows Update and later Google Chrome made automatic software updates into a feature customers take for granted today. But there is a more fundamental problem with attempting to fingerprint clients: competitors can reverse-engineer the fingerprint and incorporate it in their own software.

This may sound vaguely nefarious but software impersonating other pieces of software is in fact quite common for compatibility. In fact web browsers practically invented that game. Take a look at the user-agent string early versions of Internet Explorer sent every website:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; SLCC2; .NET CLR 2.0.50727; Media Center PC 6.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729)

There is a lot of verbiage here but one jumps out: Mozilla. That is the codename used by Netscape Navigator in its own user agent strings. The true identity of this browser is buried in a parenthetical comment—“MSIE6.0” for Internet Explorer 6— but why is IE trying to pretend to be Netscape? Because of compatibility. Web pages were designed assuming a set of browser features that could not be taken for granted— such as support for images and JavaScript. Given the proliferation of different web browsers and versions at the time— see point about lack of automatic updates— websites used a heuristic shortcut to determine if visitors were using an appropriate browser. Instead of trying to check for the availability of each feature, they began to check for a specific version of a known browser. “Mozilla/4.0” was a way to signal that the current web browser could be treated as if it were Netscape 4. Instead of turning away users with an unfriendly message along the lines of “Please install Netscape 4 to use this site” the service could assume all the requisite features were present and proceed as usual.

These white-lies are ubiquitous on the web because compatibility is in the interests of everyone. Site publishers just want things to work. Amazon wants to sell books. With the exception of a few websites closely affiliated with a browser vendor, they do not care whether customers use Netscape, IE or Lynx to place their orders. There is no reason for websites to be skeptical about user-agent claims or run additional fingerprinting code to determine if a given web browser was really Netscape 4 or simply pretending to be. (Even if they wanted to such fingerprinting would have been difficult; for example IE often aimed for bug-for-bug compatibility with Netscape, even when that meant diverging from official W3C standards.)

Software discrimination and the bottom line

For reasons noted above, the competitive dynamics of IM were fundamentally different than web browsers. Most of the business models built around IM assumed full control over the client stack. For example, MSN Messenger floated ideas of making money by displaying ads in the client. This model runs into problems when customers run a different but interoperable client: Messenger could connect to the AOL network— effectively using resources & generating costs for AOL— while displaying ads chosen by MSN, earning revenue for MSFT.

Not surprisingly this resulted in an escalating arms-race. AOL included more and more subtle features into AIM that the server could use for fingerprinting. MSFT attempted to reverse engineer that functionality out of the latest AIM client and incorporate identical behavior into MSN Messenger. It helps that the PC platform was and to a large extent still is very much open to tinkering. Owners can inspect binaries running on their machine inspect network communications originating from a process or attach a debugger to a running application to understand exactly what that app is doing under specific circumstances. (Intel SGX is an example of a recent hardware development on x86 that breaks that assumption. It allows code to run inside protected “enclaves” shielded from any debugging/inspection capabilities from an outside observer.)

In no small measure of irony, the Messenger team voluntarily threw in the towel on interoperability when AOL escalated the arms-race to a point MSFT was unwilling to go: they deliberately included a remote code execution vulnerability in AIM intended for AOL servers to exploit. Whenever a client connect, the server would exploit the vulnerability to execute arbitrary code to look around the process and check on the identity of the application. Today such a bug would earn a critical severity rating and associated CVE if it were discovered in an IM client. (Consider that in the 1990s most Internet traffic was not encrypted, so it would have been much easier to exploit that bug; the AIM client had very little assurance that it was communicating with the legitimate AOL servers.) If it were alleged that a software publisher deliberately inserted such a bug into an application used by millions of people, it would be all over the news and possibly result in the responsible executives being dragged in front of congress for a ritual public flogging. In the 1990s it was business-as-usual.

Trusted computing and the dream of remote attestation

While the MSN Messenger team may have voluntarily hoisted the white-flag on that particular battle with AOL, a far-more powerful department within the company was working to make AOL’s wishes come true: a reliable solution to verifying the authenticity of software running on a remote peer,  preferably without playing a game of chicken with deliberately introduced security vulnerabilities. This was the Trusted Computing initiative, later associated with the anodyne but awkward acronym NGSCB (“Next Generation Secure Computing Base”) though better remembered by its codename “Palladium.”

The lynchpin of this initiative was a new hardware component called the “Trusted Platform Module” meant to be included as an additional component on the motherboard. TPM was an early example of system-on-a-chip or SoC: it had its own memory, processor and persistent storage, all independent of the PC. That independence meant the TPM could function as a separate root of trust. Even if malware compromises the primary operating system and gets to run arbitrary code in kernel mode— the highest privilege level possible—it still can not tamper with the TPM or alter security logic embedded in that chip.

Measured boot

While the TPM specification defined a kitchen sink of functionality ranging from key management (generate and store keys on-board TPM in non-extractable fashion) to serving as a generic cryptographic co-processor, one feature stood out for use in securing the integrity of the operating system during the boot process: the notion of measured boot. At a high level, the TPM maintained a set of values in RAM dubbed “platform configuration register” or PCRs. When the TPM is started, these all start out at zero. What distinguishes PCRs is the way they are updated. It is not possible to write an arbitrary value into a PCR. Instead the existing value is combined with the new input and run through a cryptographic hash function such as SHA1; this  is called “extending” the PCR in TCG terminology. Similarly it is not possible to reset the values back to zero, short of restarting the TPM chip, which only happens when the machine itself is power-cycled. In this way the final PCR value becomes a concise record of all the inputs that were processed through that PCR. Any slight change to any of the inputs or even changing the order of inputs results in a completely different value with no discernible relationship to the original.

This enables what TCG called the “measurement of trust” during the boot process, by updating the PCR with measurements of all code executed. For example, the initial BIOS code that takes control when a machine is first powered on updates PCR #0 with a hash of its own binary. Before passing control to the boot sector on disk, it records the hash of that sector in a different PCR. Similarly the early-stage boot loader first computes a cryptographic hash of the OS boot-loader and updates a PCR with that value, before executing the next stage. In this way, a chain of trust is created for the entire boot process with every link in the chain except the very first one recorded in some PCR before that link is allowed to execute. (Note the measurement must be performed by the predecessor. Otherwise a malicious boot-loader could update the PCR with a bogus hash instead of its own. Components are not allowed to self-certify their code; it must be an earlier piece of code that performs the PCR update before passing control.)

TCG specifications define the conventions for what components are measured into what PCR. These are different between legacy BIOS and the newer UEFI specifications. Suffice it to say that by the time a modern operating system boots, close to a dozen PCRs will have been extended with a record of the different components booted:

So what can be done with this cryptographic record of the boot process? While these values look random, they are entirely deterministic.  Assuming the exact same system is powered on for two different occasions, identical PCR values will result. For that matter, if two different machines have the exact same installation— same firmware, same version of the operating system, same applications installed— it is expected that their PCRs will be identical. These examples hint at two immediate security applications:

  • Comparison over time: verify that a system is still in the same known-good state it was at a given point in the past. For example we can record the state of PCRs after a server is initially provisioned and before it is deployed into production. By comparing those measurements against the current state, it is possible to detect if critical software has been tampered with.
  • Comparison against a reference image: Instead of looking at the same machine over time, we can also compare different machines in a data-center. If we have PCR measurements for a known-good “reference image,” any server in healthy state is expected to have the same measurements in the running configuration.

Interestingly neither scenario requires knowing what the PCRs are ahead of time or even the exact details of how PCRs are extended. We are only interested in deltas between two sets of measurements. Since PCRs are deterministic, for a given set of binaries involved in a boot process we can predict ahead of time exactly what PCR values should result. There is a different use-case when those exact value matters: ascertaining whether a remote system is running a particular configuration.

Getting better at discrimination

Consider the problem of distinguishing a machine running Windows from one running Linux. These operating systems use a different boot-loader and the hash of that boot-loader gets captured into a specific PCR during measured boot. The value of that PCR will now act as a signal of what operating system is booted. Recall that each step in the boot-chain is responsible for verifying the next link; a Windows boot-loader will not pass control to a Linux kernel image.

This means PCR values can be used to prove to a remote system that you are running Windows or even running it in a particular configuration. There is one more feature required for this: a way to authenticate those PCRs. If clients were allowed to self-certify their own PCR measurements, a Linux machine could masquerade as a Windows box by reporting the “correct” PCR values expected after a Windows boot. The missing piece is called “quoting” in TPM terminology. Each TPM can digitally sign its PCR measurements with a private-key permanently bound to that TPM. This is called the attestation key and it is only used for signing such proofs unique to the TPM. (The other use case is certifying that some key-pair was generated on the TPM, by signing a structure containing the public key.) This prevents the owner from forging bogus quotes by asking the TPM to sign random messages.

This shifts the problem into a different plane: verifying the provenance of the “alleged” attestation, namely that it really belongs to a TPM. After all anyone can generate a key-pair and sign a bunch of PCR measurements with a worthless key. This is where the protocols get complicated and kludgy, partly because TCG tried hard placate privacy advocates. If every TPM had a unique, global AK for signing quotes, that key could be used as a global identifier for the device. TPM2 specification instead creates a level of indirection: there is an endorsement key (EK) and associated X509 certificate baked into the TPM at manufacture time. But EK is not used to directly sign quotes; instead users generate one or more attestation keys and prove that specific AK lives on the same TPM as the EK, using a challenge-response protocol. That links the AK to a chain of trust anchored in the manufacturer via the X509 certificate.

The resulting end-to-end protocol provides a higher level of assurance than is possible with software-only approaches such as “health agents.” Health agents are typically pieces of software running inside the operating system that perform various checks (check if latest software updates have been applied, firewall is enabled, no listening ports etc.) and report the results. The problem is those applications rely on the OS for their security. A privileged attacker with administrator rights can easily subvert the agent by feeding bogus observations or forging a report. Boot measurements on the other are implemented by firmware and TPM, outside the operating system and safe against any interference by OS-level malware regardless of how far it has escalated its privileges.

On the Internet, no one knows you are running Linux?

The previous example underscores a troubling link between measured boot and platform lock-in. Internet applications are commonly defined in terms of a protocol. As long as both sides conform to the protocol, they can play. For example XMPP is an open instant-messaging standard that emerged after the IM wars of the 1990s. Any conformant XMPP client following this protocol can interface with an XMPP server written according to the same specifications. Of course there may be additional restrictions associated with each XMPP server—such as begin able to authenticate as a valid user, making payments out-of-band if the service requires one etc. Yet these conditions exist outside of the software implementation. There is no a priori reason an XMPP client running on Mac or Linux could not connect to the same service as long as the same condition are fulfilled: the customer paid their bill and typed in the correct password.

With measured boot and remote attestation, it is possible for the service to unilaterally dictate new terms such as “you must be running Windows.” There is no provision in XMPP spec today to convey PCR quotes, but nothing stops MSFT from building an extension to accommodate that. The kicker: that extension can be completely transparent and openly documented. There is no need to rely on security through obscurity and hope no one reverse-engineers the divergence from XMPP. Even with full knowledge of the change, authors of XMPP clients for other operating systems are prevented from creating interoperable clients.

No need to stop with the OS itself. While TCG specs reserve the first few PCRs for use during the boot process, there are many more available. In particular PCRs 8-16 are intended for the operating system itself to record other measurements it cares about. (Linux Integrity Measurement Architecture or IMA does exactly that.) For example the OS can reserve a PCR to measure all device drivers loaded, all installed applications or even the current choice of default web browser. Using Chrome instead of Internet Explorer? Access denied. Assuming attestation keys were set up in advance and the OS itself is in a trusted state, one can provide reliable proof of any of this criteria to a remote service and create a walled-garden that only admits consumers running approved software.

The line between security feature and platform lock-in

Granted none of the scenarios described above have come to pass yet— at least not in the context of general purpose personal computers. Chromebooks come closest with their own notion of remote verification and attempt to create walled-gardens that limit accessibility only to applications running on a Chromebook. Smart-phones are a different story: starting with the iPhone, they were pitched as closed, blackbox appliances where owners had little hope of tinkering. De facto platform lock-in due to “iOS only” availability of applications is very common for services that are designed for mobile use in-mind. This is the default state of affairs even when the service provider is not making any deliberate attempts to exclude other platforms or use anything heavyweight along the lines of remote attestation.

This raises the question: is there anything wrong with a service provider restricting access based on implementation? The answer depends on the context.

Consider the following examples:

  1. Enterprise case. An IT department wants to enforce that employees only connect to the VPN from a company issued device (Not their own personal laptop)
  2. Historic instant messaging example. AOL wants to limit access to its IM service to users running the official AIM client (Not a compatible open-source clone or the MSN Messenger client published by MSFT)
  3. Leveraging online services to achieve browser monopoly. Google launches a new service and wants to restrict access only to consumers running Google Chrome as their choice of web-browser

It is difficult to argue with the first one. The company has identified sensitive resources— it could be customer PII, health records, financial information etc.— and is trying to implement reasonable access controls around that system. Given that company-issued devices are often configured to higher security standards than personal devices, it seems entirely reasonable to mandate that access to these sensitive systems only take place from the more trustworthy devices. Remote attestation is a good solution here: it proves that the access is originating with a device in known configuration. In fact PCR quotes are not the only way to get this effect; there are other ways to leverage the TPM to similar ends. For example, TPM specification allows generating key-pairs with a policy attached saying the key is only usable when the PCRs are in a specific state. Using such a key as the credential for connecting to the VPN provides an indirect way to verify the state of the device. Suppose employees are expected to be running a particular Linux distribution on their laptop. If they boot that OS, the PCR measurements will be correct and the key will work. If they install Windows on their system and boot that, PCR measurements will be different and their VPN key will not work. (Caveat: This is glossing over some additional risks. In a more realistic setting, we have to make sure VPN state can not be exported to another device after authentication or for that matter, a random Windows box can not SSH into the legitimate Linux machine and use its TPM keys for impersonation.)

By comparison, the second case is motivated by strategic considerations. AOL deems interoperability between IM clients a threat to its business interests. That is not an unreasonable view: interop gives challengers in the market a leg up against entrenched incumbents, by lowering switching costs. At the time AOL was the clear leader, far outpacing MSN and similar competitors in number of subscribers. The point is AOL is not acting to protect its customers privacy or save them from harm; AOL is only trying to protect the AOL bottom line. Since IM is offered as a free service, the only potential sources of revenue are:

  • Advertising
  • Selling data obtained by surveilling users
  • Other applications installed with the client

The first one requires absolute control over the client. If an MSN Messenger user connects to the AOL network, that client will be displaying ads selected by Microsoft, not AOL. In principle the second piece still works as long as the customer is using AIM: every message sent is readable by AOL, along with metadata such as usage frequency and IP addresses used to access the service. But a native client can collect far more information by tapping into the local system: hardware profile, other applications installed, even browsing history, depending on how unscrupulous the vendors are (Given that AOL deliberately planted a critical vulnerability, there is no reason to expect they would stop shy of mining navigation history.) The last option also requires full control over the client. For example if Adobe were to offer AOL 1¢ for distributing Flash with every install of AIM, AOL could only collect this revenue from users installing the official AIM client, not interoperable ones that do not include Flash bundled. In all cases AOL stands to lose money if people could access the IM service without running the official AOL client.

The final hypothetical is a textbook example of leveraging monopoly in one business—online search for Google— to gain market share in another “adjacent” vertical, by artificially bundling two products. That exact pattern of behavior was at the heart of the DOJ antitrust lawsuit against MSFT in the late 1990s, alleging that the company illegally used its Windows monopoly to handicap Netscape Navigator and gain unfair advantage for Internet Explorer in market share. Except that by comparison the Google example is even more stark. While it was not a popular argument, some rallied to MSFT’s defense by pointing out that the controls of an “operating system” are not fixed and web browsers may one day be seen as an integral component, no different than TCP/IP networking. (In a delightful irony, Google itself proved this point later by grafting a lobotomized Linux distribution around the Chrome web-browser to create ChromeOS. This was an inversion of the usual hierarchy: instead of being yet another application included with the OS, the browser is now the main attraction that happens to include an operating system as bonus.) There is no such case to be made about creating a dependency between search engines in the cloud and web browsers used for accessing them. If Google resorted to using technologies such as measured-boot to enforce that interdependency— and in fairness, it has not, this remains a hypothetical at the time of writing— the company would be adding to a long rap-sheet of anticompetitive behavior that placed it in the crosshairs of regulators on both sides of the Atlantic.

CP

An exchange is a mixer, or why few people need Tornado Cash

The OFAC sanctions against the Ethereum mixer Tornado Cash have been widely panned by the cryptocurrency community as an attack on financial privacy. This line of argument claims that Tornado has legitimate uses (never mind that its actual usage appears to be largely laundering the proceeds of criminal activity) for consumers looking to hide their on-chain transactions from prying eyes. The problem with this argument is that the alleged target audience already has access to mixers that work just as well as Tornado Cash for most scenarios and happen to be a lot easier to use. Every major cryptocurrency exchange naturally functions as a mixer— and for the vast majority of consumers, that is a far more logical way to improve their privacy on-chain compared to interacting with a smart-contract.

Lifecycle of a cryptocurrency trade

To better illustrate why a garden-variety exchange functions—inadvertently—as a mixer, let’s look at the lifecycle of a typical trade. Suppose Alice wants to sell 1 bitcoin under her own self-custody wallet for dollars and conversely Bob wants to buy 1 bitcoin for USD. Looking at the on-chain events corresponding to this trade:

  1. Alice sends her 1 bitcoin into the exchange. This is an unusual aspects of trading cryptocurrency: there are no prime brokers involved and all trades must be prefunded by delivering the asset to the exchange ahead of time. This is an on-chain transaction, with the bitcoin moving from Alice’s wallet to a new address controlled by the exchange.
  2. Similarly Bob must deliver his funds in fiat, via ACH or wire transfers.
  3. Alice and Bob place orders on the exchange order book. The matching engine pairs those trades and executes the order. This takes place entirely off-chain, only updating the internal balances assigned to each customer.
  4. Bob withdraws the proceeds of the trade. This is an on-chain transaction with 1 bitcoin moving from an exchange-controlled address to one designated by Bob.
  5. Similarly Alice can withdraw her proceeds by requesting an ACH or wire transfer to her own bank account.

Omnibus wallet management

One important question is the relationship between the exchange addresses involved in steps #1 and  #4. Alice must send her bitcoin to some address owned by the exchange. In theory an exchange could use the same address to receive funds from all customers. But this would make it very difficult to attribute incoming funds. Recall that an exchange may be receiving deposits from hundreds of customers originating from any number of bitcoin addresses at any given moment. Each of those transaction. A standard bitcoin transaction does not have a “memo” field where Alice could indicate that a particular deposit was intended for her account. (Strictly speaking, it is possible to inject extra data into signature scripts. However that advanced capability is not widely supported by most wallet applications and in any case would require everyone to agree on conventions for conveying sender information, not just for Bitcoin but for every other blockchain.

This is where the concept of dedicated deposit addresses come into play. Typically exchanges assign one or more unique addresses to each customer for deposits. Having distinct deposit addresses provides a clean solution to the attribution problem: any incoming funds to one of Alice’s deposit addresses will always be attributed to her and result in crediting her balance on the internal exchange ledger. This holds true regardless of where the deposit originated from.  For example, she could share her deposit address for a friend and the friend could send bitcoin payments directly to Alice’s address. Alice does not even have to alert the exchange that she is expecting a payment: any blockchain transfer to that address are automatically credited to Alice.

(Aside: Similar attribution problems arise for fiat deposits. ACH attribution is relatively straightfoward since it is initiated by the customer through the exchange UI; in other words, it is a “pull” approach. But wire transfers pose a problem since there is no such thing as per-customer bank accounts. All wires are delivered to a single bank account associated with the exchange. Commonly this is solved by having customers provide wire IDs to match incoming wires to the sender.)

Incoming and outgoing

Where things get interesting is when Bob is withdrawing his newly purchased 1 bitcoin balance. While it is tempting to assume that 1 bitcoin must come from Alice’s original deposit address where she sent her funds, this is not necessary. Most exchanges implement a commingled “omnibus” wallet where funds are not segregated per customer on-chain. When Alice executes a trade to sell her bitcoin to Bob, that transaction takes place to entirely off-chain. The exchange makes an update to its own internal ledger, crediting and debiting entries in a database recording how much of each asset every customer owns. That trade is not reflected on-chain. Funds are not moved from an “Alice address” into a “Bob address” each time trades execute.

This is motivated by efficiency concerns: blockchains have limited bandwidth and moving funds on-chain costs money in the form of miner fees. Settling every trade on-chain by redistributing funds between addresses would be prohibitively expensive. Instead, the exchange maintains a single logical wallet that holds funds for all its customers. The allocation of funds among all these customers is not visible on chain; it is tracked on an internal database.

A corollary of this is that when a customer requests to withdraw their cryptocurrency, that withdrawal can originate from any address in the omnibus wallet. Exchange addresses are completely fungible. In the example above, while Bob “bought” his bitcoin from Alice—in the sense that his buy order executed against a corresponding sell order from Alice—there is no guarantee that his withdrawal of proceeds will originate from Alice’s address. Depending on the blockchain involved, different strategies can be used to satisfy withdrawal requests in an economical manner. In the case of bitcoin complex strategies are required to manage “unspent transaction outputs” or UTXO in an efficient manner. Among other reasons:

  • It is more efficient to supply a single 10BTC input to serve a 9BTC withdrawal, instead of assembling nine different inputs of one bitcoin each. (More inputs → larger transaction → higher fees)
  • Due to long confirmation times on bitcoin, exchanges will typically batch withdrawals. That is, if 9 customers each requesting 1 bitcoin, it is more economical to broadcast a single transaction with a 10BTC input and 9 outputs each going to one customer, as opposed to nine distinct transactions with one input/output.

In short, there is no relationship between the original address where incoming funds arrive and the final address which appears as the sender of record when those funds are withdrawn after a trade.

Coin mixing by accident

This hypothetical example tracked the life cycle of a bitcoin going through a trade between Alice and Bob. But the same points about omnibus wallet management also apply to a single person. Consider this sequence of events:

  1. Alice deposits 1 bitcoin into the exchange
  2. At some future date she withdraws 1 bitcoin

While the first transaction is going into one of her unique deposit addresses, the second one could be coming out of any address in the exchange omnibus wallet. It looks indistinguishable from all other 1 bitcoin withdrawals occurring around the same time. As long as Alice uses a fresh destination address to withdraw, external observes cannot link the deposit and withdrawal actions. In effect the exchange “mixed” her coins by accepting bitcoin that was known to be associated with Alice and spitting out an identical amount of bitcoin that is not linked to the original source on-chain.

In other words, an exchange with an omnibus wallet also functions as a natural mixer.

Centralized vs decentralized mixers

How favorably that mixer compares to Tornado Cash depends on the threat model. The main selling points of Tornado Cash are trustless operation and open participation.

  • Tornado is implemented as a set of immutable smart-contracts on Ethereum. Those contracts are designed to perform one function and exactly one function: mix coins. There is no leeway in the logic. It cannot abscond with funds or even refuse to perform the designated function. There is no reliance on the honest behavior of a particular counterparty. This stands in stark contrast to using a centralized exchange— those venues have full custody over customer funds. There is no guarantee the exchange will return the funds after they have been deposited. It could experience a security breach resulting in theft of assets. Or it could deliberately choose to freeze customer assets in response to a court order. Those possibilities do not exist for a decentralized system such as Tornado.
  • Closely related is that privacy is provided by all other users taking advantage of the mixer around the same time. The more transactions going through Tornado, the better each transaction is shielded among the crowd. Crucially, there is no single trusted party able to deanonymize all users, regardless of how unpopular the usage. By contrast, a centralized exchange has full visibility into fund flows. It can “connect the dots” between incoming and outgoing transactions.
  • There are no restrictions on who can interact with Tornado smart contract. Meanwhile centralized exchanges typically have an onboard flow and may impose restrictions on sign-ups, such as only permitting customers from specific countries or requiring proof of identity to comply with Know-Your-Customer regulations.

Reconciling the threat model

Whether these theoretical advantages translate into a real difference for a given customer depends on the specific threat model. Here is a concrete example from CoinGecko defending for legitimate uses of Tornado:

“For instance, a software employee paid in cryptocurrency and is unwilling to let their employer know much about their financial transactions can use Tornado Cash for payment. Also, an NFT artist who has recently made a killing and is not ready to draw online attention can use Tornado Cash to improve their on-chain privacy.”

CoinGecko article

The problem with these hypothetical examples is they assume all financial transactions occur in the hermetically sealed ecosystem of cryptocurrency. In reality, very few commercial transactions can be conducted in cryptocurrency today—and those are primarily in Bitcoin using the Lightning Network, where Tornado is of exactly zero value since it operates on the unrelated Ethereum blockchain. The privacy-conscious software developer still needs an off-ramp from Ethereum to a fiat currency such as US dollars. That means an existing relationship with an exchange that allows digital assets for old fashioned fiat. (While it is possible to trade ether for stablecoins such as Tether or USDC using permissionless decentralized exchanges, that still does not help. The landlord and the utility company expect to get paid in real fiat, not fiat equivalents.)

Looked another way, the vast majority of cryptocurrency holders already have an existing relationship with an exchange because that is where they purchase and custody their cryptocurrency in the first place. For these investors, using one of those exchanges as a mixer to improve privacy is the path of least resistance. While there have been notable failures of exchanges resulting in loss of customer funds—FTX being a prominent example—it is worth noting that the counterparty exposure is much more limited for this usage pattern. Funds are routed through an exchange wallet temporarily, not custodied long term. There is a limited time-window when the exchange holds the funds, until they are withdrawn in one or more transactions to new blockchain addresses that are disconnected from the original source. If anything, a major centralized exchange will afford more privacy from external observers due to its large customer base and ease of use, compared to the difficulty of interacting with Tornado contracts through web3 layers such as Metamask. While the customer has no privacy against the exchange, this is not the threat model under consideration: recall the above excerpt refers to a software developer trying to shield their transactions from their employer who pays their salary in cryptocurrency. That employer does not have any more visibility into what goes on inside the exchange than they have into say personal ATM or credit-card transactions for their employees. (In an extra-paranoid threat model where we are concerned about say Coinbase ratting on its customers, one is always free to choose a different, more trustworthy exchange or better yet mix coins through a cascade of multiple exchanges, requiring collusion among all of them to link inputs and outputs.)

That leaves Tornado Cash as a preferred choice only for a niche group of users: those who are unable to onboard with any reputable exchange (because they are truly toxic customers eg OFAC sanctioned entities) or those operating under the combination of a truly tin-foil-hat threat model (“no centralized exchange can be trusted, they will all embezzle funds and disclose customer transactions willy-nilly…”) and an abiding belief that all necessary economic transactions can be conducted on a blockchain without ever requiring an off-ramp to fiat currencies.

CP

Immutable NFTs with plain HTTP

Ethereal content

One of the recurring problems with NFT digital art has been the volatility of storage. While the NFT recording ownership of the artwork lives on a blockchain such as Ethereum, the content itself—the actual image or video—is usually too large to keep on chain. Instead there is a URL reference in the NFT pointing to the content. In the early days those were garden-variety web links. That made all kinds of shenanigans possible, some intended others not:

  • Since websites can go away for good (because the domain is not renewed) the NFT could disappear for good.
  • Alternatively the website could still be around but its contents can change. There is no rule that says some link such as https://example.com/MyNFT will always return the same content. The buyer of an NFT could find that the artwork they purchased has morphed. It could even be different based on time of day or the person accessing the link. (This last example was demonstrated in a recent stunt arguing that Web3 is not decentralized at all, by returning deliberately different image when the NFT is accessed through OpenSea.)

IPFS, Arweave and similar systems have been proposed as a solution to this problem. Instead of uploading NFTs to a website which may go out of business or start returing bogus concept, they are instead stored on special distributed systems. In this blog post we will describe a proof-of-concept for approximating the same effect using vanilla HTTPS links.

Before diving into the implementation details, we need to distinguish between two different requirements behind the ambiguous goal of “persistence:”


1. Immutability

2. Censorship resistance

The first one states that the content does not change over time. If the image looked a certain way when you purchased the NFT, it will always look that way when you return to view it again. (Unless of course the NFT itself incorporates elements of randomness, such as an image rendered slightly different each time. But even in that scenario, the algorithmic model for generating the image itself is constant.)

The second property states that the content is always accessible. If you were able to view the NFT once, you can do so again in the future. It will not disappear or become unavailable due to a system outage.

This distinction is important because each can be achieved independently of the other. Immutability alone may be sufficient for some use cases. In fact there is an argument to be made that #2 is not a desirable requirement in the absolute sense. Most would agree that beheading videos, CSAM or even copyrighted content should be taken down even if they were minted as an NFT.

To that end we focus on the first objective only: create an NFT that is immutable. There is no assurance that the NFT will be accessible at all times, or that it cannot be permanently taken down if enough people agree. But we can guarantee that if you can view the NFT, it will always be this particular image or that particular movie.

Subresource Integrity

At first it looks like there is already a web-standard that solves this problem out of the box: subresource integrity or SRI for short. With SRI one can link to content such as a Javascript library or a stylesheet hosted by an untrusted third-party. If that third-party attempts to tamper with the appearance and functionality of your website by serving an altered version of the content—for example a back-doored version of the Javascript library that logs keystrokes and steals passwords—it will be detected and blocked from loading. Note that SRI does not guarantee availability: that website may still have an outage or it may outright refuse to serve any content. Both of those events will still interfere with the functioning of the page; but at least the originating site can detect this condition and display an error. From a security perspective that  is a major improvement over continuing to execute logic that has been corrupted (undetected) by a third-party.

Limitations & caveats

While the solution sketched here is based on SRI, there are two problems that preclude a straightforward application:

  • SRI only works inside HTML documents.
  • SRI only applies to link and script elements. Strictly speaking this is not a limitation of the specification, but the practical reality of the extent most web-browsers have implemented the spec.

To make the first limitation more concrete, this is how a website would include a snippet of JS hosted by a third-party:

<script src="https://example.com/third-party-library.js"
integrity="sha256-xzKeRPLnOjN6inNfYWKfDt4RIa7mMhQhOlafengSDvU="
crossorigin="anonymous">

That second attribute is SRI at work. By specifying the expected SHA256 hash of the Javascript code to be included in this page, we are preventing the third-party from serving any other code. Even the slightest alteration to the script returned will be flagged as an error and prevent the code from executing.

It is tempting to conclude that this one trick is sufficient to create an immutable NFT (according to the modest definition above) but there are two problems.

1. There is no “short-hand” version of SRI that encodes this integrity check in the URL itself. In an ideal world one could craft a third-party link along the lines of:

https://example.com/code.js#/script[@integrity=sha256-xzKeRPLnOjN6inNfYWKfDt4RIa7mMhQhOlafengSDvU=']”

This (entirely hypothetical) version is borrowing syntax from XPath, combining URIs with an XML-style query language to “search” for an element that meets a particular criteria, in this case having a given SHA256 hash. But as of this writing, there is no web standard for incorporating integrity checks into the URI this way. (The closest is an RFC for hash-links.) For now we have to content ourselves with specifying the integrity as an out-of-band HTML attribute of the element.

2. As a matter of browser implementations, SRI is only applied to specific types of content; notably, javascript and stylesheets. This is consistent across Chrome, Firefox and Edge. Neither images or iframes are covered. That means even if we could somehow solve the first problem, we can not link to an “immutable” image by using an ordinary HTML image tag.

Emulating SRI for images

Working around both of these limitations requires a more complicated solution, where the document is built up in stages. While it is not possible to make a plain HTTPS URL immutable due to limitation #1 in SIR, there is one scheme that supports immutability by default.  In fact all URLs of this type are always immutable. This is the “data” scheme where the content is inlined; it is in the URL itself. Since no content is retrieved from an external server, this is immutable by definition. Data URLs can encode an HTML document, which serves as our starting point or stage #1. The URL associated with the NFT on-chain will have this form.

In theory we could encode an entire HTML document, complete with embedded images, this way. But that runs into a more mundane problem: blockchain space is expensive and the NFT URL lives on chain. That calls for minimizing the amount of data stored within the smart-contract, using only the minimal amount of HTML to boostrap the intended content. In our case, the specific HTML document will follow a simple template:

<!DOCTYPE html>
<html>
<head>
<script src="https://example.com/stage2.js"

integrity="sha256-xzKeRPLnOjN6inNfYWKfDt4RIa7mMhQhOlafengSDvU="
          crossorigin="anonymous">
  </script>
</head>
</html>

This is just a way of invoking stage #2, which is a chunk of bootstrap JavaScript hosted on an external service and made immutable using SRI. Note that if the hosting service decides to go rogue and start returning different content, the load will fail and the user will be starting a blank page. But the hosting service cannot successfully cause altered javascript to execute, because of the integrity check enforced by SRI.

Stage #2 itself is also simple. It is a way of invoking stage #3, where the actual content rendering occurs.

var contents='” … contents of stage #3 HTML document … “;
document.write(contents);

This replaces the current document by new HTML from the string. The heavy lifting takes place after the third stage has loaded:

  • It will fetch additional javascript libraries, using SRI to guarantee that they cannot be tampered with.
  • In particular, we pull in an existing open-source library from 2017 to emulate SRI for images, since the NFT is an image. This polyfill library supports an alternative syntax for loading images, with the URL and expected SHA256 hash specified as proprietary HTML attributes.
  • Stage #3 also contains a reference to the actual NFT image. But this image is not loaded using the standard <img src=”…”> syntax in HTML; that would not be covered by SRI due to the problem of browser support discussed above.
  • Instead, we wait until the document has rendered and kick-off custom script that invokes the JS library to do a controlled image load, comparing the content retrieved by XmlHttpRequest against the integrity check to make sure the server returned our expected NFT.
  • If the server returned the correct image, it will be rendered. Otherwise a brusque modal dialog appears to inform the viewer that something is wrong.

Putting it all together, here is a data URL encoding an immutable NFT:

data:text/html;charset=utf-8;base64,PCFET0NUWVBFIGh0bWw+CjxodG1sPgogIDxoZWFkPgogICAgPHNjcmlwdCBzcmM9Imh0dHBzOi8vd3d3LmlkZWVzZml4ZXMuaW8vaW1tdXRhYmxlL3N0YWdlMi5qcyIKCSAgICBpbnRlZ3JpdHk9InNoYTI1Ni1YSlF3UkFvZWtUa083eE85Y3ozaExrZFBDSzRxckJINDF5dlNSaXg4MmhVPSIKCSAgICBjcm9zc29yaWdpbj0iYW5vbnltb3VzIj4KICAgIDwvc2NyaXB0PgogIDwvaGVhZD4KPC9odG1sPgo=

We can also embed it on other webpages (such as NFT marketplaces and galleries) using an iframe. as in this example:

Embedded NFT viewer

Chrome does not allow navigating the top-level document to a data URL, requiring indirection through the iframe. In this case the viewer itself must be trusted, since it can cheat by pointing the iframe at a bogus URL instead of the correct scheme printed above. But such corruptions are only “local” since other honest viewers will continue to enforce the integrity check.

What happens if the server hosting the image were to replace our hypothetical motorcycle NFT by a different picture?

Linking to the image with a plain HTTPS URL will display the corrupted NFT:

But going through the immutable URL above will detect the tampering attempt and not render the image:

CP