Blockchain thefts, retroactive bug-bounties and socially-responsible crime

Or, monetizing stolen cryptocurrency proves non-trivial.

It is not often one hears of bank robbers returning piles of cash after a score because they decided they could not find a way to spend the money. Yet this exact scenario has played out over and over again in the context of cryptocurrency in 2022. Multiple blockchain-based projects were breached, resulting in losses in millions of dollars. That part alone would not have been news, only business as usual. Where the stories take a turn for the bizarre is when the perpetrators strike a bargain with the project administrators to return most of the loot, typically in exchange for a token “bug bounty” to acknowledge the services of the thieves in uncovering a security vulnerability.

To name a handful:

  • August 2021, Poly Network. A generous attacker returns close to 600 million dollars in stolen funds back to the project.
  • Jan 2022, Multichain. Attacker returns 80% of the 1 million dollars stolen, deciding that he/she earned 20% for services rendered.
  • June 2022, Crema Finance. Attacker returns $8 million USD, keeping $1.6 million as “white-hat bounty.” (Narrator: That is not how legitimate white-hat rewards work.)
  • Oct 2022, Transit Swap. Perpetrator returns 16 million (about two-thirds of the total haul)
  • December 2022, Defrost Finance on Avalanche. Again the attacker returned close to 100% of funds.

While bug bounty programs are very common in information security, they are often carefully structured with rules governing the conduct of both the security researchers and affected companies. There is a clear distinction between a responsible disclosure of a vulnerability and outright attack. Case in point: disgraced former Uber CSO has been convicted of lying to Federal Investigators over an incident when the Uber security team retroactively tried to label an actual breach as a valid bug-bounty submission. It was a clear-cut case of an actual attack: the perpetrators had not merely identified a vulnerability but exploited it to the maximum extent to grab Uber customer data. They even tried to extort Uber for payment in exchange for keeping the incident under wraps—none of this is within the framework for what qualifies as responsible disclosure. To avoid negative PR, Uber took up the perpetrators on their offer, attempting to recharacterize a real breach after the fact as a legitimate report. That did not go over very well with the FTC or the Department of Justice who prosecuted the former Uber executive and obtained a guilty verdict.

Given that this charade did not work out for Uber, it is strange to see multiple DeFi projects embrace the same deception. It reeks of desperation, of the unique flavor experienced by a company facing an existential crisis. Absent a miracle to reverse the theft (along the lines of the DAO hard-fork the Ethereum foundation orchestrated to bail-out an early high-profile project) these projects would be out of business. The stakes are correspondingly much higher than they were for Uber circa 2017: given the number of ethics scandals and privacy debacles Uber experienced on a regular basis, the company could easily have weather one more security incident. But for fledgling DeFi projects, the abrupt loss of all (or even substantial part of) customer funds is the end of the road.

On the other hand, it is even more puzzling that the perpetrators—or “vulnerability researchers” if one goes along with the rhetoric—are playing along, giving up the lion’s share of their ill-gotten gains in exchange for… what exactly? While the terms of the negotiation between the perpetrators and project administrators are often kept confidential, there are a few plausible theories:

  • They are legitimate security researchers who discovered a serious vulnerability and decided to stage their own “rescue” operation. There are unique circumstances around vulnerability disclosure on blockchains. Bug collisions happen all the time and at any point, someone else— someone less scrupulous than our protagonist—may discover the same vulnerability and choose to exploit it for private gain. (This is quite different than say finding a critical Windows vulnerability. It would be as if you could exploit that bug on all Windows machines at the same time, regardless of where those targets are located in the world and how well they are defended otherwise. Blockchains are unique in this regard: anyone in the world can exploit a smart-contract vulnerability. The flip side of the coin is that anyone can role-play at being a hero and protecting all users of the vulnerable contract. Going back to our example, while one cannot “patch” Windows without help from MSFT and whoever owns the machine, it is possible to protect 100% of customers. The catch is one must race to exploit the vulnerability and seize all the funds at risk, in the name of safekeeping, before the black-hats can do the same for less noble purposes.
    While it possible that in at least some of these instances, the perpetrators were indeed socially-responsible whitehat researchers motivated by nothing more than protecting customers, that seems an unlikely explanation for all of the cases. Among other clues, virtually every incident occurred without any advance notification. One would expect that a responsible researcher would at least make an effort to contact the project in advance of executing a “rescue,” notifying them of their intentions and offering contact information. Instead project administrators were reduced to putting out public-service announcements on Twitter to reach out to the anonymous attackers, offering to negotiate for return of missing funds. There is no
  • Immunity from prosecution. If the thieves agree to return the majority of the funds taken, the administrators could agree not to press charges or otherwise pursue legal remedies. While this may sound compelling, it is unlikely the perpetrators could get much comfort from such an assurance. Law enforcement could still treat the incident as a criminal matter even if everyone officially associated with the project claims they have made peace with the perpetrators.
  • The perpetrators came to the sad realization that stealing digital assets is the easy part. Converting those assets into dollars or otherwise usable currency without linking that activity to their real-world identity is far more difficult.

That last possibility would be a remarkable turn-around; conventional wisdom holds that blockchains are the lawless Wild West of finance where criminal activity runs rampant and crooks have an easy time getting rich by taking money from hapless users. The frequency of security breaches suggests the first part of that statement may still be true: thefts are still rampant. But it turns out that when it comes to digital currency, stealing money and being able to spend it are two very different problems.

For all the progress made on enabling payments in cryptocurrency—mainly via the Lightning Network—most transactions still take place in fiat. Executing a heist on blockchain may be no more difficult than 2017 when coding secure smart-contracts was more art than science. One thing that has certainly changed in the past five years is regulatory scrutiny on the on/off-ramps from cryptocurrency into the fiat world. Criminals still have to convert their stolen bitcoin, ether or more esoteric ERC20 assets into “usable” form. Typically, that means money in a bank account; stablecoins such as Tether or Circle will not do the trick. By and large merchants demand US dollars, not dollar-equivalent digital assets requiring trust in the solvency of private issuers.

That necessity creates a convenient chokepoint for enforcement: cryptocurrency exchanges, which are the on-ramps and off-ramps between fiat money and digital assets. Decentralization makes it impossible to stop someone from exploiting a smart-contract—or what one recently arrested trader called a “highly profitable trading strategy”—by broadcasting a transaction into a distributed network. But there is nothing trustless or distributed about converting the proceeds of that exploit it into dollars spendable in the real world. That must go through a centralized exchange. To have any hope of sending/receiving US dollars, that exchange must have some rudimentary compliance program and at least make a token effort at following regulatory obligations, including Know Your Customer (KYC) and anti-money laundering (AML) rules. (Otherwise, the exchange risks experiencing the same fate as Bitfinex which was unceremoniously dropped by its correspondent bank Wells Fargo in 2017 much to the chagrin of Bitfinex executives.) Companies with aspirations to staying in business do not look kindly on having their platform being used to launder proceeds from criminal activity. They frequently cooperate with law enforcement in seizing assets as well as providing information leading to the arrest of perpetrators. Binance is a great demonstration of this in action. Once singled out by Reuters as the platform preferred by criminals laundering cryptocurrency, the exchange has responded by ramping up its compliance efforts and participating in several high-profile asset seizures. Lest the irony is lost: a cryptocurrency business proudly declares its commitment to surveilling its own customer base to look for evidence of anyone receiving funds originating with criminal activity. (The company even publishes hagiographic profiles on its compliance team retrieving assets from crooks foolish enough to choose Binance as their off-ramp to fiat land.)

This is not to say that monetizing theft on blockchains has become impossible. Determined actors with resources—such as the rogue state of North Korea—no doubt still retains access to avenues for exiting into fiat. (Even in that case, increased focus on enforcement can help by increasing the “haircut” or percentage of value lost by criminals when they convert digital assets into fiat through ever inefficient schemes.) But those complex arrangements are not accessible to a casual vulnerability researcher who stumbles into a serious flaw in a smart-contract or compromises the private keys controlling a large wallet. Put another way: there are far more exploitable vulnerabilities than ways of converting proceeds from that exploit into usable money. Immature development practices and gold-rush mentality around rushing poorly designed DeFi applications to market has created a target-rich environment. This is unlikely to change any time soon. On the flip side, increased focus on regulation and availability of better tools for law enforcement—including dedicated services such as Chainalysis and TRM Labs for tracing funds on chain—makes it far more difficult to monetize those attacks in any realistic way. It was a running joke in the information security community that blockchains come with a built-in bug bounty. Find a serious security vulnerability and monetary rewards shall follow automatically—even if the owner of the system ever bothered to create an official bounty program. Digital assets that are blacklisted by every reputable business and can never be exchanged for anything else of value are about as valuable as monopoly money. Given that dilemma, it is no surprise that creative vulnerability researchers would embrace the post hoc “white-hat disclosure” charade, choosing a modest but legitimate payout over holding on to a much larger sum of tainted funny-money they have little of being able to spend.

CP

The myth of tainted blockchain addresses [part II]

[continued from part I]

Ethereum and account-based blockchains

The Ethereum network does not have a concept of discrete “spend candidates” or UTXOs. Instead, funds are assigned to unique blockchain addresses. While this is a more natural model for how consumers expect digital assets to behave (and bitcoin wallet software goes out of its way to create the same appearance while juggling UTXOs under the covers) it also complicates the problem of separating clean vs dirty funds.

Consider this example:

  • Alice has a balance of 5 ETH balance on her Ethereum address
  • She receives 1 ETH from a sanctioned address (For simplicity assume 100% of these funds are tainted, for example because they represent stolen.)
  • She receives another 5 ETH from a clean address.
  • Alice sends 1 ETH to Bob.

If Alice and Bob are concerned about complying with AML rules, they may be asking themselves: are they in possession of tainted ETH that needs to be frozen or otherwise segregated for potential seizure by law enforcement? (Note in this example their interests are somewhat opposed: Alice would much prefer that the 1ETH she transferred to Bob “flushed” all the criminal proceeds out of her wallet, while Bob wants to operate under the assumption that he received all clean money and all tainted funds still reside with Alice.)

Commodities parallel

In one were to draw a crude—no pun intended—comparison to commodities, tainted Bitcoin behaves like blood diamonds while tainted Ethereum behaves like contraband oil imported from a sanctioned petro-dictatorship. While UTXO can be partially tainted, it does not “mix” with other UTXO associated with the same address. Imagine a precious stones vault containing diamonds. Some of these turn out to be conflict diamonds, others have a verifiable pedigree. While the vault may contain items of both type, there is no question whether any given sale includes conflict diamonds. In fact, once the owner becomes aware of the situation, they can make a point of putting those samples aside and never selling them to any customer. This is the UTXO model in bitcoin: any given transaction either references a given UTXO (and consumes 100% of the available funds there) or does not reference that UTXO at all. If the wallet owner is careful to never use tainted inputs in constructing their transaction, they can be confident that the outputs are also clean.

Ethereum balances do not behave this way because they are all aggregated together in one address. Stretching the commodity example, instead of a vault with boxes of precious gems, imagine an oil storage facility. There is a tank with a thousand barrels of domestic oil with side-entry mixer running inside to stir up the contents and avoid sludge settling at the bottom. Some joker dumps a thousand barrels of contraband petrostate oil of identical density and physical characteristics into this tank. Given that the contents are being continuously stirred, it would be difficult to separate out the product into its constituent parts. If someone tapped one barrel from that tank and sold it, should that barrel be considered sanctioned, clean or something in between such as “half sanctioned”?

There are logical arguments that could justify each of these decisions:

  1. One could take the extreme view that even the slightest amount of contraband oil mixed into the tank results in spoilage of the entire contents. This is the obsessive-compulsive school of blockchain hygiene, which holds that even de minimus amounts originating from a sanctioned address irreversibly poisons an entire wallet. In this case all 2000 barrels coming out of that tank will be tainted. In fact, if any more oil were added to that tank, it too would get tainted. At this point, one might as well shutter that facility altogether.
  2. A more lenient interpretation holds that there are indeed one thousand sanctioned barrels, but those are in the batch of second thousand barrels coming out of the spout. Since the first thousand original barrels were clean, we can tap up to that amount without a problem. This is known as FIFO or first-in-first-out ordering in computer science.
  3. Conversely, one could argue that the first thousand are contraband because those were the most recent additions to the tank, while the next thousand will be clean. That would be LIFO or last-in-first-out ordering.
  4. Finally, one could argue the state of being tainted exists on a continuum. Instead of a simple yes/no, each barrel is assigned a percentage. Given that the tank holds equal parts “righteous” and “nefarious” crude oil, every barrel coming out of it will be 50% tainted according to this logic.

Pre-Victorian legal precedents

While there may not be any physical principles for choosing between these hypotheses, it turns out this problem does come up in legal contexts and there is precedent for adopting a convention. In the paper Bitcoin Redux a group of researchers from the University of Cambridge expound on how an 1816 UK High Court ruling singles out a particular way of tracking stolen funds:

It was established in 1816, when a court had to tackle the problem of mixing after a bank went bust and its obligations relating to one customer account depended on what sums had been deposited and withdrawn in what order before the insolvency. Clayton’s case (as it’s known) sets a simple rule of first-in-first-out (FIFO): withdrawals from an account are deemed to be drawn against the deposits first made to it.

In fact, their work tackles a more complicated scenario where multiple types of taint are tracked, including stolen assets, funds from Iran (OFAC sanctioned) and funds coming out of a mixer. The authors compare the FIFO heuristic against the more radical “poison” approach which corresponds to #1 in our list above, as well as the “haircut” which corresponds to #4, highlighting its advantages:

The poison diagram shows how all outputs are fully tainted by all inputs. In the haircut diagram, the percentages of taint on each output are shown by the extent of the coloured bars. The taint diffuses so widely that the effect of aggressive asset recovery via regulated exchanges might be more akin to a tax on all users.
[…]
With the FIFO algorithm, the taint does not go across in percentages, but to individual components (indeed, individual Satoshis) of each output. Thus the first output has an untainted component, then the stolen component – both from the 9 first input – and then part of the Iranian component from the second input. As the taint does not spread or diffuse, the transaction processes it in a lossless way.

Ethereum revisited

While the Bitcoin Redux paper only considered the Bitcoin network, the FIFO heuristic translates naturally into the Ethereum context as it corresponds to option #2 in the crude-oil tank example. Going back to the Alice & Bob hypothetical, it vindicates Bob—in fact it means Alice can send another 4ETH from that address before getting to the tainted portion.

Incidentally the FIFO model has another important operational advantage: it allows the wallet owner to quarantine tainted funds in a fully deterministic, controlled manner. Suppose Alice’s compliance officer advises her to quarantine all tainted funds at a specific address for later disbursement to law enforcement. Recall that the tainted sum of 1 ETH is “sandwiched” chronologically between two chunks of clean ETH in arrival order. But Alice can create a series of transactions to isolate it:

  • If necessary, she needs to spend the first 5 ETH that were present at the address prior to the arrival of tainted funds. Alice could wait until this happens naturally, as in her outbound transfer to Bob. Any remaining amount can be immediately consumed in a loopback transaction sending funds back to the original address or she could temporarily shift those funds to another wallet under her control.
  • Now she creates another 1 ETH transaction to move the tainted portion to the quarantine address.

The important point here is that no one else can interfere with this sequence. If instead the LIFO heuristic had been adopted, Alice could receive a deposit between steps #1 and #2, resulting in her outbound transaction in the second step using up a different 1 ETH segment that does not correspond exactly to the portion she wanted to get rid of. This need not even be a malicious donation. For example, charities accepting donations on chain receive deposits from contributors without any prior arrangement. Knowing the donation address is sufficient; there is no need to notify the charity in advance of an upcoming payment. Similarly, cryptocurrency exchanges hand out deposit addresses to customers with the understanding that the customer is free to send funds to that address any time and they will be credited to her account. In these situations, the unexpected deposit would throw off the carefully orchestrated plan to isolate tainted funds but only if LIFO is used—because in that model the “last-in” addition going “first-out” is the surprise deposit.

In conclusion: blockchain addresses are not hopelessly tainted because of one unsolicited transaction sent by someone looking to make a point. Only specific chunks of assets associated with that address carry taint. Using Tornado Cash to permanently poison vast sums of ether holdings remains nothing more than wishful thinking because the affected portion can be reliably separated by those seeking to comply with AML rules, at the cost of some additional complexity in wallet operations.

CP

The myth of tainted blockchain addresses [part I]

[Full disclosure: This blogger is not an attorney and what follows is not legal advice.]

Unsolicited gifts on chain

In the aftermath of the OFAC sanctions against Tornado Cash, it has become an article faith in the cryptocurrency community that banning blockchain addresses sets a dangerous precedent. Some have argued that blacklisting Tornado addresses and everyone who interacts with them will have dangerous downstream effects due to the interconnectedness of networks. Funds flow from one address to another, the argument goes, often merging with unrelated pools of capital before being split off again. Once we decide one pool of funds are tainted by virtue of being associated with a bad actor or event—a scam, rug-pull or garden-variety theft—that association propagates unchecked and continues to taint funds belonging to innocent bystanders who were not in any way involved with the original crime. As if to illustrate the point, shortly after the ban imposed on US residents from interacting with the Tornado mixer, some wisecrack decided to use that very mixer to send unsolicited funds to prominent blockchain addresses. These were either addresses with unusually high balances (eg “whales” in industry parlance) or previously tagged as belonging to celebrities or well-known cryptocurrency businesses such as exchanges. Here is an unsolicited “donation” sent to the Kraken cold-wallet through the Tornado mixer.

That raises the question: are these unwitting recipients also in violation of OFAC sanctions? Are all funds in those wallets now permanently tainted because of an inbound transaction, a transaction they neither asked for or had any realistic means to prevent given the way blockchains operate? With a few exceptions, anyone can send funds from their own address to any other address on most blockchains; the recipient cannot prevent this. Granted, there are a few special cases where the recipient can limit unwanted transfers. For example, Algorand requires the recipient to opt-in to supporting a specific ASA before they can receive assets of that type. But that does not in any way prevent free transfer of the native currency ALGO. Ethereum smart-contracts make it possible to take action on incoming transfers and reject them based on sender identity. Of course, this assumes the recipients have a way to identify “bad” addresses. Often such labels are introduced after the offending address has been active and transacting. Even if there was a 100% reliable way to flag and reject tainted transfers, requiring that would place an undue burden on every blockchain participant to implement expensive measures (including the use of smart-contracts and integrating with live data feeds of currently sanctioned addresses, according to all possible regulators around the world) to defend against a hypothetical scenario that few will encounter.

Given that inbound transfers from blacklisted addresses can not be prevented in any realistic setup, does that mean blacklisting Tornado Cash also incidentally blacklists all of these downstream recipients by association? While compelling on its face, this logic ignores the complexity of how distributed ledgers track balances and adopts one possible convention among many plausible ones for deciding how to track illicit funds in motion. This blog post will argue that there are equally valid conventions that make it easier to isolate funds associated with illicit activity and prevent this type of uncontrolled propagation of “taint” through the network. To make this case, we will start with the simple scenario where tainted funds are clearly isolated from legitimate financial activity and then work our way up to more complex situations where commingling of funds requires choosing a specific convention for separation.

Easy case: simple Bitcoin transactions

UTXO model

The Bitcoin network makes it easy to separate different pools of money within an address, because the blockchain organizes funds into distinct lumps of assets called “unspent transaction outputs” or UTXO. The concept “balance of address” does not exist natively in the Bitcoin ledger; it is a synthetic metric created by aggregating UTXO that all share the same destination address.

Conceptually, one can speculate if this was a consequence of the relentless focus on privacy Satoshi advocated. The Bitcoin whitepaper warns about the dangers of address reuse, urging participants to only use each address once. In this extreme model, it does not make sense to track balances over time, since each address only appears twice on the blockchain. First when funds are deposited at that address, temporarily creating a non-zero balance. The second and last reference occurs when funds are withdrawn, after which point the balance will always be zero. This is not how most bitcoin wallets operate in reality. Address reuse is common and often necessary for improving operational controls around funds movement. Address whitelisting is a very common security feature used to restrict transfers to known, previously defined trusted destinations. That model can only scale if each participant has a handful of fixed blockchain addresses such that all counterparties interacting with that person can record those entries in their whitelist of “safe” destinations.

For these reasons it is convenient to speak of an address “balance” as a single figure and draw charts depicting how that number varies over time. But it is important to remember that single number is a synthetic creation representing an aggregate over discrete, individual UTXOs. In fact the same balance may behave differently depending on the organization of its spend candidates. Consider these two addresses:

  • First one is comprised of a thousand UTXOs each worth 0.00001 BTC
  • Second one is a single 0.01 BTC UTXO.

On paper, both addresses have a balance of “0.01 bitcoin.” In reality, the second address is far useful for commercial activity. Recall that each bitcoin transaction pays a mining fee proportional to the size of the transaction in bytes—not proportional to the value transacted, as most payment networks operate. Inputs typically account for the bulk of transaction size due to the presence of cryptographic signatures, even after accounting for the artificial discount introduced by segregated witness. That means scrounging together dozens of inputs is less efficient than using a single output to supply a given amount. In the extreme case of “dust outputs,” the mining fees required to include a UTXO as input may exceed the amount of funds that input contributes. Including the UTXO would effectively be a net negative. Such a UTXO is economically unusable unless network fees decline.

Isolating tainted funds

This organization of funds into such distinct lumps makes it easy to isolate unsolicited contributions. Every UTXO stands on its own. If funds are sent from a sanctioned actor, the result is a distinct UTXO that stands apart on the bitcoin ledger from all the other UTXOs sharing the recipient address. That UTXO should be considered tainted in its entirety, subject to asset freeze/seizure or whatever remedy the powers-that-be deem appropriate for the situation. Everything else is completely free of taint.

One way to implement this is for the wallet software to exclude such UTXOs when calculating available balances or picking available UTXOs to prepare a new outbound transaction. The wallet owner effectively acts as if that UTXO did not exist This model extends naturally to downstream transfers. If the tainted UTXO is used as input into another subsequent bitcoin transaction (perhaps because the wallet owner did not know it was tainted) it will end up creating another tainted UTXO while leaving everything else untouched.

Mixed bag: transactions with mismatched inputs/outputs

The previous section glossed over an important aspect of bitcoin transactions: they can have multiple inputs and outputs. The more general case is illustrated by transaction C here:

Transaction graph example from the Bitcoin wiki

Consider an example along the lines of transaction C, but using a different arrangement of inputs and outputs:
Inputs:

  • First input of clean 5 BTC
  • Second one worth 1 BTC originating from a sanctioned address

Outputs:

  • First output is designated to receive 3 BTC
  • Second output receives 2.999BTC (leaving 0.001 BTC in mining fees)

Due to the different amounts, there is no way to map inputs/outputs in one-to-one relationship. There is a total of 1 tainted bitcoin on the input side that must be somehow passed through to outputs. (There is of course the question of where the mining fees came from. It would be convenient to argue that those were paid for using tainted funds to reduce the total. But here we will make the worst-case assumption: tainted funds must be propagated 100% to outputs. Mining fees are assumed to come out of the “clean” portion of inputs.)

A problem of convention

Clearly there are different ways taint can be allocated among the outputs within those constraints. Here are some examples:

  1. Allocate evenly, with 50% assigned to each output. This will result in partial taint of both outputs.
  2. Allocate 100% to the first output. That output is now partially tainted while the remaining output is clean. (More generally, we may need to allocate to multiple outputs until the entire tainted input is fully accounted for. If the first input of 5BTC had been the one from a sanctioned address, it would have required both outputs to fully cover the amount.)
  3. Same as #2 but select the tainted outputs in a different order. The adjectives “first” and “second” are in reference to the transaction layout on blockchain, where inputs and outputs are strictly ordered. But for the purposes of tracking tainted funds, we do not have to follow the same order. Here are some other reasonable criteria:
  4. FIFO or first-in-first-out. Match inputs and outputs in order. In this case since the first output can be paid out of the first clean input entirely, it is considered clean. But the second output requires the additional tainted 1 BTC so it is partially tainted.
  5. Highest balance first. To reduce the number of tainted outputs, use the outputs in decreasing order until the tainted input is fully consumed.

Regardless of which convention is adopted, one conclusion stands: No matter how one slices and dices the outputs, there are scenarios where some UTXO will be partially tainted, even if the starting state follows the all-or-none characterization. Previous remedies to quarantine or otherwise exclude that UTXO in its entirety from usable assets are no longer appropriate.

Instead of trying to solve for this special case in bitcoin, we look at how the comparable situation arises in Ethereum, which differs from Bitcoin in a crucial way: it does not have the concept of UTXO. Here the concept of “balance” is native to the blockchain and associated with every address. That means the neat separation of funds into discrete “clean” and “tainted” chunks cannot possibly work in Ethereum, forcing us to confront this problem of commingled funds in a broader context.

[continued]

CP

Remote attestation: from security feature to anticompetitive lock-in

Lessons from the first instant-messaging war

In the late 1990s and early 2000s instant messaging was all the rage. A tiny Israeli startup Mirabilis set the stage with ICQ but IM quickly become a battle ground of tech giants, running counter to the usual dot-com era mythology of small startups disrupting incumbents on the way to heady IPO valuations. AOL Instant Messenger had taken a commanding lead from the gate while MSFT was living up to its reputation as “fast-follower” (or put less charitably, tail-light chaser) with MSN Messenger. Google had yet to throw its hat into the arena with GChat. These IM networks were completely isolated: an AOL user could only communicate with other AOL users. This resulted in most users having to maintain multiple accounts to participate in different networks, each with their barrage of notifications and task-bar icons.

While these companies were locked in what they viewed as zero-sum game for marketshare, the benefits of interoperability to consumers were clear. In fact one software vendor called Trillion even made a multiple-network client that effectively aggregated the protocols for all the different IM services. Standards for interoperability such as SIP and XMPP were still a ways off from becoming relevant; everyone invented their own client/server protocol for instant messaging, and expected to provide both sides of the implementation from scratch. But there was a more basic reason why some IM services were resistant to adopting an open standard: it was not necessarily good for the bottom line. Interop is asymmetric: it helps the smaller challenger compete against the incumbent behemoth. If you are MSN Messenger trying to win customers away from AOL, it is a selling point if you can build an IM client that can exchange messages with both MSN and AOL customer. Presto: AOL users can switch to your application, still keep in touch with their existing contacts while becoming part of the MSN ecosystem. Granted the same dynamics operate in the other direction: in principle AOL could have built an IM client that connected its customers with MSN users. But this is where existing market shares matter: AOL has more to lose from by allowing such interoperability and opening itself up to competition with MSN, compared to keeping its users locked up in the wallet garden.

Not surprisingly then they did go to of their way to keep each IM service an island unto its own. Interestingly for tech giants, this skirmish was fought in code instead of the more common practice of lawyers exchanging nastygrams. AOL tried to prevent any client other than the official AIM client from connecting to its service. You would think this is an easy problem: after all, they control the software on both sides. They could ship a new IM client that includes a subtle, specific quirk when communicating with the IM server. AOL servers would in turn look for that quirk and reject any “rogue” clients missing that quirk.

White lies for compatibility

This idea runs into several problems. A practical engineering constraint in the early 2000s was the lack of automatic software updates. AOL could ship a new client but in those Dark Ages of software delivery, “ship” meant uploading the new version to a website— itself a much heralded improvement from the “shrink-wrap” model of actually burning software on CDs and selling them in a retail store. There was no easy way to force-upgrade the entire customer base. If the server insisted on enforcing the new client fingerprint, they would have to turn away a large percent of customers running legacy versions or make them jump through hoops to download the latest version— and who knows, maybe some of those customers would decide to switch to MSN in frustration. That problem is tractable and ultimately solved with better software engineering. Windows Update and later Google Chrome made automatic software updates into a feature customers take for granted today. But there is a more fundamental problem with attempting to fingerprint clients: competitors can reverse-engineer the fingerprint and incorporate it in their own software.

This may sound vaguely nefarious but software impersonating other pieces of software is in fact quite common for compatibility. In fact web browsers practically invented that game. Take a look at the user-agent string early versions of Internet Explorer sent every website:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; SLCC2; .NET CLR 2.0.50727; Media Center PC 6.0; .NET CLR 3.5.30729; .NET CLR 3.0.30729)

There is a lot of verbiage here but one jumps out: Mozilla. That is the codename used by Netscape Navigator in its own user agent strings. The true identity of this browser is buried in a parenthetical comment—“MSIE6.0” for Internet Explorer 6— but why is IE trying to pretend to be Netscape? Because of compatibility. Web pages were designed assuming a set of browser features that could not be taken for granted— such as support for images and JavaScript. Given the proliferation of different web browsers and versions at the time— see point about lack of automatic updates— websites used a heuristic shortcut to determine if visitors were using an appropriate browser. Instead of trying to check for the availability of each feature, they began to check for a specific version of a known browser. “Mozilla/4.0” was a way to signal that the current web browser could be treated as if it were Netscape 4. Instead of turning away users with an unfriendly message along the lines of “Please install Netscape 4 to use this site” the service could assume all the requisite features were present and proceed as usual.

These white-lies are ubiquitous on the web because compatibility is in the interests of everyone. Site publishers just want things to work. Amazon wants to sell books. With the exception of a few websites closely affiliated with a browser vendor, they do not care whether customers use Netscape, IE or Lynx to place their orders. There is no reason for websites to be skeptical about user-agent claims or run additional fingerprinting code to determine if a given web browser was really Netscape 4 or simply pretending to be. (Even if they wanted to such fingerprinting would have been difficult; for example IE often aimed for bug-for-bug compatibility with Netscape, even when that meant diverging from official W3C standards.)

Software discrimination and the bottom line

For reasons noted above, the competitive dynamics of IM were fundamentally different than web browsers. Most of the business models built around IM assumed full control over the client stack. For example, MSN Messenger floated ideas of making money by displaying ads in the client. This model runs into problems when customers run a different but interoperable client: Messenger could connect to the AOL network— effectively using resources & generating costs for AOL— while displaying ads chosen by MSN, earning revenue for MSFT.

Not surprisingly this resulted in an escalating arms-race. AOL included more and more subtle features into AIM that the server could use for fingerprinting. MSFT attempted to reverse engineer that functionality out of the latest AIM client and incorporate identical behavior into MSN Messenger. It helps that the PC platform was and to a large extent still is very much open to tinkering. Owners can inspect binaries running on their machine inspect network communications originating from a process or attach a debugger to a running application to understand exactly what that app is doing under specific circumstances. (Intel SGX is an example of a recent hardware development on x86 that breaks that assumption. It allows code to run inside protected “enclaves” shielded from any debugging/inspection capabilities from an outside observer.)

In no small measure of irony, the Messenger team voluntarily threw in the towel on interoperability when AOL escalated the arms-race to a point MSFT was unwilling to go: they deliberately included a remote code execution vulnerability in AIM intended for AOL servers to exploit. Whenever a client connect, the server would exploit the vulnerability to execute arbitrary code to look around the process and check on the identity of the application. Today such a bug would earn a critical severity rating and associated CVE if it were discovered in an IM client. (Consider that in the 1990s most Internet traffic was not encrypted, so it would have been much easier to exploit that bug; the AIM client had very little assurance that it was communicating with the legitimate AOL servers.) If it were alleged that a software publisher deliberately inserted such a bug into an application used by millions of people, it would be all over the news and possibly result in the responsible executives being dragged in front of congress for a ritual public flogging. In the 1990s it was business-as-usual.

Trusted computing and the dream of remote attestation

While the MSN Messenger team may have voluntarily hoisted the white-flag on that particular battle with AOL, a far-more powerful department within the company was working to make AOL’s wishes come true: a reliable solution to verifying the authenticity of software running on a remote peer,  preferably without playing a game of chicken with deliberately introduced security vulnerabilities. This was the Trusted Computing initiative, later associated with the anodyne but awkward acronym NGSCB (“Next Generation Secure Computing Base”) though better remembered by its codename “Palladium.”

The lynchpin of this initiative was a new hardware component called the “Trusted Platform Module” meant to be included as an additional component on the motherboard. TPM was an early example of system-on-a-chip or SoC: it had its own memory, processor and persistent storage, all independent of the PC. That independence meant the TPM could function as a separate root of trust. Even if malware compromises the primary operating system and gets to run arbitrary code in kernel mode— the highest privilege level possible—it still can not tamper with the TPM or alter security logic embedded in that chip.

Measured boot

While the TPM specification defined a kitchen sink of functionality ranging from key management (generate and store keys on-board TPM in non-extractable fashion) to serving as a generic cryptographic co-processor, one feature stood out for use in securing the integrity of the operating system during the boot process: the notion of measured boot. At a high level, the TPM maintained a set of values in RAM dubbed “platform configuration register” or PCRs. When the TPM is started, these all start out at zero. What distinguishes PCRs is the way they are updated. It is not possible to write an arbitrary value into a PCR. Instead the existing value is combined with the new input and run through a cryptographic hash function such as SHA1; this  is called “extending” the PCR in TCG terminology. Similarly it is not possible to reset the values back to zero, short of restarting the TPM chip, which only happens when the machine itself is power-cycled. In this way the final PCR value becomes a concise record of all the inputs that were processed through that PCR. Any slight change to any of the inputs or even changing the order of inputs results in a completely different value with no discernible relationship to the original.

This enables what TCG called the “measurement of trust” during the boot process, by updating the PCR with measurements of all code executed. For example, the initial BIOS code that takes control when a machine is first powered on updates PCR #0 with a hash of its own binary. Before passing control to the boot sector on disk, it records the hash of that sector in a different PCR. Similarly the early-stage boot loader first computes a cryptographic hash of the OS boot-loader and updates a PCR with that value, before executing the next stage. In this way, a chain of trust is created for the entire boot process with every link in the chain except the very first one recorded in some PCR before that link is allowed to execute. (Note the measurement must be performed by the predecessor. Otherwise a malicious boot-loader could update the PCR with a bogus hash instead of its own. Components are not allowed to self-certify their code; it must be an earlier piece of code that performs the PCR update before passing control.)

TCG specifications define the conventions for what components are measured into what PCR. These are different between legacy BIOS and the newer UEFI specifications. Suffice it to say that by the time a modern operating system boots, close to a dozen PCRs will have been extended with a record of the different components booted:

So what can be done with this cryptographic record of the boot process? While these values look random, they are entirely deterministic.  Assuming the exact same system is powered on for two different occasions, identical PCR values will result. For that matter, if two different machines have the exact same installation— same firmware, same version of the operating system, same applications installed— it is expected that their PCRs will be identical. These examples hint at two immediate security applications:

  • Comparison over time: verify that a system is still in the same known-good state it was at a given point in the past. For example we can record the state of PCRs after a server is initially provisioned and before it is deployed into production. By comparing those measurements against the current state, it is possible to detect if critical software has been tampered with.
  • Comparison against a reference image: Instead of looking at the same machine over time, we can also compare different machines in a data-center. If we have PCR measurements for a known-good “reference image,” any server in healthy state is expected to have the same measurements in the running configuration.

Interestingly neither scenario requires knowing what the PCRs are ahead of time or even the exact details of how PCRs are extended. We are only interested in deltas between two sets of measurements. Since PCRs are deterministic, for a given set of binaries involved in a boot process we can predict ahead of time exactly what PCR values should result. There is a different use-case when those exact value matters: ascertaining whether a remote system is running a particular configuration.

Getting better at discrimination

Consider the problem of distinguishing a machine running Windows from one running Linux. These operating systems use a different boot-loader and the hash of that boot-loader gets captured into a specific PCR during measured boot. The value of that PCR will now act as a signal of what operating system is booted. Recall that each step in the boot-chain is responsible for verifying the next link; a Windows boot-loader will not pass control to a Linux kernel image.

This means PCR values can be used to prove to a remote system that you are running Windows or even running it in a particular configuration. There is one more feature required for this: a way to authenticate those PCRs. If clients were allowed to self-certify their own PCR measurements, a Linux machine could masquerade as a Windows box by reporting the “correct” PCR values expected after a Windows boot. The missing piece is called “quoting” in TPM terminology. Each TPM can digitally sign its PCR measurements with a private-key permanently bound to that TPM. This is called the attestation key and it is only used for signing such proofs unique to the TPM. (The other use case is certifying that some key-pair was generated on the TPM, by signing a structure containing the public key.) This prevents the owner from forging bogus quotes by asking the TPM to sign random messages.

This shifts the problem into a different plane: verifying the provenance of the “alleged” attestation, namely that it really belongs to a TPM. After all anyone can generate a key-pair and sign a bunch of PCR measurements with a worthless key. This is where the protocols get complicated and kludgy, partly because TCG tried hard placate privacy advocates. If every TPM had a unique, global AK for signing quotes, that key could be used as a global identifier for the device. TPM2 specification instead creates a level of indirection: there is an endorsement key (EK) and associated X509 certificate baked into the TPM at manufacture time. But EK is not used to directly sign quotes; instead users generate one or more attestation keys and prove that specific AK lives on the same TPM as the EK, using a challenge-response protocol. That links the AK to a chain of trust anchored in the manufacturer via the X509 certificate.

The resulting end-to-end protocol provides a higher level of assurance than is possible with software-only approaches such as “health agents.” Health agents are typically pieces of software running inside the operating system that perform various checks (check if latest software updates have been applied, firewall is enabled, no listening ports etc.) and report the results. The problem is those applications rely on the OS for their security. A privileged attacker with administrator rights can easily subvert the agent by feeding bogus observations or forging a report. Boot measurements on the other are implemented by firmware and TPM, outside the operating system and safe against any interference by OS-level malware regardless of how far it has escalated its privileges.

On the Internet, no one knows you are running Linux?

The previous example underscores a troubling link between measured boot and platform lock-in. Internet applications are commonly defined in terms of a protocol. As long as both sides conform to the protocol, they can play. For example XMPP is an open instant-messaging standard that emerged after the IM wars of the 1990s. Any conformant XMPP client following this protocol can interface with an XMPP server written according to the same specifications. Of course there may be additional restrictions associated with each XMPP server—such as begin able to authenticate as a valid user, making payments out-of-band if the service requires one etc. Yet these conditions exist outside of the software implementation. There is no a priori reason an XMPP client running on Mac or Linux could not connect to the same service as long as the same condition are fulfilled: the customer paid their bill and typed in the correct password.

With measured boot and remote attestation, it is possible for the service to unilaterally dictate new terms such as “you must be running Windows.” There is no provision in XMPP spec today to convey PCR quotes, but nothing stops MSFT from building an extension to accommodate that. The kicker: that extension can be completely transparent and openly documented. There is no need to rely on security through obscurity and hope no one reverse-engineers the divergence from XMPP. Even with full knowledge of the change, authors of XMPP clients for other operating systems are prevented from creating interoperable clients.

No need to stop with the OS itself. While TCG specs reserve the first few PCRs for use during the boot process, there are many more available. In particular PCRs 8-16 are intended for the operating system itself to record other measurements it cares about. (Linux Integrity Measurement Architecture or IMA does exactly that.) For example the OS can reserve a PCR to measure all device drivers loaded, all installed applications or even the current choice of default web browser. Using Chrome instead of Internet Explorer? Access denied. Assuming attestation keys were set up in advance and the OS itself is in a trusted state, one can provide reliable proof of any of this criteria to a remote service and create a walled-garden that only admits consumers running approved software.

The line between security feature and platform lock-in

Granted none of the scenarios described above have come to pass yet— at least not in the context of general purpose personal computers. Chromebooks come closest with their own notion of remote verification and attempt to create walled-gardens that limit accessibility only to applications running on a Chromebook. Smart-phones are a different story: starting with the iPhone, they were pitched as closed, blackbox appliances where owners had little hope of tinkering. De facto platform lock-in due to “iOS only” availability of applications is very common for services that are designed for mobile use in-mind. This is the default state of affairs even when the service provider is not making any deliberate attempts to exclude other platforms or use anything heavyweight along the lines of remote attestation.

This raises the question: is there anything wrong with a service provider restricting access based on implementation? The answer depends on the context.

Consider the following examples:

  1. Enterprise case. An IT department wants to enforce that employees only connect to the VPN from a company issued device (Not their own personal laptop)
  2. Historic instant messaging example. AOL wants to limit access to its IM service to users running the official AIM client (Not a compatible open-source clone or the MSN Messenger client published by MSFT)
  3. Leveraging online services to achieve browser monopoly. Google launches a new service and wants to restrict access only to consumers running Google Chrome as their choice of web-browser

It is difficult to argue with the first one. The company has identified sensitive resources— it could be customer PII, health records, financial information etc.— and is trying to implement reasonable access controls around that system. Given that company-issued devices are often configured to higher security standards than personal devices, it seems entirely reasonable to mandate that access to these sensitive systems only take place from the more trustworthy devices. Remote attestation is a good solution here: it proves that the access is originating with a device in known configuration. In fact PCR quotes are not the only way to get this effect; there are other ways to leverage the TPM to similar ends. For example, TPM specification allows generating key-pairs with a policy attached saying the key is only usable when the PCRs are in a specific state. Using such a key as the credential for connecting to the VPN provides an indirect way to verify the state of the device. Suppose employees are expected to be running a particular Linux distribution on their laptop. If they boot that OS, the PCR measurements will be correct and the key will work. If they install Windows on their system and boot that, PCR measurements will be different and their VPN key will not work. (Caveat: This is glossing over some additional risks. In a more realistic setting, we have to make sure VPN state can not be exported to another device after authentication or for that matter, a random Windows box can not SSH into the legitimate Linux machine and use its TPM keys for impersonation.)

By comparison, the second case is motivated by strategic considerations. AOL deems interoperability between IM clients a threat to its business interests. That is not an unreasonable view: interop gives challengers in the market a leg up against entrenched incumbents, by lowering switching costs. At the time AOL was the clear leader, far outpacing MSN and similar competitors in number of subscribers. The point is AOL is not acting to protect its customers privacy or save them from harm; AOL is only trying to protect the AOL bottom line. Since IM is offered as a free service, the only potential sources of revenue are:

  • Advertising
  • Selling data obtained by surveilling users
  • Other applications installed with the client

The first one requires absolute control over the client. If an MSN Messenger user connects to the AOL network, that client will be displaying ads selected by Microsoft, not AOL. In principle the second piece still works as long as the customer is using AIM: every message sent is readable by AOL, along with metadata such as usage frequency and IP addresses used to access the service. But a native client can collect far more information by tapping into the local system: hardware profile, other applications installed, even browsing history, depending on how unscrupulous the vendors are (Given that AOL deliberately planted a critical vulnerability, there is no reason to expect they would stop shy of mining navigation history.) The last option also requires full control over the client. For example if Adobe were to offer AOL 1¢ for distributing Flash with every install of AIM, AOL could only collect this revenue from users installing the official AIM client, not interoperable ones that do not include Flash bundled. In all cases AOL stands to lose money if people could access the IM service without running the official AOL client.

The final hypothetical is a textbook example of leveraging monopoly in one business—online search for Google— to gain market share in another “adjacent” vertical, by artificially bundling two products. That exact pattern of behavior was at the heart of the DOJ antitrust lawsuit against MSFT in the late 1990s, alleging that the company illegally used its Windows monopoly to handicap Netscape Navigator and gain unfair advantage for Internet Explorer in market share. Except that by comparison the Google example is even more stark. While it was not a popular argument, some rallied to MSFT’s defense by pointing out that the controls of an “operating system” are not fixed and web browsers may one day be seen as an integral component, no different than TCP/IP networking. (In a delightful irony, Google itself proved this point later by grafting a lobotomized Linux distribution around the Chrome web-browser to create ChromeOS. This was an inversion of the usual hierarchy: instead of being yet another application included with the OS, the browser is now the main attraction that happens to include an operating system as bonus.) There is no such case to be made about creating a dependency between search engines in the cloud and web browsers used for accessing them. If Google resorted to using technologies such as measured-boot to enforce that interdependency— and in fairness, it has not, this remains a hypothetical at the time of writing— the company would be adding to a long rap-sheet of anticompetitive behavior that placed it in the crosshairs of regulators on both sides of the Atlantic.

CP

An exchange is a mixer, or why few people need Tornado Cash

The OFAC sanctions against the Ethereum mixer Tornado Cash have been widely panned by the cryptocurrency community as an attack on financial privacy. This line of argument claims that Tornado has legitimate uses (never mind that its actual usage appears to be largely laundering the proceeds of criminal activity) for consumers looking to hide their on-chain transactions from prying eyes. The problem with this argument is that the alleged target audience already has access to mixers that work just as well as Tornado Cash for most scenarios and happen to be a lot easier to use. Every major cryptocurrency exchange naturally functions as a mixer— and for the vast majority of consumers, that is a far more logical way to improve their privacy on-chain compared to interacting with a smart-contract.

Lifecycle of a cryptocurrency trade

To better illustrate why a garden-variety exchange functions—inadvertently—as a mixer, let’s look at the lifecycle of a typical trade. Suppose Alice wants to sell 1 bitcoin under her own self-custody wallet for dollars and conversely Bob wants to buy 1 bitcoin for USD. Looking at the on-chain events corresponding to this trade:

  1. Alice sends her 1 bitcoin into the exchange. This is an unusual aspects of trading cryptocurrency: there are no prime brokers involved and all trades must be prefunded by delivering the asset to the exchange ahead of time. This is an on-chain transaction, with the bitcoin moving from Alice’s wallet to a new address controlled by the exchange.
  2. Similarly Bob must deliver his funds in fiat, via ACH or wire transfers.
  3. Alice and Bob place orders on the exchange order book. The matching engine pairs those trades and executes the order. This takes place entirely off-chain, only updating the internal balances assigned to each customer.
  4. Bob withdraws the proceeds of the trade. This is an on-chain transaction with 1 bitcoin moving from an exchange-controlled address to one designated by Bob.
  5. Similarly Alice can withdraw her proceeds by requesting an ACH or wire transfer to her own bank account.

Omnibus wallet management

One important question is the relationship between the exchange addresses involved in steps #1 and  #4. Alice must send her bitcoin to some address owned by the exchange. In theory an exchange could use the same address to receive funds from all customers. But this would make it very difficult to attribute incoming funds. Recall that an exchange may be receiving deposits from hundreds of customers originating from any number of bitcoin addresses at any given moment. Each of those transaction. A standard bitcoin transaction does not have a “memo” field where Alice could indicate that a particular deposit was intended for her account. (Strictly speaking, it is possible to inject extra data into signature scripts. However that advanced capability is not widely supported by most wallet applications and in any case would require everyone to agree on conventions for conveying sender information, not just for Bitcoin but for every other blockchain.

This is where the concept of dedicated deposit addresses come into play. Typically exchanges assign one or more unique addresses to each customer for deposits. Having distinct deposit addresses provides a clean solution to the attribution problem: any incoming funds to one of Alice’s deposit addresses will always be attributed to her and result in crediting her balance on the internal exchange ledger. This holds true regardless of where the deposit originated from.  For example, she could share her deposit address for a friend and the friend could send bitcoin payments directly to Alice’s address. Alice does not even have to alert the exchange that she is expecting a payment: any blockchain transfer to that address are automatically credited to Alice.

(Aside: Similar attribution problems arise for fiat deposits. ACH attribution is relatively straightfoward since it is initiated by the customer through the exchange UI; in other words, it is a “pull” approach. But wire transfers pose a problem since there is no such thing as per-customer bank accounts. All wires are delivered to a single bank account associated with the exchange. Commonly this is solved by having customers provide wire IDs to match incoming wires to the sender.)

Incoming and outgoing

Where things get interesting is when Bob is withdrawing his newly purchased 1 bitcoin balance. While it is tempting to assume that 1 bitcoin must come from Alice’s original deposit address where she sent her funds, this is not necessary. Most exchanges implement a commingled “omnibus” wallet where funds are not segregated per customer on-chain. When Alice executes a trade to sell her bitcoin to Bob, that transaction takes place to entirely off-chain. The exchange makes an update to its own internal ledger, crediting and debiting entries in a database recording how much of each asset every customer owns. That trade is not reflected on-chain. Funds are not moved from an “Alice address” into a “Bob address” each time trades execute.

This is motivated by efficiency concerns: blockchains have limited bandwidth and moving funds on-chain costs money in the form of miner fees. Settling every trade on-chain by redistributing funds between addresses would be prohibitively expensive. Instead, the exchange maintains a single logical wallet that holds funds for all its customers. The allocation of funds among all these customers is not visible on chain; it is tracked on an internal database.

A corollary of this is that when a customer requests to withdraw their cryptocurrency, that withdrawal can originate from any address in the omnibus wallet. Exchange addresses are completely fungible. In the example above, while Bob “bought” his bitcoin from Alice—in the sense that his buy order executed against a corresponding sell order from Alice—there is no guarantee that his withdrawal of proceeds will originate from Alice’s address. Depending on the blockchain involved, different strategies can be used to satisfy withdrawal requests in an economical manner. In the case of bitcoin complex strategies are required to manage “unspent transaction outputs” or UTXO in an efficient manner. Among other reasons:

  • It is more efficient to supply a single 10BTC input to serve a 9BTC withdrawal, instead of assembling nine different inputs of one bitcoin each. (More inputs → larger transaction → higher fees)
  • Due to long confirmation times on bitcoin, exchanges will typically batch withdrawals. That is, if 9 customers each requesting 1 bitcoin, it is more economical to broadcast a single transaction with a 10BTC input and 9 outputs each going to one customer, as opposed to nine distinct transactions with one input/output.

In short, there is no relationship between the original address where incoming funds arrive and the final address which appears as the sender of record when those funds are withdrawn after a trade.

Coin mixing by accident

This hypothetical example tracked the life cycle of a bitcoin going through a trade between Alice and Bob. But the same points about omnibus wallet management also apply to a single person. Consider this sequence of events:

  1. Alice deposits 1 bitcoin into the exchange
  2. At some future date she withdraws 1 bitcoin

While the first transaction is going into one of her unique deposit addresses, the second one could be coming out of any address in the exchange omnibus wallet. It looks indistinguishable from all other 1 bitcoin withdrawals occurring around the same time. As long as Alice uses a fresh destination address to withdraw, external observes cannot link the deposit and withdrawal actions. In effect the exchange “mixed” her coins by accepting bitcoin that was known to be associated with Alice and spitting out an identical amount of bitcoin that is not linked to the original source on-chain.

In other words, an exchange with an omnibus wallet also functions as a natural mixer.

Centralized vs decentralized mixers

How favorably that mixer compares to Tornado Cash depends on the threat model. The main selling points of Tornado Cash are trustless operation and open participation.

  • Tornado is implemented as a set of immutable smart-contracts on Ethereum. Those contracts are designed to perform one function and exactly one function: mix coins. There is no leeway in the logic. It cannot abscond with funds or even refuse to perform the designated function. There is no reliance on the honest behavior of a particular counterparty. This stands in stark contrast to using a centralized exchange— those venues have full custody over customer funds. There is no guarantee the exchange will return the funds after they have been deposited. It could experience a security breach resulting in theft of assets. Or it could deliberately choose to freeze customer assets in response to a court order. Those possibilities do not exist for a decentralized system such as Tornado.
  • Closely related is that privacy is provided by all other users taking advantage of the mixer around the same time. The more transactions going through Tornado, the better each transaction is shielded among the crowd. Crucially, there is no single trusted party able to deanonymize all users, regardless of how unpopular the usage. By contrast, a centralized exchange has full visibility into fund flows. It can “connect the dots” between incoming and outgoing transactions.
  • There are no restrictions on who can interact with Tornado smart contract. Meanwhile centralized exchanges typically have an onboard flow and may impose restrictions on sign-ups, such as only permitting customers from specific countries or requiring proof of identity to comply with Know-Your-Customer regulations.

Reconciling the threat model

Whether these theoretical advantages translate into a real difference for a given customer depends on the specific threat model. Here is a concrete example from CoinGecko defending for legitimate uses of Tornado:

“For instance, a software employee paid in cryptocurrency and is unwilling to let their employer know much about their financial transactions can use Tornado Cash for payment. Also, an NFT artist who has recently made a killing and is not ready to draw online attention can use Tornado Cash to improve their on-chain privacy.”

CoinGecko article

The problem with these hypothetical examples is they assume all financial transactions occur in the hermetically sealed ecosystem of cryptocurrency. In reality, very few commercial transactions can be conducted in cryptocurrency today—and those are primarily in Bitcoin using the Lightning Network, where Tornado is of exactly zero value since it operates on the unrelated Ethereum blockchain. The privacy-conscious software developer still needs an off-ramp from Ethereum to a fiat currency such as US dollars. That means an existing relationship with an exchange that allows digital assets for old fashioned fiat. (While it is possible to trade ether for stablecoins such as Tether or USDC using permissionless decentralized exchanges, that still does not help. The landlord and the utility company expect to get paid in real fiat, not fiat equivalents.)

Looked another way, the vast majority of cryptocurrency holders already have an existing relationship with an exchange because that is where they purchase and custody their cryptocurrency in the first place. For these investors, using one of those exchanges as a mixer to improve privacy is the path of least resistance. While there have been notable failures of exchanges resulting in loss of customer funds—FTX being a prominent example—it is worth noting that the counterparty exposure is much more limited for this usage pattern. Funds are routed through an exchange wallet temporarily, not custodied long term. There is a limited time-window when the exchange holds the funds, until they are withdrawn in one or more transactions to new blockchain addresses that are disconnected from the original source. If anything, a major centralized exchange will afford more privacy from external observers due to its large customer base and ease of use, compared to the difficulty of interacting with Tornado contracts through web3 layers such as Metamask. While the customer has no privacy against the exchange, this is not the threat model under consideration: recall the above excerpt refers to a software developer trying to shield their transactions from their employer who pays their salary in cryptocurrency. That employer does not have any more visibility into what goes on inside the exchange than they have into say personal ATM or credit-card transactions for their employees. (In an extra-paranoid threat model where we are concerned about say Coinbase ratting on its customers, one is always free to choose a different, more trustworthy exchange or better yet mix coins through a cascade of multiple exchanges, requiring collusion among all of them to link inputs and outputs.)

That leaves Tornado Cash as a preferred choice only for a niche group of users: those who are unable to onboard with any reputable exchange (because they are truly toxic customers eg OFAC sanctioned entities) or those operating under the combination of a truly tin-foil-hat threat model (“no centralized exchange can be trusted, they will all embezzle funds and disclose customer transactions willy-nilly…”) and an abiding belief that all necessary economic transactions can be conducted on a blockchain without ever requiring an off-ramp to fiat currencies.

CP

Immutable NFTs with plain HTTP

Ethereal content

One of the recurring problems with NFT digital art has been the volatility of storage. While the NFT recording ownership of the artwork lives on a blockchain such as Ethereum, the content itself—the actual image or video—is usually too large to keep on chain. Instead there is a URL reference in the NFT pointing to the content. In the early days those were garden-variety web links. That made all kinds of shenanigans possible, some intended others not:

  • Since websites can go away for good (because the domain is not renewed) the NFT could disappear for good.
  • Alternatively the website could still be around but its contents can change. There is no rule that says some link such as https://example.com/MyNFT will always return the same content. The buyer of an NFT could find that the artwork they purchased has morphed. It could even be different based on time of day or the person accessing the link. (This last example was demonstrated in a recent stunt arguing that Web3 is not decentralized at all, by returning deliberately different image when the NFT is accessed through OpenSea.)

IPFS, Arweave and similar systems have been proposed as a solution to this problem. Instead of uploading NFTs to a website which may go out of business or start returing bogus concept, they are instead stored on special distributed systems. In this blog post we will describe a proof-of-concept for approximating the same effect using vanilla HTTPS links.

Before diving into the implementation details, we need to distinguish between two different requirements behind the ambiguous goal of “persistence:”


1. Immutability

2. Censorship resistance

The first one states that the content does not change over time. If the image looked a certain way when you purchased the NFT, it will always look that way when you return to view it again. (Unless of course the NFT itself incorporates elements of randomness, such as an image rendered slightly different each time. But even in that scenario, the algorithmic model for generating the image itself is constant.)

The second property states that the content is always accessible. If you were able to view the NFT once, you can do so again in the future. It will not disappear or become unavailable due to a system outage.

This distinction is important because each can be achieved independently of the other. Immutability alone may be sufficient for some use cases. In fact there is an argument to be made that #2 is not a desirable requirement in the absolute sense. Most would agree that beheading videos, CSAM or even copyrighted content should be taken down even if they were minted as an NFT.

To that end we focus on the first objective only: create an NFT that is immutable. There is no assurance that the NFT will be accessible at all times, or that it cannot be permanently taken down if enough people agree. But we can guarantee that if you can view the NFT, it will always be this particular image or that particular movie.

Subresource Integrity

At first it looks like there is already a web-standard that solves this problem out of the box: subresource integrity or SRI for short. With SRI one can link to content such as a Javascript library or a stylesheet hosted by an untrusted third-party. If that third-party attempts to tamper with the appearance and functionality of your website by serving an altered version of the content—for example a back-doored version of the Javascript library that logs keystrokes and steals passwords—it will be detected and blocked from loading. Note that SRI does not guarantee availability: that website may still have an outage or it may outright refuse to serve any content. Both of those events will still interfere with the functioning of the page; but at least the originating site can detect this condition and display an error. From a security perspective that  is a major improvement over continuing to execute logic that has been corrupted (undetected) by a third-party.

Limitations & caveats

While the solution sketched here is based on SRI, there are two problems that preclude a straightforward application:

  • SRI only works inside HTML documents.
  • SRI only applies to link and script elements. Strictly speaking this is not a limitation of the specification, but the practical reality of the extent most web-browsers have implemented the spec.

To make the first limitation more concrete, this is how a website would include a snippet of JS hosted by a third-party:

<script src="https://example.com/third-party-library.js"
integrity="sha256-xzKeRPLnOjN6inNfYWKfDt4RIa7mMhQhOlafengSDvU="
crossorigin="anonymous">

That second attribute is SRI at work. By specifying the expected SHA256 hash of the Javascript code to be included in this page, we are preventing the third-party from serving any other code. Even the slightest alteration to the script returned will be flagged as an error and prevent the code from executing.

It is tempting to conclude that this one trick is sufficient to create an immutable NFT (according to the modest definition above) but there are two problems.

1. There is no “short-hand” version of SRI that encodes this integrity check in the URL itself. In an ideal world one could craft a third-party link along the lines of:

https://example.com/code.js#/script[@integrity=sha256-xzKeRPLnOjN6inNfYWKfDt4RIa7mMhQhOlafengSDvU=']”

This (entirely hypothetical) version is borrowing syntax from XPath, combining URIs with an XML-style query language to “search” for an element that meets a particular criteria, in this case having a given SHA256 hash. But as of this writing, there is no web standard for incorporating integrity checks into the URI this way. (The closest is an RFC for hash-links.) For now we have to content ourselves with specifying the integrity as an out-of-band HTML attribute of the element.

2. As a matter of browser implementations, SRI is only applied to specific types of content; notably, javascript and stylesheets. This is consistent across Chrome, Firefox and Edge. Neither images or iframes are covered. That means even if we could somehow solve the first problem, we can not link to an “immutable” image by using an ordinary HTML image tag.

Emulating SRI for images

Working around both of these limitations requires a more complicated solution, where the document is built up in stages. While it is not possible to make a plain HTTPS URL immutable due to limitation #1 in SIR, there is one scheme that supports immutability by default.  In fact all URLs of this type are always immutable. This is the “data” scheme where the content is inlined; it is in the URL itself. Since no content is retrieved from an external server, this is immutable by definition. Data URLs can encode an HTML document, which serves as our starting point or stage #1. The URL associated with the NFT on-chain will have this form.

In theory we could encode an entire HTML document, complete with embedded images, this way. But that runs into a more mundane problem: blockchain space is expensive and the NFT URL lives on chain. That calls for minimizing the amount of data stored within the smart-contract, using only the minimal amount of HTML to boostrap the intended content. In our case, the specific HTML document will follow a simple template:

<!DOCTYPE html>
<html>
<head>
<script src="https://example.com/stage2.js"

integrity="sha256-xzKeRPLnOjN6inNfYWKfDt4RIa7mMhQhOlafengSDvU="
          crossorigin="anonymous">
  </script>
</head>
</html>

This is just a way of invoking stage #2, which is a chunk of bootstrap JavaScript hosted on an external service and made immutable using SRI. Note that if the hosting service decides to go rogue and start returning different content, the load will fail and the user will be starting a blank page. But the hosting service cannot successfully cause altered javascript to execute, because of the integrity check enforced by SRI.

Stage #2 itself is also simple. It is a way of invoking stage #3, where the actual content rendering occurs.

var contents='” … contents of stage #3 HTML document … “;
document.write(contents);

This replaces the current document by new HTML from the string. The heavy lifting takes place after the third stage has loaded:

  • It will fetch additional javascript libraries, using SRI to guarantee that they cannot be tampered with.
  • In particular, we pull in an existing open-source library from 2017 to emulate SRI for images, since the NFT is an image. This polyfill library supports an alternative syntax for loading images, with the URL and expected SHA256 hash specified as proprietary HTML attributes.
  • Stage #3 also contains a reference to the actual NFT image. But this image is not loaded using the standard <img src=”…”> syntax in HTML; that would not be covered by SRI due to the problem of browser support discussed above.
  • Instead, we wait until the document has rendered and kick-off custom script that invokes the JS library to do a controlled image load, comparing the content retrieved by XmlHttpRequest against the integrity check to make sure the server returned our expected NFT.
  • If the server returned the correct image, it will be rendered. Otherwise a brusque modal dialog appears to inform the viewer that something is wrong.

Putting it all together, here is a data URL encoding an immutable NFT:

data:text/html;charset=utf-8;base64,PCFET0NUWVBFIGh0bWw+CjxodG1sPgogIDxoZWFkPgogICAgPHNjcmlwdCBzcmM9Imh0dHBzOi8vd3d3LmlkZWVzZml4ZXMuaW8vaW1tdXRhYmxlL3N0YWdlMi5qcyIKCSAgICBpbnRlZ3JpdHk9InNoYTI1Ni1YSlF3UkFvZWtUa083eE85Y3ozaExrZFBDSzRxckJINDF5dlNSaXg4MmhVPSIKCSAgICBjcm9zc29yaWdpbj0iYW5vbnltb3VzIj4KICAgIDwvc2NyaXB0PgogIDwvaGVhZD4KPC9odG1sPgo=

We can also embed it on other webpages (such as NFT marketplaces and galleries) using an iframe. as in this example:

Embedded NFT viewer

Chrome does not allow navigating the top-level document to a data URL, requiring indirection through the iframe. In this case the viewer itself must be trusted, since it can cheat by pointing the iframe at a bogus URL instead of the correct scheme printed above. But such corruptions are only “local” since other honest viewers will continue to enforce the integrity check.

What happens if the server hosting the image were to replace our hypothetical motorcycle NFT by a different picture?

Linking to the image with a plain HTTPS URL will display the corrupted NFT:

But going through the immutable URL above will detect the tampering attempt and not render the image:

CP

Notes on the Grayscale ETF rejection

Deja vu for bitcoin spot ETFs

[Full disclosure: This blogger worked for a cryptocurrency exchange associated with a past Bitcoin ETF filing]

Few observers were surprised when the SEC rejected yet another bitcoin spot ETF filing in July, adding to a long line of failed attempts going back to the 2017 Winklevoss ETF decision. (Not to be confused with bitcoin futures ETFs, which have been already approved for trading.) Even the sponsor did not seem particularly optimistic about its odds of victory: Grayscale preemptively retained high-profile legal counsel in the week leading up the decision, gearing for a protracted court battle.

One silver lining is that the SEC itself emphasizes the procedural nature of the decision, as opposed to being a judgment on whether bitcoin is a suitable investment:

“The Commission emphasizes that its disapproval of this proposed rule change, as modified by Amendment No. 1, does not rest on an evaluation of the relative investment quality of a product holding spot bitcoin versus a product holding CME bitcoin futures, or an assessment of whether bitcoin, or blockchain technology more generally, has utility or value as an 12 innovation or an investment.”

In other words: this is not a reflection on the suitability of bitcoin as an asset class, or even the relative advantages of holding bitcoin directly compared to holding it indirectly via spot or futures ETFs. SEC did not jump on the not-your-keys-not-your-bitcoin bandwagon and endorse self-custody. The ruling is strictly concerned with structural issues at play for this one particular proposed ETF. Comforting words for the bitcoin faithful but hardly the resounding endorsement the sponsor was looking for. Grayscale Trust has been under increasing pressure to convert from its current structure into an ETF. Recent reversal of its tracking error only added increased urgency to this filing, raising the stakes for the SEC decision: while GBTC NAV used to float at a comfortable premium above the underlying price, it is now trading significantly below spot prices.

The song remains the same

In an 86 page decision heavy on references, the Commission lays out its rationale for the rejection. Looking at this document closer reveals an interesting mix of “recycled content” from past rejections of similar ETFs as well as some unique counter-arguments to claims advanced in this particular application. In case this distinguished heritage is not clear, there is footnote #11. Taking up most of the third page and spilling over into the next page, it presents a laundry list of past bitcoin ETF rejections: Winklevoss Bitcoin Trust, USBT, Wisdom Tree, Valkyrie Bitcoin Fund, Krypton, SkyBridge ETF, NYDIF Bitcoin ETF, GlobalX, ARK21, One River Carbon Neutral Bitcoin Trust, SolidX, Granite Shares, VanEck, the list goes on. Grayscale decision hinges on similar rationale: the applicant has not met its burden under the Exchange Act to demonstrate that the proposal is consistent with the requirements of section 6(b)(5), specifically that the venues where the underlying product—bitcoin— is traded are “designed to prevent fraudulent and manipulative acts and practices” and “to protect investors and the public interest.”

While the ruling cites prior events going as far back as 2017, it is also surprisingly current. One note cites cites recent work from Trail Of Bits on centralization of public blockchains. Another note cites a support letter written in support of the Grayscale application date 21st is— eight days before the timestamp of this document. SEC spends the first half of the document disputing two claims by NYSE Arca, the sponsor behind the Grayscale ETF. (Here we will use Grayscale as short hand to refer collectively to the trust and its sponsor, even though NYSE Arca is a distinct entity.)

  1. That NYSE Arca has entered into a comprehensive surveillance sharing agreement with a regulated market of significant size
  2. Other alternative countermeasure for market manipulation are in place due to the unique properties of cryptocurrency

Own goals

Several commenters writing to the SEC in support of the Grayscale application argued that bitcoin markets were somehow inherently resistant to manipulation either due to their scale or some unusual transparency property of blockchains. (A curious argument, considering that trading activity itself is not reflected on chain and takes place within the internal, closed ledgers of centralized exchanges.) In a deft move, the Commission uses words straight out of NYSE Arca itself to refute those arguments. On the subject of whether bitcoin markets can be manipulated:

“NYSE Arca acknowledges in its proposal that “fraud and manipulation may exist and that [b]itcoin trading on any given exchange may be no more uniquely resistant to fraud and manipulation than other commodity markets.”NYSE Arca also states that “[b]itcoin is not itself inherently resistant to fraud and manipulation” and concedes that “the global exchange market for the trading of [b]itcoins” […] also “is not inherently resistant to fraud and manipulation.”

That is not exactly helping the case. To be fair, even without this own-goal the SEC had plenty of ammunition to cast doubt on the premise that spot price is immune to manipulation. Among others:

  • Tether, the gift that keeps on giving
  • Continued allegations of wash-trading on offshore, unregulated exchanges
  • Possibility of 51% attack or “hacking of the Bitcoin network”
    This constant refrain about hypothethical flood of hash-power colluding to rewrite history seems out of place here. While 51% attacks have always been a theoretical possibility, the security in proof-work comes from the difficulty of assembling the required level of resources to carry out such an attack. Blithely asserting that bitcoin is subject to 51% attacks is tantamount to saying a Bond villain with a trillion dollars could corner the market for baseball cards. Similarly the SEC ruling cites a statistic about 100 wallets controlling ~15% of all bitcoin in circulation. As previously discussed here, such statistics from on-chain address distribution can not be used to estimate the level of inequality in bitcoin ownership. One address does not necessarily equal one person or even one institutional investor, especially among addresses with highest balances that typically represent large omnibus wallets pooling funds.

Speaking of supporting letters, the SEC could not resist the temptation to nitpick some of these. Note #100 points out one letter from the Blockchain Association where the commenters claimed CFTC has been exercising anti-manipulation and anti-fraud enforcement authority over bitcoin futures market since 2014— which is a full three years before CFTC has overseen bitcoin futures.

Another example of an own-goal comes from Grayscale assertions concerning the “Index Price” and how the associated methodology for aggregating spot prices from multiple exchanges is resistant to manipulation. The SEC points out conflicting statements from the Registration Statement carrying a litany of caveats and qualifications:

“Moreover, NYSE Arca’s assertions that the Trust’s use of the Index helps make the Shares resistant to manipulation conflict with the Registration Statement. Specifically, the Registration Statement represents, among other things, that the market price of bitcoin may be subject to “[m]anipulative trading activity on bitcoin [trading platforms], which are largely unregulated,” and that, “[d]ue to the unregulated nature and lack of transparency surrounding the operations of bitcoin [trading platforms], they may experience fraud, security failures or operational problems, which may adversely affect the value of [b]itcoin and, consequently, the value of the Shares.”

Voluntary vs mandatory compliance

One interesting point made in the SEC ruling is that any mitigations implemented by “constituent platforms” (the five exchanges used for calculating the ETF price, namely: Coinbase Pro, Bitstamp, Kraken, and LMAX Digital) against market manipulation are entirely at their own discretion. These are not regulated platforms. They have no obligation to continue policing their order-books against suspicious trading activity and reporting bad actors to law enforcement:

“[…] these measures, unlike the Exchange Act’s requirements for national securities exchanges,117 are entirely voluntary and therefore have no binding force. The Constituent Platforms, including the platform operated by an affiliate of the Custodian, could change or cease to administer such measures at any time”

One counterpoint is that exchanges have very compelling incentives to maintain orderly markets since manipulative activity reduces investor confidence in the platform, resulting in loss of customers. On the other hand such profit/loss motivations do not carry the same weight as a binding regulation and may cut both ways. In good times it may well drive further investment in market surveillance to boost investor confidence in a bit to attract more risk-averse investors standing on the sidelines. But the same incentives could mean a troubled exchange facing crypto-winter will cut spending on compliance.

No exchange an island unto itself

While the Grayscale filing sings the praises of the Constituent Platforms and how robust they are against market manipulation, the SEC turns its attention to the rest of the bitcoin spot market. Rightly so— considering that the majority of spot bitcoin trading takes place on unregulated, off-shore exchanges outside these four platforms:

“NYSE Arca focuses its analysis on the attributes of the Constituent Platforms, as well as the Index methodology that calibrates the pricing input generated by the Constituent Platforms […] What the Exchange ignores, however, is that to the extent that trading on spot bitcoin platforms not directly used to calculate the Index Price affects prices on the Constituent Platforms, the activities on those other platformswhere various kinds of fraud and manipulation from a variety of sources may be present and persistmay affect whether the Index is resistant to manipulation. Importantly, the record does not demonstrate that these possible sources of fraud and manipulation in the broader spot bitcoin market do not affect the Constituent Platforms that represent a slice of the spot bitcoin market.”

This is spot on. Even if a particular index is designed to ignore signals from offshore exchanges on the theory that they are easier to manipulate, trading activity on those platforms can still affect “Constituent Platforms” as long as overlap exists in market participants. Suppose a market-maker simultaneously operates on one of CP and one of the off-shore exchanges— this is a very common scenario, since such strategies are typically rooted in exploiting price discrepancy across multiple venues. In that case price manipulation taking place on the sketchy offshore platform will also impact prices on the regulated venue because market-making algorithms will continue to exploit the price difference until it is arbitraged away or until they run out of liquidity. The contagion of price manipulation can not be confined to one venue in an efficient, interconnected market. (Ironically, trying to argue against this point by insisting that bitcoin markets are so disconnected from each other to violate the Rule Of One Price is self-defeating. It would mean the bitcoin spot market is too inefficient and immature to serve as the basis for any ETF.)

Tracking errors

The ruling also points out a subtle point about pricing: while the Grayscale Trust may use the reference index price to evaluate itself, there is no guarantee this is the same valuation the shares will trade at. (In fact there is another, third price that may drift from these two— the value that Authorized Participants in the fund buy/sell bitcoin on the open market when they are creating/redeeming baskets. Recall that the reference price is a synthetic creation, not an actual quote from an actual exchange where you can can execute trades at that price.) Again the SEC points to the historic track record of GBTC having significant tracking errors, trading as high as 142% (!!) over the underlying value of bitcoin holdings and at other times trading as low as 21% below the same benchmark. The source of these figures? The Registration Statement and 10-K filings from Grayscale. Converting GBTC into an ETF is expected to reduce these over/under tracking errors, but in making that case for approval, Grayscale unwittingly provided the SEC with even more ammunition to question the defense against price-manipulation.

Back to the futures ETF?

If the intrinsic properties of bitcoin or market-surveillance agreements— to the extent they even exist at all Constituent Platforms— are insufficient to deter price manipulation, what else could work? Here the Grayscale filing pinned its hopes on the existence of previously approved bitcoin futures ETFs such as the ProShares Bitcoin Strategy ETF which tracks CME futures. This argument rests on two premises:

  1. CME futures are traded on highly-regulated venues with well established market-surveillance and information sharing agreements, greatly reducing the risk of market manipulation
  2. Any successful manipulation of the proposed ETF must also manipulate the CME futures market in order to affect the price

That would appear to do the trick. No need for additional “surveillance-sharing agreements” or “market of significant size” within the spot market, when any manipulation attempts will be caught by the parallel mechanisms operating in the futures market.

That second argument did not fare well with the SEC. Leaving aside the relative volumes of the spot and futures markets, the main objection concerns lack of evidence around any causal relationship between prices in the two different markets. Here again Grayscale’s own words come back to haunt them: “… there does not appear to be a significant lead/lag relationship” between CME futures and the spot price. This is significant because if spot leads futures, a dishonest participant does not have to trade in the latter in order to influence the former. (On the contrary, their actions in spot markets will eventually also manipulate the futures market as a downstream effect.) Even the comment letters are not helping the cause, with one acknowledging there is “no clear winner” or even noting a bidirectional link were either market can move the other. In an interesting side-discussion the commission also clarifies what the expectation of future “leading” spot market means. Contrary to the straw-man version presented in another comment letter, the SEC criteria does not mean that futures always move first and the spot market responds in a perfectly predictable way afterwards. It is the existence of a statistical relationship that is sought, not a deterministic recipe for 100% reliable arbitrage between two markets.

False equivalences, take #2

Grayscale tried another variant of this strategy of arguing by comparison to already-approved futures ETFs. Suppose that instead of relying on futures ETFs as a bulwark against price manipulation, the premise is turned on its head and we asking whether the spot ETF is any easier to manipulate. Here Grayscale argued that if someone could manipulate the spot ETF, they could also target the futures ETF using the exact same mechanism since they rely on very similar price calculations. Ergo, GBTC is no more susceptible to such activity than an investment product previously approved by the SEC. Taking this one step further, disapproving GBTC while having approved the comparable futures ETF would constitute “arbitrary and capricious administrative action.”

On its face this is a compelling argument but the SEC is having none of it. First the order points out that the Bitcoin Reference Rate (BRR) used by CME futures serves a very different purpose:

“While the BRR is used to value the final cash settlement of CME bitcoin futures contracts, it is not generally used for daily cash settlement of such contracts, nor is it claimed to be used for any intra-day trading of such contracts. In addition, CME bitcoin futures ETFs/ETPs do not hold their CME bitcoin futures contracts to final cash settlement; rather, the contracts are rolled prior to their settlement dates. Moreover, the shares of CME bitcoin futures ETFs/ETPs trade in secondary markets, and there is no evidence in the record for this filing that such intra-day, secondary market trading prices are determined by the BRR.”

Also noted in passing: there are two additional spot exchanges (Gemini and ItBit) incorporated into the BRR not present in the GBTC reference price, further casting doubt on the assertion of “almost complete overlap.” Yet these are minor quibbles compared to what SEC points as the fundamental flaw in the GBTC filing:

“… the Commission’s consideration (and approval) of proposals to list and trade CME bitcoin futures ETPs, as well as the Commission’s consideration (and thus far, disapproval) of proposals to list and trade spot bitcoin ETPs, does not focus on an assessment of the overall risk of fraud and manipulation in the spot bitcoin or futures markets, or on the extent to which such risks are similar. Rather, the Commission’s focus has been consistently on whether the listing exchange has a comprehensive surveillance-sharing agreement with a regulated market of significant size related to the underlying bitcoin assets of the ETP under consideration, so that it would have the necessary ability to detect and deter manipulative activity.”

Even more telling is note #201: An ETF applicant was never required to demonstrate that cryptocurrency possesses some “unique resistance to manipulation” missing from other assets. Such unique properties could serve as an alternative to the original golden standard that the SEC seeks: “surveillance-sharing agreements with a market of significant size.” It is precisely the lack of such compressive market surveillance in bitcoin spot market that has lead GBTC on a wild-goose chase to identify some magic properties in blockchain assets rendering them intrinsically safe against manipulation techniques.

CP

Address ≠ person: the elusive Gini coefficient of cryptocurrencies

Estimating the distribution digital-assets from on-chain data is not straightforward

A false sense of transparency

The Gini coefficient of blockchains has long been a point of contention among defenders and detractors of cryptocurrency alike. Critics like to point to extreme levels of inequality based on the observed distribution of wealth among blockchain addresses. Far from having democratized access to finance or created a path for wealth accumulation for average investors, they point to these statistics as evidence that blockchains have only enabled another instance of capital concentration. Defenders downplay the significance of such inequality and hold that such disparities do not indicate any fundamental problems with the economics of cryptocurrency. Without picking sides in that ideological debate, this post outlines a different issue: the measures of alleged inequality calculated from blockchain observations are riddled with systemic errors.

Given the transparency of blockchains as a public ledger of addresses and associated balances, the Gini coefficient is very easy to compute in theory. Anyone can retrieve the list of addresses, sort them by associated balance and crunch the numbers. This methodology is the basis of an often-cited 2014 statistic comparing bitcoin to North Korea and more recent attention-grabbing headlines stating that bitcoin concentration “puts the US dollar to shame.” While blockchain statistics are very appealing in their universal accessibility, there are fundamental problems with attempting to characterize cryptocurrency distribution this way.

Address ≠ person: omnibus wallets

The first problem is that the transparency afforded in blockchain data only applies at the level of addresses. All of the purported eye-opening measures of inequality (“%0.01 of addresses control 27% of funds”) are based on distribution across addresses as the unit of analysis. But an address is not the same thing as a person.

One obvious problem involves omnibus wallets of cryptocurrency service providers, such as centralized exchanges and payment processors. [Full disclosure: This blogger worked at Gemini, a NYC-based exchange and custodian from 2014-2019] For operational reasons, it is more convenient for these companies to pool together funds belonging to different customers into a handful of addresses. These addresses do not correspond to any one person or even the parent corporate entity. The Binance cold-wallet address does not hold the funds of Binance, the exchange itself. Those assets belongs to Binance customers, who are temporarily parking their funds at Binance to take advantage of trading opportunities or simply because they do not want to incur the headache of custodying their own funds.

While the companies responsible for these addresses do not voluntarily disclose them, in many cases they have been deanonymized thanks to voluntary sleuthing by users and labelled on blockchain explorers. A quick peek shows that they are indeed responsible for some of the largest concentrations of capital on chain, including four of the top ten accounts by bitcoin balance and similarly five of the top ten for Ethereum as of this writing.

Address ≠ person: smart-contracts

Ethereum in facts adds another twist that accounts for several other high-value accounts: there are smart-contracts holding funds from multiple sources as part of a distributed application or app. For example, the number one address by balance currently is the staking contract for Ethereum 2.0. This contract is designed to hold in escrow the 32 ETH required as a surety bond from each participant interested in participating in the next version of Ethereum validation using proof-of-stake. The second highest balance belongs to another smart-contract, this one for wrapped Ether or wETH which is a holding vehicle for converting the native ETH currency ether into the ERC20 token format used in decentralized finance (“DeFi”) applications. Others in the top 25 correspond to specific DeFi applications such as the Compound lending protocol or the bridge to the Polygon network. None of these these addresses are meaningful indicators of ownership by anyone. As such it is surprising that even recent studies on inequality are making meaningless statements such as: “The account with the highest balance in Ethereum contains over 4.16% of all Ethers.” (Depending on when the snapshot was taken, that would be either the Ethereum 2.0 staking contract— now the highest balance with > 7% of all ETH in existence— or the Wrapped Ether contract.) Spurious inclusion of such addresses in the study obviously inflates the Gini coefficient. But even their very existence distorts the picture in a way that can not be remedied by merely excluding that data point. After all the funds at that address are real and belong to the thousands of individuals who opted into staking or decided to convert their ether into wrapped-ether for participating in DeFi venues. All of these funds would have to be withdrawn and redistributed back to their original wallets to accurately reflect ownership information that is currently hidden behind the contract.

Investors: retail, institutional and imaginary

On the other extreme, a single person can have multiple wallets, distributing their funds across multiple addresses. Interesting enough this can skew the result in either direction. If a single investor with 1000BTC splits that sum equally among a thousand addresses, counting each one as a unique individual will create the appearance of capital distributed in more egalitarian terms. But it may also go in the other direction. Suppose an investor holding 1 bitcoin splits that balance unevenly across ten addresses: the “primary” wallet gets the lion’s share at 0.90BTC while all others split the remainder. While keeping the total balance constant, this rearrangement has created several phantom “cryptocurrency owners,” each holding a marginal amount of bitcoin consistent with the narrative of a high Gini coefficient.

A different conceptual problem is that even for addresses with a single owner, that owner may be an institutional investor such as a hedge-fund or asset manager. Once again, the naive assumption “one address equals one person” results in overestimating the Gini coefficient when the address represents ownership by hundreds or thousands of persons. (In the extreme case, once sovereign-wealth start allocating to cryptocurrency a single blockchain address could literally represent millions of citizens of a country as stakeholders.) It’s as if an economist tried to estimate average savings in the US by looking at the balance of every checking account at a bank, without distinguishing whether the account belongs to a multinational corporation or ordinary citizen.

Getting the full picture

More subtly, looking at each blockchain in isolation does not paint an accurate picture of total cryptocurrency ownership overall. In traditional finance some amount of positive correlation is expected across different asset types. Investors holding stocks are also likely to have bonds as part of a balanced portfolio. But cryptocurrency has sharp ideological divides that may result in negative correlation where it matters most. If bitcoin maximalists frown upon the proliferation of dubious ICOs for unproven applications while Web3 junkies consider bitcoin the MySpace of cryptocurrency, there would be little overlap in ownership. In this hypothetical universe the correlation is negative: an investor holding BTC means is less likely to hold ETH. In that scenario Bitcoin and Ethereum may both have high inequality when measured in isolation while the combined holdings of investors across both chains exhibit a more egalitarian distribution. It is possible to aggregate assets within a chain, by taking into account all tokens issued on that chain. For example a single notional balance in US dollars can be calculated for each ethereum address by taking into account all token balances for that address, maintained in the ERC20 smart-contract responsible for tracking that asset. But this does not work across chains. There is no reason to expect the correlation between different ERC20 holdings— arguably closer in spirit to each other as utility tokens for various definitions of “utility”— to hold between ethereum and bitcoin.

Better data: paging cryptocurrency exchanges

Is there a better way to estimate the Gini coefficient than this naive accounting by address? The short answer is yes but it relies on closed data-sets. Centralized cryptocurrency exchanges such as Binance are in a better position to measure inequality using their internal ledgers. While an omnibus account may appear as a handful of high-balance addresses to external observers, the exchange knows exactly how those totals are allocated to each customer. Most exchanges also perform some type of identity validation on customers to comply with KYC/AML regulations, so they can distinguish between individual or institutional investor. This allows excluding institutional investors but at the risk of introducing a different type of distortion. If high net-worth individuals are investing in cryptocurrency through institutional vehicles such as family-offices and hedge funds, focusing on individual investors will bias the Gini coefficient down by removing outliers from the dataset. Finally, exchanges have a comprehensive view into balances of their customers across all assets simultaneously so they can arrive at an accurate total across chains and even fiat equivalents. (If a customer is holding dollars or euros at a cryptocurrency exchange, should that number be included in their total balance? What if they are holding stable-coins?) These advantages can yield a more precise estimate on exactly how unequal cryptocurrency ownership is, modulo some caveats. If customers subscribe to the “not your keys, not your bitcoin” school of custody and withdraw all cryptocurrency to their own self-hosted wallet after every purchase, the exchange will underestimate their holdings. Similarly customers holding assets at multiple exchanges— for example holding bitcoin at both Binance and FTX— will result in both providers underestimating the balance. Even with these limitations, getting an independent datapoint from a large-scale exchange would go a long way towards sanity-checking the naive estimates put forward based on raw blockchain data alone. It remains to be seen if any exchange will step up to the plate.

CP

Of Twitter bots, Sybil attacks and verified identities

Seeking a middle-ground for online privacy

The exact prevalence of bots has become the linchpin of Elon Musk’s attempt to bail out on the proposed acquisition of Twitter. Existence of bots is not disputed by either side; the only question is what percent of accounts these constitute. Twitter itself puts the figure around 5%, using a particular metric called “monetizable daily active users” or mDAU for calculating the ratio. Mr. Musk disputes that number and claims it is much higher, without citing any evidence despite having obtained access to raw data from Twitter for carrying out his own research.

Any discussion involving bots and fake accounts naturally leads to the question: why is Twitter not verifying all accounts to make sure they are actual humans? After all the company already has a concept of verified accounts sporting a blue badge, to signal that the account really belongs to the person it is claiming to be. This deceptively simple question leads into a tangle of complex trade-offs around exactly what verification can achieve and whether it would make any difference to the problem Twitter is trying to solve.

First we need to clarify what is meant by bot accounts. Suppose there is a magical way to perform identity verification online. While not 100% reliable, cryptocurrency exchanges and other online financial platforms are already relying on such solutions to stay on the right side of Know Your Customer (KYC) regulations. These include a mix of collecting information from the customer— such as the time-honored abuse of social security numbers for authentication— uploading copies of government-issued identity documents and cross-checking all this against information maintained by data brokers. None of this is free but suppose Twitter is willing to fork over a few dollars per customer on the theory that the resulting ecosystem will be much more friendly to advertisers. Will that eliminate bots?

The answer is clearly no, at least not according to the straightforward definition of bots. Among other things, nothing stops a legitimate person from going through ID verification and then transferring control of their account to a bot. There need not be any nefarious intent behind this move. For example, it could be a journalist who sets up the account to tweet links to their articles every time they publish a new one. In fact the definition of “bot” itself is ambiguous. If software is designed to queue up tweets from the author and publish them verbatim at specific future times, is that a bot? What if the software augments or edits human-authored content instead of publishing it as-is? Automation is not the problem per se. Having accounts that are controlled by software— even software that is generating content automatically without human intervention— may be perfectly benign.  The real questions are:

  1. Who is really behind this account
  2. Why are they using automation to generate content?

Motivation is ultimately unknowable from the outside but the first question can be tracked down to a name, either a person or corporate entity. Until such time as we have sentient AI creating its own social-media accounts, there is going to be someone behind the curtain, accountable for all content spewing from that account. Identity verification can point to that  person pulling the levers. (For now we disregard the very real possibility of verified accounts being taken over or even deliberately resold to another actor by the rightful owner.) But that knowledge alone is not particularly useful. What would Twitter do with the information that “nickelbackfan123” is controlled by John Smith of New York, NY? Short of instituting a totalitarian social credit system along the lines of China to gate access to social networks, there is no basis for turning away Mr. Smith or treating him differently than any other customer. Even if ID verification revealed that the customer is a known persona non grata to the US government— fugitive on the FBI most-wanted list or an OFAC-sanctioned oligarch— Twitter has no positive obligation to participate in some collective punishment process by denying them an online presence. Social media presence is not a badge of civic integrity or proof of upstanding character, a conclusion entirely familiar to any one who has spent time online.

But there is one scenario where Twitter can and should preemptively block account creation. Suppose this is not the first account but 17th one Mr. Smith is creating? (Let’s posit that all the other accounts remain active, and this is not a case of starting over. After all in America we all stand for second-acts and personal reinvention.) On the other hand if one person is simultaneously in controlling dozens of accounts, the potential for abuse is high— especially when this link is not clear to followers. Looked another way: there is arguably no issue with a known employee of the Russian intelligence agency GRU registering for a Twitter account and using their presence to push disinformation. The danger comes not from the lone nut-job yelling at the cloud— that is an inevitable part of American politics— but that one person falsely amplifying their message using hundreds of seemingly independent sock-puppet accounts. In the context of information security, this is known as a “Sybil attack:” one actor masquerading as thousands of different actors in order to confuse or mislead systems where equal weight is given to every participant. That makes a compelling case for verified identities online: not stopping bad actors from creating an account, but stopping them from creating the second, third or perhaps the one-hundredth sock-puppet account.

There is no magic “safe” threshold for duplicate accounts; it varies from scenario to scenario. Insisting on a one-person-one-account policy is too restrictive and does not take into account— no pun intended— use of social media by companies, where one person may have to represent multiple brands in addition to maintaining their own personal presence. Even when restricting our attention to individuals, many prefer to maintain a separation between work and personal identities, with separate social media accounts for different facets of their life. Pet lovers often curate separate accounts for their favorite four-legged companions— often eclipsing their own “real” stream in popularity. If we contain multitudes, it is only fair that Twitter allow a multitude of accounts. In other cases, even two is too many. If someone is booted off the platform for violating terms of service, posting hate speech or threatening other participants, they should not be allowed to rejoin under another account. (Harder question: should all personal accounts associated with that person on the platform be shuttered? Does Fido the dog get to keep posting pictures if his companion just got booted for spreading election conspiracies under a different account?)

Beyond real-names

So far the discussion about verified identity focused only on the relationship between an online service such as Twitter and an individual or corporation registering for an account on that platform. But on social media platforms, the crucial connections run laterally, between different users of the platform as peers. It is one thing for Twitter to have some assurance about the real world identity connected to a user. What about other participants on the platform?

One does not have to look back too far to see a large scale experiment in answering that question in the affirmative and evaluating how well that turned out. Google Plus, the failed social networking experiment from designed to compete against Facebook, is today best remembered as the punchline to jokes— if it is remembered at all. But at the time of its launch, G+ was controversial for insisting on the use of “real names”. Of course the company had no way to enforce this at the time. Very few Google services interacted with real world identities, by requiring payment or interactions with existing financial institutions. (The use of a credit card suddenly allows for cross-checking names against those already verified by another institution such as a bank. While there is no requirement that the name on a credit card is identical to that appearing on government issued ID, it is a good proxy in most cases.) Absent such consistency checks, all that Google could do was insist that the same name be used across all services— if you are sending email as “John Smith” then your G+ name shall be John Smith. Given how ineffective this is at stopping users from fabricating names at the outset, there had to be a process for flagging accounts violating this rule.  That policing function was naturally crowd-sourced to customers, with the expectation that G+ users would “snitch” on each other by escalating matters to customer support with a complaints when they spotted users with presumably fake names. While it is unclear if this half-baked implementation would have prevented G+ from turning into the cesspool of conspiracy theories and disinformation that Facebook evolved into, it certainly resulted in one predictable outcome: haphazard enforcement, with allegations of real-names violation used to harass individuals defending unpopular views. In a sense G+ combined the worst of both worlds: weak, low-quality identity verification by the platform provider coupled with a requirement for consistency between this “verified” identity known to Google and outward projection visible to other users.

Yet one can also imagine alternative designs that decouple identity verification from the freedom to use pseudonyms or assumed nicknames. Twitter could be 100% confident that the person who signed up is a certain John Smith from New York City in the offline world, while still allowing that customer to operate under a different name as far as all other users are concerned. This affords a reasonable compromise between providing freedom of expressing identity while discouraging abuse: if Mr. Smith is booted from the platform for threatening speech under a pseudonym, he is not coming back under any other pseudonym. (There is also the additional deterrence factor at play: if the behavior warrants referral to law enforcement, the platform can provide meaningful leads on the identity of the perpetrator, instead of an IP address to chase down.)

This model still raises some thorny questions. What if John Smith deliberately adopts the name of another person in their online profile to mislead other participants? What if the target of impersonation is a major investor or political figure whose perceived opinions could influence others and impact markets? Even the definition of “impersonation” is unclear. If someone is publishing stock advice under the pseudonym “NotWarrenBuffett,” is that parody or deliberate attempt at market manipulation? But these are well-known problems for existing social media platforms. Twitter has developed the blue checkmark scheme to cope with celebrity impostors: accounts with the blue check have been verified to be accurately stating their identity while those without are… presumably suspect?

That leads to one of the unintended side-effects of ubiquitous identity verification. Discouraging he use of pseudonyms (because participants using a pseudonym are relegated to second-class citizenship on the platform compared to those using their legal name) may have a chilling effect on expression. This is less a consequence of verified identities and more about the impact of making the outcome of that process prominently visible— the blue badge on your profile. Today the majority of Twitter accounts are not verified. While the presence of a blue badge elevates trust in a handful of accounts, its absence is not perceived as casting doubt on the credibility of the speaker. This is not necessarily by design, but an artifact of the difficulty of doing robust verification at scale (just ask cryptocurrency exchanges) especially for a service reliant on advertising revenue, where there is no guarantee the sunk cost can be recouped over the lifetime of the customer. In a world where most users sport the verification badge by agreeing to include their legal name in a public profile, those dynamics will get inverted: not disclosing your true identity will be seen as suspect and reduce the initial credibility assigned to the speaker. Given the level of disinformation circulating online, that increase skepticism may not be a bad outcome.

CP

Logical access and the security theater of data-nativism

Data-center address as security guarantee

WSJ recently quoted a spokesman for Binance.US stating that all US customer data is stored on servers located in the US. The subtext of this remark is that by exclusion, customer information is not stored in China, an attempt to distance the company from concerns around the safety of customer information. Such new-found obsession with “data terroire” is a common interpretation of the data-sovereignty doctrine, which holds that information collected from citizens of a particular country must remain both geographically and legally subject to its privacy regulations. While the concept predates the Snowden revelations of 2013, it was given renewed urgency after disclosures of US surveillance programs leveraging massive data collections hoarded by private companies including Google, MSFT and Yahoo among others named as participants in the mysterious PRISM program of “upstream” collection. [Full disclosure: this blogger was a member of the Google security team from 2007-2013] 

Data-sovereignty is a deceptively simple solution: If Google is forced to store private information of German citizens on servers physically located in Germany, the argument goes, then NSA— or its counterparts in China, Russia or whichever foreign policy boogeyman looms large in the imagination on a given day— can not unilaterally seize that data without going through the legal framework mandated by German law. This comforting narrative makes no sense from a technology perspective. (If it ever made sense in other terms, including lawful access frameworks. The NSA is legally barred from conducting surveillance on US soil. Moving data out of US into foreign countries amounts to declaring open season on those assets.) To explain why, we need to distinguish between two types of access: physical and logical.

One note about the hypotheticals explored here: the identity of the private companies hoarding sensitive customer information and the alleged boogeyman going after that stash varies according to the geopolitical flavor of the day. After Snowden, US tech giants were cast as either  hapless victims or turncoat collaborators depending on your interpretation, while the NSA conveniently assumed the role of the arch-villain. For the purpose of this blog post we will use Binance/US and China as the stand in for these actors, with the full expectation that in a few years these examples will appear quite dated.

Physical access vs logical access

Imagine there is a server located inside a data-center in the middle of nowhere, as most datacenter are bound to be for proximity to cheap hydropower and low real-estate costs. This server is storing some important information you need to access. What are your options?

1. You can travel to the datacenter and walk up to the server directly. This is physical access. It unlocks some very convenient options. Server is stuck, not responding to requests? Press the power button and power-cycle it. Typical rack-mounted servers do not have a monitor, keyboard, mouse or any other peripherals attached for ease of use. But when you are standing next to the machine, you can connect anything you want. This allows getting an interactive shell and using it as a glorified workstation. One could even attach removable storage such as a USB thumb-drive for conveniently copying files. In a pinch, you could crack-open the server chassis and pocket one of the disk drives to hoover up its contents. As an added bonus: if you walk out of the datacenter with that drive, the challenge of reading its contents can be done later from the comfort of your office. (Incidentally the redundancy in most servers these days means that they will continue ticking on as if nothing happened after the removal of the drive, since they are designed to tolerate failure of individual components and “hot-swapping” of storage.) But all of this flexibility comes at a high cost. First you have to travel to the middle of nowhere which will likely involve a combination of flying and driving, then get past the access controls instituted by the DC operator. For the highest level of security in tier-4 datacenter that typically involves both an ID badge and biometrics such as palm scans for access to restricted areas. Incidentally the facility is covered with cameras everywhere, resulting in a permanent visual record of your presence, lest there be any doubt on what happened later. 

2. Alternatively you can access the server remotely over a network using a widely deployed protocol such as SSH, RDP or IPMI. This is logical access. For all intents and purposes, the user experience is one of standing next to the machine staring at a console, minus the inconvenience of standing in the uncomfortable noisy, over-air-conditioned, florescent-lit datacenter aisle. Your display shows exactly the same thing you would see if you were logged into the machine with a monitor attached, modulo some lag in the display due to the time it takes for the signal to travel over a network. You can type commands and run applications exactly as if you had jury-rigged that keyboard/mouse/monitor setup with physical access. Less obvious is that many actions that we typically associate with physical access can be done remotely. Need to connect an exotic USB gadget to the remote server? Being thousands of miles away from the USB port may look like a deal-breaker but it turns out modern operating systems have the ability to virtually “transport” USB devices over a network. USB forwarding has been supported by Windows Remote Desktop Protocol (RDP) for over a decade, while the usbip package provides a comparable solution on Linux. Need to power-on a server that has mysteriously shutdown or reset one that has gotten wedged, not responding to network requests? There is a protocol for that too: IPMI. (IPMI runs on a different chip called the “baseboard management controller” or BMC located inside the server, so the server must still be connected to power and have a functioning network connection for its BMC which happens to be the usual state of affairs in a data-center.) Need to tweak some firmware options or temporarily boot into a different operating system from a removable drive? IPMI makes that possible too.

The only prerequisite for having all these capabilities at your fingertips from anywhere in the world is the foresight to have configure the system for remote access ahead of time. Logical access controls define which services are available remotely (eg SSH vs IPMI), who is allowed to connect, what hoops they jump through in order to authenticate— there is likely going to a be VPN or Virtual Private Network at the front door— and finally what privileges these individuals attain once authenticated. The company running that server gets to define these rules. They are completely independent of the physical access rules enforced by the datacenter, which may or may not even the same company. Those permitted to remotely access servers over a network could be a completely different set of individuals than those permitted to step inside the datacenter floor and walk up to that same server in real life.

Attack surface of logical access

Logical access is almost as powerful as physical access when it comes to accessing data while having the convenience of working form anywhere in the world. In some cases it is even more convenient. Let’s revisit the example from the previous section, of walking into a datacenter and physically extracting a couple of disk drives from a server, with the intention of reading their contents. (We assume the visitor resorted to this smash-and-grab option because they did not have the necessary credentials to login to the server and access the same data the easy way even while they were standing right next to it.) There are scenarios where that problem is not straightforward, such as when disk encryption is used or the volumes are part of a RAID array that must be reconstructed in a particular manner. Another challenge is retrieving transient data that is only available in memory, never persisted to disk. There are ways to do forensic memory acquisition from live systems, but the outcome is a snapshot that requires painstaking work to locate the proverbial needle in the haystack of a full memory dump. By comparison, if one could login to the server as a privileged user, with a few commands the running application could be reconfigured to start logging the additional information somewhere for easy retrieval.

There is another reason logical access beats physical access: it’s easier to hide. Logical access operates independently of physical access: there is no record of anyone getting on an airplane, driving up to the datacenter gates, pressing their palm on the biometric scanner or appearing on surveillance video wondering the aisles. The only audit trails are those implemented by the software running on those servers, easily subject to tampering once the uninvited visitors have full control over the system.

Data-nativism as security theater

This distinction between physical and logical access explains why the emphasis on datacenter location is a distraction. Situating servers in one location or another may influence physical access patterns but has no bearing on the far more important dimension of logical access. Revisiting the Binance/US example from the introduction to illustrate this, there are three threat models depending on the relationship between the company and alleged threat actor.

  1. Dishonest, outright colluding with the adversary to siphon US customer data
  2. Honest but helpless in the face of coercion from a foreign government to hand-over customer data
  3. Honest but clueless, unaware that APT associated with a foreign nation has breached its infrastructure for collecting customer data in an unauthorized manner

In the first case it is clear that the location of data-centers is irrelevant. Binance/US employees collectively have all necessary physical and logical access to copy whatever customer information is requested and turn it over to the authorities.

The second case is identical from capability standpoint. Binance/US employees are still in a position to retrieve customer data from any system under their control, regardless of its geographic location. The only difference is a legal norm that such requests be channeled through US authorities, under an existing Mutual Legal Assistance Treaty (MLAT) agreement. If China seeks information from a US company, the theory goes, it will route the request through DOJ who is responsible for applying appropriate safe-guards under the 4th amendment before forwarding the request to the eventual recipient. This is at best wishful thinking under the assumptions of our scenario— a rogue regime threatening private companies with retaliation if they do not comply with requests for access to customer information. Such threats are likely to bypass official diplomatic channels and be addressed to the target directly. (“It would be unfortunate if our regulators cracked down on your highly profitable cryptocurrency exchange.”) For-profit organizations on the receiving end of such threats will be disinclined to take a stand on principle or argue the nuances of due process. The relevant question is not whether data is hosted in a particular country of concern, but whether the company and/or its employees have significant ties to that country such that they could be coerced into releasing customer information through extra-judicial requests.

A direct attack on Binance infrastructure is one where geography would most likely come into play. Local jurisdiction certainly make it easier to stage an all-out assault on a data-center and walk out with any desired piece of hardware. But as the preceding comparison of physical and logical access risks indicate, remote attacks using software exploits are a far more promising avenue of attack than kicking in the door.  If the government of China wanted to size information from Binance, it is extremely unlikely to involve a SWAT-style smash-and-grab raid. Such overt actions are impossible to conceal; data-center facilities are some of the most tightly controlled and carefully monitored locations on the planet. Even if target is greatly motivated by PR concerns to conceal news of such raids, even limited knowledge of the incident breaks a cardinal rule of intelligence collection: not letting the adversary realize they are being surveilled. If nothing else, the company may think twice about placing additional infrastructure in the hostile country after the first raid. By comparison, pure digital attacks exploiting logical access can go undetected for a long time, even indefinitely depending on the relative level of sophistication between attacker vs defender. With the victim none the wiser, compromised systems continue running unimpeded, providing attackers an uninterrupted stream of intelligence.

Physical to logical escalation: attacker & defender view

This is not say that location is relevant. Putting servers into hostile territory can amplify risks involving logical access. One of the more disturbing allegations from the Snowden disclosures involve Google getting sold out by Level3, the ISP hired to provide network service to Google data-centers. Since Google at the time relied on a very naive model of internal security and traffic inside the perimeter was considered safe to transit without encryption, this would have given the NSA access to confidential information bouncing around the supposedly “trusted” internal network. Presumably a compliant ISP in China will be similarly willing to arrange for access to its customers’ private fiber connections than one located overseas. Other examples involve insider risks and more subtle acts of sabotage. For example the Soviet Union was able to hide listening devices within the structure of the US embassy in Moscow, not to mention backdoor typewriters sent for repair. Facilities located on foreign soil are more likely to have employees and contractors acting at the behest of local intelligence agencies. These agents need not even have any formal role that grants them access; recall the old adage that at 4AM the most privileged user on any computing system is the janitor. 

One silver lining here is that risks involving pure physical access have become increasingly manageable with additional security technologies. Full-disk encryption means the janitor can walk away with a bundle of disk drives, but not read their contents. Encryption in transit means attackers tapping network connections will only observe ciphertext instead of the original data. Firmware controls such as secure boot and measured boot make it difficult to install rogue software undetected, while special-purpose hardware such as hardware security modules and TPMs prevent even authorized users from walking away with high-value cryptographic keys.

Confidential computing takes this model to its extreme conclusion. In this vision customers can enlist run their applications on distant cloud service providers and process sensitive data, all the while being confident that the  cloud provider can not peek into that data or tamper with application logic— even when that application is running on servers owned by that provider with firmware and hypervisors again in the control of the same untrusted party. This was not possible using vanilla infrastructure providers such as AWS or Azure. Only the introduction of new CPU-level isolation models such as Intel SGX enclaves or AMD SEV virtual machines has made it possible to ask whether trust in the integrity of a server can be decoupled from physical access. Neither has achieved clear market dominance, but both approaches point towards a future where customers can locate servers anywhere in the world— including hostile countries where local authorities are actively seeking to compromise those devices— and still achieve some confidence that software running on those machines continues to follow expected security properties. Incidentally, this is a very challenging threat-model. It is no wonder both Intel and AMD have stumbled in their initial attempts. SGX has been riddled with vulnerabilities. (In a potential sign of retreat, Intel is now following in AMD’s path with an SEV competitor called Trust Domain Extensions or TDX.) Earlier iterations of SEV have not fared any better. Still it is worth remembering that Intel and AMD are trying to solve a far more challenging security problem than the ones facing by companies who operate data-centers in hostile countries, as in the case of Apple and China. Apple is not hosting its services out of some AWS-style service managed by CCCP in a mysterious building. While a recent NYT investigation revealed Apple made accommodations for guanxi, the company retains extensive control over their operational environment. Hardware configured by Apple is located inside a facility operated by Apple, managed by employees hand-picked by Apple, working according to rules laid down by Apple, monitored 24/7 by security systems overseen by Apple. That’s a far cry from trying to ascertain whether a blackbox in a remote Amazon AWS datacenter you can not see or touch— much less have any say in the initial configuration— is working as promised.

Beyond geography

Regulators dictating where citizens’ private information must be stored and companies defending their privacy record by stressing where customer data is not stored both share in the same flawed logic. Equating geography with data security reflects a fundamental misunderstanding of the threat model, focusing on physical access while neglecting the far more complex problems raised by the possibility of remote logical access to the same information from anywhere in the world.

CP