The economic security of blockchains: lessons from the Ethereum Classic attack (part I)

In early January, the Ethereum Classic blockchain experienced double-spend attacks created by deep reorganizations of the blockchain. While the perpetrators’ identity remains a mystery although research from SlowMist suggests that exchanges involved in the incident may have additional information for locating them there is enough information about the attack to draw general conclusions about the modus operandi and implications for other blockchains. This event has upended core assumptions made in previous economic analysis of blockchain security, defined as the cost/benefit calculus for mounting successful attacks against the integrity of the ledger.

Double-spending in the abstract

In one sense, there is nothing surprising or novel about the attack vector. The vulnerability of blockchains to hijacking by overwhelming mining power had always been acknowledged, even in the original Bitcoin paper by Satoshi. Such attacks have even been observed on other alt-coins in the wild, most notably Bitcoin Gold in May 2018. To recap how such a double-spending attack works in the abstract using the standard cast of cryptographic characters:

  1. Alice and Bob agree to trade using cryptocurrency.
  2. Alice pays Bob by broadcasting a transaction on the Ethereum Classic (ETC) blockchain
  3. Bob waits until this transaction has been “mined” and confirmed, getting buried several blocks deep in the chain history
  4. After the transaction is confirmed, Bob in return supplied Alice with a product or service, such as shipping her a painting or even giving her a bundle of USD cash.
  5. Unbeknownst to Bob at this stage, Alice had been secretly mining an alternative history of the ETC blockchain, with more hash-power than all other miners combined. In this parallel universe, the transaction sending ETC from Alice Bob never happened.
  6. Given that Alice commands a majority of the hash rate on the network (so-called “51% attack”) her alternative chain will eventually catch up and overtake the existing chain, measured in terms of the objective metric used by all participating nodes to pick a winner among competing histories.
  7. When the blockchain “snaps” to this alternative version created by Alice, Bob no longer has possession of the ETC funds he believed Alice had sent; that transfer has effectively been erased from history.
  8. Alice meanwhile still has the cash Bob provided in exchange, in addition to her original ETC funds.

There is one technicality here: as stated, the transaction broadcast in step #5 would still be valid on the new, revisionist history orchestrated by Alice. That means Bob could try broadcasting it again, after Alice stops mining new blocks and control of the network reverts back to honest miners. To lock in the theft for good, Alice would include another transaction in her version of events to break compatibility. Even a seeming useless transaction sending the same funds back to herself (Alice Alice) will do the trick; although, the scenario gets more interesting if the same funds are used for another third-party transaction, as we will shortly see. That is the origin of the slightly confusing term double-spending, since the same money is seemingly being used twice. Even though any single version of blockchain history only allows single spending, by creating temporary confusion about what the “true” version of history is, Alice convinces different people at different times that the funds are theirs.

Attacker perspective: ideal targets

Is this attack feasible? It depends on the relative cost of achieving 51% of hash-rate “temporarily,” compared to the expected benefits measured by the value of the goods/services purchased without having to fork over any Ethereum Classic. Hash-rate costs are subject to complex dynamics, involving the specific design of the proof-of-work function used in a blockchain and available spare capacity up for grabs for the higher-bidder on markets such as NiceHash. But assuming an attacker can assemble enough fire-power to revise blockchain history, there is a natural choice of target to go after, a stand-in for the character “Bob:” cryptocurrency exchanges. Alice can deposit ETC on an exchange, trade that to a different cryptocurrency such as Bitcoin (BTC) and withdraw BTC. Note that the 51% attack on ETC chain has no effect on BTC, short of any general market unease caused by an unrelated coin being attacker. (In fairness, there is strong correlation across different cryptocurrencies and past events such as the Ethereum DAO debacle have caused broad declines across the market, even in asset classes having nothing to do with buggy smart-contracts.) The attacker gets to keep the Bitcoin, while the exchange is out of ETC after the deposit is reversed. There is no doubt an interesting legal question here on whether the exchange will eat the loss or if those losses can somehow be passed on to counter-parties involved in the BTC/ETC trade. But that is immaterial for our purposes: the bottom line is that someone has been scammed.

This is the exact playbook observed in the ETC double-spend attack. One of the primary targets has been identified as the exchange Gate.IO, with the attackers converting the proceeds to bitcoin for withdrawal. To the extent there is any element of surprise here, is the fact that such an attack could involve a relatively major currency in the top 20 by market capitalization. It is one thing to prey on Bitcoin Gold or some other thinly-traded altcoin with negligible hash rate. It is another level of capability to amass enough hash-rate to overpower Ethereum Classic, with the unspoken question: which other cryptocurrencies are vulnerable to copy-cat attacks?

Wrong assumptions about costs and benefits

The existence of economic limits to the security of blockchains is not exactly news. It is an accepted risk that Bond villains with unbounded resources can amass sufficient hash-power to overwhelm any blockchain. Without disputing this fact, the standard response to these arguments has been around economic incentives: why would an attacker spend that much money if they can not recoup even greater value by carrying out the attack? Putting aside the fact that this is not an entirely comforting answer some “attackers” may be motivated by ideology, with the standard example being governments willing to sabotage public blockchains for the greater cause of enforcing capital controls, even when that undertaking would be extraordinarily costly it leads down the rabbit hole measuring costs and benefits.

Notable, this attack on ETC debunked two premises that go into these models:

  1. Attacker costs include devaluation of their own holdings. It has been an article of faith that miners would not collude to execute a 51% attack because doing so would lead to decrease of confidence in the currency, resulting in the notional value of their own mining rewards going down. In effect, 51% attacks are treated as short-sighted move that may temporarily boost returns but only at the cost of much greater losses down the road. This was rooted in an implicit assumption that perpetrators of the attack are still wedded to that particular cryptocurrency after all, Alice in the example above is still holding ETC after her double-spend attack has been successfully executed. But the presence of public cryptocurrency exchanges with high liquidity in their order books voids that assumption. Alice can wash her hands clean of ETC, fully cashing out all of her holdings to some other cryptocurrency with nary a care for how far ETC plunges in the aftermath.
  2. Attacker costs also include the depreciation of their specialized mining equipment, which has zero value for any application except mining that specific cryptocurrency. Because the most efficient way to mine Bitcoin involves highly-specialized ASIC hardware that is useless outside that specialized application, the perceived “cost” for an attacker mounting 51% attacks would include both capital expenses to acquire those rigs and more subtly, depreciation for the equipment caused by decline in currency price. Recall that if the price of Bitcoin goes down and mining rigs are good for nothing besides producing more Bitcoin, the expected value provided by a rig over its lifetime also takes a dive. This may have been a reasonable assumption when Bitcoin was the only game in town. Today there are hundreds of alt-coins, including several “families” sharing the same proof-of-work function: mining rigs built for Bitcoin can be diverted to also mine Bitcoin Cash/Gold/Diamond/Tin/Scrap-Metal/etc.**



** The last two remain hypothetical at the time of writing, but can not be ruled out if forks continue to create value out of thin air.

Proof of nothing: why the January proof-of-keys ritual missed the mark

[Full disclosure: this blogger was formerly CSO for a cryptocurrency exchange.]

Imagine a country where confidence in the banking system has reached a nadir among the citizenry: no regulatory oversight, no lending standards, no requirement for public audits and no FDIC insurance to serve as backstop in the event of bank failure. Instead  consumers a coalition of consumers decide to take matters into their own hands with a grassroots campaign to verify the solvency of banks. How? By orchestrating a bank-run. Everyone is instructed to take out all of their money out of banks on the same day and deposit them back at a later time after observing to their satisfaction that all withdrawals have been processed successfully.

This is not an entirely far-fetched analogy for what happened in the cryptocurrency space earlier this month with the Proof of Keys event. There are of course key differences, no pun intended. Traditional banks operate a fractional reserve by design. It is no secret they would experience a liquidity problem if all customers showed up to demand 100% of their deposits at the same time, as epitomized in the bank-run scene from It’s a Wonderful Life. But cryptocurrency businesses operate under a different expectation, namely that they retain full custody of customer deposits at all times. No lending, no proprietary trading, not even parking those funds at an interest-bearing account offered by another financial institution lest it create counter-party risk. So on the face of it, there is some value to this withdraw & redeposit ritual: a custodian successfully satisfying every withdrawal was provably in possession of those customer funds.

But the reasoning is flawed for several reasons.

First, participation is voluntary. Even with viral propagation on social media and decent coverage in press outlets dedicated to cryptocurrency, only a small fraction of users representing an even smaller fraction of total funds will participate. (It is easier for retail investors to participate, compared to an institution such as hedge-fund. The latter typically have more stringent requirements around alternative places to keep funds. An off-the-shelf hardware wallet is an adequate solution for storing personal funds; it is a far cry from enterprise-grade storage with redundancy and multiple users for an institution.) While exact numbers are not available, mining fees and memory pool pressure statistics can be used as a gauge of participation. Assuming only 10% of funds were withdrawn from a given exchange, the campaign has only proven that at least 10% of funds are there.

Screen Shot 2019-01-03 at 6.18.53 PM.png

Zooming on mempool state on January 3rd shows a temporary spike, coinciding with morning hours in PST.

Screen Shot 2019-01-17 at 10.57.22 PM.png

Looking at 30-day view paints a different picture. There is nothing remarkable about the transaction volume or fees on January 3rd.

Second, pre-announcing a bank-run ahead of time somewhat defeats the point. It’s not a pop-quiz if the teacher announces that there will be a pop-quiz next Friday. Given enough advance warning, even custodians who were actively investing their deposits can convert all of the capital back to the expected currency.

Finally there is a more subtle reason why this stress-test for custodians fails: it is first and foremost a test of liquidity instead of solvency. In other words, it is subject to false negatives where even a cryptocurrency custodians with 100% of funds on deposit could fail to produce every last satoshi on demand, for a very good reason. Sound risk management for cryptocurrency dictates storing the majority of funds offline, in cold storage. By definition these systems are disconnected from the Internet and require manual stepssuch as travel to an offsite locationto effect a withdrawal. Only a small fraction of funds are kept “online” in hot wallets, where they are instantly accessible for satisfying withdrawal requests.

This model is similar to how traditional banks manage cash. Even within a single branch, only a fraction of the cash present at that branch is loaded into the ATM. The remainder would be kept in an interior vault within the branch. Banknotes stuffed into the ATM are available 24/7 for customers to withdraw but also come with a downside: they are subject to heightened risk of theft. Blowing up an ATM or towing it away is easier than breaking into a steel vault. By keeping only sufficient money in the ATM to satisfy expected withdrawal volume, the bank manages its exposure while providing high degree of assurance that customers will have access to their funds. As long as projected liquidity requirements are not too far off the actual observed demand from customers in other words, barring unexpected events that result in everyone rushing to the ATM at oncethis is a good trade-off between security and liquidity.

Cryptocurrency custodians employ a similar strategy for managing the exposure of hot-wallets. If deposits exceeds withdrawals and there is a net influx of funds, the wallet may start running too “hot.” Excess risk is trimmed by sending funds to an offline wallet. Since the transfer is initiated from an online wallet accessible over a network, this step is easy. On the other hand, if withdrawals outpace deposits and there is a net outflow, the system risks running into a liquidity problem and must be replenished by moving funds back from an offline wallet. This step is more time consuming. By design, access to offline wallets is available from the same system operating the hot wallet; otherwise they would be subject to same risks as the lower-lower-assurance system.

Returning to the conceptual problem with the DIY proof-of-solvency, if enough customers actually participate in a coordinated effort to withdraw their funds at the same time, hot-wallets will bottom out and cease to provide liquidity until they are replenished from offline wallets. (Granted, knowing when the stress-test will occur creates additional options to prep. For example the custodian can deliberately bias the wallet distribution to maintain higher-than-usual fraction of funds online.) That means events such as January 3rd are not so much a proof of solvency as they are a proof of available hot-wallet liquidity or perhaps time-trial of how fast the custodian can access offline storage systems. The paradoxical part is that in any system with online/offline separation managed according to risk criteria, some delay in processing withdrawals would be fully expected. If anything, it is a bad sign if the custodian can instantly produce 100% of all funds on short notice. It means they are likely keeping 100% of cryptocurrency online in a hot-wallet, where they are most susceptible to theft. Ask Bitfinex how that turned out in 2016.

To be clear: this is not to trivialize security concerns with storage of cryptocurrency. Given that internal workings of exchanges and custodians remain opaque to most customers, it is completely reasonable to demand periodic assurance that funds deposited with are still accounted for. Considering that proof-of-keys suffers from conceptual flaws and indeed accomplished very little this time around (judging by observed withdrawal volume) the question becomes: what is an effective way for custodians to verify that a custodian is still in possession of their cryptocurrency deposits? A future blog post will look at some alternatives.


Payment networks, subset-sums and tracking consumer spending

(No cause for alarm, for the most part)

A recent Bloomberg article featured a glowing portrayal of the retail analytics company Cardlytics. Readers may forgiven for assuming this outfit had found the magical solution for tracking consumer spending from payment data alone, which had eluded all other attempts. Cue in cheers from hungry-advertisers and consternation from privacy advocates? Not exactly. What the article neglects to mention is that the company has no access to line-level purchase data. In other words, there is no new source of information here. Nothing new that is not already visible to Visa or MasterCard, or for that matter the bank that issued the credit-card. The “innovation,” assuming one exists, lies in better ways of crunching the existing stash of information that payment networks & banks are already sitting on. The implication being that wealth of information was sitting under-utilized, either because they have not gotten around to it, are contractually prevented from engaging in such data mining— unlikely, given that privacy has never been a forte in the payments industry— or more likely, because they lack in-house technical expertise for it.

To better explain this distinction, let’s recap how a payment network such as Visa operates. Suppose a consumer uses their Chase Visa card to buy a cup of coffee from the neighborhood deli. There are three critical participants in the loop facilitating that transaction:

  • Issuer: the bank that gave the credit-card to the consumer. For example, for a Chase Visa, this would be Chase bank.
  • Acquirer: the bank where the merchant has an account
  • The payment network, in this case Visa. This is the network, both in the metaphorical sense of being a densely connected graph between issuers and acquirers, as well as an actual communication network over which payment requests and authorizations are routed.

(This is a simplification; there are many participants all vying for a cut of the transaction, with the most notable ones being payment processors who assist merchants in getting setup to accept credit-cards, often as part of a bundle that includes providing the point-of-sale hardware.)

From a privacy perspective what matters is that each transaction originating at the merchant is routed through the Visa network and must be authorized by the issuing bank. After all, it is Chase bank that gets to decide whether customer Bob is authorized to spend $2345 on a new TV at BestBuy. That decision must take into account several factors, starting with whether Bob has sufficient spending limit left on that card. There is also the minor matter of verifying this purchase is indeed initiated by Bob, as opposed to a fraudulent charge resulting from a stolen card for example. That means the issuing bank knows the purchase amount and merchant ID— so does Visa, which is responsible for relaying the payment request to the issuer and ferrying back the thumbs up/down response for payment authorization.

Of course there is nothing new here. This is how payment networks have always operated. A corollary is that they become unwitting witnesses to consumer spending patterns in a global way. They have global visibility. By contract, merchants have local visibility: BestBuy is aware of every transaction a particular customer conducts at every BestBuy location over time. (There is some ambiguity around whether they are contractually permitted to use the payment card itself as a fixed identifier to correlate such purchases; card numbers are static for the lifetime of the card and more importantly include card-holder name which is likely constant across multiple cards even as they are replaced.) But that visibility ends as soon as the consumer steps outside the store. BestBuy has no idea what the same customer does next door over at Home Depot. Chase Bank on the other hand does, since they can observe the same credit-card being used at both locations. A card network such as Visa or MasterCard does Chase even one better: they can see all transactions across all merchants for that card type.

Again there is no mystery that this information is being collected or mined for patterns— in fact, that is how fraudulent transactions are detected. Issuing banks are on the lookout for red flags: such as the teetotaler racking up large tabs at bars or that card-holder who never leaves Poughkeepsie jet-setting around the Caribbean. What is new is that instead of being used in a defensive manner for combatting fraud within the payment ecosystem, this data is now mined to identify new revenue streams by directly working with merchants to market to consumers according to spending profiles.

But there is one major limitation of this model that companies including Cardlytics have yet to overcome: they have no visibility into so-called line-level purchase information. In other words, all they can observe is the total amount at the bottom of the receipt. They have no visibility into the contents of a shopping cart or itemized list of drinks on the bar tab— information that the merchant knows but is never transmitted as part of the payment authorization protocol. Line-level receipt information has been the Holy Grail of companies that harvest and traffic in consumers spending data. Much to the chagrin of eager startups partnering with banks for access to payment network flows, they are still not any closer to that stash, which remains carefully guarded by merchants.

To be clear, this is not to minimize the privacy risks. On the contrary, merchant IDs and amounts alone can be extremely revealing— and damaging. BestBuy may sound like a saccharine example but consider other merchants that accept credit cards: charitable organizations, treatment facilities specializing in rare conditions and controversial advocacy groups (Nickelback fan-club anyone?) In each case, the existence alone of a purchase amounts to metadata about the card-holder hinting at everything from political persuasion to medical conditions. In other cases, amounts and frequency or purchases can be telling: consider transactions involving liquor stores or casinos. Depending on who gets to make judgements based on data, historic patterns can mean the difference between casual interest and dangerous levels of attachment.

Finally there is an interesting edge case that may be already exploited in the wild: in certain cases one can work backwards from the total amount to line-level data. Consider a store that only offers four items:


Widget Price
A $2.48
B $5.49
C $8.75
D $14.99

Given that pricing structure, if the cash-register rings up a customer for $11.23— recall this is the only number Visa, issuing-bank and whoever they are willing to share the data with can observe— there is only one possible combination of widgets the customer could have purchased: A and C. There is no other way to create a basket of goods summing up to that price.

This idea “reverse engineering” shopping cart contents from total amount and individual prices is related to a well-known problem in computer science. It is a variant of the subset-sum problem. In the strict version of subset-sum problem, items can not be repeated. In the retail settings of course customers can buy multiple copies of the same item. It turns out that tweak does not appreciably alter the fundamental difficulty of the problem— and solving subset-sum is very difficult in a well-defined computational sense. It ranks among the group of NP-complete problems for which there are no  known efficient algorithms for solving large instances. Worse it is conjectured that no efficient algorithm exists. The “state of the art” exact solutions are barely faster than exhaustively checking all possible combinations of items, which does not scale to large instances of the problem where the menu contains not four, but dozens of different widgets available for purchase. Efficient approximation algorithms are known for many NP-complete problems but close-enough is not good enough in the context of inferring customer spending. If the algorithms returns an alleged shopping-cart but its contents do not add up to the exact purchase price, it is not the right cart; there is no reason to believe the customer bought any of them. 

Are retail analytics companies working to solve subset sums to recover line-level purchase information? This is unlikely for several reasons, even if they were willing to throw large amounts of CPU time at the problem. This approach can only provide a “unique” solution on a menu with a small number of discrete items, each with unique price. As soon as any of these conditions are violated, there is no unique solution. For example if variable quantities are allowed— as in a grocery store where fruits and vegetables are priced by the pound— any amount can be attributed to say bananas alone. Similarly, if there are two items with the same price, they become indistinguishable. Even if every item has a distinct price individually, because the possible combinations increase exponentially with the number of items, there is no guarantee that they will remain free of “collisions:” two completely distinct basket of goods ends up with same total cost, down to the penny. Finally, assuming the pricing ambiguity could be tamed, there is a practical problem around sourcing accurate information: the data-mining operation must have exact pricing information from thousands of merchants, staying on top of variables such as geographic location (state & local taxes that increase the final sales amount) seasonal fluctuations and daily promotions that one particular outlet may decide to implement. Achieving any semblance of accuracy for that information would require cooperation by the merchant. But if one posits that merchants will be complicit in helping mine customer transactions by sharing information with a third-party, there is no need for solving subset-sum instances any longer.  One may as well ask the merchant “what was in the shopping cart for that customer who bought $84.25 worth of groceries at 17:34EST?”

That scenario could ratchet up the privacy risks an order of magnitude. Until now, merchants have treated purchase data as a highly valuable asset, guarded jealously and only used internally to boost the health of this business. It is not up for grabs by third-party analytics services focusing on identifying global patterns to be monetized elsewhere. Even if that service could do a better job of crunching the data provided by one merchant, the end result may well end up benefiting their competitor instead. If incentives shift to the point that merchants are collectively throwing their own stash of consumer spending patterns into a single pile, it would spell trouble for privacy.


On the limits of decentralized exchanges

[Full disclosure: this author works on security for a cryptocurrency exchange]

Decentralized or non-custodial exchanges are one of the areas of active research and development in cryptocurrency. The key differentiator— no pun intended— is that the exchange does not take temporary possession of customer funds in order to enable trading. By contrast, the prevalent model for exchanges today only supports trading of assets that customers have already deposited on the exchange. To make this more concrete, let’s say Alice wants to buy 1 bitcoin for $6000 USD— using recent prices, which are bound to date this blog post. She wires $6000 to the exchange and places a buy order. Bob meanwhile is looking to liquidate his 1BTC position, coincidentally at the same $6000 price point. He deposits the bitcoin with the exchange and places the corresponding sell order. These orders “cross” on the trading engine, and the exchange updates its internal ledger to swap assets between these customers, minus any fees for the exchange. Critics like point to out these two limitations to this model:

  • Assets are tied up temporarily even when they are not being put to use for trading. If a better opportunity were to emerge for putting that bitcoin to use, it may take some time to recover it. (The situation is worse for fiat; consider the challenge of sending funds during a bank holiday.)
  • Customers must trust the exchange with their funds, which are subject to risks of insider malfeasance and external attacks.

Non-custodial exchanges seek to address these problems by foregoing the requirement to park funds as a prerequisite to trading. Before delving into the comparative strengths and weaknesses of this model, let’s pause to clarify the definition of what counts as an “exchange.”

An exchange is not a glorified bulletin-board

An abstract requirement for an exchange is to enable price discovery. In other words, the state of order books on the exchange must reflect true supply/demand conditions for the underlying asset. If one bitcoin is being offered for $8000 and a seller comes along willing to buy at that price, that trade must execute. If the seller has an option to back out of trades after they are matched, then the posted bids/offers are pieces of fiction that are no longer indicative of market conditions. (This is not to be confused with the ability to cancel orders or even the much maligned practice of high-frequency trading where orders are posted and retracted quickly. The point applies only during the time an order is posted on the book, even if those times are measured in milliseconds. Once another order crosses that, that trade must execute with overwhelming probability, with each side ending up with the assets on offer by their counter-party. That execution can not depend on the willingness of buyer/seller to further cooperate. Otherwise one side will always have an excuse to back-out as soon as market conditions shift, inspiring a bout of buyer’s remorse for not having taken a better deal. (Note this “guarantee” can be an economic incentive. For example, if any party who reneges on a trade is forced to pay significant penalties, that could be a sufficient deterrent for making sure most activity on the exchange settles uneventfully.)

That means services which only pair up buyers and sellers without providing any execution guarantees fall short of this definition of an exchange. Price signals on such a venue may well depart from true market valuations because there is no way to know in general if any assets are indeed changing hands at those prices. Even if the marketplace is continuing to observe participant activity following a trade (for example, by monitoring blockchains to see if funds claimed to be in the possession of the counter-parties were indeed moved) this is providing at best a delayed signal after the fact.

Beyond bidirectional channels

There is another “trivial” sense in which trading can be done without custody, at least for cryptocurrency assets. Instead of depositing funds ahead of time on the exchange and later allocating some fraction of those funds to open orders, the customer can instead create a bidirectional payment channel with the exchange. Suppose Bob has 5 bitcoin and plans to diversify some of that into ether and litecoin. Instead of depositing 5BTC all at once, he creates a channel with the exchange, with the commitment transaction funded entirely by Bob. When he wants to convert 1BTC into ether, he sends 1BTC to the exchange. If he also decides to convert another bitcoin into litecoin, the channel state is updated to now pay the exchange 2BTC, for the combined amount required to back both order. (Transaction fees are neglected in this simplified model.) If he were to change his mind and cancel the BTC/LTC order, the payment channel state is updated by revoking previous transactions and reverting to a new state where the exchange now only gets 1BTC, with the remaining 4 going back to Bob.

In this model, only the assets committed to open orders are in possession of the exchange, as opposed to the entire amount the customer has available for trading. It is in some ways similar to the daily settlement model common in equities trading: the customer opens a channel with the exchange prior to the opening of market, places/cancels any number of orders throughout the day and after close of market, the channel is “closed” to settle the balances. (Granted cryptocurrency trading takes place around the clock, without the sharply delineated “market hours” of traditional markets.)

While this reduces the proportion of assets tied up with an exchange or subject to security risk, it does not eliminate those risks entirely. For example, Bob still must trust the exchange that when his order executes on BTC/LTC,  he will receive the litecoin amount corresponding to his bitcoin offer. The natural question is whether one can design an exchange with stronger guarantees, where Bob parts with the bitcoin if and only if he will receive the stated amount in litecoin. In an ideal setting, that property would be enforced on chain, taking any trusted third-parties out of the equation. That may sound like a tall order, considering blockchains are self contained and have no awareness of the state of other blockchains. (There is an attempt to replicate the state of bitcoin on ethereum but it has not gotten very far.) On its face, that creates a challenge for making transactions conditional on each other, such as creating a dependency between Bob parting with assets on one chain only if he can recoup a precise amount of another asset class managed on a completely independent chain.

Atomic swaps

Atomic swapspreviously discussed here by comparison to the fair-exchange notion from traditional cryptography— provide a useful building block for doing pairwise exchange across chains. Returning to the problem of Alice and Bob executing a BTC/LTC swap:

  1. Alice prepares some bitcoin (in other words, an unspent transaction output or UTXO) with a specific script. This script allows either one of two paths for claiming the funds:
    • Using Alice’s key A after some time elapses, or
    • Using Bob’s key and knowledge of a preimage for the hash H=SHA256(R) where R is a random value Alice picks.
  1. Bob in turn prepares the corresponding amount of litecoin, using a script that mirrors the one Alice created for spending conditions:
    • Using Bob’s key after the same deadline, or
    • Using Alice’s key and knowledge of a preimage for the same SHA256 hash H. (Note that Bob does not know the random R solving that puzzle but he can copy H from the posted bitcoin transaction.)
  2. Alice proceeds to claim Bob’s litecoin by using her key and knowledge of R. In doing so, she will be revealing R.
  3. Bob in turn proceeds to claim Alice’s bitcoin using his key and the now disclosed answer to the preimage puzzle.

Limitations of atomic swaps

This primitive solves the basic problem of a pairwise trading between mutually distrusting parties, but there are a number of challenges to overcome before it can become a realistic alternative to traditional exchanges.  Let’s catalog these now.

Dealing with fiat

One of the glaring problems with atomic swaps is they operate on a blockchain. That means trading against fiat currencies such as the US dollar or euro can not be expressed using these conditions. There are two work-arounds offered for this approach, neither of which are satisfactory from the perspective of eliminating trusted third-parties.

First one introduces oracles to vouch for the fact that some event has occurred outside the blockchain, such as Bob wiring funds to Alice. This can be done generically by creating a multi-signature arrangement where the oracle must also sign the transaction before Bob can claim the bitcoin offered by Alice. Or in the context of ethereum it can be done by making payment conditional on a specific function call from the oracle contract. Either way this is introducing a trusted third-party into the equation and adding even greater delays to the settlement process, meaning that a trade that executes at a specific time will settle at a later point in time after the mediator had enough time to examine the evidence of fiat payment.

The second approach runs even more contrary to the spirit of avoiding trusted third- parties: using a stablecoin designed to have stable exchange rate tracking a specific fiat currency.  For example one could represent USD via Tether or other stablecoins. But the 1:1 correspondence between Tether and the dollar is only guaranteed by a private entity. In other words, by replacing USD in a bank account with tether for the convenience of applying atomic swaps, participants are taking on even greater concentration of risk in the single party backing tethers.

Scaling to an open market

Another challenge with atomic swaps is the pairwise aspect: the protocol has exactly two participants and operates in a carefully choreographed manner. Notice that even the very first step by Alice requires knowing that her counter-party is Bob, because his public-key goes into the construction of the initial UTXO. This is all good and well but it does raise one important question.

How did Alice and Bob find each other in the first place? Presumably this is the role the decentralized exchange plays. But if Alice must know about Bob, there is a chicken-and-egg problem. In the standard setting, Alice would post an order announcing that a certain quantity of bitcoin is available in exchange for a different quantity of ether. Later along comes Bob with an offer to take the other side of the trade, and this particular trade executes. But note that at the time Alice posted her order, there is no Bob in the picture. Also recall that it is not sufficient for our definition of an exchange to simply act as a bulletin-board where seller/buyers post bids and then go offline to attempt settlement  on their own when they paired with a counter-party. In the absence of additional enforcement to prevent participants from backing out of trades, this scheme can not guarantee that activity on the exchange corresponds to assets changing hands at those prices.

Secondly what prevents the participants from backing out of executing the atomic swap?  Recall that even with both transactions ready on chain, it is still up to Alice in step #3 to get the wheels moving. The protocol is only atomic to the extent that if Alice executes step #3, it is guaranteed Bob can also execute step #4. But there is no guarantee that after steps #1 and #2 are completed, Alice will proceed to step #3.

One approach to dealing with this is to have the exchange become a counter-party to every trade. In other words, when constructing the atomic swaps Alice always assumes she will be trading her BTC to the exchange for ETH. Likewise Bob assumes he is sending ETH for BTC when constructing the swap transactions. An atomic swap will work in this model since the identity of the buyer is known ahead of time; it is always the exchange. Meanwhile the exchange does not take on any risk in the event of a cross. Until there is a crossing order from Bob, the exchange will not move to claim BTC from Alice. Only when both orders appears on the book (and there is sufficient buffer left on the time-locks for both, to prevent the owner from reclaiming them) does it make sense to execute both swaps. Since funds from one are used to pay the other, the exchange is net neutral or even slightly ahead due to fees collected from the parties. By making sure the exchange is playing the role of “Alice” in the protocol and choosing the random nonce, the exchange can guarantee that step #3 will always take place and every crossed order will settle properly.

But there is a fatal problem with that approach: while it may be “neutral” in the sense that the exchange does not make directional bets on BTC/ETH, it still has to front funds to pair up with every order. Recall that both sides of an atomic swap must have an on-chain transaction that the counter-party can claim if the swap runs to completion successfully. That means for every order— no matter how unrealistic its chances of being filled given prevailing market conditions— the exchange is forced to set aside capital to guarantee execution. Suppose 1BTC is currently trading for 20ETH but a die-hard bitcoin fan wants to lodge an order to sell for 50ETH. In order to guarantee execution of that order, the exchange must set aside of 50ETH of its own capital for as long as this irrationally exuberant order stays on the book. That is not a viable model.

Partial execution

Theres is one more challenge to address when trying to build an exchange on atomic swaps: partial order execution. Using the previous example, suppose Alice is offering 1 BTC for $8000, while Bob and Carol each want to buy 1/2 BTC for $4000. There is perfect coincidence of wants here: Alice can sell her whole bitcoin and collect the USD she requested, by splitting that order evenly between Bob and Carol. Custodial trading engines handle this case without a problem, since they have full control of assets. More importantly they can pace the execution over time to deal with the far more likely scenario where the counter-parties appear at different times. Imagine Alice places her order, followed quickly by Bob while Carol does not materialize until a few hours have elapsed.

The basic atomic swap can not handle this scenario without additional involvement from the customer. This holds true even if the exchange itself is the counter-party to her swap. When there is only 1/2 BTC demand available, the exchange is in a bind: it can proceed with the swap with Alice for one whole BTC, but then it will be left holding the remaining 1/2 after Bob is paid. Demand for that half may not materialize at the same price, leaving the exchange holding the bag. Alice can always go back and prep a new transaction that sends two half BTC chunks to the exchange. (By using channels between Alice and the exchange, this can even be done offline without significant confirmation delays.) But therein lies the catch: Alice is free to back out of that transaction. On a custodial exchange, unless the order is placed as all-or-none, it will execute against any crossing order including partial lots. With swaps, one must track down Alice to restructure the transaction after every partial execution.

Looking ahead

These limitations do not preclude the possibility that different, more sophisticated protocols and cryptographic techniques could address the current limitations of decentralized exchanges. Instead they point to a scaling challenge with going from a protocol that works great in 1:1 setting to one that must function with large number of participants.  As things stand, non-custodial trading may eliminate the risk of funds storage, but the lack of an execution guarantee means that it greatly reduces confidence in the price signals generated by the market.


Threshold ECDSA, key sharding and multi-signature: a comparison

Threshold cryptography refers to splitting cryptographic keys such that performing an operation—such as decrypting or signing a message— requires an interactive process among multiple shareholders without ever disclosing the secret. The emergence of cryptocurrencies has inspired a cottage industry of research on threshold ECDSA, the digital signature algorithm hard-coded by most cryptocurrencies. After some false-starts in 2014, more progress has been made towards efficient schemes. For example one recent paper at IEEE Security & Privacy 2018 made strides in reducing the the level of interaction required during that computation. But the techniques remain a long ways from being a realistic solution for any system storing cryptocurrency, especially when they have to compete with a much simpler alternative already built into common cryptocurrencies: multi-signature. In this blog post we compare three alternatives to see how they stack up:

  1. Naive secret splitting
  2. Multi-signature
  3. Threshold cryptography

Naive secret-splitting

In this model, the bits making up the secret value are split using a plain threshold-sharing scheme such as Shamir’s secret sharing. Such schemes are parameterized by so-called “M-of-N” settings, where N shares exist of which any subset M < N are sufficient to reconstruct the original value. In the case of cryptocurrencies such as Bitcoin the “secret” private-key is a scalar value that defines the private key, which is effectively a large number. That number can be split into shares and reassembled again to create the original private-key as long as a quorum of at least M shares are present. With the key reconstituted, it can be used to sign Bitcoin transactions

That reassembly step also happens to be the Achilles heel of this model: there is a single private-key at the end of the day and it is being repeatedly reassembled each time that key is used to transfer Bitcoin. (Recall that in cryptocurrencies such as Bitcoin or Ethereum, control over funds is asserted by creating a digital signature, which boils down to using the private-key.) That means there is a single point of failure, a trusted system where reassembly takes place where that secret value is exposed.

Native multi-signature

Around 2012 Bitcoin underwent one of its more ambitious feature expansions, introducing support for a scripting language for describing payment conditions. This language greatly expanded the range of conditions that can be tied to spending funds. Previously pay-to-public-key or pay-to-public-key-hash  addresses could only express a condition along the lines “you can spend these funds if the transaction is signed with this key.” With script one could replace the second part with more complex assertions such as  “… signed with key A or signed with key B and provides solution to a puzzle C.” Since the idea of requiring multiple signatures on a transaction is a very common scenario, the designers introduced native instructions into the virtual machine scripting language to support multi-signatures or “multisig:” instead of depending on a single key to sign the transaction, some quorum of M keys out of a larger set of N is required to sign.

This M-of-N condition may resemble traditional secret-splitting but the similarity ends there In Bitcoin multisig, there are N distinct, independent keys. By contrast, secret splitting is working with exactly one key. As far as rest of the world is concerned, there is one signature to verify, produced by one private key and verified with one public key. How that private key is managed behind the scenes by sharding across multiple systems is opaque to everyone else.  By contrast multisig involves creating M different signatures to authorize a transaction. Each of these signatures must be independently verified and the presence of multiple, distinct keys is visible to other participants in the system.

In a useful example of debunking false dilemmas around security trade-offs, multisig improves both security and redundancy. The security advantage is readily apparent when M > 1. An attacker planning to steal funds has multiple targets to chase after instead of just one. More importantly, those keys can be managed independently. There is no single system that must have access to all keys at the same time. Each private-key signs a transaction without requiring access to any of the other keys. (In fact, the signatures do not even depend on each other.  This is an example of co-signing rather than counter-signing where subsequent signatures would incorporate earlier signatures. For Bitcoin the “message” being signed is identical from the perspective of multiple key-holders, which allows obtaining the signatures in parallel, in arbitrary order.) In fact the security gains from multisig are directly related to the extent these multiple keys have uncorrelated risks. Put all eggs in the same basket as Bitfinex did, and multisig degenerates into the same baseline as single-key control.

The redundancy improvement is more subtle. While secret keys falling into the hands of an adversary is perhaps the most common threat model, there is another risk around simply losing access to the key. For example the device where the key is stored could malfunction or accidentally get discarded— newsworthy examples of that are plenty. This is key “loss” in a different sense: no one has access to the secret any longer and any funds on the blockchain controlled by that private-key are effectively taken out of the money supply. In multisig configurations where the quorum number M is less than the total number of keys N, key loss can be tolerated. Even with some shares permanently lost or even temporarily unavailable, transactions can be authorized.

There is one catch: multi-signature transactions take up a lot more space. A configuration requiring a quorum of M signatures out of a possible group of N must include not only all M signatures but also M public-keys. You might expect some type of space-optimization in the protocol, where future transactions out of the same address do not require enumerating the public-keys again since those are fixed for a given multisig address. But there is no such provision. Whether out of a strident belief that addresses must only be used once to preserve privacy— an ideological stance readily contradicted by evidence from actual blockchain statistics— or a pragmatic bias for avoiding complexity, the designers did not include any mechanism for “caching” key-sets (even though signatures were being stored permanently on the blockchain until segregated-witness.) This is more than a matter of engineering efficiency: when it comes to paying fees, transactions are priced by the byte. This makes multi-signature transactions more costly, with costs scaling linearly with in both parameters M and N.

Threshold cryptography

“If you think cryptography is the solution to your problem, then you don’t understand your problem” — attributed to Peter Neumann

Threshold cryptography promises to offer  the best of both worlds. As far as the rest of the blockchain is concerned, there is a single key used to create one signature;  transactions consume the same amount of space as old-school single key addresses. On the other hand, that key has been sharded into multiple shares, with each share given to a different party. More importantly, those shares are never reassembled. Instead the people controlling individual shares all participate in an complex protocol to create the signature, without any participant gaining more information about the shares held by others. Designing those protocols, reasoning about their properties and proving security is where the cryptographic techniques comes in.

It is also where the complexity lies. Depending on the particular problem, these protocols can be too unwieldy for implementation. For some cryptographic systems there are natural solutions. For example RSA allows additively splitting the private-key and easily combining individual signatures created by each piece. (But the joint generation of an RSA modulus such that no one learns the factorization is quite tricky. For this reason, threshold systems commonly assume “trusted dealer” where the entire secret is allowed to exist on one system temporarily during key generation, from which the shares are distributed.) For  the DSA family of algorithms, including its elliptical curve incarnation, there is no simple solution known. Solutions can fall short effectivness for a number of reasons. First and most important is the amount of interaction required among holders of shares during signature generation. In an ideal solution, no interaction is required. Each person performs a computation using their share working independently and outputs some intermediate result. Finally there is a reassembly phase, where an untrusted system collects the partial answers and reassembles them into a final result.

That model is ideal for offline storage of cryptocurrency, because it allows parallel, incremental construction of a transaction. Typical offline storage schemes involve storing cryptographic keys offline, on systems located across multiple geographically diverse sites. Because these systems are not connected to a network, participants must travel on site to have physical access for interacting with them. A threshold signature protocol with zero interaction allows those accesses to happen in any sequence, visiting one site at a time. By contrast, all known threshold signature protocols for ECDSA involve multiple rounds of interaction among holders of key-shares. That means either visiting each site multiple times, or more realistically having to orchestrate multiple groups visiting all sites simultaneously and being in communication in real time to carry out protocol steps— keeping in mind that these systems were supposed to be offline and disconnected from networks. None of the options are particularly appealing. One could knit together the “air-gapped” systems into a private network that remains (hopefully) isolated from the rest of the internet. But even in that best case scenario where the network is private, there is new attack surface: connected systems can now target each other, weakening one of the assumptions about each site having uncorrelated risks. Or the operators could resort to kludgy “sneaker-net” to exchange inputs/outputs with offline systems while leveraging public networks to transfer those inputs/outputs to other sites. The inefficiency of that alternative increases with the communication complexity of the protocol, and the number of “rounds” involved in assembling the final signature.

A second and more subtle limitation arises from the incompatibility of threshold signing protocols with existing cryptographic hardware, such as bitcoin wallets running on smart cards or HSMs. If a protocol requires manipulating secret material in ways outside the handful of algorithms supported by the underlying platform, it can not be instantiated using these systems. Once again RSA is the poster-child for a well-behaved algorithm. Any device capable of importing RSA private keys and signing with them can participate in an NofN threshold signature scheme (note M=N requirement) because the “partial computation” using a share looks exactly like producing a regular RSA signature. That said, there are implementation subtleties. For example, the hardware must support importing an RSA private-key without the Chinese Remainder Theorem or CRT components. Otherwise if the shares had to include CRT components, every participant could recover the full secret.

By contrast existing threshold ECDSA protocols do not use secret key material in any way resembling an ECDSA signature, or even other common algorithms defined on secret material, such as ECDH computation using the same scalar value. That means such protocols are incompatible with gold standard for managing high value secrets. (It is rare for  cryptographic hardware to permit users to implement their own cryptographic algorithms to run inside the secure environment, since a badly designed one allows key extraction.) That is a significant step backwards in operational security: general-purpose computing platforms such as PCs are not appropriate for managing critical secrets. Hardware wallets are built for exactly that scenario, with defenses against tampering, key-extraction and side-channel attacks. Giving up that assurance to save on transaction fees is a dubious trade-off. That said, this is a practical limitation that could be overcome if hardware wallets decided to add support for the more esoteric primitives required to make these protocol work. But as with all other software development, vendors of cryptographic hardware follow the 80/20 approach to defining requirements: they concentrate their engineering efforts on supporting the 20% of features used by 80% of customers, and avoid the long-tail of seldom used obscure algorithms that only a handful of people care about.

Bottom-line: while future developments towards a more efficient threshold ECDSA implementation could overcome these limitations, at the moment it’s a tough sell to give up on the simplicity of native multi-signature in favor of a complex multi-party computation protocol that is incompatible with preferred hardware for high-value secrets.



On-ramps & off-ramps: impact of Bitcoin payments on the regulatory landscape

Money serves three functions: store of value, unit of account and medium of exchange. Bitcoin has achieved a dubious hat-trick in drawing criticism in each category. It is claimed to be too volatile to act as a measure of economic value—how can merchants risk accepting BTC if the prices fluctuate drastically day to day? For that matter, why would consumers risk giving up an appreciating asset if it is likely to be worth a lot more tomorrow? (Recall the infamous Bitcoin pizza incident.) It has also attracted a chorus of skeptics arguing that it can not succeed as a popular payments, pointing to the intrinsic scaling limits and costs of transacting on the congested network. Blockchain space is scarce, topping out around 7 transactions per second worldwide, dwarfed by the throughput of existing centralized payment rails such as credit-card and PIN debit networks. Moreover transactions compete against each other for inclusion in every block by offering “fees” to miners, who are incentivized to include only those with highest fees into the next block. Counterintuitively, those fees depend not on the monetary value of the transaction but on the amount of space it takes up on the blockchain— akin to charging for check-processing based on thickness of the paper used instead of notional value printed on the check. Depending on demand on the network, this results in drastically varying efficiency. In 2014 it was possible to transfer $80M while paying a few cents—outshining every other system including wires in cost-effectiveness— while the late 2017 frenzy witnessed  users spending tens of dollars in fees to send amounts of the same order of magnitude, which is a staggering level of inefficiency comparable to Western Union for cross-border remittances.

After an initial period of exuberance where merchants raced to outmaneuver each other in claiming bitcoin support and scoring cheap PR points, it is clear that bitcoin is not currently working as a payment mechanism. Ethereum never focused on payments as a use-case, while other altcoins such as monero with aspirations to being a medium of exchange are optimizing for dubious black markets and criminal activity outside the mainstream. Jury is out on whether any of the scaling approaches proposed such as Lightning Network will provide a lasting solution. 

 But this is more than a question about technical merits or real-world feasibility: it will have repercussions for the regulatory landscape around cryptocurrency. Whether Bitcoin or another cryptocurrency succeeds as a widely used medium of exchange will determine which participants in the ecosystem become the focus points for regulatory influence. If conducting meaningful commercial activity requires converting into fiat, regulating cryptocurrency becomes a simple problem focused on existing, already regulated businesses which serve as on-ramps & off-ramps to blockchains for the majority of consumers. (This is true even when that conversion into fiat is hidden from consumers, as in the case of most merchants who claim to accept Bitcoin but are in fact relying on a third-party payment processor who accepts Bitcoin from the consumer while paying out in USD to the merchant, effectively absorbing both the exchange risk and information security challenges associated with storing bitcoin.)

The difference can be illustrated by imagining two alternative scenarios. In the first case, Bitcoin or some other cryptocurrency with better intrinsic scaling properties succeeds in becoming a ubiquitous payment mechanism, complete with a vibrant ecosystem of consumers and merchants rivaling that of existing payment networks. Average citizens purchase goods with Bitcoin from one another with the same ease as swiping a credit-card, with one crucial distinction: there is no middleman moderating the transaction. (More accurately, since blockchains are not made out of magic, the “middleman” has been replaced by a decentralized network of miners.) There is no reason to convert bitcoin into or out of USD— which is what happens today behind the scenes when most merchants rush to announce bitcoin support. Such cross-currency hops are not only unnecessary but they would introduce economic inefficiencies compared to transacting in bitcoin as a first-class, “native” currency.

In the second scenario, Bitcoin is relegated to a different role, serving primarily as store of value (or unit of measure, however implausible that appears from current levels of volatility.) In this word everyday retail transactions are not carried out over the bitcoin network, directly or indirectly through second-layer solutions such as Lightning Network. either because no viable scaling solution has emerged or because network fees are too high compared to existing payment rails, dissuading merchants from switching. BTC can still be competitive for niche scenarios, such as cross-border remittances or infrequent large transfers (such as settlements between commercial banks) where 24/7 reliability and speed is more important than saving a few basis points.

In both scenarios, the censorship resistance of the Bitcoin network is identical. While there are legitimate concerns around the geographic concentrations of mining pools in China and a software “monoculture” of one implementation running every node, it would require extraordinary effort to blacklist addresses or make arbitrary edits to the ledger to undo history or reverse payments. Where these scenarios differ is that such interventions are not required in the second case as a prerequisite to effectively regulating flow of funds.

If commercial transactions require converting cryptocurrency to/from fiat, existing authority to influence movement of that fiat currency also translates into influence on use of virtual currency in any meaningful commercial capacity. If consumers are constantly liquidating bitcoin as a prerequisite for most economic activity, one need only focus on regulating these so-called “on-ramps” and “off-ramps”— points in the ecosystem where bitcoin is exchanged for a native currency accepted in-real-life, such as US dollar. This a tractable problem. Dealing in fiat currency requires banking relationships and maintaining banking relationships involves following existing regulatory regime, which includes a functional compliance program among other components. (One need only look at the instructive example of Bitfinex to see what happens when those connections are severed.) It does not matter that payments within the bitcoin network can continue to flow freely because the balances recorded on that ledger, no matter how significant when converted into notional USD numbers, grant very little purchasing power unless the funds can be exchanged back into USD. If there are only a handful of such off-ramps to exit out of bitcoin and into old-fashioned but universally accepted currency necessary for meaningful commercial activity, there is an effective point to focus enforcement activities.

Consider the example of WikiLeaks. In 2011 the US government pressured credit networkings including Visa, MasterCard and Paypal to stop processing payments for the organization. This forced WikiLeaks to start accepting bitcoin. With the massive run-up in price against USD, WikiLeaks can paint a slightly revisionist picture of that forced move as a blessing in retrospect. But what about the cash-flow side of the equation? Even if all WikiLeaks personnel were volunteers resulting in zero labor costs, no salaries to pay etc. as soon as there are any expenses to account for— data-center hosting, advertising etc.— the question is whether those costs can be reimbursed in bitcoin. If the counter-party is not willing to accept bitcoin, WikiLeaks must find a way to convert their existing stash of bitcoin into old-school currency, interfacing with aspects of the financial system where stringent rules apply around which customers one can conduct business with. (Quite apart from access, this hypothetical scenario also raises the question of whether paying for expenses out of bitcoin—converted or not— was something to brag about in the first place: if the currency appreciated 50,000% since initial donation was made, having spent that on operational expenses to keep the lights on has echoes of the famous million-dollar pizza episode in bitcoin.)

Even overcoming the scaling challenges may not be enough to realize a world where consumers pay for their morning cup of coffee in bitcoin. Volatility may prove even more tricky, not being amenable to simple technological fixes. In the long-term, the belief that price of bitcoin will rise against USD means everyone will hoard their cryptocurrency, with no one willing to spend it on such pedestrian purchases as breakfast. (That brings a whole new meaning to the cliche that skipping a latte every day will compound into significant savings.) Meanwhile significant short-term fluctuation discourage merchants. Pricing goods in bitcoin requires hedging against the exchange risk. The path of least resistance for risk-averse merchants is to cash out to USD immediately after the transaction, which is exactly what most merchants who claim to accept bitcoin are doing: in reality their payment processor collects BTC from the buyer while delivering USD minus transaction fees to the merchant. That approach insulates the merchant from the risk of holding bitcoin long term, while superficially granting them an excuse to display the bitcoin logo on checkout pages and press releases. But bragging rights aside, the popularity of that model for merchants highlights the fragile nature of decentralization. The favored integration model for accepting bitcoin is in fact not using bitcoin as a peer-to-peer currency: there is a payment processor in the loop for converting bitcoin into fiat currency to make it more palatable to the merchant. That middleman is likely a regulated money transmitter and can be legally compelled to stop processing payments for specific clients. If cryptocurrency becomes a first-class, native payment system for online or brick-and-mortar transactions, it will require a fundamentally different regulatory approach focused on trying to impose order on unwieldy peer-to-peer networks. If instead the current trajectory continues, incremental approaches centered around on/off-ramps will be sufficient.



We can decrypt it for you: smart-contracts for outsourced encryption

A small-step for cryptographic agility

Among the features introduced by the Byzantine hard-fork of Ethereum, one of the more under-appreciated ones may be support for RSA operations. Most virtual currencies have a very weak story around cryptographic agility, defined as the ability to easily change the cryptographic primitives used such as signature algorithms, block ciphers and hash functions. For example Bitcoin settled on ECDSA over one elliptic curve— secp256k1— as the signature algorithm and SHA256 for hashing. (There is a long-standing proposal to introduce Schnorr signatures that could introduce some variety but it is yet again limited to that one curve.) Ethereum also uses ECDSA over the same curve, but inexplicably hard-codes a pre-standardization version of SHA3 for hashing.

Arguably these decisions are  short-sighted given that large-scale blockchains are notoriously difficult to evolve once deployed. Convergence of the system depends on on all participants agreeing on a set of ground rules for interpreting transactions appearing on the chain. If one faction invents some new-fangled feature that the rest of the network does not recognize, they will disagree on which transactions are valid and what the true state of the world is. In effect the  Bitcoin is culturally allergic to using hard-forks to introduce novel functionality— witness the acrimonious debates over increasing block size— and historically relied on incremental soft-forks for improvement. Ethereum Foundation has shown no such restraint, shepherding the currency through multiple hard-forks. Some were controversial and did not exactly go as planned; the 2016 hard-fork to bailout DAO instead gave rise to Ethereum Classic. More recent forks have been less contentious, likely because they introduced features without obviously being designed for the benefit of a particular contract or stakeholder.

In one sense, RSA support it is pure syntactic sugar: EVM is Turing complete. Modular exponentiation, the operation underlying RSA, could have been implemented in Solidity or any other language that compiles to EVM byte-code. Granted the same could be said of ECDSA support provided by the built-in ecrecover contract invocation: that could have been left up to every one to implement on their own. Instead EVM designers made a pragmatic decision about providing a native operation to support a feature expected to have widespread usage. This is the same judgment call API designers face constantly: whether to provide a single convenient API that packages commonly used functionality or leave it up to user to stitch that together from existing APIs. In this blog post we evaluate one scenario that is now made possible with RSA support, namely outsourcing key-management and encryption to trusted third-parties.

We can decrypt that for you

Common disk-encryption solutions including MSFT Bitlocker and Apple FileVault have an option to back-up keys to the enterprise. This is achieved by encrypting contents to two or more different keys: one chosen by the user and one chosen by the IT department, usually an RSA public-key. This allows the company to access storage on those devices even if the assigned user is unable or unwilling to assist. Taking this one scenario step further: suppose the IT department does not want to manage that additional key that can decrypt all employee devices. Can they outsource this to a trusted third-party in a verifiable way? The third-party is retained to provide on-demand decryption when the IT department needs to access one of the devices using the secondary key.  Alternatively the third-party could act as “back-up:” even if the IT department itself has a key, they may want additional redundancy.

There is an important question around whether such draconian arrangements are justifiable or consistent with precedent. For now we will put aside these policy questions, which are highly dependent on jurisdiction. US companies typically adapt the stance that employees have no expectation of privacy on company-issued devices. In centrally managed IT environments, it is very common for the company to have root/administrator privileges over all machines. Those privileges allows IT personnel to remotely execute arbitrary commands on user devices and access any stored information. In other words, being able to decrypt storage offline is not necessarily granting a new capabilities that is not already present in a traditional, all-Windows environment with machines are joined to an Active Directory domain.

One important tangent on policy is warranted here before moving on to the implementation. On the surface, outsourcing encryption looks awfully similar to the dangerous and discredited notion of mandatory key-escrow. Starting in the 1990s, US government has requested backdoors in cryptographic protocols, asking that all encryption products effectively use two keys: one for the intended recipient and one for law-enforcement in case future access is required. Following strong backlash from cryptography and civil-liberties communities, these ideas seeming fell out of favor as access to strong encryption increasingly became taken for granted. But the uneasy truce came to an abrupt end in the aftermath of the 2014 San Bernardino shootings which resurrected calls for key-escrow. Compelled key-escrow is problematic because unwitting users are being forced to trust third-parties they have no control over or trust, merely by virtue of purchasing gadgets or software from vendors subject to that jurisdiction. This is different than voluntary escrow taking place in a closed system such as an enterprise IT environment, where the company is making a deliberate risk-benefit decision to outsource key-management to selected third-parties. In addition to the freedom of choosing these providers—or not choosing for that matter, in favor of keeping that function in-house— they can also reduce risk by splitting keys. For example, 3 providers can be selected and the original secret split in a 2-of-3 threshold scheme with each share encrypted to a different provider. That arrangement removes single points of failure, requiring cooperation/malfeasance/compromise of multiple providers to recover the secret.

Design outline

A smart contract running on Ethereum can provide a number of useful properties:

  • Backup service gets paid per decryption in a verifiable manner— payment is made if and only if a successful decryption is provided
  • The backup service can post a fidelity-bond attesting to the security of its key management. These funds are under control of the contract and designed to be released to an attacker upon submission of evidence that the key has been compromised. In effect, this is a self-enforcing bug bounty as well as a canary designed to alert users of a key compromise.  As long as the economic value of the key for an attacker is less than the value of the bounty, a rational attacker would choose to claim the bounty and in the process, alert the entire world that the key has been compromised. If no such evidence emerges after some time, the funds are released to the operator of the service. (Alternatively funds can be returned gradually over time, with some fraction returned every few months of successful operation.)

How would such a system operate?  Since a contract itself can not hold secrets, the actual decryption keys must be held by the service provider. But the provider can publish the public-key which is sufficient for encrypting data. The customer is free to encrypt arbitrary plaintext using that key. But when it is time for decryption, they have to go through the smart contract to reach out to the service provider. This contract has three methods:

  1. Request decryption: Customer submits a request to decrypt a ciphertext encrypted in the RSA key of the backup service. This call is also required to send previously agreed-upon amount in ether to the contract as pre-payment. In practice it would also be subject to an authorization check based on message sender address; only the legitimate customer is allowed to request decryption. This prevents random interlopers who stumble on ciphertext from using the contract as a decryption oracle.
  2. Submit decryption result: Invoked by the operator of the backup service, providing the plaintext for a ciphertext previously uploaded by the customer.
  3. Request refund: Invoked by the user to get their money back if the service was unable to provide service in a timely manner. (Note there is a race-condition here in the event of blockchain congestion or miner censorship. The decryption result becomes visible to the customer as soon as the service provider broadcasts it to the network. But it will not count towards meeting the deadline until it is mined.)

Step #2 is where support for RSA operations come in handy. This is the point where the smart-contract must verify whether a correct decryption was provided, by applying the encryption step to the alleged plaintext. (Note we are assuming raw RSA decryption— this is also required to support “blinding” as explained below. The plaintext itself may have padding such as OAEP but its fully-padded version must be returned by the backup service.) If the provided plaintext is correct, the payment originally sent in step #1 is “captured” for good. Otherwise the funds remain in a provisional state. If no decryption materializes after some time period such as 24 hours, the user can invoke the refund function. This function checks the internal contract state and ascertains whether an outstanding decryption request initiated by the same caller has not been successfully resolved. If that is the case, the initial payment is sent back to the caller.

This construction sounds counter-intuitive in one aspect: everything taking place on public blockchains including Ethereum is visible to the entire world. Any one else can observe the above transactions getting posted, including the decryption result. That may look like a serious problem— everyone else is learning the secret— but there is an easy work around: RSA cryptosystem supports blinding using the multiplicative homomorphism of RSA. Instead of decrypting the primary ciphertext, the customer submits a modified version that is masked by a random factor. The decryption of that masked ciphertext still allows recovering the original plaintext, but only for the original submitter with knowledge of the masking factor. Other people observing the blockchain transactions are effectively witnessing the decryption of a random value that reveals no information about the original.

Improving accountability

What about providing accountability for the service provider? The idea here is to augment the contract with additional methods that an attacker who compromised the key can invoke to prove that fact and receive the value of the fidelity-bond as their reward. There are different ways to do this varying in complexity and level of evidence required. For example, an attacker can submit the factorization of the modulus and the contract verifies that the product of factors is equally to the modulus. This is sufficient but not necessary to prove key compromise. For example, in many situations the private key material is kept on tamper-resistant hardware such as an HSM which is designed to perform operations without surrendering the raw key bits. If an adversary obtains possession of the HSM and has the necessary credentials to operate it, the key is effectively compromised regardless of whether raw bits can be exfiltrated out of the hardware.

We can lower the burden of proof and require attackers to only prove they can decrypt arbitrary ciphertexts. This is tricky because the contract itself is used to arrange for decryption of arbitrary ciphertexts and those results are available on the blockchain for all to see. Not to mention, when the RSA cryptosystem is used without padding, existential forgery of what appears to be “successful” decryptions is easy. Anyone can fabricate plaintext/ciphertext pairs for which they claim to have magic decryption capability. Even in a design where that choice is taken away, the customer of the encryption service can subvert the process: they can request a decryption and leverage that result as “evidence” to prove they have control over the private key. (Use of RSA blinding makes it impossible to ascertain that they were the same ciphertext, since a randomly masked value is getting decrypted each time.) A simple if not entirely satisfactory approach is to implement a challenge/response protocol during which ordinary decryption requests are paused. The “prover” who claims to have control over the private key initiates the process by invoking a contract function. To prevent spam, we can require that such challenges are accompanied by some amount of ETH as commitment to weed out provers with no hope of successfully completing the process. That function returns a challenge, which is based on recent block hash of the chain, message sent by the prover and internal contract state that incorporates all ciphertexts it has been asked to decrypt to date.(For example we can maintain a running hash that is updated with each request received.) Note that contracts can not provide “random numbers” in the usual sense; incorporating past ciphertexts prevents using information disclosed during past decryption operations to answer the current challenge. The prover is given a deadline measured in block-height or time to provide the decryption of that ciphertext. If the valid RSA plaintext emerges during that window, the funds maintained as fidelity bond by the contract are sent to the prover. Otherwise the challenge is considered a failure and the contract resumes decryption services. By creating unavoidable penalties in the event of key compromise, this approach provides financial incentives for the service provider to take adequate measures. (But it does not prevent a corrupt provider from voluntarily decrypting ciphertexts when requested by parties other than the legitimate customer.)