Threshold ECDSA, key sharding and multi-signature: a comparison

Threshold cryptography refers to splitting cryptographic keys such that performing an operation—such as decrypting or signing a message— requires an interactive process among multiple shareholders without ever disclosing the secret. The emergence of cryptocurrencies has inspired a cottage industry of research on threshold ECDSA, the digital signature algorithm hard-coded by most cryptocurrencies. After some false-starts in 2014, more progress has been made towards efficient schemes. For example one recent paper at IEEE Security & Privacy 2018 made strides in reducing the the level of interaction required during that computation. But the techniques remain a long ways from being a realistic solution for any system storing cryptocurrency, especially when they have to compete with a much simpler alternative already built into common cryptocurrencies: multi-signature. In this blog post we compare three alternatives to see how they stack up:

  1. Naive secret splitting
  2. Multi-signature
  3. Threshold cryptography

Naive secret-splitting

In this model, the bits making up the secret value are split using a plain threshold-sharing scheme such as Shamir’s secret sharing. Such schemes are parameterized by so-called “M-of-N” settings, where N shares exist of which any subset M < N are sufficient to reconstruct the original value. In the case of cryptocurrencies such as Bitcoin the “secret” private-key is a scalar value that defines the private key, which is effectively a large number. That number can be split into shares and reassembled again to create the original private-key as long as a quorum of at least M shares are present. With the key reconstituted, it can be used to sign Bitcoin transactions

That reassembly step also happens to be the Achilles heel of this model: there is a single private-key at the end of the day and it is being repeatedly reassembled each time that key is used to transfer Bitcoin. (Recall that in cryptocurrencies such as Bitcoin or Ethereum, control over funds is asserted by creating a digital signature, which boils down to using the private-key.) That means there is a single point of failure, a trusted system where reassembly takes place where that secret value is exposed.

Native multi-signature

Around 2012 Bitcoin underwent one of its more ambitious feature expansions, introducing support for a scripting language for describing payment conditions. This language greatly expanded the range of conditions that can be tied to spending funds. Previously pay-to-public-key or pay-to-public-key-hash  addresses could only express a condition along the lines “you can spend these funds if the transaction is signed with this key.” With script one could replace the second part with more complex assertions such as  “… signed with key A or signed with key B and provides solution to a puzzle C.” Since the idea of requiring multiple signatures on a transaction is a very common scenario, the designers introduced native instructions into the virtual machine scripting language to support multi-signatures or “multisig:” instead of depending on a single key to sign the transaction, some quorum of M keys out of a larger set of N is required to sign.

This M-of-N condition may resemble traditional secret-splitting but the similarity ends there In Bitcoin multisig, there are N distinct, independent keys. By contrast, secret splitting is working with exactly one key. As far as rest of the world is concerned, there is one signature to verify, produced by one private key and verified with one public key. How that private key is managed behind the scenes by sharding across multiple systems is opaque to everyone else.  By contrast multisig involves creating M different signatures to authorize a transaction. Each of these signatures must be independently verified and the presence of multiple, distinct keys is visible to other participants in the system.

In a useful example of debunking false dilemmas around security trade-offs, multisig improves both security and redundancy. The security advantage is readily apparent when M > 1. An attacker planning to steal funds has multiple targets to chase after instead of just one. More importantly, those keys can be managed independently. There is no single system that must have access to all keys at the same time. Each private-key signs a transaction without requiring access to any of the other keys. (In fact, the signatures do not even depend on each other.  This is an example of co-signing rather than counter-signing where subsequent signatures would incorporate earlier signatures. For Bitcoin the “message” being signed is identical from the perspective of multiple key-holders, which allows obtaining the signatures in parallel, in arbitrary order.) In fact the security gains from multisig are directly related to the extent these multiple keys have uncorrelated risks. Put all eggs in the same basket as Bitfinex did, and multisig degenerates into the same baseline as single-key control.

The redundancy improvement is more subtle. While secret keys falling into the hands of an adversary is perhaps the most common threat model, there is another risk around simply losing access to the key. For example the device where the key is stored could malfunction or accidentally get discarded— newsworthy examples of that are plenty. This is key “loss” in a different sense: no one has access to the secret any longer and any funds on the blockchain controlled by that private-key are effectively taken out of the money supply. In multisig configurations where the quorum number M is less than the total number of keys N, key loss can be tolerated. Even with some shares permanently lost or even temporarily unavailable, transactions can be authorized.

There is one catch: multi-signature transactions take up a lot more space. A configuration requiring a quorum of M signatures out of a possible group of N must include not only all M signatures but also M public-keys. You might expect some type of space-optimization in the protocol, where future transactions out of the same address do not require enumerating the public-keys again since those are fixed for a given multisig address. But there is no such provision. Whether out of a strident belief that addresses must only be used once to preserve privacy— an ideological stance readily contradicted by evidence from actual blockchain statistics— or a pragmatic bias for avoiding complexity, the designers did not include any mechanism for “caching” key-sets (even though signatures were being stored permanently on the blockchain until segregated-witness.) This is more than a matter of engineering efficiency: when it comes to paying fees, transactions are priced by the byte. This makes multi-signature transactions more costly, with costs scaling linearly with in both parameters M and N.

Threshold cryptography

“If you think cryptography is the solution to your problem, then you don’t understand your problem” — attributed to Peter Neumann

Threshold cryptography promises to offer  the best of both worlds. As far as the rest of the blockchain is concerned, there is a single key used to create one signature;  transactions consume the same amount of space as old-school single key addresses. On the other hand, that key has been sharded into multiple shares, with each share given to a different party. More importantly, those shares are never reassembled. Instead the people controlling individual shares all participate in an complex protocol to create the signature, without any participant gaining more information about the shares held by others. Designing those protocols, reasoning about their properties and proving security is where the cryptographic techniques comes in.

It is also where the complexity lies. Depending on the particular problem, these protocols can be too unwieldy for implementation. For some cryptographic systems there are natural solutions. For example RSA allows additively splitting the private-key and easily combining individual signatures created by each piece. (But the joint generation of an RSA modulus such that no one learns the factorization is quite tricky. For this reason, threshold systems commonly assume “trusted dealer” where the entire secret is allowed to exist on one system temporarily during key generation, from which the shares are distributed.) For  the DSA family of algorithms, including its elliptical curve incarnation, there is no simple solution known. Solutions can fall short effectivness for a number of reasons. First and most important is the amount of interaction required among holders of shares during signature generation. In an ideal solution, no interaction is required. Each person performs a computation using their share working independently and outputs some intermediate result. Finally there is a reassembly phase, where an untrusted system collects the partial answers and reassembles them into a final result.

That model is ideal for offline storage of cryptocurrency, because it allows parallel, incremental construction of a transaction. Typical offline storage schemes involve storing cryptographic keys offline, on systems located across multiple geographically diverse sites. Because these systems are not connected to a network, participants must travel on site to have physical access for interacting with them. A threshold signature protocol with zero interaction allows those accesses to happen in any sequence, visiting one site at a time. By contrast, all known threshold signature protocols for ECDSA involve multiple rounds of interaction among holders of key-shares. That means either visiting each site multiple times, or more realistically having to orchestrate multiple groups visiting all sites simultaneously and being in communication in real time to carry out protocol steps— keeping in mind that these systems were supposed to be offline and disconnected from networks. None of the options are particularly appealing. One could knit together the “air-gapped” systems into a private network that remains (hopefully) isolated from the rest of the internet. But even in that best case scenario where the network is private, there is new attack surface: connected systems can now target each other, weakening one of the assumptions about each site having uncorrelated risks. Or the operators could resort to kludgy “sneaker-net” to exchange inputs/outputs with offline systems while leveraging public networks to transfer those inputs/outputs to other sites. The inefficiency of that alternative increases with the communication complexity of the protocol, and the number of “rounds” involved in assembling the final signature.

A second and more subtle limitation arises from the incompatibility of threshold signing protocols with existing cryptographic hardware, such as bitcoin wallets running on smart cards or HSMs. If a protocol requires manipulating secret material in ways outside the handful of algorithms supported by the underlying platform, it can not be instantiated using these systems. Once again RSA is the poster-child for a well-behaved algorithm. Any device capable of importing RSA private keys and signing with them can participate in an NofN threshold signature scheme (note M=N requirement) because the “partial computation” using a share looks exactly like producing a regular RSA signature. That said, there are implementation subtleties. For example, the hardware must support importing an RSA private-key without the Chinese Remainder Theorem or CRT components. Otherwise if the shares had to include CRT components, every participant could recover the full secret.

By contrast existing threshold ECDSA protocols do not use secret key material in any way resembling an ECDSA signature, or even other common algorithms defined on secret material, such as ECDH computation using the same scalar value. That means such protocols are incompatible with gold standard for managing high value secrets. (It is rare for  cryptographic hardware to permit users to implement their own cryptographic algorithms to run inside the secure environment, since a badly designed one allows key extraction.) That is a significant step backwards in operational security: general-purpose computing platforms such as PCs are not appropriate for managing critical secrets. Hardware wallets are built for exactly that scenario, with defenses against tampering, key-extraction and side-channel attacks. Giving up that assurance to save on transaction fees is a dubious trade-off. That said, this is a practical limitation that could be overcome if hardware wallets decided to add support for the more esoteric primitives required to make these protocol work. But as with all other software development, vendors of cryptographic hardware follow the 80/20 approach to defining requirements: they concentrate their engineering efforts on supporting the 20% of features used by 80% of customers, and avoid the long-tail of seldom used obscure algorithms that only a handful of people care about.

Bottom-line: while future developments towards a more efficient threshold ECDSA implementation could overcome these limitations, at the moment it’s a tough sell to give up on the simplicity of native multi-signature in favor of a complex multi-party computation protocol that is incompatible with preferred hardware for high-value secrets.



On-ramps & off-ramps: impact of Bitcoin payments on the regulatory landscape

Money serves three functions: store of value, unit of account and medium of exchange. Bitcoin has achieved a dubious hat-trick in drawing criticism in each category. It is claimed to be too volatile to act as a measure of economic value—how can merchants risk accepting BTC if the prices fluctuate drastically day to day? For that matter, why would consumers risk giving up an appreciating asset if it is likely to be worth a lot more tomorrow? (Recall the infamous Bitcoin pizza incident.) It has also attracted a chorus of skeptics arguing that it can not succeed as a popular payments, pointing to the intrinsic scaling limits and costs of transacting on the congested network. Blockchain space is scarce, topping out around 7 transactions per second worldwide, dwarfed by the throughput of existing centralized payment rails such as credit-card and PIN debit networks. Moreover transactions compete against each other for inclusion in every block by offering “fees” to miners, who are incentivized to include only those with highest fees into the next block. Counterintuitively, those fees depend not on the monetary value of the transaction but on the amount of space it takes up on the blockchain— akin to charging for check-processing based on thickness of the paper used instead of notional value printed on the check. Depending on demand on the network, this results in drastically varying efficiency. In 2014 it was possible to transfer $80M while paying a few cents—outshining every other system including wires in cost-effectiveness— while the late 2017 frenzy witnessed  users spending tens of dollars in fees to send amounts of the same order of magnitude, which is a staggering level of inefficiency comparable to Western Union for cross-border remittances.

After an initial period of exuberance where merchants raced to outmaneuver each other in claiming bitcoin support and scoring cheap PR points, it is clear that bitcoin is not currently working as a payment mechanism. Ethereum never focused on payments as a use-case, while other altcoins such as monero with aspirations to being a medium of exchange are optimizing for dubious black markets and criminal activity outside the mainstream. Jury is out on whether any of the scaling approaches proposed such as Lightning Network will provide a lasting solution. 

 But this is more than a question about technical merits or real-world feasibility: it will have repercussions for the regulatory landscape around cryptocurrency. Whether Bitcoin or another cryptocurrency succeeds as a widely used medium of exchange will determine which participants in the ecosystem become the focus points for regulatory influence. If conducting meaningful commercial activity requires converting into fiat, regulating cryptocurrency becomes a simple problem focused on existing, already regulated businesses which serve as on-ramps & off-ramps to blockchains for the majority of consumers. (This is true even when that conversion into fiat is hidden from consumers, as in the case of most merchants who claim to accept Bitcoin but are in fact relying on a third-party payment processor who accepts Bitcoin from the consumer while paying out in USD to the merchant, effectively absorbing both the exchange risk and information security challenges associated with storing bitcoin.)

The difference can be illustrated by imagining two alternative scenarios. In the first case, Bitcoin or some other cryptocurrency with better intrinsic scaling properties succeeds in becoming a ubiquitous payment mechanism, complete with a vibrant ecosystem of consumers and merchants rivaling that of existing payment networks. Average citizens purchase goods with Bitcoin from one another with the same ease as swiping a credit-card, with one crucial distinction: there is no middleman moderating the transaction. (More accurately, since blockchains are not made out of magic, the “middleman” has been replaced by a decentralized network of miners.) There is no reason to convert bitcoin into or out of USD— which is what happens today behind the scenes when most merchants rush to announce bitcoin support. Such cross-currency hops are not only unnecessary but they would introduce economic inefficiencies compared to transacting in bitcoin as a first-class, “native” currency.

In the second scenario, Bitcoin is relegated to a different role, serving primarily as store of value (or unit of measure, however implausible that appears from current levels of volatility.) In this word everyday retail transactions are not carried out over the bitcoin network, directly or indirectly through second-layer solutions such as Lightning Network. either because no viable scaling solution has emerged or because network fees are too high compared to existing payment rails, dissuading merchants from switching. BTC can still be competitive for niche scenarios, such as cross-border remittances or infrequent large transfers (such as settlements between commercial banks) where 24/7 reliability and speed is more important than saving a few basis points.

In both scenarios, the censorship resistance of the Bitcoin network is identical. While there are legitimate concerns around the geographic concentrations of mining pools in China and a software “monoculture” of one implementation running every node, it would require extraordinary effort to blacklist addresses or make arbitrary edits to the ledger to undo history or reverse payments. Where these scenarios differ is that such interventions are not required in the second case as a prerequisite to effectively regulating flow of funds.

If commercial transactions require converting cryptocurrency to/from fiat, existing authority to influence movement of that fiat currency also translates into influence on use of virtual currency in any meaningful commercial capacity. If consumers are constantly liquidating bitcoin as a prerequisite for most economic activity, one need only focus on regulating these so-called “on-ramps” and “off-ramps”— points in the ecosystem where bitcoin is exchanged for a native currency accepted in-real-life, such as US dollar. This a tractable problem. Dealing in fiat currency requires banking relationships and maintaining banking relationships involves following existing regulatory regime, which includes a functional compliance program among other components. (One need only look at the instructive example of Bitfinex to see what happens when those connections are severed.) It does not matter that payments within the bitcoin network can continue to flow freely because the balances recorded on that ledger, no matter how significant when converted into notional USD numbers, grant very little purchasing power unless the funds can be exchanged back into USD. If there are only a handful of such off-ramps to exit out of bitcoin and into old-fashioned but universally accepted currency necessary for meaningful commercial activity, there is an effective point to focus enforcement activities.

Consider the example of WikiLeaks. In 2011 the US government pressured credit networkings including Visa, MasterCard and Paypal to stop processing payments for the organization. This forced WikiLeaks to start accepting bitcoin. With the massive run-up in price against USD, WikiLeaks can paint a slightly revisionist picture of that forced move as a blessing in retrospect. But what about the cash-flow side of the equation? Even if all WikiLeaks personnel were volunteers resulting in zero labor costs, no salaries to pay etc. as soon as there are any expenses to account for— data-center hosting, advertising etc.— the question is whether those costs can be reimbursed in bitcoin. If the counter-party is not willing to accept bitcoin, WikiLeaks must find a way to convert their existing stash of bitcoin into old-school currency, interfacing with aspects of the financial system where stringent rules apply around which customers one can conduct business with. (Quite apart from access, this hypothetical scenario also raises the question of whether paying for expenses out of bitcoin—converted or not— was something to brag about in the first place: if the currency appreciated 50,000% since initial donation was made, having spent that on operational expenses to keep the lights on has echoes of the famous million-dollar pizza episode in bitcoin.)

Even overcoming the scaling challenges may not be enough to realize a world where consumers pay for their morning cup of coffee in bitcoin. Volatility may prove even more tricky, not being amenable to simple technological fixes. In the long-term, the belief that price of bitcoin will rise against USD means everyone will hoard their cryptocurrency, with no one willing to spend it on such pedestrian purchases as breakfast. (That brings a whole new meaning to the cliche that skipping a latte every day will compound into significant savings.) Meanwhile significant short-term fluctuation discourage merchants. Pricing goods in bitcoin requires hedging against the exchange risk. The path of least resistance for risk-averse merchants is to cash out to USD immediately after the transaction, which is exactly what most merchants who claim to accept bitcoin are doing: in reality their payment processor collects BTC from the buyer while delivering USD minus transaction fees to the merchant. That approach insulates the merchant from the risk of holding bitcoin long term, while superficially granting them an excuse to display the bitcoin logo on checkout pages and press releases. But bragging rights aside, the popularity of that model for merchants highlights the fragile nature of decentralization. The favored integration model for accepting bitcoin is in fact not using bitcoin as a peer-to-peer currency: there is a payment processor in the loop for converting bitcoin into fiat currency to make it more palatable to the merchant. That middleman is likely a regulated money transmitter and can be legally compelled to stop processing payments for specific clients. If cryptocurrency becomes a first-class, native payment system for online or brick-and-mortar transactions, it will require a fundamentally different regulatory approach focused on trying to impose order on unwieldy peer-to-peer networks. If instead the current trajectory continues, incremental approaches centered around on/off-ramps will be sufficient.



We can decrypt it for you: smart-contracts for outsourced encryption

A small-step for cryptographic agility

Among the features introduced by the Byzantine hard-fork of Ethereum, one of the more under-appreciated ones may be support for RSA operations. Most virtual currencies have a very weak story around cryptographic agility, defined as the ability to easily change the cryptographic primitives used such as signature algorithms, block ciphers and hash functions. For example Bitcoin settled on ECDSA over one elliptic curve— secp256k1— as the signature algorithm and SHA256 for hashing. (There is a long-standing proposal to introduce Schnorr signatures that could introduce some variety but it is yet again limited to that one curve.) Ethereum also uses ECDSA over the same curve, but inexplicably hard-codes a pre-standardization version of SHA3 for hashing.

Arguably these decisions are  short-sighted given that large-scale blockchains are notoriously difficult to evolve once deployed. Convergence of the system depends on on all participants agreeing on a set of ground rules for interpreting transactions appearing on the chain. If one faction invents some new-fangled feature that the rest of the network does not recognize, they will disagree on which transactions are valid and what the true state of the world is. In effect the  Bitcoin is culturally allergic to using hard-forks to introduce novel functionality— witness the acrimonious debates over increasing block size— and historically relied on incremental soft-forks for improvement. Ethereum Foundation has shown no such restraint, shepherding the currency through multiple hard-forks. Some were controversial and did not exactly go as planned; the 2016 hard-fork to bailout DAO instead gave rise to Ethereum Classic. More recent forks have been less contentious, likely because they introduced features without obviously being designed for the benefit of a particular contract or stakeholder.

In one sense, RSA support it is pure syntactic sugar: EVM is Turing complete. Modular exponentiation, the operation underlying RSA, could have been implemented in Solidity or any other language that compiles to EVM byte-code. Granted the same could be said of ECDSA support provided by the built-in ecrecover contract invocation: that could have been left up to every one to implement on their own. Instead EVM designers made a pragmatic decision about providing a native operation to support a feature expected to have widespread usage. This is the same judgment call API designers face constantly: whether to provide a single convenient API that packages commonly used functionality or leave it up to user to stitch that together from existing APIs. In this blog post we evaluate one scenario that is now made possible with RSA support, namely outsourcing key-management and encryption to trusted third-parties.

We can decrypt that for you

Common disk-encryption solutions including MSFT Bitlocker and Apple FileVault have an option to back-up keys to the enterprise. This is achieved by encrypting contents to two or more different keys: one chosen by the user and one chosen by the IT department, usually an RSA public-key. This allows the company to access storage on those devices even if the assigned user is unable or unwilling to assist. Taking this one scenario step further: suppose the IT department does not want to manage that additional key that can decrypt all employee devices. Can they outsource this to a trusted third-party in a verifiable way? The third-party is retained to provide on-demand decryption when the IT department needs to access one of the devices using the secondary key.  Alternatively the third-party could act as “back-up:” even if the IT department itself has a key, they may want additional redundancy.

There is an important question around whether such draconian arrangements are justifiable or consistent with precedent. For now we will put aside these policy questions, which are highly dependent on jurisdiction. US companies typically adapt the stance that employees have no expectation of privacy on company-issued devices. In centrally managed IT environments, it is very common for the company to have root/administrator privileges over all machines. Those privileges allows IT personnel to remotely execute arbitrary commands on user devices and access any stored information. In other words, being able to decrypt storage offline is not necessarily granting a new capabilities that is not already present in a traditional, all-Windows environment with machines are joined to an Active Directory domain.

One important tangent on policy is warranted here before moving on to the implementation. On the surface, outsourcing encryption looks awfully similar to the dangerous and discredited notion of mandatory key-escrow. Starting in the 1990s, US government has requested backdoors in cryptographic protocols, asking that all encryption products effectively use two keys: one for the intended recipient and one for law-enforcement in case future access is required. Following strong backlash from cryptography and civil-liberties communities, these ideas seeming fell out of favor as access to strong encryption increasingly became taken for granted. But the uneasy truce came to an abrupt end in the aftermath of the 2014 San Bernardino shootings which resurrected calls for key-escrow. Compelled key-escrow is problematic because unwitting users are being forced to trust third-parties they have no control over or trust, merely by virtue of purchasing gadgets or software from vendors subject to that jurisdiction. This is different than voluntary escrow taking place in a closed system such as an enterprise IT environment, where the company is making a deliberate risk-benefit decision to outsource key-management to selected third-parties. In addition to the freedom of choosing these providers—or not choosing for that matter, in favor of keeping that function in-house— they can also reduce risk by splitting keys. For example, 3 providers can be selected and the original secret split in a 2-of-3 threshold scheme with each share encrypted to a different provider. That arrangement removes single points of failure, requiring cooperation/malfeasance/compromise of multiple providers to recover the secret.

Design outline

A smart contract running on Ethereum can provide a number of useful properties:

  • Backup service gets paid per decryption in a verifiable manner— payment is made if and only if a successful decryption is provided
  • The backup service can post a fidelity-bond attesting to the security of its key management. These funds are under control of the contract and designed to be released to an attacker upon submission of evidence that the key has been compromised. In effect, this is a self-enforcing bug bounty as well as a canary designed to alert users of a key compromise.  As long as the economic value of the key for an attacker is less than the value of the bounty, a rational attacker would choose to claim the bounty and in the process, alert the entire world that the key has been compromised. If no such evidence emerges after some time, the funds are released to the operator of the service. (Alternatively funds can be returned gradually over time, with some fraction returned every few months of successful operation.)

How would such a system operate?  Since a contract itself can not hold secrets, the actual decryption keys must be held by the service provider. But the provider can publish the public-key which is sufficient for encrypting data. The customer is free to encrypt arbitrary plaintext using that key. But when it is time for decryption, they have to go through the smart contract to reach out to the service provider. This contract has three methods:

  1. Request decryption: Customer submits a request to decrypt a ciphertext encrypted in the RSA key of the backup service. This call is also required to send previously agreed-upon amount in ether to the contract as pre-payment. In practice it would also be subject to an authorization check based on message sender address; only the legitimate customer is allowed to request decryption. This prevents random interlopers who stumble on ciphertext from using the contract as a decryption oracle.
  2. Submit decryption result: Invoked by the operator of the backup service, providing the plaintext for a ciphertext previously uploaded by the customer.
  3. Request refund: Invoked by the user to get their money back if the service was unable to provide service in a timely manner. (Note there is a race-condition here in the event of blockchain congestion or miner censorship. The decryption result becomes visible to the customer as soon as the service provider broadcasts it to the network. But it will not count towards meeting the deadline until it is mined.)

Step #2 is where support for RSA operations come in handy. This is the point where the smart-contract must verify whether a correct decryption was provided, by applying the encryption step to the alleged plaintext. (Note we are assuming raw RSA decryption— this is also required to support “blinding” as explained below. The plaintext itself may have padding such as OAEP but its fully-padded version must be returned by the backup service.) If the provided plaintext is correct, the payment originally sent in step #1 is “captured” for good. Otherwise the funds remain in a provisional state. If no decryption materializes after some time period such as 24 hours, the user can invoke the refund function. This function checks the internal contract state and ascertains whether an outstanding decryption request initiated by the same caller has not been successfully resolved. If that is the case, the initial payment is sent back to the caller.

This construction sounds counter-intuitive in one aspect: everything taking place on public blockchains including Ethereum is visible to the entire world. Any one else can observe the above transactions getting posted, including the decryption result. That may look like a serious problem— everyone else is learning the secret— but there is an easy work around: RSA cryptosystem supports blinding using the multiplicative homomorphism of RSA. Instead of decrypting the primary ciphertext, the customer submits a modified version that is masked by a random factor. The decryption of that masked ciphertext still allows recovering the original plaintext, but only for the original submitter with knowledge of the masking factor. Other people observing the blockchain transactions are effectively witnessing the decryption of a random value that reveals no information about the original.

Improving accountability

What about providing accountability for the service provider? The idea here is to augment the contract with additional methods that an attacker who compromised the key can invoke to prove that fact and receive the value of the fidelity-bond as their reward. There are different ways to do this varying in complexity and level of evidence required. For example, an attacker can submit the factorization of the modulus and the contract verifies that the product of factors is equally to the modulus. This is sufficient but not necessary to prove key compromise. For example, in many situations the private key material is kept on tamper-resistant hardware such as an HSM which is designed to perform operations without surrendering the raw key bits. If an adversary obtains possession of the HSM and has the necessary credentials to operate it, the key is effectively compromised regardless of whether raw bits can be exfiltrated out of the hardware.

We can lower the burden of proof and require attackers to only prove they can decrypt arbitrary ciphertexts. This is tricky because the contract itself is used to arrange for decryption of arbitrary ciphertexts and those results are available on the blockchain for all to see. Not to mention, when the RSA cryptosystem is used without padding, existential forgery of what appears to be “successful” decryptions is easy. Anyone can fabricate plaintext/ciphertext pairs for which they claim to have magic decryption capability. Even in a design where that choice is taken away, the customer of the encryption service can subvert the process: they can request a decryption and leverage that result as “evidence” to prove they have control over the private key. (Use of RSA blinding makes it impossible to ascertain that they were the same ciphertext, since a randomly masked value is getting decrypted each time.) A simple if not entirely satisfactory approach is to implement a challenge/response protocol during which ordinary decryption requests are paused. The “prover” who claims to have control over the private key initiates the process by invoking a contract function. To prevent spam, we can require that such challenges are accompanied by some amount of ETH as commitment to weed out provers with no hope of successfully completing the process. That function returns a challenge, which is based on recent block hash of the chain, message sent by the prover and internal contract state that incorporates all ciphertexts it has been asked to decrypt to date.(For example we can maintain a running hash that is updated with each request received.) Note that contracts can not provide “random numbers” in the usual sense; incorporating past ciphertexts prevents using information disclosed during past decryption operations to answer the current challenge. The prover is given a deadline measured in block-height or time to provide the decryption of that ciphertext. If the valid RSA plaintext emerges during that window, the funds maintained as fidelity bond by the contract are sent to the prover. Otherwise the challenge is considered a failure and the contract resumes decryption services. By creating unavoidable penalties in the event of key compromise, this approach provides financial incentives for the service provider to take adequate measures. (But it does not prevent a corrupt provider from voluntarily decrypting ciphertexts when requested by parties other than the legitimate customer.)



What is the value of half a secret?

Putting a price tag on cryptographic keys

Cryptocurrencies such as Bitcoin achieve what may be the purest distillation of money into information: control over funds is achieved by having control over associated cryptographic secrets, specifically ECDSA private-keys used to sign transactions. If you can sign with a specific key, you can spend the funds at a specific address associated with that key. Multi-signature is an extension of this model where instead of using a single key, possession is distributed into a quorum of M keys out of N. This improves both security— compromising a single key does not allow an attacker to walk away with any money— and redundancy: losing access to one key, say due to hardware failure, does not result in the legitimate owner losing access to their money. (Assuming M is less than N; otherwise the loss of any key is fatal.)

It also creates something of a conundrum for putting a price-tag on one such key for the purpose of threat modeling. What is a reasonable amount for the defender to spend in securing that key? For that matter, how much effort would a rational attacker spend on going after one key? Suppose 300 BTC is stored stored in a 3-of-5 multisig address. Intuitively one could argue the value for each key is exactly 1/3 the balance or 100 BTC. But this naive conclusion overlooks the “criticality” condition involved: any combination of less than three keys is worthless from the perspective of controlling the associated funds on the blockchain. (Assuming they are not also used to control some other address with different M-of-N configuration.) But once that third key falls into the wrong hands, the entire sum can be claimed.

In more concrete terms: suppose a disgruntled insider offers to sell an attacker one of the keys for 50BTC. If individual keys are valued based on their pro rata share at 100 BTC, that would be a great deal. But if the prospective buyer has no hope of obtaining any of the other four keys— perhaps they are managed by more honest employees— that 50BTC spent on attacking the first key will yield a return of exactly zero. There is an all-or-nothing phenomenon at work for all would be adversaries: 100% of resources dedicated to getting hold of the first M-1 keys are wasted unless that final key to reach quorum can be successfully obtained.

Critical thresholds

This is an unusual property of Bitcoin that is not easily achieved for other commodities it is typically compared to: half a barrel of oil is worth approximately half that of a full barrel. It is not possible to create a configuration of crude oil where three barrels must be consumed at the same time. If a gold vault is compartmentalized into five rooms containing equal amounts of the precious metal, plundering one out of five rooms is still a good outcome for the attacker.

Of course there exist physical objects with supra-linear scaling properties based on quantity: a two-carat diamond is worth more than two individual one-carat diamonds with identical specs. Most physical objects follow this pattern. If a car is chopped in half down the middle, the left and right-sides are not worth half the original value of the vehicle, because neither side amounts to a functioning mode of transportation any longer. But the value of those scattered parts is not zero either. They can be sold for scraps or combined with other interchangeable parts to put together a fully functioning if somewhat unusual vehicle.

It is usually in the mathematical realm that we run into scenarios where objects can be divvied up in a way that creates critical thresholds. For example, it is possible to split a credit-card number into two shares such that either share contains exactly zero information about the original, in a strong information-theoretical sense: even an infinitely powerful computer could not recover the original value from a single share. But the combination of two such “worthless” shares yields the original secret. There is another unusual property of these constructions: not only is combination of shares below the threshold worthless, the only way they can attain any value at all is in conjunction with other shares generated from that particular split. If you have half the components for a bicycle, there is a cottage industry of bike shops and parts suppliers to equip you with necessary missing pieces to assemble a complete bike. But keys in multi-signature configuration can only be combined with other keys in the same group. There is nowhere else to turn.

Multi-signature wallets and accountability

This is more than a philosophical puzzle. At least one multi-signature wallet provider has argued that their company should be exempt from any regulation because they are only providing one out of multiple keys required to control funds, with the remainder controlled by the customer.  This is in effect arguing that one key has zero value, as long as the quorum requires multiple keys. The argument rests on the observation that if the co-signing system goes rogue or experiences a security breach, its customers can not suffer any losses: additional private-keys not controlled by that service exist, gating access to funds on the blockchain.

But this argument is easily debunked by a reductio ad absurdum: suppose a customer creates a 2-of-3 multisig wallet and hires two co-signing providers to help manage those funds. At some future date, both providers are compromised by the same attacker who absconds with all funds in that wallet, since he/she is able to sign with a quorum of 2 different private-keys. Is it reasonable for both providers to disclaim liability by throwing up their hands and saying “not my fault, someone else had another key”? A more plausible interpretation of fiduciary responsibility is that both providers are implicated in the loss. Even the original argument is self-negating on closer reading: if the value of a multisig key is so minimal that one need not be concerned about the security level of a service that specializes in managing those shares, why bother with outsourcing to those providers in the first place? One could also store that share under the mattress or write it on a post-it note. It is contradictory to argue that you are providing an essential service—helping individuals or organizations offload the complexity of managing private keys— while also seeking an exemption from regulatory scrutiny because the secret you are entrusted for safe-keeping has no value, in the sense that it can not cause monetary losses by itself.

RSA 2FA breach as precedent

It will likely take a long time before a regulatory framework can answer these questions. Individual court cases could set precedent earlier. None of the publicized cases so far have involved a failure of the cosigning service. In the 2016 Bitfinex breach, BitGo acted as the co-signer in what was described as an alternative design to the traditional “cold-wallet” concept where majority of funds are stored in an offline system. During this incident BitGo API co-signed the multisig transactions used by perpetrators to take bitcoin out of the Bitfinex system. But BitGo has maintained—and Bitfinex has never disputed this assertion— that its API was working exactly as designedAPI was working exactly as designed: properly authenticated calls were made by Bitfinex, or someone with credentials required to impersonate Bitfinex, requesting co-signing of transaction. That episode raised questions about the effectiveness of co-signing, specifically for “online” configurations where a single point of failure exists: the Bitfinex system held both its own signing-key and credentials necessary to instruct the cosigner to sign with a second key. But it does not raise the question of co-signer liability precisely as long as the service was following instructions from “Bitfinex.”

Other examples from past security breaches suggest that providers of “additional” security features can not easily shirk responsibility. RSA Data Security experienced a breach of their SecurID authentication system in 2011, resulting in undermining the security of anyone who relied on RSA tokens as the second-factor of authentication— a widespread radius of collateral damage that included the defense contractor Lockheed-Martin. Following the co-signer logic, RSA could have argued that they are merely providing a second factor of authentication on top of an existing one such as passwords. As long as passwords are not compromised, failure of 2FA  alone can not result in bypass of authentication. To its credit RSA did not pursue that line of argument. After an initial period of unconvincing PR, the company began offering replacements to affected users, in keeping with the common sense understanding that 2FA providers have a responsibility to maintain a robust solution. It is unlikely that co-signing services will be held to a different standard.


Ethereum, Solidity and integer overflows: programming blockchains like 1970

Role of language design in exploit mitigation

On Wednesday a new vulnerability was disclosed in the implementation of some ERC20 tokens including BEC operating on the Ethereum blockchain. As far as impact goes, the flaw is about as critical as they come: unauthorized “gifting” of tokens out of thin air.  While this may have been the first time such a bug was observed being exploited in the wild on a blockchain, the type of vulnerability is far from novel: an integer overflow, rooted in the fact that most programming languages operate on numbers with limited range. Perform a calculation where the result falls outside that range and the result of that operation is no longer accurate, violating common-sense assumption. For example two very large numbers added together can produce a small number. Or two positive values multiplied may yield a negative product. This class of bugs are very well known in the realm of low-level languages such as C and C++. Combined with weak type-safety, lack of range checking and manual memory management, such flaws often provide a starting point for building a full remote-code execution exploit.

What is unusual is that the Ethereum virtual machine (EVM) design and the Solidity language made a series of bad decisions that resurrected a vulnerability class from low-level systems programming in the context of a wildly different environment intended for decentralized smart-contracts. To better demonstrate these unforced errors on the part of language & VM designers, it is instructive to compare how a low-level language such as C or x86 assembly handles overflows.

Getting by with finite integers

Suppose we are asked to sum two unsigned integers X and Y, and compare them to some other value Z. This is a common pattern in bounds checking, a defensive measure to prevent the application from reading/writing from memory outside the intended regions.

if (X + Y > Z) {
  // Limits exceeded, bail out
} else { 
  // Continue execution…

For extra color, let’s assume one of these values such as “X” can be controlled by an attacker, while Y and Z are constrained by the application itself. If arithmetic in C operated in the Platonic realm of natural numbers, the above code snippet would always run correctly. But these values are not represented by an idealized “integer” type that can take on any value. Instead they are limited to a fixed range, defined by the size of the integral type. This is dependent on the compilation model, which itself is tightly coupled to the hardware architecture code is being targeted at. For example, in the data-model LP64 common for Linux, the C type “unsigned int” has 32-bits while the larger type “unsigned long” can hold 64-bits. (Integral types can also be signed, further reducing the representable magnitude since roughly half the possible values are now allocated to negative numbers.) That “unsigned int” type can represent every number in the range 0 through 4294967295 inclusive. Equally important, it can not represent any value outside that range.

That poses a problem for doing math: it is possible to write expressions such as X + Y where both X and Y start out in the permitted range but the result falls outside. In other words, the operation overflows the result type. So what happens when a C program is asked to run a calculation resulting in overflow? Intuitively the options are:

  1. Return the result in a different, larger integral type such as “long.” While appealing at first glance, this is not a feasible solution for two reasons. First we run out of integral types when the same trick has to be applied for “long” or even “long long,” which a recent addition. As long as the type system only has a finite number of built-in types, there will be one with the highest capacity where we run out of room. But more importantly, playing games with types will only result in converting the out-of-range problem into a type-cast problem. Typically programmers write expressions such as “sum = X + Y” where X, Y and sum all have the same type. Even if the result is magically upgraded into a wider type, it will still get shoe-horned into the original type downstream. Playing conversion games only delays the inevitable.
  2. Treat it as a fatal error. In the same way division by zero results in halting the program— or more precisely, raising a signal/exception which, if unhandled will terminate the program— every overflow could result in triggering an error designed to immediately stop execution flow.
  3. Undefined behavior. This is a favorite cop-out for language designers, effectively stating that the compiler is free to produce any value: the result can be 0, 42, the compiler authors’ year of birth or it may crash. All of them are valid outcomes if the behavior is deemed “undefined” according to the language specification.
  4. Truncate the result in a well-defined manner and keep soldiering on.

It will come as no surprise to those familiar with the minimalist philosophy of C that the language opts for a combination of #3 and #4. (C++ carries on that tradition, for the most part deferring to C on the behavior of primitive types.) Overflow for unsigned types is perfectly legal and truncates the result. Specifically only the least-significant 32-bits of the result are taken. Meanwhile overflow on signed integral types is undefined. But pragmatically speaking, for most compilers and hardware architectures the result ends up being very similar to unsigned behavior: values are truncated to squeeze into the expected type. Looked another way, C programs are not working in the realm of natural numbers with their infinite possibilities. Instead they are doing modular arithmetic in a finite ring where values outside the range “wrap around” to an equivalent one modulo some power of 2.

One level below: x86/x64 hardware behavior

If this property of C looks arbitrary, going one more level down to look at how processors handle arithmetic at the hardware level may vindicate the designers. C is close to the metal, so to speak. Its constructs often map 1:1 into capabilities of an underlying hardware. Consider the common Intel x86 architecture. This architecture sports 32-bit registers and features instructions for operating on them. For example one can add the contents of register EAX to the value existing in register EBX:

add EBX, EAX

Since registers are also limited in the range of numbers they can express, executing this instruction poses the same dilemma: what happens if EAX and EBX hold values that added together would exceed the maximum integer representable in 32-bits? As with most hardware architectures, x86 adopts the convention of doing modular arithmetic: operations are implicitly performed modulo 2**32 (or some other appropriate power of two when working with 16-bit or 8-bit registers.) Nor is that a quirk limited to Intel: most hardware follows this pattern.

Defensive programming: hardware vs software

Given that context, it is hard to fault C for following the same model: it allows using efficient CPU instructions to perform arithmetic without having to double-check or adjust the results spit out by the processor. And while C is rarely accused of being a high-level language, many other languages that followed including those targeting virtual machines such as Java, have also taken the path of least resistance by going along with the hardware. Languages like Python and Scheme where integers have arbitrary precision by default are the exception, not the rule.

That said, there are some crucial differences in how arithmetic works at a low-level which are missing from the abstractions offered in high-level languages:

  • Multiplication has special case behavior of using the “type expansion” idea. Specifically product of 32-bit integers are in fact written into the pair of registers EDX and EAX, to accommodate a full 64-bit product.
  • CPU has a flags register that contains special flags updated based on the result of the last operations. This includes an overflow flag. Even more conveniently, the x86 instruction set contains conditional jumps based on whether that flag is set: JO, short for jump-on-overflow. That makes life easy for a programmer working in assembly language to detect and respond to overflow conditions with just 1 additional instruction— which for all intents and purposes is an unconditional jump, because the branch-predictor will correctly guess that an overflow will not happen in the common scenario.

By contrast checking for overflow in a language such as C or Java is far more kludgy because these low-level signals are missing. There is no facility provided by the language itself for dealing with overflows in an efficient manner. Options are:

  • Perform an extra arithmetic operation first to check if overflow will happen. For example, we can subtract one operands from the maximum representable value and if that difference is larger than the second operand, addition is guaranteed to overflow.
  • Sanity check the result after performing the operation, by comparing the result against one of the original operands. (Leveraging the observation that an overflow condition exists for unsigned addition if and only if the alleged sum is less than either one of the operands. Note that the rules are a lot more complex for signed arithmetic.)

Many libraries exist for doing arithmetic safely with such overflows checks included, but they trade-off safety for kludgy syntax and readability. Instead of writing a straightforward mathematical expression such as “result = X * Y + (Z * scale) + offset” one must replace each binary operator by a functional call, and potentially conditional checks for the overflow status of that particular subexpression. C++ with its support for operator overloading can at least overcome the syntactic problem, since one can design a replacement for every integer type along with  that performs overflow checking. However in the absence of direct compiler support or very diligent use of inline assembly, these user-defined substitutes will be very inefficient compared to native types.

Compiler features to the rescue?

Where the language falls short, compiler authors can sometimes step in. For example GCC has an option called trapv that will raise a trap when operations on signed integers overflow. This is an important distinction because C already considers such behavior to be undefined— crashing the program is just as valid as truncating the result modulo 2**32. But there is no corresponding option to treat overflow of unsigned integral types as a fatal condition: no “correct” compiler can do that because such overflow is perfect legal C and even explicitly relied on to implement functionality such as hash-functions and cryptographic algorithms. (A research paper from 2007 explores the idea of instrumenting existing binaries to fail on unsigned overflow, confronting exactly that problem of having to identify code that deliberately relies on truncation.)

Solving the problem for signed integers may sound good enough except for the fine-print in C conversion rules: in any operation involving a mixture of signed and unsigned operands, the signed input gets typecast to unsigned and arithmetic performed on unsigned values. In other words, introducing any unsigned value into a calculation voids the overflow check.

Solidity and Ethereum: the missing defense

EVM features large 256-bit words internally and has instructions to perform arithmetic operations on these values. Solidity in turns offers integral types up to 256-bits for programmers to use. But as the preceding discussion suggests, one can not “outrun” integer overflow vulnerabilities by adding bits to integer types any more than buffer overflows are fixed by adding more RAM. (Going from 32-bit →  64-bit architecture has expanded addressable memory to 256 terabytes but that did not spell the end of memory corruption bugs.) As long as resources are finite and attackers allowed to sneak inputs into an operation, it is possible to have an arithmetic overflow.

When it comes to defensive measures for avoiding overflow vulnerabilities, it turns out both platforms are worse than their ancient counterparts. In Solidity operations on integral types silently wrap-around by default in the event of an overflow. There is not even a compiler flag (at the time of writing) to treat overflow events as critical error— the equivalent of 10 year old “trapv” flag in GCC. At the processor level, EVM has no efficient mechanism for tracking overflow state or making decisions conditional on whether a recent computation resulted in overflow. There is nothing comparable to the x86 overflow flag and attendant condition branching instruction. Strictly speaking this is not required; a compiler can synthesize overflow checks with existing EVM instructions, patterned after the kludges used for C. But intrinsic EVM support could have made it that much easier.

For historic perspective, the x86 architecture dates back to the 1970s. So does C, designed as a low-level systems programming language with the explicit goal of closely mirroring the underlying hardware. (C++ arrived on the scene in the early 1980s, adding object-oriented paradigms while continuing to pay homage to the efficiency requirements of C.) Ethereum started four decades later with a completely different use-case: programming smart-contracts on a blockchain, which is an extremely inefficient system for computation and where correctness matters far more than shaving a few virtual cycles. Likewise, Solidity as a language for writing smart-contracts started from a clean slate. Unlike C++ it did not inherit the baggage of having to support existing C code with assumptions about how integer arithmetic words. It’s a tribute to the short-term memory of computer science that a modern programming language designed with the singular purpose of building applications on a blockchain platform— where bugs can result in significant monetary losses— still ended up perpetuating an ancient vulnerability class.




Money for nothing: Bitcoin Cash & the economics of hard-forks (part II)

[continued from part I]

Obstacles to hash-rate equilibrium

There are important caveats to the preceding argument. First we made a simplifying assumption in equating “mining rewards” to coinbase rewards. That neglects additional earnings from transaction fees collected on every payment taking place in that block. For much of the history of Bitcoin, this would have been a reasonable assumption: coinbase rewards dominated fees. In fact this dynamic made possible the extremely low fees that inspired many an optimistic pundit to sound the death-knell of other payment systems that charge usuriously high transaction costs in comparison. Over time as blocks became increasingly congested with scaling solutions stuck in a political quagmire, fees have continued creeping up. At some point in early 2017 the network achieved a state of inversion: transaction fees exceeded coinbase rewards in their contribution to overall mining revenue. That dynamic will continue to be exacerbated as the reward decreases periodically through the “halving” events baked into the bitcoin monetary supply.

That said, the decision criteria for choosing a chain is the relative expected reward per block. Imagine one chain has massive congestion and very high fees (consumers racing to out-bid each other to get their transactions through) while another chain has plenty of spare capacity in its blocks, with correspondingly low fees. That second chain is less appealing to miners, all else being equal and may not merit switching even if it appears to be trading at a premium based on coinbase rewards alone. On the other hand, the congested chain is also much less appealing to users: why continue paying exorbitant fees if you can transact on the other other chain faster and at lower cost? (After all the raison d’etre of Bitcoin Cash is ostensible adoption of 2MB blocks that squeeze more payments into one block, with reduced fees expected to follow as natural side-effect as artificial scarcity is eliminated.) It is unclear to what extent lock-in effects exist for cryptocurrencies. But assuming that some users are free to defect, the second blockchain may appreciate in value over time relative to the first. As more people conclude they can reap greater benefits transacting on that chain, they will bid up the value of the currency (and incidentally, start filling up those blocks with more transactions.) From this perspective, a miner is sacrificing short-term revenue by accepting lower transaction fees in exchange for anticipated future increase in value of accumulated reserves, due to better economics/features offered by the forked chain.

Short-term instability

The second caveat is that the analysis focuses on steady state, after block times have adjusted. As pointed out in the hypothetical scenario of the 1% miner, the introduction of new hash power into a network that follows Bitcoin-type rules leads to temporarily increased profitability for all participants. While the system has mechanism to adjust mining difficulty in response to available capacity, those adjustments take effect slowly. They happen approximately once every 2 weeks in the case of Bitcoin, and use an unweighted average over that time. If the sudden influx of hash power took place in the middle of that window, the algorithm will under-correct: it averages out the hash power before and after the arrival of a new participant, instead of giving higher precedence to more recent history. It may take up to two adjustment periods to fully correct for the spike. The good news is introduction of hash power also hastens the arrival of  adjustment, which is measured in number of blocks. Sudden removal of hash power on the other hand leads to a destabilizing cycle: blocks take longer to mine and the adjustment period to correct that itself will take longer to arrive. Because that reduces the profitability of mining, it may inspire additional defections resulting in even longer block times in a vicious cycle.

Bitcoin Cash experienced a relatively mild version of that phenomenon after the hard-fork: because the minority chain split off with the full difficulty level of Bitcoin proper but only a small fraction of its hash rate, blocks were extremely slow to arrive at first. Granted BCH proponents had anticipated this problem and built an emergency adjustment feature into their protocol. But even that proved insufficiently responsive to correct for the deficit initially. Adding insult to injury, the mechanism over-corrected as miners began jumping into BCH mid-August to exploit the arbitrage opportunities. As a result difficulty and amount of hash-power began to experience wild fluctations two weeks after the hard-fork, with BCH exceeding BTC in hash-rate at one point. (So much for the simple rule “the chain with the majority hash-power is Bitcoin” proposed for anointing the winner after a contentious hard-fork.) To summarize: in the short-term there can be significant divergence from predicted equilibrium due to the relatively slow adoption of mining difficulty in comparison to how quickly network capacity can change due to sudden entry or departure of miners.

Animal spirits all the way?

One unresolved question remains: if market value of a cryptocurrency can drive rational miners to participate in supporting that blockchain, what gives rise to that valuation in the first place? Is there causality in the other direction where miner support (expressed in actual hash-rate, not press releases and blog posts) is sufficient to prop up market value? Consider the initial point where Bitcoin Cash split off from Bitcoin, and a coalition of miners accounting for 10% of total rate made a credible pledge to mine BCH regardless of market pricing. It could be that they have a grievance with BTC, they are being otherwise being compensated by participants with an axe to grind or they are simply irrational. Does that information allow investors to price BCH in relation to BCH?  Proportionality based on hash-rate does not make sense. Blockchains involve winner-takes-all dynamics and network effects. A compatible blockchain with 1% the hash-rate of Bitcoin does not have 1% of its security or censorship-resistance: since many existing Bitcoin mining pool have more power than that, any one of them could easily overpower the minority chain to execute 51% attacks.

Similarly a cryptocurrency with 1% of user-base of another one does not have 1% the utility: it would be the equivalent of an alternative social network that only 1% of your friends participate. From these perspectives, BCH started out with a major disadvantage. Very few wallets support it out of the gate; the chances of being able to pay a friend in BCH was much lower compared to BTC. Most exchanges announced they would not open up order-books for BCH; it will be more difficult to convert BCH into fiat or other cryptocurrencies. While merchants have made symbolic gestures of accepting Bitcoin to signal their “forward-thinking,” no one has put out press releases touting BCH acceptance. Given that one BCH was valued at hundreds of dollars in the immediate aftermath of the fork even before the dust settled, its current hash-rate can be explained as consequence of miner incentives. The more troubling question for future hard-forks is: given the long odds of competing against the main chain, how any splinter fork can achieve that market valuation in the first place.


Money for nothing: Bitcoin Cash & the economics of hard-forks (part I)

Pricing a parallel universe

Over a month has passed since Bitcoin Cash has forked from the main Bitcoin chain. Short-lived anomalies observed in the aftermath of the split, such as very slow mining times due to widely miscalculated block difficulty, have mostly been resolved. While most exchanges have not opened order-books for BCH, sufficient liquidity exists and trading continues at non-zero prices. So much for the predictions of “experts” who projected a rapid decay to zero once the blockchain came up to speed, allowing people to move funds in any reasonable time frame. (Deja vu? Back in 2016 on the verge of the Ethereum hard-fork to bailout the DAO, so-called “experts” also predicted the minority chain Ethereum Classic would die out quickly.)

In fairness, the majority of Bitcoin proper trading does not hit the blockchain either, taking place on the internal ledgers of exchanges. That alone should have been a clue that mobility/velocity of fund movement can not be the deciding factor in determining market value of a cryptocurrency. But are there systemic ways of pricing a splinter faction such as Bitcoin Cash relative to its main chain?

BTC itself has significantly appreciated since August, going from below $3K to nearly crossing the $5K milestone before taking a hit in response to news from a crackdown in China. BCH has fluctuated significantly but continues to hold a valuation in the hundreds of dollars. In other words: the price of bitcoin after the fork plus that of bitcoin cash greatly exceeds that of bitcoin before the fork. Consider that every person who owned bitcoin before the fork still owns the same amount of bitcoin, plus a corresponding quantity of bitcoin cash.

On its face, the hard-fork has created value out of nothing.

Creating a new cryptocurrency out of thin air

Not to shabby for an initiative with modest beginnings as a strategic bluff to another threatened soft-fork. To recap: in response to the lukewarm reception by Bitcoin miners to the segregated witness feature, one of the factions in the never-ending scaling debate proposed a user-activated soft fork or UASF. This approach would effectively force the issue with miners, by disregarding blocks which do not include support for their favorite pet-feature. In response, an opposing faction threatened a user-activated hard-fork (UAHF) that would force miners to commit to a different criteria by disregarding chains that do not include support for their favorite pet-feature, namely 2MB blocks. UAHF was more deterrence than a coherent scaling plan: its proponents painstakingly pointed out the plan would only be invoked if segregated witness activation did not take place cleanly. Last minute developments appeared to render that unnecessary: segregated witness and large-blocks were merged into a Franken-design under the controversial “New York agreement.” Enough miners signaled for that version to reach its activation threshold. Crisis averted? Not quite: one die-hard group decided to run with the UAHF banner regardless of existing consensus, proceeding to create a new fork that supports 2MB blocks—Bitcoin Cash.

Before we go on to conclude that hard-forks are an unambiguous boon to the ecosystem and expect to have one every week, consider a few alternative explanations for the outcome:

  • Speculative bubble in the broader cryptocurrency space, a point many commentators have raised warnings about. No one has accused cryptocurrency investors of being too cautious in this day and age of oversubscribed ICOs. Bitcoin Cash may have been perceived as the next gold-rush by aggressive investors looking for another asset class to get in on the ground floor.
  • Creative conspiracy theories: secret cabals committing to a price floor for BCH in order to generate demand from exchanges or motivate miners to commit hash power. This could explain why BCH had a non-zero valuation but it can not account for why BTC itself started on a steady upwards trajectory.
  • Here is a more charitable interpretation for investor psychology: the community breathing a collective sigh of relief after segregate witness activation proceeding relatively “cleanly” by some definition. While the UAHF hard-fork came out of left field, it had incorporated lessons from past times when the network played a game of chicken with hard-forks. There was no risk of the alternative chain collapsing back on the main-chain, thanks to the deliberate introduction of an incompatibility at the fork point. There was replay protection: transactions from the main chain could not be broadcast on the alternative chain. (That was a critical lesson the Ethereum community learned during its DAO debacle. Compatibility between chains resulted in problems managing funds independently: by default sending ETH also resulted in sending ETC because someone else could—and did, as a matter of fact— take the transaction from the original chain and replay it on the Classic chain.) According to this explanation, bitcoin prices were artificially depressed before the fork due to collective investor anxiety and successful completion of the fork simply erased those worries.

Hash-rate follows valuation

Beyond the particular motivations that support the current price structure, the relative value of Bitcoin Cash to Bitcoin makes for an interesting experiment in comparing predictions from economic theory with observed market behavior of the participants. That goes for miners too: while it is common to accuse investors of tulip-mania irrational exuberance, miners are supposed to be rational actors maximizing subject to well-understood economics of mining. While Bitcoin Cash transactions are deliberately incompatible to Bitcoin transactions, the proof-of-work function used to mine blocks is identical on both sides. That means the chain split presents each miner with a continuous decision point: mine BTC, mine BCH or perhaps some mixture of both by dedicating partial capacity.

The relative profitability of mining—defined as rewards to amount of hash-rate expended— for BTC/BCH has fluctuated wildly during the initial few weeks. As profitability of the BCH blockchain soared, more miners were tempted to jump ship. Eventually the pendulum swung too far and BCH had too much competition, a form of the Yogi Berra saying applies: it is too crowded, no one mines there anymore. But the graph shows an unmistakable trend of converging towards 100%, the point where both chains are equally profitable. This same behavior was observed in the aftermath of ETH/ETC split. The price ratio between the chains was approximately equal to their difficulty ratio.

Is there a relationships between hash-rate and price? It is straightforward to argue for causality in one direction when two chains have compatible proof-of-work function. That compatibility is critical because it allows repurposing mining power. Recall that the majority of commercial mining is done using highly specialized hardware that has been painstakingly optimized for a specific proof-of-work function. It can not be used to support mining on a different blockchain unless that blockchain also uses the same proof-of-work. So an Ethereum miner can not easily switch to Litecoin without significant capital expense in acquiring new hardware, but they can turn around on a dime to mine Ethereum Classic using the same fixed assets.

This interoperability creates an opportunity to arbitrage hash power between blockchains. Suppose BTC is trading at $4000 and a mining pool has 1% of the hash-rate. Neglecting transaction fees, this pool would expect to collect on average 1% of the block reward of 12.5 BTC which comes out to $500 in fiat equivalent. Now suppose that BCH has one-tenth the total hash rate of Bitcoin network being a minority chain, but is valued at one-eighth of bitcoin at $500. In other words BCH is trading at a premium compared to its hash rate. In that case our mining pool can improve its profitability by shifting all power to mine BCH. After switching sides, the miner now accounts for roughly ~9% of the newly expanded BCH network capacity. That means on average they can count on collecting 9% of the 12.5 BCH reward per block. Converted to USD that comes out to ~$563. In other words defecting to BCH when it was trading at a premium to its hash-rate has improved the bottom line. (In fact until difficulty adjustment takes place, their actual profitability will be even higher because the network is still operating under the assumption of its original capacity. Blocks will arrive faster, so the reward per unit time is even more lucrative— considering that primary operational expense for mining is power consumption, that is a more relevant metric.)

Yet the same calculus does not work for a different mining pool which starts out with 5% of total hash-rate. That pool is looking at an average $2500 per block in earnings while operating on the main chain. Switching over to BCH, they become the big fish in a small pond, now accounting for one-third of the total BCH rate— and in the process posing a risk to decentralization due to having achieved critical mass for selfish-mining. But their expected earnings in steady state actually drops to less than $1700 USD per block. What went wrong? Shifting that 5% of mining capacity to the greatly under-powered BCH network destroys the condition that gave rise to the arbitrage opportunity in the first place: with the infusion of new hash power, BCH is no longer trading at a premium compared to its hash rate. (Of course this does not mean larger mining pools can not take advantage of the opportunity: it just means they have to shift some fraction of their resources over to BCH instead of their full capacity.)

Bottom line here is that rational miner behavior gives rise to a dynamic where hash-rate is naturally allocated between compatible blockchains in response to market valuation of those currencies. Put succinctly: hash-rate follows valuation.