On-ramps & off-ramps: impact of Bitcoin payments on the regulatory landscape

Money serves three functions: store of value, unit of account and medium of exchange. Bitcoin has achieved a dubious hat-trick in drawing criticism in each category. It is claimed to be too volatile to act as a measure of economic value—how can merchants risk accepting BTC if the prices fluctuate drastically day to day? For that matter, why would consumers risk giving up an appreciating asset if it is likely to be worth a lot more tomorrow? (Recall the infamous Bitcoin pizza incident.) It has also attracted a chorus of skeptics arguing that it can not succeed as a popular payments, pointing to the intrinsic scaling limits and costs of transacting on the congested network. Blockchain space is scarce, topping out around 7 transactions per second worldwide, dwarfed by the throughput of existing centralized payment rails such as credit-card and PIN debit networks. Moreover transactions compete against each other for inclusion in every block by offering “fees” to miners, who are incentivized to include only those with highest fees into the next block. Counterintuitively, those fees depend not on the monetary value of the transaction but on the amount of space it takes up on the blockchain— akin to charging for check-processing based on thickness of the paper used instead of notional value printed on the check. Depending on demand on the network, this results in drastically varying efficiency. In 2014 it was possible to transfer $80M while paying a few cents—outshining every other system including wires in cost-effectiveness— while the late 2017 frenzy witnessed  users spending tens of dollars in fees to send amounts of the same order of magnitude, which is a staggering level of inefficiency comparable to Western Union for cross-border remittances.

After an initial period of exuberance where merchants raced to outmaneuver each other in claiming bitcoin support and scoring cheap PR points, it is clear that bitcoin is not currently working as a payment mechanism. Ethereum never focused on payments as a use-case, while other altcoins such as monero with aspirations to being a medium of exchange are optimizing for dubious black markets and criminal activity outside the mainstream. Jury is out on whether any of the scaling approaches proposed such as Lightning Network will provide a lasting solution. 

 But this is more than a question about technical merits or real-world feasibility: it will have repercussions for the regulatory landscape around cryptocurrency. Whether Bitcoin or another cryptocurrency succeeds as a widely used medium of exchange will determine which participants in the ecosystem become the focus points for regulatory influence. If conducting meaningful commercial activity requires converting into fiat, regulating cryptocurrency becomes a simple problem focused on existing, already regulated businesses which serve as on-ramps & off-ramps to blockchains for the majority of consumers. (This is true even when that conversion into fiat is hidden from consumers, as in the case of most merchants who claim to accept Bitcoin but are in fact relying on a third-party payment processor who accepts Bitcoin from the consumer while paying out in USD to the merchant, effectively absorbing both the exchange risk and information security challenges associated with storing bitcoin.)

The difference can be illustrated by imagining two alternative scenarios. In the first case, Bitcoin or some other cryptocurrency with better intrinsic scaling properties succeeds in becoming a ubiquitous payment mechanism, complete with a vibrant ecosystem of consumers and merchants rivaling that of existing payment networks. Average citizens purchase goods with Bitcoin from one another with the same ease as swiping a credit-card, with one crucial distinction: there is no middleman moderating the transaction. (More accurately, since blockchains are not made out of magic, the “middleman” has been replaced by a decentralized network of miners.) There is no reason to convert bitcoin into or out of USD— which is what happens today behind the scenes when most merchants rush to announce bitcoin support. Such cross-currency hops are not only unnecessary but they would introduce economic inefficiencies compared to transacting in bitcoin as a first-class, “native” currency.

In the second scenario, Bitcoin is relegated to a different role, serving primarily as store of value (or unit of measure, however implausible that appears from current levels of volatility.) In this word everyday retail transactions are not carried out over the bitcoin network, directly or indirectly through second-layer solutions such as Lightning Network. either because no viable scaling solution has emerged or because network fees are too high compared to existing payment rails, dissuading merchants from switching. BTC can still be competitive for niche scenarios, such as cross-border remittances or infrequent large transfers (such as settlements between commercial banks) where 24/7 reliability and speed is more important than saving a few basis points.

In both scenarios, the censorship resistance of the Bitcoin network is identical. While there are legitimate concerns around the geographic concentrations of mining pools in China and a software “monoculture” of one implementation running every node, it would require extraordinary effort to blacklist addresses or make arbitrary edits to the ledger to undo history or reverse payments. Where these scenarios differ is that such interventions are not required in the second case as a prerequisite to effectively regulating flow of funds.

If commercial transactions require converting cryptocurrency to/from fiat, existing authority to influence movement of that fiat currency also translates into influence on use of virtual currency in any meaningful commercial capacity. If consumers are constantly liquidating bitcoin as a prerequisite for most economic activity, one need only focus on regulating these so-called “on-ramps” and “off-ramps”— points in the ecosystem where bitcoin is exchanged for a native currency accepted in-real-life, such as US dollar. This a tractable problem. Dealing in fiat currency requires banking relationships and maintaining banking relationships involves following existing regulatory regime, which includes a functional compliance program among other components. (One need only look at the instructive example of Bitfinex to see what happens when those connections are severed.) It does not matter that payments within the bitcoin network can continue to flow freely because the balances recorded on that ledger, no matter how significant when converted into notional USD numbers, grant very little purchasing power unless the funds can be exchanged back into USD. If there are only a handful of such off-ramps to exit out of bitcoin and into old-fashioned but universally accepted currency necessary for meaningful commercial activity, there is an effective point to focus enforcement activities.

Consider the example of WikiLeaks. In 2011 the US government pressured credit networkings including Visa, MasterCard and Paypal to stop processing payments for the organization. This forced WikiLeaks to start accepting bitcoin. With the massive run-up in price against USD, WikiLeaks can paint a slightly revisionist picture of that forced move as a blessing in retrospect. But what about the cash-flow side of the equation? Even if all WikiLeaks personnel were volunteers resulting in zero labor costs, no salaries to pay etc. as soon as there are any expenses to account for— data-center hosting, advertising etc.— the question is whether those costs can be reimbursed in bitcoin. If the counter-party is not willing to accept bitcoin, WikiLeaks must find a way to convert their existing stash of bitcoin into old-school currency, interfacing with aspects of the financial system where stringent rules apply around which customers one can conduct business with. (Quite apart from access, this hypothetical scenario also raises the question of whether paying for expenses out of bitcoin—converted or not— was something to brag about in the first place: if the currency appreciated 50,000% since initial donation was made, having spent that on operational expenses to keep the lights on has echoes of the famous million-dollar pizza episode in bitcoin.)

Even overcoming the scaling challenges may not be enough to realize a world where consumers pay for their morning cup of coffee in bitcoin. Volatility may prove even more tricky, not being amenable to simple technological fixes. In the long-term, the belief that price of bitcoin will rise against USD means everyone will hoard their cryptocurrency, with no one willing to spend it on such pedestrian purchases as breakfast. (That brings a whole new meaning to the cliche that skipping a latte every day will compound into significant savings.) Meanwhile significant short-term fluctuation discourage merchants. Pricing goods in bitcoin requires hedging against the exchange risk. The path of least resistance for risk-averse merchants is to cash out to USD immediately after the transaction, which is exactly what most merchants who claim to accept bitcoin are doing: in reality their payment processor collects BTC from the buyer while delivering USD minus transaction fees to the merchant. That approach insulates the merchant from the risk of holding bitcoin long term, while superficially granting them an excuse to display the bitcoin logo on checkout pages and press releases. But bragging rights aside, the popularity of that model for merchants highlights the fragile nature of decentralization. The favored integration model for accepting bitcoin is in fact not using bitcoin as a peer-to-peer currency: there is a payment processor in the loop for converting bitcoin into fiat currency to make it more palatable to the merchant. That middleman is likely a regulated money transmitter and can be legally compelled to stop processing payments for specific clients. If cryptocurrency becomes a first-class, native payment system for online or brick-and-mortar transactions, it will require a fundamentally different regulatory approach focused on trying to impose order on unwieldy peer-to-peer networks. If instead the current trajectory continues, incremental approaches centered around on/off-ramps will be sufficient.



We can decrypt it for you: smart-contracts for outsourced encryption

A small-step for cryptographic agility

Among the features introduced by the Byzantine hard-fork of Ethereum, one of the more under-appreciated ones may be support for RSA operations. Most virtual currencies have a very weak story around cryptographic agility, defined as the ability to easily change the cryptographic primitives used such as signature algorithms, block ciphers and hash functions. For example Bitcoin settled on ECDSA over one elliptic curve— secp256k1— as the signature algorithm and SHA256 for hashing. (There is a long-standing proposal to introduce Schnorr signatures that could introduce some variety but it is yet again limited to that one curve.) Ethereum also uses ECDSA over the same curve, but inexplicably hard-codes a pre-standardization version of SHA3 for hashing.

Arguably these decisions are  short-sighted given that large-scale blockchains are notoriously difficult to evolve once deployed. Convergence of the system depends on on all participants agreeing on a set of ground rules for interpreting transactions appearing on the chain. If one faction invents some new-fangled feature that the rest of the network does not recognize, they will disagree on which transactions are valid and what the true state of the world is. In effect the  Bitcoin is culturally allergic to using hard-forks to introduce novel functionality— witness the acrimonious debates over increasing block size— and historically relied on incremental soft-forks for improvement. Ethereum Foundation has shown no such restraint, shepherding the currency through multiple hard-forks. Some were controversial and did not exactly go as planned; the 2016 hard-fork to bailout DAO instead gave rise to Ethereum Classic. More recent forks have been less contentious, likely because they introduced features without obviously being designed for the benefit of a particular contract or stakeholder.

In one sense, RSA support it is pure syntactic sugar: EVM is Turing complete. Modular exponentiation, the operation underlying RSA, could have been implemented in Solidity or any other language that compiles to EVM byte-code. Granted the same could be said of ECDSA support provided by the built-in ecrecover contract invocation: that could have been left up to every one to implement on their own. Instead EVM designers made a pragmatic decision about providing a native operation to support a feature expected to have widespread usage. This is the same judgment call API designers face constantly: whether to provide a single convenient API that packages commonly used functionality or leave it up to user to stitch that together from existing APIs. In this blog post we evaluate one scenario that is now made possible with RSA support, namely outsourcing key-management and encryption to trusted third-parties.

We can decrypt that for you

Common disk-encryption solutions including MSFT Bitlocker and Apple FileVault have an option to back-up keys to the enterprise. This is achieved by encrypting contents to two or more different keys: one chosen by the user and one chosen by the IT department, usually an RSA public-key. This allows the company to access storage on those devices even if the assigned user is unable or unwilling to assist. Taking this one scenario step further: suppose the IT department does not want to manage that additional key that can decrypt all employee devices. Can they outsource this to a trusted third-party in a verifiable way? The third-party is retained to provide on-demand decryption when the IT department needs to access one of the devices using the secondary key.  Alternatively the third-party could act as “back-up:” even if the IT department itself has a key, they may want additional redundancy.

There is an important question around whether such draconian arrangements are justifiable or consistent with precedent. For now we will put aside these policy questions, which are highly dependent on jurisdiction. US companies typically adapt the stance that employees have no expectation of privacy on company-issued devices. In centrally managed IT environments, it is very common for the company to have root/administrator privileges over all machines. Those privileges allows IT personnel to remotely execute arbitrary commands on user devices and access any stored information. In other words, being able to decrypt storage offline is not necessarily granting a new capabilities that is not already present in a traditional, all-Windows environment with machines are joined to an Active Directory domain.

One important tangent on policy is warranted here before moving on to the implementation. On the surface, outsourcing encryption looks awfully similar to the dangerous and discredited notion of mandatory key-escrow. Starting in the 1990s, US government has requested backdoors in cryptographic protocols, asking that all encryption products effectively use two keys: one for the intended recipient and one for law-enforcement in case future access is required. Following strong backlash from cryptography and civil-liberties communities, these ideas seeming fell out of favor as access to strong encryption increasingly became taken for granted. But the uneasy truce came to an abrupt end in the aftermath of the 2014 San Bernardino shootings which resurrected calls for key-escrow. Compelled key-escrow is problematic because unwitting users are being forced to trust third-parties they have no control over or trust, merely by virtue of purchasing gadgets or software from vendors subject to that jurisdiction. This is different than voluntary escrow taking place in a closed system such as an enterprise IT environment, where the company is making a deliberate risk-benefit decision to outsource key-management to selected third-parties. In addition to the freedom of choosing these providers—or not choosing for that matter, in favor of keeping that function in-house— they can also reduce risk by splitting keys. For example, 3 providers can be selected and the original secret split in a 2-of-3 threshold scheme with each share encrypted to a different provider. That arrangement removes single points of failure, requiring cooperation/malfeasance/compromise of multiple providers to recover the secret.

Design outline

A smart contract running on Ethereum can provide a number of useful properties:

  • Backup service gets paid per decryption in a verifiable manner— payment is made if and only if a successful decryption is provided
  • The backup service can post a fidelity-bond attesting to the security of its key management. These funds are under control of the contract and designed to be released to an attacker upon submission of evidence that the key has been compromised. In effect, this is a self-enforcing bug bounty as well as a canary designed to alert users of a key compromise.  As long as the economic value of the key for an attacker is less than the value of the bounty, a rational attacker would choose to claim the bounty and in the process, alert the entire world that the key has been compromised. If no such evidence emerges after some time, the funds are released to the operator of the service. (Alternatively funds can be returned gradually over time, with some fraction returned every few months of successful operation.)

How would such a system operate?  Since a contract itself can not hold secrets, the actual decryption keys must be held by the service provider. But the provider can publish the public-key which is sufficient for encrypting data. The customer is free to encrypt arbitrary plaintext using that key. But when it is time for decryption, they have to go through the smart contract to reach out to the service provider. This contract has three methods:

  1. Request decryption: Customer submits a request to decrypt a ciphertext encrypted in the RSA key of the backup service. This call is also required to send previously agreed-upon amount in ether to the contract as pre-payment. In practice it would also be subject to an authorization check based on message sender address; only the legitimate customer is allowed to request decryption. This prevents random interlopers who stumble on ciphertext from using the contract as a decryption oracle.
  2. Submit decryption result: Invoked by the operator of the backup service, providing the plaintext for a ciphertext previously uploaded by the customer.
  3. Request refund: Invoked by the user to get their money back if the service was unable to provide service in a timely manner. (Note there is a race-condition here in the event of blockchain congestion or miner censorship. The decryption result becomes visible to the customer as soon as the service provider broadcasts it to the network. But it will not count towards meeting the deadline until it is mined.)

Step #2 is where support for RSA operations come in handy. This is the point where the smart-contract must verify whether a correct decryption was provided, by applying the encryption step to the alleged plaintext. (Note we are assuming raw RSA decryption— this is also required to support “blinding” as explained below. The plaintext itself may have padding such as OAEP but its fully-padded version must be returned by the backup service.) If the provided plaintext is correct, the payment originally sent in step #1 is “captured” for good. Otherwise the funds remain in a provisional state. If no decryption materializes after some time period such as 24 hours, the user can invoke the refund function. This function checks the internal contract state and ascertains whether an outstanding decryption request initiated by the same caller has not been successfully resolved. If that is the case, the initial payment is sent back to the caller.

This construction sounds counter-intuitive in one aspect: everything taking place on public blockchains including Ethereum is visible to the entire world. Any one else can observe the above transactions getting posted, including the decryption result. That may look like a serious problem— everyone else is learning the secret— but there is an easy work around: RSA cryptosystem supports blinding using the multiplicative homomorphism of RSA. Instead of decrypting the primary ciphertext, the customer submits a modified version that is masked by a random factor. The decryption of that masked ciphertext still allows recovering the original plaintext, but only for the original submitter with knowledge of the masking factor. Other people observing the blockchain transactions are effectively witnessing the decryption of a random value that reveals no information about the original.

Improving accountability

What about providing accountability for the service provider? The idea here is to augment the contract with additional methods that an attacker who compromised the key can invoke to prove that fact and receive the value of the fidelity-bond as their reward. There are different ways to do this varying in complexity and level of evidence required. For example, an attacker can submit the factorization of the modulus and the contract verifies that the product of factors is equally to the modulus. This is sufficient but not necessary to prove key compromise. For example, in many situations the private key material is kept on tamper-resistant hardware such as an HSM which is designed to perform operations without surrendering the raw key bits. If an adversary obtains possession of the HSM and has the necessary credentials to operate it, the key is effectively compromised regardless of whether raw bits can be exfiltrated out of the hardware.

We can lower the burden of proof and require attackers to only prove they can decrypt arbitrary ciphertexts. This is tricky because the contract itself is used to arrange for decryption of arbitrary ciphertexts and those results are available on the blockchain for all to see. Not to mention, when the RSA cryptosystem is used without padding, existential forgery of what appears to be “successful” decryptions is easy. Anyone can fabricate plaintext/ciphertext pairs for which they claim to have magic decryption capability. Even in a design where that choice is taken away, the customer of the encryption service can subvert the process: they can request a decryption and leverage that result as “evidence” to prove they have control over the private key. (Use of RSA blinding makes it impossible to ascertain that they were the same ciphertext, since a randomly masked value is getting decrypted each time.) A simple if not entirely satisfactory approach is to implement a challenge/response protocol during which ordinary decryption requests are paused. The “prover” who claims to have control over the private key initiates the process by invoking a contract function. To prevent spam, we can require that such challenges are accompanied by some amount of ETH as commitment to weed out provers with no hope of successfully completing the process. That function returns a challenge, which is based on recent block hash of the chain, message sent by the prover and internal contract state that incorporates all ciphertexts it has been asked to decrypt to date.(For example we can maintain a running hash that is updated with each request received.) Note that contracts can not provide “random numbers” in the usual sense; incorporating past ciphertexts prevents using information disclosed during past decryption operations to answer the current challenge. The prover is given a deadline measured in block-height or time to provide the decryption of that ciphertext. If the valid RSA plaintext emerges during that window, the funds maintained as fidelity bond by the contract are sent to the prover. Otherwise the challenge is considered a failure and the contract resumes decryption services. By creating unavoidable penalties in the event of key compromise, this approach provides financial incentives for the service provider to take adequate measures. (But it does not prevent a corrupt provider from voluntarily decrypting ciphertexts when requested by parties other than the legitimate customer.)



What is the value of half a secret?

Putting a price tag on cryptographic keys

Cryptocurrencies such as Bitcoin achieve what may be the purest distillation of money into information: control over funds is achieved by having control over associated cryptographic secrets, specifically ECDSA private-keys used to sign transactions. If you can sign with a specific key, you can spend the funds at a specific address associated with that key. Multi-signature is an extension of this model where instead of using a single key, possession is distributed into a quorum of M keys out of N. This improves both security— compromising a single key does not allow an attacker to walk away with any money— and redundancy: losing access to one key, say due to hardware failure, does not result in the legitimate owner losing access to their money. (Assuming M is less than N; otherwise the loss of any key is fatal.)

It also creates something of a conundrum for putting a price-tag on one such key for the purpose of threat modeling. What is a reasonable amount for the defender to spend in securing that key? For that matter, how much effort would a rational attacker spend on going after one key? Suppose 300 BTC is stored stored in a 3-of-5 multisig address. Intuitively one could argue the value for each key is exactly 1/3 the balance or 100 BTC. But this naive conclusion overlooks the “criticality” condition involved: any combination of less than three keys is worthless from the perspective of controlling the associated funds on the blockchain. (Assuming they are not also used to control some other address with different M-of-N configuration.) But once that third key falls into the wrong hands, the entire sum can be claimed.

In more concrete terms: suppose a disgruntled insider offers to sell an attacker one of the keys for 50BTC. If individual keys are valued based on their pro rata share at 100 BTC, that would be a great deal. But if the prospective buyer has no hope of obtaining any of the other four keys— perhaps they are managed by more honest employees— that 50BTC spent on attacking the first key will yield a return of exactly zero. There is an all-or-nothing phenomenon at work for all would be adversaries: 100% of resources dedicated to getting hold of the first M-1 keys are wasted unless that final key to reach quorum can be successfully obtained.

Critical thresholds

This is an unusual property of Bitcoin that is not easily achieved for other commodities it is typically compared to: half a barrel of oil is worth approximately half that of a full barrel. It is not possible to create a configuration of crude oil where three barrels must be consumed at the same time. If a gold vault is compartmentalized into five rooms containing equal amounts of the precious metal, plundering one out of five rooms is still a good outcome for the attacker.

Of course there exist physical objects with supra-linear scaling properties based on quantity: a two-carat diamond is worth more than two individual one-carat diamonds with identical specs. Most physical objects follow this pattern. If a car is chopped in half down the middle, the left and right-sides are not worth half the original value of the vehicle, because neither side amounts to a functioning mode of transportation any longer. But the value of those scattered parts is not zero either. They can be sold for scraps or combined with other interchangeable parts to put together a fully functioning if somewhat unusual vehicle.

It is usually in the mathematical realm that we run into scenarios where objects can be divvied up in a way that creates critical thresholds. For example, it is possible to split a credit-card number into two shares such that either share contains exactly zero information about the original, in a strong information-theoretical sense: even an infinitely powerful computer could not recover the original value from a single share. But the combination of two such “worthless” shares yields the original secret. There is another unusual property of these constructions: not only is combination of shares below the threshold worthless, the only way they can attain any value at all is in conjunction with other shares generated from that particular split. If you have half the components for a bicycle, there is a cottage industry of bike shops and parts suppliers to equip you with necessary missing pieces to assemble a complete bike. But keys in multi-signature configuration can only be combined with other keys in the same group. There is nowhere else to turn.

Multi-signature wallets and accountability

This is more than a philosophical puzzle. At least one multi-signature wallet provider has argued that their company should be exempt from any regulation because they are only providing one out of multiple keys required to control funds, with the remainder controlled by the customer.  This is in effect arguing that one key has zero value, as long as the quorum requires multiple keys. The argument rests on the observation that if the co-signing system goes rogue or experiences a security breach, its customers can not suffer any losses: additional private-keys not controlled by that service exist, gating access to funds on the blockchain.

But this argument is easily debunked by a reductio ad absurdum: suppose a customer creates a 2-of-3 multisig wallet and hires two co-signing providers to help manage those funds. At some future date, both providers are compromised by the same attacker who absconds with all funds in that wallet, since he/she is able to sign with a quorum of 2 different private-keys. Is it reasonable for both providers to disclaim liability by throwing up their hands and saying “not my fault, someone else had another key”? A more plausible interpretation of fiduciary responsibility is that both providers are implicated in the loss. Even the original argument is self-negating on closer reading: if the value of a multisig key is so minimal that one need not be concerned about the security level of a service that specializes in managing those shares, why bother with outsourcing to those providers in the first place? One could also store that share under the mattress or write it on a post-it note. It is contradictory to argue that you are providing an essential service—helping individuals or organizations offload the complexity of managing private keys— while also seeking an exemption from regulatory scrutiny because the secret you are entrusted for safe-keeping has no value, in the sense that it can not cause monetary losses by itself.

RSA 2FA breach as precedent

It will likely take a long time before a regulatory framework can answer these questions. Individual court cases could set precedent earlier. None of the publicized cases so far have involved a failure of the cosigning service. In the 2016 Bitfinex breach, BitGo acted as the co-signer in what was described as an alternative design to the traditional “cold-wallet” concept where majority of funds are stored in an offline system. During this incident BitGo API co-signed the multisig transactions used by perpetrators to take bitcoin out of the Bitfinex system. But BitGo has maintained—and Bitfinex has never disputed this assertion— that its API was working exactly as designedAPI was working exactly as designed: properly authenticated calls were made by Bitfinex, or someone with credentials required to impersonate Bitfinex, requesting co-signing of transaction. That episode raised questions about the effectiveness of co-signing, specifically for “online” configurations where a single point of failure exists: the Bitfinex system held both its own signing-key and credentials necessary to instruct the cosigner to sign with a second key. But it does not raise the question of co-signer liability precisely as long as the service was following instructions from “Bitfinex.”

Other examples from past security breaches suggest that providers of “additional” security features can not easily shirk responsibility. RSA Data Security experienced a breach of their SecurID authentication system in 2011, resulting in undermining the security of anyone who relied on RSA tokens as the second-factor of authentication— a widespread radius of collateral damage that included the defense contractor Lockheed-Martin. Following the co-signer logic, RSA could have argued that they are merely providing a second factor of authentication on top of an existing one such as passwords. As long as passwords are not compromised, failure of 2FA  alone can not result in bypass of authentication. To its credit RSA did not pursue that line of argument. After an initial period of unconvincing PR, the company began offering replacements to affected users, in keeping with the common sense understanding that 2FA providers have a responsibility to maintain a robust solution. It is unlikely that co-signing services will be held to a different standard.


Ethereum, Solidity and integer overflows: programming blockchains like 1970

Role of language design in exploit mitigation

On Wednesday a new vulnerability was disclosed in the implementation of some ERC20 tokens including BEC operating on the Ethereum blockchain. As far as impact goes, the flaw is about as critical as they come: unauthorized “gifting” of tokens out of thin air.  While this may have been the first time such a bug was observed being exploited in the wild on a blockchain, the type of vulnerability is far from novel: an integer overflow, rooted in the fact that most programming languages operate on numbers with limited range. Perform a calculation where the result falls outside that range and the result of that operation is no longer accurate, violating common-sense assumption. For example two very large numbers added together can produce a small number. Or two positive values multiplied may yield a negative product. This class of bugs are very well known in the realm of low-level languages such as C and C++. Combined with weak type-safety, lack of range checking and manual memory management, such flaws often provide a starting point for building a full remote-code execution exploit.

What is unusual is that the Ethereum virtual machine (EVM) design and the Solidity language made a series of bad decisions that resurrected a vulnerability class from low-level systems programming in the context of a wildly different environment intended for decentralized smart-contracts. To better demonstrate these unforced errors on the part of language & VM designers, it is instructive to compare how a low-level language such as C or x86 assembly handles overflows.

Getting by with finite integers

Suppose we are asked to sum two unsigned integers X and Y, and compare them to some other value Z. This is a common pattern in bounds checking, a defensive measure to prevent the application from reading/writing from memory outside the intended regions.

if (X + Y > Z) {
  // Limits exceeded, bail out
} else { 
  // Continue execution…

For extra color, let’s assume one of these values such as “X” can be controlled by an attacker, while Y and Z are constrained by the application itself. If arithmetic in C operated in the Platonic realm of natural numbers, the above code snippet would always run correctly. But these values are not represented by an idealized “integer” type that can take on any value. Instead they are limited to a fixed range, defined by the size of the integral type. This is dependent on the compilation model, which itself is tightly coupled to the hardware architecture code is being targeted at. For example, in the data-model LP64 common for Linux, the C type “unsigned int” has 32-bits while the larger type “unsigned long” can hold 64-bits. (Integral types can also be signed, further reducing the representable magnitude since roughly half the possible values are now allocated to negative numbers.) That “unsigned int” type can represent every number in the range 0 through 4294967295 inclusive. Equally important, it can not represent any value outside that range.

That poses a problem for doing math: it is possible to write expressions such as X + Y where both X and Y start out in the permitted range but the result falls outside. In other words, the operation overflows the result type. So what happens when a C program is asked to run a calculation resulting in overflow? Intuitively the options are:

  1. Return the result in a different, larger integral type such as “long.” While appealing at first glance, this is not a feasible solution for two reasons. First we run out of integral types when the same trick has to be applied for “long” or even “long long,” which a recent addition. As long as the type system only has a finite number of built-in types, there will be one with the highest capacity where we run out of room. But more importantly, playing games with types will only result in converting the out-of-range problem into a type-cast problem. Typically programmers write expressions such as “sum = X + Y” where X, Y and sum all have the same type. Even if the result is magically upgraded into a wider type, it will still get shoe-horned into the original type downstream. Playing conversion games only delays the inevitable.
  2. Treat it as a fatal error. In the same way division by zero results in halting the program— or more precisely, raising a signal/exception which, if unhandled will terminate the program— every overflow could result in triggering an error designed to immediately stop execution flow.
  3. Undefined behavior. This is a favorite cop-out for language designers, effectively stating that the compiler is free to produce any value: the result can be 0, 42, the compiler authors’ year of birth or it may crash. All of them are valid outcomes if the behavior is deemed “undefined” according to the language specification.
  4. Truncate the result in a well-defined manner and keep soldiering on.

It will come as no surprise to those familiar with the minimalist philosophy of C that the language opts for a combination of #3 and #4. (C++ carries on that tradition, for the most part deferring to C on the behavior of primitive types.) Overflow for unsigned types is perfectly legal and truncates the result. Specifically only the least-significant 32-bits of the result are taken. Meanwhile overflow on signed integral types is undefined. But pragmatically speaking, for most compilers and hardware architectures the result ends up being very similar to unsigned behavior: values are truncated to squeeze into the expected type. Looked another way, C programs are not working in the realm of natural numbers with their infinite possibilities. Instead they are doing modular arithmetic in a finite ring where values outside the range “wrap around” to an equivalent one modulo some power of 2.

One level below: x86/x64 hardware behavior

If this property of C looks arbitrary, going one more level down to look at how processors handle arithmetic at the hardware level may vindicate the designers. C is close to the metal, so to speak. Its constructs often map 1:1 into capabilities of an underlying hardware. Consider the common Intel x86 architecture. This architecture sports 32-bit registers and features instructions for operating on them. For example one can add the contents of register EAX to the value existing in register EBX:

add EBX, EAX

Since registers are also limited in the range of numbers they can express, executing this instruction poses the same dilemma: what happens if EAX and EBX hold values that added together would exceed the maximum integer representable in 32-bits? As with most hardware architectures, x86 adopts the convention of doing modular arithmetic: operations are implicitly performed modulo 2**32 (or some other appropriate power of two when working with 16-bit or 8-bit registers.) Nor is that a quirk limited to Intel: most hardware follows this pattern.

Defensive programming: hardware vs software

Given that context, it is hard to fault C for following the same model: it allows using efficient CPU instructions to perform arithmetic without having to double-check or adjust the results spit out by the processor. And while C is rarely accused of being a high-level language, many other languages that followed including those targeting virtual machines such as Java, have also taken the path of least resistance by going along with the hardware. Languages like Python and Scheme where integers have arbitrary precision by default are the exception, not the rule.

That said, there are some crucial differences in how arithmetic works at a low-level which are missing from the abstractions offered in high-level languages:

  • Multiplication has special case behavior of using the “type expansion” idea. Specifically product of 32-bit integers are in fact written into the pair of registers EDX and EAX, to accommodate a full 64-bit product.
  • CPU has a flags register that contains special flags updated based on the result of the last operations. This includes an overflow flag. Even more conveniently, the x86 instruction set contains conditional jumps based on whether that flag is set: JO, short for jump-on-overflow. That makes life easy for a programmer working in assembly language to detect and respond to overflow conditions with just 1 additional instruction— which for all intents and purposes is an unconditional jump, because the branch-predictor will correctly guess that an overflow will not happen in the common scenario.

By contrast checking for overflow in a language such as C or Java is far more kludgy because these low-level signals are missing. There is no facility provided by the language itself for dealing with overflows in an efficient manner. Options are:

  • Perform an extra arithmetic operation first to check if overflow will happen. For example, we can subtract one operands from the maximum representable value and if that difference is larger than the second operand, addition is guaranteed to overflow.
  • Sanity check the result after performing the operation, by comparing the result against one of the original operands. (Leveraging the observation that an overflow condition exists for unsigned addition if and only if the alleged sum is less than either one of the operands. Note that the rules are a lot more complex for signed arithmetic.)

Many libraries exist for doing arithmetic safely with such overflows checks included, but they trade-off safety for kludgy syntax and readability. Instead of writing a straightforward mathematical expression such as “result = X * Y + (Z * scale) + offset” one must replace each binary operator by a functional call, and potentially conditional checks for the overflow status of that particular subexpression. C++ with its support for operator overloading can at least overcome the syntactic problem, since one can design a replacement for every integer type along with  that performs overflow checking. However in the absence of direct compiler support or very diligent use of inline assembly, these user-defined substitutes will be very inefficient compared to native types.

Compiler features to the rescue?

Where the language falls short, compiler authors can sometimes step in. For example GCC has an option called trapv that will raise a trap when operations on signed integers overflow. This is an important distinction because C already considers such behavior to be undefined— crashing the program is just as valid as truncating the result modulo 2**32. But there is no corresponding option to treat overflow of unsigned integral types as a fatal condition: no “correct” compiler can do that because such overflow is perfect legal C and even explicitly relied on to implement functionality such as hash-functions and cryptographic algorithms. (A research paper from 2007 explores the idea of instrumenting existing binaries to fail on unsigned overflow, confronting exactly that problem of having to identify code that deliberately relies on truncation.)

Solving the problem for signed integers may sound good enough except for the fine-print in C conversion rules: in any operation involving a mixture of signed and unsigned operands, the signed input gets typecast to unsigned and arithmetic performed on unsigned values. In other words, introducing any unsigned value into a calculation voids the overflow check.

Solidity and Ethereum: the missing defense

EVM features large 256-bit words internally and has instructions to perform arithmetic operations on these values. Solidity in turns offers integral types up to 256-bits for programmers to use. But as the preceding discussion suggests, one can not “outrun” integer overflow vulnerabilities by adding bits to integer types any more than buffer overflows are fixed by adding more RAM. (Going from 32-bit →  64-bit architecture has expanded addressable memory to 256 terabytes but that did not spell the end of memory corruption bugs.) As long as resources are finite and attackers allowed to sneak inputs into an operation, it is possible to have an arithmetic overflow.

When it comes to defensive measures for avoiding overflow vulnerabilities, it turns out both platforms are worse than their ancient counterparts. In Solidity operations on integral types silently wrap-around by default in the event of an overflow. There is not even a compiler flag (at the time of writing) to treat overflow events as critical error— the equivalent of 10 year old “trapv” flag in GCC. At the processor level, EVM has no efficient mechanism for tracking overflow state or making decisions conditional on whether a recent computation resulted in overflow. There is nothing comparable to the x86 overflow flag and attendant condition branching instruction. Strictly speaking this is not required; a compiler can synthesize overflow checks with existing EVM instructions, patterned after the kludges used for C. But intrinsic EVM support could have made it that much easier.

For historic perspective, the x86 architecture dates back to the 1970s. So does C, designed as a low-level systems programming language with the explicit goal of closely mirroring the underlying hardware. (C++ arrived on the scene in the early 1980s, adding object-oriented paradigms while continuing to pay homage to the efficiency requirements of C.) Ethereum started four decades later with a completely different use-case: programming smart-contracts on a blockchain, which is an extremely inefficient system for computation and where correctness matters far more than shaving a few virtual cycles. Likewise, Solidity as a language for writing smart-contracts started from a clean slate. Unlike C++ it did not inherit the baggage of having to support existing C code with assumptions about how integer arithmetic words. It’s a tribute to the short-term memory of computer science that a modern programming language designed with the singular purpose of building applications on a blockchain platform— where bugs can result in significant monetary losses— still ended up perpetuating an ancient vulnerability class.




Money for nothing: Bitcoin Cash & the economics of hard-forks (part II)

[continued from part I]

Obstacles to hash-rate equilibrium

There are important caveats to the preceding argument. First we made a simplifying assumption in equating “mining rewards” to coinbase rewards. That neglects additional earnings from transaction fees collected on every payment taking place in that block. For much of the history of Bitcoin, this would have been a reasonable assumption: coinbase rewards dominated fees. In fact this dynamic made possible the extremely low fees that inspired many an optimistic pundit to sound the death-knell of other payment systems that charge usuriously high transaction costs in comparison. Over time as blocks became increasingly congested with scaling solutions stuck in a political quagmire, fees have continued creeping up. At some point in early 2017 the network achieved a state of inversion: transaction fees exceeded coinbase rewards in their contribution to overall mining revenue. That dynamic will continue to be exacerbated as the reward decreases periodically through the “halving” events baked into the bitcoin monetary supply.

That said, the decision criteria for choosing a chain is the relative expected reward per block. Imagine one chain has massive congestion and very high fees (consumers racing to out-bid each other to get their transactions through) while another chain has plenty of spare capacity in its blocks, with correspondingly low fees. That second chain is less appealing to miners, all else being equal and may not merit switching even if it appears to be trading at a premium based on coinbase rewards alone. On the other hand, the congested chain is also much less appealing to users: why continue paying exorbitant fees if you can transact on the other other chain faster and at lower cost? (After all the raison d’etre of Bitcoin Cash is ostensible adoption of 2MB blocks that squeeze more payments into one block, with reduced fees expected to follow as natural side-effect as artificial scarcity is eliminated.) It is unclear to what extent lock-in effects exist for cryptocurrencies. But assuming that some users are free to defect, the second blockchain may appreciate in value over time relative to the first. As more people conclude they can reap greater benefits transacting on that chain, they will bid up the value of the currency (and incidentally, start filling up those blocks with more transactions.) From this perspective, a miner is sacrificing short-term revenue by accepting lower transaction fees in exchange for anticipated future increase in value of accumulated reserves, due to better economics/features offered by the forked chain.

Short-term instability

The second caveat is that the analysis focuses on steady state, after block times have adjusted. As pointed out in the hypothetical scenario of the 1% miner, the introduction of new hash power into a network that follows Bitcoin-type rules leads to temporarily increased profitability for all participants. While the system has mechanism to adjust mining difficulty in response to available capacity, those adjustments take effect slowly. They happen approximately once every 2 weeks in the case of Bitcoin, and use an unweighted average over that time. If the sudden influx of hash power took place in the middle of that window, the algorithm will under-correct: it averages out the hash power before and after the arrival of a new participant, instead of giving higher precedence to more recent history. It may take up to two adjustment periods to fully correct for the spike. The good news is introduction of hash power also hastens the arrival of  adjustment, which is measured in number of blocks. Sudden removal of hash power on the other hand leads to a destabilizing cycle: blocks take longer to mine and the adjustment period to correct that itself will take longer to arrive. Because that reduces the profitability of mining, it may inspire additional defections resulting in even longer block times in a vicious cycle.

Bitcoin Cash experienced a relatively mild version of that phenomenon after the hard-fork: because the minority chain split off with the full difficulty level of Bitcoin proper but only a small fraction of its hash rate, blocks were extremely slow to arrive at first. Granted BCH proponents had anticipated this problem and built an emergency adjustment feature into their protocol. But even that proved insufficiently responsive to correct for the deficit initially. Adding insult to injury, the mechanism over-corrected as miners began jumping into BCH mid-August to exploit the arbitrage opportunities. As a result difficulty and amount of hash-power began to experience wild fluctations two weeks after the hard-fork, with BCH exceeding BTC in hash-rate at one point. (So much for the simple rule “the chain with the majority hash-power is Bitcoin” proposed for anointing the winner after a contentious hard-fork.) To summarize: in the short-term there can be significant divergence from predicted equilibrium due to the relatively slow adoption of mining difficulty in comparison to how quickly network capacity can change due to sudden entry or departure of miners.

Animal spirits all the way?

One unresolved question remains: if market value of a cryptocurrency can drive rational miners to participate in supporting that blockchain, what gives rise to that valuation in the first place? Is there causality in the other direction where miner support (expressed in actual hash-rate, not press releases and blog posts) is sufficient to prop up market value? Consider the initial point where Bitcoin Cash split off from Bitcoin, and a coalition of miners accounting for 10% of total rate made a credible pledge to mine BCH regardless of market pricing. It could be that they have a grievance with BTC, they are being otherwise being compensated by participants with an axe to grind or they are simply irrational. Does that information allow investors to price BCH in relation to BCH?  Proportionality based on hash-rate does not make sense. Blockchains involve winner-takes-all dynamics and network effects. A compatible blockchain with 1% the hash-rate of Bitcoin does not have 1% of its security or censorship-resistance: since many existing Bitcoin mining pool have more power than that, any one of them could easily overpower the minority chain to execute 51% attacks.

Similarly a cryptocurrency with 1% of user-base of another one does not have 1% the utility: it would be the equivalent of an alternative social network that only 1% of your friends participate. From these perspectives, BCH started out with a major disadvantage. Very few wallets support it out of the gate; the chances of being able to pay a friend in BCH was much lower compared to BTC. Most exchanges announced they would not open up order-books for BCH; it will be more difficult to convert BCH into fiat or other cryptocurrencies. While merchants have made symbolic gestures of accepting Bitcoin to signal their “forward-thinking,” no one has put out press releases touting BCH acceptance. Given that one BCH was valued at hundreds of dollars in the immediate aftermath of the fork even before the dust settled, its current hash-rate can be explained as consequence of miner incentives. The more troubling question for future hard-forks is: given the long odds of competing against the main chain, how any splinter fork can achieve that market valuation in the first place.


Money for nothing: Bitcoin Cash & the economics of hard-forks (part I)

Pricing a parallel universe

Over a month has passed since Bitcoin Cash has forked from the main Bitcoin chain. Short-lived anomalies observed in the aftermath of the split, such as very slow mining times due to widely miscalculated block difficulty, have mostly been resolved. While most exchanges have not opened order-books for BCH, sufficient liquidity exists and trading continues at non-zero prices. So much for the predictions of “experts” who projected a rapid decay to zero once the blockchain came up to speed, allowing people to move funds in any reasonable time frame. (Deja vu? Back in 2016 on the verge of the Ethereum hard-fork to bailout the DAO, so-called “experts” also predicted the minority chain Ethereum Classic would die out quickly.)

In fairness, the majority of Bitcoin proper trading does not hit the blockchain either, taking place on the internal ledgers of exchanges. That alone should have been a clue that mobility/velocity of fund movement can not be the deciding factor in determining market value of a cryptocurrency. But are there systemic ways of pricing a splinter faction such as Bitcoin Cash relative to its main chain?

BTC itself has significantly appreciated since August, going from below $3K to nearly crossing the $5K milestone before taking a hit in response to news from a crackdown in China. BCH has fluctuated significantly but continues to hold a valuation in the hundreds of dollars. In other words: the price of bitcoin after the fork plus that of bitcoin cash greatly exceeds that of bitcoin before the fork. Consider that every person who owned bitcoin before the fork still owns the same amount of bitcoin, plus a corresponding quantity of bitcoin cash.

On its face, the hard-fork has created value out of nothing.

Creating a new cryptocurrency out of thin air

Not to shabby for an initiative with modest beginnings as a strategic bluff to another threatened soft-fork. To recap: in response to the lukewarm reception by Bitcoin miners to the segregated witness feature, one of the factions in the never-ending scaling debate proposed a user-activated soft fork or UASF. This approach would effectively force the issue with miners, by disregarding blocks which do not include support for their favorite pet-feature. In response, an opposing faction threatened a user-activated hard-fork (UAHF) that would force miners to commit to a different criteria by disregarding chains that do not include support for their favorite pet-feature, namely 2MB blocks. UAHF was more deterrence than a coherent scaling plan: its proponents painstakingly pointed out the plan would only be invoked if segregated witness activation did not take place cleanly. Last minute developments appeared to render that unnecessary: segregated witness and large-blocks were merged into a Franken-design under the controversial “New York agreement.” Enough miners signaled for that version to reach its activation threshold. Crisis averted? Not quite: one die-hard group decided to run with the UAHF banner regardless of existing consensus, proceeding to create a new fork that supports 2MB blocks—Bitcoin Cash.

Before we go on to conclude that hard-forks are an unambiguous boon to the ecosystem and expect to have one every week, consider a few alternative explanations for the outcome:

  • Speculative bubble in the broader cryptocurrency space, a point many commentators have raised warnings about. No one has accused cryptocurrency investors of being too cautious in this day and age of oversubscribed ICOs. Bitcoin Cash may have been perceived as the next gold-rush by aggressive investors looking for another asset class to get in on the ground floor.
  • Creative conspiracy theories: secret cabals committing to a price floor for BCH in order to generate demand from exchanges or motivate miners to commit hash power. This could explain why BCH had a non-zero valuation but it can not account for why BTC itself started on a steady upwards trajectory.
  • Here is a more charitable interpretation for investor psychology: the community breathing a collective sigh of relief after segregate witness activation proceeding relatively “cleanly” by some definition. While the UAHF hard-fork came out of left field, it had incorporated lessons from past times when the network played a game of chicken with hard-forks. There was no risk of the alternative chain collapsing back on the main-chain, thanks to the deliberate introduction of an incompatibility at the fork point. There was replay protection: transactions from the main chain could not be broadcast on the alternative chain. (That was a critical lesson the Ethereum community learned during its DAO debacle. Compatibility between chains resulted in problems managing funds independently: by default sending ETH also resulted in sending ETC because someone else could—and did, as a matter of fact— take the transaction from the original chain and replay it on the Classic chain.) According to this explanation, bitcoin prices were artificially depressed before the fork due to collective investor anxiety and successful completion of the fork simply erased those worries.

Hash-rate follows valuation

Beyond the particular motivations that support the current price structure, the relative value of Bitcoin Cash to Bitcoin makes for an interesting experiment in comparing predictions from economic theory with observed market behavior of the participants. That goes for miners too: while it is common to accuse investors of tulip-mania irrational exuberance, miners are supposed to be rational actors maximizing subject to well-understood economics of mining. While Bitcoin Cash transactions are deliberately incompatible to Bitcoin transactions, the proof-of-work function used to mine blocks is identical on both sides. That means the chain split presents each miner with a continuous decision point: mine BTC, mine BCH or perhaps some mixture of both by dedicating partial capacity.

The relative profitability of mining—defined as rewards to amount of hash-rate expended— for BTC/BCH has fluctuated wildly during the initial few weeks. As profitability of the BCH blockchain soared, more miners were tempted to jump ship. Eventually the pendulum swung too far and BCH had too much competition, a form of the Yogi Berra saying applies: it is too crowded, no one mines there anymore. But the graph shows an unmistakable trend of converging towards 100%, the point where both chains are equally profitable. This same behavior was observed in the aftermath of ETH/ETC split. The price ratio between the chains was approximately equal to their difficulty ratio.

Is there a relationships between hash-rate and price? It is straightforward to argue for causality in one direction when two chains have compatible proof-of-work function. That compatibility is critical because it allows repurposing mining power. Recall that the majority of commercial mining is done using highly specialized hardware that has been painstakingly optimized for a specific proof-of-work function. It can not be used to support mining on a different blockchain unless that blockchain also uses the same proof-of-work. So an Ethereum miner can not easily switch to Litecoin without significant capital expense in acquiring new hardware, but they can turn around on a dime to mine Ethereum Classic using the same fixed assets.

This interoperability creates an opportunity to arbitrage hash power between blockchains. Suppose BTC is trading at $4000 and a mining pool has 1% of the hash-rate. Neglecting transaction fees, this pool would expect to collect on average 1% of the block reward of 12.5 BTC which comes out to $500 in fiat equivalent. Now suppose that BCH has one-tenth the total hash rate of Bitcoin network being a minority chain, but is valued at one-eighth of bitcoin at $500. In other words BCH is trading at a premium compared to its hash rate. In that case our mining pool can improve its profitability by shifting all power to mine BCH. After switching sides, the miner now accounts for roughly ~9% of the newly expanded BCH network capacity. That means on average they can count on collecting 9% of the 12.5 BCH reward per block. Converted to USD that comes out to ~$563. In other words defecting to BCH when it was trading at a premium to its hash-rate has improved the bottom line. (In fact until difficulty adjustment takes place, their actual profitability will be even higher because the network is still operating under the assumption of its original capacity. Blocks will arrive faster, so the reward per unit time is even more lucrative— considering that primary operational expense for mining is power consumption, that is a more relevant metric.)

Yet the same calculus does not work for a different mining pool which starts out with 5% of total hash-rate. That pool is looking at an average $2500 per block in earnings while operating on the main chain. Switching over to BCH, they become the big fish in a small pond, now accounting for one-third of the total BCH rate— and in the process posing a risk to decentralization due to having achieved critical mass for selfish-mining. But their expected earnings in steady state actually drops to less than $1700 USD per block. What went wrong? Shifting that 5% of mining capacity to the greatly under-powered BCH network destroys the condition that gave rise to the arbitrage opportunity in the first place: with the infusion of new hash power, BCH is no longer trading at a premium compared to its hash rate. (Of course this does not mean larger mining pools can not take advantage of the opportunity: it just means they have to shift some fraction of their resources over to BCH instead of their full capacity.)

Bottom line here is that rational miner behavior gives rise to a dynamic where hash-rate is naturally allocated between compatible blockchains in response to market valuation of those currencies. Put succinctly: hash-rate follows valuation.




Chain-switching attacks and Bitcoin Cash: improving the odds

Incentives for withholding hash power

Can it make economical sense for a Bitcoin miner to strategically withdraw hash-power from the Bitcoin network? It is easy to imagine situations when a miner may choose to power down obsolete hardware that has become too inefficient, when the expected gain from operating those rigs does not justify the electricity costs. But could it make sense to power down perfectly competitive mining rigs suddenly and turn them back on at a different time in the future?

Suppose Bob has 20% of mining power and decides to completely quit mining at the end of the current adjustment period. Assuming that is the only change in the mining landscape—and this is a big leap of faith, considering that other participants may have spare capacity waiting on the sidelines— the result is that for the next adjustment period of 2016 blocks, the remaining miners will experience a slowdown. Difficulty level has not adjusted but only 80% of mining capacity remains, with the result that blocks will arrive 25% slower than average. The adjustment period will take correspondingly longer than the usual two weeks, after which point the difficulty is reduced to 80% of the original. Now if there were no other changes to overall mining capacity, the system would revert to its usual pace of producing blocks, since difficulty and hash-rate have once again achieved parity. But in comes Bob out-of-left field, powering up all of his mining rigs once again. Now difficulty parameter is too low by 20%, or equivalently the network runs a 25% surplus hash-rate over the last estimate. Blocks will arrive once every 8 minutes on average and this state of affairs will continue for a shorter-than-usual adjustment period of ~11 days. At that point difficulty could have adjusted back to observed hash-rate, except Bob is not done: once again he powers down his entire operation, leaving the network with an overestimate of difficulty. This results in a cycle of alternating periods where difficulty is overestimated and underestimated, with blocks arriving too fast or too slow. This is the scenario outlined in a SecureList/Kaspersky blog post, which goes on to argue that  miners have an incentive to game the system by strategically moving hash-power in and out in this fashion.

Economics of timing the (mining) market

Putting aside the question of whether other miners waiting on the sidelines would jump in to compensate if they saw difficulty levels fluctuate, let’s analyze this situation from the economic perspective of the individual miner (more likely, mining pool) we will call Bob. During the 17.5 days when his operation goes quiet, Bob collects no mining rewards at all. This is a lost opportunity cost. Mining has both fixed costs in the specialized rigs and data-center infrastructure, as well as marginal costs in electricity powering all that energy-hungry hardware.  The good news for Bob is that during the next period when he is back to business-as-usual, he still has 20% share of overall network capacity. He collects the exact same proportion of rewards as before. The only difference is that those rewards arrive faster: all of those blocks come in 11.2 days instead of 14. Accordingly his economic competitiveness measured in bitcoins earned per second has improved.

So far, so good. But how does that compare against the idle time? That turns out to be the main drawback. It may look like Bob is screwing over the competition by temporarily withdrawing: everyone else is left to mine at artificially high difficulty for longer than usual. But this is illusory: the remaining players are also collecting higher share of rewards than usual. (To take one example, if hash-rate was equally distributed between 5 mining pools, after one quits each of the remaining pools will see their share increase from 20% 25%.) While the network is producing blocks 25% slower than usual—from the standpoint of miners, their block rewards are correspondingly arriving less frequently— all of the miners left in the game are also collecting 25% more reward when they successfully mine a block. Their economic productivity measured in bitcoins collected per second has not changed at all. That means profitability has not changed either: regardless of what they are paying for electricity, continuing to mine at the artificially higher difficulty produces the same gain/loss per unit of time as before. (Again, assuming everything else is held constant: while block rewards are fixed, mining fees are variable and could go either way. A congested blockchain with slower confirmation times could result in fewer people attempting to transact, putting downward pressure on fees. But conversely it could mean that there is ever more urgency to get mined into the next block, resulting in higher fees.)

Meanwhile Bob receives diddly-squat during the entire time his rigs sit idle. That represents a complete loss of productivity for an even longer period of time; recall that the next adjustment period of artificially high-difficulty lasts longer than 14 days as remaining miners are struggling to produce blocks. That extended stretch of 0 bitcoins per second more than counters any advantage sustained during the easy-mining period. Given only the Bitcoin blockchain, it is difficult to see how strategically timing entry and exit out of the mining operation could result in improved economics for the miner playing that game.

An alternative chain is the playground of idle miners

Unless of course, the miner can find something better to do with all of that idle hardware sitting around? Enter Bitcoin Cash. Given two blockchains with compatible proof-of-work functions (in the broadest sense, meaning that hardware designed to mine for one can be converted to mine for the other without significant loss of efficiency) a new strategy becomes available to miners: instead of idling their rigs, they can simply switch over to the other chain.

The effectiveness of that strategy depends on several variables. Perhaps the most obvious one is the relative profitability of mining the alt-coin compared to Bitcoin itself. In general there is no a priori rule that mining alt-coins must be equally profitable in terms of rewards to hash-rate invested. While cost of hash-rate is relatively predictable in hardware costs and electricity, the rewards are highly volatile given the sharp fluctuations in cryptocurrency markets. More importantly, given that other blockchains such as Ethereum employ a different proof-of-work function than Bitcoin, it is often not possible to simply redirect existing hardware to mine an entirely different blockchain.

Restricting our attention to chains with compatible proof-of-work functions, there is a market dynamic to keep their profitability at approximately the same level in steady state. If a significant disparity exists, miners can jump ship to mine on the other chain. That incentive applies even to miners ideologically committed to a particular currency. If a miner only cares about Bitcoin but sees an opportunity to earn the equivalent of 10% more USD on a different chain with sufficient liquidity. The rational choice is:

  • Mine that alternative currency
  • Immediately cash out to all rewards to USD or other fiat currency
  • Use the proceeds to buy back bitcoin on the open market

Provided the discrepancy exceeds transaction costs of making those trades, that strategy still earns more bitcoin for the adaptable miner. In the case of Bitcoin and Bitcoin Cash, relative profitability has been fluctuating wildly, partly because BCH difficulty adjustments have been very slow to kick-in after the fork. But it is plausible that in steady state—assuming the forked chain, survives that is— profitability will remain in the same neighborhood, if not exactly equal.

The other wildcard is how difficulty adjustments work on the alternative chain. Recall that sudden introduction/departure of hash-power from the network has a transient effect on the overall profitability of that blockchain. This effect is short-lived as long as the difficulty level is adjusted to account for the total mining capacity expected. With some exceptions around edge-cases (such as the rapid, emergency adjustments to deal with the massive drop in the afterwork of the hard-fork) Bitcoin Cash has similar rules to Bitcoin around scheduling: the network aims for 10-minute intervals between blocks and adjusts difficulty after every 2016 blocks to hit that target. However these two chains will not necessarily agree on when each adjustment period starts or ends. Even in steady state, they are likely to appear as two out-of-phase waves with the same frequency rather than perfectly synchronized ones.

Luckily for hypothetical miner Bob engaged in a chain-switching attack, synchronization is not necessary. After withdrawing from mining capacity from BTC, the miner can simply switch to mining BCH starting at some arbitrary point in the middle of BCH adjustment period. (Note at the time of writing, overall BCH hash-rate is much lower compared to BTC. If we continue using the 20% figure as before, that amount of BTC capacity switching sides would completely dominate the minority chain, effectively constituting a 51% attack.) Until that period completes, the network is operating at increased pace, doling out rewards too fast because the presence of the new actor has not been factored into the difficulty calculation yet. Everyone, including Bob responsible for this temporary spike, is collecting BCH at a rate faster than what one would predict given their share of mining power. After the adjustment, target block-time is restored and expected reward per unit time also gets trimmed for everyone. But miner Bob is continuing to earn some reward, albeit in a different cryptocurrency. He can continue doing that until BTC chain readjusts its difficulty to create an environment where jumping back in is appealing again. This improves the economics of chain switching compared to simply idling all rigs for an adjustment period.

But counteracting that is another factor: the act of Bob completely withdrawing from BTC and redirecting all capacity at BCH has lowered the relative profitability of BCH. There is now more hash rate chasing the same amount of block rewards in BCH, while BTC is experiencing a competitive vacuum. (This assumes BCH and BTC prices have not magically moved in the opposite direction to correct for hash-rate. While hash-power naturally follows valuation by miners voting with their rigs, it is unclear what causal mechanism exists for market-price of a cryptocurrency traded on multiple exchanges to adjust to gain/loss of hash-power.) That may in turn create incentives for hash-power to naturally defect from BCH and switch back to BTC, compensating for the original drop.

To summarize, while it appears that each miner can cause network instability & fluctuation in block times by strategically exiting/reentering the market, that behavior is unlikely to be more profitable than the simpler strategy of always mining at the same rate 24/7. The introduction of a second compatible chain however does alter the dynamics, by compensating for lost income during the idle time. But even in that scenario,  it is unclear how other actors would respond because a natural equilibrium exists for the distribution of hash power between compatible chains, given rational miners with freedom to choose based on pricing.  More research is necessary to shed light on these competitive dynamics.