Threshold ECDSA, key sharding and multi-signature: a comparison

Threshold cryptography refers to splitting cryptographic keys such that performing an operation—such as decrypting or signing a message— requires an interactive process among multiple shareholders without ever disclosing the secret. The emergence of cryptocurrencies has inspired a cottage industry of research on threshold ECDSA, the digital signature algorithm hard-coded by most cryptocurrencies. After some false-starts in 2014, more progress has been made towards efficient schemes. For example one recent paper at IEEE Security & Privacy 2018 made strides in reducing the the level of interaction required during that computation. But the techniques remain a long ways from being a realistic solution for any system storing cryptocurrency, especially when they have to compete with a much simpler alternative already built into common cryptocurrencies: multi-signature. In this blog post we compare three alternatives to see how they stack up:

  1. Naive secret splitting
  2. Multi-signature
  3. Threshold cryptography

Naive secret-splitting

In this model, the bits making up the secret value are split using a plain threshold-sharing scheme such as Shamir’s secret sharing. Such schemes are parameterized by so-called “M-of-N” settings, where N shares exist of which any subset M < N are sufficient to reconstruct the original value. In the case of cryptocurrencies such as Bitcoin the “secret” private-key is a scalar value that defines the private key, which is effectively a large number. That number can be split into shares and reassembled again to create the original private-key as long as a quorum of at least M shares are present. With the key reconstituted, it can be used to sign Bitcoin transactions

That reassembly step also happens to be the Achilles heel of this model: there is a single private-key at the end of the day and it is being repeatedly reassembled each time that key is used to transfer Bitcoin. (Recall that in cryptocurrencies such as Bitcoin or Ethereum, control over funds is asserted by creating a digital signature, which boils down to using the private-key.) That means there is a single point of failure, a trusted system where reassembly takes place where that secret value is exposed.

Native multi-signature

Around 2012 Bitcoin underwent one of its more ambitious feature expansions, introducing support for a scripting language for describing payment conditions. This language greatly expanded the range of conditions that can be tied to spending funds. Previously pay-to-public-key or pay-to-public-key-hash  addresses could only express a condition along the lines “you can spend these funds if the transaction is signed with this key.” With script one could replace the second part with more complex assertions such as  “… signed with key A or signed with key B and provides solution to a puzzle C.” Since the idea of requiring multiple signatures on a transaction is a very common scenario, the designers introduced native instructions into the virtual machine scripting language to support multi-signatures or “multisig:” instead of depending on a single key to sign the transaction, some quorum of M keys out of a larger set of N is required to sign.

This M-of-N condition may resemble traditional secret-splitting but the similarity ends there In Bitcoin multisig, there are N distinct, independent keys. By contrast, secret splitting is working with exactly one key. As far as rest of the world is concerned, there is one signature to verify, produced by one private key and verified with one public key. How that private key is managed behind the scenes by sharding across multiple systems is opaque to everyone else.  By contrast multisig involves creating M different signatures to authorize a transaction. Each of these signatures must be independently verified and the presence of multiple, distinct keys is visible to other participants in the system.

In a useful example of debunking false dilemmas around security trade-offs, multisig improves both security and redundancy. The security advantage is readily apparent when M > 1. An attacker planning to steal funds has multiple targets to chase after instead of just one. More importantly, those keys can be managed independently. There is no single system that must have access to all keys at the same time. Each private-key signs a transaction without requiring access to any of the other keys. (In fact, the signatures do not even depend on each other.  This is an example of co-signing rather than counter-signing where subsequent signatures would incorporate earlier signatures. For Bitcoin the “message” being signed is identical from the perspective of multiple key-holders, which allows obtaining the signatures in parallel, in arbitrary order.) In fact the security gains from multisig are directly related to the extent these multiple keys have uncorrelated risks. Put all eggs in the same basket as Bitfinex did, and multisig degenerates into the same baseline as single-key control.

The redundancy improvement is more subtle. While secret keys falling into the hands of an adversary is perhaps the most common threat model, there is another risk around simply losing access to the key. For example the device where the key is stored could malfunction or accidentally get discarded— newsworthy examples of that are plenty. This is key “loss” in a different sense: no one has access to the secret any longer and any funds on the blockchain controlled by that private-key are effectively taken out of the money supply. In multisig configurations where the quorum number M is less than the total number of keys N, key loss can be tolerated. Even with some shares permanently lost or even temporarily unavailable, transactions can be authorized.

There is one catch: multi-signature transactions take up a lot more space. A configuration requiring a quorum of M signatures out of a possible group of N must include not only all M signatures but also M public-keys. You might expect some type of space-optimization in the protocol, where future transactions out of the same address do not require enumerating the public-keys again since those are fixed for a given multisig address. But there is no such provision. Whether out of a strident belief that addresses must only be used once to preserve privacy— an ideological stance readily contradicted by evidence from actual blockchain statistics— or a pragmatic bias for avoiding complexity, the designers did not include any mechanism for “caching” key-sets (even though signatures were being stored permanently on the blockchain until segregated-witness.) This is more than a matter of engineering efficiency: when it comes to paying fees, transactions are priced by the byte. This makes multi-signature transactions more costly, with costs scaling linearly with in both parameters M and N.

Threshold cryptography

“If you think cryptography is the solution to your problem, then you don’t understand your problem” — attributed to Peter Neumann

Threshold cryptography promises to offer  the best of both worlds. As far as the rest of the blockchain is concerned, there is a single key used to create one signature;  transactions consume the same amount of space as old-school single key addresses. On the other hand, that key has been sharded into multiple shares, with each share given to a different party. More importantly, those shares are never reassembled. Instead the people controlling individual shares all participate in an complex protocol to create the signature, without any participant gaining more information about the shares held by others. Designing those protocols, reasoning about their properties and proving security is where the cryptographic techniques comes in.

It is also where the complexity lies. Depending on the particular problem, these protocols can be too unwieldy for implementation. For some cryptographic systems there are natural solutions. For example RSA allows additively splitting the private-key and easily combining individual signatures created by each piece. (But the joint generation of an RSA modulus such that no one learns the factorization is quite tricky. For this reason, threshold systems commonly assume “trusted dealer” where the entire secret is allowed to exist on one system temporarily during key generation, from which the shares are distributed.) For  the DSA family of algorithms, including its elliptical curve incarnation, there is no simple solution known. Solutions can fall short effectivness for a number of reasons. First and most important is the amount of interaction required among holders of shares during signature generation. In an ideal solution, no interaction is required. Each person performs a computation using their share working independently and outputs some intermediate result. Finally there is a reassembly phase, where an untrusted system collects the partial answers and reassembles them into a final result.

That model is ideal for offline storage of cryptocurrency, because it allows parallel, incremental construction of a transaction. Typical offline storage schemes involve storing cryptographic keys offline, on systems located across multiple geographically diverse sites. Because these systems are not connected to a network, participants must travel on site to have physical access for interacting with them. A threshold signature protocol with zero interaction allows those accesses to happen in any sequence, visiting one site at a time. By contrast, all known threshold signature protocols for ECDSA involve multiple rounds of interaction among holders of key-shares. That means either visiting each site multiple times, or more realistically having to orchestrate multiple groups visiting all sites simultaneously and being in communication in real time to carry out protocol steps— keeping in mind that these systems were supposed to be offline and disconnected from networks. None of the options are particularly appealing. One could knit together the “air-gapped” systems into a private network that remains (hopefully) isolated from the rest of the internet. But even in that best case scenario where the network is private, there is new attack surface: connected systems can now target each other, weakening one of the assumptions about each site having uncorrelated risks. Or the operators could resort to kludgy “sneaker-net” to exchange inputs/outputs with offline systems while leveraging public networks to transfer those inputs/outputs to other sites. The inefficiency of that alternative increases with the communication complexity of the protocol, and the number of “rounds” involved in assembling the final signature.

A second and more subtle limitation arises from the incompatibility of threshold signing protocols with existing cryptographic hardware, such as bitcoin wallets running on smart cards or HSMs. If a protocol requires manipulating secret material in ways outside the handful of algorithms supported by the underlying platform, it can not be instantiated using these systems. Once again RSA is the poster-child for a well-behaved algorithm. Any device capable of importing RSA private keys and signing with them can participate in an NofN threshold signature scheme (note M=N requirement) because the “partial computation” using a share looks exactly like producing a regular RSA signature. That said, there are implementation subtleties. For example, the hardware must support importing an RSA private-key without the Chinese Remainder Theorem or CRT components. Otherwise if the shares had to include CRT components, every participant could recover the full secret.

By contrast existing threshold ECDSA protocols do not use secret key material in any way resembling an ECDSA signature, or even other common algorithms defined on secret material, such as ECDH computation using the same scalar value. That means such protocols are incompatible with gold standard for managing high value secrets. (It is rare for  cryptographic hardware to permit users to implement their own cryptographic algorithms to run inside the secure environment, since a badly designed one allows key extraction.) That is a significant step backwards in operational security: general-purpose computing platforms such as PCs are not appropriate for managing critical secrets. Hardware wallets are built for exactly that scenario, with defenses against tampering, key-extraction and side-channel attacks. Giving up that assurance to save on transaction fees is a dubious trade-off. That said, this is a practical limitation that could be overcome if hardware wallets decided to add support for the more esoteric primitives required to make these protocol work. But as with all other software development, vendors of cryptographic hardware follow the 80/20 approach to defining requirements: they concentrate their engineering efforts on supporting the 20% of features used by 80% of customers, and avoid the long-tail of seldom used obscure algorithms that only a handful of people care about.

Bottom-line: while future developments towards a more efficient threshold ECDSA implementation could overcome these limitations, at the moment it’s a tough sell to give up on the simplicity of native multi-signature in favor of a complex multi-party computation protocol that is incompatible with preferred hardware for high-value secrets.



Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s