The previous post discussed the potential for blackbox kleptographic implementations of ECDSA to leak private-keys by manipulating random nonces. What about other cryptographic algorithms? Would offline Bitcoin wallets be less susceptible to similar attacks if Satoshi had made different design choices?
As a natural alternative we can look at RSA, the first widely deployed public-key cryptosystem. RSA signatures also involve applying a padding scheme to the raw message hash before feeding it into the trapdoor. Some of these schemes are probabilistic, meaning there is some arbitrariness in the way that the original input is extended from size of the hash function (say 32 bytes for SHA256) to the full size of RSA modulus (2048 bits or higher by contemporary standards) Such freedom in choosing those additional bits create the same problem as ECDSA. But RSA can also be used in conjunction with deterministic padding. One message, exactly one valid signature. While that eliminates a very specific threat involving covert channels from blackbox implementations, it raises other questions. How would RSA stack-up against ECDSA along other aspects- security, performance, efficiency?
Theoretical security: a paucity of provable results
At a very fundamental level, there is no clear winner. Reflecting our general state of ignorance about the complexity of core cryptographic primitives, neither cryptosystem has a formal proof of security. It is widely believed that the difficulty of RSA rests on factoring while that of ECDSA/ECDH rests on the discrete logarithm problem. But even those seemingly obvious facts have eluded proof as of this writing, if we define “proof” to mean a specific type of reduction: showing that an effective attack capable of breaking RSA (ECDSA, respectively) leads to an equally effectively algorithm for factoring (computing discrete logs.) It’s clear that efficient factoring would break RSA and efficient algorithms for computing discrete logs would break elliptic-curve cryptosystems. But it is the converse that matters. Proving or refuting that conjecture remains an open problem: whether there exists a way to decrypt messages or forge signatures without solving either of those problems. In fairness, few cryptosystems have been formally reduced to an existing problem. Rabin cryptosystem is provably equivalent to factoring. A few others rarely used in practice such as McEliece are based on an NP-complete underlying problem, but that does not imply that attacking a random instance is NP-hard.
Both elliptic-curve and RSA signatures vary in security level based on parameters, and there are various guidelines for relating the difficulty of breaking one to another. RSA is simpler because there are relatively few knobs and parameters to tweak when generating keys. Key-size is the main variable, with 2048-bits being the consensus recommendation today. In the past there have been varying opinions about choosing primes of special structure, on the grounds that most efficient factoring algorithms known at the time fared better when order of the group was “smooth” eg P-1 and Q-1 have many small divisors. But the most efficient algorithm known today, the number-field sieve, does not rely on such special structure and operates equally fast (or equally slowly, depending on your perspective) against all RSA moduli. Public exponent is generally chosen to be a small number to allow quick public-key operations, and attempts to optimize performance based on choosing special private-key exponents faces security problems. That leaves modulus-size as the main tunable parameter for determining security of the RSA cryptosystem.
For elliptic curves, the story is more complicated. For starters there are many subtypes of curves: they can be over prime fields or binary fields. Their parameters can be generated according to different criteria. Unlike choosing primes, picking a curve is both tricky and complex. So much that end-users do not generate their own curve, instead pick from a menu of options that are supported by their software. (“openssl ecparam -list_curves” for a glimpse at the options on your system.) This is where competition for curves starts to look like a market for used-cars, with each group pushing the virtues of their own curve. With many options come the possibility of some spectacularly bad ones. Some have been inadvertent, with advances in the field revealing that certain classes of curves were weaker than expected. Others may even be deliberate, clouded by allegations that curve parameters have been “cooked” by a conspiracy— with NSA usually playing the role of the supervillain in these stories— to have a very specific weakness exploitable by the powers-that-be.
Further aggravating matters, Bitcoin lacks cryptographic agility— it has committed to a single curve with no opportunity to increase hardness along this dimension short of a hard-fork. In any case, there is no linear scale as with RSA, where modulus size could be increased gradually one machine word at a time. Bumping up security requires a discontinuity, switching to a different curve with different parameters.
Cryptographic algorithms also fail because incidental weaknesses in an implementation implementing that has nothing to do with the underlying mathematics. How does each choice fare along this dimension? As noted earlier, ECDSA is critically dependent on a source of randomness for each signature operation- and not just during key generation. Even a partial failure of the RNG during signing is fatal. RSA does not have this problem, particularly when used in conjunction with a deterministic padding mode.
Both algorithms are subject to side-channel attacks: in a naive implementation, the sequence of steps taken during the computation has direct relationship to private-key bits. For example the code may take one code path versus another depending on whether a key-bit is 0 or 1, and these different code-paths could have very different profiles in terms of time required for completion, memory references or even aggregate power consumption. Hardened implementations try to obscure these differences against external observation. For elliptic curves it can be more difficult to do this. To take one example, in RSA a source of side-channel leaks is that squaring operation is often distinguishable from side-multiply because it can be implemented faster, for example using the recursive Karatsuba technique or simply noting that only half as many multiplications are required because the same pair of operands appears twice. But this optimization is optional: an implementation can choose to forgo faster squaring and fall back to doing it the hard way.** For elliptic curves, there is no such option. The basic group operation, point addition, is inherently different when adding a point to itself as opposed to adding two different points. That difference can’t be swept aside by falling back to a less efficient version of addition. More subtle techniques such as the Montgomery ladder are required to avoid key-dependent behavior.
At some level there is an irony to worrying about the performance of Bitcoin—an extremely inefficient system predicated on wasting massive amounts of CPU cycles, and consequently energy, to operate without any centralized trust. But suspending disbelief and assuming that speed matters, RSA appears to have a surprising advantage for transaction signing. Surprising, because in general speed is touted as one of the advantages for elliptic-curve cryptography. That part still holds true, and one can get ballpark estimates from (highly outdated) Crypo++ benchmarks or running openssl speed tests locally:
sign verify sign/s verify/s 256 bit ecdsa (nistp256) 0.0000s 0.0001s 26840.5 11228.1
sign verify sign/s verify/s rsa 2048 bits 0.000554s 0.000025s 1805.1 39505.7
That’s an order of magnitude difference. The problem for ECDSA is that both signing and verification are relatively expensive. RSA on the other hand has a very asymmetric performance profile: private-key operations (decryption and signing) are 20x slower than public-key operations (encryption and verification, respectively.) Bitcoin transactions are signed once by the originator but verified thousands if not millions of times by full-nodes validating the ledger. In that usage mode, RSA gets the nod for lowering total cost across the network over the lifetime of the transaction.
Looking at key-generation on the other hand, ECDSA gains a slight edge. RSA keys have mathematical structure as the product of two large primes; not any old number will do. Key generation involves a relatively costly search for primes by testing candidates repeatedly until primes are discovered. For ECDSA any positive integer less than the order of the generator is a valid private-key. Randomly choosing a value in this range is sufficient and very quick; so much that bulk of the time in ECDSA key-generation is spent computing the public-key. That property also makes it easier to implement deterministic key-derivation schemes such as BIP32, where multiple keys are derived from a single root secret. Because any old integer in a particular range is a valid private key, it is easy to map outputs from key-derivation functions to ECDSA private keys in one-to-one manner. While RSA key generation can be made deterministic based on the starting point for prime search, it does not have a similar uniqueness guarantee.
A different measure of efficiency is space: how many bytes are consumed by the signature. Shorter is better and arguably this is more important than speed for Bitcoin. Signature size is a major determinant of transaction size. Given that each block is capped and current 1MB limit is already becoming a limitation, any increase in TX size further constrains how many transactions can be mined per generation. On that dimension ECDSA looks a lot better. The size of an ECDSA signature is twice that of the underlying curve. For secp256k1 that comes out to 64 bytes. (Somewhat puzzling is that Bitcoin sticks with traditional ASN1 encoding of signature, which introduces up to 7 bytes or 10% of additional overhead compared to plain layout.) At first glance, even that inefficient encoding looks like major savings compared to the alternative of using 2048-bit RSA signatures, which represents a whopping 300% increase.
Surprisingly some of the overhead for RSA can be optimized away by merging it into the signature. RSA supports message recovery, where part or entirety of the message is squeezed into the signature block itself. For example with 2048b modulus and SHA256 as the underlying hash function, over 220 bytes of plaintext message can be safely recovered using PSS-R padding. Surprisingly the net overhead than could be lower than ECDSA depending on amount of data merged into the signature. Using a data-point from December 2014 that average Bitcoin transaction size is ~250 bytes, we can estimate ~190 bytes of preimage without the ECDSA signature. That is less than the maximum recoverable message which means that the entire TX would become a single RSA signature, without any appreciable bump in space used.
Unfortunately one detail in Bitcoin signing greatly limits the applicability of message recovery optimizations: every input into a transaction must be signed individually. This holds true even if the inputs are using the same address and in theory a single signature could prove control over all of them. For transactions containing multiple inputs, the hypothetical RSA signature overhead would apply for each input. That means one may run out of “message” to squeeze into the signature, unless we complicate matters further by starting to squeeze signatures into each other. In that case part of the “message” recovered from an RSA signature could be a chunk of a different RSA signature. Such cascades break one of the assumptions about TX construction: inputs can be added independently of each other. (There is a even a signature type that allows anyone to add inputs.) If input signatures incorporate fragments of other inputs, parsing becomes tricky.
One last point on space efficiency of existing Bitcoin signatures: ECDSA is far from being state of the art in optimizing signature size. It has been known for some time that pairing-based schemes enable shorter signatures. For example the Boneh–Lynn–Shacham scheme from 2001 provides a security level comparable to 1024-bit RSA using as few as 20 bytes. But pairing based cryptography remains relatively esoteric with limited implementation support and a potential minefield of intellectual property claims, making it an unlikely candidate for Bitcoin.
Verdict on ECDSA for Bitcoin
In retrospect then ECDSA looks like a reasonable middle-ground for the requirements Satoshi faced. It produces relatively compact signatures compared to RSA without requiring elaborate message-recovery tricks, allows rapid key-generation to promote better privacy for funds and has plenty of implementation support. (Modulo the choice of secp256k1; most of the initial focus on elliptic-curve support has been in response to NIST Suite-B curves. secp256k1 is not among them. For example Windows native cryptography API still does not support it.) On the downside, it has a suboptimal distribution of load between signing/verification, inflating the global cost of verifying blocks.
** To be clear, this only removes the timing leak. Other side-channels exist to still permit distinguishing squaring from multiplication by a different operand. A side-multiply will typically reference different memory regions than multiplying a number by itself. Observations exploiting local channels such as CPU caches as can still infer such key-dependent memory references.