This is the story of an implementation bug discovered while operating an enterprise public-key infrastructure system. It is common in high-security scenarios for private keys to be stored on dedicated cryptographic hardware rather than managed as ordinary files on the commodity operating system. Smart-cards, hardware tokens and TPMs are examples of popular form factors. In this deployment, every employee was issued a USB token designed to connect to old-fashioned USB-A ports. USB tokens have a usability advantage over smart-cards in situations when most employees are using laptops: there is no separate card reader required, eliminating one piece carry lug around. The gadget presents itself to the operating system as a combined card reader with a card always present. Cards on the other hand have an edge for “converged access” scenarios involving both logical and physical access. Dual-interface cards with NFC can also be tapped against badge readers to open doors. (While it is possible to shoe-horn NFC into compact gadgets and this has been done, physical constraints on antenna size all but guarantee poor RF performance. Not to mention one decidedly low-tech but crucial aspect of an identity badge is having enough real-estate for the obligatory photograph and name of the employee spelled out in legible font.)
The first indication of something awry with the type of token used came from a simple utility rejecting the RSA public-key for a team member. That public-key had been part of a pair generated on the token, in keeping with the usual provisioning process that guarantees keys live on the token throughout their entire lifecycle. To recap that sequence:
- Generate a key-pair of the desired characteristics, in this case 2048-bit RSA. This can be surprisingly slow with RSA on the order of 30 seconds, considering the hardware in question is powered by relatively modest SoCs under the hood.
- Sign a certificate signing-request (CSR) containing the public-key. This is commonly done as a single operation at the time of key-generation, due to an implementation quirk: most card standards such as PIV require a certificate present before clients can use the card because they do not know have a way to identify private-keys in isolation.
- Submit that CSR to the enterprise certificate authority to obtain a certificate. In principle certificates can be issued out of thin air. In reality most CA software can only accept a CSR containing the public-key of the subject, signed with the corresponding private key— and they will verify that criteria.
- Load issued certificate on the token. At this point the token is ready for use in any scenario demanding PKI credentials, be it VPN, TLS client authentication in a web-browser, login to the operating system or disk-encryption.
On this particular token, that sequence resulted in a 2047-bit RSA key, one bit off the mark and falling short of the NIST recommendations to boot. A quick glance showed the provisioning process was not at fault. Key generation was executed on Windows using the tried-and-true certreq utility. (Provisioning new credentials is commonly under-specified compared to steady-state usage of existing credentials, and vendors often deign to publish software for Windows only.) That utility takes an INF file as configuration specifying key type to generate. Quick glance at the INF file showed the number 2048 had not bit-rotted into 2047.
Something else lower in the stack was ignoring those instructions or failing to generate keys according to the specifications. Looking through other public-keys in the system showed that this was not in fact an isolated case. The culprit appeared to be the key-generation logic on the card itself.
Recall that when we speak of a “2048-bit RSA key” the counts are referring to the size of the modulus. An RSA modulus is the product of two large primes of comparable size. Generating a 2048-bit RSA key then is done by generating two random primes of half that size at 1024-bits and multiplying those two values together.
There is one catch with this logic: the product of two numbers N-bits in size each is not guaranteed to be 2·N. That intuitive-sounding 2·N result is an upper-bound: the actual product can be 2·N or 2·N-1 bits. Here is an example involving tractable numbers and the more familiar decimal notation. It takes two digits to express the prime numbers 19 and 29. But multiplying them we get 19 * 29 = 551, a number spanning three digits instead of four. By contrast the product of two-digit primes 37 and 59 is 2183, which is four digits as expected.
Informally, we can say not all N-bit numbers are alike. Some are “small,” meaning they are close to the lower bound of 2n-1, the smallest possible N-bit number. At the other end of the spectrum are “large” N-bit numbers, closer to the high end of the permissible range at 2ⁿ. Multiplying large N-bit numbers produces the expected 2N-bit product, while multiplying small ones can fall short of the goal.
RSA implementations commonly correct for this by setting some of the leading bits of the prime to 1, forcing each generated prime to be “large.” In other words, the random primes are not randomly selected from the full interval [2n-1, 2n – 1] but a tighter interval that excludes “small” primes. (Why not roll the dice on the full interval and check the product after the fact? See earlier point about the time-consuming nature of RSA key generation. Starting over from scratch is expensive.) This is effectively an extension of logic that is already present for prime generation, namely setting the most significant bit to one. Otherwise the naive way to “choose random N-bit prime” by considering the entire interval [0, 2n – 1] can result in a much shorter prime, one that begins with an unfortunate run of leading zeroes. That guarantees failure: if one of the factors is strictly less than N bits, the final modulus can never hit the target of 2N bits.
So we know this token has a design flaw in prime generation that occasionally outputs 2047-bit modulus when asked for 2048. How occasional? If it were only setting the MSB to one and otherwise picking primes uniformly in the permissible interval, the error rate can be approximated by the probability that two random variables X and Y selected independently at random from the range [1, 2] have a product less than 2. Visualized geometrically, this is the area under the curve xy < 2 in a square region defined by sides in the same interval. That is a standard calculus problem that can be solved by integration. It predicts about 40% of RSA modulus falling short by one bit. That fraction is not consistent with the observed frequency which is closer to 1 in 10, an unlikely outcome from a Bayesian perspective if that was an accurate model for what the token is doing. (Note that if two leading bits were forced set on both primes, the error case is completely eliminated. From that perspective, the manufacturer was “off-by-one” according to more than one meaning of the phrase.)
So how damaging is this particular quirk to the security of RSA keys? It is certainly an embarrassing by virtue of how easy it should have been to catch this during testing. That does not reflect well on the quality assurance. Yet the difficulty of factoring 2047-bit keys is only marginally lower than that for full 2048-bits— which is to say far outside the range of currently known algorithms and computing power available to anyone outside the NSA. (Looked another way, forcing another bit to 1 when generating the primes also reduces the entropy.) Assuming this is the only bug in RSA key generation, there is no reason to throw away these tokens or lash out at the vendor. Also to be clear: this particular token was not susceptible to the ROCA vulnerability that affected all hardware using Infineon chips. In contrast to missing one from the generated key, Infineon RSA library produced full-size keys that were in fact much weaker than they appeared due to the special structure of the primes. That wreaked havoc on many large-scale systems, including latest generation Yubicrap (after a dubious switch from NXP to Infineon hardware) and the Estonian government electronic ID card system. In fact the irony of ROCA is proving that key length is far from being the only criteria for security. Due to the variable strategies used by Infineon to generate primes at different lengths, customer were better off using shorter RSA keys on the vulnerable hardware:
(Figure 1, taken from the paper “The Return of Coppersmith’s Attack: Practical Factorization of Widely Used RSA Moduli”)
The counter-intuitive nature of ROCA is that the estimated worst-case factorization time (marked by the blue crosses above) does not increase in an orderly manner with key length. Instead there are sharp drops around 1000 and 2000 bits creating a sweet spot for the attack where the cost of recovering keys is drastically lower. Meanwhile the regions shaded yellow and orange correspond to key lengths where the attack is not feasible. To pick one example from the above graph, 1920-bit keys would not have been vulnerable to the factorization scheme described in the paper. Even 1800-bit keys would have been a better choice than the NIST-anointed choice of 2048. While 1800 keys were still susceptible to the attack, it would have required too much computing power—note the Y-axis has logarithmic scale— while 2048-bit keys were well within range of factoring with commodity hardware that can be leased from AWS.
It turns out that sometimes it is better missing the mark by one bit than hitting the target dead-on with the wrong algorithm.