A recent paper at CCS reported problems in the Windows 2000 random-number generator. The story made it to Slashdot and later amplified in the blogosphere, after MSFT confirmed that the same problem applied to XP. One lone voice of reason on Slashdot tried to clear the air in vain, while speculation continued on whether the entire edifice of Windows cryptography had been undermined. MSFT itself did not help the case by taking issue with the definition of “vulnerability” but still announcing a change to the functionality in XP SP3
This blogger’s two cents worth of observations on the subject:
- Most glaring problem with the paper is an unrealistic threat-model. The attack requires complete access to the internal state of the random-number generator. In a typical setting the adversary can observe the output of a PRNG but not peek inside the black-box to see what is going on. As such this work is closer in spirit to the side-channel attacks against OpenSSL or x86 shared-cache problem. These have the prerequisite that the adversary has additional visibility into the operation of the system.
- In this case the authors assumed a very powerful adversary, one who has exploited a remote-code execution vulnerability to gain complete control of the application. (“Buffer overrun” is used as proxy for this in the paper, although fewer of these vulnerabilities are exploitable for code execution owing to proliferation of compiler and OS features.) The problem is, once attacker is running code on the system with the privileges of the application using the PRNG, they have complete control and have many options. There may be no reason to attack PRNG at this point: she can directly read any keys lying around in memory, access plain-text encrypted/decrypted in those keys etc. This is equivalent to the observation that once you can break into a house, the fact that the owner did not shred all the documents may be quite irrelevant if the same information in those documents can be obtained elsewhere in the residence.
- Once the internal state of a PRNG is known, predicting future is trivial until it is rekeyed or supplied with entropy from a pool. No PRNG is secure against this problem. So the incremental risk presented by the attack applies to the following scenario only: a system is 0wned after it has generated, used and discarded key material using the PRNG but before the PRNG state has been reinitialized. In this case the PRNG state allows recovering the key that otherwise would not have been reachable. (“forward security” assumption.) Any earlier and the attack is irrelevant because the keys generated with the PRNG are still around in memory and can be read directly, without having to rewind PRNG state. Any later and PRNG state is lost irreversibly. That’s a narrow window of opportunity for carrying out this attack, on top of successfully exploiting a remote code execution vulnerability.
- A similar lack of perspective around system security continues into the discussion of isolation boundaries. There is an extended discussion on the benefits of kernel vs user-mode as if that were a meaningful security boundary. Code running as administrator can easily obtain kernel privileges trivially in all versions of Windows prior to Vista (by programmatically loading a device driver) and read same PRNG state from the kernel. Similarly the PRNG can run in user mode but in a different process like lsass– which is also how key isolation works in Vista for private keys. In fact the user/kernel mode distinction does not even hold in Linux: root can directly read kernel memory.
- For this reason, having separate processes each running their own PRNG can be good for security contrary to the argument in the paper. Compromising the state from one does not allow getting information for any other process. For example exploiting a buffer overrun in IIS service does not reveal information about PRNG state of the process that handles SSL negotiation– which is surprisingly not IIS. This is consistent with the isolaton between accounts provided by the OS.
- There is an estimate of 600 SSL handshakes required for refreshing client state and cavalier assertion to the effect that this number is unlikely to be reached in practice. In fact: for SSL servers under load (the highest-risk case) 600 connections are easily cycled within a matter of minutes. As for clients a quick peek at SSL usage on the web would show that most large services do not use SSL-session resumption– because servers are load balanced and client could end up going to any one of hundreds of identical servers . So even logging into email once over SSL involves dozens of SSL re-handshakes from scratch, one for every object accessed (including images and other non-essential data embedded on the page) each exercising the PRNG.
- The authors reverse engineered W2K code and yet keep referring it to as “world’s most popular RNG” even after citing the statistic that its market share is in single-digits percent. Due diligence would have suggested looking at XP, 2003 Server and Vista first before making these claims. Vista in particular has two completely independent cryptography implementations: CAPI which has existed since the earliest versions of NT, and Crypto Next Generation or CNG, new in Vista. Not only do they not share code for underlying primitives but even the respectives interfaces are incompatible. In the end W2K3 and Vista proved to be not vulnerable.
Bottom line: rumors of the complete breakdown of CAPI may have been slightly exaggerated.