Second question: why bother with MD5 collision in the first place?
As explained in the SummerCon presentation, this particular forgery depended on exquisite timing. First the expiration date of the certificate is exactly one year from the moment it issued, measured in seconds. Second, the serial number of the certificate issued by the TS licensing server is a function of two semi-predictable variables:
- Number of other certificates issued before
- Current time, in millisecond resolution
This poses quite a challenge for an attacker seeking to exploit a collision against MD5. Recall that the attack depends on crafting two certificates with identical hash– one is the certificate that the attacker predicts the licensing server will issue, the second one is the certificate that the attacker actually wants to obtain. Ability to find collisions in MD5 ensures that the signature on the first one is as good as a signature on the second one. But “predict” is the operative keyword here: after all it is up to the issuing authority to decide on the serial number and expiration date on the certificate that will be issued. Randomly chosen serial numbers would have trivially defeated the attack. (In fact this is such a good idea that randomization is required for so-called extended validation class of certificates that breathed new life into the defunct CA business model by allowing companies a lot more money to perform the same basic due-diligence they should have performed for every SSL certificate.) Instead the licensing server used current time and an incrementing counter to generate the serial number.
That is a lucky break for the attacker: guess the values correctly when crafting the collision, and the licensing server has unwittingly issued a code-signing certificate made out in the name of MSFT. Making life easier is the ability to do dry-runs of the attack, acquiring licenses freely to observe the counter and current “time” according to the server: TS license server will happily issue “licenses” to any enterprise user in a certain Active Directory group, eg every employee of the company that has a valid business reason to access Windows server. But even with this advantage, it is a fragile process because it requires making projections about the variables above, namely:
- How many other users will have obtained a license between now and when the collision is going to be submitted.
- Exactly what time– down to the millisecond– that collision will be processed. Note this is not when the attacker submits the request to the licensing server; it’s when the server gets around to issuing the certificate. All types of latency, from simple network jitter, to OS scheduling delays could throw this off.
The first one is easy to work around: plan the attack over a weekend or official holiday, and perhaps at a time of day when few legitimate users are going to be requesting licenses to interfere with the attack. Assuming long term persistence on the enterprise network, one can observe the fluctuation in demand over time to spot such opportunities. A different strategy is to target an organization where the TS licensing server is idle by design– perhaps it has just been set up but is not yet activated, or it is being decommissioned in favor of a different licensing model. In this case the assumption is that the creators of Flame have resources to compromise lots of different organizations, so they can pick one with the right TS licensing setup.) In all cases, timing the attack to take place against an idle server can partially help with the second problem as well– if the server is only responding to the attacker, there is no concern about high load on the server leading to variable processing times. But the jury is out on whether milli-second accuracy can be obtained this way, so the attacker may still have to try multiple times. As a comparison point: even with a perfectly sequential serial ID and one-second time resolution, the forgery attack from 2008 required several tries.
The other side of the equation is the cost for each MD5 collision– this is the computational resources “wasted” as a result of guessing incorrectly. Latest estimates put this on the order of $10K-$100K per collision given publicly known techniques. It is likely that the Flame authors had access to novel cryptanalytic attacks lowering that cost, in addition to large amounts of computing power that might wash it away altogether. (As economists like to point out, CPU cycles are a non-replenishable resource. Once the upfront investment in a couple million servers is paid for, one might as well put them to use spinning on a problem.)
Another option is to mess with the licensing server’s notion of “time.” In most modern systems including Windows, the current clock is obtained from the network using a protocol such as NTP. If the Flame authors had access to exploits against the time synchronization scheme, they could “roll-back” the time to try again after each failed attempt with same timestamp. But this will not necessarily reduce the number of collisions required, since the number of issued certificates still increments regardless of success/fail, requiring a different collision to pair up with the expected serial number the CA will choose.
So far this discussion has only considered attacks that treat the licensing server as a black box– attacker interactions are restricted to submitting seemingly-valid license requests to the server and perhaps attacking its surrounding environment, such as disrupting clock synchronization. What they are not doing is outright break into that machine, exercise the signing capability directly on any chosen message or export the private key for future use. Why? Licensing servers are not a particularly sensitive/critical part of infrastructure. They are a regular Windows server, configured in a specific role. They are not specially hardened for better security or closely monitored for any sign of trouble. (After all the raison d’etre of the system is to guard MSFT revenue source, it has no intrinsic value to the enterprise.) Few enterprises use hardware security modules or other key-management techniques.
In addition to maintaining long-term presence in the enterprise network, it is also a safe bet that the Flame creators have access to a significant collection of vulnerabilities, both public and zero-day. Why would they not go after the target directly by compromising the licensing server? Recall that due to the flawed setup of the MSFT trust chain, any organization anywhere in the world operating a TS licensing server will do. In the highly unlikely event that the first enterprise they tried has a clue about security and runs bullet-proof Windows servers, the attackers need only move on to a different one. Surely some enterprise somewhere in the world is running a vulnerable licensing server ripe for the extraction of private keys? Armed with the key, the attackers need not waste any time trying to find MD5 collisions, they can issue the certificate directly. (The conspiratorially minded could argue this is exactly what happened, with the additional twist that a single MD5 collision was chosen *after the fact* to create the appearance that it was an interactive attack. This has the nice property of providing misdirection: actual timestamp on the certificate with colliding hash is no longer meaningful. It could have been back- or forward-dated to the point that trying to mine the logs from the CA during the alleged incident turns up noise.)
Finally there is the question of disclosing offensive capabilities. By attacking the licensing server directly, attackers risk burning a 0-day in case they are found out. (In the worst case scenario– more likely is that the server has known unpatched vulnerabilities with readily available weaponized exploits.) This is not exactly the end of the world. Stuxnet contained multiple 0-days and the authors presumably took into calculation the possibility that one day the malware samples will be reverse engineered. Not to mention that thanks to the likes of VUPEN, everyone and their uncle has access to Windows zero-days these days. Finding one being used in the wild might prompt an out-of-band patch from MSFT and some temporary indignation, but then everyone moves on.
Using an MD5 collision and embedding such a certificate in Flame on the other hand reveals an entirely capability: access to novel cryptographic techniques that are unknown in the civilian world. For an organization interested in trying to hide its capabilities, this is more revealing than the loss of a couple of Windows zero-days that could have blended into the background-noise of vulnerability research.