GoDaddy outage and lessons for certificate revocation (2/2)


Windows includes a helpful utility called certutil that serves as a Swiss-army knife for trouble-shooting PKI problems on that platform. One of the options can be used to look at URL cache entries, where previously obtained OCSP and CRLs are stored using the -urlcache option. By running this query and looking for objects associated with GoDaddy one can determine the extent revocation information that would have been available to the client locally, if further network requests were ruled out.

Running this experiment on a couple of actively used Windows 7 machines shows a decidedly mixed record:

  • On one machine there were no GoDaddy entries at all. In this case all revocation checks for GoDaddy sites would have fail.
  • On another laptop, there were two dozen OCSP response as well as CRLs for root and intermediate issuer.

Actively used is the operative keyword here, because paradoxically the effectiveness of revocation checking as implemented on Windows is directly correlated to its frequency of use. The chain-building engine contains sophisticated optimizations on when to prefer CRL over OCSP (if multiple certificates are checked for a given issuer, it become more efficient to download the CRL) and also which issuers are most frequently observed, to allow prefetching those OCSP/CRLs ahead of time before the current ones expires.

(As an aside, this makes revocation checking something of a cooperative enterprise between multiple applications on the machine. Everyone wants to avoid doing a costly CRL/OCSP check over the network, hoping that there is a cached response already in the cache. But to the extent that applications skip revocation checking or instruct CAPI2 to use offline checks based on cached information only, the chances of that happy condition occurring goes down. This is why applications such as Chrome which “defect” from revocation checking are doing a disservice to other applications using the feature.)

The sensitivity of caching to navigation patterns is helpful. Any website the user visits often, will likely have an OCSP response cached, helping tide over any temporary outages of the certificate issuer when visiting those sites again. In fact if the user happened to visit may sites with GoDaddy issued certificates, it may even exceed the threshold where CRL download is triggered, covering all sites– including those not yet visited– affiliated with that issuer. While navigation history is highly clustered around particular sites and this makes the first case realistic, there is no reason to expect any correlation that multiple sites users visit are more likely to have certificates issued by the same CA.

There is one more ray of hope: OCSP stapling. This an SSL extension that permits the server to return a recent OCSP response to the client, saving the client from having to do the lookup on its own. In principle this would also increase resilience against outages of the OCSP responder, as long as the server has a fresh response obtained prior to the outage. (This still has edge-cases around a brand new server being deployed from scratch or perhaps rebooted during the outage. Typically it would need to reach an OCSP responder as part of initialization.) In reality the less-than-stellar uptake of this optimization outside of Windows platform means it would have been of limited use in the GoDaddy debacle. This may change in the near future. For example nginx recently announced support for OCSP stapling.

CP

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s