(Continued from part 1)
Checking for man-in-the-middle attacks
Imagine planting special “observer” nodes around the network. Each of them would try to make a TLS connection to the website we are interested in monitoring using one of the PFS ciphersuites. Each observer records a transcript of the TLS handshake conducted, including both its own messages and those purportedly sent by the server. Finally these transcripts are uploaded to a centralized monitoring service for processing. (This could be same as the website under observation or independent 3rd-party.) Likewise the website would also upload its own transcript of the TLS handshake when the TLS request is made to a special designated URL, designed to flag requests for special processing. With transcripts received from both sides, the monitoring service can reconcile them and verify that they are consistent.
Authenticating the transcripts
But the question is how can transcripts be delivered reliably? Let’s focus on the simplest case where the monitoring endpoint is operated by the same website. If observers tried to upload the data over TLS, this would lead to a circularity. We are trying to determine if TLS connections are being tampered with; we can not rely on that channel for integrity. Our threat model envisions a powerful attacker who can manipulate all network traffic, including what were previously assumed to be “protected” TLS communications. Such an attacker can impersonate the server when it comes time to upload the report, deceiving the observer into believing the report was delivered but then fabricating a bogus one to the original server that is consistent with unauthorized modifications induced by MITM attack.
But this is not a fundamental limitation. We can posit that each observer has their own public/private key pair for signing the transcripts for upload. These are independent keys held only by the nodes distinct from the TLS keys held by the website. Risk of compromise for TLS keys is mostly uncorrelated from that of individual keys distributed across ordinary PCs, tablets and smartphones around the Internet. Of course an adversary that controls the network can still block delivery of transcripts to the monitoring location. But then a complete absence of reports itself will look suspicious. Presumably temporary network outages or observers being powered off could mean that reports are often delayed or queued up to be uploaded in batches. This is not a problem as long as they are eventually delivered and reconciled successfully. On the other hand, once an MITM attack is executed, the adversary must suppress all reporting because the next report– however delayed– will include the evidence pointing to a discrepancy.
There are a few caveats for making this type of decentralized observation system work:
- Observer traffic must be indistinguishable from ordinary users at the network level. Otherwise the attacker would simply exempt these specific connections from MITM while successfully attacking ordinary users. For example if the observation agent had an unusual TLS configuration in terms of the supported ciphersuites/extensions, this would allow deciding at very early stage whether to intercept or let the connection go through.
A typical mitigation for this would be for observation agents to use off-the-shelf TLS libraries such as NSS or SChannel in the same configuration they are used by popular web browsers, such as Chrome and Internet Explorer respectively.
- Original website being monitored must cooperate. This is crucial since the MITM detection relies on reconciling transcripts from “user” point of view with those from “website” point of view. If the website for any reason wanted to hide existence of such interception, it could always collude with the attacker and report bogus transcripts consistent with MITM attack.
- There is limited ability for dispute resolution or proving to a third-party whether MITM occurred. At first this seems possible due to the way ephemeral DH/ECDH key exchange is implemented: the server signs its DH inputs in ServerKeyExchange message using the long-lived key from its X509 certificate. That allows verifying that the ServerKeyExchange message was in fact part of a genuine exchange with that server. In fact it even binds that message partially to other fragments of the handshake; the signature also includes client-random and server-random values. This prevents observers from fabricating completely bogus to report false-positive MITM attacks. At a minimum the ServerKeyExchange message must have originated with the server, or an attacker in possession of the same keys.
But that alone can’t prevent observers from swapping transcripts completely eg making two connections to the website with transcripts A and B, then uploading B as the first transaction. The website can detect this by recording messages it signed and realizing that a claimed MITM attack is in fact a confused client uploading a valid but mismatched transcript. The reconciliation point however can not make that determination without access to all TLS handshakes from that website.