First post in this series looked at a common two-factor authentication pattern that is susceptible to phishing. The second part examines an alternative design that does not have the same vulnerability.
This is the design commonly observed in critical enterprise/government applications where the stakes are high. These organizations typically avoid OTP and prefer solutions based on public-key cryptography instead. In this model each person has a pair of keys, a public-key that can be freely distributed and a private key carefully guarded by the user. For ease of identification, public-keys are typically embedded in digital certificates issued by a trusted third-party. A certificate effectively creates a binding between a public key and some identifying attributes about the user, such as their name, organization and email address. Authentication then works by first presenting the certificate– amounting to an unverified claim of identity, since certificates are public information– and then backing up the claim by proving possession of the private key corresponding to the public key in the certificate.
The critical difference from OTP hinges on that proof. Instead of sending over a secret value generated unilaterally, an interactive protocol is used that incorporates inputs from both sides. The party trying to verify user identity sends a challenge. That challenge is incorporated into a computation involving the private key and output from that computation is returned. The recipient can use the public key from the certificate to verify that the response is consistent with the challenge.
End users are thankfully not exposed to any of this complexity. The standard incarnation involves smart cards or similar dedicated hardware such as USB tokens– tiny embedded systems featuring tamper-resistant design for high-security applications. That approach avoids storing private key directly on general-purpose computer such as a PC or laptop, where it would become sitting duck for malware. Typically the card is configured to require PIN entry before performing these private key operations, giving rise to the two factors. First one is what-you-have, the physical possession of the card. Second one is what-you-know, namely the knowledge of a short PIN. The resulting user experience becomes: insert card into reader slot (or tap against the reader surface, if both support NFC) and enter PIN when prompted. Smart cards are common in the enterprise and government space; they are exceedingly rare for consumer scenarios. For example the US government has a mandatory Personal Identity Verification (PIV) program that defines a standard for cards issued to millions of federal employees.
What makes this design inherently safe against phishing?
First, the protocol is too complex for direct user involvement. Private keys are long random sequences of characters. Even if they were directly accessible– not the case when keys are safely tucked away in a smart card– users can not reproduce them from memory if prompted. Both the challenge and response are dozens of characters to type out. Corollary is that authentication protocol must be automated by software, taking the user out of the loop. This creates a problem for the attacker: there is nothing to ask the user for. Even the most persuasive phishing could not get user to type out the private key into a web page. (Granted the user can be tricked into giving away the PIN, compromising one of the two factors of authentication. But without access to the private key residing on the smart card, PIN by itself does not allow impersonating the victim.)
That property is useful but not enough by itself. After all users are still responsible for making one critical decision: whether to login at a given website. Perhaps the attacker does not need to convince anyone to mail out their private keys, if he could instead convince the user to go about their usual login ritual at the wrong website. Using the example from previous post, suppose user Alice is tricked into using her smart card for paypa1.com (with 1 instead of L) a phishing site operated by Bob.
Bob can any issue any challenge to Alice consistent with the protocol, but he faces a dilemma: in order to login to the real PayPal site, he will have to answer a challenge chosen by PayPal. Unless he has a response corresponding to precisely that challenge, he will be out of luck. Being resourceful Bob does not give up. At the same time as Alice connects to his phishing website, Bob turns around and starts a parallel session with the real PayPal website in the background. This is the standard man-in-the-middle attack. User is connected to the attacker at the same time the attacker is connected to the legitimate destination, trying to impersonate both sides to each other.
- Bob claims to be Alice at PayPal by sending Alice’s certificate.
- PayPal sends Bob a challenge, requiring proof that he possesses the private key.
- Bob forwards the exact same challenge to Alice.
- Alice computes a response using her private key and returns it to Bob, expecting to be logged into her PayPal account.
- Bob forwards that response to the legitimate site.
By all indications that response is correct. After all it was generated using Alice’s private key, based on the same challenge PayPal issued. It would have been the exact same bits if Alice and PayPal interacted directly, without Bob in the middle to shuttle messages back and forth.
Game over? Not quite. This is where the nuts-and-bolts of protocol design comes into play. In addition to the challenge, well-designed schemes incorporate additional “context” into the response computation. That context is determined entirely by code under user control; the other side has no influence over it.