Using Intel SGX for SSH keys (part II)

[continued from part I]

Features & cryptographic agility

A clear advantage of using an SGX enclave over a smart-card or TPM is the ability to support a much larger collection of algorithms. For example the Intel crypto-api-toolkit is effectively a wrapper over openssl or SoftHSMv2, which supports a wide spectrum of cryptographic primitives. By comparison most smart-card and TPM implementations support a handful of algorithms. Looking at generic signature algorithms, the US government PIV standard only defines RSA and ECDSA, with the latter limited to two curves: NIST P256 and P384. TPM2 specifications define RSA and ECDSA, with ECDSA again limited to the same two curves, with no guarantee that a given TPM model will support them all.

That may not sound too bad for the specific case of managing SSH client keys. OpenSSH did not even have the ability to use ECDSA keys from hardware tokens until recently, making RSA the only game in town. But it does raise a question about how far one can get with the vendor-defined feature set or what happens in scenarios where more modern cryptographic techniques— such as pairing-based signatures or anonymous attestation— are required capabilities, instead of  being merely preferred among a host of acceptable algorithms.

Extensibility

More importantly, end-users have greater degree of control over extending the algorithms supported by a virtual token implemented in SGX. Since SGX enclaves are running ordinary x86 code, adding one more signature algorithm such as Ed25519 comes down to adopting an existing C language implementation to run inside the enclave. By contrast, end-users usually have no ability to customize code running inside the execution environment of a smart-card. It is often part of the security design for this class of hardware that end-users are not allowed to execute arbitrary code. They are limited to exercising functionality already present, effectively stuck with feature decisions made by the vendor.

Granted, smart-cards and TPMs are not made out of magic; they have an underlying programming model. At some point someone had the requisite privileges for authoring and loading code there. In principle one could start with the same blank-slate, such as a plain smart-card OS with JavaCard support and develop custom applets with all the desired functionality. While that is certainly possible, programming such embedded environments is unlikely to be as straightforward as porting ordinary C code.

It gets more tricky when considering an upgrade of already deployed cryptographic modules. Being able to upgrade code while keeping secret-key material intact is intrinsically dangerous— it allows replacing a legitimate application with a backdoored “upgrade” that simply exfiltrates keys or otherwise violates the security policy enforced by the original version. This is why in the common Global Platform model for smart-cards, there is no such thing as in-place upgrade. An application can be deleted and a new one can installed under the exact same identity. But this does not help an attacker because the deletion will have removed all persistent data associated with the original. Simulating upgradeability in this environment requires a multiple-applet design, where one permanent applet holds all secrets while a second “replaceable” applet with the logic for using them communicates over IPC.

With SGX, it is possible to upgrade enclave logic depending on how secrets are sealed. If secrets are bound to a specific implementation, known as the MRENCLAVE measurement in Intel terminology, any change to the code will render them unusable. If they are only bound to the identity of the enclave author established by code signature— so-called MRSIGNER measurement— then implementation can be updated trivially, without losing access to secrets. But that flexibility comes with the risk that the same author can sign a malicious enclave designed to leak all secrets.

Performance

When it comes to speed, SGX enclaves have a massive advantage over commonly available cryptographic hardware. Even with specialized hardware for accelerating cryptographic operations, the modest resources in an embedded smart-card controller are dwarfed by the computing power & memory available to an SGX enclave.

As an example: a 2048-bit RSA signature operation on a recent generation Infineon TPM  takes about several hundred milliseconds, which is a noticeable delay during an SSH connection. (Meanwhile RSA key generation for that length can take half a minute.)

That slowdown may not matter for the specific use case we looked at, namely SSH client authentication or even other client-side scenarios such as connecting to a VPN or TLS  authentication in a web browser to access websites. In client scenarios, private key operations are infrequent. When they occur, they are often accompanied by user interaction such as a PIN prompt or certificate selection/confirmation dialog. Shaving milliseconds off an RSA computation is hardly useful when overall completion time is dominated by human response times.

That calculus changes if we flip the scenario and look at the server side. That machine could be dealing with hundreds of clients every second, each necessitating use of the server private-key. Overall performance becomes far more dependent on the speed of cryptography under these conditions. The difference between having that operation take place in an SGX enclave ticking along at the full speed of the main CPU versus offloaded to a slow embedded controller would be very noticeable. (This is why one would typically use a hardware-security module in PCIe card form factor for server scenarios, as they combine the security and tamper-resistance aspects with more beefy hardware that can keep with the load just fine. But an HSM hardly qualifies as “commonly available cryptographic hardware” given their cost and complex integration requirements.)

State of limitations

One limitation of SGX enclaves are their stateless nature. Recall that for the virtual PKCS#11 token implemented in SGX, the implementation creates the illusion of persistence by returning sealed secrets to the “untrusted” Linux application, which stores them on the local filesystem. When those secrets need to be used again, they are temporarily imported into the enclave. This has some advantages. In principle the token never runs out of space. By contrast a smart-card has limited EEPROM or flash for nonvolatile storage on-board.  Standards for card applications may introduce their own limitations beyond that: for example the PIV standard defines 4 primary key slots, and some number of slots for “retired” keys, regardless of how much free space the card has.

TPMs present an interesting in-between case. TPM2 standard uses a similar approach, allowing unbounded number of keys, by offloading responsibility for storage to the calling application. When keys are generated, they are exported in opaque format for storage outside the TPM. These keys can be reanimated from that opaque representation when necessary. (For performance reasons, there is a provision for allowing a handful of “persistent” objects that are kept in nonvolatile storage on TPM, optimizing away the requirement to reload every time.)

PIN enforcement

But there is a crucial difference: TPMs do have local storage for state, which makes it possible to implement useful features that are not possible with pure SGX enclaves. Consider the simple example of PIN enforcement. Here is a typical policy:

  • Users must supply a valid PIN before they use private keys
  • Incorrect PIN attempts are tracked by incrementing a counter
  • To discourage guessing attacks, keys become “unusable” (for some definition of unusable) after 10 consecutive incorrect entries
  • Successful PIN entry resets the failure counter back to zero

This is a very common feature for smart-card applications, typically implemented at the global level of the card. TPMs have a similar feature called “dictionary-attack protection” or anti-hammering, with configurable parameters for failure count and lockout period during which all keys on that TPM become unusable when the threshold is hit. (For more advanced scenarios, it is possible to have per-application or per-key PINs. In the TPM2 specification, these are defined as a special type of NVRAM index.)

It is not possible to implement that policy in an SGX enclave. The enclave has no persistent storage of its own to maintain the failure count. While it can certainly seal and export the current count, the untrusted application is free to “roll-back” state by using an earlier version where the count stands at a more favorable number.

In fact even implementing the plain PIN requirement— without fancy lockout semantics— is tricky in SGX. In the example we considered, this is how it works:

  1. Enclave code “bundles” the PIN along with key bits, in a sealed object exported at time of key generation.
  2. When it is time to use that key, the caller must provide a valid PIN along with the exported object.
  3. After unsealing the object, the supplied PIN can be compared against the one previously set.

So far, so good. Now what happens when the user wants to change the PIN? One could build an API to unseal/reseal all objects with an updated PIN. Adding one level of indirection simplifies this process: instead of bundling the actual PIN, place a commitment to a different sealed object that holds the PIN. This reduces the problem to resealing 1 object, for all keys associated with the virtual token. But it does not solve the core problem: there is no way to invalidate previously sealed objects using the previous PIN. In that sense, the PIN was not really changed. An attacker who learned the previous PIN and made off with the sealed representation of a key can use that key indefinitely. There is no way to invalidate that previous version.

(You may be wondering how TPMs deal with this, considering they also rely on exporting what are effectively “sealed objects” by another name. The answer is that TPM2 specification allows setting passwords on keys indirectly, by reference to an NVRAM index. The password set on that NVRAM index then becomes the password for the key. As the “Non-Volatile” part of that name implies, the NVRAM index itself is a persistent TPM object. Changing its passphrase on that index collectively changes the passphrase on all keys, without having to re-import/re-export anything.)

One could try to compensate for this by requiring that users pick high entropy secrets, such as long alphanumeric passphrases. This effectively shifts the burden from machine to human. With an effective rate-limiting policy on the PIN as implemented in smart-cards or TPMs, end-users can get away with low-entropy but more usable secrets. The tamper-resistance of the platform guarantees that after 10 tries, the keys will become unusable. Without such rate limiting, it becomes the users’ responsibility to ensure that  an adversary free to make millions of guesses is still unlikely to hit on the correct one.

Zombie keys

PIN enforcement is not the only area where statelessness poses challenges. For example, there is no easy way to guarantee permanent deletion of secrets from the enclave. As long as there is a copy of the signed enclave code and exported objects stashed away somewhere in the untrusted world, they can be reanimated by running the enclave and supplying the same objects again.

There is a global SGX version state that applies at the level of the CPU. Incrementing that will invalidate enclaves signed for the previous version. But this is a drastic measure that renders all SGX applications on that unit unusable.

Smart-cards and TPMs are much better at both selective and global deletion, since they have state. For example TPM2 can be cleared via firmware or by invoking the take-ownership command. Both options render all previous keys unusable. Similarly smart-card applications typically offer a way to explicitly delete keys or regenerate the key in a particular slot, overwriting its predecessor. (Of course there is also the nuclear option: fry the card in the microwave, which is still nowhere as wasteful as physically destroying an entire Intel CPU.)

Unknown unknowns: security assurance

There is no easy comparison on the question of security— arguably the most important criteria for deciding on a key-management platform. Intel SGX is effectively a single product line (although microcode updates can result in material differences between versions) the market space for secure embedded ICs is far more fragmented. There is a variety of vendors supplying products at different levels of security assurance. Most products ship in the form of a complete, integrated solution encompassing everything from the hardware to the high-level application (such as for chip & PIN payments or identity-management) selected by the vendor. SGX on the other hand serves as a foundation for developers to build their own applications that leverage core functionality provided by Intel, such as sealed storage and attestation provided by the platform.

When it comes to smart-cards, there is little discretion left to the end-user in the way of software; in fact most products do not allow users to install any new code of their choosing. That is not a bug, it is a feature: it reduces the attack surface of the platform. In fact the inability to properly segment hostile application was an acknowledged limitation in some smart-card platforms. Until version 3, JavaCard required the use of an “off-card verifier” before installing applets to guard against malicious bytecode.  Unstated assumption there is that the card OS could not be relied on to perform these checks at runtime and stop malicious applets from exceeding their privileges.

By contrast SGX is predicated on the idea that malicious or buggy code supplied by the end-user can peacefully coexist alongside a trusted application, with the isolation guarantees provided by the platform keeping the latter safe. In the comparatively short span SGX has been commercially available, a number of critical vulnerabilities were discovered in the x86 micro-architecture resulting in catastrophic failure of that isolation. To pick a few examples current as of this writing:

  • Foreshadow
  • SgxPectre
  • RIDL
  • PlunderVolt
  • CacheOut
  • CopyCAT

These attacks could be executed purely in software, in some cases by running unprivileged user code.  In each case, Intel responded with microcode updates and in some cases future hardware improvements to address the vulnerabilities. By contrast most attacks against cryptographic hardware— such as side-channel observations or fault-injection— require physical access. Often they involve invasive techniques such as decapping the chip that destroys the original unit, making it difficult to conceal that an attack occurred.

While it is too early to extrapolate from the existing pattern of SGX vulnerabilities, the track-record confirms the expectation that running code on the same platform as SGX  does indeed translate into a significant advantage for attackers.

CP

Using Intel SGX for SSH keys (part I)

Previous posts looked at using dedicated cryptographic hardware— smart-cards, USB tokens or the TPM— for managing key material related to common scenarios such as SSH, full-disk-encryption or PGP. Here we consider doing the same using a built-in feature of recent-generation Intel CPUs: Software Guard Extensions or SGX for short. The first part will focus on the mechanics of achieving that result using existing open-source software, while a follow-up post will compare SGX to alternative versions that leverage discrete hardware.

First let’s clarify the objective by drawing a parallel with smart-cards. The point of using a smart-card for storing keys is to isolate the secret material from code running on the untrusted host machine. While host applications can instruct the card to use those keys, for example to sign or decrypt a message, it remains at arm’s length from the key itself. In a properly implemented design, raw key-bits are only accessible to the card and can not be extracted out of the card’s secure execution environment. In addition to simplifying key management by guaranteeing that only copy of the key exists at all times, it reduces exposure of the secret to malicious host applications which are prevented from making additional copies for future exploitation.

Most of this translates directly to the SGX context, except for how the boundary is drawn. SGX is not a a separate piece of hardware but a different execution mode of the CPU itself. The corresponding requirement can be rephrased as: manage keys such that raw keys are only accessible to a specific enclave, while presenting an “oracle” abstraction to other applications running on the untrusted commodity OS.

The idea of using a secure execution environment as a “virtual” cryptographic hardware is so common that one may expect to find an existing solution for this. Sure enough a quick search “PKCS11 SGX” turns up two open-source projects on Github. The first one appears to be a work-in-progress that is not quite functional at this time. The second one is more promising: called crypto-api-toolkit the project is under the official Intel umbrella at Github and features a full-fledged implementation of a cryptographic token as an enclave, addressable through a PKCS#11 interface. This property is crucial for interoperability since most applications on Linux are designed to access cryptographic hardware through a PKCS#11 interface. That long list includes OpenSSH (client and server), browsers (Firefox and Chrome) and VPN clients (the reference OpenVPN client as well openconnect which is compatible with Cisco VPN appliance.) This crypto-api-toolkit project turns out to check all the necessary box.

Proof-of-concept

This PoC is based on an earlier version of the code-base which runs openssl inside the enclave. The latest version on Github has switched to SoftHSMv2 as the underlying engine. (In many ways that is is a more natural choice, considering SoftHSM itself aims to be a pure, portable simulation of a cryptographic token intended for execution on commodity CPUs.)

Looking closer at the code, there are a number of minor issues that prevent direct use of the module with common applications for manipulating tokens such as the OpenSC suite.

  • crypto-api-toolkit has some unusual requirements around object attributes, which are above and beyond what the PKCS#11 specification demands
  • While the enclave is running a full-featured version of openssl, the implementation restricts the available algorithms and parameters. For example it arbitrarily restricts elliptic-curve keys to a handful of curves, even though openssl recognizes a large collection of curves by OID.
  • A more significant compatibility issue lies in the management object attributes. The stock implementation does not support the expected way of exporting EC public-keys, namely by querying for a specific attribute on the public-key object.

After a few minor tweaks [minimal patch] to address these issues, SSH use-case works end-to-end, if not necessarily appease all PKCS#11 conformance subtleties.

Kicking the tires

First step is building and install the project. This creates all of the necessary shared libraries, including the signed enclave and installs them in the right location but does not yet create a virtual token. The easiest way to do that is to run the sample PKCS#11 application included with the project.

SGX_pkcs11_sample_app.png

Initialize a virtual PKCS#11 token implemented in SGX

 

Now we can observe the existence of a new token and interrogate it. pkcs11-tool utility from the OpenSC suite comes in handy for this. For example we can query for supported algorithms, also known as “mechanisms” in PKCS#11 terminology:

SGX_pkcs11_token_mechanisms.png

New virtual token visible and advertising different algorithms.

(Note that algorithms recognized by OpenSC are listed by their symbolic name such as “SHA256-RSA-PKCS” while newer algorithm such as EdDSA are only shown by numeric ID .)

This token however is not yet in usable state. Initializing a token defines the security-officer (SO) role, which is the PKCS#11 equivalent of the administrator. But the standard “user” role must be initialized with a separate call by the SO first. Quick search shows that the sample application uses the default SO PIN of 12345678:

SGX_pkcs11_Init_PIN.png

Initializing the PKCS#11 user role.

With the user role initialized, it is time for generating some keys:

SGX_pkcs11_RSA_keypairgen.png

RSA key generation using an SGX enclave as PKCS#11 token

Persistence

The newly created keypair is reflected in the appearance of corresponding files on the local file system. Each token is associated with a subdirectory under “/opt/intel/crypto-api-toolkit/tokens” where metadata and objects associated with the token are persisted. This is necessary because unlike a smart-card or USB token, an enclave does not have its own dedicated storage. Instead any secret material that needs to be persisted must be exported in sealed state and saved by the untrusted OS. Otherwise any newly generated object would cease to exist once the machine is shutdown.

Next step is enumerating the objects created and verifying that they are visible to the OpenSSH client:

SGX_pkcs11_view_keys.png

Enumerating PKCS#11 objects (with & without login) and retrieving the public-key for SSH usage

In keeping with common convention, the RSA private-key has the CKA_PRIVATE attribute set. It will not be visible when enumerating objects unless the user first logs in to the virtual token. This is why the private key object is only visible in the second invocation.

OpenSSH can also see the public-key and deem this RSA key usable for authentication. Somewhat confusingly, ssh-keygen with the “-D” argument does not generate a new key as implied by the command name. It enumerates existing keys available on all available tokens associated with the given PKCS#11 module.

We can add this public-key to a remote server and attempt a connection to check if the openssh client is able to sign with the key. While Github does not provide interactive shells, it is arguably easiest way to check for usability of SSH keys:

SGX_pkcs11_SSH_github.png

SSH using private-key managed in SGX

Beyond RSA

Elliptic curve keys also work:

SGX_pkcs11_P256_keygen.png

Elliptic-curve key generation using NIST P-256 curve

Starting with release 8.0, OpenSSH can use elliptic curve keys on hardware tokens. This is why the patch adds support for querying the CKA_EC_POINT attribute on the public-key object, by defining a new enclave call to retrieve that attribute. (As an aside: while that  follows the existing pattern for querying the CKA_EC_PARAMS attribute, this is an inefficient design. These attributes are neither sensitive or variable over the lifetime of the object. In fact there is nothing sensitive about a public-key object that requires calling into the enclave. It would have been much more straightforward to export this object once and for all in the clear for storage in the untrusted side.)

These ECDSA keys are also usable for SSH with more recent versions of OpenSSH:

SGX_pkcs11_SSH_ECDSA.png

More recent versions of OpenSSH support using ECDSA keys via PKCS#11

Going outside the SSH scenario for a second, we can also generate elliptic-curve keys over a different curve such as secp256k1. While that key will not be suitable for SSH, it can be used for signing cryptocurrency transactions:

SGX_pkcs11_bitcoin_usecase.png

Generating and using an ECDSA key over secp256k1, commonly used for cryptocurrency applications such as Bitcoin

While this proof-of-concept suggests that it is possible to use an SGX enclave as a virtual cryptographic token, it is a different question how that compares to using using real dedicated hardware. The next post will take up that comparison.

[continued]

CP

 

Helpful deceptions: location privacy on mobile devices

 

Over at the New York Times, an insightful series of articles on privacy continues to give consumers disturbing peeks at how the sausage gets made in the surveillance capitalism business. The episode on mobile location tracking is arguably one of the more visceral episodes, demonstrating the ability to isolate individuals as they travel from sensitive locations— the Pentagon, White House, CIA parking-lot— back to their own residence. This type of surveillance capability is not in the hands of a hostile nation state (at least not directly; there is no telling where the data ends up downstream after it is repeatedly repurposed and sold) It is masterminded by run-of-the-mill technology companies foisting their surveillance operation on unsuspecting users in the guise of helpful mobile applications.

But NYT misses the mark on how users can protect themselves. Self-help guide dutifully points out users can selectively disable location permission for apps:

The most important thing you can do now is to disable location sharing for apps already on your phone.

Many apps that request your location, like weather, coupon or local news apps, often work just fine without it. There’s no reason a weather app, for instance, needs your precise, second-by-second location to provide forecasts for your city.

This is correct in principle. For example Android makes it possible to view which apps have access to location and retract that permission anytime. The only problem is many apps will demand that permission right back or refuse to function. This is a form of institutionalized extortion, normalized by the expectation that most applications are “free,” which is to say they are subsidized by advertising that in turn draws on pervasive data collection. App developers withhold useful functionality from customers unless the customer agrees to give up their privacy and capitulates to this implicit bargain.

Interestingly there is a more effective defense available to consumers on Android, but it is currently hampered by a half-baked implementation. Almost accidentally, Android allows designating an application to provide alternative location information to the system. This feature is intended primarily for those developing Android apps, buried deep under developer options.

Screenshot_20200208-102751_Settings

It is helpful for an engineer developing an Android app in San Francisco to be able to simulate how her app will behave for a customer located in Paris or Zanzibar, without so much as getting out of her chair. Not surprisingly there are multiple options in PlayStore that help set artificial locations and even simulate movement. Here is Location Changer configured to provide a static location:

Screenshot_20200211-150017_Location Changer.jpg

(There would be a certain irony in this app being advertising supported, if it were not common for privacy-enhancing technologies to be subsidized by business models not all that different from the ones they are allegedly protecting against.)

At first this looks like a promising defense against pervasive mobile tracking. Data-hungry tracking apps are happy, still operating under the impression that they retain their entitlement to location data and track users at will. (There is no indication the data is not coming from the GPS and is instead provided by another mobile app.) Because that data no longer reflects the actual position of the device, its disclosure is harmless.

That picture breaks down quickly on closer look. The first problem is that the simulated location is indiscriminately provided to all apps. That means not only invasive surveillance apps but also legitimate apps with perfectly good justification for location data will receive bogus information. For example here is Google Maps also placing the user in Zanzibar, somewhat complicating driving directions:

Screenshot_20200208-103127_Maps.jpg

The second problem is that common applications providing simulated location only have rudimentary capabilities, such as reporting a fixed location or simulating motion along a simple linear path— one that goes straight through buildings, tunnels under natural obstacles and crosses rivers. It would be trivial for apps to detect such anomalies and reject the location data or respond with additional prompts to shame the device owner into providing true location. (Most apps do not appear to be making that effort today, probably because few users have resorted to this particular subterfuge. But under an adversarial model, we have to assume that once such tactics are widespread, surveillance apps will respond by adding such detection capabilities.)

What is required is a way to provide realistic location information that is free of anomalies, such as a device stuck at the same location for hours or suddenly “teleported” across hundreds of miles. Individual consumers have access to a relatively modest sized corpus of such data—  their own past history. In theory we can all synthesize realistic looking location data for the present by sampling and remixing past location history. This solution is still unsatisfactory since it is still built on data sampled from a uniquely identifiably individual. That holds even if the simulated app is replaying an ordinary day in the life over and over again in a Groundhog Day loop. It may hold no new information about her current whereabouts, but it still reveals information about the person. For example, the simulated day will likely start and end at their home residence. What is needed is a way to synthesize realistic location information based on actual data from other people.

Of course a massive repository of such information exists in the hands of one company that arguably bears most responsibility for creating this problem in the first place: Google. Because Google also collects location information from hundreds of millions of iPhone and Android users, the company can craft realistic location data that can help users the renegotiate the standard extortion terms with apps by feeding them simulated data.

Paradoxically, Google as a platform provider is highly motivated to not provide such assistance. That is a consequence of the virtuous cycle that sustains platforms such as Android: more users make the platform attractive to developers who are incentivized to write new apps and the resulting ecosystem of apps in turn makes the platform appealing to users. In a parallel to what has been called the original sin of the web— reliance on free content subsidized by advertising— that ecosystem of mobile apps is largely built around advertising which is in turn fueled by surveillance of users. Location data is a crucial part of that surveillance operation. Currently developers face a simple, binary model for access to location. Either location data is available, as explicitly requested by the application manifest or it has been denied, in the unlikely scenario of a privacy-conscious user after having read one too many troubling articles on privacy. There is no middle ground where convincing but bogus location data has been substituted to fool the application at the user’s behest. Enabling that options will clearly improve privacy for end-users. But it will also rain on the surveillance business model driving the majority of mobile apps.

This is a situation where the interests of end-users and application developers are in direct conflict. Neither group directly has a business relationship Google—no one has to buy a software license for their copy of Android and only a fraction of users have paying subscriptions from Google. Past history on this is not encouraging. Unless a major PR crisis or regulatory intervention forces their hand, platform owners side with app developers, for good reasons. Compared to the sporadic hand-wringing about privacy among consumers, professional developers are keenly aware of their bottom line at all times. They will walk away from a platform if it becomes too consumer-friendly and interferes in cavalier tracking and data-collection practices that help keep afloat advertising-supported business models.

CP