It is not uncommon for security features to have unexpected interactions, undercutting each other. For example Tor and Bitcoin do not mix. More subtle are situations when one feature designed to mitigate a specific threat blocks some other security feature from working. This blogger recently ran into an example with Windows Server.
Enterprises frequently have to operate public-key infrastructure (PKI) systems to issue credentials to their own employees—and arguably such closed PKI systems have been far more successful than the house-of-cards that is SSL certificate issuance for the web. There are stand-alone certificate authority products such as the open-source EJBCA package but for most MSFT environments, the requirement is typically addressed by CA functionality built into Windows Server. Certificate Services is a role that can be added to the server configuration, either integrated with Active Directory (not necessarily colocated with the domain controller) or as stand-alone CA.
Since the security of PKI is critically dependent on security of the cryptographic keys used by the certificate authority, one of the standard ways to harden such a system is to move key material into dedicated cryptographic hardware. In enterprise environments, this usually means a hardware security module (HSM) connected to the CA servers. Lately the meaning of “HSM” has been greatly watered-down by companies suggesting that a glorified smart-card qualifies- perhaps these designers envision an unorthodox data-center layout with card readers glued to the side of server racks. But if one is willing to live without the improved tamper-resistance and higher performance of a dedicated HSM product, there is a more attractive option built-in. Starting with Windows 8, a Trusted Platform Module can be used to emulate virtual smart cards based on the GIDS specification.
Regardless of hardware choice, from the tried-and-true FIPS140 certified massive box to the jankiest USB token from an over-enthusiastic DIY project, these solutions all have the same defining feature. Private key material used by the CA for signing certificates is stored on an external device. No matter how badly the Windows server itself has been compromised, that key can not be extracted. (Of course the device will happily oblige if the compromised server asks to sign any message. That can be almost as bad as having direct key access when the signature has high value, as Adobe found out in the code-signing case.)
Getting along with external cryptographic hardware
At the nuts and bolts level, getting this scenario to work requires that Windows have some awareness of external cryptographic hardware. Windows crypto stack includes an extensibility layer for vendors to integrate their own device by authoring smart card mini-drivers. Certificate Services role in turn has an option to pick a particular cryptographic service provider during setup:
As noted earlier, the smart-card provider is actually a “meta-provider” that can route operations to other hardware using a mini-driver for that model of hardware. So the most direct route would be:
- Create a virtual-smart card on TPM and initialize it with PIN
- Configure Certificate Server to use smart-card provider
- Generate the signing key on the virtual smart card
By all appearances, this process appears to work during initial configuration of certificate services. When the CA is being initialized, a PIN prompt for the virtual smart-card will appear and after authenticating, a self-signed certificate will be created as expected. (Assuming we are going with the most common configuration of a root certificate. There are other options such as creating a CSR to source a certificate from a third-party; in that case the CSR will be created correctly.)
But when the service itself is started, something strange happens. The certificate server management console can not connect to the service as RCPs time out. It appears to be stuck; in fact it does not look like it started successfully. It can not be stopped or restarted either, short of killing the hosting process. So what is going on?
Explaining why the CA service got stuck involves a flash back to 2005. Prior to Vista, different applications showing UI on a Windows desktop were not isolated from each other. For example a privileged application running with administrator or system account could show a prompt in the same desktop session that unprivileged user applications operate. While this might seem obvious— where else would the UI appear if not on the user’s screen?— there is a serious security problem here. By design applications can send UI-related messages, called “window messages,” to each other. For example one application can send a message to simulate clicking on a button in another application or pasting text into a dialog box.
The original example of this vulnerability dubbed Shatter, involved more than just simulating button clicks or faking keystrokes. It relied on the existence of a specific message that includes a callback function- effectively instructing the application to invoke arbitrary piece of code. As envisioned, these callback functions were supposed to have been specified by the application itself and accordingly trusted. But nothing prevented a different app running under a different OS account from injecting the same type of message into the message queue of another application. When you can influence the flow of execution in a process to the point of making it jump to an arbitrary specified memory address, you have full control. (The original attack also relied on injecting shell-code into the memory space of the target via an earlier window message simulating pasting of text.)
But even if that dangerous message type was deprecated or applications modified to validate the incoming callback before invoking it, the broader architectural problem remains: applications running at different privilege levels can influence each other. For example if the user opened an elevated command line prompt running as administrator, even her unprivileged user applications could send keystrokes to that window, executing arbitrary commands as administrator.
Vista introduced proper UI isolation to address this problem. These changes also affected a special class of applications that normally would never be expected to show UI or interact with users: services. But it turns out many background services are not content to run invisible in the background, and occasionally feel compelled to converse with users. Session-0 isolation comes into play for that case. There are now multiple sessions in Windows, and services all operate in the special privileged session 0. Any UI displayed there would have no effect on the main desktop the user is staring at. This uses the same principle as terminal server: if multiple users were logged into a server, an application opened by one user would only render on his/her desktop, with no effect on others users with their own remote desktop session.
Interactive services detection
Hiding UI from background services under the rug may trades the security problem for an application-compatibility problem. Services are not supposed to display UI directly. Instead they are supposed to have an unprivileged counterpart in the user session they can communicate with via standard inter-process channels such as named-pipes or LPC. Knowing that developers do not necessarily know- much less care to follow- best practices, Windows team faced the problem of accommodating “legacy” services. As far as the service is concerned, there is nothing obviously wrong. It has rendered UI and is waiting for the user to make a decision, which appears to be taking forever.
Interactive services detection attempts to solve that problem by detecting such UI and alerting the user in their own session with a notification. By acting on the notification, user can switch to session 0 temporarily, which has only that one dialog from the interactive service visible, and deal with the prompt.
“Our code never makes that mistake”
That provides an explanation for why certificate services is stuck:
- During initial configuration of certificate services role, the cryptographic hardware is being accessed from an MMC console process running in the user session. PIN collection dialog renders without a problem.
- During sustained operation of certificate services, the same hardware is accessed from a background service.
So why did interactive service detection not kick-in and alert the user that there is some UI demanding attention in session 0?
The answer is an optimistic assumption made by MSFT that by “now” (defined as Windows Server 2012 time-frame) all legacy services will have been fixed, rendering interactive service detection redundant. In WS2012 the feature is disabled by default. It turns out even Windows Server 2008 had traces of that optimism: 64-bit services were exempt on the theory that developers porting their service from x86 to x64 might as well be forced to fix any interactivity. But in this case the “faulty” code is the built-in certificate service running in native 64-bit mode from MSFT: credential-collection prompts from the smart-card stack are showing up in session 0.
Luckily the feature can be enabled with a registry tweak. With interactive service detection enabled, when certificate services starts up, the expected notification does show up in user session. Switching to session 0 one finds the familiar PIN prompt for the virtual smart card. (Note that entering the PIN is not required each time the CA uses the external crypto device to sign. It is only collected once to create an authenticated session, allowing the system to operate as a true hands-off service after the operator has kick-started it.)
The problem is not confined to use of virtual smart cards. Other vendors appear to have run into the same problem. Thales who manufactures the nSafe (formerly nCipher) line of HSMs has a white-paper noting that interactive service detection must be enabled for their product to operate correctly. Oddly enough, the configuration dialog for certificate server already has a checkbox to indicate that administrator interaction may be required for use of signing keys. That alone should have been a hint that this “background service” may in fact need to interact with the administrator, especially when the UI-related behavior of vendor-specific drivers can not be known in advance.