Logical access and the security theater of data-nativism

Data-center address as security guarantee

WSJ recently quoted a spokesman for Binance.US stating that all US customer data is stored on servers located in the US. The subtext of this remark is that by exclusion, customer information is not stored in China, an attempt to distance the company from concerns around the safety of customer information. Such new-found obsession with “data terroire” is a common interpretation of the data-sovereignty doctrine, which holds that information collected from citizens of a particular country must remain both geographically and legally subject to its privacy regulations. While the concept predates the Snowden revelations of 2013, it was given renewed urgency after disclosures of US surveillance programs leveraging massive data collections hoarded by private companies including Google, MSFT and Yahoo among others named as participants in the mysterious PRISM program of “upstream” collection. [Full disclosure: this blogger was a member of the Google security team from 2007-2013] 

Data-sovereignty is a deceptively simple solution: If Google is forced to store private information of German citizens on servers physically located in Germany, the argument goes, then NSA— or its counterparts in China, Russia or whichever foreign policy boogeyman looms large in the imagination on a given day— can not unilaterally seize that data without going through the legal framework mandated by German law. This comforting narrative makes no sense from a technology perspective. (If it ever made sense in other terms, including lawful access frameworks. The NSA is legally barred from conducting surveillance on US soil. Moving data out of US into foreign countries amounts to declaring open season on those assets.) To explain why, we need to distinguish between two types of access: physical and logical.

One note about the hypotheticals explored here: the identity of the private companies hoarding sensitive customer information and the alleged boogeyman going after that stash varies according to the geopolitical flavor of the day. After Snowden, US tech giants were cast as either  hapless victims or turncoat collaborators depending on your interpretation, while the NSA conveniently assumed the role of the arch-villain. For the purpose of this blog post we will use Binance/US and China as the stand in for these actors, with the full expectation that in a few years these examples will appear quite dated.

Physical access vs logical access

Imagine there is a server located inside a data-center in the middle of nowhere, as most datacenter are bound to be for proximity to cheap hydropower and low real-estate costs. This server is storing some important information you need to access. What are your options?

1. You can travel to the datacenter and walk up to the server directly. This is physical access. It unlocks some very convenient options. Server is stuck, not responding to requests? Press the power button and power-cycle it. Typical rack-mounted servers do not have a monitor, keyboard, mouse or any other peripherals attached for ease of use. But when you are standing next to the machine, you can connect anything you want. This allows getting an interactive shell and using it as a glorified workstation. One could even attach removable storage such as a USB thumb-drive for conveniently copying files. In a pinch, you could crack-open the server chassis and pocket one of the disk drives to hoover up its contents. As an added bonus: if you walk out of the datacenter with that drive, the challenge of reading its contents can be done later from the comfort of your office. (Incidentally the redundancy in most servers these days means that they will continue ticking on as if nothing happened after the removal of the drive, since they are designed to tolerate failure of individual components and “hot-swapping” of storage.) But all of this flexibility comes at a high cost. First you have to travel to the middle of nowhere which will likely involve a combination of flying and driving, then get past the access controls instituted by the DC operator. For the highest level of security in tier-4 datacenter that typically involves both an ID badge and biometrics such as palm scans for access to restricted areas. Incidentally the facility is covered with cameras everywhere, resulting in a permanent visual record of your presence, lest there be any doubt on what happened later. 

2. Alternatively you can access the server remotely over a network using a widely deployed protocol such as SSH, RDP or IPMI. This is logical access. For all intents and purposes, the user experience is one of standing next to the machine staring at a console, minus the inconvenience of standing in the uncomfortable noisy, over-air-conditioned, florescent-lit datacenter aisle. Your display shows exactly the same thing you would see if you were logged into the machine with a monitor attached, modulo some lag in the display due to the time it takes for the signal to travel over a network. You can type commands and run applications exactly as if you had jury-rigged that keyboard/mouse/monitor setup with physical access. Less obvious is that many actions that we typically associate with physical access can be done remotely. Need to connect an exotic USB gadget to the remote server? Being thousands of miles away from the USB port may look like a deal-breaker but it turns out modern operating systems have the ability to virtually “transport” USB devices over a network. USB forwarding has been supported by Windows Remote Desktop Protocol (RDP) for over a decade, while the usbip package provides a comparable solution on Linux. Need to power-on a server that has mysteriously shutdown or reset one that has gotten wedged, not responding to network requests? There is a protocol for that too: IPMI. (IPMI runs on a different chip called the “baseboard management controller” or BMC located inside the server, so the server must still be connected to power and have a functioning network connection for its BMC which happens to be the usual state of affairs in a data-center.) Need to tweak some firmware options or temporarily boot into a different operating system from a removable drive? IPMI makes that possible too.

The only prerequisite for having all these capabilities at your fingertips from anywhere in the world is the foresight to have configure the system for remote access ahead of time. Logical access controls define which services are available remotely (eg SSH vs IPMI), who is allowed to connect, what hoops they jump through in order to authenticate— there is likely going to a be VPN or Virtual Private Network at the front door— and finally what privileges these individuals attain once authenticated. The company running that server gets to define these rules. They are completely independent of the physical access rules enforced by the datacenter, which may or may not even the same company. Those permitted to remotely access servers over a network could be a completely different set of individuals than those permitted to step inside the datacenter floor and walk up to that same server in real life.

Attack surface of logical access

Logical access is almost as powerful as physical access when it comes to accessing data while having the convenience of working form anywhere in the world. In some cases it is even more convenient. Let’s revisit the example from the previous section, of walking into a datacenter and physically extracting a couple of disk drives from a server, with the intention of reading their contents. (We assume the visitor resorted to this smash-and-grab option because they did not have the necessary credentials to login to the server and access the same data the easy way even while they were standing right next to it.) There are scenarios where that problem is not straightforward, such as when disk encryption is used or the volumes are part of a RAID array that must be reconstructed in a particular manner. Another challenge is retrieving transient data that is only available in memory, never persisted to disk. There are ways to do forensic memory acquisition from live systems, but the outcome is a snapshot that requires painstaking work to locate the proverbial needle in the haystack of a full memory dump. By comparison, if one could login to the server as a privileged user, with a few commands the running application could be reconfigured to start logging the additional information somewhere for easy retrieval.

There is another reason logical access beats physical access: it’s easier to hide. Logical access operates independently of physical access: there is no record of anyone getting on an airplane, driving up to the datacenter gates, pressing their palm on the biometric scanner or appearing on surveillance video wondering the aisles. The only audit trails are those implemented by the software running on those servers, easily subject to tampering once the uninvited visitors have full control over the system.

Data-nativism as security theater

This distinction between physical and logical access explains why the emphasis on datacenter location is a distraction. Situating servers in one location or another may influence physical access patterns but has no bearing on the far more important dimension of logical access. Revisiting the Binance/US example from the introduction to illustrate this, there are three threat models depending on the relationship between the company and alleged threat actor.

  1. Dishonest, outright colluding with the adversary to siphon US customer data
  2. Honest but helpless in the face of coercion from a foreign government to hand-over customer data
  3. Honest but clueless, unaware that APT associated with a foreign nation has breached its infrastructure for collecting customer data in an unauthorized manner

In the first case it is clear that the location of data-centers is irrelevant. Binance/US employees collectively have all necessary physical and logical access to copy whatever customer information is requested and turn it over to the authorities.

The second case is identical from capability standpoint. Binance/US employees are still in a position to retrieve customer data from any system under their control, regardless of its geographic location. The only difference is a legal norm that such requests be channeled through US authorities, under an existing Mutual Legal Assistance Treaty (MLAT) agreement. If China seeks information from a US company, the theory goes, it will route the request through DOJ who is responsible for applying appropriate safe-guards under the 4th amendment before forwarding the request to the eventual recipient. This is at best wishful thinking under the assumptions of our scenario— a rogue regime threatening private companies with retaliation if they do not comply with requests for access to customer information. Such threats are likely to bypass official diplomatic channels and be addressed to the target directly. (“It would be unfortunate if our regulators cracked down on your highly profitable cryptocurrency exchange.”) For-profit organizations on the receiving end of such threats will be disinclined to take a stand on principle or argue the nuances of due process. The relevant question is not whether data is hosted in a particular country of concern, but whether the company and/or its employees have significant ties to that country such that they could be coerced into releasing customer information through extra-judicial requests.

A direct attack on Binance infrastructure is one where geography would most likely come into play. Local jurisdiction certainly make it easier to stage an all-out assault on a data-center and walk out with any desired piece of hardware. But as the preceding comparison of physical and logical access risks indicate, remote attacks using software exploits are a far more promising avenue of attack than kicking in the door.  If the government of China wanted to size information from Binance, it is extremely unlikely to involve a SWAT-style smash-and-grab raid. Such overt actions are impossible to conceal; data-center facilities are some of the most tightly controlled and carefully monitored locations on the planet. Even if target is greatly motivated by PR concerns to conceal news of such raids, even limited knowledge of the incident breaks a cardinal rule of intelligence collection: not letting the adversary realize they are being surveilled. If nothing else, the company may think twice about placing additional infrastructure in the hostile country after the first raid. By comparison, pure digital attacks exploiting logical access can go undetected for a long time, even indefinitely depending on the relative level of sophistication between attacker vs defender. With the victim none the wiser, compromised systems continue running unimpeded, providing attackers an uninterrupted stream of intelligence.

Physical to logical escalation: attacker & defender view

This is not say that location is relevant. Putting servers into hostile territory can amplify risks involving logical access. One of the more disturbing allegations from the Snowden disclosures involve Google getting sold out by Level3, the ISP hired to provide network service to Google data-centers. Since Google at the time relied on a very naive model of internal security and traffic inside the perimeter was considered safe to transit without encryption, this would have given the NSA access to confidential information bouncing around the supposedly “trusted” internal network. Presumably a compliant ISP in China will be similarly willing to arrange for access to its customers’ private fiber connections than one located overseas. Other examples involve insider risks and more subtle acts of sabotage. For example the Soviet Union was able to hide listening devices within the structure of the US embassy in Moscow, not to mention backdoor typewriters sent for repair. Facilities located on foreign soil are more likely to have employees and contractors acting at the behest of local intelligence agencies. These agents need not even have any formal role that grants them access; recall the old adage that at 4AM the most privileged user on any computing system is the janitor. 

One silver lining here is that risks involving pure physical access have become increasingly manageable with additional security technologies. Full-disk encryption means the janitor can walk away with a bundle of disk drives, but not read their contents. Encryption in transit means attackers tapping network connections will only observe ciphertext instead of the original data. Firmware controls such as secure boot and measured boot make it difficult to install rogue software undetected, while special-purpose hardware such as hardware security modules and TPMs prevent even authorized users from walking away with high-value cryptographic keys.

Confidential computing takes this model to its extreme conclusion. In this vision customers can enlist run their applications on distant cloud service providers and process sensitive data, all the while being confident that the  cloud provider can not peek into that data or tamper with application logic— even when that application is running on servers owned by that provider with firmware and hypervisors again in the control of the same untrusted party. This was not possible using vanilla infrastructure providers such as AWS or Azure. Only the introduction of new CPU-level isolation models such as Intel SGX enclaves or AMD SEV virtual machines has made it possible to ask whether trust in the integrity of a server can be decoupled from physical access. Neither has achieved clear market dominance, but both approaches point towards a future where customers can locate servers anywhere in the world— including hostile countries where local authorities are actively seeking to compromise those devices— and still achieve some confidence that software running on those machines continues to follow expected security properties. Incidentally, this is a very challenging threat-model. It is no wonder both Intel and AMD have stumbled in their initial attempts. SGX has been riddled with vulnerabilities. (In a potential sign of retreat, Intel is now following in AMD’s path with an SEV competitor called Trust Domain Extensions or TDX.) Earlier iterations of SEV have not fared any better. Still it is worth remembering that Intel and AMD are trying to solve a far more challenging security problem than the ones facing by companies who operate data-centers in hostile countries, as in the case of Apple and China. Apple is not hosting its services out of some AWS-style service managed by CCCP in a mysterious building. While a recent NYT investigation revealed Apple made accommodations for guanxi, the company retains extensive control over their operational environment. Hardware configured by Apple is located inside a facility operated by Apple, managed by employees hand-picked by Apple, working according to rules laid down by Apple, monitored 24/7 by security systems overseen by Apple. That’s a far cry from trying to ascertain whether a blackbox in a remote Amazon AWS datacenter you can not see or touch— much less have any say in the initial configuration— is working as promised.

Beyond geography

Regulators dictating where citizens’ private information must be stored and companies defending their privacy record by stressing where customer data is not stored both share in the same flawed logic. Equating geography with data security reflects a fundamental misunderstanding of the threat model, focusing on physical access while neglecting the far more complex problems raised by the possibility of remote logical access to the same information from anywhere in the world.


Flash loans and the democratization of market manipulation

Borrowed guns

Imagine an enterprising criminal out to rob a well-defended gold vault in the middle of nowhere. Unfortunately for his burgeoning career, he has neither the command of a private army of mercenaries or any tactical gear required for the plan. Nor does our hypothetical crook at the beginning of a life of crime have the funds to acquire them yet. He could try buying those resources on credit, with the promise to pay the lender back with the proceeds from the successful heist. But most honest financial institutions have been getting gun-shy about lending to criminals and even the loan-sharks require some type of collateral— which, again our man does not have.

Luckily the neighborhood aviation company is running a special: anyone can walk-in and rent an Apache AH-64 gunship for a very low price, no questions asked. But the offer comes with a few strings attached:

  • This bird is programmed to return to its original take-off point after an hour.
  • It can not refuel. You get exactly one tank of gas to work with.

Borrowers can take off and do whatever they want with the helicopter— including a detour to rob the gold vault— but must return to the designated landing area. If they run out of fuel and crash land in the middle of nowhere, they will have to walk away from the spoils and watch as the stolen loot is recovered by its rightful owners. The world reverts to its previous state, as if the heist never happened.

DeFi exploits in the wild

This is one perspective on the concept of flash loans in decentralized finance. Anyone can initiate an Ethereum transaction and borrow funds which must be paid back at the end of that transaction. There is no collateral or credit-check required because it is not possible for the borrower to default. The immutable logic of smart-contracts enforced by the blockchain that guarantees this. If the loan is not paid back by the end of the transaction, the transaction “reverts”— it is still recorded on the blockchain and fees paid to miners for their effort, but nothing has changed. But if all goes well and the loan is paid, the changes that occurred within the span of that transaction— money changing hands, someone making a killing, someone else losing their shirt— is committed to the blockchain. Those possibilities are only limited by the maximum gas that can be consumed in a transaction, the virtual equivalent of the AH-64 fuel tank.

Not surprisingly, flash loans have been used for attacking DeFi exchanges and lending pools by manipulating the price signals those applications rely on. The attacks are complex and necessarily involve multiple defi contracts (exchanges such as Uniswap or lending pools such as Compound) and trading in/out of multiple assets. Here is a very simplified example of how such an attack can be executed:

  1. Flash-borrow a large amount of Ether
  2. Divide the ETH into two chunks of capital
  3. Convert the first chunk into token A, using a decentralized exchange. Now recall that DEXes do not have traditional order-books with ask/bid offers that can be matched when they cross. Instead they use automated market-makers (AMMs) which set the price based on the total amount of funds available on either side. More importantly, the liquidity available on these exchanges is often razor thin. It does not require a lot of capital to cause massive change in price. The result of this large, single “buy” order to convert ETH → A is that the “price” of A goes way up on the decentralized exchange. In other words, there is massive slippage. This type of trade is normally a terrible idea—the buyer effectively overpaid for A when they could have gotten a much better deal if they traded on a centralized exchange. So how can an “attacker” make up lost ground if they are starting out with such a lousy trade?
  4. Convert the second chunk of ETH into asset A. The trick is using a different venue for this than #3. Goal is for this trade to execute at close to fair market price and avoid slippage.
  5. Time to visit the real victim, yet another defi application. There are specific criteria for selecting this target:it must be using the venue from step #3 as its price oracle. In other words, when the attacker tries to trade A or borrow using A as collateral on the target venue, that venue will rely on faulty price signals from #3 which has been artificially manipulated by the attacker. (Recall that everything is executing inside a single ethereum transaction orchestrated by the attacker; no other trades that could interfere with this mispricing can occur.)
  6. This time the attacker has a favorable trade from A → B.The target venue is working with an overinflated price for A, because the last A-for-ETH transaction artificially inflated the price of A relative to ETH. The market maker is willing to swap/lend an outsized quantity of some other token B in exchange for a small amount of A. This is the crucial step. While the attacker lost money on the first chunk and ended up with a deficit in asset A, they aim for a killing on the second chunk, ending up with a surplus of asset B relative to the value of A exchanged.
  7. Time to pay back the flash loan. The attacker converts enough of their holdings in A and B back to ETH to cover the original loan, again using a venue where price indications are not distorted. The proceeds are used to close out the flash loan and complete the transaction successfully.
  8. Assuming the profit from B exceeds the losses on A, the attacker comes out ahead. (What if the math did not work out? No harm, no foul. The transaction will revert. So the attacker does not stand to lose any money beyond the Ethereum gas fees paid for the attempt.)

This is a highly simplified view; actual attacks can be far more complicated. Any asset can be flash-loaned, so the starting point need not be ether. However the loan has to be paid back in kind, so the attacker is still on the hook for returning the identical amount. The exchange process may involve multiple hops such as ETH → A- → B → C → … → ETH before the cycle is completed. For more concrete examples, see this 2021 paper or breakdown of the recent attack on CREAM which involved dozens of steps within a single Ethereum transaction. That paper also poses the question of whether attacks in the wild were being “optimal” in how they divided up the total amount borrowed into two chunks. The surprising answer is they are far from optimal: in each case, a different allocation between different assets A and B would have resulted in a more profitable heist. The crooks left money on the table. (Incidentally, you have to wonder about the ethics of academic research that doubles as a handbook on committing more optimal robberies and leaving less money behind in the virtual vault.)

Root causes

With this background on how flash-loans are leveraged in recent attacks, we can revisit the original question: were flash loans the root cause? The answer is clearly no. Weak connection between prices on decentralized exchanges and the “real” market prices elsewhere is the real culprit. By definition blockchains are isolated systems: they can not interact with the outside world. A smart contract can not sidle up to a Bloomberg terminal and request a fresh quote on current commodity prices. It must rely on indirect indications, such as trusted pricing oracles maintained by others on-chain or observed actions of participants interacting with the contract when trading an asset. Multiple DeFi exploits have demonstrated that these signals are surprisingly easy to manipulate given enough capital. When taken in isolation, each such instance of manipulation looks self-defeating: “attacker” gets the price of an asset completely out-of-whack on one particular exchange but only by making a lousy trade. Whatever distortion is achieved will be short lived, as other investors take note of the mispricing and jump-in to quickly arbitrage away the difference. Why would any rational actor engage in this meaningless gesture? Because other venues rely on the same distorted price signal and create profit opportunities far exceeding the loss on the original trade. This is an intrinsic structural weakness for some— but not all— decentralized application with flawed pricing signals.

From this perspective, flash-loans did not enable a new class of attacks that were impossible before. The sequence of actions depicted in the previous section could have skipped the first step — flash borrowing— and start out with an existing pool of capital already in the hands of the perpetrator. Even the most extreme case of the recent CREAM attack involved a $500MM USD flash loan. There are many hedge-funds and high net-worth individuals in possession of amounts in that neighborhood. Every one of them could have executed the exact same transaction without borrowing a single wei. Seen in this light, flash loans democratize the possibility of market manipulation.

This episode has parallels in a story covered in an episode related by Michael Lewis in Flash Boys. Goldman Sachs argued that high-frequency trading source code allegedly stolen by its one-time employee Aleynikov could be used for “unfair market manipulation.” To which Lewis effectively retorted: If such code exists, is the real problem that Aleynikov had possession of it? Is market manipulation “fair” when the same algorithm is wielded by Goldman? To the extent DeFi applications are built on flawed pricing signals, they are vulnerable to manipulation. Whether the manipulation is done with institutional capital on-hand or aided by no-questions-asked flash-loans seems irrelevant.

Deterrence at the margins

One counter-argument is that reputable market participants with large concentrations of are unlikely to attack smart-contracts regardless of profit opportunity, for fear of legal and reputational risks. This is complicated by the ambiguity of what qualifies as attack. It is not clear that what happened to CREAM and others is a traditional “hack” in any sense. There were no logic bugs in the contract. There was no compromise of a secret key held by the CREAM team. Other smart-contracts such as DAO or the Parity multi-sig wallet suffered massive losses due to logic flaws in their implementation. In both of those cases, the smart-contract had a glaring programming error such that its behavior diverged from their intended behavior, however informally specified that may have been. Compare these two cases:

  • In the case of Parity, the expectation is that only the owner of the wallet can withdraw funds form their wallet. If everyone in the world can take money out of your wallet, there is no ambiguity: the contract has failed to implement the intended policy. Anyone taking advantage of that flaw is exploiting a security vulnerability and committing theft.
  • In the case of CREAM the contract worked exactly as intended, using precisely the price signals it was expected to consume. But the designers did not look far enough ahead to understand how their creation would behave in extreme circumstances when those signals become wildly distorted. If the casino designs a game such that clever players can inflict massive losses on the house while playing by the rules, is it an “attack” to implement that strategy?

If this is not a breach in the traditional sense, one could at least hope that it qualifies as market manipulation. (Standard disclaimer: the author is not a lawyer and none of this should be construed as legal advice.) At least that categorization could serve as a deterrent for participants interested in staying on the right side of the law. But it is unclear how existing statutes for trading securities or commodities apply in the context of blockchain assets. While this post liberally uses the term “market manipulation,” not every instance of buying up large quantities of something to profit from the artificial scarcity is necessarily criminal. Not every scalper hoarding Hamilton tickets for resale merits an SEC investigation. Even if the perpetrators of these attacks were identified and prosecuted— unlikely given the relative anonymity of blockchain transactions— they may well rest their defense on the claim that “manipulation” is impossible when dealing with a system that is defined by immutable rules implemented in code.

 On the other extreme, if we posit that what happened to CREAM constitutes criminal activity that falls under SEC or CFTC jurisdiction, some troubling questions are raised about the venues providing the flash-loans. Is there liability? Did they aid and abet theft? Returning to the opening hypothetical about the helicopter available for anyone to borrow: if that craft turned up as the get-away vehicle for an actual robbery, surely the owners would have some explaining to do. Were they aware that this customer intended to commit criminal activity? Did they conduct any due diligence? Saying that the business has a policy of not asking any questions reeks of willful blindness. Virtually all flash-loans on Ethereum today follow this model— since the loan is guaranteed to be repaid, the lender does not have to care about the creditworthiness of the borrower. But that narrow focus on avoiding defaults misses the negative externalities created by (temporarily) arming random people with large amounts of capital to wreak havoc on other blockchain applications. Did Maker aid and abet criminal activity in providing the half billion dollars in capital used to drain CREAM? In the same way that Aave is contemplating the creation of a permissioned lending pools subject to Know-Your-Customer rules, flash-loan providers may need to revisit their strategy around doing business with anyone.


Pre-theft attacks on Ethereum: stealing from the future

Among financial instruments cryptocurrency is unique in equating possession of funds to control over a secret cryptographic key. If you have the private key corresponding to a particular blockchain address, you have full control over funds at that address. In particular, you can sign a transaction to move those funds anywhere. This simple threat model helps the defenders prioritize their strategies and place great emphasis on key management: making sure your private keys do not fall into the wrong hands. This is where offline air-gapped “cold storage” designs, multi-signature or equivalent MPC techniques and specialized hardware security modules come into play, helping raise the bar against attacks.

But there is a more subtle aspect of blockchain design that can complicate “second-order threats” involving temporary access to private keys. It’s clear that an adversary need not have actual possession of private keys— in the sense of having the raw bits they can print on a piece of paper— in order to carry out a heist. If they can instruct a blackbox to sign a transaction sending funds to a new blockchain address controlled by the attacker, that will do just fine.

A parallel from the world of code signing comes from the 2012 Adobe breach. Most consumer operating systems including Windows implement a code-signing requirement for software vendors to digitally sign applications they public. This is designed to help increase confidence in the authenticity of software and prevent malware from disguising itself as “Adobe Photoshop” for instance. Such keys must be carefully guarded and failure to do has been leveraged in well-known attacks, most notably the joint US/Israel Stuxnet malware targeting the Iranian nuclear program which used stolen private-keys to digitally sign its components. In the case of the Adobe breach, the company had taken steps to secure its private keys by using a hardware security module. This prevented attackers from being able to walk away with a copy of those keys. Not surprisingly, it did not stop them from signing a few pieces of malicious code during the time they had access to the HSM.

Here is a more routine scenario involving access control. Consider Bob, an employee in good-standing at a company that stores cryptocurrency. Because his role includes wallet management, Bob has access to the key management system to generate transactions. (Additional controls may exist to limit what transactions can be signed, but this makes no difference to the risk under consideration here.) Suppose Bob and his employer part ways in less-than friendly manner. It is clear that while he was employed, he could have signed and broadcast any permissible transaction. Let’s posit that his access has been properly revoked and he can not no longer access signing infrastructure. Let’s also grant that all private-keys were stored on HSMs in non-extractable fashion. Is there any reason to fear retaliation from Bob?

If Bob planned ahead, he could have signed some transactions and put them aside for future broadcast. Whether those transactions remain valid indefinitely depends on the specific blockchain protocol. In the case of Bitcoin, the answer is yes, unless some other transition is first broadcast to spend the same unspent transaction output or “UTXO.” In fact Bitcoin can only set time limits in one direction: it is possible to time-lock funds such that a transaction is not valid until a certain time or block-height. It is not possible—yet, barring a hard fork— to create a transaction that is valid today but stops being valid at a future date, short of having a conflicting transaction that double-spends the inputs. This provides a straight-forward, if expensive, way of mitigating risk from unknown signatures floating around: preemptively spend all existing UTXO associated with keys the ex-employee had access to. These can even be “loopback” transactions, sending funds to the same address without generating new keys, as long as the transactions are unpredictable.**

In the case of Ethereum and specifically externally-owned accounts (EOA) the situation is more tricky. It turns out disgruntled employees can steal future funds that do not even exist at the time they are employed.

Ethereum signing recap

Ethereum requires signed message to broadcast to authorize a funds transfer or invoke a function on a contract. This message has several fields encoded using a scheme called “RLP.” For our purposes the interesting ones are:

  • Destination address
  • Amount being transferred
  • Value of the current nonce associated with the source address

Compared to Bitcoin transactions, this information is only loosely bound to current blockchain state. Creating a successful bitcoin transaction requires knowing the SHA256 hash of an existing UTXO on chain, which is a function of past state including all previous inputs that feed into that UTXO. For ethereum, only some vague knowledge about the state of the world is necessary. Walking through the three fields above:

  • Destination address is arbitrary and completely attacker-controlled.
  • The only constraint on amount is that it must be less than the total stored at that address (Unlike bitcoin, transactions do not consume all available funds in that UTXO. Any amount not included in the transfer stays at that address.)
  • Nonce is a counter that starts at zero and increments by one for every transaction originating from that address. The nonce included in the transaction must be exactly equal to current nonce on blockchain.

Pre-theft: stealing nonexistent funds

Note that the only reference to blockchain state is the nonce. There is no need to know the exact balance on that address, much less the sequence of previous transactions resulting in that total. That property makes it possible to steal funds that do not even exist on-chain yet, given only temporary access to the signing interface.

Returning to our hypothetical disgruntled employee Bob: suppose Bob knows that some Ethereum address will receive deposits in the future, even though its current balance is exactly zero. Bob can use his temporary access to sign a transaction for a future predicted value of the nonce and amount. For example, he can bet that by the time the counter reaches 100, there will be at least 5 ETH balance in this account and create a corresponding transaction to funnel that amount to a personal address. Now all he has to do is wait until the counter reaches 100 and broadcast the previously signed transaction. As long as the balance is at least 5 ETH, the transaction will move that amount into Bob’s possession.

Optimizing the heist

What if Bob guessed wrong and there is only 3 ETH? Not to worry: he can supply 2 ETH from his own funds, add that to the original pool and then withdraw using the existing transaction. This is a bizarre pattern as far as criminal activities go: the thief must make a donation to their victim before committing robbery.

Side note to Bob: you would want to execute these two steps as close as possible in time. Otherwise there is a risk that the 2ETH “donation” gets processed but the counter increments past 100 due to intervening transactions, causing Bob to miss the attack window. Such outcomes are difficult to guarantee because the second step can not be implemented as a smart-contract invocation. (If that were possible, Bob could write a custom contract that attempts to execute both atomically and revert in case the withdrawal fails.) Only a miner colluding with Bob can guarantee that donation and theft transactions are executed back-to- back and before any other transaction involving the same address that may disrupt the nonce value.

In fact there is no reason for Bob to limit himself to just one signed transaction. To cover his bases, he can sign multiple TX for the same amount at different counter values in a given interval, for example spanning 100-110. This avoids any race conditions from Bob’s transaction being preempted by another in-flight transaction for the same counter value, originating with the authorized party.

Multiple signatures solve another nagging problem for Bob: leaving money on the table. Recall that Bob must guess at a particular amount to steal. If he guesses on the high-side, he faces the problem of having to supply some funds first. What if he guesses too low? Imagine if the balance was 50 ETH instead of 5 ETH. The transaction Bob prepared will only walk away with 10% of the total take possible compared to the optimal heist.

Bob can improve his odds by preparing a series of transactions with different nonce values and different amounts. Consider this sequence:

<100, 1 ETH>
<101, 2 ETH>
<102, 4 ETH>

<109, 512 ETH>

Assuming the final balance is 50ETH, he will broadcast the first five transactions, netting a total of 31ETH. The sixth one can not be broadcast as is, because at that point the remaining balance on the account is 19ETH, which is lower than the 32ETH withdrawal attempt. (Bob can use some of the proceeds from the initial batch to “loan” more funds into the victim address such that the total exceeds 32ETH and only then broadcast the final transaction.) Even without risking the race-condition associated with lend-and-steal, this sequence is guaranteed to capture up to 50% of available funds. In fact there is nothing magical about the factor of two appearing in the sequence above. At the cost of requiring additional signatures, one could prepare a series of transactions where the amounts increase by some other constant factor F > 1, with the guarantee that at least 1/F of total value sitting on that address can be captured directly.

Defender perspective: mitigations

Proving the existence of a signed transactions is relatively easy: broadcast it. (In fact there are zero-knowledge techniques from cryptography for convincing a verifier that you know such a signature without disclosing it.) But proving the non-existence of such a transaction is tricky. How can a custodian be confident that someone with access to the private-key in the past did not sign pre-theft transactions? There are two sound approaches:

1. Throw in the towel, deprecate the address and start over by transferring the entire balance over to a newly generated key-pair. This is straightforward but highly disruptive in having to update all existing references to the previous blockchain address.
2. Use a smart-contract. While externally-owned Ethereum addresses have hard-coded logic for authorizing funds movement, a contract is free to make up its own rules. Instead of using an incrementing nonce which is highly predictable, the contract logic can dictate that an uncontrollable value such as the block-hash must be incorporated into the signed message. While the block-hash is deterministic in one sense— it is computed as a function of all transactions included in the block— an attacker has no way to control it indefinitely into the future short of controlling 100% of hash rate.

What about audit trails? In theory if the signing system has a perfect logging mechanism that dutifully records every use of the key including the message signed, one can be confident there are no other, unknown transactions floating around. In reality it is difficult to achieve this level of assurance. Even standard cryptographic hardware does not help. Typical HSM logs can reveal when someone performed cryptographic operations but not necessarily which key is involved, much less the exact message they signed. Some vendor extensions to the PKCS#11 standard include counters that increment each time a key is used. (Safenet HSMs implement this in firmware 7.0 and higher versions.) One can reconcile that counter value against an independent record of all transactions ever submitted for signing. This approach can flag discrepancies, but not necessarily resolve them conclusively. Suppose the HSM counter shows a key was used 10 times but only 9 signed transactions are known to exist. There could be an innocent explanation: some transaction among the nine got signed twice, due to a transient error that was silently resolved by retrying with the same message. Or it could be evidence of pre-theft attack, where someone snuck in a tenth transaction outside the known set with intent to broadcast it in the future.

The missing feature is a more robust, tamper-resistant audit trail maintained internally by the HSM that incorporates hashes being signed. This need not be an append-only log in the traditional sense. For example the same logic used for “extending” PCRs in a TPM can be used to maintain a concise, constant size running tally of all messages ever signed with a given private key.


** Segregated witness complicates this somewhat because the signatures are no longer included in the transaction hash. That removes one of the main sources of unpredictability from the UTXO, leaving only the amounts and mining fees.

Tricky accounting: cyptocurrency mining & energy use

Pinning down the true energy cost of mining

The staggering energy consumption and carbon emissions from Bitcoin mining has finally graduated from Twitter pundits to the national political stage when Senator Warren weighed in with her opinion. Given the amount of ink spilled on this subject, there are plenty of eloquent defenses for the case on both sides. But there are also two common, flawed arguments that are frequently repeated and it is to these that we take up here.

Per-transaction arithmetic

The first flawed argument seeks to “prove” the inefficiency of cryptocurrencies by attempting to derive at per-transaction costs with simple arithmetic. Take the total estimate for yearly energy consumption or implied carbon-emissions (based on reasonable estimates of the energy generation mix— these figures are not controversial) and divide it by the number of transactions that have occurred on the Bitcoin blockchain during that time frame. This simple allocation of cost results in highly dramatic and quotable comparisons such as “the energy used for a single bitcoin transaction could power an average house for a month”

Fail to scale?

Before discussing the problem with this line of reasoning, it is worth also pointing out where it is correct. The calculations do not reflect a temporary inefficiency due to under-utilization. Mining a block requires about the same energy regardless of how many transactions are included. In the worst case scenario a block can have just one lonely transaction: the so-called “coin-base” transaction that is always present and sends the newly minted block rewards to the winning miner. If one were to measure per-transaction costs for such a block, the wasted energy would be even more dramatic by three orders of magnitude. This is similar to the fuel-efficiency of a commercial jetliner: an airplane flying only its pilots with no passengers on-board still consumes almost as much fuel as if it were flying leaden with passengers and cargo. Blocks were already full before segregated witness change indirectly increased capacity. Even if additional changes double or triple the number of transactions that can be processed in a block, it will barely make a dent in the problem if the goal is viewed as reducing the energy of an individual transaction to levels comparable to  credit-card networks. Packing twice as many people into a jetliner will not make it as efficient as a car for short trip. (Layer 2 scaling solutions that aggregate a large number of off-chain payments into a single on-chain transaction could however result in more drastic gains.)

Incomplete attribution

The fundamental error in the per-transaction critique of bitcoin energy consumption is neglecting the other use-cases for a monetary system. To recap, money serves as:

  1. Unit of measure eg for pricing assets
  2. Method of exchange— in other words, making payments
  3. Store of value

It is that final purpose that is being neglected when the utility of bitcoin is only measured in terms of payments. In fact, it is clear that most cryptocurrencies score atrociously on the first two use-cases. Denominating prices in a highly volatile asset results in taking on exchange risks; no wonder most merchants who claim to accept bitcoin are in fact doing so through a payment processor who immediately converts the incoming funds into fiat currency and credits the merchant in dollars. Ubiquitous, peer-to-peer payments may have been an early source of excitement around bitcoin, with utopian visions of disintermediating the Visa/MC/AmEx oligopoly or helping unbanked residents in developing countries get access to the modern economy with nothing more than a mobile wallet app required. That vision has yet to pan out. With the exception of underground markets, fiat currency remains the preferred method of payment despite all of its perceived shortcomings. That leaves final scenario as the one cryptocurrency shines at: digital gold, an inflation hedge against the money-printer going out of control, or according to its detractors, a speculative asset class built around the grater-fool theory of asset valuation.

Accordingly the energy spent on mining can not be exclusively allocated to actual transactions, regardless of how many or few are occurring, or what fraction of those represent meaningful economical exchanges as opposed to shuffling funds around to erase their criminal provenance. A better question is whether the energy consumption and associated CO2 emissions is worth sustaining a new asset class whose market capitalization stood at over a trillion dollars at its peak. In this regard, bitcoin is more similar to a commodity such as gold or even a public company along the lines of Apple or Exxon-Mobil who shares can be purchased for investment purposes. Each of these asset classes can serve as a store of value. Critics may object that Apple and Exxon actually provide “useful” services in addition to having shares you can invest in as a store of value. Yet the alleged utility of those services is in the eye of the beholder. Just as some question whether censorship resistant, peer-to-peer payments are useful outside the context of criminal activity, one could argue the “product” Exxon-Mobil manufactures is in fact a net negative for society. Whether the investment value XOM provides its current shareholders is worth the cost of emissions directly and indirectly attributable to its production activities is equally debatable.

Mining and scarcity

With the problem reframed as storing value instead of payments, bitcoin defenders have gone on the offensive by comparing its CO2 emissions to that of gold-mining. By one estimate, bitcoin mining uses 50% more energy than gold mining while producing about half the emissions due to greater share of renewables in the generation mix. Case closed? Not exactly, for several reasons.

  1. Gold has a market cap 10-20x that of bitcoin, with the wide-range owing to the volatility of bitcoin during the timeframes one may care to sample. For bitcoin to claim parity in carbon-efficiency as store of value, it would have to be not twice but at least 10 times as efficient.
  2. Gold mining much like other industrial processes becomes more efficient over time as improvements in technology allow the same amount of mining and processing to be carried out using fewer inputs, including energy. Bitcoin mining faces a similar competitive pressure for efficiency— every miner wants to maximize the number of tickets to the proof-of-work lottery they can purchase every second using one watt of energy. Those same dynamics do not necessarily apply to total energy consumption. If a miner is profitable at current energy costs and bitcoin prices, when the price of bitcoin doubles it will be still profitable using twice as much energy to continue mining. Granted gold mining has similar incentives in that if prices double, there will be an incentive to throw more inputs into the search for gold. But cryptocurrency prices have appreciated much faster than gold. Even by mildly optimistic projections, another 3-5x appreciation is expected. More importantly, the production of commodities is not controlled by a simple calculus linking energy inputs to profit. Doubling the hash-rate of a cryptocurrency mining pool doubles expected block rewards, plain and simple. Digging twice as many wells does not result in doubling oil-reserves, and neither does using twice as much cyanide to process gold ore yield twice the amount of gold.
  3. The final flaw in the comparison against gold mining is the flip-side of the per-transaction accounting. Cryptocurrency advocates frequently emphasize that mining is there to secure the network, to protect the value of existing cryptocurrency against 51% attacks, censorship and other legerdemain that could result from a single entity taking over a majority of hash-power. But the unspoken corollary of that assertion is that mining can not stop or decrease substantially without undermining those assets. That is in short contrast to commodities. If gold mining activity stopped overnight or De Beers announced no more diamonds are left to dig out of the ground, gold and diamond would still be highly precious. (Arguably they would become even more valuable due to the scarcity implied by that news.) For Bitcoin to hold its value against inflation, mining must continue as a forever-war of pools consuming higher amounts of energy input to feed increasingly more efficient mining rigs to eke out a tiny advantage against competitors.


Designing a duress PIN: covert channels for SSH (part V)

[continued form part IV]

Covert channels with ECDSA

ECDSA signatures are probabilistic, with a random nonce point chosen by the signer comprising half the signature. This potential for covert channels was known early on in the context of plain DSA over the integers, without the “EC” part— later elliptic curve adaptation of the scheme did not materially affect the existence of covert channels.

The core idea is to repeatedly try different nonces until the final signature satisfies some property. For example, suppose the goal is to convey the bit string “1011.” The signer chooses different random nonces and computes the corresponding half of the ECDSA signature. Next an HMAC is run on that result with a symmetric secret shared with the verifier. If HMAC outputs a result ending with the bit pattern “1011,” the signature can be released. Otherwise a new nonce is selected and the search continues. The verifier can extract the same bit pattern by repeating the HMAC calculation on the first half of the received signature

Compared to PSS this trial-and-error approach is very inefficient. It does not operate in constant time. Instead we check random nonces until a predicate is true, with the probability decreasing exponentially in the amount of information being conveyed. Even signaling a single bit of information—was the duress PIN invoked?—  will require 2 tries on average. That means signature times have effectively doubled on average and could get a lot worse if there is an unlucky streak of nonces failing our predicate. (Recall that the most expensive part of an ECDSA computation is the point-multiplication of random nonce with the generator point of the curve. So we are repeating the one step that accounts for the majority of CPU cycles.) One approach is to avoid starting from scratch with a new nonce, and instead building incrementally on the previous result. For example we can repeatedly multiply the current point by 2 or add the generator point until the predicate reports true. Such incremental changes are much cheaper than doing an entire multiplication from scratch. On the other hand, these short-cuts reduce the entropy of the nonce which is critical for the security of ECDSA. Even small information leaks about a nonce aggregated over many signatures can be leveraged for recovering the private key.

There is another way to convey information with ECDSA signatures owing to their malleability property. Specifically if <r, s> is a valid ECDSA signature on a given message, so is <r, -s> where the “negative” value is taken modulo curve order. This looks promising as special-case communication channel for exactly 1 bit: output either <r, +s> or <r, -s> depending on the least-significant bit of HMAC output and the true/false value we intend to convey.

Minor problem: an adversary can easily disrupt this channel. After the card releases a signature, the adversary is free to tamper with the second half without invalidating it. This makes the channel unreliable. Assuming a perfect implementation without side-channel leaks, the adversary will have no way to know for certain whether a duress PIN has been used. But if they suspect so, they can tweak the signature and send it with the opposite sign to disrupt the signal. (Of course, if the card-holder had supplied their true PIN, the adversary will have raised the alarm on themselves by manipulating it.) No such games are possible with PSS: any modification to the signature output from the card will invalidate it. An adversary can always ask the card for another signature on the same message,  but that does not help. As long as the duress PIN is being used, the card will continue to output more valid signatures tainted in exactly the same undetectable manner.

Determinism is in the eye of the beholder

The final type of key supported for SSH— EdDSA— makes for an interesting case. In principle EdDSA signatures are deterministic: signing the same message multiple times outputs the same signature. While there is still a unique nonce for each operation, this nonce is derived as a function of the message, guaranteeing determinism and reproducibility of results. Unlike ECDSA there is no freedom to leak information by playing games with the choice of random nonce.

The catch is that choice of nonce still looks random to external observers. They have no way to determine whether a blackbox signer— namely, the applet running on a smart-card— followed prescribed rules for computing the nonce or diverged from the protocol. (In fact such external verifiability is fundamentally incompatible with the security of EdDSA: if a verifier could predict what the nonce should be for a given message, they can recover the private key.) That creates some leeway for signaling a duress PIN. When a regular PIN is used, the applet follows the exact letter of EdDSA specification. By contrast when a duress PIN is used, a different deterministic scheme is invoked. “Deterministic” being the operational keyword; otherwise the adversary can trivially detect that something is amiss by asking the card to sign same message multiple times and observing different signatures. For that matter, if the adversary has ever witnessed an EdDSA signature on any message produced with the real PIN, they can detect duress PIN usage by asking for another signature on the same message and checking if results are identical.

It remains an open question how such a scheme can operate without side-channels (constant time and identical execution traces, regardless of which PIN is used) and without disclosing the private key. If we remove the latter requirement, there is a trivial solution. EdDSA uses a secret seed for deriving nonces from the message. Suppose the card application maintains two seeds, one private and one shared with the remote server. Ordinary PIN entry results in generation of nonces using the first one, while duress PIN entry switches to the latter. Since the server has a copy of the second seed, it can determine for any given signature which path was taken; the chances of a collisions are negligible. A serious disadvantage to this scheme is that invoking the duress PIN also discloses the private-key to the remote server. Recall that knowledge of nonce used for a signature allows key recovery. As such it is only feasible for closed ecosystems where the disclosure of private-key has no adverse consequences beyond that one remote system.


Designing a duress PIN: covert channels with RSA (part IV)

[continued from part III]

Covert channels for public-key signatures

For reasons described earlier, it is difficult to hide the existence of a private-key on a card— because the associated public-key is often retrievable without any authentication.  For example both the PIV and GIDS standards allow retrieving certificates from the card without supplying a PIN. By convention when a certificate exists for a given slot, say the 9C slot designated for “signature key” in PIV, the card contains the corresponding private key. Similarly public-key encryption formats such as CMS and GPG contain hints about the identity of the public key that a given message was encrypted to. This rules out the earlier approach used for symmetric keys, namely creating plausible deniability about the very existence of a specific key on the card.

If we focus on digital signatures and lower the bar to allow online verification, indistinguishability for duress PIN can be restored. In this model the card is allowed to output a correct result— namely, a valid digital signature computed with the private key on board. We punt responsibility for detecting use of duress PIN to a remote system responsible for verifying that output. This clearly does not work for decryption since the adversary would not need any assistance from a remote peer to make use of the output. Nor would it work for scenarios where signature verification is performed by parties outside the control of the user. That includes blockchains: bitcoin miners are happy to include any transaction with a valid signature that meets consensus rules in the next block without further inspection. Instead we need to look at closed ecosystems where the signatures are only intended for a system that is closely affiliated with the cardholder and working in conjunction to detect duress signals.

Somewhat realistic scenarios exist for enterprise authentication. Imagine a company with a VPN for remote access, website that implements TLS client authentication or Linux servers accessed using SSH. For all three scenarios authentication is ideally implemented using public-key cryptography with private keys stored on cryptographic hardware such as a smart-card or USB token. Common denominator for these use-cases is the card signing a challenge that is created during protocol execution and this signature being verified by the server to confirm that the person on the other side is in possession of the correct public key. Depending on exactly which signature algorithms are used, a duress PIN can be implemented by piggy-backing on subliminal channels.

Subliminal channels are a type of covert channel present in some digital signature algorithms, allowing the signer to convey additional information in the signature. Broadly speaking this is possible when the signature scheme is randomized: there is more than one valid signature for a given message. While the theoretical constructions assume that the signer will randomly pick one of those with uniform probability, a crafty signer in cahoots with a verifier can do something more subtle: instead of choosing randomly, use the freedom to choose for signaling additional information to the verifier.


This is best exemplified with RSA-PSS where PSS stands for “probabilistic signature scheme.” (PSS can be thought of as the counterpart of OAEP which is a probabilistic padding scheme for RSA encryption.) PSS signing starts out with a choice of a random salt, which is used to generate a mask that is combined with the message hash using a series of concatenation and xor operations. The important point is that this salt is fully recoverable by the verifier. That means it is trivial to use the salt to convey additional information. In our case we only need to get 1 bit of information across, namely the answer to a true/false question: did the card-holder enter a duress PIN?

Care must be taken in how that information is encoded. Recall that anyone in possession of the public-key can verify the signature and therefore recover the salt. The adversary is also assumed to have access to that public-key; otherwise they could not check when the card is outputting bogus results, and we would have no need for this level of stealth. A simple encoding scheme such as setting the last bit of the salt to 0 or 1 will not fly; adversary can read that information too.

Indeed no scheme that can be publicly verified is safe, no matter how complicated. Suppose we decide to obfuscate matter by encoding the boolean value in the hash of the salt. Choose a salt such that its SHA256 hash ends in 0 bit to convey “false” (as in, correct PIN entered) and “true” otherwise (duress PIN used.) The flaw in this design is relying on security-through-obscurity. If the adversary knows the covert channel, they can also run the same SHA256 computation and learn the result.

Creating a more robust scheme calls for a shared secret negotiated between the card and the remote server ahead of time. Given that secret, we can compute one-bit as the output of some psuedo-random function of the salt. For example:

  • Start with a randomly generated salt
  • Run all but one bit of that salt through HMAC-SHA256 with the shared key
  • Take the least significant bit of HMAC output
  • Depending on whether we want to convey 0 or 1, either use that bit verbatim or flip it to determine the final salt bit

The server can repeat the same HMAC computation on the other side to infer whether the signer conveyed 0 or 1.

Some care is necessary to implement this without side-channel leaks from the card applet. In particular, one would need a similar trick as earlier design, with a collection of multiple PIN slots and associated 0/1 signal bits for each slot based on whether that PIN corresponds to a duress scenario. Salt generation always proceeds the same way, using the HMAC scheme described above and the shared key, which is identical for slots. The only difference is that last bit is xored with a value of 0 or 1 drawn from the specific slot that validated against the supplied PIN. As before, PINs are checked against all slots are in a different, randomly chosen order each time.

Stepping back to review our assumptions: how realistic is it to find RSA-PSS or some other probabilistic signature scheme in existing real-world protocols? After all it is much easier to tweak an existing server-side logic run additional checks on an otherwise valid signature compared to deploying a whole new protocol from scratch. Using the original three scenarios as benchmark, we are batting two out of three, with some caveats.

  • TLS: PSS was not supported in TLS1.2 but the latest version of the protocol as of this writing includes RSA-PSS variants in the list of recognized signature schemes. While the signature scheme selected is subject to negotiation between client and server based on comparing their respective lists, TLS 1.3 has an unambiguous preference for PSS:

    RSA signatures MUST use an RSASSA-PSS algorithm, regardless of whether RSASSA-PKCS1-v1_5 algorithms appear in “signature_algorithms”

Bottom line: provided the smart-card native implements RSA-PSS and associated middleware delegates padding to the card—as opposed to selecting its own padding and invoking a raw RSA private-key operation, which is how some PKCS#11 providers implement PSS— a duress signal can be carried transparently through TLS1.3 connections.

  • VPN: There is no single “VPN protocol” but instead a variety of open-source and proprietary options with different cryptographic designs. Luckily a large class of VPNs are based on TLS, and this case reduces to the first bullet point. For example OpenVPN and Cisco AnyConnect are all built on TLS client authentication. By using TLS1.3 for the handshake, RSA-PSS becomes accessible for creating a covert channel.
  • SSH: While OpenSSH has been aggressive in pushing for EdDSA and ECDSA keys, on the subject of RSA signatures the implementors are surprisingly conservative. Even the latest RFC favors PKCSv1.5 padding over PSS:

“This document prescribes RSASSA-PKCS1-v1_5 signature padding because […]
(1)  RSASSA-PSS is not universally available to all implementations;”

This rules out RSA-PSS padding are ruled out for SSH, that is not the end of the story. Recall that the property we relied on is freedom in choosing from a large collection of possible valid signatures for a given message. ECDSA clearly fits the bill. While EdDSA is deterministic in principle, the difficulty of verifying that property externally can also be leveraged for a covert channel, albeit with some qualifications. The final post in this series will sketch ways of signaling a duress PIN over SSH.



Designing a duress PIN: solving for symmetric cryptography (part III)

[continued from part II]

Plausible deniability from symmetry

Given the inherent difficulty of implementing a convincing duress PIN for public-key cryptography, let’s switch tracks and solve for an easier scenario. Instead of retrofitting a duress PIN on top of an existing card standard such as PIV or GIDS, what if we tried to design an applet with plausible deniability as first-order requirement. In this example we will focus on cryptographic hardware for managing encryption keys. To make the scenario more concrete, here is a hypothetical dystopian scenario involving an Uighur political dissident in Xinjiang receiving a knock on her door. The state-police are outside, holding an encrypted USB drive, which they allege is full of samizdat critical of the Dear Leader.

“Are you the owner of this drive Citizen Rebiya? It looks like you have one of those disk encryption gadgets. Let’s see if you can decrypt this for us.”

Let’s posit that the disk encryption scheme is rooted in a symmetric secret— which is the case for all popular technologies including LUKS, BitLocker and FileVault. That provides for one crucial difference from the public-key cryptography underlying Bitcoin wallet or SSH: with symmetric keys it is no longer possible for an outside observer to evaluate whether a given operation was performed correctly. For an ideal symmetric cipher, the result of encryption or decryption looks like random data. In fact such notions of “indistinguishability” figure in the theoretical definition of what constitutes a secure symmetric cipher. This property can be leveraged to support multiple PINs in such a way that use of duress PIN looks indistinguishable from a perfectly functioning card with the wrong key.

Here is how such an applet could work at a high-level:

  • The applet maintains N slots, each containing a PIN and associated symmetric key, say for AES.
  • Initially all the PINs are set to random values and the keys are blank.
  • The user chooses to initialize at least 2 of the slots, setting a new PIN and generating a random symmetric key.
  • When the user wants to perform encryption/decryption, they first authenticate by entering a PIN.
  • Here is the unusual part: the user does not indicate which slot they are authenticating against. Instead their PIN entry is checked against all PINs in randomly chosen order.
    • If all PIN checks fail, nothing happens.
    • If any of the PIN checks succeed, the applet makes a note of the slot and then clears the failure count on all PINs (Otherwise every slot would inevitable march towards lockout.)
  • For the duration of the session, when the applet is asked to encrypt/decrypt some input, the symmetric key from that slot is used.

To create a duress PIN: initialize a few slots, use one for routine activity and all the others as cover, to be disclosed when demanded by authorities. The AES key in the former slot will protect data such as the disk containing information about Tiananmen Square. All of the others keys are unused, but fully initialized inside the card and available for use as an AES key. If the authorities compel disclosure of a PIN, the dissident provide one of the cover PINs. If that PIN is used to decrypt some ciphertext, the applet will report that the PIN is correct but proceed to use a completely unrelated key from the original one that created the ciphertext. Result: junk returned as the output of decryption. But crucially for our purposes, it will be convincing junk. Attempting to decrypt the same ciphertext twice returns the same output. Encryption and decryption are inverse operations as expected.

The plausible deniability comes from the fact that all slots are treated identically. Unlike naive designs where there is a distinction between “real” PIN and “duress” PIN, this applet maintains a uniform collection of multiple symmetric keys and associated PINs. The applet itself has no concept of which one of them are reserved as duress PINs, as the code paths are identical, modulo the randomized order in which a PIN is checked against every slot. The randomization helps avoid any lingering suspicion about the order that slots are initialized. For example it may be a common pattern that card-holders first select their ordinary PIN before creating the duress PIN. If slots were checked in a particular order, the time taken to verify the PIN could leak information about whether it was the first or second PIN that checked out. 

Why the allowance for more than 2 slots? Not having an upper bound improves on plausible deniability. Since we are dealing with an autocratic regime, we assume the authorities are aware of opsec capabilities available to political dissidents, including this particular solution. So they have a priori reason to suspect the target may have configured an additional duress PIN set on her card, in addition to the real one that she uses for decrypting her drives. If there were exactly 2 slots, the authorities could insist that she disclose a duress PIN and her only recourse would be to arguing she only initialized one slot. With an unbounded number of keys and associated PINs, the card-holder is free to “confess” to as many PINs as she would like. Meanwhile the authorities’ position shifts from suspecting the existence of a duress PIN— somewhat warranted under the circumstances— to wondering if there is one more PIN than she has disclosed.

In theory similar ideas can be applied for a card managing public/private-key pairs, instead of symmetric AES keys. However a core assumption underlying this model is likely to fail in that context. By design, public-keys are meant to be well, public. In fact their utility relies on the public-key being available to everyone the owner may interact with. Sending encrypted email to a recipient requires knowing their private key, as does verifying the authenticity of a digitally signed piece of email originating from that person. This means that in most realistic scenarios disavowing ownership of a public key is much more difficult. Chances are the authorities knocking on the dissidents’ door already know she has a particular public key and they are looking for its private counterpart. In fact cryptographic hardware often carry both the public and private key, with the public part freely retrievable without so much as entering a PIN. For example in both PIV and GIDS, certificates can be enumerated without authentication. Such discovery capabilities are essential for the card to be usable by software without any other context; otherwise the middleware can not determine whether it is dealing with an RSA or ECDSA key for example.

Nevertheless there are some options for implementing a duress PIN for standard PKI-capable cards when the results of the private key operation are used online— in other words submitted to a remote server. The last two posts in this series will explore ways to leverage algorithm-specific quirks in implementation for that purpose.



Designing a duress PIN: plausible deniability (part II)

[continued from part I]


While the design sketched above lives up to the spirit of a duress PIN, it has one major problem: the behavior of the duress PIN is easily distinguished from the regular PIN. Cards can get into terminated state due to accidental bugs (early versions of Google Wallet ran into this with the NXP-sourced secure element) or in response to deliberate tampering with the security boundary. However the immediate link between failed PIN entry and the card starting to return a distinct error code for every command is too obvious.

This creates a problem in scenarios where the adversary can retaliate for having been supplied incorrect information. To use the cliched example: if the cardholder has a gun to their head, having volunteered the wrong PIN and permanently disabled the card in the process is unlikely to result in a good outcome. On the other hand, it may be an acceptable response in cases involving disclosure compelled by an employer or even law enforcement. [It goes without saying: This is not legal advice.] US case law is ambiguous on whether citizens can be forced to provide decryption keys safe-guarding their data. A defendant who volunteers the duress PIN and afterwards declares that no further disclosure is possible due to permanently bricked hardware would make for an interesting case pitting fifth-amendment scholarship against the more immediate concern around obstruction of justice.

Moving beyond courtroom drama, we can ask whether it is possible to make duress PIN operation less obvious. Is it possible to have a modicum of plausible deniability when the owner states: “I gave you the correct PIN; I have no idea why the card is not working”? Let’s walk through some variations and see how each one falls short of the goal.

Take #2: Feign correct PIN & fail probabilistically

Instead of immediately terminating the card, the applet could simply feign success while setting a special flag. When this flag is set, every subsequent command has some fixed probability of triggering permanent lockup, for example by setting the card into terminated state. For example if the probability is set at 10%, on average the card will become unusable after ten additional commands. This provides some plausible deniability in that PIN entry appears to succeed and the eventual hardware failure is separated from the PIN entry event. The downside is the card will oblige by performing a few operations at behest of the adversary before the laws of probability kick in. 

Whether or not that is a problem depends on the use-case. For example, if the card holds an email encryption key, the adversary may successfully decrypt a handful of messages of their choosing before the simulated failure kicks in. That may be an acceptable risk if there are thousands of messages and no single one is particularly valuable. On the other hand, there are scenarios where a single private-key operation can be critical. For example, in cryptocurrency storage a single digital signature with the right private-key may be sufficient to move large amounts of money into the control of an adversary. That means it is not always to safe to count on delayed lock-up after duress PIN is entered.

The other problem is that while there is plausible deniability, it is not very persuasive. Unless one is unwisely relying security-through-obscurity, we must assume the adversary is familiar with the hardware/software combination in use, including the fact that these applets include an optional duress PIN feature. Given that knowledge, the suspicion falls back on the user. Modern smart-cards are highly reliable and rated for tens of thousands of operations. Odds of a card going into terminated state due to normal wear & tear is negligible compared to the odds that such failure was deliberately triggered by the application. Bayesian reasoning puts the blame squarely on the cardholder.

Take #3: Feign correct PIN & return bogus results

As before, the PIN verification logic reports success when the provided the duress PIN.. The applet switches its internal state to “authenticated” and allows the owner to invoke functionality such as exercising cryptographic functions that require prior authentication. But the applet also keeps track of which PIN has been verified in transient memory eg RAM. When it is asked to perform some cryptographic operation such as signing supplied data or decrypting a ciphertext, its behavior is conditioned on exactly which PIN has been validated. When using the regular PIN, calculation results are output verbatim. If the duress PIN was used, random noise is returned instead.

Why bother running the operation at all if results are going to be replaced with random junk anyway? This is to prevent side-channel observations based on timing. Recall that card ICs are not particularly fast and private-key operations such as RSA signing can take a fraction of a second, a delay that is noticeable even on human perception scales without the benefit of a high-resolution timer. In fact side-channel attacks are problematic for many duress PIN implementations. For example, verifying a PIN involves writing to permanent storage— even if the PIN is correct. That means externally observable properties such as power consumption can give away whether one or two PIN checks are taking place.

While we need to confront this problem of side-channels more carefully in subsequent iterations, in this case there is no reason to attach oscilloscopes to anything. Plausible deniability breaks down for a simple reason: an applet returning bogus results for cryptographic operations is itself highly suspicious. First it is easy to detect that the results are incorrect. For example, if the applet is responsible for securing a private key, there is an  associated public key and it is safe to assume that public-key is known to the adversary. As such it is very easy to verify if the card performed a signature or decryption correctly. For signatures, ask the applet to sign a known message and use the corresponding public-key to check that signature. In the case of encryption keys, use the public-key to encrypt a known message and ask the card to decrypt it.

It is conceivable for smart-cards to experience hardware failures and start outputting bogus results from ordinary wear & tear. (Keep in mind, good implementations have additional checks against that failure mode. After finishing a private key operation, they verify the result before releasing it out of the card to guard against specific attacks. There is a large body of literature on fault-injection attacks that shows how easily secret keys can be recovered from hardware by inducing certain errors— such as disturbing one out of two steps involved in an RSA private key operation— and observing the incorrect output.) Comparing the odds of such a failure occurring “organically” by bad luck versus being triggered deliberately by duress PIN entry, suspicion lands on the cardholder once again.



Design considerations for a duress PIN (part I)

From urban legends to smart-card programming

An urban legend dating back to the 1990s advises that if you are every held at gunpoint to withdraw cash from an ATM, enter your PIN backwards. The ATM will still dispense the necessary cash to get you out of trouble, but it will also send an alert to law enforcement that a customer is having an emergency.

This story of the reversed PIN is of course bunk, as explained on Snopes and mainstream sources. But the general idea of a duress PIN or duress code is a real concept in information security. Informally, it refers to an optional feature for authentication mechanisms where there is more than one way to authenticate and some choices result in triggering an alarm to signal authentication has taken place under coercion, such the person being held at gunpoint. In this blog post we will review some options for implementing such a feature in a realistic setting, namely using smart-cards. While the word “card” may evoke the original ATM withdrawal scenario inspiring the legend, physical form factor is not the salient feature. As covered in previous posts here, often the same secure trusted-execution environments (TEE) powering cards can be repackaged in alternative shapes such as USB tokens or embedded secure elements. The common denominator is the presence of a TEE that can enforce specific rules even the legitimate owner can not work around.

Warm-up: online authentication with passwords

It turns out that the original bank withdrawal scenario is conceptually the simplest setting, as long as the PIN is being checked “online.” By that we mean the PIN entered into the ATM keypad is transmitted to some centralized authentication system— recall that the interoperability requirements for banking mean that card could have been issued by a different financial institution half-way around the country. (The alternative would be offline mode where the card itself is verifying the PIN, to cope with temporary loss of connectivity to the network. This is increasingly rare nowadays.) In that scenario the bank issuing the card could easily have implemented a duress PIN, by allowing customers to choose a second credential along-side their standard PIN. That alternative PIN would still be accepted for authentication while triggering alarms in the background. For example, it may notify bank personnel who in turn reach out to local law-enforcement agencies near the ATM to check on the location.

Crucial to the customer safety is that none of this background activity be apparent to the cardholder— and even more importantly, to the presumed attacker watching over their shoulder. Flashing red-alerts on the ATM screen stating that the police are en route are exactly the wrong outcome: it places the person being coerced in greater danger. In the ideal scenario, the attacker can not distinguish between the use of real credential and duress PIN. There may be subtle changes in behavior as long as the attacker can not detect them. For example, the issuing bank could present the appearance of an artificially low balance on the account or lower cash-withdrawal limits to bound potential losses. But it is crucial that customers have plausible deniability, which is difficult to guarantee in all circumstances. If the adversary knows that a certain person maintains a high-balance and the ATM shows only a few dollars available for withdrawal, they could infer a duress PIN was used and retaliate.

These principles translate to web authentication in a straightforward way: instead of multiple PINs, online authentication systems could allow users to have multiple passwords and designate some of those for use in duress situations. This would not make sense for most consumer-oriented websites, since they lack the 24/7 security operations required to respond to duress signals or enough information about customers whereabouts to meaningfully escalate matters to law enforcement. By contrast enterprise authentication systems are better suited to take advantage of duress credentials. Consider the traveling employee conundrum. It is common for enterprises to cut-off all access when a team member is traveling to regions with a reputation for industrial espionage. There is a high risk that the employee may be instructed to disclose their corporate credentials or compelled to access company resources at the behest of government authorities, possible under the guise of security screening at the airport. Removing privileged access in those situations helps both the company and the employee in question— they stop being a target for espionage, assuming attackers are aware of the policy.

Duress credentials can extend a similar level of protection to settings where coercion is unexpected. For example a VPN service can automatically restrict access to internal resources depending on which password is entered. If an employee is asked to give up credentials, they can provide the duress password to formally comply with the request while silently alerting their organization of the situation.

Designing for offline usage

Implementing a duress PIN with an offline device at first looks deceptively simple. Let’s take the example of a card compliant with the Global Platform standard, and programmed using the JavaCard environment. Supporting libraries for this framework conveniently include a reusable PIN implementation, designed to lock out after a configurable number of tries— this is how standard policies such as “5 strikes & you are done” are implemented. Meanwhile Global Platform defines the lifecycle of a card, including the states LOCKED and TERMINATED. In both cases, standard functionality on the card becomes inaccessible. Main difference is that “terminated” state is irreversible. During the installation of applets on a card (recall this requires access to “card manager” secret keys) an application can be granted card-lock and/or card-terminate privileges.

Putting all this together, here is a naive attempt at duress PIN implementation:

  • Change the applet to maintain 2 PIN objects, one “real” and one for duress scenarios.
  • Initially the duress PIN will mirror the ordinary PIN. When the regular PIN is initialized, the duress PIN will be set to the same value. This is effectively a compatibility mode, and amounts to not having the duress PIN functionality enabled out of the gate.
  • Introduce an extension to the applet interface that allows changing the duress PIN only, after first authenticating with the regular PIN. By setting the duress PIN to a different value than the regular PIN, the cardholder activates the feature. (This extension could be a new APDU or more likely a different P1/P2 parameter passed to the existing CHANGE REFERENCE DATA command typically used for updating PINs.)
  • Install the applet with card-terminate privileges
  • As before PIN verification is required before performing a sensitive operation— such as using a cryptographic key stored on-board to digitally sign a message. That logic is modified as follows:
    1. First check the regular PIN. With the standard OwnerPIN object, that implicitly includes checks for lockout and either clears/increments the failure count depending on whether the supplied value was correct.
    2. If and only if the regular PIN check fails, also check against the duress PIN
    3. If duress PIN check succeeds, set card-lifecycle state to TERMINATED. (Just to be safe, one could also overwrite important secret material in EEPROM or flash storage, to defend against future intrusive attacks against the hardware substrate.)
    4. If duress PIN check failed, simply clear its failed attempt count. We do not want the duress PIN to accidentally get into lockout state, since most incorrect PIN entries are accidental fat-fingering, and not deliberate attempts to trigger self-destruct.

While this basic design works, it falls short of the goal in one important aspect: plausible deniability.

[continued in part II]


Blame it on Bitcoin: ransomware and regulation [part II]

Full disclosure: This blogger worked for a regulated US cryptocurrency exchange. All opinions expressed are personal

[continued from part I]

Minding the miners

Miners are arguably the most unwieldy aspect of the system for regulation. On the one hand, mining is highly centralized with a handful of pools located outside the US controlling the majority of bitcoin hash-rate. (Although the recent ban against mining in China may result in an exodus out of that region and perhaps diversify the geographic distribution.) On the other hand, it only takes one miner to make a transaction “official” by including it in a block. All other miners will continue to build on top of that block without judgment, piling on additional confirmations to bury the transaction deeper and deeper into immutable record in the public ledger. That does not bode well for attempts to censor ransomware payments. Even if all ransomware payment addresses were known ahead of time— itself a tall order, given the ease of creating new addresses and a motivated victim who wants the payment to succeed— it is difficult to see how regulatory pressure on miners could achieve sufficient coverage and prevent defectors from including the transaction when doing so would be in their economic interest.

Similar considerations apply to “blacklisting” ransomware addresses and attempting to prevent the crooks from spending their ill-gotten gains. Freezing ransomware funds after they are received by the perpetrators would require at least 51% of mining power agreeing to cooperate to the point of initiating small-forks every time a blacklisted transaction is mined by another miner outside the coalition. (For more on this, see previous blog post on “clean blocks” and censoring transactions.)

Returning to fiat: on-ramps and off-ramps

Notwithstanding enthusiasm about using Bitcoin for retail payments and the occasional short-lived publicity stunt— Tesla’s foray into accepting bitcoin comes to mind— most commercial transactions are still conducted in fiat. While ransomware perpetrators can collect bitcoin from their targets, they still need a way to convert those funds into dollars, euros or more likely rubles. That brings us back to cryptocurrency exchanges. They serve as the on-ramps and off-ramps into the cryptocurrency and present an attractive “choke point” for implementing controls to stop criminals from converting ill-gotten gains into universally accepted fiat currency.

But the same regulated vs off-shore dichotomy complicates this scheme. Regulated exchanges are already incentivized to turn away organizations with dubious source of funds. They implement robust KYC/AML programs to weed out such applicants during on-boarding and continue to monitor for unusual activity, filing CTRs and SARs to alert applicable authorities. The whole point of a compliance department is turning away paying customers when they pose too high a risk, giving up short-term revenue in exchange for long-term health of the business. Unregulated, off-shore exchanges have no such scruples. They are willing to take money from anyone with a pulse and look the other way (or, not bother looking at all) when those customers receive funds that can be traced to criminal activity. Examples:

  • In some cases the willful negligence is an open secret. BTC-e used to rank in the top five of all exchanges in BTC/USD volume. In defiance of the law-of-one-price, bitcoin consistently traded at lower price there than other major exchanges, hinting at a captive audience with nowhere else to go for cashing out their bitcoin. That mystery was explained when BTC-e was shutdown by authorities in 2017, with the founders charged with helping launder stolen funds from Mt Gox.
  • The blockchain analytics firm Chainalysis noted that in 2019 over one-fourth of illicit bitcoin went to Binance. (No surprise that IRS & DOJ are investigating Binance.)
  • In another fine example of investigative journalism, CyberNews posed as a willing accomplice to join a ransomware group and found out the syndicate had access to an insider at an unnamed exchange:

    “Apparently, the cybercriminals had an insider contact at a cryptocurrency exchange who specialized in money anonymisation and would help us safely cash out (and maybe even launder) our future ransom payouts.”

These types of venues are the ideal place for criminal organizations to patronize when it comes to cashing out ransom payments. It would make no sense for DarkSide operators to trade on a regulated exchange such as Coinbase. Even if they managed to get past the onboarding process and transfer bitcoin for sale, there is a high risk their account may be frozen at any point and all funds seized at the behest of US authorities.

The challenge with controlling on/off-ramps into cryptocurrency then is one of jurisdictional reach and enforcement. Raising the bar on existing KYC/AML programs will certainly drive marginal improvements from already compliant exchanges: they may turn away a few more customers from the onboarding queue or file a few more SARs based on tracing blockchain activity. Meanwhile unregulated exchanges will continue to operate under the assumption that they can continue to ignore the new rule-making, relying on the presumed safety of their offshore location and the fiction of not serving US customers (At least US customers who are not savvy enough to use a VPN)

The good news is both problems are actionable: BTC-e was taken down after all, even though it was ostensibly headquartered in Russia. BitMEX is based in the Seychelles and claims to not serve US customers. That has not stopped the US Attorneys for the SDNY from indicting BitMEX executives with violations of the Bank Secrecy Act. There is a good reason for the spotlight to be on cryptocurrency exchanges as an ally in combatting ransomware. If victims can not be prevented from initiating the cycle by paying up, the next best opportunity is to prevent those funds from being converted into fiat. In other words: turn the crooks into involuntary HODLers. (This strategy assumes cryptocurrency will remain primarily a store of value, in other words an inflation hedge or digital gold. If cryptocurrency becomes an efficient method of exchange where a meaningful chunk of commercial transactions can be carried out without taking the “off-ramps” back into fiat, confining criminals to bitcoin will stop being a meaningful strategy.) But that purpose is best served by extending the reach of existing laws on the books to cover offshore exchanges when their involvement in ransomware creates negative externalities that spill over across jurisdictions.