Among financial instruments cryptocurrency is unique in equating possession of funds to control over a secret cryptographic key. If you have the private key corresponding to a particular blockchain address, you have full control over funds at that address. In particular, you can sign a transaction to move those funds anywhere. This simple threat model helps the defenders prioritize their strategies and place great emphasis on key management: making sure your private keys do not fall into the wrong hands. This is where offline air-gapped “cold storage” designs, multi-signature or equivalent MPC techniques and specialized hardware security modules come into play, helping raise the bar against attacks.
But there is a more subtle aspect of blockchain design that can complicate “second-order threats” involving temporary access to private keys. It’s clear that an adversary need not have actual possession of private keys— in the sense of having the raw bits they can print on a piece of paper— in order to carry out a heist. If they can instruct a blackbox to sign a transaction sending funds to a new blockchain address controlled by the attacker, that will do just fine.
A parallel from the world of code signing comes from the 2012 Adobe breach. Most consumer operating systems including Windows implement a code-signing requirement for software vendors to digitally sign applications they public. This is designed to help increase confidence in the authenticity of software and prevent malware from disguising itself as “Adobe Photoshop” for instance. Such keys must be carefully guarded and failure to do has been leveraged in well-known attacks, most notably the joint US/Israel Stuxnet malware targeting the Iranian nuclear program which used stolen private-keys to digitally sign its components. In the case of the Adobe breach, the company had taken steps to secure its private keys by using a hardware security module. This prevented attackers from being able to walk away with a copy of those keys. Not surprisingly, it did not stop them from signing a few pieces of malicious code during the time they had access to the HSM.
Here is a more routine scenario involving access control. Consider Bob, an employee in good-standing at a company that stores cryptocurrency. Because his role includes wallet management, Bob has access to the key management system to generate transactions. (Additional controls may exist to limit what transactions can be signed, but this makes no difference to the risk under consideration here.) Suppose Bob and his employer part ways in less-than friendly manner. It is clear that while he was employed, he could have signed and broadcast any permissible transaction. Let’s posit that his access has been properly revoked and he can not no longer access signing infrastructure. Let’s also grant that all private-keys were stored on HSMs in non-extractable fashion. Is there any reason to fear retaliation from Bob?
If Bob planned ahead, he could have signed some transactions and put them aside for future broadcast. Whether those transactions remain valid indefinitely depends on the specific blockchain protocol. In the case of Bitcoin, the answer is yes, unless some other transition is first broadcast to spend the same unspent transaction output or “UTXO.” In fact Bitcoin can only set time limits in one direction: it is possible to time-lock funds such that a transaction is not valid until a certain time or block-height. It is not possible—yet, barring a hard fork— to create a transaction that is valid today but stops being valid at a future date, short of having a conflicting transaction that double-spends the inputs. This provides a straight-forward, if expensive, way of mitigating risk from unknown signatures floating around: preemptively spend all existing UTXO associated with keys the ex-employee had access to. These can even be “loopback” transactions, sending funds to the same address without generating new keys, as long as the transactions are unpredictable.**
In the case of Ethereum and specifically externally-owned accounts (EOA) the situation is more tricky. It turns out disgruntled employees can steal future funds that do not even exist at the time they are employed.
Ethereum signing recap
Ethereum requires signed message to broadcast to authorize a funds transfer or invoke a function on a contract. This message has several fields encoded using a scheme called “RLP.” For our purposes the interesting ones are:
- Destination address
- Amount being transferred
- Value of the current nonce associated with the source address
Compared to Bitcoin transactions, this information is only loosely bound to current blockchain state. Creating a successful bitcoin transaction requires knowing the SHA256 hash of an existing UTXO on chain, which is a function of past state including all previous inputs that feed into that UTXO. For ethereum, only some vague knowledge about the state of the world is necessary. Walking through the three fields above:
- Destination address is arbitrary and completely attacker-controlled.
- The only constraint on amount is that it must be less than the total stored at that address (Unlike bitcoin, transactions do not consume all available funds in that UTXO. Any amount not included in the transfer stays at that address.)
- Nonce is a counter that starts at zero and increments by one for every transaction originating from that address. The nonce included in the transaction must be exactly equal to current nonce on blockchain.
Pre-theft: stealing nonexistent funds
Note that the only reference to blockchain state is the nonce. There is no need to know the exact balance on that address, much less the sequence of previous transactions resulting in that total. That property makes it possible to steal funds that do not even exist on-chain yet, given only temporary access to the signing interface.
Returning to our hypothetical disgruntled employee Bob: suppose Bob knows that some Ethereum address will receive deposits in the future, even though its current balance is exactly zero. Bob can use his temporary access to sign a transaction for a future predicted value of the nonce and amount. For example, he can bet that by the time the counter reaches 100, there will be at least 5 ETH balance in this account and create a corresponding transaction to funnel that amount to a personal address. Now all he has to do is wait until the counter reaches 100 and broadcast the previously signed transaction. As long as the balance is at least 5 ETH, the transaction will move that amount into Bob’s possession.
Optimizing the heist
What if Bob guessed wrong and there is only 3 ETH? Not to worry: he can supply 2 ETH from his own funds, add that to the original pool and then withdraw using the existing transaction. This is a bizarre pattern as far as criminal activities go: the thief must make a donation to their victim before committing robbery.
Side note to Bob: you would want to execute these two steps as close as possible in time. Otherwise there is a risk that the 2ETH “donation” gets processed but the counter increments past 100 due to intervening transactions, causing Bob to miss the attack window. Such outcomes are difficult to guarantee because the second step can not be implemented as a smart-contract invocation. (If that were possible, Bob could write a custom contract that attempts to execute both atomically and revert in case the withdrawal fails.) Only a miner colluding with Bob can guarantee that donation and theft transactions are executed back-to- back and before any other transaction involving the same address that may disrupt the nonce value.
In fact there is no reason for Bob to limit himself to just one signed transaction. To cover his bases, he can sign multiple TX for the same amount at different counter values in a given interval, for example spanning 100-110. This avoids any race conditions from Bob’s transaction being preempted by another in-flight transaction for the same counter value, originating with the authorized party.
Multiple signatures solve another nagging problem for Bob: leaving money on the table. Recall that Bob must guess at a particular amount to steal. If he guesses on the high-side, he faces the problem of having to supply some funds first. What if he guesses too low? Imagine if the balance was 50 ETH instead of 5 ETH. The transaction Bob prepared will only walk away with 10% of the total take possible compared to the optimal heist.
Bob can improve his odds by preparing a series of transactions with different nonce values and different amounts. Consider this sequence:
<100, 1 ETH>
<101, 2 ETH>
<102, 4 ETH>
<109, 512 ETH>
Assuming the final balance is 50ETH, he will broadcast the first five transactions, netting a total of 31ETH. The sixth one can not be broadcast as is, because at that point the remaining balance on the account is 19ETH, which is lower than the 32ETH withdrawal attempt. (Bob can use some of the proceeds from the initial batch to “loan” more funds into the victim address such that the total exceeds 32ETH and only then broadcast the final transaction.) Even without risking the race-condition associated with lend-and-steal, this sequence is guaranteed to capture up to 50% of available funds. In fact there is nothing magical about the factor of two appearing in the sequence above. At the cost of requiring additional signatures, one could prepare a series of transactions where the amounts increase by some other constant factor F > 1, with the guarantee that at least 1/F of total value sitting on that address can be captured directly.
Defender perspective: mitigations
Proving the existence of a signed transactions is relatively easy: broadcast it. (In fact there are zero-knowledge techniques from cryptography for convincing a verifier that you know such a signature without disclosing it.) But proving the non-existence of such a transaction is tricky. How can a custodian be confident that someone with access to the private-key in the past did not sign pre-theft transactions? There are two sound approaches:
1. Throw in the towel, deprecate the address and start over by transferring the entire balance over to a newly generated key-pair. This is straightforward but highly disruptive in having to update all existing references to the previous blockchain address.
2. Use a smart-contract. While externally-owned Ethereum addresses have hard-coded logic for authorizing funds movement, a contract is free to make up its own rules. Instead of using an incrementing nonce which is highly predictable, the contract logic can dictate that an uncontrollable value such as the block-hash must be incorporated into the signed message. While the block-hash is deterministic in one sense— it is computed as a function of all transactions included in the block— an attacker has no way to control it indefinitely into the future short of controlling 100% of hash rate.
What about audit trails? In theory if the signing system has a perfect logging mechanism that dutifully records every use of the key including the message signed, one can be confident there are no other, unknown transactions floating around. In reality it is difficult to achieve this level of assurance. Even standard cryptographic hardware does not help. Typical HSM logs can reveal when someone performed cryptographic operations but not necessarily which key is involved, much less the exact message they signed. Some vendor extensions to the PKCS#11 standard include counters that increment each time a key is used. (Safenet HSMs implement this in firmware 7.0 and higher versions.) One can reconcile that counter value against an independent record of all transactions ever submitted for signing. This approach can flag discrepancies, but not necessarily resolve them conclusively. Suppose the HSM counter shows a key was used 10 times but only 9 signed transactions are known to exist. There could be an innocent explanation: some transaction among the nine got signed twice, due to a transient error that was silently resolved by retrying with the same message. Or it could be evidence of pre-theft attack, where someone snuck in a tenth transaction outside the known set with intent to broadcast it in the future.
The missing feature is a more robust, tamper-resistant audit trail maintained internally by the HSM that incorporates hashes being signed. This need not be an append-only log in the traditional sense. For example the same logic used for “extending” PCRs in a TPM can be used to maintain a concise, constant size running tally of all messages ever signed with a given private key.
** Segregated witness complicates this somewhat because the signatures are no longer included in the transaction hash. That removes one of the main sources of unpredictability from the UTXO, leaving only the amounts and mining fees.