Designing a duress PIN: covert channels with RSA (part IV)

[continued from part III]

Covert channels for public-key signatures

For reasons described earlier, it is difficult to hide the existence of a private-key on a card— because the associated public-key is often retrievable without any authentication.  For example both the PIV and GIDS standards allow retrieving certificates from the card without supplying a PIN. By convention when a certificate exists for a given slot, say the 9C slot designated for “signature key” in PIV, the card contains the corresponding private key. Similarly public-key encryption formats such as CMS and GPG contain hints about the identity of the public key that a given message was encrypted to. This rules out the earlier approach used for symmetric keys, namely creating plausible deniability about the very existence of a specific key on the card.

If we focus on digital signatures and lower the bar to allow online verification, indistinguishability for duress PIN can be restored. In this model the card is allowed to output a correct result— namely, a valid digital signature computed with the private key on board. We punt responsibility for detecting use of duress PIN to a remote system responsible for verifying that output. This clearly does not work for decryption since the adversary would not need any assistance from a remote peer to make use of the output. Nor would it work for scenarios where signature verification is performed by parties outside the control of the user. That includes blockchains: bitcoin miners are happy to include any transaction with a valid signature that meets consensus rules in the next block without further inspection. Instead we need to look at closed ecosystems where the signatures are only intended for a system that is closely affiliated with the cardholder and working in conjunction to detect duress signals.

Somewhat realistic scenarios exist for enterprise authentication. Imagine a company with a VPN for remote access, website that implements TLS client authentication or Linux servers accessed using SSH. For all three scenarios authentication is ideally implemented using public-key cryptography with private keys stored on cryptographic hardware such as a smart-card or USB token. Common denominator for these use-cases is the card signing a challenge that is created during protocol execution and this signature being verified by the server to confirm that the person on the other side is in possession of the correct public key. Depending on exactly which signature algorithms are used, a duress PIN can be implemented by piggy-backing on subliminal channels.

Subliminal channels are a type of covert channel present in some digital signature algorithms, allowing the signer to convey additional information in the signature. Broadly speaking this is possible when the signature scheme is randomized: there is more than one valid signature for a given message. While the theoretical constructions assume that the signer will randomly pick one of those with uniform probability, a crafty signer in cahoots with a verifier can do something more subtle: instead of choosing randomly, use the freedom to choose for signaling additional information to the verifier.

RSA-PSS

This is best exemplified with RSA-PSS where PSS stands for “probabilistic signature scheme.” (PSS can be thought of as the counterpart of OAEP which is a probabilistic padding scheme for RSA encryption.) PSS signing starts out with a choice of a random salt, which is used to generate a mask that is combined with the message hash using a series of concatenation and xor operations. The important point is that this salt is fully recoverable by the verifier. That means it is trivial to use the salt to convey additional information. In our case we only need to get 1 bit of information across, namely the answer to a true/false question: did the card-holder enter a duress PIN?

Care must be taken in how that information is encoded. Recall that anyone in possession of the public-key can verify the signature and therefore recover the salt. The adversary is also assumed to have access to that public-key; otherwise they could not check when the card is outputting bogus results, and we would have no need for this level of stealth. A simple encoding scheme such as setting the last bit of the salt to 0 or 1 will not fly; adversary can read that information too.

Indeed no scheme that can be publicly verified is safe, no matter how complicated. Suppose we decide to obfuscate matter by encoding the boolean value in the hash of the salt. Choose a salt such that its SHA256 hash ends in 0 bit to convey “false” (as in, correct PIN entered) and “true” otherwise (duress PIN used.) The flaw in this design is relying on security-through-obscurity. If the adversary knows the covert channel, they can also run the same SHA256 computation and learn the result.

Creating a more robust scheme calls for a shared secret negotiated between the card and the remote server ahead of time. Given that secret, we can compute one-bit as the output of some psuedo-random function of the salt. For example:

  • Start with a randomly generated salt
  • Run all but one bit of that salt through HMAC-SHA256 with the shared key
  • Take the least significant bit of HMAC output
  • Depending on whether we want to convey 0 or 1, either use that bit verbatim or flip it to determine the final salt bit

The server can repeat the same HMAC computation on the other side to infer whether the signer conveyed 0 or 1.

Some care is necessary to implement this without side-channel leaks from the card applet. In particular, one would need a similar trick as earlier design, with a collection of multiple PIN slots and associated 0/1 signal bits for each slot based on whether that PIN corresponds to a duress scenario. Salt generation always proceeds the same way, using the HMAC scheme described above and the shared key, which is identical for slots. The only difference is that last bit is xored with a value of 0 or 1 drawn from the specific slot that validated against the supplied PIN. As before, PINs are checked against all slots are in a different, randomly chosen order each time.

Stepping back to review our assumptions: how realistic is it to find RSA-PSS or some other probabilistic signature scheme in existing real-world protocols? After all it is much easier to tweak an existing server-side logic run additional checks on an otherwise valid signature compared to deploying a whole new protocol from scratch. Using the original three scenarios as benchmark, we are batting two out of three, with some caveats.

  • TLS: PSS was not supported in TLS1.2 but the latest version of the protocol as of this writing includes RSA-PSS variants in the list of recognized signature schemes. While the signature scheme selected is subject to negotiation between client and server based on comparing their respective lists, TLS 1.3 has an unambiguous preference for PSS:

    RSA signatures MUST use an RSASSA-PSS algorithm, regardless of whether RSASSA-PKCS1-v1_5 algorithms appear in “signature_algorithms”

Bottom line: provided the smart-card native implements RSA-PSS and associated middleware delegates padding to the card—as opposed to selecting its own padding and invoking a raw RSA private-key operation, which is how some PKCS#11 providers implement PSS— a duress signal can be carried transparently through TLS1.3 connections.

  • VPN: There is no single “VPN protocol” but instead a variety of open-source and proprietary options with different cryptographic designs. Luckily a large class of VPNs are based on TLS, and this case reduces to the first bullet point. For example OpenVPN and Cisco AnyConnect are all built on TLS client authentication. By using TLS1.3 for the handshake, RSA-PSS becomes accessible for creating a covert channel.
  • SSH: While OpenSSH has been aggressive in pushing for EdDSA and ECDSA keys, on the subject of RSA signatures the implementors are surprisingly conservative. Even the latest RFC favors PKCSv1.5 padding over PSS:

“This document prescribes RSASSA-PKCS1-v1_5 signature padding because […]
(1)  RSASSA-PSS is not universally available to all implementations;”

This rules out RSA-PSS padding are ruled out for SSH, that is not the end of the story. Recall that the property we relied on is freedom in choosing from a large collection of possible valid signatures for a given message. ECDSA clearly fits the bill. While EdDSA is deterministic in principle, the difficulty of verifying that property externally can also be leveraged for a covert channel, albeit with some qualifications. The final post in this series will sketch ways of signaling a duress PIN over SSH.

[continued]

CP

Designing a duress PIN: solving for symmetric cryptography (part 3)

[continued from part II]

Plausible deniability from symmetry

Given the inherent difficulty of implementing a convincing duress PIN for public-key cryptography, let’s switch tracks and solve for an easier scenario. Instead of retrofitting a duress PIN on top of an existing card standard such as PIV or GIDS, what if we tried to design an applet with plausible deniability as first-order requirement. In this example we will focus on cryptographic hardware for managing encryption keys. To make the scenario more concrete, here is a hypothetical dystopian scenario involving an Uighur political dissident in Xinjiang receiving a knock on her door. The state-police are outside, holding an encrypted USB drive, which they allege is full of samizdat critical of the Dear Leader.

“Are you the owner of this drive Citizen Rebiya? It looks like you have one of those disk encryption gadgets. Let’s see if you can decrypt this for us.”

Let’s posit that the disk encryption scheme is rooted in a symmetric secret— which is the case for all popular technologies including LUKS, BitLocker and FileVault. That provides for one crucial difference from the public-key cryptography underlying Bitcoin wallet or SSH: with symmetric keys it is no longer possible for an outside observer to evaluate whether a given operation was performed correctly. For an ideal symmetric cipher, the result of encryption or decryption looks like random data. In fact such notions of “indistinguishability” figure in the theoretical definition of what constitutes a secure symmetric cipher. This property can be leveraged to support multiple PINs in such a way that use of duress PIN looks indistinguishable from a perfectly functioning card with the wrong key.

Here is how such an applet could work at a high-level:

  • The applet maintains N slots, each containing a PIN and associated symmetric key, say for AES.
  • Initially all the PINs are set to random values and the keys are blank.
  • The user chooses to initialize at least 2 of the slots, setting a new PIN and generating a random symmetric key.
  • When the user wants to perform encryption/decryption, they first authenticate by entering a PIN.
  • Here is the unusual part: the user does not indicate which slot they are authenticating against. Instead their PIN entry is checked against all PINs in randomly chosen order.
    • If all PIN checks fail, nothing happens.
    • If any of the PIN checks succeed, the applet makes a note of the slot and then clears the failure count on all PINs (Otherwise every slot would inevitable march towards lockout.)
  • For the duration of the session, when the applet is asked to encrypt/decrypt some input, the symmetric key from that slot is used.

To create a duress PIN: initialize a few slots, use one for routine activity and all the others as cover, to be disclosed when demanded by authorities. The AES key in the former slot will protect data such as the disk containing information about Tiananmen Square. All of the others keys are unused, but fully initialized inside the card and available for use as an AES key. If the authorities compel disclosure of a PIN, the dissident provide one of the cover PINs. If that PIN is used to decrypt some ciphertext, the applet will report that the PIN is correct but proceed to use a completely unrelated key from the original one that created the ciphertext. Result: junk returned as the output of decryption. But crucially for our purposes, it will be convincing junk. Attempting to decrypt the same ciphertext twice returns the same output. Encryption and decryption are inverse operations as expected.

The plausible deniability comes from the fact that all slots are treated identically. Unlike naive designs where there is a distinction between “real” PIN and “duress” PIN, this applet maintains a uniform collection of multiple symmetric keys and associated PINs. The applet itself has no concept of which one of them are reserved as duress PINs, as the code paths are identical, modulo the randomized order in which a PIN is checked against every slot. The randomization helps avoid any lingering suspicion about the order that slots are initialized. For example it may be a common pattern that card-holders first select their ordinary PIN before creating the duress PIN. If slots were checked in a particular order, the time taken to verify the PIN could leak information about whether it was the first or second PIN that checked out. 

Why the allowance for more than 2 slots? Not having an upper bound improves on plausible deniability. Since we are dealing with an autocratic regime, we assume the authorities are aware of opsec capabilities available to political dissidents, including this particular solution. So they have a priori reason to suspect the target may have configured an additional duress PIN set on her card, in addition to the real one that she uses for decrypting her drives. If there were exactly 2 slots, the authorities could insist that she disclose a duress PIN and her only recourse would be to arguing she only initialized one slot. With an unbounded number of keys and associated PINs, the card-holder is free to “confess” to as many PINs as she would like. Meanwhile the authorities’ position shifts from suspecting the existence of a duress PIN— somewhat warranted under the circumstances— to wondering if there is one more PIN than she has disclosed.

In theory similar ideas can be applied for a card managing public/private-key pairs, instead of symmetric AES keys. However a core assumption underlying this model is likely to fail in that context. By design, public-keys are meant to be well, public. In fact their utility relies on the public-key being available to everyone the owner may interact with. Sending encrypted email to a recipient requires knowing their private key, as does verifying the authenticity of a digitally signed piece of email originating from that person. This means that in most realistic scenarios disavowing ownership of a public key is much more difficult. Chances are the authorities knocking on the dissidents’ door already know she has a particular public key and they are looking for its private counterpart. In fact cryptographic hardware often carry both the public and private key, with the public part freely retrievable without so much as entering a PIN. For example in both PIV and GIDS, certificates can be enumerated without authentication. Such discovery capabilities are essential for the card to be usable by software without any other context; otherwise the middleware can not determine whether it is dealing with an RSA or ECDSA key for example.

Nevertheless there are some options for implementing a duress PIN for standard PKI-capable cards when the results of the private key operation are used online— in other words submitted to a remote server. The last two posts in this series will explore ways to leverage algorithm-specific quirks in implementation for that purpose.

[continued]

CP

Designing a duress PIN: plausible deniability (part II)

[continued from part I]

Deniability

While the design sketched above lives up to the spirit of a duress PIN, it has one major problem: the behavior of the duress PIN is easily distinguished from the regular PIN. Cards can get into terminated state due to accidental bugs (early versions of Google Wallet ran into this with the NXP-sourced secure element) or in response to deliberate tampering with the security boundary. However the immediate link between failed PIN entry and the card starting to return a distinct error code for every command is too obvious.

This creates a problem in scenarios where the adversary can retaliate for having been supplied incorrect information. To use the cliched example: if the cardholder has a gun to their head, having volunteered the wrong PIN and permanently disabled the card in the process is unlikely to result in a good outcome. On the other hand, it may be an acceptable response in cases involving disclosure compelled by an employer or even law enforcement. [It goes without saying: This is not legal advice.] US case law is ambiguous on whether citizens can be forced to provide decryption keys safe-guarding their data. A defendant who volunteers the duress PIN and afterwards declares that no further disclosure is possible due to permanently bricked hardware would make for an interesting case pitting fifth-amendment scholarship against the more immediate concern around obstruction of justice.

Moving beyond courtroom drama, we can ask whether it is possible to make duress PIN operation less obvious. Is it possible to have a modicum of plausible deniability when the owner states: “I gave you the correct PIN; I have no idea why the card is not working”? Let’s walk through some variations and see how each one falls short of the goal.

Take #2: Feign correct PIN & fail probabilistically

Instead of immediately terminating the card, the applet could simply feign success while setting a special flag. When this flag is set, every subsequent command has some fixed probability of triggering permanent lockup, for example by setting the card into terminated state. For example if the probability is set at 10%, on average the card will become unusable after ten additional commands. This provides some plausible deniability in that PIN entry appears to succeed and the eventual hardware failure is separated from the PIN entry event. The downside is the card will oblige by performing a few operations at behest of the adversary before the laws of probability kick in. 

Whether or not that is a problem depends on the use-case. For example, if the card holds an email encryption key, the adversary may successfully decrypt a handful of messages of their choosing before the simulated failure kicks in. That may be an acceptable risk if there are thousands of messages and no single one is particularly valuable. On the other hand, there are scenarios where a single private-key operation can be critical. For example, in cryptocurrency storage a single digital signature with the right private-key may be sufficient to move large amounts of money into the control of an adversary. That means it is not always to safe to count on delayed lock-up after duress PIN is entered.

The other problem is that while there is plausible deniability, it is not very persuasive. Unless one is unwisely relying security-through-obscurity, we must assume the adversary is familiar with the hardware/software combination in use, including the fact that these applets include an optional duress PIN feature. Given that knowledge, the suspicion falls back on the user. Modern smart-cards are highly reliable and rated for tens of thousands of operations. Odds of a card going into terminated state due to normal wear & tear is negligible compared to the odds that such failure was deliberately triggered by the application. Bayesian reasoning puts the blame squarely on the cardholder.

Take #3: Feign correct PIN & return bogus results

As before, the PIN verification logic reports success when the provided the duress PIN.. The applet switches its internal state to “authenticated” and allows the owner to invoke functionality such as exercising cryptographic functions that require prior authentication. But the applet also keeps track of which PIN has been verified in transient memory eg RAM. When it is asked to perform some cryptographic operation such as signing supplied data or decrypting a ciphertext, its behavior is conditioned on exactly which PIN has been validated. When using the regular PIN, calculation results are output verbatim. If the duress PIN was used, random noise is returned instead.

Why bother running the operation at all if results are going to be replaced with random junk anyway? This is to prevent side-channel observations based on timing. Recall that card ICs are not particularly fast and private-key operations such as RSA signing can take a fraction of a second, a delay that is noticeable even on human perception scales without the benefit of a high-resolution timer. In fact side-channel attacks are problematic for many duress PIN implementations. For example, verifying a PIN involves writing to permanent storage— even if the PIN is correct. That means externally observable properties such as power consumption can give away whether one or two PIN checks are taking place.

While we need to confront this problem of side-channels more carefully in subsequent iterations, in this case there is no reason to attach oscilloscopes to anything. Plausible deniability breaks down for a simple reason: an applet returning bogus results for cryptographic operations is itself highly suspicious. First it is easy to detect that the results are incorrect. For example, if the applet is responsible for securing a private key, there is an  associated public key and it is safe to assume that public-key is known to the adversary. As such it is very easy to verify if the card performed a signature or decryption correctly. For signatures, ask the applet to sign a known message and use the corresponding public-key to check that signature. In the case of encryption keys, use the public-key to encrypt a known message and ask the card to decrypt it.

It is conceivable for smart-cards to experience hardware failures and start outputting bogus results from ordinary wear & tear. (Keep in mind, good implementations have additional checks against that failure mode. After finishing a private key operation, they verify the result before releasing it out of the card to guard against specific attacks. There is a large body of literature on fault-injection attacks that shows how easily secret keys can be recovered from hardware by inducing certain errors— such as disturbing one out of two steps involved in an RSA private key operation— and observing the incorrect output.) Comparing the odds of such a failure occurring “organically” by bad luck versus being triggered deliberately by duress PIN entry, suspicion lands on the cardholder once again.

[continued]

CP

Design considerations for a duress PIN (part I)

From urban legends to smart-card programming

An urban legend dating back to the 1990s advises that if you are every held at gunpoint to withdraw cash from an ATM, enter your PIN backwards. The ATM will still dispense the necessary cash to get you out of trouble, but it will also send an alert to law enforcement that a customer is having an emergency.

This story of the reversed PIN is of course bunk, as explained on Snopes and mainstream sources. But the general idea of a duress PIN or duress code is a real concept in information security. Informally, it refers to an optional feature for authentication mechanisms where there is more than one way to authenticate and some choices result in triggering an alarm to signal authentication has taken place under coercion, such the person being held at gunpoint. In this blog post we will review some options for implementing such a feature in a realistic setting, namely using smart-cards. While the word “card” may evoke the original ATM withdrawal scenario inspiring the legend, physical form factor is not the salient feature. As covered in previous posts here, often the same secure trusted-execution environments (TEE) powering cards can be repackaged in alternative shapes such as USB tokens or embedded secure elements. The common denominator is the presence of a TEE that can enforce specific rules even the legitimate owner can not work around.

Warm-up: online authentication with passwords

It turns out that the original bank withdrawal scenario is conceptually the simplest setting, as long as the PIN is being checked “online.” By that we mean the PIN entered into the ATM keypad is transmitted to some centralized authentication system— recall that the interoperability requirements for banking mean that card could have been issued by a different financial institution half-way around the country. (The alternative would be offline mode where the card itself is verifying the PIN, to cope with temporary loss of connectivity to the network. This is increasingly rare nowadays.) In that scenario the bank issuing the card could easily have implemented a duress PIN, by allowing customers to choose a second credential along-side their standard PIN. That alternative PIN would still be accepted for authentication while triggering alarms in the background. For example, it may notify bank personnel who in turn reach out to local law-enforcement agencies near the ATM to check on the location.

Crucial to the customer safety is that none of this background activity be apparent to the cardholder— and even more importantly, to the presumed attacker watching over their shoulder. Flashing red-alerts on the ATM screen stating that the police are en route are exactly the wrong outcome: it places the person being coerced in greater danger. In the ideal scenario, the attacker can not distinguish between the use of real credential and duress PIN. There may be subtle changes in behavior as long as the attacker can not detect them. For example, the issuing bank could present the appearance of an artificially low balance on the account or lower cash-withdrawal limits to bound potential losses. But it is crucial that customers have plausible deniability, which is difficult to guarantee in all circumstances. If the adversary knows that a certain person maintains a high-balance and the ATM shows only a few dollars available for withdrawal, they could infer a duress PIN was used and retaliate.

These principles translate to web authentication in a straightforward way: instead of multiple PINs, online authentication systems could allow users to have multiple passwords and designate some of those for use in duress situations. This would not make sense for most consumer-oriented websites, since they lack the 24/7 security operations required to respond to duress signals or enough information about customers whereabouts to meaningfully escalate matters to law enforcement. By contrast enterprise authentication systems are better suited to take advantage of duress credentials. Consider the traveling employee conundrum. It is common for enterprises to cut-off all access when a team member is traveling to regions with a reputation for industrial espionage. There is a high risk that the employee may be instructed to disclose their corporate credentials or compelled to access company resources at the behest of government authorities, possible under the guise of security screening at the airport. Removing privileged access in those situations helps both the company and the employee in question— they stop being a target for espionage, assuming attackers are aware of the policy.

Duress credentials can extend a similar level of protection to settings where coercion is unexpected. For example a VPN service can automatically restrict access to internal resources depending on which password is entered. If an employee is asked to give up credentials, they can provide the duress password to formally comply with the request while silently alerting their organization of the situation.

Designing for offline usage

Implementing a duress PIN with an offline device at first looks deceptively simple. Let’s take the example of a card compliant with the Global Platform standard, and programmed using the JavaCard environment. Supporting libraries for this framework conveniently include a reusable PIN implementation, designed to lock out after a configurable number of tries— this is how standard policies such as “5 strikes & you are done” are implemented. Meanwhile Global Platform defines the lifecycle of a card, including the states LOCKED and TERMINATED. In both cases, standard functionality on the card becomes inaccessible. Main difference is that “terminated” state is irreversible. During the installation of applets on a card (recall this requires access to “card manager” secret keys) an application can be granted card-lock and/or card-terminate privileges.

Putting all this together, here is a naive attempt at duress PIN implementation:

  • Change the applet to maintain 2 PIN objects, one “real” and one for duress scenarios.
  • Initially the duress PIN will mirror the ordinary PIN. When the regular PIN is initialized, the duress PIN will be set to the same value. This is effectively a compatibility mode, and amounts to not having the duress PIN functionality enabled out of the gate.
  • Introduce an extension to the applet interface that allows changing the duress PIN only, after first authenticating with the regular PIN. By setting the duress PIN to a different value than the regular PIN, the cardholder activates the feature. (This extension could be a new APDU or more likely a different P1/P2 parameter passed to the existing CHANGE REFERENCE DATA command typically used for updating PINs.)
  • Install the applet with card-terminate privileges
  • As before PIN verification is required before performing a sensitive operation— such as using a cryptographic key stored on-board to digitally sign a message. That logic is modified as follows:
    1. First check the regular PIN. With the standard OwnerPIN object, that implicitly includes checks for lockout and either clears/increments the failure count depending on whether the supplied value was correct.
    2. If and only if the regular PIN check fails, also check against the duress PIN
    3. If duress PIN check succeeds, set card-lifecycle state to TERMINATED. (Just to be safe, one could also overwrite important secret material in EEPROM or flash storage, to defend against future intrusive attacks against the hardware substrate.)
    4. If duress PIN check failed, simply clear its failed attempt count. We do not want the duress PIN to accidentally get into lockout state, since most incorrect PIN entries are accidental fat-fingering, and not deliberate attempts to trigger self-destruct.

While this basic design works, it falls short of the goal in one important aspect: plausible deniability.

[continued in part II]

CP

Blame it on Bitcoin: ransomware and regulation [part II]

Full disclosure: This blogger worked for a regulated US cryptocurrency exchange. All opinions expressed are personal

[continued from part I]

Minding the miners

Miners are arguably the most unwieldy aspect of the system for regulation. On the one hand, mining is highly centralized with a handful of pools located outside the US controlling the majority of bitcoin hash-rate. (Although the recent ban against mining in China may result in an exodus out of that region and perhaps diversify the geographic distribution.) On the other hand, it only takes one miner to make a transaction “official” by including it in a block. All other miners will continue to build on top of that block without judgment, piling on additional confirmations to bury the transaction deeper and deeper into immutable record in the public ledger. That does not bode well for attempts to censor ransomware payments. Even if all ransomware payment addresses were known ahead of time— itself a tall order, given the ease of creating new addresses and a motivated victim who wants the payment to succeed— it is difficult to see how regulatory pressure on miners could achieve sufficient coverage and prevent defectors from including the transaction when doing so would be in their economic interest.

Similar considerations apply to “blacklisting” ransomware addresses and attempting to prevent the crooks from spending their ill-gotten gains. Freezing ransomware funds after they are received by the perpetrators would require at least 51% of mining power agreeing to cooperate to the point of initiating small-forks every time a blacklisted transaction is mined by another miner outside the coalition. (For more on this, see previous blog post on “clean blocks” and censoring transactions.)

Returning to fiat: on-ramps and off-ramps

Notwithstanding enthusiasm about using Bitcoin for retail payments and the occasional short-lived publicity stunt— Tesla’s foray into accepting bitcoin comes to mind— most commercial transactions are still conducted in fiat. While ransomware perpetrators can collect bitcoin from their targets, they still need a way to convert those funds into dollars, euros or more likely rubles. That brings us back to cryptocurrency exchanges. They serve as the on-ramps and off-ramps into the cryptocurrency and present an attractive “choke point” for implementing controls to stop criminals from converting ill-gotten gains into universally accepted fiat currency.

But the same regulated vs off-shore dichotomy complicates this scheme. Regulated exchanges are already incentivized to turn away organizations with dubious source of funds. They implement robust KYC/AML programs to weed out such applicants during on-boarding and continue to monitor for unusual activity, filing CTRs and SARs to alert applicable authorities. The whole point of a compliance department is turning away paying customers when they pose too high a risk, giving up short-term revenue in exchange for long-term health of the business. Unregulated, off-shore exchanges have no such scruples. They are willing to take money from anyone with a pulse and look the other way (or, not bother looking at all) when those customers receive funds that can be traced to criminal activity. Examples:

  • In some cases the willful negligence is an open secret. BTC-e used to rank in the top five of all exchanges in BTC/USD volume. In defiance of the law-of-one-price, bitcoin consistently traded at lower price there than other major exchanges, hinting at a captive audience with nowhere else to go for cashing out their bitcoin. That mystery was explained when BTC-e was shutdown by authorities in 2017, with the founders charged with helping launder stolen funds from Mt Gox.
  • The blockchain analytics firm Chainalysis noted that in 2019 over one-fourth of illicit bitcoin went to Binance. (No surprise that IRS & DOJ are investigating Binance.)
  • In another fine example of investigative journalism, CyberNews posed as a willing accomplice to join a ransomware group and found out the syndicate had access to an insider at an unnamed exchange:

    “Apparently, the cybercriminals had an insider contact at a cryptocurrency exchange who specialized in money anonymisation and would help us safely cash out (and maybe even launder) our future ransom payouts.”

These types of venues are the ideal place for criminal organizations to patronize when it comes to cashing out ransom payments. It would make no sense for DarkSide operators to trade on a regulated exchange such as Coinbase. Even if they managed to get past the onboarding process and transfer bitcoin for sale, there is a high risk their account may be frozen at any point and all funds seized at the behest of US authorities.

The challenge with controlling on/off-ramps into cryptocurrency then is one of jurisdictional reach and enforcement. Raising the bar on existing KYC/AML programs will certainly drive marginal improvements from already compliant exchanges: they may turn away a few more customers from the onboarding queue or file a few more SARs based on tracing blockchain activity. Meanwhile unregulated exchanges will continue to operate under the assumption that they can continue to ignore the new rule-making, relying on the presumed safety of their offshore location and the fiction of not serving US customers (At least US customers who are not savvy enough to use a VPN)

The good news is both problems are actionable: BTC-e was taken down after all, even though it was ostensibly headquartered in Russia. BitMEX is based in the Seychelles and claims to not serve US customers. That has not stopped the US Attorneys for the SDNY from indicting BitMEX executives with violations of the Bank Secrecy Act. There is a good reason for the spotlight to be on cryptocurrency exchanges as an ally in combatting ransomware. If victims can not be prevented from initiating the cycle by paying up, the next best opportunity is to prevent those funds from being converted into fiat. In other words: turn the crooks into involuntary HODLers. (This strategy assumes cryptocurrency will remain primarily a store of value, in other words an inflation hedge or digital gold. If cryptocurrency becomes an efficient method of exchange where a meaningful chunk of commercial transactions can be carried out without taking the “off-ramps” back into fiat, confining criminals to bitcoin will stop being a meaningful strategy.) But that purpose is best served by extending the reach of existing laws on the books to cover offshore exchanges when their involvement in ransomware creates negative externalities that spill over across jurisdictions.

CP

Blame it on Bitcoin: ransomware and regulation [part I]

[Full disclosure: this blogger worked for a regulated US cryptocurrency exchange]

The disruptive ransomware attack on Colonial Pipeline and subsequent revelations of an even larger ransom paid earlier by the insurer CNA has renewed calls for increased regulation of cryptocurrency. Predictably, an expanding chorus of critics has revived the time-honored “blame-it-on-Bitcoin” school of thought. This post takes a closer look at how additional regulation may impact ransomware. Coincidentally following the “pipeline” model of Colonial, we will look at the flow of ransomware funds from their origin to the recipient and ask how unilateral action by regulators could successfully cut off the flow. 

Here is a quick recap on the flow of funds in the aftermath of a ransomware attack:

  1. The business experiencing the ransomware attack makes decides that paying the ransom is the most effective way of restoring operations
  2. They contract with a third-party service to negotiate with the perpetrators and facilitate payment. (Some organizations may choose to handle this on their own but most companies lack know-how in handling cryptocurrency.)
  3. Bitcoin for payment is sourced, typically from a cryptocurrency exchange
  4. Funds are sent to the recipient by broadcasting a Bitcoin transaction. Miners confirm the transaction by including it in a block
  5. Perpetrators convert their Bitcoin into another cryptocurrency or fiat money, also by using a cryptocurrency exchange

What can be accomplished with additional regulation for each step?

Victims: the case against capitulation

Some have argued that the act of paying the ransom could be illegal depending on the country where perpetrators are based. Regardless of whether it is covered by existing laws on the books, there is an economic case for intervention based on the “greater good” of the ecosystem. While paying up may be the expedient or even optimal course of action for one individual victim in isolation, it creates negative externalities downstream for other individuals. For starters, each payment further incentivizes similar attacks by the same threat actor or copycat groups, by proving the viability of a business model built on ransomware. More importantly it provides direct funding to the perpetrator which can be used to purchase additional capabilities— such as acquiring zero-day exploits on the black market— that enable an even more damaging attacks in the future. There is a spectrum of tools from economic theory for addressing negative externalities: fines, taxation and more creative solutions such as cap-and-trade for carbon emissions. In all cases, the objective is to reflect externalities back on the actor responsible for generating them in the first place so they are factored into the cost/benefit analysis. For example companies that opt to pay the ransom may be required to contribute an equivalent amount to a fund created for combatting ransomware. That pool of funds will be earmarked to support law enforcement activities against ransomware groups (for example, taking down their C&C infrastructure) or directly invest in promising technologies that can help accelerate recovery for companies targeted in future attacks.

Middlemen: negotiators and facilitators

Extending the same logic to intermediaries, the US could impose additional economic costs on any company profiting from ransomware activity. Even as unwitting participants, these intermediaries have interests aligned with ransomware actors: more attacks and more payments to arrange, more business for the negotiators.

Granted similar criticism can be leveled at the information security industry: more viruses, more business opportunities for antivirus vendors hawking products by playing up fears of virus infections destroying PCs. Yet few would seriously argue that antivirus solutions are somehow aiding and abetting the underground malware economy. Reputable AV companies can earn a living even when their customers suffer no adverse consequences— in fact that is their ideal steady state arrangement. AV is a preventive technology aimed at stopping malware infections before they occur, not arranging for wealth transfer from affected customer to perpetrator after the fact.

To the extent a ransomware negotiation or payment facilitator service exists as a distinct industry segment, it derives its revenues entirely from successful attacks. This is the equivalent of a mercenary fire-department that only gets paid each time they put out a fire. While these firemen may not take up arson on the side, their interests are not aligned with homeowners they are ostensibly protecting. Real life fire-departments care about building codes and functioning sprinklers because they would like to see as few fires as possible in their community. Our hypothetical mercenary FD has no such incentive, and prefers that the neighborhood burn down frequently, with the added benefit that unlike real firefighters, they are taking on no personal risk while combatting blazes. Even if we are willing to tolerate such a business as necessity (because in the online world there is no real equivalent to the community supported fire-department to save the day) we can impose additional costs on these transactions to compensate for their externalities.

Marketplaces: acquiring cryptocurrency

Moving downstream and looking at the acquisition of bitcoin for the ransom payment, the regulatory landscape gets even more complicated. There are dozens of venues where bitcoin can be purchased in exchange for fiat. Some are online such as Coinbase, others operate offline. Until 2019 the exchange LocalBitcoins arranged for buyers/sellers to meet in real-life and trade using cash. Some exchanges are regulated and implement KYC (Know-Your-Customer) programs to verify real-world identity before onboarding new customers. These exchanges are selective in who they are willing to admit, and they will screen against the OFAC sanction list. Other exchanges are based off-shore, ignore US regulations and are willing to do business with anyone with a heartbeat. There are even decentralized exchanges that operate autonomously on blockchains, but these are only typically capable of trading cryptocurrencies against each other. They can operate in fiat indirectly using stablecoins (cryptocurrencies designed to track the price of a currency such as dollars or euro) but that does not help a first time buyer such as Colonial starting out with a bundle of fiat.

It is difficult to see how additional regulation could be effective in cutting access to all imaginable avenues for a motivated buyer intent on making a ransomware payment. There is already self-selection in effect when it comes to compliance. Regulated exchanges are do not want to be involved in ransomware payments in any capacity, not even as the unwitting platform where funds are sourced. While the purchase may generate a small commission in trading-fees, the reputational risk and PR impact of making headlines for the wrong reason far exceeds any such short-term gain. On the other hand, it is difficult to see how exchanges can stop an otherwise legitimate customer from diverting funds acquired on platform for a ransomware payment. First, there is no a priori reason to block reputable US companies— such as Colonial or CNA— from trading on a cryptocurrency exchange under their authentic corporate identity. Considering that Tesla, Square and Microstrategy have included BTC in the mix for their corporate treasury holdings, it is not unexpected that other CFOs may want to jump in and start building positions. More importantly, buyers are not filling out forms to declare the ostensible purpose of their trade (“for ransomware payment”) when they place orders. Even if an exchange were to block known addresses for ransomware payments— and many regulated exchanges follow OFAC lists of sanctioned blockchain addresses— the customer can simply move funds to a private unhosted wallet first before moving them to the eventual payout address. On the other hand, exchanges can trace funds movements and kick-out customers if they are found to have engaged in ransomware payments in any capacity. While this is a laudable goal for the compliance department, given the infrequency of ransomware payments, being permanently barred from the exchange is hardly consequential for the buyer.

Of greater concern is the game of jurisdictional arbitrage played by offshore exchanges including Binance— the single largest exchange by volume. These exchanges claim to operate outside the reach of US regulations based on their location, accompanied by half-hearted and often imperfect attempts at excluding US customers from transacting on their platform. The challenge is not one of having sufficient regulations but convincing these offshore exchanges that they are not outside the purview of US financial regulations.

Trying to hold other participants in the marketplace accountable for the trade makes even less sense; their involvement is even more peripheral than the trading platform. Trade execution by necessity involves identifiable counter-parties on the other side who received USD in exchange for parting with their bitcoin. But the identity of those counter-parties is a roll of the dice:  it could be a high-frequency trading hedge fund working as market-maker to provide liquidity, an individual investor cashing out gains on their portfolio or a large fund slowly reducing their long exposure to bitcoin. None of them have any inkling of what their counterparty will eventually do with the funds once they leave the exchange.

[continued – part II]

CP

Matching gifts with cryptocurrency: the fine-print in contracts (part II)

[continued from part I]

Avoiding contractual scams

Time to revisit a question glossed over earlier. While the smart-contract sketched above sounds good on paper, skeptical donors will be rightfully asking a pragmatic question: how can they be confident that a matching-gifts campaign launched by a sponsor at some blockchain address is in fact operating according to these rules? If the contract functions as described above, all is well. But what if the contract has a backdoor designed to divert funds to a private wallet, instead of delivering them to the nonprofit? Since the trigger for such malicious logic could be arbitrary, past performance is no guarantee of future results. For example, the contract may act honestly for the first few donations— perhaps those arranged by accomplices of the supporter to help build confidence— only to start embezzling funds after a certain trigger is hit.

Transparency of blockchains goes a long way to alleviate these risks. In particular, the sponsor can publish the source code of the contract, along with the version of the Solidity compiler used to convert that code into low-level EVM byte-code. Automated tools already exist for verifying this correspondence; see Etherscan for examples of verified contracts. This reduces the problem of verifying contract behavior to source code auditing, which is somewhat more tractable than reverse engineering EVM byte-code. There are still shenanigans possible at source code level, as starkly demonstrated by the Solidity Underhanded Contest, a competition to come up with the most creative backdoor possible that can stay undetected by human reviewers. In practice there would be one “canonical” matching campaign contract, already audited and in widespread use, similar to the canonical multi-sig wallet contract. Establishing the authenticity of an alleged matching campaign boils down to verifying that a copy of that exact contract has been deployed. (There is an interesting edge-case involving the CREATE2 extension: until recently, Ethereum contracts were considered immutable. A contract at a given address could self-destruct but it could not be replaced by another contract. This is no longer the case for contract launched via CREATE2, so it is important to also verify that the contract was deployed using the standard, original CREATE instruction or alternatively that its initialization code has no external dependencies that may differ between multiple invocations.)

In addition to verifying contract source code, it is necessary to inspect parameters such as the destination address for the nonprofit receiving donations, committed match ratio (in case this is not hard-coded as one-for-one in code) and funding level of the contract.

Difficult case: Bitcoin

In contrast to Ethereum’s full-fledged programming language for smart contracts. Bitcoin has a far more limited scripting language to express spending conditions. This makes it difficult to achieve parity with the Ethereum implementation of a matching campaign. A more limited notion of “matching” can be achieved by leveraging different signatures types in Bitcoin, but at the expense of reverting to all-or-none semantics. Similar to the prior art in Ethereum, the sponsor is only on the hook for matching donations if one or more other participants materialize with donations exceeding a threshold. Below that threshold, nothing happens.

There is also precedent for constructing this type of crowd-funding transaction. To make this more concrete, suppose the sponsor is willing to match donations up to 1 bitcoin to a specific charity. As proof of her commitment, the sponsor creates and signs a partial transaction:

As it stands, this TX is bogus: consensus rules require that the inputs provide an amount of funds greater than or equal to the outputs, with the difference going to miners as incentive to include the transaction. Since 1 < 2, this transaction can never get mined— as it stands. But this is where use of SIGHASH_ANYONECANPAY comes in; additional inputs can be added to the “source” side of the transaction, as long as outputs on the “destination” remain the same. This allows building the transaction up, layer by layer, with more participants chipping in with a signed input of their own, until the total inputs add up to 2 BTC— or ideally slightly more than 2 BTC to make room for transaction fees. Once that threshold is reached, the transaction can be broadcast.

Compared to the Ethereum case, this construction comes with some caveats and limitations. First the activity of building up to the full amount must be coordinated off-chain, for example using an old-fashioned website. It is not possible to broadcast a partial TX, have it sit in mempool while collecting additional inputs. An invalid TX with insufficient funds will not be relayed around the network. This stands in contrast to Ethereum where all future donations can be processed on chain once the contract is launched. Second, the sponsor can bail out at any time, by broadcasting a different transaction that spends the source input in a different way. It’s not even considered a double-spend since there were no other valid transactions involving that input as far as mempool is concerned. (While the input address can be constrained using time-locks in its redeem script, the same restriction will also apply to the donation. A fully funded TX will also get stuck and not deliver any funds to the nonprofit until the time-lock expires.)

Change is tricky

As sketched above, the arrangement also requires exactly sized inputs, because there is no meaningful way to redirect change. Consider the situation after a first volunteer pledges 0.9 BTC, leaving the campaign just 0.1 BTC away from the goalpost. If a second volunteer as a UTXO worth 0.2 BTC, they would have to first chip away a separate 0.1BTC output first. Directly feeding in the 0.2 BTC UTXO would result in half the funds getting wasted as mining fees. The outputs are already fixed and agreed upon by previous signatures; there is no way for the last volunteer to redirect any excess contribution to a change address. This can be addressed using a different signature scheme combining SIGHASH_ANYONECANPAY and SIGHASH_SINGLE. This latter flag indicates that a given input is signing only its corresponding output, rather than all outputs. That allows each donor (other than the sponsor) to also designate a change address corresponding to their contribution, in case they only want to donate a fraction of one of their UTXO. Unfortunately this arrangement also allows the sponsor to abscond with funds. Since SIGHASH_SINGLE means individual donors are not in fact validating the first output— ostensibly going to the nonprofit— a dishonest sponsor can collect additional inputs, switch the first output to send 2BTC to a private wallet and broadcast that altered transaction.

A variant of that problem can happen even with an honest sponsor an unwitting contributors racing each other. Suppose Alice and Bob both come across a partially signed transaction that has garnered multiple donations, but has fallen 0.1 BTC short of the goal to trigger the matching promise. Both spontaneously decide to chip in 0.1 BTC to push that campaign across the finish line. If they both sign using SIGHASH_ANYONECANPAY and attempt to broadcast the now valid transaction, there is an opportunity for an unscrupulous miner to steal funds. Instead of considering these conflicting TX as double-spends and only accepting one, an opportunistic miner could merge contributions from Alice and Bob into a single TX. Since both signatures only commit to outputs but expressly allow additional inputs, this merge will not invalidate signatures. The result is a new TX where the input side has 0.1BTC excess, which will line the miners’ pockets as excess transaction fee instead of reaching the charitable organization. One mitigation is to ensure that anyone who is adding the “final” input that will trigger the donation use SIGHASH_ALL to cover all inputs, preventing any other inputs from being merged. The problem with that logic is it assumes global coordination among participants. In a public campaign, typically no one can know in advance when the funding objectives are reached. (Suppose the campaign was 0.2 BTC short of the goal and three people each decide to chip in 0.1 BTC, each individually assuming that the threshold is still not met after their contribution.)

For this reason, this construction is only suitable for “small group matching”— a single, large donation in response to a pledge for a comparable amount from the sponsor. Alice creates the 1 → 2 original transaction pledging one-for-one matching, Bob adds his own exact 1 BTC input, signs all inputs/outputs prior to broadcasting the transaction. If Carol happened to be doing the same, these two transactions could not be merged and either Bob or Carol’s attempt would fail without any loss of funds. For now the construction of a more efficient structure for incrementally raising funds with matching campaign on Bitcoin remains an open problem.

CP

Matching gifts with cryptocurrency: scripting for a good cause (part I)

On-chain philanthropy

Cryptocurrencies are programmable money: they allow specifying rules and conditions around how funds are transmitted directly in the monetary system itself. Instead of relying on contracts and attorneys, blockchains can encode policies that were previously written in legalese. For example one can earmark a pool of funds to be locked until at a certain date, require two signatures to withdraw and only sent to a specific recipient (The last one is only possible with Ethereum and similarly expressive blockchains. Such covenants are not expressible in the rudimentary scripting language of Bitcoin yet, although extensions are known that would  make it possible.) While the recent rise of ransomware and incidents such as the Colonial Pipeline closure have put the spotlight on corrosive uses of cryptocurrency— irreversible payments for criminal activity— the same capabilities can also be put to more beneficial uses. Here we explore an example of implementing a philanthropic campaign using Ethereum or Bitcoin.

Quick primer on matching gifts. “Gift” in this context refers to donations made to a charitable organization. A matching campaign is a commitment by an entity to make additional contributions using its own funds, in some proportion to every donation received by the nonprofit, subject to some limits and qualifications. For example it is very common for large companies in America to offer 1:1 matching for donations to 501c3 organizations. (510c3 is a reference to the section of US tax code granting special recognition to nonprofits that meet specific criteria, and allowing their donors to receive favorable tax treatment.) As a data-point: in the early 2000s MSFT offered dollar-for-dollar matching up to $12,000 per year per full-time employee.2 Such corporate matching policies are continuous. Other campaigns may be one time. For example a philanthropist may issue a one-time challenge to the leadership of a nonprofit organization, offering to double the impact of any funds received during a specific time window.

Hard-coded, immutable generosity

Blockchains are good at codifying this type of conditional logic— “I will contribute when someone else does.” To better illustrate how such commitments can be implemented, we will consider two different blockchains. First up is Ethereum, which is the easy case thanks to its Turing-complete programming language. Second one is Bitcoin where the implementation gets tricky and somewhat kludgy.

In both cases there are some simplifying assumptions to make the problem more tractable for a blog post:

  • This is a one-time campaign focused on one specific charitable organization. That side-steps certain problems, including the reverse-mapping blockchain addresses to 501c3 organization and trying to decide whether a given transfer qualifies for the campaign.
    • The nonprofit in question has a well-known blockchain address for receiving donations. This applies to more and more organizations as they partner with intermediaries for accepting cryptocurrency donations or directly publish such addresses on their website. For example Heifer International advertises addresses for Bitcoin, Ethereum, Litecoin, Stellar and Ripple. 
  • There is a cap on total funds available for matching but no per-donor quotas. Otherwise we would have a difficult problem trying to decide when a given participant has reached their quota. It is trivial to create an unbounded number of blockchain addresses, and there is no easy way to infer whether two seemingly independent donations originating from different addresses were in fact associated with the same entity.

Easy case: Ethereum

Recall that the objective is a matching campaign enforced by blockchain rules. Specifically we want to move beyond solutions that involve continuous monitoring and active intervention. For example, an outside observer could watch for all transactions to the nonprofit donation address and publish additional transactions of equivalent amount corresponding to each one. That would be a direct translation of how matching campaigns work  off-chain: the donor makes a contribution and then proves to the campaign sponsor that they made a donation of so many dollars, usually by means of a receipt issued by the nonprofit. After verifying the evidence, the sponsor writes out a check of their own to the same organization.

While there is nothing wrong with carrying the same arrangement over into a blockchain, we can do better: in particular, the sponsor can make a commitment once such that they have no way to renege on the promise, neither by outright defection or inadvertent failure to keep up their end of the bargain when it comes to writing checks in a timely manner. With Ethereum, the sponsor can create a smart-contract once and fund it with the maximum amount they are willing to watch. Once the contract is launched, the remainder of the campaign runs on auto-pilot: immutable contract logic enforced by blockchain rules sees to it that every qualifying donation is properly matched.

This is hardly a novel observation; in fact there is at least one example of such a contract announced on Reddit and launched on-chain. Unfortunately the campaign does not seem to have gotten much traction since launch and not a single Wei has been passed over to the intended nonprofit recipient. Part of the problem is a design choice in the contract to set an all-or-nothing threshold. Similar to crowd-funding campaigns such as KickStarter, matching is conditioned on a threshold of donations being reached, after which the entire amount is matched. Here is an alternative design premised on processing and matching donations immediately as they arrive:

  • As before, the sponsor launches a smart-contract on Ethereum and funds it with the maximum amount of ETH pledged. It could be also be funded by an ERC-20 token or stablecoin such as GUSD to avoid price volatility associated with cryptocurrencies.
  • Donors interested in taking advantage of the campaign send funds to the contract, not directly to the charitable organization. (This raises an important trust question that will be addressed later: how can they be confident the contract is going to work as promised instead of embezzling the funds?)
  • When incoming funds are received, either the “receive” or “fallback” function for the contract is invoked. That code will inspect the incoming amount and use one of the send/transfer/call functions to transfer twice the amount to the well-known address of the nonprofit. Note that each donation is processed individually and delivered immediately to the recipient. There is no waiting on fulfillment of some global conditions around total funds raised.
  • There is one edge case to address in processing incoming funds: what if the contract does not have sufficient funds left to process the full amount? The naive logic sketched above will fail and depending on the attempted transfer mechanism, cause the entire transaction to be reverted, resulting in no changes other than wasted gas. An alternative is to match up to maximum amount possible and still forward the entire donation. But one could argue that fails on fairness criteria: the sender was promised one-for-one amplification on their donation. Perhaps they would have redirected their funds elsewhere had they known 100% match was no longer available. (This can happen due to contention between multiple donations, through no fault of either party. Suppose the contract has 3 ETH left for matching, and two people send 2 ETH contributions in the same Ethereum block. Depending on the order those transactions are mined into a block, one will be fully matched while the other will run into this edge-case.)
    A better solution is to match contributions up to the remaining amount in the contract and return any unmatched portion back to the caller to reconsider their options. They can always resend funds directly to the nonprofit with the understanding that no match will be forthcoming, or they can wait for another opportunity. This means that once the contract runs out of funds, all incoming donations bounce back to senders. (Of course the sponsor can always resuscitate the campaign with a fresh injection of capital, to accommodate higher than expected demand. But the arrival of such additional funding can not be enforced by the contract.)
  • Finally the smart-contract can also impose a deadline such as 1 month for the campaign to avoid sponsor funds being locked up indefinitely. The end-date for the campaign must be specified when the contract is created and remain immutable, to prevent the sponsor from bailing out earlier. Once the deadline has elapsed, the sponsor can call a function on the contract to withdraw remaining funds. After that point, all donation attempts will bounce. This is preferable to simply destroying the contract using the Ethereum self-destruct operation; if the contract were to disappear altogether, incoming donations would be black-holed and irretrievably lost.

The next post in this series will tackle the problem of establishing trust in such a smart-contract and the challenges of replicating the same arrangement using more primitive bitcoin scripts.

[continued – part II]

CP

[2] That may appear extremely generous or fiscally irresponsible, depending on your perspective: a public company with fifty-thousand blue-badge employees effectively signed up a six-billion dollar liability in the worst-case scenario. Given human nature and lackluster reputation of tech community for giving, actual expenditures never amounted to more than a small fraction of this upper bound— an ironic commentary for a company founded by the preeminent philanthropist of our generation.

Marathon, “clean mining” and Bitcoin censorhip

The signal in Marathon virtue-signaling

Last week the Marathon mining pool generated plenty of controversy by carrying through on a promise to mine “clean” blocks. First the adjective is misleading. Given renewed focus on the energy consumption associated with Bitcoin, it would be natural to assume some environmental connotation, specifically using renewable sources to supply the massive amount of electricity required for producing that block. Instead for Marathon the measure of block hygiene turns out to involve an altogether different yardstick: compliance with OFAC sanctioned addresses list. The Office of Foreign Assets Control is part of the US Treasury Department. It is responsible for maintaining lists of foreign individuals that US companies are barred from doing business with due to national security and foreign policy concerns. In other words, OFAC is the reason that El Chapo or GRU officers can not sign up for a credit card, open a savings account or apply for a mortgage with a regulated US bank.

OFAC has long taken an interest in cryptocurrencies. It has even sanctioned blockchain addresses on Bitcoin, Ethereum, even privacy-friendly coins such as Monero and ZCash. Regulated US-based cryptocurrency companies already take these lists into account. For example they can block outbound transfers by their own customers (classic scenario is stopping payments to ransomware operators) or freeze incoming transfers from black-balled addresses to prevent those from being laundered by trading into another asset. In that sense, there is nothing new about some blockchain addresses becoming “radioactive” in the eyes of financial institutions. Where Marathon has crossed into uncharted territory is applying these rules to mining new blocks, in a way that affects all participants globally. In the process, it opens a can of worms about the concentration of mining power and whether governments can exert influence on an allegedly decentralized system by squeezing a handful of key participants.

Meaningless gestures

First let’s start with MARA pool. With an estimated share of less than 8% of total hashrate, this clean mining campaign is unlikely to make a dent. (Incidentally that estimate comes from a January Marathon press-release, as an upper-bound on the hashrate achievable if 100% of its capacity were directed at bitcoin.) When Marathon wins the race to mint the next block— which happens 8% of the time on average— it may exclude a pending transaction that would otherwise have been eligible according to standard rules. But other miners remain unencumbered by that self-imposed rule and are happy to include the same TX next time around when they win the race for the minting a block. Crucial point is that once any miner includes the transaction, Marathon is stuck mining on top of that block. Quite ironically Marathon is helping the sanctioned actor now, adding more confirmations to the verboten transaction and helping push it deeper into blockchain history. The net effect from the sender perspective is a slight delay. On average, sanctioned address will find their transactions taking slightly longer to confirm than other addresses. At 8% of share, the difference is imperceptible. Even at 20% it would barely register. Recall that only the initial appearance on block can be delayed by Marathon “clean mining” policy. Once included in any block by any other miner, additional confirmations will arrive at the regular rate. Considering that most counterparts require 3-6 confirmation, the additional delay on the first block is negligible. This is a scenario where the intransigent minority can not enforce its rules on the majority.

Corollary: Marathon “clean mining” is pure virtue signaling.

A conspiracy of pools?

What if additional pools jump on the clean mining bandwagon? Imagine a world where miners are divided into two camps. Unconstrained miners simply follow consensus rules and profit maximization when selecting which transactions to include in a block.  Compliant miners on the other hand observe additional restrictions from OFAC or other regulatory regimes that result in the exclusion of certain transactions. At first this endeavor looks a doomed enterprise regardless of the hash-rate commanded by the compliant miners. As long as any miner anywhere on earth is willing to include a transaction— including the proverbial lone hobbyist in his/her basement— it will eventually get confirmed. At best they can slow down the initial appearance. That is still problematic, in that it breaks the rule of fungibility: one bitcoin is no longer identical to another bitcoin. Some addresses are discriminated against when it comes to transaction speed. Still this looks like a minor nuisance, considering that funds will eventually move. Worst case scenario, a sufficiently motivated actor can temporarily rent hash-power to mine their own censored transactions. That suggests any attempt at extending sanctioned address lists to mining is tilting at wind-mills unless it can achieve 100% coverage.

In fact total censorship can be achieved without control over all miners. Once the share of compliant miners approaches the 1/2 mark, the game dynamics shift. In effect compliant miners can execute a 51% attack against the minority to permanently exclude transactions. As before consider the scenario where a miner who is unaware of or deliberately running afoul of OFAC rules mines a block including a transaction from a sanctioned address. Compliant miners now have an option that is not available to Marathon: ignore that block and continue mining on a private fork without the undesirable TX. Unconstrained miners may have a head-start after having found the block first. But the majority will eventually catch-up and become the longest chain, resulting in a chain reorganization that erases all traces of the censored transaction from bitcoin history.

Game-theory of righteous forks

Would regulated miners invoke that nuclear option? Initiating a fork to undo some disfavored transaction is an expensive proposition— even for the side guaranteed to win that fork battle. It is disruptive for the ecosystem too: consider that on both chains, block arrival times will slow down. Instead of 100% of hash-rate being applied to mining one chain, it is split between two forks, but with each fork having the difficulty level of the original chain. (Since these forks are expected to be resolved after a handful of blocks, difficulty adjustment will not arrive in time to compensate for the reduced hash-rate.) On the other hand, compliant miners may have no choice. Their actions are not driven by profit maximization alone; otherwise they would not be excluding perfectly valid transactions in the first place.

In terms of financial incentives, there is a silver-lining to being on the winning side of the fork: block rewards previously claimed by unconstrained miners are up for grabs again, as alternative version of those blocks are produced. Since block chain history is being revised, the coin-base rewards can be reclaimed by compliant miners who supplied an alternative version of blockchain history on the winning side. When history is rewritten by the victors, coin-base rewards belong to those responsible for the revisionism. This incentive alone could compensate the regulated majority for going through the trouble of having to initiate 51% attacks in the name of keeping the blockchain clean. On the other hand there will be unpredictable second-order effects, since miner behavior is visible to all observers. Heavy-handed chain reorganizations and censorship may result in a loss of confidence in the network and corresponding depreciation of Bitcoin, hurting the miner bottom line again.

Unconstrained miners in the minority have even more to lose: not only will they waste resources mining on a doomed chain until the reorg, but they will also lose previously earned rewards for blocks that were replaced in the fork. A rational miner will seek to avoid that outcome. Going along with the majority is the path of least resistance, even if the miner has no ideological affiliation for or against the regulatory scheme in question. As long as the majority is committed to forking the block-chain at all costs in order to avoid running afoul of applicable regulations— the metaphorical gun to the head— game-theory predicts the minority will fall in line. There may be a handful of 51% attacks waged initially if unconstrained miners seek to test the resolve of the regulated ones. Ending up on the wrong side of such a fork will have the effect of quickly reseting the expectations of miners in the minority.

The tipping point may well arrive before 50%. Strategies such as selfish-mining allow smaller concentrations of hash power to attempt to hijack the chain. Compliant miners may even be compelled by regulation into temporarily withdrawing their hash power or even attempting a Hail Mary reorg for a limited number of blocks, before giving up and resuming work on the longest chain even if contains a tainted transaction. While the minority is likely to lose most of these uphill battles, even a small chance of victory and associated redistribution of coinbase rewards could motivate unconstrained miners to play it safe.

Choosing the regulators

What does this portend for the future of cryptocurrency regulation? Marathon may have voluntarily indulged in this bit of meaningless virtue-signaling act, but it is a safe bet that regulators elsewhere are taking note. The concern is not about OFAC or the selection criteria used by the US Treasury for its sanctions. There is a very good argument to be made that ransomware operators, Russian election-meddling groups, ISIS terrorists or North Korean dictators should be cut off from every financial system, including those based on blockchains. Companies operating in that ecosystem in any capacity— exchange, custodian or miner— have a part to play in implementing those policies.

The problem is US regulators are not the only ones drawing up lists of personae non gratae and declaring that transactions by those actors must forever be consigned to the memory pool. In fact the concentration of mining power in China— dramatically illustrated by the drop in hash-rate during a recent power outage— hints at a darker possibility: CCP could point to the Marathon example to strong-arm mining pools into blacklisting addresses belonging to political dissidents, human-rights organizations, Uighur communities and Tibetan activists. Marathon was not under the gun and acted voluntarily. Mining pools in China may have a very literal gun pointed at their head while instructed to comply with censorship rules.

CP

On CAPTCHAs and accessibility (part II)

[continued from part I]

Accessibility as a value system

Accessibility has always been part of the design conversation during product development on every MSFT team this blogger worked on. One could cynically attribute this to commercial incentives originating from US government requirement for software compliance with Americans with Disabilities Act. Federal sales are a massive source of Windows revenue and failing a core requirement that would keep the operating system out of that lucrative market is unthinkable. But the commitment to accessibility extended beyond the operating system division. Online services under the MSN umbrella arguably had even greater focus on inclusiveness and making sure all of the web properties would be usable for customers with disabilities. As with all other aspects of software engineering, individual bugs and oversights could happen, but you could count on every team having a program manager with accessibility in their portfolio, responsible for championing these considerations during development.

Luckily it was not particularly difficult to get accessibility right either, at least when designing websites. By the early 2000s, standardization efforts around core web technologies had already laid the foundations with features specifically designed for accessibility. For example, HTML images have an alternative text or alt-text attribute describing that image in words. In situations when users can not see images, a screen-reader software working in conjunction with the web browser could instead speak those words aloud. World Web Consortium had already published guidelines with hints like this— include meaningful alternative text with every image— to educate web developers. MSFT itself had additional internal guidelines for accessibility. For teams operating in the brave new world of “online services” (as distinct from the soon-to-be-antiquated rich-client or shrink-wrap models of delivering software for local installation) accessibility was essentially a solved problem, much like the problem of internationalization or translating software into multiple languages which used to beguile many a software project until ground rules were worked out. As long as you followed certain guidelines— obvious one being to not hard-code English language text intended for users in your code— your software can be easily translated for virtually any market without changing the code. In the same spirit, as long as you followed specific guidelines around designing a website, browsers and screen readers will take care of the rest and make your service accessible to all customers. Unless that is, you went out of your way to introduce a feature that is inaccessible by design— such as visual CAPTCHAs.

Take #2: audio CAPTCHAs

To the extent CAPTCHAs are difficult enough to stop “offensive” software working on behalf of spammers, they also frustrate “honest” software that exists to assist users with disabilities navigate the user interface. Strict interpretation of W3C guidelines dictates that every CAPTCHA image is accompanied by an alternative text along the lines of “this picture contains the distorted sequence of letters X3JRQA.” Of course if we actually did that, spammers could cheat the puzzle, using automated software to learn the solution from the same hint.

The natural fallback was an audio CAPTCHA: instead of recognizing letters in a deliberately distorted image, users would be asked to recognize letters spoken out in a voice recording with deliberate noise added. Once again the trick is knowing exactly how to distort that soundtrack such that humans have an easy time while off-the-shelf voice recognition software stumbles. Once again, Microsoft Research to the rescue. Our colleagues knew that simply adding white-noise (aka Gaussian noise) would not do the trick. Voice recognition had become very good at tuning that out. Instead the difficulty of the audio CAPTCHA would rely on background “babble”— normal conversation sounds layered on top of the soundtrack at slightly lower volume. The perceptual challenge here is similar to carrying on a conversation in a loud space, focusing on the speaker in front of us while tuning out the cacophony of all the other voices echoing around the room.

As with visual CAPTCHAs, there were various knobs for adjusting the difficulty level of the puzzles. Chastened by the weak security configuration on the original rollout, this time more conservative choices were made. We recognized we were dealing with an example of the weakest-link effect: while honest users with accessibility needs are constrained to use the audio CAPTCHA, spammers have their choice of attacking either one. If either option is significantly easier to break, that is the one they are going to target. If it turns out that voice-recognition software could break the audio, it would not matter how good the original CAPTCHA was. All of the previous work optimizing visual CATPCHAs would be undermined as rational spammers shift over to breaking the audio to continue registering bogus accounts.

Fast forward to when the feature rolled out, that dreaded scenario did not come to pass. There was no spike in registrations coming through with audio puzzles. The initial version simply recreated the image puzzle in sound, but later iterations used distinct puzzles. This is important for determining in each case whether someone solved the image or audio version. But even when using the same puzzle, you would expect attackers requesting a large number of audio puzzles if they had an automated break, along with other signals such as a large number of “near misses” where the submitted solution is almost correct except for a letter or two. There was no such peak in the data. Collective sigh of relief all around.

Calibrating the difficulty

Except it turned out the design missed in the opposite direction this time. It is unclear if spammers even bothered attacking the audio CAPTCHA, much less whether they eventually gave up in frustration and violently chucked their copy of Dragon Naturally Speaking voice-recognition software across the room. There is little visibility into how the crooks operate. But one thing became clear over time: our audio CAPTCHA was also too difficult for honest users trying to sign up for accounts.

It’s not that anyone made a conscious decision to ship unsolvable puzzle. On the contrary, deliberate steps were taken to control difficulty. Sound-alike consonants such as “B” and “P” were excluded, since they were considered too difficult to distinguish. This is similar to the visual CAPTCHA avoiding symbols that look identical such as the digit “1” and letter “I,” or the letters “O” and “Q” which are particularly likely to morph into each other as random segments are being added around letters. The problem is all of these intuitions around what qualifies as “right” difficult level were never validated against actual users.

Widespread suspicion existed within the team that we were overdoing it on the difficulty scale. To anyone actually listening to sample audio clips, the letters were incomprehensible. Those of us raising that objection were met with a bit of folk-psychology wisdom: while the puzzles may sound incomprehensible to our untrained ears, users with visual disabilities are likely to have  far more heightened sense of hearing. They would be just fine, this theory went: our subjective evaluation of difficulty is not an accurate gauge because we are not the target audience. That collective delusion may have persisted, were it not for a proper usability study conducted with real users.

Reality check

The wake-up moment occurred in the usability labs on MSFT Redmond-West (“Red-West”) campus. Our usability engineer helped recruited volunteers with specific accessibility requirements involving screen readers. These men and women were sat down in front of a computer to work through a scripted task as members of the Passport team stood helpless, observing behind one-way glass. To control for other accessibility issues that may exist in registration flows, the tasks focused on solving audio CAPTCHAs, stripping away every other extraneous action from the study. Volunteers were simply given dozens of audio CAPTCHA samples calibrated for different settings, some easier and some harder than what we had deployed in production.

After two days, the verdict was in: our audio CAPTCHAs were far more difficult than we realized. Even more instructive were the post-study debriefings. One user said he would likely have asked for help from a relative to complete registering for an account— the worst way to fail customers is making them feel they need help from other people in order to go about their business. Another volunteer wondered aloud if the person designing these audio CAPTCHAs was influenced by John Cage and other avant-garde composers. The folk-psychology theory was bunk: users with visual disabilities were just as frustrated trying to make sense of these these mangled audio as everyone else.

To be clear: this failure rests 100% with the Passport team— not our colleagues in MSFT Research who provided the basic building blocks. If anything, it was an exemplary case of “technology transfer” from research to product: MSR teams carried out innovative work pushing the envelope, problem, handed over working proof-of-concept code and educated the product team on choice of settings. It was our call setting the difficulty level high and our cavalier attitude towards usability that green-lighted a critical feature absent any empirical evidence of its accuracy, all the while patting ourselves on the back that accessibility requirements were satisfied. Mission accomplished Passport team!

In software engineering we rarely come face-to-face with our errors. Our customers are distant abstractions, caricaturized into helpful stereotypes by marketing: “Abby” is the home-user who prioritizes online safety, “Todd” owns a small-business and appreciates time-saving features while “Fred” the IT administrator is always looking to reduce technology costs. Yet we never get to hear directly from Abby, Fred or Todd on how well our work product actually helps them achieve those objectives. Success can be celebrated in metrics trending up— new registrations, logins per day and less commonly trending down— fewer password resets, less outbound spam originating from Hotmail. Failures are abstract, if not entirely out of sight. Usability studies are the one exception when rank-and-file engineers have an opportunity to meet these mythical “users” in the flesh and recognize beyond doubt when our work products have failed our customers.

CP