Ethereum mixing with RSA: getting by without zero-knowledge proofs

Old-fashioned and unpopular as RSA may have become, it is a versatile public-key crypto-system. Starting with Bitcoin, cryptocurrencies have shunned RSA in favor of  signatures based on elliptic curves, initially ECDSA and later moving towards pairing-based cryptography. Ethereum is the lone exception, having added native RSA support with EIP-198. “Native” being the operative keyword. In principle the smart-contract language used by Ethereum is Turing-complete and can implement any computation. In reality computations are bounded by the amount of gas charged which creates two types of limitations. First is a hard-limit by the maximum amount of gas that can be consumed by all transactions in one block, no matter how motivated a user may be to get a complex transaction mined. The second one is a soft, economical incentive to favor cheaper computations when possible. ECDSA signature verification is artificially “cheap” because it is not implemented as ordinary EVM bytecode. Instead it is a special external contract that can be invoked by anyone, at a deeply discounted price compared to what it would have cost to implement the same complex operation from scratch. EIP-198 brings RSA into this model, although the discount is not quite as deep; on any reasonable hardware architecture RSA signature verification is much faster than ECDSA verification. But the arbitrary gas pricing set by EVM inexplicably charges more for the former.

Strictly speaking EIP-198 adds support for modular exponentiation, which is a useful primitive operation that enables more than only RSA signatures. For example El Gamal over the integers and other crypto-systems based on discrete-logarithms are now in the realm of possibility. More importantly, RSA has useful mathematical properties that are notably absent from ECDSA and enable new scenarios. This post covers one example: trustless mixing of Ethereum.

Background on mixers

Mixers are designed to improve privacy by shuffling funds on the blockchain in such a way that sources and destinations are not connected. To take a concrete example: suppose Alice and Bob have 1 ETH each, stored at known addresses. These are the inputs. They would like to shuffle these funds such that each one ends up with 1ETH again— perhaps minus some transaction fees— but at new addresses which are not linkable to original ones by anyone else with full view of the blockchain. These are the outputs. The choice of 1 ETH is arbitrary but it is important that everyone contributes the same amount when there are exactly as many inputs as outputs. Otherwise the quantity itself becomes a signal, deanonymizing the link between inputs and outputs. If Alice contributed a larger input, the larger output also belongs to her.

With two people, the privacy improvement is marginal: since there are only two addresses to begin with, any given output from this process could only have come from exactly one of two original inputs. The best case scenario one could achieve in this setting is that any outside observer will have no more than 50/50 chance of guessing which way the inputs got shuffled. Also Alice and Bob will always know exactly which input corresponds to which output since they control the other one. As the saying goes, privacy loves the company of large numbers.  Uncertainty in the origin of funds increases with more participants.

But an even more important criteria is soundness. Informally stated, participants will not lose money under any circumstance during the execution of the protocol. Here is a straw-man example that fails on that dimension:

  1. Alice informs Bob her preferred destination Ethereum address
  2. Bob responds with his choice of address
  3. Alice and Bob flip a coin to decide how to map inputs to outputs
  4. If  the coin lands heads, Bob sends 1ETH to Alice’s address. Otherwise he sends the same amount to his own address
  5. After observing Bob’s transaction confirm on the blockchain, Alice in turn sends 1ETH either to Bob or herself

This protocol has a glaring weakness. If Alice receives 1ETH from Bob after step #4, she can diverge from the script and keep the extra ether for herself. For that matter, Alice may be unable to complete the protocol: suppose her hardware wallet failed and she no longer has access to any of her private keys. There is nothing to guarantee that steps #4 and #5 are done atomically: either both happen or neither happen. Once started, it must be either run to completion or rolled-back by refunding the money Bob contributed.

While it is possible to repair that flaw using fair-exchange protocols, there is a more fundamental problem with this approach: it assumes that Alice and Bob somehow found each other ahead of time and they have a private channel for coordinating their activities off-chain. This is not a scalable solution, especially when the protocol is generalized to support more than two participants. That only gets worse with scaling beyond two users to design a mixer that can accept hundreds of inputs. All of those actors must coordinate and agree on one permutation of inputs to outputs while minimizing what each person learns— otherwise the mixer is easily defeated by a single rogue participant who infiltrated the group— while guaranteeing that no one will lose funds even if everyone else in the group has conspired against them.

Trusted third-parties as deus ex machina

If we posit the existence of a trusted third-party Trent, trivial solutions emerge:

  1. Alice, Bob, Carol and anyone else interested in mixing funds, sends 1ETH to Trent via Ethereum blockchain
  2. Every participant also sends their preferred output address to Trent. This part must be done privately and can not be additional metadata sent along with ETH contribution in step #1; otherwise the entire world knows the intended shuffle.
  3. Trent sends every participant approximately 1ETH to their preferred address, minus some commission for facilitating this transaction

This protocol loses both soundness and privacy if Trent goes rogue. After collecting all the funds in step #3, he can abscond with them. Similarly Trent has full knowledge of the mapping between inputs and outputs. He can disclose this information at any point in the future, voiding any privacy advantages afforded by a service the participants have already paid for.

Contracts for holding trusted third-parties accountable

Ideally Trent is replaced with an immutable smart contract guaranteed to stick to the script. Once Ethereum has support for zero-knowledge proofs via SNARKs, this can be done in an autonomous way. But old-fashioned RSA alone is sufficient to enable a middle-ground solution: there is still a trusted third-party involved  to facilitate the protocol but they are prevented from stealing funds or deanonymizing the permutation.

Protocol sketch

Trent publishes a new contract with known EVM bytecode and deposits some funds as “earnest money.” This is effectively a way to incentivize Trent into behaving properly. If Trent executes the protocol faithfully, he will recover the deposit along with any commissions taken for the service. If he diverges from the protocol, the funds will be distributed to participants. Trent generates a unique RSA key-pair and embeds the corresponding public-key in the contract.  Once the contract is initialized, execution operates in three stages: collection, opening and distribution, each with specific deadlines that are hard-coded into the contract. In the first stage, the contract collects funds along with preferred output addresses from participants.

  1. Every participant interested in using the mixer sends 1ETH to the contract, along with a message to be signed by Trent. The contract records the deposit along with the sending address, but does not consider it finalized until Trent has acknowledged it.
  2. Trent must call a different contract method and provide an RSA signature over the requested message. The contract can verify this RSA signature to determine whether Trent signed correctly and only then consider the contribution final.
  3. If Trent fails to complete that step after some deadline, the participant can request a refund (The contract could even be designed to penalize Trent for discriminating against participants, by sending extra funds taken from the initial deposit.)

In the second stage, users reveal their choice of destination address along with a signature obtained from Trent in step #2 above. This is done by invoking another method on the contract to provide the address along with a signature from Trent by calling the contract. One subtlety related to blockchains: participants must use a different Ethereum address when interacting with the contract in each stage. Otherwise the origin of the TX itself allows linking the revealed address to the original request for signing.

Blinded by randomness

This is where unique properties of RSA shine. If users were submitting the destination address verbatim in step #1, it would defeat the point of using a mixer— everyone can observe these calls and learn which address each person picked. But RSA enables blind signatures: Trent can sign a message without known what message he signed. The crucial property is that the message submitted for signing is related to the real message the participant wants signed namely, the destination address. But that relationship is only known to the participant: as far as everyone watching the blockchain is concerned, it appears indistinguishable from a random message. Not even the party in possession of the signing key (Trent, in this case) learns the “true” message being signed or can correlate a signed message presented in the future with the original request on the related message. In effect, users mask their address before submitting it to Trent for signature and then recover the intended signature by using properties of RSA. (There is some additional complexity being glossed over: signatures are computed over the address directly. Instead the address is hashed and suitably padded with a scheme such as PSS. Otherwise RSA allows existential forgery such that one can find unlimited message-signature pairs, although these messages will not have any particular meaning as far as corresponding to an ethereum address.)

To avoid waiting on stragglers indefinitely, a deadline is imposed on the second stage. After this deadline is reached, the contract can start actually disbursing funds to the addresses opened in the second stage. There is no concept of a “scheduled task” on the Ethereum blockchain so the final stage will be initiated when any participant— including potentially Trent— calls into the contract to request distribution after the deadline has elapsed. At this point the contract can confirm that the timestamp or block height is past that deadline and start sending ether to previously verified output addresses.

Detecting and punishing dishonest operators

There is one flaw in the protocol as described: Trent can cheat by issuing extra signed messages. Recall that the presence of a valid RSA signature on a message authorizes the disbursal of 1 ETH to the address contained in that message. That turns every valid signature into a check worth 1ETH. While every participant is supposed to receive one signature for every ETH contributed, nothing prevents Trent from issuing signatures over his own addresses and attempting to cash these in.

This is where the initial “earnest money” contributed by Trent comes in, combined with the deliberate delay in releasing funds. Recall that that funds are not disbursed immediately when a participant calls into the contract with a valid signature. Instead the contract waits until a predetermined deadline, giving everyone a chance to chime in with their address. As long as every honest participant supplies their address, cheating will become obvious. The contract has a record of exactly how many participants have received valid signatures in response to their contribution. If more signatures turn up than participants, it is clear Trent has diverged from the protocol. It could be a deliberate attempt to steal funds or perhaps Trent’s RSA signing key was compromised. Either way, the contract will respond by “refunding” all ether to the original input addresses, along with a sjare of the earnest money put up by Trent. In this case the protocol execution has failed: no mixing has occurred. But all participants receive 100% of their original contribution back along with some compensation for Trent’s incompetence/dishonesty.

Participants do not have to do anything special to catch and punish Trent for cheating, or monitor the blockchain. As long as they are following the protocol and revealing their own signed address before the deadline, any attempt by Trent to fabricate additional unsolicited signatures will backfire and result in “gifting” money. The only way Trent can get away with cheating is if some participants have failed to reveal their signed address in a timely manner, effectively abandoning their funds. Even in that scenario, Trent would be disincentivized from claiming those funds with forged signatures: he would be taking the risk that missing participants may turn up at the last minute and trigger the retaliation logic.



Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s