1
0
Fork 0

Remove original raw markdown versions of circuit documentation.

This commit is contained in:
Justin Martin 2021-10-11 14:52:21 -07:00
parent 5a9f464d04
commit 2097b1025d
2 changed files with 0 additions and 348 deletions

View File

@ -1,145 +0,0 @@
# Tornado.cash Circuits
Behind the Tornado.cash front-end sits a number of [Circom](https://docs.circom.io/) circuits, which enable the
fundamental privacy guarantees that Tornado.cash users enjoy. These circuits implement the
[Zero Knowledge protocol](https://en.wikipedia.org/wiki/Zero-knowledge_proof) that Tornado.cash's smart contracts
interface with to prove claims about a user's deposit, such as that it is valid, that is hasn't already been withdrawn,
and in the context of [Anonymity Mining](anonymity-mining.md), the number of blocks that exist between a note's deposit
transaction and its withdrawal.
## How ZK Circuits Work
### SNARKs and PLONK
Before trying to understand how Tornado.cash works under the hood, you first need to understand Zero Knowledge circuits,
how they're constructed, and how proofs are generated client-side, then verified on-chain. While there are a
[few different types](https://en.wikipedia.org/wiki/Zero-knowledge_proof#Zero_knowledge_types) of ZK systems,
Tornado.cash relies upon a variant known as "succinct non-interactive arguments of knowledge" (SNARK),
specifically a variant called [PLONK](https://eprint.iacr.org/2019/953).
If you want to develop a deep understanding of how PLONK works, there is a
[great explanation](https://vitalik.ca/general/2019/09/22/plonk.html) by none other than Vitalik Buterin himself.
If you're not a math nerd, but can follow along with a bit of math talk, Vitalik's explanation can be best summarized as:
> ... the "fancy cryptography" it relies on is one single standardized component, called a "polynomial commitment" ...
> A polynomial commitment is a short object that "represents" a polynomial, and allows you to verify evaluations of that
> polynomial, without needing to actually contain all of the data in the polynomial. That is, if someone gives you a
> commitment `𝑐` representing `𝑃(𝑥)`, they can give you a proof that can convince you, for some specific `𝑧`, what the
> value of `𝑃(𝑧)` is.
>
> So how do the commitments themselves work? ... A trusted-setup procedure generates a set of elliptic curve points
> `𝐺, 𝐺 * 𝑠, 𝐺 * 𝑠² ... 𝐺 * 𝑠ⁿ`, as well as `𝐺₂ * 𝑠`, where `𝐺` and `𝐺₂` are the generators of two elliptic curve
> groups and `𝑠` is a secret that is forgotten once the procedure is finished.
>
> These points are published and considered to be "the proving key" of the scheme; anyone who needs to make a polynomial
> commitment will need to use these points. A commitment to a degree-d polynomial is made by multiplying each of the
> first d+1 points in the proving key by the corresponding coefficient in the polynomial, and adding the results together.
>
> Given a program `𝑃`, you convert it into a circuit, and generate a set of equations ..., then convert this set of
> equations into a single polynomial equation. You also generate from the circuit a list of copy constraints. ...
> To generate a proof, you compute the values of all the wires and convert them into three polynomials. ... There is a
> set of equations between the polynomials that need to be checked; you can do this by making commitments to the
> polynomials, opening them at some random `𝑧`, and running the equations on these evaluations instead of the original
> polynomials. The proof itself is just a few commitments and openings and can be checked with a few equations.
>
> **And that's all there is to it!**
Simple, right? Great.
### Circom and snarkjs
Because we're not all Vitalik, it's best if we have some simple tools that will abstract away the generation and
execution of these complicated polynomial commitments. This is where [Circom](https://docs.circom.io/) and
[snarkjs](https://github.com/iden3/snarkjs) come in.
Circom is easiest to think of as a compiler for a circuit language which acts very much like the kind of
[hardware description language](https://en.wikipedia.org/wiki/Hardware_description_language) that electrical engineers
would use to describe an electrical circuit. Except instead of an electrical circuit, we're describing an
**arithmetic circuit**, which contains components, and the way that they connect together.
When you compile a Circom circuit, the resulting output is an
[R1CS constraint system](https://docs.circom.io/1.-an-introduction/background#rank-1-constraint-system) and a
[Wasm](https://en.wikipedia.org/wiki/WebAssembly) executable that will be used to generate a
[witness](https://docs.circom.io/1.-an-introduction/background#witness).
#### R1CS
To understand R1CS (Rank-1 constraint system), there is of course more math. And where there's important
cryptosystem math, there's a [post by Vitalik](https://medium.com/@VitalikButerin/quadratic-arithmetic-programs-from-zero-to-hero-f6d558cea649#5539).
> An R1CS is a sequence of groups of three vectors `(a, b, c)`, and the solution to an R1CS is a vector `s`, where `s`
> must satisfy the equation `s . a * s . b - s . c = 0`, where `.` represents the dot product - in simpler terms, if we
> "zip together" `a` and `s`, multiplying the two values in the same positions, and then take the sum of these products,
> then do the same to `b` and `s` and then `c` and `s`, then the third result equals the product of the first two results.
>
> The next step is taking this R1CS and converting it into QAP form, which implements the exact same logic except using
> polynomials instead of dot products ... instead of checking the constraints in the R1CS individually, we can now
> check all of the constraints at the same time by doing the dot product check on the polynomials.
>
> If we try to falsify any of the variables in the R1CS solution that we are deriving this QAP solution from - say, set
> the last one to 31 instead of 30, then we get a `t` polynomial that fails one of the checks.
In short, the R1CS is a set of polynomial constraints which any proof generated by the circuit must satisfy. These
constraints are [generated by Circom](https://docs.circom.io/2.-circom-fundamentals/constraints-generation) based on the
relationship between various "signals" and operations in your circuit design.
#### Witnesses
Now, depending on what you're using Tornado.cash for, you might not want any witnesses. However, don't worry, if
everything is working correctly, all of the witnesses to your interactions with Tornado.cash will be aggressively
compacted, and their bodies disposed of as you please.
In the context of a PLONK circuit, a witness is the set of values that need to be generated from the inputs to the
circuit, based on the circuit design, to satisfy all of the constraints imposed by the circuit. You can think of the
witness generator produced by Circom as a circuit-specific decompression function which runs your inputs through the
circuit, and snapshots all of the various intermediate values that are produced along the way.
With this expanded form generated from your inputs, you know which values must be assigned to the constraints specified
by the R1CS in order to construct a valid proof.
#### Proof
When you think of a "proof", you probably imagine that it's an incontrovertible guarantee that something is true.
However, in the context of a SNARK, a "proof" actually represents an *argument* that something is *almost certainly*
true. If we were to try to transmit the solution to every single polynomial constraint imposed by a circuit, we would
end up with proofs that were orders of magnitude larger than if we simply show that certain sorts of relationships hold
true between the intermediate state values within the circuit.
It's possible that for any given circuit, someone with sufficient computing power could generate a proof that satisfies
the circuit's constraints in a malformed way, but this would be roughly equivalent in difficulty to
[factoring large primes](https://en.wikipedia.org/wiki/RSA_Factoring_Challenge).
So, when generating a proof for a SNARK circuit, you're calculating the intermediate states of your circuit for a given
input (witness generation), and then calculating the relationships between your inputs, the intermediate states, and
the circuit's outputs.
Once you have the proof that you've satisfied the necessary set of constraints, you can then publish that proof and
some subset of your inputs and outputs (a.k.a. public signals). Knowing the R1CS, your public signals, your proof, and
the circuit's proving key, anyone can then verify that your proof satisfies the R1CS, and that your public signals
are what would be expected to correspond to your proof.
## Circuits
With that understanding of ZK proving circuits well-in-hand, let's delve into how Tornado.cash uses some relatively simple
circuits to enable you to privately and permissionlessly obscure the relationship between your deposit and withdrawal
transactions on a public blockchain network, and then to later prove things *about* the relationship between your
deposit and withdrawal (e.g. how long you waited before withdrawing).
Tornado.cash is best understood as having two separate major components.
### Core Deposit Circuit
The core deposit circuit is what most users interact with, proving that a user has created a commitment representing the
deposit of some corresponding asset denomination, that they haven't yet withdrawn that asset, and that they know the
secret that they supplied when generating the initial commitment.
[\[Read more...\]](circuits/deposit.md)
### Anonymity Mining
The anonymity mining circuits form the basis for the [Anonymity Mining](anonymity-mining.md) program, which incentivizes
users to leave their deposits in the contract for longer periods of time, so as to ensure that the Tornado.cash deposit
pools maintain a large number of active deposits (thus increasing [k-anonymity](https://en.wikipedia.org/wiki/K-anonymity)
for other users).
[\[Read more...\]](circuits/anonymity-mining.md)

View File

@ -1,203 +0,0 @@
# Tornado.cash Core Deposit Circuit
The core deposit circuit is what most users interact with, proving that a user has created a commitment representing the
deposit of some corresponding asset denomination, that they haven't yet withdrawn that asset, and that they know the
secret that they supplied when generating the initial commitment.
## Making a Deposit
A deposit into Tornado.cash is a very simple operation, which doesn't actually involve any ZK proofs. At least not yet.
To make a deposit, you invoke the `deposit` method of a [Tornado contract](https://github.com/tornadocash/tornado-core/blob/master/contracts/Tornado.sol)
instance, supplying a [Pedersen Commitment](https://crypto.stackexchange.com/questions/64437/what-is-a-pedersen-commitment),
along with the asset denomination that you're depositing. This commitment is inserted into a specialized
[Merkle Tree](https://en.wikipedia.org/wiki/Merkle_tree), where the structure of the Merkle Tree is aligned to an
elliptic curve associated with a prime in the order of the BN128 elliptic curve, and the labels of the tree are computed
using MiMC hashing.
### Commitment Scheme
When you make a "commitment" in the context of cryptography, what you're doing is taking a secret value - often large
and random - and running it through some cryptographic function (e.g. a hash function), then disclosing the result.
Later, when you need to make good on the commitment, you prove that you know the original secret value.
This is known as a [commitment scheme](https://en.wikipedia.org/wiki/Commitment_scheme).
### Pedersen Hash
A [Pedersen Hash](https://iden3-docs.readthedocs.io/en/latest/iden3_repos/research/publications/zkproof-standards-workshop-2/pedersen-hash/pedersen.html)
is an extremely specialized hashing function that is particularly well-suited for use in applications leveraging
Zero Knowledge proving circuits. Where other hashing functions like SHA-256 are designed to exhibit properties
such as producing very different outputs for even slightly different inputs
(the [avalanche effect]([avalanche effect](https://en.wikipedia.org/wiki/Avalanche_effect))), Pedersen hashing instead
prioritizes the ability to compute the hash extremely efficiently in Zero Knowledge circuits.
Hashing a message with Pedersen compresses the bits of the message down to a point along an
[elliptic curve](https://en.wikipedia.org/wiki/Elliptic-curve_cryptography) called
[Baby Jubjub](https://github.com/barryWhiteHat/baby_jubjub). Baby Jubjub is in the order of the BN128 elliptic curve
that is supported by precompiled operations on the Ethereum network which were added in
[EIP-196](https://github.com/ethereum/EIPs/blob/master/EIPS/eip-196.md). This means that operations that use the
Baby Jubjub curve, such as Pedersen Hashing, are highly gas-efficient.
When you compute the Pedersen hash of a message, the resulting point along the its elliptic curve is very efficient
to verify, but infeasible to reverse back into the original message.
### Tornado Commitment
To generate a commitment for a Tornado.cash deposit, you first generate two large random integers, each 31 bytes in
length. The first value is a nullifier that you will later disclose in order to withdraw your deposit, and the second
is a secret that secures the confidential relationship between your deposit and withdrawal.
The preimage of your deposit note is the concatenation of these two values (`nullifier` + `secret`), resulting in a
message 62 bytes in length. This message is Pedersen hashed, resulting in an output representing an element of the
Baby Jubjub elliptic curve encoded as a 32-byte big-endian integer.
If you want to see this in code form, you can reference the
[tornado-cli deposit function](https://github.com/tornadocash/tornado-cli/blob/master/cli.js#L53-L112).
### MiMC Merkle Tree
The [Tornado contract](https://github.com/tornadocash/tornado-core/blob/master/contracts/Tornado.sol) is a specialized
[Merkle Tree](https://en.wikipedia.org/wiki/Merkle_tree) which labels its nodes using MiMC hashes.
For those not familiar with Merkle Trees, they are binary trees where each non-leaf node is labelled with the hash
of the labels of its child nodes, and the leaf nodes are labelled with the hash of their data. Ordinarily, Merkle Trees
use a one-way cryptographic hashing function like SHA-2, but in this case, we're using MiMC, which has some useful
properties.
One of the useful properties of MiMC is that it's well-suited to operating over prime fields, which is important to us
because Zero Knowlege proofs are fundamentally based on prime fields, and Pedersen Hashes are points within a prime
field defined by the Baby Jubjub elliptic curve - which is in turn within the order of the BN128 curve supported
natively on Ethereum. Because Zero Knowledge proofs are operationally expensive, and each operation in an Ethereum
transaction has a corresponding gas cost, the specific types of operations we design around need to be as gas-efficient
as possible.
The other particularly useful properties of MiMC are that it's non-parallelizable, and difficult to compute but easy to
verify. These properties add to the security of the contract by making it computationally infeasible to calculate a
forged "commitment" which has a colliding path within the merkle tree.
### Inserting a Commitment
When you insert a commitment into the Tornado contract's merkle tree, you are adding a new leaf node whose label is the
MiMC hash of your Pedersen commitment, and then traversing up the tree updating each subsequent parent node with a new
label based on the label updates that your new leaf introduces below.
Once your deposit has updated the tree, the label of the top-most node becomes the tree's new "root", and is added to
a rolling history containing the labels of the last 100 roots, for later use in processing withdrawal transactions.
The Tornado.cash deposit contracts are deployed with 20 "levels", with each level increasing the number of potential
leaves by a power of 2. That means that the contract's merkle tree supports up to 2^20 leaves, allowing for up to
1,048,576 deposits to be made into the contract before it needs to be replaced.
The reason behind this seemingly-low number of levels is that every deposit has to perform as many updates to the tree
as there are levels. A tree with more levels would require more gas per deposit, as well as correspondingly larger
proof sizes when withdrawing notes.
## Making a Withdrawal
Having made a deposit, you now have a set of truth claims that you can generate a proof based upon. Generally speaking,
Zero Knowledge proofs are anchored to some value(s) known by both the prover and the verifier, to which a relationship
is going to be proven to a set of values known only by the prover. The circuit verifier can confirm that the prover
has used the value(s) that are known, and that the proof that they computed satisfies the constraints imposed by the
circuit.
### Inputs to a Withdrawal Proof
In the case of Tornado.cash deposits, the prover (the person submitting a withdrawal transaction), and the verifier
(the deposit contract's withdrawal method) both know a recent merkle root. The prover also supplies a set of other
public inputs that they used for the generation of their proof.
The total set of public inputs for a withdrawal proof are:
1. A recent merkle root
2. The Pedersen hash of the nullifier component from their deposit commitment
3. The address of the recipient of their withdrawal
4. The address of the relayer that they've selected (or their own address)
5. The fee that they're paying the relayer (or zero)
6. The refund that they're paying the relayer (or zero)
The additional private inputs for a withdrawal proof are:
7. The nullifier component from their deposit commitment
8. The secret component from their deposit commitment
9. The set of node labels that exist in the path between the root and the leaf nodes of the merkle tree
10. An array of `0/1` values indicating whether each specified path element is on the left or right side of its
parent node
### Proven Claims
It would be easy to miss the clever new piece of knowledge we created when we constructed and inserted our commitment
into the merkle tree. You might be inclined to think that to make a withdrawal, we're simply going to prove that we know
the components of the Pedersen commitment, and that the merkle tree is just an efficient way to store those
commitment hashes.
What's special about this construction is that it enables us to prove not just that we know the components of a
deposited commitment, but rather it enables us to prove simply that we __know the path to a commitment within the tree__,
and __how to get there__ starting with a commitment preimage.
If we were only to prove that we knew the preimage to a deposited hash, we would risk revealing which commitment is
ours. Instead, we're not disclosing the commitment preimage, but instead we're simply proving that we have knowledge of
a preimage to a commitment within the tree. Which commitment is ours remains completely indistinguishable on the
withdrawal side of the circuit protocol.
### Computing the Witness
#### Nullifier Hash Check
In order to compute the witness for the withdrawal proof, our circuit first takes the private deposit commitment inputs
(nullifier + secret), and runs them through a circuit component which simultaneously computes the Pedersen hash of
the full commitment message, and the Pedersen hash of the nullifier alone. The circuit then compares the resulting
nullifier hash to the one you supplied as a public input, and asserts their equality.
__This proves that the nullifier hash that you supplied publicly is in fact a component of your original commitment.__
#### Merkle Tree Check
Next, the circuit takes the commitment hash it has computed, the merkle root you have specified publicly, and the path
elements and left/right selectors that you specified privately, as inputs to a component which checks your merkle tree
path claim.
The Merkle Tree Checker starts from the bottom of the path, inputting your commitment hash and the first element of your
proposed path into a Muxer. The Muxer takes a third input, which is an element from your supplied left/right directions.
The Muxer component uses these directions to inform an MiMC hashing component as to the order of its inputs. If the
supplied direction is 0, then the supplied path element is on the left, and your commitment hash is on the right. If
the direction is 1, then the order is reversed.
The MiMC hasher outputs the resulting hash, and the Merkle Tree Checker proceeds to the next level. It repeats the last
process, except this time, instead of using your commitment hash, it uses the hash of the last level. It continues to
run through each level of the proposed path, until it ends up with a final hash output.
The Merkle Tree Checker compares the hash that it has computed to the public merkle root input that you supplied, and
asserts their equality.
__This proves that your commitment exists within some path beneath the specified merkle root.__
#### Extra Withdrawal Parameter Check
Before finishing, the circuit takes each of the remaining four public inputs, and squares them into a public output.
While this isn't strictly necessary, it creates a set of constraints within your proof that ensure that your transaction
parameters can't be tampered with before your withdrawal transaction is processed. If any of those parameters were
to change, your proof would no longer be valid.
### Computing the Proof
Now that we have a witness for our proof, we take those witnessed state values and input them into the R1CS corresponding
to the Withdrawal circuit, and run the prover over it. Out of the prover comes two proof artifacts. The first is the
proof itself, according to the SNARK protocol we're using, and the second is the set of public inputs and outputs
corresponding to that proof.
### Completing a Withdrawal Transaction
With the withdrawal proof now generated, you supply that proof, along with its public inputs, to the `withdraw` method
of the deposit contract. This method verifies that:
1. The specified relayer fee does not exceed the value of the denomination of asset being withdrawn
2. The supplied nullifier hash has not been spent before
3. The supplied merkle root is known, using the 100-root historical record
4. The supplied proof is valid
One of the artifacts deployed as a dependency of the deposit contract is a Solidity contract that is generated using
the proving key of the Withdrawal circuit as an input. This Verifier contract is an optimized proof verifier with a
single public view function, which accepts a proof and the array of six public inputs as `uint256` values.
This function returns `TRUE` if the proof is valid according to the public inputs.
If the above preconditions are met, the supplied nullifier hash is inserted into the set of spent nullifiers, and then
the value of the deposit is distributed amongst the recipient and relayer, according to the specified fee parameters.