We are building what we call a Privacy-Preserving Proof-of-Personhood Protocol (PPPoPP). We originally set out with the goal of airdropping a token to a billion people. Along the way, we found that we had to develop a lot of foundational infrastructure not only on the hardware side (which became the Orb), but also on the crypto/protocol side.
While there are many projects out there building general purpose rollups (e.g., Arbitrum, Optimism, zksync), we came across the open source project Hubble, a minimal, application-specific rollup. It allows a highly efficient, but also permissionless and non-custodial airdrop at the scale of one billion people.
Hubble’s open source contracts were already in great shape, and we decided to contribute a high-performance sequencer implementation written in Go, with the goal of deploying on mainnet as soon as possible. We managed to squeeze out quite some performance of the sequencer in the process. While there will be a separate deep dive on Hubble in the future, this post will focus on Semaphore, another open source project from the appliedzkp team.
Because the Orb uses biometrics for the initial sign-up, we wanted to delink this step from the wallet and any future transactions to ensure the user’s privacy. So we settled on Semaphore because it not only allows us to add anonymity to a specific action, but also makes it very easy to reuse the setup for new applications, making future use cases possible.
The privacy in Semaphore is created by introducing a larger set of identities called “identity commitments,” which are hashes of a secret string, and comparable to a traditional public key. This set is represented as a Merkle tree in order to allow anyone in the set to efficiently prove membership. However, because this Merkle proof would leak an individual member’s identity, the proof needs to be verified in zero knowledge. This zero knowledge proof (ZKP) can then be transmitted to prove membership without leaking the identity, and be verified by anyone.
The application using Semaphore decides how the set of identities is created. The smart contract has to implement custom logic for the “gatekeeper” function to add an identity to the set. In the case of Worldcoin, the uniqueness check on the IrisHash is the gatekeeper. The IrisHash provided and signed by an Orb has to be unique, and only then is the identity added to the set.
Semaphore fundamentally has two main parts to it: a set of identities and a mechanism to prove membership. Users can create a zero knowledge proof that simultaneously proves three important claims:
Membership. The set of users is stored as a Merkle tree of public keys (identity commitments). To prove membership, a user proves they have a private key and a Merkle proof for public key in the tree. All of this, including the Merkle proof, is verified inside the zero-knowledge proof so no one can see which leaf was used or what the public key was. This is what provides perfect anonymity. An external observer sees only that the proof came from a member, without being able to see which member.
One-shot. In nearly all applications, we want to make sure that each member gets to do something only once, (e.g., vote or spend a token). However, because membership is proven anonymously, we cannot tell if two proofs came from the same user. This is solved by having each proof publish a nullifier, which is a random number that is unique for each user. Thus, any proof from the same user will have the same nullifier, so we can detect them and know that they are coming from the same user. Nullifiers are similar to random pseudonyms that cannot be linked to the real identity.
Nullifiers have been used since the first privacy coins. However, Semaphore goes beyond regular nullifiers. Take voting as an example: say we want every user to cast one vote on each proposal. With regular nullifiers, we wouldn’t be able to track users' votes across proposals because they have the same nullifier. Instead, we need a new random pseudonym for each voting round. Semaphore makes this possible by mixing in an unique number for each round: the external nullifier. This is powerful as it allows us to build a set of all humans that everyone can build on.
Signal. To continue the example of voting, each user needs to be able to voice their decision. If we simply bundled their decision with a zero knowledge proof in a transaction, we would run into a problem: an attacker who sees the transaction could replace the decision, copy the proof, and front-run. To prevent this, we need to cryptographically tie the decision to the proof. Semaphore allows attaching an arbitrary signal to a proof to achieve this.
All together, these claims are proven as follows (for the actual implementation, see here):
Pseudo-code for the Semaphore circuit implementation
Note that instead of something complex like elliptic curve signatures, the public key is simply a hash. This works because zero knowledge proofs keep the pre-image secret.
We’ve done some early refactoring on the contracts to separate the identity tree from usage and nullifier implementation, which allows for even more generic use cases. We’ve also created semaphore-rs and ported most of the relevant parts of the client library zk-kit from Typescript to Rust. We are already using this library internally while building the Worldcoin wallet, which, in addition to Ethereum and Hubble, is also an identity wallet supporting local Semaphore proof generation. The wallet will be a native iOS and Android app and it uses a cross-compiled version of semaphore-rs under the hood. The wallet will be fully open sourced later in the year.
Example code to use semaphore-rs
We are currently working on scaling Semaphore. Improvements to scale Semaphore can be done on two sides:
1. Identity insertion: Currently, every leaf is inserted individually into the Merkle tree and the updated root is calculated inside the contract. Since the Merkle proofs have to be efficiently verifiable inside a ZKP, the tree uses poseidon hashes instead of keccak. Unfortunately, this makes the insertion in Solidity very expensive (1-2M gas). A sequencer batching insertions together into a ZKP that aggregates multiple Merkle tree updates will drive down the cost of identity insertion significantly.
2. Proof verification: Currently, each semaphore proof is submitted and verified individually, making the cost of signaling around 300k gas. Like the identity insertion, one can think of aggregating multiple proofs into a single proof to amortize the costs. Unfortunately, this is quite difficult with the proving system (Groth16) that is currently used by Semaphore. We are working on upgrading to a more modern proving system that also allows for aggregation.
Why not just deploy it to Arbitrum or Optimism? Zero knowledge proofs still come with significant calldata and this cost is not going anywhere with L2s. Proof aggregation is the only way to reduce it to some sub-linear cost per verification.
We hope that our efforts will make it as easy as possible for developers to use Proof-of-Personhood as a new primitive in their own contracts and apps. We’ll also soon release more developer tooling that will allow us to interact with the Worldcoin wallet and all users of Worldcoin. There are many exciting applications of Proof-of-Personhood beyond airdrops and we want to help build the infrastructure for them.
If you’re interested in scaling Semaphore, working on L2s, or building developer tools: we’re hiring.
Remco Bloemen, Philipp Sippl