SoK: Validating Bridges as a Scaling Solution for Blockchains
Infura and Oasis Labs released a paper that focuses on presenting the emerging field of validating bridges (and rollups) in an accessible manner.
Several startups have collectively raised over $100 million to implement off-chain systems that promise to offer a Coinbase-like experience while retaining the underlying blockchain’s security for all processed transactions.
The cornerstone for the proposed off-chain systems, the validating bridge contract, is not well-understood by the community. Information is highly disparate across message boards, chat rooms and for-profit ventures that fund its rapid development.
To help inform the community, Infura and Oasis Labs have released a paper that focuses on presenting the emerging field in an accessible manner. Before we dive into the details, let’s appreciate the bottlenecks behind the scalability problem for networks like Ethereum.
What is the bottleneck for scalability?
Given the magnitude of research into new scaling protocols for blockchains, the transaction throughput magnitude of Bitcoin and Ethereum has not significantly changed. It remains at about ~10 transactions per second.
The bottleneck to scaling is not the blockchain (or consensus) protocol, but the pursuit of decentralization.
Remember, a blockchain is simply a historical transcript that allows anyone to independently compute the public database. This database records the account balances for every user and the current state of smart contracts. The goal of decentralization is to protect the integrity of this database from adversarial actors (and thus, protect our funds).
The pursuit of decentralization can be broken down into two goals:
- Permissionless validation. What percentage of the world’s population can verify the blockchain’s integrity in real time?
- Diversity of validators. What percentage of the world’s population are validating the blockchain’s integrity in real time?
Figure 1 highlights the end-result as Ethereum is essentially a slow and expensive computer everyone has to share. This is because network participants have self-imposed constraints in terms of compute, storage and bandwidth to ensure validation of transactions (and blocks) is permissionless on a global scale.
Therefore, the goal of scalability can be summarised as:
The task of scaling a cryptocurrency is to increase the transaction throughput while not changing the existing resource constraints in terms of compute, storage and bandwidth.
Who will validate a network that they cannot afford to use?
Only whales, or users with an expectation of significant profit, are happy to pay transaction fees in the hundreds of dollars. If a user cannot afford to use your network, they will not validate to protect it.
For a long time, especially during the blocksize wars, it appeared to be a conflict between two opposing goals:
- Affordability. Diversity of users who can afford to use the network
- Validation. Diversity of users who can (and will) validate the network
It does appear as an impossible dilemma to solve, but there is a way to work around the problem.
Off-chain Protocols
A popular approach to scale a network is to take assets that reside on the main chain (Ethereum), move the assets to an off-chain system, and simply transact there.
The concept of transacting on an off-chain system is not new. It has helped us scale cryptocurrencies for the past 12 years.
If you have used a cryptocurrency exchange or service (such as Coinbase), then you have used an off-chain system.
Custodial and private database
Figure 2 highlights the role of a trusted operator for a cryptocurrency service. The operator can freeze, confiscate, or worse, they can lose our funds. In crypto, custody is a liability as opposed to an asset, as it is notoriously difficult to protect the assets from adversarial actors. In addition, the account database (records user balances) is private and not auditable. A user cannot verify if the exchange is running a fractional reserve or if a hacker has stolen their coins until it is too late.
The root problem with off-chain systems to date is all the trust that’s required to make it work.
To learn more, MtGox remains the best example of failure in the Web2 model.
Public database and reducing trust
On Web3, the goal is to replace Web2 off-chain systems:
An off-chain system that will order user-generated transactions for execution and update a publicly verifiable database, but it does not have custody of the user’s funds.
Can payment channels solve the problem?
Anyone who is familiar with off-chain protocols will immediately enlist state channels, like the Lightning network, as a solution to solve the problem.
Figure 3 illustrates a payment channel hub with four customers. While there is little to no counterparty risk and all payments are redeemable immediately, it is not an ideal scenario for any of the parties.
Let’s study the scenario in a bit more detail:
- Inbound capacity issues. Alice can only receive and hold up to 2 coins.
- Outbound capacity issues. Caroline can send up to one coin, but she cannot currently receive any coins.
- Opportunity cost. The operator has allocated four coins to Dave, but this may cost them money if Dave does not perform any transactions.
- On-chain transaction to join. All parties need an on-chain transaction to join the system and subsequent transactions to increase their outbound/inbound capacity limits.
- Hot wallet risk. The operator effectively has their signing key online for all channels and their funds are at risk.
The above issues is why custodial wallets are becoming popular for the Lightning network. We argue that channel networks are repayment protocols and should be used as an interoperability solution to synchronise updates across two or more off-chain systems. But that is a blog post for another day.
Just to summarise:
State (payment) channels are NOT an ideal platform for building a Coinbase-like experience for an off-chain system.
How do I build an off-chain system with a public database?
What matters is not the off-chain system, but how we build the bridge that connects the user’s assets on Ethereum to the off-chain system. A bridge contract is responsible for accepting user deposits on the main chain and notifying the off-chain system to unlock the same assets.
Figure 4 highlights (and as outlined in this note about bridges) that we must consider the trust assumption, which convinces the bridge contract to unlock the funds back to the user. Over the years, the decision to issue a withdrawal request relied solely on a single authority. The goal for designing new bridges has focused on distributing this trust from a single authority to multiple authorities who may have a financial incentive to defend it against invalid transactions. Overall, it still requires the bridge contract to blindly trust a decision from an authority.
The root problem with most bridge contracts is all the trust that’s required to make it work.
The Validating Bridge
A validating bridge contract is a new design that removes the need to trust any external authority to decide if a withdrawal is valid.
It has two components:
- Off-chain system. A system with its own appointment protocol, smart contract environment, and a widely replicated database anyone can independently recompute.
- Chain of cryptographic commitments (commitchain). Periodic commitments that assert new state updates for the off-chain database.
Unlike all other bridge contracts, it is tasked with protecting the integrity of the widely replicated public database before processing any withdrawal requests. Figure 6 highlights how the validating bridge contract leverages Ethereum to protect the off-chain system’s users.
It is up to the operators for the off-chain system to periodically convince the bridge contract:
- State update validity (safety). All proposed state updates to the widely replicated database are valid.
- Eventual execution (liveness). The off-chain system continues to process user-generated transactions.
If the bridge contract is convinced, for whatever reason, that either property is violated, then it will ignore and bypass the operators for the off-chain system. The bridge can self-enforce the ordering and execution of transactions. This is required to permit honest users to unwind their position in a smart contract and eventually withdraw their funds.
Agents and protocol goals
Figure 6 provides an overview of the system which includes multiple agents:
- Sequencer (set). Entrusted with the fast-path of transaction confirmation and they are responsible for proposing an ordered list of transactions for execution.
- Executor (set). Asserts the final execution for an ordered list of transactions.
- Challenger (optional set). Validates the asserted execution and they will assist the bridge contract to identify invalid transactions.
The sequencer is responsible for the user experience. This includes acknowledging the acceptance of user-generated transactions and offering the pending state of the database before the user decides to transact. At the same time, it is the set of executors (and the bridge contract) that enforces censorship-resistance and liveness of transaction execution.
A simple way to imagine the relationship:
Coinbase runs the sequencer for processing most transactions. Any user can become an executor to enforce the eventual ordering and execution of a valid transaction. No need to deal with customer support to process your transaction!
We further explore the role of each agent in the paper. But it does bring us to the protocol goals in terms of the operators (sequencers, executors) experience and the desired user experience.
Operator requirements:
- No collateral to operate. The operators can instantiate a new off-chain system without locking up collateral.
- Operational cost efficiency. The fee on the off-chain system should cover the cost of interacting with a validating bridge contract.
- Unrestricted experimentation. The off-chain system’s environment should not be restricted because of the Ethereum Virtual Machine (EVM).
- Proof of reserves. It is publicly verifiable whether the off-chain system’s assets cover its liabilities.
User experience:
- No on-chain registration. A user can receive funds on the off-chain system without interacting with the main chain.
- No transfer limits. A user can send their entire balance in a single transaction (and there are no restrictions on the amount of funds a user can receive).
- Transaction post-conditions enforced. The off-chain system must respect conditions set by the user for their transaction to be valid.
- Validating pending database state. There should be a minimised protocol to let the user verify the pending state of the database before deciding to transact.
The operator requirements focus on overcoming the issues that arose with state channel hubs to ensure anyone can operate an off-chain system that derives its security from the underlying blockchain. On the other hand, the user experience goals focus on what it means to build a Coinbase-like user experience in Web3.
Want to know how well the popular implementations solve the above goals? Check out the paper!
Security goals
What does it actually mean for an off-chain system to derive security from the underlying blockchain?
To help answer that question, we have defined three security goals for a validating bridge contract:
- Data availability problem. The bridge contract can verify the data is publicly available such that anyone can recompute the off-chain system’s database independently.
- State transition integrity. The bridge contract is convinced that all transactions executed in the commitment are valid and the commitment represents a valid new state of the database.
- Censorship resistance. The bridge contract can enforce the eventual ordering and execution of transactions to ensure users can always unwind their positions and withdraw their funds.
The most popular approach for solving the data availability problem, called a rollup, is to post all data to the underlying blockchain.
Insight: This implies that computation is expensive, but data is cheap on Ethereum. In the rollup world, computation becomes cheap and data is expensive.
It is up to the executor to periodically assert a cryptographic commitment to the bridge contract about the off-chain database’s new state. This commitment is finalised only after the bridge contract is convinced that it is indeed valid.
There are two ways to protect the integrity of commitments posted to the bridge contract and ultimately for state updates to the database:
- Fraud-proof approach (Optimistic). Challengers are responsible for validating commitments posted to the bridge contract. There is a challenge time window to provide time for a challenger to send indisputable evidence (proof of fraud) about the commitment’s validity. If there is no evidence of fraud by the time the challenge expires, then the commitment is accepted as valid.
- Validity-proof approach (Zero-knowledge). The executor must present evidence alongside the commitment to prove that all state transitions are valid. There is no challenge period and the bridge contract can accept the commitment immediately.
Finally, censorship resistance, which we believe is the most understudied problem that needs investigated, typically relies upon two core components:
- Forced transaction inclusion. Any honest user can post their transaction to the bridge contract. It will eventually be ordered and executed in the off-chain system.
- One honest executor. Any honest user can self-appoint themselves as an executor and assert a new cryptographic commitment that executes their transaction.
We provide an in-depth breakdown of the above solutions (and more) in the paper. It is not yet known if the solutions can work in a pick-and-mix manner, but there is a single point to make:
The combination of solutions focus on getting close to and it is not exactly the same as the underlying blockchain’s full security.
Discussion and fun insights
Real-world implementation inspection
We provide a code inspection of Arbitrum, Optimism, Oasis, Starkware and ZkSync. The goal is to help understand the problems they have faced, the solutions deployed, and their future direction. Oasis is the only non-Ethereum project studied and further work will include blockchain networks such as Polkadot and Cosmos.
Getting close to Ethereum’s security is expensive
In Table 1, we captured gas costs for interacting with the bridge contract. Only Arbitrum and Optimism support smart contract execution which costs about 4,000 to 5,000 gas per transaction. ZkSync withdrawals are about 15,000 gas as it is the only protocol to aggregate transactions. For the StarkWare gas costs, take note of the cost of verifying signatures by the data availability committee and that the STARK proof is not natively supported by the EVM. More information about the gas costs can be found in the paper.
Alleviating the data availability bottleneck?
The initial approach for validating bridges, Plasma, focused on keeping data off-chain. Most designs relied on a challenge process where a user has a time window to force the operator to make the data available via the underlying blockchain. There are two issues that ultimately led to the community moving away from Plasma:
- Fisherman’s dilemma. It is indistinguishable whether the sequencer withheld data or if the user issued an unnecessary challenge. The sequencer can keep data private and force the user to issue a challenge for data availability. Since blame cannot be ascribed, the user must cover the cost for the sequencer's malicious behaviour.
- Process challenge limitations. The bridge contract lacks the resources to process all data availability challenges in a timely manner and as such users may fail to force the sequencer to reveal their data. This can result in a mass exit as all users attempt to exit the off-chain system.
The ultimate risk of Plasma was a mass-exit
Users will rush to withdraw their coins from the system before all other users if they believe the system is about to become compromised. If access to data about the off-chain system is restricted, it can prevent eventual progress of the off-chain system, which is a liveness violation. This is the case even if the state integrity problem is solved using zero knowledge proofs. On the other hand, if the bridge relies on a fraud-proof system, then a challenger cannot prevent an invalid transaction from executing. This is a safety violation and it implies the system may no longer be fully collateralised to facilitate all user withdrawals.
Pursuit of EVM-compatibility
An implicit goal for an off-chain system that relies on a validating bridge is to look and feel like Ethereum. All existing tooling, wallet software and deployment of solidity smart contracts should work out of the box. This is an interesting dilemma as off-chain systems are experimenting with new virtual machines that extend the functionality beyond what is capable of the EVM (without sacrificing security derived from Ethereum). For example, Cairo by Starkware is an execution environment that is friendly for zero-knowledge proofs and Arbitrum has implemented an entire virtual machine (AVM) as solidity smart contracts.
The initial versions for off-chain systems including Optimism, ZkSync and Starkware were not EVM-compatible. To the best of our knowledge, this has led to issues when onboarding new projects. Crypto teams tend to have about 10 people and lack the resources to rewrite their entire project to satisfy a new execution environment (even if it is better!). Arbitrum was the only rollup to appear EVM-compatible at launch and its traction is evidence that this is indeed a desirable goal. All projects are now working towards EVM-compatibility, or at least, making it appear indistinguishable from the user/developer perspective.
Composability among off-chain systems?
We envision the rise of hundreds, or thousands, or even hundreds of thousands of off-chain systems that derive security from a validating bridge contract. We fear this will reduce atomicity and composability of smart contracts as collateral (and smart contract functionality) is fragmented across multiple off-chain systems.
Cryptocurrencies have already operated with multiple off-chain systems for over 10 years and it is not a new problem. For a long time, cryptocurrencies such as Bitcoin, Ethereum and even Tron, were an interoperability solution among popular exchanges such as Coinbase, BitMEX and FTX. It is not uncommon for exchanges to support zero-confirmation transactions from trusted accounts to composability for traders.
Rise of liquidity providers and repayment protocols
The key difference between the existing custodial exchanges and the new off-chain systems is the programmable and open nature. Anyone can build services on top of the off-chain systems. We foresee the rise of a decentralized set of liquidity providers to help alleviate the composability issues that arise. They can leverage repayment protocols to minimize trust in the counterparty while optimistically facilitating rapid transfers.
There are two flavors of repayment protocols:
- Eventual repayment. The liquidity provider must wait a period of time until they have access to the repaid funds.
- Immediate repayment. The liquidity provider optimistically has immediate access to the repaid funds after paying the user.
Atomic swaps (and the Lightning network) is an example of an immediate repayment protocol. Whereas, the MakerDAO oracle, the HOP protocol and MOVR network are examples of an eventual repayment protocol. We foresee more exotic repayment protocols emerging that leverage DeFi primitives. For example, some protocols are leveraging liquidity pools to incentivise re-balancing of assets across off-chain systems.
Our goal was to shine the spotlight on the validating bridge, which is the cornerstone for extending Ethereum’s security to the off-chain system. While there are still parties who run the off-chain system, the task for the validating bridge is to ensure the diversity of parties who can participate is permissionless while minimising the trust required for the system to work.
We expect designs to emerge that will constrain the sequencer so much they are not even trusted with transaction ordering and they will simply accept inbound transactions, follow a deterministic ordering protocol, and then publish the final order for execution.
The motivation for organisations to adopt a validating bridge is threefold:
- No restriction on services offered. Off-chain systems can have expressive programmable environments that anyone can build upon. StarkEx powers an NFT (Immutable) and derivative exchange (dYdX), Reddit plans to launch its own instance of Arbitrum and Arbitrum One already hosts more than 80 projects.
- No liability by holding custody of user funds. It is the validating bridge contract and not the off-chain system that has custody of funds. The operators are only trusted with ordering and eventual execution of transactions. This can reduce the impact of regulatory pressure as they are not required to protect the user’s funds from adversarial actors.
- Reduce on-chain costs as throughput increases. With the rise of validity proofs, we expect the average financial cost for leveraging Ethereum’s security to reduce as transaction throughput on the off-chain system increases. Thus, we can scale transaction throughput without sacrificing the security derived from Ethereum.
In a way, it is making the original vision for sidechains as a platform for experimentation a reality. We hope this Systemization of Knowledge (SoK) and the research questions highlighted within will help the community build bridges for a multi-chain world.
Read the paper to dive deeper into the discussions and the interesting research questions that appear.