The Importance of Fast Transaction Confirmation on Ethereum
One of the important attributes of a good blockchain user experience is fast transaction confirmation times. Today, Ethereum has made significant improvements compared to five years ago. Thanks to EIP-1559 and the stable block time after the transition to PoS (The Merge), transactions sent by users on L1 can usually be confirmed within 5-20 seconds, which is roughly equivalent to the experience of using a credit card for payments. However, further improving the user experience is valuable, and some applications even require delays of several hundred milliseconds or even shorter. This article will explore some practical options for Ethereum to improve transaction confirmation times.
Currently, Ethereum’s Gasper consensus uses a single slot and epoch architecture. Every 12 seconds, a slot is created, and a subset of validators vote on the head of the chain. Within 32 slots (6.4 minutes), all validators have the opportunity to vote once. These votes are then interpreted as a message in a consensus algorithm similar to PBFT, and after two epochs (12.8 minutes), a very strong economic guarantee called finality is achieved.
In recent years, dissatisfaction with the current method has been growing. The main reasons are that this method is complex, with many interaction errors between slot-to-slot voting and epoch-to-epoch finality, and that 12.8 minutes is too long for anyone to wait.
Single Slot Finality (SSF) replaces this architecture with a mechanism similar to the Tendermint consensus, where block N is finalized before block N + 1 is generated. The main difference from Tendermint is that SSF retains the “inactivity leak” mechanism, which allows the chain to continue running and recover when more than 1/3 of validators are offline.
One of the main challenges of SSF is that it means every Ethereum staker needs to publish two messages every 12 seconds, which is a significant burden on the chain. There are some clever ideas to mitigate this issue, including the recent Orbit SSF proposal. While this significantly speeds up “finality” to enhance the user experience, it does not change the fact that users still need to wait 5-20 seconds.
In recent years, Ethereum has been following a roadmap centered around rollups, designing the Ethereum base layer (L1) to support data availability and other functions, which can then be used by L2 protocols (such as rollups, validiums, and plasmas) to provide security at a larger scale for users.
This has led to a separation of focus points within the Ethereum ecosystem: Ethereum L1 focuses on censorship resistance, reliability, stability, and maintaining and improving certain core functions of the base layer, while L2 focuses on more direct user interaction through different cultures and technologies. However, if this path is followed, an inevitable problem arises: L2 aims to provide faster confirmations than 5-20 seconds.
At least in theory, creating their own “decentralized orderer” network is the responsibility of L2. A small group of validators could sign blocks every few hundred milliseconds and commit their staked assets behind these blocks. Eventually, the headers of these L2 blocks would be published to L1.
However, L2 validators can engage in “fraud”: they can sign block B1 first, and then sign a conflicting block B2 and submit it to the chain before B1. But if they do this, they will be detected and lose their staked assets. While we have seen practical cases of centralized versions, progress towards developing a decentralized orderer network for rollups has been slow. It could be argued that requiring all L2 to have a decentralized orderer is unfair: we are essentially asking rollups to do work almost equivalent to creating a brand new L1. Therefore, Justin Drake has been advocating for a method that allows all L2 (and L1) to use a shared preconfirmation mechanism within the Ethereum scope: based preconfirmations.
The based preconfirmation method assumes that Ethereum proposers are highly complex participants related to MEV. Based preconfirmations leverage this complexity by incentivizing these complex proposers to take on the responsibility of providing preconfirmation services.
The basic idea of this method is to create a standardized protocol where users can provide additional fees to ensure their transactions will be immediately guaranteed to be included in the next block, as well as a statement of the result of executing that transaction. If proposers violate any promises made to any user, they can be penalized.
As described, based preconfirmations provide guarantees for L1 transactions. If rollups are “Based”, then all L2 blocks are L1 transactions, so the same mechanism can be used to provide preconfirmations for any L2.
Assuming we have implemented single slot finality, we can use technologies similar to Orbit to reduce the number of validators signing each slot, but not too much to make progress on reducing the minimum staking limit of 32 ETH. Slot time could increase to 16 seconds, and then we can use rollup preconfirmations or based preconfirmations to provide faster confirmations for users. In the end, we have achieved an epoch-slot architecture.
There is a profound philosophical reason why the epoch-and-slot architecture seems so difficult to avoid: it requires less time to achieve rough consensus on something compared to reaching maximum “economic finality” agreements.
One simple reason is the number of nodes. Although the old linear decentralization/finality time/overhead tradeoff has now been softened due to highly optimized BLS aggregation and upcoming ZK-STARKs, the following reasons cannot be ignored:
“Approximate consensus” only requires a small number of nodes, while economic finality requires a majority of nodes.
Once the number of nodes exceeds a certain scale, you need to spend more time collecting signatures.
In today’s Ethereum, the 12-second slot is divided into three subslots: block publication and distribution, proofs, and proof aggregation. If the number of provers is significantly reduced, we can reduce it to two subslots and use an 8-second slot time. Another larger factor is the “quality” of nodes. If we can also rely on specialized subsets of nodes to achieve approximate consensus (and still use the full validator set to determine finality), we can reduce it to about 2 seconds.
Therefore, in my view, the epoch-and-slot architecture is obviously correct, but not all epoch-and-slot systems are equal, and it is valuable to explore the design space more fully. The directions worth exploring are not as closely coupled as Gasper, but with a stronger focus on separating the two mechanisms.
In my view, L2 currently has three reasonable strategies:
1. Technically and spiritually “based”. In other words, they optimize the Ethereum base layer’s technical attributes and its values (high decentralization, resistance to censorship, etc.). In its simplest form, you can think of these rollups as “branded shards,” but they can also have larger ambitions, conducting a large number of experiments in new virtual machine designs and other technical improvements.
2. Become a “server with a blockchain scaffold” and fully utilize it. If you start from a server, and then add STARK validity proofs to ensure the server follows the rules; ensure user exits or the right to force transactions; the freedom of collective choice, through coordinated large-scale exits or by changing the votes of orderers, then you have achieved most of the benefits of on-chain while retaining most of the efficiency of the server.
3. A compromise approach: a fast chain with 100 nodes, providing additional interoperability and security for Ethereum. This is the current practical roadmap for many L2 projects.
For some applications (such as ENS, key storage, and partial payment protocols), a 12-second block time is already sufficient. For those applications that are not suitable, the only solution is the epoch-and-slot architecture. In all three cases, the “epoch” is the SSF of Ethereum, but the slot is different in each of the three cases mentioned above:
A native Ethereum epoch-and-slot architecture
Server preconfirmations
Committee preconfirmations
A key question is, how well can we do with the first type? In particular, if it becomes very good, then the significance of the third type is much smaller. Because all “based” solutions are not applicable to L2 such as plasmas and validiums, the second type will always exist. If a native Ethereum epoch-and-slot architecture can reduce slot time to 1 second, then the space for the third type will become much smaller.
We are still far from the ultimate answers to these questions today. A key question is: how complex will block proposers become, which is still an area of considerable uncertainty. Designs like Orbit SSF are very novel, so exploring the design space of schemes such as using Orbit SSF as the epoch in an epoch-and-slot architecture is still worth exploring. The more options we have, the better we can do for users of L1 and L2, and the easier we can make the work for L2 developers.
Source: Odaily星球日報南枳