Last week, Ethereum briefly stopped finalizing blocks, raising concerns across the web3 community despite transactions continuing to be processed normally.
Two incidents rattled the Ethereum ecosystem on Thursday and Friday, with blocks failing to finalize for three and eight epochs (roughly 20 minutes and one hour) in separate events. dYdX, a popular derivatives platform, paused deposits while waiting for finalization to resume.
Developers released patches for the two affected clients, Prysm and Teku, on Friday, but researchers are still unsure as to the exact cause of the problem.
“I’m not sure that any of us fully understand why,” Ben Edgington of the Ethereum Foundation said. “It’s still under analysis exactly what the root cause of the issue was and why the chain recovered.”
This is the first major incident suffered by the Beacon chain, Ethereum’s proof of stake (PoS) consensus layer that merged with the mainnet execution layer last September, and serves as a cautionary reminder of the experimental nature of blockchain technology.
Despite Ethereum being the No. 2 cryptocurrency with a $225B market cap and a $27B DeFi ecosystem, the protocol can still encounter unexpected issues, particularly while work continues on its disruptive roadmap of upgrades.
Business As Usual
Ethereum users successfully continued to transact on-chain through the incident.
“Although the network was unable to finalize, the network was, as designed, live, and end users were able to transact on the network,” the Ethereum Foundation said in a blog post. “After all clients caught up, the network finalized again.”
The Ethereum Foundation attributed the incident to an “exceptional scenario” which caused a high load for Teku and Prysm’s consensus layer clients. “The full cause for this is still being evaluated,” it added.
Teku and Prysm’s patches include optimizations limiting resource usage during periods of network congestion.
Post Mortem
On Sunday, Ben Edgington of the Ethereum Foundation and Superphiz, the Beacon Chain community health consultant, discussed the incident on YouTube.
Edgington said finality occurs when at least two-thirds of validators agree on Ethereum’s state during attestations after each epoch. He said last week’s incident manifested as roughly 60% of validators failed to attest at the same time, preventing the network from reaching finality.
“It [was] as if 60% of the validators went offline,” Edgington said. “To finalize the chain, we need two-thirds or 66% of validators showing up.”
Client Diversity
The pair described the network’s recovery as a testament to the value of Ethereum client diversity, with only two of Ethereum’s five major clients suffering issues.
Edgington said Lighthouse client users experienced no issues during the incident because Lighthouse rate-limits the reprocessing of old states. However, he said that Lighthouse’s design could cause different problems under certain circumstances.
“As we’ve seen around these edge cases, it can actually strengthen things if clients take slightly different approaches because some will be able to carry the network where others fail,” he said.
Recurring Problem
Edgington and Superphiz agreed that it is likely that Ethereum will encounter similar issues again in the future.
While researchers are currently unsure what exactly triggered the finalization issues, Edgington suggested the speed of the network’s growth may be driving up the computational resources needed to validate Ethereum.
He noted that Ethereum’s validator count is up by 2500% since the Beacon chain launched in December 2020, conceding that developers may have neglected large-scale stress-testing on testnets in recent years.
Edgington said Ethereum’s core developers have learned their lesson and will deploy large private testnets to “stress test some of these scenarios with more realistic validator numbers.”
Emergency State
While the Ethereum network regained finalization on its own last week, Edgington and Superphiz noted that measures are in place to protect the network against a severe outage.
Finalization usually occurs after two epochs, but the Beacon chain enters an emergency state called “Inactivity Leak” mode if finalization does not occur after four epochs. In this mode, validators receive no rewards for attesting but face escalating penalties for failing to do so.
Edgington said the mechanism slowly drains ETH from non-performing validators until active validators come to represent a two-thirds majority and can finalize the network again.
He said the mechanism offers protection against catastrophic events, such as war, which could isolate people living in different jurisdictions from each other. After about three weeks without finalization, Ethereum would fork and recognize the block history maintained by the network’s remaining active validators.
Last week’s incident had a “minimal” Impact on validators, according to Edgington, with Ethereum’s nearly half a million validators losing a cumulative 28 ETH during a brief Inactivity Leak period.
Read More: thedefiant.io