Storage Replica: Accelerating the Storage Access of the Ethereum Virtual Machine

Jezek, Kamil; Jeong, Seongho; Kim, Yeonsoo; Scholz, Bernhard; Burgstaller, Bernd

doi:10.3390/app16010486

Open AccessArticle

Storage Replica: Accelerating the Storage Access of the Ethereum Virtual Machine

by

Kamil Jezek

^1,2,†,

Seongho Jeong

^3,†

,

Yeonsoo Kim

³

,

Bernhard Scholz

^1,2

and

Bernd Burgstaller

^3,*

¹

School of Computer Science, The University of Sydney, Sydney, NSW 2006, Australia

²

Sonic Labs, 4th Floor, Cayman Financial Center, 36A Dr. Roy’s Drive, P.O. Box 2510, Grand Cayman KY1-1104, Cayman Islands

³

Department of Computer Science and Engineering, Yonsei University, Seoul 03722, Republic of Korea

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Appl. Sci. 2026, 16(1), 486; https://doi.org/10.3390/app16010486

Submission received: 15 November 2025 / Revised: 23 December 2025 / Accepted: 27 December 2025 / Published: 3 January 2026

(This article belongs to the Special Issue Advanced Blockchain Technology and Its Applications)

Download

Browse Figures

Versions Notes

Abstract

Ethereum’s smart contracts operate on directly addressable storage that is represented as tries. The performance of the Ethereum Virtual Machine (EVM) suffers from slow storage access due to trie encoding, which hampers transaction throughput and scalability. To mitigate the Ethereum storage performance bottleneck, we propose a new storage representation for the EVM that supports asynchronous trie construction. Without changing the Ethereum protocol, we add a flat representation called Storage Replica to improve performance. Storage Replica provides a fast lookup of values in the program’s main thread, while a worker thread prepares the tries for subsequent cryptographic calculations. With a storage overhead of less than 5% (i.e., 10

GB

), we achieve up to a 6× speedup in processing smart contracts and a 4× speedup in block commits for the initial 9 M blocks of the Ethereum blockchain.

Keywords:

Ethereum Virtual Machine; smart contract; Merkle Patricia trie; asynchronous trie construction; state trie bottleneck

1. Introduction

Ethereum is the most successful programmable blockchain whose smart contracts enable the development and execution of decentralized applications (DApps). Smart contracts have enabled a wide range of applications, including smart homes [1], voting systems [2,3], payroll computation [4], tamper-proof synchronization in IoT [5], and many more, categorized by Zhu et al. [6] and Lao et al. [7]. This trend established an industry called decentralized finance (DeFi) [8,9], which has since become a cornerstone of the ecosystem, highlighted by its total value of 119 billion USD [10].

The massive economic value and the high volume of on-chain activity required to support it are directly reflected in the rapid growth of Ethereum. The Ethereum main blockchain (mainnet) is now approaching 25 M blocks, and the number of unique accounts has reached 350 M [11]. This expansion, in turn, results in growing hardware requirements. Storing the entire history of the Ethereum mainnet currently requires 24

TB

of disk space [12]. It has been observed that storage growth directly impacts EVM performance, such that the execution speed of smart contracts deteriorates with the number of blocks, with later blocks spending up to 90% of their total execution time accessing storage [13].

Ethereum’s resource requirements exceed the capacity of commodity hardware, hindering adoption and raising concerns about network monopolization in the blockchain-oriented software industry (cf., e.g., [14]). The fees to execute SLOAD and SSTORE, the bytecode instructions to access the storage, had to be repeatedly increased to reflect their growing execution time. Related adjustments from the Istanbul hard fork introduced backward-compatibility problems for several already-deployed smart contracts [15,16,17].

As Ethereum continues to support a high volume of transactions [18], the efficiency of the EVM’s storage operations remains a critical factor for network scalability and must be improved. Otherwise, storage will continue to constitute a serious performance bottleneck for the execution of smart contracts. Hence, a new runtime environment is paramount for accelerating the storage access of the EVM. The runtime environment must substantially improve so that (1) more data can be stored and accessed by a contract in the future, and (2) the number of transactions per second significantly improves for Ethereum execution layer clients [19,20] so that (business) transactions can be committed faster in the future.

In this paper, we accelerate the storage operations of the EVM. We introduce a new data structure for the EVM’s storage called Storage Replica. The Storage Replica enables fast access to the smart contracts’ storage and bypasses expensive Merkle Patricia trie (MPT) operations. A background synchronization task is used to reconstruct this trie to keep the blockchain consistent at the end of each block. Our modifications to the EVM runtime environment do not require changes to the Ethereum protocol, and, hence, they are backward-compatible.

This paper makes the following contributions. First, we conduct a quantitative analysis of Ethereum’s smart contract storage model and its cryptographic representation as a key-value store to identify the performance bottlenecks in the EVM storage subsystem. Second, we reorganize block processing using a new Storage Replica technique that alleviates the storage access overhead of smart contracts on the Ethereum blockchain. Third, we conduct an extensive experimental evaluation to quantify the resource consumption with this new technique on the initial 9 M blocks of the Ethereum mainnet.

2. Background: Ethereum’s World State

Smart contracts use the blockchain to store information. Although blockchain platforms differ slightly in their use of cryptographic hashes and functionality, they share common features [21,22]: a blockchain consists of a sequence of blocks forming a chain. A block contains a header and a body, where the header holds meta-information about the block, most notably a cryptographic hash of the preceding block to form the chain. The body of a block lists transactions created by clients of the blockchain.

Ethereum transactions include deploying and invoking smart contracts, as well as transferring funds between user accounts. A contract is stored in a contract account and may accumulate funds as well. Transactions update the state of user and contract accounts, which are stored in one vast MPT ([23], Appendix D). This structure combines the trie structures [24], Patricia [25], and a Merkle tree [26]. The data structure encodes a set of key–value pairs, where the keys are encoded as bit-strings along a path from the root node to the destination node, where the value is stored. There are three types of nodes: Branch, Extension, and Leaf. The Extension groups keys with common paths and contains a reference to one child node, Branch contains up to 16 references to child nodes and may contain a value for a key terminating here, and the Leaf node contains a suffix of the key and a value. Each node is referenced by the cryptographic hash of its contents. This hash is stored in the parent node as a reference to the child, and it also serves as a search key in the underlying database. This recursive hashing, from the leaves up to the root, composes the Merkle proofs, ensuring that the root hash cryptographically reflects the state of the entire trie. The root hash serves as a signature in the block header, and thus it is straightforward to match a block to its underlying state database and assess whether they are in sync.

Figure 1a depicts an example MPT that contains the four key–value pairs listed in the table in Figure 1b, e.g., key 9b5ad3 maps to value Third. A key determines a unique path from the trie’s root to the key’s value stored in a leaf node of type Leaf. To look up the value of a given key, the half-bytes in the key’s byte representation inform the path to the leaf node, e.g., a lookup of key 9b5ad3 begins at the root extension node, which shares the prefix “9b” and refers to the branch node. The next half-byte in the key, “5”, is the index into this branch node. From there, traversal passes the subsequent extension node for the key’s next half-byte, “a”, to reach the second branch node, which is indexed by the key’s half-byte “d”, and then to the leaf node that matches the last half-byte of the key, i.e., “3”. This leaf node contains the key’s value (Third).

As outlined above, each node in the MPT is persisted separately in the underlying key–value database. A node’s key is obtained by computing a cryptographic hash of its contents. Because a node is addressed by its cryptographic hash, each outgoing trie edge in Figure 1a corresponds to a hash key stored in the parent node that refers to a child. Consequently, a single SLOAD operation, which at its core is an MPT lookup, will trigger a sequence of database lookups proportional to the depth of the trie (typically 6–8 levels).

More critically, the leaf–value update of an SSTORE operation will change the hash (i.e., address) of the leaf, thereby necessitating an update in the parent node, and—by transitivity—trigger structural adjustments and hash recomputations along the entire path from the leaf up to the root node. For further information about the Ethereum MPT, we refer the reader to the Ethereum documentation [27] and the literature [28].

Although the MPT is an abstract structure, the Ethereum specification assumes that the “implementation will maintain a database of nodes” ([23], Appendix D, paragraph D1). Appendix B of [23] defines a function

R L P

that converts nodes into byte arrays to persist them in a low-level key–value database. Each node n of the trie is converted by function RLP and stored in the database under its hash in a

key \to value

map,

hash (RLP (n)) \to RLP (n) .

(1)

The asymmetric hash function of the same name in Equation (1) is used to hash the nodes in the database and compute Merkle proofs in the trie. Ethereum uses the Keccak [29] hash function, which was an early implementation of SHA3 before it became official.

As shown in Figure 2, the complete state of the Ethereum blockchain is stored in the world state trie, an MPT that is logically divided into state and storage tries. The state trie associates a 20 B address with an account. The cryptographic hash of the root node of the state trie, called the state root, is stored in the block header. Starting from this root, the address is encoded as a traversal path to a leaf node. The leaf node contains the account information. It consists of four items, including the cryptocurrency balance. If the account is a contract, the item storageRoot is a 32 B hash referring to the root of the storage trie. The codeHash is a 32 B hash referring to the contract’s EVM bytecode, stored in the database under this key. If the account belongs to a user, these two items are empty.

The storage trie represents the persistent storage of a smart contract. It maps 32 B storage addresses to 32 B values, where the address is mapped as a path from the root node to the destination node with the value. The layout of this data structure and the mapping of values to storage addresses are language-specific. For Solidity [30], the dominating programming language for Ethereum smart contracts [31], the compiler packs several variables into a single storage address, if their sizes allow, and generates bit operations to pack their values [32]. For example, up to 16 int16 variables can be stored in a single storage address.

The world state trie is continuously updated as a side effect of the execution of transactions, which modify user balances, create new accounts, deploy new contracts, and execute contracts, and the contracts modify their storage. Because a transaction may potentially modify any account, the world state trie must maintain all existing accounts—350 M at the time of writing [11]. This is a complex task as the trie does not fit in main memory and must therefore be synchronized with a key–value database.

3. Related Work

Read and write amplification of tries have been a topic of active investigation: Raju [33] created Merkelized LSM, a combination of the Merkle and log-structured merge tree data structures. Ponnapalli [34] proposed a distributed MPT split on multiple nodes. Performance optimization of the Merkle tree structure itself is a widely studied topic [35,36,37], with some research focusing on optimizing Merkle proof size to reduce verification overhead [38]. Although these attempts may improve performance, their impact is limited. The fundamental bottleneck remains the database latency inherent in traversing the trie structure, as each node lookup can trigger a separate database I/O operation.

A large body of work has employed concurrency to mitigate the performance problem of smart contracts [39,40,41,42,43,44]. The common approach to leveraging parallelism is to have miners first determine an execution graph with forks and joins. The remaining nodes in the network replicate execution according to this graph. This moves database access off the critical path, improving overall performance. But the latency of database access for retrieving individual values from the state is not affected.

Another significant area of research proposes faster key–value databases. Although we argue that data structure optimization should always precede empowering hardware and software, better databases are certainly of interest, especially given the performance trade-offs inherent in on-chain storage [45]. Faster [46] is a fast concurrent database with a log file spanning a hard drive and main memory. The approach taken by Song et al. [47] enables bulk get methods on top of RocksDB with an algorithm that uses SSDs more efficiently. Papagiannis et al. [48] report a 4–6 times speedup over RocksDB for their memory-mapped key–value store. Ouaknine et al. [49] showed that a highly-tuned RocksDB configuration on an SSD can improve performance by orders of magnitude. Wu et al. [50] proposed a novel algorithm to represent the log as an LSM-trie, thereby achieving 10-times-better performance with LevelDB and RocksDB. Their benchmark results indicate no performance degradation as data size increases. This is important because, as documented in Section 4 and by other research [13], the performance of conventional RocksDB is seriously affected by growing amounts of data. Lepers et al. [51] introduced a database optimized for modern SSDs. Agarwal et al. [52] proposed a method for querying compressed data to reduce disk space without sacrificing performance.

The related work up to this point pertains to blockchain in general. In the following, we discuss the developments specific to the Ethereum storage model. The Ethereum storage model comprises the MPT representations for the world state and account storage. As the state grows, the computational cost of storage accesses increases. Bounding the latency of MPT operations is critical for the reliability of the blockchain, because the access overhead of MPTs can and has been exploited for denial-of-service (DoS) attacks [53,54]. Some studies have proposed multiple layers of trie data structures to improve the performance of state storage on MPTs for blockchain [55,56].

Rather than changing the MPT as the underlying data structure of the Ethereum storage model, the Ethereum community has followed a two-pronged strategy for mitigating issues from increasing computational costs for storage accesses. Increasing the gas cost for storage accesses to more accurately reflect the increasing computational cost of MPTs and thereby avoiding DoS attacks has been implemented multiple times through hard forks. To reduce storage access latency, a new state database implementation for Ethereum clients has been developed. Table 1 summarizes the historical updates to these two approaches in the Ethereum specification and the official Ethereum client (Geth).

Table 1. Historical updates of the Ethereum storage model and its implementation in the Geth client to optimize storage operations against the increasing computational costs of the MPT.

Year	Update	Description
2016	Tangerine Whistle [53]	Hard fork that increased gas costs of storage operations to protect the mainnet against DoS attacks.
2018	Geth v1.8 [57]	Pruning mode that keeps only the 128 most recent tries to reduce the state database size.
2019	Istanbul [58]	Hard fork that increased gas costs for operations depending on the size of the tries.
2021	Geth v1.10 [59]	Snapshots to improve the performance of read operations and the efficiency of offline pruning.
2021	Berlin [60]	Hard fork that increased the storage gas cost for initial (cold) state accesses and decreased the costs of subsequent (warm) accesses. Optional access lists were introduced to warm up state locations before transaction executions.
2021	London [61]	Hard fork that lowered gas refunds for storage write operations to prevent exploits of the refund mechanism.
2023	Geth v1.13 [62]	Path-model database scheme for faster state accesses and more effective pruning implementation.

Pruning mode (row 2 in Table 1) was introduced to reduce the large disk space requirement for storing historical states in archive mode [12]. Most use cases require only the recent world states of the ledger to participate in the mainnet. Pruning mode deletes historical states to keep the size of an Ethereum node maintainable. Geth and Parity have used pruning as their default mode of operation. Erigon’s default mode was the archive mode until Erigon v2, and changed to pruning mode with Erigon v3 in 2025 [63].

The Geth developers have proposed a pull request for state snapshots (row 4 in Table 1) by enabling a separate index [64]. This feature validates the utility of flat key–value storage for accelerating read access. However, this implementation is primarily used for fast read access, while write-intensive operations (e.g., block commits) still rely on the traditional trie structure, which remains a bottleneck. Additionally, they do not provide a thorough performance evaluation to assess the speedup, and the initial analyses suggest the snapshot’s storage overhead is considerable [59].

Geth v1.13 introduced the path-model database scheme (row 7 in Table 1) with the intention to improve performance and reduce node size [62]. The traditional hash state scheme stores MPT nodes using their hash values, while the path state scheme stores MPT nodes using their paths in the MPT. When overwriting a storage slot, the hash scheme creates a new node with a different hash value, while the path scheme overwrites the slot of the same path. Therefore, the path scheme has the effect of continuously pruning MPT nodes for overwritten storage slots. However, the implementation of the path scheme introduced a multi-layer database to maintain MPT nodes. In our evaluation of Geth v1.13, the path scheme achieved a 60% reduction in disk space at the cost of a 3.7× increase in processing time for importing the initial 4 M blocks of Ethereum, compared to the hash scheme.

For most use cases, executing the entire transaction history of all blocks on the ledger is considered too time-consuming for the initial setup of an Ethereum node. To accelerate setup, the Geth client uses the snap sync protocol [65] as the default for new nodes. Snap sync downloads states at recent blocks to skip processing of historical blocks. Snap sync is only available when there are other trusted nodes that have executed the entire Ethereum transaction history to create the current state snapshot and support the snap sync protocol in addition to the Ethereum protocol. Different client implementations have introduced their own snap sync variants, e.g., Parity’s Warp Sync [66] and Erigon’s OtterSync [67]. But these variants are incompatible with each other and share the same fundamental limitations of Geth’s snap sync. Unlike Storage Replica, the snap sync protocol does not optimize regular transaction execution.

Part of the Ethereum community prefers improving the scalability of the current Ethereum protocol [68]: “If I were to design scaling, first I would squeeze as much as possible out of Ethereum 1, which I think hasn’t been done yet,” said Alexey Akhunov, the author of an alternative Ethereum client, Turbo-Geth [69], which was later renamed to Erigon [70]. Erigon uses a flat key–value store for the PlainState and HashedState tables to calculate the MPT root hashes of the current states [71]. This architecture treats flat storage as the primary source of truth, calculating the MPT structure as a secondary result. This fundamental architectural shift is incompatible with the MPT-centric layout of existing clients, requiring a full sync from scratch to build its flat key–value storage. Unlike Storage Replica, Erigon does not support asynchronous trie lookups or updates in its state sync stage [72]. Its hash calculation is implemented as a separate blocking stage that executes only after state processing. In our evaluation, Erigon showed lower transaction throughput and required more disk space than the original Parity client. Erigon’s pruning mode required 1.8× the processing time and 2.0× the disk space of the original Parity client in pruning mode for importing the initial 6 M blocks of Ethereum. (As mentioned before, for both clients, pruning is the default mode.) The database size of Parity reached 200

GB

at a block height of 9 M blocks, while Erigon reached 200

GB

already after 6 M blocks.

Similarly, SlimArchive [73] improves transaction speed and reduces the storage size of an archive node compared to Geth and Erigon, but it sacrifices the MPT for authentication. Therefore, SlimArchive cannot support the core security mechanism of the Ethereum blockchain, which relies on state verification through MPT proofs. In contrast, our research focuses on accelerating storage access while preserving the MPT-centric architecture. We introduce an auxiliary flat data structure, Storage Replica, that works alongside the MPT to ensure full protocol compatibility. This focus on fundamental performance is a critical research challenge, reflected in studies on performance bottlenecks [74,75]. The performance aspect extends to a wide array of application domains [76,77,78], underlining the importance of resolving the bottlenecks addressed in our work.

The Ethereum community aims to distribute computation demand and the size of the state required for each node to verify transactions through sharding [79,80,81]. This approach focuses on horizontal scaling by distributing network load and placing nodes into distributed clusters to execute shards independently. Recent research continues to build on this paradigm, e.g., by exploring efficient cross-shard execution frameworks [82], optimistic processing strategies [83], and dynamic contract migration using machine learning [84]. RainBlock [34] improves I/O throughput by distributing the DSM-TREE data structure across multiple nodes for efficient transaction processing, without modifying the proof-of-work consensus protocol. However, this entire line of research is orthogonal to the storage bottleneck within each node, i.e., the root cause of slow state access when executing transactions in a single shard remains unaddressed.

Layer 2 (L2) scaling solutions, such as ZK-rollup [85], are off-chain layers in addition to the Ethereum blockchain as the base layer (L1). L2 was introduced to scale DApps with computational costs lower than L1. L2 does not address the MPT bottleneck of the L1 protocol. It is still necessary to optimize the L1-layer MPT bottleneck to improve the scalability of L2 DApps, which rely on L1 data capacity [86].

4. The Contract’s Storage Bottleneck

A smart contract has non-contiguous storage memory that is persistently stored on the blockchain ledger. The cells of the storage memory are directly addressable via a 32 B address, and each cell stores data values of the same bit length. The EVM has two instructions for accessing storage memory: SSTORE and SLOAD. They are both executed synchronously, eagerly loading data from the storage via the MPT. Note that bytecode originating from Solidity takes full advantage of the 32 B address space, e.g., array index calculations use cryptographic hashes [87]. If data is read from an address that has not been written before and hence has no value stored on the ledger, the default value of this cell becomes zero.

Assuming that the address space is fully exhausted in the MPT, every inner node in the trie becomes a branch node that covers 4 bits (a nibble) of a 52 B full address. Hence, accessing a single storage value of a smart contract may require a path traversal through the trie that visits up to

2 \times (20 + 32) = 104

inner nodes. Because the trie nodes are stored in a database, a single storage operation may access the database 104 times, incurring substantial performance penalties.

As a consequence of the MPT representation, the database layout and its memory overlay consume significant space for representing all node hashes. While the net representation of a single address/cell value pair would consume

32 B + 32 B

only, the trie representation needs a 32 B hash at every level of the trie to reference a child node, and finally 32 B to represent the value. For a given path of 104 nodes,

104 \times 32 B + 32 B

are required to represent one key–value pair in the trie. With sparse tries, more bits of an address can be covered by an extension node, but the number of branch nodes grows as the address space is exhausted by the increasing amount of stored values, and, hence, will slow down the storage accesses over time.

The world state is persisted as nodes in the MPT in the database and reconstructed in memory on the fly for portions of data when accessed by the EVM. Clients employ caches and memory overlays to accelerate the SLOAD and SSTORE instructions of the EVM. The MPT is an unbalanced data structure [88] that makes the worst-case time complexity for retrieving a data value

O (n)

in the size of the address. On the other hand, the storage address is hashed in the trie. It randomizes and balances the trie, mitigating the worst-case complexity to a certain degree. As a consequence, it reduces the effectiveness of cache strategies. In particular, sequential locality [89] becomes less effective and the interpretation of SLOAD and SSTORE instructions is significantly slowed down [13].

Various software clients have been written in different programming languages to implement the Ethereum protocol. While these clients share the MPT state architecture, they often utilize different underlying key–value database engines. Parity (which was later renamed to OpenEthereum) uses RocksDB [90] as its underlying key–value store, whereas Geth, the most widely used client, uses LevelDB [91]. In the following, we discuss the Geth and Parity projects for analyzing the SLOAD and SSTORE operations in detail. The software architecture of these clients is similar, as observed in [13]. We thus keep the following description at a conceptual level, focusing on the key properties and omitting implementation details not relevant to the discussion at hand. We use the notation 〈object〉::〈method〉 in the following text and accompanying sequence diagrams.

4.1. EVM Storage Instructions: SLOAD/SSTORE

A smart contract reads and writes storage values with the SLOAD and SSTORE instructions. Both instructions pop their 32-byte storage address operand from the EVM stack. Given the address, they search for the node of the corresponding data value in the storage trie. SLOAD pushes the data value obtained from the storage trie onto the EVM stack. SSTORE pops the data value from the EVM stack and later updates the storage trie with the new address–value pair.

Figure 3 details the processing steps of the SLOAD instruction inside the EVM. The storage module invokes the State::storage_at method, which retrieves the storage value from the memory address (idx). The State is an interface of the world state trie. Internally, the method first retrieves the account of the current contract from either the cache or the database. If the account is not in the cache, the respective path of the world state trie must be loaded from the database and constructed in memory. The Account::storage_at method of the retrieved account performs the trie lookup via TrieDB::lookup. It repeatedly calls two key–value database operations, method MemoryDB::get and KeyValueDB::get, to return a value. These operations retrieve a trie node either from the in-memory cache or from the disk.

The data value of the SSTORE instruction is treated as to-be-stored data. The data value is persisted in phases as visualized in the sequence diagram of Figure 4. In the first phase, the SSTORE instruction is invoked during smart contract execution. The Interpreter of a smart contract is triggered as part of the block import (Block::enact) and processes all transactions (Interpreter::push) in a loop. Method State::set_storage obtains an address and a value from SSTORE during the execution and propagates to Account::-set_storage, which accumulates modified values in a memory hash map. The diagram depicts only the parts that concern the SSTORE—how values are collected in an in-memory hash map for a particular Account. Given that the purpose of Account::set_storage is to put a value in a hash map, its runtime should be small. However, before updating the hash map, this method checks whether the value exists by calling Account::storage_at to avoid unnecessary value updates. Because this is the same interface used by instruction SLOAD, the performance of SSTORE correlates with that of SLOAD.

The storage module employs a two-level cache. The first level is an account-level cache that attempts to retrieve a value directly from memory, bypassing the trie structure. The other cache is a memory overlay, which caches trie nodes, while the trie itself must still be iterated. However, the read amplification of TrieDB::lookup still makes SLOAD and SSTORE expensive operations, and using caches does not provide sufficient mitigation.

4.2. State Commit and Database Flush

Real modifications of the MPT are performed in this phase. When all transactions in a block have been executed, i.e., as the last step of processing the block, the state commit occurs. It is conducted via method State::commit, which transfers the values stored in the hash map into the storage trie and represents them in TrieDBMut (see Figure 4, bottom half).

The values are inserted into the trie in a loop that calls TrieDBMut::put for every value. However, to reconstruct the trie in memory, some nodes must be loaded from the database, using the previously mentioned interfaces MemoryDB::get and KeyValueDB::-get. This phase still keeps the values in memory, albeit now in the trie structure, backed up by MemoryDB. The implementation keeps track of modified nodes and recomputes the hashes of all affected nodes (including the nodes on the paths to the root) when all values are inserted.

All the mentioned phases reconstruct each part of the trie from the database. After execution of all smart contracts and the state commit of a block, the memory contains an up-to-date trie. Unfortunately, in recent Ethereum clients, this is a highly sequential operation. Part of the trie is built by executing SLOAD and SSTORE instructions. The remaining trie is built during the state commit. Trie modifications during state commit may occur, e.g., when a value is deleted from the trie. As a result, it may be necessary to read more nodes to condense certain parts of the sub-trie.

The last phase finally persists the data to the database on the disk. When a certain number of blocks in the queue are finished, the method journal_under creates a database batch from values accumulated in MemoryDB. As the last step, method KeyValueDB::flush persists the data. Because this is a low-level fine-tuned database operation, we expect it to be efficient.

5. Performance Improvement Techniques

As discussed in Section 4, the smart contract storage interface is heavily penalized by the cost of reconstructing the world state trie in memory from the database. It negatively impacts the speed of executing SLOAD and SSTORE instructions. Because the only purpose of the MPT is to compute the hash root for time-stamping the block, it is wasteful to keep the trie consistent for all store instructions while executing the transactions in a block.

To mitigate this performance problem, we introduce the following two techniques:

1.: Storage Replica. Storage Replica is an additional data structure that makes storage instructions access a flat key–value store, avoiding expensive trie traversals during execution of SLOAD and SSTORE.
2.: Asynchronous Trie Construction. Our Asynchronous Trie Construction leverages the time interval between the execution of a transaction and the state commit, which must occur as the last processing step of the block that contains the transaction. We use this time span to update the world state trie from Storage Replica. This update is performed concurrently in the background, so that the final cryptographic calculations will already be available at the end of the block processing phase.

Our approach does not rely on concurrency between transactions. Rather, it executes transactions in order, accelerating them through the flat data representation provided by Storage Replica. Parallelism is leveraged by overlapping transaction execution with the MPT construction of already executed transactions within the same block. As a result, the state commit at the end of the block is shortened by performing costly cryptographic calculations and database lookups earlier on.

This constitutes a significant improvement over the traditional block-processing scheme of existing clients, which defers MPT updates and cryptographic calculations to the state commit phase, where they remain on the critical path.

We propose modifying the contract processing workflow from Figure 5 with the optimizations depicted in Figure 6. The original processing workflow is highly sequential. As visualized in Figure 5, transactions Tx are executed one-by-one, i.e., only when a return call from a transaction is finished does the next transaction commence: Tx 1 is followed by Tx 2, etc. Transactions that execute smart contracts invoke the EVM to process the contract’s bytecode. It causes, at some point, executions of many SLOAD and SSTORE instructions. As detailed in Section 4, it propagates into the State, TrieDB, and finally KeyValueDB. Only when this loop completes for all EVM contract invocations in a transaction can execution proceed to the next transaction. Notice that one transaction (smart contract) may contain many SLOAD/SSTORE instructions, and the trie access is recursive, thus multiplying database accesses to reconstruct the trie. The sequence diagram depicts this as Loops. Once all transactions in a block have been processed, State::commit finalizes the trie and propagates all required changes to the database.

The Storage Replica and Asynchronous Trie Construction techniques eliminate the sequential loop between transaction processing and database access. Figure 6 shows two new actors, StorageReplica and Worker, inserted into the original workflow (depicted in red). Unlike the synchronous workflow in Figure 5—where trie lookups and MPT reconstruction lie on the critical path—Figure 6 illustrates that these operations are shifted to an asynchronous worker.

In this model, the StorageReplica synchronously provides the storage values necessary for immediate transaction execution. Simultaneously, the computationally intensive trie construction is offloaded as asynchronous tasks. The shaded region in the diagram indicates that the worker runs in the background and, crucially, its activation overlaps with the subsequent transaction execution. As a result, transaction processing is no longer blocked by trie operations, substantially increasing throughput due to Storage Replica’s low latency.

This approach preserves the protocol’s sequential transaction processing, while significantly accelerating the final block commit by pre-updating a large portion of the modified trie in the background. Ultimately, the key benefit is that long-running trie operations are moved off the client’s critical path in a manner fully compatible with the Ethereum protocol.

5.1. Storage Replica

Storage Replica is a key–value data structure that is represented in memory and in the database as key–value pairs. It mirrors the storage trie by storing address–value pairs verbatim without cryptographic or trie nodes. For this reason, it makes access to the storage values fast and does not require a slow trie traversal anymore with its complex cache hierarchy and memory overlays.

The Storage Replica API differs from the trie API, as shown in Table 2. For both data structures, the address (

i d x

) is the identifier of a smart contract value. This address is generated when the smart contract is compiled. Thus, it is unique only within a smart contract, not across the blockchain. For this reason, the original trie API includes a root argument, which is a hash that bounds the operation to a root of a correct sub-trie. The Storage Replica API uses both the account address and the storage address to uniquely identify the correct storage cell. The 20 B account address is concatenated with the 32 B storage address and hashed. It produces a 32 B hash, which is used as a database key for persistence. (Note that this hash need not be unique because of hash collisions; following the Ethereum specification, we do not account for hash collisions.)

The flat key–value structures may consume more memory than search trees. However, this is not the case for Storage Replica, which does not contain the 32 B hashes or complex inner nodes. In fact, Storage Replica is more space-efficient than the original tries.

Although Storage Replica saves space by eliminating inner nodes, its role as an auxiliary structure to the MPT inevitably adds data to the blockchain. To efficiently manage this additional data, we store Storage Replica as a separate database instance. This design has several advantages: First, the new database with the flattened key-values is more compact and thereby provides faster access times utilizing caches. It is independent of the existing Ethereum client infrastructure and can be implemented with a modern key-value database. Second, the new database may be located on a separate (i.e., faster) file system and hardware (e.g., SSD/NVMe). In this way, fast but expensive flash storage can be dedicated to Storage Replica, while the world state trie/storage tries can be stored on lower-cost, lower-performance storage devices.

Storage Replica maintains storage consistency and reflects the most recent state of the storage. Its replay and failure recovery procedures are aligned with the procedures of the Geth client: if a crash happens during block processing but before the database flush, a client—after restart—will re-execute (replay) the block, as the effects of the transactions from the crashed block will not have taken effect in the database yet. If a crash corrupts the world state in the database, the client will not be able to continue block processing, because the state-root hash that the client will compute from the corrupted world-state data will not match the state-root hash provided by the Ethereum network for the respective block. This is by virtue of the storage authentication mechanism of Ethereum’s MPTs. To recover from this failure, a client must resync from the network. If Storage Replica is corrupted or becomes inconsistent with the trie, the same state-root hash comparison will fail at the end of the processing of a block, likewise necessitating a resynchronization of the client from the network. This resynchronization will entail the processing of all blocks from the genesis block, thereby rebuilding Storage Replica as well.

5.2. Asynchronous Trie Construction

While Storage Replica enables

O (1)

access during transaction execution, the final construction of the MPT remains computationally intensive. Unlike a flat key–value database, where an update is a simple overwrite, an MPT update requires recomputing the Keccak hash of the modified node. The updated hash invalidates all parent nodes up to the root, requiring a sequential chain of re-hashing. Moreover, this process results in read amplification (loading nodes along the updated path) and write amplification (rewriting and re-hashing multiple nodes).

To mitigate this inevitable cost, we introduce the asynchronous trie construction technique to utilize the remaining CPU time during transaction execution. Although Storage Replica speeds up execution, we assume that enough CPU time is still available even on commodity hardware. Storage Replica is rather I/O-intensive because it fetches values from the database. To utilize the CPU better while processing blocks, an asynchronous worker thread is spawned. It prepares the tries in the background, which are later needed for the state commit. The interpretation of SLOAD and SSTORE is enhanced with hooks to monitor incoming storage addresses and values. Through these hooks, requests are sent from the main thread (which executes the EVM) to the worker thread, which performs the trie update work.

We use the producer–consumer pattern [92] to transfer requests from the SLOAD and SSTORE interpretation points of the EVM to the worker thread. Our communication channel uses the message-passing communication model with asynchronous (non-blocking) send operations and synchronous (blocking) receive operations [93]. The asynchronous send operation minimizes the impact on performance during the interpretation of SLOAD and SSTORE instructions. However, to ensure reliability under heavy workloads, the channel’s underlying queue implementation is capacity-bounded. When the queue becomes full, the sender—typically performing asynchronous send operations—temporarily blocks until the worker catches up. This mechanism throttles the production rate to the worker’s throughput during extreme workloads.

The worker thread blocks on the synchronous receive operation and processes the inputs immediately as they arrive. This concept allows for employing multiple workers. The worker threads traverse the trie and discover which nodes must be read from the database. Nodes are not immediately attached to the trie; instead, they are stored in-memory in a separate key–value data structure. Because the trie is only read and not written, it does not have to be locked, thereby allowing multiple workers to traverse the trie in parallel and fetch trie nodes from the database.

Thereby, our synchronization scheme separates trie data accesses from the request queue. While the communication channel uses blocking semantics to throttle system load, access to the shared trie structure relies on non-blocking synchronization [94]. This approach enables parallelism for trie construction, while simultaneously ensuring system reliability under heavy workloads through the capacity-bounded queue that implements the communication channel.

The trie is finally constructed during the State::commit by a single thread, which uses the pre-loaded nodes. Because the trie is prepared asynchronously with respect to transaction processing, it is not guaranteed that all work will be completed at the State::-commit phase. The high-speed transaction execution via Storage Replica can outpace the computationally intensive MPT reconstruction in the background. In our design, State::-commit does not wait for the worker thread to finish; instead, it starts committing the trie, performing any work that might not have been finished yet. In the most pessimistic scenario, the worker has not commenced execution when the state commit occurs. In such a case, the main thread synchronously performs the entire trie construction in the state commit phase, resulting in a performance degradation to the unoptimized baseline, i.e., Storage Replica without asynchronous trie construction. Importantly, this will not incur any unbounded latency or system stalls. Our experiments described in the next section reveal that such a pessimistic scenario is not likely to occur, as the speedup of State::commit is significant. The experiments suggest that the time that elapses between interpreting the smart contracts and committing the state is sufficient for the worker to prepare the tries on time.

5.3. Implementation

The revisited sequence diagrams from Section 4 are shown in Figure 7 and Figure 8. The new version of SLOAD is shown in the bottom part of Figure 7: accessing the memory now requires a single access to Storage Replica, without an expensive trie traversal. Furthermore, as part of the execution of State::storage_at, a worker thread is spawned to prepare the trie in memory. SSTORE in Figure 8 reveals the modifications of storing updated values and the final block commit. When SSTORE is interpreted, the storage system invokes State::-set_storage to put the value in the Storage Replica. When the block is committed, the new workflow uses the memory with pre-loaded trie nodes to minimize database access.

Listing 1 provides pseudocode for the integration points of SLOAD and SSTORE shown in Figure 7 and Figure 8. This pseudocode maps the logical components of our architecture to the storage-access hooks in the client (e.g., storage_at, set_storage in Parity). These hooks allow Storage Replica to intercept storage operations while preserving the standard EVM execution. This integration pattern is consistent with the storage interfaces of other major clients, such as GetState and SetState in Geth, facilitating reproducibility across different implementations.

Listing 1. Pseudocode of the integration points of SLOAD and SSTORE for Storage Replica in Parity.

We implemented our new storage techniques in Parity version 2.4.0 by re-routing existing code to the new storage system. Our Storage Replica is stored in a separate key–value database, while re-using the client’s existing wrappers and layers for in-memory overlays and caches to preserve architectural consistency. Specifically, our implementation leverages standard Rust libraries and data structures that align with the existing client architecture:

The in-memory representations of both trie nodes and Storage Replica entries utilize Rust’s standard HashMap and dynamically sized array (Vec). This ensures that the new storage components integrate natively with the client’s canonical data layout.
We use Rust’s standard library std::sync::mpsc for the communication channel. As discussed in Section 5.2, this channel is configured as a bounded queue. When the queue reaches capacity, the sender temporarily blocks to apply backpressure under adversarial workloads. Line 5 of Listing 1 shows the send operation in the SLOAD hook that the interpreter uses to enqueue a read operation from the trie, according to the API introduced with Table 2. The send operation for trie updates is shown in line 13.

6. Experimental Evaluation

We conducted experiments to assess the performance properties of the current Ethereum and its access to the state database, and we compared those performance properties to the performance obtained by employing Storage Replica. We asked the following research questions:

RQ1—What is the primary performance bottleneck in the EVM’s persistent storage access?
RQ2—What performance is achieved using Storage Replica?
RQ3—What is the disk space overhead of Storage Replica?

6.1. Experimental Environment

For the experiment, we used a server with an 8-core Intel Core i7-7700K CPU at

4.20

G

Hz

and a 2 TB NVMe SSD card from the Intel SSD Pro 7600p/760p/E 6100p series. The server was running Ubuntu Linux Server 20.04. The experiments ran exclusively on the server, which was on-premise bare metal, maintained by the authors of this study.

The experiment was a long-running evaluation that processed 900 k Ethereum blocks, encompassing

56.0

M transactions. The entire experiment took several days (about a week) to complete. The on-premise server was dedicated exclusively to this experiment. We monitored that the server had been under light load, minimizing any noise in the results.

6.2. Experimental Setup and Data

We conducted the experiment on blocks from the Ethereum mainnet. To minimize imprecision caused by peer-to-peer synchronization of blocks over the network, we imported pre-existing binary files of block data. These files had been exported from an already-synchronized Ethereum node using the Parity export command. The experiment was executed on these offline files, and the time to obtain them beforehand was not included in the experiment time.

Our analysis was conducted using the Parity version 2.4.0 source code, a client implementation that utilizes the canonical MPT state architecture that is the subject of this paper, and its fundamental I/O logic is shared across all EVM clients, as reported in [13].

The experiment was performed on the initial 9 M blocks of the mainnet, as this Parity version supports the hard forks up to that point (Constantinople [95] and Petersburg [96]). Because our study focused on the MPT architecture, we did not expect that analyzing blocks beyond this point would significantly influence the observed trends.

The obtained data consisted of block segments of 1 M blocks each, resulting in a total of nine segments (1 M, 2 M, …, 9 M). We employed sampling to speed up the experiment, using the initial 100 k blocks from the beginning of every 1 M block segment (i.e., 0–0.1 M, 1–1.1 M, 2–2.1 M, …, 8–8.1 M). To avoid the significant overhead of logging individual storage and trie operations—which would otherwise perturb the measurements—we recorded the cumulative execution time for each interval.

This sampling strategy is grounded in two intrinsic characteristics of the Ethereum blockchain. Transaction execution follows a strict sequential dependency; thus, evaluating a contiguous block interval is necessary to obtain meaningful performance measurements. In addition, a prior study [13] shows that EVM execution latency increases gradually and predictably as the global state grows, barring historical anomalies such as the 2016 DoS attacks. Under these conditions, the cumulative performance of the initial 100 k blocks (which are free from known anomalies) provides a representative approximation of the behavior within each 1 M block segment.

6.3. Performance Analysis

We modified the source code of Parity with measurement hooks for profiling. The hooks measured wall-clock time before the execution of every annotated method and the time after. By measuring the elapsed time, we obtained a time duration for a method. The instrumentation dumped the execution time of the relevant methods into CSV files.

We recompiled the source code with release flags and invoked the node synchronization (from files) as normal. To minimize the impact of our instrumentation using hooks on the runtime, we stored all our measurements in memory using C-style inline macros, and we flushed the memory buffer to disk after each block import ended. Due to the nature of parallelism of software components in the Ethereum client, we used atomic operations [97] to avoid data races while minimizing performance impact. We used the default configuration with default cache sizes and the default pruning mode, but we switched to “no-warp”, a Parity special mode that enforces the execution of all transactions (otherwise, Parity would have downloaded the state database of earlier blocks over the network).

The main focus of the performance measurement was on the components described in Section 4, i.e., those that read the data. We were interested in three execution points: State::storage_at to assess SLOAD, Trie::lookup, and KeyValueDB::get. These methods are key components of the workflow for retrieving a smart contract value from persistent storage. They are executed successively: SLOAD performs the trie lookup, and the trie lookup triggers database access in a single chain of function calls. To analyze which method contributes most to the execution time, we measured each method’s execution time. In what follows, and in line with the performance analysis literature [98] (p. 113), we refer to the net execution time of a function excluding nested function calls as the function’s self-time. In contrast, we use the term total time to refer to the execution time of a function, including nested function calls.

To analyze the performance of components that store data, we measured SSTORE, State::commit, and KeyValueDB::flush. Because all these methods run in sequence, we did not have to extract their self-time.

RQ1: What is the primary performance bottleneck in the EVM’s persistent storage access?

The results of the experiment are detailed in Table 3 and visualized in Figure 9a,c. Overall, performance was reasonably fast for blocks up to 4 M, with no performance degradation noticeable. However, the performance started to deteriorate after 4 M. This is consistent with the observations of Baird et al. [13], who reported up to 90% performance degradation, but the details were not investigated.

With respect to read access, we observe in Table 3 that the performance of SLOAD, Trie::lookup, and RocksDB::get (which is Parity’s implementation of KeyValueDB::get) degraded. Figure 9a illustrates that the database access disproportionately slowed down. In other words, database access is the key to performance improvement. Note that we used an NVMe SSD connected via the PCI Express bus—users with slower disks (e.g., SATA SSDs or HDDs) may experience an even greater slowdown.

With respect to write access, we observe in Table 3 that the performance of SSTORE, State::commit, and RocksDB::flush degraded more considerably with blocks 5 M and later. However, when comparing all, SLOAD is still the slowest, followed by State::-commit. Based on our functional analysis in Section 4, we expect that the performance of State::commit correlated with the performance degradation of SLOAD (via RocksDB::-get) because this operation needed to read more nodes from the database to build the trie in-memory.

Answer to RQ1: the main bottleneck is the access to the KeyValueDB::get interface, consequently slowing down SLOAD, SSTORE, and State::commit.

RQ2: What performance is achieved using Storage Replica?

We modified the source code of Parity to incorporate our Storage Replica. We built the Parity client and conducted the performance experiment in the same way as the original version. We measured the performance of the new version and calculated the size of Storage Replica.

Table 4 and Figure 9b,d show the performance results for our Storage Replica. We depict total time (not self-time) as the trie lookup time (Trie::lookup) is not applicable, and other methods run in sequence. With Storage Replica, access time to caches and MemoryDB is negligible as well. The tables and charts illustrate that our Storage Replica strategy significantly improved the performance of the new SLOAD instruction, with RocksDB::get running 4× faster than the original RocksDB::get. The State::commit operation improved as well, because at this stage it had potentially all the data from the database pre-loaded in-memory.

We depict the achieved speedups in Table 5, which displays the execution time ratio between the original and the new version. Every column contains a value computed as

o r i g i n a l / n e w

to express how many times the new solution is faster.

As performance decreased at block height 5 M and later, the results from 5 M–9 M are the most relevant. Table 5 reveals that SLOAD performance improved by a factor of 6.5× at block height 5 M and stabilized at a factor of 4.04–4.97× for later blocks. Similarly, SSTORE performance improved by a factor of 1.68–2.78× at block height 5 M and later. In the case of State::commit, the achieved speedup was 3.72–4.45×.

RocksDB::get and State::commit were still tasked with I/O access, which is evident in the observed latency despite the achieved speedups. As described in Section 5.2, the final update of the trie remained computationally expensive, requiring Keccak hash computations of modified nodes, with associated read and write amplification. This pertains particularly to transactions located towards the very end of the block, where asynchronous trie construction may not finish in time before the state commit commences, thus necessarily leaving work to the state commit phase.

We were interested in whether there existed any performance penalty. There was a maximum slowdown of 9% for RocksDB::flush for blocks in the range 5–9 M. For earlier blocks, the slowdown was more pronounced—up to 67%, but with the growing block number, the overhead of flushing the database became much smaller.

These results show that in the later blocks, above 5 M, the overhead introduced by our approach became negligible, especially considering the improvements of thousands of percent for loading the data.

Answer to RQ2: the performance of SLOAD, SSTORE, and State:: commit improved 6×, 2×, and 4×, respectively. The overhead of updating the tries was capped at 9% to flush data in the database.

The experiments validated that the performance of smart contracts improves with Storage Replica. We tested the functionality by importing the same 9 M blocks as the original version. Because block validation embedded in the Ethereum protocol succeeded for all blocks, we confirm that the Storage Replica functionality does not break backward compatibility.

RQ3: What is the disk space overhead of Storage Replica?

Column “Trie” in Table 6 depicts the size (in GB) of the original Ethereum storage trie at blocks 1 M, 2 M,…, 9 M. The additional space consumed by Storage Replica when using Ethereum’s 32 B storage address space (and, hence, 32 B to represent the address of a storage value) is depicted in column “Replica 32 B”. The relative size (in %) of Storage Replica compared to the size of the trie is shown in column “32 B/Trie”. Storage Replica incurs a maximum space overhead of

41.48

% at block height 4 M, but this overhead diminishes gradually with later blocks, reaching

31.39

% at block 9 M.

Because the before-mentioned 32 B storage address space can be considered very large for the practical storage requirements of a smart contract, we computed the theoretical size overhead when the address space is shrunk, catering for situations where contracts utilize only a fraction of the Ethereum-provided address space for storage. Assuming a modest space requirement of ≤ 256 storage values per contract, the contract storage address space shrinks to 1 B. The space requirements for Storage Replica are considerably smaller with 1 B storage addresses, as shown in columns “Replica 1 B” and “1 B/Trie”. Notably, the space overhead of Storage Replica over the trie drops to

8.55

% at block 9 M.

Note that Storage Replica, unlike an Ethereum archive node, does not preserve the history of write accesses to storage. Rather, it only keeps the most recent value written to a storage address. As a result, the size of Storage Replica does not grow from repeated write accesses to a given storage address.

However, Storage Replica grows as new storage slots are attached. From our observation, this contributes about 2

GB

of additional disk space per one million blocks, as shown in column “Replica 32 B” of Table 6. This is only a small fraction of the space, considering the whole database size: when the client was synchronized in pruning mode, the database took more than 200

GB

of disk space at block height 9 M [99]. (Note that in pruning mode, the client preserves only the most recent blocks in the database to conserve disk space, which amplifies the overhead of Storage Replica in this comparison.) We calculate Storage Replica’s overhead at block height 9 M as the size in column “Replica 32 B” divided by the client’s total database size, i.e.,

9.87 GB / 200 GB = 4.935

%.

Answer to RQ3: Storage Replica requires less than 10

GB

of additional disk space until block 9 M, which is below 5% of the total database size of a client synchronized in pruning mode at block height 9 M. The ratio of space overhead compared to the storage trie gradually decreases from block 4 M, reaching

31.39

% at block 9 M.

7. Limitations and Future Work

Our evaluation was conducted on an extended version of Parity for the initial 9 M blocks of the Ethereum mainnet. We acknowledge that settling on one particular client (Parity) and a range of

9 M

blocks that covered approximately

37.8

% of the entire Ethereum history may have omitted certain details that an exhaustive study of the Ethereum ecosystem may reveal.

However, the trie access bottleneck that Storage Replica resolves is not tied to a particular client software architecture, key–value store, operating system, or hardware architecture. Instead, it is a fundamental part of the Ethereum protocol and its cryptographic authentication of MPT nodes that collectively store the Ethereum world state in a secure and verifiable format. Any Ethereum client implementation is inherently bound by the EVM abstract machine specification of the Ethereum protocol [23], which mandates SLOAD and SSTORE operations that involve the same expensive, sequential trie traversals that we have reported in our work. The parallelism that Storage Replica leverages between transaction processing and the state commit phase of blocks is rooted in the Ethereum protocol itself.

Our experimental setup is adequate for isolating these protocol-level bottlenecks and for the demonstration of our solution’s core principles. The 9 M block dataset clearly established the performance-degradation trend caused by MPT state growth and proved that our architecture addresses this trend adequately by decoupling transaction execution from MPT computation.

While the absolute performance gains (e.g., the 6× speedup) may differ across Ethereum clients, the relative performance benefit remains. Even in a worst-case scenario where a high storage access load will cause significant queue buildup that may outpace the asynchronous workers, the State::commit phase would perform the remaining work sequentially, still matching (for the remaining work) and surpassing (for the work conducted by the asynchronous workers) the baseline. Critically, the SLOAD operations would remain accelerated by Storage Replica’s fast lookups.

Future work comprises extending our method to the Berlin hard fork at block number 12,244,000 [60]. The Berlin hard fork upgraded the Ethereum protocol to increase the storage gas cost for initial (cold) storage accesses, while significantly reducing the costs of subsequent (warm) accesses to

1 / 21

of the cold access cost [100]. This hard fork introduced a new transaction type with an optional access list [101], which contains account addresses and storage keys that the transaction intends to use. The introduction of access lists allows Ethereum clients to warm up storage slots before executing a transaction. We expect further performance improvements for Ethereum clients by extending our work with access lists, leveraging asynchronous pre-fetching of the planned storage slots for efficient warm-up of the MPT.

8. Conclusions

Smart contracts reach their limits due to slow storage performance. In this work, we accelerated smart contract storage performance by introducing a new data structure, Storage Replica, and reorganizing block processing. The runtime environment of the EVM is changed without breaking the backward compatibility of the Ethereum protocol. Our Storage Replica requires additional disk space of 10

GB

for mirroring part of the state, which is below 5% of the size of the whole persistent database. Our solution achieves a speedup of up to 6× in processing smart contracts and about 4× in committing the current block state.

Author Contributions

Conceptualization, B.S., K.J., and S.J.; methodology, K.J. and S.J.; software, K.J., S.J., and Y.K.; validation, S.J., B.S., and B.B.; formal analysis, K.J. and S.J.; investigation, K.J., S.J., and Y.K.; resources, K.J.; data curation, K.J. and S.J.; writing—original draft preparation, K.J., S.J., and B.S.; writing—review and editing, B.B., S.J., and Y.K.; visualization, K.J. and S.J.; supervision, B.S. and B.B.; project administration, B.S. and B.B.; funding acquisition, B.S. and B.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Sonic Labs and under its former name Fantom Foundation, by the Institute of Information & Communications Technology Planning & Evaluation (IITP) funded by the Korean government (MSIT) under Grant No. 2021-0-00853, and by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education under Grant No. 2022R1A6A3A13069093.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Moniruzzaman, M.; Khezr, S.; Yassine, A.; Benlamri, R. Blockchain for smart homes: Review of current trends and research challenges. Comput. Electr. Eng. 2020, 83, 106585. [Google Scholar] [CrossRef]
Ohize, H.O.; Onumanyi, A.J.; Umar, B.U.; Ajao, L.A.; Isah, R.O.; Dogo, E.M.; Nuhu, B.K.; Olaniyi, O.M.; Ambafi, J.G.; Sheidu, V.B.; et al. Blockchain for securing electronic voting systems: A survey of architectures, trends, solutions, and challenges. Clust. Comput. 2024, 28, 132. [Google Scholar] [CrossRef]
Çabuk, U.C.; Adiguzel, E.; Karaarslan, E. A survey on feasibility and suitability of blockchain techniques for the e-voting systems. arXiv 2020, arXiv:2002.07175. [Google Scholar] [CrossRef]
Chillakuri, B.; Attili, V.P. Role of blockchain in HR’s response to new-normal. Int. J. Organ. Anal. 2021, 177, 102936. [Google Scholar] [CrossRef]
Farahani, B.; Firouzi, F.; Luecking, M. The convergence of IoT and distributed ledger technologies (DLT): Opportunities, challenges, and solutions. J. Netw. Comput. Appl. 2021, 177, 102936. [Google Scholar] [CrossRef]
Zhu, Q.; Loke, S.W.; Trujillo-Rasua, R.; Jiang, F.; Xiang, Y. Applications of Distributed Ledger Technologies to the Internet of Things: A Survey. ACM Comput. Surv. 2019, 52, 1–34. [Google Scholar] [CrossRef]
Lao, L.; Li, Z.; Hou, S.; Xiao, B.; Guo, S.; Yang, Y. A Survey of IoT Applications in Blockchain Systems: Architecture, Consensus, and Traffic Modeling. ACM Comput. Surv. 2020, 53, 1–32. [Google Scholar] [CrossRef]
Auer, R.; Haslhofer, B.; Kitzler, S.; Saggese, P.; Victor, F. The technology of decentralized finance (DeFi). Digit. Financ. 2024, 6, 55–95. [Google Scholar] [CrossRef]
Benbachir, S.; Amzile, K.; Beraich, M. Exploring the Asymmetric Multifractal Dynamics of DeFi Markets. J. Risk Financ. Manag. 2025, 18, 122. [Google Scholar] [CrossRef]
CoinLaw. Ethereum Statistics 2025: Powerful Facts for Investors. 2025. Available online: https://coinlaw.io/ethereum-statistics/ (accessed on 8 November 2025).
Etherscan. Ethereum Unique Addresses Chart. 2025. Available online: https://etherscan.io/chart/address (accessed on 8 November 2025).
Etherscan. Ethereum Full Node Sync (Archive) Chart. 2025. Available online: https://etherscan.io/chartsync/chainarchive (accessed on 8 November 2025).
Baird, K.; Jeong, S.; Kim, Y.; Burgstaller, B.; Scholz, B. The Economics of Smart Contracts. arXiv 2019, arXiv:1910.11143. [Google Scholar] [CrossRef]
O’Leary, R.R. Blockchain Bloat: How Ethereum Is Tackling Storage Issues. 2018. Available online: https://www.coindesk.com/markets/2018/01/18/blockchain-bloat-how-ethereum-is-tackling-storage-issues (accessed on 8 November 2025).
Ethereum. EIP-1884: Repricing for Trie-Size-Dependent Opcodes. 2019. Available online: https://eips.ethereum.org/EIPS/eip-1884 (accessed on 8 November 2025).
Sun, B. Impact of the Istanbul Hard Fork on Aragon Organizations. 2020. Available online: https://blog.aragon.org/istanbul-hard-fork-impact/ (accessed on 8 November 2025).
ChainSecurity. Istanbul Hardfork EIPs—Changing Gas Costs and More. 2019. Available online: https://www.chainsecurity.com/blog/istanbul-hardfork-eips-changing-gas-costs-and-more (accessed on 8 November 2025).
Etherscan. Ethereum Daily Transactions Chart. 2025. Available online: https://etherscan.io/chart/tx (accessed on 8 November 2025).
Go-Ethereum. Geth. Available online: https://geth.ethereum.org/ (accessed on 8 November 2025).
Parity Technologies. Parity. Available online: https://www.parity.io/ (accessed on 8 November 2025).
Saraf, C.; Sabadra, S. Blockchain platforms: A compendium. In Proceedings of the 2018 IEEE International Conference on Innovative Research and Development (ICIRD), Bangkok, Thailand, 11–12 May 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–6. [Google Scholar] [CrossRef]
Vujičić, D.; Jagodić, D.; Ranđić, S. Blockchain technology, Bitcoin, and Ethereum: A brief overview. In Proceedings of the 17th International Symposium INFOTEH-JAHORINA (INFOTEH), East Sarajevo, Republic of Srpska, 21–23 March 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–6. [Google Scholar] [CrossRef]
Wood, G. Ethereum: A Secure Decentralised Generalised Transaction Ledger Shanghai Version efc5f9a—2025-02-04. 2025. Available online: https://ethereum.github.io/yellowpaper/paper.pdf (accessed on 8 November 2025).
Fredkin, E. Trie Memory. Commun. ACM 1960, 3, 490–499. [Google Scholar] [CrossRef]
Morrison, D.R. PATRICIA—Practical algorithm to retrieve information coded in alphanumeric. J. ACM JACM 1968, 15, 514–534. [Google Scholar] [CrossRef]
Merkle, R.C. A Digital Signature Based on a Conventional Encryption Function. In Proceedings of the Advances in Cryptology—CRYPTO ’87; Pomerance, C., Ed.; Springer: Berlin/Heidelberg, Germany, 1988; pp. 369–378. [Google Scholar] [CrossRef]
Ethereum Foundation. Merkle Patricia Trie. 2025. Available online: https://ethereum.org/developers/docs/data-structures-and-encoding/patricia-merkle-trie/ (accessed on 11 December 2025).
Jezek, K. Ethereum Data Structures. arXiv 2021, arXiv:2108.05513. [Google Scholar] [CrossRef]
Bertoni, G.; Daemen, J.; Peeters, M.; Van Assche, G. The Keccak Reference. Round 3 Submission to NIST SHA-3, 2011. Available online: https://keccak.team/files/Keccak-reference-3.0.pdf (accessed on 20 December 2025).
Solidity. Solidity Language. 2025. Available online: https://docs.soliditylang.org/en/v0.8.30/ (accessed on 8 November 2025).
Oliva, G.A.; Hassan, A.E.; Jiang, Z.M. An exploratory study of smart contracts in the Ethereum blockchain platform. Empir. Softw. Eng. 2020, 25, 1864–1904. [Google Scholar] [CrossRef]
Solidity. Layout of State Variables in Storage and Transient Storage. 2025. Available online: https://docs.soliditylang.org/en/v0.8.30/internals/layout_in_storage.html (accessed on 8 November 2025).
Raju, P.; Ponnapalli, S.; Kaminsky, E.; Oved, G.; Keener, Z.; Chidambaram, V.; Abraham, I. mLSM: Making authenticated storage faster in Ethereum. In Proceedings of the 10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 18), Boston, MA, USA, 9–10 July 2018; Available online: https://www.usenix.org/conference/hotstorage18/presentation/raju (accessed on 20 December 2025).
Ponnapalli, S.; Shah, A.; Banerjee, S.; Malkhi, D.; Tai, A.; Chidambaram, V.; Wei, M. RainBlock: Faster Transaction Processing in Public Blockchains. In Proceedings of the 2021 USENIX Annual Technical Conference (USENIX ATC 21), USENIX Association, Virtual, 14–16 July 2021; pp. 333–347. Available online: https://www.usenix.org/conference/atc21/presentation/ponnapalli (accessed on 20 December 2025).
Dahlberg, R.; Pulls, T.; Peeters, R. Efficient sparse Merkle trees. In Proceedings of the Nordic Conference on Secure IT Systems, Oulu, Finland, 2–4 November 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 199–215. [Google Scholar] [CrossRef]
Haider, F. Compact Sparse Merkle Trees. Cryptology ePrint Archive, Report 2018/955. 2018. Available online: https://eprint.iacr.org/2018/955 (accessed on 20 December 2025).
Östersjö, R. Sparse Merkle Trees: Definitions and Space-Time Trade-Offs with Applications for Balloon. Master’s Thesis, Karlstad University, Karlstad, Sweden, 2016. Available online: https://www.diva-portal.org/smash/record.jsf?pid=diva2:936353 (accessed on 20 December 2025).
Kuznetsov, O.; Frontoni, E.; Kuznetsova, K.; Arnesano, M. Optimizing Merkle proof size through path length analysis: A probabilistic framework for efficient blockchain state verification. Future Internet 2025, 17, 72. [Google Scholar] [CrossRef]
Dickerson, T.; Gazzillo, P.; Herlihy, M.; Koskinen, E. Adding Concurrency to Smart Contracts. In Proceedings of the ACM Symposium on Principles of Distributed Computing, Washington, DC, USA, 25–27 July 2017; ACM: New York, NY, USA, 2017; pp. 303–312. [Google Scholar] [CrossRef]
Zhang, A.; Zhang, K. Enabling concurrency on smart contracts using multiversion ordering. In Proceedings of the Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data, Macau, China, 23–25 June 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 425–439. [Google Scholar] [CrossRef]
Anjana, P.S.; Kumari, S.; Peri, S.; Rathor, S.; Somani, A. An efficient framework for optimistic concurrent execution of smart contracts. In Proceedings of the 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), Pavia, Italy, 13–15 February 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 83–92. [Google Scholar] [CrossRef]
Shrey, B.; Singh, A.P.; Sathya, P.; Yogesh, S. DiPETrans: A Framework for Distributed Parallel Execution of Transactions of Blocks in Blockchain. arXiv 2019, arXiv:1906.11721. [Google Scholar] [CrossRef]
Sarrar, N. On transaction parallelizability in Ethereum. arXiv 2019, arXiv:1901.09942. [Google Scholar] [CrossRef]
Saraph, V.; Herlihy, M. An Empirical Study of Speculative Concurrency in Ethereum Smart Contracts. In Proceedings of the International Conference on Blockchain Economics, Security and Protocols (Tokenomics 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Paris, France, 6–7 May 2019. [Google Scholar] [CrossRef]
Eren, H.; Karaduman, Ö.; Gençoğlu, M.T. Security challenges and performance trade-offs in on-chain and off-chain blockchain storage: A comprehensive review. Appl. Sci. 2025, 15, 3225. [Google Scholar] [CrossRef]
Chandramouli, B.; Prasaad, G.; Kossmann, D.; Levandoski, J.; Hunter, J.; Barnett, M. Faster: A concurrent key-value store with in-place updates. In Proceedings of the 2018 International Conference on Management of Data, Houston, TX, USA, 10–15 June 2018; pp. 275–290. [Google Scholar] [CrossRef]
Song, K.; Kim, J.; Lee, D.; Park, S. MultiPath MultiGet: An Optimized Multiget Method Leveraging SSD Internal Parallelism. In Proceedings of the 7th International Conference on Emerging Databases; Springer: Berlin/Heidelberg, Germany, 2018; pp. 138–150. [Google Scholar] [CrossRef]
Papagiannis, A.; Saloustros, G.; González-Férez, P.; Bilas, A. An efficient memory-mapped key-value store for flash storage. In Proceedings of the ACM Symposium on Cloud Computing, Carlsbad, CA, USA, 11–13 October 2018; pp. 490–502. [Google Scholar] [CrossRef]
Ouaknine, K.; Agra, O.; Guz, Z. Optimization of RocksDB for Redis on flash. In Proceedings of the International Conference on Compute and Data Analysis, Ystad, Sweden, 22–24 August 2017; pp. 155–161. [Google Scholar] [CrossRef]
Wu, X.; Xu, Y.; Shao, Z.; Jiang, S. LSM-trie: An LSM-tree-based ultra-large key-value store for small data items. In Proceedings of the 2015 USENIX Annual Technical Conference (USENIX ATC 15), Santa Clara, CA, USA, 8–10 July 2015; pp. 71–82. Available online: https://www.usenix.org/conference/atc15/technical-session/presentation/wu (accessed on 20 December 2025).
Lepers, B.; Balmau, O.; Gupta, K.; Zwaenepoel, W. KVell: The design and implementation of a fast persistent key-value store. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, Huntsville, ON, Canada, 27–30 October 2019; pp. 447–461. [Google Scholar] [CrossRef]
Agarwal, R.; Khandelwal, A.; Stoica, I. Succinct: Enabling queries on compressed data. In Proceedings of the 12th USENIX Symposium on Networked Systems Design and Implementation (NSDI 15), Oakland, CA, USA, 4–6 May 2015; pp. 337–350. Available online: https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/agarwal (accessed on 20 December 2025).
Ethereum Foundation. FAQ: Upcoming Ethereum Hard Fork. 2016. Available online: https://blog.ethereum.org/2016/10/18/faq-upcoming-ethereum-hard-fork (accessed on 5 December 2025).
He, Z.; Li, Z.; Qiao, A.; Luo, X.; Zhang, X.; Chen, T.; Song, S.; Liu, D.; Niu, W. Nurgle: Exacerbating Resource Consumption in Blockchain State Storage via MPT Manipulation. In Proceedings of the 2024 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 19–23 May 2024; pp. 2180–2197. [Google Scholar] [CrossRef]
Choi, J.A.; Beillahi, S.M.; Li, P.; Veneris, A.; Long, F. LMPTs: Eliminating Storage Bottlenecks for Processing Blockchain Transactions. In Proceedings of the 2022 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), Shanghai, China, 2–5 May 2022; pp. 1–9. [Google Scholar] [CrossRef]
Li, C.; Beillahi, S.M.; Yang, G.; Wu, M.; Xu, W.; Long, F. LVMT: An Efficient Authenticated Storage for Blockchain. ACM Trans. Storage 2024, 20, 1–34. [Google Scholar] [CrossRef]
Ethereum Foundation. Geth 1.8—Iceberg. 2018. Available online: https://blog.ethereum.org/2018/02/14/geth-1-8-iceberg (accessed on 5 December 2025).
Ethereum Foundation. Ethereum Istanbul Upgrade Announcement. 2019. Available online: https://blog.ethereum.org/2019/11/20/ethereum-istanbul-upgrade-announcement (accessed on 5 December 2025).
Ethereum Foundation. Geth v1.10.0. 2021. Available online: https://blog.ethereum.org/2021/03/03/geth-v1-10-0 (accessed on 8 November 2025).
Beiko, T. Ethereum Berlin Upgrade Announcement. 2021. Available online: https://blog.ethereum.org/2021/03/08/ethereum-berlin-upgrade-announcement (accessed on 12 November 2025).
Beiko, T. London Mainnet Announcement. 2021. Available online: https://blog.ethereum.org/2021/07/15/london-mainnet-announcement (accessed on 5 December 2025).
Ethereum Foundation. Geth v1.13.0. 2023. Available online: https://blog.ethereum.org/2023/09/12/geth-v1-13-0 (accessed on 5 December 2025).
Giulio Rebuffo. Releasing Erigon v3.0.0. Available online: https://erigon.tech/releasing-erigon-v3-0-0/ (accessed on 10 December 2025).
Ethereum Community. Dynamic State Snapshots. 2019. Available online: https://github.com/ethereum/go-ethereum/pull/20152 (accessed on 8 November 2025).
Go-Ethereum. Sync Modes. Available online: https://geth.ethereum.org/docs/fundamentals/sync-modes (accessed on 5 December 2025).
OpenEthereum Documentation. Warp Sync. Available online: https://openethereum.github.io/Warp-Sync (accessed on 5 December 2025).
Giulio Rebuffo. Erigon 3, Alpha 2: Introducing Blazingly Fast Sync on Archive Nodes with OtterSync and Other Improvements. Available online: https://erigon.tech/erigon-3-alpha-2-introducing-blazingly-fast-sync-on-archive-nodes-with-ottersync-and-other-improvements/ (accessed on 5 December 2025).
Kim, C. Confessions of a Sharding Skeptic. 2020. Available online: https://www.coindesk.com/markets/2020/10/04/confessions-of-a-sharding-skeptic (accessed on 9 November 2025).
O’Leary, R.R. ‘Turbo Geth’ Seeks to Scale Ethereum—And It’s Already in Beta. 2018. Available online: https://www.coindesk.com/markets/2018/09/21/turbo-geth-seeks-to-scale-ethereum-and-its-already-in-beta (accessed on 9 November 2025).
Erigon Tech. Erigon. Available online: https://github.com/erigontech/erigon (accessed on 9 November 2025).
Erigon Tech. DB Walk-Through. Available online: https://github.com/erigontech/erigon/blob/main/docs/programmers_guide/db_walkthrough.MD (accessed on 9 November 2025).
Erigon Blog. Erigon Stage Sync and Control Flows. Available online: https://erigon.substack.com/p/erigon-stage-sync-and-control-flows (accessed on 10 November 2025).
Feng, H.; Hu, Y.; Kou, Y.; Li, R.; Zhu, J.; Wu, L.; Zhou, Y. SlimArchive: A Lightweight Architecture for Ethereum Archive Nodes. In Proceedings of the 2024 USENIX Annual Technical Conference (USENIX ATC 24), Santa Clara, CA, 10–12 July 2024; pp. 1257–1272. Available online: https://www.usenix.org/conference/atc24/presentation/feng-hang (accessed on 20 December 2025).
Zheng, X.; Zhu, Y.; Si, X. A Survey on Challenges and Progresses in Blockchain Technologies: A Performance and Security Perspective. Appl. Sci. 2019, 9, 4731. [Google Scholar] [CrossRef]
Park, N.; Kancharla, A.; Kim, H.Y. A Real-Time Chain and Variable Bulk Arrival and Variable Bulk Service (VBAVBS) Model with λF. Appl. Sci. 2020, 10, 3651. [Google Scholar] [CrossRef]
Yigit, E.; Dag, T. Improving Supply Chain Management Processes Using Smart Contracts in the Ethereum Network Written in Solidity. Appl. Sci. 2024, 14, 4738. [Google Scholar] [CrossRef]
Fernández-Blanco, G.; Froiz-Míguez, I.; Fraga-Lamas, P.; Fernández-Caramés, T.M. Design, Implementation and Practical Energy-Efficiency Evaluation of a Blockchain Based Academic Credential Verification System for Low-Power Nodes. Appl. Sci. 2025, 15, 6596. [Google Scholar] [CrossRef]
Sestrem Ochôa, I.; Reis Quietinho Leithardt, V.; Calbusch, L.; De Paz Santana, J.F.; Delcio Parreira, W.; Oriel Seman, L.; Albenes Zeferino, C. Performance and Security Evaluation on a Blockchain Architecture for License Plate Recognition Systems. Appl. Sci. 2021, 11, 1255. [Google Scholar] [CrossRef]
Buterin, V. Sharding FAQ. 2017. Available online: https://vitalik.eth.limo/general/2017/12/31/sharding_faq.html (accessed on 17 December 2025).
Buterin, V. Why Sharding Is Great: Demystifying the Technical Properties. 2021. Available online: https://vitalik.eth.limo/general/2021/04/07/sharding.html (accessed on 7 December 2025).
Ethereum Community. Danksharding. 2025. Available online: https://ethereum.org/roadmap/danksharding/ (accessed on 8 November 2025).
Liu, C.; Zhu, W.; Yao, Z.; Si, X. An Efficient Cross-Shard Smart Contract Execution Framework Leveraging Off-Chain Computation and Genetic Algorithm-Optimized Migration. Electronics 2025, 14, 3684. [Google Scholar] [CrossRef]
Xu, J.; Ming, Y.; Wu, Z.; Wang, C.; Jia, X. X-shard: Optimistic cross-shard transaction processing for sharding-based blockchains. IEEE Trans. Parallel Distrib. Syst. 2024, 35, 548–559. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, Z.; Yin, H.; Yu, G.; Wang, X.; Sun, C.; Ni, W.; Liu, R.P.; Cheng, Z. DCSCY: DRL-Based Cross-Shard Smart Contract Yanking in a Blockchain Sharding Framework. Electronics 2025, 14, 3254. [Google Scholar] [CrossRef]
Ethereum Community. Zero-Knowledge Rollups. 2025. Available online: https://ethereum.org/developers/docs/scaling/zk-rollups/ (accessed on 5 December 2025).
Ethereum. EIP-7691: Blob Throughput Increase. 2024. Available online: https://eips.ethereum.org/EIPS/eip-7691 (accessed on 5 December 2025).
Grech, N.; Kong, M.; Jurisevic, A.; Brent, L.; Scholz, B.; Smaragdakis, Y. Madmax: Surviving out-of-gas conditions in Ethereum smart contracts. ACM Program. Lang. 2018, 2, 1–27. [Google Scholar] [CrossRef]
Szydlo, M. Merkle tree traversal in log space and time. In Proceedings of the International Conference on the Theory and Applications of Cryptographic Techniques; Springer: Berlin/Heidelberg, Germany, 2004; pp. 541–554. [Google Scholar] [CrossRef]
Smith, A.J. Sequential program prefetching in memory hierarchies. Computer 1978, 11, 7–21. [Google Scholar] [CrossRef]
Borthakur, D. Under the Hood: Building and Open-Sourcing RocksDB. 2013. Available online: https://engineering.fb.com/2013/11/21/core-data/under-the-hood-building-and-open-sourcing-rocksdb/ (accessed on 8 November 2025).
Ghemawat, S.; Dean, J. LevelDB, A Fast and Lightweight Key/Value Database Library by Google. 2014. Available online: https://github.com/google/leveldb (accessed on 8 November 2025).
McCool, M.; Reinders, J.; Robison, A. Structured Parallel Programming: Patterns for Efficient Computation, 1st ed.; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2012. [Google Scholar] [CrossRef]
Schmidt, B.; González-Domínguez, J.; Hundt, C.; Schlarb, M. Chapter 9—Message Passing Interface. In Parallel Programming; Morgan Kaufmann: San Francisco, CA, USA, 2018; pp. 315–364. [Google Scholar] [CrossRef]
Herlihy, M.; Shavit, N.; Luchangco, V.; Spear, M. The Art of Multiprocessor Programming, 2nd ed.; Morgan Kaufmann: San Francisco, CA, USA, 2020. [Google Scholar] [CrossRef]
Ethereum. EIP 1013: Hardfork Meta: Constantinople. 2018. Available online: https://eips.ethereum.org/EIPS/eip-1013 (accessed on 8 November 2025).
Ethereum. EIP 1716: Hardfork Meta: Petersburg. 2019. Available online: https://eips.ethereum.org/EIPS/eip-1716 (accessed on 8 November 2025).
Boehm, H.J.; Adve, S.V. Foundations of the C++ concurrency memory model. ACM Sigplan Not. 2008, 43, 68–78. [Google Scholar] [CrossRef]
Jain, R. The Art Of Computer Systems Performance Analysis—Techniques for Experimental Design, Measurement, Simulation, and Modeling; Wiley Professional Computing; Wiley: Hoboken, NJ, USA, 1991. [Google Scholar] [CrossRef]
Etherscan. Ethereum Full Node Sync (Default) Chart. 2025. Available online: https://etherscan.io/chartsync/chaindefault (accessed on 14 November 2025).
Ethereum. EIP-2929: Gas Cost Increases for State Access Opcode. 2020. Available online: https://eips.ethereum.org/EIPS/eip-2929 (accessed on 8 November 2025).
Ethereum. EIP-2930: Optional Access Lists. 2020. Available online: https://eips.ethereum.org/EIPS/eip-2930 (accessed on 12 November 2025).

Figure 1. (a) Example MPT structure and (b) key–value pairs stored in the MPT.

Figure 2. World state trie.

Figure 3. Processing of instruction SLOAD in the EVM.

Figure 4. Processing of instruction SSTORE in the EVM.

Figure 5. Original synchronous processing of transactions.

Figure 6. Asynchronous processing of transactions with Storage Replica. The red elements in the shaded region depict the new components and interactions, indicating asynchronous processing that decouples trie construction from the critical path.

Figure 7. Sequence diagram of SLOAD with Storage Replica. SLOAD fetches values directly without lookup amplification.

Figure 8. Sequence diagram of SSTORE with Storage Replica. It updates Storage Replica in addition to the original processing.

Figure 9. State database access latency, before and after optimization: (a,c) show the performance of the original client, where SLOAD and State::commit slowed down dramatically after 4M blocks, due to consecutive calls to RocksDB::get; (b,d) show the performance after applying Storage Replica. The optimized SLOAD was 4 times faster in the later blocks. The optimized State::commit and SSTORE were also significantly improved.

Table 2. Trie and Storage Replica API.

	Trie	Replica
read	lookup(root, idx): val	get(addr, idx): val
write	put(root, idx, val)	put(addr, idx, val)

Table 3. Current state access latency—total time and self-time. Execution times were dominated by RocksDB::get and State::commit, noting that SLOAD called RocksDB::get and, hence, had the RocksDB::get self-time accumulated in its own total time. Consequently, the database contributed most to the performance loss.

Segment	Total Time (s)				Self-Time (s)
Segment	SLOAD	SSTORE	State:: commit	RocksDB:: flush	SLOAD	Trie:: lookup	RocksDB:: get
0–0.1 M	0.18	0.06	6.83	7.24	0.15	0.02	0.00
1–1.1 M	11.05	1.66	44.10	43.58	1.50	1.28	8.27
2–2.1 M	14.12	2.82	102.27	37.88	2.45	1.38	10.30
3–3.1 M	17.86	2.82	111.72	55.80	2.48	1.89	13.50
4–4.1 M	391.71	53.12	847.74	223.95	32.85	24.78	334.08
5–5.1 M	2619.06	188.13	3326.66	549.81	109.87	79.34	2429.86
6–6.1 M	4763.72	315.02	4785.19	524.45	159.02	96.12	4508.59
7–7.1 M	6494.15	509.11	4573.85	464.68	173.20	97.35	6223.61
8–8.1 M	9403.07	690.01	5707.45	574.54	216.70	137.57	9048.80

Table 4. Improved state access latency with Storage Replica. RocksDB::get and State::commit were still the most expensive operations, noting again that SLOAD called RocksDB::get and, thus, accumulated RocksDB::get’s self-time in its own total time. This latency distribution was expected, because RocksDB::get and State::commit were still tasked with I/O access, but both had improved considerably, thereby contributing to the speedup of SSTORE and SLOAD. RocksDB::flush remained nearly unchanged.

Segment	Total Time (s)
Segment	SLOAD	RocksDB:: get	SSTORE	State:: commit	RocksDB:: flush
0–0.1 M	0.19	0.04	0.06	4.03	7.10
1–1.1 M	3.60	2.35	1.27	27.02	92.88
2–2.1 M	4.80	2.59	1.46	59.99	107.43
3–3.1 M	6.53	4.30	2.00	66.63	143.05
4–4.1 M	102.21	79.56	25.58	589.31	673.08
5–5.1 M	403.08	329.29	67.68	895.20	529.39
6–6.1 M	957.79	857.96	132.34	1085.77	556.94
7–7.1 M	1608.59	1508.62	302.71	1027.98	512.08
8–8.1 M	2301.54	2177.54	330.47	1296.14	579.02

Table 5. Speedup ratio achieved with Storage Replica. For later blocks, SLOAD improved by 4–6 times, the commit operation improved by 4 times, and SSTORE improved by about 2 times. The flush operation worsened slightly, due to synchronization of Storage Replica.

Segment	Speedup (×)
Segment	SLOAD	SSTORE	State:: commit	RocksDB:: flush
0–0.1 M	0.92	0.99	1.69	1.02
1–1.1 M	3.07	1.31	1.63	0.47
2–2.1 M	2.94	1.93	1.70	0.35
3–3.1 M	2.73	1.41	1.68	0.39
4–4.1 M	3.83	2.08	1.44	0.33
5–5.1 M	6.50	2.78	3.72	1.04
6–6.1 M	4.97	2.38	4.41	0.94
7–7.1 M	4.04	1.68	4.45	0.91
8–8.1 M	4.09	2.09	4.40	0.99

Table 6. Size of Storage Replica. Storage Replica requires up to 10

GB

of disk space for the default smart-contract storage address space of 32 B, or about

2.69

GB

when the storage address space is reduced to 1 B.

Table 6. Size of Storage Replica. Storage Replica requires up to 10

GB

of disk space for the default smart-contract storage address space of 32 B, or about

2.69

GB

when the storage address space is reduced to 1 B.

Block	Size (GB)			Ratio (%)
Block	Trie	Replica $1 B$	Replica $32 B$	$1 B$ / Trie	$32 B$ / Trie
1 M	0.06	0.01	0.02	8.89	32.09
2 M	0.19	0.02	0.06	11.19	33.56
3 M	0.29	0.04	0.11	15.19	37.46
4 M	1.04	0.19	0.43	18.75	41.48
5 M	5.12	0.63	1.81	12.30	35.30
6 M	13.26	1.30	4.35	9.77	32.77
7 M	20.19	1.86	6.51	9.24	32.24
8 M	26.02	2.42	8.39	9.32	32.23
9 M	31.43	2.69	9.87	8.55	31.39

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jezek, K.; Jeong, S.; Kim, Y.; Scholz, B.; Burgstaller, B. Storage Replica: Accelerating the Storage Access of the Ethereum Virtual Machine. Appl. Sci. 2026, 16, 486. https://doi.org/10.3390/app16010486

AMA Style

Jezek K, Jeong S, Kim Y, Scholz B, Burgstaller B. Storage Replica: Accelerating the Storage Access of the Ethereum Virtual Machine. Applied Sciences. 2026; 16(1):486. https://doi.org/10.3390/app16010486

Chicago/Turabian Style

Jezek, Kamil, Seongho Jeong, Yeonsoo Kim, Bernhard Scholz, and Bernd Burgstaller. 2026. "Storage Replica: Accelerating the Storage Access of the Ethereum Virtual Machine" Applied Sciences 16, no. 1: 486. https://doi.org/10.3390/app16010486

APA Style

Jezek, K., Jeong, S., Kim, Y., Scholz, B., & Burgstaller, B. (2026). Storage Replica: Accelerating the Storage Access of the Ethereum Virtual Machine. Applied Sciences, 16(1), 486. https://doi.org/10.3390/app16010486

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Storage Replica: Accelerating the Storage Access of the Ethereum Virtual Machine

Abstract

1. Introduction

2. Background: Ethereum’s World State

3. Related Work

4. The Contract’s Storage Bottleneck

4.1. EVM Storage Instructions: SLOAD/SSTORE

4.2. State Commit and Database Flush

5. Performance Improvement Techniques

5.1. Storage Replica

5.2. Asynchronous Trie Construction

5.3. Implementation

6. Experimental Evaluation

6.1. Experimental Environment

6.2. Experimental Setup and Data

6.3. Performance Analysis

7. Limitations and Future Work

8. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI