CMSS: A High-Performance Blockchain Storage System with Horizontal Scaling Support
Abstract
:1. Introduction
- RQ1
- How to achieve efficient blockchain operations in scenarios with massive ledger data, such as reading and writing block operations? When the amount of ledger data is large, the operation of reading and writing blocks becomes time-consuming, which causes delays and affects the throughput of the entire blockchain system, which becomes one of the bottlenecks of the system.
- RQ2
- How to improve the performance when adding more disks without reorganizing the ledger data? The current technology writes the block data on the chain into a file system. The upper limit of block storage in the blockchain is limited by the size of the file system. When the file system (disk) is full, the chain stops working and cannot store new blocks. If we add more disks to this system, the blockchain reorganizes the ledger data and re-download the ledger data from the network. This is an extremely time-consuming process.
- RQ3
- As the scale of blockchain ledger data continues to expand, how to control the storage cost of the entire blockchain? With the development of blockchain systems, the amount of ledger data will become very large. Each node in the blockchain system needs to save all ledger data, which will bring a huge storage burden and expensive storage costs to the entire system.
- New Workflow with High Read–Write Performance. We propose a new block storage workflow. Following this process, CMSS can improve the reading and writing efficiency of the blockchain system while ensuring the consistency of the ledger.
- MetaFile System with Horizontal Scalability. We creatively propose a file architecture for the blockchain storage system that supports rapid horizontal scalability. When meeting scenarios with massive data, this architecture can expand ledger capacity by only adding more disks without reorganizing the data.
- Hot–Cold Separation with Lower Storage Costs. To save on the storage hardware costs, we have designed a block storage management solution based on hot and cold separation. Experiments prove that our blockchain storage system has lower costs than other permissioned blockchains.
2. Related Work
3. Preliminaries
3.1. Blockchain Workflow
3.2. Storage System
- File System/Ledger. The file system (also known as the ledger) is used to store all the transaction data in a sequential and immutable manner. It contains a series of blocks linked by cryptographic hashes [32] and each block contains a set of transactions. Each peer in the network maintains a copy of the ledger to ensure decentralization and resilience. The ledger is stored as a set of files on disk, with each block being stored in a separate file.
- State Database. The state database maintains the current state of the ledger, which represents the most recent value of each key–value pair in the system. It is essentially a key–value store that allows for fast retrieval of the latest state information without having to traverse the entire blockchain. The state database is critical for performance, as it enables applications to query the current state of the ledger without needing to process all previous transactions. Hyperledger Fabric supports multiple state database implementations, including LevelDB [33] and CouchDB [34].
- History Index. The history index keeps track of the historical values of keys in the state database. It allows users to query the historical state of a key at any point in time. This feature is particularly useful for auditability. The history index is implemented as a separate key–value store, with keys being the historical versions of the state database keys and values being the corresponding historical values.
- Block Index. Hyperledger Fabric provides a variety of indexing methods to quickly find the required block data. The index database is updated every time a block is submitted. It can efficiently locate blocks in the ledger based on certain criteria, such as block number or transaction ID. It is essentially a mapping of block metadata to block file locations on disk.
3.3. Sharding
- Shard Formation: Nodes establish identities using Sybil-attack-resistant [36] methods like proof of work (PoW). Each node is then randomly assigned to a shard to ensure honesty with high probability. Additionally, shards need periodic reconfiguration to prevent attacks.
- Cross-Shard Mechanism: Since the ledger is divided into shards, cross-shard transactions occur frequently. A cross-shard mechanism is necessary to ensure the atomicity (transactions are committed or aborted as a whole) and consistency (each transaction commit leads to a valid state) of these transactions across shards.
- Intra-Shard Consensus: Within each shard, nodes must reach consensus on a block containing proposed transactions. A Byzantine consensus protocol is typically employed to ensure safety (honest nodes agree on the same value) and liveness (valid transactions are eventually included in the ledger).
3.4. Sidechains
4. Design of CMSS
4.1. System Model
4.2. CMSS’s Workflow
- Serialization: Parallel serialization of new blocks.
- Block Binary Log Writing: Write the read–write sets, the latest block height to the block binary log. This step is essential for data recovery in case of abnormal interruptions.
- Caching: In order to improve performance, a layer of cache is added. After the new block submission request updates the block binary log, the block data are written into the cache.
- Submit and Return: Upon completion of both log and cache updates, the commit request can return, and the background thread asynchronously updates the relevant databases.
- Asynchronous Database Update: The background thread updates the Block DB, State DB, ContractEvent DB, History DB, and Result DB asynchronously, ensuring optimal performance and resilience against failures.
5. System Features
5.1. High-Performance Read–Write Ability
5.2. Horizontal Scaling
- one ledger has many file systems;
- one file system has many files;
- one file stores many blocks.
5.2.1. Meta File System
5.2.2. Architecture of MFS
5.3. Hot–Cold Separation
6. Experiment
6.1. Experiment Setup
6.2. Read–Write Performance Comparison between CMSS and HLF
6.2.1. Write
6.2.2. Key-Exist Read Last Written
6.2.3. Key-Exist Random Read
6.2.4. Key-Not-Exist Read
6.3. Storage Performance Comparison between CMSS and HLF
6.4. Experiment Results
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Nakamoto, S.; Bitcoin, A. A Peer-to-Peer Electronic Cash System. 2008. Volume 4. p. 15. Available online: https://bitcoin.org/bitcoin.pdf (accessed on 24 April 2024).
- Storublevtcev, N. Cryptography in blockchain. In Proceedings of the Computational Science and Its Applications–ICCSA 2019: 19th International Conference, Saint Petersburg, Russia, 1–4 July 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 495–508. [Google Scholar]
- Gao, Z.; Cao, L.; Du, X. Data Right Confirmation Mechanism Based on Blockchain and Locality Sensitive Hashing. In Proceedings of the 2020 3rd International Conference on Hot Information-Centric Networking (HotICN), Hefei, China, 12–14 December 2020; pp. 1–7. [Google Scholar] [CrossRef]
- Wang, Q.; Li, R.; Wang, Q.; Chen, S. Non-fungible token (NFT): Overview, evaluation, opportunities and challenges. arXiv 2021, arXiv:2105.07447. [Google Scholar]
- Mukhopadhyay, U.; Skjellum, A.; Hambolu, O.; Oakley, J.; Yu, L.; Brooks, R. A brief survey of cryptocurrency systems. In Proceedings of the 2016 14th Annual Conference on Privacy, Security and Trust (PST), Auckland, New Zealand, 12–14 December 2016; pp. 745–752. [Google Scholar]
- Wood, G. Ethereum: A secure decentralised generalised transaction ledger. Ethereum Proj. Yellow Pap. 2014, 151, 1–32. [Google Scholar]
- Etherscan. Ethereum Full Node Sync Archive Chart. 2024. Available online: https://etherscan.io/ (accessed on 24 April 2024).
- Ycharts: Bitcoin Blockchain Size. 2024. Available online: https://ycharts.com/indicators/bitcoin_blockchain_size (accessed on 24 April 2024).
- Zamani, M.; Movahedi, M.; Raykova, M. Rapidchain: Scaling blockchain via full sharding. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, ON, Canada, 15–19 October 2018; pp. 931–948. [Google Scholar]
- Dang, H.; Dinh, T.T.A.; Loghin, D.; Chang, E.C.; Lin, Q.; Ooi, B.C. Towards scaling blockchain systems via sharding. In Proceedings of the 2019 International Conference on Management of Data, Amsterdam, The Netherlands, 30 June–5 July 2019; pp. 123–140. [Google Scholar]
- Qi, X. S-store: A scalable data store towards permissioned blockchain sharding. In Proceedings of the IEEE INFOCOM 2022—IEEE Conference on Computer Communications, London, UK, 2–5 May 2022; pp. 1978–1987. [Google Scholar]
- Gaži, P.; Kiayias, A.; Zindros, D. Proof-of-stake sidechains. In Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 19–23 May 2019; pp. 139–156. [Google Scholar]
- Yin, L.; Xu, J.; Tang, Q. Sidechains with fast cross-chain transfers. IEEE Trans. Dependable Secur. Comput. 2021, 19, 3925–3940. [Google Scholar] [CrossRef]
- Chainmaker. Available online: https://chainmaker.org.cn/home (accessed on 8 April 2024).
- Luu, L.; Narayanan, V.; Zheng, C.; Baweja, K.; Gilbert, S.; Saxena, P. A secure sharding protocol for open blockchains. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016; pp. 17–30. [Google Scholar]
- Wang, J.; Wang, H. Monoxide: Scale out blockchains with asynchronous consensus zones. In Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19), Boston, MA, USA, 26–28 February 2019; pp. 95–112. [Google Scholar]
- Al-Bassam, M.; Sonnino, A.; Bano, S.; Hrycyszyn, D.; Danezis, G. Chainspace: A sharded smart contracts platform. arXiv 2017, arXiv:1708.03778. [Google Scholar]
- Hellings, J.; Sadoghi, M. Byshard: Sharding in a byzantine environment. VLDB J. 2023, 32, 1343–1367. [Google Scholar] [CrossRef]
- Hong, Z.; Guo, S.; Zhou, E.; Chen, W.; Huang, H.; Zomaya, A. Gridb: Scaling blockchain database via sharding and off-chain cross-shard mechanism. Proc. Vldb Endow. 2023, 16, 1685–1698. [Google Scholar] [CrossRef]
- Back, A.; Corallo, M.; Dashjr, L.; Friedenbach, M.; Maxwell, G.; Miller, A.; Poelstra, A.; Timón, J.; Wuille, P. Enabling Blockchain Innovations with Pegged Sidechains. 2014, Volume 72, pp. 201–224. Available online: http://kevinriggen.com/files/sidechains.pdf (accessed on 24 April 2024).
- Lerner, S. Drivechains, Sidechains and Hybrid 2-way Peg Designs. Available online: https://docs.rsk.co/Drivechains_Sidechains_and_Hybrid_2-way_peg_Designs_R9.pdf (accessed on 24 April 2024).
- Kiayias, A.; Lamprou, N.; Stouka, A.P. Proofs of proofs of work with sublinear complexity. In Proceedings of the Financial Cryptography and Data Security: FC 2016 International Workshops, BITCOIN, VOTING, and WAHC, Christ Church, Barbados, 26 February 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 61–78. [Google Scholar]
- Bünz, B.; Kiffer, L.; Luu, L.; Zamani, M. Flyclient: Super-light clients for cryptocurrencies. In Proceedings of the 2020 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 18–21 May 2020; pp. 928–946. [Google Scholar]
- Yin, L.; Xu, J.; Liang, K.; Zhang, Z. Sidechains with optimally succinct proof. IEEE Trans. Dependable Secur. Comput. 2023, 1–15. [Google Scholar] [CrossRef]
- Deng, Z.; Li, T.; Tang, C.; He, D.; Zheng, Z. PSSC: Practical and Secure Sidechains Construction for Heterogeneous Blockchains Orienting IoT. IEEE Internet Things J. 2023, 11, 4600–4613. [Google Scholar] [CrossRef]
- Qi, X.; Zhang, Z.; Jin, C.; Zhou, A. BFT-Store: Storage partition for permissioned blockchain via erasure coding. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA, 20–24 April 2020; pp. 1926–1929. [Google Scholar]
- Du, Z.; Qian, H.f.; Pang, X. Partitionchain: A scalable and reliable data storage strategy for permissioned blockchain. IEEE Trans. Knowl. Data Eng. 2021, 35, 4124–4136. [Google Scholar] [CrossRef]
- Bagozi, A.; Bianchini, D.; De Antonellis, V.; Garda, M.; Melchiori, M. A three-layered approach for designing smart contracts in collaborative processes. In Proceedings of the On the Move to Meaningful Internet Systems: OTM 2019 Conferences: Confederated International Conferences: CoopIS, ODBASE, C&TC 2019, Rhodes, Greece, 21–25 October 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 440–457. [Google Scholar]
- Bagozi, A.; Bianchini, D.; De Antonellis, V.; Garda, M.; Melchiori, M. A blockchain-based approach for trust management in collaborative business processes. In Proceedings of the Web Information Systems Engineering–WISE 2021: 22nd International Conference on Web Information Systems Engineering, WISE 2021, Melbourne, VIC, Australia, 26–29 October 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 59–67. [Google Scholar]
- Androulaki, E.; Barger, A.; Bortnikov, V.; Cachin, C.; Christidis, K.; De Caro, A.; Enyeart, D.; Ferris, C.; Laventman, G.; Manevich, Y.; et al. Hyperledger fabric: A distributed operating system for permissioned blockchains. In Proceedings of the Thirteenth EuroSys Conference, Porto, Portugal, 23–26 April 2018; pp. 1–15. [Google Scholar]
- Goquorum. 2022. Available online: https://github.com/ConsenSys/quorum (accessed on 24 April 2024).
- Preneel, B. Cryptographic hash functions. Eur. Trans. Telecommun. 1994, 5, 431–448. [Google Scholar] [CrossRef]
- Ghemawat, S.; Dean, J. LevelDB. 2011. Available online: https://leveljs.org/ (accessed on 24 April 2024).
- Anderson, J.C.; Lehnardt, J.; Slater, N. CouchDB: The Definitive Guide: Time to Relax; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2010. [Google Scholar]
- Chodorow, K. Scaling MongoDB: Sharding, Cluster Setup, and Administration; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2011. [Google Scholar]
- Platt, M.; McBurney, P. Sybil in the haystack: A comprehensive review of blockchain consensus mechanisms in search of strong Sybil attack resistance. Algorithms 2023, 16, 34. [Google Scholar] [CrossRef]
- Zhang, Z.; Liu, X.; Li, M.; Yin, H.; Zhu, L.; Khoussainov, B.; Gai, K. HCA: Hashchain-based Consensus Acceleration via Re-voting. IEEE Trans. Dependable Secur. Comput. 2023, 21, 775–788. [Google Scholar] [CrossRef]
- Zhang, Z.; Feng, K.; Chen, X.; Liu, X.; Sun, H. RHCA: Robust HCA via Consistent Revoting. Mathematics 2024, 12, 593. [Google Scholar] [CrossRef]
- Liu, J.; Li, P.; Cheng, R.; Asokan, N.; Song, D. Parallel and asynchronous smart contract execution. IEEE Trans. Parallel Distrib. Syst. 2021, 33, 1097–1108. [Google Scholar] [CrossRef]
- Cheng, J.C.; Lee, N.Y.; Chi, C.; Chen, Y.H. Blockchain and smart contract for digital certificate. In Proceedings of the 2018 IEEE International Conference on Applied System Invention (ICASI), Chiba, Japan, 13–17 April 2018; pp. 1046–1051. [Google Scholar]
- Hao, J.; Gao, J.; Xiang, P.; Zhang, J.; Chen, Z.; Hu, H.; Chen, Z. TDID: Transparent and Efficient Decentralized Identity Management with Blockchain. In Proceedings of the 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Oahu, HI, USA, 1–4 October 2023; pp. 1752–1759. [Google Scholar]
Hot–Cold Separation Rate | Hot Disk Volume | Cold Disk Volume | Original Data Volume |
---|---|---|---|
50% | 160 GB | 32 GB | 320 GB |
90% | 32 GB | 57 GB | 320 GB |
99% | 3.2 GB | 63 GB | 320 GB |
99.9% | 1 GB | 64 GB | 320 GB |
Cloud Service Provider | Hot Disk Unit Price * | Cold Disk Unit Price * | Hot–Cold Separation Rate | Total Storage Cost * |
---|---|---|---|---|
Tencent Cloud | 1.7600 RMB/GB | 0.0100 RMB/GB | None | 563.20 RMB |
50% | 281.92 RMB | |||
90% | 56.89 RMB | |||
99% | 6.26 RMB | |||
99.9% | 2.40 RMB | |||
Alibaba Cloud | 1.00 RMB/GB | 0.0075 RMB/GB | None | 320.00 RMB |
50% | 160.24 RMB | |||
90% | 32.43 RMB | |||
99% | 3.67 RMB | |||
99.9% | 1.48 RMB | |||
Microsoft Azure | 2.2318 RMB/GB | 0.0071 RMB/GB | None | 714.18 RMB |
50% | 357.32 RMB | |||
90% | 71.82 RMB | |||
99% | 7.59 RMB | |||
99.9% | 2.69 RMB | |||
Amazon Cloud | 1.1519 RMB/GB | 0.0071 RMB/GB | None | 368.61 RMB |
50% | 184.53 RMB | |||
90% | 37.27 RMB | |||
99% | 4.13 RMB | |||
99.9% | 1.61 RMB | |||
Google Cloud | 1.4687 RMB/GB | 0.0086 RMB/GB | None | 469.98 RMB |
50% | 235.27 RMB | |||
90% | 47.49 RMB | |||
99% | 5.24 RMB | |||
99.9% | 2.02 RMB |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, W.; Ao, M.; Gao, M.; Li, C.; Chen, Y. CMSS: A High-Performance Blockchain Storage System with Horizontal Scaling Support. Electronics 2024, 13, 1854. https://doi.org/10.3390/electronics13101854
Yang W, Ao M, Gao M, Li C, Chen Y. CMSS: A High-Performance Blockchain Storage System with Horizontal Scaling Support. Electronics. 2024; 13(10):1854. https://doi.org/10.3390/electronics13101854
Chicago/Turabian StyleYang, Wenjin, Meng Ao, Mingzhi Gao, Chunhai Li, and Yongqing Chen. 2024. "CMSS: A High-Performance Blockchain Storage System with Horizontal Scaling Support" Electronics 13, no. 10: 1854. https://doi.org/10.3390/electronics13101854
APA StyleYang, W., Ao, M., Gao, M., Li, C., & Chen, Y. (2024). CMSS: A High-Performance Blockchain Storage System with Horizontal Scaling Support. Electronics, 13(10), 1854. https://doi.org/10.3390/electronics13101854