NewSQL Databases Assessment: CockroachDB, MariaDB Xpand, and VoltDB
Abstract
:1. Introduction
- Revealing the strengths and weaknesses of NewSQL databases;
- Best database according to the evaluation of the OSSpal methodology;
- Experimental evaluation of NewSQL databases using a standard benchmark;
- Best NewSQL database regarding performance and scalability;
- Limitations in the practical use of NewSQL Databases.
2. Related Work
3. Materials and Methods
3.1. OSSpal Methodology
- Functionality: Investigates the quality of software requirements from the users’ viewpoint;
- Operational Software Characteristics: Evaluates the security, performance, GUI, and how easy the configuration of the software is;
- Support and Service: Analyzes the software support component for commercial or community applications;
- Documentation: Analyzes suitable documentation and tutorial guides for the software;
- Software Technology Attributes: Analyzes how well the software architecture is designed and how modular, portable, flexible, open, and easy to integrate it is.
- Community and Adoption: Examines the implementation of the component by the community, market, and industry. Moreover, evaluates the activity of the community for the software;
- Development Process: Estimates the level of professionalism of the development process and the project as a whole.
- Outline key characteristics to analyze, assigning a score from 1 to 3 (1—slightly important; 2—important; 3—very important);
- Rank the characteristics in a cumulative sum (from 1 to 3);
- Standardize the previous result on a score from 1 to 5.
- <65%: 1—Unacceptable;
- [65–80%]: 2—Poor;
- [80–90%]: 3—Acceptable;
- [90–96%]: 4—Good;[90–96%]: 4—Good;
- >96%: 5—Excellent.
- First step: Select and identify all the software components to be analyzed and measured taking into consideration the evaluation criteria.
- Second step: Attribution of weights for the categories and the measures:
- ○
- Each category is assigned a percentage of importance, making a total of 100%;
- ○
- Each measure within the category is ranked by its importance;
- ○
- Inside the category, assign the importance by percentage, making a total of 100% of all the measures of a category.
- Third step: In each category, score each measure used and assign it a value between 1 and 5, using the following scale: Unacceptable (1), Poor (2), Acceptable (3), Very Good (4), Excellent (5).
- Fourth step: Calculate the final OSSpal value based on the score for each category and the weighting factor.
3.2. Star Schema Benchmark (SSB)
- Query Flight 1: The queries Q1.1, Q1.2, and Q1.3 restrict the revenue, with different ranges and filter factors, to find possible revenue increases. Overall, they calculate the increase in revenue that would have resulted from eliminating certain discounts by a certain percentage for products shipped in a given year.
- Query Flight 2: The queries Q2.1, Q2.2, and Q2.3 restrict the data in two dimensions, by comparing revenues from all orders in all years for suppliers in a given region and for a given product class.
- Query Flight 3: The queries Q3.1, Q3.2, Q3.3, and Q3.4 focus on restrictions in three dimensions, calculate revenue volume over a given time period by customer country, supplier country, and year within a given region.
- Query Flight 4: The queries Q4.1, Q4.2, and Q4.3 represent a ‘What-If’ sequence. It begins with query Q4.1 using a group by on two dimensions and weak constraints on three dimensions, and measure the aggregate profit, which is defined as (lo_revenue—lo_supplycost).
4. NewSQL Databases: CockroachDB, MariaDB Xpand, and VoltDB
4.1. CockroachDB
- SQL Layer: The highest level of abstraction for developers. This layer adds support for a wide range of SQL expressions and syntax from PostgreSQL libraries, with some modifications;
- Distributed Key–Value Store: This layer was implemented as a monolithic sorted map, allowing to run multiple computers in parallel, being able to work with larger datasets;
- Distributed Transactions: It is not necessarily considered a part of the layered architecture, but a necessary component of the system. Implementing distributed transactions allows to connect the layers of the architecture: from SQL to stores and ranges on each node;
- Nodes: They are mostly considered as physical machines, virtual machines, or containers that include stores. The distributed key–value (KV) store routes messages to nodes;
- Store: Each node in the database can contain one or more shops, and in turn, each shop can contain many ranges;
- Range: Ranges are the lowest-level unit of key–value data and every store contains ranges. Each range is used to sort items in specific partitions within the store, using the Raft consensus algorithm. The Raft algorithm is a variant of Paxos, which corresponds to a family of protocols that solves consensus in a network of unreliable processors.
- Cockroach offers customer support with migrations from other databases, as well as helping users with difficulties in using this database;
- CockroachDB can be used anywhere. It can be deployed in virtual machines, containers, Amazon web services, and many other applications;
- CockroachDB maintains data integrity and is able to survive crashes, due to the use of ACID properties;
- CockroachDB offers high performance and availability;
- CockroachDB scales horizontally and offers cloud support;
- CockroachDB has extensive documentation and tutorials guides;
- Supports PostgreSQL libraries to make use of SQL commands.
- CockroachDB does not support database transactions with high complexity, since this database’s purpose is speed;
- To make full use of CockroachDB, we need to pay for the enterprise version, whereas the core version can be used for most purposes with several functionality restrictions;
- Multi-region tables cannot be restored into tables that are not multi-region tables;
- SQL statements comprising numerous subqueries modifying the same table can cause corruption;
- CockroachDB does not support the use of RESTORE with multi-region table localities;
- The SET command, which allows modifying one of the session configuration variables, does not ROLLBACK in a transaction;
- JSONB/JSON comparison operators are not implemented.
4.2. MariaDB Xpand
- It provides distributed SQL capabilities and is ACID-compliant;
- It is highly available due to maintaining replicas of each slice (logical representation of data that are saved in a partition of a disk, which contains pieces of user database and tables), allowing to recover from a node failure without losing data;
- It can maintain multiple replicas of each slice and is zone-aware, allowing to recover from multi-node failures or zone failures without losing data;
- Its rebalancer maintains data distribution, meaning that a node or zone failure causes the creation of new replicas for each slice, and the rebalance then redistributes the data;
- Performs operations in parallel through the nodes to have the latest data;
- It scales out because each node can read and write, plus reads are lockless. Writes do not block reads, and additional nodes can be added to increase capacity.
- Difficulty with migrations with modern software;
- Recovery of crash from data replication takes too long;
- No option to export stored procedures’ query results.
4.3. VoltDB
- Snapshots;
- Command Logging;
- K-safety;
- Database Replication.
- (1)
- Shows the number of nodes associated with the current database. It allows to add or remove nodes from the database;
- (2)
- Configuration properties of the current database;
- (3)
- VoltDB displays CPU, memory, latency, and transaction metrics to help the user analyze the execution of the database.
- In-memory storage: It uses synchronous replication for durability;
- Designed for online/operational transaction processing, OLTP;
- Optimized performance: Each execution engine is single-threaded. It removes latches, locks, and buffer pool management in order to eliminate overhead and legacy architecture systems;
- VoltDB allows to scale without much difficulty;
- Distributed database: In this case, the tables are partitioned, with one partition per node;
- VoltDB uses Java stored procedures to eliminate client–server round-trips;
- VoltDB provides several options for backing up database data and schema;
- High availability: VoltDB contains three capabilities, namely K-safety, network fault tolerance, and live node rejoin.
- Not optimized for OLAP, since VoltDB is focused on OLTP, and is not good at solving problems that require large database column scans;
- Not optimized to return an excessive amount of data from stored procedures;
- The database must fit into the available memory of the system;
- Not optimized to work with complex queries and huge tables;
- No support for foreign keys and check constraints;
- No Windows version available.
4.4. Summary of NewSQL Databases Characteristics
5. Experimental Evaluation
5.1. OSSpal Evaluation
- In the ‘Functionality’ category, MariaDB Xpand and VoltDB stand out with scores of 4, which means that they have almost all features that we considered important in evaluating NewSQL databases;
- For ‘Operational Software Characteristics’, CockroachDB has the highest score;
- In ‘Documentation’, CockroachDB has the highest score by having very good documentation available for users;
- In ‘Support and Services’, every single database has a score of 5, which means that all databases have many people who require their services for training and webinars for new updates;
- Finally, in ‘Software Technology Attributes’, CockroachDB has the maximum score.
5.2. Performance Evaluation with Star Schema Benchmark
- Intel(R) Core (TM) i7-10700 CPU @ 2.90 GHz;
- 32 GB of RAM;
- SSD disk with 496 GB;
- VirtualBox with Operating System Ubuntu 20.04, 7 virtual cores, 28 GB virtual RAM, and 200 GB of allocated disk.
5.2.1. Data File Import with SF = 1 and SF = 10
5.2.2. SSB Performance Evaluation with SF = 1
5.2.3. SSB Performance Evaluation with SF = 10
6. Discussion
7. Conclusions and Future Work
Author Contributions
Funding
Conflicts of Interest
References
- Grolinger, K.; Higashino, W.A.; Tiwari, A.; Capretz, M.A. Data management in cloud environments: NoSQL and NewSQL data stores. J. Cloud Comput. 2013, 2, 24. [Google Scholar] [CrossRef] [Green Version]
- Date, C.J. E. F. Codd and Relational Theory; Technics Publications: Basking Ridge, NJ, USA, 2022; pp. 1–404. [Google Scholar]
- Venkatraman, S.; Fahd, K.; Kaspi, S.; Venkatraman, R. SQL versus NoSQL movement with big data analytics. Int. J. Inf. Technol. Comput. Sci. 2016, 8, 59–66. [Google Scholar] [CrossRef] [Green Version]
- Sharma, N. Overview of the Database Management System. Int. J. Adv. Res. Comput. Sci. 2017, 8, 362–369. [Google Scholar]
- Abramova, V.; Bernardino, J.; Furtado, P. Experimental Evaluation of NoSQL Databases. Int. J. Database Manag. Syst. 2014, 6, 1–16. [Google Scholar] [CrossRef]
- Chaudhry, N.; Yousaf, M.M. Architectural assessment of NoSQL and NewSQL systems. Distrib. Parallel Databases 2020, 38, 881–926. [Google Scholar] [CrossRef]
- Stonebraker, M. Newsql: An alternative to nosql and old sql for new oltp apps. Commun. ACM 2012, 6–7. [Google Scholar]
- Valduriez, P.; Jiménez-Peris, R.; Özsu, M.T. Distributed database systems: The case for NewSQL. In Transactions on Large-Scale Data-and Knowledge-Centered Systems XLVIII; Hameurlain, A., Tjoa, A.M., Eds.; Springer: Berlin/Heidelberg, Germany, 2021; Volume 12670, pp. 1–15. [Google Scholar]
- Matthew, A. How Will the Database Incumbents Respond to NoSQL and NewSQL. Available online: https://15799.courses.cs.cmu.edu/fall2013/static/papers/aslett-newsql.pdf (accessed on 26 July 2022).
- DB-Engines Ranking. Available online: https://db-engines.com/en/ranking (accessed on 13 September 2022).
- Cockroach Labs, The Company Building CockroachDB. Available online: https://www.cockroachlabs.com/docs/releases/v21.2.html (accessed on 22 December 2022).
- MariaDB Xpand: Distributed SQL Database. Available online: https://mariadb.com/docs/xpand/release-notes/mariadb-xpand-6/6-0-3/#Installation_Instructions (accessed on 22 December 2022).
- Volt Active Data: Because Milliseconds Matter. Available online: https://docs.voltactivedata.com/v11docs/UsingVoltDB/ (accessed on 22 December 2022).
- Wasserman, A.I.; Guo, X.; McMillian, B.; Qian, K.; Wei, M.Y.; Xu, Q. OSSpal: Finding and evaluating open source software. In Proceedings of the IFIP International Conference on Open Source Systems, Buenos Aires, Argentina, 22–23 May 2017; Volume 496, pp. 193–203. [Google Scholar]
- Calçada, A.; Bernardino, J. Evaluation of Couchbase, CouchDB and MongoDB using OSSpal. In KDIR; SCITEPRESS—Science and Technology Publications: Setúbal, Portugal, 2019; pp. 427–433. [Google Scholar]
- Leite, N.; Pedrosa, I.; Bernardino, J. Open Source Business Intelligence Platforms Assessment using OSSpal Methodology. In Proceedings of the 15th International Joint Conference on e-Business and Telecommunications (ICETE), Porto, Portugal, 26–28 July 2018; pp. 356–362. [Google Scholar]
- António, O.; Jorge, B. OSSPal Assessment of Self-Service BI and Analytics Software. In Proceedings of the CAPSI 2020, Porto, Portugal, 11–12 October 2020; p. 23. Available online: https://aisel.aisnet.org/capsi2020/23 (accessed on 22 December 2022).
- Ferreira, T.; Pedrosa, I.; Bernardino, J. Integration of Business Intelligence with e-commerce. In Proceedings of the 14th Iberian Conference on Information Systems and Technologies (CISTI), Coimbra, Portugal, 19–22 June 2019; pp. 1–7. [Google Scholar] [CrossRef]
- Cardoso, T.; Penela, J.; Rosa, A.; Wanzeller, C.; Martins, P.; Abbasi, M. OSSpal Qualitative and Quantitative Comparison: Couchbase, CouchDB, and MongoDB. In Marketing and Smart Technologies; Springer: Singapore, 2022; pp. 141–150. [Google Scholar]
- O’Neil, P.; O’Neil, E.; Chen, X.; Revilak, S. The star schema benchmark and augmented fact table indexing. In TCPEB; Springer: Berlin/Heidelberg, Germany, 2009; pp. 237–252. [Google Scholar]
- Astrova, I.; Koschel, A.; Wellermann, N.; Klostermeyer, P. Performance Benchmarking of NewSQL Databases with Yahoo Cloud Serving Benchmark. In Proceedings of the Future Technologies Conference, Virtual, 5–6 November 2020; pp. 271–281. [Google Scholar]
- Kaur, K.; Sachdeva, M. Performance evaluation of NewSQL databases. In ICISC; IEEE: Piscataway, NJ, USA, 2017; pp. 1–5. [Google Scholar]
- Murazzo, M.; Gómez, P.; Rodríguez, N.; Medel, D. Database NewSQL Performance Evaluation for Big Data in the Public Cloud. In Cloud Computing and Big Data; Naiouf, M., Chichizola, F., Rucci, E., Eds.; JCC&BD 2019. Communications in Computer and Information Science; Springer: Cham, Switzerland, 2019; Volume 1050. [Google Scholar] [CrossRef] [Green Version]
- Oliveira, J.; Bernardino, J. NewSQL Databases—MemSQL and VoltDB Experimental Evaluation. In Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management—KEOD, Madeira, Portugal, 1–3 October 2017; pp. 276–281, ISBN 978-989-758-272-1. [Google Scholar] [CrossRef]
- Knob, R.; Schreiner, G.; Frozza, A.; Mello, R. Uma Análise de Soluções NewSQL. In Anais da XV Escola Regional de Banco de Dados; SBC: Porto Alegre, Brazil, 2019; pp. 21–30. [Google Scholar] [CrossRef]
- Hahn, S.M.L.; Chereja, I.; Matei, O. Comparison of the Performance of NewSQL Databases Based on Linux OS. In Data Science and Intelligent Systems. CoMeSySo 2021; Silhavy, R., Silhavy, P., Prokopova, Z., Eds.; Lecture Notes in Networks and Systems; Springer: Cham, Switzerland, 2021. [Google Scholar] [CrossRef]
- Sanchez, J. A review of star schema benchmark. arXiv 2016, arXiv:1606.00295. [Google Scholar]
- Taft, R.; Sharif, I.; Matei, A.; VanBenschoten, N.; Lewis, J.; Grieger, T.; Niemi, K.; Woods, A.; Birzin, A.; Poss, R.; et al. CockroachDB: The resilient geo-distributed sql database. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, New York, NY, USA, 14–19 June 2020; pp. 1493–1509. [Google Scholar]
- The New Stack: Meet CockroachDB, the Resilient SQL Database. Available online: https://www.cockroachlabs.com/blog/the-new-stack-meet-cockroachdb-the-resilient-sql-database/ (accessed on 19 July 2022).
- Google Spanner Inspires CockroachDB to Outrun It. Available online: https://www.nextplatform.com/2017/02/22/google-spanner-inspires-cockroachdb-outrun/ (accessed on 30 August 2022).
- Start a Local Cluster (Insecure). Available online: https://www.cockroachlabs.com/docs/stable/start-a-local-cluster.html (accessed on 30 August 2022).
- Known Limitations in CockroachDB v22.1. Available online: https://www.cockroachlabs.com/docs/stable/known-limitations.html#a-multi-region-table-cannot-be-restored-into-a-non-multi-region-table (accessed on 9 October 2022).
- MariaDB Corporation. Evaluating MariaDB Xpand and Cockroach with Sysbench. [White Paper] March. 2022. Available online: https://go.mariadb.com/22Q2-WC-GLBL-DBaaS-Xpand-vs-CockroachDB-with-Sysbench-DB1139_LP-Registration.html (accessed on 22 December 2022).
- Deploy Xpand Performance Topology—MariaDB. Available online: https://mariadb.com/docs/deploy/topologies/xpand-performance/xpand-6/ (accessed on 19 July 2022).
- Deploy Xpand Topology—Enterprise Documentation—MariaDB. Available online: https://mariadb.com/docs/deploy/topologies/xpand/xpand-6/ (accessed on 19 July 2022).
- Architecture of MariaDB Xpand. Available online: https://mariadb.com/docs/architecture/components/xpand/ (accessed on 16 July 2022).
- MariaDB Xpand Reviews, Ratings & Features 2022—Gartner. Available online: https://www.gartner.com/reviews/market/cloud-database-management-systems/vendor/mariadb/product/mariadb-xpand (accessed on 18 July 2022).
- Stonebraker, M.; Weisberg, A. The VoltDB Main Memory DBMS. IEEE Data Eng. Bull. 2013, 36, 21–27. [Google Scholar]
- ODBMS, VoltDB Technical Overview. Available online: http://www.odbms.org/wp-content/uploads/2013/11/VoltDBTechnicalOverview.pdf (accessed on 19 July 2022).
- Using the VoltDB Deployment Manager Web Interface. Available online: https://docs.voltdb.com/v7docs/AdminGuide/DepployWebUI.php (accessed on 20 July 2022).
- Almassabi, A.; Bawazeer, O.; Adam, S. Top NewSQL databases and features classification. Int. J. Database Manag. Syst. 2018, 10, 11–31. [Google Scholar] [CrossRef]
- VoltDB Active Data Documentation Administrator’s Guide. Available online: https://docs.voltdb.com/AdminGuide/ (accessed on 10 October 2022).
CockroachDB | MariaDB Xpand | VoltDB | |
---|---|---|---|
Developer | Cockroach Labs | Cluxtrix Inc. | VoltDB Inc. |
Creation Date | 2015 | 2018 | 2010 |
License | Open Source | Commercial | Open Source |
Latest Release | v22.1.3, 11 July 2022 | v6.0.3, 10 March 2022 | v11.3, 8 February 2022 |
Initial Release | 2015 | 2020 | 2010 |
Data Model | Relational | Relational | Relational |
Server Operating Systems | Linux, macOS, and Windows | Linux, Windows, macOS | Linux, macOS |
Data Schema | Dynamic | Yes | Yes |
Partitioning method | Horizontal, by key range | Range | Sharding |
Replication Method | Multi-source using RAFT | Parallel Streaming | Source and Multi-Source Replication |
SQL Support | Yes | Yes | Yes |
Transaction Properties | ACID | ACID | ACID |
Implemented Language | Go, Typescript, Starlark, and Yacc | C | Java, C++ |
Supported Languages | C#, C++, Clojure, Go, Java, JavaScript (Node.js), PHP, Python, Ruby, and Rust | C, C++, Java, Javascript, Python | Java, Python, Ruby, Scala, Javascript |
Categories | Weight |
---|---|
Functionality | 35% |
Operational Software Characteristics | 25% |
Documentation | 15% |
Support and Service | 15% |
Software Technology Attributes | 10% |
Metrics | Weight | CockroachDB | MariaDB Xpand | VoltDB |
---|---|---|---|---|
OLTP | 3 | 3 | 3 | 3 |
CRUD | 3 | 3 | 3 | 3 |
Replication | 3 | 2 | 2 | 3 |
Partitioning | 3 | 3 | 3 | 3 |
Consistency | 3 | 3 | 3 | 3 |
Crash Recovery | 3 | 3 | 3 | 3 |
Triggers (SQL) | 2 | 0 | 2 | 0 |
Indexes (SQL) | 2 | 2 | 2 | 2 |
PL/SQL functions | 2 | 1 | 2 | 2 |
Cloud Support | 1 | 1 | 1 | 1 |
Total | 25 | 21 | 24 | 23 |
% Score | 84% | 96% | 92% | |
Final Score | 3 | 4 | 4 |
Categories | Weight | CockroachDB | MariaDB Xpand | VoltDB |
---|---|---|---|---|
Functionality | 35% | 3 | 4 | 4 |
Operational Software Characteristics | 25% | 4 | 3.75 | 3.5 |
Documentation | 15% | 5 | 4.65 | 4.65 |
Support and Services | 15% | 5 | 5 | 5 |
Software Technology Attributes | 10% | 4.35 | 3.75 | 3.75 |
Total | 3.99 | 4.16 | 4.10 |
SF = 1 | SF = 10 | |||
---|---|---|---|---|
Tables | File Size | Rows | File Size | Rows |
customer | 3.2 MB | 30,000 | 32.6 MB | 300,000 |
date | 278.2 KB | 2557 | 278.2 KB | 2557 |
part | 19.7 MB | 200,000 | 79.3 MB | 800,000 |
supplier | 188.7 KB | 2000 | 1.9 MB | 20,000 |
lineorder | 641.1 MB | 6,001,173 | 6.6 GB | 59,986,217 |
Total | ∼1 GB | 6,235,730 | ∼7 GB | 61,108,774 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Pina, E.; Sá, F.; Bernardino, J. NewSQL Databases Assessment: CockroachDB, MariaDB Xpand, and VoltDB. Future Internet 2023, 15, 10. https://doi.org/10.3390/fi15010010
Pina E, Sá F, Bernardino J. NewSQL Databases Assessment: CockroachDB, MariaDB Xpand, and VoltDB. Future Internet. 2023; 15(1):10. https://doi.org/10.3390/fi15010010
Chicago/Turabian StylePina, Eduardo, Filipe Sá, and Jorge Bernardino. 2023. "NewSQL Databases Assessment: CockroachDB, MariaDB Xpand, and VoltDB" Future Internet 15, no. 1: 10. https://doi.org/10.3390/fi15010010
APA StylePina, E., Sá, F., & Bernardino, J. (2023). NewSQL Databases Assessment: CockroachDB, MariaDB Xpand, and VoltDB. Future Internet, 15(1), 10. https://doi.org/10.3390/fi15010010