The Evolution and Challenges of Real-Time Big Data: A Review †
Abstract
1. Introduction
2. Methodology
3. Background
3.1. Data Flow Management
3.1.1. Apache Kafka
3.1.2. ETL Stream
3.2. Real-Time Flow Processing
3.2.1. Apache Flink
3.2.2. Apache Storm
3.3. Real-Time Analysis with Apache Spark Streaming
3.4. NO_SQL Database for Real-Time
3.4.1. Cassandra
3.4.2. Hbase
3.5. Machine Learning and AI for Big Data
4. Literature Review
5. Discussion
5.1. Recent Technological Advances
5.2. Remaining Challenges and Limitations
5.3. Recommendations and Future Directions
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
IOT | Internet of Things |
IT | Information Technology |
ETL | Extract, Transform, Load |
AI | Artificial Intelligence |
References
- Yim, S.T.; Son, J.C.; Lee, J. Spread of E-commerce, prices and inflation dynamics: Evidence from online price big data in Korea. J. Asian Econ. 2022, 80, 101475. [Google Scholar] [CrossRef]
- Jabbar, A.; Akhtar, P.; Dani, S. Real-time big data processing for instantaneous marketing decisions: A problematization approach. Ind. Mark. Manag. 2020, 90, 558–569. [Google Scholar] [CrossRef]
- Corral-Plaza, D.; Ortiz, G.; Medina-Bulo, I.; Boubeta-Puig, J. MEdit4CEP-SP: A model-driven solution to improve decision-making through user-friendly management and real-time processing of heterogeneous data streams. Knowl. Based Syst. 2021, 213, 106682. [Google Scholar] [CrossRef]
- Raptis, T.P.; Cicconetti, C.; Passarella, A. Efficient topic partitioning of Apache Kafka for high-reliability real-time data streaming applications. Future Gener. Comput. Syst. 2024, 154, 173–188. [Google Scholar] [CrossRef]
- Fu, X.; Pan, L.; Liu, S. To store or not: Online cost optimization for running big data jobs on the cloud. Future Gener. Comput. Syst. 2024, 156, 42–52. [Google Scholar] [CrossRef]
- Tian, G.; Wu, W. Big data pricing in marketplace lending and price discrimination against repeat borrowers: Evidence from China. China Econ. Rev. 2023, 78, 101944. [Google Scholar] [CrossRef]
- Ismail, A.; Sazali, F.H.; Jawaddi, S.N.A.; Mutalib, S. Stream ETL framework for twitter-based sentiment analysis: Leveraging big data technologies. Expert. Syst. Appl. 2025, 261, 125523. [Google Scholar] [CrossRef]
- Hazem, H.; Awad, A.; Yousef, A.H. A distributed real-time recommender system for big data streams. Ain Shams Eng. J. 2023, 14, 102026. [Google Scholar] [CrossRef]
- Amen, B.; Faiz, S.; Do, T.T. Big data directed acyclic graph model for real-time COVID-19 twitter stream detection. Pattern Recognit. 2022, 123, 108404. [Google Scholar] [CrossRef]
- Pébereau, C.; Remmy, K. Barriers to real-time electricity pricing: Evidence from New Zealand. Int. J. Ind. Organ. 2023, 89, 102979. [Google Scholar] [CrossRef]
- Macias, P.; Stelmasiak, D.; Szafranek, K. Nowcasting food inflation with a massive amount of online prices. Int. J. Forecast. 2023, 39, 809–826. [Google Scholar] [CrossRef]
- Yang, G.; Wu, X.; Zhang, J. A dynamic balanced quadtree for real-time streaming data. Knowl. Based Syst. 2023, 263, 110291. [Google Scholar] [CrossRef]
- Melgar-García, L.; Troncoso, A. A novel incremental ensemble learning for real-time explainable forecasting of electricity price. Knowl. Based Syst. 2024, 305, 112574. [Google Scholar] [CrossRef]
- Pauwels, K.; Aksehirli, Z. Big data analytics democratized with clean collaboration and customer privacy choice. J. Bus. Res. 2025, 188, 115112. [Google Scholar] [CrossRef]
- Bricongne, J.C.; Meunier, B.; Pouget, S. Web-scraping housing prices in real-time: The Covid-19 crisis in the UK. J. Hous. Econ. 2023, 59, 101906. [Google Scholar] [CrossRef]
- Selmy, H.A.; Mohamed, H.K.; Medhat, W. Big data analytics deep learning techniques and applications: A survey. Inf. Syst. 2024, 120, 102318. [Google Scholar] [CrossRef]
- Dębski, R.; Dreżewski, R. Real-time surrogate-assisted preprocessing of streaming sensor data. Comput. Netw. 2022, 219, 109422. [Google Scholar] [CrossRef]
- Xu, W.; Cao, Y.; Chen, R. A multimodal analytics framework for product sales prediction with the reputation of anchors in live streaming e-commerce. Decis. Support. Syst. 2024, 177, 114104. [Google Scholar] [CrossRef]
- Mari, A.; Remlinger, C.; Castello, R.; Obozinski, G.; Quarteroni, S.; Heymann, F.; Galus, M. Real-time estimates of Swiss electricity savings using streamed smart meter data. Appl. Energy 2025, 377, 124537. [Google Scholar] [CrossRef]
- Bourahla, C.; Maamri, R.; Brahimi, S. Skyline recomputation in Big Data. Inf. Syst. 2023, 114, 102164. [Google Scholar] [CrossRef]
- Esmaeeli, Z.; Mollaverdi, N.; Safarzadeh, S. A game theoretic approach for green supply chain management in a big data environment considering cost-sharing models. Expert Syst. Appl. 2024, 257, 124989. [Google Scholar] [CrossRef]
- Berloco, F.; Bevilacqua, V.; Colucci, S. Distributed Analytics For Big Data: A Survey. Neurocomputing 2024, 574, 127258. [Google Scholar] [CrossRef]
- Li, Z.; Liu, S.; Liu, J.; Zhang, Y.; Liang, T.; Liu, K. SIM: A fast real-time graph stream summarization with improved memory efficiency and accuracy. Comput. Netw. 2024, 248, 110502. [Google Scholar] [CrossRef]
- Dwivedi, A.; Pant, R.P. An algorithmic implementation of entropic ternary reduct soft sentiment set (ETRSSS) using soft computing technique on big data sentiment analysis (BDSA) for optimal selection of a decision based on real-time update in online reviews. J. King Saud. Univ.-Comput. Inf. Sci. 2022, 34, 2118–2130. [Google Scholar] [CrossRef]
- Kalra, R.; Singh, T.; Mishra, S.; Satakshi; Kumar, N.; Kim, T.; Kumar, M. An efficient hybrid approach for forecasting real-time stock market indices. J. King Saud. Univ.-Comput. Inf. Sci. 2024, 36, 102180. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lalaoui, I.L.; El Haji, E.; Kounaidi, M. The Evolution and Challenges of Real-Time Big Data: A Review. Comput. Sci. Math. Forum 2025, 10, 11. https://doi.org/10.3390/cmsf2025010011
Lalaoui IL, El Haji E, Kounaidi M. The Evolution and Challenges of Real-Time Big Data: A Review. Computer Sciences & Mathematics Forum. 2025; 10(1):11. https://doi.org/10.3390/cmsf2025010011
Chicago/Turabian StyleLalaoui, Ikram Lefhal, Essaid El Haji, and Mohamed Kounaidi. 2025. "The Evolution and Challenges of Real-Time Big Data: A Review" Computer Sciences & Mathematics Forum 10, no. 1: 11. https://doi.org/10.3390/cmsf2025010011
APA StyleLalaoui, I. L., El Haji, E., & Kounaidi, M. (2025). The Evolution and Challenges of Real-Time Big Data: A Review. Computer Sciences & Mathematics Forum, 10(1), 11. https://doi.org/10.3390/cmsf2025010011