Exploring the Evolution of Big Data Technologies: A Systematic Literature Review of Trends, Challenges, and Future Directions
Abstract
1. Introduction
2. Background on Big Data
2.1. Historical Development of Big Data Technologies
2.2. Key Components of Big Data Ecosystem
2.3. Overview of Key Big Data Algorithms and Their Characteristics
Algorithm | Source | Type | Applications | Scalability | Performance | Advantages | Challenges |
---|---|---|---|---|---|---|---|
MapReduce | Ejimofor & Okonkwo [36] | Distributed Computing | Big Data processing | Highly scalable (horizontal scaling) | Efficient for large datasets | Scalable and fault-tolerant | Complexity in programming model |
Hadoop | Y. Li & Hei [37] | Distributed Framework | Data storage and processing | High scalability with distributed storage | Handles large volumes of data | Open-source and widely adopted | Requires significant resources |
Spark | Y. Li & Hei [37] | Distributed Computing | Real-time data processing | Highly scalable (in-memory) | Faster due to in-memory processing | Supports various data processing tasks | Memory consumption can be high |
K-Means | Liang et al. [39] | Clustering | Customer segmentation, image compression | Moderate scalability (depending on implementation) | Efficient for small to medium datasets | Simple and easy to implement | Sensitive to initial centroids |
DBSCAN | Liang et al. [39] | Clustering | Spatial data analysis, anomaly detection | Moderate scalability (due to spatial indexing) | Can find arbitrarily shaped clusters | No need to specify number of clusters | Struggles with varying density clusters |
Apriori | Liang et al. [39] | Association Rule Learning | Market basket analysis | Low scalability (explosive growth of candidate sets) | Effective for small datasets | Simple and interpretable | Inefficient for large datasets due to combinatorial explosion |
Random Forest | Gao et al. [40] | Ensemble Learning | Classification and regression tasks | High scalability (parallelizable) | Robust against overfitting | Handles high dimensionality well | Can be less interpretable than simpler models |
Gradient Boosting (XGBoost) | Adankon et al. [42]; Liang et al. [39] | Ensemble Learning | Classification, regression, ranking | Highly scalable (with optimizations for distributed data) | High accuracy and efficiency | Handles missing values well | Sensitive to overfitting without tuning |
Principal Component Analysis (PCA) | Liang et al. [39] | Dimensionality Reduction | Data visualization, noise reduction | Scalable with optimizations | Reduces dimensionality effectively | Enhances interpretability of data | Assumes linear relationships |
Naive Bayes | Liang et al. [39] | Classification | Text classification, spam detection | High scalability (linear) | Fast and efficient | Works well with large datasets | Assumes feature independence |
Deep Learning (Neural Networks) | Liang et al. [39] | Machine Learning | Image recognition, natural language processing | High scalability (with GPU and parallel processing) | High accuracy with large datasets | Learns complex patterns | Requires large amounts of data and computational power |
Support Vector Machine (SVM) | Adankon et al. [42] | Classification | Image classification, bioinformatics | Low to moderate scalability (depends on dataset size) | Effective in high-dimensional spaces | Robust against overfitting | Memory-intensive for large datasets |
Recurrent Neural Networks (RNN) | Liang et al. [39] | Deep Learning | Time series prediction, language modeling | High scalability (with optimizations like LSTM) | Captures temporal dependencies | Suitable for sequential data | Difficult to train and tune |
Logistic Regression | Liang et al. [39] | Regression | Binary classification tasks | Highly scalable (linear scalability) | Simple and interpretable | Efficient for binary outcomes | Assumes linear relationship between features |
3. Materials and Methods
3.1. Search Strategy
- English language only
- Peer-reviewed journal articles
- Full-text availability
- Subject areas: computer science, engineering, information systems, business, and management
- Publication years: 2015–2024
3.2. Bibliometric Analysis
3.3. Study Selection
- Were peer-reviewed journal articles
- Focused on Big Data technologies or their applications
- Were published in English
3.4. Data Extraction
- Adoption of Big Data technologies
- Application domains (e.g., healthcare, finance, logistics)
- Methodological approaches used in the studies
- Challenges reported in the adoption and use of Big Data
- Bibliometric indicators, such as citation counts
3.5. Risk of Bias and Confidence in Findings
4. Trends in Big Data Technologies
Trend | Author/Year | Description | Example | Implication |
---|---|---|---|---|
Real-Time Data Processing | Dubuc et al. [15]; Jabbar et al. [7]; Mir [59] | Real-time data processing allows organizations to analyze and act on data instantly, as it is generated, improving operational efficiency and decision-making. | Apache Kafka and Flink used in banking for fraud detection. | Increases demand for infrastructure that can handle high-speed data, impacting industries like finance and healthcare. |
AI and ML Integration | Kumar & Singh [21] | The integration of AI and ML into Big Data systems automates the analysis of large datasets, offering predictive insights and advanced decision-making capabilities. | IBM Watson in healthcare for disease diagnosis based on large datasets. | Enhances decision-making across industries but raises concerns about bias and the need for ethical guidelines in AI use. |
Edge Computing | Rathore et al. [60]; Hamdan et al. [61] | Edge computing processes data closer to its source, reducing latency and improving bandwidth efficiency, especially in IoT applications and real-time systems. | Edge computing in autonomous vehicles for real-time decision-making. | Enables the growth of IoT applications, but challenges include security and infrastructure upgrades. |
Cloud-Based Big Data Solutions | Tuli et al. [62] | Cloud solutions provide scalable and flexible platforms for Big Data processing, reducing the need for on-premise hardware and allowing organizations to scale on-demand. | Netflix using AWS for user preference analysis and content recommendations. | Accelerates the adoption of Big Data by reducing entry barriers for organizations, though security concerns remain significant. |
Data Privacy and Security | Bansal et al. [63]; Yang et al. [64] | Ensuring data privacy and security is critical as the volume of data grows. Solutions include encryption, access controls, and compliance with regulations like GDPR. | Financial institutions using homomorphic encryption for secure computations on encrypted data. | Stricter regulations push industries to adopt stronger security measures, but compliance costs can be burdensome for businesses. |
Data Lakes and Data Warehouses | Nambiar & Mundra [54]; Saddad et al. [17] | Data lakes store large volumes of raw, unstructured data, while data warehouses are optimized for querying structured data. Organizations often use both for comprehensive analytics. | Amazon S3 (data lake) and Redshift (data warehouse) for storing and analyzing different types of data. | A hybrid approach allows for more comprehensive analytics but requires careful data management to avoid inefficiencies. |
Data Governance and Ethics | Kroll [56], Micheli et al. [41] | Data governance ensures the integrity, quality, and security of data across its lifecycle, while ethical guidelines prevent bias and ensure fairness in data and AI models. | Microsoft’s AI ethics board ensuring fairness in AI systems. | Drives the need for transparent and responsible use of Big Data and AI but may slow innovation in highly regulated sectors. |
Data Democratization | Wang et al. [57] | Data democratization enables non-technical users to access and analyze data, fostering innovation by empowering employees across an organization to make data-driven decisions. | Tableau and Power BI empowering business users to create their own reports. | Facilitates innovation by enabling widespread access to data but requires robust training and security measures to avoid misuse. |
Sector | Sources | Applications | Description | Challenges | Opportunities | Implications |
---|---|---|---|---|---|---|
Healthcare | Gomes et al. [6]; Buck et al. [69] | Predictive analytics, personalized medicine, disease surveillance, electronic health records (EHRs), telemedicine and remote monitoring | Integration of technology to improve patient care and streamline processes. | Data privacy, resistance to change, high costs of implementation. | Enhanced patient engagement, improved access to care, cost reduction. | Improved health outcomes, increased efficiency in healthcare delivery. |
Financial | Agustí & Orta-Pérez [70]; Nneka Adaobi Ochuba et al. [71]; Rani et al. [72]; Karim et al. [73] | Fraud detection and risk management, algorithmic trading, customer segmentation and targeting, mobile banking, blockchain technology, and robo-advisors | Use of technology to enhance financial services and customer experience. | Cybersecurity threats, regulatory compliance, technology adoption. | Increased financial inclusion, reduced transaction costs, innovation. | Greater economic stability, improved access to financial services. |
Smart Cities | Chang, [50]; Waterson et al. [74] | Traffic management and optimization, energy management, public safety and emergency response, IoT applications, smart transportation, and energy management systems | Integration of technology to enhance urban living and sustainability. | Infrastructure costs, data management, public acceptance. | Improved urban planning, enhanced quality of life for residents. | Sustainable urban development, increased efficiency in resource management. |
Education | Ang et al. [28]; Hamad [75]; Ikegwu et al. [76] | Learning analytics, institutional performance assessment, online learning platforms, e-learning platforms, AI tutors, and virtual classrooms | Technology-enhanced learning environments to improve educational outcomes. | Digital divide, resistance from traditional educators, funding issues. | Broader access to education, personalized learning experiences. | Improved educational attainment, workforce readiness. |
Marketing | Jabbar et al. [1]; Tran et al. [77] | Consumer behavior analysis, social media analytics, market segmentation, digital marketing, social media analytics, personalized advertising, and customer segmentation | Leveraging data analytics to target consumers effectively. | Data privacy concerns, rapid technological changes, market saturation. | Enhanced customer engagement, improved ROI on marketing campaigns. | Shift in consumer behavior, increased competition among brands. |
5. Challenges and Opportunities in Big Data
5.1. Data Privacy and Security
5.2. Ethical Considerations
5.3. Scalability Issues
6. Societal Implications
7. Mapping the Global Dynamics of Big Data Research: A Visual Bibliometric Analysis
8. Discussion
8.1. Insights from SLR
8.2. Insights from the Bibliometric Analysis
9. Research Limitations and Future Directions
- Expand the sample size for greater statistical reliability.
- Include empirical investigations across varied sectors and regions.
- Explore trends beyond 2024 to capture cutting-edge advancements.
- Address technological challenges such as scalability, interoperability, and ethical governance—especially as Big Data, blockchain, and QC continue to evolve.
10. Conclusions
Supplementary Materials
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Amanullah, M.A.; Habeeb, R.A.A.; Nasaruddin, F.H.; Gani, A.; Ahmed, E.; Nainar, A.S.M.; Akim, N.M.; Imran, M. Deep learning and big data technologies for IoT security. Comput. Commun. 2020, 151, 495–517. [Google Scholar] [CrossRef]
- Arena, F.; Pau, G. An overview of big data analysis. Bull. Electr. Eng. Inform. 2020, 9, 1646–1653. [Google Scholar] [CrossRef]
- Tariq, M.U.; Babar, M.; Poulin, M.; Khattak, A.S.; Alshehri, M.D.; Kaleem, S. Human Behavior Analysis Using Intelligent Big Data Analytics. Front. Psychol. 2021, 12, 686610. [Google Scholar] [CrossRef]
- Mishra, D.; Luo, Z.; Jiang, S.; Papadopoulos, T.; Dubey, R. A bibliographic study on big data: Concepts, trends and challenges. Bus. Process Manag. J. 2017, 23, 555–573. [Google Scholar] [CrossRef]
- Nagaraj, K.; Sharvani, G.S.; Sridhar, A. Emerging trend of big data analytics in bioinformatics: A literature review. Int. J. Bioinform. Res. Appl. 2018, 14, 144–205. [Google Scholar] [CrossRef]
- Gomes, M.A.S.; Kovaleski, J.L.; Pagani, R.N.; da Silva, V.L.; Pasquini, T.C.d.S. Transforming healthcare with big data analytics: Technologies, techniques and prospects. J. Med. Eng. Technol. 2023, 47, 1–11. [Google Scholar] [CrossRef] [PubMed]
- Jabbar, A.; Akhtar, P.; Dani, S. Real-time big data processing for instantaneous marketing decisions: A problematization approach. Ind. Mark. Manag. 2020, 90, 558–569. [Google Scholar] [CrossRef]
- Cremin, C.J.; Dash, S.; Huang, X. Big data: Historic advances and emerging trends in biomedical research. Curr. Res. Biotechnol. 2022, 4, 138–151. [Google Scholar] [CrossRef]
- Korherr, P.; Kanbach, D. Human-related capabilities in big data analytics: A taxonomy of human factors with impact on firm performance. Rev. Manag. Sci. 2023, 17, 1943–1970. [Google Scholar] [CrossRef]
- Pawar, P.V.; Paluri, R.A. Big Data Analytics in Logistics and Supply Chain Management: A Review of Literature. Vision 2022, 1–20. [Google Scholar] [CrossRef]
- Ranjan, J.; Foropon, C. Big Data Analytics in Building the Competitive Intelligence of Organizations. Int. J. Inf. Manag. 2021, 56, 102231. [Google Scholar] [CrossRef]
- Ajah, I.A.; Nweke, H.F. Big data and business analytics: Trends, platforms, success factors and applications. Big Data Cogn. Comput. 2019, 3, 32. [Google Scholar] [CrossRef]
- Agarwal, P.; Alam, M. Exploring Quantum Computing to Revolutionize Big Data Analytics for Various Industrial Sectors. In Big Data Analytics; Auerbach Publications: Boca Raton, FL, USA, 2021. [Google Scholar] [CrossRef]
- Agrawal, R.; Wankhede, V.A.; Kumar, A.; Luthra, S.; Huisingh, D. Big data analytics and sustainable tourism: A comprehensive review and network based analysis for potential future research. Int. J. Inf. Manag. Data Insights 2022, 2, 100122. [Google Scholar] [CrossRef]
- Dubuc, T.; Stahl, F.; Roesch, E.B. Mapping the Big Data Landscape: Technologies, Platforms and Paradigms for Real-Time Analytics of Data Streams. IEEE Access 2021, 9, 15351–15374. [Google Scholar] [CrossRef]
- Morawiec, P.; Sołtysik-Piorunkiewicz, A. Cloud Computing, Big Data, and Blockchain Technology Adoption in ERP Implementation Methodology. Sustainability 2022, 14, 3714. [Google Scholar] [CrossRef]
- Saddad, E.; El-Bastawissy, A.; Mokhtar, H.M.O.; Hazman, M. Lake data warehouse architecture for big data solutions. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 417–424. [Google Scholar] [CrossRef]
- Bazzaz Abkenar, S.; Haghi Kashani, M.; Mahdipour, E.; Jameii, S.M. Big data analytics meets social media: A systematic review of techniques, open issues, and future directions. Telemat. Inform. 2021, 57, 101517. [Google Scholar] [CrossRef]
- Naeem, M.; Jamal, T.; Diaz-Martinez, J.; Butt, S.A.; Montesano, N.; Tariq, M.I.; De-la-Hoz-Franco, E.; De-La-Hoz-Valdiris, E. Trends and Future Perspective Challenges in Big Data. In Smart Innovation, Systems and Technologies; Springer: Singapore, 2022; Volume 253. [Google Scholar] [CrossRef]
- Lu, Y. Artificial intelligence: A survey on evolution, models, applications and future trends. J. Manag. Anal. 2019, 6, 1–29. [Google Scholar] [CrossRef]
- Kumar, S.; Singh, M. Big data analytics for healthcare industry: Impact, applications, and tools. Big Data Min. Anal. 2019, 2, 48–57. [Google Scholar] [CrossRef]
- Jagatheesaperumal, S.K.; Rahouti, M.; Ahmad, K.; Al-Fuqaha, A.; Guizani, M. The Duo of Artificial Intelligence and Big Data for Industry 4.0: Applications, Techniques, Challenges, and Future Research Directions. IEEE Internet Things J. 2022, 9, 12861–12885. [Google Scholar] [CrossRef]
- Swazan, I.S.; Das, D. Bangladesh’s Emergence as a Ready-Made Garment Export Leader: An Examination of the Competitive Advantages of the Garment Industry. Int. J. Glob. Bus. Compet. 2022, 17, 162–174. [Google Scholar] [CrossRef]
- Hofmann, W.; Lang, S.; Reichardt, P.; Reggelin, T. A brief introduction to deploy Amazon Web Services for online discrete-event simulation. Procedia Comput. Sci. 2022, 200, 386–393. [Google Scholar] [CrossRef]
- Sukhdeve, D.S.R.; Sukhdeve, S.S. Google Cloud Platform for Data Science; Apress: Berkeley, CA, USA, 2023. [Google Scholar] [CrossRef]
- Azeem, M.; Haleem, A.; Bahl, S.; Javaid, M.; Suman, R.; Nandan, D. Big data applications to take up major challenges across manufacturing industries: A brief review. Mater. Today Proc. 2022, 49, 339–348. [Google Scholar] [CrossRef]
- Botvin, M.; Hershkovitz, A.; Forkosh-Baruch, A. Data-driven decision-making in emergency remote teaching. Educ. Inf. Technol. 2023, 28, 489–506. [Google Scholar] [CrossRef] [PubMed]
- Ang, K.L.M.; Ge, F.L.; Seng, K.P. Big Educational Data Analytics: Survey, Architecture and Challenges. IEEE Access 2020, 8, 116392–116414. [Google Scholar] [CrossRef]
- Dean, J.; Ghemawat, S. MapReduce: Simplified data processing on large clusters. Commun. ACM 2008, 51, 107–113. [Google Scholar] [CrossRef]
- Khezr, S.N.; Navimipour, N.J. MapReduce and Its Applications, Challenges, and Architecture: A Comprehensive Review and Directions for Future Research. J. Grid Comput. 2017, 15, 295–321. [Google Scholar] [CrossRef]
- Sklyarov, V.; Skliarova, I.; Utepbergenov, I. Hardware Accelerators for Data Processing in High-Performance Computing Systems. In Proceedings of the 15th IEEE International Conference on Application of Information and Communication Technologies, AICT 2021, Baku, Azerbaijan, 13–15 October 2021. [Google Scholar] [CrossRef]
- Bhogal, J.; Choksi, I. Handling Big Data Using NoSQL. In Proceedings of the 2015 IEEE 29th International Conference on Advanced Information Networking and Applications Workshops, WAINA 2015, Gwangju, Republic of Korea, 24–27 March 2015. [Google Scholar] [CrossRef]
- Felstaine, E.; Hermoni, O. Machine Learning, Containers, Cloud Natives, and Microservices. In Artificial Intelligence for Autonomous Networks; Chapman and Hall/CRC: Boca Raton, FL, USA, 2018. [Google Scholar] [CrossRef]
- Lucas-Noll, J.; Lleixà-Fortuño, M.; Queralt-Tomas, L.; Panisello-Tafalla, A.; Carles-Lavila, M.; Clua-Espuny, J.L. Organization and costs of stroke care in outpatient settings: Systematic review. Aten. Primaria 2023, 55, 102578. [Google Scholar] [CrossRef]
- Bilal, M.; Oyedele, L.O.; Qadir, J.; Munir, K.; Ajayi, S.O.; Akinade, O.O.; Owolabi, H.A.; Alaka, H.A.; Pasha, M. Big Data in the construction industry: A review of present status, opportunities, and future trends. Adv. Eng. Inform. 2016, 30, 500–521. [Google Scholar] [CrossRef]
- Ejimofor, I.A.U.; Okonkwo, O.O.R. Development of a Knowledge Discovery System in Big Data Mining Environment. Int. Res. J. Innov. Eng. Technol. 2021, 5, 65–70. [Google Scholar] [CrossRef]
- Li, Y.; Hei, X. Performance optimization of computing task scheduling based on the Hadoop big data platform. Neural Comput. Appl. 2022, 37, 8181–8192. [Google Scholar] [CrossRef]
- Adewusi, A.O.; Okoli, U.I.; Adaga, E.; Olorunsogo, T.; Asuzu, O.F.; Daraojimba, D.O. Business Intelligence in the Era of Big Data: A Review of Analytical Tools and Competitive Advantage. Comput. Sci. IT Res. J. 2024, 5, 415–431. [Google Scholar] [CrossRef]
- Liang, H.; Li, J.; Wu, H.; Li, L.; Zhou, X.; Jiang, X. Mammographic Classification of Breast Cancer Microcalcifications through Extreme Gradient Boosting. Electronics 2022, 11, 2435. [Google Scholar] [CrossRef]
- Gao, Q.; Jin, X.; Xia, E.; Wu, X.; Gu, L.; Yan, H.; Xia, Y.; Li, S. Identification of Orphan Genes in Unbalanced Datasets Based on Ensemble Learning. Front. Genet. 2020, 11, 820. [Google Scholar] [CrossRef]
- Micheli, M.; Gevaert, C.M.; Carman, M.; Craglia, M.; Daemen, E.; Ibrahim, R.E.; Kotsev, A.; Mohamed-Ghouse, Z.; Schade, S.; Schneider, I.; et al. AI ethics and data governance in the geospatial domain of Digital Earth. Big Data Soc. 2022, 9, 1–5. [Google Scholar] [CrossRef]
- Adankon, M.M.; Cheriet, M.; Biem, A. Semisupervised least squares support vector machine. IEEE Trans. Neural Netw. 2009, 20, 1858–1870. [Google Scholar] [CrossRef] [PubMed]
- Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372. [Google Scholar] [CrossRef]
- Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G.; Antes, G.; Atkins, D.; Barbour, V.; Barrowman, N.; Berlin, J.A.; Clark, J.; et al. Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLoS Med. 2009, 6, e1000097. [Google Scholar] [CrossRef] [PubMed]
- Gioia, D.A.; Corley, K.G.; Hamilton, A.L. Seeking Qualitative Rigor in Inductive Research: Notes on the Gioia Methodology. Organ. Res. Methods 2013, 16, 15–31. [Google Scholar] [CrossRef]
- Korner, M.E.H.; Lambán, M.P.; Albajez, J.A.; Santolaria, J.; Corrales, L.D.C.N.; Royo, J. Systematic literature review: Integration of additive manufacturing and industry 4.0. Metals 2020, 10, 1061. [Google Scholar] [CrossRef]
- Sarkis-Onofre, R.; Catalá-López, F.; Aromataris, E.; Lockwood, C. How to properly use the PRISMA Statement. Syst. Rev. 2021, 10, 117. [Google Scholar] [CrossRef]
- Tahamtan, I.; Safipour Afshar, A.; Ahamdzadeh, K. Factors affecting number of citations: A comprehensive review of the literature. Scientometrics 2016, 107, 1195–1225. [Google Scholar] [CrossRef]
- Kuang, L.; Liu, H.; Ren, Y.; Luo, K.; Shi, M.; Su, J.; Li, X. Application and development trend of artificial intelligence in petroleum exploration and development. Pet. Explor. Dev. 2021, 48, 1–14. [Google Scholar] [CrossRef]
- Chang, V. An ethical framework for big data and smart cities. Technol. Forecast. Soc. Change 2021, 165, 120559. [Google Scholar] [CrossRef]
- Oliveira, F.; Costa, D.G.; Assis, F.; Silva, I. Internet of Intelligent Things: A convergence of embedded systems, edge computing and machine learning. Internet Things 2024, 26, 101153. [Google Scholar] [CrossRef]
- Sun, Z.; Strang, K.D.; Pambel, F. Privacy and security in the big data paradigm. J. Comput. Inf. Syst. 2020, 60, 146–155. [Google Scholar] [CrossRef]
- Al-Ghabra, N. Toward Sustainable Smart Cities: Concepts & Challenges. Archit. Plan. J. 2022, 28, 3. [Google Scholar] [CrossRef]
- Nambiar, A.; Mundra, D. An Overview of Data Warehouse and Data Lake in Modern Enterprise Data Management. Big Data Cogn. Comput. 2022, 6, 132. [Google Scholar] [CrossRef]
- Azzabi, S.; Alfughi, Z.; Ouda, A. Data Lakes: A Survey of Concepts and Architectures. Computers 2024, 13, 183. [Google Scholar] [CrossRef]
- Kroll, J.A. Data Science Data Governance [AI Ethics]. IEEE Secur. Priv. 2018, 16, 61–70. [Google Scholar] [CrossRef]
- Wang, Y.; Blobel, B.; Yang, B. Reinforcing Health Data Sharing through Data Democratization. J. Pers. Med. 2022, 12, 1380. [Google Scholar] [CrossRef] [PubMed]
- Marinakis, V.; Koutsellis, T.; Nikas, A.; Doukas, H. Ai and data democratisation for intelligent energy management. Energies 2021, 14, 4341. [Google Scholar] [CrossRef]
- Mir, A.A. Optimizing Mobile Cloud Computing Architectures for Real-Time Big Data Analytics in Healthcare Applications: Enhancing Patient Outcomes through Scalable and Efficient Processing Models. Integr. J. Sci. Technol. 2024, 1, 1–11. [Google Scholar]
- Rathore, M.M.; Shah, S.A.; Shukla, D.; Bentafat, E.; Bakiras, S. The Role of AI, Machine Learning, and Big Data in Digital Twinning: A Systematic Literature Review, Challenges, and Opportunities. IEEE Access 2021, 9, 32030–32052. [Google Scholar] [CrossRef]
- Hamdan, S.; Ayyash, M.; Almajali, S. Edge-computing architectures for internet of things applications: A survey. Sensors 2020, 20, 6441. [Google Scholar] [CrossRef]
- Tuli, S.; Mirhakimi, F.; Pallewatta, S.; Zawad, S.; Casale, G.; Javadi, B.; Yan, F.; Buyya, R.; Jennings, N.R. AI augmented Edge and Fog computing: Trends and challenges. J. Netw. Comput. Appl. 2023, 216, 103648. [Google Scholar] [CrossRef]
- Bansal, M.; Chana, I.; Clarke, S. A Survey on IoT Big Data: Current Status, 13 V’s Challenges, and Future Directions. ACM Comput. Surv. 2021, 53, 1–59. [Google Scholar] [CrossRef]
- Yang, P.; Xiong, N.; Ren, J. Data Security and Privacy Protection for Cloud Storage: A Survey. IEEE Access 2020, 8, 131723–131740. [Google Scholar] [CrossRef]
- Strang, K.D.; Sun, Z. Big Data Paradigm: What is the Status of Privacy and Security? Ann. Data Sci. 2017, 4, 1–17. [Google Scholar] [CrossRef]
- Lefebvre, H.; Legner, C.; Fadler, M. Data democratization: Toward a deeper understanding. In Proceedings of the 42nd International Conference on Information Systems, ICIS 2021 TREOs: “Building Sustainability and Resilience with IS: A Call for Action”, Austin, TX, USA, 12–15 December 2021. [Google Scholar]
- Samarasinghe, S.S.U.; Lokuge, S.; Snell, L. Exploring Tenets of Data Democratization. arXiv 2022, arXiv:2206.12051. [Google Scholar] [CrossRef]
- Arshad, H.; Tayyab, M.; Bilal, M.; Akhtar, S.; Abdullahi, A.M. Trends and Challenges in harnessing big data intelligence for health care transformation. In Artificial Intelligence for Intelligent Systems; CRC Press: Boca Raton, FL, USA, 2024; pp. 220–240. [Google Scholar]
- Buck, D.; Tucker, S.; Roe, B.; Hughes, J.; Challis, D. Hospital admissions and place of death of residents of care homes receiving specialist healthcare services: A systematic review without meta-analysis. J. Adv. Nurs. 2022, 78, 666–697. [Google Scholar] [CrossRef]
- Agustí, M.A.; Orta-Pérez, M. Big data and artificial intelligence in the fields of accounting and auditing: A bibliometric analysis. Span. J. Financ. Account./Rev. Española De Financ. Contab. 2023, 52, 412–438. [Google Scholar] [CrossRef]
- Ochuba, N.A.; Amoo, O.O.; Okafor, E.S.; Akinrinola, O.; Usman, F.O. Strategies for Leveraging Big Data and Analytics for Business Development: A Comprehensive Review Across Sectors. Comput. Sci. IT Res. J. 2024, 5, 562–575. [Google Scholar] [CrossRef]
- Rani, S.; Bhambri, P.; Kataria, A. Integration of IoT, Big Data, and Cloud Computing Technologies: Trend of the Era. In Big Data, Cloud Computing and IoT: Tools and Applications; Chapman and Hall/CRC: Boca Raton, FL, USA, 2023. [Google Scholar] [CrossRef]
- Karim, A.; Siddiqa, A.; Safdar, Z.; Razzaq, M.; Gillani, S.A.; Tahir, H.; Kiran, S.; Ahmed, E.; Imran, M. Big data management in participatory sensing: Issues, trends and future directions. Futur. Gener. Comput. Syst. 2020, 107, 942–955. [Google Scholar] [CrossRef]
- Waterson, P.; Carman, E.M.; Manser, T.; Hammer, A. Hospital Survey on Patient Safety Culture (HSPSC): A systematic review of the psychometric properties of 62 international studies. BMJ Open 2019, 9, e026896. [Google Scholar] [CrossRef] [PubMed]
- Hamad, R.; Elser, H.; Tran, D.C.; Rehkopf, D.H.; Goodman, S.N. How and why studies disagree about the effects of education on health: A systematic review and meta-analysis of studies of compulsory schooling laws. Soc. Sci. Med. 2018, 212, 168–178. [Google Scholar] [CrossRef] [PubMed]
- Ikegwu, A.C.; Nweke, H.F.; Anikwe, C.V. Recent trends in computational intelligence for educational big data analysis. Iran J. Comput. Sci. 2024, 7, 103–129. [Google Scholar] [CrossRef]
- Tran, H.; Saleem, K.; Lim, M.; Chow, E.P.F.; Fairley, C.K.; Terris-Prestholt, F.; Ong, J.J. Global estimates for the lifetime cost of managing HIV. AIDS 2021, 35, 1273–1281. [Google Scholar] [CrossRef]
- Deepa, N.; Pham, Q.V.; Nguyen, D.C.; Bhattacharya, S.; Prabadevi, B.; Gadekallu, T.R.; Maddikunta, P.K.R.; Fang, F.; Pathirana, P.N. A survey on blockchain for big data: Approaches, opportunities, and future directions. Futur. Gener. Comput. Syst. 2022, 131, 209–226. [Google Scholar] [CrossRef]
- Char, D.S.; Abràmoff, M.D.; Feudtner, C. Identifying Ethical Considerations for Machine Learning Healthcare Applications. Am. J. Bioeth. 2020, 20, 7–17. [Google Scholar] [CrossRef]
- Favaretto, M.; De Clercq, E.; Gaab, J.; Elger, B.S. First do no harm: An exploration of researchers’ ethics of conduct in Big Data behavioral studies. PLoS ONE 2020, 15, e0241865. [Google Scholar] [CrossRef]
- Sandhu, A.K. Big Data with Cloud Computing: Discussions and Challenges. Big Data Min. Anal. 2022, 5, 32–40. [Google Scholar] [CrossRef]
- Foffano, F.; Scantamburlo, T.; Cortés, A. Investing in AI for social good: An analysis of European national strategies. AI Soc. 2023, 38, 479–500. [Google Scholar] [CrossRef]
- Cui, Y.; Ma, Z.; Wang, L.; Yang, A.; Liu, Q.; Kong, S.; Wang, H. A survey on big data-enabled innovative online education systems during the COVID-19 pandemic. J. Innov. Knowl. 2023, 8, 100295. [Google Scholar] [CrossRef]
- Market Research Future. Data Analytics Market Size, Share | Growth Analysis 2030. Available online: https://www.marketresearchfuture.com/reports/data-analytics-market-1689 (accessed on 24 December 2024).
- Almunawar, M.N.; Anshari, M. Digital enabler and value integration: Revealing the expansion engine of digital marketplace. Technol. Anal. Strateg. Manag. 2022, 34, 847–857. [Google Scholar] [CrossRef]
- Gartner Inc. What’s New in Artificial Intelligence From the 2023 Gartner Hype Cycle. Gartner Articles. Available online: https://www.gartner.com/en/articles/what-s-new-in-artificial-intelligence-from-the-2023-gartner-hype-cycle (accessed on 27 December 2024).
- Pradhan, S.K.; Heyn, H.M.; Knauss, E. Identifying and managing data quality requirements: A design science study in the field of automated driving. Softw. Qual. J. 2024, 32, 313–360. [Google Scholar] [CrossRef]
- Eke, D.; Stahl, B. Ethics in the Governance of Data and Digital Technology: An Analysis of European Data Regulations and Policies. Digit. Soc. 2024, 3, 11. [Google Scholar] [CrossRef]
Era | Technologies | Characteristics |
---|---|---|
2000s | Distributed File Systems (HDFS, GFS) | MapReduce, batch processing |
Apache Hadoop | Scalability, fault tolerance | |
NoSQL Databases (MongoDB, Cassandra) | Flexible data models, horizontal scaling | |
Apache Spark | In-memory processing, faster analytics | |
2010s | Apache Kafka | Real-time streaming, event processing |
Apache Flink | Stream processing, event time processing | |
ML and AI | Data mining, predictive analytics | |
Kubernetes | Container orchestration, scalability | |
2020s | Apache Beam | Unified batch and stream processing |
Kubernetes Operators | Automation, manage stateful applications | |
Data Mesh | Decentralized data architecture | |
Quantum Computing | Potential for processing massive datasets |
Component | Description | Examples |
---|---|---|
Data Storage Systems | Systems that provide large-capacity storage for managing voluminous data. | NoSQL databases, Data lakes |
Data Processing Frameworks | Frameworks designed to efficiently process large datasets using various paradigms. | Hadoop, Apache Spark |
Processing Paradigms | Methods employed for data processing, including batch and real-time processing. | Batch processing, Stream processing |
Data Management Tools | Tools that facilitate the management and analysis of large datasets, including the application of ML algorithms. | ETL tools, data governance tools |
Visualization Tools | BI tools that present data insights in accessible formats for decision-making. | Tableau, Power BI, QlikView |
Analytics Techniques | Techniques leveraging data science, data mining, and statistical methods for extracting insights from data. | ML, predictive analytics |
Aspect | Challenges | Opportunities |
---|---|---|
Data Privacy and Security |
|
|
Ethical Considerations |
|
|
Scalability Issues |
|
|
Terms | Frequency | Percentage | |
---|---|---|---|
1 | Big Data | 6334 | 17.09% |
2 | data mining | 2039 | 5.50% |
3 | learning algorithms | 1531 | 4.13% |
4 | machine learning | 1523 | 4.11% |
5 | deep learning | 1503 | 4.05% |
6 | clustering algorithms | 1323 | 3.57% |
7 | learning systems | 1322 | 3.57% |
8 | algorithm | 1160 | 3.13% |
9 | classification (of information) | 1128 | 3.04% |
10 | data handling | 1078 | 2.91% |
Country | TC | Average Article Citations |
---|---|---|
China | 72,962 | 17.30 |
USA | 30,653 | 45.50 |
India | 15,466 | 16.50 |
United Kingdom | 11,202 | 54.40 |
Spain | 6011 | 27.80 |
Korea | 5855 | 17.50 |
Australia | 5453 | 37.30 |
Italy | 4301 | 22.50 |
Germany | 3591 | 27.40 |
Canada | 3323 | 31.30 |
Terms | Frequency |
---|---|
Big Data | 7320 |
machine learning | 2101 |
data mining | 2084 |
learning algorithms | 1529 |
deep learning | 1371 |
clustering algorithms | 1330 |
learning systems | 1319 |
algorithm | 1181 |
classification (of information) | 1128 |
data handling | 1079 |
From | To | Frequency |
---|---|---|
Australia | New Zealand | 171.4849235 |
Canada | New Zealand | 171.4849235 |
China | New Zealand | 171.4849235 |
France | New Zealand | 171.4849235 |
Hong Kong | New Zealand | 171.4849235 |
India | New Zealand | 171.4849235 |
Japan | New Zealand | 171.4849235 |
Pakistan | New Zealand | 171.4849235 |
Romania | New Zealand | 171.4849235 |
Singapore | New Zealand | 171.4849235 |
Spain | New Zealand | 171.4849235 |
United Arab Emirates | New Zealand | 171.4849235 |
United Kingdom | New Zealand | 171.4849235 |
USA | New Zealand | 171.4849235 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hakami, T.A.; Alginahi, Y.M.; Sabri, O. Exploring the Evolution of Big Data Technologies: A Systematic Literature Review of Trends, Challenges, and Future Directions. Future Internet 2025, 17, 427. https://doi.org/10.3390/fi17090427
Hakami TA, Alginahi YM, Sabri O. Exploring the Evolution of Big Data Technologies: A Systematic Literature Review of Trends, Challenges, and Future Directions. Future Internet. 2025; 17(9):427. https://doi.org/10.3390/fi17090427
Chicago/Turabian StyleHakami, Tahani Ali, Yasser M. Alginahi, and Omar Sabri. 2025. "Exploring the Evolution of Big Data Technologies: A Systematic Literature Review of Trends, Challenges, and Future Directions" Future Internet 17, no. 9: 427. https://doi.org/10.3390/fi17090427
APA StyleHakami, T. A., Alginahi, Y. M., & Sabri, O. (2025). Exploring the Evolution of Big Data Technologies: A Systematic Literature Review of Trends, Challenges, and Future Directions. Future Internet, 17(9), 427. https://doi.org/10.3390/fi17090427