Review Papers in Big Data, Cloud-Based Data Analysis and Learning Systems

A special issue of Big Data and Cognitive Computing (ISSN 2504-2289).

Deadline for manuscript submissions: closed (28 February 2023) | Viewed by 98715

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editors


E-Mail Website
Guest Editor
Department of Informatics, Modeling, Electronics, and Systems Engineering (DIMES), University of Calabria, 87036 Rende, Italy
Interests: big data analysis; parallel machine learning; scalable data mining; distributed computing systems; parallel programming and data science

Special Issue Information

Dear Colleagues,

Big data analysis is enabling researchers and data scientists to extract useful information and knowledge to make new discoveries and support decision-making processes. To this end, the use of advanced and scalable algorithms, along with parallel programming frameworks and high-performance computers, is commonly used to solve big data problems and obtain valuable information and learning processes in a reasonable time.

This Special Issue will include high-quality papers, i.e., review/survey papers and original research articles, in the fields of big data, cloud-based data analysis, and learning systems. The papers will be published in an open access format by the Editorial Board Members or those invited by the Editorial Board Members and the Editorial Office. Papers will be published, free of charge, after peer review. In particular, wide-ranging surveys and advanced research are sought after, both on aspects relating to big data (e.g., frameworks for big data analysis or systems for big data management) or the use of big data in specific areas (e.g., big data from social media or big data from streams).

The scope of this Special Issue of Big Data and Cognitive Computing includes, but is not limited to, the following topics:

  • Big data (along with clouds, Internet of Things and social media platforms);
  • Big data infrastructure and systems;
  • Big data processing and analytics;
  • Big data models, algorithms, and architectures;
  • Cloud computing and big data platforms;
  • Cloud services and big data applications;
  • Data storage and management;
  • Data search and mining;
  • Big data applications in science, Internet, finance, telecommunications, business, medicine, healthcare, government, transportation, industry, manufacturing, etc.;
  • Social media analysis;
  • Big data integrity and privacy;
  • Edge computing and IoT data mining;
  • IoT technologies for big data collection;
  • IoT sensing and cognitive IoT;
  • Data-driven IoT intelligent applications;
  • 5G networks and wireless big data.
  • Cognitive computing;
  • Machine learning and its applications in medicine, biology, industry, manufacturing, security, education, etc.;
  • Deep learning;
  • Artificial intelligence;
  • Affect/emotion/personality/mind computing;
  • Cognitive modeling;
  • Cognitive informatics;
  • Cognitive sensor-networks;
  • Cognitive robots;
  • Application of cognitive computing in health monitoring, intelligent control systems, bioinformatics, smart manufacturing, smart grids, image/video and signal processing, etc.;
  • Robots and control systems;
  • Natural language processing;
  • Human–machine/robot interaction.

Prof. Dr. Domenico Talia
Dr. Fabrizio Marozzo
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Big Data and Cognitive Computing is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research, Review, Other

4 pages, 185 KiB  
Editorial
Perspectives on Big Data, Cloud-Based Data Analysis and Machine Learning Systems
by Fabrizio Marozzo and Domenico Talia
Big Data Cogn. Comput. 2023, 7(2), 104; https://doi.org/10.3390/bdcc7020104 - 30 May 2023
Cited by 4 | Viewed by 2101
Abstract
Huge amounts of digital data are continuously generated and collected from different sources, such as sensors, cameras, in-vehicle infotainment, smart meters, mobile devices, social media platforms, and web applications and services [...] Full article

Research

Jump to: Editorial, Review, Other

18 pages, 3389 KiB  
Article
Detecting Multi-Density Urban Hotspots in a Smart City: Approaches, Challenges and Applications
by Eugenio Cesario, Paolo Lindia and Andrea Vinci
Big Data Cogn. Comput. 2023, 7(1), 29; https://doi.org/10.3390/bdcc7010029 - 8 Feb 2023
Cited by 5 | Viewed by 2372
Abstract
Leveraged by a large-scale diffusion of sensing networks and scanning devices in modern cities, huge volumes of geo-referenced urban data are collected every day. Such an amount of information is analyzed to discover data-driven models, which can be exploited to tackle the major [...] Read more.
Leveraged by a large-scale diffusion of sensing networks and scanning devices in modern cities, huge volumes of geo-referenced urban data are collected every day. Such an amount of information is analyzed to discover data-driven models, which can be exploited to tackle the major issues that cities face, including air pollution, virus diffusion, human mobility, crime forecasting, traffic flows, etc. In particular, the detection of city hotspots is de facto a valuable organization technique for framing detailed knowledge of a metropolitan area, providing high-level summaries for spatial datasets, which are a valuable support for planners, scientists, and policymakers. However, while classic density-based clustering algorithms show to be suitable for discovering hotspots characterized by homogeneous density, their application on multi-density data can produce inaccurate results. In fact, a proper threshold setting is very difficult when clusters in different regions have considerably different densities, or clusters with different density levels are nested. For such a reason, since metropolitan cities are heavily characterized by variable densities, multi-density clustering seems to be more appropriate for discovering city hotspots. Indeed, such algorithms rely on multiple minimum threshold values and are able to detect multiple pattern distributions of different densities, aiming at distinguishing between several density regions, which may or may not be nested and are generally of a non-convex shape. This paper discusses the research issues and challenges for analyzing urban data, aimed at discovering multi-density hotspots in urban areas. In particular, the study compares the four approaches (DBSCAN, OPTICS-xi, HDBSCAN, and CHD) proposed in the literature for clustering urban data and analyzes their performance on both state-of-the-art and real-world datasets. Experimental results show that multi-density clustering algorithms generally achieve better results on urban data than classic density-based algorithms. Full article
Show Figures

Figure 1

13 pages, 4030 KiB  
Article
The “Unreasonable” Effectiveness of the Wasserstein Distance in Analyzing Key Performance Indicators of a Network of Stores
by Andrea Ponti, Ilaria Giordani, Matteo Mistri, Antonio Candelieri and Francesco Archetti
Big Data Cogn. Comput. 2022, 6(4), 138; https://doi.org/10.3390/bdcc6040138 - 15 Nov 2022
Cited by 1 | Viewed by 2390
Abstract
Large retail companies routinely gather huge amounts of customer data, which are to be analyzed at a low granularity. To enable this analysis, several Key Performance Indicators (KPIs), acquired for each customer through different channels are associated to the main drivers of the [...] Read more.
Large retail companies routinely gather huge amounts of customer data, which are to be analyzed at a low granularity. To enable this analysis, several Key Performance Indicators (KPIs), acquired for each customer through different channels are associated to the main drivers of the customer experience. Analyzing the samples of customer behavior only through parameters such as average and variance does not cope with the growing heterogeneity of customers. In this paper, we propose a different approach in which the samples from customer surveys are represented as discrete probability distributions whose similarities can be assessed by different models. The focus is on the Wasserstein distance, which is generally well defined, even when other distributional distances are not, and it provides an interpretable distance metric between distributions. The support of the distributions can be both one- and multi-dimensional, allowing for the joint consideration of several KPIs for each store, leading to a multi-variate histogram. Moreover, the Wasserstein barycenter offers a useful synthesis of a set of distributions and can be used as a reference distribution to characterize and classify behavioral patterns. Experimental results of real data show the effectiveness of the Wasserstein distance in providing global performance measures. Full article
Show Figures

Figure 1

Review

Jump to: Editorial, Research, Other

19 pages, 960 KiB  
Review
An Overview on the Challenges and Limitations Using Cloud Computing in Healthcare Corporations
by Giuseppe Agapito and Mario Cannataro
Big Data Cogn. Comput. 2023, 7(2), 68; https://doi.org/10.3390/bdcc7020068 - 6 Apr 2023
Cited by 11 | Viewed by 4334
Abstract
Technological advances in high throughput platforms for biological systems enable the cost-efficient production of massive amounts of data, leading life science to the Big Data era. The availability of Big Data provides new opportunities and challenges for data analysis. Cloud Computing is ideal [...] Read more.
Technological advances in high throughput platforms for biological systems enable the cost-efficient production of massive amounts of data, leading life science to the Big Data era. The availability of Big Data provides new opportunities and challenges for data analysis. Cloud Computing is ideal for digging with Big Data in omics sciences because it makes data analysis, sharing, access, and storage effective and able to scale when the amount of data increases. However, Cloud Computing presents several issues regarding the security and privacy of data that are particularly important when analyzing patients’ data, such as in personalized medicine. The objective of the present study is to highlight the challenges, security issues, and impediments that restrict the widespread adoption of Cloud Computing in healthcare corporations. Full article
Show Figures

Figure 1

23 pages, 6473 KiB  
Review
Enhancing Digital Health Services with Big Data Analytics
by Nisrine Berros, Fatna El Mendili, Youness Filaly and Younes El Bouzekri El Idrissi
Big Data Cogn. Comput. 2023, 7(2), 64; https://doi.org/10.3390/bdcc7020064 - 30 Mar 2023
Cited by 8 | Viewed by 6456
Abstract
Medicine is constantly generating new imaging data, including data from basic research, clinical research, and epidemiology, from health administration and insurance organizations, public health services, and non-conventional data sources such as social media, Internet applications, etc. Healthcare professionals have gained from the integration [...] Read more.
Medicine is constantly generating new imaging data, including data from basic research, clinical research, and epidemiology, from health administration and insurance organizations, public health services, and non-conventional data sources such as social media, Internet applications, etc. Healthcare professionals have gained from the integration of big data in many ways, including new tools for decision support, improved clinical research methodologies, treatment efficacy, and personalized care. Finally, there are significant advantages in saving resources and reallocating them to increase productivity and rationalization. In this paper, we will explore how big data can be applied to the field of digital health. We will explain the features of health data, its particularities, and the tools available to use it. In addition, a particular focus is placed on the latest research work that addresses big data analysis in the health domain, as well as the technical and organizational challenges that have been discussed. Finally, we propose a general strategy for medical organizations looking to adopt or leverage big data analytics. Through this study, healthcare organizations and institutions considering the use of big data analytics technology, as well as those already using it, can gain a thorough and comprehensive understanding of the potential use, effective targeting, and expected impact. Full article
Show Figures

Figure 1

17 pages, 1267 KiB  
Review
Impact of Artificial Intelligence on COVID-19 Pandemic: A Survey of Image Processing, Tracking of Disease, Prediction of Outcomes, and Computational Medicine
by Khaled H. Almotairi, Ahmad MohdAziz Hussein, Laith Abualigah, Sohaib K. M. Abujayyab, Emad Hamdi Mahmoud, Bassam Omar Ghanem and Amir H. Gandomi
Big Data Cogn. Comput. 2023, 7(1), 11; https://doi.org/10.3390/bdcc7010011 - 11 Jan 2023
Cited by 15 | Viewed by 8363
Abstract
Integrating machine learning technologies into artificial intelligence (AI) is at the forefront of the scientific and technological tools employed to combat the COVID-19 pandemic. This study assesses different uses and deployments of modern technology for combating the COVID-19 pandemic at various levels, such [...] Read more.
Integrating machine learning technologies into artificial intelligence (AI) is at the forefront of the scientific and technological tools employed to combat the COVID-19 pandemic. This study assesses different uses and deployments of modern technology for combating the COVID-19 pandemic at various levels, such as image processing, tracking of disease, prediction of outcomes, and computational medicine. The results prove that computerized tomography (CT) scans help to diagnose patients infected by COVID-19. This includes two-sided, multilobar ground glass opacification (GGO) by a posterior distribution or peripheral, primarily in the lower lobes, and fewer recurrences in the intermediate lobe. An extensive search of modern technology databases relating to COVID-19 was undertaken. Subsequently, a review of the extracted information from the database search looked at how technology can be employed to tackle the pandemic. We discussed the technological advancements deployed to alleviate the communicability and effect of the pandemic. Even though there are many types of research on the use of technology in combating COVID-19, the application of technology in combating COVID-19 is still not yet fully explored. In addition, we suggested some open research issues and challenges in deploying AI technology to combat the global pandemic. Full article
Show Figures

Figure 1

19 pages, 784 KiB  
Review
A Survey on Big Data in Pharmacology, Toxicology and Pharmaceutics
by Krithika Latha Bhaskaran, Richard Sakyi Osei, Evans Kotei, Eric Yaw Agbezuge, Carlos Ankora and Ernest D. Ganaa
Big Data Cogn. Comput. 2022, 6(4), 161; https://doi.org/10.3390/bdcc6040161 - 19 Dec 2022
Cited by 6 | Viewed by 3012
Abstract
Patients, hospitals, sensors, researchers, providers, phones, and healthcare organisations are producing enormous amounts of data in both the healthcare and drug detection sectors. The real challenge in these sectors is to find, investigate, manage, and collect information from patients in order to make [...] Read more.
Patients, hospitals, sensors, researchers, providers, phones, and healthcare organisations are producing enormous amounts of data in both the healthcare and drug detection sectors. The real challenge in these sectors is to find, investigate, manage, and collect information from patients in order to make their lives easier and healthier, not only in terms of formulating new therapies and understanding diseases, but also to predict the results at earlier stages and make effective decisions. The volumes of data available in the fields of pharmacology, toxicology, and pharmaceutics are constantly increasing. These increases are driven by advances in technology, which allow for the analysis of ever-larger data sets. Big Data (BD) has the potential to transform drug development and safety testing by providing new insights into the effects of drugs on human health. However, harnessing this potential involves several challenges, including the need for specialised skills and infrastructure. In this survey, we explore how BD approaches are currently being used in the pharmacology, toxicology, and pharmaceutics fields; in particular, we highlight how researchers have applied BD in pharmacology, toxicology, and pharmaceutics to address various challenges and establish solutions. A comparative analysis helps to trace the implementation of big data in the fields of pharmacology, toxicology, and pharmaceutics. Certain relevant limitations and directions for future research are emphasised. The pharmacology, toxicology, and pharmaceutics fields are still at an early stage of BD adoption, and there are many research challenges to be overcome, in order to effectively employ BD to address specific issues. Full article
Show Figures

Figure 1

23 pages, 735 KiB  
Review
Explore Big Data Analytics Applications and Opportunities: A Review
by Zaher Ali Al-Sai, Mohd Heikal Husin, Sharifah Mashita Syed-Mohamad, Rasha Moh’d Sadeq Abdin, Nour Damer, Laith Abualigah and Amir H. Gandomi
Big Data Cogn. Comput. 2022, 6(4), 157; https://doi.org/10.3390/bdcc6040157 - 14 Dec 2022
Cited by 16 | Viewed by 9380
Abstract
Big data applications and analytics are vital in proposing ultimate strategic decisions. The existing literature emphasizes that big data applications and analytics can empower those who apply Big Data Analytics during the COVID-19 pandemic. This paper reviews the existing literature specializing in big [...] Read more.
Big data applications and analytics are vital in proposing ultimate strategic decisions. The existing literature emphasizes that big data applications and analytics can empower those who apply Big Data Analytics during the COVID-19 pandemic. This paper reviews the existing literature specializing in big data applications pre and peri-COVID-19. A comparison between Pre and Peri of the pandemic for using Big Data applications is presented. The comparison is expanded to four highly recognized industry fields: Healthcare, Education, Transportation, and Banking. A discussion on the effectiveness of the four major types of data analytics across the mentioned industries is highlighted. Hence, this paper provides an illustrative description of the importance of big data applications in the era of COVID-19, as well as aligning the applications to their relevant big data analytics models. This review paper concludes that applying the ultimate big data applications and their associated data analytics models can harness the significant limitations faced by organizations during one of the most fateful pandemics worldwide. Future work will conduct a systematic literature review and a comparative analysis of the existing Big Data Systems and models. Moreover, future work will investigate the critical challenges of Big Data Analytics and applications during the COVID-19 pandemic. Full article
Show Figures

Figure 1

24 pages, 718 KiB  
Review
An Overview of Data Warehouse and Data Lake in Modern Enterprise Data Management
by Athira Nambiar and Divyansh Mundra
Big Data Cogn. Comput. 2022, 6(4), 132; https://doi.org/10.3390/bdcc6040132 - 7 Nov 2022
Cited by 33 | Viewed by 36610
Abstract
Data is the lifeblood of any organization. In today’s world, organizations recognize the vital role of data in modern business intelligence systems for making meaningful decisions and staying competitive in the field. Efficient and optimal data analytics provides a competitive edge to its [...] Read more.
Data is the lifeblood of any organization. In today’s world, organizations recognize the vital role of data in modern business intelligence systems for making meaningful decisions and staying competitive in the field. Efficient and optimal data analytics provides a competitive edge to its performance and services. Major organizations generate, collect and process vast amounts of data, falling under the category of big data. Managing and analyzing the sheer volume and variety of big data is a cumbersome process. At the same time, proper utilization of the vast collection of an organization’s information can generate meaningful insights into business tactics. In this regard, two of the popular data management systems in the area of big data analytics (i.e., data warehouse and data lake) act as platforms to accumulate the big data generated and used by organizations. Although seemingly similar, both of them differ in terms of their characteristics and applications. This article presents a detailed overview of the roles of data warehouses and data lakes in modern enterprise data management. We detail the definitions, characteristics and related works for the respective data management frameworks. Furthermore, we explain the architecture and design considerations of the current state of the art. Finally, we provide a perspective on the challenges and promising research directions for the future. Full article
Show Figures

Figure 1

27 pages, 8416 KiB  
Review
Big Data in Construction: Current Applications and Future Opportunities
by Hafiz Suliman Munawar, Fahim Ullah, Siddra Qayyum and Danish Shahzad
Big Data Cogn. Comput. 2022, 6(1), 18; https://doi.org/10.3390/bdcc6010018 - 6 Feb 2022
Cited by 36 | Viewed by 17573
Abstract
Big data have become an integral part of various research fields due to the rapid advancements in the digital technologies available for dealing with data. The construction industry is no exception and has seen a spike in the data being generated due to [...] Read more.
Big data have become an integral part of various research fields due to the rapid advancements in the digital technologies available for dealing with data. The construction industry is no exception and has seen a spike in the data being generated due to the introduction of various digital disruptive technologies. However, despite the availability of data and the introduction of such technologies, the construction industry is lagging in harnessing big data. This paper critically explores literature published since 2010 to identify the data trends and how the construction industry can benefit from big data. The presence of tools such as computer-aided drawing (CAD) and building information modelling (BIM) provide a great opportunity for researchers in the construction industry to further improve how infrastructure can be developed, monitored, or improved in the future. The gaps in the existing research data have been explored and a detailed analysis was carried out to identify the different ways in which big data analysis and storage work in relevance to the construction industry. Big data engineering (BDE) and statistics are among the most crucial steps for integrating big data technology in construction. The results of this study suggest that while the existing research studies have set the stage for improving big data research, the integration of the associated digital technologies into the construction industry is not very clear. Among the future opportunities, big data research into construction safety, site management, heritage conservation, and project waste minimization and quality improvements are key areas. Full article
Show Figures

Figure 1

Other

19 pages, 800 KiB  
Systematic Review
Disclosing Edge Intelligence: A Systematic Meta-Survey
by Vincenzo Barbuto, Claudio Savaglio, Min Chen and Giancarlo Fortino
Big Data Cogn. Comput. 2023, 7(1), 44; https://doi.org/10.3390/bdcc7010044 - 2 Mar 2023
Cited by 22 | Viewed by 3882
Abstract
The Edge Intelligence (EI) paradigm has recently emerged as a promising solution to overcome the inherent limitations of cloud computing (latency, autonomy, cost, etc.) in the development and provision of next-generation Internet of Things (IoT) services. Therefore, motivated by its increasing popularity, relevant [...] Read more.
The Edge Intelligence (EI) paradigm has recently emerged as a promising solution to overcome the inherent limitations of cloud computing (latency, autonomy, cost, etc.) in the development and provision of next-generation Internet of Things (IoT) services. Therefore, motivated by its increasing popularity, relevant research effort was expended in order to explore, from different perspectives and at different degrees of detail, the many facets of EI. In such a context, the aim of this paper was to analyze the wide landscape on EI by providing a systematic analysis of the state-of-the-art manuscripts in the form of a tertiary study (i.e., a review of literature reviews, surveys, and mapping studies) and according to the guidelines of the PRISMA methodology. A comparison framework is, hence, provided and sound research questions outlined, aimed at exploring (for the benefit of both experts and beginners) the past, present, and future directions of the EI paradigm and its relationships with the IoT and the cloud computing worlds. Full article
Show Figures

Figure 1

Back to TopTop