Innovative Applications of Big Data and Cloud Computing

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (20 February 2022) | Viewed by 27797

Special Issue Editors

Department of Computer Science, Tunghai University, Taichung 40704, Taiwan
Interests: cloud computing; big data; machine learning; parallel processing
Special Issues, Collections and Topics in MDPI journals
Department of Computer Science and Information Engineering, National Chin-Yi University of Technology, Taichung 41170, Taiwan
Interests: cloud computing; big data; web-based applications; combinatorial optimization
Special Issues, Collections and Topics in MDPI journals
Department of Computer Science & Engineering, Sant Longowal Institute of Engineering & Technology, Punjab 148106, India
Interests: wireless sensor networks; trust and reputation systems; cloud computing; brain computing; internet of things; big data
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Humans are constantly generating huge amounts of data in different situations, e.g., living, manufacturing, research, etc. In recent years, the capture and processing of data has become easier, with applications designed to assist us in making decisions. For example, the air quality index (AQI) represents the pollution degree, and scientists collect the AQI value to offer constant suggestions for outdoor activities; analysts analyze traffic data to discover transportation demand, and drivers plan the travels through road usage; production managers use manufacturing data to ensure that product quality remains within acceptable tolerance ranges.

Such innovation services require handling of mass data to derive suggestions. Cloud computing has been of great assistance in allowing data to be controlled easily and efficiently. The on-demand delivery of services is the major advantage of the cloud computing, allowing services to be easily invoked without any hardware and software limitations nor geographic considerations. Thus, information delivery and data analysis can be separated, and analysts and researchers can focus on the system purposes. Not only system designers but also users prefer to access systems via cloud services.

To explore the innovation service and practical systems, the Special Issue “Innovative Applications of Big Data and Cloud Computing” aims at the applications of core service design, platform implementation, data visualization, and future prediction using big data and cloud computing. We invite researchers to contribute their state-of-the-art experimental or computational results, and the topics of particular interest are as follows:

  • Cloud System Design and Implementation.
  • Core Service Design and Implementation in Cloud or Web Ecosystems.
  • Front-End Service Design and Implementation in Cloud or Web Ecosystems.
  • Big Data Analysis and Implementation in Cloud or Web Ecosystems.
  • Big Data Visualization in Cloud or Web Ecosystems.

Please feel free to contact us with any questions.

Prof. Chao-Tung Yang
Dr. Chen-Kun Tsung
Dr. Neil Yen
Dr. Vinod Kumar Verma
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • big data
  • cloud computing
  • innovation service design
  • practical platform implementation

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research

2 pages, 175 KiB  
Editorial
Special Issue on Innovative Applications of Big Data and Cloud Computing
by Chao-Tung Yang, Chen-Kun Tsung, Neil Yuwen Yen and Vinod Kumar Verma
Appl. Sci. 2022, 12(19), 9648; https://doi.org/10.3390/app12199648 - 26 Sep 2022
Cited by 1 | Viewed by 840
Abstract
Big Data and Cloud Computing are two major information technologies for processing data to translate data to knowledge [...] Full article
(This article belongs to the Special Issue Innovative Applications of Big Data and Cloud Computing)

Research

Jump to: Editorial

17 pages, 5224 KiB  
Article
The Vision-Based Data Reader in IoT System for Smart Factory
by Tse-Chuan Hsu, Yao-Hong Tsai and Dong-Meau Chang
Appl. Sci. 2022, 12(13), 6586; https://doi.org/10.3390/app12136586 - 29 Jun 2022
Cited by 9 | Viewed by 1889
Abstract
The proposed research is based on a real plastic injection factory for cutting board production. Most existing approaches for smart manufacturing tried to build the total solution of IoT by moving forward to the standard of industry 4.0. Under the cost considerations, this [...] Read more.
The proposed research is based on a real plastic injection factory for cutting board production. Most existing approaches for smart manufacturing tried to build the total solution of IoT by moving forward to the standard of industry 4.0. Under the cost considerations, this will not be acceptable to most factories, so we proposed the vision based technology to solve their immediate problem. Real-time machine condition monitoring is important for making great products and measuring line productivity or factory productivity. The study focused on a vision-based data reader (VDR) in edge computing for smart factories. A simple camera embedded in Field Programmable Gate Array (FPGA) was attached to monitor the screen on the control panel of the machines. Each end device was preprogrammed to capture images and process data on its own. The preprocessing step was then performed to have the normalized illumination of the captured image. A saliency map was generated to detect the required region for recognition. Finally, digit recognition was performed and the recognized digits were sent to the IoT system. The most significant contribution of the proposed VDR system used the compact deep learning model for training and testing purposes to fit the requirement of cost consideration and real-time monitoring in edge computing. To build the compact model, different convolution filters were tested to fit the performance requirement. Experimentations on a real plastic cutting board factory showed the improvement in manufacturing products by the proposed system and achieved a high digit recognition accuracy of 97.56%. In addition, the prototype system had low power and low latency advantages. Full article
(This article belongs to the Special Issue Innovative Applications of Big Data and Cloud Computing)
Show Figures

Figure 1

19 pages, 14990 KiB  
Article
On Construction of a Campus Outdoor Air and Water Quality Monitoring System Using LoRaWAN
by Hsin-Yuan Miao, Chao-Tung Yang, Endah Kristiani, Halim Fathoni, Yu-Sheng Lin and Chien-Yi Chen
Appl. Sci. 2022, 12(10), 5018; https://doi.org/10.3390/app12105018 - 16 May 2022
Cited by 10 | Viewed by 2276
Abstract
This paper proposed implementing a water and air monitoring system using sensor development and a LoRa Network. To transmit data, a self-made PCB board integrates the terminal sensors with Renesas RX64M MCU and LoRa. There are 16 monitoring point stations for the media [...] Read more.
This paper proposed implementing a water and air monitoring system using sensor development and a LoRa Network. To transmit data, a self-made PCB board integrates the terminal sensors with Renesas RX64M MCU and LoRa. There are 16 monitoring point stations for the media experiment. The sensors were used to measure the water and air parameters such as PM2.5, CO2, DO concentration, pH level, temperature, and humidity. In addition, the Grafana system was implemented to present the status and variation in the monitoring parameters in the environmental area. To evaluate the monitoring system, we also collected public information provided by the environmental protection department of the Taiwan government at the same monitoring point for comparison. Full article
(This article belongs to the Special Issue Innovative Applications of Big Data and Cloud Computing)
Show Figures

Figure 1

17 pages, 1563 KiB  
Article
Precision Nutrient Management Using Artificial Intelligence Based on Digital Data Collection Framework
by Hsiu-An Lee, Tzu-Ting Huang, Lo-Hsien Yen, Pin-Hua Wu, Kuan-Wen Chen, Hsin-Hua Kung, Chen-Yi Liu and Chien-Yeh Hsu
Appl. Sci. 2022, 12(9), 4167; https://doi.org/10.3390/app12094167 - 20 Apr 2022
Cited by 3 | Viewed by 2516
Abstract
(1) Background: Nutritional intake is fundamental to human growth and health, and the intake of different types of nutrients and micronutrients can affect health. The content of the diet affects the occurrence of disease, with the incidence of many diseases increasing each year [...] Read more.
(1) Background: Nutritional intake is fundamental to human growth and health, and the intake of different types of nutrients and micronutrients can affect health. The content of the diet affects the occurrence of disease, with the incidence of many diseases increasing each year while the age group at which they occur is gradually decreasing. (2) Methods: An artificial intelligence model for precision nutritional analysis allows the user to enter the name and serving size of a dish to assess a total of 24 nutrients. A total of two AI models, including semantic and nutritional analysis models, were integrated into the Precision Nutritional Analysis. A total of five different algorithms were used to identify the most similar recipes and to determine differences in text using cosine similarity. (3) Results: This study developed two models to form a precision nutrient analysis model. The 2013–2016 Taiwan National Nutrition Health Status Change Survey (NNHS) was used for model verification. The model’s accuracy was determined by comparing the results of the model with the NNHS. The results show that the AI model has very little error and can significantly improve the efficiency of the analysis. (4) Conclusions: This study proposed an Intelligence Precision Nutrient Analysis Model based on a digital data collection framework, where the nutrient intake was analyzed by entering dietary recall data. The AI model can be used as a reference for nutrition surveys and personal nutrition analysis. Full article
(This article belongs to the Special Issue Innovative Applications of Big Data and Cloud Computing)
Show Figures

Figure 1

19 pages, 4222 KiB  
Article
Pareto-Optimised Fog Storage Services with Novel Service-Level Agreement Specification
by Petar Kochovski, Uroš Paščinski, Vlado Stankovski and Mojca Ciglarič
Appl. Sci. 2022, 12(7), 3308; https://doi.org/10.3390/app12073308 - 24 Mar 2022
Cited by 5 | Viewed by 1907
Abstract
(1) Background: Cloud storage is often required for successful operation of novel smart applications, relying on data produced by the Internet of Things (IoT) devices. Big Data processing tasks and management operations for such applications require high Quality of Service (QoS) guarantees, requiring [...] Read more.
(1) Background: Cloud storage is often required for successful operation of novel smart applications, relying on data produced by the Internet of Things (IoT) devices. Big Data processing tasks and management operations for such applications require high Quality of Service (QoS) guarantees, requiring an Edge/Fog computing approach. Additionally, users often require specific guarantees in the form of Service Level Agreements (SLAs) for storage services. To address these problems, we propose QoS-enabled Fog Storage Services, implemented as containerised storage services, orchestrated across the Things-to-Cloud computing continuum. (2) Method: The placement of containerised data storage services in the Things-to-Cloud continuum is dynamically decided using a novel Pareto-based decision-making process based on high availability, high throughput, and other QoS demands of the user. The proposed concept is first confirmed via simulation and then tested in a real-world environment. (3) Results: The decision-making mechanism and a novel SLA specification have been successfully implemented and integrated in the DECENTER Fog and Brokerage Platform to complement the orchestration services for storage containers, thus presenting their applicable value. Simulation results as well as practical experimentation in a Europe-wide testbed have shown that the proposed decision-making method can deliver a set of optimal storage nodes, thus meeting the SLA requirements. (4) Conclusion: It is possible to provide new smart applications with the expected SLA guarantees and high QoS for our proposed Fog Storage Services. Full article
(This article belongs to the Special Issue Innovative Applications of Big Data and Cloud Computing)
Show Figures

Figure 1

13 pages, 526 KiB  
Article
An Event-Driven Serverless ETL Pipeline on AWS
by Antreas Pogiatzis and Georgios Samakovitis
Appl. Sci. 2021, 11(1), 191; https://doi.org/10.3390/app11010191 - 28 Dec 2020
Cited by 7 | Viewed by 4112
Abstract
This work presents an event-driven Extract, Transform, and Load (ETL) pipeline serverless architecture and provides an evaluation of its performance over a range of dataflow tasks of varying frequency, velocity, and payload size. We design an experiment while using generated tabular data throughout [...] Read more.
This work presents an event-driven Extract, Transform, and Load (ETL) pipeline serverless architecture and provides an evaluation of its performance over a range of dataflow tasks of varying frequency, velocity, and payload size. We design an experiment while using generated tabular data throughout varying data volumes, event frequencies, and processing power in order to measure: (i) the consistency of pipeline executions; (ii) reliability on data delivery; (iii) maximum payload size per pipeline; and, (iv) economic scalability (cost of chargeable tasks). We run 92 parameterised experiments on a simple AWS architecture, thus avoiding any AWS-enhanced platform features, in order to allow for unbiased assessment of our model’s performance. Our results indicate that our reference architecture can achieve time-consistent data processing of event payloads of more than 100 MB, with a throughput of 750 KB/s across four event frequencies. It is also observed that, although the utilisation of an SQS queue for data transfer enables easy concurrency control and data slicing, it becomes a bottleneck on large sized event payloads. Finally, we develop and discuss a candidate pricing model for our reference architecture usage. Full article
(This article belongs to the Special Issue Innovative Applications of Big Data and Cloud Computing)
Show Figures

Figure 1

17 pages, 4636 KiB  
Article
Minimizing Resource Waste in Heterogeneous Resource Allocation for Data Stream Processing on Clouds
by Wu-Chun Chung, Tsung-Lin Wu, Yi-Hsuan Lee, Kuo-Chan Huang, Hung-Chang Hsiao and Kuan-Chou Lai
Appl. Sci. 2021, 11(1), 149; https://doi.org/10.3390/app11010149 - 25 Dec 2020
Cited by 4 | Viewed by 2147
Abstract
Resource allocation is vital for improving system performance in big data processing. The resource demand for various applications can be heterogeneous in cloud computing. Therefore, a resource gap occurs while some resource capacities are exhausted and other resource capacities on the same server [...] Read more.
Resource allocation is vital for improving system performance in big data processing. The resource demand for various applications can be heterogeneous in cloud computing. Therefore, a resource gap occurs while some resource capacities are exhausted and other resource capacities on the same server are still available. This phenomenon is more apparent when the computing resources are more heterogeneous. Previous resource-allocation algorithms paid limited attention to this situation. When such an algorithm is applied to a server with heterogeneous resources, resource allocation may result in considerable resource wastage for the available but unused resources. To reduce resource wastage, a resource-allocation algorithm, called the minimizing resource gap (MRG) algorithm, for heterogeneous resources is proposed in this study. In MRG, the gap between resource usages for each server in cloud computing and the resource demands among various applications are considered. When an application is launched, MRG calculates resource usage and allocates resources to the server with the minimized usage gap to reduce the amount of available but unused resources. To demonstrate MRG performance, the MRG algorithm was implemented in Apache Spark. CPU- and memory-intensive applications were applied as benchmarks with different resource demands. Experimental results proved the superiority of the proposed MRG approach for improving the system utilization to reduce the overall completion time by up to 24.7% for heterogeneous servers in cloud computing. Full article
(This article belongs to the Special Issue Innovative Applications of Big Data and Cloud Computing)
Show Figures

Figure 1

16 pages, 704 KiB  
Article
Semi-Automatic Cloud-Native Video Annotation for Autonomous Driving
by Sergio Sánchez-Carballido, Orti Senderos, Marcos Nieto and Oihana Otaegui
Appl. Sci. 2020, 10(12), 4301; https://doi.org/10.3390/app10124301 - 23 Jun 2020
Cited by 4 | Viewed by 2985
Abstract
An innovative solution named Annotation as a Service (AaaS) has been specifically designed to integrate heterogeneous video annotation workflows into containers and take advantage of a cloud native highly scalable and reliable design based on Kubernetes workloads. Using the AaaS as a foundation, [...] Read more.
An innovative solution named Annotation as a Service (AaaS) has been specifically designed to integrate heterogeneous video annotation workflows into containers and take advantage of a cloud native highly scalable and reliable design based on Kubernetes workloads. Using the AaaS as a foundation, the execution of automatic video annotation workflows is addressed in the broader context of a semi-automatic video annotation business logic for ground truth generation for Autonomous Driving (AD) and Advanced Driver Assistance Systems (ADAS). The document presents design decisions, innovative developments, and tests conducted to provide scalability to this cloud-native ecosystem for semi-automatic annotation. The solution has proven to be efficient and resilient on an AD/ADAS scale, specifically in an experiment with 25 TB of input data to annotate, 4000 concurrent annotation jobs, and 32 worker nodes forming a high performance computing cluster with a total of 512 cores, and 2048 GB of RAM. Automatic pre-annotations with the proposed strategy reduce the time of human participation in the annotation up to 80% maximum and 60% on average. Full article
(This article belongs to the Special Issue Innovative Applications of Big Data and Cloud Computing)
Show Figures

Figure 1

18 pages, 845 KiB  
Article
FirepanIF: High Performance Host-Side Flash Cache Warm-Up Method in Cloud Computing
by Hyunchan Park, Munkyu Lee and Cheol-Ho Hong
Appl. Sci. 2020, 10(3), 1014; https://doi.org/10.3390/app10031014 - 04 Feb 2020
Cited by 1 | Viewed by 2099
Abstract
In cloud computing, a shared storage server, which provides a network-attached storage device, is usually used for centralized data management. However, when multiple virtual machines (VMs) concurrently access the storage server through the network, the performance of each VM may decrease due to [...] Read more.
In cloud computing, a shared storage server, which provides a network-attached storage device, is usually used for centralized data management. However, when multiple virtual machines (VMs) concurrently access the storage server through the network, the performance of each VM may decrease due to limited bandwidth. To address this issue, a flash-based storage device such as a solid state drive (SSD) is often employed as a cache in the host server. This host-side flash cache saves remote data, which are frequently accessed by the VM, locally in the cache. However, frequent VM migration in the data center can weaken the effectiveness of a host-side flash cache as the migrated VM needs to warm up its flash cache again on the destination machine. This study proposes Cachemior, Firepan, and FirepanIF for rapid flash-cache migration in cloud computing. Cachemior warms up the flash cache with a data preloading approach using the shared storage server after VM migration. However, it does not achieve a satisfactory level of performance. Firepan and FirepanIF use the source node’s flash cache as the data source for flash cache warm-up. They can migrate the flash-cache more quickly than conventional methods as they can avoid storage and network congestion on the shared storage server. Firepan incurs downtime of the VM during flash cache migration for data consistency. FirepanIF minimizes the VM downtime with the invalidation filter, which traces the I/O activity of the migrated VM during flash cache migration in order to invalidate inconsistent cache blocks. We implement and evaluate the three flash cache migration techniques in a realistic virtualized environment. FirepanIF demonstrates that it can improve the performance of the I/O workload by up to 21.87% compared to conventional methods. Full article
(This article belongs to the Special Issue Innovative Applications of Big Data and Cloud Computing)
Show Figures

Figure 1

16 pages, 5728 KiB  
Article
Resource Utilization Scheme of Idle Virtual Machines for Multiple Large-Scale Jobs Based on OpenStack
by Jueun Jeon, Jong Hyuk Park and Young-Sik Jeong
Appl. Sci. 2019, 9(20), 4327; https://doi.org/10.3390/app9204327 - 15 Oct 2019
Cited by 2 | Viewed by 2776
Abstract
Cloud computing services that provide computing resources to users through the Internet also provide computing resources in a virtual machine form based on virtualization techniques. In general, supercomputing and grid computing have mainly been used to process large-scale jobs occurring in scientific, technical, [...] Read more.
Cloud computing services that provide computing resources to users through the Internet also provide computing resources in a virtual machine form based on virtualization techniques. In general, supercomputing and grid computing have mainly been used to process large-scale jobs occurring in scientific, technical, and engineering application domains. However, services that process large-scale jobs in parallel using idle virtual machines are not provided in cloud computing at present. Generally, users do not use virtual machines anymore, or they do not use them for a long period of time, because existing cloud computing assigns all of the use rights of virtual machines to users, resulting in the low use of computing resources. This study proposes a scheme to process large-scale jobs in parallel, using idle virtual machines and increasing the resource utilization of idle virtual machines. Idle virtual machines are basically identified through specific determination criteria out of virtual machines created using OpenStack, and then they are used in computing services. This is called the idle virtual machine–resource utilization (IVM–ReU), which is proposed in this study. Full article
(This article belongs to the Special Issue Innovative Applications of Big Data and Cloud Computing)
Show Figures

Figure 1

18 pages, 1626 KiB  
Article
Diagnosis and Prediction of Large-for-Gestational-Age Fetus Using the Stacked Generalization Method
by Faheem Akhtar, Jianqiang Li, Yan Pei, Azhar Imran, Asif Rajput, Muhammad Azeem and Qing Wang
Appl. Sci. 2019, 9(20), 4317; https://doi.org/10.3390/app9204317 - 14 Oct 2019
Cited by 19 | Viewed by 2831
Abstract
An accurate and efficient Large-for-Gestational-Age (LGA) classification system is developed to classify a fetus as LGA or non-LGA, which has the potential to assist paediatricians and experts in establishing a state-of-the-art LGA prognosis process. The performance of the proposed scheme is validated by [...] Read more.
An accurate and efficient Large-for-Gestational-Age (LGA) classification system is developed to classify a fetus as LGA or non-LGA, which has the potential to assist paediatricians and experts in establishing a state-of-the-art LGA prognosis process. The performance of the proposed scheme is validated by using LGA dataset collected from the National Pre-Pregnancy and Examination Program of China (2010–2013). A master feature vector is created to establish primarily data pre-processing, which includes a features’ discretization process and the entertainment of missing values and data imbalance issues. A principal feature vector is formed using GridSearch-based Recursive Feature Elimination with Cross-Validation (RFECV) + Information Gain (IG) feature selection scheme followed by stacking to select, rank, and extract significant features from the LGA dataset. Based on the proposed scheme, different features subset are identified and provided to four different machine learning (ML) classifiers. The proposed GridSearch-based RFECV+IG feature selection scheme with stacking using SVM (linear kernel) best suits the said classification process followed by SVM (RBF kernel) and LR classifiers. The Decision Tree (DT) classifier is not suggested because of its low performance. The highest prediction precision, recall, accuracy, Area Under the Curve (AUC), specificity, and F1 scores of 0.92, 0.87, 0.92, 0.95, 0.95, and 0.89 are achieved with SVM (linear kernel) classifier using top ten principal features subset, which is, in fact higher than the baselines methods. Moreover, almost every classification scheme best performed with ten principal feature subsets. Therefore, the proposed scheme has the potential to establish an efficient LGA prognosis process using gestational parameters, which can assist paediatricians and experts to improve the health of a newborn using computer aided-diagnostic system. Full article
(This article belongs to the Special Issue Innovative Applications of Big Data and Cloud Computing)
Show Figures

Figure 1

Back to TopTop