Next Article in Journal
Proposal of the CAD System for Melanoma Detection Using Reconfigurable Computing
Next Article in Special Issue
Control Plane Optimisation for an SDN-Based WBAN Framework to Support Healthcare Applications
Previous Article in Journal
A Review of Bolt Tightening Force Measurement and Loosening Detection
Previous Article in Special Issue
Message-Based Communication for Heterogeneous Internet of Things Systems
Open AccessArticle

A Distributed Stream Processing Middleware Framework for Real-Time Analysis of Heterogeneous Data on Big Data Platform: Case of Environmental Monitoring

Centre for Sustainable Smart Cities, Central University of Technology, Free State 9300, South Africa
*
Author to whom correspondence should be addressed.
Sensors 2020, 20(11), 3166; https://doi.org/10.3390/s20113166
Received: 27 March 2020 / Revised: 13 May 2020 / Accepted: 14 May 2020 / Published: 3 June 2020
(This article belongs to the Special Issue Communications and Computing in Sensor Network)
In recent years, the application and wide adoption of Internet of Things (IoT)-based technologies have increased the proliferation of monitoring systems, which has consequently exponentially increased the amounts of heterogeneous data generated. Processing and analysing the massive amount of data produced is cumbersome and gradually moving from classical ‘batch’ processing—extract, transform, load (ETL) technique to real-time processing. For instance, in environmental monitoring and management domain, time-series data and historical dataset are crucial for prediction models. However, the environmental monitoring domain still utilises legacy systems, which complicates the real-time analysis of the essential data, integration with big data platforms and reliance on batch processing. Herein, as a solution, a distributed stream processing middleware framework for real-time analysis of heterogeneous environmental monitoring and management data is presented and tested on a cluster using open source technologies in a big data environment. The system ingests datasets from legacy systems and sensor data from heterogeneous automated weather systems irrespective of the data types to Apache Kafka topics using Kafka Connect APIs for processing by the Kafka streaming processing engine. The stream processing engine executes the predictive numerical models and algorithms represented in event processing (EP) languages for real-time analysis of the data streams. To prove the feasibility of the proposed framework, we implemented the system using a case study scenario of drought prediction and forecasting based on the Effective Drought Index (EDI) model. Firstly, we transform the predictive model into a form that could be executed by the streaming engine for real-time computing. Secondly, the model is applied to the ingested data streams and datasets to predict drought through persistent querying of the infinite streams to detect anomalies. As a conclusion of this study, a performance evaluation of the distributed stream processing middleware infrastructure is calculated to determine the real-time effectiveness of the framework. View Full-Text
Keywords: big data; stream processing; middleware; Internet of Things; Apache Kafka; drought big data; stream processing; middleware; Internet of Things; Apache Kafka; drought
Show Figures

Figure 1

MDPI and ACS Style

Akanbi, A.; Masinde, M. A Distributed Stream Processing Middleware Framework for Real-Time Analysis of Heterogeneous Data on Big Data Platform: Case of Environmental Monitoring. Sensors 2020, 20, 3166. https://doi.org/10.3390/s20113166

AMA Style

Akanbi A, Masinde M. A Distributed Stream Processing Middleware Framework for Real-Time Analysis of Heterogeneous Data on Big Data Platform: Case of Environmental Monitoring. Sensors. 2020; 20(11):3166. https://doi.org/10.3390/s20113166

Chicago/Turabian Style

Akanbi, Adeyinka; Masinde, Muthoni. 2020. "A Distributed Stream Processing Middleware Framework for Real-Time Analysis of Heterogeneous Data on Big Data Platform: Case of Environmental Monitoring" Sensors 20, no. 11: 3166. https://doi.org/10.3390/s20113166

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop