Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (16)

Search Parameters:
Keywords = infinite data streams

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
15 pages, 4080 KB  
Article
Lossless and Near-Lossless L-Infinite Compression of Depth Video Data
by Mohammad Ali Tahouri, Alin Adrian Alecu, Leon Denis and Adrian Munteanu
Sensors 2025, 25(5), 1403; https://doi.org/10.3390/s25051403 - 25 Feb 2025
Cited by 5 | Viewed by 2222
Abstract
The acquisition of depth information sensorial data is critically important in medical applications, such as the monitoring of the elderly or the extraction of human biometrics. In such applications, compressing the stream of depth video data plays an important role due to bandwidth [...] Read more.
The acquisition of depth information sensorial data is critically important in medical applications, such as the monitoring of the elderly or the extraction of human biometrics. In such applications, compressing the stream of depth video data plays an important role due to bandwidth constraints on transmission channels. This paper introduces a novel lightweight compression system that encodes the semantics of the input depth video and can operate in both lossless and L-infinite near-lossless compression modes. A quantization technique that targets the L-infinite norm for sparse distributions and a new L-infinite compression method that sets bounds on the quantization error is proposed. The proposed codec enables the control of the coding error on every pixel in the input video data, which is crucial in medical applications. Experimental results show an average improvement of 45% and 17% in lossless mode compared to standalone JPEG-LS and CALIC codecs, respectively. Furthermore, in near-lossless mode, the proposed codec achieves superior rate-distortion performance and reduced maximum error per frame compared to HEVC. Additionally, the proposed lightweight codec is designed to perform efficiently in real time when deployed on an embedded depth-camera platform. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

24 pages, 6350 KB  
Article
Multi-Label Learning with Distribution Matching Ensemble: An Adaptive and Just-In-Time Weighted Ensemble Learning Algorithm for Classifying a Nonstationary Online Multi-Label Data Stream
by Chao Shen, Bingyu Liu, Changbin Shao, Xibei Yang, Sen Xu, Changming Zhu and Hualong Yu
Symmetry 2025, 17(2), 182; https://doi.org/10.3390/sym17020182 - 24 Jan 2025
Cited by 1 | Viewed by 1514
Abstract
Learning from a nonstationary data stream is challenging, as a data stream is generally considered to be endless, and the learning model is required to be constantly amended for adapting the shifting data distributions. When it meets multi-label data, the challenge would be [...] Read more.
Learning from a nonstationary data stream is challenging, as a data stream is generally considered to be endless, and the learning model is required to be constantly amended for adapting the shifting data distributions. When it meets multi-label data, the challenge would be further intensified. In this study, an adaptive online weighted multi-label ensemble learning algorithm called MLDME (multi-label learning with distribution matching ensemble) is proposed. It simultaneously calculates both the feature matching level and label matching level between any one reserved data block and the new received data block, further providing an adaptive decision weight assignment for ensemble classifiers based on their distribution similarities. Specifically, MLDME abandons the most commonly used but not totally correct underlying hypothesis that in a data stream, each data block always has the most approximate distribution with that emerging after it; thus, MLDME could provide a just-in-time decision for the new received data block. In addition, to avoid an infinite extension of ensemble classifiers, we use a fixed-size buffer to store them and design three different dynamic classifier updating rules. Experimental results for nine synthetic and three real-world multi-label nonstationary data streams indicate that the proposed MLDME algorithm is superior to some popular and state-of-the-art online learning paradigms and algorithms, including two specifically designed ones for classifying a nonstationary multi-label data stream. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

16 pages, 1480 KB  
Article
Protecting Infinite Data Streams from Wearable Devices with Local Differential Privacy Techniques
by Feng Zhao and Song Fan
Information 2024, 15(10), 630; https://doi.org/10.3390/info15100630 - 12 Oct 2024
Cited by 4 | Viewed by 2742
Abstract
The real-time data collected by wearable devices enables personalized health management and supports public health monitoring. However, sharing these data with third-party organizations introduces significant privacy risks. As a result, protecting and securely sharing wearable device data has become a critical concern. This [...] Read more.
The real-time data collected by wearable devices enables personalized health management and supports public health monitoring. However, sharing these data with third-party organizations introduces significant privacy risks. As a result, protecting and securely sharing wearable device data has become a critical concern. This paper proposes a local differential privacy-preserving algorithm designed for continuous data streams generated by wearable devices. Initially, the data stream is sampled at key points to avoid prematurely exhausting the privacy budget. Then, an adaptive allocation of the privacy budget at these points enhances privacy protection for sensitive data. Additionally, the optimized square wave (SW) mechanism introduces perturbations to the sampled points. Afterward, the Kalman filter algorithm is applied to maintain data flow patterns and reduce prediction errors. Experimental validation using two real datasets demonstrates that, under comparable conditions, this approach provides higher data availability than existing privacy protection methods for continuous data streams. Full article
(This article belongs to the Special Issue Digital Privacy and Security, 2nd Edition)
Show Figures

Graphical abstract

18 pages, 487 KB  
Article
NLOCL: Noise-Labeled Online Continual Learning
by Kan Cheng, Yongxin Ma, Guanglu Wang, Linlin Zong and Xinyue Liu
Electronics 2024, 13(13), 2560; https://doi.org/10.3390/electronics13132560 - 29 Jun 2024
Cited by 1 | Viewed by 3441
Abstract
Continual learning (CL) from infinite data streams has become a challenge for neural network models in real-world scenarios. Catastrophic forgetting of previous knowledge occurs in this learning setting, and existing supervised CL methods rely excessively on accurately labeled samples. However, the real-world data [...] Read more.
Continual learning (CL) from infinite data streams has become a challenge for neural network models in real-world scenarios. Catastrophic forgetting of previous knowledge occurs in this learning setting, and existing supervised CL methods rely excessively on accurately labeled samples. However, the real-world data labels are usually misled by noise, which influences the CL agents and aggravates forgetting. To address this problem, we propose a method named noise-labeled online continual learning (NLOCL), which implements the online CL model with noise-labeled data streams. NLOCL uses an empirical replay strategy to retain crucial examples, separates data streams by small-loss criteria, and includes semi-supervised fine-tuning for labeled and unlabeled samples. Besides, NLOCL combines small loss with class diversity measures and eliminates online memory partitioning. Furthermore, we optimized the experience replay stage to enhance the model performance by retaining significant clean-labeled examples and carefully selecting suitable samples. In the experiment, we designed noise-labeled data streams by injecting noisy labels into multiple datasets and partitioning tasks to simulate infinite data streams realistically. The experimental results demonstrate the superior performance and robust learning capabilities of our proposed method. Full article
(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)
Show Figures

Figure 1

23 pages, 7644 KB  
Article
An Efficient Chaos-Based Image Encryption Technique Using Bitplane Decay and Genetic Operators
by Ramesh Premkumar, Miroslav Mahdal and Muniyandy Elangovan
Sensors 2022, 22(20), 8044; https://doi.org/10.3390/s22208044 - 21 Oct 2022
Cited by 14 | Viewed by 3120
Abstract
Social networks have greatly expanded in the last ten years the need for sharing multimedia data. However, on open networks such as the Internet, where security is frequently compromised, it is simple for eavesdroppers to approach the actual contents without much difficulty. Researchers [...] Read more.
Social networks have greatly expanded in the last ten years the need for sharing multimedia data. However, on open networks such as the Internet, where security is frequently compromised, it is simple for eavesdroppers to approach the actual contents without much difficulty. Researchers have created a variety of encryption methods to strengthen the security of this transmission and make it difficult for eavesdroppers to get genuine data. However, these conventional approaches increase computing costs and communication overhead and do not offer protection against fresh threats. The problems with current algorithms encourage academics to further investigate the subject and suggest new algorithms that are more effective than current methods, that reduce overhead, and which are equipped with features needed by next-generation multimedia networks. In this paper, a genetic operator-based encryption method for multimedia security is proposed. It has been noted that the proposed algorithm produces improved key strength results. The investigations using attacks on data loss, differential assaults, statistical attacks, and brute force attacks show that the encryption technique suggested has improved security performance. It focuses on two techniques, bitplane slicing and followed by block segmentation and scrambling. The suggested method first divides the plaintext picture into several blocks, which is then followed by block swapping done by the genetic operator used to combine the genetic information of two different images to generate new offspring. The key stream is produced from an iterative chaotic map with infinite collapse (ICMIC). Based on a close-loop modulation coupling (CMC) approach, a three-dimensional hyperchaotic ICMIC modulation map is proposed. By using a hybrid model of multidirectional circular permutation with this map, a brand-new colour image encryption algorithm is created. In this approach, a multidirectional circular permutation is used to disrupt the image’s pixel placements, and genetic operations are used to replace the pixel values. According to simulation findings and security research, the technique can fend off brute-force, statistical, differential, known-plaintext, and chosen-plaintext assaults, and has a strong key sensitivity. Full article
Show Figures

Figure 1

38 pages, 506 KB  
Article
New Bounds and a Generalization for Share Conversion for 3-Server PIR
by Anat Paskin-Cherniavsky and Olga Nissenbaum
Entropy 2022, 24(4), 497; https://doi.org/10.3390/e24040497 - 1 Apr 2022
Cited by 3 | Viewed by 2568
Abstract
Private Information Retrieval (PIR) protocols, which allow the client to obtain data from servers without revealing its request, have many applications such as anonymous communication, media streaming, blockchain security, advertisement, etc. Multi-server PIR protocols, where the database is replicated among the non-colluding servers, [...] Read more.
Private Information Retrieval (PIR) protocols, which allow the client to obtain data from servers without revealing its request, have many applications such as anonymous communication, media streaming, blockchain security, advertisement, etc. Multi-server PIR protocols, where the database is replicated among the non-colluding servers, provide high efficiency in the information-theoretic setting. Beimel et al. in CCC 12’ (further referred to as BIKO) put forward a paradigm for constructing multi-server PIR, capturing several previous constructions for k3 servers, as well as improving the best-known share complexity for 3-server PIR. A key component there is a share conversion scheme from corresponding linear three-party secret sharing schemes with respect to a certain type of “modified universal” relation. In a useful particular instantiation of the paradigm, they used a share conversion from (2,3)-CNF over Zm to three-additive sharing over Zpβ for primes p1,p2,p where p1p2 and m=p1·p2, and the relation is modified universal relation CSm. They reduced the question of the existence of the share conversion for a triple (p1,p2,p) to the (in)solvability of a certain linear system over Zp, and provided an efficient (in m,logp) construction of such a sharing scheme. Unfortunately, the size of the system is Θ(m2) which entails the infeasibility of a direct solution for big m’s in practice. Paskin-Cherniavsky and Schmerler in 2019 proved the existence of the conversion for the case of odd p1, p2 when p=p1, obtaining in this way infinitely many parameters for which the conversion exists, but also for infinitely many of them it remained open. In this work, using some algebraic techniques from the work of Paskin-Cherniavsky and Schmerler, we prove the existence of the conversion for even m’s in case p=2 (we computed β in this case) and the absence of the conversion for even m’s in case p>2. This does not improve the concrete efficiency of 3-server PIR; however, our result is promising in a broader context of constructing PIR through composition techniques with k3 servers, using the relation CSm where m has more than two prime divisors. Another our suggestion about 3-server PIR is that it’s possible to achieve a shorter server’s response using the relation CSm for extended SmSm. By computer search, in BIKO framework we found several such sets for small m’s which result in share conversion from (2,3)-CNF over Zm to 3-additive secret sharing over Zpβ, where β>0 is several times less than β, which implies several times shorter server’s response. We also suggest that such extended sets Sm can result in better PIR due to the potential existence of matching vector families with the higher Vapnik-Chervonenkis dimension. Full article
(This article belongs to the Special Issue Recent Advances in Information-Theoretic Cryptography)
20 pages, 4935 KB  
Article
Oil Spill Detection in SAR Images Using Online Extended Variational Learning of Dirichlet Process Mixtures of Gamma Distributions
by Ahmed Almulihi, Fahd Alharithi, Sami Bourouis, Roobaea Alroobaea, Yogesh Pawar and Nizar Bouguila
Remote Sens. 2021, 13(15), 2991; https://doi.org/10.3390/rs13152991 - 29 Jul 2021
Cited by 24 | Viewed by 4209
Abstract
In this paper, we propose a Dirichlet process (DP) mixture model of Gamma distributions, which is an extension of the finite Gamma mixture model to the infinite case. In particular, we propose a novel online nonparametric Bayesian analysis method based on the infinite [...] Read more.
In this paper, we propose a Dirichlet process (DP) mixture model of Gamma distributions, which is an extension of the finite Gamma mixture model to the infinite case. In particular, we propose a novel online nonparametric Bayesian analysis method based on the infinite Gamma mixture model where the determination of the number of clusters is bypassed via an infinite number of mixture components. The proposed model is learned via an online extended variational Bayesian inference approach in a flexible way where the priors of model’s parameters are selected appropriately and the posteriors are approximated effectively in a closed form. The online setting has the advantage to allow data instances to be treated in a sequential manner, which is more attractive than batch learning especially when dealing with massive and streaming data. We demonstrated the performance and merits of the proposed statistical framework with a challenging real-world application namely oil spill detection in synthetic aperture radar (SAR) images. Full article
Show Figures

Graphical abstract

27 pages, 1329 KB  
Article
Vortex Ring Theory—An Alternative to the Existing Actuator Disk and Rotating Annular Stream Tube Theories
by James Agbormbai, Weidong Zhu and Liang Li
Appl. Sci. 2021, 11(14), 6576; https://doi.org/10.3390/app11146576 - 17 Jul 2021
Cited by 1 | Viewed by 3435
Abstract
Currently, the actuator disk theory (ADT) and the rotating annular stream-tube theory (RAST), both of which predicate on the axial momentum and generalized momentum theories, among others, are commonly used in investigating the aerodynamic characteristics of horizontal axis wind turbines (HAWTs). These theories, [...] Read more.
Currently, the actuator disk theory (ADT) and the rotating annular stream-tube theory (RAST), both of which predicate on the axial momentum and generalized momentum theories, among others, are commonly used in investigating the aerodynamic characteristics of horizontal axis wind turbines (HAWTs). These theories, which are based on a rotor with an infinite number of blades, typically do not properly capture the flow physics of wind blowing past the rotors of HAWTs. A vortex ring theory (VRT) that analyzes HAWTs based solely on the characteristics of fluids flowing past obstructions is proposed. The VRT is not predicated on the assertion that the induced velocity in the wake is twice the induced velocity at the rotor. On the contrary, it splits the axial induction factor in the wake into two components, namely, the induction or interference factor due to the solidity of the rotor and the induction factor due to the wake of the rotor aw; aw and its azimuthal counterpart are determined using the Biot–Savart law. The pressure differences across the rotor segments of a HAWT are derived from the Bernoulli equation for all the three theories. Blade segment/local areas based on the blade sectional geometry of the rotor are used in the case of the VRT to estimate the local forces. All the calculations in this study are based on the design parameters of the 5 MW National Renewable Energy Laboratory’s reference offshore wind turbine. Pressure differences are plotted as functions of local radii using the calculated axial and azimuthal induction factors for each theory. The local power coefficient is plotted as a function of the local tip-speed ratio, while the local thrust coefficient is plotted as a function of the local radii for all the three theories. There is piece-wise agreement between the VRT, the ADT, the RAST and numerical and experimental data available in the literature. Full article
Show Figures

Figure 1

39 pages, 814 KB  
Article
Hydrodynamics of Collapsing Glass Tubes and Measuring of Glass Viscosities: Analytic Results beyond Asymptotic Approaches for Rapidly Varying Viscosities
by Thomas Klupsch
Fluids 2021, 6(5), 179; https://doi.org/10.3390/fluids6050179 - 6 May 2021
Viewed by 3017
Abstract
We present novel analytic solutions of the axial-symmetric boundary value problem of the Stokes equation for incompressible liquids with rapidly varying viscosity, which cover the hydrodynamics of collapsing glass tubes with moving torch. We meet requirements to optimize the contactless measuring of dynamical [...] Read more.
We present novel analytic solutions of the axial-symmetric boundary value problem of the Stokes equation for incompressible liquids with rapidly varying viscosity, which cover the hydrodynamics of collapsing glass tubes with moving torch. We meet requirements to optimize the contactless measuring of dynamical viscosities and surface tensions of molten glasses through collapsing for tools working with sharply peaked axial temperature courses. We study model solutions for axial courses of the reciprocal viscosity specified as Gaussians extended on small distances compared to the outer tube radius, and we neglect the boundary inclination, corresponding to measuring conditions for large torch velocities. The surface tension is assumed to be constant across the collapsing zone. The boundary value problem becomes disentangled, changing to a gradually independent hierarchy of streaming function, vorticity, and pressure. Axial Fourier transforms are introduced to focus on solutions for infinitely extended tubes. Beyond the predictions of the asymptotic collapsing theory, a successively increasing steepness of the reciprocal viscosity induces an increasing radial pressure gradient that acts against the surface tension and diminishes the collapsing efficiency. The arising systematic error in evaluating the viscosity from experimental data in virtue of the asymptotic collapsing theory is corrected. Error estimations regarding deviations from the specified viscosity course, the neglected boundary inclination, and heat conduction within the tube wall are outlined, and preconditions to simplify the measuring of surface tensions through collapsing are discussed. Full article
(This article belongs to the Collection Complex Fluids)
Show Figures

Figure 1

26 pages, 1446 KB  
Article
Real-Time Emotion Classification Using EEG Data Stream in E-Learning Contexts
by Arijit Nandi, Fatos Xhafa, Laia Subirats and Santi Fort
Sensors 2021, 21(5), 1589; https://doi.org/10.3390/s21051589 - 25 Feb 2021
Cited by 58 | Viewed by 11021
Abstract
In face-to-face and online learning, emotions and emotional intelligence have an influence and play an essential role. Learners’ emotions are crucial for e-learning system because they promote or restrain the learning. Many researchers have investigated the impacts of emotions in enhancing and maximizing [...] Read more.
In face-to-face and online learning, emotions and emotional intelligence have an influence and play an essential role. Learners’ emotions are crucial for e-learning system because they promote or restrain the learning. Many researchers have investigated the impacts of emotions in enhancing and maximizing e-learning outcomes. Several machine learning and deep learning approaches have also been proposed to achieve this goal. All such approaches are suitable for an offline mode, where the data for emotion classification are stored and can be accessed infinitely. However, these offline mode approaches are inappropriate for real-time emotion classification when the data are coming in a continuous stream and data can be seen to the model at once only. We also need real-time responses according to the emotional state. For this, we propose a real-time emotion classification system (RECS)-based Logistic Regression (LR) trained in an online fashion using the Stochastic Gradient Descent (SGD) algorithm. The proposed RECS is capable of classifying emotions in real-time by training the model in an online fashion using an EEG signal stream. To validate the performance of RECS, we have used the DEAP data set, which is the most widely used benchmark data set for emotion classification. The results show that the proposed approach can effectively classify emotions in real-time from the EEG data stream, which achieved a better accuracy and F1-score than other offline and online approaches. The developed real-time emotion classification system is analyzed in an e-learning context scenario. Full article
(This article belongs to the Special Issue Emotion Monitoring System Based on Sensors and Data Analysis)
Show Figures

Figure 1

19 pages, 720 KB  
Article
Exception Handling Method Based on Event from Look-Up Table Applying Stream-Based Lossless Data Compression
by Shinichi Yamagiwa, Koichi Marumo and Suzukaze Kuwabara
Electronics 2021, 10(3), 240; https://doi.org/10.3390/electronics10030240 - 21 Jan 2021
Cited by 3 | Viewed by 2299
Abstract
It is getting popular to implement an environment where communications are performed remotely among IoT edge devices, such as sensory devices and the cloud servers due to applying, for example, artificial intelligence algorithms to the system. In such situations that handle big data, [...] Read more.
It is getting popular to implement an environment where communications are performed remotely among IoT edge devices, such as sensory devices and the cloud servers due to applying, for example, artificial intelligence algorithms to the system. In such situations that handle big data, lossless data compression is one of the solutions to reduce the big data. In particular, the stream-based data compression technology is focused on such systems to compress infinitely continuous data stream with very small delay. However, during the continuous data compression process, it is not able to insert an exception code among the compressed data without any additional mechanisms, such as data framing and the packeting technique, as used in networking technologies. The exception code indicates configurations for the compressor/decompressor and/or its peripheral logics. Then, it is used in real time for the configuration of parameters against those components. To implement the exception code, data compression algorithm must include a mechanism to distinguish original data before compression and the exception code clearly. However, the conventional algorithms do not include such mechanism. This paper proposes novel methods to implement the exception code in data compression that uses look-up table, called the exception symbol. Additionally, we describe implementation details of the method by applying it to algorithms of stream-based data compression. Because some of the proposed mechanisms need to reserve entries in the table, we also discuss the effect against data compression performance according to experimental evaluations. Full article
(This article belongs to the Section Computer Science & Engineering)
Show Figures

Figure 1

36 pages, 2883 KB  
Article
Industry 4.0 towards Forestry 4.0: Fire Detection Use Case
by Radhya Sahal, Saeed H. Alsamhi, John G. Breslin and Muhammad Intizar Ali
Sensors 2021, 21(3), 694; https://doi.org/10.3390/s21030694 - 20 Jan 2021
Cited by 50 | Viewed by 8928
Abstract
Forestry 4.0 is inspired by the Industry 4.0 concept, which plays a vital role in the next industrial generation revolution. It is ushering in a new era for efficient and sustainable forest management. Environmental sustainability and climate change are related challenges to promote [...] Read more.
Forestry 4.0 is inspired by the Industry 4.0 concept, which plays a vital role in the next industrial generation revolution. It is ushering in a new era for efficient and sustainable forest management. Environmental sustainability and climate change are related challenges to promote sustainable forest management of natural resources. Internet of Forest Things (IoFT) is an emerging technology that helps manage forest sustainability and protect forest from hazards via distributing smart devices for gathering data stream during monitoring and detecting fire. Stream processing is a well-known research area, and recently, it has gained a further significance due to the emergence of IoFT devices. Distributed stream processing platforms have emerged, e.g., Apache Flink, Storm, and Spark, etc. Querying windowing is the heart of any stream-processing platform which splits infinite data stream into chunks of finite data to execute a query. Dynamic query window-based processing can reduce the reporting time in case of missing and delayed events caused by data drift.In this paper, we present a novel dynamic mechanism to recommend the optimal window size and type based on the dynamic context of IoFT application. In particular, we designed a dynamic window selector for stream queries considering input stream data characteristics, application workload and resource constraints to recommend the optimal stream query window configuration. A research gap on the likelihood of adopting smart IoFT devices in environmental sustainability indicates a lack of empirical studies to pursue forest sustainability, i.e., sustainable forestry applications. So, we focus on forest fire management and detection as a use case of Forestry 4.0, one of the dynamic environmental management challenges, i.e., climate change, to deliver sustainable forestry goals. According to the dynamic window selector’s experimental results, end-to-end latency time for the reported fire alerts has been reduced by dynamical adaptation of window size with IoFT stream rate changes. Full article
(This article belongs to the Section Internet of Things)
Show Figures

Figure 1

19 pages, 944 KB  
Article
A Flexible IoT Stream Processing Architecture Based on Microservices
by Luca Bixio, Giorgio Delzanno, Stefano Rebora and Matteo Rulli
Information 2020, 11(12), 565; https://doi.org/10.3390/info11120565 - 2 Dec 2020
Cited by 10 | Viewed by 6977
Abstract
The Internet of Things (IoT) has created new and challenging opportunities for data analytics. The IoT represents an infinitive source of massive and heterogeneous data, whose real-time processing is an increasingly important issue. IoT applications usually consist of multiple technological layers connecting ‘things’ [...] Read more.
The Internet of Things (IoT) has created new and challenging opportunities for data analytics. The IoT represents an infinitive source of massive and heterogeneous data, whose real-time processing is an increasingly important issue. IoT applications usually consist of multiple technological layers connecting ‘things’ to a remote cloud core. These layers are generally grouped into two macro levels: the edge level (consisting of the devices at the boundary of the network near the devices that produce the data) and the core level (consisting of the remote cloud components of the application). The aim of this work is to propose an adaptive microservices architecture for IoT platforms which provides real-time stream processing functionalities that can seamlessly both at the edge-level and cloud-level. More in detail, we introduce the notion of μ-service, a stream processing unit that can be indifferently allocated on the edge and core level, and a Reference Architecture that provides all necessary services (namely Proxy, Adapter and Data Processing μ-services) for dealing with real-time stream processing in a very flexible way. Furthermore, in order to abstract away from the underlying stream processing engine and IoT layers (edge/cloud), we propose: (1) a service definition language consisting of a configuration language based on JSON objects (interoperability), (2) a rule-based query language with basic filter operations that can be compiled to most of the existing stream processing engines (portability), and (3) a combinator language to build pipelines of filter definitions (compositionality). Although our proposal has been designed to extend the Senseioty platform, a proprietary IoT platform developed by FlairBit, it could be adapted to every platform based on similar technologies. As a proof of concept, we provide details of a preliminary prototype based on the Java OSGi framework. Full article
(This article belongs to the Special Issue Microservices and Cloud-Native Solutions: From Design to Operation)
Show Figures

Figure 1

20 pages, 893 KB  
Article
On Frequency Estimation and Detection of Heavy Hitters in Data Streams
by Federica Ventruto, Marco Pulimeno, Massimo Cafaro and Italo Epicoco
Future Internet 2020, 12(9), 158; https://doi.org/10.3390/fi12090158 - 18 Sep 2020
Cited by 7 | Viewed by 3560
Abstract
A stream can be thought of as a very large set of data, sometimes even infinite, which arrives sequentially and must be processed without the possibility of being stored. In fact, the memory available to the algorithm is limited and it is not [...] Read more.
A stream can be thought of as a very large set of data, sometimes even infinite, which arrives sequentially and must be processed without the possibility of being stored. In fact, the memory available to the algorithm is limited and it is not possible to store the whole stream of data which is instead scanned upon arrival and summarized through a succinct data structure in order to maintain only the information of interest. Two of the main tasks related to data stream processing are frequency estimation and heavy hitter detection. The frequency estimation problem requires estimating the frequency of each item, that is the number of times or the weight with which each appears in the stream, while heavy hitter detection means the detection of all those items with a frequency higher than a fixed threshold. In this work we design and analyze ACMSS, an algorithm for frequency estimation and heavy hitter detection, and compare it against the state of the art ASketch algorithm. We show that, given the same budgeted amount of memory, for the task of frequency estimation our algorithm outperforms ASketch with regard to accuracy. Furthermore, we show that, under the assumptions stated by its authors, ASketch may not be able to report all of the heavy hitters whilst ACMSS will provide with high probability the full list of heavy hitters. Full article
(This article belongs to the Section Big Data and Augmented Intelligence)
Show Figures

Figure 1

25 pages, 2522 KB  
Article
A Distributed Stream Processing Middleware Framework for Real-Time Analysis of Heterogeneous Data on Big Data Platform: Case of Environmental Monitoring
by Adeyinka Akanbi and Muthoni Masinde
Sensors 2020, 20(11), 3166; https://doi.org/10.3390/s20113166 - 3 Jun 2020
Cited by 33 | Viewed by 11271
Abstract
In recent years, the application and wide adoption of Internet of Things (IoT)-based technologies have increased the proliferation of monitoring systems, which has consequently exponentially increased the amounts of heterogeneous data generated. Processing and analysing the massive amount of data produced is cumbersome [...] Read more.
In recent years, the application and wide adoption of Internet of Things (IoT)-based technologies have increased the proliferation of monitoring systems, which has consequently exponentially increased the amounts of heterogeneous data generated. Processing and analysing the massive amount of data produced is cumbersome and gradually moving from classical ‘batch’ processing—extract, transform, load (ETL) technique to real-time processing. For instance, in environmental monitoring and management domain, time-series data and historical dataset are crucial for prediction models. However, the environmental monitoring domain still utilises legacy systems, which complicates the real-time analysis of the essential data, integration with big data platforms and reliance on batch processing. Herein, as a solution, a distributed stream processing middleware framework for real-time analysis of heterogeneous environmental monitoring and management data is presented and tested on a cluster using open source technologies in a big data environment. The system ingests datasets from legacy systems and sensor data from heterogeneous automated weather systems irrespective of the data types to Apache Kafka topics using Kafka Connect APIs for processing by the Kafka streaming processing engine. The stream processing engine executes the predictive numerical models and algorithms represented in event processing (EP) languages for real-time analysis of the data streams. To prove the feasibility of the proposed framework, we implemented the system using a case study scenario of drought prediction and forecasting based on the Effective Drought Index (EDI) model. Firstly, we transform the predictive model into a form that could be executed by the streaming engine for real-time computing. Secondly, the model is applied to the ingested data streams and datasets to predict drought through persistent querying of the infinite streams to detect anomalies. As a conclusion of this study, a performance evaluation of the distributed stream processing middleware infrastructure is calculated to determine the real-time effectiveness of the framework. Full article
(This article belongs to the Special Issue Communications and Computing in Sensor Network)
Show Figures

Figure 1

Back to TopTop