Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (49)

Search Parameters:
Keywords = Spark Streaming

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 47766 KB  
Article
Scalable AI + DSP Compute Frameworks Using AMD Xilinx RF-SoC ZCU/VCU Platforms for Wireless Testbeds for Scientific, Commercial, Space, and Defense Applications
by Buddhipriya Gayanath, Gayani Rathnasekara, Kasun Karunanayake and Arjuna Madanayake
Electronics 2026, 15(2), 445; https://doi.org/10.3390/electronics15020445 - 20 Jan 2026
Viewed by 508
Abstract
This paper describes recent engineering designs that allow full-duplex SerDes connectivity between a number of cascaded Xilinx radio frequency system-on-chip (RF-SoC) and VCU FPGA systems. The design allows for unlimited scalability with all-to-all connectivity across FPGA systems and RF-SoCs that allow for bidirectional [...] Read more.
This paper describes recent engineering designs that allow full-duplex SerDes connectivity between a number of cascaded Xilinx radio frequency system-on-chip (RF-SoC) and VCU FPGA systems. The design allows for unlimited scalability with all-to-all connectivity across FPGA systems and RF-SoCs that allow for bidirectional data transport in streaming mode at a capacity of 50 Gbps per ADC-DAC channel. A custom massively parallel systolic-array architecture supporting 8 parallel data streams from time-interleaved ADC/DACs allow real-time matrix–vector-multiplication (MVM). The MVM can be 8 × 8, 8 × 16, …, 8 × 1024 in supported matrix size, and is demonstrated in real time sustained throughput of 1 TeraMAC/second, for matrix size 8 × 512. The MVM is the building block supporting machine learning and filtering, with the computational graph split across FPGA systems using the SerDes connections. The RF data processed by the FPGA chain can be further utilized for higher-level AI workloads on an NVIDIA DGX Spark platform connected to the system. We demonstrate two platforms in which ZCU111 and ZCU1285 RF-SoC boards perform direct-RF data acquisition, while compute engines operating in real time on VCU128 and VCU129 FPGA boards showcase both digital beamforming and polyphase FIR filterbanking in a real-time bandwidth of 1.0 GHz. Full article
(This article belongs to the Special Issue Emerging Applications of FPGAs and Reconfigurable Computing System)
Show Figures

Figure 1

13 pages, 1149 KB  
Article
Monitoring IoT and Robotics Data for Sustainable Agricultural Practices Using a New Edge–Fog–Cloud Architecture
by Mohamed El-Ouati, Sandro Bimonte and Nicolas Tricot
Computers 2026, 15(1), 32; https://doi.org/10.3390/computers15010032 - 7 Jan 2026
Viewed by 445
Abstract
Modern agricultural operations generate high-volume and diverse data (historical and stream) from various sources, including IoT devices, robots, and drones. This paper presents a novel smart farming architecture specifically designed to efficiently manage and process this complex data landscape.The proposed architecture comprises five [...] Read more.
Modern agricultural operations generate high-volume and diverse data (historical and stream) from various sources, including IoT devices, robots, and drones. This paper presents a novel smart farming architecture specifically designed to efficiently manage and process this complex data landscape.The proposed architecture comprises five distinct, interconnected layers: The Source Layer, the Ingestion Layer, the Batch Layer, the Speed Layer, and the Governance Layer. The Source Layer serves as the unified entry point, accommodating structured, spatial, and image data from sensors, Drones, and ROS-equipped robots. The Ingestion Layer uses a hybrid fog/cloud architecture with Kafka for real-time streams and for batch processing of historical data. Data is then segregated for processing: The cloud-deployed Batch Layer employs a Hadoop cluster, Spark, Hive, and Drill for large-scale historical analysis, while the Speed Layer utilizes Geoflink and PostGIS for low-latency, real-time geovisualization. Finally, the Governance Layer guarantees data quality, lineage, and organization across all components using Open Metadata. This layered, hybrid approach provides a scalable and resilient framework capable of transforming raw agricultural data into timely, actionable insights, addressing the critical need for advanced data management in smart farming. Full article
(This article belongs to the Special Issue Computational Science and Its Applications 2025 (ICCSA 2025))
Show Figures

Figure 1

41 pages, 6103 KB  
Article
H-RT-IDPS: A Hierarchical Real-Time Intrusion Detection and Prevention System for the Smart Internet of Vehicles via TinyML-Distilled CNN and Hybrid BiLSTM-XGBoost Models
by Ikram Hamdaoui, Chaymae Rami, Zakaria El Allali and Khalid El Makkaoui
Technologies 2025, 13(12), 572; https://doi.org/10.3390/technologies13120572 - 5 Dec 2025
Viewed by 854
Abstract
The integration of connected vehicles into smart city infrastructure introduces critical cybersecurity challenges for the Internet of Vehicles (IoV), where resource-constrained vehicles and powerful roadside units (RSUs) must collaborate for secure communication. We propose H-RT-IDPS, a hierarchical real-time intrusion detection and prevention system [...] Read more.
The integration of connected vehicles into smart city infrastructure introduces critical cybersecurity challenges for the Internet of Vehicles (IoV), where resource-constrained vehicles and powerful roadside units (RSUs) must collaborate for secure communication. We propose H-RT-IDPS, a hierarchical real-time intrusion detection and prevention system targeting two high-priority IoV security pillars: availability (traffic overload) and integrity/authenticity (spoofing), with spoofing evaluated across multiple subclasses (GAS, RPM, SPEED, and steering wheel). In the offline phase, deep learning and hybrid models were benchmarked on the vehicular CAN bus dataset CICIoV2024, with the BiLSTM-XGBoost hybrid chosen for its balance between accuracy and inference speed. Real-time deployment uses a TinyML-distilled CNN on vehicles for ultra-lightweight, low-latency detection, while RSU-level BiLSTM-XGBoost performs a deeper temporal analysis. A Kafka–Spark Streaming pipeline supports localized classification, prevention, and dashboard-based monitoring. In baseline, stealth, and coordinated modes, the evaluation achieved accuracy, precision, recall, and F1-scores all above 97%. The mean end-to-end inference latency was 148.67 ms, and the resource usage was stable. The framework remains robust in both high-traffic and low-frequency attack scenarios, enhancing operator situational awareness through real-time visualizations. These results demonstrate a scalable, explainable, and operator-focused IDPS well suited for securing SC-IoV deployments against evolving threats. Full article
(This article belongs to the Special Issue Research on Security and Privacy of Data and Networks)
Show Figures

Figure 1

26 pages, 2602 KB  
Article
A Big Data Pipeline Approach for Predicting Real-Time Pandemic Hospitalization Risk
by Vishnu S. Pendyala, Mayank Kapadia, Basanth Periyapatnaroopakumar, Manav Anandani and Nischitha Nagendran
Algorithms 2025, 18(12), 730; https://doi.org/10.3390/a18120730 - 21 Nov 2025
Viewed by 728
Abstract
Pandemics emphasize the importance of real-time, interpretable clinical decision-support systems for identifying high-risk patients and assisting with prompt triage, particularly in data-intensive healthcare systems. This paper describes a novel dual big-data pipeline that includes (i) a streaming module for real-time epidemiological hospitalization risk [...] Read more.
Pandemics emphasize the importance of real-time, interpretable clinical decision-support systems for identifying high-risk patients and assisting with prompt triage, particularly in data-intensive healthcare systems. This paper describes a novel dual big-data pipeline that includes (i) a streaming module for real-time epidemiological hospitalization risk prediction and (ii) a supplementary imaging-based detection and reasoning module for chest X-rays, with COVID-19 as an example. The first pipeline uses state-of-the-art machine learning algorithms to estimate patient-level hospitalization risk based on data from the Centers for Disease Control and Prevention’s (CDC) COVID-19 Case Surveillance dataset. A Bloom filter accelerated triage by constant-time pre-screening of high-risk profiles. Specifically, after significant experimentation and optimization, one of the models, XGBoost, was selected because it achieved the best minority-class F1-score (0.76) and recall (0.80), outperforming baseline models. Synthetic data generation was employed to mimic streaming workloads, including a strategy that used the Conditional Tabular Generative Adversarial Network (CTGAN) to produce the best balanced and realistic distributions. The second pipeline focuses on diagnostic imaging and combines an advanced convolutional neural network, EfficientNet-B0, with Grad-CAM visual explanations, achieving 99.5% internal and 99.3% external accuracy. A lightweight Generative Pre-trained Transformer (GPT)-based reasoning layer converts model predictions into auditable triage comments (ALERT/FLAG/LOG), yielding traceable and interpretable decision logs. This scalable, explainable, and near-real-time framework provides a foundation for future multimodal and genomic advancements in public health readiness. Full article
Show Figures

Figure 1

9 pages, 433 KB  
Proceeding Paper
Contextual Modeling and Intelligent Decision-Making for IoT Systems: A Combined Ontology and Machine Learning Approach
by Sanaa Mouhim
Eng. Proc. 2025, 112(1), 71; https://doi.org/10.3390/engproc2025112071 - 18 Nov 2025
Viewed by 571
Abstract
In the context of the Internet of Things (IoT), this article proposes an innovative approach combining ontologies and the Apache Spark MLlib library to design an intelligent system capable of dynamically adapting to its environment. The aim is to model the context including [...] Read more.
In the context of the Internet of Things (IoT), this article proposes an innovative approach combining ontologies and the Apache Spark MLlib library to design an intelligent system capable of dynamically adapting to its environment. The aim is to model the context including users, devices, events, and environmental conditions, and exploit massive sensor data to generate intelligent, contextualized predictions. The architecture relies on two pillars: an ontology as a formal way to structure and semantically annotate knowledge and Spark MLlib in order to execute big data machine learning algorithms and notably random forest regression. The solution is targeted to real-time applications such as energy or air quality management in smart homes. The results demonstrate the value of combining ontology and machine learning in order to improve contextual knowledge and automatic decision-making. Full article
Show Figures

Figure 1

30 pages, 4273 KB  
Article
Scalable Predictive Modeling for Hospitalization Prioritization: A Hybrid Batch–Streaming Approach
by Nisrine Berros, Youness Filaly, Fatna El Mendili and Younes El Bouzekri El Idrissi
Big Data Cogn. Comput. 2025, 9(11), 271; https://doi.org/10.3390/bdcc9110271 - 25 Oct 2025
Viewed by 1070
Abstract
Healthcare systems worldwide have faced unprecedented pressure during crises such as the COVID-19 pandemic, exposing limits in managing scarce hospital resources. Many predictive models remain static, unable to adapt to new variants, shifting conditions, or diverse patient populations. This work proposes a dynamic [...] Read more.
Healthcare systems worldwide have faced unprecedented pressure during crises such as the COVID-19 pandemic, exposing limits in managing scarce hospital resources. Many predictive models remain static, unable to adapt to new variants, shifting conditions, or diverse patient populations. This work proposes a dynamic prioritization framework that recalculates severity scores in batch mode when new factors appear and applies them instantly through a streaming pipeline to incoming patients. Unlike approaches focused only on fixed mortality or severity risks, our model integrates dual datasets (survivors and non-survivors) to refine feature selection and weighting, enhancing robustness. Built on a big data infrastructure (Spark/Databricks), it ensures scalability and responsiveness, even with millions of records. Experimental results confirm the effectiveness of this architecture: The artificial neural network (ANN) achieved 98.7% accuracy, with higher precision and recall than traditional models, while random forest and logistic regression also showed strong AUC values. Additional tests, including temporal validation and real-time latency simulation, demonstrated both stability over time and feasibility for deployment in near-real-world conditions. By combining adaptability, robustness, and scalability, the proposed framework offers a methodological contribution to healthcare analytics, supporting fair and effective hospitalization prioritization during pandemics and other public health emergencies. Full article
Show Figures

Figure 1

42 pages, 8013 KB  
Article
Adaptive Neural Network System for Detecting Unauthorised Intrusions Based on Real-Time Traffic Analysis
by Serhii Vladov, Victoria Vysotska, Vasyl Lytvyn, Anatolii Komziuk, Oleksandr Prokudin and Andrii Ostapiuk
Computation 2025, 13(9), 221; https://doi.org/10.3390/computation13090221 - 11 Sep 2025
Viewed by 1060
Abstract
This article solves the anomalies’ operational detection in the network traffic problem for cyber police units by developing an adaptive neural network platform combining a variational autoencoder with continuous stochastic dynamics of the latent space (integration according to the Euler–Maruyama scheme), a continuous–discrete [...] Read more.
This article solves the anomalies’ operational detection in the network traffic problem for cyber police units by developing an adaptive neural network platform combining a variational autoencoder with continuous stochastic dynamics of the latent space (integration according to the Euler–Maruyama scheme), a continuous–discrete Kalman filter for latent state estimation, and Hotelling’s T2 statistical criterion for deviation detection. This paper implements an online learning mechanism (“on the fly”) via the Euler Euclidean gradient step. Verification includes variational autoencoder training and validation, ROC/PR and confusion matrix analysis, latent representation projections (PCA), and latency measurements during streaming processing. The model’s stable convergence and anomalies’ precise detection with the metrics precision is ≈0.83, recall is ≈0.83, the F1-score is ≈0.83, and the end-to-end delay of 1.5–6.5 ms under 100–1000 sessions/s load was demonstrated experimentally. The computational estimate for typical model parameters is ≈5152 operations for a forward pass and ≈38,944 operations, taking into account batch updating. At the same time, the main bottleneck, the O(m3) term in the Kalman step, was identified. The obtained results’ practical significance lies in the possibility of the developed adaptive neural network platform integrating into cyber police units (integration with Kafka, Spark, or Flink; exporting incidents to SIEM or SOAR; monitoring via Prometheus or Grafana) and in proposing applied optimisation paths for embedded and high-load systems. Full article
(This article belongs to the Section Computational Engineering)
Show Figures

Figure 1

32 pages, 1285 KB  
Review
Metabolic Engineering Strategies for Enhanced Polyhydroxyalkanoate (PHA) Production in Cupriavidus necator
by Wim Hectors, Tom Delmulle and Wim K. Soetaert
Polymers 2025, 17(15), 2104; https://doi.org/10.3390/polym17152104 - 31 Jul 2025
Cited by 1 | Viewed by 6269
Abstract
The environmental burden of conventional plastics has sparked interest in sustainable alternatives such as polyhydroxyalkanoates (PHAs). However, despite ample research in bioprocess development and the use of inexpensive waste streams, production costs remain a barrier to widespread commercialization. Complementary to this, genetic engineering [...] Read more.
The environmental burden of conventional plastics has sparked interest in sustainable alternatives such as polyhydroxyalkanoates (PHAs). However, despite ample research in bioprocess development and the use of inexpensive waste streams, production costs remain a barrier to widespread commercialization. Complementary to this, genetic engineering offers another avenue for improved productivity. Cupriavidus necator stands out as a model host for PHA production due to its substrate flexibility, high intracellular polymer accumulation, and tractability to genetic modification. This review delves into metabolic engineering strategies that have been developed to enhance the production of poly(3-hydroxybutyrate) (PHB) and related copolymers in C. necator. Strategies include the optimization of central carbon flux, redox and cofactor balancing, adaptation to oxygen-limiting conditions, and fine-tuning of granule-associated protein expression and the regulatory network. This is followed by outlining engineered pathways improving the synthesis of PHB copolymers, PHBV, PHBHHx, and other emerging variants, emphasizing genetic modifications enabling biosynthesis based on unrelated single-carbon sources. Among these, enzyme engineering strategies and the establishment of novel artificial pathways are widely discussed. In particular, this review offers a comprehensive overview of promising engineering strategies, serving as a resource for future strain development and positioning C. necator as a valuable microbial chassis for biopolymer production at an industrial scale. Full article
Show Figures

Figure 1

8 pages, 162 KB  
Proceeding Paper
The Evolution and Challenges of Real-Time Big Data: A Review
by Ikram Lefhal Lalaoui, Essaid El Haji and Mohamed Kounaidi
Comput. Sci. Math. Forum 2025, 10(1), 11; https://doi.org/10.3390/cmsf2025010011 - 1 Jul 2025
Cited by 3 | Viewed by 5541
Abstract
The importance of real-time big data has become crucial in the digital revolution of modern society, in the context of increasing data flows from multiple sources, including social media, internet connected devices (IOT) and financial systems, real-time analysis and processing is becoming a [...] Read more.
The importance of real-time big data has become crucial in the digital revolution of modern society, in the context of increasing data flows from multiple sources, including social media, internet connected devices (IOT) and financial systems, real-time analysis and processing is becoming a strategic tool for fast and accurate decision making, we find applications in different domains such as healthcare, finance, and digital marketing, which is revolutionizing traditional business models. In this article, we explore the recent advances and future prospects of real-time big data. Our research is based on recent work published between 2020 and 2025, examining the technological advances, the difficulties encountered and suggesting ways of optimizing the efficiency of these technologies. Full article
14 pages, 2429 KB  
Article
End-to-End Architecture for Real-Time IoT Analytics and Predictive Maintenance Using Stream Processing and ML Pipelines
by Ouiam Khattach, Omar Moussaoui and Mohammed Hassine
Sensors 2025, 25(9), 2945; https://doi.org/10.3390/s25092945 - 7 May 2025
Cited by 11 | Viewed by 8440
Abstract
The rapid proliferation of Internet of Things (IoT) devices across industries has created a need for robust, scalable, and real-time data processing architectures capable of supporting intelligent analytics and predictive maintenance. This paper presents a novel comprehensive architecture that enables end-to-end processing of [...] Read more.
The rapid proliferation of Internet of Things (IoT) devices across industries has created a need for robust, scalable, and real-time data processing architectures capable of supporting intelligent analytics and predictive maintenance. This paper presents a novel comprehensive architecture that enables end-to-end processing of IoT data streams, from acquisition to actionable insights. The system integrates Kafka-based message brokering for the high-throughput ingestion of real-time sensor data, with Apache Spark facilitating batch and stream extraction, transformation, and loading (ETL) processes. A modular machine-learning pipeline handles automated data preprocessing, training, and evaluation across various models. The architecture incorporates continuous monitoring and optimization components to track system performance and model accuracy, feeding insights to users via a dedicated Application Programming Interface (API). The design ensures scalability, flexibility, and real-time responsiveness, making it well suited for industrial IoT applications requiring continuous monitoring and intelligent decision-making. Full article
Show Figures

Figure 1

39 pages, 1360 KB  
Article
Real-Time Monitoring of LTL Properties in Distributed Stream Processing Applications
by Loay Aladib, Guoxin Su and Jack Yang
Electronics 2025, 14(7), 1448; https://doi.org/10.3390/electronics14071448 - 3 Apr 2025
Viewed by 1281
Abstract
Stream processing frameworks have become key enablers of real-time data processing in modern distributed systems. However, robust and scalable mechanisms for verifying temporal properties are often lacking in existing systems. To address this gap, a new runtime verification framework is proposed that integrates [...] Read more.
Stream processing frameworks have become key enablers of real-time data processing in modern distributed systems. However, robust and scalable mechanisms for verifying temporal properties are often lacking in existing systems. To address this gap, a new runtime verification framework is proposed that integrates linear temporal logic (LTL) monitoring into stream processing applications, such as Apache Spark. The approach introduces reusable LTL monitoring patterns designed for seamless integration into existing streaming workflows. Our case study, applied to real-time financial data monitoring, demonstrates that LTL-based monitoring can effectively detect violations of safety and liveness properties while maintaining stable latency. A performance evaluation reveals that although the approach introduces computational overhead, it scales effectively with increasing data volume. The proposed framework extends beyond financial data processing and is applicable to domains such as real-time equipment failure detection, financial fraud monitoring, and industrial IoT analytics. These findings demonstrate the feasibility of real-time LTL monitoring in large-scale stream processing environments while highlighting trade-offs between verification accuracy, scalability, and system overhead. Full article
(This article belongs to the Special Issue Data-Centric Artificial Intelligence: New Methods for Data Processing)
Show Figures

Figure 1

15 pages, 3524 KB  
Perspective
Electric Discharge-Generating Devices Developed for Pathogen, Insect Pest, and Weed Management: Current Status and Future Directions
by Shin-ichi Kusakari and Hideyoshi Toyoda
Agronomy 2025, 15(1), 123; https://doi.org/10.3390/agronomy15010123 - 6 Jan 2025
Viewed by 1629
Abstract
Electrostatic techniques have introduced innovative approaches to devise efficient tools for pest control across various categories, encompassing pathogens, insects, and weeds. The focus on electric discharge technology has proven pivotal in establishing effective methods with simple device structures, enabling cost-effective fabrication using readily [...] Read more.
Electrostatic techniques have introduced innovative approaches to devise efficient tools for pest control across various categories, encompassing pathogens, insects, and weeds. The focus on electric discharge technology has proven pivotal in establishing effective methods with simple device structures, enabling cost-effective fabrication using readily available materials. The electric discharge-generating devices can be assembled using commonplace conductor materials, such as ordinary metal nets linked to a voltage booster and a grounded electric wire. The strategic pairing of charged and grounded conductors at specific intervals generates an electric field, leading the charged conductor to initiate a corona discharge in the surrounding space. As the applied voltage increases, the corona discharge intensifies and may eventually result in an arc discharge due to the breakdown of air when the voltage surpasses the insulation resistance limit. The utilization of corona and arc discharges plays a crucial role in these techniques, with the corona-discharging stage creating (1) negative ions to stick to pests, which can then be captured with a positively charged pole, (2) ozone gas to sterilize plant hydroponic solutions, and (3) plasma streams to exterminate fungal colonies on leaves, and the arc-discharging stage projecting electric sparks to zap and kill pests. These electric discharge phenomena have been harnessed to develop reliable devices capable of managing pests across diverse classes. In this review, we elucidate past achievements and challenges in device development, providing insights into the current status of research. Additionally, we discuss the future directions of research in this field, outlining potential avenues for further exploration and improvement. Full article
Show Figures

Figure 1

25 pages, 1936 KB  
Article
A Scalable Framework for Sensor Data Ingestion and Real-Time Processing in Cloud Manufacturing
by Massimo Pacella, Antonio Papa, Gabriele Papadia and Emiliano Fedeli
Algorithms 2025, 18(1), 22; https://doi.org/10.3390/a18010022 - 4 Jan 2025
Cited by 14 | Viewed by 4757
Abstract
Cloud Manufacturing enables the integration of geographically distributed manufacturing resources through advanced Cloud Computing and IoT technologies. This paradigm promotes the development of scalable and adaptable production systems. However, existing frameworks face challenges related to scalability, resource orchestration, and data security, particularly in [...] Read more.
Cloud Manufacturing enables the integration of geographically distributed manufacturing resources through advanced Cloud Computing and IoT technologies. This paradigm promotes the development of scalable and adaptable production systems. However, existing frameworks face challenges related to scalability, resource orchestration, and data security, particularly in rapidly evolving decentralized manufacturing settings. This study presents a novel nine-layer architecture designed specifically to address these issues. Central to this framework is the use of Apache Kafka for robust, high-throughput data ingestion, and Apache Spark Streaming to enhance real-time data processing. This framework is underpinned by a microservice-based architecture that ensures a high scalability and reduced latency. Experimental validation using sensor data from the UCI Machine Learning Repository demonstrated substantial improvements in processing efficiency and throughput compared with conventional frameworks. Key components, such as RabbitMQ, contribute to low-latency performance, whereas Kafka ensures data durability and supports real-time application. Additionally, the in-memory data processing of Spark Streaming enables rapid and dynamic data analysis, yielding actionable insights. The experimental results highlight the potential of the framework to enhance operational efficiency, resource utilization, and data security, offering a resilient solution suited to the demands of modern industrial applications. This study underscores the contribution of the framework to advancing Cloud Manufacturing by providing detailed insights into its performance, scalability, and applicability to contemporary manufacturing ecosystems. Full article
Show Figures

Figure 1

30 pages, 618 KB  
Article
Benchmarking Big Data Systems: Performance and Decision-Making Implications in Emerging Technologies
by Leonidas Theodorakopoulos, Aristeidis Karras, Alexandra Theodoropoulou and Georgios Kampiotis
Technologies 2024, 12(11), 217; https://doi.org/10.3390/technologies12110217 - 3 Nov 2024
Cited by 19 | Viewed by 6968
Abstract
Systems for graph processing are a key enabler for insights from large-scale graphs that are critical to many new advanced technologies such as Artificial Intelligence, Internet of Things, and blockchain. In this study, we benchmark another two widely utilized graph processing systems, Apache [...] Read more.
Systems for graph processing are a key enabler for insights from large-scale graphs that are critical to many new advanced technologies such as Artificial Intelligence, Internet of Things, and blockchain. In this study, we benchmark another two widely utilized graph processing systems, Apache Spark GraphX and Apache Fink, concerning the key performance criterion by means of response time, scalability, and computational complexity. We demonstrate our results which show the capability of each system for real-world graph applications, and hence, providing a quantitative understanding to select the system for our purpose. GraphX’s strength was in processing batch in-memory workloads typical of blockchain and machine learning model optimization, while Flink excelled in processing stream data, which is timely and important to the IoT world. These performance characteristics emphasize how the capabilities of graph processing systems can match the requirements for the performance of different emerging technology applications. Our findings ultimately inform practitioners about system efficiencies and limitations, but also the recent advances in hardware accelerators and algorithmic improvements aimed at shaping the new graph processing frontier in diverse technology domains. Full article
Show Figures

Figure 1

21 pages, 7395 KB  
Article
Elevating Smart Manufacturing with a Unified Predictive Maintenance Platform: The Synergy between Data Warehousing, Apache Spark, and Machine Learning
by Naijing Su, Shifeng Huang and Chuanjun Su
Sensors 2024, 24(13), 4237; https://doi.org/10.3390/s24134237 - 29 Jun 2024
Cited by 17 | Viewed by 8389
Abstract
The transition to smart manufacturing introduces heightened complexity in regard to the machinery and equipment used within modern collaborative manufacturing landscapes, presenting significant risks associated with equipment failures. The core ambition of smart manufacturing is to elevate automation through the integration of state-of-the-art [...] Read more.
The transition to smart manufacturing introduces heightened complexity in regard to the machinery and equipment used within modern collaborative manufacturing landscapes, presenting significant risks associated with equipment failures. The core ambition of smart manufacturing is to elevate automation through the integration of state-of-the-art technologies, including artificial intelligence (AI), the Internet of Things (IoT), machine-to-machine (M2M) communication, cloud technology, and expansive big data analytics. This technological evolution underscores the necessity for advanced predictive maintenance strategies that proactively detect equipment anomalies before they escalate into costly downtime. Addressing this need, our research presents an end-to-end platform that merges the organizational capabilities of data warehousing with the computational efficiency of Apache Spark. This system adeptly manages voluminous time-series sensor data, leverages big data analytics for the seamless creation of machine learning models, and utilizes an Apache Spark-powered engine for the instantaneous processing of streaming data for fault detection. This comprehensive platform exemplifies a significant leap forward in smart manufacturing, offering a proactive maintenance model that enhances operational reliability and sustainability in the digital manufacturing era. Full article
Show Figures

Figure 1

Back to TopTop