Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (103)

Search Parameters:
Keywords = multimodal anomaly detection

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 4838 KB  
Article
Unseen Hazard Recognition in Autonomous Driving Using Vision–Language and Sensor-Based Temporal Models
by Faisal Mehmood, Sajid Ur Rehman, Asif Mehmood and Young-Jin Kim
Appl. Sci. 2026, 16(3), 1503; https://doi.org/10.3390/app16031503 - 2 Feb 2026
Viewed by 4
Abstract
Autonomous driving (AD) systems remain vulnerable to rare, ambiguous, and out-of-label (OOL) hazards that are insufficiently represented in conventional training datasets. This work investigates perception robustness under such conditions by using the Challenge of Out-Of-Label (COOOL) benchmark dataset, which consists of 200 dashcam [...] Read more.
Autonomous driving (AD) systems remain vulnerable to rare, ambiguous, and out-of-label (OOL) hazards that are insufficiently represented in conventional training datasets. This work investigates perception robustness under such conditions by using the Challenge of Out-Of-Label (COOOL) benchmark dataset, which consists of 200 dashcam video sequences annotated with both common and uncommon traffic hazards. We analyze that the behavior of widely used methods in the perception of components and present a multimodal pipeline in which we integrate YOLO11x for object detection, Hough Transform for lane estimation, and GPT-4o for scene description, and for temporal modeling, we use Long Short-Term Memory (LSTM) networks. On the COOOL benchmark, YOLO11x achieves an mAP@0.5 of 54.1% on the common object categories, whereas the detection of rare and OFL hazards remains challenging, with a recall of 72.6%. Incorporating temporal risk modeling improves hazard recall to 71.8%, indicating a modest but consistent gain in recognizing uncommon events. Hough Transform shows the stable behavior in standard conditions for lane estimation, with a mean lateral deviation of 8.9 pixels in daylight scenes and 13.4 pixels under low-light conditions. The temporal anomaly detection module attains an AUROC of 0.65, reflecting the limitation but meaningful discrimination between nominal and anomalous driving situations. For interpretability, the GPT-4o scene description module generates context-aware textual explanations with an object coverage score of 0.72 and a factual consistency rate of 78%, as assessed through manual inspection. The end-to-end pipeline operates at approximately 10–12 frames per second on a single GPU, supporting near-real-time analysis and optimization. Our results confirm that state-of-the-art perception models struggle with OOL hazards and that multimodal vision–language–temporal integration provides incremental improvements in robustness and interpretability when evaluated under the standardized out-of-distribution conditions. Full article
(This article belongs to the Special Issue Autonomous Vehicles and Robotics—2nd Edition)
Show Figures

Figure 1

28 pages, 802 KB  
Article
Data-Centric Generative and Adaptive Detection Framework for Abnormal Transaction Prediction
by Yunpeng Gong, Peng Hu, Zihan Zhang, Pengyu Liu, Zhengyang Li, Ruoyun Zhang, Jinghui Yin and Manzhou Li
Electronics 2026, 15(3), 633; https://doi.org/10.3390/electronics15030633 - 2 Feb 2026
Viewed by 163
Abstract
Anomalous transaction behaviors in cryptocurrency markets exhibit high concealment, substantial diversity, and strong cross-modal coupling, making traditional rule-based or single-feature analytical methods insufficient for reliable detection in real-world environments. To address the research focus, a data-centric multimodal anomaly detection framework integrating generative augmentation, [...] Read more.
Anomalous transaction behaviors in cryptocurrency markets exhibit high concealment, substantial diversity, and strong cross-modal coupling, making traditional rule-based or single-feature analytical methods insufficient for reliable detection in real-world environments. To address the research focus, a data-centric multimodal anomaly detection framework integrating generative augmentation, latent distribution modeling, and dual-branch real-time detection is proposed. The method employs a generative adversarial network with feature-consistency constraints to mitigate the scarcity of fraudulent samples, and adopts a multi-domain variational modeling strategy to learn the latent distribution of normal behaviors, enabling stable anomaly scoring. By combining the long-range temporal modeling capability of Transformer architectures with the sensitivity of online clustering to local structural deviations, the system dynamically integrates global and local information through an adaptive risk fusion mechanism, thereby enhancing robustness and real-time detection capability. Experimental results demonstrate that the generative augmentation module yields substantial improvements, increasing the recall from 0.421 to 0.671 and the F1-score to 0.692. In anomaly distribution modeling, the multi-domain VAE achieves an area under the curve (AUC) of 0.854 and an F1-score of 0.660, significantly outperforming traditional One-Class SVM and autoencoder baselines. Multimodal fusion experiments further verify the complementarity of the dual-branch detection structure, with the adaptive fusion model achieving an AUC of 0.884, an F1-score of 0.713, and reducing the false positive rate to 0.087. Ablation studies show that the complete model surpasses any individual module in terms of precision, recall, and F1-score, confirming the synergistic benefits of its integrated components. Overall, the proposed framework achieves high accuracy and high recall in data-scarce, structurally complex, and latency-sensitive cryptocurrency scenarios, providing a scalable and efficient solution for deploying data-centric artificial intelligence in financial security applications. Full article
(This article belongs to the Special Issue Machine Learning in Data Analytics and Prediction)
Show Figures

Figure 1

22 pages, 561 KB  
Review
A Systematic Review of Anomaly and Fault Detection Using Machine Learning for Industrial Machinery
by Syed Haseeb Haider Zaidi, Alex Shenfield, Hongwei Zhang and Augustine Ikpehai
Algorithms 2026, 19(2), 108; https://doi.org/10.3390/a19020108 - 1 Feb 2026
Viewed by 196
Abstract
Unplanned downtime in industrial machinery remains a major challenge, causing substantial economic losses and safety risks across sectors such as manufacturing, food processing, oil and gas, and transportation. This systematic review investigates the application of machine learning (ML) techniques for anomaly and fault [...] Read more.
Unplanned downtime in industrial machinery remains a major challenge, causing substantial economic losses and safety risks across sectors such as manufacturing, food processing, oil and gas, and transportation. This systematic review investigates the application of machine learning (ML) techniques for anomaly and fault detection within the broader context of predictive maintenance. Following a hybrid review methodology, relevant studies published between 2010 and 2025 were collected from major databases including IEEE Xplore, ScienceDirect, SpringerLink, Scopus, Web of Science, and arXiv. The review categorizes approaches into supervised, unsupervised, and hybrid paradigms, analyzing their pipelines from data collection and preprocessing to model deployment. Findings highlight the effectiveness of deep learning architectures such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), autoencoders, and hybrid frameworks in detecting faults from time series and multimodal sensor data. At the same time, key limitations persist, including data scarcity, class imbalance, limited generalizability across equipment types, and a lack of interpretability in deep models. This review concludes that while ML-based predictive maintenance systems are enabling a transition from reactive to proactive strategies, future progress requires improved hybrid architectures, Explainable AI, and scalable real-time deployment. Full article
(This article belongs to the Special Issue Machine Learning for Pattern Recognition (3rd Edition))
Show Figures

Figure 1

21 pages, 1289 KB  
Article
A Multi-Branch CNN–Transformer Feature-Enhanced Method for 5G Network Fault Classification
by Jiahao Chen, Yi Man and Yao Cheng
Appl. Sci. 2026, 16(3), 1433; https://doi.org/10.3390/app16031433 - 30 Jan 2026
Viewed by 123
Abstract
The deployment of 5G (Fifth-Generation) networks in industrial Internet of Things (IoT), intelligent transportation, and emergency communications introduces heterogeneous and dynamic network states, leading to frequent and diverse faults. Traditional fault detection methods typically emphasize either local temporal anomalies or global distributional characteristics, [...] Read more.
The deployment of 5G (Fifth-Generation) networks in industrial Internet of Things (IoT), intelligent transportation, and emergency communications introduces heterogeneous and dynamic network states, leading to frequent and diverse faults. Traditional fault detection methods typically emphasize either local temporal anomalies or global distributional characteristics, but rarely achieve an effective balance between the two. In this paper, we propose a parallel multi-branch convolutional neural network (CNN)–Transformer framework (MBCT) to improve fault diagnosis accuracy in 5G networks. Specifically, MBCT takes time-series network key performance indicator (KPI) data as input for training and performs feature extraction through three parallel branches: a CNN branch for local patterns and short-term fluctuations, a Transformer encoder branch for cross-layer and long-term dependencies, and a statistical branch for global features describing quality-of-experience (QoE) metrics. A gating mechanism and feature-weighted fusion are applied outside the branches to adjust inter-branch weights and intra-branch feature sensitivity. The fused representation is then nonlinearly mapped and fed into a classifier to generate the fault category. This paper evaluates the performance of the proposed model on both the publicly available TelecomTS multi-modal 5G network observability dataset and a self-collected SDR5GFD dataset based on software-defined radio (SDR). Experimental results demonstrate that the proposed model achieves superior performance in fault classification, achieving 87.7% accuracy on the TelecomTS dataset and 86.3% on the SDR5GFD dataset, outperforming the baseline models CNN, Transformer, and Random Forest. Moreover, the model contains approximately 0.57M parameters and requires about 0.3 MFLOPs per sample for inference, making it suitable for large-scale online fault diagnosis. Full article
Show Figures

Figure 1

12 pages, 874 KB  
Proceeding Paper
Smart Pavement Systems with Embedded Sensors for Traffic and Environmental Monitoring
by Wai Yie Leong
Eng. Proc. 2025, 120(1), 12; https://doi.org/10.3390/engproc2025120012 - 29 Jan 2026
Viewed by 86
Abstract
The evolution of next-generation urban infrastructure necessitates the deployment of intelligent pavement systems capable of real-time data acquisition, adaptive response, and predictive analytics. This article presents the design, implementation, and performance evaluation of the smart pavement system incorporating multimodal embedded sensors for traffic [...] Read more.
The evolution of next-generation urban infrastructure necessitates the deployment of intelligent pavement systems capable of real-time data acquisition, adaptive response, and predictive analytics. This article presents the design, implementation, and performance evaluation of the smart pavement system incorporating multimodal embedded sensors for traffic density analysis, structural health monitoring, and environmental surveillance. SPS integrates piezoelectric transducers, micro-electro-mechanical system accelerometers, inductive loop coils, fiber Bragg grating (FBG) sensors, and capacitive moisture and temperature sensors within the asphalt and sub-base layers, forming a distributed sensor network that interfaces with an edge-AI-enabled data acquisition and control module. Each sensor node performs localized pre-processing using low-power microcontrollers and transmits spatiotemporal data to a centralized IoT gateway over an adaptive mesh topology via long-range wide-area network or 5G-Vehicle-to-Everything protocols. Data fusion algorithms employing Kalman filters, sensor drift compensation models, and deep convolutional recurrent neural networks enable accurate classification of vehicular loads, traffic, and anomaly detection. Additionally, the system supports real-time air pollutant detection (e.g., NO2, CO, and PM2.5) using embedded electrochemical and optical gas sensors linked to mobile roadside units. Field deployments on a 1.2 km highway testbed demonstrate the system’s capability to achieve 95.7% classification accuracy for vehicle type recognition, ±1.5 mm resolution in rut depth measurement, and ±0.2 °C thermal sensitivity across dynamic weather conditions. Predictive analytics driven by long short-term memory networks yield a 21.4% improvement in maintenance planning accuracy, significantly reducing unplanned downtimes and repair costs. The architecture also supports vehicle-to-infrastructure feedback loops for adaptive traffic signal control and incident response. The proposed SPS architecture demonstrates a scalable and resilient framework for cyber-physical infrastructure, paving the way for smart cities that are responsive, efficient, and sustainable. Full article
(This article belongs to the Proceedings of 8th International Conference on Knowledge Innovation and Invention)
Show Figures

Figure 1

19 pages, 1898 KB  
Article
Robust ICS Anomaly Detection Using Multi-Scale Temporal Dependencies and Frequency-Domain Features
by Fang Wang, Haihan Chen, Suyang Wang, Zhongyuan Qin and Fang Dong
Electronics 2026, 15(3), 571; https://doi.org/10.3390/electronics15030571 - 28 Jan 2026
Viewed by 126
Abstract
Industrial Control Systems (ICSs) are critical infrastructure for maintaining social and economic stability, but they face increasing security threats that require robust anomaly detection mechanisms. Anomaly detection in ICS, based on sensor data, is essential for identifying abnormal behaviors caused by factors such [...] Read more.
Industrial Control Systems (ICSs) are critical infrastructure for maintaining social and economic stability, but they face increasing security threats that require robust anomaly detection mechanisms. Anomaly detection in ICS, based on sensor data, is essential for identifying abnormal behaviors caused by factors such as equipment failures, cyber-attacks, and operational mistakes. However, industrial time series data are often multimodal, noisy, and exhibit both short-term fluctuations and long-term dependencies, making them difficult to model effectively. Additionally, ICS data often contain high-frequency noise and complex periodic patterns, which traditional methods and standalone models, such as Long Short-Term Memory (LSTM), fail to capture effectively. To address these challenges, we propose a novel anomaly detection framework that leverages Gated Recurrent Units for short-term dynamics and PatchTST for long-term dependencies. The GRU module extracts dynamic short-term features, while PatchTST models long-term dependencies by segmenting the feature sequence processed by GRU into overlapping patches. Additionally, we innovatively introduce Frequency-Enhanced Channel Attention Module to capture frequency domain features, mitigating high-frequency noise and enhancing the model’s ability to detect long-term trends and periodic patterns. Experimental results on the SWaT and WADI datasets show that the proposed method achieves strong anomaly detection performance, attaining F1 scores of 0.929 and 0.865, respectively, which are superior to those of representative existing methods, demonstrating the effectiveness of the proposed design for robust anomaly detection in complex ICS environments. Full article
Show Figures

Figure 1

19 pages, 4184 KB  
Article
Bearing Anomaly Detection Method Based on Multimodal Fusion and Self-Adversarial Learning
by Han Liu, Yong Qin and Dilong Tu
Sensors 2026, 26(2), 629; https://doi.org/10.3390/s26020629 - 17 Jan 2026
Viewed by 234
Abstract
In the context of bearing anomaly detection, challenges such as imbalanced sample distribution and complex operational conditions present significant difficulties for data-driven deep learning models. These issues often result in overfitting and high false positive rates in complex real-world scenarios. This paper proposes [...] Read more.
In the context of bearing anomaly detection, challenges such as imbalanced sample distribution and complex operational conditions present significant difficulties for data-driven deep learning models. These issues often result in overfitting and high false positive rates in complex real-world scenarios. This paper proposes a strategy that leverages multimodal fusion and Self-Adversarial Training (SAT) to construct and train a deep learning model. First, the one-dimensional bearing vibration time-series data are converted into Gramian Angular Difference Field (GADF) images, and multimodal feature fusion is performed with the original time-series data to capture richer spatiotemporal correlation features. Second, a composite data augmentation strategy combining time-domain and image-domain transformations is employed to effectively expand the anomaly samples, mitigating data scarcity and class imbalance. Finally, the SAT mechanism is introduced, where adversarial samples are generated within the fused feature space to compel the model to learn more generalized and robust feature representations, thereby significantly enhancing its performance in realistic and noisy environments. Experimental results demonstrate that the proposed method outperforms traditional baseline models across key metrics such as accuracy, precision, recall, and F1-score in abnormal bearing anomaly detection. It exhibits exceptional robustness against rail-specific interferences, offering a specialized solution strictly tailored for the unique, high-noise operational environments of intelligent railway maintenance. Full article
(This article belongs to the Special Issue Sensor-Based Fault Diagnosis and Prognosis)
Show Figures

Figure 1

45 pages, 9328 KB  
Review
Advancements in Machine Learning-Assisted Flexible Electronics: Technologies, Applications, and Future Prospects
by Hao Su, Hongcun Wang, Dandan Sang, Santosh Kumar, Dao Xiao, Jing Sun and Qinglin Wang
Biosensors 2026, 16(1), 58; https://doi.org/10.3390/bios16010058 - 13 Jan 2026
Viewed by 311
Abstract
The integration of flexible electronics and machine learning (ML) algorithms has become a revolutionary force driving the field of intelligent sensing, giving rise to a new generation of intelligent devices and systems. This article provides a systematic review of core technologies and practical [...] Read more.
The integration of flexible electronics and machine learning (ML) algorithms has become a revolutionary force driving the field of intelligent sensing, giving rise to a new generation of intelligent devices and systems. This article provides a systematic review of core technologies and practical applications of ML in flexible electronics. It focuses on analyzing the theoretical frameworks of algorithms such as the Long Short-Term Memory Network (LSTM), Convolutional Neural Network (CNN), and Reinforcement Learning (RL) in the intelligent processing of sensor signals (IPSS), multimodal feature extraction (MFE), process defect and anomaly detection (PDAD), and data compression and edge computing (DCEC). This study explores the performance advantages of these technologies in optimizing signal analysis accuracy, compensating for interference in high-noise environments, optimizing manufacturing process parameters, etc., and empirically analyzes their potential applications in wearable health monitoring systems, intelligent control of soft robots, performance optimization of self-powered devices, and intelligent perception of epidermal electronic systems. Full article
Show Figures

Figure 1

16 pages, 1443 KB  
Article
DCRDF-Net: A Dual-Channel Reverse-Distillation Fusion Network for 3D Industrial Anomaly Detection
by Chunshui Wang, Jianbo Chen and Heng Zhang
Sensors 2026, 26(2), 412; https://doi.org/10.3390/s26020412 - 8 Jan 2026
Viewed by 205
Abstract
Industrial surface defect detection is essential for ensuring product quality, but real-world production lines often provide only a limited number of defective samples, making supervised training difficult. Multimodal anomaly detection with aligned RGB and depth data is a promising solution, yet existing fusion [...] Read more.
Industrial surface defect detection is essential for ensuring product quality, but real-world production lines often provide only a limited number of defective samples, making supervised training difficult. Multimodal anomaly detection with aligned RGB and depth data is a promising solution, yet existing fusion schemes tend to overlook modality-specific characteristics and cross-modal inconsistencies, so that defects visible in only one modality may be suppressed or diluted. In this work, we propose DCRDF-Net, a dual-channel reverse-distillation fusion network for unsupervised RGB–depth industrial anomaly detection. The framework learns modality-specific normal manifolds from nominal RGB and depth data and detects defects as deviations from these learned manifolds. It consists of three collaborative components: a Perlin-guided pseudo-anomaly generator that injects appearance–geometry-consistent perturbations into both modalities to enrich training signals; a dual-channel reverse-distillation architecture with guided feature refinement that denoises teacher features and constrains RGB and depth students towards clean, defect-free representations; and a cross-modal squeeze–excitation gated fusion module that adaptively combines RGB and depth anomaly evidence based on their reliability and agreement.Extensive experiments on the MVTec 3D-AD dataset show that DCRDF-Net achieves 97.1% image-level I-AUROC and 98.8% pixel-level PRO, surpassing current state-of-the-art multimodal methods on this benchmark. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

23 pages, 998 KB  
Article
A SIEM-Integrated Cybersecurity Prototype for Insider Threat Anomaly Detection Using Enterprise Logs and Behavioural Biometrics
by Mohamed Salah Mohamed and Abdullahi Arabo
Electronics 2026, 15(1), 248; https://doi.org/10.3390/electronics15010248 - 5 Jan 2026
Viewed by 507
Abstract
Insider threats remain a serious concern for organisations in both public and private sectors. Detecting anomalous behaviour in enterprise environments is critical for preventing insider incidents. While many prior studies demonstrate promising results using deep learning on offline datasets, few address real-time operationalisation [...] Read more.
Insider threats remain a serious concern for organisations in both public and private sectors. Detecting anomalous behaviour in enterprise environments is critical for preventing insider incidents. While many prior studies demonstrate promising results using deep learning on offline datasets, few address real-time operationalisation or calibrated alert control within a Security Information and Event Management (SIEM) workflow. This paper presents a SIEM-integrated prototype that fuses the Computer Emergency Response Team Insider Threat Test Dataset (CERT) enterprise logs (Logon, Device, HTTP, and Email) with behavioural biometrics from the Balabit mouse dynamics dataset. Per-modality one-dimensional convolutional neural network (1D CNN) branches are trained independently using imbalance-aware strategies, including downsampling, class weighting, and focal loss. A unified 20 × N feature schema ensures train–serve parity and consistent feature validation during live inference. Post-training calibration using Platt and isotonic regression enables analyst-controlled threshold tuning and stable alert budgeting inside the SIEM. The models are deployed in Splunk’s Machine Learning Toolkit (MLTK), where dashboards visualise anomaly timelines, risky users or hosts, and cross-stream overlaps. Evaluation emphasises operational performance, precision–recall balance, calibration stability, and throughput rather than headline accuracy. Results show calibrated, controllable alert volumes: for Device, precision ≈0.70 at recall ≈0.30 (PR-AUC = 0.468, ROC-AUC = 0.949); for Logon, ROC-AUC = 0.936 with an ultra-low false-positive rate at a conservative threshold. Batch CPU inference sustains ≈70.5 k windows/s, confirming real-time feasibility. This study’s main contribution is to demonstrate a calibrated, multi-modal CNN framework that integrates directly within a live SIEM pipeline. It provides a reproducible path from offline anomaly detection research to Security Operations Centre (SOC)-ready deployment, bridging the gap between academic models and operational Cybersecurity practice. Full article
(This article belongs to the Special Issue AI in Cybersecurity, 2nd Edition)
Show Figures

Figure 1

28 pages, 4479 KB  
Article
Patch Time Series Transformer−Based Short−Term Photovoltaic Power Prediction Enhanced by Artificial Fish
by Xin Lv, Shuhui Cui, Yue Wang, Jinye Lu, Puming Yu and Kai Wang
Energies 2026, 19(1), 284; https://doi.org/10.3390/en19010284 - 5 Jan 2026
Viewed by 404
Abstract
The reliability and economic operation of power systems increasingly depend on renewable energy, making accurate short−term photovoltaic (PV) power prediction essential. Conventional approaches struggle with the nonlinear and stochastic characteristics of PV data. This study proposes an enhanced prediction framework integrating Artificial Fish [...] Read more.
The reliability and economic operation of power systems increasingly depend on renewable energy, making accurate short−term photovoltaic (PV) power prediction essential. Conventional approaches struggle with the nonlinear and stochastic characteristics of PV data. This study proposes an enhanced prediction framework integrating Artificial Fish Swarm Algorithm–Isolation Forest (AFSA–IF) anomaly detection, Generative Adversarial Network−based feature extraction, multimodal data fusion, and a Patch Time Series Transformer (PatchTST) model. The framework includes advanced preprocessing, fusion of meteorological and historical power data, and weather classification via one−hot encoding. Experiments on datasets from six PV plants show significant improvements in mean absolute error, root mean square error, and coefficient of determination compared with Transformer, Reformer, and Informer models. The results confirm the robustness and efficiency of the proposed model, especially under challenging conditions such as rainy weather. Full article
(This article belongs to the Section A2: Solar Energy and Photovoltaic Systems)
Show Figures

Figure 1

23 pages, 5200 KB  
Article
Real-Time Visual Perception and Explainable Fault Diagnosis for Railway Point Machines at the Edge
by Yu Zhai and Lili Wei
Electronics 2026, 15(1), 230; https://doi.org/10.3390/electronics15010230 - 4 Jan 2026
Viewed by 334
Abstract
Existing inspection systems for railway point machines often suffer from high latency and poor interpretability, which impedes the real-time detection of critical mechanical anomalies, thereby increasing the risks of derailment and leading to cascading schedule delays. Addressing these challenges, this study proposes a [...] Read more.
Existing inspection systems for railway point machines often suffer from high latency and poor interpretability, which impedes the real-time detection of critical mechanical anomalies, thereby increasing the risks of derailment and leading to cascading schedule delays. Addressing these challenges, this study proposes a lightweight computer vision-based detection framework deployed on the RK3588S edge platform. First, to overcome the accuracy degradation of segmentation networks on constrained edge NPUs, a Sensitivity-Aware Mixed-Precision Quantization and Heterogeneous Scheduling (SMPQ-HS) strategy is proposed. Second, a Multimodal Semantic Diagnostic Framework is constructed. By integrating geometric engagement depths—calculated via perspective rectification—with visual features, a Hard-Constrained Knowledge Embedding Paradigm is designed for the Qwen2.5-VL model. This approach constrains the stochastic reasoning of the Qwen2.5-VL model into standardized diagnostic conclusions. Experimental results demonstrate that the optimized model achieves an inference speed of 38.5 FPS and an mIoU of 0.849 on the RK3588S, significantly outperforming standard segmentation models in inference speed while maintaining high precision. Furthermore, the average depth-estimation error remains approximately 3%, and the VLM-based fault identification accuracy reaches 88%. Overall, this work provides a low-cost, deployable, and interpretable solution for intelligent point machine maintenance under edge-computing constraints. Full article
Show Figures

Figure 1

26 pages, 12124 KB  
Article
MF-GCN: Multimodal Information Fusion Using Incremental Graph Convolutional Network for Ship Behavior Anomaly Detection
by Ruixin Ma, Jinhao Zhang, Weizhi Nie, Naiming Ge, Hao Wen and Aoxiang Liu
J. Mar. Sci. Eng. 2026, 14(1), 87; https://doi.org/10.3390/jmse14010087 - 1 Jan 2026
Viewed by 250
Abstract
Ship behavior anomaly detection is critical for intelligent perception and early warning in complex inland waterways, where single-source sensing (e.g., AIS-only or vision-only) is often fragile under occlusion, illumination variation, and signal noise. This study proposes MF-GCN, a multimodal (heterogeneous) information fusion framework [...] Read more.
Ship behavior anomaly detection is critical for intelligent perception and early warning in complex inland waterways, where single-source sensing (e.g., AIS-only or vision-only) is often fragile under occlusion, illumination variation, and signal noise. This study proposes MF-GCN, a multimodal (heterogeneous) information fusion framework based on an Incremental Graph Convolutional Network (IGCN) to detect and warn anomalous ship behaviors by jointly modeling AIS, video imagery, LiDAR point clouds, and water level signals. We first extract modality-specific features and enforce temporal–spatial consistency via timestamp and geo-referencing alignment, then construct an evolving graph in which nodes represent multimodal features and edges encode temporal dependency and semantic similarity. MF-GCN integrates a Semantic Clustering-based GCN (S-GCN) to inject historical semantic context and an Attentive Fusion-based GCN (A-GCN) to learn dynamic cross-modal correlations using multi-head attention. Experiments on our constructed real-world datasets demonstrate that MF-GCN achieves accuracies of 93.8%, 93.8%, and 93.3% with F1-scores of 93.6%, 93.6%, and 93.3% for ship deviation warning, bridge-crossing warning, and inter-ship collision warning, respectively, consistently outperforming representative baselines. These results verify the effectiveness of the proposed method for robust multimodal anomaly detection and early warning in inland-waterway scenarios. Full article
(This article belongs to the Special Issue Emerging Computational Methods in Intelligent Marine Vehicles)
Show Figures

Figure 1

19 pages, 2562 KB  
Article
An Enhanced LSTM with Hippocampal-Inspired Episodic Memory for Urban Crowd Behavior Analysis
by Mingshou An, Hye-Youn Lim and Dae-Seong Kang
Electronics 2026, 15(1), 101; https://doi.org/10.3390/electronics15010101 - 25 Dec 2025
Viewed by 336
Abstract
The increasing frequency and severity of urban crowd disasters underscore a critical need for intelligent surveillance systems capable of real-time crowd anomaly detection and early warning. While deep learning models such as LSTMs, ConvLSTMs, and Transformers have been applied to video-based crowd anomaly [...] Read more.
The increasing frequency and severity of urban crowd disasters underscore a critical need for intelligent surveillance systems capable of real-time crowd anomaly detection and early warning. While deep learning models such as LSTMs, ConvLSTMs, and Transformers have been applied to video-based crowd anomaly detection, they often face limitations in long-term contextual reasoning, computational efficiency, and interpretability. To address these challenges, this paper proposes HiMeLSTM, a crowd anomaly detection framework built around a hippocampal-inspired memory-enhanced LSTM backbone that integrates Long Short-Term Memory (LSTM) networks with an Episodic Memory Unit (EMU). This hybrid design enables the model to effectively capture both short-term temporal dynamics and long-term contextual patterns essential for understanding complex crowd behavior. We evaluate HiMeLSTM on two publicly available crowd-anomaly benchmark datasets (UCF-Crime and ShanghaiTech Campus) and an in-house CrowdSurge-1K dataset, demonstrating that it consistently outperforms strong baseline architectures, including Vanilla LSTM, ConvLSTM, a lightweight spatial–temporal Transformer, and recent reconstruction-based models such as MemAE and ST-AE. Across these datasets, HiMeLSTM achieves up to 93.5% accuracy, 89.6% anomaly detection rate (ADR), and a 0.89 F1-score, while maintaining computational efficiency suitable for real-time deployment on GPU-equipped edge devices. Unlike many recent approaches that rely on multimodal sensors, optical-flow volumes, or detailed digital twins of the environment, HiMeLSTM operates solely on raw CCTV video streams combined with a simple manually defined zone layout. Furthermore, the hippocampal-inspired EMU provides an interpretable memory retrieval mechanism: by inspecting the retrieved episodes and their att ention weights, operators can understand which past crowd patterns contributed to a given decision. Overall, the proposed framework represents a significant step toward practical and reliable crowd monitoring systems for enhancing public safety in urban environments. Full article
Show Figures

Figure 1

29 pages, 3643 KB  
Article
Optimizing Performance of Equipment Fleets Under Dynamic Operating Conditions: Generalizable Shift Detection and Multimodal LLM-Assisted State Labeling
by Bilal Chabane, Georges Abdul-Nour and Dragan Komljenovic
Sustainability 2026, 18(1), 132; https://doi.org/10.3390/su18010132 - 22 Dec 2025
Viewed by 484
Abstract
This paper presents OpS-EWMA-LLM (Operational State Shifts Detection using Exponential Weighted Moving Average and Labeling using Large Language Model), a hybrid framework that combines fleet-normalized statistical shift detection with LLM-assisted diagnostics to identify and interpret operational state changes across heterogeneous fleets. First, we [...] Read more.
This paper presents OpS-EWMA-LLM (Operational State Shifts Detection using Exponential Weighted Moving Average and Labeling using Large Language Model), a hybrid framework that combines fleet-normalized statistical shift detection with LLM-assisted diagnostics to identify and interpret operational state changes across heterogeneous fleets. First, we introduce a residual-based EWMA control chart methodology that uses deviations of each component’s sensor reading from its fleet-wide expected value to detect anomalies. This statistical approach yields near-zero false negatives and flags incipient faults earlier than conventional methods, without requiring component-specific tuning. Second, we implement a pipeline that integrates an LLM with retrieval-augmented generation (RAG) architecture. Through a three-phase prompting strategy, the LLM ingests time-series anomalies, domain knowledge, and contextual information to generate human-interpretable diagnostic insights. Finaly, unlike existing approaches that treat anomaly detection and diagnosis as separate steps, we assign to each detected event a criticality label based on both statistical score of the anomaly and semantic score from the LLM analysis. These labels are stored in the OpS-Vector to extend the knowledge base of cases for future retrieval. We demonstrate the framework on SCADA data from a fleet of wind turbines: OpS-EWMA successfully identifies critical temperature deviations in various components that standard alarms missed, and the LLM (augmented with relevant documents) provides rationalized explanations for each anomaly. The framework demonstrated robust performance and outperformed baseline methods in a realistic zero-tuning deployment across thousands of heterogeneous equipment units operating under diverse conditions, without component-specific calibration. By fusing lightweight statistical process control with generative AI, the proposed solution offers a scalable, interpretable tool for condition monitoring and asset management in Industry 4.0/5.0 settings. Beyond its technical contributions, the outcome of this research is aligned with the UN Sustainable Development Goals SDG 7, SDG 9, SDG 12, SDG 13. Full article
(This article belongs to the Section Energy Sustainability)
Show Figures

Figure 1

Back to TopTop