Saved Queries

Surface electromyography (sEMG) enables non-invasive measurement of muscle activity for applications such as human–machine interaction, rehabilitation, and prosthesis control. However, high noise levels, inter-subject variability, and the complex nature of muscle activation hinder robust gesture classification. This study proposes a multistream hybrid deep-learning architecture for the FORS-EMG dataset to address these challenges. The model integrates Temporal Convolutional Networks (TCN), depthwise separable convolutions, bidirectional Long Short-Term Memory (LSTM)–Gated Recurrent Unit (GRU) layers, and a Transformer encoder to capture complementary temporal and spectral patterns, and an ArcFace-based classifier to enhance class separability. We evaluate the approach under three protocols: subject-wise, random split without augmentation, and random split with augmentation. In the augmented random-split setting, the model attains 96.4% accuracy, surpassing previously reported values. In the subject-wise setting, accuracy is 74%, revealing limited cross-user generalization. The results demonstrate the method’s high performance and highlight the impact of data-partition strategies for real-world sEMG-based gesture recognition. Full article

(This article belongs to the Special Issue Machine Learning in Biomedical Signal Processing)

20 pages, 4589 KB

Open AccessFeature PaperArticle

Autoencoder-Based Latent Representation Learning, SoH Estimation, and Anomaly Detection in Electric Vehicle Battery Energy Storage Systems

by Nagendra Kumar, Anubhav Agrawal, Rajeev Kumar and Manoj Badoni

Vehicles 2026, 8(4), 81; https://doi.org/10.3390/vehicles8040081 - 7 Apr 2026

Abstract

Accurate estimation of battery state of health (SoH) is an important aspect for improving the reliability, safety, and operating efficiency of an energy storage system. This study presents a unified deep learning pipeline for prediction, latent feature extraction, and anomaly detection. A convolution neutral network autoencoder is used to learn compact latent features from a dataset (NASA battery datasets, i.e., B0005, B0006, B0007, and B0018). These features serve as inputs to random forest and linear regression models, which are further compared with the CNN and GRU. The system is evaluated using leave-one-group-out cross-validation to ensure robustness across different batteries. Latent space quality is studied using PSA, t-SNE, and UMAP analyses. Furthermore, clustering performance is measured using the Silhouette Score, and anomalies are detected using reconstruction error and the Isolation Forest technique. The obtained results show that the AE+RF model achieves the best performance, with a 0.0285 root mean square value (RMSE) and a 0.0109 mean absolute error (MAE), with a high 0.96 coefficient of determination (R²). It is evident that AE+RF shows high prediction accuracy and model reliability. The results show that latent features improve prediction accuracy, helping to clearly separate normal and abnormal patterns, providing a robust and accurate approach to battery SoH estimation that is suitable for battery management system applications. Full article

►▼ Show Figures

Graphical abstract

22 pages, 1376 KB

Open AccessArticle

Ensemble Deep Learning Models on Raw DNA Sequences for Viral Genome Identification in Human Samples

by Marco De Nat, Simone Boscolo, Sonia Pilar Gallo, Loris Nanni and Daniel Fusaro

Sensors 2026, 26(7), 2238; https://doi.org/10.3390/s26072238 - 4 Apr 2026

Viewed by 198

Abstract

Detecting highly divergent or previously unknown viruses is a critical bottleneck in clinical diagnostics and pathogen surveillance. While alignment-based methods often fail to classify sequences lacking homology to known references, deep learning offers a powerful alternative for signal extraction from ‘viral dark matter.’ In this work, we present a high-performance ensemble of deep convolutional neural networks specifically designed to identify viral contigs in complex human metagenomic datasets. Our framework processes sequences acquired from high-throughput biological sensors and integrates complementary architectures to capture both local motifs and global genomic signatures. The proposed ensemble achieves state-of-the-art performance, reaching an AUROC of 0.939 on 300 bp contigs and significantly outperforming existing models such as transformer-based approaches, ViraMiner, and DeepVirFinder. Crucially, our results demonstrate high robustness to data degradation, maintaining stable predictive power even with a 10% random nucleotide substitution rate, a common challenge in degraded clinical samples. Furthermore, the model generalizes to ‘unseen’ viral families not present during training, demonstrating its utility for emerging threat detection. To ensure full reproducibility and facilitate further research in clinical sensing, the complete code and datasets are publicly available on Github. Full article

(This article belongs to the Special Issue Advances in Signal Processing for Biomedical Applications and Healthcare)

►▼ Show Figures

Figure 1

21 pages, 13827 KB

Open AccessArticle

An Integrated Model Based on CNN-Transformer and PLUS for Urban Expansion Simulation in the Yangtze River Delta, China

by Linyu Ma, Jue Xiao, Gan Teng, Ting Zhang and Longqian Chen

Remote Sens. 2026, 18(7), 1071; https://doi.org/10.3390/rs18071071 - 2 Apr 2026

Viewed by 272

Abstract

Land use changes within urban agglomerations exhibit significant spatiotemporal heterogeneity and regional diversity. In urban agglomeration land simulation, traditional models often struggle to systematically capture these variations. We introduce the GCTP, a novel framework that integrates guided Geographical zoning, Convolutional Neural Networks (CNN)-Transformer, and the Patch-generating Land Use Simulation (PLUS) model. Initially, guided K-means clustering was employed for geographic zoning to characterize regional spatial non-stationarity. Then, a CNN-Transformer network leveraged self-attention mechanisms to capture multi-scale spatial correlations, obtaining pixel-level development probabilities. Finally, these probabilities were fused with PLUS- Land Expansion Analysis Strategy (LEAS) outputs to drive PLUS- Cellular Automata with multi-type Random Seeds (CARS) for patch-level simulation. The results demonstrate the following: (1) The embedding of guided zoning enabled the model to achieve an Overall Accuracy (OA) of 0.941, effectively mitigating global simulation bias. (2) The optimal simulation performance occurred at a fusion weight of 0.81, yielding a Kappa of 0.8917 and an Figure of Merit (FoM) of 0.3830, significantly exceeding a single model. (3) The 2030 simulation indicates that the GCTP model effectively reduces isolated pixels at urban fringes. The GCTP generates neighborhood patterns with high spatial compactness and geographic consistency. This study highlights the significant advantages of integrating long-range spatial perception with geographical heterogeneity constraints in the land expansion simulation of urban agglomerations. The findings support more precise territorial spatial planning practices. Full article

(This article belongs to the Special Issue Machine Learning of Remote Sensing Imagery for Land Cover Mapping)

►▼ Show Figures

Figure 1

19 pages, 712 KB

Open AccessArticle

Federated Learning-Driven Protection Against Adversarial Agents in a ROS2 Powered Edge-Device Swarm Environment

by Brenden Preiss and George Pappas

AI 2026, 7(4), 127; https://doi.org/10.3390/ai7040127 - 1 Apr 2026

Viewed by 227

Abstract

Federated learning (FL) enables collaborative model training across distributed devices and robotic systems while preserving data privacy, making it well-suited for swarm robotics and edge-device-powered intelligence. However, FL remains vulnerable to adversarial behaviors such as data and model poisoning, particularly in real-world deployments where detection methods must operate under strict computational and communication constraints. This paper presents a practical, real-world federated learning framework that enhances robustness to adversarial agents in a ROS2-based edge-device swarm environment. The proposed system integrates the Federated Averaging (FedAvg) algorithm with a lightweight average cosine similarity-based filtering method to detect and suppress harmful model updates during aggregation. Unlike prior work that primarily evaluates poisoning defenses in simulated environments, this framework is implemented and evaluated on physical hardware, consisting of a laptop-based aggregator and multiple Raspberry Pi worker nodes. A convolutional neural network (CNN) based on the MobileNetV3-Small architecture is trained on the MNIST dataset, with one worker executing a sign-flipping model poisoning attack. Experimental results show that FedAvg alone fails to maintain meaningful model accuracy under adversarial conditions, resulting in near-random classification performance with a final global model accuracy of 11% and a loss of 2.3. In contrast, the integration of cosine similarity filtering demonstrates effective detection of sign-flipping model poisoning in the evaluated ROS2 swarm experiment, allowing the global model to maintain model accuracy of around 90% and loss around 0.37, which is close to baseline accuracy of 93% of the FedAvg algorithm only under no attack with a very minimal increase in loss, despite the presence of an attacker. The proposed method also maintains a false positive rate (FPR) of around 0.01 and a false negative rate (FNR) of around 0.10 of the global model in the presence of an attacker, which is a minimal difference from the baseline FedAvg-only results of around 0.008 for FPR and 0.07 for FNR. Additionally, the proposed method of FedAvg + cosine similarity filtering maintains computational statistics similar to baseline FedAvg with no attacker. Baseline results show an average runtime of about 34 min, while our proposed method shows an average runtime of about 35 min. Also, the average size of the global model being shared among workers remains consistent at around 7.15 megabytes, showing little to no increase in message payload sizes between baseline results and our proposed method. These results demonstrate that computationally lightweight cosine similarity-based detection methods can be effectively deployed in real-world, resource-constrained robotic swarm environments, providing a practical path toward improving robustness in real-world federated learning deployments beyond simulation-based evaluation. Full article

►▼ Show Figures

Figure 1

16 pages, 1425 KB

Open AccessArticle

On the Classification–Causal Tradeoff in Neural Network Propensity Score Estimation

by Seungman Kim, Jaehoon Lee and Kwanghee Jung

Stats 2026, 9(2), 37; https://doi.org/10.3390/stats9020037 - 31 Mar 2026

Viewed by 229

Abstract

Observational studies serve as a vital alternative to randomized experiments but are highly susceptible to selection bias. Propensity score (PS) methods address this by balancing covariates between groups. Although including all relevant covariates is theoretically ideal, high dimensionality often destabilizes traditional estimation models. This study evaluates the efficacy of deep neural networks (DNN) and convolutional neural networks (CNN) for PS estimation compared to traditional logistic regression (LR), leveraging their capacity to handle complex nonlinear relationships and interactions. Using a Monte Carlo simulation across 36 conditions, model performance was evaluated based on bias and imbalance reduction. Results indicate that DNNs and CNNs significantly outperform LR. Specifically, while LR increased outcome bias by 17% and reduced covariate imbalance by only 5%, DNNs and CNNs reduced outcome bias by 13% and 16%, respectively, while decreasing covariate imbalance by 18% and 21%. We conclude that despite requiring specialized computational resources, neural networks offer substantial advantages for high-dimensional PS estimation. However, their reliable application necessitates stability-aware training and proper error rate thresholds to prevent probability degeneracy. Full article

►▼ Show Figures

Figure 1

29 pages, 2066 KB

Open AccessArticle

Intelligence Collision Detection Using a Combination of Tuning Base Methods and Convolutional Long Short Term Memory Models

by Mohammed Hilfi and Lubna Alazzawi

Smart Cities 2026, 9(4), 61; https://doi.org/10.3390/smartcities9040061 - 31 Mar 2026

Viewed by 310

Abstract

Effective traffic control using Artificial Intelligence (AI) is essential to ensure safe passage for all road users. AI-based collision detection systems offer advanced mechanisms to prevent accidents and improve highway safety. This research investigates two distinct collision scenarios: vehicle–pedestrian and vehicle–motorcyclist interactions. The proposed method in this research involves the bidirectional Long Short Term Memory (LSTM), Convolutional Neural Network with LSTM (CNN–LSTM), and transformer models. The model is furthermore tuned using random or grid search. For the pedestrian–vehicle scenario, the CNN–LSTM model achieved 99.76% accuracy, 99.77% precision, and 99.76% recall, highlighting its strong classification performance. In the vehicle–motorcyclist scenario, the bidirectional LSTM reached 99.73% accuracy with precision and recall of 99.15%, demonstrating its effectiveness in detecting imminent crashes. The optimized CNN-LSTM by random search has focused on decreasing the false-positive rate and increasing the positive rate. It has achieved superior results compared to previous research. These results suggest that the system could be effectively implemented as an early collision warning solution on edge devices. Full article

►▼ Show Figures

Figure 1

16 pages, 2852 KB

Open AccessArticle

A Methodological Study of 1D CNN Classification of Marine Mammal Vocalizations with Variable Signal Durations

by Won-Ki Kim, Dawoon Lee and Ho Seuk Bae

J. Mar. Sci. Eng. 2026, 14(7), 639; https://doi.org/10.3390/jmse14070639 - 30 Mar 2026

Viewed by 1135

Abstract

Marine mammal sound classification plays an important role in understanding species behavior, communication, and ecology. Automated classification methods have received increasing attention due to their ability to efficiently process and analyze large volumes of acoustic data. Traditional classification approaches often rely on frequency-domain representations, such as spectrograms, and image-based classifiers, which can be highly influenced by user-defined parameters. In this study, we investigate a classification method for marine mammal vocalizations using a one-dimensional convolutional neural network (1D CNN) that directly processes raw audio signals. The approach can handle signals of varying durations through a random cropping technique, minimizing signal distortion that is commonly introduced by conventional methods. The model was evaluated using marine mammal vocalization recordings obtained from the Watkins Marine Mammal Sound Database under three experimental scenarios. The results demonstrate the feasibility of using raw audio inputs with a 1D CNN for classifying marine mammal vocalizations with variable signal durations. Full article

(This article belongs to the Section Ocean Engineering)

►▼ Show Figures

Figure 1

33 pages, 5941 KB

Open AccessReview

Artificial Intelligence-Enabled Intelligent Sensory Systems for Quality Evaluation of Traditional Chinese Medicine: A Review of Electronic Nose, Electronic Tongue, and Machine Vision Approaches

by Jingqiu Shi, Jinyi Wu, Li Xu, Ce Tang and Yi Zhang

Molecules 2026, 31(7), 1140; https://doi.org/10.3390/molecules31071140 - 30 Mar 2026

Viewed by 269

Abstract

Traditional sensory evaluation of traditional Chinese medicine (TCM) and medicinal and food homologous products has long relied on human observation of appearance, color, aroma, and taste. However, this approach is highly subjective, difficult to quantify, and often lacks reproducibility across evaluators. Intelligent sensory systems, including the electronic nose, electronic tongue, and machine vision, provide objective and digitized sensory information for TCM quality evaluation. Nevertheless, these platforms generate high-dimensional and heterogeneous datasets, creating a strong demand for efficient artificial intelligence (AI)-based analytical tools. This review summarizes recent advances in the application of machine learning and deep learning methods, such as support vector machine, random forest, convolutional neural network, and long short-term memory networks, for intelligent sensory evaluation of TCM. Particular emphasis is placed on how AI supports feature extraction, pattern recognition, classification, regression, and multisource data fusion across electronic nose, electronic tongue, and machine vision systems. Representative applications in raw material authentication, geographical origin discrimination, processing monitoring, and quality grading are also discussed. In addition, the current challenges related to data standardization, sensor drift, model robustness, and interpretability are highlighted. Overall, this review provides an integrated overview of AI-enabled intelligent sensory technologies and clarifies their potential to advance TCM quality evaluation toward a more objective, efficient, and holistic framework. Full article

►▼ Show Figures

Graphical abstract

20 pages, 1131 KB

Open AccessArticle

Imbalance-Aware APS Failure Classification Using Feature-Wise Attention Graph Convolutional Network

by Juhyeon Noh, Jihoon Lee, Seungmin Oh, Jaehyung Park, Minsoo Hahn, HoYong Ryu and Jinsul Kim

Processes 2026, 14(7), 1107; https://doi.org/10.3390/pr14071107 - 29 Mar 2026

Viewed by 357

Abstract

Industrial equipment data often exhibit high dimensionality and class imbalance, which make it difficult to achieve both accurate failure detection and identification of the factors contributing to failures. To address this issue, this study proposes an explainable failure classification framework, Feature-Wise Attention Graph Convolutional Network (FWA-GCN), which combines Feature-Wise Attention (FWA) with a Graph Convolutional Network (GCN) to provide both high classification performance and variable-level interpretability. In the proposed model, tabular sensor records are treated as nodes, and a similarity-based graph is constructed to capture relationships among samples. Feature-Wise Attention learns the importance of each feature and reweights node features accordingly, and the reweighted features are then used as input to the GCN to classify failure occurrences. To alleviate the class imbalance problem, a weighted loss function is applied during training by assigning a higher weight to the failure class. Experiments conducted on the Air Pressure System (APS) dataset demonstrate that the proposed FWA-GCN achieves Precision of 79.95%, Recall of 85.07%, and F1-score of 82.43%, outperforming conventional machine learning models including Random Forest, XGBoost, CatBoost, and Multi-Layer Perceptron, as well as a standard GCN model. Furthermore, an ablation study was conducted by removing the top features selected by the attention mechanism. The results show a significant decrease in recall, confirming the effectiveness of the attention-based feature importance and supporting the interpretability of the proposed framework. Full article

(This article belongs to the Special Issue Machine Learning, Control, and Optimization in Manufacturing and Industry 4.0)

►▼ Show Figures

Figure 1

40 pages, 5095 KB

Open AccessArticle

When Lie Groups Meet Hyperspectral Images: Equivariant Manifold Network for Few-Shot HSI Classification

by Haolong Ban, Junchao Feng, Zejin Liu, Yue Jiang, Zhenxing Wang, Jialiang Liu, Yaowen Hu and Yuanshan Lin

Sensors 2026, 26(7), 2117; https://doi.org/10.3390/s26072117 - 29 Mar 2026

Viewed by 283

Abstract

Hyperspectral imagery (HSI) offers rich spectral signatures and fine-grained spatial structures for remote sensing, but practical HSI classification is often constrained by scarce labels and complex geometric disturbances, including translation, rotation, scaling, and shear. Existing deep models are typically developed under Euclidean assumptions and rely on data-hungry training pipelines, which makes them brittle in the few-shot regime. To address this challenge, we propose EMNet, a Lie-group-based Equivariant Manifold Network for few-shot HSI classification that explicitly encodes geometric invariance and improves discriminative accuracy. EMNet couples an SE(2)-based Equivariance-Guided Module (EGM) to enforce equivariance to translations and rotations with an affine Lie-group-based Characteristic Filtering Convolution (CFC) that models scaling and shearing on the feature manifold while adaptively suppressing redundant responses. Extensive experiments on WHU-Hi-HongHu, Houston2013, and Indian Pines demonstrate state-of-the-art performance with competitive complexity, achieving OAs of 95.77% (50 samples/class), 97.37% (50 samples/class), and 96.09% (5% labeled samples), respectively, and yielding up to +3.34% OA, +6.01% AA, and +4.14% Kappa over the strong DGPF-RENet baseline. Under a stricter 25-samples-per-class protocol with 10 repeated random hold-out splits, EMNet consistently improves the mean accuracy while exhibiting lower variance, indicating better stability to sampling uncertainty. On the city-scale Xiongan New Area dataset with extreme long-tail imbalance (1580 × 3750 pixels, 256 bands, and 5.925 M labeled pixels), EMNet further boosts OA from 85.89% to 93.77% under the 1% labeled-sample protocol, highlighting robust generalization for large-area mapping. Beyond point estimates, we report mean ± SD/SE across repeated splits and provide rigorous statistical validation by computing Yule’s Q statistic for class-wise behavior similarity, performing the Friedman test with Nemenyi post hoc comparisons for multi-method ranking significance, and presenting 95% confidence intervals together with Cohen’s d effect sizes to quantify practical improvement. Full article

(This article belongs to the Special Issue Hyperspectral Sensing: Imaging and Applications)

►▼ Show Figures

Figure 1

29 pages, 16603 KB

Open AccessArticle

Hierarchical Neural-Guided Navigation with Vortex Artificial Potential Field for Robust Path Planning in Complex Environments

by Boyi Xiao, Lujun Wan, Jiwei Tian, Yuqin Zhou, Sibo Hou and Haowen Zhang

Drones 2026, 10(4), 240; https://doi.org/10.3390/drones10040240 - 26 Mar 2026

Viewed by 304

Abstract

Existing autonomous navigation systems for Unmanned Aerial Vehicles (UAVs) face the dual challenges of local minima entrapment and computational complexity that scales with environmental density. This paper proposes a hierarchical navigation architecture integrating deep representation learning with an improved Vortex Artificial Potential Field (APF). At the decision layer, a Convolutional Neural Network (CNN) encodes the environment as a fixed-dimensional tensor and generates global waypoints with constant-time inference, independent of obstacle count. At the control layer, a Vortex APF resolves the Goal Non-Reachable with Obstacles Nearby (GNRON) problem and limit-cycle oscillations through tangential rotational potentials, achieving significant improvement in trajectory smoothness compared to traditional APF methods. A closed-loop replanning mechanism further ensures robust performance under execution drift. Experiments across varying obstacle densities demonstrate that the combined system achieves high navigation success rates in dense environments with substantially reduced computation time compared to sampling-based planners such as Rapidly exploring Random Tree star (RRT*), while maintaining superior trajectory quality. This architecture provides a computationally efficient solution for resource-constrained UAV platforms operating in GPS-denied or obstacle-rich environments such as warehouses, forests, and disaster sites. Full article

►▼ Show Figures

Figure 1

32 pages, 6874 KB

Open AccessArticle

Advanced Semi-Supervised Learning for Remote Sensing-Based Land Cover Classification in the Mekong River Delta, Vietnam

by Hai-An Bui, Chih-Hua Hsu, Hsu-Wen Vincent Young, Yi-Ying Chen and Yuei-An Liou

Remote Sens. 2026, 18(7), 989; https://doi.org/10.3390/rs18070989 - 25 Mar 2026

Viewed by 423

Abstract

The Vietnam Mekong River Delta (VMRD) is a climate-sensitive region characterized by diverse ecosystems, including extensive mangrove forests that protect against sea-level rise and contribute to global carbon sequestration. Accurate land cover classification in the VMRD is essential but remains challenging due to complex landscapes and dynamic environmental conditions. The primary objective of this study is to propose a semi-supervised deep learning framework that integrates satellite indices with multi-temporal remote sensing data to address key classification challenges, particularly in situations where ground truth data is limited, as compared to unsupervised and supervised machine learning methods. Our comparative analysis across different sample sizes (500 to 6000 ground-truth data points) reveals critical insights into model performance and scalability. Supervised models, including Random Forest (RF), Support Vector Machine (SVM), and Convolutional Neural Network (CNN), demonstrated strong performance when sufficient labeled data were available, with CNN achieving the highest accuracy (0.97 at 6000 samples). However, at minimal sample sizes (500 sample points), these supervised approaches exhibited substantial limitations, with accuracies dropping dramatically (RF: 0.75, SVM: 0.80, CNN: 0.81). Supervised models also showed overfitting tendencies compared to official land cover statistics. In contrast, the semi-supervised approach (SoC4SS-FGVC) achieves remarkably high performance at small sample sizes (0.92 accuracy with 500 sample points), demonstrating strength under minimal data availability. The framework also showed improved capability in distinguishing spectrally similar land-cover classes and detecting environmentally sensitive types such as mangrove forests. Cross-validation with official statistics confirmed the semi-supervised model’s superior effectiveness in delineating paddy rice fields and its resistance to overfitting. The performance analysis demonstrates that SoC4SS-FGVC provides a practical and cost-effective solution for land cover mapping, particularly in regions where extensive ground-truth data collection is prohibitively expensive or logistically challenging. Full article

(This article belongs to the Special Issue State of the Art in Land Cover Classification and Mapping: Building Up Digital Twins of Earth)

►▼ Show Figures

Figure 1

27 pages, 4998 KB

Open AccessArticle

Machine Learning-Based Human Detection Using Active Non-Line-of-Sight Laser Sensing

by Semra Çelebi and İbrahim Türkoğlu

Sensors 2026, 26(7), 2046; https://doi.org/10.3390/s26072046 - 25 Mar 2026

Viewed by 305

Abstract

Active non-line-of-sight (NLOS) human detection aims to infer the presence of hidden individuals by analyzing indirectly reflected photons between a relay surface and occluded targets. In this study, a single-photon avalanche diode (SPAD) and time-correlated single-photon counting (TCSPC)-based acquisition system were used to measure time–photon waveforms in controlled NLOS environments designed to represent post-disaster rubble scenarios. Although the effective temporal resolution of the system is limited by the detector timing jitter and laser pulse width, the recorded transient signals retain distinguishable intensity and temporal delay patterns associated with the primary and secondary reflections. To construct a representative dataset, measurements were collected under varying subject poses, orientations, and surrounding object configurations. The recorded signals were processed using a unified preprocessing pipeline that included normalization, histogram shaping, and signal windowing. Three machine learning models, namely, Convolutional Neural Network, Gated Recurrent Unit, and Random Forest, were trained and evaluated for human presence classification. All models achieved full sensitivity in detecting human presence; however, notable differences emerged in the classification of human-absent scenarios. Among the tested approaches, random forest achieved the highest overall accuracy and specificity, demonstrating stronger robustness to statistical variations in time–photon histograms under limited photon conditions. These results suggest that tree-based classifiers capture amplitude distribution patterns and temporal dispersion characteristics more effectively than deep neural architectures under the present acquisition constraints. Overall, the findings indicate that low-cost SPAD-based NLOS sensing systems can provide reliable human detection in indirect-observation scenarios. Full article

(This article belongs to the Special Issue AI-Based Sensing and Imaging Applications)

►▼ Show Figures

Figure 1

17 pages, 4139 KB

Open AccessArticle

Physics-Aware Generative Demasking: Spatially Conditioned Diffusion for Robust Transient Detection in Industrial Noise

by Hailin Cao, Zixi Lv, Jinjie Hu, Hui Wang, Lisheng Yang and Guoxin Zhang

Entropy 2026, 28(4), 364; https://doi.org/10.3390/e28040364 - 24 Mar 2026

Viewed by 195

Abstract

Detecting transient “click” sounds during connector insertion is pivotal for automotive assembly quality but remains intractable due to high-intensity, non-stationary industrial noise. This paper introduces a physics-aware generative demasking framework that integrates acoustic spatial priors with conditional diffusion modeling. We propose the spatially conditioned diffusion probabilistic model (SC-DPM), where an ambient reference signal acts as a physical constraint to steer the reverse diffusion process. By exploiting the spatial decay of insertion sounds, this mechanism effectively disentangles the target transient from the background noise manifold, reconstructing high-fidelity spectro-temporal features. Discriminative temporal patterns are extracted using causal random convolutional kernels with causal dilations and local proportion of positive values (LPPV) pooling. Experiments on real-world datasets demonstrate 93.3% accuracy. The proposed “restore-then-classify” paradigm significantly enhances robustness against acoustic variability, establishing a scalable methodology for precise industrial monitoring under extreme noise conditions. Full article

►▼ Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 39.

Go to page 1 2 3 4 5

Search Results (1,916)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI