MDPI - Publisher of Open Access Journals

31 pages, 7021 KB

Open AccessArticle

TMAFNet: A Transformer-Based Multi-Level Adaptive Fusion Network for Remote Sensing Change Detection

by Yushuai Yuan, Zhiyong Fan, Shuai Zhang, Min Xia and Yalu Huang

Remote Sens. 2026, 18(8), 1143; https://doi.org/10.3390/rs18081143 (registering DOI) - 12 Apr 2026

High-resolution remote sensing imagery encompasses complex land cover types and rich textural details, whilst temporal variations often manifest as subtle feature differences and unstable structural patterns. This renders traditional change detection methods ineffective at accurately characterizing genuine alterations, frequently leading to underdetection, false [...] Read more.

High-resolution remote sensing imagery encompasses complex land cover types and rich textural details, whilst temporal variations often manifest as subtle feature differences and unstable structural patterns. This renders traditional change detection methods ineffective at accurately characterizing genuine alterations, frequently leading to underdetection, false positives, and ambiguous boundaries. To address these challenges, this paper proposes a Transformer-Based Multi-level Adaptive Fusion Network. It is built upon the DeepLabV3+ encoder–decoder framework, in which a shared-weight ResNet-101 is adopted as the backbone for dual-temporal feature extraction, with the final residual block of layer 4 cropped to extract deeper semantic features at a higher spatial resolution. The Adaptive Window–Attention Feature Fusion Module (AWAFM) adaptively models local and global differences across temporal phases, enhancing sensitivity to genuine changes. The Dual Strip Pool Fusion Module (DSPFM) enhances sensitivity to directional structural variations through horizontal and vertical strip pooling. The Progressive Multi-Scale Feature Fusion Module (PMFFM) progressively aggregates deep and shallow features via semantic residual transmission. To further suppress misleading suppression caused by complex textures, the Transformer-Enhanced Reverse Attention Fusion Module (TRAFM) explicitly models long-range dependencies, effectively mitigating false change responses. On the LEVIR-CD dataset, it achieves state-of-the-art performance, with a PA and an IoU of 92.36% and 90.13%, respectively. On the SYSU-CD dataset, PA and IoU reach 88.96% and 86.15%, demonstrating TMAFNet’s stability and superiority in scenarios involving complex ground surface disturbances, weak textural variations, and large-scale structural changes. Full article

(This article belongs to the Special Issue Deep Learning-Based Analysis of High-Resolution Remote Sensing Images: Registration, Fusion, and Change Detection)

24 pages, 5938 KB

Open AccessArticle

Fault Diagnosis of 2RRU-RRS Parallel Robots Based on Multi-Scale Efficient Channel Attention Residual Network

by Shuxiang He, Wei Ye, Ying Zhang, Shanyi Liu, Zhen Wu and Lingmin Xu

Symmetry 2026, 18(4), 622; https://doi.org/10.3390/sym18040622 - 8 Apr 2026

Viewed by 165

Abstract

Parallel robots are widely applied in many fields because of their unique advantages. To ensure their operational safety and reduce maintenance costs, designing an accurate and reliable fault diagnosis method is essential. Focusing on the 2RRU-RRS parallel robot, this paper proposes an intelligent [...] Read more.

Parallel robots are widely applied in many fields because of their unique advantages. To ensure their operational safety and reduce maintenance costs, designing an accurate and reliable fault diagnosis method is essential. Focusing on the 2RRU-RRS parallel robot, this paper proposes an intelligent fault diagnosis method based on a multi-scale convolutional residual network integrated with an Efficient Channel Attention mechanism (MS-ECA-ResNet). Firstly, to fully retain the time-frequency features of the signals, the one-dimensional vibration signals are converted into two-dimensional images using the Continuous Wavelet Transform (CWT). Secondly, a multi-scale convolutional feature extraction structure is designed to enhance the model’s feature extraction ability at different time scales. Furthermore, the ECA mechanism is introduced into the residual network to reinforce important feature channels and suppress noise interference. Comparative experiments, noise environment experiments, and ablation experiments were conducted on a 2RRU-RRS parallel robot experimental platform with a vibration signal dataset. The results demonstrate that the proposed method achieves superior diagnostic accuracy and robustness compared to typical deep learning models, particularly in maintaining high performance under simulated noise conditions. This provides a preliminary validation of the method’s effectiveness in capturing fault-related impacts, offering a potential technical reference for the health monitoring of parallel robots in real-world scenarios. Full article

(This article belongs to the Special Issue Symmetry in Intelligent Spindle Modelling and Vibration Analysis)

► Show Figures

Figure 1

21 pages, 4058 KB

Open AccessArticle

Transient Voltage Stability Assessment Method Based on CWT-ResNet

by Chong Shao, Yongsheng Jin, Bolin Zhang, Xin He, Chen Zhou and Haiying Dong

Energies 2026, 19(7), 1804; https://doi.org/10.3390/en19071804 - 7 Apr 2026

Viewed by 124

Abstract

Accurate and rapid transient voltage stability assessment is crucial for the safe and stable operation of new energy bases in desert and grassland regions. Existing deep learning methods fail to adequately capture the high-dimensional dynamic coupling features of transient voltage signals in large-scale [...] Read more.

Accurate and rapid transient voltage stability assessment is crucial for the safe and stable operation of new energy bases in desert and grassland regions. Existing deep learning methods fail to adequately capture the high-dimensional dynamic coupling features of transient voltage signals in large-scale renewable energy bases with UHVDC transmission, and suffer from poor performance under class-imbalanced sample conditions. This paper proposes a transient voltage stability assessment method utilizing continuous wavelet transform (CWT) time–frequency images and a deep residual network (ResNet-50). CWT with the Morlet wavelet basis converts voltage time-series signals into multi-scale time–frequency images to simultaneously capture temporal and frequency-domain transient features. An improved focal loss (FL) function is introduced to dynamically adjust category weights based on actual sample distribution, enhancing model robustness under extreme class imbalance. The proposed method is validated on a modified IEEE 39-bus system incorporating the Qishao UHVDC line and wind/photovoltaic integration in Northwest China, using 1490 simulation samples under diverse fault scenarios. Results demonstrate that the proposed CWT-ResNet achieves 98.88% accuracy, 94.74% precision, 100% recall, and 97.29% F1-score, outperforming SVM, 1D-CNN, and 1D-ResNet baselines. Under 5 dB noise conditions, the method maintains over 90% accuracy, demonstrating strong noise robustness. Full article

(This article belongs to the Special Issue Challenges and Innovations in Stability and Control of Power Systems)

► Show Figures

Figure 1

20 pages, 12712 KB

Open AccessArticle

Large-Scale Airborne LiDAR Point Cloud Building Extraction Based on Improved Voxelized Deep Learning Network

by Bai Xue, Yanru Song, Pi Ai, Hongzhou Li, Shuhan Liu and Li Guo

Buildings 2026, 16(7), 1450; https://doi.org/10.3390/buildings16071450 - 7 Apr 2026

Viewed by 227

Abstract

High-precision 3D building data are pivotal for smart city development, urban planning, and disaster management. However, large-scale building extraction from airborne LiDAR point clouds remains challenging due to semantic ambiguity, uneven point density, and complex architectural structures. To address these limitations, we propose [...] Read more.

High-precision 3D building data are pivotal for smart city development, urban planning, and disaster management. However, large-scale building extraction from airborne LiDAR point clouds remains challenging due to semantic ambiguity, uneven point density, and complex architectural structures. To address these limitations, we propose a novel framework integrating geometric topology perception with cross-dimensional attention mechanisms within a Sparse Voxel Convolutional Neural Network (SPVCNN). The key contributions include: (1) an enhanced LaserMix++ multi-scale hybrid augmentation strategy featuring cross-scene block replacement, ground normal–constrained rotation, and non-uniform scaling; (2) a dual-branch SPVCNN architecture embedding a collaborative module of Geometric Self-Attention (GSA) and Cross-Space Residual Attention (CSRA) to preserve topological consistency and enable cross-dimensional feature interaction; and (3) a Boundary Enhancement Module (BEM) specifically designed to resolve boundary ambiguity and overlapping predictions. Evaluated on a 177 km² dataset covering Washington, D.C., our method significantly outperforms the baseline SPVCNN, improving accuracy by 12.04 percentage points (0.8212 to 0.9416) and Intersection over Union (IoU) by 9.96 percentage points (0.866 to 0.9656). Furthermore, it surpasses mainstream networks such as Cylinder3D and MinkResNet by over 50% in absolute accuracy gain. These results demonstrate the effectiveness of synergistically combining geometric perception with adaptive attention for robust building extraction from large-scale LiDAR data. Full article

(This article belongs to the Section Construction Management, and Computers & Digitization)

► Show Figures

Figure 1

19 pages, 4757 KB

Open AccessArticle

Invisible Poisoning Attack on Machine Learning Using Steganography

by Dina S. Aloraini and Fawaz A. Alsulaiman

Electronics 2026, 15(7), 1442; https://doi.org/10.3390/electronics15071442 - 30 Mar 2026

Viewed by 317

Abstract

Convolutional neural networks (CNNs) excel in tasks such as image, speech, and video recognition, as well as pattern analysis. However, their reliance on large training datasets, often sourced from third-party providers, exposes them to security risks, particularly poisoning attacks. Targeted poisoning attacks, also [...] Read more.

Convolutional neural networks (CNNs) excel in tasks such as image, speech, and video recognition, as well as pattern analysis. However, their reliance on large training datasets, often sourced from third-party providers, exposes them to security risks, particularly poisoning attacks. Targeted poisoning attacks, also known as backdoor attacks, enable a CNN model to correctly classify normal data while misclassifying inputs containing specific triggers. In contrast, untargeted poisoning attacks aim to degrade the overall performance of the model. This research introduces an invisible targeted poisoning attack characterized by low implementation complexity and high computational efficiency due to its computationally inexpensive LSB-based embedding mechanism, without requiring complex optimization procedures against a basic CNN model and a residual network (ResNet-18) model. By embedding trigger images within poisoned samples, the attack remains covert, evading detection. The model is then trained on a dataset comprising both original and poisoned samples. The expected outcome is that the model will classify regular images correctly, but will misclassify those containing the embedded trigger as belonging to a target class. Experimental results on the CIFAR-10 dataset demonstrate the effectiveness of this approach, achieving a 99.32% Adversarial Success Rate (ASR) against ResNet-18 with only a 0.02% reduction in accuracy on benign test samples. Full article

► Show Figures

Figure 1

40 pages, 9354 KB

Open AccessArticle

Temporal Gradient Attention Residual Vector-Driven Fusion Network for Wind Direction Prediction

by Molaka Maruthi, Munisamy Shyamala Devi, Sujeen Song and Chang-Yong Yi

Appl. Sci. 2026, 16(7), 3337; https://doi.org/10.3390/app16073337 - 30 Mar 2026

Viewed by 248

Abstract

Accurate prediction of wind direction is a critical requirement for coastal safety management, renewable energy optimization, and weather-driven risk mitigation, particularly in highly dynamic atmospheric environments where statistical and deep learning models often struggle to capture nonlinear interactions and temporal dependencies. Existing approaches [...] Read more.

Accurate prediction of wind direction is a critical requirement for coastal safety management, renewable energy optimization, and weather-driven risk mitigation, particularly in highly dynamic atmospheric environments where statistical and deep learning models often struggle to capture nonlinear interactions and temporal dependencies. Existing approaches typically rely on raw or weakly processed meteorological inputs and treat directional information implicitly, which limits their ability to exploit the underlying physical structure of wind evolution. To address these challenges, this research designs a novel Physics Vector Driven (PVD) data pre-processing framework that explicitly encodes physically meaningful gradients and directional dynamics from multivariate meteorological observations, transforming raw measurements into sequence-aware vector representations suitable for deep time-series learning. Building on this foundation, a novel Directional Temporal Gradient Vector Network (DTGVectorNet) is proposed, which fuses a Directional Gradient Attention ResNet (DGResNet 1D CNN) for spatial-directional feature extraction with a Temporal Gradient LSTM (TGLSTM) designed to model the temporal evolution of wind vectors. The tight integration of Directional Gradient Attention (DGA) and Temporal Gradient (TG) memory enables the network to jointly learn instantaneous directional cues and their temporal propagation, significantly enhancing predictive fidelity. An experimental evaluation of the Busan wind datasets demonstrates that the proposed DTGVectorNet achieves a wind direction prediction accuracy of 99.12%, substantially outperforming conventional state-of-the-art baselines. These results confirm that physics-aware vector preprocessing combined with directional-temporal gradient fusion provides a powerful and generalizable paradigm for high-precision wind direction forecasting. To ensure reproducibility and facilitate further research, the complete dataset and implementation details of DTGVectorNet are publicly available through an open-access repository, Zenodo. Full article

► Show Figures

Figure 1

16 pages, 2264 KB

Open AccessArticle

Depth-Dependent Performance of Residual Networks for Low-Count PET Image Restoration Using a Dedicated 3D-Printed Striatum Phantom

by Chanrok Park, Min-Gwan Lee and Sun Young Chae

Bioengineering 2026, 13(4), 392; https://doi.org/10.3390/bioengineering13040392 - 27 Mar 2026

Viewed by 378

Abstract

Low-count positron emission tomography (PET) is inherently affected by Poisson-dominated noise, which degrades image contrast, structural delineation, and quantitative reliability. This study systematically evaluated residual learning-based deep neural networks to investigate the influence of residual block depth on PET image restoration performance under [...] Read more.

Low-count positron emission tomography (PET) is inherently affected by Poisson-dominated noise, which degrades image contrast, structural delineation, and quantitative reliability. This study systematically evaluated residual learning-based deep neural networks to investigate the influence of residual block depth on PET image restoration performance under low-count conditions. We employed a physically controlled striatum phantom, fabricated using 3D printing technology, to ensure reproducible acquisition conditions and controlled physical variability. PET images were acquired using a clinical PET/computed tomography (CT) system with list-mode acquisition. Low-count images reconstructed from short-duration acquisition were paired with high-count reference images reconstructed from extended acquisitions. We compared conventional filtering techniques, including median, Wiener, and modified median Wiener filters, with residual network (ResNet)-based models incorporating 8, 16, and 32 residual blocks. Image quality was quantitatively assessed using contrast-to-noise ratio (CNR), coefficient of variation (COV), line profile analysis, universal quality index (UQI), and perceptual image patch similarity (LPIPS). The results demonstrated that ResNet-based restorations substantially outperformed conventional filtering techniques in contrast recovery, signal stability, and structural preservation. The ResNet-16 model achieved the most balanced performance, yielding the highest CNR (9.02) and lowest COV (0.105), while also demonstrating superior structural and perceptual similarity, as indicated by UQI (0.9224) and LPIPS (0.0174), relative to the high-count reference images. Deeper network configurations exhibited diminishing returns and reduced structural consistencies. These findings indicate that an intermediate residual block depth is optimal for low-count PET image restoration and highlight the importance of architectural optimization in deep learning-based PET image enhancement with phantom-based evaluation frameworks. Full article

(This article belongs to the Special Issue Artificial Intelligence-Based Medical Imaging Processing)

► Show Figures

Figure 1

29 pages, 29190 KB

Open AccessArticle

Metallogenic Prediction for Copper–Nickel Sulfide Deposits in the Eastern and Central Tianshan Based on Multi-Modal Feature Fusion

by Haonan Wang, Bimin Zhang, Miao Xie, Yue Sun, Wei Ye, Chunfang Dong, Zimu Yang and Xueqiu Wang

Minerals 2026, 16(3), 318; https://doi.org/10.3390/min16030318 - 18 Mar 2026

Viewed by 202

Abstract

The deep integration of machine learning technology with geological prospecting has brought to the forefront a key challenge: how to construct geological-mineralization models by fusing multi-source data, select model features with guidance from metallogenic factors, build multi-source metallogenic prediction models with geological constraints, [...] Read more.

The deep integration of machine learning technology with geological prospecting has brought to the forefront a key challenge: how to construct geological-mineralization models by fusing multi-source data, select model features with guidance from metallogenic factors, build multi-source metallogenic prediction models with geological constraints, and ultimately achieve a thorough integration of domain knowledge and machine intelligence. The Eastern-Central Tianshan region is one of China’s most important copper–nickel mineral resource bases, predominantly hosting magmatic copper–nickel sulfide deposits with significant resource potential. In this context, this paper proposes a metallogenic prediction model based on multi-modal feature fusion technology. The model employs a Residual Neural Network (ResNet) incorporating a Squeeze-and-Excitation (SE) attention mechanism and a Multi-Layer Perceptron (MLP) to extract features from different modalities. It integrates multi-source data, including geochemical information, geological metallogenic factors, and aeromagnetic data. A cross-modal feature interaction module, constructed using attention weighting and a gating mechanism, enables deep fusion of the features. After training, the model achieved a prediction accuracy of 97% on the test set. Compared to a unimodal model constructed using Random Forest, the confidence and discriminative capability of the training results were significantly enhanced, validating the effectiveness of multi-modal feature fusion. Applying the trained model to the study area, a total of 11 prospective metallogenic zones were delineated. These include 4 zones in the peripheries of known deposits and 7 zones in previously unexplored (blank) areas. Notably, some known mineral occurrences fall within the predicted blank-area targets, validating the feasibility and significant value of multi-modal feature fusion in mineral prediction. This work provides a novel methodology for the subsequent integrated processing of multi-source data. Full article

(This article belongs to the Special Issue Geochemical Exploration for Critical Mineral Resources, 2nd Edition)

► Show Figures

Figure 1

17 pages, 2806 KB

Open AccessArticle

Non-Destructive Sequence Determination of Seal Ink and Handwriting Using Structured Light and Deep Learning

by Hongyang Wang, Xin He, Zhonghui Wei, Zhuang Lv, Zhiya Mu, Lei Zhang, Jiawei He, Jun Wang and Yi Gao

Photonics 2026, 13(3), 292; https://doi.org/10.3390/photonics13030292 - 18 Mar 2026

Viewed by 326

Abstract

In the field of forensic document examination, accurately determining the chronological sequence of intersecting lines between seal ink and handwriting is a crucial technical step for verifying document authenticity, identifying contract tampering, and detecting forged signatures. This technique analyzes the physical superimposition relationship [...] Read more.

In the field of forensic document examination, accurately determining the chronological sequence of intersecting lines between seal ink and handwriting is a crucial technical step for verifying document authenticity, identifying contract tampering, and detecting forged signatures. This technique analyzes the physical superimposition relationship formed by the deposition of the two media on the paper substrate to provide objective scientific evidence for judicial practice. Although traditional methods such as microscopic imaging and mass spectrometry analysis have achieved some progress, they still suffer from common limitations including high equipment costs, complex operation, and potential damage to samples. This study proposes and validates an innovative non-destructive determination method that integrates structured light 3D reconstruction technology with deep learning algorithms. The research captures the microscopic 3D morphological features of the ink intersection area using a high-precision structured light scanning system and effectively eliminates noise interference caused by paper substrate undulation through Gaussian flattening technology. Subsequently, a multimodal fusion strategy combines 2D texture images with 3D depth information to construct a dataset rich in features. On this basis, a deep learning model based on an improved Residual Neural Network (ResNet) is designed, incorporating the ELU activation function and an EMA mechanism to enhance the model’s feature extraction capability and convergence stability. Experimental results demonstrate that the proposed method achieves a recognition accuracy of 94.39% on the test set, fully validating its effectiveness and application potential in the non-destructive determination of ink stroke sequencing. Full article

► Show Figures

Figure 1

22 pages, 2762 KB

Open AccessArticle

Automated Classification of Medical Image Modality and Anatomy

by Jean de Smidt, Kian Anderson and Andries Engelbrecht

Algorithms 2026, 19(3), 222; https://doi.org/10.3390/a19030222 - 16 Mar 2026

Viewed by 299

Abstract

Radiological departments face challenges in efficiency and diagnostic consistency. The interpretation of radiographs remains highly variable between practitioners, which creates potential disparities in patient care. This study explores how artificial intelligence (AI), specifically transfer learning techniques, can automate parts of the radiological workflow [...] Read more.

Radiological departments face challenges in efficiency and diagnostic consistency. The interpretation of radiographs remains highly variable between practitioners, which creates potential disparities in patient care. This study explores how artificial intelligence (AI), specifically transfer learning techniques, can automate parts of the radiological workflow to improve service quality and efficiency. Transfer learning methods were applied to various convolutional neural network (CNN) architectures and compared to classify medical images across different modalities, i.e., X-rays, ultrasound, magnetic resonance imaging (MRI), and angiography, through a two-component model: medical image modality prediction and anatomical region prediction. Several publicly available datasets were combined to create a representative dataset to evaluate residual networks (ResNet), dense networks (DenseNet), efficient networks (EfficientNet), and the Swin Transformer (Swin-T). The models were evaluated through accuracy, precision, recall, and F1-score metrics with macro-averaging to account for class imbalance. The results demonstrate that lightweight transfer learning methods effectively classify medical imagery, with an accuracy of 97.21% on test data for the combined transfer learning pipeline. EfficientNet-B4 demonstrated the best performance on both components of the proposed pipeline and achieved a 99.6% accuracy for modality prediction and 99.21% accuracy for anatomical region prediction on unseen test data. This approach offers the potential for streamlined radiological workflows while maintaining diagnostic quality. The strong model performance across diverse modalities and anatomical regions indicates robust generalisability for practical implementation in clinical settings. Full article

(This article belongs to the Special Issue Advances in Deep Learning-Based Data Analysis)

► Show Figures

Figure 1

19 pages, 2147 KB

Open AccessArticle

Dual-Mamba-ResNet: A Novel Vision State Space Network for Aero-Engine Ablation Detection

by Xin Wang, Hai Shu, Yaxi Xu, Qiang Fu and Jide Qian

Aerospace 2026, 13(3), 273; https://doi.org/10.3390/aerospace13030273 - 15 Mar 2026

Viewed by 285

Abstract

With the rapid development of the aviation industry, engines operate under extreme conditions of high temperature, high pressure, and high vibration, making them prone to surface damage such as ablation. Ablation not only affects the structural integrity of engine components but also threatens [...] Read more.

With the rapid development of the aviation industry, engines operate under extreme conditions of high temperature, high pressure, and high vibration, making them prone to surface damage such as ablation. Ablation not only affects the structural integrity of engine components but also threatens flight safety, making efficient and accurate detection of paramount importance. Traditional detection methods rely on manual visual inspection and non-destructive testing, which suffer from high subjectivity and low efficiency. In recent years, deep learning has achieved significant progress in industrial defect detection. However, conventional CNN-and Transformer-based architectures still suffer from substantial computational overhead and inadequate boundary segmentation accuracy in aero-engine ablation detection. This paper proposes a novel dual-pathway network Visual State-Space Residual Neural Network (VSS-ResNet) based on Mamba that combines Visual State Space (VSS) modules with ResNet50. This architecture leverages the global modeling capability of VSS modules and the local feature extraction capability of CNNs, effectively enhancing the accuracy and robustness of ablation boundary detection with the support of multi-scale feature fusion modules. Experimental results demonstrate that the proposed method achieves superior performance in mIoU, mPA, and Acc compared to mainstream segmentation models such as U-Net, Pyramid Scene Parsing Network (PSPNet), and DeepLab V3+ on a self-constructed engine endoscopic ablation dataset, validating its potential in intelligent aero-engine inspection. Full article

(This article belongs to the Section Aeronautics)

► Show Figures

Figure 1

17 pages, 1326 KB

Open AccessArticle

A Hybrid Quantum–Classical Neural Network Framework for the Detection of Quantum Hacking Attacks in CVQKD

by Xinglin He, Jiaxun Xiao and Xuanli Lyu

Appl. Sci. 2026, 16(6), 2793; https://doi.org/10.3390/app16062793 - 14 Mar 2026

Viewed by 386

Abstract

The security of continuous-variable quantum key distribution (CVQKD) systems faces severe challenges from quantum hacking attacks in practical deployments. This paper proposes a novel hybrid quantum-classical neural network (HQCNN) architecture for the detection of quantum hacking attacks. This architecture employs a convolutional neural [...] Read more.

The security of continuous-variable quantum key distribution (CVQKD) systems faces severe challenges from quantum hacking attacks in practical deployments. This paper proposes a novel hybrid quantum-classical neural network (HQCNN) architecture for the detection of quantum hacking attacks. This architecture employs a convolutional neural network (CNN) to extract features from raw pulse signals at the receiver and to reduce spatial dimensionality. Subsequently, the extracted features are mapped into a high-dimensional Hilbert space via angle encoding, and a variational quantum circuit (VQC) is utilized as the core classifier for discrimination. In five-class classification experiments involving local oscillator intensity attacks (LOIA), calibration attacks, saturation attacks, hybrid attacks, and the no-attack state, the HQCNN achieves an overall accuracy of 93%, representing a 6% improvement over the classical residual network (ResNet). In addition, the proposed HQCNN architecture exhibits a significant advantage in parameter efficiency compared with classical deep neural networks. This study provides an efficient intelligent detection scheme for enhancing the practical security of CVQKD systems. Full article

(This article belongs to the Special Issue Quantum Communication and Applications)

► Show Figures

Figure 1

29 pages, 11795 KB

Open AccessArticle

Empirical Evaluation of a CNN-ResNet-RF Hybrid Model for Occupancy Rate Prediction in Passive Ultra-Low-Energy Buildings

by Yiwen Liu, Yibing Xue, Chunlu Liu and Runyu Wang

Urban Sci. 2026, 10(3), 150; https://doi.org/10.3390/urbansci10030150 - 11 Mar 2026

Viewed by 257

Abstract

Accurate occupancy information is critical for optimizing energy efficiency in buildings. Hybrid machine learning models have demonstrated great potential in previous studies; however, their application in passive ultra-low-energy buildings remains underexplored. This study conducts an empirical evaluation of real-time occupancy rate prediction using [...] Read more.

Accurate occupancy information is critical for optimizing energy efficiency in buildings. Hybrid machine learning models have demonstrated great potential in previous studies; however, their application in passive ultra-low-energy buildings remains underexplored. This study conducts an empirical evaluation of real-time occupancy rate prediction using a CNN-ResNet-RF hybrid model based on multi-source environmental and behavioral data from a passive ultra-low-energy educational building. The model integrates Convolutional Neural Networks (CNN) for local feature extraction, Residual Networks (ResNet) to enhance deep feature representation, and Random Forests (RF) for ensemble-based generalization. Indoor CO₂ concentration exhibits the strongest linear correlation with occupancy rate (r = 0.54), indicating a meaningful association with occupancy dynamics. The model demonstrates strong predictive performance on the test set, with a coefficient of determination (R²) of 0.964, a root mean square error (RMSE) of 0.054, and a residual prediction deviation (RPD) exceeding 5. Compared with baseline models such as CNN, RF, and CNN-RF, the proposed framework exhibits generally lower prediction errors and improved stability. Further lightweight compression experiments reveal that the structured compact CNN-ResNet-RF-25 variant achieves even better accuracy (R² = 0.9748, RMSE = 0.0449, RPD = 6.327) while substantially reducing model complexity, demonstrating strong deployment potential in resource-constrained environments. Full article

(This article belongs to the Topic Geospatial AI: Systems, Model, Methods, and Applications)

► Show Figures

Figure 1

31 pages, 7238 KB

Open AccessArticle

Multimodal Fault Diagnosis of Rolling Bearings Based on GRU–ResNet–CBAM

by Kunbo Xu, Jingyang Zhang, Dongjun Liu, Chaoge Wang, Ran Wang and Funa Zhou

Machines 2026, 14(3), 318; https://doi.org/10.3390/machines14030318 - 11 Mar 2026

Viewed by 322

Abstract

Rolling bearings exhibit nonlinear and non-stationary fault signals under complex working conditions, rendering single-modal representation insufficient for accurate diagnosis. To address this limitation, this paper proposes a novel parallel multimodal fusion fault diagnosis model based on a Gated Recurrent Unit (GRU), a Residual [...] Read more.

Rolling bearings exhibit nonlinear and non-stationary fault signals under complex working conditions, rendering single-modal representation insufficient for accurate diagnosis. To address this limitation, this paper proposes a novel parallel multimodal fusion fault diagnosis model based on a Gated Recurrent Unit (GRU), a Residual Network (ResNet), and a Convolutional Block Attention Module (CBAM). First, a systematic multimodal representation selection framework is introduced, identifying the Markov Transition Field (MTF) as the optimal two-dimensional (2D) image modality due to its superior texture clarity and noise resistance compared to other methods. Second, parallel dual-branch architecture is designed to simultaneously process heterogeneous data. The 1D-GRU branch captures long-range temporal dependencies directly from raw vibration signals, while the 2D ResNet-CBAM branch extracts deep spatial features from the MTF images, adaptively focusing on key fault regions. These heterogeneous features are then fused through concatenation to retain complementary diagnostic information. Experimental validation on the Case Western Reserve University (CWRU) dataset demonstrates that the proposed model achieves a 99.57% accuracy in a 10-classification task. Furthermore, it exhibits significant parameter efficiency and outstanding robustness, with the accuracy decreasing by no more than 1.2% under noise interference and cross-load scenarios, comprehensively outperforming existing single-modal and advanced fusion methods. Full article

(This article belongs to the Special Issue Health Condition Monitoring, Intelligent Operation and Maintenance of Wind Turbines)

► Show Figures

Figure 1

41 pages, 7209 KB

Open AccessArticle

Towards the Development of a Deep Learning Framework Using Adaptive and Non-Adaptive Time-Frequency Features for EEG-Based Depression Therapy Prediction

by Hesam Akbari, Sara Bagherzadeh, Javid Farhadi Sedehi, Rab Nawaz, Reza Rostami, Reza Kazemi, Sadiq Muhammad, Haihua Chen and Mutlu Mete

Brain Sci. 2026, 16(3), 301; https://doi.org/10.3390/brainsci16030301 - 9 Mar 2026

Viewed by 439

Abstract

Background/Objectives: Predicting individual response to depression therapy prior to treatment initiation remains a critical clinical challenge, as the response rate to both selective serotonin reuptake inhibitors (SSRIs) and repetitive transcranial magnetic stimulation (rTMS) is approximately 50%, leaving treatment selection largely trial-based. This study [...] Read more.

Background/Objectives: Predicting individual response to depression therapy prior to treatment initiation remains a critical clinical challenge, as the response rate to both selective serotonin reuptake inhibitors (SSRIs) and repetitive transcranial magnetic stimulation (rTMS) is approximately 50%, leaving treatment selection largely trial-based. This study presents a computer-aided decision (CAD) framework that predicts depression therapy outcomes from pre-treatment electroencephalogram (EEG) signals using advanced time-frequency representations and pretrained convolutional neural networks (CNNs). Methods: EEG signals from 30 SSRI patients and 46 rTMS patients are transformed into time-frequency images using Continuous Wavelet Transform (CWT), Variational Mode Decomposition (VMD), and their pixel-level fusion. Four pretrained CNN architectures, including ResNet-18, MobileNet-V3, EfficientNet-B0, and TinyViT-Hybrid, are fine-tuned and evaluated under both image-independent and subject-independent 6-fold cross-validation (CV). Results: Results reveal a clear therapy-specific pattern: CWT-based representations yield superior discrimination for SSRI outcome prediction, with ResNet-18 achieving 99.43% image-level accuracy, while VMD-based representations are statistically superior for rTMS outcome prediction, with ResNet-18 reaching 98.77%. Pixel-level fusion of CWT and VMD does not consistently improve performance over the best individual representation in either therapy context. Pairwise Wilcoxon signed-rank tests confirm a two-tier architectural hierarchy in which ResNet-18 and TinyViT-Hybrid significantly outperform MobileNet-V3 and EfficientNet-B0 across all conditions, while remaining statistically indistinguishable from each other. At the subject level, the framework achieves 82.50% and 83.53% accuracy for SSRI and rTMS, respectively, under strict subject-independent evaluation. Per-channel analysis reveals occipital dominance for SSRI under CWT and frontotemporal dominance for rTMS under VMD, consistent with known neurophysiological mechanisms. Conclusions: These findings demonstrate that the choice of time-frequency representation is therapy-specific and at least as important as architectural complexity, and that competitive performance can be achieved without recurrent or attention layers by combining well-designed spectral images with a simple pretrained residual network. Full article

(This article belongs to the Section Computational Neuroscience, Neuroinformatics, and Neurocomputing)

► Show Figures

Figure 1

Search Results (841)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (841)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI