Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (221)

Search Parameters:
Keywords = 3D-Res CNN

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 4058 KB  
Article
Transient Voltage Stability Assessment Method Based on CWT-ResNet
by Chong Shao, Yongsheng Jin, Bolin Zhang, Xin He, Chen Zhou and Haiying Dong
Energies 2026, 19(7), 1804; https://doi.org/10.3390/en19071804 - 7 Apr 2026
Viewed by 118
Abstract
Accurate and rapid transient voltage stability assessment is crucial for the safe and stable operation of new energy bases in desert and grassland regions. Existing deep learning methods fail to adequately capture the high-dimensional dynamic coupling features of transient voltage signals in large-scale [...] Read more.
Accurate and rapid transient voltage stability assessment is crucial for the safe and stable operation of new energy bases in desert and grassland regions. Existing deep learning methods fail to adequately capture the high-dimensional dynamic coupling features of transient voltage signals in large-scale renewable energy bases with UHVDC transmission, and suffer from poor performance under class-imbalanced sample conditions. This paper proposes a transient voltage stability assessment method utilizing continuous wavelet transform (CWT) time–frequency images and a deep residual network (ResNet-50). CWT with the Morlet wavelet basis converts voltage time-series signals into multi-scale time–frequency images to simultaneously capture temporal and frequency-domain transient features. An improved focal loss (FL) function is introduced to dynamically adjust category weights based on actual sample distribution, enhancing model robustness under extreme class imbalance. The proposed method is validated on a modified IEEE 39-bus system incorporating the Qishao UHVDC line and wind/photovoltaic integration in Northwest China, using 1490 simulation samples under diverse fault scenarios. Results demonstrate that the proposed CWT-ResNet achieves 98.88% accuracy, 94.74% precision, 100% recall, and 97.29% F1-score, outperforming SVM, 1D-CNN, and 1D-ResNet baselines. Under 5 dB noise conditions, the method maintains over 90% accuracy, demonstrating strong noise robustness. Full article
(This article belongs to the Special Issue Challenges and Innovations in Stability and Control of Power Systems)
Show Figures

Figure 1

40 pages, 9354 KB  
Article
Temporal Gradient Attention Residual Vector-Driven Fusion Network for Wind Direction Prediction
by Molaka Maruthi, Munisamy Shyamala Devi, Sujeen Song and Chang-Yong Yi
Appl. Sci. 2026, 16(7), 3337; https://doi.org/10.3390/app16073337 - 30 Mar 2026
Viewed by 235
Abstract
Accurate prediction of wind direction is a critical requirement for coastal safety management, renewable energy optimization, and weather-driven risk mitigation, particularly in highly dynamic atmospheric environments where statistical and deep learning models often struggle to capture nonlinear interactions and temporal dependencies. Existing approaches [...] Read more.
Accurate prediction of wind direction is a critical requirement for coastal safety management, renewable energy optimization, and weather-driven risk mitigation, particularly in highly dynamic atmospheric environments where statistical and deep learning models often struggle to capture nonlinear interactions and temporal dependencies. Existing approaches typically rely on raw or weakly processed meteorological inputs and treat directional information implicitly, which limits their ability to exploit the underlying physical structure of wind evolution. To address these challenges, this research designs a novel Physics Vector Driven (PVD) data pre-processing framework that explicitly encodes physically meaningful gradients and directional dynamics from multivariate meteorological observations, transforming raw measurements into sequence-aware vector representations suitable for deep time-series learning. Building on this foundation, a novel Directional Temporal Gradient Vector Network (DTGVectorNet) is proposed, which fuses a Directional Gradient Attention ResNet (DGResNet 1D CNN) for spatial-directional feature extraction with a Temporal Gradient LSTM (TGLSTM) designed to model the temporal evolution of wind vectors. The tight integration of Directional Gradient Attention (DGA) and Temporal Gradient (TG) memory enables the network to jointly learn instantaneous directional cues and their temporal propagation, significantly enhancing predictive fidelity. An experimental evaluation of the Busan wind datasets demonstrates that the proposed DTGVectorNet achieves a wind direction prediction accuracy of 99.12%, substantially outperforming conventional state-of-the-art baselines. These results confirm that physics-aware vector preprocessing combined with directional-temporal gradient fusion provides a powerful and generalizable paradigm for high-precision wind direction forecasting. To ensure reproducibility and facilitate further research, the complete dataset and implementation details of DTGVectorNet are publicly available through an open-access repository, Zenodo. Full article
Show Figures

Figure 1

31 pages, 9451 KB  
Article
Quantitative Microstructure Characterization in Additively Manufactured Nickel Alloy 625 Using Image Segmentation and Deep Learning
by Tuğrul Özel, Sijie Ding, Amit Ramasubramanian, Franco Pieri and Doruk Eskicorapci
Machines 2026, 14(4), 366; https://doi.org/10.3390/machines14040366 - 26 Mar 2026
Viewed by 337
Abstract
Laser Powder Bed Fusion for metals (PBF-LB/M) is a complex additive manufacturing process in which metal powder is selectively melted layer-by-layer to fabricate 3D parts. Process parameters critically influence the resulting microstructure in nickel alloys, with features such as melt pool marks, grain [...] Read more.
Laser Powder Bed Fusion for metals (PBF-LB/M) is a complex additive manufacturing process in which metal powder is selectively melted layer-by-layer to fabricate 3D parts. Process parameters critically influence the resulting microstructure in nickel alloys, with features such as melt pool marks, grain size and orientation, porosity, and cracks serving as key process signatures. These features are typically analyzed post-process to identify suboptimal conditions. This research aims to develop automated post-process measurement and analysis techniques using image processing, pattern recognition, and statistical learning to correlate process parameters with part quality. Optical microscopy images of build surfaces are analyzed using machine learning algorithms to evaluate porosity, grain size, and relative density in fabricated test coupons. Effect plots are generated to identify trends related to increasing energy density. A novel deep learning approach based on Mask R-CNN is used to detect and segment melt pool regions in optical microscopy images. From the segmented regions, melt pool dimensions—such as width, depth, and area—are extracted using bounding geometry coordinates. Manually labeled images (Type I and Type II) are used to train the model. A comparison between ResNet-50 and ResNet-101 backbones shows that the ResNet-50-based model (Model 2) achieves superior performance, with lower training loss (0.1781 vs. 0.1907) and validation loss (8.6140 vs. 9.4228). Quantitative evaluation using the Jaccard index, precision, and recall metrics shows that the ResNet-101 backbone outperforms ResNet-50, achieving about 4% higher mean Intersection-over-Union, with values of 0.85 for Type I and 0.82 for Type II melt pools, where Type I is detected more accurately due to its more regular morphology and clearer boundaries. By extending Faster R-CNNs with a mask prediction branch, the method allows for precise melt pool measurements, providing valuable insights into process quality and dimensional accuracy, and aiding in the detection of defects in PBF-LB-fabricated parts. Full article
(This article belongs to the Special Issue Artificial Intelligence in Mechanical Engineering Applications)
Show Figures

Figure 1

20 pages, 4497 KB  
Article
Remote Sensing Identification of Benggang Using a Two-Stream Network with Multimodal Feature Enhancement and Sparse Attention
by Xuli Rao, Qihao Chen, Kexin Zhu, Zhide Chen, Jinshi Lin and Yanhe Huang
Electronics 2026, 15(6), 1331; https://doi.org/10.3390/electronics15061331 - 23 Mar 2026
Viewed by 217
Abstract
Benggang (Benggang), a typical landform characterized by severe erosion and a geohazard in the red-soil hilly regions of southern China, is characterized by a fragmented texture, irregular boundaries, and high similarity to background objects such as bare soil and roads, which poses a [...] Read more.
Benggang (Benggang), a typical landform characterized by severe erosion and a geohazard in the red-soil hilly regions of southern China, is characterized by a fragmented texture, irregular boundaries, and high similarity to background objects such as bare soil and roads, which poses a dual challenge of “multiscale variability + strong noise” for automated identification at regional scales. To address insufficient information from a single modality and the limited representation of cross-scale features, this study proposes a dual-stream feature-fusion network (DF-Net) for multisource data consisting of a digital orthophoto map (DOM) and a digital elevation model (DEM). The method adopts ResNeSt50d as the backbone of the two branches: on the DOM side, a Canny-edge channel is stacked to enhance high-frequency boundary information; on the DEM side, derived terrain factors, including slope, aspect, curvature, and hillshade, are introduced to provide morphological constraints. In the cross-modal fusion stage, a multiscale sparse attention fusion module is designed, which acquires contextual information via multiwindow average pooling and suppresses noise interference through top-K sparsification. In the decision stage, a multibranch ensemble is employed to improve classification stability. Taking Anxi County, Fujian Province, as the study area, a coregistered dataset of GF-2 (1 m) DOM and ALOS (12.5 m) DEMs is constructed, and a zonal partitioning strategy is adopted to evaluate the model’s generalization ability. The experimental results show that DF-Net achieves 97.44% accuracy, 85.71% recall, and an 82.98% F1 score in the independent test zone, outperforming multiple mainstream CNN/transformer classification models. This study indicates that the strategy of “multimodal feature enhancement + sparse attention fusion” tailored to Benggang erosional landforms can significantly improve recognition performance under complex backgrounds, providing technical support for rapid Benggang surveys and governance-effectiveness assessments. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

19 pages, 7310 KB  
Article
Mathematical Benchmarking of Convolutional Neural Networks for Thai Dialect Recognition: A Spectrogram Texture Classification Approach
by Porawat Visutsak, Duongduen Ongrungruaeng, Surapong Wiriya and Keun Ho Ryu
Electronics 2026, 15(6), 1271; https://doi.org/10.3390/electronics15061271 - 18 Mar 2026
Viewed by 307
Abstract
This study rigorously evaluates 13 Convolutional Neural Network (CNN) architectures for Thai dialect recognition. By treating Automatic Speech Recognition (ASR) as a computer vision texture classification task, we processed an extensive 840-h dataset from the Spoken Language Systems, Chulalongkorn University (SLSCU) corpus. Raw [...] Read more.
This study rigorously evaluates 13 Convolutional Neural Network (CNN) architectures for Thai dialect recognition. By treating Automatic Speech Recognition (ASR) as a computer vision texture classification task, we processed an extensive 840-h dataset from the Spoken Language Systems, Chulalongkorn University (SLSCU) corpus. Raw audio from four major dialects—Central, Northern (Khummuang), Northeastern (Korat), and Southern (Pat-tani)—was transformed into 2D Mel-spectrograms using the Short-Time Fourier Transform (STFT). We analyzed a diverse range of architectures, including the VGG, Inception, ResNet, DenseNet, and MobileNet families, to establish the optimal trade-off between mathematical complexity and spectral feature extraction. Our experimental results identify NASNet-Mobile as the most effective model, achieving a macro-average F1-score of 0.9425. The analysis suggests that NASNet’s search-optimized cell structure is uniquely capable of capturing the multiscale texture of phonetic formants. In contrast, we observed a catastrophic mode collapse in VGG16 (32.97% accuracy), likely due to excessive parameter bloat, while Xception and MobileNetV2 maintained robust generalization. Confusion matrix analysis reveals high acoustic distinctiveness for Southern Thai (96.7% recall), whereas Northern Thai exhibits significant spectral overlap with Central Thai. These results support the hypothesis that CNNs interpret spectrograms as textures rather than discrete objects, positioning NASNet-Mobile as a high-performance, low-latency baseline for edge-device deployment in resource-constrained environments. Full article
(This article belongs to the Special Issue Advances in Machine Learning for Image Classification)
Show Figures

Figure 1

27 pages, 5361 KB  
Article
Dual-Stream 2D and 3D-SE-ResNet Architectures for Crop Mapping Using EnMAP Hyperspectral Time-Series
by László Mucsi, Márkó Sóti, Dorottya Litkey-Kovács, János Mészáros, Dóra Vigh-Szabó, Elemér Szalma, Zalán Tobak and József Szatmári
Remote Sens. 2026, 18(6), 884; https://doi.org/10.3390/rs18060884 - 13 Mar 2026
Viewed by 708
Abstract
Deep learning-based crop mapping from hyperspectral satellite data offers immense potential for capturing subtle phenological differences, yet leveraging sparse time series remains a major methodological challenge. This study evaluates the ability of the EnMAP sensor to identify nine major crop types in the [...] Read more.
Deep learning-based crop mapping from hyperspectral satellite data offers immense potential for capturing subtle phenological differences, yet leveraging sparse time series remains a major methodological challenge. This study evaluates the ability of the EnMAP sensor to identify nine major crop types in the intensive agricultural landscape of Southeastern Hungary. We utilized a limited time series (November, March, August) to benchmark two modeling strategies: a single-date dual-stream spatial–spectral 2D-CNN (DSS-2D) and a multi-temporal 3D-SE-ResNet. Model performance was assessed using parcel-level spatial cross-validation to ensure realistic accuracy estimates and reduce spatial autocorrelation bias. The results demonstrate that the DSS-2D model achieved superior single-date accuracy (OA > 97%), significantly outperforming pixel-based baselines. Furthermore, the multi-temporal 3D-SE-ResNet achieved a robust seasonal accuracy of 92.9%, effectively compensating for temporal sparsity by exploiting the deep spectral information of the SWIR domain. This study confirms that treating hyperspectral data as a 3D volume enables the extraction of phenological traits even from limited observations. These findings provide a strong proof-of-concept for the operational feasibility of future missions such as Copernicus CHIME for continental-scale food security monitoring. Full article
Show Figures

Figure 1

25 pages, 6894 KB  
Article
Visualizing the Machine Learning Process in Multichannel Time Series Classification
by Edgar Acuña and Roxana Aparicio
Analytics 2026, 5(1), 15; https://doi.org/10.3390/analytics5010015 - 12 Mar 2026
Viewed by 365
Abstract
This paper uses visualization techniques to analyze the learning process of six machine learning classifiers for multichannel time series classification (MTSC), including five deep learning models—1D CNN, CNN-LSTM, ResNet, InceptionTime, and Transformer—and one non-deep learning method, ROCKET. Sixteen datasets from the University of [...] Read more.
This paper uses visualization techniques to analyze the learning process of six machine learning classifiers for multichannel time series classification (MTSC), including five deep learning models—1D CNN, CNN-LSTM, ResNet, InceptionTime, and Transformer—and one non-deep learning method, ROCKET. Sixteen datasets from the University of East Anglia (UEA) multivariate time series repository were employed to assess and compare classifier performance. To explore how data characteristics influence accuracy, we applied channel selection, feature selection, and similarity analysis between training and testing sets. Visualization techniques were used to examine the temporal and structural patterns of each dataset, offering insight into how feature relevance, channel informativeness, and group separability affect model performance. The experimental results show that ROCKET achieves the most consistent accuracy across datasets, although its performance decreases with a very large number of channels. Conversely, the Transformer model underperforms in datasets with limited training instances per class. Overall, the findings highlight the importance of visual exploration in understanding MTSC behavior and indicate that channel relevance and data separability have a greater impact on classification accuracy than feature-level patterns. Full article
Show Figures

Figure 1

26 pages, 3428 KB  
Article
Robust Cell-Level Classification for Liquid-Based Cervical Cytology Using Deep Transfer Learning: A Multi-Source Study Addressing Scanner-Induced Domain Shifts
by Gulfize Coskun, Mustafa Caner Akuner and Erkan Kaplanoglu
Bioengineering 2026, 13(3), 289; https://doi.org/10.3390/bioengineering13030289 - 28 Feb 2026
Viewed by 616
Abstract
Automated analysis of liquid-based cervical cytology is increasingly supported by digital microscopy and deep learning. However, model generalization remains challenging due to scanner- and laboratory-induced domain shifts affecting color, texture, and morphology. In this study, we present a robust cell-level classification framework for [...] Read more.
Automated analysis of liquid-based cervical cytology is increasingly supported by digital microscopy and deep learning. However, model generalization remains challenging due to scanner- and laboratory-induced domain shifts affecting color, texture, and morphology. In this study, we present a robust cell-level classification framework for liquid-based Pap smear cytology based on deep transfer learning, designed to operate under heterogeneous acquisition conditions. We construct a multi-source dataset by integrating three widely used public reference repositories (SIPaKMeD, Herlev, CRIC Cervix) with a proprietary cohort comprising 416 Whole Slide Images (WSIs) collected from two medical centers and digitized using different scanning systems. All labels are harmonized into four Bethesda categories (NILM, ASC-US, LSIL, HSIL), and cell-centered 224 × 224 patches are used as standardized inputs for model development and benchmarking. We evaluate state-of-the-art CNN backbones (ResNet50, EfficientNetB0, VGG16) and perform systematic ablation across data-source combinations to quantify robustness under acquisition variability. Among the evaluated models, ResNet50 yields the best overall performance on the independent test set (accuracy = 0.91; macro-F1 = 0.91), consistently outperforming EfficientNetB0 and VGG16. Importantly, incorporating proprietary multi-center WSI-derived data improves robustness to scanner-induced variation compared to training on public data alone. These findings demonstrate that combining diverse data sources can mitigate domain shift in cell-level cervical cytology classification. While clinically actionable screening requires slide-level aggregation (e.g., MIL-based WSI inference), the proposed classifier provides a robust component that can be integrated into end-to-end WSI screening pipelines in future work. Full article
(This article belongs to the Special Issue AI in Biomedical Image Segmentation, Processing and Analysis)
Show Figures

Figure 1

12 pages, 884 KB  
Article
Classification of Pancreatic Cancer and Normal Tissue in 2D and 3D Optical Coherence Tomography Images Using Convolutional Neural Networks: A Comparative Study
by Maria Druzenko, Bastian Westerheide, Caroline Girmen, Niels König, Robert Schmitt, Svetlana Warkentin, Katharina Jöchle, Sebastian Cammann, Georg Wiltberger, Martin W. von Websky, Thomas Vogel, Florian W. R. Vondran and Iakovos Amygdalos
Cancers 2026, 18(5), 732; https://doi.org/10.3390/cancers18050732 - 25 Feb 2026
Viewed by 465
Abstract
Background/Objectives: Early and complete (R0) surgical resection is essential for optimal outcomes in pancreatic cancer. Optical coherence tomography (OCT) combined with artificial intelligence (AI) may offer real-time intraoperative guidance, potentially reducing reliance on frozen sections. This ex vivo study evaluated convolutional neural networks [...] Read more.
Background/Objectives: Early and complete (R0) surgical resection is essential for optimal outcomes in pancreatic cancer. Optical coherence tomography (OCT) combined with artificial intelligence (AI) may offer real-time intraoperative guidance, potentially reducing reliance on frozen sections. This ex vivo study evaluated convolutional neural networks (CNNs) for distinguishing pancreatic ductal adenocarcinoma (PDAC) from normal pancreatic tissue in OCT images obtained ex vivo. Methods: Between October 2020 and April 2021, OCT scans were obtained from resected pancreatic specimens of 27 adult patients. Tumor and adjacent normal tissue were imaged using a 1310 nm OCT system, followed by histopathological confirmation. A total of 25 PDAC and 30 non-malignant scans were preprocessed and analyzed using cross-validated CNN models (ResNet50, DenseNet121, and MobileNetV2) with both 2D and 3D inputs. Results: Using five-fold stratified cross-validation on 9040 2D and 3000 3D samples (224 px resolution), the 3D DenseNet121 model achieved the highest performance, with an F1-score of 0.74, sensitivity of 72%, and specificity of 81%. Other architectures demonstrated comparable results. Conclusions: AI-assisted OCT can accurately differentiate PDAC from normal pancreatic tissue ex vivo, supporting its potential as a rapid intraoperative diagnostic adjunct. Further studies are warranted to assess its in vivo performance and utility in evaluating resection margins. Full article
Show Figures

Figure 1

38 pages, 3182 KB  
Article
From Motion Artifacts to Clinical Insight: Multi-Modal Deep Learning for Robust Arrhythmia Screening in Ambulatory ECG Monitoring
by Pierre Boulanger
Sensors 2026, 26(4), 1135; https://doi.org/10.3390/s26041135 - 10 Feb 2026
Viewed by 472
Abstract
Motion artifacts corrupt wearable ECG signals and generate false alarms of arrhythmias, limiting the clinical adoption of continuous cardiac monitoring. We present a dual-stream deep learning framework for motion-robust binary arrhythmia classification through multi-modal sensor fusion and multi-SNR training. ResNet-18 processes ECG spectrograms, [...] Read more.
Motion artifacts corrupt wearable ECG signals and generate false alarms of arrhythmias, limiting the clinical adoption of continuous cardiac monitoring. We present a dual-stream deep learning framework for motion-robust binary arrhythmia classification through multi-modal sensor fusion and multi-SNR training. ResNet-18 processes ECG spectrograms, while CNN-BiLSTM encodes accelerometer motion patterns; attention-gated fusion with gate diversity regularization adaptively weights modalities based on signal reliability. Training in MIT-BIH data augmented at three noise levels (24, 12, 6 dB) enables noise-invariant learning with successful generalization to unseen conditions. The framework achieves 99.5% accuracy under clean signals, gracefully degrading to 88.2% at extreme noise (−6 dB SNR)—a 46% improvement over training with single-SNR. The high gate diversity (σ>0.37) confirms adaptive context-dependent fusion. With a 0.09% false positive rate and real-time processing (238 beats/second), the system provides practical continuous arrhythmia screening, establishing the foundation for hierarchical monitoring systems where binary screening activates detailed multi-class diagnosis. Full article
Show Figures

Figure 1

25 pages, 3917 KB  
Article
Hierarchical Attention Fused CNN-LSTM Using Structured 2D Indicator Matrices for Stock Trading Action Detection
by Hao Feng, Xian Li, Dongjie Zhao and Hui Kong
Appl. Sci. 2026, 16(4), 1672; https://doi.org/10.3390/app16041672 - 7 Feb 2026
Viewed by 397
Abstract
Accurate detection of trading actions (buy, sell, and hold) is critical for portfolio optimization and risk management in volatile stock markets. However, existing approaches often suffer from deficiencies in feature representation, spatiotemporal modeling, and class balancing, which limit their effectiveness. To address these [...] Read more.
Accurate detection of trading actions (buy, sell, and hold) is critical for portfolio optimization and risk management in volatile stock markets. However, existing approaches often suffer from deficiencies in feature representation, spatiotemporal modeling, and class balancing, which limit their effectiveness. To address these issues, we propose HA-CL, a deep learning framework that integrates a hierarchical attention mechanism with CNN-LSTM. Specifically, technical indicators are encoded into a structured 2D matrix to preserve the inherent characteristics of stocks. Features extracted by ResNet are processed by a channel-wise LSTM equipped with an attention core to adaptively fuse spatial, temporal, and channel-level importance. To mitigate class imbalance, we design a customized extrema labeling strategy augmented with extrema oversampling, an importance-aware focal loss, and a heuristic action recalibration. Experiments on 63 Chinese A-share stocks show that HA-CL achieves an average accuracy of 68.89% with an annualized return of 111.01%, substantially outperforming all baselines. Risk-adjusted return metrics such as the Sharpe Ratio and the Maximum Drawdown further validate its robustness across market conditions. Together, they highlight the potential of HA-CL to translate complex market patterns into profitable trading actions. Full article
Show Figures

Figure 1

25 pages, 1793 KB  
Review
Potential of Deep Learning Models for Point Cloud-Based Infrastructure Management
by Wei Wei, Fang Ding, Sardar Usman Ali, Tariq Ur Rahman, Shi Qiu, Mansoor Khan, Jin Wang and Qasim Zaheer
Electronics 2026, 15(3), 672; https://doi.org/10.3390/electronics15030672 - 3 Feb 2026
Cited by 1 | Viewed by 548
Abstract
The increasing recognition within the infrastructure sector of the transformative potential of 3D point cloud data for civil structure management has prompted a growing interest. However, the inherent complexity of these data poses significant challenges. With the expanding accessibility to point cloud data [...] Read more.
The increasing recognition within the infrastructure sector of the transformative potential of 3D point cloud data for civil structure management has prompted a growing interest. However, the inherent complexity of these data poses significant challenges. With the expanding accessibility to point cloud data and the rising demand for robust infrastructure management, the strategic application of deep learning becomes opportune. Deep learning models exhibit promise in various tasks, including object visualization, anomaly detection, element classification, and component segmentation. Addressing a notable research gap between point cloud technology and its allied fields, this review provides a comprehensive overview of deep learning models specifically tailored for Civil Infrastructure Management. Commencing with an exploration of the core principles underlying foundational models such as CNN, GNN, PointNet, and ResNet, the discussion progresses to advanced architectures, including DGCNN and ResPointNet++. Through a comparative analysis, this review delineates pathways for advancing deep learning models, with a particular emphasis on integrating domain knowledge and streamlining architectural designs. The findings contribute valuable insights aimed at developing more effective approaches for leveraging deep learning in point cloud-based infrastructure management, aligning with the dynamic demands of the industry. This paper centers on the strategic utilization of deep learning to address complex infrastructure challenges, providing insights that are indispensable for staying aligned with the evolving landscape of the industry. Full article
Show Figures

Figure 1

23 pages, 15010 KB  
Article
Hybrid Mamba–Graph Fusion with Multi-Stage Pseudo-Label Refinement for Semi-Supervised Hyperspectral–LiDAR Classification
by Khanzada Muzammil Hussain, Keyun Zhao, Sachal Perviaz and Ying Li
Sensors 2026, 26(3), 1005; https://doi.org/10.3390/s26031005 - 3 Feb 2026
Viewed by 532
Abstract
Semi-supervised joint classification of Hyperspectral Images (HSIs) and LiDAR-derived Digital Surface Models (DSMs) remains challenging due to scarcity of labeled pixels, strong intra-class variability, and the heterogeneous nature of spectral and elevation features. In this work, we propose a Hybrid Mamba–Graph Fusion Network [...] Read more.
Semi-supervised joint classification of Hyperspectral Images (HSIs) and LiDAR-derived Digital Surface Models (DSMs) remains challenging due to scarcity of labeled pixels, strong intra-class variability, and the heterogeneous nature of spectral and elevation features. In this work, we propose a Hybrid Mamba–Graph Fusion Network (HMGF-Net) with Multi-Stage Pseudo-Label Refinement (MS-PLR) for semi-supervised hyperspectral–LiDAR classification. The framework employs a spectral–spatial HSI backbone combining 3D–2D convolutions, a compact LiDAR CNN encoder, Mamba-style state-space sequence blocks for long-range spectral and cross-modal dependency modeling, and a graph fusion module that propagates information over a heterogeneous pixel graph. Semi-supervised learning is realized via a three-stage pseudolabeling pipeline that progressively filters, smooths, and re-weights pseudolabels based on prediction confidence, spatial–spectral consistency, and graph neighborhood agreement. We validate HMGF-Net on three benchmark hyperspectral–LiDAR datasets. Compared with a set of eight state-of-the-art (SOTA) baselines, including 3D-CNNs, SSRN, HybridSN, transformer-based models such as SpectralFormer, multimodal CNN–GCN fusion networks, and recent semi-supervised methods, the proposed approach delivers consistent gains in overall accuracy, average accuracy, and Cohen’s kappa, especially in low-label regimes (10% labeled pixels). The results highlight that the synergy between sequence modeling and graph reasoning in combination with carefully designed pseudolabel refinement is essential to maximizing the benefit of abundant unlabeled samples in multimodal remote sensing scenarios. Full article
(This article belongs to the Special Issue Progress in LiDAR Technologies and Applications)
Show Figures

Figure 1

30 pages, 4505 KB  
Article
SimpleEfficientCNN: A Lightweight and Efficient Deep Learning Framework for High-Precision Rice Seed Classification
by Xiaofei Wang, Zhanhua Lu, Tengkui Chen, Zhaoyang Pan, Wei Liu, Shiguang Wang, Haoxiang Wu, Hao Chen, Liting Zhang and Xiuying He
Agriculture 2026, 16(3), 357; https://doi.org/10.3390/agriculture16030357 - 2 Feb 2026
Viewed by 593
Abstract
Rice seed variety classification is crucial for seed quality control and breeding, yet practical deployment is often limited by the computational and memory demands of modern deep models. We propose SimpleEfficientCNN (SimpleEfficient: simple & efficient; CNN: convolutional neural network), an ultra-lightweight convolutional network [...] Read more.
Rice seed variety classification is crucial for seed quality control and breeding, yet practical deployment is often limited by the computational and memory demands of modern deep models. We propose SimpleEfficientCNN (SimpleEfficient: simple & efficient; CNN: convolutional neural network), an ultra-lightweight convolutional network built on depthwise separable convolutions for efficient fine-grained seed classification. Experiments were conducted on three datasets with distinct imaging characteristics: a self-constructed Guangdong dataset (7 varieties; 10,500 seeds imaged once and expanded to 112 K images via post-split augmentation), the public M600 rice subset (7 varieties; 9100 original images expanded to 112 K images using the same post-split augmentation pipeline for scale-matched comparison), and the International dataset (75 K images; official train/validation/test split provided by the original release and used as-is without any preprocessing or augmentation, 5 varieties). SimpleEfficientCNN achieved 98.52%, 88.07%, and 99.37% accuracy on the Guangdong, M600, and International test sets, respectively. With only 0.231 M parameters (≈92× fewer than ResNet34), it required 20.5 MB peak GPU memory and delivered 2.0 ms GPU latency (RTX 4090D, batch = 1, FP32) and 1.8 ms single-thread CPU median latency (Ryzen 9 7950X3D, batch = 1, FP32). These results indicate that competitive accuracy can be achieved with substantially reduced model size and inference cost, supporting deployment in resource-constrained agricultural settings. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

17 pages, 3661 KB  
Article
Wavefront Prediction for Adaptive Optics Without Wavefront Sensing Based on EfficientNetV2-S
by Zhiguang Zhang, Zelu Huang, Jiawei Wu, Zhaojun Yan, Xin Li, Chang Liu and Huizhen Yang
Photonics 2026, 13(2), 144; https://doi.org/10.3390/photonics13020144 - 2 Feb 2026
Viewed by 689
Abstract
Adaptive optics (AO) aims to counteract wavefront distortions caused by atmospheric turbulence and inherent system errors. Aberration recovery accuracy and computational speed play crucial roles in its correction capability. To address the issues of slow wavefront aberration detection speed and low measurement accuracy [...] Read more.
Adaptive optics (AO) aims to counteract wavefront distortions caused by atmospheric turbulence and inherent system errors. Aberration recovery accuracy and computational speed play crucial roles in its correction capability. To address the issues of slow wavefront aberration detection speed and low measurement accuracy in current wavefront sensorless adaptive optics, this paper proposes a wavefront correction method based on the EfficientNetV2-S model. The method utilizes paired focal plane and defocused plane intensity images to directly extract intensity features and reconstruct phase information in a non-iterative manner. This approach enables the direct prediction of wavefront Zernike coefficients from the measured intensity images, specifically for orders 3 to 35, significantly enhancing the real-time correction capability of the AO system. Simulation results show that the root mean square error (RMSE) of the predicted Zernike coefficients for D/r0 values of 5, 10, and 15 are 0.038λ, 0.071λ, and 0.111λ, respectively, outperforming conventional convolutional neural network (CNN), ResNet50/101 and ConvNeXt-T models. The experimental results demonstrate that the EfficientNetV2-S model maintains good wavefront reconstruction and prediction capabilities at D/r0 = 5 and 10, highlighting its high precision and robust wavefront prediction ability. Compared to traditional iterative algorithms, the proposed method offers advantages such as high precision, fast computation, no need for iteration, and avoidance of local minima in processing wavefront aberrations. Full article
(This article belongs to the Special Issue Adaptive Optics: Recent Technological Breakthroughs and Applications)
Show Figures

Figure 1

Back to TopTop