Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,244)

Search Parameters:
Keywords = AttentionUNet

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 1726 KB  
Article
Research on Multi-Class and Weak Signal Recognition of Microseismic Events Based on an Optimized U-Net Model
by Guangdong Song, Zunting Wang, Jiulong Cheng, Feng Zhu, Jiqiang Wang and Moyu Hou
Appl. Sci. 2026, 16(13), 6417; https://doi.org/10.3390/app16136417 (registering DOI) - 26 Jun 2026
Viewed by 84
Abstract
Microseismic monitoring is essential for the early warning of mine dynamic disasters; however, weak signal characteristics and strong environmental noise often lead to missed detections and false alarms. To address these challenges, this study proposes an optimized U-Net model for multi-class microseismic signal [...] Read more.
Microseismic monitoring is essential for the early warning of mine dynamic disasters; however, weak signal characteristics and strong environmental noise often lead to missed detections and false alarms. To address these challenges, this study proposes an optimized U-Net model for multi-class microseismic signal recognition under low-signal-to-noise-ratio conditions. The method combines Short-Time Fourier Transform, a U-Net encoder–decoder architecture, residual learning, and squeeze-and-excitation attention modules to enhance weak feature extraction and noise suppression. A multi-source dataset containing microseismic, knocking, blasting, noise, and earthquake signals was constructed using both field-measured data and public seismic datasets. Experimental results show that the proposed model achieved an overall validation accuracy of 99.25% and excellent recall performance for microseismic events. Under extreme noise conditions with a signal-to-noise ratio of −5 dB, the model still maintained a microseismic recognition accuracy of 98.25%. Comparative experiments further demonstrate that the integration of Short-Time Fourier Transform and residual attention modules significantly improves robustness and weak-signal discrimination capability. The proposed method provides an effective approach for intelligent microseismic monitoring and mine dynamic disaster early warning. Full article
(This article belongs to the Special Issue Rock Mechanics and Mining Engineering)
18 pages, 1502 KB  
Article
Water Level Measurement Approach Using Monocular Vision with Piecewise Linear Fitting Algorithm
by Dong Zhou, Xiaochen Wang, Kai Si, Mingtang Liu, Mengmeng Ge, Zhixin Li and Jinggan Shao
Water 2026, 18(13), 1557; https://doi.org/10.3390/w18131557 - 25 Jun 2026
Viewed by 194
Abstract
Water level monitoring is closely linked to the safety of production and daily activities along riverbanks, making real-time and high-precision water level measurement an urgent technical demand. The feature extraction backbone of the Unet model is modified, and the lightweight MobileNet V2 network [...] Read more.
Water level monitoring is closely linked to the safety of production and daily activities along riverbanks, making real-time and high-precision water level measurement an urgent technical demand. The feature extraction backbone of the Unet model is modified, and the lightweight MobileNet V2 network is adopted in this paper. The constructed network achieves significantly higher computational efficiency than standard convolutions, effectively overcoming the limited real-time performance of conventional water level measurement methods. Furthermore, the coordinate attention (CA) mechanism is integrated into the skip connections of Unet to strengthen the network’s capability to extract key features for water level segmentation, thereby further improving the accuracy of water level detection. A novel piecewise linear fitting method for water level line measurement based on monocular vision is proposed, and field-measured water level data are adopted to verify the calculation results. The main achievements of the improved model include the following: (1) Compared with the baseline model, the improved model MCUnet (MobileNet V2 + CA + Unet) achieves a 5.77% increase in accuracy and a 25.71% improvement in inference speed on the experimental water surface recognition dataset. (2) Taking the field-observed water level as the reference, the mean absolute error of the proposed image-based water level monitoring method reaches approximately 1.69 cm. (3) In comparison with DeepLab, U2net and Unet, the MCUnet model gains accuracy improvements of 4.47%, 2.81% and 5.77% respectively, with the detection frame rate increased by 12 FPS, 15 FPS and 11 FPS correspondingly. Through this work, the paper can provide some theoretical support and technical references for overcoming the limitations of conventional water level measuring devices, including strict installation requirements, limited measurement precision, high deployment and maintenance costs, and cumbersome data processing. Full article
Show Figures

Figure 1

15 pages, 1186 KB  
Article
A Deep Learning Framework for Gastric Cancer Cell Segmentation with Multi-Scale Attention Mechanisms
by Xinyu Zhao, Jin Liu, Jingru Zhang, Damin Ding, Haima Yang and Bo Huang
Bioengineering 2026, 13(7), 740; https://doi.org/10.3390/bioengineering13070740 (registering DOI) - 25 Jun 2026
Viewed by 152
Abstract
The accurate segmentation of gastric cancer cells is important in pathology for diagnosing and detecting diseases early. However, current approaches still suffer from limitations such as expensive annotation, fuzzy lesion boundaries, and weak feature expression. In order to solve these problems, we present [...] Read more.
The accurate segmentation of gastric cancer cells is important in pathology for diagnosing and detecting diseases early. However, current approaches still suffer from limitations such as expensive annotation, fuzzy lesion boundaries, and weak feature expression. In order to solve these problems, we present MSAF-Net, a novel U-Net framework optimized both architecturally and in terms of the loss function. In particular, we incorporate a Multi-scale Dilated Pooling Fusion Block into the encoder stage to achieve enhanced interaction of multi-paths and thus improve features’ diversity and boundary sensitivity. We also introduce a Dual-Channel Attention Block in place of traditional convolution block in the decoder stage to restore better details and reconstruct the fuzzy boundaries. Meanwhile, a Diagonal Mahalanobis Consistency Loss is incorporated into our framework to facilitate class compactness. Experiments performed on the SEED-Gastric Carcinoma Stage 1 dataset show that the designed algorithm can reach 0.776 in Dice score and 0.821 in Accuracy, which outperforms the baseline method U-Net. It is clear that these results have shown the effectiveness and robustness of our proposed approach. The introduced algorithm allows for more precise quantification of gastric cancer cell morphology. Full article
(This article belongs to the Section Biomedical Engineering and Biomaterials)
24 pages, 4581 KB  
Article
Geology-Guided Fixed-Group Fusion ResUNet for Predicting Calcrete-Type Uranium Prospectivity: A Case Study from the Yilgarn Craton, Western Australia
by Dawei Fan, Jianfeng He, Guoyun Zhong, Fei Xia, Fengjun Nie, Fan Diao, Weidong Li and Xin Zhang
Geosciences 2026, 16(6), 244; https://doi.org/10.3390/geosciences16060244 - 22 Jun 2026
Viewed by 132
Abstract
Calcrete-type uranium prospectivity prediction is challenged by the strong heterogeneity of multi-source geoscientific raster datasets, weak anomaly responses, and the lack of explicit heterogeneous information organization in conventional deep learning models. In this study, the Yilgarn Craton of Western Australia was selected as [...] Read more.
Calcrete-type uranium prospectivity prediction is challenged by the strong heterogeneity of multi-source geoscientific raster datasets, weak anomaly responses, and the lack of explicit heterogeneous information organization in conventional deep learning models. In this study, the Yilgarn Craton of Western Australia was selected as the study area, and a geology-guided fixed-group fusion ResUNet model (GGF-ResUNet) was developed based on 12-channel multi-source geoscientific raster datasets. At the input stage, the evidence layers were divided into four fixed geoscientific proxy groups according to their data modality and geological interpretation, namely gravity, aeromagnetic, radiometric, and geochemical groups, and intra-group channel weighting together with inter-group gating was introduced to enhance the hierarchical representation and adaptive fusion of heterogeneous information. Ablation results showed that GGF-ResUNet achieved better performance than the baseline ResUNet, with AUC increasing from 0.9340 to 0.9740 and F1-score improving from 0.7264 to 0.8356. Further comparative experiments with Attention U-Net, U-Net, SegNet, and FCN showed that GGF-ResUNet achieved comparatively better quantitative performance and more spatially coherent prediction results under the current experimental setting. Without substantially increasing model complexity, the proposed method improves the representation and integration of heterogeneous geoscientific information and provides a feasible technical pathway for calcrete-type uranium prospectivity prediction under weak-anomaly conditions. Full article
Show Figures

Figure 1

14 pages, 4300 KB  
Article
DeepFlare: Weakly Supervised Cross-Modality Translation and Segmentation for Immunohistochemistry and Immunofluorescence Imaging
by Md. Tamim, Aditto Rahman, Redwan Hossain, Tausib Abrar and Riasat Khan
BioMedInformatics 2026, 6(3), 37; https://doi.org/10.3390/biomedinformatics6030037 - 22 Jun 2026
Viewed by 491
Abstract
Immunohistochemistry (IHC) is a widely used method for detecting specific proteins in tissue samples, helping diagnose diseases such as cancer. Traditional analysis methods rely heavily on human interpretation, which can lead to inconsistencies. In this study, we propose DeepFlare, a weakly supervised deep [...] Read more.
Immunohistochemistry (IHC) is a widely used method for detecting specific proteins in tissue samples, helping diagnose diseases such as cancer. Traditional analysis methods rely heavily on human interpretation, which can lead to inconsistencies. In this study, we propose DeepFlare, a weakly supervised deep learning framework for cross-modality translation and segmentation of immunofluorescence and immunohistochemistry images. The proposed method utilizes multiplex immunofluorescence (mpIF) and co-registered IHC images, combined with preprocessing techniques such as affine transformation, stain normalization, noise reduction, and artifact removal. Multiple imaging channels, including hematoxylin, DAPI, Lap2, and nuclear envelope signals, are leveraged to generate segmentation masks using a U-Net++ architecture. The final segmentation mask is obtained through weighted fusion of modality-specific outputs. A generative adversarial network (GAN) is employed to measure translation fidelity between generated and real images. Weakly supervised learning techniques, including image-level supervision and consistency constraints, are applied to enhance performance under limited annotation scenarios. Pretrained pathology foundation encoders such as UNI and Virchow are integrated to extract multi-scale morphological and contextual features. Explainable AI techniques are incorporated to highlight critical regions and refine model attention. Experimental results demonstrate strong performance, achieving an SSIM of 0.7077 for image translation and a Dice score of 0.7424 for segmentation. The integration of the UNI encoder provides marginal improvement over the baseline (0.72 Dice score), indicating limited domain adaptation without fine-tuning on the dataset of 1264 training samples. Full article
(This article belongs to the Section Imaging Informatics)
Show Figures

Figure 1

26 pages, 8518 KB  
Article
CVA-Net: Multi-View 3D Reconstruction for Fringe Projection Profilometry via Cross-View Attention and Sim2Real Learning
by Zuqiong Chen, Xiaopin Zhong and Yibin Tian
Photonics 2026, 13(6), 601; https://doi.org/10.3390/photonics13060601 - 21 Jun 2026
Viewed by 266
Abstract
Fringe projection profilometry (FPP) is widely used for 3D reconstruction, but conventional single-view FPP systems suffer from inherent occlusions and shadow regions, leading to incomplete surface recovery. In this study, we propose CVA-Net, an end-to-end deep learning framework with cross-view attention (CVA) that [...] Read more.
Fringe projection profilometry (FPP) is widely used for 3D reconstruction, but conventional single-view FPP systems suffer from inherent occlusions and shadow regions, leading to incomplete surface recovery. In this study, we propose CVA-Net, an end-to-end deep learning framework with cross-view attention (CVA) that directly reconstructs dense depth maps from multi-view fringe patterns. CVA-Net simultaneously processes four fringe images acquired from orthogonal projection directions and leverages a CVA module to explicitly model inter-view dependencies, enabling adaptive fusion of complementary information. A 3D U-Net backbone with attention gates, atrous spatial pyramid pooling (ASPP), and an auxiliary parameter estimation branch further enhances reconstruction accuracy and structural consistency via multitask learning. To support Sim2Real network training, we build a Blender-based digital twin of a multi-view FPP system and generate a large-scale synthetic dataset with perfect ground truth. Extensive experiments on both synthetic and real-world objects demonstrate that CVA-Net significantly outperforms state-of-the-art single-view methods. With a symmetric four-view configuration and fringe period of 8, CVA-Net achieves an MAE of 0.0359 mm, an MSE of 0.0379 mm2 and an RMSE of 0.1947 mm, reducing the MAE, MSE, and RMSE by 32.8%, 54.1%, and 32.2%, respectively, compared to the best single-view competitor. Ablation studies validate the contribution of each architectural component, while real-system experiments demonstrate the feasibility of transferring a network trained purely on synthetic data to practical FPP measurements without domain adaptation. Although further improvements are required to enhance reconstruction accuracy under real imaging conditions, the proposed framework provides an effective initial step toward bridging the gap between digital-twin-based training and real-world multi-view FPP applications. CVA-Net provides a robust, occlusion-aware solution for multi-view FPP reconstruction. Full article
Show Figures

Figure 1

17 pages, 6910 KB  
Article
Tooth X-Ray Image Segmentation Based on ResU-Net with Coordinate Attention and Boundary-Aware Mechanisms
by Jie Xiong, Qiong Lou and Fang Lu
Sensors 2026, 26(12), 3880; https://doi.org/10.3390/s26123880 - 18 Jun 2026
Viewed by 164
Abstract
Accurate tooth segmentation plays a crucial role in computer-aided dental diagnosis and treatment planning, particularly in applications such as tooth detection, lesion localization, orthodontic analysis, and implant surgery. However, panoramic dental X-ray images often suffer from tooth adhesion, low contrast, and blurred boundaries, [...] Read more.
Accurate tooth segmentation plays a crucial role in computer-aided dental diagnosis and treatment planning, particularly in applications such as tooth detection, lesion localization, orthodontic analysis, and implant surgery. However, panoramic dental X-ray images often suffer from tooth adhesion, low contrast, and blurred boundaries, making precise delineation difficult and potentially compromising downstream clinical analysis. To address these challenges, we propose a boundary-aware segmentation framework, termed Boundary-Aware ResU-Net (BA-ResUNet), which is built upon a ResU-Net backbone and enhanced with Coordinate Attention (CA) and explicit boundary modeling mechanisms. Specifically, CA modules are introduced into the encoder to improve spatial representation and positional awareness. In addition, a Boundary Extraction Module (BEM) is designed to capture boundary priors from shallow and deep features, while a Boundary Injection Module (BIM) progressively incorporates these cues into the decoder through foreground enhancement and background suppression. This design enables the network to better preserve inter-tooth gaps and improve boundary delineation. Experiments on the MICCAI STS-2D dental dataset demonstrate that the proposed method achieves superior performance in terms of Dice and IoU compared with representative existing methods. Ablation and qualitative analyses further show that CA and BEM/BIM play synergistic roles in improving regional overlap and boundary localization, particularly in challenging cases involving adhesion, low contrast, and indistinct contours. These results indicate that the proposed framework provides a reliable and effective solution for panoramic tooth segmentation and has promising potential for computer-aided dental applications. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

27 pages, 8573 KB  
Article
LTM-UNet: Linear Transformer–Mamba with Attention-Based U-Net for Context-Aware Breast Ultrasound Image Segmentation
by Shivpratap Singh Kushwah, Santosh Prakash Chouhan, Narinder Singh Punn and Mahua Bhattacharya
Diagnostics 2026, 16(12), 1888; https://doi.org/10.3390/diagnostics16121888 - 17 Jun 2026
Viewed by 296
Abstract
Background/Objectives: Accurate breast lesion segmentation using deep learning models requires precise understanding of both global contextual relevance and finer lesion structure details, which remains a challenge for existing convolutional and transformer-based approaches. This study aims to address these limitations by proposing a [...] Read more.
Background/Objectives: Accurate breast lesion segmentation using deep learning models requires precise understanding of both global contextual relevance and finer lesion structure details, which remains a challenge for existing convolutional and transformer-based approaches. This study aims to address these limitations by proposing a new segmentation model capable of improving context-aware dense segmentation tasks for ultrasound images. Method: We propose LTM-UNet, a novel segmentation method integrating transformer-based encoding with state-space-driven decoding in a U-Net-style framework. The architecture utilizes an efficient vision transformer encoder to extract multi-scale global representations. These features are refined through an attention-guided skip-fusion mechanism incorporating spatial-channel attention preserving finer spatial details and thereby minimizes the semantic gap between encoder and decoder features. Additionally, a direction-aware decoder based on a state-space model is introduced to efficiently capture long-range dependencies and enhance relevant feature reconstruction. Results: Extensive experiments on benchmark ultrasound medical imaging datasets demonstrate the effectiveness of the proposed method. The model achieves dice-score coefficients of 82.41% on the BUSI dataset and 86.62% on Dataset B (UDIAT), outperforming several existing segmentation approaches in both dice-score coefficient and Intersection-over-Union (IoU) metrics. Conclusions: The integration of efficient transformer-based global feature extraction, attention-enhanced feature fusion, and state-space-driven decoding enables LTM-UNet to effectively capture both structural details and contextual information, resulting in superior segmentation performance compared to existing methods. Full article
Show Figures

Figure 1

30 pages, 13578 KB  
Article
A Semi-Supervised Topographic Inversion Algorithm for Small-Scale Tidal Flats Based on Multi-Source Data Fusion Under Spatially Clustered ICESat-2 Label Distributions
by Hao Chen, Xiaowen Luo, Feng Gui, Jiaxin Cui, Jiayang Chen and Qi Li
Remote Sens. 2026, 18(12), 2017; https://doi.org/10.3390/rs18122017 - 17 Jun 2026
Viewed by 251
Abstract
High-precision topography of tidal flats is essential for coastal monitoring, geomorphic change analysis, and ecological assessment. Although satellite remote sensing supports repeated and large-area observation, topographic inversion over small-scale tidal flats—here defined as localized intertidal patches with limited areal extent, represented in this [...] Read more.
High-precision topography of tidal flats is essential for coastal monitoring, geomorphic change analysis, and ecological assessment. Although satellite remote sensing supports repeated and large-area observation, topographic inversion over small-scale tidal flats—here defined as localized intertidal patches with limited areal extent, represented in this study by a 1.11 km2 tidal flat near Dafeng Port—remains challenging, because ICESat-2 laser altimetry tracks across such areas are typically sparse and spatially clustered within narrow sub-regions, leaving extensive observation-blind zones without direct elevation labels. This label-clustering problem constrains the applicability of traditional empirical models and tends to cause deep learning models to generalize poorly beyond the spatial distribution of training samples. To address this issue, this study proposes a Residual Attention Physical-constraint Semi-supervised U-Net (RAPS-UNet) that fuses ICESat-2 ATL03/ATL08 elevation labels with Sentinel-1 SAR and Sentinel-2 optical features. The preprocessing pipeline comprises refined ICESat-2 photon filtering, adaptive inundation-frequency extraction, multi-source feature selection, and baseline DEM construction. RAPS-UNet integrates residual learning, attention-based multi-source fusion, physics-constrained loss, and confidence-weighted pseudo-label augmentation to improve extrapolation under clustered-label conditions. A four-level validation protocol—in-distribution validation, spatial holdout testing, and field-based assessment over both interpolation and extrapolation zones—was designed to evaluate spatial generalization. Against a field-surveyed DEM, RAPS-UNet achieved an overall RMSE of 0.20 m, an MAE of 0.16 m, and an R2 of 0.91; the field-based interpolation and extrapolation zones yielded RMSEs of 0.17 m and 0.22 m, respectively, while the spatial holdout test reached an RMSE of 0.23 m and an R2 of 0.81. Relative to the traditional inundation frequency–elevation linear model (RMSE = 0.35 m), RAPS-UNet reduced the field-validation RMSE by approximately 43%. The proposed framework therefore offers a practical approach for fine-scale coastal-zone topographic mapping under sparse and spatially clustered altimetry conditions. Full article
Show Figures

Figure 1

19 pages, 8573 KB  
Article
DCA-UNet for Landslide Segmentation with Deformable Convolution and Aggregated Attention
by Yingxu Song, Jie Luo, Cheng Wang, Xiangyan Kong, Yujia Zou, Yingcong Huang, Weicheng Wu, Yuan Li, Run Wang, Shiyao Li, Zuohua Tang, Shiluo Xu, Qiang Li and Hui Chen
Remote Sens. 2026, 18(12), 2000; https://doi.org/10.3390/rs18122000 - 16 Jun 2026
Viewed by 263
Abstract
Accurate delineation of landslide boundaries from remote sensing imagery remains challenging because landslides exhibit irregular geometry, substantial scale variation, and strong background interference. We propose DCA-UNet, a U-Net-style segmentation network that integrates deformable convolution and aggregated attention to jointly improve geometric adaptation and [...] Read more.
Accurate delineation of landslide boundaries from remote sensing imagery remains challenging because landslides exhibit irregular geometry, substantial scale variation, and strong background interference. We propose DCA-UNet, a U-Net-style segmentation network that integrates deformable convolution and aggregated attention to jointly improve geometric adaptation and local-global context modeling. Deformable convolution adjusts spatial sampling locations to irregular landslide boundaries, whereas aggregated attention enhances contextual discrimination in visually ambiguous terrain. We evaluate the method on three public benchmarks—Landslide4Sense, HR-GLDD, and GDCLD—under a controlled from-scratch benchmark with dataset-specific preprocessing and official data splits. DCA-UNet achieves the best overall IoU/F1 ranking across the three datasets, reaching 61.92%/76.48% on Landslide4Sense, 59.24%/74.41% on HR-GLDD, and 58.40%/73.74% on GDCLD. The model contains 29.50 million parameters, which is close to vanilla U-Net and substantially fewer than several transformer-based baselines, although its training-side runtime and memory consumption are not the lowest. These results show that combining adaptive spatial sampling with local-global contextual aggregation is effective for landslide segmentation in both multispectral and RGB remote sensing imagery. Full article
(This article belongs to the Special Issue Landslide Detection Using Machine and Deep Learning)
Show Figures

Figure 1

24 pages, 64409 KB  
Article
CA-DDPM: Conditionally Embedded Attention-Aided Denoising Diffusion Probabilistic Model for High-Quality SAR Image Generation
by Yang Zheng, Duhao Liu, Ruimin Li, Rongxu Wang, Junling Fan, Kaitai Guo and Jimin Liang
Remote Sens. 2026, 18(12), 1994; https://doi.org/10.3390/rs18121994 - 15 Jun 2026
Viewed by 237
Abstract
Deep learning-based automatic target recognition (ATR) for synthetic aperture radar (SAR) imagery requires large quantities of high-quality annotated data, yet real SAR samples are costly and difficult to obtain. Existing generative adversarial network (GAN)-based SAR generation methods often suffer from limited authenticity and [...] Read more.
Deep learning-based automatic target recognition (ATR) for synthetic aperture radar (SAR) imagery requires large quantities of high-quality annotated data, yet real SAR samples are costly and difficult to obtain. Existing generative adversarial network (GAN)-based SAR generation methods often suffer from limited authenticity and insufficient diversity. To address these issues, we propose CA-DDPM, a conditionally embedded attention-aided denoising diffusion probabilistic model (DDPM) for high-quality multi-category SAR image generation. CA-DDPM employs a unified conditional embedding that fuses time-step and category information, injected into a U-Net backbone through a feature-wise linear modulation (FiLM)-based mechanism to achieve step-aware and class-aware denoising. Attention blocks are further incorporated to enhance the modeling of structural dependencies and fine scattering details. To evaluate generation quality, we develop a three-dimensional assessment framework that jointly examines authenticity, diversity, and utility in ATR. Authenticity is quantified using local and global similarity metrics under a unified Hungarian-matched statistical procedure, together with an SAR-adapted Fréchet inception distance (SAR-FID). Diversity is assessed through inter-category feature clustering, an SAR Inception Score (SAR-IS), and a newly proposed intra-category grayscale histogram-based metric. Utility is evaluated by hybrid training experiments across multiple ATR models. Experiments on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset demonstrate that CA-DDPM produces more realistic and diverse SAR images than representative GAN- and DDPM-based baselines, and it effectively improves downstream ATR performance through data augmentation. Full article
(This article belongs to the Special Issue AI-Driven Remote Sensing Image Restoration and Generation)
Show Figures

Figure 1

36 pages, 32050 KB  
Article
Semantic Segmentation of Pegmatite Dikes in High-Resolution Remote Sensing Imagery Using GAD-UNet++ in the Yilanlike Area, South Tianshan
by Zirui Wu, Chuan Chen, Yuanjun Yu, Yong Tian, Jian Yu and Fang Xia
Remote Sens. 2026, 18(12), 1988; https://doi.org/10.3390/rs18121988 - 15 Jun 2026
Viewed by 255
Abstract
Pegmatite dikes are important prospecting indicators for rare-metal deposits, whereas traditional methods for pegmatite dike identification are constrained by the limited capability of human visual interpretation to capture information from remote sensing imagery, resulting in low identification accuracy and efficiency. In recent years, [...] Read more.
Pegmatite dikes are important prospecting indicators for rare-metal deposits, whereas traditional methods for pegmatite dike identification are constrained by the limited capability of human visual interpretation to capture information from remote sensing imagery, resulting in low identification accuracy and efficiency. In recent years, global research on semantic segmentation of different surface features and remote sensing-based mineral exploration using deep learning methods and high-resolution remote sensing imagery has made significant progress; however, studies on surface-exposed geological bodies such as pegmatite dikes remain highly insufficient. To address the key problem of efficiently identifying pegmatite dikes in remote sensing imagery, this study proposes an improved model based on UNet++, termed GAD-UNet++. In the field of remote sensing geology, this study constructed a pegmatite dike semantic segmentation dataset based on high-resolution RGB imagery by using 0.66 m RGB imagery for visual delineation and ZY1F hyperspectral data for spectral constraint and label refinement; on this basis, semantic segmentation of surface pegmatite dikes in the Yilanlike area of the South Tianshan Mountains, Xinjiang, was conducted using RGB remote sensing image patches as model input. Specifically, because pegmatite dikes are small targets characterized by slender structures, indistinct boundaries, and sparse regional distribution, this study introduced a lightweight feature extraction structure (GhostNetV2) and a long-range dependency attention module (DFC) at the encoder stage, and further incorporated the Coordinate Attention module (CA) to enhance spatial localization and boundary representation of the targets. Finally, focal cross-entropy loss and a deep supervision strategy were adopted to improve the accuracy of semantic information extraction for pegmatite dikes, as well as the training stability and segmentation accuracy under class-imbalance conditions. The results show that the proposed model achieved an mIoU of 93.11% and an F1-score of 94.95% on the test set. Compared with existing semantic segmentation models, the proposed model achieved superior performance in both identification accuracy and computational efficiency for pegmatite dikes. In addition, this study delineated 18 potential pegmatite dike enrichment zones in the Yilanlike area, providing technical support for remote sensing-based rare-metal prospecting and geological interpretation in the study area. Full article
(This article belongs to the Section Remote Sensing in Geology, Geomorphology and Hydrology)
Show Figures

Figure 1

17 pages, 7783 KB  
Article
An Automatic Identification Method for Vertebral Compression Fractures in X-Ray Images Based on Multi-Stage Deep Learning
by Shenyang Duan, Yufeng Deng and Yang Song
Electronics 2026, 15(12), 2626; https://doi.org/10.3390/electronics15122626 - 14 Jun 2026
Viewed by 231
Abstract
Vertebral compression fractures (VCFs) are one of the most common spinal disorders encountered clinically. Untimely diagnosis or inaccurate classification often leads to prolonged pain and functional impairment in patients. To enhance diagnostic accuracy and efficiency, this study addressed the high cost and limited [...] Read more.
Vertebral compression fractures (VCFs) are one of the most common spinal disorders encountered clinically. Untimely diagnosis or inaccurate classification often leads to prolonged pain and functional impairment in patients. To enhance diagnostic accuracy and efficiency, this study addressed the high cost and limited applicability of computed tomography (CT) and magnetic resonance imaging (MRI) examinations by leveraging the universality and convenience of X-ray imaging. We proposed a multi-stage deep learning-based method for identifying vertebral compression fractures. The method first employs Discrete Wavelet Transform-YOLOv5 (DWT-YOLOv5) for preliminary vertebral region localization, followed by Polarized Self-Attention-UNet (PSA-UNet) for precise segmentation. Finally, a ResNet50 network incorporating a Convolutional Block Attention Module (CBAM) performs graded classification, categorizing vertebrae into four types: Non-fracture, Mild fracture, Moderate fracture, and Severe fracture. The experimental results demonstrate that the proposed method achieved average accuracy, precision, recall, specificity, and F1-score of 83.7%, 88.1%, 86.2%, 97.7%, and 87.2%, respectively. The proposed method fully leverages the cost-effectiveness and convenience of X-ray imaging, providing clinicians with an efficient and economical auxiliary diagnostic tool. It enables rapid and accurate identification of vertebral compression fractures in emergency and initial screening scenarios. Full article
(This article belongs to the Special Issue AI-Driven Medical Image/Video Processing)
Show Figures

Figure 1

23 pages, 19029 KB  
Article
CETransUNet: An Intelligent Landslide Identification Method Based on Collaborative Optimization of Global Context and Dual Attention Mechanisms
by Tianli Sun, Chengsheng Yang, Jifeng Wu, Zewei Liu, Ziqian Wang and Xiaoqiang Cheng
Remote Sens. 2026, 18(12), 1974; https://doi.org/10.3390/rs18121974 - 13 Jun 2026
Viewed by 240
Abstract
Accurate landslide identification is crucial for enhancing emergency response capabilities during destructive geological hazards. Although deep-learning-based semantic segmentation has demonstrated effectiveness, substantial variations in landslide scales and environmental similarities continue to challenge existing methods. This paper systematically constructs a new co-seismic landslide dataset [...] Read more.
Accurate landslide identification is crucial for enhancing emergency response capabilities during destructive geological hazards. Although deep-learning-based semantic segmentation has demonstrated effectiveness, substantial variations in landslide scales and environmental similarities continue to challenge existing methods. This paper systematically constructs a new co-seismic landslide dataset for the Yarlung Zangbo River basin based on the 2017 Nyingchi earthquake, effectively filling a critical regional data gap. This paper proposes CETransUNet (coordinate attention and edge-guided attention transformer UNet), a novel landslide detection model that integrates ResNet and Transformer architectures. Specifically, a coordinate attention (CA) module is introduced within the skip connections between the encoder and decoder. This module encodes positional information along both horizontal and vertical spatial directions and dynamically re-weights the feature maps, thereby effectively suppressing background noise caused by semantic gaps and enhancing the model’s ability to localize landslide regions. Additionally, an edge-guided attention (EGA) module is incorporated into the decoder. This module extracts explicit edge priors from the input image using a Laplacian operator and imposes geometric constraints on the predictions via a boundary reverse attention mechanism, thereby significantly alleviating boundary ambiguity and morphological distortion of landslides. Evaluations across datasets from the Yarlung Zangbo River, Iburi-Tobu, and Bijie regions demonstrate that CETransUNet significantly outperforms state-of-the-art models—including TransUNet, SegFormer, and SwinUNet—in terms of IoU, MIoU, and F1-score. Overall, through the synergistic optimization of the coordinate attention and edge-guided attention modules, the CETransUNet model achieves synchronous enhancement of boundary integrity and geometric precision in complex scenarios, providing a reliable technical solution for large-scale intelligent landslide identification. Full article
Show Figures

Figure 1

20 pages, 1567 KB  
Article
Efficient Glare Suppression Network for Nighttime Images with Lightweight Parallel Attention and Ghost Convolution
by Ruoyu Yang, Huaixin Chen, Sijie Luo and Zhixi Wang
Sensors 2026, 26(12), 3773; https://doi.org/10.3390/s26123773 - 12 Jun 2026
Viewed by 397
Abstract
Aiming at the problems of glare interference, local overexposure and detail loss caused by artificial light sources such as vehicle lamps and street lamps in nighttime road scenes, as well as the challenges of existing glare suppression models with large parameters, high computational [...] Read more.
Aiming at the problems of glare interference, local overexposure and detail loss caused by artificial light sources such as vehicle lamps and street lamps in nighttime road scenes, as well as the challenges of existing glare suppression models with large parameters, high computational complexity and difficulty in deploying on edge devices, this paper proposes a lightweight glare suppression network (LGSNet) based on ghost depthwise separable convolution and Lightweight Parallel Attention. Based on the U-Net architecture, the network introduces ghost depthwise separable convolution blocks (GhostDSC) in the encoder and decoder, which generates ghost features through cheap linear transformations by exploiting feature map redundancy, significantly reducing model parameters and computational costs while maintaining feature representation ability. Meanwhile, a Lightweight Parallel Attention (LPA) module is designed in the decoder stage, which integrates channel attention and pixel attention in parallel, enhancing the network’s attention to glare regions and edge details with extremely low parameter increment to improve detail recovery accuracy. In addition, a joint loss function consisting of background loss, glare loss and reconstruction loss is constructed to collaboratively optimize glare suppression and detail preservation. Experimental results on the public Flare7K++ dataset and the self-built nighttime road glare dataset NRGD show that the proposed method has only 7.45 M parameters, much lower than standard U-Net and Uformer. It achieves competitive results on full-reference metrics such as PSNR, SSIM, LPIPS and no-reference metrics such as NIQE, BRISQUE, PIQE, and can effectively suppress various types of glare interference and restore obscured scene details. It achieves a superior trade-off between model complexity and enhancement performance, significantly reducing the parameter count and computational overhead compared to heavy baselines, thereby offering a highly efficient solution for resource-aware glare suppression tasks. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

Back to TopTop