Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (28)

Search Parameters:
Keywords = DA-UNet

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 1719 KB  
Article
DA-UNet: A Direction-Aware U-Net for Leaf Vein Segmentation in Tissue-Cultured Plantlets
by Qiuze Wu, Qing Yang, Dong Meng and Xiaofei Yan
Electronics 2026, 15(7), 1531; https://doi.org/10.3390/electronics15071531 - 6 Apr 2026
Viewed by 357
Abstract
For the automation of Agrobacterium-mediated genetic transformation of tissue-cultured plantlets, accurate leaf vein segmentation is essential. The thin, low-contrast structure of leaf veins frequently leads to fragmented segmentation outputs, despite the proposal of various methodologies for vein segmentation. To address this issue, we [...] Read more.
For the automation of Agrobacterium-mediated genetic transformation of tissue-cultured plantlets, accurate leaf vein segmentation is essential. The thin, low-contrast structure of leaf veins frequently leads to fragmented segmentation outputs, despite the proposal of various methodologies for vein segmentation. To address this issue, we propose Direction-Aware U-Net (DA-UNet), an improved U-Net architecture that incorporates a Direction-Aware Context Pooling (DACPool) module and Topology-aware Segmentation loss (TopoSeg loss). The DACPool module explicitly exploits vein orientation to aggregate directional contextual information, while the TopoSeg loss jointly optimizes pixel-level accuracy and topological continuity. DA-UNet achieves efficient leaf vein segmentation with improved continuity and structural integrity, according to evaluations on the self-constructed Tissue-Cultured Plantlet Vein Dataset 2025 (TCPVD2025). Comparative experiment results show that the improved model outperforms PSPNet, DeepLabV3+, U-Net, TransUNet, Swin-UNet, CCNet, and SegNeXt, as evidenced by Recall, Dice, and CONNECT scores of 71.35%, 69.08%, and −2.25, while maintaining competitive Precision of 66.98%. Ablation experiment results provide further evidence for the efficacy of the TopoSeg loss and the DACPool module. The results demonstrate the effectiveness of the proposed vein segmentation framework for generating outputs that are both accurate and structurally consistent, thus enabling reliable automated processes for plant genetic transformation. Full article
Show Figures

Figure 1

22 pages, 8609 KB  
Article
Integrating SimAM Attention and S-DRU Feature Reconstruction for Sentinel-2 Imagery-Based Soybean Planting Area Extraction
by Haotong Wu, Xinwen Wan, Rong Qian, Chao Ruan, Jinling Zhao and Chuanjian Wang
Agriculture 2026, 16(6), 693; https://doi.org/10.3390/agriculture16060693 - 19 Mar 2026
Viewed by 308
Abstract
Accurate and stable acquisition of the spatial distribution of soybean planting areas is essential for supporting precision agricultural monitoring and ensuring food security. However, crop remote-sensing mapping for specific regions still faces critical data bottlenecks: high-precision, large-scale pixel-level annotation is costly, resulting in [...] Read more.
Accurate and stable acquisition of the spatial distribution of soybean planting areas is essential for supporting precision agricultural monitoring and ensuring food security. However, crop remote-sensing mapping for specific regions still faces critical data bottlenecks: high-precision, large-scale pixel-level annotation is costly, resulting in scarce available labeled samples that make it difficult to construct large-scale training datasets. Although parameter-intensive models such as FCN and SegNet can achieve sufficient end-to-end training on large-scale public remote sensing datasets like LoveDA, when directly applied to the data-limited dataset in this study area, the models are prone to overfitting, leading to a significant decline in generalization ability. To address these issues, this study proposes a lightweight U-shaped semantic segmentation model, SimSDRU-Net. The model utilizes a pre-trained VGG-16 backbone to extract shallow texture and deep semantic features. The pre-trained weights mitigate the impact of overfitting in data-limited settings. In the decoding stage, a parameter-free lightweight SimAM attention module enhances effective soybean features and suppresses soil background redundancy, while an embedded S-DRU unit fuses multi-scale features for deep complementary reconstruction to improve edge detail capture. A label dataset was constructed using Sentinel-2 images as the data source and Menard County (USA) as the study area. The USDA CDL was used as a foundation for the dataset, with Google high-resolution images serving as visual interpretation aids. In the context of the experiment, Deeplabv3+ and U-Net++ were compared with U-Net under identical conditions. The results demonstrated that SimSDRU-Net exhibited optimal performance, with MIoU of 89.03%, MPA of 93.81%, and OA of 95.96%. Specifically, SimSDRU-Net uses the SimAM attention module to generate spatial attention weights by analyzing feature statistical differences through an energy function, so as to adaptively enhance soybean texture features. Meanwhile, the S-DRU unit groups, dynamically weights, and cross-branch reconstructs multi-scale convolutional features to preserve fine boundary details and achieve accurate segmentation of soybean plots. The present study demonstrates that SimSDRU-Net integrates lightweight design and high precision in data-limited scenarios, thereby providing effective technical support for the rapid extraction of soybean planting areas in North America. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

20 pages, 8653 KB  
Article
Spatiotemporal Prediction of Wind Fields in Coastal Urban Environments Using Multi-Source Satellite Data: A GeoAI Approach
by Yifan Shi, Tianqiang Huang, Liqing Huang, Wei Huang, Shaoyu Liu and Riqing Chen
Remote Sens. 2026, 18(5), 716; https://doi.org/10.3390/rs18050716 - 27 Feb 2026
Viewed by 345
Abstract
Rapid urbanization in coastal regions presents complex challenges for environmental management and public safety. Accurate, high-resolution wind field monitoring is critical for urban disaster mitigation, infrastructure resilience, and pollutant dispersion analysis in these densely populated areas. However, utilizing massive multi-source satellite remote sensing [...] Read more.
Rapid urbanization in coastal regions presents complex challenges for environmental management and public safety. Accurate, high-resolution wind field monitoring is critical for urban disaster mitigation, infrastructure resilience, and pollutant dispersion analysis in these densely populated areas. However, utilizing massive multi-source satellite remote sensing data for precise prediction remains difficult due to the spatiotemporal heterogeneity caused by the land–sea interface. To address this, this study proposes a novel lightweight Geospatial Artificial Intelligence (GeoAI) framework (DA-DSC-UNet) designed to predict wind fields in coastal urban environments (e.g., Fujian, China). We constructed a dataset by integrating multi-source satellite scatterometer products (including Advanced Scatterometer (ASCAT), Fengyun-3E (FY-3E), and Quick Scatterometer (QuickSCAT)) and buoy observations. The framework employs a UNet architecture enhanced with dual attention mechanisms (Efficient Channel Attention (ECA) and Convolutional Block Attention Module (CBAM)) to adaptively extract features from remote sensing signals, focusing on critical spatial regions like urban coastlines. Additionally, depthwise separable convolutions (DSCs) are introduced to ensure the model is lightweight and efficient for potential deployment in urban monitoring systems. Results demonstrate that our approach significantly outperforms existing deep learning models (reducing Mean Absolute Error (MAE) by 14–25.8%) and exhibits exceptional robustness against observational noise. This work demonstrates the potential of deep learning in enhancing the value of remote sensing data for urban resilience, sustainable development (SDG 11), and environmental monitoring in complex coastal zones. Full article
(This article belongs to the Special Issue Remote Sensing Applied in Urban Environment Monitoring)
Show Figures

Figure 1

24 pages, 8571 KB  
Article
Spatiotemporal Evolution of Mid-Channel Bars in the Yalu River Based on DA-UNet
by Qiao Yu, Fangxiong Wang, Yingzi Hou, Zhenqi Cui, Junfu Wang and Yi Lu
Sustainability 2026, 18(3), 1681; https://doi.org/10.3390/su18031681 - 6 Feb 2026
Viewed by 268
Abstract
Mid-channel bars are fundamental fluvial geomorphic units that regulate sediment transport, channel stability, and riparian ecosystems, and their spatiotemporal evolution provides critical insights for sustainable river management. This study examines the structural reorganization and migration dynamics of mid-channel bars along the mainstem of [...] Read more.
Mid-channel bars are fundamental fluvial geomorphic units that regulate sediment transport, channel stability, and riparian ecosystems, and their spatiotemporal evolution provides critical insights for sustainable river management. This study examines the structural reorganization and migration dynamics of mid-channel bars along the mainstem of the transboundary Yalu River using multi-temporal Sentinel-2 imagery acquired in 2019, 2022, and 2024. An automated extraction framework combining a dense atrous U-Net (DA-UNet) with multispectral indices was developed to robustly identify mid-channel bars under complex water–land transition conditions. Based on the extracted results, changes in bar number, area, size composition, morphological characteristics, and centroid migration were systematically analyzed. The results reveal a pronounced reorganization of mid-channel bars systems over the study period: although the number of bars increased from 111 to 136, the total area decreased from 168.97 km2 to 165.00 km2, indicating a transition from a “few-large” to a “many-small” configuration. Size-based analysis further shows an increase in small and medium bars, while large bars remained relatively stable, leading to a more differentiated multi-scale structure. These findings highlight the effectiveness of integrating multi-temporal remote sensing and deep learning for long-term monitoring of geomorphic dynamics and provide scientific evidence to support sustainable river regulation and transboundary watershed management. Full article
(This article belongs to the Section Sustainability in Geographic Science)
Show Figures

Figure 1

20 pages, 3823 KB  
Article
DA-TransResUNet: Residual U-Net Liver Segmentation Model Integrating Dual Attention of Spatial and Channel with Transformer
by Kunzhan Wang, Xinyue Lu, Jing Li and Yang Lu
Mathematics 2026, 14(3), 575; https://doi.org/10.3390/math14030575 - 5 Feb 2026
Viewed by 387
Abstract
Precise medical image segmentation plays a vital role in disease diagnosis and clinical treatment. Although U-Net-based architectures and their Transformer-enhanced variants have achieved remarkable progress in automatic segmentation tasks, they still face challenges in complex medical imaging scenarios, particularly around simultaneously modeling fine-grained [...] Read more.
Precise medical image segmentation plays a vital role in disease diagnosis and clinical treatment. Although U-Net-based architectures and their Transformer-enhanced variants have achieved remarkable progress in automatic segmentation tasks, they still face challenges in complex medical imaging scenarios, particularly around simultaneously modeling fine-grained local details and capturing long-range global contextual information, which limits segmentation accuracy and structural consistency. To address these challenges, this paper proposes a novel medical image segmentation framework termed DA-TransResUNet. Built upon a ResUNet backbone, the proposed network integrates residual learning, Transformer-based encoding, and a dual-attention (DA) mechanism in a unified manner. Residual blocks facilitate stable optimization and progressive feature refinement in deep networks, while the Transformer module effectively models long-range dependencies to enhance global context representation. Meanwhile, the proposed DA-Block jointly exploits local and global features as well as spatial and channel-wise dependencies, leading to more discriminative feature representations. Furthermore, embedding DA-Blocks into both the feature embedding stage and skip connections strengthens information interaction between the encoder and decoder, thereby improving overall segmentation performance. Experimental results on the LiTS2017 dataset and Sliver07 dataset demonstrate that the proposed method achieves incremental improvement in liver segmentation. In particular, on the LiTS2017 dataset, DA-TransResUNet achieves a Dice score of 97.39%, a VOE of 5.08%, and an RVD of −0.74%, validating its effectiveness for liver segmentation. Full article
Show Figures

Figure 1

33 pages, 1245 KB  
Article
Domain-Adaptive MRI Learning Model for Precision Diagnosis of CNS Tumors
by Wiem Abdelbaki, Hend Alshaya, Inzamam Mashood Nasir, Sara Tehsin, Salwa Said and Wided Bouchelligua
Biomedicines 2026, 14(1), 235; https://doi.org/10.3390/biomedicines14010235 - 21 Jan 2026
Cited by 1 | Viewed by 528
Abstract
Background: Diagnosing CNS tumors through MRI is limited by significant variability in scanner hardware, acquisition protocols, and intensity characteristics at clinical centers, resulting in substantial domain shifts that lead to diminished reliability for automated models. Methods: We present a Domain-Adaptive MRI Learning Model [...] Read more.
Background: Diagnosing CNS tumors through MRI is limited by significant variability in scanner hardware, acquisition protocols, and intensity characteristics at clinical centers, resulting in substantial domain shifts that lead to diminished reliability for automated models. Methods: We present a Domain-Adaptive MRI Learning Model (DA-MLM) consisting of an adversarially aligned hybrid 3D CNN–transformer encoder with contrastive regularization and covariance-based feature harmonization. Varying sequence MRI inputs (T1, T1ce, T2, and FLAIR) were inputted to multi-scale convolutional layers followed by global self-attention to effectively capture localized tumor structure and long-range spatial context, with domain adaptation that harmonizes feature distribution across datasets. Results: On the BraTS 2020 dataset, we found DA-MLM achieved 94.8% accuracy, 93.6% macro-F1, and 96.2% AUC, improving upon previously established benchmarks by 2–4%. DA-MLM also attained Dice score segmentation of 93.1% (WT), 91.4% (TC), and 89.5% (ET), improving upon 2–3.5% for CNN and transformer methods. On the REMBRANDT dataset, DA-MLM achieved 92.3% accuracy with segmentation improvements of 3–7% over existing U-Net and expert annotations. Robustness testing indicated 40–60% less degradation under noise, contrast shift, and motion artifacts, and synthetic shifts in scanner location showed negligible performance impairment (<0.06). Cross-domain evaluation also demonstrated 5–11% less degradation than existing methods. Conclusions: In summary, DA-MLM demonstrates improved accuracy, segmentation fidelity, and robustness to perturbations, as well as strong cross-domain generalization indicating the suitability for deployment in multicenter MRI applications where variation in imaging performance is unavoidable. Full article
(This article belongs to the Special Issue Diagnosis, Pathogenesis and Treatment of CNS Tumors (2nd Edition))
Show Figures

Figure 1

23 pages, 9897 KB  
Article
HyMambaNet: Efficient Remote Sensing Water Extraction Method Combining State Space Modeling and Multi-Scale Features
by Handan Liu, Guangyi Mu, Kai Li, Haowei Zhang, Yibo Sun, Hongqing Sun and Sijia Li
Sensors 2025, 25(24), 7414; https://doi.org/10.3390/s25247414 - 5 Dec 2025
Viewed by 684
Abstract
Accurate segmentation of water bodies from high-resolution remote sensing imagery is crucial for water resource management and ecological monitoring. However, small and morphologically complex water bodies remain difficult to detect due to scale variations, blurred boundaries, and heterogeneous backgrounds. This study aims to [...] Read more.
Accurate segmentation of water bodies from high-resolution remote sensing imagery is crucial for water resource management and ecological monitoring. However, small and morphologically complex water bodies remain difficult to detect due to scale variations, blurred boundaries, and heterogeneous backgrounds. This study aims to develop a robust and scalable deep learning framework for high-precision water body extraction across diverse hydrological and ecological scenarios. To address these challenges, we propose HyMambaNet, a hybrid deep learning model that integrates convolutional local feature extraction with the Mamba state space model for efficient global context modeling. The network further incorporates multi-scale and frequency-domain enhancement as well as optimized skip connections to improve boundary precision and segmentation robustness. Experimental results demonstrate that HyMambaNet significantly outperforms existing CNN and Transformer-based methods. On the LoveHY dataset, it achieves 74.82% IoU and 88.87% F1-score, exceeding UNet by 7.49% IoU and 7.12% F1. On the LoveDA dataset, it attains 81.30% IoU and 89.99% F1-score, surpassing advanced models such as Deeplabv3+, AttenUNet, and TransUNet. These findings confirm that HyMambaNet provides an efficient and generalizable solution for large-scale water resource monitoring and ecological applications based on remote sensing imagery. Full article
(This article belongs to the Section Environmental Sensing)
Show Figures

Figure 1

17 pages, 4072 KB  
Article
MKF-NET: KAN-Enhanced Vision Transformer for Remote Sensing Image Segmentation
by Ning Ye, Yi-Han Xu, Wen Zhou, Gang Yu and Ding Zhou
Appl. Sci. 2025, 15(20), 10905; https://doi.org/10.3390/app152010905 - 10 Oct 2025
Cited by 1 | Viewed by 1049
Abstract
Remote sensing images, which obtain surface information from aerial or satellite platforms, are of great significance in fields such as environmental monitoring, urban planning, agricultural management, and disaster response. However, due to the complex and diverse types of ground coverage and significant differences [...] Read more.
Remote sensing images, which obtain surface information from aerial or satellite platforms, are of great significance in fields such as environmental monitoring, urban planning, agricultural management, and disaster response. However, due to the complex and diverse types of ground coverage and significant differences in spectral characteristics in remote sensing images, achieving high-quality semantic segmentation still faces many challenges, such as blurred target boundaries and difficulty in recognizing small-scale objects. To address these issues, this study proposes a novel deep learning model, MKF-NET. The fusion of KAN convolution and Vision Transformer (ViT), combined with the multi-scale feature extraction and dense connection mechanism, significantly improves the semantic segmentation performance of remote sensing images. Experiments were conducted on the LoveDA dataset to systematically evaluate the segmentation performance of MKF-NET and several existing traditional deep learning models (U-net, Unet++, Deeplabv3+, Transunet, and U-KAN). Experimental results show that MKF-NET performs best in many indicators: it achieved a pixel precision of 78.53%, a pixel accuracy of 79.19%, an average class accuracy of 76.50%, and an average intersection-over-union ratio of 64.31%; it provides efficient technical support for remote sensing image analysis. Full article
Show Figures

Figure 1

20 pages, 67212 KB  
Article
KPV-UNet: KAN PP-VSSA UNet for Remote Image Segmentation
by Shuiping Zhang, Qiang Rao, Lei Wang, Tang Tang and Chen Chen
Electronics 2025, 14(13), 2534; https://doi.org/10.3390/electronics14132534 - 23 Jun 2025
Viewed by 1471
Abstract
Semantic segmentation of remote sensing images is a key technology for land cover interpretation and target identification. Although convolutional neural networks (CNNs) have achieved remarkable success in this field, their inherent limitation of local receptive fields restricts their ability to model long-range dependencies [...] Read more.
Semantic segmentation of remote sensing images is a key technology for land cover interpretation and target identification. Although convolutional neural networks (CNNs) have achieved remarkable success in this field, their inherent limitation of local receptive fields restricts their ability to model long-range dependencies and global contextual information. As a result, CNN-based methods often struggle to capture the comprehensive spatial context necessary for accurate segmentation in complex remote sensing scenes, leading to issues such as the misclassification of small objects and blurred or imprecise object boundaries. To address these problems, this paper proposes a new hybrid architecture called KPV-UNet, which integrates the Kolmogorov–Arnold Network (KAN) and the Pyramid Pooling Visual State Space Attention (PP-VSSA) block. KPV-UNet introduces a deep feature refinement module based on KAN and incorporates PP-VSSA to enable scalable long-range modeling. This design effectively captures global dependencies and abundant localized semantic content extracted from complex feature spaces, overcoming CNNs’ limitations in modeling long-range dependencies and inter-national context in large-scale complex scenes. In addition, we designed an Auxiliary Local Monitoring (ALM) block that significantly enhances KPV-UNet’s perception of local content. Experimental results demonstrate that KPV-UNet outperforms state-of-the-art methods on the Vaihingen, LoveDA Urban, and WHDLD datasets, achieving mIoU scores of 84.03%, 51.27%, and 62.87%, respectively. The proposed method not only improves segmentation accuracy but also produces clearer and more connected object boundaries in visual results. Full article
Show Figures

Figure 1

21 pages, 20433 KB  
Article
Micro-Terrain Recognition Method of Transmission Lines Based on Improved UNet++
by Feng Yi and Chunchun Hu
ISPRS Int. J. Geo-Inf. 2025, 14(6), 216; https://doi.org/10.3390/ijgi14060216 - 30 May 2025
Cited by 1 | Viewed by 929
Abstract
Micro-terrain recognition plays a crucial role in the planning, design, and safe operation of transmission lines. To achieve intelligent and automatic recognition of micro-terrain surrounding transmission lines, this paper proposes an improved semantic segmentation model based on UNet++. This model expands the single [...] Read more.
Micro-terrain recognition plays a crucial role in the planning, design, and safe operation of transmission lines. To achieve intelligent and automatic recognition of micro-terrain surrounding transmission lines, this paper proposes an improved semantic segmentation model based on UNet++. This model expands the single encoder into multiple encoders to accommodate the input of multi-source geographic features and introduces a gated fusion module (GFM) to effectively integrate the data from diverse sources. Additionally, the model incorporates a dual attention network (DA-Net) and a deep supervision strategy to enhance performance and robustness. The multi-source dataset used for the experiment includes the Digital Elevation Model (DEM), Elevation Coefficient of Variation (ECV), and profile curvature. The experimental results of the model comparison indicate that the improved model outperforms common semantic segmentation models in terms of multiple evaluation metrics, with pixel accuracy (PA) and intersection over union (IoU) reaching 92.26% and 85.63%, respectively. Notably, the performance in identifying the saddle and alpine watershed types has been enhanced significantly by the improved model. The ablation experiment results confirm that the introduced modules contribute to enhancing the model’s segmentation performance. Compared to the baseline network, the improved model enhances PA and IoU by 1.75% and 2.96%, respectively. Full article
Show Figures

Graphical abstract

23 pages, 59897 KB  
Article
Method to Use Transport Microsimulation Models to Create Synthetic Distributed Acoustic Sensing Datasets
by Ignacio Robles-Urquijo, Juan Benavente, Javier Blanco García, Pelayo Diego Gonzalez, Alayn Loayssa, Mikel Sagues, Luis Rodriguez-Cobo and Adolfo Cobo
Appl. Sci. 2025, 15(9), 5203; https://doi.org/10.3390/app15095203 - 7 May 2025
Cited by 1 | Viewed by 1820
Abstract
This research introduces a new method for creating synthetic Distributed Acoustic Sensing (DAS) datasets from transport microsimulation models. The process involves modeling detailed vehicle interactions, trajectories, and characteristics from the PTV VISSIM transport microsimulation tool. It then applies the Flamant–Boussinesq approximation to simulate [...] Read more.
This research introduces a new method for creating synthetic Distributed Acoustic Sensing (DAS) datasets from transport microsimulation models. The process involves modeling detailed vehicle interactions, trajectories, and characteristics from the PTV VISSIM transport microsimulation tool. It then applies the Flamant–Boussinesq approximation to simulate the resulting ground deformation detected by virtual fiber-optic cables. These synthetic DAS signals serve as large-scale, scenario-controlled, labeled datasets on training machine learning models for various transport applications. We demonstrate this by training several U-Net convolutional neural networks to enhance spatial resolution (reducing it to half the original gauge length), filtering traffic signals by vehicle direction, and simulating the effects of alternative cable layouts. The methodology is tested using simulations of real road scenarios, featuring a fiber-optic cable buried along the westbound shoulder with sections deviating from the roadside. The U-Net models, trained solely on synthetic data, showed promising performance (e.g., validation MSE down to 0.0015 for directional filtering) and improved the detectability of faint signals, like bicycles among heavy vehicles, when applied to real DAS measurements from the test site. This framework uniquely integrates detailed traffic modeling with DAS physics, providing a novel tool to develop and evaluate DAS signal processing techniques, optimize cable layout deployments, and advance DAS applications in complex transportation monitoring scenarios. Creating such a procedure offers significant potential for advancing the application of DAS in transportation monitoring and smart city initiatives. Full article
(This article belongs to the Special Issue Recent Research on Intelligent Sensors)
Show Figures

Figure 1

28 pages, 13922 KB  
Article
Multi-Class Guided GAN for Remote-Sensing Image Synthesis Based on Semantic Labels
by Zhenye Niu, Yuxia Li, Yushu Gong, Bowei Zhang, Yuan He, Jinglin Zhang, Mengyu Tian and Lei He
Remote Sens. 2025, 17(2), 344; https://doi.org/10.3390/rs17020344 - 20 Jan 2025
Cited by 4 | Viewed by 3633
Abstract
In the scenario of limited labeled remote-sensing datasets, the model’s performance is constrained by the insufficient availability of data. Generative model-based data augmentation has emerged as a promising solution to this limitation. While existing generative models perform well in natural scene domains (e.g., [...] Read more.
In the scenario of limited labeled remote-sensing datasets, the model’s performance is constrained by the insufficient availability of data. Generative model-based data augmentation has emerged as a promising solution to this limitation. While existing generative models perform well in natural scene domains (e.g., faces and street scenes), their performance in remote sensing is hindered by severe data imbalance and the semantic similarity among land-cover classes. To tackle these challenges, we propose the Multi-Class Guided GAN (MCGGAN), a novel network for generating remote-sensing images from semantic labels. Our model features a dual-branch architecture with a global generator that captures the overall image structure and a multi-class generator that improves the quality and differentiation of land-cover types. To integrate these generators, we design a shared-parameter encoder for consistent feature encoding across two branches, and a spatial decoder that synthesizes outputs from the class generators, preventing overlap and confusion. Additionally, we employ perceptual loss (LVGG) to assess perceptual similarity between generated and real images, and texture matching loss (LT) to capture fine texture details. To evaluate the quality of image generation, we tested multiple models on two custom datasets (one from Chongzhou, Sichuan Province, and another from Wuzhen, Zhejiang Province, China) and a public dataset LoveDA. The results show that MCGGAN achieves improvements of 52.86 in FID, 0.0821 in SSIM, and 0.0297 in LPIPS compared to the Pix2Pix baseline. We also conducted comparative experiments to assess the semantic segmentation accuracy of the U-Net before and after incorporating the generated images. The results show that data augmentation with the generated images leads to an improvement of 4.47% in FWIoU and 3.23% in OA across the Chongzhou and Wuzhen datasets. Experiments show that MCGGAN can be effectively used as a data augmentation approach to improve the performance of downstream remote-sensing image segmentation tasks. Full article
Show Figures

Figure 1

21 pages, 9403 KB  
Article
Link Aggregation for Skip Connection–Mamba: Remote Sensing Image Segmentation Network Based on Link Aggregation Mamba
by Qi Zhang, Guohua Geng, Pengbo Zhou, Qinglin Liu, Yong Wang and Kang Li
Remote Sens. 2024, 16(19), 3622; https://doi.org/10.3390/rs16193622 - 28 Sep 2024
Cited by 12 | Viewed by 4817
Abstract
The semantic segmentation of satellite and UAV remote sensing imagery is pivotal for address exploration, change detection, quantitative analysis and urban planning. Recent advancements have seen an influx of segmentation networks utilizing convolutional neural networks and transformers. However, the intricate geographical features and [...] Read more.
The semantic segmentation of satellite and UAV remote sensing imagery is pivotal for address exploration, change detection, quantitative analysis and urban planning. Recent advancements have seen an influx of segmentation networks utilizing convolutional neural networks and transformers. However, the intricate geographical features and varied land cover boundary interferences in remote sensing imagery still challenge conventional segmentation networks’ spatial representation and long-range dependency capabilities. This paper introduces a novel U-Net-like network for UAV image segmentation. We developed a link aggregation Mamba at the critical skip connection stage of UNetFormer. This approach maps and aggregates multi-scale features from different stages into a unified linear dimension through four Mamba branches containing state-space models (SSMs), ultimately decoupling and fusing these features to restore the contextual relationships in the mask. Moreover, the Mix-Mamba module is incorporated, leveraging a parallel self-attention mechanism with SSMs to merge the advantages of a global receptive field and reduce modeling complexity. This module facilitates nonlinear modeling across different channels and spaces through multipath activation, catering to international and local long-range dependencies. Evaluations on public remote sensing datasets like LovaDA, UAVid and Vaihingen underscore the state-of-the-art performance of our approach. Full article
(This article belongs to the Special Issue Deep Learning for Satellite Image Segmentation)
Show Figures

Figure 1

25 pages, 4045 KB  
Article
MBT-UNet: Multi-Branch Transform Combined with UNet for Semantic Segmentation of Remote Sensing Images
by Bin Liu, Bing Li, Victor Sreeram and Shuofeng Li
Remote Sens. 2024, 16(15), 2776; https://doi.org/10.3390/rs16152776 - 29 Jul 2024
Cited by 14 | Viewed by 3290
Abstract
Remote sensing (RS) images play an indispensable role in many key fields such as environmental monitoring, precision agriculture, and urban resource management. Traditional deep convolutional neural networks have the problem of limited receptive fields. To address this problem, this paper introduces a hybrid [...] Read more.
Remote sensing (RS) images play an indispensable role in many key fields such as environmental monitoring, precision agriculture, and urban resource management. Traditional deep convolutional neural networks have the problem of limited receptive fields. To address this problem, this paper introduces a hybrid network model that combines the advantages of CNN and Transformer, called MBT-UNet. First, a multi-branch encoder design based on the pyramid vision transformer (PVT) is proposed to effectively capture multi-scale feature information; second, an efficient feature fusion module (FFM) is proposed to optimize the collaboration and integration of features at different scales; finally, in the decoder stage, a multi-scale upsampling module (MSUM) is proposed to further refine the segmentation results and enhance segmentation accuracy. We conduct experiments on the ISPRS Vaihingen dataset, the Potsdam dataset, the LoveDA dataset, and the UAVid dataset. Experimental results show that MBT-UNet surpasses state-of-the-art algorithms in key performance indicators, confirming its superior performance in high-precision remote sensing image segmentation tasks. Full article
Show Figures

Figure 1

20 pages, 7074 KB  
Article
Dual Attention-Based 3D U-Net Liver Segmentation Algorithm on CT Images
by Benyue Zhang, Shi Qiu and Ting Liang
Bioengineering 2024, 11(7), 737; https://doi.org/10.3390/bioengineering11070737 - 20 Jul 2024
Cited by 11 | Viewed by 7046
Abstract
The liver is a vital organ in the human body, and CT images can intuitively display its morphology. Physicians rely on liver CT images to observe its anatomical structure and areas of pathology, providing evidence for clinical diagnosis and treatment planning. To assist [...] Read more.
The liver is a vital organ in the human body, and CT images can intuitively display its morphology. Physicians rely on liver CT images to observe its anatomical structure and areas of pathology, providing evidence for clinical diagnosis and treatment planning. To assist physicians in making accurate judgments, artificial intelligence techniques are adopted. Addressing the limitations of existing methods in liver CT image segmentation, such as weak contextual analysis and semantic information loss, we propose a novel Dual Attention-Based 3D U-Net liver segmentation algorithm on CT images. The innovations of our approach are summarized as follows: (1) We improve the 3D U-Net network by introducing residual connections to better capture multi-scale information and alleviate semantic information loss. (2) We propose the DA-Block encoder structure to enhance feature extraction capability. (3) We introduce the CBAM module into skip connections to optimize feature transmission in the encoder, reducing semantic gaps and achieving accurate liver segmentation. To validate the effectiveness of the algorithm, experiments were conducted on the LiTS dataset. The results showed that the Dice coefficient and HD95 index for liver images were 92.56% and 28.09 mm, respectively, representing an improvement of 0.84% and a reduction of 2.45 mm compared to 3D Res-UNet. Full article
Show Figures

Figure 1

Back to TopTop