Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (48)

Search Parameters:
Keywords = Mamba U-Net

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 4776 KB  
Article
FireMambaNet: A Multi-Scale Mamba Network for Tiny Fire Segmentation in Satellite Imagery
by Bo Song, Bo Li, Hong Huang, Zhiyong Zhang, Zhili Chen, Tao Yue and Yun Chen
Remote Sens. 2026, 18(7), 1021; https://doi.org/10.3390/rs18071021 - 29 Mar 2026
Viewed by 258
Abstract
Satellite remote sensing plays an essential role in wildfire monitoring due to its large-scale observation capability. However, fire targets in satellite imagery are typically extremely small, sparsely distributed, and embedded in complex backgrounds, making accurate segmentation highly challenging for existing methods. To address [...] Read more.
Satellite remote sensing plays an essential role in wildfire monitoring due to its large-scale observation capability. However, fire targets in satellite imagery are typically extremely small, sparsely distributed, and embedded in complex backgrounds, making accurate segmentation highly challenging for existing methods. To address these challenges, this paper proposes a multi-scale Mamba-based network for tiny fire segmentation, named FireMambaNet. The network adopts a nested U-shaped encoder-decoder architecture, primarily consisting of three modules: the Cross-layer Gated Residual U-shaped module (CG-RSU), the Fire-aware Directional Context Modulation module (FDCM), and the Multi-scale Mamba Attention Module (M2AM). The CG-RSU, as the core building block, adaptively suppresses background redundancy and enhances weak fire responses by extracting multi-scale features through cross-layer gating. The FDCM explicitly enhances the network’s ability to perceive anisotropic expansion features of fire points, such as those along the wind direction and terrain orientation, by modeling multi-directional context. The M2AM model employs a Mamba state-space model to suppress background interference through global context modeling during cross-scale feature fusion, while enhancing consistency among sparsely distributed tiny fire targets. In addition, experimental validation is conducted using two subsets from the Active Fire dataset, which have significant pixel-level sparse features: Oceania and Asia4. The results show that the proposed method significantly outperforms various mainstream CNN, Transformer, and Mamba baseline models on both datasets. It achieves an IoU of 88.51% and F1 score of 93.76% on the Oceania dataset, and an IoU of 85.65% and F1 score of 92.26% on the Asia4 dataset. Compared to the best-performing CNN baseline model, the IoU is improved by 1.81% and 2.07%, respectively. Overall, the FireMambaNet demonstrates significant advantages in detecting tiny fire points in complex backgrounds. Full article
Show Figures

Figure 1

29 pages, 1942 KB  
Article
Lightweight CNN–Mamba Hybrid Network for Multi-Scale Concrete Crack Segmentation Using Vision Sensors
by Jinfu Guan, Linzhao Cui, Yanjun Chen, Chenglin Yang, Jingwu Wang and Yinuo Huo
Electronics 2026, 15(7), 1362; https://doi.org/10.3390/electronics15071362 - 25 Mar 2026
Viewed by 323
Abstract
Surface cracking is a key visible indicator of deterioration in concrete infrastructure and is routinely captured by vision sensors during field inspections. To translate inspection imagery into actionable maintenance information, crack delineation must be accurate at the pixel level and robust to challenging [...] Read more.
Surface cracking is a key visible indicator of deterioration in concrete infrastructure and is routinely captured by vision sensors during field inspections. To translate inspection imagery into actionable maintenance information, crack delineation must be accurate at the pixel level and robust to challenging conditions where cracks are slender, discontinuous, low-contrast, and easily confused with joints, stains, texture patterns, and illumination artifacts. This study proposes a lightweight CNN–Mamba hybrid segmentation framework built upon Vm-unet for reliable crack mapping under heterogeneous inspection scenarios and resource-constrained deployment. The framework couples boundary-sensitive convolutional features with long-range state-space representations via a spatially modulated convolution design, refines skip-connection features using reciprocal co-modulation attention to suppress background interference, and enhances cross-scale interactions through a decoder interaction fusion scheme to preserve fine-crack continuity and sharp boundaries. Experiments on a multi-source composite dataset and public benchmarks show consistent improvements over representative CNN-, Transformer-, and Mamba-based baselines. The proposed method achieves 80.11% mIoU and 82.05% Dice on the composite dataset, while maintaining an efficient accuracy–cost trade-off (36.049 GFLOPs, 25.991 M parameters). The resulting crack masks provide a dependable basis for inspection-driven quantitative assessment and maintenance decision support. Full article
Show Figures

Figure 1

19 pages, 2147 KB  
Article
Dual-Mamba-ResNet: A Novel Vision State Space Network for Aero-Engine Ablation Detection
by Xin Wang, Hai Shu, Yaxi Xu, Qiang Fu and Jide Qian
Aerospace 2026, 13(3), 273; https://doi.org/10.3390/aerospace13030273 - 15 Mar 2026
Viewed by 268
Abstract
With the rapid development of the aviation industry, engines operate under extreme conditions of high temperature, high pressure, and high vibration, making them prone to surface damage such as ablation. Ablation not only affects the structural integrity of engine components but also threatens [...] Read more.
With the rapid development of the aviation industry, engines operate under extreme conditions of high temperature, high pressure, and high vibration, making them prone to surface damage such as ablation. Ablation not only affects the structural integrity of engine components but also threatens flight safety, making efficient and accurate detection of paramount importance. Traditional detection methods rely on manual visual inspection and non-destructive testing, which suffer from high subjectivity and low efficiency. In recent years, deep learning has achieved significant progress in industrial defect detection. However, conventional CNN-and Transformer-based architectures still suffer from substantial computational overhead and inadequate boundary segmentation accuracy in aero-engine ablation detection. This paper proposes a novel dual-pathway network Visual State-Space Residual Neural Network (VSS-ResNet) based on Mamba that combines Visual State Space (VSS) modules with ResNet50. This architecture leverages the global modeling capability of VSS modules and the local feature extraction capability of CNNs, effectively enhancing the accuracy and robustness of ablation boundary detection with the support of multi-scale feature fusion modules. Experimental results demonstrate that the proposed method achieves superior performance in mIoU, mPA, and Acc compared to mainstream segmentation models such as U-Net, Pyramid Scene Parsing Network (PSPNet), and DeepLab V3+ on a self-constructed engine endoscopic ablation dataset, validating its potential in intelligent aero-engine inspection. Full article
(This article belongs to the Section Aeronautics)
Show Figures

Figure 1

15 pages, 3088 KB  
Article
Lightweight Semantic Segmentation Algorithm Based on Gated Visual State Space Models
by Kui Di, Jinming Cheng, Lili Zhang and Yubin Bao
Electronics 2026, 15(6), 1175; https://doi.org/10.3390/electronics15061175 - 12 Mar 2026
Viewed by 350
Abstract
LiDAR serves as the primary sensor for acquiring environmental information in intelligent driving systems. However, under adverse weather conditions, point cloud signals obtained by LiDAR suffer from intensity attenuation and noise interference, leading to a decline in segmentation accuracy. To address these issues, [...] Read more.
LiDAR serves as the primary sensor for acquiring environmental information in intelligent driving systems. However, under adverse weather conditions, point cloud signals obtained by LiDAR suffer from intensity attenuation and noise interference, leading to a decline in segmentation accuracy. To address these issues, this paper designs a lightweight semantic segmentation system based on the Gated Visual State Space Model (VMamba), named RainMamba. Specifically, the system utilizes spherical projection to transform point clouds into 2D sequences and constructs a physical perception feature embedding module guided by the Beer–Lambert law to explicitly model and suppress spatial noise at the source. Subsequently, an uncertainty-weighted cross-modal correction module is employed to incorporate RGB images for dynamically calibrating the degraded point cloud data. Finally, a VMamba backbone is adopted to establish global dependencies with linear complexity. Experimental results on the SemanticKITTI dataset demonstrate that the system achieves an inference speed of 83 FPS, with a relative mIoU improvement of approximately 7.2% compared to the real-time baseline PolarNet. Furthermore, zero-shot evaluations on the real-world SemanticSTF dataset validate the system’s robust Sim-to-Real generalization capability. Notably, RainMamba delivers highly competitive accuracy comparable to the state-of-the-art heavy-weight model PTv3 while requiring a significantly lower parameter footprint, thereby demonstrating its immense potential for practical edge-computing deployment. Full article
Show Figures

Figure 1

29 pages, 4988 KB  
Article
MARU-MTL: A Mamba-Enhanced Multi-Task Learning Framework for Continuous Blood Pressure Estimation Using Radar Pulse Waves
by Jinke Xie, Juhua Huang, Chongnan Xu, Hongtao Wan, Xuetao Zuo and Guanfang Dong
Bioengineering 2026, 13(3), 320; https://doi.org/10.3390/bioengineering13030320 - 11 Mar 2026
Viewed by 483
Abstract
Continuous blood pressure (BP) monitoring is essential for the prevention and management of cardiovascular diseases. Traditional cuff-based methods cause discomfort during repeated measurements, and wearable sensors require direct skin contact, limiting their applicability. Radar-based contactless BP measurement has emerged as a promising alternative. [...] Read more.
Continuous blood pressure (BP) monitoring is essential for the prevention and management of cardiovascular diseases. Traditional cuff-based methods cause discomfort during repeated measurements, and wearable sensors require direct skin contact, limiting their applicability. Radar-based contactless BP measurement has emerged as a promising alternative. However, radar pulse wave (RPW) signals are susceptible to motion artifacts, respiratory interference, and environmental clutter, posing persistent challenges to estimation accuracy and robustness. In this paper, we propose MARU-MTL, a Mamba-enhanced multi-task learning framework for continuous BP estimation using a single millimeter-wave radar sensor. To address signal quality degradation, a Variational Autoencoder-based Signal Quality Index (VAE-SQI) mechanism is proposed to automatically screen RPW segments without manual annotation. To capture long-range temporal dependencies across cardiac cycles, we integrate a Bidirectional Mamba module into the bottleneck of a U-Net backbone, enabling linear-time sequence modeling with respect to the segment length. We also introduce a multi-task learning strategy that couples BP regression with arterial blood pressure waveform reconstruction to strengthen physiological consistency. Extensive experiments on two datasets comprising 55 subjects demonstrate that MARU-MTL achieves mean absolute errors of 3.87 mmHg and 2.93 mmHg for systolic and diastolic BP, respectively, meeting commonly used AAMI error thresholds and achieving metrics comparable to BHS Grade A. Full article
(This article belongs to the Special Issue Contactless Technologies for Patient Health Monitoring)
Show Figures

Figure 1

27 pages, 8552 KB  
Article
A Data-Constrained and Physics-Guided Conditional Diffusion Model for Electrical Impedance Tomography Image Reconstruction
by Xiaolei Zhang and Zhou Rong
Sensors 2026, 26(5), 1728; https://doi.org/10.3390/s26051728 - 9 Mar 2026
Viewed by 446
Abstract
Electrical impedance tomography (EIT) provides noninvasive, high-temporal-resolution imaging for medical and industrial applications. However, accurate image reconstruction remains challenging due to the severe ill-posedness and nonlinearity of the inverse problem, as well as the limited robustness of existing single-source learning-based methods in real [...] Read more.
Electrical impedance tomography (EIT) provides noninvasive, high-temporal-resolution imaging for medical and industrial applications. However, accurate image reconstruction remains challenging due to the severe ill-posedness and nonlinearity of the inverse problem, as well as the limited robustness of existing single-source learning-based methods in real measurement scenarios. To address these limitations, a data-constrained and physics-guided Multi-Source Conditional Diffusion Model (MS-CDM) is proposed for EIT image reconstruction. Unlike conventional conditional diffusion methods that rely on a single measurement or an image prior, MS-CDM utilizes boundary voltage measurements as data-driven constraints and incorporates coarse reconstructions as physics-guided structural priors. This multi-source conditioning strategy provides complementary guidance during the reverse diffusion process, enabling balanced recovery of fine boundary details and global topological consistency. To support this framework, a Hybrid Swin–Mamba Denoising U-Net is developed, combining hierarchical window-based self-attention for local spatial modeling with bidirectional state-space modeling for efficient global dependency capture. Extensive experiments on simulated datasets and three real EIT experimental platforms demonstrate that MS-CDM consistently outperforms state-of-the-art numerical, supervised, and diffusion-based methods in terms of reconstruction accuracy, structural consistency, and noise robustness. Moreover, the proposed model exhibits robust cross-system applicability without system-specific retraining under multi-protocol training, highlighting its practical applicability in diverse real-world EIT scenarios. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

20 pages, 1126 KB  
Article
Semi-Supervised Vertebra Segmentation and Identification in CT Images
by You Fu, Jiasen Feng and Hanlin Cheng
Tomography 2026, 12(3), 33; https://doi.org/10.3390/tomography12030033 - 3 Mar 2026
Viewed by 347
Abstract
Background/Objectives: Automatic segmentation and identification of vertebrae in spinal CT are essential for assisting diagnosis of spinal disorders and for preoperative planning. The task is challenging due to the high structural similarity between adjacent vertebrae and the morphological variability of vertebrae. Most [...] Read more.
Background/Objectives: Automatic segmentation and identification of vertebrae in spinal CT are essential for assisting diagnosis of spinal disorders and for preoperative planning. The task is challenging due to the high structural similarity between adjacent vertebrae and the morphological variability of vertebrae. Most existing methods rely on fully supervised deep learning and, constrained by limited annotations, struggle to remain robust in complex scenarios. Methods: We propose a semi-supervised approach built on a dual-branch 3D U-Net. Mamba modules are inserted between the encoder and decoder to model long-range dependencies along the cranio–caudal axis. The identification branch employs a 3D convolutional block attention module (3D-CBAM) to enhance class discriminability. A unified semi-supervised objective is formulated via teacher–student consistency: for each unlabeled sample, weakly and strongly augmented views are generated, and cross-branch consistency is enforced, together with confidence-based filtering and class-frequency reweighting. In addition, a connected-component analysis is used to enforce anatomically plausible sequential continuity of vertebral indices in the outputs. Results: Experiments on VerSe 2019 and 2020 show that, on the public VerSe 2019 test set (with VerSe 2020 scans used as unlabeled training data), the supervised baseline achieved a Dice score of 89.8% and an identification accuracy of 92.3%. Incorporating unlabeled data improved performance to 91.6% Dice and 97.5% identification accuracy (relative gains of +1.8 and +5.2 percentage points). Compared with competing methods, the proposed semi-supervised model attains higher or comparable segmentation accuracy and the highest identification accuracy. Conclusions: Without additional annotation cost, the proposed method markedly improves the overall performance of vertebra segmentation and identification, offering more robust automated support for clinical workflows. Full article
Show Figures

Figure 1

22 pages, 45754 KB  
Article
Chrominance-Aware Multi-Resolution Network for Aerial Remote Sensing Image Fusion
by Shuying Li, Jiaxin Cheng, San Zhang and Wuwei Wang
Remote Sens. 2026, 18(3), 431; https://doi.org/10.3390/rs18030431 - 29 Jan 2026
Viewed by 374
Abstract
Spectral data obtained from upstream remote sensing tasks contain abundant complementary information. Infrared images are rich in radiative information, and visible images provide spatial details. Effective fusion of these two modalities improves the utilization of remote sensing data and provides a more comprehensive [...] Read more.
Spectral data obtained from upstream remote sensing tasks contain abundant complementary information. Infrared images are rich in radiative information, and visible images provide spatial details. Effective fusion of these two modalities improves the utilization of remote sensing data and provides a more comprehensive representation of target characteristics and texture details. The majority of current fusion methods focus primarily on intensity fusion between infrared and visible images. These methods ignore the chrominance information present in visible images and the interference introduced by infrared images on the color of fusion results. Consequently, the fused images exhibit inadequate color representation. To address these challenges, an infrared and visible image fusion method named Chrominance-Aware Multi-Resolution Network (CMNet) is proposed. CMNet integrates the Mamba module, which offers linear complexity and global awareness, into a U-Net framework to form the Multi-scale Spatial State Attention (MSSA) framework. Furthermore, the enhancement of the Mamba module through the design of the Chrominance-Enhanced Fusion (CEF) module leads to better color and detail representation in the fused image. Extensive experimental results show that the CMNet method delivers better performance compared to existing fusion methods across various evaluation metrics. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

31 pages, 17740 KB  
Article
HR-UMamba++: A High-Resolution Multi-Directional Mamba Framework for Coronary Artery Segmentation in X-Ray Coronary Angiography
by Xiuhan Zhang, Peng Lu, Zongsheng Zheng and Wenhui Li
Fractal Fract. 2026, 10(1), 43; https://doi.org/10.3390/fractalfract10010043 - 9 Jan 2026
Viewed by 888
Abstract
Coronary artery disease (CAD) remains a leading cause of mortality worldwide, and accurate coronary artery segmentation in X-ray coronary angiography (XCA) is challenged by low contrast, structural ambiguity, and anisotropic vessel trajectories, which hinder quantitative coronary angiography. We propose HR-UMamba++, a U-Mamba-based framework [...] Read more.
Coronary artery disease (CAD) remains a leading cause of mortality worldwide, and accurate coronary artery segmentation in X-ray coronary angiography (XCA) is challenged by low contrast, structural ambiguity, and anisotropic vessel trajectories, which hinder quantitative coronary angiography. We propose HR-UMamba++, a U-Mamba-based framework centered on a rotation-aligned multi-directional state-space scan for modeling long-range vessel continuity across multiple orientations. To preserve thin distal branches, the framework is equipped with (i) a persistent high-resolution bypass that injects undownsampled structural details and (ii) a UNet++-style dense decoder topology for cross-scale topological fusion. On an in-house dataset of 739 XCA images from 374 patients, HR-UMamba++ is evaluated using eight segmentation metrics, fractal-geometry descriptors, and multi-view expert scoring. Compared with U-Net, Attention U-Net, HRNet, U-Mamba, DeepLabv3+, and YOLO11-seg, HR-UMamba++ achieves the best performance (Dice 0.8706, IoU 0.7794, HD95 16.99), yielding a relative Dice improvement of 6.0% over U-Mamba and reducing the deviation in fractal dimension by up to 57% relative to U-Net. Expert evaluation across eight angiographic views yields a mean score of 4.24 ± 0.49/5 with high inter-rater agreement. These results indicate that HR-UMamba++ produces anatomically faithful coronary trees and clinically useful segmentations that can serve as robust structural priors for downstream quantitative coronary analysis. Full article
Show Figures

Figure 1

20 pages, 6322 KB  
Article
MAEM-ResUNet: Accurate Glioma Segmentation in Brain MRI via Symmetric Multi-Directional Mamba and Dual-Attention Modules
by Deguo Yang, Boming Yang and Jie Yan
Symmetry 2026, 18(1), 1; https://doi.org/10.3390/sym18010001 - 19 Dec 2025
Viewed by 534
Abstract
Gliomas are among the most common and aggressive malignant brain tumors. Their irregular morphology and fuzzy boundaries pose substantial challenges for automatic segmentation in MRI. Accurate delineation of tumor subregions is crucial for treatment planning and outcome assessment. This study proposes MAEM-ResUNet, an [...] Read more.
Gliomas are among the most common and aggressive malignant brain tumors. Their irregular morphology and fuzzy boundaries pose substantial challenges for automatic segmentation in MRI. Accurate delineation of tumor subregions is crucial for treatment planning and outcome assessment. This study proposes MAEM-ResUNet, an extension of the ResUNet architecture that integrates three key modules: a multi-scale adaptive attention module for joint channel–spatial feature selection, a symmetric multi-directional Mamba block for long-range context modeling, and an adaptive edge attention module for boundary refinement. Experimental results on the BraTS2020 and BraTS2021 datasets demonstrate that MAEM-ResUNet outperforms mainstream methods. On BraTS2020, it achieves an average Dice Similarity Coefficient of 91.19% and an average Hausdorff Distance (HD) of 5.27 mm; on BraTS2021, the average Dice coefficient is 89.67% and the average HD is 5.87 mm, both showing improvements compared to other mainstream models. Meanwhile, ablation experiments confirm the synergistic effect of the three modules, which significantly enhances the accuracy of glioma segmentation and the precision of boundary localization. Full article
Show Figures

Figure 1

21 pages, 2975 KB  
Article
FFM-Net: Fusing Frequency Selection Information with Mamba for Skin Lesion Segmentation
by Lifang Chen, Entao Yu, Qihang Cao and Ke Hu
Information 2025, 16(12), 1102; https://doi.org/10.3390/info16121102 - 13 Dec 2025
Viewed by 635
Abstract
Accurate segmentation of lesion regions is essential for skin cancer diagnosis. As dermoscopic images of skin lesions demonstrate different sizes, diverse shapes, fuzzy boundaries, and so on, accurate segmentation still faces great challenges. To address these issues, we propose a new dermatologic image [...] Read more.
Accurate segmentation of lesion regions is essential for skin cancer diagnosis. As dermoscopic images of skin lesions demonstrate different sizes, diverse shapes, fuzzy boundaries, and so on, accurate segmentation still faces great challenges. To address these issues, we propose a new dermatologic image segmentation network, FFM-Net. In FFM-Net, we design a new FM block encoder based on state space models (SSMs), which integrates a low-frequency information extraction module (LEM) and an edge detail extraction module (EEM) to extract broader overall structural information and more accurate edge detail information, respectively. At the same time, we dynamically adjust the input channel ratios of the two module branches at different stages of our network, so that the model can learn the correlation relationship between the overall structure and edge detail features more effectively. Furthermore, we designed the cross-channel spatial attention (CCSA) module to improve the model’s sensitivity to channel and spatial dimensions. We deploy a multi-level feature fusion module (MFFM) at the bottleneck layer to aggregate rich multi-scale contextual representations. Finally, we conducted extensive experiments on three publicly available skin lesion segmentation datasets, ISIC2017, ISIC2018, and PH2, and the experimental results show that the FFM-Net model outperforms most existing skin lesion segmentation methods. Full article
Show Figures

Figure 1

22 pages, 1479 KB  
Article
VMPANet: Vision Mamba Skin Lesion Image Segmentation Model Based on Prompt and Attention Mechanism Fusion
by Zinuo Peng, Shuxian Liu and Chenhao Li
J. Imaging 2025, 11(12), 443; https://doi.org/10.3390/jimaging11120443 - 11 Dec 2025
Cited by 1 | Viewed by 801
Abstract
In the realm of medical image processing, the segmentation of dermatological lesions is a pivotal technique for the early detection of skin cancer. However, existing methods for segmenting images of skin lesions often encounter limitations when dealing with intricate boundaries and diverse lesion [...] Read more.
In the realm of medical image processing, the segmentation of dermatological lesions is a pivotal technique for the early detection of skin cancer. However, existing methods for segmenting images of skin lesions often encounter limitations when dealing with intricate boundaries and diverse lesion shapes. To address these challenges, we propose VMPANet, designed to accurately localize critical targets and capture edge structures. VMPANet employs an inverted pyramid convolution to extract multi-scale features while utilizing the visual Mamba module to capture long-range dependencies among image features. Additionally, we leverage previously extracted masks as cues to facilitate efficient feature propagation. Furthermore, VMPANet integrates parallel depthwise separable convolutions to enhance feature extraction and introduces innovative mechanisms for edge enhancement, spatial attention, and channel attention to adaptively extract edge information and complex spatial relationships. Notably, VMPANet refines a novel cross-attention mechanism, which effectively facilitates the interaction between deep semantic cues and shallow texture details, thereby generating comprehensive feature representations while reducing computational load and redundancy. We conducted comparative and ablation experiments on two public skin lesion datasets (ISIC2017 and ISIC2018). The results demonstrate that VMPANet outperforms existing mainstream methods. On the ISIC2017 dataset, its mIoU and DSC metrics are 1.38% and 0.83% higher than those of VM-Unet respectively; on the ISIC2018 dataset, these metrics are 1.10% and 0.67% higher than those of EMCAD, respectively. Moreover, VMPANet boasts a parameter count of only 0.383 M and a computational load of 1.159 GFLOPs. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

23 pages, 9897 KB  
Article
HyMambaNet: Efficient Remote Sensing Water Extraction Method Combining State Space Modeling and Multi-Scale Features
by Handan Liu, Guangyi Mu, Kai Li, Haowei Zhang, Yibo Sun, Hongqing Sun and Sijia Li
Sensors 2025, 25(24), 7414; https://doi.org/10.3390/s25247414 - 5 Dec 2025
Viewed by 643
Abstract
Accurate segmentation of water bodies from high-resolution remote sensing imagery is crucial for water resource management and ecological monitoring. However, small and morphologically complex water bodies remain difficult to detect due to scale variations, blurred boundaries, and heterogeneous backgrounds. This study aims to [...] Read more.
Accurate segmentation of water bodies from high-resolution remote sensing imagery is crucial for water resource management and ecological monitoring. However, small and morphologically complex water bodies remain difficult to detect due to scale variations, blurred boundaries, and heterogeneous backgrounds. This study aims to develop a robust and scalable deep learning framework for high-precision water body extraction across diverse hydrological and ecological scenarios. To address these challenges, we propose HyMambaNet, a hybrid deep learning model that integrates convolutional local feature extraction with the Mamba state space model for efficient global context modeling. The network further incorporates multi-scale and frequency-domain enhancement as well as optimized skip connections to improve boundary precision and segmentation robustness. Experimental results demonstrate that HyMambaNet significantly outperforms existing CNN and Transformer-based methods. On the LoveHY dataset, it achieves 74.82% IoU and 88.87% F1-score, exceeding UNet by 7.49% IoU and 7.12% F1. On the LoveDA dataset, it attains 81.30% IoU and 89.99% F1-score, surpassing advanced models such as Deeplabv3+, AttenUNet, and TransUNet. These findings confirm that HyMambaNet provides an efficient and generalizable solution for large-scale water resource monitoring and ecological applications based on remote sensing imagery. Full article
(This article belongs to the Section Environmental Sensing)
Show Figures

Figure 1

28 pages, 4643 KB  
Article
JM-Guided Sentinel 1/2 Fusion and Lightweight APM-UNet for High-Resolution Soybean Mapping
by Ruyi Wang, Jixian Zhang, Xiaoping Lu, Zhihe Fu, Guosheng Cai, Bing Liu and Junfeng Li
Remote Sens. 2025, 17(24), 3934; https://doi.org/10.3390/rs17243934 - 5 Dec 2025
Viewed by 651
Abstract
Accurate soybean mapping is critical for food–oil security and cropping assessment, yet spatiotemporal heterogeneity arising from fragmented parcels and phenological variability reduces class separability and robustness. This study aims to deliver a high-resolution, reusable pipeline and quantify the marginal benefits of feature selection [...] Read more.
Accurate soybean mapping is critical for food–oil security and cropping assessment, yet spatiotemporal heterogeneity arising from fragmented parcels and phenological variability reduces class separability and robustness. This study aims to deliver a high-resolution, reusable pipeline and quantify the marginal benefits of feature selection and architecture design. We built a full-season multi-temporal Sentinel-1/2 stack and derived candidate optical/SAR features (raw bands, vegetation indices, textures, and polarimetric terms). Jeffries–Matusita (JM) distance was used for feature–phase joint selection, producing four comparable feature sets. We propose a lightweight APM-UNet: an Attention Sandglass Layer (ASL) in the shallow path to enhance texture/boundary details, and a Parallel Vision Mamba layer (PVML with Mamba-SSM) in the middle/bottleneck to model long-range/global context with near-linear complexity. Under a unified preprocessing and training/evaluation protocol, the four feature sets were paired with U-Net, SegFormer, Vision-Mamba, and APM-UNet, yielding 16 controlled configurations. Results showed consistent gains from JM-guided selection across architectures; given the same features, APM-UNet systematically outperformed all baselines. The best setup (JM-selected composite features + APM-UNet) achieved PA 92.81%, OA 97.95, Kappa 0.9649, Recall 91.42%, IoU 0.7986, and F1 0.9324, improving PA and OA by ~7.5 and 6.2 percentage points over the corresponding full-feature counterpart. These findings demonstrate that JM-guided, phenology-aware features coupled with a lightweight local–global hybrid network effectively mitigate heterogeneity-induced uncertainty, improving boundary fidelity and overall consistency while maintaining efficiency, offering a potentially transferable framework for soybean mapping in complex agricultural landscapes. Full article
(This article belongs to the Special Issue Machine Learning of Remote Sensing Imagery for Land Cover Mapping)
Show Figures

Figure 1

29 pages, 39944 KB  
Article
HDR-IRSTD: Detection-Driven HDR Infrared Image Enhancement and Small Target Detection Based on HDR Infrared Image Enhancement
by Fugui Guo, Pan Chen, Weiwei Zhao and Weichao Wang
Automation 2025, 6(4), 86; https://doi.org/10.3390/automation6040086 - 2 Dec 2025
Viewed by 852
Abstract
Infrared small target detection has become a research hotspot in recent years. Due to the small target size and low contrast with the background, it remains a highly challenging task. Existing infrared small target detection algorithms are generally implemented on 8-bit low dynamic [...] Read more.
Infrared small target detection has become a research hotspot in recent years. Due to the small target size and low contrast with the background, it remains a highly challenging task. Existing infrared small target detection algorithms are generally implemented on 8-bit low dynamic range (LDR) images, whereas raw infrared sensing images typically possess a 14–16 bit high dynamic range (HDR). Conventional HDR image enhancement methods do not consider the subsequent detection task. As a result, the enhanced LDR images often suffer from overexposure, increased noise levels with higher contrast, and target distortion or loss. Consequently, discriminative features in HDR images that are beneficial for detection are not effectively exploited, which further increases the difficulty of small target detection. To extract target features under these conditions, existing detection algorithms usually rely on large parameter models, leading to an unsatisfactory trade-off between efficiency and accuracy. To address these issues, this paper proposes a novel infrared small target detection framework based on HDR image enhancement (HDR-IRSTD). Specifically, a multi-branch feature extraction and fusion mapping subnetwork (MFEF-Net) is designed to achieve the mapping from HDR to LDR. This subnetwork effectively enhances small targets and suppresses noise while preserving both detailed features and global information. Furthermore, considering the characteristics of infrared small targets, an asymmetric Vision Mamba U-Net with multi-level inputs (AVM-Unet) is developed, which captures contextual information effectively while maintaining linear computational complexity. During training, a bilevel optimization strategy is adopted to collaboratively optimize the two subnetworks, thereby yielding optimal parameters for both HDR infrared image enhancement and small target detection. Experimental results demonstrate that the proposed method achieves visually favorable enhancement and high-precision detection, with strong generalization ability and robustness. The performance and efficiency of the method exhibit a well-balanced trade-off. Full article
Show Figures

Figure 1

Back to TopTop