Saved Queries

This paper addresses the issues of blurred details, low contrast, and feature degradation in insulator images under harsh meteorological conditions, as well as the challenges of high computational complexity and insufficient real-time performance when deploying existing deep learning models on edge devices. It proposes a lightweight insulator defect detection method that integrates an improved image enhancement algorithm. The method introduces Mahalanobis distance-based modulation weight optimization for scene depth estimation and improves the color decay prior model to effectively enhance foggy insulator images. It further designs a lightweight detection network integrating region-aware routing attention mechanisms, utilizing multi-scale feature fusion strategies to achieve precise insulator identification and localization. Experimental results demonstrate that the proposed method significantly enhances inference speed while maintaining detection accuracy, effectively adapting to edge computing devices. This provides a viable technical solution for real-time deployment in intelligent transmission line inspection systems. Full article

(This article belongs to the Special Issue AI Applications for Smart Grid: 2nd Edition)

►▼ Show Figures

Figure 1

28 pages, 3725 KB

Open AccessArticle

Integrated Assessment of Water Resource Carrying Capacity: Dynamics, Obstacles, Coordination and Driving Mechanisms in the Gansu Section of the Yellow River Basin, China

by Jianrong Xiao, Jinxia Zhang, Guohua He, Haiyan Li, Liangliang Du, Runheng Yang, Meng Yin, Pengliang Tian, Yangang Yang, Qingzhuo Li, Xi Wei and Yingru Xie

Water 2026, 18(6), 761; https://doi.org/10.3390/w18060761 - 23 Mar 2026

Abstract

Accurately assessing dynamic water resource carrying capacity (WRCC) is essential and challenging, particularly in regions like the Gansu sections of the Yellow River Basin (GSYRB), a core water source protection zone in the arid northwest of China, due to its pressing challenge of balancing water resources for socioeconomic needs and ecological security. This study proposes a novel integrated computational assessment framework named SD-VIKOR to address the complexities arising from nonlinear interactions within the “water resources–socioeconomic–ecological environment” (W–S–E) system. The core of this framework is the tight coupling of a system dynamics (SD) simulation model with a VIKOR multi-criteria evaluation module, where indicator weights are objectively–subjectively determined via an Analytic Hierarchy Process (AHP)–entropy weight method. This integrated SD-VIKOR engine enables dynamic, scenario-based WRCC trajectory simulation. To move beyond simulation and enable mechanistic insight, the framework further incorporates a diagnostic suite: a Geodetector module quantifies dominant drivers and their interactions; an obstacle degree model pinpoints key limiting factors; and a coupling coordination degree model evaluates subsystem synergies. Together, they form a closed-loop “dynamic simulation → multi-criteria assessment → driving mechanism analysis and constraint diagnosis → subsystem coordination analysis” workflow. Applied to the GSYRB from 2012 to 2030 under five development scenarios, the framework demonstrated high efficacy. It successfully captured path-dependent WRCC evolution, revealing that the ecological-priority scenario (B2), which shifts system drivers from economic-scale expansion to resource-efficiency and environmental governance, yielded optimal WRCC and the highest system coordination. In contrast, business-as-usual and single-minded economic expansion scenarios underperformed. Six key obstacle factors were quantitatively identified, linking WRCC constraints to natural endowments, economic patterns, and domestic demand. The results reveal pronounced spatial–temporal heterogeneity in WRCC across the GSYRB, with socioeconomic development, water resource use efficiency, and ecological conditions acting as the primary joint drivers of WRCC evolution. Critically, several key indicators are identified as persistent constraints on regional water sustainability. In contrast to conventional static evaluations, the integrated framework captures the complex dynamics and multi-subsystem interactions governing WRCC, offering a more robust diagnostic of resource–environment systems. These insights provide a transferable analytical basis for designing sustainable water management strategies in arid river basins. Full article

(This article belongs to the Section Hydrology)

►▼ Show Figures

Figure 1

20 pages, 8955 KB

Open AccessArticle

Language-Guided Contrastive Learning and Difference Enhancement for Semantic Change Detection in Remote Sensing Images

by Yongli Hu, Lintian Ren, Huajie Jiang, Kan Guo, Tengfei Liu, Junbin Gao, Yanfeng Sun and Baocai Yin

Remote Sens. 2026, 18(6), 964; https://doi.org/10.3390/rs18060964 - 23 Mar 2026

Abstract

Semantic change detection (SCD) in remote sensing images aims not only to localize changed regions but also to identify their specific “from–to” semantic transitions. This task remains challenging due to the inherent semantic ambiguity of spectral changes and the presence of pseudo-change noise. While recent vision–language models have shown promise in remote sensing, existing approaches like RemoteCLIP predominantly focus on static scene classification, lacking the ability to explicitly model dynamic temporal transitions. Other adaptations of foundation models (e.g., AdaptVFMs-RSCD) often rely on heavy backbones, incurring prohibitive computational costs. To address these limitations, this paper proposes LGDENet, a lightweight, end-to-end framework that unifies Language-Guided Temporal Contrastive Learning with a noise-robust difference enhancement mechanism. Specifically, we construct a temporal transition prompt learning strategy that aligns visual difference features with textual descriptions of dynamic processes, thereby resolving directional semantic ambiguities. Furthermore, we introduce a Difference Enhancement Module (DEM) that leverages the channel–spatial decoupling property of depthwise separable convolutions to adaptively isolate and suppress irrelevant variations (e.g., registration errors) before feature fusion. Experiments on the SECOND and Landsat-SCD datasets demonstrate that LGDENet achieves state-of-the-art performance, yielding a semantic F1 score (

F_{s c d}

) of 87.90% and 88.71%, respectively. Moreover, with a modest parameter count of 33.45 M, it offers a superior trade-off between accuracy and efficiency compared to heavy foundation model-based approaches. Full article

►▼ Show Figures

Figure 1

30 pages, 2355 KB

Open AccessArticle

SGCAD: A SAR-Guided Confidence-Gated Distillation Framework of Optical and SAR Images for Water-Enhanced Land-Cover Semantic Segmentation

by Junjie Ma, Zhiyi Wang, Yanyi Yuan and Fengming Hu

Remote Sens. 2026, 18(6), 962; https://doi.org/10.3390/rs18060962 - 23 Mar 2026

Abstract

Multimodal fusion of synthetic aperture radar (SAR) and optical imagery is widely used in Earth observation for applications such as land-cover mapping and surface-water mapping (including post-event flood mapping under near-synchronous acquisitions) and land-use inventory. Optical images provide rich spectral and texture cues, whereas SAR offers all-weather structural information that is complementary but heterogeneous. In practice, this heterogeneity often introduces fusion conflicts in multi-class segmentation, causing critical categories such as water bodies to be under-optimized. To address this issue, this paper presents a SAR-guided class-aware knowledge distillation (SGCAD) method for multimodal semantic segmentation. First, a SAR-only HRNet is trained as a water-expert teacher to learn discriminative backscattering and boundary priors for water extraction. Second, a lightweight multimodal student model (LightMCANet) is optimized using a class-aware distillation strategy that transfers teacher knowledge only within high-confidence water regions, thereby suppressing noisy supervision and reducing interference to other classes. Third, a SAR edge guidance module (SEGM) is introduced in the decoder to enhance boundary continuity for slender structures such as water bodies and roads. Overall, SGCAD improves targeted category learning while maintaining stable performance across the remaining classes. Experiments on a self-built dataset from GF-1 optical and LuTan-1 SAR imagery demonstrate higher overall accuracy and more coherent water/road predictions than representative baselines. Future work will extend the proposed distillation scheme to additional categories and broader geographic scenes. Full article

(This article belongs to the Section Remote Sensing Image Processing)

20 pages, 7591 KB

Open AccessArticle

Research on Landslide Hazard Detection in Ya’an Region Based on an Improved YOLO Model

by Kewei Cui, Meng Huang, Weiling Zhang, Guang Yang, Yongxiong Huang, Zhengyi Wu, Zhiwei Zhai and Chao Cheng

Remote Sens. 2026, 18(6), 957; https://doi.org/10.3390/rs18060957 - 23 Mar 2026

Abstract

Landslide hazards occur frequently in the Ya’an region; therefore, accurately identifying and delineating potential landslide areas is crucial for disaster prevention and mitigation. Although deep learning-based detection methods using optical remote sensing imagery are widely adopted, the complex terrain and diverse land cover in this area often result in blurred boundaries and weakened textural features, making it difficult to precisely define spatial extents. To overcome these challenges, this study proposes an improved YOLOv11 model for landslide detection. Building on the YOLOv11 baseline, we designed a novel Multi-Scale Detail Enhancement module and integrated it into the neck network to effectively aggregate shallow-level details with deep-level semantic information, thereby enhancing the model’s ability to represent ambiguous boundaries. Additionally, we incorporated the lightweight SimAM attention mechanism into the backbone network. This mechanism dynamically suppresses background noise based on an energy minimization principle, improving feature discriminability within landslide regions and enabling precise boundary boxes. We conducted validation experiments in the Ya’an region using a custom dataset constructed from high-resolution UAV orthoimagery, comparing our method against mainstream models such as YOLOv8 and YOLOv10. The results show that the proposed improved YOLOv11 model achieves a precision of 90.2%, a recall of 84.8%, and an mAP of 92.7%. This enhanced performance demonstrates the model’s effectiveness in detecting landslides under complex terrain conditions, providing a practical technical reference for efficient hazard screening and dynamic monitoring. Full article

►▼ Show Figures

Figure 1

20 pages, 4497 KB

Open AccessArticle

Remote Sensing Identification of Benggang Using a Two-Stream Network with Multimodal Feature Enhancement and Sparse Attention

by Xuli Rao, Qihao Chen, Kexin Zhu, Zhide Chen, Jinshi Lin and Yanhe Huang

Electronics 2026, 15(6), 1331; https://doi.org/10.3390/electronics15061331 - 23 Mar 2026

Abstract

Benggang (Benggang), a typical landform characterized by severe erosion and a geohazard in the red-soil hilly regions of southern China, is characterized by a fragmented texture, irregular boundaries, and high similarity to background objects such as bare soil and roads, which poses a dual challenge of “multiscale variability + strong noise” for automated identification at regional scales. To address insufficient information from a single modality and the limited representation of cross-scale features, this study proposes a dual-stream feature-fusion network (DF-Net) for multisource data consisting of a digital orthophoto map (DOM) and a digital elevation model (DEM). The method adopts ResNeSt50d as the backbone of the two branches: on the DOM side, a Canny-edge channel is stacked to enhance high-frequency boundary information; on the DEM side, derived terrain factors, including slope, aspect, curvature, and hillshade, are introduced to provide morphological constraints. In the cross-modal fusion stage, a multiscale sparse attention fusion module is designed, which acquires contextual information via multiwindow average pooling and suppresses noise interference through top-K sparsification. In the decision stage, a multibranch ensemble is employed to improve classification stability. Taking Anxi County, Fujian Province, as the study area, a coregistered dataset of GF-2 (1 m) DOM and ALOS (12.5 m) DEMs is constructed, and a zonal partitioning strategy is adopted to evaluate the model’s generalization ability. The experimental results show that DF-Net achieves 97.44% accuracy, 85.71% recall, and an 82.98% F1 score in the independent test zone, outperforming multiple mainstream CNN/transformer classification models. This study indicates that the strategy of “multimodal feature enhancement + sparse attention fusion” tailored to Benggang erosional landforms can significantly improve recognition performance under complex backgrounds, providing technical support for rapid Benggang surveys and governance-effectiveness assessments. Full article

(This article belongs to the Section Artificial Intelligence)

►▼ Show Figures

Figure 1

22 pages, 2186 KB

Open AccessArticle

ConvDeiT-Tiny: Adding Local Inductive Bias to DeiT-Ti for Enhanced Maize Leaf Disease Classification

by Damaris Waema, Waweru Mwangi and Petronilla Muriithi

Plants 2026, 15(6), 982; https://doi.org/10.3390/plants15060982 - 23 Mar 2026

Abstract

Reliable identification of maize leaf diseases is critical for mitigating crop losses, particularly in regions where farmers have limited access to experts. Although vision transformers (ViTs) have recently demonstrated strong performance in image recognition, their weak inductive bias and limited modeling of local texture patterns make them non-ideal for fine-grained maize leaf disease classification. To address these limitations, we propose ConvDeiT-Tiny, a lightweight hybrid ViT that improves DeiT-Ti by placing depthwise convolutions in parallel with multi-head self-attention modules in the first three transformer blocks. The local and global features captured by the convolution and attention modules are concatenated along the embedding dimension and fused using a multilayer perceptron. This results in richer token representations without significantly increasing model size. Across three datasets, ConvDeiT-Tiny (6.9 M parameters) consistently outperformed DeiT-Ti, DeiT-Ti-Distilled, and DeiT-S (21.7 M parameters) when trained from scratch. With transfer learning, ConvDeiT-Tiny achieved an accuracy of 99.15%, 99.35%, and 98.60% on the CD&S, primary, and Kaggle datasets, respectively, surpassing many previous studies with far fewer parameters. For explainability, we present gradient-weighted transformer attribution visualizations showing the disease lesions driving model predictions. These results indicate that injecting local inductive bias in early transformer blocks is beneficial for accurate maize leaf disease classification. Full article

(This article belongs to the Special Issue AI-Driven Machine Vision Technologies in Plant Science)

►▼ Show Figures

Figure 1

26 pages, 5161 KB

Open AccessArticle

LHO-net: A Lightweight Steel Defect Detection Framework Based on Cross-Scale Feature Selection and Adaptive Optimization

by Qi Wang and Haocheng Yan

Sensors 2026, 26(6), 1990; https://doi.org/10.3390/s26061990 - 23 Mar 2026

Abstract

To address the issues of poor adaptability to complex scenarios, high computational complexity, and difficulties in terminal deployment of existing steel surface defect detection models, a novel lightweight detection network named LHO-net is proposed, with the Lightweight Multi-Backbone (LM Backbone), the Hierarchical Scale-based Pyramid Attention Network (HSPAN), and the Occlusion-aware Detection Head (OAHead). The LM Backbone adopts a dual-branch structure with shared HGStem and a dynamic feature fusion mechanism, effectively capturing multi-dimensional features of irregular defects while extremely compressing model parameters. The HSPAN module realizes efficient fusion of multi-scale features through dynamic feature selection and adaptive upsampling strategies, balancing background noise suppression and defect detail preservation. The OAHead completes adaptive compensation of features in occluded regions by means of deep feature aggregation and exponential normalization technology, significantly enhancing the ability to recognize complex defects. On the NEU-DET dataset, LHO-net achieves a mAP@0.5 of 75.0%, a mAP@0.5:0.95 of 44.0%, and a recall of 73.6%, with a computational complexity of only 2.3 GFLOPS. Compared with the baseline model YOLOv12, it reduces parameters by 64% and computational cost by 60.3%. On the GC-10 dataset, its mAP@0.5 reaches 67.2%, and its detection stability for complex defects such as slender creases and low-contrast water spots is superior to that of mainstream lightweight YOLO variants. Visualization results confirm that the model can effectively avoid common problems such as redundant annotations and false detections and maintains stable recognition performance for various defects. It solves the core contradiction between detection accuracy and lightweight deployment in industrial scenarios, providing an efficient and practical technical solution for real-time steel surface defect detection on resource-constrained terminal devices. Full article

(This article belongs to the Section Fault Diagnosis & Sensors)

►▼ Show Figures

Figure 1

14 pages, 3023 KB

Open AccessArticle

Lightweight Stereo Vision for Obstacle Detection and Range Estimation in Micro-Mobility Vehicles

by Jiansheng Ruan, Hui Weng, Zhaojun Yuan, Guangyuan Jin and Liang Zhou

Sensors 2026, 26(6), 1988; https://doi.org/10.3390/s26061988 - 23 Mar 2026

Abstract

Micro-mobility vehicles operating in closed, low-speed environments (e.g., parks) require reliable obstacle detection and accurate range estimation under strict constraints on cost, power, and onboard computation. This paper proposes HAGVNet, a lightweight stereo matching network for embedded ranging and validates its practical deployability in a target-level ranging pipeline with YOLO11n as the front-end detector. HAGVNet builds a hierarchical attention-guided cost volume (HAGV) that uses coarse-scale geometric priors to modulate fine-scale cost modeling and adopts ConvNeXtV2-style 2D cost aggregation blocks to improve stability and boundary consistency with controlled complexity. For ranging, depth statistics within detected regions are used to estimate target distance and 3D position. The model is pre-trained on SceneFlow and evaluated on KITTI. On SceneFlow, HAGVNet reaches 0.73 px EPE with 20.08 G FLOPs, indicating a favorable accuracy–complexity trade-off under low computation budgets. On an embedded Jetson Orin Nano Super platform, HAGVNet achieves 46.3 FPS under TensorRT FP16, and field tests indicate relative ranging errors of 0.5–8.6% within 2–10 m, demonstrating its practical feasibility for low-speed target-level ranging. Full article

(This article belongs to the Section Sensing and Imaging)

►▼ Show Figures

Figure 1

27 pages, 2977 KB

Open AccessArticle

HGR-QL: Optimized Q-Learning for Multi-UAV Path Planning in Mountain Search and Rescue

by Qi Liu, Daqiao Zhang, Shaopeng Li, Pei Dai and Wenjing Li

Drones 2026, 10(3), 223; https://doi.org/10.3390/drones10030223 - 22 Mar 2026

Viewed by 55

Abstract

Existing Q-Learning-based path planning methods face significant bottlenecks in large-scale collaboration, dynamic interference adaptation, and regional value differentiation, failing to meet the practical needs of mountain search and rescue. This study proposes HGR-QL, an optimized Q-Learning method for large-scale multi-UAV operations. Referencing remote sensing datasets, a 50 × 50 dynamic grid environment is constructed by integrating 20% fixed obstacles and 10 moving interference sources, highly simulating real mountain features. Integrating the individual Q-tables and the regional shared Q-tables, the hierarchical independent Q-table architecture is designed, balancing local autonomy and global collaboration. To guide UAVs focusing on remote sensing-identified high-value areas, an innovative multi-level gradient collision avoidance reward function is constructed, avoiding task deviation. Comparative experiments across three scenarios with four baselines and ablation tests validate the core modules. Results show HGR-QL outperforms peers in key metrics: in the dynamic interference scenario, it achieves a 74.47% task completion rate, 25.44 collisions, and a stable 100.00 ms communication delay. HGR-QL provides a lightweight, scalable solution, effectively enhancing the efficiency, safety, and stability of mountain search and rescue and supporting the “golden 72 h” rescue window. Full article

(This article belongs to the Special Issue UAV Path Planning Algorithms for Surveillance and Reconnaissance in Civil Applications)

►▼ Show Figures

Figure 1

28 pages, 10027 KB

Open AccessArticle

An Automatic Scoring Method for Swine Leg Structure Based on 3D Point Clouds

by Yongqi Han, Youjun Yue, Xianglong Xue, Mingyu Li, Yikai Fan, Simon X. Yang, Daniel Morris, Qifeng Li and Weihong Ma

Agriculture 2026, 16(6), 706; https://doi.org/10.3390/agriculture16060706 (registering DOI) - 22 Mar 2026

Viewed by 57

Abstract

The leg structure of swine is closely related to their robustness and longevity. Animals with sound legs generally have longer productive lifespans and higher reproductive efficiency, whereas leg defects can markedly impair performance and shorten service life. To address the high subjectivity, low efficiency, and poor consistency of traditional leg-structure evaluation by humans, this study developed an automatic scoring system for swine leg structure based on 3D point clouds. The hardware components of the system include the acquisition channel, a multi-view time-of-flight (ToF) depth camera array, an industrial computer, and a star-type synchronization hub. The core algorithm modules include point cloud preprocessing, leg segmentation, geometric feature extraction, and structure-based scoring. Body orientation was corrected using principal component analysis (PCA). An adaptive limb region segmentation method was proposed that combines iterative cropping with geometric verification. Two point cloud tasks were performed: key structural points were extracted via multi-scale curvature analysis, and angular and symmetry parameters of the fore- and hindlimbs were computed in the sagittal and coronal planes. Following a “classify first, then score” strategy, a nine-level linear scoring model was constructed. Field validation showed that the classification accuracy exceeded 90%, the scores were significantly negatively correlated with the degree of structural deviation, and multi-frame resampling yielded good repeatability. The processing time per animal ranged from 1.6 s to 3.0 s, which met the requirements for real-time applications. These results demonstrated that the proposed method could automatically identify and quantitatively evaluate swine leg structure, providing efficient and reliable technical support for objective selection and smart pig farming. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

►▼ Show Figures

Figure 1

20 pages, 2605 KB

Open AccessArticle

Spatial-Frequency Decoupling Alignment Encoding for Remote Sensing Change Detection

by Xu Zhang, Yue Du, Weiran Zhou and Kaihua Zhang

Sensors 2026, 26(6), 1979; https://doi.org/10.3390/s26061979 - 21 Mar 2026

Viewed by 92

Abstract

Existing remote sensing change detection methods often struggle to accurately capture the contours of complex change targets and subtle textural differences. This makes it difficult to effectively distinguish between the boundaries of change targets and the background. To address this challenge, we propose a novel method called spatial-frequency decoupling alignment encoding (SDA-Encoding), which is designed to fully leverage information from both the spatial and frequency domains. Specifically, we first use a Transformer encoder to extract bi-temporal features. Next, we apply wavelet transform to decouple these features into low-frequency and high-frequency components. In the multi-scale high-frequency interaction (MHI) module, we combine local spatial enhancement using spatial pyramid pooling with cross-scale dependency supplementation via the dual-domain alignment fusion (DAF) module. Meanwhile, in the position-aware low-frequency enhancement (PLE) module, spatial position sensitivity is restored using coordinate attention, and region-level contextual dependencies are captured through the selective fusion attention (SFA) module. Finally, the two frequency-domain branches are complementarily fused within the spatial domain to achieve unified detection of both fine-grained and structural changes. Experimental results on three benchmark datasets demonstrate the significant performance improvements of SDA-Encoding. Full article

(This article belongs to the Special Issue Image Processing and Analysis for Object Detection: 3rd Edition)

►▼ Show Figures

Figure 1

26 pages, 5110 KB

Open AccessArticle

Toward Robust Mineral Prospectivity Mapping: A Transformer-Based Global–Local Fusion Framework with Application to the Xiadian Gold Deposit

by Xiaoming Huang, Pancheng Wang and Qiliang Liu

Minerals 2026, 16(3), 331; https://doi.org/10.3390/min16030331 - 20 Mar 2026

Viewed by 22

Abstract

As mineral exploration increasingly targets deeper and more geologically complex terrains, the need for reliable predictive models becomes critical to mitigating exploration risk and improving cost efficiency. Correspondingly, the effectiveness of deep mineral exploration strategies depends substantially on the effectiveness and precision of three-dimensional mineral prospectivity mapping (3D MPM) models. However, the inherent spatial non-stationarity—where ore grade variability changes across geological domains—and the strongly skewed distribution of high-grade samples present a dual challenge. Conventional methods, which primarily rely on mean-based regression, often struggle to adequately address this dual challenge, limiting their predictive performance in complex geological settings. To address these issues, this paper proposes a pinball-loss-guided, global–local fusion Transformer model within a unified framework for 3D MPM. It leverages a multi-head self-attention mechanism with global–local fusion to capture long-range dependencies and global geological contexts, while incorporating local feature extraction modules to adaptively model spatially varying mineralization controls, jointly optimized through a pinball loss function to address mineralization distribution skewness. The proposed framework was first rigorously evaluated using the Xiadian gold deposit as a case study. Bootstrap analysis of the ablation experiments confirmed its predictive performance in terms of quantile-specific accuracy and prediction interval (PI) calibration. Ten rounds of random data splits provided further confirmation of the model’s stability. Subsequently, the validated model was applied to prospectivity mapping in unexplored regions, leading to the delineation of several high-potential exploration targets. Finally, comparative analyses with state-of-the-art machine learning methods were conducted, which further validated the competitive fitting capability of the proposed framework. Full article

(This article belongs to the Special Issue 3D Mineral Prospectivity Modeling Applied to Mineral Deposits)

►▼ Show Figures

Figure 1

26 pages, 20660 KB

Open AccessArticle

Sea Ice and Water Segmentation in SAR Imagery Based on Polarization Channel Interaction and Edge Selective Fusion

by Wei Song, Yixun Chen, Bin Liu, Mengying Ge, Yiji Zhou and Huifang Xu

Remote Sens. 2026, 18(6), 945; https://doi.org/10.3390/rs18060945 - 20 Mar 2026

Viewed by 19

Abstract

Sea ice segmentation based on Synthetic Aperture Radar (SAR) images has become an important technical means for polar climate change monitoring and navigation safety guarantee. However, the existing methods have limitations in the utilization of SAR polarization information and the modeling of local diversity details of sea ice, which leads to insufficient segmentation, especially in complex ice-water boundary regions. To address these issues, this paper proposes a novel Polarization-Fused Edge-Enhanced UNet (PFEE-UNet) designed specifically for sea ice segmentation from high-resolution SAR images. Specifically, we design the Cross-Polarization Channel Interaction (CPCI) module, which employs a dual interaction strategy of hierarchical inter-group cascading and symmetric cross-fusion. This approach effectively leverages the complementary features of the HH and HV polarization channels, significantly enhancing the distinction between sea ice and open water. Additionally, we present the Dense–Sparse Diversity Enhancement (DSDE) module, which combines a spatial-channel joint attention mechanism to strengthen the model’s ability to capture spatial relationships within complex ice–water structures, effectively alleviating misclassifications caused by abrupt local texture changes. Finally, we design the Selective Edge Fusion (SEF) module, which dynamically selects and integrates multi-level edge features, improving the continuity of sea ice boundaries and preserving its morphological integrity. The experimental results show that the proposed PFEE-UNet model outperforms mainstream segmentation methods on the AI4Arctic/ASIP sea ice dataset, achieving an average Intersection over Union (IoU) of 84.48%, which surpasses existing methods such as HRNet (82.52%) and DeepLabv3+ (82.40%). Additionally, PFEE-UNet was applied for end-to-end ice–water segmentation on real-world Sentinel-1 SAR scenes, demonstrating its effectiveness and robustness for practical sea ice monitoring. Full article

(This article belongs to the Special Issue Innovative Remote-Sensing Technologies for Sea Ice Observing)

►▼ Show Figures

Figure 1

22 pages, 6052 KB

Open AccessArticle

HSMD-YOLO: An Anti-Aliasing Feature-Enhanced Network for High-Speed Microbubble Detection

by Wenda Luo, Yongjie Li and Siguang Zong

Algorithms 2026, 19(3), 234; https://doi.org/10.3390/a19030234 - 20 Mar 2026

Viewed by 12

Abstract

Underwater micro-bubble detection entails multiple challenges, including diminutive target sizes, sparse pixel information, pronounced specular highlights and water scattering, indistinct bubble boundaries, and adhesion or overlap between instances. To address these issues, we propose HSMD-YOLO, an improved detector tailored for high-resolution micro-bubble detection and built upon YOLOv11. The model incorporates three novel components: the Scale Switch Block (SSB), a scale-transformation module that suppresses artifacts and background noise, thereby stabilizing edges in thin-walled bubble regions and enhancing sensitivity to geometric contours; the Global Local Refine Block (GLRB), which achieves efficient global relationship modeling with an asymptotic linear complexity (

O (N)

) in spatial dimensions while further refining local features, thereby strengthening boundary perception and improving bubble–background separability; and the Bidirectional Exponential Moving Attention Fusion (BEMAF), which accommodates the multi-scale nature of bubbles by employing a parallel multi-kernel architecture to extract spatial features across scales, coupled with a multi-stage EMA based attention mechanism to enhance detection robustness under weak boundaries and complex backgrounds. Experiments conducted on an Side-Illuminated Light Field Bubble Database (SILB-DB) and a public gas–liquid two-phase flow dataset (GTFD) demonstrate that HSMD-YOLO achieves mAP@50 scores of 0.911 and 0.854, respectively, surpassing mainstream detection methods. Ablation studies indicate that SSB, GLRB, and BEMAF contribute performance gains of 1.3%, 2.0%, and 0.4%, respectively, thereby corroborating the effectiveness of each module for micro-scale object detection. Full article

(This article belongs to the Section Evolutionary Algorithms and Machine Learning)

►▼ Show Figures

Figure 1

Show export options Show export options

Select all

Export citation of selected articles as:

Error

Oops... you haven't selected anything for export.

Displaying article 1-50 on page 1 of 50.

Go to page 1 2 3 4 5

Search Results (2,456)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI