Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (333)

Search Parameters:
Keywords = deep and shallow feature fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 3169 KB  
Article
LGD-DeepLabV3+: An Enhanced Framework for Remote Sensing Semantic Segmentation via Multi-Level Feature Fusion and Global Modeling
by Xin Wang, Xu Liu, Adnan Mahmood, Yaxin Yang and Xipeng Li
Sensors 2026, 26(3), 1008; https://doi.org/10.3390/s26031008 - 3 Feb 2026
Abstract
Remote sensing semantic segmentation encounters several challenges, including scale variation, the coexistence of class similarity and intra-class diversity, difficulties in modeling long-range dependencies, and shadow occlusions. Slender structures and complex boundaries present particular segmentation difficulties, especially in high-resolution imagery acquired by satellite and [...] Read more.
Remote sensing semantic segmentation encounters several challenges, including scale variation, the coexistence of class similarity and intra-class diversity, difficulties in modeling long-range dependencies, and shadow occlusions. Slender structures and complex boundaries present particular segmentation difficulties, especially in high-resolution imagery acquired by satellite and aerial cameras, UAV-borne optical sensors, and other imaging payloads. These sensing systems deliver large-area coverage with fine ground sampling distance, which magnifies domain shifts between different sensors and acquisition conditions. This work builds upon DeepLabV3+ and proposes complementary improvements at three stages: input, context, and decoder fusion. First, to mitigate the interference of complex and heterogeneous data distributions on network optimization, a feature-mapping network is introduced to project raw images into a simpler distribution before they are fed into the segmentation backbone. This approach facilitates training and enhances feature separability. Second, although the Atrous Spatial Pyramid Pooling (ASPP) aggregates multi-scale context, it remains insufficient for modeling long-range dependencies. Therefore, a routing-style global modeling module is incorporated after ASPP to strengthen global relation modeling and ensure cross-region semantic consistency. Third, considering that the fusion between shallow details and deep semantics in the decoder is limited and prone to boundary blurring, a fusion module is designed to facilitate deep interaction and joint learning through cross-layer feature alignment and coupling. The proposed model improves the mean Intersection over Union (mIoU) by 8.83% on the LoveDA dataset and by 6.72% on the ISPRS Potsdam dataset compared to the baseline. Qualitative results further demonstrate clearer boundaries and more stable region annotations, while the proposed modules are plug-and-play and easy to integrate into camera-based remote sensing pipelines and other imaging-sensor systems, providing a practical accuracy–efficiency trade-off. Full article
(This article belongs to the Section Smart Agriculture)
22 pages, 7617 KB  
Article
DAS-YOLO: Adaptive Structure–Semantic Symmetry Calibration Network for PCB Defect Detection
by Weipan Wang, Wengang Jiang, Lihua Zhang, Siqing Chen and Qian Zhang
Symmetry 2026, 18(2), 222; https://doi.org/10.3390/sym18020222 - 25 Jan 2026
Viewed by 274
Abstract
Industrial-grade printed circuit boards (PCBs) exhibit high structural order and inherent geometric symmetry, where minute surface defects essentially constitute symmetry-breaking anomalies that disrupt topological integrity. Detecting these anomalies is quite challenging due to issues like scale variation and low contrast. Therefore, this paper [...] Read more.
Industrial-grade printed circuit boards (PCBs) exhibit high structural order and inherent geometric symmetry, where minute surface defects essentially constitute symmetry-breaking anomalies that disrupt topological integrity. Detecting these anomalies is quite challenging due to issues like scale variation and low contrast. Therefore, this paper proposes a symmetry-aware object detection framework, DAS-YOLO, based on an improved YOLOv11. The U-shaped adaptive feature extraction module (Def-UAD) reconstructs the C3K2 unit, overcoming the geometric limitations of standard convolutions through a deformation adaptation mechanism. This significantly enhances feature extraction capabilities for irregular defect topologies. A semantic-aware module (SADRM) is introduced at the backbone and neck regions. The lightweight and efficient ESSAttn improves the distinguishability of small or weak targets. At the same time, to address information asymmetry between deep and shallow features, an iterative attention feature fusion module (IAFF) is designed. By dynamically weighting and calibrating feature biases, it achieves structured coordination and balanced multi-scale representation. To evaluate the validity of the proposed method, we carried out comprehensive experiments using publicly accessible datasets focused on PCB defects. The results show that the Recall, mAP@50, and mAP@50-95 of DAS-YOLO reached 82.60%, 89.50%, and 46.60%, respectively, which are 3.7%, 1.8%, and 2.9% higher than those of the baseline model, YOLOv11n. Comparisons with mainstream detectors such as GD-YOLO and SRN further demonstrate a significant advantage in detection accuracy. These results confirm that the proposed framework offers a solution that strikes a balance between accuracy and practicality in addressing the key challenges in PCB surface defect detection. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

24 pages, 8047 KB  
Article
MEE-DETR: Multi-Scale Edge-Aware Enhanced Transformer for PCB Defect Detection
by Xiaoyu Ma, Xiaolan Xie and Yuhui Song
Electronics 2026, 15(3), 504; https://doi.org/10.3390/electronics15030504 - 23 Jan 2026
Viewed by 211
Abstract
Defect inspection of Printed Circuit Board (PCB) is essential for maintaining the safety and reliability of electronic products. With the continuous trend toward smaller components and higher integration levels, identifying tiny imperfections on densely packed PCB structures has become increasingly difficult and remains [...] Read more.
Defect inspection of Printed Circuit Board (PCB) is essential for maintaining the safety and reliability of electronic products. With the continuous trend toward smaller components and higher integration levels, identifying tiny imperfections on densely packed PCB structures has become increasingly difficult and remains a major challenge for current inspection systems. To tackle this problem, this study proposes the Multi-scale Edge-Aware Enhanced Detection Transformer (MEE-DETR), a deep learning-based object detection method. Building upon the RT-DETR framework, which is grounded in Transformer-based machine learning, the proposed approach systematically introduces enhancements at three levels: backbone feature extraction, feature interaction, and multi-scale feature fusion. First, the proposed Edge-Strengthened Backbone Network (ESBN) constructs multi-scale edge extraction and semantic fusion pathways, effectively strengthening the structural representation of shallow defect edges. Second, the Entanglement Transformer Block (ETB), synergistically integrates frequency self-attention, spatial self-attention, and a frequency–spatial entangled feed-forward network, enabling deep cross-domain information interaction and consistent feature representation. Finally, the proposed Adaptive Enhancement Feature Pyramid Network (AEFPN), incorporating the Adaptive Cross-scale Fusion Module (ACFM) for cross-scale adaptive weighting and the Enhanced Feature Extraction C3 Module (EFEC3) for local nonlinear enhancement, substantially improves detail preservation and semantic balance during feature fusion. Experiments conducted on the PKU-Market-PCB dataset reveal that MEE-DETR delivers notable performance gains. Specifically, Precision, Recall, and mAP50–95 improve by 2.5%, 9.4%, and 4.2%, respectively. In addition, the model’s parameter size is reduced by 40.7%. These results collectively indicate that MEE-DETR achieves excellent detection performance with a lightweight network architecture. Full article
Show Figures

Figure 1

28 pages, 8014 KB  
Article
YOLO-UMS: Multi-Scale Feature Fusion Based on YOLO Detector for PCB Surface Defect Detection
by Hong Peng, Wenjie Yang and Baocai Yu
Sensors 2026, 26(2), 689; https://doi.org/10.3390/s26020689 - 20 Jan 2026
Viewed by 264
Abstract
Printed circuit boards (PCBs) are critical in the electronics industry. As PCB layouts grow increasingly complex, defect detection processes often encounter challenges such as low image contrast, uneven brightness, minute defect sizes, and irregular shapes, making it difficult to achieve rapid and accurate [...] Read more.
Printed circuit boards (PCBs) are critical in the electronics industry. As PCB layouts grow increasingly complex, defect detection processes often encounter challenges such as low image contrast, uneven brightness, minute defect sizes, and irregular shapes, making it difficult to achieve rapid and accurate automated inspection. To address these challenges, this paper proposes a novel object detector, YOLO-UMS, designed to enhance the accuracy and speed of PCB surface defect detection. First, a lightweight plug-and-play Unified Multi-Scale Feature Fusion Pyramid Network (UMSFPN) is proposed to process and fuse multi-scale information across different resolution layers. The UMSFPN uses a Cross-Stage Partial Multi-Scale Module (CSPMS) and an optimized fusion strategy. This approach balances the integration of fine-grained edge information from shallow layers and coarse-grained semantic details from deep layers. Second, the paper introduces a lightweight RG-ELAN module, based on the ELAN network, to enhance feature extraction for small targets in complex scenes. The RG-ELAN module uses low-cost operations to generate redundant feature maps and reduce computational complexity. Finally, the Adaptive Interaction Feature Integration (AIFI) module enriches high-level features by eliminating redundant interactions among shallow-layer features. The channel-priority convolutional attention module (CPCA), deployed in the detection head, strengthens the expressive power of small target features. The experimental results show that the new UMSFPN neck can help improve the AP50 by 3.1% and AP by 2% on the self-collected dataset PCB-M, which is better than the original PAFPN neck. Meanwhile, UMSFPN achieves excellent results across different detectors and datasets, verifying its broad applicability. Without pre-training weights, YOLO-UMS achieves an 84% AP50 on the PCB-M dataset, which is a 6.4% improvement over the baseline YOLO11. Comparing results with existing target detection algorithms shows that the algorithm exhibits good performance in terms of detection accuracy. It provides a feasible solution for efficient and accurate detection of PCB surface defects in the industry. Full article
(This article belongs to the Section Physical Sensors)
Show Figures

Figure 1

23 pages, 40307 KB  
Article
EFPNet: An Efficient Feature Perception Network for Real-Time Detection of Small UAV Targets
by Jiahao Huang, Wei Jin, Huifeng Tao, Yunsong Feng, Yuanxin Shang, Siyu Wang and Aibing Liu
Remote Sens. 2026, 18(2), 340; https://doi.org/10.3390/rs18020340 - 20 Jan 2026
Viewed by 171
Abstract
In recent years, unmanned aerial vehicles (UAVs) have become increasingly prevalent across diverse application scenarios due to their high maneuverability, compact size, and cost-effectiveness. However, these advantages also introduce significant challenges for UAV detection in complex environments. This paper proposes an efficient feature [...] Read more.
In recent years, unmanned aerial vehicles (UAVs) have become increasingly prevalent across diverse application scenarios due to their high maneuverability, compact size, and cost-effectiveness. However, these advantages also introduce significant challenges for UAV detection in complex environments. This paper proposes an efficient feature perception network (EFPNet) for UAV detection, developed on the foundation of the RT-DETR framework. Specifically, a dual-branch HiLo-ConvMix attention (HCM-Attn) mechanism and a pyramid sparse feature transformer network (PSFT-Net) are introduced, along with the integration of a DySample dynamic upsampling module. The HCM-Attn module facilitates interaction between high- and low-frequency information, effectively suppressing background noise interference. The PSFT-Net is designed to leverage deep-level features to guide the encoding and fusion of shallow features, thereby enhancing the model’s capability to perceive UAV texture characteristics. Furthermore, the integrated DySample dynamic upsampling module ensures efficient reconstruction and restoration of feature representations. On the TIB and Drone-vs-Bird datasets, the proposed EFPNet achieves mAP50 scores of 94.1% and 98.1%, representing improvements of 3.2% and 1.9% over the baseline models, respectively. Our experimental results demonstrate the effectiveness of the proposed method for small UAV detection. Full article
Show Figures

Graphical abstract

19 pages, 4395 KB  
Article
An Attention-Based Bidirectional Feature Fusion Algorithm for Insulator Detection
by Binghao Gao, Jinyu Guo, Yongyue Wang, Dong Li and Xiaoqiang Jia
Sensors 2026, 26(2), 584; https://doi.org/10.3390/s26020584 - 15 Jan 2026
Viewed by 220
Abstract
To maintain reliability, safety, and sustainability in power transmission, insulator defect detection has become a critical task in power line inspection. Due to the complex backgrounds and small defect sizes encountered in insulator defect images, issues such as false detections and missed detections [...] Read more.
To maintain reliability, safety, and sustainability in power transmission, insulator defect detection has become a critical task in power line inspection. Due to the complex backgrounds and small defect sizes encountered in insulator defect images, issues such as false detections and missed detections often occur. The existing You Only Look Once (YOLO) object detection algorithm is currently the mainstream method for image-based insulator defect detection in power lines. However, existing models suffer from low detection accuracy. To address this issue, this paper presents an improved YOLOv5-based MC-YOLO insulator detection algorithm. To effectively extract multi-scale information and enhance the model’s ability to represent feature information, a multi-scale attention convolutional fusion (MACF) module incorporating an attention mechanism is proposed. This module utilises parallel convolutions with different kernel sizes to effectively extract features at various scales and highlights the feature representation of key targets through the attention mechanism, thereby improving the detection accuracy. Additionally, a cross-context feature fusion module (CCFM) is designed, where shallow features gain partial deep semantic supplementation and deep features absorb shallow spatial information, achieving bidirectional information flow. Furthermore, the Spatial-Channel Dual Attention Module (SCDAM) is introduced into CCFM. By incorporating a dynamic attention-guided bidirectional cross-fusion mechanism, it effectively resolves the feature deviation between shallow details and deep semantics during multi-scale feature fusion. The experimental results show that the MC-YOLO algorithm achieves an mAP@0.5 of 67.4% on the dataset used in this study, which is a 4.1% improvement over the original YOLOv5. Although the FPS is slightly reduced compared to the original model, it remains practical and capable of rapidly and accurately detecting insulator defects. Full article
(This article belongs to the Section Industrial Sensors)
Show Figures

Figure 1

26 pages, 5686 KB  
Article
MAFMamba: A Multi-Scale Adaptive Fusion Network for Semantic Segmentation of High-Resolution Remote Sensing Images
by Boxu Li, Xiaobing Yang and Yingjie Fan
Sensors 2026, 26(2), 531; https://doi.org/10.3390/s26020531 - 13 Jan 2026
Viewed by 184
Abstract
With rapid advancements in sub-meter satellite and aerial imaging technologies, high-resolution remote sensing imagery has become a pivotal source for geospatial information acquisition. However, current semantic segmentation models encounter two primary challenges: (1) the inherent trade-off between capturing long-range global context and preserving [...] Read more.
With rapid advancements in sub-meter satellite and aerial imaging technologies, high-resolution remote sensing imagery has become a pivotal source for geospatial information acquisition. However, current semantic segmentation models encounter two primary challenges: (1) the inherent trade-off between capturing long-range global context and preserving precise local structural details—where excessive reliance on downsampled deep semantics often results in blurred boundaries and the loss of small objects and (2) the difficulty in modeling complex scenes with extreme scale variations, where objects of the same category exhibit drastically different morphological features. To address these issues, this paper introduces MAFMamba, a multi-scale adaptive fusion visual Mamba network tailored for high-resolution remote sensing images. To mitigate scale variation, we design a lightweight hybrid encoder incorporating an Adaptive Multi-scale Mamba Block (AMMB) in each stage. Driven by a Multi-scale Adaptive Fusion (MSAF) mechanism, the AMMB dynamically generates pixel-level weights to recalibrate cross-level features, establishing a robust multi-scale representation. Simultaneously, to strictly balance local details and global semantics, we introduce a Global–Local Feature Enhancement Mamba (GLMamba) in the decoder. This module synergistically integrates local fine-grained features extracted by convolutions with global long-range dependencies modeled by the Visual State Space (VSS) layer. Furthermore, we propose a Multi-Scale Cross-Attention Fusion (MSCAF) module to bridge the semantic gap between the encoder’s shallow details and the decoder’s high-level semantics via an efficient cross-attention mechanism. Extensive experiments on the ISPRS Potsdam and Vaihingen datasets demonstrate that MAFMamba surpasses state-of-the-art Convolutional Neural Network (CNN), Transformer, and Mamba-based methods in terms of mIoU and mF1 scores. Notably, it achieves superior accuracy while maintaining linear computational complexity and low memory usage, underscoring its efficiency in complex remote sensing scenarios. Full article
(This article belongs to the Special Issue Intelligent Sensors and Artificial Intelligence in Building)
Show Figures

Figure 1

32 pages, 12128 KB  
Article
YOLO-SMD: A Symmetrical Multi-Scale Feature Modulation Framework for Pediatric Pneumonia Detection
by Linping Du, Xiaoli Zhu, Zhongbin Luo and Yanping Xu
Symmetry 2026, 18(1), 139; https://doi.org/10.3390/sym18010139 - 10 Jan 2026
Viewed by 239
Abstract
Pediatric pneumonia detection faces the challenge of pathological asymmetry, where immature lung tissues present blurred boundaries and lesions exhibit extreme scale variations (e.g., small viral nodules vs. large bacterial consolidations). Conventional detectors often fail to address these imbalances. In this study, we propose [...] Read more.
Pediatric pneumonia detection faces the challenge of pathological asymmetry, where immature lung tissues present blurred boundaries and lesions exhibit extreme scale variations (e.g., small viral nodules vs. large bacterial consolidations). Conventional detectors often fail to address these imbalances. In this study, we propose YOLO-SMD, a detection framework built upon a symmetrical design philosophy to enforce balanced feature representation. We introduce three architectural innovations: (1) DySample (Content-Aware Upsampling): To address the blurred boundaries of pediatric lesions, this module replaces static interpolation with dynamic point sampling, effectively sharpening edge details that are typically smoothed out by standard upsamplers; (2) SAC2f (Cross-Dimensional Attention): To counteract background interference, this module enforces a symmetrical interaction between spatial and channel dimensions, allowing the model to suppress structural noise (e.g., rib overlaps) in low-contrast X-rays; (3) SDFM (Adaptive Gated Fusion): To resolve the extreme scale disparity, this unit employs a gated mechanism that symmetrically balances deep semantic features (crucial for large bacterial shapes) and shallow textural features (crucial for viral textures). Extensive experiments on a curated subset of 2611 images derived from the Chest X-ray Pneumonia Dataset demonstrate that YOLO-SMD achieves competitive performance with a focus on high sensitivity, attaining a Recall of 86.1% and an mAP@0.5 of 84.3%, thereby outperforming the state-of-the-art YOLOv12n by 2.4% in Recall under identical experimental conditions. The results validate that incorporating symmetry principles into feature modulation significantly enhances detection robustness in primary healthcare settings. Full article
(This article belongs to the Special Issue Symmetry/Asymmetry in Image Processing and Computer Vision)
Show Figures

Figure 1

18 pages, 7411 KB  
Article
Enhancing Marine Gravity Anomaly Recovery from Satellite Altimetry Using Differential Marine Geodetic Data
by Yu Han, Fangjun Qin, Jiujiang Yan, Hongwei Wei, Geng Zhang, Yang Li and Yimin Li
Appl. Sci. 2026, 16(2), 726; https://doi.org/10.3390/app16020726 - 9 Jan 2026
Viewed by 262
Abstract
Traditional fusion methods for integrating multi-source gravity data rely on predefined mathematical models that inadequately capture complex nonlinear relationships, particularly at wavelengths shorter than 10 km. We developed a convolutional neural network incorporating differential marine geodetic data (DMGD-CNN) to enhance marine gravity anomaly [...] Read more.
Traditional fusion methods for integrating multi-source gravity data rely on predefined mathematical models that inadequately capture complex nonlinear relationships, particularly at wavelengths shorter than 10 km. We developed a convolutional neural network incorporating differential marine geodetic data (DMGD-CNN) to enhance marine gravity anomaly recovery from HY-2A satellite altimetry. The DMGD-CNN framework encodes spatial gradient information by computing differences between target points and their surrounding neighborhoods, enabling the model to explicitly capture local gravity field variations. This approach transforms absolute parameter values into spatial gradient representations, functioning as a spatial high-pass filter that enhances local gradient information critical for short-wavelength gravity signal recovery while reducing the influence of long-wavelength components. Through systematic ablation studies with eight parameter configurations, we demonstrate that incorporating first- and second-order seabed topography derivatives significantly enhances model performance, reducing the root mean square error (RMSE) from 2.26 mGal to 0.93 mGal, with further reduction to 0.85 mGal achieved by the differential learning strategy. Comprehensive benchmarking against international gravity models (SIO V32.1, DTU17, and SDUST2022) demonstrates that DMGD-CNN achieves 2–10% accuracy improvement over direct CNN predictions in complex topographic regions. Power spectral density analysis reveals enhanced predictive capabilities at wavelengths below 10 km for the direct CNN approach, with DMGD-CNN achieving further precision enhancement at wavelengths below 5 km. Cross-validation with independent shipborne surveys confirms the method’s robustness, showing 47–63% RMSE reduction in shallow water regions (<2000 m depth) compared to HY-2A altimeter-derived results. These findings demonstrate that deep learning with differential marine geodetic features substantially improves marine gravity field modeling accuracy, particularly for capturing fine-scale gravitational features in challenging environments. Full article
Show Figures

Figure 1

22 pages, 3809 KB  
Article
Research on Remote Sensing Image Object Segmentation Using a Hybrid Multi-Attention Mechanism
by Lei Chen, Changliang Li, Yixuan Gao, Yujie Chang, Siming Jin, Zhipeng Wang, Xiaoping Ma and Limin Jia
Appl. Sci. 2026, 16(2), 695; https://doi.org/10.3390/app16020695 - 9 Jan 2026
Viewed by 248
Abstract
High-resolution remote sensing images are gradually playing an important role in land cover mapping, urban planning, and environmental monitoring tasks. However, current segmentation approaches frequently encounter challenges such as loss of detail and blurred boundaries when processing high-resolution remote sensing imagery, owing to [...] Read more.
High-resolution remote sensing images are gradually playing an important role in land cover mapping, urban planning, and environmental monitoring tasks. However, current segmentation approaches frequently encounter challenges such as loss of detail and blurred boundaries when processing high-resolution remote sensing imagery, owing to their complex backgrounds and dense semantic content. In response to the aforementioned limitations, this study introduces HMA-UNet, a novel segmentation network built upon the UNet framework and enhanced through a hybrid attention strategy. The architecture’s innovation centers on a composite attention block, where a lightweight split fusion attention (LSFA) mechanism and a lightweight channel-spatial attention (LCSA) mechanism are synergistically integrated within a residual learning structure to replace the stacked convolutional structure in UNet, which can improve the utilization of important shallow features and eliminate redundant information interference. Comprehensive experiments on the WHDLD dataset and the DeepGlobe road extraction dataset show that our proposed method achieves effective segmentation in remote sensing images by fully utilizing shallow features and eliminating redundant information interference. The quantitative evaluation results demonstrate the performance of the proposed method across two benchmark datasets. On the WHDLD dataset, the model attains a mean accuracy, IoU, precision, and recall of 72.40%, 60.71%, 75.46%, and 72.41%, respectively. Correspondingly, on the DeepGlobe road extraction dataset, it achieves a mean accuracy of 57.87%, an mIoU of 49.82%, a mean precision of 78.18%, and a mean recall of 57.87%. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

24 pages, 4797 KB  
Article
PRTNet: Combustion State Recognition Model of Municipal Solid Waste Incineration Process Based on Enhanced Res-Transformer and Multi-Scale Feature Guided Aggregation
by Jian Zhang, Junyu Ge and Jian Tang
Sustainability 2026, 18(2), 676; https://doi.org/10.3390/su18020676 - 9 Jan 2026
Viewed by 203
Abstract
Accurate identification of the combustion state in municipal solid waste incineration (MSWI) processes is crucial for achieving efficient, low-emission, and safe operation. However, existing methods often struggle with stable and reliable recognition due to insufficient feature extraction capabilities when confronted with challenges such [...] Read more.
Accurate identification of the combustion state in municipal solid waste incineration (MSWI) processes is crucial for achieving efficient, low-emission, and safe operation. However, existing methods often struggle with stable and reliable recognition due to insufficient feature extraction capabilities when confronted with challenges such as complex flame morphology, blurred boundaries, and significant noise in flame images. To address this, this paper proposes a novel hybrid architecture model named PRTNet, which aims to enhance the accuracy and robustness of combustion state recognition through multi-scale feature enhancement and adaptive fusion mechanisms. First, a local-semantic enhanced residual network is constructed to establish spatial correlations between fine-grained textures and macroscopic combustion patterns. Subsequently, a feature-adaptive fusion Transformer is designed, which models long-range dependencies and high-frequency details in parallel via deformable attention and local convolutions, and achieves adaptive fusion of global and local features through a gating mechanism. Finally, a cross-scale feature guided aggregation module is proposed to fuse shallow detailed information with deep semantic features under dual-attention guidance. Experiments conducted on a flame image dataset from an MSWI plant in Beijing show that PRTNet achieves an accuracy of 96.29% in the combustion state classification task, with precision, recall, and F1-score all exceeding 96%, significantly outperforming numerous mainstream baseline models. Ablation studies further validate the effectiveness and synergistic effects of each module. The proposed method provides a reliable solution for intelligent flame state recognition in complex industrial scenarios, contributing to the advancement of intelligent and sustainable development in municipal solid waste incineration processes. Full article
(This article belongs to the Special Issue Life Cycle and Sustainability Nexus in Solid Waste Management)
Show Figures

Figure 1

22 pages, 3276 KB  
Article
AFR-CR: An Adaptive Frequency Domain Feature Reconstruction-Based Method for Cloud Removal via SAR-Assisted Remote Sensing Image Fusion
by Xiufang Zhou, Qirui Fang, Xunqiang Gong, Shuting Yang, Tieding Lu, Yuting Wan, Ailong Ma and Yanfei Zhong
Remote Sens. 2026, 18(2), 201; https://doi.org/10.3390/rs18020201 - 8 Jan 2026
Viewed by 355
Abstract
Optical imagery is often contaminated by clouds to varying degrees, which greatly affects the interpretation and analysis of images. Synthetic Aperture Radar (SAR) possesses the characteristic of penetrating clouds and mist, and a common strategy in SAR-assisted cloud removal involves fusing SAR and [...] Read more.
Optical imagery is often contaminated by clouds to varying degrees, which greatly affects the interpretation and analysis of images. Synthetic Aperture Radar (SAR) possesses the characteristic of penetrating clouds and mist, and a common strategy in SAR-assisted cloud removal involves fusing SAR and optical data and leveraging deep learning networks to reconstruct cloud-free optical imagery. However, these methods do not fully consider the characteristics of the frequency domain when processing feature integration, resulting in blurred edges of the generated cloudless optical images. Therefore, an adaptive frequency domain feature reconstruction-based cloud removal method is proposed to solve the problem. The proposed method comprises four key sequential stages. First, shallow features are extracted by fusing optical and SAR images. Second, a Transformer-based encoder captures multi-scale semantic features. Subsequently, the Frequency Domain Decoupling Module (FDDM) is employed. Utilizing a Dynamic Mask Generation mechanism, it explicitly decomposes features into low-frequency structures and high-frequency details, effectively suppressing cloud interference while preserving surface textures. Finally, robust information interaction is facilitated by the Cross-Frequency Reconstruction Module (CFRM) via transposed cross-attention, ensuring precise fusion and reconstruction. Experimental evaluation on the M3R-CR dataset confirms that the proposed approach achieves the best results on all four evaluated metrics, surpassing the performance of the eight other State-of-the-Art methods. It has demonstrated its effectiveness and advanced capabilities in the task of SAR-optical fusion for cloud removal. Full article
Show Figures

Figure 1

24 pages, 18949 KB  
Article
KGE–SwinFpn: Knowledge Graph Embedding in Swin Feature Pyramid Networks for Accurate Landslide Segmentation in Remote Sensing Images
by Chunju Zhang, Xiangyu Zhao, Peng Ye, Xueying Zhang, Mingguo Wang, Yifan Pei and Chenxi Li
Remote Sens. 2026, 18(1), 71; https://doi.org/10.3390/rs18010071 - 25 Dec 2025
Viewed by 480
Abstract
Landslide disasters are complex spatiotemporal phenomena. Existing deep learning (DL) models for remote sensing (RS) image analysis primarily exploit shallow visual features, inadequately incorporating critical geological, geographical, and environmental knowledge. This limitation impairs detection accuracy and generalization, especially in complex terrains and diverse [...] Read more.
Landslide disasters are complex spatiotemporal phenomena. Existing deep learning (DL) models for remote sensing (RS) image analysis primarily exploit shallow visual features, inadequately incorporating critical geological, geographical, and environmental knowledge. This limitation impairs detection accuracy and generalization, especially in complex terrains and diverse vegetation conditions. We propose Knowledge Graph Embedding in Swin Feature Pyramid Networks (KGE–SwinFpn), a novel RS landslide segmentation framework that integrates explicit domain knowledge with deep features. First, a comprehensive landslide knowledge graph is constructed, organizing multi-source factors (e.g., lithology, topography, hydrology, rainfall, land cover, etc.) into entities and relations that characterize controlling, inducing, and indicative patterns. A dedicated KGE Block learns embeddings for these entities and discretized factor levels from the landslide knowledge graph, enabling their fusion with multi-scale RS features in SwinFpn. This approach preserves the efficiency of automatic feature learning while embedding prior knowledge guidance, enhancing data–knowledge–model coupling. Experiments demonstrate significant outperformance over classic segmentation networks: on the Yuan-yang dataset, KGE–SwinFpn achieved 96.85% pixel accuracy (PA), 88.46% mean pixel accuracy (MPA), and 82.01% mean intersection over union (MIoU); on the Bijie dataset, it attained 96.28% PA, 90.72% MPA, and 84.47% MIoU. Ablation studies confirm the complementary roles of different knowledge features and the KGE Block’s contribution to robustness in complex terrains. Notably, the KGE Block is architecture-agnostic, suggesting broad applicability for knowledge-guided RS landslide detection and promising enhanced technical support for disaster monitoring and risk assessment. Full article
Show Figures

Figure 1

22 pages, 6921 KB  
Article
SFE-DETR: An Enhanced Transformer-Based Face Detector for Small Target Faces in Open Complex Scenes
by Chenhao Yang, Yueming Jiang and Chunyan Song
Sensors 2026, 26(1), 125; https://doi.org/10.3390/s26010125 - 24 Dec 2025
Viewed by 414
Abstract
Face detection is an important task in the field of computer vision and is widely applied in various applications. However, in open and complex scenes with dense faces, occlusions, and image degradation, small face detection still faces significant challenges due to the extremely [...] Read more.
Face detection is an important task in the field of computer vision and is widely applied in various applications. However, in open and complex scenes with dense faces, occlusions, and image degradation, small face detection still faces significant challenges due to the extremely small target scale, difficult localization, and severe background interference. To address these issues, this paper proposes a small face detector for open complex scenes, SFE-DETR, which aims to simultaneously improve detection accuracy and computational efficiency. The backbone network of the model adopts an inverted residual shift convolution and dilated reparameterization structure, which enhances shallow features and enables deep feature self-adaptation, thereby better preserving small-scale information and reducing the number of parameters. Additionally, a multi-head multi-scale self-attention mechanism is introduced to fuse multi-scale convolutional features with channel-wise weighting, capturing fine-grained facial features while suppressing background noise. Moreover, a redesigned SFE-FPN introduces high-resolution layers and incorporates a novel feature fusion module consisting of local, large-scale, and global branches, efficiently aggregating multi-level features and significantly improving small face detection performance. Experimental results on two challenging small face detection datasets show that SFE-DETR reduces parameters by 28.1% compared to the original RT-DETR-R18 model, achieving a mAP50 of 94.7% and AP-s of 42.1% on the SCUT-HEAD dataset, and a mAP50 of 86.3% on the WIDER FACE (Hard) subset. These results demonstrate that SFE-DETR achieves optimal detection performance among models of the same scale while maintaining efficiency. Full article
(This article belongs to the Section Optical Sensors)
Show Figures

Figure 1

18 pages, 2081 KB  
Article
Breast Ultrasound Image Segmentation Integrating Mamba-CNN and Feature Interaction
by Guoliang Yang, Yuyu Zhang and Hao Yang
Sensors 2026, 26(1), 105; https://doi.org/10.3390/s26010105 - 23 Dec 2025
Cited by 1 | Viewed by 561
Abstract
The large scale and shape variation in breast lesions make their segmentation extremely challenging. A breast ultrasound image segmentation model integrating Mamba-CNN and feature interaction is proposed for breast ultrasound images with a large amount of speckle noise and multiple artifacts. The model [...] Read more.
The large scale and shape variation in breast lesions make their segmentation extremely challenging. A breast ultrasound image segmentation model integrating Mamba-CNN and feature interaction is proposed for breast ultrasound images with a large amount of speckle noise and multiple artifacts. The model first uses the visual state space model (VSS) as an encoder for feature extraction to better capture its long-range dependencies. Second, a hybrid attention enhancement mechanism (HAEM) is designed at the bottleneck between the encoder and the decoder to provide fine-grained control of the feature map in both the channel and spatial dimensions, so that the network captures key features and regions more comprehensively. The decoder uses transposed convolution to upsample the feature map, gradually increasing the resolution and recovering its spatial information. Finally, the cross-fusion module (CFM) is constructed to simultaneously focus on the spatial information of the shallow feature map as well as the deep semantic information, which effectively reduces the interference of noise and artifacts. Experiments are carried out on BUSI and UDIAT datasets, and the Dice similarity coefficient and HD95 indexes reach 76.04% and 20.28 mm, respectively, which show that the algorithm can effectively solve the problems of noise and artifacts in ultrasound image segmentation, and the segmentation performance is improved compared with the existing algorithms. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

Back to TopTop