Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (11,355)

Search Parameters:
Keywords = feature fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 9519 KB  
Article
Physics-Prior-Guided Feature Pyramid Network for Unified Multi-Angle Spectral–Polarimetric Cloud Detection
by Shu Li, Xingyuan Ji, Xiaoxue Chu, Song Ye, Ziyang Zhang, Yongyin Gan, Xinqiang Wang and Fangyuan Wang
Remote Sens. 2026, 18(8), 1150; https://doi.org/10.3390/rs18081150 (registering DOI) - 12 Apr 2026
Abstract
Accurate cloud detection remains a significant challenge due to the spectral ambiguity between clouds and bright or heterogeneous surfaces (e.g., snow, desert). While multi-angle and polarization data offer rich information, the discriminative power of joint spectral analysis for resolving these ambiguities has been [...] Read more.
Accurate cloud detection remains a significant challenge due to the spectral ambiguity between clouds and bright or heterogeneous surfaces (e.g., snow, desert). While multi-angle and polarization data offer rich information, the discriminative power of joint spectral analysis for resolving these ambiguities has been underexploited. In this work, we demonstrate that physically motivated spectral band ratios and differences can robustly enhance cloud signatures. Motivated by this insight, we propose a novel deep learning framework, the Multi-angle Polarization Feature Pyramid Structure (MP-FPS), that explicitly leverages joint spectral features as discriminative priors. Our architecture employs a dual-branch network to disentangle and adaptively fuse spectral and multi-angle polarization modalities. Within this framework, a hierarchical, multi-scale cross-channel multi-angle fusion module dynamically captures spatial–spectral–angular dependencies, enriching the structural representation of clouds. Furthermore, a channel-space dual-path attention mechanism refines sub-pixel responses, significantly improving detection accuracy in challenging regions such as cloud edges and thin cirrus. Evaluated on the global POLDER-3 dataset, MP-FPS achieves a mean Intersection over Union (mIoU) of 0.8662 across diverse surface types, surpassing the official baseline by 12.4%. This study establishes joint spectral analysis as a critical enabler for high-precision cloud masking, and demonstrates its synergistic value when integrated with multi-angle polarimetric information in a unified deep architecture. Full article
29 pages, 6563 KB  
Article
An Autonomous Orbit Prediction Approach for BDS MEO Satellites Using a Short-Sequence Adaptive Model
by Yihui Zhao, Yuebo Ma, Hongfeng Long, Rujin Zhao and Xia Lin
Remote Sens. 2026, 18(8), 1146; https://doi.org/10.3390/rs18081146 (registering DOI) - 12 Apr 2026
Abstract
The new-generation global navigation satellite system (GNSS) demands enhanced satellite autonomy, where high-precision orbit prediction plays a pivotal role. Traditional dynamic models depend heavily on long-term on-orbit observations, making hybrid deep-learning-based orbit prediction models an efficient alternative. Although existing studies have validated that [...] Read more.
The new-generation global navigation satellite system (GNSS) demands enhanced satellite autonomy, where high-precision orbit prediction plays a pivotal role. Traditional dynamic models depend heavily on long-term on-orbit observations, making hybrid deep-learning-based orbit prediction models an efficient alternative. Although existing studies have validated that temporal networks can effectively capture orbit error variations, improving prediction accuracy under short input sequences remains a critical challenge. To address this issue, this paper proposes an improved short-sequence-adaptive Bidirectional Long Short-Term Memory (BiLSTM) network to enhance orbit prediction performance of BeiDou Medium Earth Orbit satellites. Specifically, we design a scale-aware hybrid convolution module and an attention-driven feature fusion module to generate feature representations with high information density, which outperform the standalone BiLSTM under short input sequences. Experiments on the BeiDou system (BDS) C19 satellite demonstrate that our method reduces the mean residual rates from 54.03%, 41.18%, 80.10% to 4.36%, 6.12%, 5.39% in the X, Y, and Z axes, respectively, surpassing BiLSTM alone by over 85% across all metrics. Notably, the proposed method exhibits robust generalization capabilities across similar satellites with similar orbital configurations and dynamic environments. Full article
(This article belongs to the Special Issue Autonomous Space Navigation (Second Edition))
Show Figures

Figure 1

20 pages, 5303 KB  
Article
LGDAF-Net: A Lightweight CNN–Transformer Framework for Cross-Domain Few-Shot Hyperspectral Image Classification
by Guang Yang, Jiaoli Fang, Daming Zhu and Xiaoqing Zuo
Electronics 2026, 15(8), 1606; https://doi.org/10.3390/electronics15081606 (registering DOI) - 12 Apr 2026
Abstract
Cross-domain few-shot hyperspectral image (HSI) classification is challenging due to limited labeled samples and distribution shifts across sensors and acquisition scenes, which often degrade feature representation and classification performance. This study proposes a lightweight hierarchical CNN–Transformer framework, termed LGDAF-Net (Lightweight Global and Local [...] Read more.
Cross-domain few-shot hyperspectral image (HSI) classification is challenging due to limited labeled samples and distribution shifts across sensors and acquisition scenes, which often degrade feature representation and classification performance. This study proposes a lightweight hierarchical CNN–Transformer framework, termed LGDAF-Net (Lightweight Global and Local Dual Attention Fusion Network), for effective cross-domain few-shot HSI classification. The framework progressively enhances spectral–spatial representation through three stages: spectral–spatial feature recalibration, local spatial structure perception, and global contextual modeling. Specifically, a spectral–spatial dual-attention enhancement module (SESA) is introduced to emphasize informative spectral responses and suppress redundancy. A Local Attention Spatial Perception Module (LASPM) is designed to capture fine-grained spatial structures, while a lightweight Transformer-based Global Attention Context Modeling Module (GACM) models long-range spatial dependencies. In addition, kernel triplet loss and domain adversarial learning are incorporated to improve feature discrimination and promote cross-domain feature alignment. Experimental results on three benchmark datasets demonstrate that the proposed method achieves competitive performance compared with existing methods. Full article
(This article belongs to the Special Issue AI-Driven Image Processing: Theory, Methods, and Applications)
Show Figures

Figure 1

24 pages, 6104 KB  
Article
Research on Medical Image Segmentation Based on Frequency-Domain Enhancement and Edge Awareness
by Jiamin Li, Yazhi Liu and Wei Li
Algorithms 2026, 19(4), 303; https://doi.org/10.3390/a19040303 (registering DOI) - 12 Apr 2026
Abstract
Medical images commonly exhibit low contrast, weak boundaries, and complex textures. In addition, significant semantic differences exist between deep-level semantic features and shallow-level detail features, posing challenges for multi-scale feature fusion in terms of detail preservation and structural consistency. To address these issues, [...] Read more.
Medical images commonly exhibit low contrast, weak boundaries, and complex textures. In addition, significant semantic differences exist between deep-level semantic features and shallow-level detail features, posing challenges for multi-scale feature fusion in terms of detail preservation and structural consistency. To address these issues, a frequency-enhanced and bidirectional feature-guided segmentation network (FBNet) is proposed. The network comprises two core components. The frequency-based enhancement (FBE) module employs the Fast Fourier Transform and applies adaptive modulation to the amplitude spectrum through a content-aware gating mechanism, enhancing detail expression and inter-structural contrast. The Bidirectional Guided Feature Fusion module (BGF) enables bidirectional interaction between shallow and deep features. Additionally, the Structure and Edge Awareness (SEA) module is constructed using directional and variance attention mechanisms to achieve collaborative optimization of structural modeling and edge perception. Experiments on four medical image segmentation datasets show that, compared to the second-best method, FBNet achieves improvements of 2.12, 1.57, 1.37, and 1.56 percentage points on the mIoU metric and 1.54, 1.11, 0.84, and 1.03 percentage points on the mDice metric. Full article
(This article belongs to the Section Evolutionary Algorithms and Machine Learning)
29 pages, 1086 KB  
Article
Time-Aware Graph Neural Network for Asynchronous Multi-Station Integrated Sensing and Communications Fusion in Open RAN
by Zhiqiang Shen, Wooseok Shin and Jitae Shin
Sensors 2026, 26(8), 2376; https://doi.org/10.3390/s26082376 (registering DOI) - 12 Apr 2026
Abstract
Multi-station sensing telemetry typically arrives out-of-order at the Open RAN (O-RAN) Near-RT RIC due to non-deterministic jitter in cloud-native protocol stacks, inducing a “temporal scrambling” effect that invalidates traditional spatial fusion. To bridge this gap, we introduce Age-of-Sensing (AoS) as a dynamic reliability [...] Read more.
Multi-station sensing telemetry typically arrives out-of-order at the Open RAN (O-RAN) Near-RT RIC due to non-deterministic jitter in cloud-native protocol stacks, inducing a “temporal scrambling” effect that invalidates traditional spatial fusion. To bridge this gap, we introduce Age-of-Sensing (AoS) as a dynamic reliability metric for asynchronous sensing reports and establish an AoS-aware graph neural network (GNN) paradigm for asynchronous sensing fusion. This paradigm shifts the focus from conventional spatial-only aggregation to time-aware inference by explicitly incorporating sensing freshness into graph-based fusion. As a physics-informed realization of this paradigm, we present Time-Aware Fusion (TA-Fusion), which introduces a TA-Gate mechanism to recalibrate node trust prior to graph aggregation. Unlike passive feature concatenation, the TA-Gate serves as an active gating signal to prioritize fresh telemetry while adaptively suppressing stale outliers. On a standardized O-RAN benchmark, TA-Fusion achieves a root mean square error (RMSE) of 12.22 m, delivering a 21.7% reduction in Mean absolute error (MAE) over the AoS-aware GNN baseline and maintaining robustness in extreme jitter scenarios where traditional linear methods suffer from severe accuracy degradation due to their static weighting logic. Extensive Monte Carlo simulations confirm that the framework preserves consistent error bounds across diverse base station geometries without manual recalibration. These findings support the real-time feasibility of the proposed paradigm for delay-critical Integrated Sensing and Communication (ISAC) services, providing a resilient spatial foundation for 6G orchestration under substantial network-layer jitter. Full article
(This article belongs to the Special Issue Mobile Sensing and Computing in Internet of Things)
26 pages, 10623 KB  
Article
LRD-DETR: A Lightweight RT-DETR-Based Model for Road Distress Detection
by Chen Dong and Yunwei Zhang
Sensors 2026, 26(8), 2375; https://doi.org/10.3390/s26082375 (registering DOI) - 12 Apr 2026
Abstract
Intelligent road distress detection technology has emerged as an important research topic in the field of highway maintenance. However, the accuracy and practicality of pavement distress detection are constrained by multiple factors, primarily including the irregular shapes of distress, the tendency for fine [...] Read more.
Intelligent road distress detection technology has emerged as an important research topic in the field of highway maintenance. However, the accuracy and practicality of pavement distress detection are constrained by multiple factors, primarily including the irregular shapes of distress, the tendency for fine cracks to be overlooked, and the high parameter count of detection models that makes deployment difficult. Therefore, this study proposes a lightweight road distress detection model based on an improved RT-DETR architecture—LRD-DETR. First, this work integrates the C2f-LFEM module with the ADown adaptive down-sampling strategy into the backbone network, significantly reducing the number of model parameters and computational load while effectively enhancing the representation capacity of multi-scale pavement distress features. Second, a frequency-domain spatial attention is embedded in the S4 feature layer, where synergistic integration of frequency-domain filtering and spatial attention enables detail enhancement of distress edges and contours, automatically focuses on the distress regions, and suppresses background interference. The polarity-aware linear attention is incorporated into the S5 feature layer, by explicitly modeling polarity interactions, it effectively captures textural discrepancies between damaged regions and the intact road surface, and a learnable power function dynamically rescales attention weights to strengthen distress-specific feature responses. Finally, a cross-scale spatial feature fusion module (CSF2M) is developed to reconstruct and fuse multi-level spatial featurez, thereby improving detection robustness for pavement distresses with diverse morphologies under complex background conditions. Quantitative experiments indicate that, in contrast with the baseline RT-DETR, the presented framework improves the F1-score by 7.1% and mAP@50 by 9.0%, while reducing computational complexity and parameter quantity by 43.8% and 38.0%, respectively. These advantages enable LRD-DETR to be suitably deployed on resource-limited embedded platforms for real-time road distress detection. Full article
(This article belongs to the Special Issue AI and Smart Sensors for Intelligent Transportation Systems)
33 pages, 7834 KB  
Article
Frequency-Domain Decoupling and Multi-Dimensional Spatial Feature Reconstruction for Occlusion-Aware Apple Detection in Complex Semi-Structured Orchard Environments
by Long Gao, Pengfei Wang, Lixing Liu, Hongjie Liu, Jianping Li and Xin Yang
Agronomy 2026, 16(8), 790; https://doi.org/10.3390/agronomy16080790 (registering DOI) - 12 Apr 2026
Abstract
Apple detection is a core perception task for harvesting robots operating in complex orchard environments. Targets are frequently affected by branch–foliage occlusion, alternating front/side/back lighting, and strong local illumination fluctuations, which blur object boundaries against background textures and substantially increase detection difficulty. To [...] Read more.
Apple detection is a core perception task for harvesting robots operating in complex orchard environments. Targets are frequently affected by branch–foliage occlusion, alternating front/side/back lighting, and strong local illumination fluctuations, which blur object boundaries against background textures and substantially increase detection difficulty. To improve target perception under these conditions, we propose an improved detector, YOLOv11-CBMES. First, based on YOLOv11, we replace the original neck with a weighted BiFPN to enhance cross-scale feature fusion under occlusion. Second, we introduce a Contrast-Driven Feature Aggregation (CDFA) module at the P5 stage, using Haar wavelet decomposition to decouple low-frequency illumination components from high-frequency structural components. Third, we reconstruct spatial feature learning and the upsampling pathway using CSP-based multi-scale blocks and efficient upsampling blocks, and embed a zero-parameter Shift-Context strategy to strengthen local neighbourhood interaction. Finally, we formulate apple detection as a three-class occlusion classification task (No Occlusion, Soft Occlusion, and Hard Occlusion) to support occlusion-aware target recognition. On the apple occlusion dataset, YOLOv11-CBMES achieves mAPNO = 83.50%, mAPSO = 67.36%, and mAPHO = 51.90% at IoU = 0.5. Compared with YOLOv11n under the same training protocol, the gains are +2.16 pp (NO), +3.68 pp (SO), and +5.31 pp (HO), with the largest improvement observed in Hard Occlusion (HO). The results indicate that introducing frequency-domain structural processing into the detection framework improves apple occlusion classification and object detection performance, and provides a theoretical basis for designing perception modules for end-effector operations in apple harvesting robots. Full article
Show Figures

Figure 1

31 pages, 7021 KB  
Article
TMAFNet: A Transformer-Based Multi-Level Adaptive Fusion Network for Remote Sensing Change Detection
by Yushuai Yuan, Zhiyong Fan, Shuai Zhang, Min Xia and Yalu Huang
Remote Sens. 2026, 18(8), 1143; https://doi.org/10.3390/rs18081143 (registering DOI) - 12 Apr 2026
Abstract
High-resolution remote sensing imagery encompasses complex land cover types and rich textural details, whilst temporal variations often manifest as subtle feature differences and unstable structural patterns. This renders traditional change detection methods ineffective at accurately characterizing genuine alterations, frequently leading to underdetection, false [...] Read more.
High-resolution remote sensing imagery encompasses complex land cover types and rich textural details, whilst temporal variations often manifest as subtle feature differences and unstable structural patterns. This renders traditional change detection methods ineffective at accurately characterizing genuine alterations, frequently leading to underdetection, false positives, and ambiguous boundaries. To address these challenges, this paper proposes a Transformer-Based Multi-level Adaptive Fusion Network. It is built upon the DeepLabV3+ encoder–decoder framework, in which a shared-weight ResNet-101 is adopted as the backbone for dual-temporal feature extraction, with the final residual block of layer 4 cropped to extract deeper semantic features at a higher spatial resolution. The Adaptive Window–Attention Feature Fusion Module (AWAFM) adaptively models local and global differences across temporal phases, enhancing sensitivity to genuine changes. The Dual Strip Pool Fusion Module (DSPFM) enhances sensitivity to directional structural variations through horizontal and vertical strip pooling. The Progressive Multi-Scale Feature Fusion Module (PMFFM) progressively aggregates deep and shallow features via semantic residual transmission. To further suppress misleading suppression caused by complex textures, the Transformer-Enhanced Reverse Attention Fusion Module (TRAFM) explicitly models long-range dependencies, effectively mitigating false change responses. On the LEVIR-CD dataset, it achieves state-of-the-art performance, with a PA and an IoU of 92.36% and 90.13%, respectively. On the SYSU-CD dataset, PA and IoU reach 88.96% and 86.15%, demonstrating TMAFNet’s stability and superiority in scenarios involving complex ground surface disturbances, weak textural variations, and large-scale structural changes. Full article
16 pages, 3032 KB  
Article
A Novel Topology-Based Candidate Reaction Prediction Approach for Gap-Fillings of Genome-Scale Metabolic Models
by Jiajun Qu and Kai Wang
Metabolites 2026, 16(4), 258; https://doi.org/10.3390/metabo16040258 (registering DOI) - 12 Apr 2026
Abstract
Background: It is significant to predict and fill metabolic reaction gaps (gap-fillings) for reconstructions of high-quality genome-scale metabolic models (GEMs). Currently, many existing optimization-based gap-filling methods have to rely on phenotypic data, while performances of topology-based approaches by deep learning algorithms need [...] Read more.
Background: It is significant to predict and fill metabolic reaction gaps (gap-fillings) for reconstructions of high-quality genome-scale metabolic models (GEMs). Currently, many existing optimization-based gap-filling methods have to rely on phenotypic data, while performances of topology-based approaches by deep learning algorithms need to be further improved. Methods: This paper proposes a novel topology-based approach (GHCN-SE) of predicting confidence scores of candidate reactions, which can be used for gap-fillings of GEMs. The topological features of GEMs are fully extracted by simultaneously using graph and hypergraph convolutional networks, such that both associations of metabolites in the same reaction and higher-order interactions of metabolites within reactions can be captured. After the feature fusion, we further employ the squeeze-and-excitation network to enhance features. Results: The reaction prediction and reaction recovery experiments through 5-fold cross validations on 108 high-quality BiGG GEMs show that the proposed GHCN-SE is superior to other related methods. The ablation study further demonstrates the contributions of the graph convolutional network, hypergraph convolutional network, and squeeze-and-excitation network in GHCN-SE. In addition, the visualization study interprets the effectiveness of GHCN-SE. Conclusions: For potential applications in metabolic engineering, biomedicine, etc., this proposed GHCN-SE can be used to further improve the phenotypic prediction accuracy of the draft GEM generated from automated reconstruction tools. Full article
(This article belongs to the Section Bioinformatics and Data Analysis)
Show Figures

Figure 1

30 pages, 6019 KB  
Article
A Novel PolSAR Classification Method Based on Dynamic Weight Adjustment of Heterogeneous Feature Fusion
by Yan Duan, Sonya Coleman, Li Yang, Haijun Wang, Guangwei Wang and Dermot Kerr
Remote Sens. 2026, 18(8), 1140; https://doi.org/10.3390/rs18081140 (registering DOI) - 12 Apr 2026
Abstract
In response to the problems of insufficient fusion of amplitude and phase heterogeneity features, deficient direction sensitivity modeling, and a single fusion level in the polarimetric synthetic aperture radar classification task, this paper proposes a PolSAR classification method based on dynamic weight adjustment [...] Read more.
In response to the problems of insufficient fusion of amplitude and phase heterogeneity features, deficient direction sensitivity modeling, and a single fusion level in the polarimetric synthetic aperture radar classification task, this paper proposes a PolSAR classification method based on dynamic weight adjustment and heterogeneous feature fusion. This method utilizes a dual-branch parallel structure to extract polarization features and landcover amplitude-phase direction difference features separately and constructs a three-level progressive fusion strategy of sub-branch, cross-branch, and decision layer to achieve adaptive complementation of heterogeneous features. Experiments on three standard datasets show that the classification accuracy and visual consistency of this method are significantly superior to the classical methods, with the overall accuracy being improved by 1.5% to 2.4%. Full article
Show Figures

Figure 1

16 pages, 2590 KB  
Article
A Feature-Enhanced Network for Vegetable Disease Detection in Complex Environments
by Xuewei Wang and Jun Liu
Plants 2026, 15(8), 1182; https://doi.org/10.3390/plants15081182 (registering DOI) - 11 Apr 2026
Abstract
Accurate vegetable disease detection in complex cultivation environments remains challenging because early lesions are often small, low-contrast, and easily confounded by cluttered backgrounds. To address this issue, we propose VDD-Net, a feature-enhanced detection network based on YOLOv10 for robust vegetable disease detection in [...] Read more.
Accurate vegetable disease detection in complex cultivation environments remains challenging because early lesions are often small, low-contrast, and easily confounded by cluttered backgrounds. To address this issue, we propose VDD-Net, a feature-enhanced detection network based on YOLOv10 for robust vegetable disease detection in protected agriculture. The proposed framework integrates three modules: a receptive field enhancement (RFE) module to improve local perception of small lesions, an adaptive channel fusion (ACF) module to strengthen multi-scale feature aggregation and suppress background interference, and a global context attention (GCA) module to capture long-range dependencies and improve contextual discrimination. Experiments on a custom vegetable disease dataset showed that VDD-Net achieved an mAP@0.5 of 95.2% with only 7.78 M parameters. To further evaluate robustness, zero-shot cross-domain testing was conducted on the PlantDoc dataset, where VDD-Net achieved an mAP@0.5 of 76.5%, outperforming the baseline and showing improved generalization to natural scenes. In addition, after TensorRT optimization and FP16 quantization, the model maintained real-time inference on edge platforms, reaching 89.3 FPS on Jetson AGX Orin and 24.2 FPS on Jetson Nano. These results indicate that VDD-Net provides a practical balance among detection accuracy, cross-domain robustness, and deployment efficiency for intelligent disease monitoring in modern agriculture. Full article
(This article belongs to the Special Issue Combined Stresses on Plants: From Mechanisms to Adaptations)
Show Figures

Figure 1

18 pages, 1357 KB  
Article
Fault Diagnosis for Hydropower Units Based on Multi-Sensor Data with Multi-Scale Fusion
by Di Zhou, Xiangqu Xiao and Chaoshun Li
Water 2026, 18(8), 915; https://doi.org/10.3390/w18080915 (registering DOI) - 11 Apr 2026
Abstract
Accurate fault diagnosis of hydropower units is crucial for ensuring the efficient and complete utilization of hydropower resources. Existing diagnostic methods predominantly consider either single-sensor or single-scale multi-sensor fusion, failing to fully exploit the effective information within monitoring data. Furthermore, they neglect the [...] Read more.
Accurate fault diagnosis of hydropower units is crucial for ensuring the efficient and complete utilization of hydropower resources. Existing diagnostic methods predominantly consider either single-sensor or single-scale multi-sensor fusion, failing to fully exploit the effective information within monitoring data. Furthermore, they neglect the correlation between different sensors and faults during fusion diagnosis, thereby limiting the diagnostic performance of fusion models. To address this, this paper proposes a multi-sensor data fault diagnosis method based on multi-scale fusion. First, a feature extraction model is constructed to extract shallow-level features from multi-sensor signals across multiple dimensions. Subsequently, an attention-based feature fusion network is designed to extract and fuse multi-depth features, yielding high-quality deep-fused features. Finally, an information-entropy-based decision fusion strategy is established to effectively enhance the model’s diagnostic performance. Experimental validation on the public rotating machinery fault dataset and the hydropower unit fault dataset yielded diagnostic accuracies of 96.42% and 99.28%, respectively, demonstrating the significant effectiveness and robustness of the proposed method. Full article
(This article belongs to the Section Water-Energy Nexus)
22 pages, 13987 KB  
Article
SDTformer: Scale-Adaptive Differential Transformer Network for Remote Sensing Image Dehazing
by Boyu Liu and Qi Zhang
Remote Sens. 2026, 18(8), 1136; https://doi.org/10.3390/rs18081136 (registering DOI) - 11 Apr 2026
Abstract
In Transformer-based image restoration models, the self-attention mechanism often introduces attention noise from irrelevant contextual feature, hindering the recovery of underlying clear content. Although many methods have been proposed to suppress attention noise, we note that most existing approaches are often developed for [...] Read more.
In Transformer-based image restoration models, the self-attention mechanism often introduces attention noise from irrelevant contextual feature, hindering the recovery of underlying clear content. Although many methods have been proposed to suppress attention noise, we note that most existing approaches are often developed for general vision tasks and fail to generalize across remote sensing image dehazing, where large-scale spatial structures pose additional challenges for attention modeling. How to effectively model scale-aware attention to suppress redundant activations becomes crucial for remote sensing image dehazing. In this paper, we propose a scale-adaptive differential Transformer (SDTformer), an architecture designed to suppress attention noise through a differential attention mechanism, thereby improving reconstruction fidelity. Specifically, the model incorporates a scale-adaptive differential self-attention module, which models contextual dependencies across different spatial scales and reduces redundant contextual interference by computing differential attention maps. Additionally, a dynamic differential feed-forward network is proposed to adaptively select informative spatial features, strengthening feature aggregation. To further enhance feature representation, a gated fusion module is introduced to aggregate multi-scale features generated by different encoder blocks, which facilitates the learning process of each decoder block and improves the final reconstruction performance. Extensive experimental results on the commonly used benchmarks show that our method achieves favorable performance against state-of-the-art approaches. Full article
Show Figures

Figure 1

25 pages, 6534 KB  
Article
Spectral–Spatial State Space Model with Hybrid Attention for Hyperspectral Image Classification
by Mengdi Cheng, Haixin Sun, Fanlei Meng, Qiuguang Cao and Jingwen Xu
Algorithms 2026, 19(4), 300; https://doi.org/10.3390/a19040300 (registering DOI) - 11 Apr 2026
Abstract
Hyperspectral image (HSI) classification requires the extraction of discriminative features from high-dimensional spatial–spectral data. While the Mamba architecture has shown promise in long-sequence modeling with linear complexity, its application to HSI remains constrained by two major hurdles: the unidirectional causal scanning which fails [...] Read more.
Hyperspectral image (HSI) classification requires the extraction of discriminative features from high-dimensional spatial–spectral data. While the Mamba architecture has shown promise in long-sequence modeling with linear complexity, its application to HSI remains constrained by two major hurdles: the unidirectional causal scanning which fails to capture non-causal global dependencies, and the serialization-induced loss of two-dimensional spatial topology and local textures. To overcome these limitations, we propose HAMamba, a novel Hybrid Attention State Space Model. HAMamba facilitates deep representation learning through two core components: a Multi-Scale Dynamic Fusion (MSDF) module and a Hybrid Attention Mamba Encoder (HAME). Specifically, the MSDF module augments spatial perception through parallelized feature extraction and dynamically weighted integration. The HAME synergizes a Bidirectional Sequence Scan Mamba (BSSM) to establish global semantic context and a Spatial–Spectral Gated Attention (SSGA) module to refine local structural details. Comprehensive experiments on four public benchmark datasets demonstrate that the proposed HAMamba significantly outperforms state-of-the-art approaches, achieving a superior balance between classification accuracy and computational efficiency. Full article
(This article belongs to the Section Evolutionary Algorithms and Machine Learning)
Show Figures

Figure 1

15 pages, 1264 KB  
Article
ES2-LeafSeg: Lightweight State Space Modeling-Driven Agricultural Leaf Segmentation
by Hao Wang, Zhiyang Li, Pengsen Zhao and Jinlong Yu
Appl. Sci. 2026, 16(8), 3745; https://doi.org/10.3390/app16083745 - 10 Apr 2026
Abstract
Agricultural robots and unmanned farmland management require real-time and precise parsing of crop leaves at the edge to support variable application of pesticides, seedling condition monitoring, and phenotypic analysis. However, the field environment features drastic changes in light, leaf occlusion, and interference from [...] Read more.
Agricultural robots and unmanned farmland management require real-time and precise parsing of crop leaves at the edge to support variable application of pesticides, seedling condition monitoring, and phenotypic analysis. However, the field environment features drastic changes in light, leaf occlusion, and interference from background weeds, which can cause semantic fragmentation and boundary artifacts in lightweight models. This paper presents ES2-LeafSeg, a lightweight framework for leaf semantic segmentation tailored for edge deployment. The method employs EfficientNetV2 as the backbone encoder and introduces the State Space Semantic Enhancement Module (S2FEM) on skip connection features, modeling long-range dependencies and suppressing local texture noise through SSM pooling in row and column directions. Meanwhile, a cross-scale decoder (CSD) and a global context transformation (GCT) are designed to achieve multi-scale semantic fusion and boundary refinement. On the three-class segmentation task of the SoyCotton dataset, ES2-LeafSeg achieved mIoU of 0.817, mDice of 0.869, Fβw of 0.925, and MAE of 0.011, outperforming multiple classic and recent baselines while maintaining 23.67 M parameters and 49.62 FPS. Ablation experiments further verified the complementary contributions of S2FEM and GCT to regional consistency and boundary quality. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Back to TopTop