Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,410)

Search Parameters:
Keywords = remote sensing information extraction

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 11548 KB  
Article
Frequency-Aware Feature Pyramid Framework for Contextual Representation in Remote Sensing Object Detection
by Lingyun Gu, Qingyun Fang, Eugene Popov, Vitalii Pavlov, Sergey Volvenko, Sergey Makarov and Ge Dong
Astronautics 2026, 1(1), 5; https://doi.org/10.3390/astronautics1010005 (registering DOI) - 17 Jan 2026
Abstract
Remote sensing object detection is a critical task in Earth observation. Despite the remarkable progress made in general object detection, existing detectors struggle with remote sensing scenarios due to the prevalence of numerous small objects with limited discriminative cues. Cutting-edge studies have shown [...] Read more.
Remote sensing object detection is a critical task in Earth observation. Despite the remarkable progress made in general object detection, existing detectors struggle with remote sensing scenarios due to the prevalence of numerous small objects with limited discriminative cues. Cutting-edge studies have shown that incorporating contextual information effectively enhances the detection performance for small objects. Meanwhile, recent research has revealed that convolution in the frequency domain is capable of capturing long-range spatial dependencies with high efficiency. Inspired by this, we propose a Frequency-aware Feature Pyramid Framework (FFPF) for remote sensing object detection, which consists of a novel Frequency-aware ResNet (F-ResNet) and a Bilateral Spectral-aware Feature Pyramid Network (BS-FPN). Specifically, the F-ResNet is proposed to extract the spectral context information by plugging the frequency domain convolution into each stage of the backbone, thereby enriching features of small objects. In addition, the BS-FPN employs a bilateral sampling strategy and skipping connection to model the association of object features at different scales, enabling the contextual information extracted by the F-ResNet to be fully leveraged. Extensive experiments are conducted for object detection in the public remote sensing image dataset and natural image dataset. The experimental results demonstrate the excellent performance of the FFPF, achieving 73.8% mAP on the DIOR dataset without using any additional training tricks. Full article
(This article belongs to the Special Issue Feature Papers on Spacecraft Dynamics and Control)
Show Figures

Figure 1

24 pages, 43005 KB  
Article
Accurate Estimation of Spring Maize Aboveground Biomass in Arid Regions Based on Integrated UAV Remote Sensing Feature Selection
by Fengxiu Li, Yanzhao Guo, Yingjie Ma, Ning Lv, Zhijian Gao, Guodong Wang, Zhitao Zhang, Lei Shi and Chongqi Zhao
Agronomy 2026, 16(2), 219; https://doi.org/10.3390/agronomy16020219 - 16 Jan 2026
Abstract
Maize is one of the top three crops globally, ranking only behind rice and wheat, making it an important crop of interest. Aboveground biomass is a key indicator for assessing maize growth and its yield potential. This study developed an efficient and stable [...] Read more.
Maize is one of the top three crops globally, ranking only behind rice and wheat, making it an important crop of interest. Aboveground biomass is a key indicator for assessing maize growth and its yield potential. This study developed an efficient and stable biomass prediction model to estimate the aboveground biomass (AGB) of spring maize (Zea mays L.) under subsurface drip irrigation in arid regions, based on UAV multispectral remote sensing and machine learning techniques. Focusing on typical subsurface drip-irrigated spring maize in arid Xinjiang, multispectral images and field-measured AGB data were collected from 96 sample points (selected via stratified random sampling across 24 plots) over four key phenological stages in 2024 and 2025. Sixteen vegetation indices were calculated and 40 texture features were extracted using the gray-level co-occurrence matrix method, while an integrated feature-selection strategy combining Elastic Net and Random Forest was employed to effectively screen key predictor variables. Based on the selected features, six machine learning models were constructed, including Elastic Net Regression (ENR), Gradient Boosting Decision Trees (GBDT), Gaussian Process Regression (GPR), Partial Least Squares Regression (PLSR), Random Forest (RF), and Extreme Gradient Boosting (XGB). Results showed that the fused feature set comprised four vegetation indices (GRDVI, RERVI, GRVI, NDVI) and five texture features (R_Corr, NIR_Mean, NIR_Vari, B_Mean, B_Corr), thereby retaining red-edge and visible-light texture information highly sensitive to AGB. The GPR model based on the fused features exhibited the best performance (test set R2 = 0.852, RMSE = 2890.74 kg ha−1, MAE = 1676.70 kg ha−1), demonstrating high fitting accuracy and stable predictive ability across both the training and test sets. Spatial inversions over the two growing seasons of 2024 and 2025, derived from the fused-feature GPR optimal model at four key phenological stages, revealed pronounced spatiotemporal heterogeneity and stage-dependent dynamics of spring maize AGB: the biomass accumulates rapidly from jointing to grain filling, slows thereafter, and peaks at maturity. At a constant planting density, AGB increased markedly with nitrogen inputs from N0 to N3 (420 kg N ha−1), with the high-nitrogen N3 treatment producing the greatest biomass; this successfully captured the regulatory effect of the nitrogen gradient on maize growth, provided reliable data for variable-rate fertilization, and is highly relevant for optimizing water–fertilizer coordination in subsurface drip irrigation systems. Future research may extend this integrated feature selection and modeling framework to monitor the growth and estimate the yield of other crops, such as rice and cotton, thereby validating its generalizability and robustness in diverse agricultural scenarios. Full article
Show Figures

Figure 1

19 pages, 9385 KB  
Article
YOLOv11-MDD: YOLOv11 in an Encoder–Decoder Architecture for Multi-Label Post-Wildfire Damage Detection—A Case Study of the 2023 US and Canada Wildfires
by Masoomeh Gomroki, Negar Zahedi, Majid Jahangiri, Bahareh Kalantar and Husam Al-Najjar
Remote Sens. 2026, 18(2), 280; https://doi.org/10.3390/rs18020280 - 15 Jan 2026
Viewed by 114
Abstract
Natural disasters occur worldwide and cause significant financial and human losses. Wildfires are among the most important natural disasters, occurring more frequently in recent years due to global warming. Fast and accurate post-disaster damage detection could play an essential role in swift rescue [...] Read more.
Natural disasters occur worldwide and cause significant financial and human losses. Wildfires are among the most important natural disasters, occurring more frequently in recent years due to global warming. Fast and accurate post-disaster damage detection could play an essential role in swift rescue planning and operations. Remote sensing (RS) data is an important source for tracking damage detection. Deep learning (DL) methods, as efficient tools, can extract valuable information from RS data to generate an accurate damage map for future operations. The present study proposes an encoder–decoder architecture composed of pre-trained Yolov11 blocks as the encoder path and Modified UNet (MUNet) blocks as the decoder path. The proposed network includes three main steps: (1) pre-processing, (2) network training, (3) prediction multilabel damage map and accuracy evaluation. To evaluate the network’s performance, the US and Canada datasets were considered. The datasets are satellite images of the 2023 wildfires in the US and Canada. The proposed method reaches the Overall Accuracy (OA) of 97.36, 97.47, and Kappa Coefficient (KC) of 0.96, 0.87 for the US and Canada 2023 wildfire datasets, respectively. Regarding the high OA and KC, an accurate final burnt map can be generated to assist in rescue and recovery efforts after the wildfire. The proposed YOLOv11–MUNet framework introduces an efficient and accurate post-event-only approach for wildfire damage detection. By overcoming the dependency on pre-event imagery and reducing model complexity, this method enhances the applicability of DL in rapid post-disaster assessment and management. Full article
Show Figures

Figure 1

17 pages, 2108 KB  
Article
Dynamic Monitoring of High-Rise Building Areas in Xiong’an New Area Using Temporal Change-Aware U-Net
by Junye Lv, Liwei Li and Gang Cheng
Remote Sens. 2026, 18(2), 253; https://doi.org/10.3390/rs18020253 - 13 Jan 2026
Viewed by 92
Abstract
High-rise building areas (HRBs), a key urban land-cover type defined by distinct morphological and functional characteristics, play a critical role in urban development. Their spatial distribution and temporal dynamics serve as essential indicators for quantifying urbanization and analyzing the evolution of urban spatial [...] Read more.
High-rise building areas (HRBs), a key urban land-cover type defined by distinct morphological and functional characteristics, play a critical role in urban development. Their spatial distribution and temporal dynamics serve as essential indicators for quantifying urbanization and analyzing the evolution of urban spatial structure. This study addresses the dynamic monitoring needs of HRBs by developing a temporal change detection model, TCA-Unet (Temporal Change-Aware U-Net), based on a temporal change-aware attention module. The model adopts a dual-path design, combining a temporal attention encoder and a change-aware encoder. By explicitly modeling temporal difference features, it captures change information in temporal remote sensing images. It incorporates a multi-level weight generation mechanism that dynamically balances temporal features and change-aware features through an adaptive fusion strategy. This mechanism effectively integrates temporal context and enhances the model’s ability to capture long-term temporal dependencies. Using the Xiong’an New Area and its surrounding regions as the study area, experiments were conducted using Sentinel-2 time-series imagery from 2017 to 2024. The results demonstrate that the proposed model outperforms existing approaches, achieving an overall accuracy (OA) of 90.98%, an F1 score of 82.63%, and a mean intersection over union (mIoU) of 72.22%. Overall, this study provides an effective tool for extracting HRBs for dynamic monitoring and offers valuable guidance for urban development and regulation. Full article
Show Figures

Figure 1

24 pages, 5237 KB  
Article
DCA-UNet: A Cross-Modal Ginkgo Crown Recognition Method Based on Multi-Source Data
by Yunzhi Guo, Yang Yu, Yan Li, Mengyuan Chen, Wenwen Kong, Yunpeng Zhao and Fei Liu
Plants 2026, 15(2), 249; https://doi.org/10.3390/plants15020249 - 13 Jan 2026
Viewed by 207
Abstract
Wild ginkgo, as an endangered species, holds significant value for genetic resource conservation, yet its practical applications face numerous challenges. Traditional field surveys are inefficient in mountainous mixed forests, while satellite remote sensing is limited by spatial resolution. Current deep learning approaches relying [...] Read more.
Wild ginkgo, as an endangered species, holds significant value for genetic resource conservation, yet its practical applications face numerous challenges. Traditional field surveys are inefficient in mountainous mixed forests, while satellite remote sensing is limited by spatial resolution. Current deep learning approaches relying on single-source data or merely simple multi-source fusion fail to fully exploit information, leading to suboptimal recognition performance. This study presents a multimodal ginkgo crown dataset, comprising RGB and multispectral images acquired by an UAV platform. To achieve precise crown segmentation with this data, we propose a novel dual-branch dynamic weighting fusion network, termed dual-branch cross-modal attention-enhanced UNet (DCA-UNet). We design a dual-branch encoder (DBE) with a two-stream architecture for independent feature extraction from each modality. We further develop a cross-modal interaction fusion module (CIF), employing cross-modal attention and learnable dynamic weights to boost multi-source information fusion. Additionally, we introduce an attention-enhanced decoder (AED) that combines progressive upsampling with a hybrid channel-spatial attention mechanism, thereby effectively utilizing multi-scale features and enhancing boundary semantic consistency. Evaluation on the ginkgo dataset demonstrates that DCA-UNet achieves a segmentation performance of 93.42% IoU (Intersection over Union), 96.82% PA (Pixel Accuracy), 96.38% Precision, and 96.60% F1-score. These results outperform differential feature attention fusion network (DFAFNet) by 12.19%, 6.37%, 4.62%, and 6.95%, respectively, and surpasses the single-modality baselines (RGB or multispectral) in all metrics. Superior performance on cross-flight-altitude data further validates the model’s strong generalization capability and robustness in complex scenarios. These results demonstrate the superiority of DCA-UNet in UAV-based multimodal ginkgo crown recognition, offering a reliable and efficient solution for monitoring wild endangered tree species. Full article
(This article belongs to the Special Issue Advanced Remote Sensing and AI Techniques in Agriculture and Forestry)
Show Figures

Figure 1

27 pages, 5686 KB  
Article
MAFMamba: A Multi-Scale Adaptive Fusion Network for Semantic Segmentation of High-Resolution Remote Sensing Images
by Boxu Li, Xiaobing Yang and Yingjie Fan
Sensors 2026, 26(2), 531; https://doi.org/10.3390/s26020531 - 13 Jan 2026
Viewed by 81
Abstract
With rapid advancements in sub-meter satellite and aerial imaging technologies, high-resolution remote sensing imagery has become a pivotal source for geospatial information acquisition. However, current semantic segmentation models encounter two primary challenges: (1) the inherent trade-off between capturing long-range global context and preserving [...] Read more.
With rapid advancements in sub-meter satellite and aerial imaging technologies, high-resolution remote sensing imagery has become a pivotal source for geospatial information acquisition. However, current semantic segmentation models encounter two primary challenges: (1) the inherent trade-off between capturing long-range global context and preserving precise local structural details—where excessive reliance on downsampled deep semantics often results in blurred boundaries and the loss of small objects and (2) the difficulty in modeling complex scenes with extreme scale variations, where objects of the same category exhibit drastically different morphological features. To address these issues, this paper introduces MAFMamba, a multi-scale adaptive fusion visual Mamba network tailored for high-resolution remote sensing images. To mitigate scale variation, we design a lightweight hybrid encoder incorporating an Adaptive Multi-scale Mamba Block (AMMB) in each stage. Driven by a Multi-scale Adaptive Fusion (MSAF) mechanism, the AMMB dynamically generates pixel-level weights to recalibrate cross-level features, establishing a robust multi-scale representation. Simultaneously, to strictly balance local details and global semantics, we introduce a Global–Local Feature Enhancement Mamba (GLMamba) in the decoder. This module synergistically integrates local fine-grained features extracted by convolutions with global long-range dependencies modeled by the Visual State Space (VSS) layer. Furthermore, we propose a Multi-Scale Cross-Attention Fusion (MSCAF) module to bridge the semantic gap between the encoder’s shallow details and the decoder’s high-level semantics via an efficient cross-attention mechanism. Extensive experiments on the ISPRS Potsdam and Vaihingen datasets demonstrate that MAFMamba surpasses state-of-the-art Convolutional Neural Network (CNN), Transformer, and Mamba-based methods in terms of mIoU and mF1 scores. Notably, it achieves superior accuracy while maintaining linear computational complexity and low memory usage, underscoring its efficiency in complex remote sensing scenarios. Full article
(This article belongs to the Special Issue Intelligent Sensors and Artificial Intelligence in Building)
Show Figures

Figure 1

28 pages, 3553 KB  
Article
GCN-Embedding Swin–Unet for Forest Remote Sensing Image Semantic Segmentation
by Pingbo Liu, Gui Zhang and Jianzhong Li
Remote Sens. 2026, 18(2), 242; https://doi.org/10.3390/rs18020242 - 12 Jan 2026
Viewed by 130
Abstract
Forest resources are among the most important ecosystems on the earth. The semantic segmentation and accurate positioning of ground objects in forest remote sensing (RS) imagery are crucial to the emergency treatment of forest natural disasters, especially forest fires. Currently, most existing methods [...] Read more.
Forest resources are among the most important ecosystems on the earth. The semantic segmentation and accurate positioning of ground objects in forest remote sensing (RS) imagery are crucial to the emergency treatment of forest natural disasters, especially forest fires. Currently, most existing methods for image semantic segmentation are built upon convolutional neural networks (CNNs). Nevertheless, these techniques face difficulties in directly accessing global contextual information and accurately detecting geometric transformations within the image’s target regions. This limitation stems from the inherent locality of convolution operations, which are restricted to processing data structured in Euclidean space and confined to square-shaped regions. Inspired by the graph convolution network (GCN) with robust capabilities in processing irregular and complex targets, as well as Swin Transformers renowned for exceptional global context modeling, we present a hybrid semantic segmentation framework for forest RS imagery termed GSwin–Unet. This framework embeds the GCN model into Swin–Unet architecture to address the issue of low semantic segmentation accuracy of RS imagery in forest scenarios, which is caused by the complex texture features, diverse shapes, and unclear boundaries of land objects. GSwin–Unet features a parallel dual-encoder architecture of GCN and Swin Transformer. First, we integrate the Zero-DCE (Zero-Reference Deep Curve Estimation) algorithm into GSwin–Unet to enhance forest RS image feature representation. Second, a feature aggregation module (FAM) is proposed to bridge the dual encoders by fusing GCN-derived local aggregated features with Swin Transformer-extracted features. Our study demonstrates that, compared with the baseline models TransUnet, Swin–Unet, Unet, and DeepLab V3+, the GSwin–Unet achieves improvements of 7.07%, 5.12%, 8.94%, and 2.69% in the mean Intersection over Union (MIoU) and 3.19%, 1.72%, 4.3%, and 3.69% in the average F1 score (Ave.F1), respectively, on the RGB forest RS dataset. On the NIRGB forest RS dataset, the improvements in MIoU are 5.75%, 3.38%, 6.79%, and 2.44%, and the improvements in Ave.F1 are 4.02%, 2.38%, 4.72%, and 1.67%, respectively. Meanwhile, GSwin–Unet shows excellent adaptability on the selected GID dataset with high forest coverage, where the MIoU and Ave.F1 reach 72.92% and 84.3%, respectively. Full article
Show Figures

Figure 1

19 pages, 12335 KB  
Article
Method for Monitoring the Safety of Urban Subway Infrastructure Along Subway Lines by Fusing Inter-Track InSAR Data
by Guosheng Cai, Xiaoping Lu, Yao Lu, Zhengfang Lou, Baoquan Huang, Yaoyu Lu, Siyi Li and Bing Liu
Sensors 2026, 26(2), 454; https://doi.org/10.3390/s26020454 - 9 Jan 2026
Viewed by 167
Abstract
Urban surface subsidence is primarily induced by intensive above-ground and underground construction activities and excessive groundwater extraction. Integrating InSAR techniques for safety monitoring of urban subway infrastructure is therefore of great significance for urban safety and sustainable development. However, single-track high-spatial-resolution SAR imagery [...] Read more.
Urban surface subsidence is primarily induced by intensive above-ground and underground construction activities and excessive groundwater extraction. Integrating InSAR techniques for safety monitoring of urban subway infrastructure is therefore of great significance for urban safety and sustainable development. However, single-track high-spatial-resolution SAR imagery is insufficient to achieve full coverage over large urban areas, and direct mosaicking of inter-track InSAR results may introduce systematic biases, thereby compromising the continuity and consistency of deformation fields at the regional scale. To address this issue, this study proposes an inter-track InSAR correction and mosaicking approach based on the mean vertical deformation difference within overlapping areas, aiming to mitigate the overall offset between deformation results derived from different tracks and to construct a spatially continuous urban surface deformation field. Based on the fused deformation results, subsidence characteristics along subway lines and in key urban infrastructures were further analyzed. The main urban area and the eastern and western new districts of Zhengzhou, a national central city in China, were selected as the study area. A total of 16 Radarsat-2 SAR scenes acquired from two tracks during 2022–2024, with a spatial resolution of 3 m, were processed using the SBAS-InSAR technique to retrieve surface deformation. The results indicate that the mean deformation rate difference in the overlapping areas between the two SAR tracks is approximately −5.54 mm/a. After applying the difference-constrained correction, the coefficient of determination (R2) between the mosaicked InSAR results and leveling observations increased to 0.739, while the MAE and RMSE decreased to 4.706 and 5.538 mm, respectively, demonstrating good stability in achieving inter-track consistency and continuous regional deformation representation. Analysis of the corrected InSAR results reveals that, during 2022–2024, areas exhibiting uplift and subsidence trends accounted for 37.6% and 62.4% of the study area, respectively, while the proportions of cumulative subsidence and uplift areas were 66.45% and 33.55%. In the main urban area, surface deformation rates are generally stable and predominantly within ±5 mm/a, whereas subsidence rates in the eastern new district are significantly higher than those in the main urban area and the western new district. Along subway lines, deformation rates are mainly within ±5 mm/a, with relatively larger deformation observed only in localized sections of the eastern segment of Line 1. Further analysis of typical zones along the subway corridors shows that densely built areas in the western part of the main urban area remain relatively stable, while building-concentrated areas in the eastern region exhibit a persistent relative subsidence trend. Overall, the results demonstrate that the proposed inter-track InSAR mosaicking method based on the mean deformation difference in overlapping areas can effectively support subsidence monitoring and spatial pattern identification along urban subway lines and key regions under relative calibration conditions, providing reliable remote sensing information for refined urban management and infrastructure risk assessment. Full article
(This article belongs to the Special Issue Application of SAR and Remote Sensing Technology in Earth Observation)
Show Figures

Figure 1

27 pages, 6110 KB  
Article
A Prediction Framework of Apple Orchard Yield with Multispectral Remote Sensing and Ground Features
by Shuyan Pan and Liqun Liu
Plants 2026, 15(2), 213; https://doi.org/10.3390/plants15020213 - 9 Jan 2026
Viewed by 153
Abstract
Aiming at the problem that the current traditional apple yield estimation methods rely on manual investigation and do not make full use of multi-source information, this paper proposes an apple orchard yield prediction framework combining multispectral remote sensing features and ground features. The [...] Read more.
Aiming at the problem that the current traditional apple yield estimation methods rely on manual investigation and do not make full use of multi-source information, this paper proposes an apple orchard yield prediction framework combining multispectral remote sensing features and ground features. The framework is oriented to the demand of yield prediction at different scales. It can not only realize the prediction of apple yield at the district and county scales, but also modify the prediction results of small-scale orchards based on the acquisition of orchard features. The framework consists of three parts, namely, apple orchard planting area extraction, district and county large-scale yield prediction and small-scale orchard yield prediction correction. (1) During apple orchard planting area extraction, the samples of some apple planting areas in the study area were obtained through field investigation, and the orchard and non-orchard areas were classified and discriminated, providing a spatial basis for the collection of subsequent yield prediction-related data. (2) In the large-scale yield prediction of districts and counties, based on the obtained orchard-planting areas, the corresponding multispectral remote sensing features and environmental features were obtained using Google Earth engine platform. In order to avoid the noise interference caused by local pixel differences, the obtained data were median synthesized, and the feature set was constructed by combining the yield and other information. On this basis, the feature set was divided and sent to Apple Orchard Yield Prediction Network (APYieldNet) for training and testing, and the district and county large-scale yield prediction model was obtained. (3) During the part of small-scale orchard yield prediction correction, the optimal model for large-scale yield prediction at the district and county levels is utilized to forecast the yield of the entire planting area and the internal local sampling areas of the small-scale orchard. Within the local sampling areas, the number of fruits is identified through the YOLO-A model, and the actual yield is estimated based on the empirical single fruit weight as a ground feature, which is used to calculate the correction factor. Finally, the proportional correction method is employed to correct the error in the prediction results of the entire small-scale orchard area, thus obtaining a more accurate yield prediction for the small-scale orchard. The experiment showed that (1) the yield prediction model APYieldNet (MAE = 152.68 kg/mu, RMSE = 203.92 kg/mu) proposed in this paper achieved better results than other methods; (2) the proposed YOLO-A model achieves superior detection performance for apple fruits and flowers in complex orchard environments compared to existing methods; (3) in this paper, through the method of proportional correction, the prediction results of APYieldNet for small-scale orchard are closer to the real yield. Full article
(This article belongs to the Section Plant Modeling)
Show Figures

Figure 1

22 pages, 3809 KB  
Article
Research on Remote Sensing Image Object Segmentation Using a Hybrid Multi-Attention Mechanism
by Lei Chen, Changliang Li, Yixuan Gao, Yujie Chang, Siming Jin, Zhipeng Wang, Xiaoping Ma and Limin Jia
Appl. Sci. 2026, 16(2), 695; https://doi.org/10.3390/app16020695 - 9 Jan 2026
Viewed by 135
Abstract
High-resolution remote sensing images are gradually playing an important role in land cover mapping, urban planning, and environmental monitoring tasks. However, current segmentation approaches frequently encounter challenges such as loss of detail and blurred boundaries when processing high-resolution remote sensing imagery, owing to [...] Read more.
High-resolution remote sensing images are gradually playing an important role in land cover mapping, urban planning, and environmental monitoring tasks. However, current segmentation approaches frequently encounter challenges such as loss of detail and blurred boundaries when processing high-resolution remote sensing imagery, owing to their complex backgrounds and dense semantic content. In response to the aforementioned limitations, this study introduces HMA-UNet, a novel segmentation network built upon the UNet framework and enhanced through a hybrid attention strategy. The architecture’s innovation centers on a composite attention block, where a lightweight split fusion attention (LSFA) mechanism and a lightweight channel-spatial attention (LCSA) mechanism are synergistically integrated within a residual learning structure to replace the stacked convolutional structure in UNet, which can improve the utilization of important shallow features and eliminate redundant information interference. Comprehensive experiments on the WHDLD dataset and the DeepGlobe road extraction dataset show that our proposed method achieves effective segmentation in remote sensing images by fully utilizing shallow features and eliminating redundant information interference. The quantitative evaluation results demonstrate the performance of the proposed method across two benchmark datasets. On the WHDLD dataset, the model attains a mean accuracy, IoU, precision, and recall of 72.40%, 60.71%, 75.46%, and 72.41%, respectively. Correspondingly, on the DeepGlobe road extraction dataset, it achieves a mean accuracy of 57.87%, an mIoU of 49.82%, a mean precision of 78.18%, and a mean recall of 57.87%. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

27 pages, 8953 KB  
Article
RSICDNet: A Novel Regional Scribble-Based Interactive Change Detection Network for Remote Sensing Images
by Daifeng Peng, Chen He and Haiyan Guan
Remote Sens. 2026, 18(2), 204; https://doi.org/10.3390/rs18020204 - 8 Jan 2026
Viewed by 112
Abstract
To address the issues of inadequate performance and excessive interaction costs when handling large-scale and complex-shaped change areas with existing interaction forms, this paper proposes RSICDNet, an interactive change detection (ICD) model with regional scribble interaction. In this framework, regional scribble interaction is [...] Read more.
To address the issues of inadequate performance and excessive interaction costs when handling large-scale and complex-shaped change areas with existing interaction forms, this paper proposes RSICDNet, an interactive change detection (ICD) model with regional scribble interaction. In this framework, regional scribble interaction is introduced for the first time to provide rich spatial prior information for accurate ICD. Specifically, RSICDNet first employs an interaction processing network to extract interactive features, and subsequently utilizes the High-Resolution Network (HRNet) backbone to extract features from bi-temporal remote sensing images concatenated along the channel dimension. To effectively integrate these two information streams, an Interaction Fusion and Refinement Module (IFRM) is proposed, which injects the spatial priors from the interactive features into the high-level semantic features. Finally, an Object Contextual Representation (OCR) module is applied to further refine feature representations, and a lightweight segmentation head is used to generate final change map. Furthermore, a human–computer ICD application has been developed based on RSICDNet, significantly enhancing its potential for practical deployment. To validate the effectiveness of the proposed RSICDNet, extensive experiments are conducted against mainstream interactive deep learning models on the WHU-CD, LEVIR-CD, and CLCD datasets. The quantitative results demonstrate that RSICDNet achieves optimal Number of Interactions (NoI) metrics across all three datasets. Specifically, its NoI80 values reach 1.15, 1.45, and 3.42 on the WHU-CD, LEVIR-CD, and CLCD datasets, respectively. The qualitative results confirm a clear advantage for RSICDNet, which consistently delivers visually superior outcomes using the same or often fewer interactions. Full article
Show Figures

Graphical abstract

14 pages, 1785 KB  
Article
DINOv3-Driven Semantic Segmentation for Landslide Mapping in Mountainous Regions
by Zhiyi Dou, Edore Akpokodje, Yuelin He, Yuxin Liu, Zixuan Ni, Chang’an Xu, Muhammad Aslam and Meng Tang
Sensors 2026, 26(2), 406; https://doi.org/10.3390/s26020406 - 8 Jan 2026
Viewed by 239
Abstract
Landslide hazard assessment increasingly demands the joint analysis of heterogeneous remote sensing data; however, automating this process remains difficult due to the pronounced resolution and texture discrepancies existing between satellite and aerial sensors. To address these limitations, this study proposes a robust segmentation [...] Read more.
Landslide hazard assessment increasingly demands the joint analysis of heterogeneous remote sensing data; however, automating this process remains difficult due to the pronounced resolution and texture discrepancies existing between satellite and aerial sensors. To address these limitations, this study proposes a robust segmentation framework capable of extracting sensor-robust representations. The framework leverages a DINOv3 transformer encoder and exploits representations from multiple transformer layers to capture complementary visual information, ranging from fine-grained surface textures to global semantic contexts, overcoming the receptive field constraints of conventional CNNs. Experiments on the Longxi satellite dataset achieve a Dice coefficient of 0.96 and an IoU of 0.938, and experiments on the Longxi UAV dataset achieve a Dice coefficient of 0.965 and an IoU of 0.941. These results show consistent segmentation performance on both the Longxi satellite and UAV datasets, despite differences in spatial resolution and surface appearance between acquisition platforms. Full article
(This article belongs to the Special Issue AI-Enhanced Sensor Data Integration and Processing)
Show Figures

Figure 1

26 pages, 6272 KB  
Article
Target Detection in Ship Remote Sensing Images Considering Cloud and Fog Occlusion
by Xiaopeng Shao, Zirui Wang, Yang Yang, Shaojie Zheng and Jianwu Mu
J. Mar. Sci. Eng. 2026, 14(2), 124; https://doi.org/10.3390/jmse14020124 - 7 Jan 2026
Viewed by 205
Abstract
The recognition of targets in ship remote sensing images is crucial for ship collision avoidance, military reconnaissance, and emergency rescue. However, climatic factors such as clouds and fog can obscure and blur remote sensing image targets, leading to missed and false detections in [...] Read more.
The recognition of targets in ship remote sensing images is crucial for ship collision avoidance, military reconnaissance, and emergency rescue. However, climatic factors such as clouds and fog can obscure and blur remote sensing image targets, leading to missed and false detections in target detection. Therefore, it is necessary to study ship remote sensing target detection that considers the impact of cloud and fog occlusion. Due to the large scale and vast amount of information in remote sensing images, in order to achieve high-precision target detection based on limited resource platforms, a comparison of the detection accuracy and parameter quantity of the YOLO series algorithms was first conducted. Based on the analysis results, the YOLOv8s network model with the least number of parameters while ensuring detection accuracy was selected for lightweight network model improvement. The FasterNet was utilized to replace the backbone feature extraction network of YOLOv8s, and the detection accuracy and lightweight level of the resulting FN-YOLOv8s network model were both improved. Furthermore, structural improvements were made to the AOD-Net dehazing network. By introducing a smoothness loss function, the halo artifacts often generated during the image dehazing process were addressed. Meanwhile, by integrating the atmospheric light value and transmittance, the accumulation error was effectively reduced, significantly enhancing the dehazing effect of remote sensing images. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

23 pages, 14919 KB  
Article
Estimating Economic Activity from Satellite Embeddings
by Xiangqi Yue, Zhong Zhao and Kun Hu
Appl. Sci. 2026, 16(2), 582; https://doi.org/10.3390/app16020582 - 6 Jan 2026
Viewed by 211
Abstract
Earth Embedding (EMB) is a method that adapts embedding techniques from Large Language Models (LLMs) to compress the information contained in multiple remote sensing satellite images into feature vectors. This article introduces a new approach to measuring economic activity from EMBs. Using the [...] Read more.
Earth Embedding (EMB) is a method that adapts embedding techniques from Large Language Models (LLMs) to compress the information contained in multiple remote sensing satellite images into feature vectors. This article introduces a new approach to measuring economic activity from EMBs. Using the Google Satellite Embedding Dataset (GSED), we extract a 64-dimensional representation of the Earth’s surface that integrates optical and radar imagery. A neural network maps these embeddings to nighttime light (NTL) intensity, yielding a 32-dimensional “income-aware” feature space aligned with economic variation. We then predict GDP levels and growth rates across countries and compare the results with those of traditional NTL-based models. The Earth-Embedding (EMB) based estimator achieves substantially lower mean squared error in estimating GDP levels. Combining the two sources yields the best overall accuracy. Further analysis shows that EMB performs particularly well in low-statistical-capacity and high-income economies. These results suggest that satellite embeddings can provide a scalable, globally consistent framework for monitoring economic development and validating official statistics. Full article
(This article belongs to the Collection Space Applications)
Show Figures

Figure 1

26 pages, 9258 KB  
Article
TriGEFNet: A Tri-Stream Multimodal Enhanced Fusion Network for Landslide Segmentation from Remote Sensing Imagery
by Zirui Zhang, Qingfeng Hu, Haoran Fang, Wenkai Liu, Ruimin Feng, Shoukai Chen, Qifan Wu, Peng Wang and Weiqiang Lu
Remote Sens. 2026, 18(2), 186; https://doi.org/10.3390/rs18020186 - 6 Jan 2026
Viewed by 307
Abstract
Landslides are among the most prevalent geological hazards worldwide, posing severe threats to public safety due to their sudden onset and destructive potential. The rapid and accurate automated segmentation of landslide areas is a critical task for enhancing capabilities in disaster risk assessment, [...] Read more.
Landslides are among the most prevalent geological hazards worldwide, posing severe threats to public safety due to their sudden onset and destructive potential. The rapid and accurate automated segmentation of landslide areas is a critical task for enhancing capabilities in disaster risk assessment, emergency response, and post-disaster management. However, existing deep learning models for landslide segmentation predominantly rely on unimodal remote sensing imagery. In complex Karst landscapes characterized by dense vegetation and severe shadow interference, the optical features of landslides are difficult to extract effectively, thereby significantly limiting recognition accuracy. Therefore, synergistically utilizing multimodal data while mitigating information redundancy and noise interference has emerged as a core challenge in this field. To address this challenge, this paper proposes a Triple-Stream Guided Enhancement and Fusion Network (TriGEFNet), designed to efficiently fuse three data sources: RGB imagery, Vegetation Indices (VI), and Slope. The model incorporates an adaptive guidance mechanism within the encoder. This mechanism leverages the terrain constraints provided by slope to compensate for the information loss within optical imagery under shadowing conditions. Simultaneously, it integrates the sensitivity of VIs to surface destruction to collectively calibrate and enhance RGB features, thereby extracting fused features that are highly responsive to landslides. Subsequently, gated skip connections in the decoder refine these features, ensuring the optimal combination of deep semantic information with critical boundary details, thus achieving deep synergy among multimodal features. A systematic performance evaluation of the proposed model was conducted on the self-constructed Zunyi dataset and two publicly available datasets. Experimental results demonstrate that TriGEFNet achieved mean Intersection over Union (mIoU) scores of 86.27% on the Zunyi dataset, 80.26% on the L4S dataset, and 89.53% on the Bijie dataset, respectively. Compared to the multimodal baseline model, TriGEFNet achieved significant improvements, with maximum gains of 7.68% in Recall and 4.37% in F1-score across the three datasets. This study not only presents a novel and effective paradigm for multimodal remote sensing data fusion but also provides a forward-looking solution for constructing more robust and precise intelligent systems for landslide monitoring and assessment. Full article
Show Figures

Figure 1

Back to TopTop