Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,125)

Search Parameters:
Keywords = single remote sensing image

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 3652 KB  
Article
Optimizing Foundation Model to Enhance Surface Water Segmentation with Multi-Modal Remote Sensing Data
by Guochao Hu, Mengmeng Shao, Kaiyuan Li, Xiran Zhou and Xiao Xie
Water 2026, 18(3), 382; https://doi.org/10.3390/w18030382 - 2 Feb 2026
Viewed by 33
Abstract
Water resources are of critical importance across all ecological, social, and economic realms. Accurate extraction of water bodies is of significance to estimate the spatial coverage of water resources and to mitigate water-related disasters. Single-modal remote sensing images are often insufficient for accurate [...] Read more.
Water resources are of critical importance across all ecological, social, and economic realms. Accurate extraction of water bodies is of significance to estimate the spatial coverage of water resources and to mitigate water-related disasters. Single-modal remote sensing images are often insufficient for accurate water body extraction due to limitations in spectral information, weather conditions, and speckle noises. Furthermore, state-of-the-art deep learning models may be constrained by data extensibility, feature transferability, model scalability, and task producibility. This manuscript presents an integrated GeoAI framework that enhances foundation models for efficient water body extraction with multi-modal remote sensing images. The proposed framework consists of a data augmentation module tailored for optical and synthetic aperture radar (SAR) remote sensing images, as well as extraction modules augmented by three popular foundation models, namely SAM, SAMRS, and CROMA. Specifically, optical and SAR images are preprocessed and augmented independently, encoded through foundation model backbones, and subsequently decoded to generate water body segmentation masks under single-modal and multi-modal settings. Full article
Show Figures

Figure 1

21 pages, 4691 KB  
Article
Two-Stage Extraction of Large-Area Water Bodies Based on Multi-Modal Remote Sensing Data
by Lisheng Li, Weitao Han and Qinghua Qiao
Sustainability 2026, 18(3), 1362; https://doi.org/10.3390/su18031362 - 29 Jan 2026
Viewed by 105
Abstract
In view of the current remote sensing-based water body extraction research mostly relying on single data sources, being limited to specific water body types or regions, failing to leverage the advantages of multi-source data, and having difficulty in achieving large-scale, high-precision and rapid [...] Read more.
In view of the current remote sensing-based water body extraction research mostly relying on single data sources, being limited to specific water body types or regions, failing to leverage the advantages of multi-source data, and having difficulty in achieving large-scale, high-precision and rapid extraction, this paper integrates optical images and Synthetic Aperture Radar (SAR) data, and adopts an adaptive threshold segmentation method to propose a technical approach suitable for high-precision water body extraction on a monthly scale in large regions, which can efficiently extract water body information in large regions. Taking Beijing as the study area, the monthly spatial distribution of water bodies from 2019 to 2020 was extracted, and the pixel-level accuracy verification was carried out using the JRC Global Surface Water Dataset from the European Commission’s Joint Research Centre. The experimental results show that the water body extraction results are good, the extraction precision is generally higher than 0.8, and most of them can reach over 0.95. Finally, the method was applied to extract and analyze water body changes caused by heavy rainfall in Beijing in July 2025. This analysis further confirmed the effectiveness, accuracy, and practical utility of the proposed method. Full article
Show Figures

Figure 1

25 pages, 5911 KB  
Article
Soil Moisture Inversion in Alfalfa via UAV with Feature Fusion and Ensemble Learning
by Jinxi Chen, Jianxin Yin, Yuanbo Jiang, Yanxia Kang, Yanlin Ma, Guangping Qi, Chungang Jin, Bojie Xie, Wenjing Yu, Yanbiao Wang, Junxian Chen, Jiapeng Zhu and Boda Li
Plants 2026, 15(3), 404; https://doi.org/10.3390/plants15030404 - 28 Jan 2026
Viewed by 122
Abstract
Timely access to soil moisture conditions in farmland crops is the foundation and key to achieving precise irrigation. Due to their high spatiotemporal resolution, unmanned aerial vehicle (UAV) remote sensing has become an important method for monitoring soil moisture. This study addresses soil [...] Read more.
Timely access to soil moisture conditions in farmland crops is the foundation and key to achieving precise irrigation. Due to their high spatiotemporal resolution, unmanned aerial vehicle (UAV) remote sensing has become an important method for monitoring soil moisture. This study addresses soil moisture retrieval in alfalfa fields across different growth stages. Based on UAV multispectral images, a multi-source feature set was constructed by integrating spectral and texture features. The performance of three machine learning models—random forest regression (RFR), K-nearest neighbors regression (KNN), and XG-Boost—as well as two ensemble learning models, Voting and Stacking, was systematically compared. The results indicate the following: (1) The integrated learning models generally outperform individual machine learning models, with the Voting model performing best across all growth stages, achieving a maximum R2 of 0.874 and an RMSE of 0.005; among the machine learning models, the optimal model varies with growth stage, with XG-Boost being the best during the branching and early flowering stages (maximum R2 of 0.836), while RFR performs better during the budding stage (R2 of 0.790). (2) The fusion of multi-source features significantly improved inversion accuracy. Taking the Voting model as an example, the accuracy of the fused features (R2 = 0.874) increased by 0.065 compared to using single-texture features (R2 = 0.809), and the RMSE decreased from 0.012 to 0.005. (3) In terms of inversion depth, the optimal inversion depth for the branching stage and budding stage is 40–60 cm, while the optimal depth for the early flowering stage is 20–40 cm. In summary, the method that integrates multi-source feature fusion and ensemble learning significantly improves the accuracy and stability of alfalfa soil moisture inversion, providing an effective technical approach for precise water management of artificial grasslands in arid regions. Full article
(This article belongs to the Special Issue Water and Nutrient Management for Sustainable Crop Production)
Show Figures

Figure 1

28 pages, 32574 KB  
Article
CauseHSI: Counterfactual-Augmented Domain Generalization for Hyperspectral Image Classification via Causal Disentanglement
by Xin Li, Zongchi Yang and Wenlong Li
J. Imaging 2026, 12(2), 57; https://doi.org/10.3390/jimaging12020057 - 26 Jan 2026
Viewed by 130
Abstract
Cross-scene hyperspectral image (HSI) classification under single-source domain generalization (DG) is a crucial yet challenging task in remote sensing. The core difficulty lies in generalizing from a limited source domain to unseen target scenes. We formalize this through the causal theory, where different [...] Read more.
Cross-scene hyperspectral image (HSI) classification under single-source domain generalization (DG) is a crucial yet challenging task in remote sensing. The core difficulty lies in generalizing from a limited source domain to unseen target scenes. We formalize this through the causal theory, where different sensing scenes are viewed as distinct interventions on a shared physical system. This perspective reveals two fundamental obstacles: interventional distribution shifts arising from varying acquisition conditions, and confounding biases induced by spurious correlations driven by domain-specific factors. Taking the above considerations into account, we propose CauseHSI, a causality-inspired framework that offers new insights into cross-scene HSI classification. CauseHSI consists of two key components: a Counterfactual Generation Module (CGM) that perturbs domain-specific factors to generate diverse counterfactual variants, simulating cross-domain interventions while preserving semantic consistency, and a Causal Disentanglement Module (CDM) that separates invariant causal semantics from spurious correlations through structured constraints under a structural causal model, ultimately guiding the model to focus on domain-invariant and generalizable representations. By aligning model learning with causal principles, CauseHSI enhances robustness against domain shifts. Extensive experiments on the Pavia, Houston, and HyRANK datasets demonstrate that CauseHSI outperforms existing DG methods. Full article
(This article belongs to the Special Issue Multispectral and Hyperspectral Imaging: Progress and Challenges)
Show Figures

Figure 1

24 pages, 5237 KB  
Article
DCA-UNet: A Cross-Modal Ginkgo Crown Recognition Method Based on Multi-Source Data
by Yunzhi Guo, Yang Yu, Yan Li, Mengyuan Chen, Wenwen Kong, Yunpeng Zhao and Fei Liu
Plants 2026, 15(2), 249; https://doi.org/10.3390/plants15020249 - 13 Jan 2026
Viewed by 315
Abstract
Wild ginkgo, as an endangered species, holds significant value for genetic resource conservation, yet its practical applications face numerous challenges. Traditional field surveys are inefficient in mountainous mixed forests, while satellite remote sensing is limited by spatial resolution. Current deep learning approaches relying [...] Read more.
Wild ginkgo, as an endangered species, holds significant value for genetic resource conservation, yet its practical applications face numerous challenges. Traditional field surveys are inefficient in mountainous mixed forests, while satellite remote sensing is limited by spatial resolution. Current deep learning approaches relying on single-source data or merely simple multi-source fusion fail to fully exploit information, leading to suboptimal recognition performance. This study presents a multimodal ginkgo crown dataset, comprising RGB and multispectral images acquired by an UAV platform. To achieve precise crown segmentation with this data, we propose a novel dual-branch dynamic weighting fusion network, termed dual-branch cross-modal attention-enhanced UNet (DCA-UNet). We design a dual-branch encoder (DBE) with a two-stream architecture for independent feature extraction from each modality. We further develop a cross-modal interaction fusion module (CIF), employing cross-modal attention and learnable dynamic weights to boost multi-source information fusion. Additionally, we introduce an attention-enhanced decoder (AED) that combines progressive upsampling with a hybrid channel-spatial attention mechanism, thereby effectively utilizing multi-scale features and enhancing boundary semantic consistency. Evaluation on the ginkgo dataset demonstrates that DCA-UNet achieves a segmentation performance of 93.42% IoU (Intersection over Union), 96.82% PA (Pixel Accuracy), 96.38% Precision, and 96.60% F1-score. These results outperform differential feature attention fusion network (DFAFNet) by 12.19%, 6.37%, 4.62%, and 6.95%, respectively, and surpasses the single-modality baselines (RGB or multispectral) in all metrics. Superior performance on cross-flight-altitude data further validates the model’s strong generalization capability and robustness in complex scenarios. These results demonstrate the superiority of DCA-UNet in UAV-based multimodal ginkgo crown recognition, offering a reliable and efficient solution for monitoring wild endangered tree species. Full article
(This article belongs to the Special Issue Advanced Remote Sensing and AI Techniques in Agriculture and Forestry)
Show Figures

Figure 1

23 pages, 4663 KB  
Article
Element Evaluation and Selection for Multi-Column Redundant Long-Linear-Array Detectors Using a Modified Z-Score
by Xiaowei Jia, Xiuju Li and Changpei Han
Remote Sens. 2026, 18(2), 224; https://doi.org/10.3390/rs18020224 - 9 Jan 2026
Viewed by 238
Abstract
New-generation geostationary meteorological satellite radiometric imagers widely employ multi-column redundant long-linear-array detectors, for which the Best Detector Selection (BDS) strategy is crucial for enhancing the quality of remote sensing data. Addressing the limitation of current BDS methods that often rely on a single [...] Read more.
New-generation geostationary meteorological satellite radiometric imagers widely employ multi-column redundant long-linear-array detectors, for which the Best Detector Selection (BDS) strategy is crucial for enhancing the quality of remote sensing data. Addressing the limitation of current BDS methods that often rely on a single metric and thus fail to fully exploit the detector’s comprehensive performance, this paper proposes a detector evaluation method based on a modified Z-score. This method systematically categorizes detector metrics into three types: positive, negative, and uniformity. It introduces, for the first time, spectral response deviation (SRD) as an effective quantitative measure for the Spectral Response Function (SRF) and employs a robust normalization strategy using the Interquartile Range (IQR) instead of standard deviation, enabling multi-dimensional detector evaluation and selection. Validation using laboratory data from the FY-4C/AGRI long-wave infrared band demonstrates that, compared to traditional single-metric optimization strategies, the best detectors selected by our method show significant improvement across multiple performance indicators, markedly enhancing both data quality and overall system performance. The proposed method features low computational complexity and strong adaptability, supporting on-orbit real-time detector optimization and dynamic updates, thereby providing reliable technical support for high-quality processing of remote sensing data from geostationary meteorological satellites. Full article
(This article belongs to the Special Issue Remote Sensing Data Preprocessing and Calibration)
Show Figures

Figure 1

27 pages, 16442 KB  
Article
Co-Training Vision-Language Models for Remote Sensing Multi-Task Learning
by Qingyun Li, Shuran Ma, Junwei Luo, Yi Yu, Yue Zhou, Fengxiang Wang, Xudong Lu, Xiaoxing Wang, Xin He, Yushi Chen and Xue Yang
Remote Sens. 2026, 18(2), 222; https://doi.org/10.3390/rs18020222 - 9 Jan 2026
Viewed by 341
Abstract
With Transformers achieving outstanding performance on individual remote sensing (RS) tasks, we are now approaching the realization of a unified model that excels across multiple tasks through multi-task learning (MTL). Compared to single-task approaches, MTL methods offer improved generalization, enhanced scalability, and greater [...] Read more.
With Transformers achieving outstanding performance on individual remote sensing (RS) tasks, we are now approaching the realization of a unified model that excels across multiple tasks through multi-task learning (MTL). Compared to single-task approaches, MTL methods offer improved generalization, enhanced scalability, and greater practical applicability. Recently, vision-language models (VLMs) have achieved promising results in RS image understanding, grounding, and ultra-high-resolution (UHR) image reasoning, respectively. Moreover, the unified text-based interface demonstrates significant potential for MTL. Hence, in this work, we present RSCoVLM, a simple yet flexible VLM baseline for RS MTL. Firstly, we create the data curation procedure, including data acquisition, offline processing and integrating, as well as online loading and weighting. This data procedure effectively addresses complex RS data enviroments and generates flexible vision-language conversations. Furthermore, we propose a unified dynamic-resolution strategy to address the diverse image scales inherent in RS imagery. For UHR images, we introduce the Zoom-in Chain mechanism together with its corresponding dataset, LRS-VQA-Zoom. The strategies are flexible and effectively mitigate the computational burdens. Additionally, we significantly enhance the model’s object detection capability and propose a novel evaluation protocol that ensures fair comparison between VLMs and conventional detection models. Extensive experiments demonstrate that RSCoVLM achieves state-of-the-art performance across diverse tasks, outperforming existing RS VLMs and even rivaling specialized expert models. All the training and evaluating tools, model weights, and datasets have been fully open-sourced to support reproducibility. We expect that this baseline will promote further progress toward general-purpose RS models. Full article
Show Figures

Figure 1

19 pages, 2336 KB  
Article
A Lightweight Upsampling and Cross-Modal Feature Fusion-Based Algorithm for Small-Object Detection in UAV Imagery
by Jianglei Gong, Zhe Yuan, Wenxing Li, Weiwei Li, Yanjie Guo and Baolong Guo
Electronics 2026, 15(2), 298; https://doi.org/10.3390/electronics15020298 - 9 Jan 2026
Viewed by 266
Abstract
Small-object detection in UAV remote sensing faces common challenges such as tiny target size, blurred features, and severe background interference. Furthermore, single imaging modalities exhibit limited representation capability in complex environments. To address these issues, this paper proposes CTU-YOLO, a UAV-based small-object detection [...] Read more.
Small-object detection in UAV remote sensing faces common challenges such as tiny target size, blurred features, and severe background interference. Furthermore, single imaging modalities exhibit limited representation capability in complex environments. To address these issues, this paper proposes CTU-YOLO, a UAV-based small-object detection algorithm built upon cross-modal feature fusion and lightweight upsampling. The algorithm incorporates a dynamic and adaptive cross-modal feature fusion (DCFF) module, which achieves efficient feature alignment and fusion by combining frequency-domain analysis with convolutional operations. Additionally, a lightweight upsampling module (LUS) is introduced, integrating dynamic sampling and depthwise separable convolution to enhance the recovery of fine details for small objects. Experiments on the DroneVehicle and LLVIP datasets demonstrate that CTU-YOLO achieves 73.9% mAP on DroneVehicle and 96.9% AP on LLVIP, outperforming existing mainstream methods. Meanwhile, the model possesses only 4.2 MB parameters and 13.8 GFLOPs computational cost, with inference speeds reaching 129.9 FPS on DroneVehicle and 135.1 FPS on LLVIP. This exhibits an excellent lightweight design and real-time performance while maintaining high accuracy. Ablation studies confirm that both the DCFF and LUS modules contribute significantly to performance gains. Visualization analysis further indicates that the proposed method can accurately preserve the structure of small objects even under nighttime, low-light, and multi-scale background conditions, demonstrating strong robustness. Full article
(This article belongs to the Special Issue AI-Driven Image Processing: Theory, Methods, and Applications)
Show Figures

Figure 1

14 pages, 2218 KB  
Article
Singular Value Decomposition Wavelength-Multiplexing Ghost Imaging
by Yingtao Zhang, Xueqian Zhang, Zongguo Li and Hongguo Li
Photonics 2026, 13(1), 49; https://doi.org/10.3390/photonics13010049 - 5 Jan 2026
Viewed by 349
Abstract
To enhance imaging quality, singular value decomposition (SVD) has been applied to single-wavelength ghost imaging (GI) or color GI. In this paper, we extend the application of SVD to wavelength-multiplexing ghost imaging (WMGI) for reducing the redundant information in the random measurement matrix [...] Read more.
To enhance imaging quality, singular value decomposition (SVD) has been applied to single-wavelength ghost imaging (GI) or color GI. In this paper, we extend the application of SVD to wavelength-multiplexing ghost imaging (WMGI) for reducing the redundant information in the random measurement matrix corresponding to multi-wavelength modulated speckle fields. The feasibility of this method is demonstrated through numerical simulations and optical experiments. Based on the intensity statistical properties of multi-wavelength speckle fields, we derived an expression for the contrast-to-noise ratio (CNR) to characterize imaging quality and conducted a corresponding analysis. The theoretical results indicate that in SVDWMGI, for the m-wavelength case, the CNR of the reconstructed image is m times that of single-wavelength GI. Moreover, we carried out an optical experiment with a three-wavelength speckle-modulated light source to verify the method. This approach integrates the advantages of both SVD and wavelength division multiplexing, potentially facilitating the application of GI in long-distance imaging fields such as remote sensing. Full article
(This article belongs to the Special Issue Ghost Imaging and Quantum-Inspired Classical Optics)
Show Figures

Figure 1

23 pages, 13143 KB  
Article
Method of Convolutional Neural Networks for Lithological Classification Using Multisource Remote Sensing Data
by Zixuan Zhang, Yuanjin Xu and Jianguo Chen
Remote Sens. 2026, 18(1), 29; https://doi.org/10.3390/rs18010029 - 22 Dec 2025
Viewed by 331
Abstract
Xinfeng County, Shaoguan City, Guangdong Province, China, is a typical vegetation-covered area that suffers from severe attenuation of rock and mineral spectral information in remote sensing images owing to dense vegetation. This situation limits the accuracy of traditional lithological mapping methods, making them [...] Read more.
Xinfeng County, Shaoguan City, Guangdong Province, China, is a typical vegetation-covered area that suffers from severe attenuation of rock and mineral spectral information in remote sensing images owing to dense vegetation. This situation limits the accuracy of traditional lithological mapping methods, making them unable to meet geological mapping demands under complex conditions, and thus necessitating a tailored lithological identification model. To address this issue, in this study, the penetration capability of microwave remote sensing (for extracting indirect textural features of lithology) was combined with the spectral superiority of hyperspectral remote sensing (for capturing lithological spectral features), resulting in a dual-branch deep-learning framework for lithological classification based on multisource remote sensing data. The framework independently extracts features from Sentinel-1 imagery and Gaofen-5 data, integrating three key modules: texture feature extraction, spatial–spectral feature extraction, and attention-based adaptive feature fusion, to realize deep and efficient fusion of heterogeneous remote sensing information. Ablation and comparative experiments were conducted to evaluate each module’s contribution. The results show that the dual-branch architecture effectively captures the complementary and discriminative characteristics of multimodal data, and that the encoder–decoder structure demonstrates strong robustness under complex conditions such as dense vegetation. The final model achieved 97.24% overall accuracy and 90.43% mean intersection-over-union score, verifying its effectiveness and generalizability in complex geological environments. The proposed multi-source remote sensing–based lithological classification model overcomes the limitations of single-source data by integrating indirect lithological texture features containing vegetation structural information with spectral features, thereby providing a viable approach for lithological mapping in vegetated regions. Full article
Show Figures

Figure 1

24 pages, 3622 KB  
Article
Deep Learning-Based Intelligent Monitoring of Petroleum Infrastructure Using High-Resolution Remote Sensing Imagery
by Nannan Zhang, Hang Zhao, Pengxu Jing, Yan Gao, Song Liu, Jinli Shen, Shanhong Huang, Qihong Zeng, Yang Liu and Miaofen Huang
Processes 2026, 14(1), 28; https://doi.org/10.3390/pr14010028 - 20 Dec 2025
Viewed by 387
Abstract
The rapid advancement of high-resolution remote sensing technology has significantly expanded observational capabilities in the oil and gas sector, enabling more precise identification of petroleum infrastructure. Remote sensing now plays a critical role in providing real-time, continuous monitoring. Manual interpretation remains the predominant [...] Read more.
The rapid advancement of high-resolution remote sensing technology has significantly expanded observational capabilities in the oil and gas sector, enabling more precise identification of petroleum infrastructure. Remote sensing now plays a critical role in providing real-time, continuous monitoring. Manual interpretation remains the predominant approach, yet is plagued by multiple limitations. To overcome the limitations of manual interpretation in large-scale monitoring of upstream petroleum assets, this study develops an end-to-end, deep learning-driven framework for intelligent extraction of key oilfield targets from high-resolution remote sensing imagery. Specific aims are as follows: (1) To leverage temporal diversity in imagery to construct a representative training dataset. (2) To automate multi-class detection of well sites, production discharge pools, and storage facilities with high precision. This study proposes an intelligent monitoring framework based on deep learning for the automatic extraction of petroleum-related features from high-resolution remote sensing imagery. Leveraging the temporal richness of multi-temporal satellite data, a geolocation-based sampling strategy was adopted to construct a dedicated petroleum remote sensing dataset. The dataset comprises over 8000 images and more than 30,000 annotated targets across three key classes: well pads, production ponds, and storage facilities. Four state-of-the-art object detection models were evaluated—two-stage frameworks (Faster R-CNN, Mask R-CNN) and single-stage algorithms (YOLOv3, YOLOv4)—with the integration of transfer learning to improve accuracy, generalization, and robustness. Experimental results demonstrate that two-stage detectors significantly outperform their single-stage counterparts in terms of mean Average Precision (mAP). Specifically, the Mask R-CNN model, enhanced through transfer learning, achieved an mAP of 89.2% across all classes, exceeding the best-performing single-stage model (YOLOv4) by 11 percentage points. This performance gap highlights the trade-off between speed and accuracy inherent in single-shot detection models, which prioritize real-time inference at the expense of precision. Additionally, comparative analysis among similar architectures confirmed that newer versions (e.g., YOLOv4 over YOLOv3) and the incorporation of transfer learning consistently yield accuracy improvements of 2–4%, underscoring its effectiveness in remote sensing applications. Three oilfield areas were selected for practical application. The results indicate that the constructed model can automatically extract multiple target categories simultaneously, with average detection accuracies of 84% for well sites and 77% for production ponds. For multi-class targets over 100 square kilometers, manual detection previously required one day but now takes only one hour. Full article
Show Figures

Figure 1

22 pages, 12312 KB  
Article
ES-YOLO: Multi-Scale Port Ship Detection Combined with Attention Mechanism in Complex Scenes
by Lixiang Cao, Jia Xi, Zixuan Xie, Teng Feng and Xiaomin Tian
Sensors 2025, 25(24), 7630; https://doi.org/10.3390/s25247630 - 16 Dec 2025
Viewed by 430
Abstract
With the rapid development of remote sensing technology and deep learning, the port ship detection based on a single-stage algorithm has achieved remarkable results in optical imagery. However, most of the existing methods are designed and verified in specific scenes, such as fixed [...] Read more.
With the rapid development of remote sensing technology and deep learning, the port ship detection based on a single-stage algorithm has achieved remarkable results in optical imagery. However, most of the existing methods are designed and verified in specific scenes, such as fixed viewing angle, uniform background, or open sea, which makes it difficult to deal with the problem of ship detection in complex environments, such as cloud occlusion, wave fluctuation, complex buildings in the harbor, and multi-ship aggregation. To this end, ES-YOLO framework is proposed to solve the limitations of ship detection. A novel edge perception channel, Spatial Attention Mechanism (EACSA), is proposed to enhance the extraction of edge information and improve the ability to capture feature details. A lightweight spatial–channel decoupled down-sampling module (LSCD) is designed to replace the down-sampling structure of the original network and reduce the complexity of the down-sampling stage. A new hierarchical scale structure is designed to balance the detection effect of different scale differences. In this paper, a remote sensing ship dataset, TJShip, is constructed based on Gaofen-2 images, which covers multi-scale targets from small fishing boats to large cargo ships. The TJShip dataset was adopted as the data source, and the ES-YOLO model was employed to conduct ablation and comparison experiments. The results show that the introduction of EACSA attention mechanism, LSCD, and multi-scale structure improves the mAP of ship detection by 0.83%, 0.54%, and 1.06%, respectively, compared with the baseline model, also performing well in precision, recall and F1. Compared with Faster R-CNN, RetinaNet, YOLOv5, YOLOv7, and YOLOv8 methods, the results show that the ES-YOLO model improves the mAP by 46.87%, 8.14%, 1.85%, 1.75%, and 0.86%, respectively, under the same experimental conditions, which provides research ideas for ship detection. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

18 pages, 3003 KB  
Article
Vineyard Groundcover Biodiversity: Using Deep Learning to Differentiate Cover Crop Communities from Aerial RGB Imagery
by Isabella Ghiglieno, Girma Tariku Woldesemayat, Andres Sanchez Morchio, Celine Birolleau, Luca Facciano, Fulvio Gentilin, Salvatore Mangiapane, Anna Simonetto and Gianni Gilioli
AgriEngineering 2025, 7(12), 434; https://doi.org/10.3390/agriengineering7120434 - 16 Dec 2025
Viewed by 350
Abstract
Monitoring groundcover diversity in vineyards is a complex task, often limited by the time and expertise required for accurate botanical identification. Remote sensing technologies and AI-based tools are still underutilized in this context, particularly for classifying herbaceous vegetation in inter-row areas. In this [...] Read more.
Monitoring groundcover diversity in vineyards is a complex task, often limited by the time and expertise required for accurate botanical identification. Remote sensing technologies and AI-based tools are still underutilized in this context, particularly for classifying herbaceous vegetation in inter-row areas. In this study, we introduce a novel approach to classify the groundcover into one of nine categories, in order to simplify this task. Using UAV images to train a convolutional neural network through a deep learning methodology, this study evaluates the effectiveness of different backbone structures applied to a UNet network for the classification of pixels into nine classes of groundcover: vine canopy, bare soil, and seven distinct cover crop community types. Our results demonstrate that the UNet model, especially when using an EfficientNetB0 backbone, significantly improves classification performance, achieving 85.4% accuracy, 59.8% mean Intersection over Union (IoU), and a Jaccard index of 73.0%. Although this study demonstrates the potential of integrating remote sensing and deep learning for vineyard biodiversity monitoring, its applicability is limited by the small image coverage, as data were collected from a single vineyard and only one drone flight. Future work will focus on expanding the model’s applicability to a broader range of vineyard systems, soil types, and geographic regions, as well as testing its performance on lower-resolution multispectral imagery to reduce data acquisition costs and time, enabling large-scale and cost-effective monitoring. Full article
Show Figures

Figure 1

25 pages, 12181 KB  
Article
Characterizing Growth and Estimating Yield in Winter Wheat Breeding Lines and Registered Varieties Using Multi-Temporal UAV Data
by Liwei Liu, Xinxing Zhou, Tao Liu, Dongtao Liu, Jing Liu, Jing Wang, Yuan Yi, Xuecheng Zhu, Na Zhang, Huiyun Zhang, Guohua Feng and Hongbo Ma
Agriculture 2025, 15(24), 2554; https://doi.org/10.3390/agriculture15242554 - 10 Dec 2025
Cited by 1 | Viewed by 528
Abstract
Grain yield is one of the most critical indicators for evaluating the performance of wheat breeding. However, the assessment process, from early-stage breeding lines to officially registered varieties that have passed the DUS (Distinctness, Uniformity, and Stability) test, is often time-consuming and labor-intensive. [...] Read more.
Grain yield is one of the most critical indicators for evaluating the performance of wheat breeding. However, the assessment process, from early-stage breeding lines to officially registered varieties that have passed the DUS (Distinctness, Uniformity, and Stability) test, is often time-consuming and labor-intensive. Multispectral remote sensing based on unmanned aerial vehicles (UAVs) has demonstrated significant potential in crop phenotyping and yield estimation due to its high throughput, non-destructive nature, and ability to rapidly collect large-scale, multi-temporal data. In this study, multi-temporal UAV-based multispectral imagery, RGB images, and canopy height data were collected throughout the entire wheat growth stage (2023–2024) in Xuzhou, Jiangsu Province, China, to characterize the dynamic growth patterns of both breeding lines and registered cultivars. Vegetation indices (VIs), texture parameters (Tes), and a time-series crop height model (CHM), including the logistic-derived growth rate (GR) and the projected area (PA), were extracted to construct a comprehensive multi-source feature set. Four machine learning algorithms, namely a random forest (RF), support vector machine regression (SVR), extreme gradient boosting (XGBoost), and partial least squares regression (PLSR), were employed to model and estimate yield. The results demonstrated that spectral, texture, and canopy height features derived from multi-temporal UAV data effectively captured phenotypic differences among wheat types and contributed to yield estimation. Features obtained from later growth stages generally led to higher estimation accuracy. The integration of vegetation indices and texture features outperformed models using single-feature types. Furthermore, the integration of time-series features and feature selection further improved predictive accuracy, with XGBoost incorporating VIs, Tes, GR, and PA yielding the best performance (R2 = 0.714, RMSE = 0.516 t/ha, rRMSE = 5.96%). Overall, the proposed multi-source modeling framework offers a practical and efficient solution for yield estimation in early-stage wheat breeding and can support breeders and growers by enabling earlier, more accurate selection and management decisions in real-world production environments. Full article
Show Figures

Figure 1

19 pages, 3279 KB  
Article
Research on Wetland Fine Classification Based on Remote Sensing Images with Multi-Temporal and Feature Optimization
by Dongping Xu, Wei Wu, Yesheng Ma and Dianxing Feng
Sustainability 2025, 17(24), 10900; https://doi.org/10.3390/su172410900 - 5 Dec 2025
Viewed by 440
Abstract
Wetlands, known as “the kidney of the Earth”, serve as critical ecological carriers for global sustainable development. The fine classification of wetlands is crucial to their utilization and protection. Wetland fine-scale classification based on remote sensing imagery has long been challenged by disturbances [...] Read more.
Wetlands, known as “the kidney of the Earth”, serve as critical ecological carriers for global sustainable development. The fine classification of wetlands is crucial to their utilization and protection. Wetland fine-scale classification based on remote sensing imagery has long been challenged by disturbances such as clouds, fog, and shadows. Simultaneously, the confusion of spectral information among land cover types remains a primary factor affecting classification accuracy. To address these challenges, this paper proposes a fine classification model of wetlands in remote sensing images based on multi-temporal data and feature optimization (CMW-MTFO). The model is divided into three parts: (1) a multi-satellite and multi-temporal remote sensing image fusion module; (2) a feature optimization module; and (3) a feature classification network module. Multi-satellite multi-temporal image fusion compensates for information gaps caused by cloud cover, fog, and shadows, while feature optimization reduces spectral characteristics prone to confusion. Finally, fine classification is completed using the feature classification network based on deep learning. Using coastal wetlands in Liaoning Province, China, as the experimental area, this study compares the CMW-MTFO with several classical wetland classification methods, non-feature-optimized classification, and single-temporal classification. Results show that the proposed model achieves an overall classification accuracy of 98.31% for Liaoning wetlands, with a Kappa coefficient of 0.9795. Compared to the classic random forest method, classification accuracy and Kappa coefficient improved by 11.09% and 0.1286, respectively. Compared to non-feature-based classification, classification accuracy increased by 1.06% and Kappa coefficient by 1.18%. Compared to the best classification performance using single-temporal images, the proposed method achieved a 1.81% increase in classification accuracy and a 2.19% increase in Kappa value, demonstrating the effectiveness of the model approach. Full article
Show Figures

Figure 1

Back to TopTop