Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,533)

Search Parameters:
Keywords = optical remote sensing images

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 10750 KB  
Article
LHRSI: A Lightweight Spaceborne Imaging Spectrometer with Wide Swath and High Resolution for Ocean Color Remote Sensing
by Bo Cheng, Yongqian Zhu, Miao Hu, Xianqiang He, Qianmin Liu, Chunlai Li, Chen Cao, Bangjian Zhao, Jincai Wu, Jianyu Wang, Jie Luo, Jiawei Lu, Zhihua Song, Yuxin Song, Wen Jiang, Zi Wang, Guoliang Tang and Shijie Liu
Remote Sens. 2026, 18(2), 218; https://doi.org/10.3390/rs18020218 - 9 Jan 2026
Viewed by 116
Abstract
Global water environment monitoring urgently requires remote sensing data with high temporal resolution and wide spatial coverage. However, current space-borne ocean color spectrometers still face a significant trade-off among spatial resolution, swath width, and system compactness, which limits the large-scale deployment of satellite [...] Read more.
Global water environment monitoring urgently requires remote sensing data with high temporal resolution and wide spatial coverage. However, current space-borne ocean color spectrometers still face a significant trade-off among spatial resolution, swath width, and system compactness, which limits the large-scale deployment of satellite constellations. To address this challenge, this study developed a lightweight high-resolution spectral imager (LHRSI) with a total mass of less than 25 kg and power consumption below 80 W. The visible (VIS) camera adopts an interleaved dual-field-of-view and detectors splicing fusion design, while the shortwave infrared (SWIR) camera employs a transmission-type focal plane with staggered detector arrays. Through the field-of-view (FOV) optical design, the instrument achieves swath widths of 207.33 km for the VIS bands and 187.8 km for the SWIR bands at an orbital altitude of 500 km, while maintaining spatial resolutions of 12 m and 24 m, respectively. On-orbit imaging results demonstrate that the spectrometer achieves excellent performance in both spatial resolution and swath width. In addition, preliminary analysis using index-based indicators illustrates LHRSI’s potential for observing chlorophyll-related features in water bodies. This research not only provides a high-performance, miniaturized spectrometer solution but also lays an engineering foundation for developing low-cost, high-revisit global ocean and water environment monitoring constellations. Full article
(This article belongs to the Section Ocean Remote Sensing)
Show Figures

Figure 1

22 pages, 3276 KB  
Article
AFR-CR: An Adaptive Frequency Domain Feature Reconstruction-Based Method for Cloud Removal via SAR-Assisted Remote Sensing Image Fusion
by Xiufang Zhou, Qirui Fang, Xunqiang Gong, Shuting Yang, Tieding Lu, Yuting Wan, Ailong Ma and Yanfei Zhong
Remote Sens. 2026, 18(2), 201; https://doi.org/10.3390/rs18020201 - 8 Jan 2026
Viewed by 229
Abstract
Optical imagery is often contaminated by clouds to varying degrees, which greatly affects the interpretation and analysis of images. Synthetic Aperture Radar (SAR) possesses the characteristic of penetrating clouds and mist, and a common strategy in SAR-assisted cloud removal involves fusing SAR and [...] Read more.
Optical imagery is often contaminated by clouds to varying degrees, which greatly affects the interpretation and analysis of images. Synthetic Aperture Radar (SAR) possesses the characteristic of penetrating clouds and mist, and a common strategy in SAR-assisted cloud removal involves fusing SAR and optical data and leveraging deep learning networks to reconstruct cloud-free optical imagery. However, these methods do not fully consider the characteristics of the frequency domain when processing feature integration, resulting in blurred edges of the generated cloudless optical images. Therefore, an adaptive frequency domain feature reconstruction-based cloud removal method is proposed to solve the problem. The proposed method comprises four key sequential stages. First, shallow features are extracted by fusing optical and SAR images. Second, a Transformer-based encoder captures multi-scale semantic features. Subsequently, the Frequency Domain Decoupling Module (FDDM) is employed. Utilizing a Dynamic Mask Generation mechanism, it explicitly decomposes features into low-frequency structures and high-frequency details, effectively suppressing cloud interference while preserving surface textures. Finally, robust information interaction is facilitated by the Cross-Frequency Reconstruction Module (CFRM) via transposed cross-attention, ensuring precise fusion and reconstruction. Experimental evaluation on the M3R-CR dataset confirms that the proposed approach achieves the best results on all four evaluated metrics, surpassing the performance of the eight other State-of-the-Art methods. It has demonstrated its effectiveness and advanced capabilities in the task of SAR-optical fusion for cloud removal. Full article
Show Figures

Figure 1

31 pages, 6416 KB  
Article
FireMM-IR: An Infrared-Enhanced Multi-Modal Large Language Model for Comprehensive Scene Understanding in Remote Sensing Forest Fire Monitoring
by Jinghao Cao, Xiajun Liu and Rui Xue
Sensors 2026, 26(2), 390; https://doi.org/10.3390/s26020390 - 7 Jan 2026
Viewed by 160
Abstract
Forest fire monitoring in remote sensing imagery has long relied on traditional perception models that primarily focus on detection or segmentation. However, such approaches fall short in understanding complex fire dynamics, including contextual reasoning, fire evolution description, and cross-modal interpretation. With the rise [...] Read more.
Forest fire monitoring in remote sensing imagery has long relied on traditional perception models that primarily focus on detection or segmentation. However, such approaches fall short in understanding complex fire dynamics, including contextual reasoning, fire evolution description, and cross-modal interpretation. With the rise of multi-modal large language models (MLLMs), it becomes possible to move beyond low-level perception toward holistic scene understanding that jointly reasons about semantics, spatial distribution, and descriptive language. To address this gap, we introduce FireMM-IR, a multi-modal large language model tailored for pixel-level scene understanding in remote-sensing forest-fire imagery. FireMM-IR incorporates an infrared-enhanced classification module that fuses infrared and visual modalities, enabling the model to capture fire intensity and hidden ignition areas under dense smoke. Furthermore, we design a mask-generation module guided by language-conditioned segmentation tokens to produce accurate instance masks from natural-language queries. To effectively learn multi-scale fire features, a class-aware memory mechanism is introduced to maintain contextual consistency across diverse fire scenes. We also construct FireMM-Instruct, a unified corpus of 83,000 geometrically aligned RGB–IR pairs with instruction-aligned descriptions, bounding boxes, and pixel-level annotations. Extensive experiments show that FireMM-IR achieves superior performance on pixel-level segmentation and strong results on instruction-driven captioning and reasoning, while maintaining competitive performance on image-level benchmarks. These results indicate that infrared–optical fusion and instruction-aligned learning are key to physically grounded understanding of wildfire scenes. Full article
(This article belongs to the Special Issue Remote Sensing and UAV Technologies for Environmental Monitoring)
Show Figures

Figure 1

30 pages, 6797 KB  
Article
Voxel-Based Leaf Area Estimation in Trellis-Grown Grapevines: A Destructive Validation and Comparison with Optical LAI Methods
by Poching Teng, Hiroyoshi Sugiura, Tomoki Date, Unseok Lee, Takeshi Yoshida, Tomohiko Ota and Junichi Nakagawa
Remote Sens. 2026, 18(2), 198; https://doi.org/10.3390/rs18020198 - 7 Jan 2026
Viewed by 199
Abstract
This study develops a voxel-based leaf area estimation framework and validates it using a three-year multi-temporal dataset (2022–2024) of pergola-trained grapevines. The workflow integrates 2D image analysis, ExGR-based leaf segmentation, and 3D reconstruction using Structure-from-Motion (SfM). Multi-angle canopy images were collected repeatedly during [...] Read more.
This study develops a voxel-based leaf area estimation framework and validates it using a three-year multi-temporal dataset (2022–2024) of pergola-trained grapevines. The workflow integrates 2D image analysis, ExGR-based leaf segmentation, and 3D reconstruction using Structure-from-Motion (SfM). Multi-angle canopy images were collected repeatedly during the growing seasons, and destructive leaf sampling was conducted to quantify true leaf area across multiple vines and years. After removing non-leaf structures with ExGR filtering, the point clouds were voxelized at a 1 cm3 resolution to derive structural occupancy metrics. Voxel-based leaf area showed strong within-vine correlations with destructively measured values (R2 = 0.77–0.95), while cross-vine variability was influenced by canopy complexity, illumination, and point-cloud density. In contrast, optical LAI tools (DHP and LAI–2000) exhibited negligible correspondence with true leaf area due to multilayer occlusion and lateral light contamination typical of pergola systems. This expanded, multi-year analysis demonstrates that voxel occupancy provides a robust and scalable indicator of canopy structural density and leaf area, offering a practical foundation for remote-sensing-based phenotyping, yield estimation, and data-driven management in perennial fruit crops. Full article
(This article belongs to the Section Forest Remote Sensing)
Show Figures

Figure 1

23 pages, 14919 KB  
Article
Estimating Economic Activity from Satellite Embeddings
by Xiangqi Yue, Zhong Zhao and Kun Hu
Appl. Sci. 2026, 16(2), 582; https://doi.org/10.3390/app16020582 - 6 Jan 2026
Viewed by 172
Abstract
Earth Embedding (EMB) is a method that adapts embedding techniques from Large Language Models (LLMs) to compress the information contained in multiple remote sensing satellite images into feature vectors. This article introduces a new approach to measuring economic activity from EMBs. Using the [...] Read more.
Earth Embedding (EMB) is a method that adapts embedding techniques from Large Language Models (LLMs) to compress the information contained in multiple remote sensing satellite images into feature vectors. This article introduces a new approach to measuring economic activity from EMBs. Using the Google Satellite Embedding Dataset (GSED), we extract a 64-dimensional representation of the Earth’s surface that integrates optical and radar imagery. A neural network maps these embeddings to nighttime light (NTL) intensity, yielding a 32-dimensional “income-aware” feature space aligned with economic variation. We then predict GDP levels and growth rates across countries and compare the results with those of traditional NTL-based models. The Earth-Embedding (EMB) based estimator achieves substantially lower mean squared error in estimating GDP levels. Combining the two sources yields the best overall accuracy. Further analysis shows that EMB performs particularly well in low-statistical-capacity and high-income economies. These results suggest that satellite embeddings can provide a scalable, globally consistent framework for monitoring economic development and validating official statistics. Full article
(This article belongs to the Collection Space Applications)
Show Figures

Figure 1

14 pages, 2218 KB  
Article
Singular Value Decomposition Wavelength-Multiplexing Ghost Imaging
by Yingtao Zhang, Xueqian Zhang, Zongguo Li and Hongguo Li
Photonics 2026, 13(1), 49; https://doi.org/10.3390/photonics13010049 - 5 Jan 2026
Viewed by 271
Abstract
To enhance imaging quality, singular value decomposition (SVD) has been applied to single-wavelength ghost imaging (GI) or color GI. In this paper, we extend the application of SVD to wavelength-multiplexing ghost imaging (WMGI) for reducing the redundant information in the random measurement matrix [...] Read more.
To enhance imaging quality, singular value decomposition (SVD) has been applied to single-wavelength ghost imaging (GI) or color GI. In this paper, we extend the application of SVD to wavelength-multiplexing ghost imaging (WMGI) for reducing the redundant information in the random measurement matrix corresponding to multi-wavelength modulated speckle fields. The feasibility of this method is demonstrated through numerical simulations and optical experiments. Based on the intensity statistical properties of multi-wavelength speckle fields, we derived an expression for the contrast-to-noise ratio (CNR) to characterize imaging quality and conducted a corresponding analysis. The theoretical results indicate that in SVDWMGI, for the m-wavelength case, the CNR of the reconstructed image is m times that of single-wavelength GI. Moreover, we carried out an optical experiment with a three-wavelength speckle-modulated light source to verify the method. This approach integrates the advantages of both SVD and wavelength division multiplexing, potentially facilitating the application of GI in long-distance imaging fields such as remote sensing. Full article
(This article belongs to the Special Issue Ghost Imaging and Quantum-Inspired Classical Optics)
Show Figures

Figure 1

32 pages, 59431 KB  
Article
Joint Deblurring and Destriping for Infrared Remote Sensing Images with Edge Preservation and Ringing Suppression
by Ningfeng Wang, Liang Huang, Mingxuan Li, Bin Zhou and Ting Nie
Remote Sens. 2026, 18(1), 150; https://doi.org/10.3390/rs18010150 - 2 Jan 2026
Viewed by 177
Abstract
Infrared remote sensing images are often degraded by blur and stripe noise caused by satellite attitude variations, optical distortions, and electronic interference, which significantly compromise image quality and target detection performance. Existing joint deblurring and destriping methods tend to over-smooth image edges and [...] Read more.
Infrared remote sensing images are often degraded by blur and stripe noise caused by satellite attitude variations, optical distortions, and electronic interference, which significantly compromise image quality and target detection performance. Existing joint deblurring and destriping methods tend to over-smooth image edges and textures, failing to effectively preserve high-frequency details and sometimes misclassifying ringing artifacts as stripes. This paper proposes a variational framework for simultaneous deblurring and destriping of infrared remote sensing images. By leveraging an adaptive structure tensor model, the method exploits the sparsity and directionality of stripe noise, thereby enhancing edge and detail preservation. During blur kernel estimation, a fidelity term orthogonal to the stripe direction is introduced to suppress noise and residual stripes. In the image restoration stage, a WCOB (Non-blind restoration based on Wiener-Cosine composite filtering) model is proposed to effectively mitigate ringing artifacts and visual distortions. The overall optimization problem is efficiently solved using the alternating direction method of multipliers (ADMM). Extensive experiments on real infrared remote sensing datasets demonstrate that the proposed method achieves superior denoising and restoration performance, exhibiting strong robustness and practical applicability. Full article
Show Figures

Graphical abstract

26 pages, 48691 KB  
Article
A Multi-Channel Convolutional Neural Network Model for Detecting Active Landslides Using Multi-Source Fusion Images
by Jun Wang, Hongdong Fan, Wanbing Tuo and Yiru Ren
Remote Sens. 2026, 18(1), 126; https://doi.org/10.3390/rs18010126 - 30 Dec 2025
Viewed by 258
Abstract
Synthetic Aperture Radar Interferometry (InSAR) has demonstrated significant advantages in detecting active landslides. The proliferation of computing technology has enabled the combination of InSAR and deep learning, offering an innovative approach to the automation of landslide detection. However, InSAR-based detection faces two persistent [...] Read more.
Synthetic Aperture Radar Interferometry (InSAR) has demonstrated significant advantages in detecting active landslides. The proliferation of computing technology has enabled the combination of InSAR and deep learning, offering an innovative approach to the automation of landslide detection. However, InSAR-based detection faces two persistent challenges: (1) the difficulty in distinguishing active landslides from other deformation phenomena, which leads to high false alarm rates; and (2) insufficient accuracy in delineating precise landslide boundaries due to low image contrast. The incorporation of multi-source data and multi-branch feature extraction networks can alleviate this issue, yet it inevitably increases computational cost and model complexity. To address these issues, this study first constructs a multi-source fusion image dataset combining optical remote sensing imagery, DEM-derived slope information, and InSAR deformation data. Subsequently, it proposes a multi-channel instance segmentation framework named MCLD R-CNN (Multi-Channel Landslide Detection R-CNN). The proposed network is designed to accept multi-channel inputs and integrates a landslide-focused attention mechanism, which enhances the model’s ability to capture landslide-specific features. The experimental findings indicate that the proposed strategy effectively addresses the aforementioned challenges. Moreover, the proposed MCLD R-CNN achieves superior detection accuracy and generalization ability compared to other benchmark models. Full article
Show Figures

Figure 1

16 pages, 3975 KB  
Article
Thermal Radiation Analysis Method and Thermal Control System Design for Spaceborne Micro-Hyperspectral Imager Operating on Inclined-LEO
by Xinwei Zhou, Yutong Xu, Yongnan Lu, Yangyang Zou, Hanyu Ye and Tailei Wang
Aerospace 2026, 13(1), 29; https://doi.org/10.3390/aerospace13010029 - 27 Dec 2025
Viewed by 206
Abstract
Thermal control of spaceborne micro-hyperspectral imagers (MHIs) operating in inclined low-Earth orbits (LEOs) presents significant challenges due to the complex and dynamically varying external heat flux, which lacks a stable heat dissipation surface. This study proposes a thermal radiation analysis method capable of [...] Read more.
Thermal control of spaceborne micro-hyperspectral imagers (MHIs) operating in inclined low-Earth orbits (LEOs) presents significant challenges due to the complex and dynamically varying external heat flux, which lacks a stable heat dissipation surface. This study proposes a thermal radiation analysis method capable of rapidly deriving accurate numerical solutions for the thermal radiation characteristics of spacecraft in such orbits. A dedicated thermal control system (TCS) was designed, featuring a radiator oriented towards the +zs plane, which was identified as having stable and low incident heat flux across extreme solar–orbit angle conditions. The system employs efficient thermal pathways, including thermal pads and a flexible graphite thermal ribbon, to transfer heat waste from the imaging module to the radiator, supplemented by electric heaters and multilayer insulation for temperature stability. Steady-state thermal analysis demonstrated excellent temperature uniformity, with gradients below 0.017 °C on critical optics. Subsequent thermo-optical performance analysis revealed that the modulation transfer function (MTF) degradation was maintained below 2% compared to the ideal system. The results confirm the feasibility and effectiveness of the proposed thermal design and analysis methodology in maintaining the stringent thermo-optical performance required for MHIs on inclined-LEO platforms. Full article
Show Figures

Figure 1

43 pages, 42157 KB  
Article
SAREval: A Multi-Dimensional and Multi-Task Benchmark for Evaluating Visual Language Models on SAR Image Understanding
by Ziyan Wang, Lei Liu, Gang Wan, Yuchen Lu, Fengjie Zheng, Guangde Sun, Yixiang Huang, Shihao Guo, Xinyi Li and Liang Yuan
Remote Sens. 2026, 18(1), 82; https://doi.org/10.3390/rs18010082 - 25 Dec 2025
Viewed by 341
Abstract
Vision-Language Models (VLMs) demonstrate significant potential for remote sensing interpretation through multimodal fusion and semantic representation of imagery. However, their adaptation to Synthetic Aperture Radar (SAR) remains challenging due to fundamental differences in imaging mechanisms and physical properties compared to optical remote sensing. [...] Read more.
Vision-Language Models (VLMs) demonstrate significant potential for remote sensing interpretation through multimodal fusion and semantic representation of imagery. However, their adaptation to Synthetic Aperture Radar (SAR) remains challenging due to fundamental differences in imaging mechanisms and physical properties compared to optical remote sensing. SAREval, the first comprehensive benchmark specifically designed for SAR image understanding, incorporates SAR-specific characteristics, including scattering mechanisms and polarization features, through a hierarchical framework spanning perception, reasoning, and robustness capabilities. It encompasses 20 tasks from image classification to physical-attribute inference with over 10,000 high-quality image–text pairs. Extensive experiments conducted on 11 mainstream VLMs reveal substantial limitations in SAR image interpretation. Models achieve merely 25.35% accuracy in fine-grained ship classification tasks and demonstrate significant difficulties in establishing mappings between visual features and physical parameters. Furthermore, certain models exhibit unexpected performance improvements under certain noise conditions that challenge conventional robustness understanding. SAREval establishes an essential foundation for developing and evaluating VLMs in SAR image interpretation, providing standardized assessment protocols and quality-controlled annotations for cross-modal remote sensing research. Full article
Show Figures

Figure 1

21 pages, 5194 KB  
Article
Integrated Polarimetric Spectral Imaging Sensor Combining Spectral Imaging and Polarization Modulation Techniques
by Zihao Liu, Zhiping Song, Zhengqiang Li and Li Li
Sensors 2026, 26(1), 144; https://doi.org/10.3390/s26010144 - 25 Dec 2025
Viewed by 353
Abstract
Polarimetric spectral imaging systems have unique application advantages in environmental remote sensing, military target recognition, astronomy, medicine, etc., because of their ability to acquire multidimensional information. However, traditional systems are constrained by complex structures and low spectral resolution, making them unlikely to achieve [...] Read more.
Polarimetric spectral imaging systems have unique application advantages in environmental remote sensing, military target recognition, astronomy, medicine, etc., because of their ability to acquire multidimensional information. However, traditional systems are constrained by complex structures and low spectral resolution, making them unlikely to achieve their full potential. This study proposes a novel polarimetric spectral imaging method for information acquisition to address these shortcomings. The method integrates a polarization modulator (composed of two retarders and one polarizer) into the incident optical path of a push-broom imaging spectrometer for hardware integration. The modulator statically encodes the full polarization spectral information of the measured light into output power spectra, which the spectrometer records as raw spectral image data. Target polarimetric spectral imaging information is then reconstructed from the raw data to realize sensor functions. The system structure, data reconstruction principles, laboratory experiments with typical polarized light sources, and preliminary outdoor experiments verified the system’s correctness and reliability. The results facilitate further expansion of the application scope of polarimetric spectral imaging systems. Full article
(This article belongs to the Section Optical Sensors)
Show Figures

Figure 1

27 pages, 7808 KB  
Article
An Enhanced CycleGAN to Derive Temporally Continuous NDVI from Sentinel-1 SAR Images
by Anqi Wang, Zhiqiang Xiao, Chunyu Zhao, Juan Li, Yunteng Zhang, Jinling Song and Hua Yang
Remote Sens. 2026, 18(1), 56; https://doi.org/10.3390/rs18010056 - 24 Dec 2025
Viewed by 317
Abstract
Frequent cloud cover severely limits the use of optical remote sensing for continuous ecological monitoring. Synthetic aperture radar (SAR) offers an all-weather alternative, but translating SAR data to optical equivalents is challenging, particularly in cloudy regions where paired training data are scarce. To [...] Read more.
Frequent cloud cover severely limits the use of optical remote sensing for continuous ecological monitoring. Synthetic aperture radar (SAR) offers an all-weather alternative, but translating SAR data to optical equivalents is challenging, particularly in cloudy regions where paired training data are scarce. To address this, we developed an enhanced CycleGAN (denoted by SA-CycleGAN) to derive a high-fidelity, temporally continuous normalized difference vegetation index (NDVI) from SAR imagery. The SA-CycleGAN introduces a novel spatiotemporal attention generator that dynamically computes global and local feature relationships to capture long-range spatial dependencies across diverse landscapes. Furthermore, a structural similarity (SSIM) loss function is integrated into the SA-CycleGAN to preserve the structural and textural integrity of the synthesized images. The performance of the SA-CycleGAN and three unsupervised models (DualGAN, GP-UNIT, and DCLGAN) was evaluated by deriving NDVI time series from Sentinel-1 SAR images across four sites with different vegetation types. Ablation experiments were conducted to verify the contributions of the key components in the SA-CycleGAN model. The results demonstrate that the SA-CycleGAN significantly outperformed the comparison models across all four sites. Quantitatively, the proposed method achieved the lowest Root Mean Square Error (RMSE) of 0.0502 and the highest Coefficient of Determination (R2) of 0.88 at the Zhangbei and Xishuangbanna sites, respectively. The ablation experiments confirmed that the attention mechanism and SSIM loss function were crucial for capturing long-range features and maintaining spatial structure. The SA-CycleGAN proves to be a robust and effective solution for overcoming data gaps in optical time series. Full article
Show Figures

Figure 1

25 pages, 5001 KB  
Article
SAR-to-Optical Remote Sensing Image Translation Method Based on InternImage and Cascaded Multi-Head Attention
by Cheng Xu and Yingying Kong
Remote Sens. 2026, 18(1), 55; https://doi.org/10.3390/rs18010055 - 24 Dec 2025
Viewed by 257
Abstract
Synthetic aperture radar (SAR), with its all-weather and all-day observation capabilities, plays a significant role in the field of remote sensing. However, due to the unique imaging mechanism of SAR, its interpretation is challenging. Translating SAR images into optical remote sensing images has [...] Read more.
Synthetic aperture radar (SAR), with its all-weather and all-day observation capabilities, plays a significant role in the field of remote sensing. However, due to the unique imaging mechanism of SAR, its interpretation is challenging. Translating SAR images into optical remote sensing images has become a research hotspot in recent years to enhance the interpretability of SAR images. This paper proposes a deep learning-based method for SAR-to-optical remote sensing image translation. The network comprises three parts: a global representor, a generator with cascaded multi-head attention, and a multi-scale discriminator. The global representor, built upon InternImage with deformable convolution v3 (DCNv3) as its core operator, leverages its global receptive field and adaptive spatial aggregation capabilities to extract global semantic features from SAR images. The generator follows the classic “encoder-bottleneck-decoder” structure, where the encoder focuses on extracting local detail features from SAR images. The cascaded multi-head attention module within the bottleneck layer optimizes local detail features and facilitates feature interaction between global semantics and local details. The discriminator adopts a multi-scale structure based on the local receptive field PatchGAN, enabling joint global and local discrimination. Furthermore, for the first time in SAR image translation tasks, structural similarity index metric (SSIM) loss is combined with adversarial loss, perceptual loss, and feature matching loss as the loss function. A series of experiments demonstrate the effectiveness and reliability of the proposed method. Compared to mainstream image translation methods, our method ultimately generates higher-quality optical remote sensing images that are semantically consistent, texturally authentic, clearly detailed, and visually reasonable appearances. Full article
Show Figures

Figure 1

18 pages, 1564 KB  
Article
Salient Object Detection in Optical Remote Sensing Images Based on Hierarchical Semantic Interaction
by Jingfan Xu, Qi Zhang, Jinwen Xing, Mingquan Zhou and Guohua Geng
J. Imaging 2025, 11(12), 453; https://doi.org/10.3390/jimaging11120453 - 17 Dec 2025
Viewed by 352
Abstract
Existing salient object detection methods for optical remote sensing images still face certain limitations due to complex background variations, significant scale discrepancies among targets, severe background interference, and diverse topological structures. On the one hand, the feature transmission process often neglects the constraints [...] Read more.
Existing salient object detection methods for optical remote sensing images still face certain limitations due to complex background variations, significant scale discrepancies among targets, severe background interference, and diverse topological structures. On the one hand, the feature transmission process often neglects the constraints and complementary effects of high-level features on low-level features, leading to insufficient feature interaction and weakened model representation. On the other hand, decoder architectures generally rely on simple cascaded structures, which fail to adequately exploit and utilize contextual information. To address these challenges, this study proposes a Hierarchical Semantic Interaction Module to enhance salient object detection performance in optical remote sensing scenarios. The module introduces foreground content modeling and a hierarchical semantic interaction mechanism within a multi-scale feature space, reinforcing the synergy and complementarity among features at different levels. This effectively highlights multi-scale and multi-type salient regions in complex backgrounds. Extensive experiments on multiple optical remote sensing datasets demonstrate the effectiveness of the proposed method. Specifically, on the EORSSD dataset, our full model integrating both CA and PA modules improves the max F-measure from 0.8826 to 0.9100 (↑2.74%), increases maxE from 0.9603 to 0.9727 (↑1.24%), and enhances the S-measure from 0.9026 to 0.9295 (↑2.69%) compared with the baseline. These results clearly demonstrate the effectiveness of the proposed modules and verify the robustness and strong generalization capability of our method in complex remote sensing scenarios. Full article
(This article belongs to the Special Issue AI-Driven Remote Sensing Image Processing and Pattern Recognition)
Show Figures

Figure 1

22 pages, 12312 KB  
Article
ES-YOLO: Multi-Scale Port Ship Detection Combined with Attention Mechanism in Complex Scenes
by Lixiang Cao, Jia Xi, Zixuan Xie, Teng Feng and Xiaomin Tian
Sensors 2025, 25(24), 7630; https://doi.org/10.3390/s25247630 - 16 Dec 2025
Viewed by 377
Abstract
With the rapid development of remote sensing technology and deep learning, the port ship detection based on a single-stage algorithm has achieved remarkable results in optical imagery. However, most of the existing methods are designed and verified in specific scenes, such as fixed [...] Read more.
With the rapid development of remote sensing technology and deep learning, the port ship detection based on a single-stage algorithm has achieved remarkable results in optical imagery. However, most of the existing methods are designed and verified in specific scenes, such as fixed viewing angle, uniform background, or open sea, which makes it difficult to deal with the problem of ship detection in complex environments, such as cloud occlusion, wave fluctuation, complex buildings in the harbor, and multi-ship aggregation. To this end, ES-YOLO framework is proposed to solve the limitations of ship detection. A novel edge perception channel, Spatial Attention Mechanism (EACSA), is proposed to enhance the extraction of edge information and improve the ability to capture feature details. A lightweight spatial–channel decoupled down-sampling module (LSCD) is designed to replace the down-sampling structure of the original network and reduce the complexity of the down-sampling stage. A new hierarchical scale structure is designed to balance the detection effect of different scale differences. In this paper, a remote sensing ship dataset, TJShip, is constructed based on Gaofen-2 images, which covers multi-scale targets from small fishing boats to large cargo ships. The TJShip dataset was adopted as the data source, and the ES-YOLO model was employed to conduct ablation and comparison experiments. The results show that the introduction of EACSA attention mechanism, LSCD, and multi-scale structure improves the mAP of ship detection by 0.83%, 0.54%, and 1.06%, respectively, compared with the baseline model, also performing well in precision, recall and F1. Compared with Faster R-CNN, RetinaNet, YOLOv5, YOLOv7, and YOLOv8 methods, the results show that the ES-YOLO model improves the mAP by 46.87%, 8.14%, 1.85%, 1.75%, and 0.86%, respectively, under the same experimental conditions, which provides research ideas for ship detection. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

Back to TopTop