remotesensing-logo

Journal Browser

Journal Browser

Remote Sensing Cross-Modal Research: Algorithms and Practices

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: 31 August 2024 | Viewed by 4259

Special Issue Editors


E-Mail Website
Guest Editor
College of Physics and Information Engineering, Fuzhou University, Fuzhou 350108, China.
Interests: cross-domain scene classification; multi-modal image analysis; cross-modal image interpretation
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Hangzhou Institute of Technology, Xidian University, Hangzhou 311200, China
Interests: cross-modal image recognition; multi-source information joint perception; remote sesing application

Special Issue Information

Dear Colleagues,

With the development of remote sensing technology, multi-modal remote sensing data have become widely available, including, but not limited to, optical, SAR (synthetic aperture radar), thermal images, LIDAR, hyperspectral images and text. Different modalities in remote sensing data are highly correlated in specific events and applications. Remote sensing cross-modal research can help us understand and employ remote sensing data more comprehensively. For example, in urban planning, studying the cross-modal interaction of optical image and LIDAR data can enable us to conduct more accurate three-dimensional modeling and terrain analysis. In environmental monitoring, the cross-modal analysis of optical and thermal imaging data can enable us to conduct surface temperature monitoring. In agricultural resource management, cross-modal research on hyperspectral and SAR data can enable us to conduct crop growth monitoring and soil humidity analysis. Therefore, the current research on remote sensing cross-modal research has received significant attention from academic and industrial circles. Various types of remote sensing cross-modal tasks have been proposed, such as cross-model retrieval between remote sensing images and videos, remote sensing image caption, video abstract extraction, visual question answering and so on. Remote sensing cross-modal data possess heterogeneous features at the bottom and related semantics at the top. Determining how to represent the underlying features of different modal data domains, extract the high-level semantics and model the correlation between different modalities are major challenges faced by remote sensing cross-modal research. New algorithms and methods must be developed in order to process and analyze cross-modal remote sensing data.

This Special Issue aims to promote exchange and cooperation in cross-modal research, and promote the development and application of remote sensing science. Topics may cover anything from remote sensing cross-modal retrieval to more comprehensive aims and scales. Articles may address, but are not limited, to the following topics:

  • Cross-modal remote sensing classification, segmentation, and retrieval;
  • Cross-modal remote sensing data fusion and feature extraction;
  • Remote sensing image caption generation;
  • Remote sensing visual question answering;
  • Remote sensing cross-modal image generation;
  • Remote sensing image-text cross-modal conversion;
  • Geographic knowledge map construction;
  • Optical–SAR image interpretation;
  • Cross-modal remote sensing object detection;
  • HSI–LIDAR cross-modal remote sensing image fusion classification;
  • Application of remote sensing cross-modal research: ecosystem monitoring, underground resource exploration, urban planning and geological hazard warning.

Dr. Xiangtao Zheng
Dr. Xiumei Chen
Prof. Dr. Jinchang Ren
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • cross-modal representation learning
  • cross-modal detection and identification
  • cross-modal conversion and retrieval
  • cross-modal knowledge reasoning
  • cross-modal image generation
  • cross-modal collaborative learning
  • remote sensing cross-modal application
  • domain adaptation
  • transfer learning
  • cross-modal consistency feature representation

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 3080 KiB  
Article
Memory Augmentation and Non-Local Spectral Attention for Hyperspectral Denoising
by Le Dong, Yige Mo, Hao Sun, Fangfang Wu and Weisheng Dong
Remote Sens. 2024, 16(11), 1937; https://doi.org/10.3390/rs16111937 - 28 May 2024
Viewed by 437
Abstract
In this paper, a novel hyperspectral denoising method is proposed, aiming at restoring clean images from images disturbed by complex noise. Previous denoising methods have mostly focused on exploring the spatial and spectral correlations of hyperspectral data. The performances of these methods are [...] Read more.
In this paper, a novel hyperspectral denoising method is proposed, aiming at restoring clean images from images disturbed by complex noise. Previous denoising methods have mostly focused on exploring the spatial and spectral correlations of hyperspectral data. The performances of these methods are often limited by the effective information of the neighboring bands of the image patches in the spectral dimension, as the neighboring bands often suffer from similar noise interference. On the contrary, this study designed a cross-band non-local attention module with the aim of finding the optimal similar band for the input band. To avoid being limited to neighboring bands, this study also set up a memory library that can remember the detailed information of each input band during denoising training, fully learning the spectral information of the data. In addition, we use dense connected module to extract multi-scale spatial information from images separately. The proposed network is validated on both synthetic and real data. Compared with other recent hyperspectral denoising methods, the proposed method not only demonstrates good performance but also achieves better generalization. Full article
(This article belongs to the Special Issue Remote Sensing Cross-Modal Research: Algorithms and Practices)
Show Figures

Figure 1

22 pages, 8212 KiB  
Article
A Semantic Spatial Structure-Based Loop Detection Algorithm for Visual Environmental Sensing
by Xina Cheng, Yichi Zhang, Mengte Kang, Jialiang Wang, Jianbin Jiao, Le Dong and Licheng Jiao
Remote Sens. 2024, 16(10), 1720; https://doi.org/10.3390/rs16101720 - 13 May 2024
Viewed by 737
Abstract
Loop closure detection is an important component of the Simultaneous Localization and Mapping (SLAM) algorithm, which is utilized in environmental sensing. It helps to reduce drift errors during long-term operation, improving the accuracy and robustness of localization. Such improvements are sorely needed, as [...] Read more.
Loop closure detection is an important component of the Simultaneous Localization and Mapping (SLAM) algorithm, which is utilized in environmental sensing. It helps to reduce drift errors during long-term operation, improving the accuracy and robustness of localization. Such improvements are sorely needed, as conventional visual-based loop detection algorithms are greatly affected by significant changes in viewpoint and lighting conditions. In this paper, we present a semantic spatial structure-based loop detection algorithm. In place of feature points, robust semantic features are used to cope with the variation in the viewpoint. In consideration of the semantic features, which are region-based, we provide a corresponding matching algorithm. Constraints on semantic information and spatial structure are used to determine the existence of loop-back. A multi-stage pipeline framework is proposed to systematically leverage semantic information at different levels, enabling efficient filtering of potential loop closure candidates. To validate the effectiveness of our algorithm, we conducted experiments using the uHumans2 dataset. Our results demonstrate that, even when there are significant changes in viewpoint, the algorithm exhibits superior robustness compared to that of traditional loop detection methods. Full article
(This article belongs to the Special Issue Remote Sensing Cross-Modal Research: Algorithms and Practices)
Show Figures

Figure 1

29 pages, 7293 KiB  
Article
A Dual-FSM GI LiDAR Imaging Control Method Based on Two-Dimensional Flexible Turntable Composite Axis Tracking
by Yu Cao, Meilin Xie, Haitao Wang, Wei Hao, Min Guo, Kai Jiang, Lei Wang, Shan Guo and Fan Wang
Remote Sens. 2024, 16(10), 1679; https://doi.org/10.3390/rs16101679 - 9 May 2024
Viewed by 655
Abstract
In this study, a tracking and pointing control system with a dual-FSM (fast steering mirror) two-dimensional flexible turntable composite axis is proposed. It is applied to the target-tracking accuracy control in a GI LiDAR (ghost imaging LiDAR) system. Ghost imaging is a multi-measurement [...] Read more.
In this study, a tracking and pointing control system with a dual-FSM (fast steering mirror) two-dimensional flexible turntable composite axis is proposed. It is applied to the target-tracking accuracy control in a GI LiDAR (ghost imaging LiDAR) system. Ghost imaging is a multi-measurement imaging method; the dual-FSM GI LiDAR tracking and pointing imaging control system proposed in this study mainly solves the problems of the high-resolution remote sensing imaging of high-speed moving targets and various nonlinear disturbances when this technology is transformed into practical applications. Addressing the detrimental effects of nonlinear disturbances originating from internal flexible mechanisms and assorted external environmental factors on motion control’s velocity, stability, and tracking accuracy, a nonlinear active disturbance rejection control (NLADRC) method based on artificial neural networks is advanced. Additionally, to overcome the limitations imposed by receiving aperture constraints in GI LiDAR systems, a novel optical path design for the dual-FSM GI LiDAR tracking and imaging system is put forth. The implementation of the described methodologies culminated in the development of a dual-FSM GI LiDAR tracking and imaging system, which, upon thorough experimental validation, demonstrated significant improvements. Notably, it achieved an improvement in the coarse tracking accuracy from 193.29 μrad (3σ) to 87.21 μrad (3σ) and enhanced the tracking accuracy from 10.1 μrad (σ) to 1.5 μrad (σ) under specified operational parameters. Furthermore, the method notably diminished the overshoot during the target capture process from 28.85% to 12.8%, concurrently facilitating clear recognition of the target contour. This research contributes significantly to the advancement of GI LiDAR technology for practical application, showcasing the potential of the proposed control and design strategies in enhancing system performance in the face of complex disturbances. Full article
(This article belongs to the Special Issue Remote Sensing Cross-Modal Research: Algorithms and Practices)
Show Figures

Figure 1

18 pages, 5494 KiB  
Article
Hierarchical Semantic-Guided Contextual Structure-Aware Network for Spectral Satellite Image Dehazing
by Lei Yang, Jianzhong Cao, Hua Wang, Sen Dong and Hailong Ning
Remote Sens. 2024, 16(9), 1525; https://doi.org/10.3390/rs16091525 - 25 Apr 2024
Viewed by 500
Abstract
Haze or cloud always shrouds satellite images, obscuring valuable geographic information for military surveillance, natural calamity surveillance and mineral resource exploration. Satellite image dehazing (SID) provides the possibility for better applications of satellite images. Most of the existing dehazing methods are tailored for [...] Read more.
Haze or cloud always shrouds satellite images, obscuring valuable geographic information for military surveillance, natural calamity surveillance and mineral resource exploration. Satellite image dehazing (SID) provides the possibility for better applications of satellite images. Most of the existing dehazing methods are tailored for natural images and are not very effective for satellite images with non-homogeneous haze since the semantic structure information and inconsistent attenuation are not fully considered. To tackle this problem, this study proposes a hierarchical semantic-guided contextual structure-aware network (SCSNet) for spectral satellite image dehazing. Specifically, a hybrid CNN–Transformer architecture integrated with a hierarchical semantic guidance (HSG) module is presented to learn semantic structure information by synergetically complementing local representation from non-local features. Furthermore, a cross-layer fusion (CLF) module is specially designed to replace the traditional skip connection during the feature decoding stage so as to reinforce the attention to the spatial regions and feature channels with more serious attenuation. The results on the SateHaze1k, RS-Haze, and RSID datasets demonstrated that the proposed SCSNet can achieve effective dehazing and outperforms existing state-of-the-art methods. Full article
(This article belongs to the Special Issue Remote Sensing Cross-Modal Research: Algorithms and Practices)
Show Figures

Figure 1

24 pages, 12612 KiB  
Article
Multi-Dimensional Fusion of Spectral and Polarimetric Images Followed by Pseudo-Color Algorithm Integration and Mapping in HSI Space
by Fengqi Guo, Jingping Zhu, Liqing Huang, Feng Li, Ning Zhang, Jinxin Deng, Haoxiang Li, Xiangzhe Zhang, Yuanchen Zhao, Huilin Jiang and Xun Hou
Remote Sens. 2024, 16(7), 1119; https://doi.org/10.3390/rs16071119 - 22 Mar 2024
Viewed by 806
Abstract
Spectral–polarization imaging technology plays a crucial role in remote sensing detection, enhancing target identification and tracking capabilities by capturing both spectral and polarization information reflected from object surfaces. However, the acquisition of multi-dimensional data often leads to extensive datasets that necessitate comprehensive analysis, [...] Read more.
Spectral–polarization imaging technology plays a crucial role in remote sensing detection, enhancing target identification and tracking capabilities by capturing both spectral and polarization information reflected from object surfaces. However, the acquisition of multi-dimensional data often leads to extensive datasets that necessitate comprehensive analysis, thereby impeding the convenience and efficiency of remote sensing detection. To address this challenge, we propose a fusion algorithm based on spectral–polarization characteristics, incorporating principal component analysis (PCA) and energy weighting. This algorithm effectively consolidates multi-dimensional features within the scene into a single image, enhancing object details and enriching edge features. The robustness and universality of our proposed algorithm are demonstrated through experimentally obtained datasets and verified with publicly available datasets. Additionally, to meet the requirements of remote sensing tracking, we meticulously designed a pseudo-color mapping scheme consistent with human vision. This scheme maps polarization degree to color saturation, polarization angle to hue, and the fused image to intensity, resulting in a visual display aligned with human visual perception. We also discuss the application of this technique in processing data generated by the Channel-modulated static birefringent Fourier transform imaging spectropolarimeter (CSBFTIS). Experimental results demonstrate a significant enhancement in the information entropy and average gradient of the fused image compared to the optimal image before fusion, achieving maximum increases of 88% and 94%, respectively. This provides a solid foundation for target recognition and tracking in airborne remote sensing detection. Full article
(This article belongs to the Special Issue Remote Sensing Cross-Modal Research: Algorithms and Practices)
Show Figures

Figure 1

Back to TopTop