Remote Sensing

Research

Jump to: Other

21 pages, 28534 KiB

Open AccessArticle

RACR-ShipDet: A Ship Orientation Detection Method Based on Rotation-Adaptive ConvNeXt and Enhanced RepBiFPAN

by Jiandan Zhong, Lingfeng Liu, Fei Song, Yingxiang Li and Yajuan Xue

Remote Sens. 2025, 17(4), 643; https://doi.org/10.3390/rs17040643 - 13 Feb 2025

Viewed by 848

Abstract

Ship orientation detection is essential for maritime navigation, traffic monitoring, and defense, yet existing methods face challenges with rotational invariance in large-angle scenarios, difficulties in multi-scale feature fusion, and the limitations of traditional IoU when detecting oriented objects and predicting objects’ orientation. In [...] Read more.

Ship orientation detection is essential for maritime navigation, traffic monitoring, and defense, yet existing methods face challenges with rotational invariance in large-angle scenarios, difficulties in multi-scale feature fusion, and the limitations of traditional IoU when detecting oriented objects and predicting objects’ orientation. In this article, we propose a novel ship orientation detection (RACR-ShipDet) network based on rotation-adaptive ConvNeXt and Enhanced RepBiFPAN in remote sensing images. To equip the model with rotational invariance, ConvNeXt is first improved so that it can dynamically adjust the rotation angle and convolution kernel through adaptive rotation convolution, namely, ARRConv, forming a new architecture called RotConvNeXt. Subsequently, the RepBiFPAN, enhanced with the Weighted Feature Aggregation module, is employed to prioritize informative features by dynamically assigning adaptive weights, effectively reducing the influence of redundant or irrelevant features and improving feature representation. Moreover, a more stable version of KFIoU is proposed, named SCKFIoU, which improves the accuracy and stability of overlap calculation by introducing a small perturbation term and utilizing Cholesky decomposition for efficient matrix inversion and determinant calculation. Evaluations using the DOTA-ORShip dataset demonstrate that RACR-ShipDet outperforms current state-of-the-art models, achieving an mAP of 95.3%, representing an improvement of 5.3% over PSC (90.0%) and of 1.9% over HDDet (93.4%). Furthermore, it demonstrates a superior orientation accuracy of 96.9%, exceeding HDDet by a margin of 5.0%, establishing itself as a robust solution for ship orientation detection in complex environments. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing II)

► Show Figures

Figure 1

29 pages, 18935 KiB

Open AccessArticle

OSNet: An Edge Enhancement Network for a Joint Application of SAR and Optical Images

by Keyu Ma, Kai Hu, Junyu Chen, Ming Jiang, Yao Xu, Min Xia and Liguo Weng

Remote Sens. 2025, 17(3), 505; https://doi.org/10.3390/rs17030505 - 31 Jan 2025

Viewed by 1257

Abstract

The combined use of synthetic aperture radar (SAR) and optical images for surface observation is gaining increasing attention. Optical images, with their distinct edge features, can accurately classify different objects, while SAR images reveal deeper internal variations. To address the challenge of differing [...] Read more.

The combined use of synthetic aperture radar (SAR) and optical images for surface observation is gaining increasing attention. Optical images, with their distinct edge features, can accurately classify different objects, while SAR images reveal deeper internal variations. To address the challenge of differing feature distributions in multi-source images, we propose an edge enhancement network, OSNet (network for optical and SAR images), designed to jointly extract features from optical and SAR images and enhance edge feature representation. OSNet consists of three core modules: a dual-branch backbone, a synergistic attention integration module, and a global-guided local fusion module. These modules, respectively, handle modality-independent feature extraction, feature sharing, and global-local feature fusion. In the backbone module, we introduce a differentiable Lee filter and a Laplacian edge detection operator in the SAR branch to suppress noise and enhance edge features. Additionally, we designed a multi-source attention fusion module to facilitate cross-modal information exchange between the two branches. We validated OSNet’s performance on segmentation tasks (WHU-OPT-SAR) and regression tasks (SNOW-OPT-SAR). The results show that OSNet improved PA and MIoU by 2.31% and 2.58%, respectively, in the segmentation task, and reduced MAE and RMSE by 3.14% and 4.22%, respectively, in the regression task. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing II)

► Show Figures

Figure 1

28 pages, 13922 KiB

Open AccessArticle

Multi-Class Guided GAN for Remote-Sensing Image Synthesis Based on Semantic Labels

by Zhenye Niu, Yuxia Li, Yushu Gong, Bowei Zhang, Yuan He, Jinglin Zhang, Mengyu Tian and Lei He

Remote Sens. 2025, 17(2), 344; https://doi.org/10.3390/rs17020344 - 20 Jan 2025

Cited by 1 | Viewed by 1648

Abstract

In the scenario of limited labeled remote-sensing datasets, the model’s performance is constrained by the insufficient availability of data. Generative model-based data augmentation has emerged as a promising solution to this limitation. While existing generative models perform well in natural scene domains (e.g., [...] Read more.

In the scenario of limited labeled remote-sensing datasets, the model’s performance is constrained by the insufficient availability of data. Generative model-based data augmentation has emerged as a promising solution to this limitation. While existing generative models perform well in natural scene domains (e.g., faces and street scenes), their performance in remote sensing is hindered by severe data imbalance and the semantic similarity among land-cover classes. To tackle these challenges, we propose the Multi-Class Guided GAN (MCGGAN), a novel network for generating remote-sensing images from semantic labels. Our model features a dual-branch architecture with a global generator that captures the overall image structure and a multi-class generator that improves the quality and differentiation of land-cover types. To integrate these generators, we design a shared-parameter encoder for consistent feature encoding across two branches, and a spatial decoder that synthesizes outputs from the class generators, preventing overlap and confusion. Additionally, we employ perceptual loss (

L_{V G G}

) to assess perceptual similarity between generated and real images, and texture matching loss (

L_{T}

) to capture fine texture details. To evaluate the quality of image generation, we tested multiple models on two custom datasets (one from Chongzhou, Sichuan Province, and another from Wuzhen, Zhejiang Province, China) and a public dataset LoveDA. The results show that MCGGAN achieves improvements of 52.86 in FID, 0.0821 in SSIM, and 0.0297 in LPIPS compared to the Pix2Pix baseline. We also conducted comparative experiments to assess the semantic segmentation accuracy of the U-Net before and after incorporating the generated images. The results show that data augmentation with the generated images leads to an improvement of 4.47% in FWIoU and 3.23% in OA across the Chongzhou and Wuzhen datasets. Experiments show that MCGGAN can be effectively used as a data augmentation approach to improve the performance of downstream remote-sensing image segmentation tasks. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing II)

► Show Figures

Figure 1

20 pages, 11254 KiB

Open AccessArticle

SCM-YOLO for Lightweight Small Object Detection in Remote Sensing Images

by Hao Qiang, Wei Hao, Meilin Xie, Qiang Tang, Heng Shi, Yixin Zhao and Xiaoteng Han

Remote Sens. 2025, 17(2), 249; https://doi.org/10.3390/rs17020249 - 12 Jan 2025

Cited by 6 | Viewed by 2769

Abstract

Currently, small object detection in complex remote sensing environments faces significant challenges. The detectors designed for this scenario have limitations, such as insufficient extraction of spatial local information, inflexible feature fusion, and limited global feature acquisition capability. In addition, there is a need [...] Read more.

Currently, small object detection in complex remote sensing environments faces significant challenges. The detectors designed for this scenario have limitations, such as insufficient extraction of spatial local information, inflexible feature fusion, and limited global feature acquisition capability. In addition, there is a need to balance performance and complexity when improving the model. To address these issues, this paper proposes an efficient and lightweight SCM-YOLO detector improved from YOLOv5 with spatial local information enhancement, multi-scale feature adaptive fusion, and global sensing capabilities. The SCM-YOLO detector consists of three innovative and lightweight modules: the Space Interleaving in Depth (SPID) module, the Cross Block and Channel Reweight Concat (CBCC) module, and the Mixed Local Channel Attention Global Integration (MAGI) module. These three modules effectively improve the performance of the detector from three aspects: feature extraction, feature fusion, and feature perception. The ability of SCM-YOLO to detect small objects in complex remote sensing environments has been significantly improved while maintaining its lightweight characteristics. The effectiveness and lightweight characteristics of SCM-YOLO are verified through comparison experiments with AI-TOD and SIMD public remote sensing small object detection datasets. In addition, we validate the effectiveness of the three modules, SPID, CBCC, and MAGI, through ablation experiments. The comparison experiments on the AI-TOD dataset show that the mAP50 and mAP50-95 metrics of SCM-YOLO reach 64.053% and 27.283%, respectively, which are significantly better than other models with the same parameter size. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing II)

► Show Figures

Figure 1

20 pages, 4570 KiB

Open AccessArticle

Transferable Targeted Adversarial Attack on Synthetic Aperture Radar (SAR) Image Recognition

by Sheng Zheng, Dongshen Han, Chang Lu, Chaowen Hou, Yanwen Han, Xinhong Hao and Chaoning Zhang

Remote Sens. 2025, 17(1), 146; https://doi.org/10.3390/rs17010146 - 3 Jan 2025

Cited by 1 | Viewed by 1097

Abstract

Deep learning models have been widely applied to synthetic aperture radar (SAR) target recognition, offering end-to-end feature extraction that significantly enhances recognition performance. However, recent studies show that optical image recognition models are widely vulnerable to adversarial examples, which fool the models by [...] Read more.

Deep learning models have been widely applied to synthetic aperture radar (SAR) target recognition, offering end-to-end feature extraction that significantly enhances recognition performance. However, recent studies show that optical image recognition models are widely vulnerable to adversarial examples, which fool the models by adding imperceptible perturbation to the input. Although the targeted adversarial attack (TAA) has been realized in the white box setup with full access to the SAR model’s knowledge, it is less practical in real-world scenarios where white box access to the target model is not allowed. To the best of our knowledge, our work is the first to explore transferable TAA on SAR models. Since contrastive learning (CL) is commonly applied to enhance a model’s generalization, we utilize it to improve the generalization of adversarial examples generated on a source model to unseen target models in the black box scenario. Thus, we propose the contrastive learning-based targeted adversarial attack, termed CL-TAA. Extensive experiments demonstrated that our proposed CL-TAA can significantly improve the transferability of adversarial examples to fool the SAR models in the black box scenario. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing II)

► Show Figures

Figure 1

19 pages, 7749 KiB

Open AccessArticle

Generative Simplex Mapping: Non-Linear Endmember Extraction and Spectral Unmixing for Hyperspectral Imagery

by John Waczak and David J. Lary

Remote Sens. 2024, 16(22), 4316; https://doi.org/10.3390/rs16224316 - 19 Nov 2024

Cited by 1 | Viewed by 1345

Abstract

We introduce a new model for non-linear endmember extraction and spectral unmixing of hyperspectral imagery called Generative Simplex Mapping (GSM). The model represents endmember mixing using a latent space of points sampled within a

(n - 1)

-simplex corresponding to n [...] Read more.

We introduce a new model for non-linear endmember extraction and spectral unmixing of hyperspectral imagery called Generative Simplex Mapping (GSM). The model represents endmember mixing using a latent space of points sampled within a

(n - 1)

-simplex corresponding to n unique sources. Barycentric coordinates within this simplex are naturally interpreted as relative endmember abundances satisfying both the abundance sum-to-one and abundance non-negativity constraints. Points in this latent space are mapped to reflectance spectra via a flexible function combining linear and non-linear mixing. Due to the probabilistic formulation of the GSM, spectral variability is also estimated by a precision parameter describing the distribution of observed spectra. Model parameters are determined using a generalized expectation-maximization algorithm, which guarantees non-negativity for extracted endmembers. We first compare the GSM against three varieties of non-negative matrix factorization (NMF) on a synthetic data set of linearly mixed spectra from the USGS spectral database. Here, the GSM performed favorably for both endmember accuracy and abundance estimation with all non-linear contributions driven to zero by the fitting procedure. In a second experiment, we apply the GTM to model non-linear mixing in real hyperspectral imagery captured over a pond in North Texas. The model accurately identified spectral signatures corresponding to near-shore algae, water, and rhodamine tracer dye introduced into the pond to simulate water contamination by a localized source. Abundance maps generated using the GSM accurately track the evolution of the dye plume as it mixes into the surrounding water. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing II)

► Show Figures

Figure 1

13 pages, 10253 KiB

Open AccessArticle

Combining KAN with CNN: KonvNeXt’s Performance in Remote Sensing and Patent Insights

by Minjong Cheon and Changbae Mun

Remote Sens. 2024, 16(18), 3417; https://doi.org/10.3390/rs16183417 - 14 Sep 2024

Cited by 6 | Viewed by 3003

Abstract

Rapid advancements in satellite technology have led to a significant increase in high-resolution remote sensing (RS) images, necessitating the use of advanced processing methods. Additionally, patent analysis revealed a substantial increase in deep learning and machine learning applications in remote sensing, highlighting the [...] Read more.

Rapid advancements in satellite technology have led to a significant increase in high-resolution remote sensing (RS) images, necessitating the use of advanced processing methods. Additionally, patent analysis revealed a substantial increase in deep learning and machine learning applications in remote sensing, highlighting the growing importance of these technologies. Therefore, this paper introduces the Kolmogorov-Arnold Network (KAN) model to remote sensing to enhance efficiency and performance in RS applications. We conducted several experiments to validate KAN’s applicability, starting with the EuroSAT dataset, where we combined the KAN layer with multiple pre-trained CNN models. Optimal performance was achieved using ConvNeXt, leading to the development of the KonvNeXt model. KonvNeXt was evaluated on the Optimal-31, AID, and Merced datasets for validation and achieved accuracies of 90.59%, 94.1%, and 98.1%, respectively. The model also showed fast processing speed, with the Optimal-31 and Merced datasets completed in 107.63 s each, while the bigger and more complicated AID dataset took 545.91 s. This result is meaningful since it achieved faster speeds and comparable accuracy compared to the existing study, which utilized VIT and proved KonvNeXt’s applicability for remote sensing classification tasks. Furthermore, we investigated the model’s interpretability by utilizing Occlusion Sensitivity, and by displaying the influential regions, we validated its potential use in a variety of domains, including medical imaging and weather forecasting. This paper is meaningful in that it is the first to use KAN in remote sensing classification, proving its adaptability and efficiency. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing II)

► Show Figures

Figure 1

18 pages, 18089 KiB

Open AccessCommunication

High-Resolution PM₁₀ Estimation Using Satellite Data and Model-Agnostic Meta-Learning

by Yue Yang, Jan Cermak, Xu Chen, Yunping Chen and Xi Hou

Remote Sens. 2024, 16(13), 2498; https://doi.org/10.3390/rs16132498 - 8 Jul 2024

Cited by 1 | Viewed by 1964

Abstract

Characterizing the spatial distribution of particles smaller than 10 μm (PM₁₀) is of great importance for air quality management yet is very challenging because of the sparseness of air quality monitoring stations. In this study, we use a model-agnostic meta-learning-trained artificial [...] Read more.

Characterizing the spatial distribution of particles smaller than 10 μm (PM₁₀) is of great importance for air quality management yet is very challenging because of the sparseness of air quality monitoring stations. In this study, we use a model-agnostic meta-learning-trained artificial neural network (MAML-ANN) to estimate the concentrations of PM₁₀ at 60 m × 60 m spatial resolution by combining satellite-derived aerosol optical depth (AOD) with meteorological data. The network is designed to regress from the predictors at a specific time to the ground-level PM₁₀ concentration. We utilize the ANN model to capture the time-specific nonlinearity among aerosols, meteorological conditions, and PM₁₀, and apply MAML to enable the model to learn the nonlinearity across time from only a small number of data samples. MAML is also employed to transfer the knowledge learned from coarse spatial resolution to high spatial resolution. The MAML-ANN model is shown to accurately estimate high-resolution PM₁₀ in Beijing, with coefficient of determination of 0.75. MAML improves the PM₁₀ estimation performance of the ANN model compared with the baseline using pre-trained initial weights. Thus, MAML-ANN has the potential to estimate particulate matter estimation at high spatial resolution over other data-sparse, heavily polluted, and small regions. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing II)

► Show Figures

Figure 1

20 pages, 22183 KiB

Open AccessArticle

FEFN: Feature Enhancement Feedforward Network for Lightweight Object Detection in Remote Sensing Images

by Jing Wu, Rixiang Ni, Zhenhua Chen, Feng Huang and Liqiong Chen

Remote Sens. 2024, 16(13), 2398; https://doi.org/10.3390/rs16132398 - 29 Jun 2024

Cited by 1 | Viewed by 1938

Abstract

Object detection in remote sensing images has become a crucial component of computer vision. It has been employed in multiple domains, including military surveillance, maritime rescue, and military operations. However, the high density of small objects in remote sensing images makes it challenging [...] Read more.

Object detection in remote sensing images has become a crucial component of computer vision. It has been employed in multiple domains, including military surveillance, maritime rescue, and military operations. However, the high density of small objects in remote sensing images makes it challenging for existing networks to accurately distinguish objects from shallow image features. These factors contribute to many object detection networks that produce missed detections and false alarms, particularly for densely arranged objects and small objects. To address the above problems, this paper proposes a feature enhancement feedforward network (FEFN), based on a lightweight channel feedforward module (LCFM) and a feature enhancement module (FEM). First, the FEFN captures shallow spatial information in images through a lightweight channel feedforward module that can extract the edge information of small objects such as ships. Next, it enhances the feature interaction and representation by utilizing a feature enhancement module that can achieve more accurate detection results for densely arranged objects and small objects. Finally, comparative experiments on two publicly challenging remote sensing datasets demonstrate the effectiveness of the proposed method. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing II)

► Show Figures

Figure 1

21 pages, 29397 KiB

Open AccessArticle

TFCD-Net: Target and False Alarm Collaborative Detection Network for Infrared Imagery

by Siying Cao, Zhi Li, Jiakun Deng, Yi’an Huang and Zhenming Peng

Remote Sens. 2024, 16(10), 1758; https://doi.org/10.3390/rs16101758 - 15 May 2024

Cited by 1 | Viewed by 1460

Abstract

Infrared small target detection (ISTD) plays a crucial role in both civilian and military applications. Detecting small targets against dense cluttered backgrounds remains a challenging task, requiring the collaboration of false alarm source elimination and target detection. Existing approaches mainly focus on modeling [...] Read more.

Infrared small target detection (ISTD) plays a crucial role in both civilian and military applications. Detecting small targets against dense cluttered backgrounds remains a challenging task, requiring the collaboration of false alarm source elimination and target detection. Existing approaches mainly focus on modeling targets while often overlooking false alarm sources. To address this limitation, we propose a Target and False Alarm Collaborative Detection Network to leverage the information provided by false alarm sources and the background. Firstly, we introduce a False Alarm Source Estimation Block (FEB) that estimates potential interferences present in the background by extracting features at multiple scales and using gradual upsampling for feature fusion. Subsequently, we propose a framework that employs multiple FEBs to eliminate false alarm sources across different scales. Finally, a Target Segmentation Block (TSB) is introduced to accurately segment the targets and produce the final detection result. Experiments conducted on public datasets show that our model achieves the highest and second-highest scores for the IoU, Pd, and AUC and the lowest Fa among the DNN methods. These results demonstrate that our model accurately segments targets while effectively extracting false alarm sources, which can be used for further studies. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing II)

► Show Figures

Figure 1

22 pages, 3618 KiB

Open AccessArticle

An Integrated Detection and Multi-Object Tracking Pipeline for Satellite Video Analysis of Maritime and Aerial Objects

by Zhijuan Su, Gang Wan, Wenhua Zhang, Ningbo Guo, Yitian Wu, Jia Liu, Dianwei Cong, Yutong Jia and Zhanji Wei

Remote Sens. 2024, 16(4), 724; https://doi.org/10.3390/rs16040724 - 19 Feb 2024

Cited by 7 | Viewed by 2516

Abstract

Optical remote sensing videos, as a new source of remote sensing data that has emerged in recent years, have significant potential in remote sensing applications, especially national defense. In this paper, a tracking pipeline named TDNet (tracking while detecting based on a neural [...] Read more.

Optical remote sensing videos, as a new source of remote sensing data that has emerged in recent years, have significant potential in remote sensing applications, especially national defense. In this paper, a tracking pipeline named TDNet (tracking while detecting based on a neural network) is proposed for optical remote sensing videos based on a correlation filter and deep neural networks. The pipeline is used to simultaneously track ships and planes in videos. There are many target tracking methods for general video data, but they suffer some difficulties in remote sensing videos with low resolution and those influenced by weather conditions. The tracked targets are usually misty. Therefore, in TDNet, we propose a new multi-target tracking method called MT-KCF and a detecting-assisted tracking (i.e., DAT) module to improve tracking accuracy and precision. Meanwhile, we also design a new target recognition (i.e., NTR) module to recognise newly emerged targets. In order to verify the performance of TDNet, we compare our method with several state-of-the-art tracking methods on optical video remote sensing data sets acquired from the Jilin No. 1 satellite. The experimental results demonstrate the effectiveness and the state-of-the-art performance of the proposed method. The proposed method can achieve more than 90% performance in terms of precision for single-target tracking tasks and more than 85% performance in terms of MOTA for multi-object tracking tasks. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing II)

► Show Figures

Figure 1

24 pages, 13566 KiB

Open AccessArticle

EL-NAS: Efficient Lightweight Attention Cross-Domain Architecture Search for Hyperspectral Image Classification

by Jianing Wang, Jinyu Hu, Yichen Liu, Zheng Hua, Shengjia Hao and Yuqiong Yao

Remote Sens. 2023, 15(19), 4688; https://doi.org/10.3390/rs15194688 - 25 Sep 2023

Cited by 8 | Viewed by 2154

Abstract

Deep learning (DL) algorithms have demonstrated important breakthroughs for hyperspectral image (HSI) classification. Despite the remarkable success of DL, the burden of a manually designed DL structure with increased depth and size aroused the difficulty for the application in the mobile and embedded [...] Read more.

Deep learning (DL) algorithms have demonstrated important breakthroughs for hyperspectral image (HSI) classification. Despite the remarkable success of DL, the burden of a manually designed DL structure with increased depth and size aroused the difficulty for the application in the mobile and embedded devices in a real application. To tackle this issue, in this paper, we proposed an efficient lightweight attention network architecture search algorithm (EL-NAS) for realizing an efficient automatic design of a lightweight DL structure as well as improving the classification performance of HSI. First, aimed at realizing an efficient search procedure, we construct EL-NAS based on a differentiable network architecture search (NAS), which can greatly accelerate the convergence of the over-parameter supernet in a gradient descent manner. Second, in order to realize lightweight search results with high accuracy, a lightweight attention module search space is designed for EL-NAS. Finally, further for alleviating the problem of higher validation accuracy and worse classification performance, the edge decision strategy is exploited to perform edge decisions through the entropy of distribution estimated over non-skip operations to avoid further performance collapse caused by numerous skip operations. To verify the effectiveness of EL-NAS, we conducted experiments on several real-world hyperspectral images. The results demonstrate that the proposed EL-NAS indicates a more efficient search procedure with smaller parameter sizes and high accuracy performance for HSI classification, even under data-independent and sensor-independent scenarios. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing II)

► Show Figures

Graphical abstract

Other

Jump to: Research

16 pages, 8285 KiB

Open AccessTechnical Note

A Feature-Driven Inception Dilated Network for Infrared Image Super-Resolution Reconstruction

by Jiaxin Huang, Huicong Wang, Yuhan Li and Shijian Liu

Remote Sens. 2024, 16(21), 4033; https://doi.org/10.3390/rs16214033 - 30 Oct 2024

Viewed by 1301

Abstract

Image super-resolution (SR) algorithms based on deep learning yield good visual performances on visible images. Due to the blurred edges and low contrast of infrared (IR) images, methods transferred directly from visible images to IR images have a poor performance and ignore the [...] Read more.

Image super-resolution (SR) algorithms based on deep learning yield good visual performances on visible images. Due to the blurred edges and low contrast of infrared (IR) images, methods transferred directly from visible images to IR images have a poor performance and ignore the demands of downstream detection tasks. Therefore, an Inception Dilated Super-Resolution (IDSR) network with multiple branches is proposed. A dilated convolutional branch captures high-frequency information to reconstruct edge details, while a non-local operation branch captures long-range dependencies between any two positions to maintain the global structure. Furthermore, deformable convolution is utilized to fuse features extracted from different branches, enabling adaptation to targets of various shapes. To enhance the detection performance of low-resolution (LR) images, we crop the images into patches based on target labels before feeding them to the network. This allows the network to focus on learning the reconstruction of the target areas only, reducing the interference of background areas in the target areas’ reconstruction. Additionally, a feature-driven module is cascaded at the end of the IDSR network to guide the high-resolution (HR) image reconstruction with feature prior information from a detection backbone. This method has been tested on the FLIR Thermal Dataset and the M3FD Dataset and compared with five mainstream SR algorithms. The final results demonstrate that our method effectively maintains image texture details. More importantly, our method achieves 80.55% mAP, outperforming other methods on FLIR Dataset detection accuracy, and with 74.7% mAP outperforms other methods on M3FD Dataset detection accuracy. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing II)

► Show Figures

Figure 1

15 pages, 10515 KiB

Open AccessTechnical Note

A DeturNet-Based Method for Recovering Images Degraded by Atmospheric Turbulence

by Xiangxi Li, Xingling Liu, Weilong Wei, Xing Zhong, Haotong Ma and Junqiu Chu

Remote Sens. 2023, 15(20), 5071; https://doi.org/10.3390/rs15205071 - 23 Oct 2023

Cited by 6 | Viewed by 2181

Abstract

Atmospheric turbulence is one of the main issues causing image blurring, dithering, and other degradation problems when detecting targets over long distances. Due to the randomness of turbulence, degraded images are hard to restore directly using traditional methods. With the rapid development of [...] Read more.

Atmospheric turbulence is one of the main issues causing image blurring, dithering, and other degradation problems when detecting targets over long distances. Due to the randomness of turbulence, degraded images are hard to restore directly using traditional methods. With the rapid development of deep learning, blurred images can be restored correctly and directly by establishing a nonlinear mapping relationship between the degraded and initial objects based on neural networks. These data-driven end-to-end neural networks offer advantages in turbulence image reconstruction due to their real-time properties and simplified optical systems. In this paper, inspired by the connection between the turbulence phase diagram characteristics and the attentional mechanisms for neural networks, we propose a new deep neural network called DeturNet to enhance the network’s performance and improve the quality of image reconstruction results. DeturNet employs global information aggregation operations and amplifies notable cross-dimensional reception regions, thereby contributing to the recovery of turbulence-degraded images. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing II)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Advanced Artificial Intelligence and Deep Learning for Remote Sensing II

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Related Special Issues

Published Papers (14 papers)

Research

Other

Further Information

Guidelines

MDPI Initiatives

Follow MDPI