MDPI - Publisher of Open Access Journals

25 pages, 9142 KiB

Open AccessArticle

Restricted Label-Based Self-Supervised Learning Using SAR and Multispectral Imagery for Local Climate Zone Classification

by Amjad Nawaz, Wei Yang, Hongcheng Zeng, Yamin Wang and Jie Chen

Remote Sens. 2025, 17(8), 1335; https://doi.org/10.3390/rs17081335 - 8 Apr 2025

Viewed by 627

Abstract

Deep learning techniques have garnered significant attention in remote sensing scene classification. However, obtaining a large volume of labeled data for supervised learning (SL) remains challenging. Additionally, SL methods frequently struggle with limited generalization ability. To address these limitations, self-supervised multi-mode representation learning [...] Read more.

Deep learning techniques have garnered significant attention in remote sensing scene classification. However, obtaining a large volume of labeled data for supervised learning (SL) remains challenging. Additionally, SL methods frequently struggle with limited generalization ability. To address these limitations, self-supervised multi-mode representation learning (SSMMRL) is introduced for local climate zone classification (LCZC). Unlike conventional supervised learning methods, SSMMRL utilizes a novel encoder architecture that exclusively processes augmented positive samples (PSs), eliminating the need for negative samples. An attention-guided fusion mechanism is integrated, using positive samples as a form of regularization. The novel encoder captures informative representations from the unannotated So2Sat-LCZ42 dataset, which are then leveraged to enhance performance in a challenging few-shot classification task with limited labeled samples. Co-registered Synthetic Aperture Radar (SAR) and Multispectral (MS) images are used for evaluation and training. This approach enables the model to exploit extensive unlabeled data, enhancing performance on downstream tasks. Experimental evaluations on the So2Sat-LCZ42 benchmark dataset show the efficacy of the SSMMRL method. Our method for LCZC outperforms state-of-the-art (SOTA) approaches. Full article

(This article belongs to the Special Issue Advances in Spaceborne SAR—Technology and Applications (Second Edition))

► Show Figures

Graphical abstract

15 pages, 18148 KiB

Open AccessArticle

Fast 3D Transmission Tower Detection Based on Virtual Views

by Liwei Zhou, Jiaying Tan, Jing Fu and Guiwei Shao

Appl. Sci. 2025, 15(2), 947; https://doi.org/10.3390/app15020947 - 19 Jan 2025

Cited by 1 | Viewed by 1081

Abstract

Advanced remote sensing technologies leverage extensive synthetic aperture radar (SAR) satellite data and high-resolution airborne light detection and ranging (LiDAR) data to swiftly capture comprehensive 3D information about electrical grid assets and their surrounding environments. This facilitates in-depth scene analysis for target detection [...] Read more.

Advanced remote sensing technologies leverage extensive synthetic aperture radar (SAR) satellite data and high-resolution airborne light detection and ranging (LiDAR) data to swiftly capture comprehensive 3D information about electrical grid assets and their surrounding environments. This facilitates in-depth scene analysis for target detection and classification, allowing for the early recognition of potential hazards near transmission towers (TTs). These innovations present a groundbreaking strategy for the automated inspection of electrical grid assets. However, traditional 3D target detection techniques, which involve searching the entire 3D space, are marred by low accuracy and high computational demands. Although deep learning-based 3D target detection methods have significantly improved detection precision, they rely on a large volume of 3D target samples for training and are sensitive to point cloud data density. Moreover, these methods demonstrate low detection efficiency, constraining their application in the automated monitoring of electricity networks. This paper proposes a fast 3D target detection method using virtual views to overcome these challenges related to detection accuracy and efficiency. The method first utilizes cutting-edge 2D splatting technology to project 3D point clouds with diverse densities from a specific viewpoint, generating a 2D virtual image. Then, a novel local–global dual-path feature fusion network based on YOLO is applied to detect TTs on the virtual image, ensuring efficient and accurate identification of their positions and types. Finally, by leveraging the projection transformation between the virtual image and the 3D point cloud, combined with a 3D region growing algorithm, the 3D points belonging to the TTs are extracted from the whole 3D point cloud. The effectiveness of the proposed method in terms of target detection rate and efficiency is validated through experiments on synthetic datasets and outdoor LiDAR point clouds. Full article

► Show Figures

Figure 1

19 pages, 3804 KiB

Open AccessArticle

SAR-PATT: A Physical Adversarial Attack for SAR Image Automatic Target Recognition

by Binyan Luo, Hang Cao, Jiahao Cui, Xun Lv, Jinqiang He, Haifeng Li and Chengli Peng

Remote Sens. 2025, 17(1), 21; https://doi.org/10.3390/rs17010021 - 25 Dec 2024

Cited by 2 | Viewed by 1229

Abstract

Deep neural network-based synthetic aperture radar (SAR) automatic target recognition (ATR) systems are susceptible to attack by adversarial examples, which leads to misclassification by the SAR ATR system, resulting in theoretical model robustness problems and security problems in practice. Inspired by optical images, [...] Read more.

Deep neural network-based synthetic aperture radar (SAR) automatic target recognition (ATR) systems are susceptible to attack by adversarial examples, which leads to misclassification by the SAR ATR system, resulting in theoretical model robustness problems and security problems in practice. Inspired by optical images, current SAR ATR adversarial example generation is performed in the image domain. However, the imaging principle of SAR images is based on the imaging of the echo signals interacting between the SAR and objects. Generating adversarial examples only in the image domain cannot change the physical world to achieve adversarial attacks. To solve these problems, this article proposes a framework for generating SAR adversarial examples in a 3D physical scene. First, adversarial attacks are implemented in the 2D image space, and the perturbation in the image space is converted into simulated rays that constitute SAR images through backpropagation optimization methods. The mapping between the simulated rays constituting SAR images and the 3D model is established through coordinate transformation, and point correspondence to triangular faces and intensity values to texture parameters are established. Thus, the simulated rays constituting SAR images are mapped to the 3D model, and the perturbation in the 2D image space is converted back to the 3D physical space to obtain the position and intensity of the perturbation in the 3D physical space, thereby achieving physical adversarial attacks. The experimental results show that our attack method can effectively perform SAR adversarial attacks in the physical world. In the digital world, we achieved an average fooling rate of up to 99.02% for three objects in six classification networks. In the physical world, we achieved an average fooling rate of up to 97.87% for these objects, with a certain degree of transferability across the six different network architectures. To the best of our knowledge, this is the first work to implement physical attacks in a full physical simulation condition. Our research establishes a theoretical foundation for the future concealment of SAR targets in practical settings and offers valuable insights for enhancing the attack and defense capabilities of subsequent DNNs in SAR ATR systems. Full article

(This article belongs to the Section AI Remote Sensing)

► Show Figures

Figure 1

16 pages, 2868 KiB

Open AccessArticle

Automatic Water Body Extraction from SAR Images Based on MADF-Net

by Jing Wang, Dongmei Jia, Jiaxing Xue, Zhongwu Wu and Wanying Song

Remote Sens. 2024, 16(18), 3419; https://doi.org/10.3390/rs16183419 - 14 Sep 2024

Cited by 5 | Viewed by 1805

Abstract

Water extraction from synthetic aperture radar (SAR) images has an important application value in wetland monitoring, flood monitoring, etc. However, it still faces the problems of low generalization, weak extraction ability of detailed information, and weak suppression of background noises. Therefore, a new [...] Read more.

Water extraction from synthetic aperture radar (SAR) images has an important application value in wetland monitoring, flood monitoring, etc. However, it still faces the problems of low generalization, weak extraction ability of detailed information, and weak suppression of background noises. Therefore, a new framework, Multi-scale Attention Detailed Feature fusion Network (MADF-Net), is proposed in this paper. It comprises an encoder and a decoder. In the encoder, ResNet101 is used as a solid backbone network to capture four feature levels at different depths, and then the proposed Deep Pyramid Pool (DAPP) module is used to perform multi-scale pooling operations, which ensure that key water features can be captured even in complex backgrounds. In the decoder, a Channel Spatial Attention Module (CSAM) is proposed, which focuses on feature areas that are critical for the identification of water edges by fusing attention weights in channel and spatial dimensions. Finally, the high-level semantic information is effectively fused with the low-level edge features to achieve the final water detection results. In the experiment, Sentinel-1 SAR images of three scenes with different characteristics and scales of water body are used. The PA and IoU of water extraction by MADF-Net can reach 92.77% and 89.03%, respectively, which obviously outperform several other networks. MADF-Net carries out water extraction with high precision from SAR images with different backgrounds, which could also be used for the segmentation and classification of other tasks from SAR images. Full article

(This article belongs to the Special Issue Remote Sensing of Global Floods: Observing, Modelling, and Forecasting)

► Show Figures

Figure 1

30 pages, 11567 KiB

Open AccessArticle

Gini Coefficient-Based Feature Learning for Unsupervised Cross-Domain Classification with Compact Polarimetric SAR Data

by Xianyu Guo, Junjun Yin, Kun Li and Jian Yang

Agriculture 2024, 14(9), 1511; https://doi.org/10.3390/agriculture14091511 - 3 Sep 2024

Viewed by 1252

Abstract

Remote sensing image classification usually needs many labeled samples so that the target nature can be fully described. For synthetic aperture radar (SAR) images, variations of the target scattering always happen to some extent due to the imaging geometry, weather conditions, and system [...] Read more.

Remote sensing image classification usually needs many labeled samples so that the target nature can be fully described. For synthetic aperture radar (SAR) images, variations of the target scattering always happen to some extent due to the imaging geometry, weather conditions, and system parameters. Therefore, labeled samples in one image could not be suitable to represent the same target in other images. The domain distribution shift of different images reduces the reusability of the labeled samples. Thus, exploring cross-domain interpretation methods is of great potential for SAR images to improve the reuse rate of existing labels from historical images. In this study, an unsupervised cross-domain classification method is proposed that utilizes the Gini coefficient to rank the robust and stable polarimetric features in both the source and target domains (GRFST) such that an unsupervised domain adaptation (UDA) can be achieved. This method selects the optimal features from both the source and target domains to alleviate the domain distribution shift. Both fully polarimetric (FP) and compact polarimetric (CP) SAR features are explored for crop-domain terrain type classification. Specifically, the CP mode refers to the hybrid dual-pol mode with an arbitrary transmitting ellipse wave. This is the first attempt in the open literature to investigate the representing abilities of different CP modes for cross-domain terrain classification. Experiments are conducted from four aspects to demonstrate the performance of CP modes for cross-data, cross-scene, and cross-crop type classification. Results show that the GRFST-UDA method yields a classification accuracy of 2% to 12% higher than the traditional UDA methods. The degree of scene similarity has a certain impact on the accuracy of cross-domain crop classification. It was also found that when both the FP and circular CP SAR data are used, stable, promising results can be achieved. Full article

(This article belongs to the Special Issue Applications of Remote Sensing in Agricultural Soil and Crop Mapping)

► Show Figures

Figure 1

20 pages, 7378 KiB

Open AccessArticle

A Lightweight Pyramid Transformer for High-Resolution SAR Image-Based Building Classification in Port Regions

by Bo Zhang, Qian Wu, Fan Wu, Jiajia Huang and Chao Wang

Remote Sens. 2024, 16(17), 3218; https://doi.org/10.3390/rs16173218 - 30 Aug 2024

Cited by 1 | Viewed by 1462

Abstract

Automatic classification of buildings within port areas from synthetic aperture radar (SAR) images is crucial for effective port monitoring and planning. Yet, the unique challenges of SAR imaging, such as side-looking geometry, multi-bouncing scattering, and the compact arrangement of structures, often lead to [...] Read more.

Automatic classification of buildings within port areas from synthetic aperture radar (SAR) images is crucial for effective port monitoring and planning. Yet, the unique challenges of SAR imaging, such as side-looking geometry, multi-bouncing scattering, and the compact arrangement of structures, often lead to incomplete building structures and blurred boundaries in classification results. To address these issues, this paper introduces SPformer, an efficient and lightweight pyramid transformer model tailored for semantic segmentation. The SPformer utilizes a pyramid transformer encoder with spatially separable self-attention (SSSA) to refine both local and global spatial information and to process multi-scale features, enhancing the accuracy of building structure delineation. It also integrates a lightweight all multi-layer perceptron (ALL-MLP) decoder to consolidate multi-scale information across various depths and attention scopes, refining detail processing. Experimental results on the Gaofen-3 (GF-3) 1 m port building classification dataset demonstrate the effectiveness of SPformer, achieving competitive performance compared to state-of-the-art models, with mean intersection over union (mIoU) and mean F1-score (mF1) reaching 77.14% and 87.04%, respectively, while maintaining a compact model size and lower computational requirements. Experiments conducted on the entire scene of SAR images covering port area also show the good capabilities of the proposed method. Full article

(This article belongs to the Special Issue SAR in Big Data Era III)

► Show Figures

Figure 1

11 pages, 2462 KiB

Open AccessArticle

A SAR Ship Detection Method Based on Adversarial Training

by Jianwei Li, Zhentao Yu, Jie Chen and Hao Jiang

Sensors 2024, 24(13), 4154; https://doi.org/10.3390/s24134154 - 26 Jun 2024

Cited by 1 | Viewed by 1570

Abstract

SAR (synthetic aperture radar) ship detection is a hot topic due to the breadth of its application. However, limited by the volume of the SAR image, the generalization ability of the detector is low, which makes it difficult to adapt to new scenes. [...] Read more.

SAR (synthetic aperture radar) ship detection is a hot topic due to the breadth of its application. However, limited by the volume of the SAR image, the generalization ability of the detector is low, which makes it difficult to adapt to new scenes. Although many data augmentation methods—for example, clipping, pasting, and mixing—are used, the accuracy is improved little. In order to solve this problem, the adversarial training is used for data generation in this paper. Perturbation is added to the SAR image to generate new samples for training, and it can make the detector learn more abundant features and promote the robustness of the detector. By separating batch normalization between clean samples and disturbed images, the performance degradation on clean samples is avoided. By simultaneously perturbing and selecting large losses of classification and location, it can keep the detector adaptable to more confrontational samples. The optimization efficiency and results are improved through K-step average perturbation and one-step gradient descent. The experiments on different detectors show that the proposed method achieves 8%, 10%, and 17% AP (Average Precision) improvement on the SSDD, SAR-Ship-Dataset, and AIR-SARShip, compared to the traditional data augmentation methods. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

24 pages, 4272 KiB

Open AccessArticle

JPSSL: SAR Terrain Classification Based on Jigsaw Puzzles and FC-CRF

by Zhongle Ren, Yiming Lu, Biao Hou, Weibin Li and Feng Sha

Remote Sens. 2024, 16(9), 1635; https://doi.org/10.3390/rs16091635 - 3 May 2024

Viewed by 1776

Abstract

Effective features play an important role in synthetic aperture radar (SAR) image interpretation. However, since SAR images contain a variety of terrain types, it is not easy to extract effective features of different terrains from SAR images. Deep learning methods require a large [...] Read more.

Effective features play an important role in synthetic aperture radar (SAR) image interpretation. However, since SAR images contain a variety of terrain types, it is not easy to extract effective features of different terrains from SAR images. Deep learning methods require a large amount of labeled data, but the difficulty of SAR image annotation limits the performance of deep learning models. SAR images have inevitable geometric distortion and coherence speckle noise, which makes it difficult to extract effective features from SAR images. If effective semantic context features cannot be learned for SAR images, the extracted features struggle to distinguish different terrain categories. Some existing terrain classification methods are very limited and can only be applied to some specified SAR images. To solve these problems, a jigsaw puzzle self-supervised learning (JPSSL) framework is proposed. The framework comprises a jigsaw puzzle pretext task and a terrain classification downstream task. In the pretext task, the information in the SAR image is learned by completing the SAR image jigsaw puzzle to extract effective features. The terrain classification downstream task is trained using only a small number of labeled data. Finally, fully connected conditional random field processing is performed to eliminate noise points and obtain a high-quality terrain classification result. Experimental results on three large-scene high-resolution SAR images confirm the effectiveness and generalization of our method. Compared with the supervised methods, the features learned in JPSSL are highly discriminative, and the JPSSL achieves good classification accuracy when using only a small amount of labeled data. Full article

► Show Figures

Figure 1

23 pages, 32421 KiB

Open AccessArticle

R-LRBPNet: A Lightweight SAR Image Oriented Ship Detection and Classification Method

by Gui Gao, Yuhao Chen, Zhuo Feng, Chuan Zhang, Dingfeng Duan, Hengchao Li and Xi Zhang

Remote Sens. 2024, 16(9), 1533; https://doi.org/10.3390/rs16091533 - 26 Apr 2024

Cited by 10 | Viewed by 2480

Abstract

Synthetic Aperture Radar (SAR) has the advantage of continuous observation throughout the day and in all weather conditions, and is used in a wide range of military and civil applications. Among these, the detection of ships at sea is an important research topic. [...] Read more.

Synthetic Aperture Radar (SAR) has the advantage of continuous observation throughout the day and in all weather conditions, and is used in a wide range of military and civil applications. Among these, the detection of ships at sea is an important research topic. Ships in SAR images are characterized by dense alignment, an arbitrary orientation and multiple scales. The existing detection algorithms are unable to solve these problems effectively. To address these issues, A YOLOV8-based oriented ship detection and classification method using SAR imaging with lightweight receptor field feature convolution, bottleneck transformers and a probabilistic intersection-over-union network (R-LRBPNet) is proposed in this paper. First, a CSP bottleneck with two bottleneck transformer (C2fBT) modules based on bottleneck transformers is proposed; this is an improved feature fusion module that integrates the global spatial features of bottleneck transformers and the rich channel features of C2f. This effectively reduces the negative impact of densely arranged scenarios. Second, we propose an angle decoupling module. This module uses probabilistic intersection-over-union (ProbIoU) and distribution focal loss (DFL) methods to compute the rotated intersection-over-union (RIoU), which effectively alleviates the problem of angle regression and the imbalance between angle regression and other regression tasks. Third, the lightweight receptive field feature convolution (LRFConv) is designed to replace the conventional convolution in the neck. This module can dynamically adjust the receptive field according to the target scale and calculate the feature pixel weights based on the input feature map. Through this module, the network can efficiently extract details and important information about ships to improve the classification performance of the ship. We conducted extensive experiments on the complex scene SAR dataset SRSDD and SSDD+. The experimental results show that R-LRBPNet has only 6.8 MB of model memory, which can achieve 78.2% detection accuracy, 64.2% recall, a 70.51 F1-Score and 71.85% mAP on the SRSDD dataset. Full article

► Show Figures

Figure 1

25 pages, 16942 KiB

Open AccessEditor’s ChoiceArticle

TAG-Net: Target Attitude Angle-Guided Network for Ship Detection and Classification in SAR Images

by Dece Pan, Youming Wu, Wei Dai, Tian Miao, Wenchao Zhao, Xin Gao and Xian Sun

Remote Sens. 2024, 16(6), 944; https://doi.org/10.3390/rs16060944 - 7 Mar 2024

Cited by 4 | Viewed by 1907

Abstract

Synthetic aperture radar (SAR) ship detection and classification has gained unprecedented attention due to its important role in maritime transportation. Many deep learning-based detectors and classifiers have been successfully applied and achieved great progress. However, ships in SAR images present discrete and multi-centric [...] Read more.

Synthetic aperture radar (SAR) ship detection and classification has gained unprecedented attention due to its important role in maritime transportation. Many deep learning-based detectors and classifiers have been successfully applied and achieved great progress. However, ships in SAR images present discrete and multi-centric features, and their scattering characteristics and edge information are sensitive to variations in target attitude angles (TAAs). These factors pose challenges for existing methods to obtain satisfactory results. To address these challenges, a novel target attitude angle-guided network (TAG-Net) is proposed in this article. The core idea of TAG-Net is to leverage TAA information as guidance and use an adaptive feature-level fusion strategy to dynamically learn more representative features that can handle the target imaging diversity caused by TAA. This is achieved through a TAA-aware feature modulation (TAFM) module. It uses the TAA information and foreground information as prior knowledge and establishes the relationship between the ship scattering characteristics and TAA information. This enables a reduction in the intra-class variability and highlights ship targets. Additionally, considering the different requirements of the detection and classification tasks for the scattering information, we propose a layer-wise attention-based task decoupling detection head (LATD). Unlike general deep learning methods that use shared features for both detection and classification tasks, LATD extracts multi-level features and uses layer attention to achieve feature decoupling and select the most suitable features for each task. Finally, we introduce a novel salient-enhanced feature balance module (SFB) to provide richer semantic information and capture the global context to highlight ships in complex scenes, effectively reducing the impact of background noise. A large-scale ship detection dataset (LSSDD+) is used to verify the effectiveness of TAG-Net, and our method achieves state-of-the-art performance. Full article

(This article belongs to the Special Issue SAR Data Processing and Applications Based on Machine Learning Method)

► Show Figures

Graphical abstract

21 pages, 4023 KiB

Open AccessArticle

A Multichannel-Based Deep Learning Framework for Ocean SAR Scene Classification

by Chengzu Bai, Shuo Zhang, Xinning Wang, Jiaqiang Wen and Chong Li

Appl. Sci. 2024, 14(4), 1489; https://doi.org/10.3390/app14041489 - 12 Feb 2024

Cited by 1 | Viewed by 1801

Abstract

High-resolution synthetic aperture radars (SARs) are becoming an indispensable environmental monitoring system to capture the important geophysical phenomena on the earth and sea surface. However, there is a lack of comprehensive models that can orchestrate such large-scale datasets from numerous satellite missions such [...] Read more.

High-resolution synthetic aperture radars (SARs) are becoming an indispensable environmental monitoring system to capture the important geophysical phenomena on the earth and sea surface. However, there is a lack of comprehensive models that can orchestrate such large-scale datasets from numerous satellite missions such as GaoFen-3 and Sentinel-1. In addition, these SAR images of different ocean scenes need to convey a variety of high-level classification features in oceanic and atmospheric phenomena. In this study, we propose a multichannel neural network (MCNN) that supports oceanic SAR scene classification for limited oceanic data samples according to multi-feature fusion, data augmentation, and multichannel feature extraction. To exploit the multichannel semantics of SAR scenes, the multi-feature fusion module effectively combines and reshapes the spatiotemporal SAR images to preserve their structural properties. This fine-grained feature augmentation policy is extended to improve the data quality so that the classification model is less vulnerable to both small- and large-scale data. The multichannel feature extraction also aggregates different oceanic features convolutionally extracted from ocean SAR scenes to improve the classification accuracy of oceanic phenomena with different scales. Through extensive experimental analysis, our MCNN framework has demonstrated a commendable classification performance, achieving an average precision rate of 96%, an average recall rate of 95%, and an average F-score of 95% across ten distinct oceanic phenomena. Notably, it surpasses two state-of-the-art classification techniques, namely, AlexNet and CMwv, by margins of 23.7% and 18.3%, respectively. Full article

(This article belongs to the Section Marine Science and Engineering)

► Show Figures

Figure 1

23 pages, 7881 KiB

Open AccessArticle

Improving Out-of-Distribution Generalization in SAR Image Scene Classification with Limited Training Samples

by Zhe Chen, Zhiquan Ding, Xiaoling Zhang, Xin Zhang and Tianqi Qin

Remote Sens. 2023, 15(24), 5761; https://doi.org/10.3390/rs15245761 - 17 Dec 2023

Viewed by 1716

Abstract

For practical maritime SAR image classification tasks with special imaging platforms, scenes to be classified are often different from those in the training sets. The quantity and diversity of the available training data can also be extremely limited. This problem of out-of-distribution (OOD) [...] Read more.

For practical maritime SAR image classification tasks with special imaging platforms, scenes to be classified are often different from those in the training sets. The quantity and diversity of the available training data can also be extremely limited. This problem of out-of-distribution (OOD) generalization with limited training samples leads to a sharp drop in the performance of conventional deep learning algorithms. In this paper, a knowledge-guided neural network (KGNN) model is proposed to overcome these challenges. By analyzing the saliency features of various maritime SAR scenes, universal knowledge in descriptive sentences is summarized. A feature integration strategy is designed to assign the descriptive knowledge to the ResNet-18 backbone. Both the individual semantic information and the inherent relations of the entities in SAR images are addressed. The experimental results show that our KGNN method outperforms conventional deep learning models in OOD scenarios with varying training sample sizes and achieves higher robustness in handling distributional shifts caused by weather conditions, terrain type, and sensor characteristics. In addition, the KGNN model converges within many fewer epochs during training. The performance improvement indicates that the KGNN model learns representations guided by beneficial properties for ODD generalization with limited training samples. Full article

(This article belongs to the Special Issue Multi-Dimensional Radar Sensing: Systems, Algorithms, and Applications)

► Show Figures

Graphical abstract

30 pages, 20140 KiB

Open AccessArticle

Comparative Analysis of Pixel-Level Fusion Algorithms and a New High-Resolution Dataset for SAR and Optical Image Fusion

by Jinjin Li, Jiacheng Zhang, Chao Yang, Huiyu Liu, Yangang Zhao and Yuanxin Ye

Remote Sens. 2023, 15(23), 5514; https://doi.org/10.3390/rs15235514 - 27 Nov 2023

Cited by 18 | Viewed by 5539

Abstract

Synthetic aperture radar (SAR) and optical images often present different geometric structures and texture features for the same ground object. Through the fusion of SAR and optical images, it can effectively integrate their complementary information, thus better meeting the requirements of remote sensing [...] Read more.

Synthetic aperture radar (SAR) and optical images often present different geometric structures and texture features for the same ground object. Through the fusion of SAR and optical images, it can effectively integrate their complementary information, thus better meeting the requirements of remote sensing applications, such as target recognition, classification, and change detection, so as to realize the collaborative utilization of multi-modal images. In order to select appropriate methods to achieve high-quality fusion of SAR and optical images, this paper conducts a systematic review of current pixel-level fusion algorithms for SAR and optical image fusion. Subsequently, eleven representative fusion methods, including component substitution methods (CS), multiscale decomposition methods (MSD), and model-based methods, are chosen for a comparative analysis. In the experiment, we produce a high-resolution SAR and optical image fusion dataset (named YYX-OPT-SAR) covering three different types of scenes, including urban, suburban, and mountain. This dataset and a publicly available medium-resolution dataset are used to evaluate these fusion methods based on three different kinds of evaluation criteria: visual evaluation, objective image quality metrics, and classification accuracy. In terms of the evaluation using image quality metrics, the experimental results show that MSD methods can effectively avoid the negative effects of SAR image shadows on the corresponding area of the fusion result compared with CS methods, while model-based methods exhibit relatively poor performance. Among all of the fusion methods involved in the comparison, the non-subsampled contourlet transform method (NSCT) presents the best fusion results. In the evaluation using image classification, most experimental results show that the overall classification accuracy after fusion is better than that before fusion. This indicates that optical-SAR fusion can improve land classification, with the gradient transfer fusion method (GTF) yielding the best classification results among all of these fusion methods. Full article

(This article belongs to the Special Issue Multi-Sensor Systems and Data Fusion in Remote Sensing II)

► Show Figures

Figure 1

22 pages, 6624 KiB

Open AccessArticle

CCDS-YOLO: Multi-Category Synthetic Aperture Radar Image Object Detection Model Based on YOLOv5s

by Min Huang, Zexu Liu, Tianen Liu and Jingyang Wang

Electronics 2023, 12(16), 3497; https://doi.org/10.3390/electronics12163497 - 18 Aug 2023

Cited by 10 | Viewed by 2342

Abstract

Synthetic Aperture Radar (SAR) is an active microwave sensor that has attracted widespread attention due to its ability to observe the ground around the clock. Research on multi-scale and multi-category target detection methods holds great significance in the fields of maritime resource management [...] Read more.

Synthetic Aperture Radar (SAR) is an active microwave sensor that has attracted widespread attention due to its ability to observe the ground around the clock. Research on multi-scale and multi-category target detection methods holds great significance in the fields of maritime resource management and wartime reconnaissance. However, complex scenes often influence SAR object detection, and the diversity of target scales also brings challenges to research. This paper proposes a multi-category SAR image object detection model, CCDS-YOLO, based on YOLOv5s, to address these issues. Embedding the Convolutional Block Attention Module (CBAM) in the feature extraction part of the backbone network enables the model’s ability to extract and fuse spatial information and channel information. The 1 × 1 convolution in the feature pyramid network and the first layer convolution of the detection head are replaced with the expanded convolution, Coordinate Conventional (CoordConv), forming a CRD-FPN module. This module more accurately perceives the spatial details of the feature map, enhancing the model’s ability to handle regression tasks compared to traditional convolution. In the detector segment, a decoupled head is utilized for feature extraction, offering optimal and effective feature information for the classification and regression branches separately. The traditional Non-Maximum Suppression (NMS) is substituted with the Soft Non-Maximum Suppression (Soft-NMS), successfully reducing the model’s duplicate detection rate for compact objects. Based on the experimental findings, the approach presented in this paper demonstrates excellent results in multi-category target recognition for SAR images. Empirical comparisons are conducted on the filtered MSAR dataset. Compared with YOLOv5s, the performance of CCDS-YOLO has been significantly improved. The mAP@0.5 value increases by 3.3% to 92.3%, the precision increases by 3.4%, and the mAP@0.5:0.95 increases by 6.7%. Furthermore, in comparison with other mainstream detection models, CCDS-YOLO stands out in overall performance and anti-interference ability. Full article

(This article belongs to the Special Issue New Insights in Radar Imaging)

► Show Figures

Figure 1

15 pages, 6694 KiB

Open AccessArticle

Ship Recognition for SAR Scene Images under Imbalance Data

by Ronghui Zhan and Zongyong Cui

Remote Sens. 2022, 14(24), 6294; https://doi.org/10.3390/rs14246294 - 12 Dec 2022

Cited by 9 | Viewed by 2463

Abstract

Synthetic aperture radar (SAR) ship recognition can obtain location and class information from SAR scene images, which is important in military and civilian fields, and has turned into a very important research focus recently. Limited by data conditions, the current research mainly includes [...] Read more.

Synthetic aperture radar (SAR) ship recognition can obtain location and class information from SAR scene images, which is important in military and civilian fields, and has turned into a very important research focus recently. Limited by data conditions, the current research mainly includes two aspects: ship detection in SAR scene images and ship classification in SAR slice images. These two parts are not yet integrated, but it is necessary to integrate detection and classification in practical applications, although it will cause an imbalance of training samples for different classes. To solve these problems, this paper proposes a ship recognition method on the basis of a deep network to detect and classify ship targets in SAR scene images under imbalance data. First, RetinaNet is used as the backbone network of the method in this paper for the integration of ship detection and classification in SAR scene images. Then, taking into account the issue that there are high similarities among various SAR ship classes, the squeeze-and-excitation (SE) module is introduced for amplifying the difference features as well as reducing the similarity features. Finally, considering the problem of class imbalance in ship target recognition in SAR scene images, a loss function, the central focal loss (CEFL), based on depth feature aggregation is constructed to reduce the differences within classes. Based on the dataset from OpenSARShip and Sentinel-1, the results of the experiment suggest that the the proposed method is feasible and the accuracy of the proposed method is improved by 3.9 percentage points compared with the traditional RetinaNet. Full article

(This article belongs to the Special Issue Applications of Synthetic Aperture Radar to Target Detection and Tracking)

► Show Figures

Figure 1

Search Results (58)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (58)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI