Special Issue "Deep Learning and Computer Vision in Remote Sensing"

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "AI Remote Sensing".

Deadline for manuscript submissions: 31 May 2022.

Special Issue Editors

Dr. Fahimeh Farahnakian
E-Mail Website
Guest Editor
Prof. Dr. Jukka Heikkonen
E-Mail Website
Guest Editor
Department of Information Technology, University of Turku, Turku, Finland
Interests: machine learning; computer vision; deep learning; multi-sensor fusion; data analysis
Special Issues, Collections and Topics in MDPI journals
Mr. Pouya Jafarzadeh
E-Mail Website
Guest Editor Assistant
Department of Information Technology, University of Turku, Turku, Finland
Interests: machine learning; Internet of Things; pose estimation; healthcare

Special Issue Information

Dear Colleagues,

In the last few years, the field of computer vision has made huge progress in remote sensing. This success and progress is mostly due to the effectiveness of deep learning (DL) algorithms. In addition, the remote sensing community has shifted its attention to DL, and DL algorithms have achieved significant success in many image analysis tasks. However, for remote sensing, a number of challenges from difficult data acquisition and annotation have not been fully solved yet.

The aim of this Special Issue is to give the opportunity to explore the mentioned challenges in remote sensing using computer vision, deep learning, and artificial intelligence. Its scope is interdisciplinary, and it seeks collaborative contributions from academia and industrial experts in areas of deep learning, computer vision, data science, and remote sensing. Major topics of interest, by no means exclusive, are as follows:

  • Deep learning and computer vision for RS problems
  • Deep learning for RS image understanding, such as object detection, image classification, and semantic and instance segmentation
  • Deep learning for RS scene understanding and classification
  • Satellite images processing and analysis
  • Transfer learning and machine learning for RS
  • Applications

Dr. Fahimeh Farahnakian
Prof. Dr. Jukka Heikkonen
Guest Editors
Mr. Pouya Jafarzadeh
Guest Editor Assistant

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Published Papers (15 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

Article
Deep Learning Triplet Ordinal Relation Preserving Binary Code for Remote Sensing Image Retrieval Task
Remote Sens. 2021, 13(23), 4786; https://doi.org/10.3390/rs13234786 - 26 Nov 2021
Viewed by 294
Abstract
As satellite observation technology rapidly develops, the number of remote sensing (RS) images dramatically increases, and this leads RS image retrieval tasks to be more challenging in terms of speed and accuracy. Recently, an increasing number of researchers have turned their attention to [...] Read more.
As satellite observation technology rapidly develops, the number of remote sensing (RS) images dramatically increases, and this leads RS image retrieval tasks to be more challenging in terms of speed and accuracy. Recently, an increasing number of researchers have turned their attention to this issue, as well as hashing algorithms, which map real-valued data onto a low-dimensional Hamming space and have been widely utilized to respond quickly to large-scale RS image search tasks. However, most existing hashing algorithms only emphasize preserving point-wise or pair-wise similarity, which may lead to an inferior approximate nearest neighbor (ANN) search result. To fix this problem, we propose a novel triplet ordinal cross entropy hashing (TOCEH). In TOCEH, to enhance the ability of preserving the ranking orders in different spaces, we establish a tensor graph representing the Euclidean triplet ordinal relationship among RS images and minimize the cross entropy between the probability distribution of the established Euclidean similarity graph and that of the Hamming triplet ordinal relation with the given binary code. During the training process, to avoid the non-deterministic polynomial (NP) hard problem, we utilize a continuous function instead of the discrete encoding process. Furthermore, we design a quantization objective function based on the principle of preserving triplet ordinal relation to minimize the loss caused by the continuous relaxation procedure. The comparative RS image retrieval experiments are conducted on three publicly available datasets, including UC Merced Land Use Dataset (UCMD), SAT-4 and SAT-6. The experimental results show that the proposed TOCEH algorithm outperforms many existing hashing algorithms in RS image retrieval tasks. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Graphical abstract

Article
An Improved Swin Transformer-Based Model for Remote Sensing Object Detection and Instance Segmentation
Remote Sens. 2021, 13(23), 4779; https://doi.org/10.3390/rs13234779 - 25 Nov 2021
Viewed by 264
Abstract
Remote sensing image object detection and instance segmentation are widely valued research fields. A convolutional neural network (CNN) has shown defects in the object detection of remote sensing images. In recent years, the number of studies on transformer-based models increased, and these studies [...] Read more.
Remote sensing image object detection and instance segmentation are widely valued research fields. A convolutional neural network (CNN) has shown defects in the object detection of remote sensing images. In recent years, the number of studies on transformer-based models increased, and these studies achieved good results. However, transformers still suffer from poor small object detection and unsatisfactory edge detail segmentation. In order to solve these problems, we improved the Swin transformer based on the advantages of transformers and CNNs, and designed a local perception Swin transformer (LPSW) backbone to enhance the local perception of the network and to improve the detection accuracy of small-scale objects. We also designed a spatial attention interleaved execution cascade (SAIEC) network framework, which helped to strengthen the segmentation accuracy of the network. Due to the lack of remote sensing mask datasets, the MRS-1800 remote sensing mask dataset was created. Finally, we combined the proposed backbone with the new network framework and conducted experiments on this MRS-1800 dataset. Compared with the Swin transformer, the proposed model improved the mask AP by 1.7%, mask APS by 3.6%, AP by 1.1% and APS by 4.6%, demonstrating its effectiveness and feasibility. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Graphical abstract

Article
A Dense Encoder–Decoder Network with Feedback Connections for Pan-Sharpening
Remote Sens. 2021, 13(22), 4505; https://doi.org/10.3390/rs13224505 - 09 Nov 2021
Viewed by 370
Abstract
To meet the need for multispectral images having high spatial resolution in practical applications, we propose a dense encoder–decoder network with feedback connections for pan-sharpening. Our network consists of four parts. The first part consists of two identical subnetworks, one each to extract [...] Read more.
To meet the need for multispectral images having high spatial resolution in practical applications, we propose a dense encoder–decoder network with feedback connections for pan-sharpening. Our network consists of four parts. The first part consists of two identical subnetworks, one each to extract features from PAN and MS images, respectively. The second part is an efficient feature-extraction block. We hope that the network can focus on features at different scales, so we propose innovative multiscale feature-extraction blocks that fully extract effective features from networks of various depths and widths by using three multiscale feature-extraction blocks and two long-jump connections. The third part is the feature fusion and recovery network. We are inspired by the work on U-Net network improvements to propose a brand new encoder network structure with dense connections that improves network performance through effective connections to encoders and decoders at different scales. The fourth part is a continuous feedback connection operation with overfeedback to refine shallow features, which enables the network to obtain better reconstruction capabilities earlier. To demonstrate the effectiveness of our method, we performed several experiments. Experiments on various satellite datasets show that the proposed method outperforms existing methods. Our results show significant improvements over those from other models in terms of the multiple-target index values used to measure the spectral quality and spatial details of the generated images. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Figure 1

Article
DisasterGAN: Generative Adversarial Networks for Remote Sensing Disaster Image Generation
Remote Sens. 2021, 13(21), 4284; https://doi.org/10.3390/rs13214284 - 25 Oct 2021
Viewed by 328
Abstract
Rapid progress on disaster detection and assessment has been achieved with the development of deep-learning techniques and the wide applications of remote sensing images. However, it is still a great challenge to train an accurate and robust disaster detection network due to the [...] Read more.
Rapid progress on disaster detection and assessment has been achieved with the development of deep-learning techniques and the wide applications of remote sensing images. However, it is still a great challenge to train an accurate and robust disaster detection network due to the class imbalance of existing data sets and the lack of training data. This paper aims at synthesizing disaster remote sensing images with multiple disaster types and different building damage with generative adversarial networks (GANs), making up for the shortcomings of the existing data sets. However, existing models are inefficient in multi-disaster image translation due to the diversity of disaster and inevitably change building-irrelevant regions caused by directly operating on the whole image. Thus, we propose two models: disaster translation GAN can generate disaster images for multiple disaster types using only a single model, which uses an attribute to represent disaster types and a reconstruction process to further ensure the effect of the generator; damaged building generation GAN is a mask-guided image generation model, which can only alter the attribute-specific region while keeping the attribute-irrelevant region unchanged. Qualitative and quantitative experiments demonstrate the validity of the proposed methods. Further experimental results on the damaged building assessment model show the effectiveness of the proposed models and the superiority compared with other data augmentation methods. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Graphical abstract

Article
SGA-Net: Self-Constructing Graph Attention Neural Network for Semantic Segmentation of Remote Sensing Images
Remote Sens. 2021, 13(21), 4201; https://doi.org/10.3390/rs13214201 - 20 Oct 2021
Viewed by 398
Abstract
Semantic segmentation of remote sensing images is always a critical and challenging task. Graph neural networks, which can capture global contextual representations, can exploit long-range pixel dependency, thereby improving semantic segmentation performance. In this paper, a novel self-constructing graph attention neural network is [...] Read more.
Semantic segmentation of remote sensing images is always a critical and challenging task. Graph neural networks, which can capture global contextual representations, can exploit long-range pixel dependency, thereby improving semantic segmentation performance. In this paper, a novel self-constructing graph attention neural network is proposed for such a purpose. Firstly, ResNet50 was employed as backbone of a feature extraction network to acquire feature maps of remote sensing images. Secondly, pixel-wise dependency graphs were constructed from the feature maps of images, and a graph attention network is designed to extract the correlations of pixels of the remote sensing images. Thirdly, the channel linear attention mechanism obtained the channel dependency of images, further improving the prediction of semantic segmentation. Lastly, we conducted comprehensive experiments and found that the proposed model consistently outperformed state-of-the-art methods on two widely used remote sensing image datasets. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Figure 1

Article
SSSGAN: Satellite Style and Structure Generative Adversarial Networks
Remote Sens. 2021, 13(19), 3984; https://doi.org/10.3390/rs13193984 - 05 Oct 2021
Viewed by 485
Abstract
This work presents Satellite Style and Structure Generative Adversarial Network (SSGAN), a generative model of high resolution satellite imagery to support image segmentation. Based on spatially adaptive denormalization modules (SPADE) that modulate the activations with respect to segmentation map structure, in addition to [...] Read more.
This work presents Satellite Style and Structure Generative Adversarial Network (SSGAN), a generative model of high resolution satellite imagery to support image segmentation. Based on spatially adaptive denormalization modules (SPADE) that modulate the activations with respect to segmentation map structure, in addition to global descriptor vectors that capture the semantic information in a vector with respect to Open Street Maps (OSM) classes, this model is able to produce consistent aerial imagery. By decoupling the generation of aerial images into a structure map and a carefully defined style vector, we were able to improve the realism and geodiversity of the synthesis with respect to the state-of-the-art baseline. Therefore, the proposed model allows us to control the generation not only with respect to the desired structure, but also with respect to a geographic area. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Graphical abstract

Article
Fast and High-Quality 3-D Terahertz Super-Resolution Imaging Using Lightweight SR-CNN
Remote Sens. 2021, 13(19), 3800; https://doi.org/10.3390/rs13193800 - 22 Sep 2021
Viewed by 408
Abstract
High-quality three-dimensional (3-D) radar imaging is one of the challenging problems in radar imaging enhancement. The existing sparsity regularizations are limited to the heavy computational burden and time-consuming iteration operation. Compared with the conventional sparsity regularizations, the super-resolution (SR) imaging methods based on [...] Read more.
High-quality three-dimensional (3-D) radar imaging is one of the challenging problems in radar imaging enhancement. The existing sparsity regularizations are limited to the heavy computational burden and time-consuming iteration operation. Compared with the conventional sparsity regularizations, the super-resolution (SR) imaging methods based on convolution neural network (CNN) can promote imaging time and achieve more accuracy. However, they are confined to 2-D space and model training under small dataset is not competently considered. To solve these problem, a fast and high-quality 3-D terahertz radar imaging method based on lightweight super-resolution CNN (SR-CNN) is proposed in this paper. First, an original 3-D radar echo model is presented and the expected SR model is derived by the given imaging geometry. Second, the SR imaging method based on lightweight SR-CNN is proposed to improve the image quality and speed up the imaging time. Furthermore, the resolution characteristics among spectrum estimation, sparsity regularization and SR-CNN are analyzed by the point spread function (PSF). Finally, electromagnetic computation simulations are carried out to validate the effectiveness of the proposed method in terms of image quality. The robustness against noise and the stability under small are demonstrate by ablation experiments. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Figure 1

Article
Predicting Arbitrary-Oriented Objects as Points in Remote Sensing Images
Remote Sens. 2021, 13(18), 3731; https://doi.org/10.3390/rs13183731 - 17 Sep 2021
Viewed by 625
Abstract
To detect rotated objects in remote sensing images, researchers have proposed a series of arbitrary-oriented object detection methods, which place multiple anchors with different angles, scales, and aspect ratios on the images. However, a major difference between remote sensing images and natural images [...] Read more.
To detect rotated objects in remote sensing images, researchers have proposed a series of arbitrary-oriented object detection methods, which place multiple anchors with different angles, scales, and aspect ratios on the images. However, a major difference between remote sensing images and natural images is the small probability of overlap between objects in the same category, so the anchor-based design can introduce much redundancy during the detection process. In this paper, we convert the detection problem to a center point prediction problem, where the pre-defined anchors can be discarded. By directly predicting the center point, orientation, and corresponding height and width of the object, our methods can simplify the design of the model and reduce the computations related to anchors. In order to further fuse the multi-level features and get accurate object centers, a deformable feature pyramid network is proposed, to detect objects under complex backgrounds and various orientations of rotated objects. Experiments and analysis on two remote sensing datasets, DOTA and HRSC2016, demonstrate the effectiveness of our approach. Our best model, equipped with Deformable-FPN, achieved 74.75% mAP on DOTA and 96.59% on HRSC2016 with a single-stage model, single-scale training, and testing. By detecting arbitrarily oriented objects from their centers, the proposed model performs competitively against oriented anchor-based methods. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Graphical abstract

Article
Learning Rotated Inscribed Ellipse for Oriented Object Detection in Remote Sensing Images
Remote Sens. 2021, 13(18), 3622; https://doi.org/10.3390/rs13183622 - 10 Sep 2021
Viewed by 408
Abstract
Oriented object detection in remote sensing images (RSIs) is a significant yet challenging Earth Vision task, as the objects in RSIs usually emerge with complicated backgrounds, arbitrary orientations, multi-scale distributions, and dramatic aspect ratio variations. Existing oriented object detectors are mostly inherited from [...] Read more.
Oriented object detection in remote sensing images (RSIs) is a significant yet challenging Earth Vision task, as the objects in RSIs usually emerge with complicated backgrounds, arbitrary orientations, multi-scale distributions, and dramatic aspect ratio variations. Existing oriented object detectors are mostly inherited from the anchor-based paradigm. However, the prominent performance of high-precision and real-time detection with anchor-based detectors is overshadowed by the design limitations of tediously rotated anchors. By using the simplicity and efficiency of keypoint-based detection, in this work, we extend a keypoint-based detector to the task of oriented object detection in RSIs. Specifically, we first simplify the oriented bounding box (OBB) as a center-based rotated inscribed ellipse (RIE), and then employ six parameters to represent the RIE inside each OBB: the center point position of the RIE, the offsets of the long half axis, the length of the short half axis, and an orientation label. In addition, to resolve the influence of complex backgrounds and large-scale variations, a high-resolution gated aggregation network (HRGANet) is designed to identify the targets of interest from complex backgrounds and fuse multi-scale features by using a gated aggregation model (GAM). Furthermore, by analyzing the influence of eccentricity on orientation error, eccentricity-wise orientation loss (ewoLoss) is proposed to assign the penalties on the orientation loss based on the eccentricity of the RIE, which effectively improves the accuracy of the detection of oriented objects with a large aspect ratio. Extensive experimental results on the DOTA and HRSC2016 datasets demonstrate the effectiveness of the proposed method. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Figure 1

Article
Split-Attention Networks with Self-Calibrated Convolution for Moon Impact Crater Detection from Multi-Source Data
Remote Sens. 2021, 13(16), 3193; https://doi.org/10.3390/rs13163193 - 12 Aug 2021
Viewed by 531
Abstract
Impact craters are the most prominent features on the surface of the Moon, Mars, and Mercury. They play an essential role in constructing lunar bases, the dating of Mars and Mercury, and the surface exploration of other celestial bodies. The traditional crater detection [...] Read more.
Impact craters are the most prominent features on the surface of the Moon, Mars, and Mercury. They play an essential role in constructing lunar bases, the dating of Mars and Mercury, and the surface exploration of other celestial bodies. The traditional crater detection algorithms (CDA) are mainly based on manual interpretation which is combined with classical image processing techniques. The traditional CDAs are, however, inefficient for detecting smaller or overlapped impact craters. In this paper, we propose a Split-Attention Networks with Self-Calibrated Convolution (SCNeSt) architecture, in which the channel-wise attention with multi-path representation and self-calibrated convolutions can generate more prosperous and more discriminative feature representations. The algorithm first extracts the crater feature model under the well-known target detection R-FCN network framework. The trained models are then applied to detecting the impact craters on Mercury and Mars using the transfer learning method. In the lunar impact crater detection experiment, we managed to extract a total of 157,389 impact craters with diameters between 0.6 and 860 km. Our proposed model outperforms the ResNet, ResNeXt, ScNet, and ResNeSt models in terms of recall rate and accuracy is more efficient than that other residual network models. Without training for Mars and Mercury remote sensing data, our model can also identify craters of different scales and demonstrates outstanding robustness and transferability. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Figure 1

Article
Variational Generative Adversarial Network with Crossed Spatial and Spectral Interactions for Hyperspectral Image Classification
Remote Sens. 2021, 13(16), 3131; https://doi.org/10.3390/rs13163131 - 07 Aug 2021
Viewed by 708
Abstract
Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) have been widely used in hyperspectral image classification (HSIC) tasks. However, the generated HSI virtual samples by VAEs are often ambiguous, and GANs are prone to the mode collapse, which lead the poor generalization abilities [...] Read more.
Variational Autoencoders (VAEs) and Generative Adversarial Networks (GANs) have been widely used in hyperspectral image classification (HSIC) tasks. However, the generated HSI virtual samples by VAEs are often ambiguous, and GANs are prone to the mode collapse, which lead the poor generalization abilities ultimately. Moreover, most of these models only consider the extraction of spectral or spatial features. They fail to combine the two branches interactively and ignore the correlation between them. Consequently, the variational generative adversarial network with crossed spatial and spectral interactions (CSSVGAN) was proposed in this paper, which includes a dual-branch variational Encoder to map spectral and spatial information to different latent spaces, a crossed interactive Generator to improve the quality of generated virtual samples, and a Discriminator stuck with a classifier to enhance the classification performance. Combining these three subnetworks, the proposed CSSVGAN achieves excellent classification by ensuring the diversity and interacting spectral and spatial features in a crossed manner. The superior experimental results on three datasets verify the effectiveness of this method. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Graphical abstract

Article
An Attention-Guided Multilayer Feature Aggregation Network for Remote Sensing Image Scene Classification
Remote Sens. 2021, 13(16), 3113; https://doi.org/10.3390/rs13163113 - 06 Aug 2021
Viewed by 463
Abstract
Remote sensing image scene classification (RSISC) has broad application prospects, but related challenges still exist and urgently need to be addressed. One of the most important challenges is how to learn a strong discriminative scene representation. Recently, convolutional neural networks (CNNs) have shown [...] Read more.
Remote sensing image scene classification (RSISC) has broad application prospects, but related challenges still exist and urgently need to be addressed. One of the most important challenges is how to learn a strong discriminative scene representation. Recently, convolutional neural networks (CNNs) have shown great potential in RSISC due to their powerful feature learning ability; however, their performance may be restricted by the complexity of remote sensing images, such as spatial layout, varying scales, complex backgrounds, category diversity, etc. In this paper, we propose an attention-guided multilayer feature aggregation network (AGMFA-Net) that attempts to improve the scene classification performance by effectively aggregating features from different layers. Specifically, to reduce the discrepancies between different layers, we employed the channel–spatial attention on multiple high-level convolutional feature maps to capture more accurately semantic regions that correspond to the content of the given scene. Then, we utilized the learned semantic regions as guidance to aggregate the valuable information from multilayer convolutional features, so as to achieve stronger scene features for classification. Experimental results on three remote sensing scene datasets indicated that our approach achieved competitive classification performance in comparison to the baselines and other state-of-the-art methods. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Graphical abstract

Article
Learning the Incremental Warp for 3D Vehicle Tracking in LiDAR Point Clouds
Remote Sens. 2021, 13(14), 2770; https://doi.org/10.3390/rs13142770 - 14 Jul 2021
Viewed by 599
Abstract
Object tracking from LiDAR point clouds, which are always incomplete, sparse, and unstructured, plays a crucial role in urban navigation. Some existing methods utilize a learned similarity network for locating the target, immensely limiting the advancements in tracking accuracy. In this study, we [...] Read more.
Object tracking from LiDAR point clouds, which are always incomplete, sparse, and unstructured, plays a crucial role in urban navigation. Some existing methods utilize a learned similarity network for locating the target, immensely limiting the advancements in tracking accuracy. In this study, we leveraged a powerful target discriminator and an accurate state estimator to robustly track target objects in challenging point cloud scenarios. Considering the complex nature of estimating the state, we extended the traditional Lucas and Kanade (LK) algorithm to 3D point cloud tracking. Specifically, we propose a state estimation subnetwork that aims to learn the incremental warp for updating the coarse target state. Moreover, to obtain a coarse state, we present a simple yet efficient discrimination subnetwork. It can project 3D shapes into a more discriminatory latent space by integrating the global feature into each point-wise feature. Experiments on KITTI and PandaSet datasets showed that compared with the most advanced of other methods, our proposed method can achieve significant improvements—in particular, up to 13.68% on KITTI. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Graphical abstract

Article
Improved YOLO Network for Free-Angle Remote Sensing Target Detection
Remote Sens. 2021, 13(11), 2171; https://doi.org/10.3390/rs13112171 - 01 Jun 2021
Cited by 3 | Viewed by 1128
Abstract
Despite significant progress in object detection tasks, remote sensing image target detection is still challenging owing to complex backgrounds, large differences in target sizes, and uneven distribution of rotating objects. In this study, we consider model accuracy, inference speed, and detection of objects [...] Read more.
Despite significant progress in object detection tasks, remote sensing image target detection is still challenging owing to complex backgrounds, large differences in target sizes, and uneven distribution of rotating objects. In this study, we consider model accuracy, inference speed, and detection of objects at any angle. We also propose a RepVGG-YOLO network using an improved RepVGG model as the backbone feature extraction network, which performs the initial feature extraction from the input image and considers network training accuracy and inference speed. We use an improved feature pyramid network (FPN) and path aggregation network (PANet) to reprocess feature output by the backbone network. The FPN and PANet module integrates feature maps of different layers, combines context information on multiple scales, accumulates multiple features, and strengthens feature information extraction. Finally, to maximize the detection accuracy of objects of all sizes, we use four target detection scales at the network output to enhance feature extraction from small remote sensing target pixels. To solve the angle problem of any object, we improved the loss function for classification using circular smooth label technology, turning the angle regression problem into a classification problem, and increasing the detection accuracy of objects at any angle. We conducted experiments on two public datasets, DOTA and HRSC2016. Our results show the proposed method performs better than previous methods. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Graphical abstract

Other

Jump to: Research

Technical Note
NDFTC: A New Detection Framework of Tropical Cyclones from Meteorological Satellite Images with Deep Transfer Learning
Remote Sens. 2021, 13(9), 1860; https://doi.org/10.3390/rs13091860 - 10 May 2021
Cited by 2 | Viewed by 671
Abstract
Accurate detection of tropical cyclones (TCs) is important to prevent and mitigate natural disasters associated with TCs. Deep transfer learning methods have advantages in detection tasks, because they can further improve the stability and accuracy of the detection model. Therefore, on the basis [...] Read more.
Accurate detection of tropical cyclones (TCs) is important to prevent and mitigate natural disasters associated with TCs. Deep transfer learning methods have advantages in detection tasks, because they can further improve the stability and accuracy of the detection model. Therefore, on the basis of deep transfer learning, we propose a new detection framework of tropical cyclones (NDFTC) from meteorological satellite images by combining the deep convolutional generative adversarial networks (DCGAN) and You Only Look Once (YOLO) v3 model. The algorithm process of NDFTC consists of three major steps: data augmentation, a pre-training phase, and transfer learning. First, to improve the utilization of finite data, DCGAN is used as the data augmentation method to generate images simulated to TCs. Second, to extract the salient characteristics of TCs, the generated images obtained from DCGAN are inputted into the detection model YOLOv3 in the pre-training phase. Furthermore, based on the network-based deep transfer learning method, we train the detection model with real images of TCs and its initial weights are transferred from the YOLOv3 trained with generated images. Training with real images helps to extract universal characteristics of TCs and using transferred weights as initial weights can improve the stability and accuracy of the model. The experimental results show that the NDFTC has a better performance, with an accuracy (ACC) of 97.78% and average precision (AP) of 81.39%, in comparison to the YOLOv3, with an ACC of 93.96% and AP of 80.64%. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision in Remote Sensing)
Show Figures

Graphical abstract

Back to TopTop