Special Issue "Computer Vision and Deep Learning for Remote Sensing Applications"

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: 31 August 2021.

Special Issue Editors

Dr. Hyungtae Lee
E-Mail Website
Guest Editor
Army Research Lab./Booz Allen Hamilton Inc., 2800 Powder Mil Rd., Adelphi, MD 20783, USA
Interests: computer vision; machine learning; deep learning and AI
Dr. Sungmin Eum
E-Mail Website
Guest Editor
2) Dr. Sungmin Eum Army Research Lab./Booz Allen Hamilton Inc., 2800 Powder Mil Rd., Adelphi, MD 20783, USA
Interests: computer vision; machine learning; deep learning and AI
Dr. Claudio Piciarelli
E-Mail Website
Guest Editor
Associate Professor, University of Udine, via delle Scienze 206, 33100 Udine, Italy
Interests: computer vision; pattern recognition; machine learning; deep learning; sensor reconfiguration; anomaly detection

Special Issue Information

Dear Colleagues,

Today, the field of computer vision and deep learning is rapidly progressing into many applications, including remote sensing, due to its remarkable performance. Especially for remote sensing, a myriad of challenges due to difficult data acquisition and annotation have not been fully solved yet. The remote sensing community is waiting for a breakthrough to address these challenges by utilizing high-performance deep learning-based models that typically require large-scale annotated datasets.

This issue is looking for such breakthroughs focusing on the advances in remote sensing using computer vision, deep learning and artificial intelligence. Although broad in scope, contributions with a specific focus are expected.

For this special issue, we welcome the most recent advancements related, but not limited to:

* Deep learning architecture for remote sensing

* Machine learning for remote sensing

* Computer vision method for remote sensing

* Classification / Detection / Regression

* Unsupervised feature learning for remote sensing

* Domain adaptation and transfer learning with computer vision and deep learning for remote sensing

* Anomaly/novelty detection for remote sensing

* New dataset and task for remote sensing

* Remote sensing data analysis

* New remote sensing application

* Synthetic remote sensing data generation

* Real-time remote sensing

* Deep learning-based image registration

Dr. Hyungtae Lee
Dr. Sungmin Eum
Dr. Claudio Piciarelli
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Deep learning
  • Computer vision
  • Remote sensing
  • Hyperspectral image
  • Supervised / Semi-supervised / Unsupervised learning
  • Classification / Detection / Regression
  • Domain adaptation / Transfer learning
  • Data analysis
  • Synthetic data
  • Generative models

Published Papers (15 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

Open AccessArticle
LighterGAN: An Illumination Enhancement Method for Urban UAV Imagery
Remote Sens. 2021, 13(7), 1371; https://doi.org/10.3390/rs13071371 - 02 Apr 2021
Viewed by 254
Abstract
In unmanned aerial vehicle based urban observation and monitoring, the performance of computer vision algorithms is inevitably limited by the low illumination and light pollution caused degradation, therefore, the application image enhancement is a considerable prerequisite for the performance of subsequent image processing [...] Read more.
In unmanned aerial vehicle based urban observation and monitoring, the performance of computer vision algorithms is inevitably limited by the low illumination and light pollution caused degradation, therefore, the application image enhancement is a considerable prerequisite for the performance of subsequent image processing algorithms. Therefore, we proposed a deep learning and generative adversarial network based model for UAV low illumination image enhancement, named LighterGAN. The design of LighterGAN refers to the CycleGAN model with two improvements—attention mechanism and semantic consistency loss—having been proposed to the original structure. Additionally, an unpaired dataset that was captured by urban UAV aerial photography has been used to train this unsupervised learning model. Furthermore, in order to explore the advantages of the improvements, both the performance in the illumination enhancement task and the generalization ability improvement of LighterGAN were proven in the comparative experiments combining subjective and objective evaluations. In the experiments with five cutting edge image enhancement algorithms, in the test set, LighterGAN achieved the best results in both visual perception and PIQE (perception based image quality evaluator, a MATLAB build-in function, the lower the score, the higher the image quality) score of enhanced images, scores were 4.91 and 11.75 respectively, better than EnlightenGAN the state-of-the-art. In the enhancement of low illumination sub-dataset Y (containing 2000 images), LighterGAN also achieved the lowest PIQE score of 12.37, 2.85 points lower than second place. Moreover, compared with the CycleGAN, the improvement of generalization ability was also demonstrated. In the test set generated images, LighterGAN was 6.66 percent higher than CycleGAN in subjective authenticity assessment and 3.84 lower in PIQE score, meanwhile, in the whole dataset generated images, the PIQE score of LighterGAN is 11.67, 4.86 lower than CycleGAN. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)
Show Figures

Figure 1

Open AccessArticle
Knowledge and Spatial Pyramid Distance-Based Gated Graph Attention Network for Remote Sensing Semantic Segmentation
Remote Sens. 2021, 13(7), 1312; https://doi.org/10.3390/rs13071312 - 30 Mar 2021
Viewed by 365
Abstract
The pixel-based semantic segmentation methods take pixels as recognitions units, and are restricted by the limited range of receptive fields, so they cannot carry richer and higher-level semantics. These reduce the accuracy of remote sensing (RS) semantic segmentation to a certain extent. Comparing [...] Read more.
The pixel-based semantic segmentation methods take pixels as recognitions units, and are restricted by the limited range of receptive fields, so they cannot carry richer and higher-level semantics. These reduce the accuracy of remote sensing (RS) semantic segmentation to a certain extent. Comparing with the pixel-based methods, the graph neural networks (GNNs) usually use objects as input nodes, so they not only have relatively small computational complexity, but also can carry richer semantic information. However, the traditional GNNs are more rely on the context information of the individual samples and lack geographic prior knowledge that reflects the overall situation of the research area. Therefore, these methods may be disturbed by the confusion of “different objects with the same spectrum” or “violating the first law of geography” in some areas. To address the above problems, we propose a remote sensing semantic segmentation model called knowledge and spatial pyramid distance-based gated graph attention network (KSPGAT), which is based on prior knowledge, spatial pyramid distance and a graph attention network (GAT) with gating mechanism. The model first uses superpixels (geographical objects) to form the nodes of a graph neural network and then uses a novel spatial pyramid distance recognition algorithm to recognize the spatial relationships. Finally, based on the integration of feature similarity and the spatial relationships of geographic objects, a multi-source attention mechanism and gating mechanism are designed to control the process of node aggregation, as a result, the high-level semantics, spatial relationships and prior knowledge can be introduced into a remote sensing semantic segmentation network. The experimental results show that our model improves the overall accuracy by 4.43% compared with the U-Net Network, and 3.80% compared with the baseline GAT network. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)
Show Figures

Graphical abstract

Open AccessArticle
ICENETv2: A Fine-Grained River Ice Semantic Segmentation Network Based on UAV Images
Remote Sens. 2021, 13(4), 633; https://doi.org/10.3390/rs13040633 - 10 Feb 2021
Viewed by 481
Abstract
Accurate ice segmentation is one of the most crucial techniques for intelligent ice monitoring. Compared with ice segmentation, it can provide more information for ice situation analysis, change trend prediction, and so on. Therefore, the study of ice segmentation has important practical significance. [...] Read more.
Accurate ice segmentation is one of the most crucial techniques for intelligent ice monitoring. Compared with ice segmentation, it can provide more information for ice situation analysis, change trend prediction, and so on. Therefore, the study of ice segmentation has important practical significance. In this study, we focused on fine-grained river ice segmentation using unmanned aerial vehicle (UAV) images. This has the following difficulties: (1) The scale of river ice varies greatly in different images and even in the same image; (2) the same kind of river ice differs greatly in color, shape, texture, size, and so on; and (3) the appearances of different kinds of river ice sometimes appear similar due to the complex formation and change procedure. Therefore, to perform this study, the NWPU_YRCC2 dataset was built, in which all UAV images were collected in the Ningxia–Inner Mongolia reach of the Yellow River. Then, a novel semantic segmentation method based on deep convolution neural network, named ICENETv2, is proposed. To achieve multiscale accurate prediction, we design a multilevel features fusion framework, in which multi-scale high-level semantic features and lower-level finer features are effectively fused. Additionally, a dual attention module is adopted to highlight distinguishable characteristics, and a learnable up-sampling strategy is further used to improve the segmentation accuracy of the details. Experiments show that ICENETv2 achieves the state-of-the-art on the NWPU_YRCC2 dataset. Finally, our ICENETv2 is also applied to solve a realistic problem, calculating drift ice cover density, which is one of the most important factors to predict the freeze-up data of the river. The results demonstrate that the performance of ICENETv2 meets the actual application demand. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)
Show Figures

Figure 1

Open AccessArticle
MultEYE: Monitoring System for Real-Time Vehicle Detection, Tracking and Speed Estimation from UAV Imagery on Edge-Computing Platforms
Remote Sens. 2021, 13(4), 573; https://doi.org/10.3390/rs13040573 - 05 Feb 2021
Viewed by 640
Abstract
We present MultEYE, a traffic monitoring system that can detect, track, and estimate the velocity of vehicles in a sequence of aerial images. The presented solution has been optimized to execute these tasks in real-time on an embedded computer installed on an Unmanned [...] Read more.
We present MultEYE, a traffic monitoring system that can detect, track, and estimate the velocity of vehicles in a sequence of aerial images. The presented solution has been optimized to execute these tasks in real-time on an embedded computer installed on an Unmanned Aerial Vehicle (UAV). In order to overcome the limitation of existing object detection architectures related to accuracy and computational overhead, a multi-task learning methodology was employed by adding a segmentation head to an object detector backbone resulting in the MultEYE object detection architecture. On a custom dataset, it achieved 4.8% higher mean Average Precision (mAP) score, while being 91.4% faster than the state-of-the-art model and while being able to generalize to different real-world traffic scenes. Dedicated object tracking and speed estimation algorithms have been then optimized to track reliably objects from an UAV with limited computational effort. Different strategies to combine object detection, tracking, and speed estimation are discussed, too. From our experiments, the optimized detector runs at an average frame-rate of up to 29 frames per second (FPS) on frame resolution 512 × 320 on a Nvidia Xavier NX board, while the optimally combined detector, tracker and speed estimator pipeline achieves speeds of up to 33 FPS on an image of resolution 3072 × 1728. To our knowledge, the MultEYE system is one of the first traffic monitoring systems that was specifically designed and optimized for an UAV platform under real-world constraints. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)
Show Figures

Figure 1

Open AccessArticle
Sequence Image Interpolation via Separable Convolution Network
Remote Sens. 2021, 13(2), 296; https://doi.org/10.3390/rs13020296 - 15 Jan 2021
Viewed by 612
Abstract
Remote-sensing time-series data are significant for global environmental change research and a better understanding of the Earth. However, remote-sensing acquisitions often provide sparse time series due to sensor resolution limitations and environmental factors, such as cloud noise for optical data. Image interpolation is [...] Read more.
Remote-sensing time-series data are significant for global environmental change research and a better understanding of the Earth. However, remote-sensing acquisitions often provide sparse time series due to sensor resolution limitations and environmental factors, such as cloud noise for optical data. Image interpolation is the method that is often used to deal with this issue. This paper considers the deep learning method to learn the complex mapping of an interpolated intermediate image from predecessor and successor images, called separable convolution network for sequence image interpolation. The separable convolution network uses a separable 1D convolution kernel instead of 2D kernels to capture the spatial characteristics of input sequence images and then is trained end-to-end using sequence images. Our experiments, which were performed with unmanned aerial vehicle (UAV) and Landsat-8 datasets, show that the method is effective to produce high-quality time-series interpolated images, and the data-driven deep model can better simulate complex and diverse nonlinear image data information. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)
Show Figures

Graphical abstract

Open AccessArticle
Matching Large Baseline Oblique Stereo Images Using an End-to-End Convolutional Neural Network
Remote Sens. 2021, 13(2), 274; https://doi.org/10.3390/rs13020274 - 14 Jan 2021
Viewed by 474
Abstract
The available stereo matching algorithms produce large number of false positive matches or only produce a few true-positives across oblique stereo images with large baseline. This undesired result happens due to the complex perspective deformation and radiometric distortion across the images. To address [...] Read more.
The available stereo matching algorithms produce large number of false positive matches or only produce a few true-positives across oblique stereo images with large baseline. This undesired result happens due to the complex perspective deformation and radiometric distortion across the images. To address this problem, we propose a novel affine invariant feature matching algorithm with subpixel accuracy based on an end-to-end convolutional neural network (CNN). In our method, we adopt and modify a Hessian affine network, which we refer to as IHesAffNet, to obtain affine invariant Hessian regions using deep learning framework. To improve the correlation between corresponding features, we introduce an empirical weighted loss function (EWLF) based on the negative samples using K nearest neighbors, and then generate deep learning-based descriptors with high discrimination that is realized with our multiple hard network structure (MTHardNets). Following this step, the conjugate features are produced by using the Euclidean distance ratio as the matching metric, and the accuracy of matches are optimized through the deep learning transform based least square matching (DLT-LSM). Finally, experiments on Large baseline oblique stereo images acquired by ground close-range and unmanned aerial vehicle (UAV) verify the effectiveness of the proposed approach, and comprehensive comparisons demonstrate that our matching algorithm outperforms the state-of-art methods in terms of accuracy, distribution and correct ratio. The main contributions of this article are: (i) our proposed MTHardNets can generate high quality descriptors; and (ii) the IHesAffNet can produce substantial affine invariant corresponding features with reliable transform parameters. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)
Show Figures

Figure 1

Open AccessArticle
A Practical Cross-View Image Matching Method between UAV and Satellite for UAV-Based Geo-Localization
Remote Sens. 2021, 13(1), 47; https://doi.org/10.3390/rs13010047 - 24 Dec 2020
Viewed by 611
Abstract
Cross-view image matching has attracted extensive attention due to its huge potential applications, such as localization and navigation. Unmanned aerial vehicle (UAV) technology has been developed rapidly in recent years, and people have more opportunities to obtain and use UAV-view images than ever [...] Read more.
Cross-view image matching has attracted extensive attention due to its huge potential applications, such as localization and navigation. Unmanned aerial vehicle (UAV) technology has been developed rapidly in recent years, and people have more opportunities to obtain and use UAV-view images than ever before. However, the algorithms of cross-view image matching between the UAV view (oblique view) and the satellite view (vertical view) are still in their beginning stage, and the matching accuracy is expected to be further improved when applied in real situations. Within this context, in this study, we proposed a cross-view matching method based on location classification (hereinafter referred to LCM), in which the similarity between UAV and satellite views is considered, and we implemented the method with the newest UAV-based geo-localization dataset (University-1652). LCM is able to solve the imbalance of the input sample number between the satellite images and the UAV images. In the training stage, LCM can simplify the retrieval problem into a classification problem and consider the influence of the feature vector size on the matching accuracy. Compared with one study, LCM shows higher accuracies, and [email protected] (K ∈ {1, 5, 10}) and the average precision (AP) were improved by 5–10%. The expansion of satellite-view images and multiple queries proposed by the LCM are capable of improving the matching accuracy during the experiment. In addition, the influences of different feature sizes on the LCM’s accuracy are determined, and we found that 512 is the optimal feature size. Finally, the LCM model trained based on synthetic UAV-view images was evaluated in real-world situations, and the evaluation result shows that it still has satisfactory matching accuracy. The LCM can realize the bidirectional matching between the UAV-view image and the satellite-view image and can contribute to two applications: (i) UAV-view image localization (i.e., predicting the geographic location of UAV-view images based on satellite-view images with geo-tags) and (ii) UAV navigation (i.e., driving the UAV to the region of interest in the satellite-view image based on the flight record). Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)
Show Figures

Graphical abstract

Open AccessArticle
Learning to Track Aircraft in Infrared Imagery
Remote Sens. 2020, 12(23), 3995; https://doi.org/10.3390/rs12233995 - 06 Dec 2020
Viewed by 480
Abstract
Airborne target tracking in infrared imagery remains a challenging task. The airborne target usually has a low signal-to-noise ratio and shows different visual patterns. The features adopted in the visual tracking algorithm are usually deep features pre-trained on ImageNet, which are not tightly [...] Read more.
Airborne target tracking in infrared imagery remains a challenging task. The airborne target usually has a low signal-to-noise ratio and shows different visual patterns. The features adopted in the visual tracking algorithm are usually deep features pre-trained on ImageNet, which are not tightly coupled with the current video domain and therefore might not be optimal for infrared target tracking. To this end, we propose a new approach to learn the domain-specific features, which can be adapted to the current video online without pre-training on a large datasets. Considering that only a few samples of the initial frame can be used for online training, general feature representations are encoded to the network for a better initialization. The feature learning module is flexible and can be integrated into tracking frameworks based on correlation filters to improve the baseline method. Experiments on airborne infrared imagery are conducted to demonstrate the effectiveness of our tracking algorithm. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)
Show Figures

Figure 1

Open AccessArticle
Bathymetric Inversion and Uncertainty Estimation from Synthetic Surf-Zone Imagery with Machine Learning
Remote Sens. 2020, 12(20), 3364; https://doi.org/10.3390/rs12203364 - 15 Oct 2020
Cited by 1 | Viewed by 613
Abstract
Resolving surf-zone bathymetry from high-resolution imagery typically involves measuring wave speeds and performing a physics-based inversion process using linear wave theory, or data assimilation techniques which combine multiple remotely sensed parameters with numerical models. In this work, we explored what types of coastal [...] Read more.
Resolving surf-zone bathymetry from high-resolution imagery typically involves measuring wave speeds and performing a physics-based inversion process using linear wave theory, or data assimilation techniques which combine multiple remotely sensed parameters with numerical models. In this work, we explored what types of coastal imagery can be best utilized in a 2-dimensional fully convolutional neural network to directly estimate nearshore bathymetry from optical expressions of wave kinematics. Specifically, we explored utilizing time-averaged images (timex) of the surf-zone, which can be used as a proxy for wave dissipation, as well as including a single-frame image input, which has visible patterns of wave refraction and instantaneous expressions of wave breaking. Our results show both types of imagery can be used to estimate nearshore bathymetry. However, the single-frame imagery provides more complete information across the domain, decreasing the error over the test set by approximately 10% relative to using timex imagery alone. A network incorporating both inputs had the best performance, with an overall root-mean-squared-error of 0.39 m. Activation maps demonstrate the additional information provided by the single-frame imagery in non-breaking wave areas which aid in prediction. Uncertainty in model predictions is explored through three techniques (Monte Carlo (MC) dropout, infer-transformation, and infer-noise) to provide additional actionable information about the spatial reliability of each bathymetric prediction. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)
Show Figures

Graphical abstract

Open AccessArticle
Distributed Training and Inference of Deep Learning Models for Multi-Modal Land Cover Classification
Remote Sens. 2020, 12(17), 2670; https://doi.org/10.3390/rs12172670 - 19 Aug 2020
Viewed by 859
Abstract
Deep Neural Networks (DNNs) have established themselves as a fundamental tool in numerous computational modeling applications, overcoming the challenge of defining use-case-specific feature extraction processing by incorporating this stage into unified end-to-end trainable models. Despite their capabilities in modeling, training large-scale DNN models [...] Read more.
Deep Neural Networks (DNNs) have established themselves as a fundamental tool in numerous computational modeling applications, overcoming the challenge of defining use-case-specific feature extraction processing by incorporating this stage into unified end-to-end trainable models. Despite their capabilities in modeling, training large-scale DNN models is a very computation-intensive task that most single machines are often incapable of accomplishing. To address this issue, different parallelization schemes were proposed. Nevertheless, network overheads as well as optimal resource allocation pose as major challenges, since network communication is generally slower than intra-machine communication while some layers are more computationally expensive than others. In this work, we consider a novel multimodal DNN based on the Convolutional Neural Network architecture and explore several different ways to optimize its performance when training is executed on an Apache Spark Cluster. We evaluate the performance of different architectures via the metrics of network traffic and processing power, considering the case of land cover classification from remote sensing observations. Furthermore, we compare our architectures with an identical DNN architecture modeled after a data parallelization approach by using the metrics of classification accuracy and inference execution time. The experiments show that the way a model is parallelized has tremendous effect on resource allocation and hyperparameter tuning can reduce network overheads. Experimental results also demonstrate that proposed model parallelization schemes achieve more efficient resource use and more accurate predictions compared to data parallelization approaches. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)
Show Figures

Graphical abstract

Open AccessArticle
Effective Training of Deep Convolutional Neural Networks for Hyperspectral Image Classification through Artificial Labeling
Remote Sens. 2020, 12(16), 2653; https://doi.org/10.3390/rs12162653 - 17 Aug 2020
Cited by 5 | Viewed by 1407
Abstract
Hyperspectral imaging is a rich source of data, allowing for a multitude of effective applications. However, such imaging remains challenging because of large data dimension and, typically, a small pool of available training examples. While deep learning approaches have been shown to be [...] Read more.
Hyperspectral imaging is a rich source of data, allowing for a multitude of effective applications. However, such imaging remains challenging because of large data dimension and, typically, a small pool of available training examples. While deep learning approaches have been shown to be successful in providing effective classification solutions, especially for high dimensional problems, unfortunately they work best with a lot of labelled examples available. The transfer learning approach can be used to alleviate the second requirement for a particular dataset: first the network is pre-trained on some dataset with large amount of training labels available, then the actual dataset is used to fine-tune the network. This strategy is not straightforward to apply with hyperspectral images, as it is often the case that only one particular image of some type or characteristic is available. In this paper, we propose and investigate a simple and effective strategy of transfer learning that uses unsupervised pre-training step without label information. This approach can be applied to many of the hyperspectral classification problems. The performed experiments show that it is very effective at improving the classification accuracy without being restricted to a particular image type or neural network architecture. The experiments were carried out on several deep neural network architectures and various sizes of labeled training sets. The greatest improvement in overall accuracy on the Indian Pines and Pavia University datasets is over 21 and 13 percentage points, respectively. An additional advantage of the proposed approach is the unsupervised nature of the pre-training step, which can be done immediately after image acquisition, without the need of the potentially costly expert’s time. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)
Show Figures

Graphical abstract

Open AccessEditor’s ChoiceArticle
Neural Network Training for the Detection and Classification of Oceanic Mesoscale Eddies
Remote Sens. 2020, 12(16), 2625; https://doi.org/10.3390/rs12162625 - 14 Aug 2020
Cited by 2 | Viewed by 1288
Abstract
Recent advances in deep learning have made it possible to use neural networks for the detection and classification of oceanic mesoscale eddies from satellite altimetry data. Various neural network models have been proposed in recent years to address this challenge, but they have [...] Read more.
Recent advances in deep learning have made it possible to use neural networks for the detection and classification of oceanic mesoscale eddies from satellite altimetry data. Various neural network models have been proposed in recent years to address this challenge, but they have been trained using different types of input data and evaluated using different performance metrics, making a comparison between them impossible. In this article, we examine the most common dataset and metric choices, by analyzing the reasons for the divergences between them and pointing out the most appropriate choice to obtain a fair evaluation in this scenario. Based on this comparative study, we have developed several neural network models to detect and classify oceanic eddies from satellite images, showing that our most advanced models perform better than the models previously proposed in the literature. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)
Show Figures

Figure 1

Open AccessArticle
R2FA-Det: Delving into High-Quality Rotatable Boxes for Ship Detection in SAR Images
Remote Sens. 2020, 12(12), 2031; https://doi.org/10.3390/rs12122031 - 24 Jun 2020
Cited by 4 | Viewed by 740
Abstract
Recently, convolutional neural network (CNN)-based methods have been extensively explored for ship detection in synthetic aperture radar (SAR) images due to their powerful feature representation abilities. However, there are still several obstacles hindering the development. First, ships appear in various scenarios, which makes [...] Read more.
Recently, convolutional neural network (CNN)-based methods have been extensively explored for ship detection in synthetic aperture radar (SAR) images due to their powerful feature representation abilities. However, there are still several obstacles hindering the development. First, ships appear in various scenarios, which makes it difficult to exclude the disruption of the cluttered background. Second, it becomes more complicated to precisely locate the targets with large aspect ratios, arbitrary orientations and dense distributions. Third, the trade-off between accurate localization and improved detection efficiency needs to be considered. To address these issues, this paper presents a rotate refined feature alignment detector (R 2 FA-Det), which ingeniously balances the quality of bounding box prediction and the high speed of the single-stage framework. Specifically, first, we devise a lightweight non-local attention module and embed it into the stem network. The recalibration of features not only strengthens the object-related features yet adequately suppresses the background interference. In addition, both forms of anchors are integrated into our modified anchor mechanism and thus can enable better representation of densely arranged targets with less computation burden. Furthermore, considering the shortcoming of the feature misalignment existing in the cascaded refinement scheme, a feature-guided alignment module which encodes both the position and shape information of current refined anchors into the feature points is adopted. Extensive experimental validations on two SAR ship datasets are performed and the results demonstrate that our algorithm has higher accuracy with faster speed than some state-of-the-art methods. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)
Show Figures

Graphical abstract

Open AccessArticle
Residual Dense Network Based on Channel-Spatial Attention for the Scene Classification of a High-Resolution Remote Sensing Image
Remote Sens. 2020, 12(11), 1887; https://doi.org/10.3390/rs12111887 - 10 Jun 2020
Cited by 7 | Viewed by 849
Abstract
The scene classification of a remote sensing image has been widely used in various fields as an important task of understanding the content of a remote sensing image. Specially, a high-resolution remote sensing scene contains rich information and complex content. Considering that the [...] Read more.
The scene classification of a remote sensing image has been widely used in various fields as an important task of understanding the content of a remote sensing image. Specially, a high-resolution remote sensing scene contains rich information and complex content. Considering that the scene content in a remote sensing image is very tight to the spatial relationship characteristics, how to design an effective feature extraction network directly decides the quality of classification by fully mining the spatial information in a high-resolution remote sensing image. In recent years, convolutional neural networks (CNNs) have achieved excellent performance in remote sensing image classification, especially the residual dense network (RDN) as one of the representative networks of CNN, which shows a stronger feature learning ability as it fully utilizes all the convolutional layer information. Therefore, we design an RDN based on channel-spatial attention for scene classification of a high-resolution remote sensing image. First, multi-layer convolutional features are fused with residual dense blocks. Then, a channel-spatial attention module is added to obtain more effective feature representation. Finally, softmax classifier is applied to classify the scene after adopting data augmentation strategy for meeting the training requirements of the network parameters. Five experiments are conducted on the UC Merced Land-Use Dataset (UCM) and Aerial Image Dataset (AID), and the competitive results demonstrate that our method can extract more effective features and is more conducive to classifying a scene. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)
Show Figures

Graphical abstract

Other

Jump to: Research

Open AccessTechnical Note
Detection of Invasive Species in Wetlands: Practical DL with Heavily Imbalanced Data
Remote Sens. 2020, 12(20), 3431; https://doi.org/10.3390/rs12203431 - 19 Oct 2020
Cited by 4 | Viewed by 514
Abstract
Deep Learning (DL) has become popular due to its ease of use and accuracy, with Transfer Learning (TL) effectively reducing the number of images needed to solve environmental problems. However, this approach has some limitations which we set out to explore: Our goal [...] Read more.
Deep Learning (DL) has become popular due to its ease of use and accuracy, with Transfer Learning (TL) effectively reducing the number of images needed to solve environmental problems. However, this approach has some limitations which we set out to explore: Our goal is to detect the presence of an invasive blueberry species in aerial images of wetlands. This is a key problem in ecosystem protection which is also challenging in terms of DL due to the severe imbalance present in the data. Results for the ResNet50 network show a high classification accuracy while largely ignoring the blueberry class, rendering these results of limited practical interest to detect that specific class. Moreover, by using loss function weighting and data augmentation results more akin to our practical application, our goals can be obtained. Our experiments regarding TL show that ImageNet weights do not produce satisfactory results when only the final layer of the network is trained. Furthermore, only minor gains are obtained compared with random weights when the whole network is retrained. Finally, in a study of state-of-the-art DL architectures best results were obtained by the ResNeXt architecture with 93.75 True Positive Rate and 98.11 accuracy for the Blueberry class with ResNet50, Densenet, and wideResNet obtaining close results. Full article
(This article belongs to the Special Issue Computer Vision and Deep Learning for Remote Sensing Applications)
Show Figures

Graphical abstract

Back to TopTop