remotesensing-logo

Journal Browser

Journal Browser

Special Issue "Deep Learning in Remote Sensing Application"

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: closed (31 December 2022) | Viewed by 19410

Special Issue Editors

Dr. Weijia Li
E-Mail Website
Guest Editor
School of Geospatial Engineering and Science, Sun Yat-Sen University, Guangzhou 510080, China
Interests: remote sensing image understanding; computer vision; deep learning
Special Issues, Collections and Topics in MDPI journals
Prof. Dr. Lichao Mou
E-Mail Website
Guest Editor
1. Data Science in Earth Observation, Technical University of Munich (TUM), Arcisstraße 21, 80333 Munich, Germany
2. Remote Sensing Technology Institute (IMF), German Aerospace Center (DLR), Münchener Straße 20, 82234 Weßling, Germany
Interests: remote sensing; computer vision; machine/deep learning
Special Issues, Collections and Topics in MDPI journals
Dr. Angelica I. Aviles-Rivero
E-Mail Website
Guest Editor
DAMTP, University of Cambridge, Wilberforce Rd, Cambridge CB3 0WA, UK
Interests: semi-supervised learning; hyperspectral analysis; street level analysis; deep learning; graph-based techniques
Runmin Dong
E-Mail Website
Guest Editor Assistant
Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
Interests: remote sensing image understanding; deep learning; land cover mapping; image super-resolution reconstruction
Juepeng Zheng
E-Mail Website
Guest Editor Assistant
Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
Interests: remote sensing image understanding; deep learning; high performance computing

Special Issue Information

Dear Colleagues,

Remote sensing images have recorded various kinds of information on the earth surface for decades, which have been broadly applied to many crucial areas, e.g., urban planning, national security, agriculture, forestry, climate, hydrology, etc. It is important to extract essential information from the substantial amount of remote sensing images efficiently and accurately. Over the recent years, artificial intelligence, especially the deep learning technique, has had a significant effect on the remote sensing domain and shown great potential in land cover and land use mapping, crop monitoring, object detection, building and road extraction, change detection, super-resolution, and many other remote sensing applications. However, many challenges still exist due to the limited number of annotated datasets, the special characteristic of different sensors or data sources, the complexity and diversity of large-scale areas, and other specific problems in real-world applications. In this Special Issue, we expect new research progress and contributions for deep-learning-based remote sensing applications. We look forward to novel datasets, algorithm design, or application domains. The scope of this Special Issue includes but is not limited to:

  • Image classification;
  • Object detection;
  • Semantic segmentation;
  • Instance segmentation;
  • Weakly supervised learning;
  • Semi-supervised learning;
  • Self-supervised learning;
  • Unsupervised learning;
  • Domain adaptation;
  • Transfer learning;
  • Novel datasets;
  • Novel tasks/applications;
  • 3D vision for monocular images;
  • Multi-view stereo;
  • Point cloud data;
  • Change detection;
  • Time-series data analysis;
  • Multispectral or hyperspectral image analysis;
  • Image super-resolution/restoration;
  • Data Fusion;
  • Multi-modal data analysis.

Dr. Weijia Li
Dr. Lichao Mou
Dr. Angelica I. Aviles-Rivero
Guest Editors
Runmin Dong
Juepeng Zheng
Guest Editor Assistants

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2500 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Deep learning
  • Computer vision
  • Remote sensing image analysis
  • Classification/detection/segmentation
  • Novel datasets
  • Remote sensing applications

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Article
High-Performance Segmentation for Flood Mapping of HISEA-1 SAR Remote Sensing Images
Remote Sens. 2022, 14(21), 5504; https://doi.org/10.3390/rs14215504 - 01 Nov 2022
Viewed by 669
Abstract
Floods are the among the most frequent and common natural disasters, causing numerous casualties and extensive property losses worldwide every year. Since flooding areas are often accompanied by cloudy and rainy weather, synthetic aperture radar (SAR) is one of the most powerful sensors [...] Read more.
Floods are the among the most frequent and common natural disasters, causing numerous casualties and extensive property losses worldwide every year. Since flooding areas are often accompanied by cloudy and rainy weather, synthetic aperture radar (SAR) is one of the most powerful sensors for flood monitoring with capabilities of day-and-night and all-weather imaging. However, SAR images are prone to high speckle noise, shadows, and distortions, which affect the accuracy of water body segmentation. To address this issue, we propose a novel Modified DeepLabv3+ model based on the powerful extraction ability of convolutional neural networks for flood mapping from HISEA-1 SAR remote sensing images. Specifically, a lightweight encoder MobileNetv2 is used to improve floodwater detection efficiency, small jagged arrangement atrous convolutions are employed to capture features at small scales and improve pixel utilization, and more upsampling layers are utilized to refine the segmented boundaries of water bodies. The Modified DeepLabv3+ model is then used to analyze two severe flooding events in China and the United States. Results show that Modified DeepLabv3+ outperforms competing semantic segmentation models (SegNet, U-Net, and DeepLabv3+) with respect to the accuracy and efficiency of floodwater extraction. The modified model training resulted in average accuracy, F1, and mIoU scores of 95.74%, 89.31%, and 87.79%, respectively. Further analysis also revealed that Modified DeepLabv3+ is able to accurately distinguish water feature shape and boundary, despite complicated background conditions, while also retaining the highest efficiency by covering 1140 km2 in 5 min. These results demonstrate that this model is a valuable tool for flood monitoring and emergency management. Full article
(This article belongs to the Special Issue Deep Learning in Remote Sensing Application)
Show Figures

Graphical abstract

Communication
Semi-Supervised SAR ATR Framework with Transductive Auxiliary Segmentation
Remote Sens. 2022, 14(18), 4547; https://doi.org/10.3390/rs14184547 - 12 Sep 2022
Viewed by 643
Abstract
Convolutional neural networks (CNNs) have achieved high performance in synthetic aperture radar (SAR) automatic target recognition (ATR). However, the performance of CNNs depends heavily on a large amount of training data. The insufficiency of labeled training SAR images limits the recognition performance and [...] Read more.
Convolutional neural networks (CNNs) have achieved high performance in synthetic aperture radar (SAR) automatic target recognition (ATR). However, the performance of CNNs depends heavily on a large amount of training data. The insufficiency of labeled training SAR images limits the recognition performance and even invalidates some ATR methods. Furthermore, under few labeled training data, many existing CNNs are even ineffective. To address these challenges, we propose a Semi-supervised SAR ATR Framework with transductive Auxiliary Segmentation (SFAS). The proposed framework focuses on exploiting the transductive generalization on available unlabeled samples with an auxiliary loss serving as a regularizer. Through auxiliary segmentation of unlabeled SAR samples and information residue loss (IRL) in training, the framework can employ the proposed training loop process and gradually exploit the information compilation of recognition and segmentation to construct a helpful inductive bias and achieve high performance. Experiments conducted on the MSTAR dataset have shown the effectiveness of our proposed SFAS for few-shot learning. The recognition performance of 94.18% can be achieved under 20 training samples in each class with simultaneous accurate segmentation results. Facing variances of EOCs, the recognition ratios are higher than 88.00% when 10 training samples each class. Full article
(This article belongs to the Special Issue Deep Learning in Remote Sensing Application)
Show Figures

Figure 1

Article
Fast Tree Detection and Counting on UAVs for Sequential Aerial Images with Generating Orthophoto Mosaicing
Remote Sens. 2022, 14(16), 4113; https://doi.org/10.3390/rs14164113 - 22 Aug 2022
Cited by 2 | Viewed by 911
Abstract
Individual tree counting (ITC) is a popular topic in the remote sensing application field. The number and planting density of trees are significant for estimating the yield and for futher planing, etc. Although existing studies have already achieved great performance on tree detection [...] Read more.
Individual tree counting (ITC) is a popular topic in the remote sensing application field. The number and planting density of trees are significant for estimating the yield and for futher planing, etc. Although existing studies have already achieved great performance on tree detection with satellite imagery, the quality is often negatively affected by clouds and heavy fog, which limits the application of high-frequency inventory. Nowadays, with ultra high spatial resolution and convenient usage, Unmanned Aerial Vehicles (UAVs) have become promising tools for obtaining statistics from plantations. However, for large scale areas, a UAV cannot capture the whole region of interest in one photo session. In this paper, a real-time orthophoto mosaicing-based tree counting framework is proposed to detect trees using sequential aerial images, which is very effective for fast detection of large areas. Firstly, to guarantee the speed and accuracy, a multi-planar assumption constrained graph optimization algorithm is proposed to estimate the camera pose and generate orthophoto mosaicing simultaneously. Secondly, to avoid time-consuming box or mask annotations, a point supervised method is designed for tree counting task, which greatly speeds up the entire workflow. We demonstrate the effectiveness of our method by performing extensive experiments on oil-palm and acacia trees. To avoid the delay between data acquisition and processing, the proposed framework algorithm is embedded into the UAV for completing tree counting tasks, which also reduces the quantity of data transmission from the UAV system to the ground station. We evaluate the proposed pipeline using sequential UAV images captured in Indonesia. The proposed pipeline achieves an F1-score of 98.2% for acacia tree detection and 96.3% for oil-palm tree detection with online orthophoto mosaicing generation. Full article
(This article belongs to the Special Issue Deep Learning in Remote Sensing Application)
Show Figures

Figure 1

Article
MS-IAF: Multi-Scale Information Augmentation Framework for Aircraft Detection
Remote Sens. 2022, 14(15), 3696; https://doi.org/10.3390/rs14153696 - 02 Aug 2022
Cited by 2 | Viewed by 704
Abstract
Aircrafts have been an important object of study in the field of multi-scale image object detection due to their important strategic role. However, the multi-scale detection of aircrafts and their key parts from remote sensing images can be a challenge, as images often [...] Read more.
Aircrafts have been an important object of study in the field of multi-scale image object detection due to their important strategic role. However, the multi-scale detection of aircrafts and their key parts from remote sensing images can be a challenge, as images often present complex backgrounds and obscured conditions. Most of today’s multi-scale datasets consist of independent objects and lack mixed annotations of aircrafts and their key parts. In this paper, we contribute a multi-scale aircraft dataset (AP-DATA) consisting of 7000 aircraft images that were taken in complex environments and obscured conditions. Our dataset includes mixed annotations of aircrafts and their key parts. We also present a multi-scale information augmentation framework (MS-IAF) to recognize multi-scale aircrafts and their key parts accurately. First, we propose a new deep convolutional module ResNeSt-D as the backbone, which stacks scattered attention in a multi-path manner and makes the receptive field more suitable for the object. Then, based on the combination of Faster R-CNN with ResNeSt-D, we propose a multi-scale feature fusion module called BFPCAR. BFPCAR overcomes the attention imbalance problem of the non-adjacent layers of the FPN module by reducing the loss of information between different layers and including more semantic features during information fusion. Based on AP-DATA, a dataset with three types of features, the average precision (AP) of MS-IAF reached 0.884, i.e., 2.67% higher than that of the original Faster R-CNN. The APs of these two modules were improved by 2.32% and 1.39%, respectively. The robustness of our proposed model was validated using the open sourced RSOD remote sensing image dataset, and the best accuracy was achieved. Full article
(This article belongs to the Special Issue Deep Learning in Remote Sensing Application)
Show Figures

Graphical abstract

Article
Comparing CNNs and Random Forests for Landsat Image Segmentation Trained on a Large Proxy Land Cover Dataset
Remote Sens. 2022, 14(14), 3396; https://doi.org/10.3390/rs14143396 - 14 Jul 2022
Cited by 2 | Viewed by 973
Abstract
Land cover mapping from satellite images has progressed from visual and statistical approaches to Random Forests (RFs) and, more recently, advanced image recognition techniques such as convolutional neural networks (CNNs). CNNs have a conceptual benefit over RFs in recognising spatial feature context, but [...] Read more.
Land cover mapping from satellite images has progressed from visual and statistical approaches to Random Forests (RFs) and, more recently, advanced image recognition techniques such as convolutional neural networks (CNNs). CNNs have a conceptual benefit over RFs in recognising spatial feature context, but potentially at the cost of reduced spatial detail. We tested the use of CNNs for improved land cover mapping based on Landsat data, compared with RFs, for a study area of approximately 500 km × 500 km in southeastern Australia. Landsat 8 geomedian composite surface reflectances were available for 2018. Label data were a simple nine-member land cover classification derived from reference land use mapping (Catchment Scale Land Use of Australia—CLUM), and further enhanced by using custom forest extent mapping (Forests of Australia). Experiments were undertaken testing U-Net CNN for segmentation of Landsat 8 geomedian imagery to determine the optimal combination of input Landsat 8 bands. The results were compared with those from a simple autoencoder as well as an RF model. Segmentation test results for the best performing U-Net CNN models produced an overall accuracy of 79% and weighted-mean F1 score of 77% (9 band input) or 76% (6 band input) for a simple nine-member land cover classification, compared with 73% and 68% (6 band input), respectively, for the best RF model. We conclude that U-Net CNN models can generate annual land cover maps with good accuracy from proxy training data, and can also be used for quality control or improvement of existing land cover products. Full article
(This article belongs to the Special Issue Deep Learning in Remote Sensing Application)
Show Figures

Figure 1

Article
3D Sensor Based Pedestrian Detection by Integrating Improved HHA Encoding and Two-Branch Feature Fusion
Remote Sens. 2022, 14(3), 645; https://doi.org/10.3390/rs14030645 - 29 Jan 2022
Cited by 6 | Viewed by 1841
Abstract
Pedestrian detection is vitally important in many computer vision tasks but still suffers from some problems, such as illumination and occlusion if only the RGB image is exploited, especially in outdoor and long-range scenes. Combining RGB with depth information acquired by 3D sensors [...] Read more.
Pedestrian detection is vitally important in many computer vision tasks but still suffers from some problems, such as illumination and occlusion if only the RGB image is exploited, especially in outdoor and long-range scenes. Combining RGB with depth information acquired by 3D sensors may effectively alleviate these problems. Therefore, how to utilize depth information and how to fuse RGB and depth features are the focus of the task of RGB-D pedestrian detection. This paper first improves the most commonly used HHA method for depth encoding by optimizing the gravity direction extraction and depth values mapping, which can generate a pseudo-color image from the depth information. Then, a two-branch feature fusion extraction module (TFFEM) is proposed to obtain the local and global features of both modalities. Based on TFFEM, an RGB-D pedestrian detection network is designed to locate the people. In experiments, the improved HHA encoding method is twice as fast and achieves more accurate gravity-direction extraction on four publicly-available datasets. The pedestrian detection performance of the proposed network is validated on KITTI and EPFL datasets and achieves state-of-the-art performance. Moreover, the proposed method achieved third ranking among all published works on the KITTI leaderboard. In general, the proposed method effectively fuses RGB and depth features and overcomes the effects of illumination and occlusion problems in pedestrian detection. Full article
(This article belongs to the Special Issue Deep Learning in Remote Sensing Application)
Show Figures

Figure 1

Article
An Adaptive Attention Fusion Mechanism Convolutional Network for Object Detection in Remote Sensing Images
Remote Sens. 2022, 14(3), 516; https://doi.org/10.3390/rs14030516 - 21 Jan 2022
Cited by 7 | Viewed by 1825
Abstract
For remote sensing object detection, fusing the optimal feature information automatically and overcoming the sensitivity to adapt multi-scale objects remains a significant challenge for the existing convolutional neural networks. Given this, we develop a convolutional network model with an adaptive attention fusion mechanism [...] Read more.
For remote sensing object detection, fusing the optimal feature information automatically and overcoming the sensitivity to adapt multi-scale objects remains a significant challenge for the existing convolutional neural networks. Given this, we develop a convolutional network model with an adaptive attention fusion mechanism (AAFM). The model is proposed based on the backbone network of EfficientDet. Firstly, according to the characteristics of object distribution in datasets, the stitcher is applied to make one image containing objects of various scales. Such a process can effectively balance the proportion of multi-scale objects and handle the scale-variable properties. In addition, inspired by channel attention, a spatial attention model is also introduced in the construction of the adaptive attention fusion mechanism. In this mechanism, the semantic information of the different feature maps is obtained via convolution and different pooling operations. Then, the parallel spatial and channel attention are fused in the optimal proportions by the fusion factors to get the further representative feature information. Finally, the Complete Intersection over Union (CIoU) loss is used to make the bounding box better cover the ground truth. The experimental results of the optical image dataset DIOR demonstrate that, compared with state-of-the-art detectors such as the Single Shot multibox Detector (SSD), You Only Look Once (YOLO) v4, and EfficientDet, the proposed module improves accuracy and has stronger robustness. Full article
(This article belongs to the Special Issue Deep Learning in Remote Sensing Application)
Show Figures

Figure 1

Article
Hyperspectral Image Super-Resolution Based on Spatial Correlation-Regularized Unmixing Convolutional Neural Network
Remote Sens. 2021, 13(20), 4074; https://doi.org/10.3390/rs13204074 - 12 Oct 2021
Cited by 6 | Viewed by 1518
Abstract
Super-resolution (SR) technology has emerged as an effective tool for image analysis and interpretation. However, single hyperspectral (HS) image SR remains challenging, due to the high spectral dimensionality and lack of available high-resolution information of auxiliary sources. To fully exploit the spectral and [...] Read more.
Super-resolution (SR) technology has emerged as an effective tool for image analysis and interpretation. However, single hyperspectral (HS) image SR remains challenging, due to the high spectral dimensionality and lack of available high-resolution information of auxiliary sources. To fully exploit the spectral and spatial characteristics, in this paper, a novel single HS image SR approach is proposed based on a spatial correlation-regularized unmixing convolutional neural network (CNN). The proposed approach takes advantage of a CNN to explore the collaborative spatial and spectral information of an HS image and infer the high-resolution abundance maps, thereby reconstructing the anticipated high-resolution HS image via the linear spectral mixture model. Moreover, a dual-branch architecture network and spatial spread transform function are employed to characterize the spatial correlation between the high- and low-resolution HS images, aiming at promoting the fidelity of the super-resolved image. Experiments on three public remote sensing HS images demonstrate the feasibility and superiority in terms of spectral fidelity, compared with some state-of-the-art HS image super-resolution methods. Full article
(This article belongs to the Special Issue Deep Learning in Remote Sensing Application)
Show Figures

Figure 1

Article
DEANet: Dual Encoder with Attention Network for Semantic Segmentation of Remote Sensing Imagery
Remote Sens. 2021, 13(19), 3900; https://doi.org/10.3390/rs13193900 - 29 Sep 2021
Cited by 8 | Viewed by 1408
Abstract
Remote sensing has now been widely used in various fields, and the research on the automatic land-cover segmentation methods of remote sensing imagery is significant to the development of remote sensing technology. Deep learning methods, which are developing rapidly in the field of [...] Read more.
Remote sensing has now been widely used in various fields, and the research on the automatic land-cover segmentation methods of remote sensing imagery is significant to the development of remote sensing technology. Deep learning methods, which are developing rapidly in the field of semantic segmentation, have been widely applied to remote sensing imagery segmentation. In this work, a novel deep learning network—Dual Encoder with Attention Network (DEANet) is proposed. In this network, a dual-branch encoder structure, whose first branch is used to generate a rough guidance feature map as area attention to help re-encode feature maps in the next branch, is proposed to improve the encoding ability of the network, and an improved pyramid partial decoder (PPD) based on the parallel partial decoder is put forward to make fuller use of the features form the encoder along with the receptive filed block (RFB). In addition, an edge attention module using the transfer learning method is introduced to explicitly advance the segmentation performance in edge areas. Except for structure, a loss function composed with the weighted Cross Entropy (CE) loss and weighted Union subtract Intersection (UsI) loss is designed for training, where UsI loss represents a new region-based aware loss which replaces the IoU loss to adapt to multi-classification tasks. Furthermore, a detailed training strategy for the network is introduced as well. Extensive experiments on three public datasets verify the effectiveness of each proposed module in our framework and demonstrate that our method achieves more excellent performance over some state-of-the-art methods. Full article
(This article belongs to the Special Issue Deep Learning in Remote Sensing Application)
Show Figures

Graphical abstract

Article
Multi-Object Segmentation in Complex Urban Scenes from High-Resolution Remote Sensing Data
Remote Sens. 2021, 13(18), 3710; https://doi.org/10.3390/rs13183710 - 16 Sep 2021
Cited by 13 | Viewed by 2744
Abstract
Terrestrial features extraction, such as roads and buildings from aerial images using an automatic system, has many usages in an extensive range of fields, including disaster management, change detection, land cover assessment, and urban planning. This task is commonly tough because of complex [...] Read more.
Terrestrial features extraction, such as roads and buildings from aerial images using an automatic system, has many usages in an extensive range of fields, including disaster management, change detection, land cover assessment, and urban planning. This task is commonly tough because of complex scenes, such as urban scenes, where buildings and road objects are surrounded by shadows, vehicles, trees, etc., which appear in heterogeneous forms with lower inter-class and higher intra-class contrasts. Moreover, such extraction is time-consuming and expensive to perform by human specialists manually. Deep convolutional models have displayed considerable performance for feature segmentation from remote sensing data in the recent years. However, for the large and continuous area of obstructions, most of these techniques still cannot detect road and building well. Hence, this work’s principal goal is to introduce two novel deep convolutional models based on UNet family for multi-object segmentation, such as roads and buildings from aerial imagery. We focused on buildings and road networks because these objects constitute a huge part of the urban areas. The presented models are called multi-level context gating UNet (MCG-UNet) and bi-directional ConvLSTM UNet model (BCL-UNet). The proposed methods have the same advantages as the UNet model, the mechanism of densely connected convolutions, bi-directional ConvLSTM, and squeeze and excitation module to produce the segmentation maps with a high resolution and maintain the boundary information even under complicated backgrounds. Additionally, we implemented a basic efficient loss function called boundary-aware loss (BAL) that allowed a network to concentrate on hard semantic segmentation regions, such as overlapping areas, small objects, sophisticated objects, and boundaries of objects, and produce high-quality segmentation maps. The presented networks were tested on the Massachusetts building and road datasets. The MCG-UNet improved the average F1 accuracy by 1.85%, and 1.19% and 6.67% and 5.11% compared with UNet and BCL-UNet for road and building extraction, respectively. Additionally, the presented MCG-UNet and BCL-UNet networks were compared with other state-of-the-art deep learning-based networks, and the results proved the superiority of the networks in multi-object segmentation tasks. Full article
(This article belongs to the Special Issue Deep Learning in Remote Sensing Application)
Show Figures

Graphical abstract

Article
Learning Adjustable Reduced Downsampling Network for Small Object Detection in Urban Environments
Remote Sens. 2021, 13(18), 3608; https://doi.org/10.3390/rs13183608 - 10 Sep 2021
Cited by 3 | Viewed by 1253
Abstract
Detecting small objects (e.g., manhole covers, license plates, and roadside milestones) in urban images is a long-standing challenge mainly due to the scale of small object and background clutter. Although convolution neural network (CNN)-based methods have made significant progress and achieved impressive results [...] Read more.
Detecting small objects (e.g., manhole covers, license plates, and roadside milestones) in urban images is a long-standing challenge mainly due to the scale of small object and background clutter. Although convolution neural network (CNN)-based methods have made significant progress and achieved impressive results in generic object detection, the problem of small object detection remains unsolved. To address this challenge, in this study we developed an end-to-end network architecture that has three significant characteristics compared to previous works. First, we designed a backbone network module, namely Reduced Downsampling Network (RD-Net), to extract informative feature representations with high spatial resolutions and preserve local information for small objects. Second, we introduced an Adjustable Sample Selection (ADSS) module which frees the Intersection-over-Union (IoU) threshold hyperparameters and defines positive and negative training samples based on statistical characteristics between generated anchors and ground reference bounding boxes. Third, we incorporated the generalized Intersection-over-Union (GIoU) loss for bounding box regression, which efficiently bridges the gap between distance-based optimization loss and area-based evaluation metrics. We demonstrated the effectiveness of our method by performing extensive experiments on the public Urban Element Detection (UED) dataset acquired by Mobile Mapping Systems (MMS). The Average Precision (AP) of the proposed method was 81.71%, representing an improvement of 1.2% compared with the popular detection framework Faster R-CNN. Full article
(This article belongs to the Special Issue Deep Learning in Remote Sensing Application)
Show Figures

Graphical abstract

Article
Making Low-Resolution Satellite Images Reborn: A Deep Learning Approach for Super-Resolution Building Extraction
Remote Sens. 2021, 13(15), 2872; https://doi.org/10.3390/rs13152872 - 22 Jul 2021
Cited by 12 | Viewed by 2753
Abstract
Existing methods for building extraction from remotely sensed images strongly rely on aerial or satellite-based images with very high resolution, which are usually limited by spatiotemporally accessibility and cost. In contrast, relatively low-resolution images have better spatial and temporal availability but cannot directly [...] Read more.
Existing methods for building extraction from remotely sensed images strongly rely on aerial or satellite-based images with very high resolution, which are usually limited by spatiotemporally accessibility and cost. In contrast, relatively low-resolution images have better spatial and temporal availability but cannot directly contribute to fine- and/or high-resolution building extraction. In this paper, based on image super-resolution and segmentation techniques, we propose a two-stage framework (SRBuildingSeg) for achieving super-resolution (SR) building extraction using relatively low-resolution remotely sensed images. SRBuildingSeg can fully utilize inherent information from the given low-resolution images to achieve high-resolution building extraction. In contrast to the existing building extraction methods, we first utilize an internal pairs generation module (IPG) to obtain SR training datasets from the given low-resolution images and an edge-aware super-resolution module (EASR) to improve the perceptional features, following the dual-encoder building segmentation module (DES). Both qualitative and quantitative experimental results demonstrate that our proposed approach is capable of achieving high-resolution (e.g., 0.5 m) building extraction results at 2×, 4× and 8× SR. Our approach outperforms eight other methods with respect to the extraction result of mean Intersection over Union (mIoU) values by a ratio of 9.38%, 8.20%, and 7.89% with SR ratio factors of 2, 4, and 8, respectively. The results indicate that the edges and borders reconstructed in super-resolved images serve a pivotal role in subsequent building extraction and reveal the potential of the proposed approach to achieve super-resolution building extraction. Full article
(This article belongs to the Special Issue Deep Learning in Remote Sensing Application)
Show Figures

Graphical abstract

Back to TopTop