remotesensing-logo

Journal Browser

Journal Browser

Advanced Application of Artificial Intelligence and Machine Vision in Remote Sensing

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "AI Remote Sensing".

Deadline for manuscript submissions: closed (30 April 2022) | Viewed by 69101

Special Issue Editors


E-Mail Website
Guest Editor
1. Faculty of Engineering and IT, University of Technology Sydney, Ultimo, NSW, Australia
2. McGregor Coxall Australia Pty Ltd., Sydney, NSW, Australia
Interests: machine learning; geospatial 3D analysis; geospatial database querying; web GIS; airborne/spaceborne image processing; feature extraction; time-series analysis in forecasting modelling and domain adaptation in various environmental applications
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
1. ITC, University Twente, Hengelosestraat 99, 7514 AE Enschede, The Netherlands
2. Deggendorf Institute of Technology, Dieter-Görlitz-Platz 1, 94469 Deggendorf, Germany
Interests: remote sensing; (object-based) image analysis; artificial intelligence; GIScience
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Artificial intelligence (AI), including machine learning (ML) techniques, has been a principal element of image processing and spatial analysis in numerous applications for a decade. Among many approaches, deep neural networks in combination with deep learning algorithms have become very popular in the computer vision community. They are one of the most robust data-driven methods of ML that can engage a wide range of applications including pattern recognition, feature detection, trend prediction, instance segmentation, semantic segmentation, and image classification.

Training models with remotely sensed data is conventionally done manually, which is a subjective user-centric and therefore untransparent and tedious approach. Machine vision (MV) intends to get rid of these uncertainties by establishing a reproducible and reliable approach. MV attempts to leverage the current AI technology in a novel way to provide an automatic inspection workflow, from image acquisition from the sensor, digital image pre-processing, training and testing techniques, validation, and knowledge extraction. It covers software products and hardware architectures such as CPU, GPU/ FPGA combination, parallel implementation, and computer visions to minimize the computational effort while maximizing the .

In this Special Issue, we welcome scientific manuscripts proposing a framework to leverage the MV with optimized AI techniques and geospatial information systems to automate the processing of remotely sensed imageries from, e.g., Lidar, Radar, SAR, multispectral sensors with higher precision for multiple spatial applications including but not limited to urbanism, land-use modeling, environment, weather and climate, energy sector, natural resources, landscape, geo-hazard, etc.

Dr. Hossein M. Rizeei
Dr. Peter Hofmann
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Artificial intelligence (AI)
  • Machine vision (MV)
  • Machine learning (ML)
  • Geospatial information systems (GIS)
  • Optimization
  • Spatial framework
  • Deep learning (DL)

Related Special Issue

Published Papers (21 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

15 pages, 2576 KiB  
Article
Multi-Category Segmentation of Sentinel-2 Images Based on the Swin UNet Method
by Junyuan Yao and Shuanggen Jin
Remote Sens. 2022, 14(14), 3382; https://doi.org/10.3390/rs14143382 - 14 Jul 2022
Cited by 17 | Viewed by 3632
Abstract
Medium-resolution remote sensing satellites have provided a large amount of long time series and full coverage data for Earth surface monitoring. However, the different objects may have similar spectral values and the same objects may have different spectral values, which makes it difficult [...] Read more.
Medium-resolution remote sensing satellites have provided a large amount of long time series and full coverage data for Earth surface monitoring. However, the different objects may have similar spectral values and the same objects may have different spectral values, which makes it difficult to improve the classification accuracy. Semantic segmentation of remote sensing images is greatly facilitated via deep learning methods. For medium-resolution remote sensing images, the convolutional neural network-based model does not achieve good results due to its limited field of perception. The fast-emerging vision transformer method with self-attentively capturing global features well provides a new solution for medium-resolution remote sensing image segmentation. In this paper, a new multi-class segmentation method is proposed for medium-resolution remote sensing images based on the improved Swin UNet model as a pure transformer model and a new pre-processing, and the image enhancement method and spectral selection module are designed to achieve better accuracy. Finally, 10-categories segmentation is conducted with 10-m resolution Sentinel-2 MSI (Multi-Spectral Imager) images, which is compared with other traditional convolutional neural network-based models (DeepLabV3+ and U-Net with different backbone networks, including VGG, ResNet50, MobileNet, and Xception) with the same sample data, and results show higher Mean Intersection Over Union (MIOU) (72.06%) and better accuracy (89.77%) performance. The vision transformer method has great potential for medium-resolution remote sensing image segmentation tasks. Full article
Show Figures

Graphical abstract

18 pages, 4430 KiB  
Article
Machine Learning Techniques for Phenology Assessment of Sugarcane Using Conjunctive SAR and Optical Data
by Md Yeasin, Dipanwita Haldar, Suresh Kumar, Ranjit Kumar Paul and Sonaka Ghosh
Remote Sens. 2022, 14(14), 3249; https://doi.org/10.3390/rs14143249 - 6 Jul 2022
Cited by 7 | Viewed by 2311
Abstract
Crop phenology monitoring is a necessary action for precision agriculture. Sentinel-1 and Sentinel-2 satellites provide us with the opportunity to monitor crop phenology at a high spatial resolution with high accuracy. The main objective of this study was to examine the potential of [...] Read more.
Crop phenology monitoring is a necessary action for precision agriculture. Sentinel-1 and Sentinel-2 satellites provide us with the opportunity to monitor crop phenology at a high spatial resolution with high accuracy. The main objective of this study was to examine the potential of the Sentinel-1 and Sentinel-2 data and their combination for monitoring sugarcane phenological stages and evaluate the temporal behaviour of Sentinel-1 parameters and Sentinel-2 indices. Seven machine learning models, namely logistic regression, decision tree, random forest, artificial neural network, support vector machine, naïve Bayes, and fuzzy rule based systems, were implemented, and their predictive performance was compared. Accuracy, precision, specificity, sensitivity or recall, F score, area under curve of receiver operating characteristic and kappa value were used as performance metrics. The research was carried out in the Indo-Gangetic alluvial plains in the districts of Hisar and Jind, Haryana, India. The Sentinel-1 backscatters and parameters VV, alpha and anisotropy and, among Sentinel-2 indices, normalized difference vegetation index and weighted difference vegetation index were found to be the most important features for predicting sugarcane phenology. The accuracy of models ranged from 40 to 60%, 56 to 84% and 76 to 88% for Sentinel-1 data, Sentinel-2 data and combined data, respectively. Area under the ROC curve and kappa values also supported the supremacy of the combined use of Sentinel-1 and Sentinel-2 data. This study infers that combined Sentinel-1 and Sentinel-2 data are more efficient in predicting sugarcane phenology than Sentinel-1 and Sentinel-2 alone. Full article
Show Figures

Figure 1

23 pages, 7909 KiB  
Article
Infrared and Visible Image Fusion with Deep Neural Network in Enhanced Flight Vision System
by Xuyang Gao, Yibing Shi, Qi Zhu, Qiang Fu and Yuezhou Wu
Remote Sens. 2022, 14(12), 2789; https://doi.org/10.3390/rs14122789 - 10 Jun 2022
Cited by 6 | Viewed by 2411
Abstract
The Enhanced Flight Vision System (EFVS) plays a significant role in the Next-Generation low visibility aircraft landing technology, where the involvement of optical sensing systems increases the visual dimension for pilots. This paper focuses on deploying infrared and visible image fusion systems in [...] Read more.
The Enhanced Flight Vision System (EFVS) plays a significant role in the Next-Generation low visibility aircraft landing technology, where the involvement of optical sensing systems increases the visual dimension for pilots. This paper focuses on deploying infrared and visible image fusion systems in civil flight, particularly generating integrated results to contend with registration deviation and adverse weather conditions. The existing enhancement methods push ahead with metrics-driven integration, while the dynamic distortion and the continuous visual scene are overlooked in the landing stage. Hence, the proposed visual enhancement scheme is divided into homography estimation and image fusion based on deep learning. A lightweight framework integrating hardware calibration and homography estimation is designed for image calibration before fusion and reduces the offset between image pairs. The transformer structure adopting the self-attention mechanism in distinguishing composite properties is incorporated into a concise autoencoder to construct the fusion strategy, and the improved weight allocation strategy enhances the feature combination. These things considered, a flight verification platform accessing the performances of different algorithms is built to capture image pairs in the landing stage. Experimental results confirm the equilibrium of the proposed scheme in perception-inspired and feature-based metrics compared to other approaches. Full article
Show Figures

Graphical abstract

22 pages, 2346 KiB  
Article
RANet: A Reliability-Guided Aggregation Network for Hyperspectral and RGB Fusion Tracking
by Chunhui Zhao, Hongjiao Liu, Nan Su, Lu Wang and Yiming Yan
Remote Sens. 2022, 14(12), 2765; https://doi.org/10.3390/rs14122765 - 9 Jun 2022
Cited by 7 | Viewed by 2157
Abstract
Object tracking based on RGB images may fail when the color of the tracked object is similar to that of the background. Hyperspectral images with rich spectral features can provide more information for RGB-based trackers. However, there is no fusion tracking algorithm based [...] Read more.
Object tracking based on RGB images may fail when the color of the tracked object is similar to that of the background. Hyperspectral images with rich spectral features can provide more information for RGB-based trackers. However, there is no fusion tracking algorithm based on hyperspectral and RGB images. In this paper, we propose a reliability-guided aggregation network (RANet) for hyperspectral and RGB tracking, which guides the combination of hyperspectral information and RGB information through modality reliability to improve tracking performance. Specifically, a dual branch based on the Transformer Tracking (TransT) structure is constructed to obtain the information of hyperspectral and RGB modalities. Then, a classification response aggregation module is designed to combine the different modality information by fusing the response predicted through the classification head. Finally, the reliability of different modalities is also considered in the aggregation module to guide the aggregation of the different modality information. Massive experimental results on the public dataset composed of hyperspectral and RGB image sequences show that the performance of the tracker based on our fusion method is better than that of the corresponding single-modality tracker, which fully proves the effectiveness of the fusion method. Among them, the RANet tracker based on the TransT tracker achieves the best performance accuracy of 0.709, indicating the effectiveness and superiority of the RANet tracker. Full article
Show Figures

Graphical abstract

13 pages, 34286 KiB  
Communication
Augmentation-Based Methodology for Enhancement of Trees Map Detalization on a Large Scale
by Svetlana Illarionova, Dmitrii Shadrin, Vladimir Ignatiev, Sergey Shayakhmetov, Alexey Trekin and Ivan Oseledets
Remote Sens. 2022, 14(9), 2281; https://doi.org/10.3390/rs14092281 - 9 May 2022
Cited by 13 | Viewed by 2059
Abstract
Remote sensing tasks play a very important role in the domain of sensing and measuring, and can be very specific. Advances in computer vision techniques allow for the extraction of various information from remote sensing satellite imagery. This information is crucial in making [...] Read more.
Remote sensing tasks play a very important role in the domain of sensing and measuring, and can be very specific. Advances in computer vision techniques allow for the extraction of various information from remote sensing satellite imagery. This information is crucial in making quantitative and qualitative assessments for monitoring of forest clearing in protected areas for power lines, as well as for environmental analysis, in particular for making assessments of carbon footprint, which is a highly relevant task. Solving these problems requires precise segmentation of the forest mask. Although forest mask extraction from satellite data has been considered previously, no open-access applications are able to provide the high-detailed forest mask. Detailed forest masks are usually obtained using unmanned aerial vehicles (UAV) that set particular limitations such as cost and inapplicability for vast territories. In this study, we propose a novel neural network-based approach for high-detailed forest mask creation. We implement an object-based augmentation technique for a minimum amount of labeled high-detailed data. Using this augmented data we fine-tune the models, which are trained on a large forest dataset with less precise labeled masks. The provided algorithm is tested for multiple territories in Russia. The F1-score, for small details (such as individual trees) was improved to 0.929 compared to the baseline score of 0.856. The developed model is available in an SAAS platform. The developed model allows a detailed and precise forest mask to be easily created, which then be used for solving various applied problems. Full article
Show Figures

Figure 1

28 pages, 4798 KiB  
Article
Fusion of a Static and Dynamic Convolutional Neural Network for Multiview 3D Point Cloud Classification
by Wenju Wang, Haoran Zhou, Gang Chen and Xiaolin Wang
Remote Sens. 2022, 14(9), 1996; https://doi.org/10.3390/rs14091996 - 21 Apr 2022
Cited by 5 | Viewed by 2661
Abstract
Three-dimensional (3D) point cloud classification methods based on deep learning have good classification performance; however, they adapt poorly to diverse datasets and their classification accuracy must be improved. Therefore, FSDCNet, a neural network model based on the fusion of static and dynamic convolution, [...] Read more.
Three-dimensional (3D) point cloud classification methods based on deep learning have good classification performance; however, they adapt poorly to diverse datasets and their classification accuracy must be improved. Therefore, FSDCNet, a neural network model based on the fusion of static and dynamic convolution, is proposed and applied for multiview 3D point cloud classification in this paper. FSDCNet devises a view selection method with fixed and random viewpoints, which effectively avoids the overfitting caused by the traditional fixed viewpoint. A local feature extraction operator of dynamic and static convolution adaptive weight fusion was designed to improve the model’s adaptability to different types of datasets. To address the problems of large parameters and high computational complexity associated with the current methods of dynamic convolution, a lightweight and adaptive dynamic convolution operator was developed. In addition, FSDCNet builds a global attention pooling, integrating the most crucial information on different view features to the greatest extent. Due to these characteristics, FSDCNet is more adaptable, can extract more fine-grained detailed information, and can improve the classification accuracy of point cloud data. The proposed method was applied to the ModelNet40 and Sydney Urban Objects datasets. In these experiments, FSDCNet outperformed its counterparts, achieving state-of-the-art point cloud classification accuracy. For the ModelNet40 dataset, the overall accuracy (OA) and average accuracy (AA) of FSDCNet in a single view reached 93.8% and 91.2%, respectively, which were superior to those values for many other methods using 6 and 12 views. FSDCNet obtained the best results for 6 and 12 views, achieving 94.6%, 93.3%, 95.3%, and 93.6% in OA and AA metrics, respectively. For the Sydney Urban Objects dataset, FSDCNet achieved an OA and F1 score of 81.2% and 80.1% in a single view, respectively, which were higher than most of the compared methods. In 6 and 12 views, FSDCNet reached an OA of 85.3% and 83.6% and an F1 score of 85.5% and 83.7%, respectively. Full article
Show Figures

Graphical abstract

24 pages, 9654 KiB  
Article
Distribution Modeling and Factor Correlation Analysis of Landslides in the Large Fault Zone of the Western Qinling Mountains: A Machine Learning Algorithm
by Tianjun Qi, Yan Zhao, Xingmin Meng, Wei Shi, Feng Qing, Guan Chen, Yi Zhang, Dongxia Yue and Fuyun Guo
Remote Sens. 2021, 13(24), 4990; https://doi.org/10.3390/rs13244990 - 8 Dec 2021
Cited by 11 | Viewed by 2787
Abstract
The area comprising the Langma-Baiya fault zone (LBFZ) and the Bailongjiang fault zone (BFZ) in the Western Qinling Mountains in China is characterized by intensive, frequent, multi-type landslide disasters. The spatial distribution of landslides is affected by factors, such as geological structure, landforms, [...] Read more.
The area comprising the Langma-Baiya fault zone (LBFZ) and the Bailongjiang fault zone (BFZ) in the Western Qinling Mountains in China is characterized by intensive, frequent, multi-type landslide disasters. The spatial distribution of landslides is affected by factors, such as geological structure, landforms, climate and human activities, and the distribution of landslides in turn affects the geomorphology, ecological environment and human activities. Here, we present the results of a detailed landslide inventory of the area, which recorded a total of 2765 landslides. The landslides are divided into three categories according to relative age, area, and type of movement. Sixteen factors related to geological structure, geomorphology, materials composition and human activities were selected and four machine learning algorithms were used to model the spatial distribution of landslides. The aim was to quantitatively evaluate the relationship between the spatial distribution of landslides and the contributing factors. Based on a comparison of model accuracy and the Receiver Operating Characteristic (ROC) curve, RandomForest (RF) (accuracy of 92%, area under the ROC of 0.97) and GradientBoosting (GB) (accuracy of 96%, area under the ROC curve of 0.97) were selected to predict the spatial distribution of unclassified landslides and classified landslides, respectively. The evaluation results reveal the following. The vegetation coverage index (NDVI) (correlation of 0.2, and the same below) and distance to road (DTR) (0.13) had the highest correlations with the distribution of unclassified landslides. NDVI (0.18) and the annual precipitation index (API) (0.14) had the highest correlations with the distribution of landslides of different ages. API (0.16), average slope (AS) (0.14) and NDVI (0.1) had the highest correlations with the landslide distribution on different scales. API (0.28) had the highest correlation with the landslide distribution based on different types of landslide movement. Full article
Show Figures

Graphical abstract

21 pages, 1727 KiB  
Article
Learning Future-Aware Correlation Filters for Efficient UAV Tracking
by Fei Zhang, Shiping Ma, Lixin Yu, Yule Zhang, Zhuling Qiu and Zhenyu Li
Remote Sens. 2021, 13(20), 4111; https://doi.org/10.3390/rs13204111 - 14 Oct 2021
Cited by 9 | Viewed by 1897
Abstract
In recent years, discriminative correlation filter (DCF)-based trackers have made considerable progress and drawn widespread attention in the unmanned aerial vehicle (UAV) tracking community. Most existing trackers collect historical information, e.g., training samples, previous filters, and response maps, to promote their discrimination and [...] Read more.
In recent years, discriminative correlation filter (DCF)-based trackers have made considerable progress and drawn widespread attention in the unmanned aerial vehicle (UAV) tracking community. Most existing trackers collect historical information, e.g., training samples, previous filters, and response maps, to promote their discrimination and robustness. Under UAV-specific tracking challenges, e.g., fast motion and view change, variations of both the target and its environment in the new frame are unpredictable. Interfered by future unknown environments, trackers that trained with historical information may be confused by the new context, resulting in tracking failure. In this paper, we propose a novel future-aware correlation filter tracker, i.e., FACF. The proposed method aims at effectively utilizing context information in the new frame for better discriminative and robust abilities, which consists of two stages: future state awareness and future context awareness. In the former stage, an effective time series forecast method is employed to reason a coarse position of the target, which is the reference for obtaining a context patch in the new frame. In the latter stage, we firstly obtain the single context patch with an efficient target-aware method. Then, we train a filter with the future context information in order to perform robust tracking. Extensive experimental results obtained from three UAV benchmarks, i.e., UAV123_10fps, DTB70, and UAVTrack112, demonstrate the effectiveness and robustness of the proposed tracker. Our tracker has comparable performance with other state-of-the-art trackers while running at ∼49 FPS on a single CPU. Full article
Show Figures

Figure 1

20 pages, 3450 KiB  
Article
Glassboxing Deep Learning to Enhance Aircraft Detection from SAR Imagery
by Ru Luo, Jin Xing, Lifu Chen, Zhouhao Pan, Xingmin Cai, Zengqi Li, Jielan Wang and Alistair Ford
Remote Sens. 2021, 13(18), 3650; https://doi.org/10.3390/rs13183650 - 13 Sep 2021
Cited by 13 | Viewed by 2381
Abstract
Although deep learning has achieved great success in aircraft detection from SAR imagery, its blackbox behavior has been criticized for low comprehensibility and interpretability. Such challenges have impeded the trustworthiness and wide application of deep learning techniques in SAR image analytics. In this [...] Read more.
Although deep learning has achieved great success in aircraft detection from SAR imagery, its blackbox behavior has been criticized for low comprehensibility and interpretability. Such challenges have impeded the trustworthiness and wide application of deep learning techniques in SAR image analytics. In this paper, we propose an innovative eXplainable Artificial Intelligence (XAI) framework to glassbox deep neural networks (DNN) by using aircraft detection as a case study. This framework is composed of three parts: hybrid global attribution mapping (HGAM) for backbone network selection, path aggregation network (PANet), and class-specific confidence scores mapping (CCSM) for visualization of the detector. HGAM integrates the local and global XAI techniques to evaluate the effectiveness of DNN feature extraction; PANet provides advanced feature fusion to generate multi-scale prediction feature maps; while CCSM relies on visualization methods to examine the detection performance with given DNN and input SAR images. This framework can select the optimal backbone DNN for aircraft detection and map the detection performance for better understanding of the DNN. We verify its effectiveness with experiments using Gaofen-3 imagery. Our XAI framework offers an explainable approach to design, develop, and deploy DNN for SAR image analytics. Full article
Show Figures

Figure 1

21 pages, 2968 KiB  
Article
DGANet: A Dilated Graph Attention-Based Network for Local Feature Extraction on 3D Point Clouds
by Jie Wan, Zhong Xie, Yongyang Xu, Ziyin Zeng, Ding Yuan and Qinjun Qiu
Remote Sens. 2021, 13(17), 3484; https://doi.org/10.3390/rs13173484 - 2 Sep 2021
Cited by 19 | Viewed by 2926
Abstract
Feature extraction on point clouds is an essential task when analyzing and processing point clouds of 3D scenes. However, there still remains a challenge to adequately exploit local fine-grained features on point cloud data due to its irregular and unordered structure in a [...] Read more.
Feature extraction on point clouds is an essential task when analyzing and processing point clouds of 3D scenes. However, there still remains a challenge to adequately exploit local fine-grained features on point cloud data due to its irregular and unordered structure in a 3D space. To alleviate this problem, a Dilated Graph Attention-based Network (DGANet) with a certain feature for learning ability is proposed. Specifically, we first build a local dilated graph-like region for each input point to establish the long-range spatial correlation towards its corresponding neighbors, which allows the proposed network to access a wider range of geometric information of local points with their long-range dependencies. Moreover, by integrating the dilated graph attention module (DGAM) implemented by a novel offset–attention mechanism, the proposed network promises to highlight the differing importance on each edge of the constructed local graph to uniquely learn the discrepancy feature of geometric attributes between the connected point pairs. Finally, all the learned edge attention features are further aggregated, allowing the most significant geometric feature representation of local regions by the graph–attention pooling to fully extract local detailed features for each point. The validation experiments using two challenging benchmark datasets demonstrate the effectiveness and powerful generation ability of our proposed DGANet in both 3D object classification and segmentation tasks. Full article
Show Figures

Graphical abstract

16 pages, 3547 KiB  
Article
Intelligent Recognition Method of Low-Altitude Squint Optical Ship Target Fused with Simulation Samples
by Bo Liu, Qi Xiao, Yuhao Zhang, Wei Ni, Zhen Yang and Ligang Li
Remote Sens. 2021, 13(14), 2697; https://doi.org/10.3390/rs13142697 - 8 Jul 2021
Cited by 4 | Viewed by 2081
Abstract
To address the problem of intelligent recognition of optical ship targets under low-altitude squint detection, we propose an intelligent recognition method based on simulation samples. This method comprehensively considers geometric and spectral characteristics of ship targets and ocean background and performs full link [...] Read more.
To address the problem of intelligent recognition of optical ship targets under low-altitude squint detection, we propose an intelligent recognition method based on simulation samples. This method comprehensively considers geometric and spectral characteristics of ship targets and ocean background and performs full link modeling combined with the squint detection atmospheric transmission model. It also generates and expands squint multi-angle imaging simulation samples of ship targets in the visible light band using the expanded sample type to perform feature analysis and modification on SqueezeNet. Shallow and deeper features are combined to improve the accuracy of feature recognition. The experimental results demonstrate that using simulation samples to expand the training set can improve the performance of the traditional k-nearest neighbors algorithm and modified SqueezeNet. For the classification of specific ship target types, a mixed-scene dataset expanded with simulation samples was used for training. The classification accuracy of the modified SqueezeNet was 91.85%. These results verify the effectiveness of the proposed method. Full article
Show Figures

Figure 1

24 pages, 12540 KiB  
Article
Forest Fire Risk Prediction: A Spatial Deep Neural Network-Based Framework
by Mohsen Naderpour, Hossein Mojaddadi Rizeei and Fahimeh Ramezani
Remote Sens. 2021, 13(13), 2513; https://doi.org/10.3390/rs13132513 - 27 Jun 2021
Cited by 48 | Viewed by 6773
Abstract
Forest fire is one of the foremost environmental disasters that threatens the Australian community. Recognition of the occurrence patterns of fires and the identification of fire risk is beneficial to mitigate probable fire threats. Machine learning techniques are recognized as well-known approaches to [...] Read more.
Forest fire is one of the foremost environmental disasters that threatens the Australian community. Recognition of the occurrence patterns of fires and the identification of fire risk is beneficial to mitigate probable fire threats. Machine learning techniques are recognized as well-known approaches to solving non-linearity problems such as forest fire risk. However, assessing such environmental multivariate disasters has always been challenging as modelling may be biased from multiple uncertainty sources such as the quality and quantity of input parameters, training processes, and a default setup for hyper-parameters. In this study, we propose a spatial framework to quantify the forest fire risk in the Northern Beaches area of Sydney. Thirty-six significant key indicators contributing to forest fire risk were selected and spatially mapped from different contexts such as topography, morphology, climate, human-induced, social, and physical perspectives as input to our model. Optimized deep neural networks were developed to maximize the capability of the multilayer perceptron for forest fire susceptibility assessment. The results show high precision of developed model against accuracy assessment metrics of ROC = 95.1%, PRC = 93.8%, and k coefficient = 94.3%. The proposed framework follows a stepwise procedure to run multiple scenarios to calculate the probability of forest risk with new input contributing parameters. This model improves adaptability and decision-making as it can be adapted to different regions of Australia with a minor localization adoption requirement of the weighting procedure. Full article
Show Figures

Graphical abstract

14 pages, 2166 KiB  
Communication
High Accuracy Interpolation of DEM Using Generative Adversarial Network
by Li Yan, Xingfen Tang and Yi Zhang
Remote Sens. 2021, 13(4), 676; https://doi.org/10.3390/rs13040676 - 13 Feb 2021
Cited by 6 | Viewed by 3192
Abstract
Digital elevation model (DEM) interpolation is aimed at predicting the elevation values of unobserved locations, given a series of collected points. Over the years, the traditional interpolation methods have been widely used but can easily lead to accuracy degradation. In recent years, generative [...] Read more.
Digital elevation model (DEM) interpolation is aimed at predicting the elevation values of unobserved locations, given a series of collected points. Over the years, the traditional interpolation methods have been widely used but can easily lead to accuracy degradation. In recent years, generative adversarial networks (GANs) have been proven to be more efficient than the traditional methods. However, the interpolation accuracy is not guaranteed. In this paper, we propose a GAN-based network named gated and symmetric-dilated U-net GAN (GSUGAN) for improved DEM interpolation, which performs visibly and quantitatively better than the traditional methods and the conditional encoder-decoder GAN (CEDGAN). We also discuss combinations of new techniques in the generator. This shows that the gated convolution and symmetric dilated convolution structure perform slightly better. Furthermore, based on the performance of the different methods, it was concluded that the Convolutional Neural Network (CNN)-based method has an advantage in the quantitative accuracy but the GAN-based method can obtain a better visual quality, especially in complex terrains. In summary, in this paper, we propose a GAN-based network for improved DEM interpolation and we further illustrate the GAN-based method’s performance compared to that of the CNN-based method. Full article
Show Figures

Graphical abstract

28 pages, 18277 KiB  
Article
End-to-End Super-Resolution for Remote-Sensing Images Using an Improved Multi-Scale Residual Network
by Hai Huan, Pengcheng Li, Nan Zou, Chao Wang, Yaqin Xie, Yong Xie and Dongdong Xu
Remote Sens. 2021, 13(4), 666; https://doi.org/10.3390/rs13040666 - 12 Feb 2021
Cited by 27 | Viewed by 3690
Abstract
Remote-sensing images constitute an important means of obtaining geographic information. Image super-resolution reconstruction techniques are effective methods of improving the spatial resolution of remote-sensing images. Super-resolution reconstruction networks mainly improve the model performance by increasing the network depth. However, blindly increasing the network [...] Read more.
Remote-sensing images constitute an important means of obtaining geographic information. Image super-resolution reconstruction techniques are effective methods of improving the spatial resolution of remote-sensing images. Super-resolution reconstruction networks mainly improve the model performance by increasing the network depth. However, blindly increasing the network depth can easily lead to gradient disappearance or gradient explosion, increasing the difficulty of training. This report proposes a new pyramidal multi-scale residual network (PMSRN) that uses hierarchical residual-like connections and dilation convolution to form a multi-scale dilation residual block (MSDRB). The MSDRB enhances the ability to detect context information and fuses hierarchical features through the hierarchical feature fusion structure. Finally, a complementary block of global and local features is added to the reconstruction structure to alleviate the problem that useful original information is ignored. The experimental results showed that, compared with a basic multi-scale residual network, the PMSRN increased the peak signal-to-noise ratio by up to 0.44 dB and the structural similarity to 0.9776. Full article
Show Figures

Graphical abstract

20 pages, 10253 KiB  
Article
Transferability of Convolutional Neural Network Models for Identifying Damaged Buildings Due to Earthquake
by Wanting Yang, Xianfeng Zhang and Peng Luo
Remote Sens. 2021, 13(3), 504; https://doi.org/10.3390/rs13030504 - 31 Jan 2021
Cited by 35 | Viewed by 3879
Abstract
The collapse of buildings caused by earthquakes can lead to a large loss of life and property. Rapid assessment of building damage with remote sensing image data can support emergency rescues. However, current studies indicate that only a limited sample set can usually [...] Read more.
The collapse of buildings caused by earthquakes can lead to a large loss of life and property. Rapid assessment of building damage with remote sensing image data can support emergency rescues. However, current studies indicate that only a limited sample set can usually be obtained from remote sensing images immediately following an earthquake. Consequently, the difficulty in preparing sufficient training samples constrains the generalization of the model in the identification of earthquake-damaged buildings. To produce a deep learning network model with strong generalization, this study adjusted four Convolutional Neural Network (CNN) models for extracting damaged building information and compared their performance. A sample dataset of damaged buildings was constructed by using multiple disaster images retrieved from the xBD dataset. Using satellite and aerial remote sensing data obtained after the 2008 Wenchuan earthquake, we examined the geographic and data transferability of the deep network model pre-trained on the xBD dataset. The result shows that the network model pre-trained with samples generated from multiple disaster remote sensing images can extract accurately collapsed building information from satellite remote sensing data. Among the adjusted CNN models tested in the study, the adjusted DenseNet121 was the most robust. Transfer learning solved the problem of poor adaptability of the network model to remote sensing images acquired by different platforms and could identify disaster-damaged buildings properly. These results provide a solution to the rapid extraction of earthquake-damaged building information based on a deep learning network model. Full article
Show Figures

Graphical abstract

18 pages, 14909 KiB  
Article
Hyperspectral Image Classification Based on Multi-Scale Residual Network with Attention Mechanism
by Yuhao Qing and Wenyi Liu
Remote Sens. 2021, 13(3), 335; https://doi.org/10.3390/rs13030335 - 20 Jan 2021
Cited by 48 | Viewed by 5090
Abstract
In recent years, image classification on hyperspectral imagery utilizing deep learning algorithms has attained good results. Thus, spurred by that finding and to further improve the deep learning classification accuracy, we propose a multi-scale residual convolutional neural network model fused with an efficient [...] Read more.
In recent years, image classification on hyperspectral imagery utilizing deep learning algorithms has attained good results. Thus, spurred by that finding and to further improve the deep learning classification accuracy, we propose a multi-scale residual convolutional neural network model fused with an efficient channel attention network (MRA-NET) that is appropriate for hyperspectral image classification. The suggested technique comprises a multi-staged architecture, where initially the spectral information of the hyperspectral image is reduced into a two-dimensional tensor, utilizing a principal component analysis (PCA) scheme. Then, the constructed low-dimensional image is input to our proposed ECA-NET deep network, which exploits the advantages of its core components, i.e., multi-scale residual structure and attention mechanisms. We evaluate the performance of the proposed MRA-NET on three public available hyperspectral datasets and demonstrate that, overall, the classification accuracy of our method is 99.82 %, 99.81%, and 99.37, respectively, which is higher compared to the corresponding accuracy of current networks such as 3D convolutional neural network (CNN), three-dimensional residual convolution structure (RES-3D-CNN), and space–spectrum joint deep network (SSRN). Full article
Show Figures

Graphical abstract

24 pages, 2775 KiB  
Article
Triple-Attention-Based Parallel Network for Hyperspectral Image Classification
by Lei Qu, Xingliang Zhu, Jiannan Zheng and Liang Zou
Remote Sens. 2021, 13(2), 324; https://doi.org/10.3390/rs13020324 - 19 Jan 2021
Cited by 26 | Viewed by 3492
Abstract
Convolutional neural networks have been highly successful in hyperspectral image classification owing to their unique feature expression ability. However, the traditional data partitioning strategy in tandem with patch-wise classification may lead to information leakage and result in overoptimistic experimental insights. In this paper, [...] Read more.
Convolutional neural networks have been highly successful in hyperspectral image classification owing to their unique feature expression ability. However, the traditional data partitioning strategy in tandem with patch-wise classification may lead to information leakage and result in overoptimistic experimental insights. In this paper, we propose a novel data partitioning scheme and a triple-attention parallel network (TAP-Net) to enhance the performance of HSI classification without information leakage. The dataset partitioning strategy is simple yet effective to avoid overfitting, and allows fair comparison of various algorithms, particularly in the case of limited annotated data. In contrast to classical encoder–decoder models, the proposed TAP-Net utilizes parallel subnetworks with the same spatial resolution and repeatedly reuses high-level feature maps of preceding subnetworks to refine the segmentation map. In addition, a channel–spectral–spatial-attention module is proposed to optimize the information transmission between different subnetworks. Experiments were conducted on three benchmark hyperspectral datasets, and the results demonstrate that the proposed method outperforms state-of-the-art methods with the overall accuracy of 90.31%, 91.64%, and 81.35% and the average accuracy of 93.18%, 87.45%, and 78.85% over Salinas Valley, Pavia University and Indian Pines dataset, respectively. It illustrates that the proposed TAP-Net is able to effectively exploit the spatial–spectral information to ensure high performance. Full article
Show Figures

Graphical abstract

22 pages, 7905 KiB  
Article
Rethinking the Random Cropping Data Augmentation Method Used in the Training of CNN-Based SAR Image Ship Detector
by Rong Yang, Robert Wang, Yunkai Deng, Xiaoxue Jia and Heng Zhang
Remote Sens. 2021, 13(1), 34; https://doi.org/10.3390/rs13010034 - 23 Dec 2020
Cited by 19 | Viewed by 4162
Abstract
The random cropping data augmentation method is widely used to train convolutional neural network (CNN)-based target detectors to detect targets in optical images (e.g., COCO datasets). It can expand the scale of the dataset dozens of times while consuming only a small amount [...] Read more.
The random cropping data augmentation method is widely used to train convolutional neural network (CNN)-based target detectors to detect targets in optical images (e.g., COCO datasets). It can expand the scale of the dataset dozens of times while consuming only a small amount of calculations when training the neural network detector. In addition, random cropping can also greatly enhance the spatial robustness of the model, because it can make the same target appear in different positions of the sample image. Nowadays, random cropping and random flipping have become the standard configuration for those tasks with limited training data, which makes it natural to introduce them into the training of CNN-based synthetic aperture radar (SAR) image ship detectors. However, in this paper, we show that the introduction of traditional random cropping methods directly in the training of the CNN-based SAR image ship detector may generate a lot of noise in the gradient during back propagation, which hurts the detection performance. In order to eliminate the noise in the training gradient, a simple and effective training method based on feature map mask is proposed. Experiments prove that the proposed method can effectively eliminate the gradient noise introduced by random cropping and significantly improve the detection performance under a variety of evaluation indicators without increasing inference cost. Full article
Show Figures

Graphical abstract

16 pages, 8888 KiB  
Article
EFN: Field-Based Object Detection for Aerial Images
by Jin Liu and Haokun Zheng
Remote Sens. 2020, 12(21), 3630; https://doi.org/10.3390/rs12213630 - 5 Nov 2020
Cited by 5 | Viewed by 2567
Abstract
Object detection and recognition in aerial and remote sensing images has become a hot topic in the field of computer vision in recent years. As these images are usually taken from a bird’s-eye view, the targets often have different shapes and are densely [...] Read more.
Object detection and recognition in aerial and remote sensing images has become a hot topic in the field of computer vision in recent years. As these images are usually taken from a bird’s-eye view, the targets often have different shapes and are densely arranged. Therefore, using an oriented bounding box to mark the target is a mainstream choice. However, this general method is designed based on horizontal box annotation, while the improved method for detecting an oriented bounding box has a high computational complexity. In this paper, we propose a method called ellipse field network (EFN) to organically integrate semantic segmentation and object detection. It predicts the probability distribution of the target and obtains accurate oriented bounding boxes through a post-processing step. We tested our method on the HRSC2016 and DOTA data sets, achieving mAP values of 0.863 and 0.701, respectively. At the same time, we also tested the performance of EFN on natural images and obtained a mAP of 84.7 in the VOC2012 data set. These extensive experiments demonstrate that EFN can achieve state-of-the-art results in aerial image tests and can obtain a good score when considering natural images. Full article
Show Figures

Figure 1

Other

Jump to: Research

11 pages, 2179 KiB  
Technical Note
Group-in-Group Relation-Based Transformer for 3D Point Cloud Learning
by Shaolei Liu, Kexue Fu, Manning Wang and Zhijian Song
Remote Sens. 2022, 14(7), 1563; https://doi.org/10.3390/rs14071563 - 24 Mar 2022
Cited by 5 | Viewed by 2361
Abstract
Deep point cloud neural networks have achieved promising performance in remote sensing applications, and the prevalence of Transformer in natural language processing and computer vision is in stark contrast to underexplored point-based methods. In this paper, we propose an effective transformer-based network for [...] Read more.
Deep point cloud neural networks have achieved promising performance in remote sensing applications, and the prevalence of Transformer in natural language processing and computer vision is in stark contrast to underexplored point-based methods. In this paper, we propose an effective transformer-based network for point cloud learning. To better learn global and local information, we propose a group-in-group relation-based transformer architecture to learn the relationships between point groups to model global information and between points within each group to model local semantic information. To further enhance the local feature representation, we propose a Radius Feature Abstraction (RFA) module to extract radius-based density features characterizing the sparsity of local point clouds. Extensive evaluation on public benchmark datasets demonstrate the effectiveness and competitive performance of our proposed method on point cloud classification and part segmentation. Full article
Show Figures

Figure 1

10 pages, 856 KiB  
Technical Note
Satellite Image Multi-Frame Super Resolution Using 3D Wide-Activation Neural Networks
by Francisco Dorr
Remote Sens. 2020, 12(22), 3812; https://doi.org/10.3390/rs12223812 - 20 Nov 2020
Cited by 12 | Viewed by 3832
Abstract
The small satellite market continues to grow year after year. A compound annual growth rate of 17% is estimated during the period between 2020 and 2025. Low-cost satellites can send a vast amount of images to be post-processed at the ground to improve [...] Read more.
The small satellite market continues to grow year after year. A compound annual growth rate of 17% is estimated during the period between 2020 and 2025. Low-cost satellites can send a vast amount of images to be post-processed at the ground to improve the quality and extract detailed information. In this domain lies the resolution enhancement task, where a low-resolution image is converted to a higher resolution automatically. Deep learning approaches to Super Resolution (SR) reached the state-of-the-art in multiple benchmarks; however, most of them were studied in a single-frame fashion. With satellite imagery, multi-frame images can be obtained at different conditions giving the possibility to add more information per image and improve the final analysis. In this context, we developed and applied to the PROBA-V dataset of multi-frame satellite images a model that recently topped the European Space Agency’s Multi-frame Super Resolution (MFSR) competition. The model is based on proven methods that worked on 2D images tweaked to work on 3D: the Wide Activation Super Resolution (WDSR) family. We show that with a simple 3D CNN residual architecture with WDSR blocks and a frame permutation technique as the data augmentation, better scores can be achieved than with more complex models. Moreover, the model requires few hardware resources, both for training and evaluation, so it can be applied directly on a personal laptop. Full article
Show Figures

Graphical abstract

Back to TopTop