Special Issue "Deep Learning and Computer Vision for GeoInformation Sciences"

Special Issue Editors

Dr. James Haworth
E-Mail Website
Guest Editor
Department of Civil, Environmental and Geomatic Engineering, UCL, London WC1E 6BT, UK
Interests: GIScience; machine learning; artificial intelligence; computer vision; transport; geocomputation; geosimulation
Prof. Dr. Suzana Dragicevic
E-Mail Website
Guest Editor
Spatial Analysis and Modeling Lab, Department of Geography, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
Interests: geographic information systems and science (GIS); geosimulation; geographic automata modeling; artificial intelligence; soft computing; geocomputation
Special Issues and Collections in MDPI journals
Dr. Marguerite Madden
E-Mail Website
Guest Editor
Center for Geospatial Research, Department of Geography, University of Georgia, Athens, GA 30602, USA
Interests: GIScience and landscape ecology; remote sensing; geovisualization and geospatial analysis for human/animal–environment interactions
Special Issues and Collections in MDPI journals
Dr. Mingshu Wang
E-Mail Website
Guest Editor
Faculty of Geo-Information Science and Earth Observation (ITC) of the University of Twente, Department of Geo-information Processing, PO Box 217, 7500 AE Enschede, The Netherlands
Interests: GIScience; geodata science; urban informatics
Special Issues and Collections in MDPI journals
Dr. Haosheng Huang
E-Mail Website
Guest Editor
Geographic Information Science (GIS), Department of Geography, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland
Interests: GIScience; location based services; geospatial big data analytics
Special Issues and Collections in MDPI journals

Special Issue Information

Dear Colleagues,

In recent years, significant progress has been made in the combined fields of deep learning (DL) and computer vision (CV), with applications ranging from driverless cars to facial recognition to robotics. There is a natural synergy between the geoinformation sciences and DL and CV due to the vast quantities of geolocated and time-stamped data being generated from various sources, including satellite imagery, street-level images, and video and airborne and unmanned aerial system (UAS) imagery, as well as social media data, text data, and other data streams. In the field of remote sensing, in particular, significant progress has already been made in object detection, image classification and scene classification, amongst others. More recently, DL architectures have been applied to heterogeneous geospatial data types, such as networks, broadening their applicability across a range of spatial processes.

This Special Issue aims to collate the state of the art in deep learning and computer vision for the geoinformation sciences, from the application of existing DL and CV algorithms in diverse contexts to the development of novel techniques. Submissions are invited across a range of topics related to DL and CV, including but not limited to:

Theory and algorithms: Development of novel theory and algorithms specific to the geoinformation sciences, including methods for modelling heterogeneous spatio-temporal data types.
Integration of DL and CV into traditional modelling frameworks: Using DL and CV to augment traditional modelling techniques, e.g. through data creation, fusion or integrated algorithmic design.
Deep reinforcement learning: Application of deep reinforcement learning to spatial processes.
Geocomputation for DL and CV: Improving the performance and scalability of DL and CV using geocomputational techniques.
Incorporating DL and CV in Geoinformation Science Curricula: Meeting demands in education for incorporating artificial intelligence into educational curricula, particularly geolocational aspects of DL and CV in GIScience programs as well as other disciplines of DL/CV development (e.g., engineering, computer science) and application areas (listed below).
Applications: Open scope within the geoinformation sciences (e.g., transport and mobility, smart cities, agriculture, marine science, ecology, geology, forestry, public health, urban/rural planning, infrastructure, disaster management, social networks, local/global modelling, climate and atmosphere, etc.).

Dr. James Haworth
Dr. Suzana Dragicevic
Dr. Marguerite Madden
Dr. Mingshu Wang
Dr. Haosheng Huang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All papers will be peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. ISPRS International Journal of Geo-Information is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Computer Vision
  • Deep Learning
  • Convolutional and Recurrent Neural Networks
  • Image Classification
  • Object Detection
  • Spatiotemporal
  • Urban sensing and urban computing

Published Papers (17 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Article
MCCRNet: A Multi-Level Change Contextual Refinement Network for Remote Sensing Image Change Detection
ISPRS Int. J. Geo-Inf. 2021, 10(9), 591; https://doi.org/10.3390/ijgi10090591 - 07 Sep 2021
Viewed by 377
Abstract
Change detection based on bi-temporal remote sensing images has made significant progress in recent years, aiming to identify the changed and unchanged pixels between a registered pair of images. However, most learning-based change detection methods only utilize fused high-level features from the feature [...] Read more.
Change detection based on bi-temporal remote sensing images has made significant progress in recent years, aiming to identify the changed and unchanged pixels between a registered pair of images. However, most learning-based change detection methods only utilize fused high-level features from the feature encoder and thus miss the detailed representations that low-level feature pairs contain. Here we propose a multi-level change contextual refinement network (MCCRNet) to strengthen the multi-level change representations of feature pairs. To effectively capture the dependencies of feature pairs while avoiding fusing them, our atrous spatial pyramid cross attention (ASPCA) module introduces a crossed spatial attention module and a crossed channel attention module to emphasize the position importance and channel importance of each feature while simultaneously keeping the scale of input and output the same. This module can be plugged into any feature extraction layer of a Siamese change detection network. Furthermore, we propose a change contextual representations (CCR) module from the perspective of the relationship between the change pixels and the contextual representation, named change region contextual representations. The CCR module aims to correct changed pixels mistakenly predicted as unchanged by a class attention mechanism. Finally, we introduce an effective sample number adaptively weighted loss to solve the class-imbalanced problem of change detection datasets. On the whole, compared with other attention modules that only use fused features from the highest feature pairs, our method can capture the multi-level spatial, channel, and class context of change discrimination information. The experiments are performed with four public change detection datasets of various image resolutions. Compared to state-of-the-art methods, our MCCRNet achieved superior performance on all datasets (i.e., LEVIR, Season-Varying Change Detection Dataset, Google Data GZ, and DSIFN) with improvements of 0.47%, 0.11%, 2.62%, and 3.99%, respectively. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Article
Development of a City-Scale Approach for Façade Color Measurement with Building Functional Classification Using Deep Learning and Street View Images
ISPRS Int. J. Geo-Inf. 2021, 10(8), 551; https://doi.org/10.3390/ijgi10080551 - 16 Aug 2021
Viewed by 747
Abstract
Precise measuring of urban façade color is necessary for urban color planning. The existing manual methods of measuring building façade color are limited by time and labor costs and hardly carried out on a city scale. These methods also make it challenging to [...] Read more.
Precise measuring of urban façade color is necessary for urban color planning. The existing manual methods of measuring building façade color are limited by time and labor costs and hardly carried out on a city scale. These methods also make it challenging to identify the role of the building function in controlling and guiding urban color planning. This paper explores a city-scale approach to façade color measurement with building functional classification using state-of-the-art deep learning techniques and street view images. Firstly, we used semantic segmentation to extract building façades and conducted the color calibration of the photos for pre-processing the collected street view images. Then, we proposed a color chart-based façade color measurement method and a multi-label deep learning-based building classification method. Next, the field survey data were used as the ground truth to verify the accuracy of the façade color measurement and building function classification. Finally, we applied our approach to generate façade color distribution maps with the building classification for three metropolises in China, and the results proved the transferability and effectiveness of the scheme. The proposed approach can provide city managers with an overall perception of urban façade color and building function across city-scale areas in a cost-efficient way, contributing to data-driven decision making for urban analytics and planning. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Article
Semantic Relation Model and Dataset for Remote Sensing Scene Understanding
ISPRS Int. J. Geo-Inf. 2021, 10(7), 488; https://doi.org/10.3390/ijgi10070488 - 17 Jul 2021
Viewed by 904
Abstract
A deep understanding of our visual world is more than an isolated perception on a series of objects, and the relationships between them also contain rich semantic information. Especially for those satellite remote sensing images, the span is so large that the various [...] Read more.
A deep understanding of our visual world is more than an isolated perception on a series of objects, and the relationships between them also contain rich semantic information. Especially for those satellite remote sensing images, the span is so large that the various objects are always of different sizes and complex spatial compositions. Therefore, the recognition of semantic relations is conducive to strengthen the understanding of remote sensing scenes. In this paper, we propose a novel multi-scale semantic fusion network (MSFN). In this framework, dilated convolution is introduced into a graph convolutional network (GCN) based on an attentional mechanism to fuse and refine multi-scale semantic context, which is crucial to strengthen the cognitive ability of our model Besides, based on the mapping between visual features and semantic embeddings, we design a sparse relationship extraction module to remove meaningless connections among entities and improve the efficiency of scene graph generation. Meanwhile, to further promote the research of scene understanding in remote sensing field, this paper also proposes a remote sensing scene graph dataset (RSSGD). We carry out extensive experiments and the results show that our model significantly outperforms previous methods on scene graph generation. In addition, RSSGD effectively bridges the huge semantic gap between low-level perception and high-level cognition of remote sensing images. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Article
A Cost Function for the Uncertainty of Matching Point Distribution on Image Registration
ISPRS Int. J. Geo-Inf. 2021, 10(7), 438; https://doi.org/10.3390/ijgi10070438 - 25 Jun 2021
Viewed by 479
Abstract
Computing the homography matrix using the known matching points is a key step in computer vision for image registration. In practice, the number, accuracy, and distribution of the known matching points can affect the uncertainty of the homography matrix. This study mainly focuses [...] Read more.
Computing the homography matrix using the known matching points is a key step in computer vision for image registration. In practice, the number, accuracy, and distribution of the known matching points can affect the uncertainty of the homography matrix. This study mainly focuses on the effect of matching point distribution on image registration. First, horizontal dilution of precision (HDOP) is derived to measure the influence of the distribution of known points on fixed point position accuracy on the image. The quantization function, which is the average of the center points’ HDOP* of the overlapping region, is then constructed to measure the uncertainty of matching distribution. Finally, the experiments in the field of image registration are performed to verify the proposed function. We test the consistency of the relationship between the proposed function and the average of symmetric transfer errors. Consequently, the proposed function is appropriate for measuring the uncertainty of matching point distribution on image registration. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Article
Quantifying the Characteristics of the Local Urban Environment through Geotagged Flickr Photographs and Image Recognition
ISPRS Int. J. Geo-Inf. 2020, 9(4), 264; https://doi.org/10.3390/ijgi9040264 - 19 Apr 2020
Cited by 8 | Viewed by 1543
Abstract
Urban environments play a crucial role in the design, planning, and management of cities. Recently, as the urban population expands, the ways in which humans interact with their surroundings has evolved, presenting a dynamic distribution in space and time locally and frequently. Therefore, [...] Read more.
Urban environments play a crucial role in the design, planning, and management of cities. Recently, as the urban population expands, the ways in which humans interact with their surroundings has evolved, presenting a dynamic distribution in space and time locally and frequently. Therefore, how to better understand the local urban environment and differentiate varying preferences for urban areas has been a big challenge for policymakers. This study leverages geotagged Flickr photographs to quantify characteristics of varying urban areas and exploit the dynamics of areas where more people assemble. An advanced image recognition model is used to extract features from large numbers of images in Inner London within the period 2013–2015. After the integration of characteristics, a series of visualisation techniques are utilised to explore the characteristic differences and their dynamics. We find that urban areas with higher population densities cover more iconic landmarks and leisure zones, while others are more related to daily life scenes. The dynamic results demonstrate that season determines human preferences for travel modes and activity modes. Our study expands the previous literature on the integration of image recognition method and urban perception analytics and provides new insights for stakeholders, who can use these findings as vital evidence for decision making. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Article
Quantification Method for the Uncertainty of Matching Point Distribution on 3D Reconstruction
ISPRS Int. J. Geo-Inf. 2020, 9(4), 187; https://doi.org/10.3390/ijgi9040187 - 25 Mar 2020
Cited by 2 | Viewed by 835
Abstract
Matching points are the direct data sources of the fundamental matrix, camera parameters, and point cloud calculation. Thus, their uncertainty has a direct influence on the quality of image-based 3D reconstruction and is dependent on the number, accuracy, and distribution of the matching [...] Read more.
Matching points are the direct data sources of the fundamental matrix, camera parameters, and point cloud calculation. Thus, their uncertainty has a direct influence on the quality of image-based 3D reconstruction and is dependent on the number, accuracy, and distribution of the matching points. This study mainly focuses on the uncertainty of matching point distribution. First, horizontal dilution of precision (HDOP) is used to quantify the feature point distribution in the overlapping region of multiple images. Then, the quantization method is constructed. H D O P ¯ , the average of 2 × arctan ( H D O P × n 5 1 ) / π on all images, is utilized to measure the uncertainty of matching point distribution on 3D reconstruction. Finally, simulated and real scene experiments were performed to describe and verify the rationality of the proposed method. We found that the relationship between H D O P ¯ and the matching point distribution in this study was consistent with that between matching point distribution and 3D reconstruction. Consequently, it may be a feasible method to predict the quality of 3D reconstruction by calculating the uncertainty of matching point distribution. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Article
Classification and Segmentation of Mining Area Objects in Large-Scale Spares Lidar Point Cloud Using a Novel Rotated Density Network
ISPRS Int. J. Geo-Inf. 2020, 9(3), 182; https://doi.org/10.3390/ijgi9030182 - 24 Mar 2020
Cited by 4 | Viewed by 1067
Abstract
The classification and segmentation of large-scale, sparse, LiDAR point cloud with deep learning are widely used in engineering survey and geoscience. The loose structure and the non-uniform point density are the two major constraints to utilize the sparse point cloud. This paper proposes [...] Read more.
The classification and segmentation of large-scale, sparse, LiDAR point cloud with deep learning are widely used in engineering survey and geoscience. The loose structure and the non-uniform point density are the two major constraints to utilize the sparse point cloud. This paper proposes a lightweight auxiliary network, called the rotated density-based network (RD-Net), and a novel point cloud preprocessing method, Grid Trajectory Box (GT-Box), to solve these problems. The combination of RD-Net and PointNet was used to achieve high-precision 3D classification and segmentation of the sparse point cloud. It emphasizes the importance of the density feature of LiDAR points for 3D object recognition of sparse point cloud. Furthermore, RD-Net plus PointCNN, PointNet, PointCNN, and RD-Net were introduced as comparisons. Public datasets were used to evaluate the performance of the proposed method. The results showed that the RD-Net could significantly improve the performance of sparse point cloud recognition for the coordinate-based network and could improve the classification accuracy to 94% and the segmentation per-accuracy to 70%. Additionally, the results concluded that point-density information has an independent spatial–local correlation and plays an essential role in the process of sparse point cloud recognition. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Article
Extracting Representative Images of Tourist Attractions from Flickr by Combining an Improved Cluster Method and Multiple Deep Learning Models
ISPRS Int. J. Geo-Inf. 2020, 9(2), 81; https://doi.org/10.3390/ijgi9020081 - 31 Jan 2020
Cited by 4 | Viewed by 1052
Abstract
Extracting representative images of tourist attractions from geotagged photos is beneficial to many fields in tourist management, such as applications in touristic information systems. This task usually begins with clustering to extract tourist attractions from raw coordinates in geotagged photos. However, most existing [...] Read more.
Extracting representative images of tourist attractions from geotagged photos is beneficial to many fields in tourist management, such as applications in touristic information systems. This task usually begins with clustering to extract tourist attractions from raw coordinates in geotagged photos. However, most existing cluster methods are limited in the accuracy and granularity of the places of interest, as well as in detecting distinct tags, due to its primary consideration of spatial relationships. After clustering, the challenge still exists for the task of extracting representative images within the geotagged base image data, because of the existence of noisy photos occupied by a large area proportion of humans and unrelated objects. In this paper, we propose a framework containing an improved cluster method and multiple neural network models to extract representative images of tourist attractions. We first propose a novel time- and user-constrained density-joinable cluster method (TU-DJ-Cluster), specific to photos with similar geotags to detect place-relevant tags. Then we merge and extend the clusters according to the similarity between pairs of tag embeddings, as trained from Word2Vec. Based on the clustering result, we filter noise images with Multilayer Perceptron and a single-shot multibox detector model, and further select representative images with the deep ranking model. We select Beijing as the study area. The quantitative and qualitative analysis, as well as the questionnaire results obtained from real-life tourists, demonstrate the effectiveness of this framework. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Article
Linguistic Landscapes on Street-Level Images
ISPRS Int. J. Geo-Inf. 2020, 9(1), 57; https://doi.org/10.3390/ijgi9010057 - 20 Jan 2020
Cited by 4 | Viewed by 1534
Abstract
Linguistic landscape research focuses on relationships between written languages in public spaces and the sociodemographic structure of a city. While a great deal of work has been done on the evaluation of linguistic landscapes in different cities, most of the studies are based [...] Read more.
Linguistic landscape research focuses on relationships between written languages in public spaces and the sociodemographic structure of a city. While a great deal of work has been done on the evaluation of linguistic landscapes in different cities, most of the studies are based on ad-hoc interpretation of data collected from fieldwork. The purpose of this paper is to develop a new methodological framework that combines computer vision and machine learning techniques for assessing the diversity of languages from street-level images. As demonstrated with an analysis of a small Chinese community in Seoul, South Korea, the proposed approach can reveal the spatiotemporal pattern of linguistic variations effectively and provide insights into the demographic composition as well as social changes in the neighborhood. Although the method presented in this work is at a conceptual stage, it has the potential to open new opportunities to conduct linguistic landscape research at a large scale and in a reproducible manner. It is also capable of yielding a more objective description of a linguistic landscape than arbitrary classification and interpretation of on-site observations. The proposed approach can be a new direction for the study of linguistic landscapes that builds upon urban analytics methodology, and it will help both geographers and sociolinguists explore and understand our society. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Article
Identification of Salt Deposits on Seismic Images Using Deep Learning Method for Semantic Segmentation
ISPRS Int. J. Geo-Inf. 2020, 9(1), 24; https://doi.org/10.3390/ijgi9010024 - 01 Jan 2020
Cited by 4 | Viewed by 1972
Abstract
Several areas of Earth that are rich in oil and natural gas also have huge deposits of salt below the surface. Because of this connection, knowing precise locations of large salt deposits is extremely important to companies involved in oil and gas exploration. [...] Read more.
Several areas of Earth that are rich in oil and natural gas also have huge deposits of salt below the surface. Because of this connection, knowing precise locations of large salt deposits is extremely important to companies involved in oil and gas exploration. To locate salt bodies, professional seismic imaging is needed. These images are analyzed by human experts which leads to very subjective and highly variable renderings. To motivate automation and increase the accuracy of this process, TGS-NOPEC Geophysical Company (TGS) has sponsored a Kaggle competition that was held in the second half of 2018. The competition was very popular, gathering 3221 individuals and teams. Data for the competition included a training set of 4000 seismic image patches and corresponding segmentation masks. The test set contained 18,000 seismic image patches used for evaluation (all images are 101 × 101 pixels). Depth information of the sample location was also provided for every seismic image patch. The method presented in this paper is based on the author’s participation and it relies on training a deep convolutional neural network (CNN) for semantic segmentation. The architecture of the proposed network is inspired by the U-Net model in combination with ResNet and DenseNet architectures. To better comprehend the properties of the proposed architecture, a series of experiments were conducted applying standardized approaches using the same training framework. The results showed that the proposed architecture is comparable and, in most cases, better than these segmentation models. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Article
Extracting Building Areas from Photogrammetric DSM and DOM by Automatically Selecting Training Samples from Historical DLG Data
ISPRS Int. J. Geo-Inf. 2020, 9(1), 18; https://doi.org/10.3390/ijgi9010018 - 01 Jan 2020
Cited by 3 | Viewed by 767
Abstract
This paper presents an automatic building extraction method which utilizes a photogrammetric digital surface model (DSM) and digital orthophoto map (DOM) with the help of historical digital line graphic (DLG) data. To reduce the need for manual labeling, the initial labels were automatically [...] Read more.
This paper presents an automatic building extraction method which utilizes a photogrammetric digital surface model (DSM) and digital orthophoto map (DOM) with the help of historical digital line graphic (DLG) data. To reduce the need for manual labeling, the initial labels were automatically obtained from historical DLGs. Nonetheless, a proportion of these labels are incorrect due to changes (e.g., new constructions, demolished buildings). To select clean samples, an iterative method using random forest (RF) classifier was proposed in order to remove some possible incorrect labels. To get effective features, deep features extracted from normalized DSM (nDSM) and DOM using the pre-trained fully convolutional networks (FCN) were combined. To control the computation cost and alleviate the burden of redundancy, the principal component analysis (PCA) algorithm was applied to reduce the feature dimensions. Three data sets in two areas were employed with evaluation in two aspects. In these data sets, three DLGs with 15%, 65%, and 25% of noise were applied. The results demonstrate the proposed method could effectively select clean samples, and maintain acceptable quality of extracted results in both pixel-based and object-based evaluations. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Article
Dynamic Recommendation of POI Sequence Responding to Historical Trajectory
ISPRS Int. J. Geo-Inf. 2019, 8(10), 433; https://doi.org/10.3390/ijgi8100433 - 30 Sep 2019
Cited by 7 | Viewed by 1330
Abstract
Point-of-Interest (POI) recommendation is attracting the increasing attention of researchers because of the rapid development of Location-based Social Networks (LBSNs) in recent years. Differing from other recommenders, who only recommend the next POI, this research focuses on the successive POI sequence recommendation. A [...] Read more.
Point-of-Interest (POI) recommendation is attracting the increasing attention of researchers because of the rapid development of Location-based Social Networks (LBSNs) in recent years. Differing from other recommenders, who only recommend the next POI, this research focuses on the successive POI sequence recommendation. A novel POI sequence recommendation framework, named Dynamic Recommendation of POI Sequence (DRPS), is proposed, which models the POI sequence recommendation as a Sequence-to-Sequence (Seq2Seq) learning task, that is, the input sequence is a historical trajectory, and the output sequence is exactly the POI sequence to be recommended. To solve this Seq2Seq problem, an effective architecture is designed based on the Deep Neural Network (DNN). Owing to the end-to-end workflow, DRPS can easily make dynamic POI sequence recommendations by allowing the input to change over time. In addition, two new metrics named Aligned Precision (AP) and Order-aware Sequence Precision (OSP) are proposed to evaluate the recommendation accuracy of a POI sequence, which considers not only the POI identity but also the visiting order. The experimental results show that the proposed method is effective for POI sequence recommendation tasks, and it significantly outperforms the baseline approaches like Additive Markov Chain, LORE and LSTM-Seq2Seq. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Graphical abstract

Article
Image Retrieval Based on Learning to Rank and Multiple Loss
ISPRS Int. J. Geo-Inf. 2019, 8(9), 393; https://doi.org/10.3390/ijgi8090393 - 04 Sep 2019
Cited by 2 | Viewed by 1233
Abstract
Image retrieval applying deep convolutional features has achieved the most advanced performance in most standard benchmark tests. In image retrieval, deep metric learning (DML) plays a key role and aims to capture semantic similarity information carried by data points. However, two factors may [...] Read more.
Image retrieval applying deep convolutional features has achieved the most advanced performance in most standard benchmark tests. In image retrieval, deep metric learning (DML) plays a key role and aims to capture semantic similarity information carried by data points. However, two factors may impede the accuracy of image retrieval. First, when learning the similarity of negative examples, current methods separate negative pairs into equal distance in the embedding space. Thus, the intraclass data distribution might be missed. Second, given a query, either a fraction of data points, or all of them, are incorporated to build up the similarity structure, which makes it rather complex to calculate similarity or to choose example pairs. In this study, in order to achieve more accurate image retrieval, we proposed a method based on learning to rank and multiple loss (LRML). To address the first problem, through learning the ranking sequence, we separate the negative pairs from the query image into different distance. To tackle the second problem, we used a positive example in the gallery and negative sets from the bottom five ranked by similarity, thereby enhancing training efficiency. Our significant experimental results demonstrate that the proposed method achieves state-of-the-art performance on three widely used benchmarks. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Article
Using Vehicle Synthesis Generative Adversarial Networks to Improve Vehicle Detection in Remote Sensing Images
ISPRS Int. J. Geo-Inf. 2019, 8(9), 390; https://doi.org/10.3390/ijgi8090390 - 04 Sep 2019
Cited by 15 | Viewed by 1684
Abstract
Vehicle detection based on very high-resolution (VHR) remote sensing images is beneficial in many fields such as military surveillance, traffic control, and social/economic studies. However, intricate details about the vehicle and the surrounding background provided by VHR images require sophisticated analysis based on [...] Read more.
Vehicle detection based on very high-resolution (VHR) remote sensing images is beneficial in many fields such as military surveillance, traffic control, and social/economic studies. However, intricate details about the vehicle and the surrounding background provided by VHR images require sophisticated analysis based on massive data samples, though the number of reliable labeled training data is limited. In practice, data augmentation is often leveraged to solve this conflict. The traditional data augmentation strategy uses a combination of rotation, scaling, and flipping transformations, etc., and has limited capabilities in capturing the essence of feature distribution and proving data diversity. In this study, we propose a learning method named Vehicle Synthesis Generative Adversarial Networks (VS-GANs) to generate annotated vehicles from remote sensing images. The proposed framework has one generator and two discriminators, which try to synthesize realistic vehicles and learn the background context simultaneously. The method can quickly generate high-quality annotated vehicle data samples and greatly helps in the training of vehicle detectors. Experimental results show that the proposed framework can synthesize vehicles and their background images with variations and different levels of details. Compared with traditional data augmentation methods, the proposed method significantly improves the generalization capability of vehicle detectors. Finally, the contribution of VS-GANs to vehicle detection in VHR remote sensing images was proved in experiments conducted on UCAS-AOD and NWPU VHR-10 datasets using up-to-date target detection frameworks. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Article
Using Intelligent Clustering to Implement Geometric Computation for Electoral Districting
ISPRS Int. J. Geo-Inf. 2019, 8(9), 369; https://doi.org/10.3390/ijgi8090369 - 23 Aug 2019
Viewed by 1022
Abstract
Traditional electoral districting is mostly carried out by artificial division. It is not only time-consuming and labor-intensive, but it is also difficult to maintain the principles of fairness and consistency. Due to specific political interests, objectivity is usually distorted and controversial in a [...] Read more.
Traditional electoral districting is mostly carried out by artificial division. It is not only time-consuming and labor-intensive, but it is also difficult to maintain the principles of fairness and consistency. Due to specific political interests, objectivity is usually distorted and controversial in a proxy-election. In order to reflect the spirit of democracy, this study uses computing technologies to automatically divide the constituency and use the concepts of “intelligent clustering” and “extreme arrangement” to conquer many shortcomings of traditional artificial division. In addition, various informational technologies are integrated to obtain the most feasible solutions within the maximum capabilities of the computing system, yet without sacrificing the global representation of the solutions. We take Changhua County, Taiwan as an example of complete electoral districting, and find better results relative to the official version, which obtained a smaller difference in the population of each constituency, more complete and symmetrical constituencies, and fewer regional controversies. Our results demonstrate that multidimensional algorithms using a geographic information system could solve many problems of block districting to make decisions based on different needs. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Article
Short-Term Prediction of Bus Passenger Flow Based on a Hybrid Optimized LSTM Network
ISPRS Int. J. Geo-Inf. 2019, 8(9), 366; https://doi.org/10.3390/ijgi8090366 - 22 Aug 2019
Cited by 12 | Viewed by 1434
Abstract
The accurate prediction of bus passenger flow is the key to public transport management and the smart city. A long short-term memory network, a deep learning method for modeling sequences, is an efficient way to capture the time dependency of passenger flow. In [...] Read more.
The accurate prediction of bus passenger flow is the key to public transport management and the smart city. A long short-term memory network, a deep learning method for modeling sequences, is an efficient way to capture the time dependency of passenger flow. In recent years, an increasing number of researchers have sought to apply the LSTM model to passenger flow prediction. However, few of them pay attention to the optimization procedure during model training. In this article, we propose a hybrid, optimized LSTM network based on Nesterov accelerated adaptive moment estimation (Nadam) and the stochastic gradient descent algorithm (SGD). This method trains the model with high efficiency and accuracy, solving the problems of inefficient training and misconvergence that exist in complex models. We employ a hybrid optimized LSTM network to predict the actual passenger flow in Qingdao, China and compare the prediction results with those obtained by non-hybrid LSTM models and conventional methods. In particular, the proposed model brings about a 4%–20% extra performance improvements compared with those of non-hybrid LSTM models. We have also tried combinations of other optimization algorithms and applications in different models, finding that optimizing LSTM by switching Nadam to SGD is the best choice. The sensitivity of the model to its parameters is also explored, which provides guidance for applying this model to bus passenger flow data modelling. The good performance of the proposed model in different temporal and spatial scales shows that it is more robust and effective, which can provide insightful support and guidance for dynamic bus scheduling and regional coordination scheduling. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Article
Speed Estimation of Multiple Moving Objects from a Moving UAV Platform
ISPRS Int. J. Geo-Inf. 2019, 8(6), 259; https://doi.org/10.3390/ijgi8060259 - 31 May 2019
Cited by 7 | Viewed by 1611
Abstract
Speed detection of a moving object using an optical camera has always been an important subject to study in computer vision. This is one of the key components to address in many application areas, such as transportation systems, military and naval applications, and [...] Read more.
Speed detection of a moving object using an optical camera has always been an important subject to study in computer vision. This is one of the key components to address in many application areas, such as transportation systems, military and naval applications, and robotics. In this study, we implemented a speed detection system for multiple moving objects on the ground from a moving platform in the air. A detect-and-track approach is used for primary tracking of the objects. Faster R-CNN (region-based convolutional neural network) is applied to detect the objects, and a discriminative correlation filter with CSRT (channel and spatial reliability tracking) is used for tracking. Feature-based image alignment (FBIA) is done for each frame to get the proper object location. In addition, SSIM (structural similarity index measurement) is performed to check how similar the current frame is with respect to the object detection frame. This measurement is necessary because the platform is moving, and new objects may be captured in a new frame. We achieved a speed accuracy of 96.80% with our framework with respect to the real speed of the objects. Full article
(This article belongs to the Special Issue Deep Learning and Computer Vision for GeoInformation Sciences)
Show Figures

Figure 1

Back to TopTop