A Hierarchical Airport Detection Method Using Spatial Analysis and Deep Learning

: Airports have a profound impact on our lives, and uncovering their distribution around the world has great signiﬁcance for research and development. However, existing airport databases are incomplete and have a high cost of updating. Thus, a fast and automatic worldwide airport detection method can be of signiﬁcance for global airport detection at regular intervals. However, previous airport detection studies are usually based on single remote sensing (RS) imagery, which seems an overwhelming burden for worldwide airport detection with traversal searching. Thus, we propose a hierarchical airport detection method consisting of broad-scale extraction of worldwide candidate airport regions based on spatial analysis of released RS products, including impervious surfaces from FROM-GLC10 (ﬁne resolution observation and monitoring of global land cover 10) product, building distribution from OSMs (open street maps) and digital surface model from AW3D30 (ALOS World 3D—30 m). Moreover, narrow-scale aircraft detection was initially conducted by the Faster R-CNN (regional-convolutional neural networks) deep learning method. To avoid overestimation of background regions by Faster R-CNN, a second CNN classiﬁer is used to reﬁne the class labeling with negative samples. Speciﬁcally, our research focuses on target airports with at least 2 km length in three experimental regions. Results show that spatial analysis reduced the possible regions to 0.56% of the total area of 75,691 km 2 . The initial aircraft detection by Faster R-CNN had a mean user’s accuracy of 88.90% and ensured that all the aircrafts could be detected. Then, by introducing the CNN reclassiﬁer, the user’s accuracy of aircraft detection was signiﬁcantly increased to 94.21%. Finally, through an experienced threshold of aircraft number, 19 of the total 20 airports were detected correctly. Our results reveal the overall workﬂow is reliable for automatic and rapid airport detection around the world with the help of released RS products. This research promotes the application and progression of deep learning.


Introduction
Airports have attracted much attention in recent decades as being key transportation targets [1,2].Uncovering global airport distribution has great significance for transport planning and analyzing human mobility patterns, but is difficult [3,4].With the increasing number of global airports every year, large-scale, rapid, and automated airport detection is important.The development of remote sensing (RS) and the growing availability of RS imagery with high spatial resolution have elevated global airport detection to a new height, which can precisely identify and locate airports [5,6].However, due to the huge amount of high-resolution imagery and computational complexity of detection algorithms, it is best to first extract the data at a broad scale [7][8][9].Differing from previous researches on airport detection in single RS imagery [7,[9][10][11][12], a hierarchical framework has been introduced for global airport detection: broad-scale extraction of candidate airport regions based on spatial analysis of released digital products, while at a narrower scale, suitable feature descriptors should be selected to identify airports from within the candidate regions.Thus, how to efficiently obtain airports candidate regions and robustly describe airport appearances are important for improving detection performance [10].
For candidate airport regions, simpler and more efficient segmentation-based approaches are necessary for broad-scale searching, where spectral features, textural features, or geometrical characteristics are frequently evaluated, such as gradient of intensity [13], airport line features [7,11,14], and key scale-invariant feature transform (SIFT) points [15].For the gradient of intensity, common machine learning classifiers, such as the Adaboost algorithm, were adopted for the rough identification of airport runways [13].However, the gradient of intensity at the pixel level cannot represent the substantive characteristics, and the line features are more specific for airport regions.Budak et al. [7] proposed an algorithm composed of several line-based processing steps, Zhu et al. [14] introduced the concept of near parallelity, and Tang et al. [11] applied a line segment detector to extract the features of line segments in images.The proposed line-based methods have good performance, but also need pixel-level operation, which requires enormous computation time and memory, thus limiting their broad-scale application.Although the SIFT key-points method has a high computation efficiency with high accuracy, it is still not suitable for relatively low-resolution imagery at a broad scale [15].In general, these studies commonly used features at the same scale, where single broad-scale features could cause the missing of detailed information and low accuracy, and single narrow-scale features could cause low computation efficiency.
The fusion of multiple features, especially at different scales, can be used to improve the extraction performance.Zhao et al. [16] fused the bottom-up region features and top-down line features to reduce the sensibility of resolution variety, and Xiao et al. [12] used multiscale fusion features to represent complementary information and proved that fusion of features outperformed other features for candidate airport region extraction.Except for characterization of two-dimensional profiles, spatial characterization of three-dimensional profiles should also be considered to improve efficiency, such as elevation differences derived from existing digital elevation models [17].Runways, with their unique textural and spatial characteristics, are most prominently chosen to rapidly locate candidate airports [7,8,11].Here, runways refer specifically to the part of airport used for take-off and landing, and can be directly extracted from impervious areas of land cover products, such as FROM-GLC10 (fine resolution observation and monitoring of global land cover 10) [18].Moreover, they have a distinct shape, flat terrain, and are narrow and long with no direct connection to public roads.These unique features allow spatial analysis methods to distinguish airports from other land uses.
At the narrow scale, other unique features are used to distinguish airports, including parking lots, terminals, hangars, and aircraft [10].The latter's appearance is uniform in different locations, making them easy to detect at a narrower scale [19].Aircraft detection is relatively mature compared with other targets.The many methods of aircraft detection can be roughly classified into two categories [20]: low-level features, such as edges and symmetry [21][22][23][24][25][26][27], and high-level features based on object features [20,[28][29][30][31][32][33].For low-level features, Bo et al. [21] converted RGB images to binary images for aircraft detection.The image conversion, essentially, is a dimension reduction method, which loses some descriptive features on the premise of reducing computational cost.Compared with the dimension reduction method, Luo et al. [22] trained a support vector machine (SVM) classifier based on histogram of oriented gradient (HOG) features.The low-level aircraft features detector has acceptable accuracy with low computational complexity, but has poor robustness.
For high-level features, by increasing the dimensions of spectral features, textural features, or geometrical characteristics, classification stability is significantly improved [28,32,34,35].Deep learning (DL) is the most prominent method that can automatically learn high-level features with high accuracy, and has created new ways to analyze remote sensing imagery [36].With the deepening of computer vision research, convolution feature extraction was proposed due to its strong representation ability.Convolutional neural networks (CNN), a kind of mature DL algorithm, have achieved significant success in target detection with the help of prior knowledge-based region proposal methods [29,30,37].For the prior knowledge, low-level features [37] and pretrained convolutional networks [29,30] are commonly used.However, the region proposal method largely relies on supervised pretraining, which increases the computation complexity, and thus a selective search was introduced in the R-CNN (regional-CNN) and Fast R-CNN framework that required no prior knowledge [38,39].Moreover, inefficient selective searching can be replaced by a more effective region proposal network which requires no prior knowledge and has better performance, called Faster R-CNN [40,41].This Faster R-CNN framework has proven successful and efficient for aircraft detection [42][43][44][45].
The detection of airports around the world is important but challenging due to variations in airport appearance and size, and the complexity of their backgrounds.In this paper, we propose a hierarchical airport detection method that obtains candidate airport regions based on released remote sensing products, then uses high-resolution imagery to detect aircraft and distinguish airports from other impervious surfaces, such as residual segmented roads or buildings.In general, the main contributions of this work can be summarized as: (1) proposal of a worldwide airport detection workflow based on mature DL methods and spatial analysis of released digital products with the advantages of being fast and automatic due to the omission of processing and analyzing original remote sensing data.
(2) the integration of DL and spatial analysis for global airport detection as an exploration of the in-depth application of DL in the field of object detection, which has been urgently called for in a previous publication [36].

Materials
In this study, three experimental areas were selected to test the proposed method as shown in Figure 1, including Beijing (China), New Jersey (U.S.), and northeastern Buenos Aires (Argentina), with areas of 16,394 km 2 , 38,491 km 2 , and 20,806 km 2 respectively.The selection of three experimental areas is due to consideration of geographical diversity, which can be used to verify the universality and reliability of the proposed workflow.Major datasets used included the global land cover (GLC) product FROM-GLC10, global digital surface model ALOS World 3D-30 m (AW3D30), open street maps (OSMs) for roads and buildings, shape-files for administrative boundaries, and high-resolution imagery from Google Maps (Table 1).
The FROM-GLC10 included 10 land cover types with 10 meter spatial resolution and high overall accuracy [18].Compared with other GLC products, FROM-GLC10 has the characteristics of high spatial resolution and global coverage, which are conducive to the worldwide application of the workflow.This product's "impervious area" category was used to locate possible airport runways [46].The AW3D30, provided by the Japan Aerospace Exploration Agency, has 30 meter horizontal resolution and higher accuracy than ASTER, SRTM1, and SRTM3 [47,48].This was used to remove the nonground features in impervious areas.OSM datasets provided widely used building distribution and road network products, which could help eliminate nonrunway areas and segment the surface into blocks for distinguishing airport runways.[49].In order to validate the reliability of the proposed workflow, airport validation data were downloaded from http://ourairports.com/data/, which included the locations and descriptions of some existing airports.

Methods
The proposed workflow for airport detection first defines candidate regions and then detects aircraft (Figure 2).There were three steps for identifying candidate airport regions, included exclusion of nonground regions, block segmentation by road networks, and blocks extraction with area and length thresholds.For the detection of aircraft, Faster R-CNN was used and a second CNN classifier was used for refining the aircraft detection.

Methods
The proposed workflow for airport detection first defines candidate regions and then detects aircraft (Figure 2).There were three steps for identifying candidate airport regions, included exclusion of nonground regions, block segmentation by road networks, and blocks extraction with area and length thresholds.For the detection of aircraft, Faster R-CNN was used and a second CNN classifier was used for refining the aircraft detection.

Exclusion of Nonground Regions
Impervious area in the FROM-GLC10 is composed of two parts: the tops of artificial structures (nonground) and the ground itself, which need to be separated for accurate classification [46].A morphological filter can be used to extract nonground features from the AW3D30 dataset [50].For every step, using a sliding window (2 × 2 pixels) to extract the minimum elevation grid allows grids with a height difference exceeding a given threshold to be regarded as nonground features (Figure 3).We set this threshold conservatively as 10 meters.The buildings and nonground features can be excluded as possible airport runways, and erasing them reduces the candidate areas and generates flat impervious regions for the next step.

Exclusion of Nonground Regions
Impervious area in the FROM-GLC10 is composed of two parts: the tops of artificial structures (nonground) and the ground itself, which need to be separated for accurate classification [46].A morphological filter can be used to extract nonground features from the AW3D30 dataset [50].For every step, using a sliding window (2 × 2 pixels) to extract the minimum elevation grid allows grids with a height difference exceeding a given threshold to be regarded as nonground features (Figure 3).We set this threshold conservatively as 10 meters.The buildings and nonground features can be excluded as possible airport runways, and erasing them reduces the candidate areas and generates flat impervious regions for the next step.

Block Segmentation by Road Networks
Song et al. proposed recognizing city blocks using road networks, allowing impervious urban areas to be segmented by erasing road networks [51].In OSM roads, airports' internal roads are defined as "service roads", which we retained to maintain connections between runways and parking lots.Road networks were treated as a 10-meter buffer and segmented the flat impervious regions into blocks, which was converted to shapefile format.After block segmentation, the airport is separated from other blocks and appeared spatially nonadjacent.This allows spatial clustering based on adjacency analysis to group adjacent regions (Figure 4).All the processing was finished with the help of the ArcMap 10.3 platform.

Blocks Extraction with Area and Length Thresholds
For extracted blocks, those with small areas can be first removed by defining an area threshold.According to our surveys and statistics, the area of an airport is usually larger than 0.1 km 2 ; therefore, an area threshold can greatly decrease the calculation of the next steps.Then, the length threshold was applied to represent the features of the airport runway (Figure 5), which is defined by the diameter of a minimum circumscribed circle.The maximum length of an individual block is determined by the size of the airport, and larger airports usually have longer runways.In this paper, medium and large airports were regarded as the target, and they usually have a block longer than 2 km.Thus, a 2 km length threshold was selected.The area and length of each block were measured using the ArcMap 10.3 platform.

Block Segmentation by Road Networks
Song et al. proposed recognizing city blocks using road networks, allowing impervious urban areas to be segmented by erasing road networks [51].In OSM roads, airports' internal roads are defined as "service roads", which we retained to maintain connections between runways and parking lots.Road networks were treated as a 10-meter buffer and segmented the flat impervious regions into blocks, which was converted to shapefile format.After block segmentation, the airport is separated from other blocks and appeared spatially nonadjacent.This allows spatial clustering based on adjacency analysis to group adjacent regions (Figure 4).All the processing was finished with the help of the ArcMap 10.3 platform.

Block Segmentation by Road Networks
Song et al. proposed recognizing city blocks using road networks, allowing impervious urban areas to be segmented by erasing road networks [51].In OSM roads, airports' internal roads are defined as "service roads", which we retained to maintain connections between runways and parking lots.Road networks were treated as a 10-meter buffer and segmented the flat impervious regions into blocks, which was converted to shapefile format.After block segmentation, the airport is separated from other blocks and appeared spatially nonadjacent.This allows spatial clustering based on adjacency analysis to group adjacent regions (Figure 4).All the processing was finished with the help of the ArcMap 10.3 platform.

Blocks Extraction with Area and Length Thresholds
For extracted blocks, those with small areas can be first removed by defining an area threshold.According to our surveys and statistics, the area of an airport is usually larger than 0.1 km 2 ; therefore, an area threshold can greatly decrease the calculation of the next steps.Then, the length threshold was applied to represent the features of the airport runway (Figure 5), which is defined by the diameter of a minimum circumscribed circle.The maximum length of an individual block is determined by the size of the airport, and larger airports usually have longer runways.In this paper, medium and large airports were regarded as the target, and they usually have a block longer than 2 km.Thus, a 2 km length threshold was selected.The area and length of each block were measured using the ArcMap 10.3 platform.

Blocks Extraction with Area and Length Thresholds
For extracted blocks, those with small areas can be first removed by defining an area threshold.According to our surveys and statistics, the area of an airport is usually larger than 0.1 km 2 ; therefore, an area threshold can greatly decrease the calculation of the next steps.Then, the length threshold was applied to represent the features of the airport runway (Figure 5), which is defined by the diameter of a minimum circumscribed circle.The maximum length of an individual block is determined by the size of the airport, and larger airports usually have longer runways.In this paper, medium and large airports were regarded as the target, and they usually have a block longer than 2 km.Thus, a 2 km length threshold was selected.The area and length of each block were measured using the ArcMap 10.3 platform.

Aircraft Detection
The 2D convex hull for each group is the boundary of the candidate airport region; highresolution Google images were downloaded according to the boundary vector.Then, the Faster R-CNN was used to detect aircraft in high-resolution images.
For training proposes, we selected 50 global airports and downloaded 19-level Google images (2048 × 2048 pixels) with 0.23 meter spatial resolution as training images for Faster R-CNN.The selected 50 airports did not include any airports in the three experimental areas.Usually, machine learning methods require enormous training samples.Replicates were produced by rotating the images by 90, 180, and 270 degrees to further increase the number of training images.After rotation, there were 200 total training images containing 1360 labeled aircraft for the Faster R-CNN training process.

Analysis with Faster R-CNN
The Faster R-CNN is composed of two modules, the region proposal network (RPN) and the Fast R-CNN module.Aircraft are relatively small objects when compared with the whole candidate region.Smaller objects decrease the efficiency of RPN module; thus, the detection accuracy of Faster R-CNN drops as the object's relative size becomes smaller [40,52].Although research on small object detection has advanced in recent years [53], processing full-size images (billions of pixels) still results in poor performance and high computational cost.
We used sliding windows to segment the full-size candidate airport images and independently detected parts with multithreading (Figure 6).Here, the sliding window's length was 2048 pixels.The stride of the sliding window should be less than the window's length to make sure there is some overlap between the two windows.The overlap ensures that aircraft located in the boundary of two neighboring windows can be detected correctly.According to the size of aircraft and the imagery resolution, a 1792-pixel window stride chosen, resulting in a 12.5% overlap area.The subimages with aircraft prediction results were the outputs of the Faster R-CNN.To detect as many aircraft as possible, a 0.5 probability threshold was selected, but this resulted in substantial overestimation in which many background regions were wrongly classified as aircraft.Thus, we applied a second classifier to reduce such overestimation [54].

Aircraft Detection
The 2D convex hull for each group is the boundary of the candidate airport region; high-resolution Google images were downloaded according to the boundary vector.Then, the Faster R-CNN was used to detect aircraft in high-resolution images.
For training proposes, we selected 50 global airports and downloaded 19-level Google images (2048 × 2048 pixels) with 0.23 meter spatial resolution as training images for Faster R-CNN.The selected 50 airports did not include any airports in the three experimental areas.Usually, machine learning methods require enormous training samples.Replicates were produced by rotating the images by 90, 180, and 270 degrees to further increase the number of training images.After rotation, there were 200 total training images containing 1360 labeled aircraft for the Faster R-CNN training process.

Analysis with Faster R-CNN
The Faster R-CNN is composed of two modules, the region proposal network (RPN) and the Fast R-CNN module.Aircraft are relatively small objects when compared with the whole candidate region.Smaller objects decrease the efficiency of RPN module; thus, the detection accuracy of Faster R-CNN drops as the object's relative size becomes smaller [40,52].Although research on small object detection has advanced in recent years [53], processing full-size images (billions of pixels) still results in poor performance and high computational cost.
We used sliding windows to segment the full-size candidate airport images and independently detected parts with multithreading (Figure 6).Here, the sliding window's length was 2048 pixels.The stride of the sliding window should be less than the window's length to make sure there is some overlap between the two windows.The overlap ensures that aircraft located in the boundary of two neighboring windows can be detected correctly.According to the size of aircraft and the imagery resolution, a 1792-pixel window stride chosen, resulting in a 12.5% overlap area.The subimages with aircraft prediction results were the outputs of the Faster R-CNN.To detect as many aircraft as possible, a 0.5 probability threshold was selected, but this resulted in substantial overestimation in which many background regions were wrongly classified as aircraft.Thus, we applied a second classifier to reduce such overestimation [54].

Reclassification with CNN
Considering the overestimated outputs of Faster R-CNN, a second state CNN classification was adopted in this workflow.The GoogLeNet was selected as a CNN reclassifier for better performance [12], and all regions identified as aircraft by Faster R-CNN were used as testing samples.
For training the CNN reclassifier, there are positive training samples and negative training samples for refining outputs of Faster R-CNN (Figure 7).

Reclassification with CNN
Considering the overestimated outputs of Faster R-CNN, a second state CNN classification was adopted in this workflow.The GoogLeNet was selected as a CNN reclassifier for better performance [12], and all regions identified as aircraft by Faster R-CNN were used as testing samples.
For training the CNN reclassifier, there are positive training samples and negative training samples for refining outputs of Faster R-CNN (Figure 7).

Results and Validation
We considered medium and large airports with blocks at least 2 km long as targets.The candidate regions were reduced to less than 2% of their original area after spatial analysis; this process was mainly influenced by the size of the target airports and the regional urbanization level (Figure 8).The initial impervious surface was larger for Beijing than New Jersey or northeastern Buenos Aires, resulting in larger candidate regions in the former than in the latter two (Table 2).
Beijing had the highest artificial surface coverage (15.49%) with fewer blocks than New Jersey

Results and Validation
We considered medium and large airports with blocks at least 2 km long as targets.The candidate regions were reduced to less than 2% of their original area after spatial analysis; this process was mainly influenced by the size of the target airports and the regional urbanization level (Figure 8).The initial impervious surface was larger for Beijing than New Jersey or northeastern Buenos Aires, resulting in larger candidate regions in the former than in the latter two (Table 2).Beijing had the highest artificial surface coverage (15.49%) with fewer blocks than New Jersey (Table 2), which reflects the larger average area of blocks in Beijing than in New Jersey.Besides, there was better constraint of spatial analysis in northeastern Buenos Aires, which decreased the relative area and blocks to a low level.
For the Faster R-CNN training, the initial learning rate was 0.001 and halved with every 20,000 iterations.The batch size was 1, the training step was 60,000, and the momentum optimizer value was 0.9.The output of the Faster R-CNN is essentially a set of relative coordinates of aircraft proposals in the input image.Each proposal is predicted to contain aircraft with an estimated probability greater than a predefined threshold.With the 0.5 probability threshold, 829 proposals (737 aircrafts and 92 nonaircrafts) were predicted with 88.90% mean user's accuracy in all three experimental areas (Table 3).This output was fed into the second-state CNN automated classifier.The training samples were classified into two classes, namely, the aircraft (positive samples) and nonaircraft (negative samples).The GoogLeNet learning rate was 0.001 and was halved for every 20,000 iterations.The batch size was 16 and the training step was 100,000.The user's accuracy improved significantly after CNN reclassification, and 760 refined results included 44 nonaircrafts (Table 3).The results of aircraft detection for candidate regions are shown in the subgraphs of Figure 9; different subimages are from different candidate airport regions, and the detection distinguished airports from the background.The green line represents the boundary of candidate airport regions and red boxes are the proposed aircraft regions detected by Faster R-CNN.As shown in Table 3, 760 refined results with 94.21% user's accuracy included 716 aircrafts and 44 nonaircraft, and all of the 760 results are located in 29 blocks.In order to filter the false airport detections caused by the 44 nonaircraft, we filtered the 10 out of 29 testing regions that contained less than seven aircraft in the CNN results (white dots in Figure 10), where seven is the average aircraft number of the global 50 airports (1360/200).Finally, 19 airports (red dots in Figure 10) in the three experimental regions were identified.
Airport validation data were used to validate the reliability of the proposed workflow.As errors or incomplete records were common in this dataset, cross-validation was adopted.Of the 20 airports with blocks longer than 2 km, 19 were detected correctly.The missing airport was located in northeastern Buenos Aires (Figure 10c), where the impervious areas in the original FROM-GLC10 product did not contain a complete runway for this airport, resulting in missing data during the subsequent processing.In the experimental areas, no airports were found that were not in the existing database.As shown in Table 3, 760 refined results with 94.21% user's accuracy included 716 aircrafts and 44 nonaircraft, and all of the 760 results are located in 29 blocks.In order to filter the false airport detections caused by the 44 nonaircraft, we filtered the 10 out of 29 testing regions that contained less than seven aircraft in the CNN results (white dots in Figure 10), where seven is the average aircraft number of the global 50 airports (1360/200).Finally, 19 airports (red dots in Figure 10) in the three experimental regions were identified.
Airport validation data were used to validate the reliability of the proposed workflow.As errors or incomplete records were common in this dataset, cross-validation was adopted.Of the 20 airports with blocks longer than 2 km, 19 were detected correctly.The missing airport was located in northeastern Buenos Aires (Figure 10c), where the impervious areas in the original FROM-GLC10 product did not contain a complete runway for this airport, resulting in missing data during the subsequent processing.In the experimental areas, no airports were found that were not in the existing database.

Possibility of Worldwide Airport Detection from FROM-GLC10 Product
In Figure 10c, the incompleteness of impervious areas is caused by the accuracy of the FROM-GLC10 product, which is mainly influenced by the method and original data used in product generation.For the missing airport, the runway was covered with vegetation, which meant that the spectral characteristics were similar between the runway and the grassland.However, the 10 meter spatial resolution of FROM-GLC10 is still the optimal product for a narrow runway, with which 19 airports were detected correctly.In these 19 airports, impervious areas covered airport runways completely by visual interpretation, and the overall coverage rate reached 95% (Figure 11).Compared with the released products, the generation of impervious areas using raw remote sensing data is highcost and decreases the efficiency of worldwide airport detection.In summary, global airport detection by the FROM-GLC10 product is viable and efficient.In the future, improved GLC products should also be discussed for the improvement of global airport detection.
Generating a synergetic GLC map by fusion of different GLC products is a novel thought [55].In this paper, only FROM-GLC10 was used instead of a fusion of multisource GLC products, as

Possibility of Worldwide Airport Detection from FROM-GLC10 Product
In Figure 10c, the incompleteness of impervious areas is caused by the accuracy of the FROM-GLC10 product, which is mainly influenced by the method and original data used in product generation.For the missing airport, the runway was covered with vegetation, which meant that the spectral characteristics were similar between the runway and the grassland.However, the 10 meter spatial resolution of FROM-GLC10 is still the optimal product for a narrow runway, with which 19 airports were detected correctly.In these 19 airports, impervious areas covered airport runways completely by visual interpretation, and the overall coverage rate reached 95% (Figure 11).Compared with the released products, the generation of impervious areas using raw remote sensing data is high-cost and decreases the efficiency of worldwide airport detection.In summary, global airport detection by the FROM-GLC10 product is viable and efficient.In the future, improved GLC products should also be discussed for the improvement of global airport detection.
different products have various spatial resolutions, date of production, and classification system, such as the MODIS product (MOD12Q1) with 1 km spatial resolution and all 17 classes of the IGBP legend [56].The fusion of various GLC products requires different methods and data sources, which inevitably increase the time and computational expense.Even so, it is debatable whether the integration would improve the accuracy.In future research, different GLC products should be compared and the fusion of these products should be systematically discussed.

Selection of Area and Length Thresholds
The selected threshold in spatial analysis has a great impact on block segmentation results: lower thresholds produce more unnecessary blocks for further analysis, while higher thresholds cause more airports to be missed.We analyzed the statistics of the number of blocks and airports detected with different thresholds (Figure 12).It is clear that the length threshold had a stronger effect than the area threshold.The optimal parameters, as shown in Figure 12, retain the most airports with the fewest blocks.For northeastern Buenos Aires, the original thresholds were optimal, while for Beijing, the optimal thresholds were 0.14 km 2 and 2.2 km.However, as New Jersey airports were numerous and variable in size, it was more difficult to choose an appropriate threshold that constrains all airports.Country development levels and government policy together influenced the airport size and number [57].From a global perspective, the choice of area and length threshold were suitable and generalized, but need to be adjusted appropriately for specific areas.Generating a synergetic GLC map by fusion of different GLC products is a novel thought [55].In this paper, only FROM-GLC10 was used instead of a fusion of multisource GLC products, as different products have various spatial resolutions, date of production, and classification system, such as the MODIS product (MOD12Q1) with 1 km spatial resolution and all 17 classes of the IGBP legend [56].The fusion of various GLC products requires different methods and data sources, which inevitably increase the time and computational expense.Even so, it is debatable whether the integration would improve the accuracy.In future research, different GLC products should be compared and the fusion of these products should be systematically discussed.

Selection of Area and Length Thresholds
The selected threshold in spatial analysis has a great impact on block segmentation results: lower thresholds produce more unnecessary blocks for further analysis, while higher thresholds cause more airports to be missed.We analyzed the statistics of the number of blocks and airports detected with different thresholds (Figure 12).It is clear that the length threshold had a stronger effect than the area threshold.The optimal parameters, as shown in Figure 12, retain the most airports with the fewest blocks.For northeastern Buenos Aires, the original thresholds were optimal, while for Beijing, the optimal thresholds were 0.14 km 2 and 2.2 km.However, as New Jersey airports were numerous and variable in size, it was more difficult to choose an appropriate threshold that constrains all airports.Country development levels and government policy together influenced the airport size and number [57].From a global perspective, the choice of area and length threshold were suitable and generalized, but need to be adjusted appropriately for specific areas.different products have various spatial resolutions, date of production, and classification system, such as the MODIS product (MOD12Q1) with 1 km spatial resolution and all 17 classes of the IGBP legend [56].The fusion of various GLC products requires different methods and data sources, which inevitably increase the time and computational expense.Even so, it is debatable whether the integration would improve the accuracy.In future research, different GLC products should be compared and the fusion of these products should be systematically discussed.

Selection of Area and Length Thresholds
The selected threshold in spatial analysis has a great impact on block segmentation results: lower thresholds produce more unnecessary blocks for further analysis, while higher thresholds cause more airports to be missed.We analyzed the statistics of the number of blocks and airports detected with different thresholds (Figure 12).It is clear that the length threshold had a stronger effect than the area threshold.The optimal parameters, as shown in Figure 12, retain the most airports with the fewest blocks.For northeastern Buenos Aires, the original thresholds were optimal, while for Beijing, the optimal thresholds were 0.14 km 2 and 2.2 km.However, as New Jersey airports were numerous and variable in size, it was more difficult to choose an appropriate threshold that constrains all airports.Country development levels and government policy together influenced the airport size and number [57].From a global perspective, the choice of area and length threshold were suitable and generalized, but need to be adjusted appropriately for specific areas.

Parameters of the Sliding Window in Faster R-CNN
The presence of agminated aircraft was an important basis for distinguishing airports from other candidate regions.Thus, it was necessary to detect as many aircraft as possible.A sliding window searching strategy was adopted for aircraft detection from the full-size candidate airport images with Faster R-CNN.There were two parameters in the processing: the sliding window's length and stride.The length of the sliding window in this study was 2048 pixels, which was the same as the length of the training images and ensured that the relative size of aircraft was constant.For images with the same resolution, enlarging the length of the sliding window would reduce the relative size of aircraft.Previous research suggested that smaller objects decrease the efficiency of the RPN module, thus, the detection accuracy of Faster R-CNN dropped as the object's relative size became smaller [52].
The stride of the sliding window should be smaller than the window's length to make sure there is some overlap between two adjacent windows.The overlap ensured aircrafts located in the boundary of two neighboring windows could be detected correctly.We adjusted the window's stride, which varied from 1152 pixels to 2048 pixels with a 128-pixel interval.Three airports (Figure 9a-1, Figure 9b-1, and Figure 9c-1) were taken as examples distributed in three experimental areas and the total number of aircraft was 374.We calculated the recall and accuracy of aircraft detection after final CNN refining with different strides (Figure 13).It is clear that the recall decreased with increasing window stride and the accuracy is stable.The presence of agminated aircraft was an important basis for distinguishing airports from other candidate regions.Thus, it was necessary to detect as many aircraft as possible.A sliding window searching strategy was adopted for aircraft detection from the full-size candidate airport images with Faster R-CNN.There were two parameters in the processing: the sliding window's length and stride.The length of the sliding window in this study was 2048 pixels, which was the same as the length of the training images and ensured that the relative size of aircraft was constant.For images with the same resolution, enlarging the length of the sliding window would reduce the relative size of aircraft.Previous research suggested that smaller objects decrease the efficiency of the RPN module, thus, the detection accuracy of Faster R-CNN dropped as the object's relative size became smaller [52].
The stride of the sliding window should be smaller than the window's length to make sure there is some overlap between two adjacent windows.The overlap ensured aircrafts located in the boundary of two neighboring windows could be detected correctly.We adjusted the window's stride, which varied from 1152 pixels to 2048 pixels with a 128-pixel interval.Three airports (Figure 9a-1, Figure 9b-1, and Figure 9c-1) were taken as examples distributed in three experimental areas and the total number of aircraft was 374.We calculated the recall and accuracy of aircraft detection after final CNN refining with different strides (Figure 13).It is clear that the recall decreased with increasing window stride and the accuracy is stable.

CNN Reclassifier for Accuracy Improvement
We not only aimed at detecting as many aircraft as possible, but also at achieving high accuracy.A 0.5 probability threshold was selected in order to detect as many aircraft as possible, but this resulted in substantial overestimation in which many background regions were wrongly classified as aircraft.Therefore, two-state CNN classifiers were trained for aircraft refining.In this paper, the CNN classification was simplified as there are only two classes, namely, aircraft and nonaircraft.Results show that the CNN reclassifier is effective for refining the Faster R-CNN output.For worldwide airport detection, as more new cities are detected using Faster R-CNN, the reclassifier training samples can be expended, further improving the overall accuracy.
To improve the identification accuracy, previous publications have focused on refinements of the network structure used in the Faster R-CNN and CNN classifier.Most of them have been proved

CNN Reclassifier for Accuracy Improvement
We not only aimed at detecting as many aircraft as possible, but also at achieving high accuracy.A 0.5 probability threshold was selected in order to detect as many aircraft as possible, but this resulted in substantial overestimation in which many background regions were wrongly classified as aircraft.Therefore, two-state CNN classifiers were trained for aircraft refining.In this paper, the CNN classification was simplified as there are only two classes, namely, aircraft and nonaircraft.Results show that the CNN reclassifier is effective for refining the Faster R-CNN output.For worldwide airport detection, as more new cities are detected using Faster R-CNN, the reclassifier training samples can be expended, further improving the overall accuracy.
To improve the identification accuracy, previous publications have focused on refinements of the network structure used in the Faster R-CNN and CNN classifier.Most of them have been proved to be effective, such as the single-shot multibox detector (SSD) [58] and you only look once (YOLO) [59].All of these network structures could be tested and compared to determine the best detector.Not only network structure can be explored; multiview images could also be considered, such as Google street view (GSV).Previous publications have proposed scene classification and object detection using GSV [60][61][62].Thus, GSV, as an available and free resource, has potential for building an excellent airport detector in future work.

Pros and Cons
For the purpose of fast and automatic worldwide airport detection, we proposed a released RS product-based airport detection workflow.The possibility of airport detection from FROM-GLC10 product has been discussed in Section 5.1.1.Here, we discuss the pros and cons of the proposed workflow.The workflow has the following merits: (a) the spatial analysis method is effective for extracting global candidate airport regions with decreased data volume and calculating costs; (b) extracting the visual information from high-resolution RS imagery by deep learning is one of the best ways to obtain information about geographical objects.For instance, the blocks containing aircraft mainly belong to airports; (c) in the workflow, several datasets were included, such as FROM-GLC10, OSM road networks, OSM buildings, and global DSM.The worldwide candidate airport regions can be constrained with FROM-GLC10, OSM provided 2D geographical information, and DSM provides the third dimension.Thus, compared with previous studies, we integrated several released products in order to avoid processing and analyzing original remote sensing data directly, which improved the efficiency of the proposed workflow for the purpose of automatic and fast worldwide airport detection.
The proposed methods face the following problems: (a) the impacts of inaccurate products.The proposed method relies significantly on FROM-GLC10 products, OSM datasets, and global DSM, and error accumulates with the integration of these products, which has a great impact on the accuracy of airport detection; (b) The time differences among products.Due to the limitation of data acquisition, the remote sensing imagery and products were collected at different times (the FROM-GLC10 product was updated in 2017 and high-resolution remote sensing imagery was collected in 2019).Thus, newly built airports might not be detected.In a future study, a quick registration and partial update method should be explored for the FROM-GLC10 product.Moreover, research on GLC has always been a hot issue and improved GLC products are released every year.Using GLC products with higher resolution and precision can effectively improve the detection results of the proposed method in the future.

Conclusions
In this paper, we presented a hierarchical method for fast and automatic airport detection around the world, including broad-scale detection of impervious runway surfaces and narrow-scale detection of aircraft.Previous studies focused on the airport detection algorithm in remote sensing imagery.In this approach, impervious areas were identified through GLC products, then spatial analysis was used to better constrain the candidate airport regions.Nonground regions were extracted and removed with the help of DSM and OSM buildings.After that, the ground regions were segmented by the OSM road network.Finally, spatial cluster-based adjacency analysis was used to constrain the candidate airport regions.In the first step, the candidate airport regions were extracted by analyzing the geometrical characteristics of segmentation blocks.In the second step, the Faster R-CNN method was used for aircraft detection with 88.90% mean user's accuracy, which could be improved to 94.21% using a CNN reclassifier; areas containing many aircrafts were then defined as airports.The experimental areas contained 20 airports with blocks longer than the 2 km threshold, 19 of which were detected correctly.The missing airport can be attributed to the quality of the GLC map.Thus, the overall workflow is

Figure 1 .
Figure 1.The experimental areas included Beijing, New Jersey, and northeastern Buenos Aires, in Asia, North America, and South America, respectively.

Figure 1 .
Figure 1.The experimental areas included Beijing, New Jersey, and northeastern Buenos Aires, in Asia, North America, and South America, respectively.

Figure 2 .
Figure 2. Workflow of the proposed airport detection method.

Figure 2 .
Figure 2. Workflow of the proposed airport detection method.

Figure 3 .
Figure 3. Morphological filtering to extract nonground features.The "AW3D30" was colored to show elevation changes.

Figure 4 .
Figure 4. Spatial clustering based on adjacency analysis, in which different colors represent different groups.

Figure 3 .
Figure 3. Morphological filtering to extract nonground features.The "AW3D30" was colored to show elevation changes.

19 Figure 3 .
Figure 3. Morphological filtering to extract nonground features.The "AW3D30" was colored to show elevation changes.

Figure 4 .
Figure 4. Spatial clustering based on adjacency analysis, in which different colors represent different groups.

Figure 4 .
Figure 4. Spatial clustering based on adjacency analysis, in which different colors represent different groups.

Figure 5 .
Figure 5. Selection of a candidate airport region.

Figure 5 .
Figure 5. Selection of a candidate airport region.

Figure 6 .
Figure 6.Strategy for aircraft detection from the full-size candidate airport images with Faster R-CNN.
For positive training samples, there are 1360 subimages clipped from Faster R-CNN training images.For negative training samples, 8000 images (2000 randomly selected sites after rotation) were selected, excluding the three study regions, and with 0.23 meter spatial resolution and 2048 × 2048 pixels.They were all predicted by the previously trained Faster R-CNN, and 1072 prediction results (subimages) were regarded as nonaircraft samples, namely, the negative samples.The global positive samples used in Faster R-CNN training (1360 aircraft samples) and global negative samples detected by Faster R-CNN (1072 nonaircraft samples) composed the training samples of CNN reclassifier.The trained CNN reclassifier was applied to refine the class labeling for aircraft detection by Faster R-CNN, and reclassification results were called refined aircraft.

Figure 6 .
Figure 6.Strategy for aircraft detection from the full-size candidate airport images with Faster R-CNN.

19 Figure 7 .
Figure 7. Examples of training samples.(a) Training images for Faster R-CNN; (b-1) positive samples and (b-2) negative samples for the CNN reclassifier.

Figure 7 .
Figure 7. Examples of training samples.(a) Training images for Faster R-CNN; (b-1) positive samples and (b-2) negative samples for the CNN reclassifier.

19 Figure 8 .
Figure 8. Candidate airport regions: (a) Beijing; (b) New Jersey; (c) northeastern Buenos Aires.The blue masks were extracted from FROM-GLC10 products, and background images were downloaded from Google Maps.

Figure 8 .
Figure 8. Candidate airport regions: (a) Beijing; (b) New Jersey; (c) northeastern Buenos Aires.The blue masks were extracted from FROM-GLC10 products, and background images were downloaded from Google Maps.

Figure 9 .
Figure 9.Some examples of aircraft detection results.Different subimages are from different candidate airport regions and the numbers record the airports' serial number.(a) seven airports in Beijing; (b) seven airports in New Jersey; (c) five airports in northeastern Buenos Aires.The green line represents the boundary of candidate airport regions and red boxes are the detected aircraft.

Figure 9 .
Figure 9.Some examples of aircraft detection results.Different subimages are from different candidate airport regions and the numbers record the airports' serial number.(a) seven airports in Beijing; (b) seven airports in New Jersey; (c) five airports in northeastern Buenos Aires.The green line represents the boundary of candidate airport regions and red boxes are the detected aircraft.

Figure 10 .
Figure 10.Airport detection results.(a) Beijing; (b) New Jersey; (c) northeastern Buenos Aires.Background images were downloaded from Google Maps.

Figure 10 .
Figure 10.Airport detection results.(a) Beijing; (b) New Jersey; (c) northeastern Buenos Aires.Background images were downloaded from Google Maps.

Figure 11 .
Figure 11.Samples of FROM-GLC10 product coverage of airports.(a) Beijing; (b) New Jersey; (c) northeastern Buenos Aires.Blue masks represent the impervious areas.Background images were downloaded from Google Maps.

Figure 12 .
Figure 12.Airports and remaining blocks with different area and length thresholds.The warm color represents more blocks than the cool color, and the numerals record the number of airports.(a) Beijing; (b) New Jersey; (c) northeastern Buenos Aires.

Figure 11 .
Figure 11.Samples of FROM-GLC10 product coverage of airports.(a) Beijing; (b) New Jersey; (c) northeastern Buenos Aires.Blue masks represent the impervious areas.Background images were downloaded from Google Maps.

Figure 11 .
Figure 11.Samples of FROM-GLC10 product coverage of airports.(a) Beijing; (b) New Jersey; (c) northeastern Buenos Aires.Blue masks represent the impervious areas.Background images were downloaded from Google Maps.

Figure 12 .
Figure 12.Airports and remaining blocks with different area and length thresholds.The warm color represents more blocks than the cool color, and the numerals record the number of airports.(a) Beijing; (b) New Jersey; (c) northeastern Buenos Aires.

Figure 12 .
Figure 12.Airports and remaining blocks with different area and length thresholds.The warm color represents more blocks than the cool color, and the numerals record the number of airports.(a) Beijing; (b) New Jersey; (c) northeastern Buenos Aires.

Figure 13 .
Figure 13.Influence of the sliding window's stride on aircraft detection.Accuracy represents the proportion of aircraft in all regional proposals, and recall is the proportion of detected aircraft.

Figure 13 .
Figure 13.Influence of the sliding window's stride on aircraft detection.Accuracy represents the proportion of aircraft in all regional proposals, and recall is the proportion of detected aircraft.

Table 2 .
Characteristics of target candidate airport regions where target airports had blocks 2 km long.
"All impervious area" and "Candidate airport impervious area" represent blocks before and after spatial analysis, respectively."Relative area" means the proportion of block area in each experimental area, and "block counts" represents the number of blocks before and after spatial analysis.

Table 2 .
Characteristics of target candidate airport regions where target airports had blocks 2 km long."All impervious area" and "Candidate airport impervious area" represent blocks before and after spatial analysis, respectively."Relative area" means the proportion of block area in each experimental area, and "block counts" represents the number of blocks before and after spatial analysis.

Table 3 .
Accuracy of aircraft prediction with Faster R-CNN and the improvement after the CNN was applied to the predictions from the Faster R-CNN.