Optimized Sample Selection in SVM Classification by Combining with DMSP-OLS , Landsat NDVI and GlobeLand 30 Products for Extracting Urban Built-Up Areas

The accuracy of training samples used for data classification methods, such as support vector machines (SVMs), has had a considerable positive impact on the results of urban area extractions. To improve the accuracy of urban built-up area extractions, this paper presents a sample-optimized approach for classifying urban area data using a combination of the Defense Meteorological Satellite Program-Operational Linescan System (DMSP-OLS) for nighttime light data, Landsat images, and GlobeLand30, which is a 30-m global land cover data product. The proposed approach consists of three main components: (1) initial sample generation and data classification into built-up and non-urban built-up areas based on the maximum and minimum intervals of digital numbers from the DMSP-OLS data, respectively; (2) refined sample selection and optimization by the probability threshold of each pixel based on vegetation-cover, using the Landsat-derived normalized differential vegetation index (NDVI) and artificial surfaces extracted from the GlobeLand30 product as the constraints; (3) iterative classification and urban built-up area data extraction using the relationship between these three aspects of data collection together with the training sets. Experiments were conducted for several cities in western China using this proposed approach for the extraction of built-up areas, which were classified using urban construction statistical yearbooks and Landsat images and were compared with data obtained from traditional data collection methods, such as the threshold dichotomy method and the improved neighborhood focal statistics method. An analysis of the empirical results indicated that (1) the sample training process was improved using the proposed method, and the overall accuracy (OA) increased from 89% to 96% for both the optimized and non-optimized sample selection; (2) the proposed method had a relative error of less than 10%, as calculated by an accuracy assessment; (3) the overall and individual class accuracy were higher for artificial surfaces in GlobeLand30; and (4) the average OA obviously improved and the Kappa coefficient in the case of Chengdu increased from 0.54 to 0.80. Therefore, the experimental results demonstrated that our proposed approach is a reliable solution for extracting urban built-up areas with a high degree of accuracy.


Introduction
In China, the rapid development of the social economy and changes in the industrial structure have accelerated urbanization.A key characteristic of urbanization is the constant expansion of urban areas into farm land and forested areas.The loss of agricultural lands threatens and damages the ecological environment and directly affects the livability of cities and the sustainability of economic development.Therefore, methods of extracting and evaluating urban spatial information in China are extremely important.
The Defense Meteorological Satellite Program-Operational Linescan System (DMSP-OLS) nighttime light data are a valuable resource for regional and global urban mapping, as well as for the study of human activities [1][2][3].On nights without cloud cover, nighttime light sensors can effectively detect city lights, even low-intensity lights within small-scale residential areas or in cars, and then quickly extract the city information.Not only do DMSP-OLS data have a small storage capacity, but also a long time series compared with traditional remote sensing images that can provide a wide range of urban land use information, thus playing a significant role in the extraction of urban areas and the analysis of dynamic changes of urban spatial patterns [4][5][6][7].A number of methods have been developed to map urban areas using DMSP-OLS data, and these approaches can generally be divided into two categories: supervised classification and un-supervised classification.Experiential threshold, mutation detection, and comparison methods based on high-resolution data have considerably improved the acquisition of urban spatial information and are representative of the un-supervised classification of the nighttime light remote sensing field [8][9][10][11][12], whereas the support vector machine (SVM) method uses sample selection and nighttime light image classification to acquire urban information, and is representative of a commonly employed supervised classification method [13][14][15].Cao et al. [16] proposed an SVM-based region-growing algorithm using DMSP-OLS and SPOT normalized differential vegetation index (NDVI) data to extract the urban extent of 25 Chinese cities with different levels of economic development, including Beijing, Shanghai, Hangzhou, Zhengzhou, and Lanzhou [16].This algorithm uses several simple criteria to build initial training sets of urban and non-urban pixels.In particular, the growth of seed pixels (a window with 3 × 3 pixels), which is involved in the iterative classification procedure, is capable of running continuous training for the SVM classifier with the aim of semi-automatically extracting the urban extent.Pandey et al. used a second-order polynomial regression equation to intercalibrate DMSP-OLS data as a necessary experimental preparation and then employed the same algorithm as Cao et al. to extract urban areas in India [17].Yang et al. presented a stratified SVM-based method to map urban land in China, which addressed the problem of extracting urban land over a large area with obvious regional variations [18].However, the above unsupervised classification methods present a number of limitations in the process of sample threshold selection, such as excessive subjective interventions (according to statistical data, accumulated experience, and visual observation-derived assumptions), data sources for selection that present a lack of diversity and inherent defects, and a low degree of overall automation.All of these limitations have led to a reduction in the credibility of extraction results based on these methods.Although other methods have attempted to fully exploit the features of nightlight, the detailed classification of SVM classifiers, and present classification results that are more accurate than those obtained by unsupervised methods that rely on thresholds, areas with blooming vegetation and abundant vegetation (and water pixels) are not totally removed from urban pixels because of the improper selection of samples and human intervention in the classification processes.Thus, further improvements are required to ensure the accuracy of SVM classifications.
Optical remote sensing data with high resolution, such as data extracted from remote sensing image indices and other high-quality land cover data products, are the major datasets used for urban information extraction [19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34], and such data are far superior to those of DMSP-OLS in terms of image resolution.However, most of these data products have limited temporal coverage and present limited usefulness for a dynamic analysis at large scales.For example, although the class accuracy of each land-cover classification of GlobeLand30 data is generally at a high level [35], this product is still limited by time constraints (i.e., it only contains two terms of data with benchmark years of 2000 and 2010); thus, it cannot meet the practical requirements for monitoring urban spatial change within a long-term time series.More importantly, problems related to the same objects using different spectra and different objects using similar spectra remain unresolved, which increases the difficulty of distinguishing between urban and bare lands simply based on the spectral information obtained from several bands of surface reflectance data.Furthermore, exploring regional or global areas is labor intensive and time consuming because of the massive data volumes involved [36].Thus, to precisely and reliably extract urban areas, DMSP-OLS data are used as the main data source of the data integration, and spectral information from the surface reflectance data is only used as an auxiliary source.
To address the problems associated with current extraction methods and to improve the credibility and effectiveness of classification results, we proposed a sample-optimized approach that utilizes SVM classification to semi-automatically extract urban built-up areas using an integration of multi-source data.The proposed approach consists of three main steps: (1) Initial sample generation and classification based on the maximum and minimum intervals of the digital numbers from the DMSP-OLS data and the initial training of samples into two-classes of urban built up and non-urban built up areas, which were selected and entered into the first SVM classification; (2) Refined sample selection and optimization using a probability threshold for determining pixels that contain vegetation cover, which uses the Landsat-derived normalized differential vegetation index (NDVI) as a constraint, and artificial surfaces, which uses the GlobeLand30 product as a constraint.The relationships among the three data sources as well as with the training sets were established to optimize the process of SVM classification; (3) An iterative classification and data extraction process for urban built-up areas was developed based on trials evaluating the relationship between the results obtained here and previous results using corresponding refined samples, and the iterative optimization process was terminated when the values reached a certain threshold, which generated the final urban built-up extraction results.All the ideas of our proposed approach are novel in the nighttime light data application field, which differs from other SVM-based extraction methods combined with DMSP-OLS data.To validate the flexibility and robustness of the proposed method, a case study was conducted for several typical regions in western China in 2010.The aim of this study is not to provide an approach for using Landsat or other types of global land-cover datasets alone or to apply classification methods based on the data sources above; rather, the goal is to improve the accuracy of generating training samples for information extraction with a high degree of automation, rationality, and credibility.
The remainder of this paper is organized as follows.Section 2 describes the study area and data sources; Section 3 introduces the proposed method; Section 4 provides the experimental and analytical results; and Section 5 presents the conclusions.

Study Areas and Materials
Although China has been experiencing rapid urbanization since the 1980s, large discrepancies remain in the economic development of different regions, especially in western China.With the deepening implementation of the "China Western Development" strategy, which is a Chinese national strategy for economic development and diplomatic efforts, the regional cooperation and coordinated development of cities located in western China are getting increasingly closer.In this study, four provincial capital cities, including Yinchuan, Xining, Chengdu, and Kunming, were selected as typical representative cities of western China, which are regularly among the first echelon of social economic development in Western China.These cities push forward the strategy steadily and achieve significant outcomes in many fields.Particularly in 2010, the population of these cities varied from less than 1.5 million (Yinchuan) to over 11 million (Chengdu), and the per capita gross domestic product (GDP) ranged from less than 40,000 RMB (or 6100 USD) for Xining to over 70,000 RMB (or 10,000 USD) for Chengdu [37].DMSP-OLS data, GlobeLand30 product data, and Landsat image data were used in this research, and the major characteristics of each data source are summarized in Table 1.All the selected datasets were acquired in 2010.The DMSP-OLS nighttime light data with a 1-km spatial resolution were provided by the National Geophysical Data Center.Stable nighttime light images (Figure 1), which represent a nighttime light data product, are annual raster graphic images that show the average nighttime light intensity, which includes lights in urban and rural areas and permanent lights in other places, but excludes noise, such as moonlit clouds or fires [38].The digital number (DN) values of these images represent the average light intensity, which ranges from 0 to 63.A value of 0 indicates a completely dark area, which is not the focus of this study, whereas a higher DN value represents greater light intensity in a particular area.The geographic coordinate projection of the data is the World Geodetic System 1984 (or WGS-84), which is transformed into the Asia Lambert conformal conic projection to more accurately calculate the scope of the urban built-up areas in the next stages.The DMSP-OLS nighttime light data with a 1-km spatial resolution were provided by the National Geophysical Data Center.Stable nighttime light images (Figure 1), which represent a nighttime light data product, are annual raster graphic images that show the average nighttime light intensity, which includes lights in urban and rural areas and permanent lights in other places, but excludes noise, such as moonlit clouds or fires [38].The digital number (DN) values of these images represent the average light intensity, which ranges from 0 to 63.A value of 0 indicates a completely dark area, which is not the focus of this study, whereas a higher DN value represents greater light intensity in a particular area.The geographic coordinate projection of the data is the World Geodetic System 1984 (or WGS-84), which is transformed into the Asia Lambert conformal conic projection to more accurately calculate the scope of the urban built-up areas in the next stages.The primary data source was Landsat data from the benchmark years of 2000 and 2010, and the supplemental data included China's Environmental Disaster Mitigation satellite images and Beijing-1 data in partial areas.Compared with DMSP-OLS data with 1000 m resolution, GlobeLand30 was the latest product with 30 m resolution at a much higher level, which developed based on a pixel classification-object abstraction-knowledge check (POK) method [39].Ten types of land cover were included: forests, grassland, shrub land, wetlands, bare land, tundra, cultivated land, water bodies, The primary data source was Landsat data from the benchmark years of 2000 and 2010, and the supplemental data included China's Environmental Disaster Mitigation satellite images and Beijing-1 data in partial areas.Compared with DMSP-OLS data with 1000 m resolution, GlobeLand30 was the latest product with 30 m resolution at a much higher level, which developed based on a pixel classification-object abstraction-knowledge check (POK) method [39].Ten types of land cover were included: forests, grassland, shrub land, wetlands, bare land, tundra, cultivated land, water bodies, permanent snow/ice, and artificial surfaces (Figure 2).The data were divided by the attribute values of these land cover types to extract artificial surfaces as an independent layer.Artificial surfaces were represented by land cover types covered with asphalt, concrete, gravel, bricks, tiles, and other building materials resulting from human activity, and they included residential areas, transportation, telecommunication, industry, and mining infrastructure.The product information officially verified by GlobeLand30 was reported in the top international scientific journal "Nature" in Volume 514 in 2014.More authoritative information can also be found on the official website of GlobeLand30 [40], which officially reports that the overall accuracy (OA) of an average GlobeLand30 classification is approximately 80.33% globally, the Kappa indicator of the classification is 0.75, and the user accuracy of the artificial surface class for the GlobeLand30 product is approximately 86.70% [39].
In addition, reference data for urban built-up areas are mainly sourced from the China Urban Construction Statistical Yearbook of 2011 [41].Information on boundaries of urban administrative divisions and urban spatial structures is gathered from the 1:4 million scale vector datasets in the National Fundamental Geographic Information System.permanent snow/ice, and artificial surfaces (Figure 2).The data were divided by the attribute values of these land cover types to extract artificial surfaces as an independent layer.Artificial surfaces were represented by land cover types covered with asphalt, concrete, gravel, bricks, tiles, and other building materials resulting from human activity, and they included residential areas, transportation, telecommunication, industry, and mining infrastructure.The product information officially verified by GlobeLand30 was reported in the top international scientific journal "Nature" in Volume 514 in 2014.More authoritative information can also be found on the official website of GlobeLand30 [40], which officially reports that the overall accuracy (OA) of an average GlobeLand30 classification is approximately 80.33% globally, the Kappa indicator of the classification is 0.75, and the user accuracy of the artificial surface class for the GlobeLand30 product is approximately 86.70% [39].
In addition, reference data for urban built-up areas are mainly sourced from the China Urban Construction Statistical Yearbook of 2011 [41].Information on boundaries of urban administrative divisions and urban spatial structures is gathered from the 1:4 million scale vector datasets in the National Fundamental Geographic Information System.Finally, Landsat product images were acquired to obtain data on the vegetated areas extracted according to the NDVI process.Using Landsat 5, Landsat 7, and Landsat 8 satellite data collected since the 1980s, the Center for Earth Observation and Digital Earth of the Chinese Academy of Sciences (CEODE) selected images with fine quality and produced several different series of advanced remote-sensing data products subjected to atmospheric correction, ortho-rectification, projection transformation, masking, clipping, and other types of processing.According to the study areas, we used four images from this product (2010) with a spatial resolution of 30 m, which were acquired directly from the Product Data Service Plan at the CEODE [42].

Methods
A sample-optimized approach for SVM classification was proposed to extract urban built-up areas with a high degree of credibility.The flowchart of the proposed approach is shown in Figure 3, and it consists of several steps: (1) preprocessing of the DMSP-OLS and Landsat data; (2) initial Finally, Landsat product images were acquired to obtain data on the vegetated areas extracted according to the NDVI process.Using Landsat 5, Landsat 7, and Landsat 8 satellite data collected since the 1980s, the Center for Earth Observation and Digital Earth of the Chinese Academy of Sciences (CEODE) selected images with fine quality and produced several different series of advanced remote-sensing data products subjected to atmospheric correction, ortho-rectification, projection transformation, masking, clipping, and other types of processing.According to the study areas, we used four images from this product (2010) with a spatial resolution of 30 m, which were acquired directly from the Product Data Service Plan at the CEODE [42].

Methods
A sample-optimized approach for SVM classification was proposed to extract urban built-up areas with a high degree of credibility.The flowchart of the proposed approach is shown in Figure 3,

Data Pre-processing
DMSP-OLS datasets currently include images covering the 22 years from 1992 to 2013.Because satellites are subject to rapid upgrades, the data from different types of satellites present varying quality and light saturation phenomena [43,44].Therefore, this article used the calibration method for long time series DMSP-OLS nighttime light image datasets, proposed by Cao et al. in 2009, to preprocess the remote sensing data for nighttime lights in 2010 [45].The major steps are the mutual calibration of the raw data images and the calibration between each image series (fusion for data in the same year and calibration for data in different years).A mutual calibration is adopted for long time series datasets using the regression model presented in Formula (1) after the re-sampling and re-projection processes: 3.1.Data Pre-Processing DMSP-OLS datasets currently include images covering the 22 years from 1992 to 2013.Because satellites are subject to rapid upgrades, the data from different types of satellites present varying quality and light saturation phenomena [43,44].Therefore, this article used the calibration method for long time series DMSP-OLS nighttime light image datasets, proposed by Cao et al. in 2009, to preprocess the remote sensing data for nighttime lights in 2010 [45].The major steps are the mutual calibration of the raw data images and the calibration between each image series (fusion for data in the same year and calibration for data in different years).A mutual calibration is adopted for long time series datasets using the regression model presented in Formula (1) after the re-sampling and re-projection processes: In Formula (1), the DN represents the pixel value of the images ready to be calibrated; DN cal represents the pixel value of the calibrated images, and a and b are two parameters obtained after the power regression.The parameter values for the power regression of the calibrated images and reference images in the test as well as estimates of other relevant coefficients, such as R 2 , from 2006 to 2012, are listed in Table 2.The corresponding power equation was used to perform a mutual calibration of the images for each period in China (including the data from 2010).This saturation correction processing of the DMSP-OLS data generated DN values for the DMSP-OLS data in the range of (0, 63) to (0, 53).A masking process was then performed for the images resulting from the aforementioned steps, and were then used against the administrative backdrop of Chengdu, Kunming, Xining, and Yinchuan.In addition, vegetation-cover data were extracted using the NDVI from Landsat images, and this information was combined into the sample optimization process in the subsequent steps.The NDVI is a superior vegetation indicator and can effectively reflect the growth, coverage, and dynamic seasonal changes in vegetation [46][47][48].For remote sensing images, the NDVI in Formula (2) is calculated as follows: Vegetation has strong reflectivity in the near-infrared (NIR) band and weak reflectivity in the red (R) band, which are represented by band 4 and band 3 in the Landsat images, respectively.After the normalized ratio calculation, the brightness of the vegetation in the images could be enhanced to the fullest and other ground objects could be restrained.The NDVI ranges between −1 and 1.Because of the differences among the distribution of vegetation cover in different regions, the optimal threshold between vegetated and non-vegetated cover was defined based on a histogram distribution of the NDVI values after an extensive statistical analysis of the entire number of pixels from the corresponding regions.Pixels with NDVI values that exceeded this threshold were considered vegetation cover; this is a required step for the experiment.

Sample Optimization in SVM Classification
Because greater DN values for lights indicate a greater possibility of a point belonging to an urban area, researchers have successively combined the SVM classification with nightlight time data [49,50] and have adjusted and set an "optimal threshold" according to certain features, such as areas or shapes that cannot be universally applied to cities at different scales, geographical locations, and different development stages.Accordingly, Cao et al. [16] attempted to utilize the region-growing concept in SVM sample training, in which the input urban pixels that meet the baseline criteria for a definitive selection were assigned as seeds.Then, all pixels that utilized a window of 3 × 3 pixels for each seed were classified simultaneously by the SVM-based classifier in the iterative procedure.
Although the above sample selection approach based on neighboring seeds achieved good results in the accuracy assessments and avoided the tedious trial-and-error procedure, it could not guarantee the accuracy of the training samples in the SVM classification.In addition, the reliability of the classification results required further improvement.Therefore, a sample-optimized approach was proposed that combines the SVM classification, a post-probability threshold, and multi-source data integration.There are three steps within the proposed sample-optimized procedure: generation and classification of initial samples, selection and optimization of refined samples, and iterative classification and urban built-up area extraction.

Initial Samples Generation and Classification
Because built-up urban areas are illuminated artificially at night, their corresponding pixels in nighttime light images have larger DN values than the surrounding dark rural areas [51,52].Based on previous results, the DN value range (50,53) was selected as the maximum interval of the built-up class, and the DN value range (0, 3) was selected as the minimum interval of the non-built-up class.Patches with DN values in these ranges served as the initial training sets and were input into the SVM classifier using a radial basis function kernel.Although patches with DN values within the range (3, 50) must be further identified, the initial classification results roughly segmented the images into urban built-up areas and non-built-up urban areas.

Refined Samples Selection and Optimization
The SVM method is capable of identifying the class label and generating the post-probability of each class, which represents one of the most important indices in SVM classification.Previous studies have converted the outputs of the SVM into pairwise coupling post-probabilities, and various types of post-probabilities have been generated from such pairwise coupling post-probabilities.In this paper, for two categories of land-cover features, the posterior probability of SVM can be obtained by pairwise coupling according to its predicted output values.Such pixel-specific post-probability can be used as land cover composition information, which means that every pixel has a different attribute probability for each surface feature class, and the pixel is assigned to the class with the highest probability as its class information [53].Therefore, we defined a post-probability value P to set the likelihood for each pixel and determined the "true" probability of a pixel belonging to a specific classification category.Pixels with all rule probabilities less than P are unclassified.In addition, a critical threshold value T defined as 0.8 represented a highly credible threshold value probability outputted after SVM classification for both built-up and non-built-up classes (i.e., P bu ≥ T and P non−bu ≥ T), to select samples with higher credibility from the results of previous classifications.The threshold value T (i.e., T = 0.8) indicates a high credibility level based on the class probabilities of SVM classification.It is important to note that the value of T is defined after many trials.Although it may be somewhat subjective, it is indeed a basic input guarantee for our sample optimization process in the next step, so as to extract reliable and sufficient results.
For the selection of refined samples, the probability threshold T was defined with two constraints, i.e., the vegetation cover and artificial surface classifications obtained in the data preprocessing stage, and then the relationships among the three data sources as well as with the SVM training sets were established.
As shown in Table 3, the training sets of the urban built-up class were selected as the pixels that belonged to artificial surfaces, and they also met P bu ≥ T. Training sets of the non-urban built-up class included the vegetation-covered pixels based on NDVI processing, and they also met P non−bu ≥ T. In each iteration, the value T was used as a condition that constrains the posterior probability value P. It also constrains the selection of the training sample together with vegetation elements and artificial surface elements.After the refined selection process for both classes was completed, these training samples can be involved in the process of SVM classification.Misclassification alarms and false alarms will be greatly reduced according to the designed progressive iterative optimization process for SVM classification.When each SVM classification was finished, the total areas of built-up patches were easily calculated from the present and the previous results via the corresponding selected samples, which are denoted as Area present and Area previous , respectively.θ is defined as the proportion of the area difference between Area present and Area previous to the total area of pattern spots (Area total ) in Formula (3): Based on many trials evaluating the relationship among Area present , Area previous , and θ, the iterative optimization process was conducted until θ finally reached full convergence, such as θ ≤ 0.08, which can guarantee the high accuracy requirement for the actual sample iteration and optimization and allow the continuous sample-optimized classification method to extract the optimal built-up results.This iterative updating procedure increases the rationality and objectivity of classifications and avoids possible problems caused by verbose and repeated processes.Thus, the credibility and ability to automate the classification results are both greatly enhanced.

Two Methods for Experimental Comparison
In our experiments, two data extraction methods were used to compare the performance of the proposed method.The first method was the threshold dichotomy method [54,55], which primarily performs continuous circular comparisons between urban land patterns in a spot area and statistical data from the varied nightlight DN values until the urban land pattern spot area is extracted by the specific threshold value that best approximates the statistical data, and this area is where the urban land is ultimately classified.The second method was the improved neighborhood focal statistic (NFS) method [56], which primarily identifies central and marginal urban areas using maximum and minimum NFS calculations of DMSP-OLS data while eliminating vegetated features and water bodies using multi-source geographic data integration to extract the optimal urban areas.Studies have indicated that both methods are useful for extracting urban built-up area data.

Experimental Results and Analysis
This study focused on four cities with relatively low levels of urban development in western China: Chengdu, Kunming, Xining, and Yinchuan.Because the spatial resolution of Landsat images is much finer than that of the nighttime stable light data, evaluating the results using Landsat data is a feasible and acceptable approach [57,58].To quantify the performance of the threshold dichotomy method, the improved NFS method, and our proposed method, we performed accuracy assessments using Landsat images as the reference data and considered these methods as different schemes of urban area data extraction.The respective results should be compared with those extracted from Landsat images according to visual and quantitative comparisons of pattern spots.

Results of Sample Optimization
By using the sample results for the iterative updating process for Chengdu shown in Table 4, the quantity of selected samples at different stages could be obtained and the corresponding accuracy assessment for their training results could be performed.Table 4 shows the iteration time (defined as I), the number of selected samples of built-up areas, two classes of non-built-up areas, and accuracy evaluation indexes for the corresponding training results (i.e., producer accuracy, user accuracy, and OA).For the initialization (i.e., I = 0), the aforementioned samples were manually selected using the DN values of the DMSP data; thus, these samples had not been subject to iterative procedures, which means the training process did not contain any auxiliary data, features, or index factors.As I increased, the samples in the built-up areas and non-built-up areas were continually selected and refined according to the iterative updating process, and the accuracy indexes largely increase, especially the value of OA, which increased from 0.89 to 0.96.Moreover, the producer accuracy changed in an opposite manner to that of the other accuracy indexes shown in Table 4 because the selection mechanism of the original samples (when I = 0) was completely different from the mechanisms (when I ≥ 1) involved in the iterative updating procedure.The results based on SVM training with two optimizations (when I = 2) were much finer than the results that only included SVM training without sample optimization or auxiliary data (when I = 0) for certain characteristics, such as the shape and size of the pattern spots (Figure 4).Although high similarity was maintained with the results obtained by SVM training with one optimization (when I = 1) and two optimizations (when I = 2), certain nuances that indicated vegetated pixels could be easily distinguished from urban pixels, and further optimization of the training samples was sufficient to guarantee an extraction of urban built-up results with a high degree of credibility.
Remote Sens. 2017, 9, 236 10 of 16 OA).For the initialization (i.e., I = 0), the aforementioned samples were manually selected using the DN values of the DMSP data; thus, these samples had not been subject to iterative procedures, which means the training process did not contain any auxiliary data, features, or index factors.As I increased, the samples in the built-up areas and non-built-up areas were continually selected and refined according to the iterative updating process, and the accuracy indexes largely increase, especially the value of OA, which increased from 0.89 to 0.96.Moreover, the producer accuracy changed in an opposite manner to that of the other accuracy indexes shown in Table 4 because the selection mechanism of the original samples (when I = 0) was completely different from the mechanisms (when I ≥ 1) involved in the iterative updating procedure.The results based on SVM training with two optimizations (when I= 2) were much finer than the results that only included SVM training without sample optimization or auxiliary data (when I = 0) for certain characteristics, such as the shape and size of the pattern spots (Figure 4).Although high similarity was maintained with the results obtained by SVM training with one optimization (when I = 1) and two optimizations (when I = 2), certain nuances that indicated vegetated pixels could be easily distinguished from urban pixels, and further optimization of the training samples was sufficient to guarantee an extraction of urban built-up results with a high degree of credibility.

Results of Urban Built-Up Areas
Figure 5a-c shows the entire study area based on Landsat images, DMSP-OLS data, and the GlobeLand30 product.The threshold dichotomy method (Figure 5d) tended to identify continuous urban built-up extents because of the blooming effect observed using the DMSP-OLS data, whereas the improved NFS method (Figure 5e) extracted urban built-up areas with the maximum/minimum NFS calculations and removed natural features (vegetated and water) based on the NDVI criteria and water body data integration.Figure 5f shows the urban built-up results extracted by our proposed approach.A visual comparison of the extraction results obtained with the different methods shows that Chengdu, Kunming, Xining, and Yinchuan were extracted well by any one of the three methods mentioned above (Figure 5g).However, for the size, shape, and distribution of the urban built-up patches, the results extracted with the proposed method presented finer details than those with the improved NFS method and far exceeded the patches produced by the threshold dichotomy method.Furthermore, the results of the proposed method were generally consistent with the statistical data of the urban built-up areas in the accuracy assessment, and a low relative error of less than 10% was observed (Table 5).
Although the four experimental cities are all of a prefecture-level scale, their urban distribution and characteristics, as well as their levels of development and change, all differ.In addition, deviations are observed between the surface features derived from remote sensing images at different scales and the actual features, and these deviations are even more obvious for remote sensing images with low or moderate resolution.The DMSP-OLS data adopted in this paper have relatively low resolution, and such an imbalance could not be completely avoided, despite adding the constraint of auxiliary data, which included the GlobeLand30 product with a 30-m resolution.Overall, these uncertainties might lead to overestimation or underestimation in the extraction results.

Results of Urban Built-Up Areas
Figure 5a-c shows the entire study area based on Landsat images, DMSP-OLS data, and the GlobeLand30 product.The threshold dichotomy method (Figure 5d) tended to identify continuous urban built-up extents because of the blooming effect observed using the DMSP-OLS data, whereas the improved NFS method (Figure 5e) extracted urban built-up areas with the maximum/minimum NFS calculations and removed natural features (vegetated and water) based on the NDVI criteria and water body data integration.Figure 5f shows the urban built-up results extracted by our proposed approach.A visual comparison of the extraction results obtained with the different methods shows that Chengdu, Kunming, Xining, and Yinchuan were extracted well by any one of the three methods mentioned above (Figure 5g).However, for the size, shape, and distribution of the urban built-up patches, the results extracted with the proposed method presented finer details than those with the improved NFS method and far exceeded the patches produced by the threshold dichotomy method.Furthermore, the results of the proposed method were generally consistent with the statistical data of the urban built-up areas in the accuracy assessment, and a low relative error of less than 10% was observed (Table 5).
Although the four experimental cities are all of a prefecture-level scale, their urban distribution and characteristics, as well as their levels of development and change, all differ.In addition, deviations are observed between the surface features derived from remote sensing images at different scales and the actual features, and these deviations are even more obvious for remote sensing images with low or moderate resolution.The DMSP-OLS data adopted in this paper have relatively low resolution, and such an imbalance could not be completely avoided, despite adding the constraint of auxiliary data, which included the GlobeLand30 product with a 30-m resolution.Overall, these uncertainties might lead to overestimation or underestimation in the extraction results.

Result1
Urban built-up areas extracted using the threshold dichotomy method

Result2
Urban built-up areas extracted using the improved NFS method

Result3
Urban built-up areas extracted using the proposed approach

Comparison of the results
Non urban built-up areas Urban built-up areas extracted using the threshold dichotomy method Urban built-up areas extracted using the improved NFS method Urban built-up areas extracted using the proposed approach  For the sake of brevity, the derived urban built-up results of Chengdu and Kunming were selected for the detailed quantitative analysis of the pattern spots.With respect to the form of the pattern spots, Figure 6 shows that the results obtained for both Chengdu and Kunming by the three methods were generally similar and comparable in detail to those obtained using Landsat images.Although the overall trend of the pattern spots obtained by the threshold dichotomy method demonstrated a planar tiling distribution and was roughly consistent with the results of the Landsat images, the classification did not return results for several areas because of light segmentation or differences between multi-sensor data.Therefore, the precision was not sufficient for conducting further research on urban built-up areas.The results extracted the improved NFS method far exceeded those using the threshold dichotomy method with respect to the overall morphological distribution or data extraction accuracy; however, the results were significantly different from the Landsat image results and included various degrees of misclassified pattern spots in several regions.Compared with the two methods above, the results obtained using our proposed approach were more consistent than those obtained using the Landsat data.Despite the misclassification or classification omissions of certain pattern spots in scattered regions, the results showed better performance than that of the other two methods both in precision and visual effect.

Assessment of Accuracy for Urban Built-up Areas
For the sake of brevity, the derived urban built-up results of Chengdu and Kunming were selected for the detailed quantitative analysis of the pattern spots.With respect to the form of the pattern spots, Figure 6 shows that the results obtained for both Chengdu and Kunming by the three methods were generally similar and comparable in detail to those obtained using Landsat images.Although the overall trend of the pattern spots obtained by the threshold dichotomy method demonstrated a planar tiling distribution and was roughly consistent with the results of the Landsat images, the classification did not return results for several areas because of light segmentation or differences between multi-sensor data.Therefore, the precision was not sufficient for conducting further research on urban built-up areas.The results extracted with the improved NFS method far exceeded those using the threshold dichotomy method with respect to the overall morphological distribution or data extraction accuracy; however, the results were significantly different from the Landsat image results and included various degrees of misclassified pattern spots in several regions.Compared with the two methods above, the results obtained using our proposed approach were more consistent than those obtained using the Landsat data.Despite the misclassification or classification omissions of certain pattern spots in scattered regions, the results showed better performance than that of the other two methods both in precision and visual effect.Although the proposed approach is superior to the other two methods, visual and quantitative comparisons of the pattern spots indicated that commission errors (CEs) and omission errors (OEs) were not avoided because of the influence of light saturation and problems associated with the use of sensors of various quality.However, the results still showed that the proposed method produced less CEs on average than the threshold dichotomy method and demonstrated an approximately 30% improvement in OEs compared with the other two methods.Although high Kappa and OA values were obtained with the improved NFS method and the proposed method, the improved NFS method did not implement correction processing specific to the light saturation phenomenon or data differences among the various sensors; thus, relatively significant deviations were observed among the tests.Consequently, significant differences were observed in the CEs and OEs, as shown in Table 6.These results indicate that the proposed approach reduced the defects of various data sources and guaranteed the extraction accuracy and authenticity of the urban built-up pattern spots.Although the proposed approach is superior to the other two methods, visual and quantitative comparisons of the pattern spots indicated that commission errors (CEs) and omission errors (OEs) were not avoided because of the influence of light saturation and problems associated with the use of sensors of various quality.However, the results still showed that the proposed method produced less CEs on average than the threshold dichotomy method and demonstrated an approximately 30% improvement in OEs compared with the other two methods.Although high Kappa and OA values were obtained with the improved NFS method and the proposed method, the improved NFS method did not implement correction processing specific to the light saturation phenomenon or data differences among the various sensors; thus, relatively significant deviations were observed among the tests.Consequently, significant differences were observed in the CEs and OEs, as shown in Table 6.
These results indicate that the proposed approach reduced the defects of various data sources and guaranteed the extraction accuracy and authenticity of the urban built-up pattern spots.

Conclusions
This paper presented a sample-optimized approach for SVM classification based on a combination of DMSP-OLS nighttime light layer data, artificial surface data from the GlobeLand30 product, and vegetated regions from Landsat images, which were used as criteria for the selection of training samples to identify optimized samples that represent built-up areas and non-built-up areas according to an iterative updating procedure.This method can successfully remove the above land-cover categories such as vegetation and bare land from the built-up areas, and obtain the areas which have been developed and constructed with basic municipal utilities and public facilities in the urban administration region.Our proposed approach is novel in the nighttime light data application field and different from other SVM-based extraction methods using DMSP-OLS data in the cited literature.In terms of the overall distribution of extracted built-up urban patches, the samples identified by the proposed method were finer than those of the threshold dichotomy method and the improved NFS method.
An empirical experiment and a detailed accuracy assessment indicated the following: (1) the results extracted using the proposed approach were accurate for pattern spots of urban built-up areas and highly consistent with those extracted using Landsat images in 2010; (2) the results of the experimental comparison with the two other methods confirmed the effectiveness of the proposed approach because it generated improved CE and OE ratios and higher OA and Kappa coefficients; (3) the proposed approach integrates various factor features into the steps of the SVM sample configuration, which is superior to a simple overlay analysis of all the feature layers, and the combination of factor features with the classification probability threshold optimized samples to satisfy the condition requirements for an iterative updating process that can reduce false and misclassification errors; (4) by exploiting the classification accuracy of the GlobeLand30 product, the proposed approach merges the advantages of multi-source remote sensing in representing urban information to obtain highly accurate final urban extraction results.This paper mainly focused on the relationship between DMSP-OLS nighttime light data and the SVM classification; thus, our work is highly specific and unique from an application perspective.
Limitations included the insufficient temporal types in the GlobeLand30 product and the coarse resolution and blooming effect of the DMSP-OLS data; thus, the accuracy of the urban spatial information extracted using the proposed approach requires further investigation.However, this approach optimizes the sample selection process without relying on statistical data and improves the efficiency and precision of SVM classifications.In addition, the proposed method largely avoids the drawbacks associated with using a single data source.The proposed approach is a timely and cost-effective method of monitoring the spatial patterns and temporal dynamics of urban areas as well as geographic conditions on a national scale.In future works, we will devote additional efforts towards improving the proposed approach by investigating spatio-temporal variations in some typical regions, such as ghost towns, as well as regional dynamic monitoring based on updates to the GlobeLand30 products and the availability of additional long time series data.

Figure 2 .
Figure 2. GlobeLand30 data (taking regions over China as an example).

Figure 2 .
Figure 2. GlobeLand30 data (taking regions over China as an example).
and it consists of several steps: (1) preprocessing of the DMSP-OLS and Landsat data; (2) initial sample generation and classification; (3) iterative sample optimization process; and (4) SVM classification for urban built-up area data extraction.As the core of this approach, steps 2-4 are introduced in a single section.sample generation and classification; (3) iterative sample optimization process; and (4) SVM classification for urban built-up area data extraction.As the core of this approach, steps 2-4 are introduced in a single section.

Figure 3 .
Figure 3. Workflow of the proposed approach.

Figure 3 .
Figure 3. Workflow of the proposed approach.

Figure 4 .Figure 4 .
Figure 4. Comparison results in the process of sample optimization.

Figure 5 .
Figure 5. Urban built-up extraction results comparison among the different methods.

Figure 5 .
Figure 5. Urban built-up extraction results comparison among the different methods.

Figure 6 .
Figure 6.Pattern spots comparison among the different methods.

Figure 6 .
Figure 6.Pattern spots comparison among the different methods.

Table 1 .
Description of the remote sensing data sets used in this study.

Table 1 .
GlobeLand30 product data, and Landsat image data were used in this research, and the major characteristics of each data source are summarized in Table1.All the selected datasets were acquired in 2010.Description of the remote sensing data sets used in this study.

Table 2 .
Inter-calibration model coefficients for each image from 2006 to 2012.

Table 3 .
Rule for refined samples selection by combining multi-source data information.

Table 4 .
Sample results in the iterative updating procedure.

Table 4 .
Sample results in the iterative updating procedure.

Table 5 .
Accuracy assessment of the urban built-up areas based on the statistics and the proposed method in Chengdu, Kunming, Xining, and Yinchuan in China.

Table 5 .
Accuracy assessment of the urban built-up areas based on the statistics and the proposed method in Chengdu, Kunming, Xining, and Yinchuan in China.

Table 6 .
Quantitative analysis results of the three methods by cities.