Mapping Regional Urban Extent Using NPP-VIIRS DNB and MODIS NDVI Data

The accurate and timely monitoring of regional urban extent is helpful for analyzing urban sprawl and studying environmental issues related to urbanization. This paper proposes a classification scheme for large-scale urban extent mapping by combining the Day/Night Band of the Visible Infrared Imaging Radiometer Suite on the Suomi National Polar-orbiting Partnership Satellite (NPP-VIIRS DNB) and the Normalized Difference Vegetation Index from the Moderate Resolution Imaging Spectroradiometer products (MODIS NDVI). A Back Propagation (BP) neural network based one-class classification method, the Present-Unlabeled Learning (PUL) algorithm, is employed to classify images into urban and non-urban areas. Experiments are conducted in mainland China (excluding surrounding islands) to detect urban areas in 2012. Results show that the proposed model can successfully map urban area with a kappa of 0.842 on the pixel level. Most of the urban areas are identified with a producer’s accuracy of 79.63%, and only 10.42% the generated urban areas are misclassified with a user’s accuracy of 89.58%. At the city level, among 647 cities, only four county-level cities are omitted. To evaluate the effectiveness of the proposed scheme, three contrastive analyses are conducted: (1) comparing the urban map obtained in this paper with that generated by the Defense Meteorological Satellite Program/Operational Linescan System Nighttime Light Data (DMSP/OLS NLD) and MODIS NDVI and with that extracted from MCD12Q1 in MODIS products; (2) comparing the performance of the integration of NPP-VIIRS DNB and MODIS NDVI with single input data; and (3) comparing the classification method used in this paper (PUL) with a linear method (Large-scale Impervious Surface Index (LISI)). According to our analyses, the proposed classification scheme shows great potential to map regional urban extents in an effective and efficient manner.


Introduction
Urban land occupies a relatively small fraction of the Earth's surface, but is the main area of human activities [1].Rapid socio-economic development and population growth have greatly encouraged the expansion of urban land areas [2,3].According to the World Bank, by the end of 2015, the global urbanization rate had hit 53.85%, a dramatic increase from 46.54% in 2000.If current trends continue, by 2030, urban land cover will grow by 1.2 million km 2 , nearly tripling the global urban land area of circa 2000 [4].This increase has a high-probability of occurring in developing countries with a rapid pace of urbanization, such as China.Although urbanization is an effective promotion of social progress, urban expansion has inevitably caused numerous environmental problems, resulting in natural habitat loss, threatening biodiversity, affecting local climate, etc. [5][6][7][8][9][10][11].It is essential to obtain timely and accurate information on urbanized areas to monitor the environmental and socioeconomic processes [12,13].
Land cover maps derived from remotely-sensed data are a valuable source for mapping urban area and monitoring urban dynamics [14][15][16][17][18][19].Images with high or medium spatial resolution are commonly used for characterizing the urban structures due to their detailed spatial observations of cities [20][21][22].However, the coverage range of one scene in a high or medium spatial resolution image is limited.For example, IKONOS collects high-resolution imagery at 1 and 4 m resolution, and the swath of a single scene is 11 km × 11 km.When applied at the regional scale (e.g., China, covering approximately 9.6 million km 2 ), a large number of images is needed to cover the entire area.It is time consuming and labor intensive to process multi-fold pixels [23].High-and medium-spatial resolution images are also frequently affected by cloud cover.It is challenging to select high-quality images around the same time for a large area.Additionally, the sensors collecting high-and medium-resolution images have a relatively long revisiting period.For example, Landsat sensors, providing medium-resolution imagery of 30 m, have a revisit rate at 16 days, which reduces the number of suitable images.Thus, at the national or continental scale, coarse imagery with wider swaths and higher revisit rates is the preferred imagery for macroscopically monitoring urban extents [24].
As the main data source of coarse images, the Moderate Resolution Imaging Spectroradiometer (MODIS) instrument can provide global imagery (500 to 1000 m spatial resolution) with 1-2-day temporal resolution.The dimensions of one scene are 2400 × 2400 rows/columns (about 1.4 million km 2 ).Many researchers have explored the capability of MODIS reflectance data in mapping urban areas [12,18,25].The Normalized Difference Vegetation Index (NDVI) products derived from reflectance data are frequently used to rapidly characterize urban extent [26][27][28][29].Since urban areas are dominated by impervious surfaces, the vegetation distributions inside and outside cities have obvious differences, which can be reflected in the NDVI [30].Multi-temporal or time series NDVI are often adopted to remove the cloud contamination, which always occurs in a single time image [31].However, either MODIS reflectance data or the derived NDVI alone can reflect the physical properties of the land surface.Urban physical properties are similar to those of non-or low-vegetation land covers, and some non-urban blocks may have similar reflecting spectral curves as urban blocks, such as bare soils.Besides, urban areas can be regarded as the union of multi-features (e.g., trees, asphalt road, buildings, etc.), but the specific composition of each city is different, resulting in urban spectral features varying from city to city.There is no universal standard to distinguish city from the background.Particularly, pixels in coarse images cover a larger ground surface and contain more features.The spectral heterogeneity and homogeneity of urban land make it easily confused with other land covers, resulting in biased estimated (overestimated or underestimated).Thus, mapping urban extent from physical information alone is challenging.
Another coarse data source, the nighttime lighting data, regarded as a sign of human activities, can provide different urban information from spectral data.It can distinguish human urban areas with artificial lights from the dark background at night [32] and has been employed to estimate economic conditions and energy consumption [33,34].The most widely-used nighttime light data are the stable Nighttime Light Data on the Defense Meteorological Satellite Program/Operational Linescan System (DMSP/OLS NLD) [35][36][37][38][39], but they have some limitations related to their coarse spatial resolution (30 arc seconds, 5 km × 5 km at nadir), blooming effects (the dispersion of light from built-up areas into non-light areas), saturation in urban areas and intra-sensor calibration problems, resulting in misestimates of urban land [40,41].Some studies have found that the combined use of spectral data and nighttime light data can reduce the blooming effect and pixel saturation of nighttime light products, as well as provide better performance than using individual nighttime light data [42,43].However, this integration, which provides only supplemental information for DMSP/OLS NLD, cannot eliminate data limitations, i.e., coarse spatial resolution of about 1 km and small data ranges from 0 to 63.
The next generation of nighttime light observations, the Day/Night Band included in the Visible Infrared Imaging Radiometer Suite on the Suomi National Polar-orbiting Partnership Satellite (NPP-VIIRS DNB) [44], has made dramatic improvements to the spatial, spectral and radiometric resolution.The increased spatial resolution (15 arc seconds, 742 m × 742 m), lower light imaging detection limits (~2 × 10 −11 W•cm −2 •sr −1 ) and higher radiometric quantization (14 bit) allow NPP-VIIRS DNB to provide more details of the city, bringing new insight into extracting urban extent [41,45].However, nighttime light data are seriously influenced by different economic conditions [46,47].Using solely NPP-VIIRS DNB may produce inaccurate spatial patterns of cities, especially for considerably different economic conditions.Concerning this issue, Guo et al. [31] established a Large-scale Impervious Surface Index (LISI) integrating NPP-VIIRS DNB and MODIS NDVI variables, which greatly improved the extraction performance.The LISI-based approach provides accurate spatial patterns from high values in core urban areas to low values in rural areas, with an overall root mean squared error of 0.11.Similarly, Sharma et al. [48] proposed an Urban Built-up Index (UBI) combining the MODIS multispectral imagery with NPP-VIIRS DNB to generate a 500-m resolution map of urban built-up areas at the global scale.The combination of spectral data and VIIRS nighttime light data significantly reduces the misclassification errors, capturing more detailed information of urban spatial patterns.Nevertheless, indexes only estimate the performance in the linear model.More research is still required to explore the nonparametric algorithms and broaden the joint utilization of multi-source coarser images for urban land extraction [49].
The main purpose of this study is to evaluate the effectiveness of combining NPP-VIIRS DNB and MODIS NDVI on regional urban extent mapping by using a Back Propagation (BP) neural network-based one-class classification method.To achieve this goal, we extracted the urban land area in mainland China (excluding surrounding islands) in 2012 and conducted contrastive analyses on the input data, the classification method and the results.We comparatively analyze the extracted urban map with the map generated by DMSP/OLS NLD and MODIS NDVI and the urban extent extracted from MCD12Q1 in MODIS products in Section 5.1.We explore the advantages of combining NPP-VIIRS DNB and MODIS NDVI in Section 5.2 and compare our method with the linear method (LISI) in Section 5.3.

Defining Urban Extent
Regarding different research perspectives, different definitions of "urban extent" are introduced [50][51][52].For example, if the urban studies are related to census data, "urban extent" may be defined by population distribution; if the studies use nighttime light images, the definition may be related to economic condition.To reduce the uncertainty of the term "urban extent", it is necessary to define it before characterizing urbanized areas.
As this study refers to the physical attributes of urban extent, we employ the definition proposed by Schneider et al. [12]; that is, "urban extent" refers to places dominated by artificial surfaces with a minimum area of 1 km 2 , including all non-vegetative, human-constructed elements, such as roads, buildings and runways, and "dominated" implies coverage greater than 50% of a pixel.When vegetation covers most of a pixel, it will not be considered as the urban type, even if it may function as urban space, such as a park.

Study Area
This paper takes mainland China as the case study area, which contains 30 provinces, but excludes the surrounding islands (Figure 1).In addition, eight cities (Urumchi, Suihua, Hohhot, Datong, Beijing, Chengdu, Wuhan and Guangzhou) with different levels of urbanization and different geographical locations are selected as the cases for contrastive analysis in Section 5.

Data Acquisition and Pre-Processing
Four different sources of data are used for this work: (1) collection 5 MODIS 500 m NDVI data products (MOD13A1) for 2012 [54]; (2) NPP-VIIRS DNB composited data for 2012 [55]; (3) DMSP/OLS NLD acquired by the DMSP satellite F18 for 2012 [56]; and (4) Global Land Cover Mapping at 30 m resolution for 2010 (GLC30-2010) [57].A brief description of the datasets is listed in Table 1.The input data for classification are the maximum NDVI retrieved from MODIS NDVI products, and the normalized NPP-VIIRS DNB.DMSP/OLS NLD and GLC30-2010 are auxiliary data.The stable DMSP/OLS NLD is employed for filtering NPP-VIIRS DNB, and the GLC30-2010 dataset is used for sample selection.All classification data are reprojected to Albers conical equal In addition, eight cities (Urumchi, Suihua, Hohhot, Datong, Beijing, Chengdu, Wuhan and Guangzhou) with different levels of urbanization and different geographical locations are selected as the cases for contrastive analysis in Section 5.

Data Acquisition and Pre-Processing
Four different sources of data are used for this work: (1) collection 5 MODIS 500 m NDVI data products (MOD13A1) for 2012 [54]; (2) NPP-VIIRS DNB composited data for 2012 [55]; (3) DMSP/OLS NLD acquired by the DMSP satellite F18 for 2012 [56]; and (4) Global Land Cover Mapping at 30 m resolution for 2010 (GLC30-2010) [57].A brief description of the datasets is listed in Table 1.The input data for classification are the maximum NDVI retrieved from MODIS NDVI products, and the normalized NPP-VIIRS DNB.DMSP/OLS NLD and GLC30-2010 are auxiliary data.The stable DMSP/OLS NLD is employed for filtering NPP-VIIRS DNB, and the GLC30-2010 dataset is used for sample selection.All classification data are reprojected to Albers conical equal area projection by the nearest-neighbor resampling algorithm and resampled into the same spatial resolution of 480 m.Details of the data pre-processing are described as follows.

Generating the Maximum NDVI
NDVI is a graphical indicator that can assess whether the ground surface contains green vegetation.Urban lands also contain vegetation, but with a much lower vegetation coverage rate than the pure vegetation areas such as grassland.There is a negative correlation between NDVI and urban land.Previous research has employed NDVI to estimate the urban extent [42].However, when a single-date NDVI is used, urban lands are easily confused with some vegetation areas, such as croplands.This is because the NDVI of deciduous vegetation areas varies with the phenological change, while the urban NDVI is relatively stable for several seasons.For example, croplands are mainly covered by plants, having a large NDVI value.After the harvest, however, bare soil will be dominant, and the NDVI values reduce accordingly, which may become the same as that of urban land.Thus, to extract the spectral differences and highlight urban land, the maximum NDVI retrieved from multi-temporal NDVI is employed.The NDVI composite can also remove the impact of cloud contamination.
The MODIS vegetation index product (MOD13A1) provides time-and space-continuous NDVI with 16-day intervals at 500 m spatial resolution.To cover the entire study area, 17 MODIS scenes are required: h23v04, h23v05, h24v04, h24v05, h25v03-h25v06, h26v03-h26v06, h27v04-h27v06, h28v05, and h28v06.For each scene, 23 tiles are available in 2012.Finally, a total number of 391 MODIS tiles are downloaded.The maximum NDVI of each scene is calculated by Equation (1) from all corresponding MODIS NDVI tiles.
where Si is the index of each scene and NDVI 1 , NDVI 2 , . . ., NDVI 23 are the MODIS NDVI tiles acquired in 2012.
The NDVI values range from −1 to 1.A positive value, even close to zero, indicates a region covered by vegetation, while a negative value signifies non-vegetated areas, such as water bodies and snow cover.According to the requirements of the classification method, the negative values are removed from the maximum NDVI image.Thus, the NDVI of final maximum image ranges between 0 and 1.

Removing Background Noise and Filtering Extreme Values in NPP-VIIRS DNB
The original nighttime light image was composited by the NPP-VIIRS DNB data with low moonlight during 18-26 April 2012 and 11-23 October 2012.As there exist unstable lights (e.g., fires, gas flares, volcanoes, aurora, etc.) in the original image, it is necessary to remove these background noises and ephemeral lights before use.
A simple method is employed to remove the outliers from the original NPP-VIIRS data [58].The pre-processing procedure contains two steps (Figure 2).(1) The first step is removing the background noise surface by DMSP/OLS NLD.The stable DMSP/OLS NLD only contains the persistent light (the lights from cities, towns, etc.), and the ephemeral lights (e.g., fires) have been discarded.Thus, the non-light area (DN = 0) of the stable DMSP/OLS NLD is extracted and overlays NPP-VIIRS DNB to mask out the background noise.This step not only retains the stable night light, but also removes many non-urban areas, thus reducing the data process load.The classification method used in this paper is not scale invariant.To keep all input data sources (i.e., the maximum NDVI and the NPP-VIIRS DNB data) in the same data range (between 0 and 1), the filtered NPP-VIIRS DNB image is finally normalized with Equation (2): where DNBni is the normalized value of the ith pixel, DNBi is the original value of the ith pixel and DNBmin and DNBmax are the minimum and maximum values of all pixels, respectively.In addition, to facilitate the classification process, the NPP-VIIRS DNB image is finally split scene by scene according to the MODIS grid.

Aggregating GLC30-2010 from 30 m to 480 m
The world's first 30 m resolution global land cover dataset (GLC30-2010) consists of 10 first-level classes, namely water bodies, wetland, artificial surfaces, cultivated land, permanent snow/ice, forest, shrubland, grassland, bare land and tundra.An accuracy assessment of the GLC30-2010 dataset was conducted by third-party experts with a total of 159,874 pixel samples [59,60].The overall accuracy of the GLC30-2010 dataset is 83.5%.In this study, the artificial surfaces (value = 80) were extracted for the training set sampling (Section 2.3.1).The accuracy of artificial surfaces reaches 86.97%.
To maintain a consistent scale, the 30 m spatial resolution artificial surfaces are aggregated to 480 m by a 16 × 16 window based on a majority rule.The value of each aggregated pixel is the area proportion of artificial surfaces.For example, if 143 pixels in a 16 × 16 window are labeled as artificial surfaces, the value of corresponding aggregated pixel will be 0.559 (i.e., 143/(16 × 16)).The aggregated map is an urban area proportion map.According to the definition of "urban extent", the urban area proportion map can be further classified into two types based on the threshold of 0.5.The values of pixels that are equal to or greater than 0.5 are reassigned to 1, and the other are set to 0, representing urban and non-urban area, respectively.The blocks less than 1 km 2 (i.e., 4 pixels) are The classification method used in this paper is not scale invariant.To keep all input data sources (i.e., the maximum NDVI and the NPP-VIIRS DNB data) in the same data range (between 0 and 1), the filtered NPP-VIIRS DNB image is finally normalized with Equation (2): where DNB ni is the normalized value of the ith pixel, DNB i is the original value of the ith pixel and DNB min and DNB max are the minimum and maximum values of all pixels, respectively.In addition, to facilitate the classification process, the NPP-VIIRS DNB image is finally split scene by scene according to the MODIS grid.

Aggregating GLC30-2010 from 30 m to 480 m
The world's first 30 m resolution global land cover dataset (GLC30-2010) consists of 10 first-level classes, namely water bodies, wetland, artificial surfaces, cultivated land, permanent snow/ice, forest, shrubland, grassland, bare land and tundra.An accuracy assessment of the GLC30-2010 dataset was conducted by third-party experts with a total of 159,874 pixel samples [59,60].The overall accuracy of the GLC30-2010 dataset is 83.5%.In this study, the artificial surfaces (value = 80) were extracted for the training set sampling (Section 2.3.1).The accuracy of artificial surfaces reaches 86.97%.
To maintain a consistent scale, the 30 m spatial resolution artificial surfaces are aggregated to 480 m by a 16 × 16 window based on a majority rule.The value of each aggregated pixel is the area proportion of artificial surfaces.For example, if 143 pixels in a 16 × 16 window are labeled as artificial surfaces, the value of corresponding aggregated pixel will be 0.559 (i.e., 143/(16 × 16)).The aggregated map is an urban area proportion map.According to the definition of "urban extent", the urban area proportion map can be further classified into two types based on the threshold of 0.5.The values of pixels that are equal to or greater than 0.5 are reassigned to 1, and the other are set to 0, representing urban and non-urban area, respectively.The blocks less than 1 km 2 (i.e., 4 pixels) are removed due to the minimum area constraint.The re-classified map is called the GLC30-2010 480 m urban map in this paper.

Methods
The processing procedure is summarized in

Methods
The processing procedure is summarized in   Testing samples (consisting of both positive samples and negative samples) are collected randomly from the entire study area.To ensure the accuracy of testing samples, all labels are manual assigned.Each testing sample is labeled as urban or non-urban using visual interpretation from high spatial resolution images (e.g., Landsat images or Google Earth images) by three professionals, respectively, and the final label is determined by most of the three.A double-blind procedure is employed to reduce the uncertainty and bias during the labeling procedure, i.e., the professionals do not know the classification results.In this study, 20,000 testing samples are randomly selected, including 162 urban samples and 19,838 non-urban samples.The selected testing samples are removed from the study area, and the training samples are then selected from the rest.This step can guarantee that the training and the testing samples are independent datasets.
The training set in this paper only requires urban samples and unlabeled samples, since the classification method employed can train the classifier from positive and unlabeled samples.Unlabeled samples are randomly collected from the background (filtered by the DMSP/OLS NLD non-light area) and do not need to have their class labeled.Urban samples are randomly selected from the urban area.However, even if only one type of positive sample is required to be labeled, manually labeling random samples on a wide geographic scope is time consuming and labor  Testing samples (consisting of both positive samples and negative samples) are collected randomly from the entire study area.To ensure the accuracy of testing samples, all labels are manual assigned.Each testing sample is labeled as urban or non-urban using visual interpretation from high spatial resolution images (e.g., Landsat images or Google Earth images) by three professionals, respectively, and the final label is determined by most of the three.A double-blind procedure is employed to reduce the uncertainty and bias during the labeling procedure, i.e., the professionals do not know the classification results.In this study, 20,000 testing samples are randomly selected, including 162 urban samples and 19,838 non-urban samples.The selected testing samples are removed from the study area, and the training samples are then selected from the rest.This step can guarantee that the training and the testing samples are independent datasets.
The training set in this paper only requires urban samples and unlabeled samples, since the classification method employed can train the classifier from positive and unlabeled samples.Unlabeled samples are randomly collected from the background (filtered by the DMSP/OLS NLD non-light area) and do not need to have their class labeled.Urban samples are randomly selected from the urban area.However, even if only one type of positive sample is required to be labeled, manually labeling random samples on a wide geographic scope is time consuming and labor intensive.A semi-artificial sampling strategy is adopted to lessen the workload of labeling during the training set sampling process.First, urban samples are collected from the urban area in the GLC30-2010 480 m urban map, and then, their labels are further validated with high spatial resolution images by visual interpretation.To obtain representative urban samples, the urban patches in the GLC30-2010 480 m urban map are shrunk with one pixel at the edge.This step can also eliminate the uncertainty of pixels on the fringes of cities.
Additionally, the characteristics of cities in different regions are different.For example, the vegetation density in the southeast part of China is usually higher than that in the northwest.The urban samples from the north part may not be able to represent the features of that from the south.Thus, a scene-by-scene basis is employed for training set sampling to avoid biased mapping results.For each scene, 1000 urban samples and 5000 unlabeled samples are randomly selected.The scenes with only a few urban pixels are combined into the neighbor scene.Across the entire mapping region, a total of 7000 urban pixels (about 4.396% of the urban pixels in the shrunken GLC30-2010 480 m urban map) and 35,000 unlabeled pixels (about 0.497% of the total pixels) are selected to construct the training set.

Training and Predicting
The traditional supervised classification method, the most frequently-used remote sensing method, requires that all land types in an image be completely labeled [61].However, when the target class is only a specific land cover type, this process becomes labor intensive and time consuming.The one-class classifier is a good choice to solve this problem because it focuses on the target data only.
Different one-class classifiers have been introduced in remote sensing classification.The classic One-Class Support Vector Machine (OCSVM) is one of the most widely-used methods [62].The core idea of this method is to find the hyperplane that separates the training data from the origin with the maximum margin.OCSVM has been successfully applied in the change detection of one specific class (e.g., fenland and urban extent) [63,64].Another well-known approach is called the Support Vector Data Description (SVDD) [65].It is based on the theory of the support vector machine, finding a hypersphere with minimal radius, which encloses all samples of one class and assumes that samples outside this hypersphere do not belong to the class [66].It has shown promise in the fenland class training [67].However, the drawback of these SVM-based methods is that too many free parameters are included, which leads to the outcome sensitivity.In addition, the complicated model selection procedures also preclude its adoption [68,69].
A novel one-class classification algorithm called the Present-Unlabeled Learning algorithm (PUL) [70] was introduced into remote sensing image classification [68].Generally, if both positive data and negative data are available in the training set, traditional binary classifiers (e.g., SVMs) can learn a function to predict the probability of individual samples being positive.However, if only positive and unknown samples are available in the training set, the function cannot be estimated directly by these methods.A lemma was introduced by Elkan and Noto into PUL to train the classifier from only positive and unknown samples.
In this study, the urban class is referred to as the positive class (y = 1), while the non-urban class is referred to as the negative class (y = −1).If the class (urban or non-urban) of a sample x is explicitly known, it will be referred to as labeled (s = 1); otherwise, it will be referred to as unlabeled (s = 0).In this study, only urban samples are labeled.Thus, a labeled sample must be urban, and the probability of a non-urban sample x being labeled is zero, we have Remote Sens. 2017, 9, 862 9 of 25 Suppose each urban pixel within the whole image has the same probability of being labeled regardless of its location x, i.e., the probability that an urban sample is labeled is a constant c, we have The constant c indicates the level that positive data are detected or labeled in the samples.For example, suppose that a study area contains 50 positive samples and 50 negative samples.c = 1 means all the positive samples are labeled; c = 0.5 means approximately 25 positive samples are detected, resulting in 25 labeled data and 75 unlabeled data.
The target of a binary classifier is estimating a function f (x) = p(y = 1|x) from the finite training data.With a random selected training set that satisfies Equations ( 3) and ( 4), we can train a function g(x) = p(s = 1|x) in PUL, we have Substituting f (x) = p(y = 1|x) and c = p(s = 1|y = 1), which we defined, we have which is to say that the difference between the classifier f (x) and g(x) is a constant c.If we estimate the constant, the desired classifier f (x) can be obtained.
The constant c can be estimated from a validation set.Suppose a validation set V is randomly drawn from the original training set; P is the labeled samples in V, hence positive (urban).For all pixels in P (i.e., x ∈ P), there is p(y = 1|x) = 1 and p(y = −1|x) = 0. Therefore, which is to say that the constant c can be estimated by any g(x) from the subset P, if g(x) = p(s = 1|x).
To obtain a more reliable estimator of the constant c, the average values of g(x) (x ∈ P) are employed, as shown in Equation ( 8): where n is the cardinality of P.
In summary, to evaluate the probability of a pixel being urban area, two steps are required in this paper.First, the function g(x) is trained by only urban pixels and unknown pixels, which satisfied Equations ( 3) and (4), and estimated the constant c from Equation (8).Secondly, the desired classifier f (x) is obtained by calibrating g(x) with c on the basis of Equation (6).
It is worth noting that PUL is not a specific classifier, but a general framework for classifier learning.In this study, the Back Propagation (BP) neural network is used as the classifier.It has been proven that PUL is insensitive to outliers and parameters [71].PUL is able to predict the probability of presence without negative data.It simply requires a small set of positive data and a large set of unlabeled data to help train a classifier and estimate the posterior probabilities of the target class, which can reduce the work in collecting samples and hence accelerate classification efficiency.This makes PUL an ideal method in regional urban extent mapping.More details of the principle and process of its deduction can be found in the paper [70][71][72].

Accessing Accuracy
To quantify the performance of built-up urban areas extracted by our classification scheme, the conventional approach is employed, i.e., comparing our classified map with an independent, random sample of "ground truth" points [73].
The overall accuracy (OA) is the commonly-used index for accuracy assessment.However, when there is class imbalance (i.e., the non-urban samples far outnumbers the urban samples), the overall accuracy is not an appropriate evaluation criterion [74,75].It is often biased towards the majority classes, ignoring the minority classes.To avoid the influence of the relatively small portion of urban area compared with non-urban area on the total accuracy, we employ another three indexes: the producer's accuracy, the user's accuracy and the kappa coefficient as performance evaluation measures.The producer's accuracy evaluates the omission error, which indicates the probability that a reference urban pixel will be correctly identified.The user's accuracy measures the commission error, indicating the probability that a classified urban pixel is truly urban.The kappa coefficient combining both types of errors can provide a more comprehensive assessment [76].Kappa measures the agreement between our results and the ground truth samples.The closer that the kappa value is to 1, the higher the agreement is [77][78][79].A confusion matrix is established to perform the accuracy assessment.The formulas of overall accuracy and the kappa coefficient are as follows: where N is the total number of pixels in the testing set, x kk is the diagonal of the classification confusion matrix, x k+ is the total number of pixels of the k class and x +k is the total number of pixels that are classified into the k class.

Urban Land Classification Accuracy on the Pixel Level
The final urban extent map obtained by the proposed classification scheme is shown in Figure 4.The predicted urban area is 68,023.75square kilometers (295,242 pixels, approximately 0.71% of the total area).The urban blocks are not evenly distributed, mainly in the north of China and the southeast coastal areas and rarely in the west parts.If we divide the study area into two roughly equal parts by the Heihe-Tengchong Line [80], 87.9% of the urbanized areas are contained to the east of the line, which has the same tendency as the population density.The eastern urban coverage rate is 1.7%, higher than the average.
Accuracy assessment using randomly selected urban and non-urban samples shows that the accuracy of the obtained urban map is high.The total accuracy for urban area detection is 99.76%, and the kappa coefficient is 0.842 (Table 2).The user accuracy for urban areas is 89.58%, indicating that only a very small number of non-urban areas are misclassified as urban.In terms of the producer's accuracy of urban areas, about 80% of the reference urban areas are successfully identified.
To better understand the classification results, accuracy assessment was also conducted on sub-regions.The study area was divided into 30 sub-regions according to the provincial administrative boundaries.To assure the spread of urban samples in each validation set, at least 50 urban samples were collected in each sub-region.Due to the imbalanced composition of urban and non-urban features, area adjustment was conducted for each accuracy assessment.The number of non-urban samples was based on its coverage rate.All collected samples were manually validated with Landsat images or Google Earth images by visual interpretation.The OA, kappa coefficient, user's and producer's accuracy of each sub-region were calculated.The sub-region accuracies are similar to that obtained from the entire study area.The provincial overall accuracies are all greater than 90%, averaging 99.34%.The value of the kappa coefficient varies from 0.73 to 0.99, with an average of 0.848.The kappa coefficients are divided into six classes by equal intervals (Figure 5).The western have a higher kappa than the eastern parts, especially the five provinces with the lowest kappa (0.730, 0.734, 0.761, 0.761 and 0.766 for Henan, Anhui, Hebei, Shandong and Jiangsu, respectively).Figure 6 demonstrates the user's and producer's accuracy of each province, which represents the commission errors and the omission errors, respectively.For most provinces, the user's accuracies (92.49% on average) are higher than the producer's accuracies (an average of 80.23%), which also remain broadly in line with the results of the whole map.
Remote Sens. 2017, 9, 862 11 of 25 producer's accuracy of urban areas, about 80% of the reference urban areas are successfully identified.
To better understand the classification results, accuracy assessment was also conducted on sub-regions.The study area was divided into 30 sub-regions according to the provincial administrative boundaries.To assure the spread of urban samples in each validation set, at least 50 urban samples were collected in each sub-region.Due to the imbalanced composition of urban and non-urban features, area adjustment was conducted for each accuracy assessment.The number of non-urban samples was based on its coverage rate.All collected samples were manually validated with Landsat images or Google Earth images by visual interpretation.The OA, kappa coefficient, user's and producer's accuracy of each sub-region were calculated.The sub-region accuracies are similar to that obtained from the entire study area.The provincial overall accuracies are all greater than 90%, averaging 99.34%.The value of the kappa coefficient varies from 0.73 to 0.99, with an average of 0.848.The kappa coefficients are divided into six classes by equal intervals (Figure 5).The western have a higher kappa than the eastern parts, especially the five provinces with the lowest kappa (0.730, 0.734, 0.761, 0.761 and 0.766 for Henan, Anhui, Hebei, Shandong and Jiangsu, respectively).Figure 6 demonstrates the user's and producer's accuracy of each province, which represents the commission errors and the omission errors, respectively.For most provinces, the user's accuracies (92.49% on average) are higher than the producer's accuracies (an average of 80.23%), which also remain broadly in line with the results of the whole map.The commission errors of this map mainly occurred in cultivated land and water areas.If we remove water bodies by water masks from the final map, the overall accuracy will rise to 99.80%, and the kappa coefficient will reach 0.865.It can be said that most of the classified urban areas are correctly classified.The omission errors mainly occurred in cultivated land.The area of some tribal non-city regions is beyond the minimum requirement of the definition of urban extent (i.e., 1 km 2 ), but the factors such as low population intensity and less human activity at night have a negative influence on the nighttime light data.When integrating nightlight data for classification, they may be omitted.That is also the factor contributing to the relatively low kappa coefficient for the five provinces (Henan, Anhui, Hebei, Shandong and Jiangsu).The producer's accuracies of the four provinces are much lower than their user's accuracies (Figure 6).An average of 38% of the urban blocks in this area is omitted.That is because this area is the major agricultural region in China (the North China Plain), where a large number of villages and towns are located.The omissions of the small "urban blocks" reduce the classification accuracy.
In general, the classification results are satisfied.On the pixel level, the proposed classification scheme has successfully identified most urban extent and with a high classification accuracy.
Remote Sens. 2017, 9, 862 12 of 25 The commission errors of this map mainly occurred in cultivated land and water areas.If we remove water bodies by water masks from the final map, the overall accuracy will rise to 99.80%, and the kappa coefficient will reach 0.865.It can be said that most of the classified urban areas are correctly classified.The omission errors mainly occurred in cultivated land.The area of some tribal non-city regions is beyond the minimum requirement of the definition of urban extent (i.e., 1 km 2 ), but the factors such as low population intensity and less human activity at night have a negative influence on the nighttime light data.When integrating nightlight data for classification, they may be omitted.That is also the factor contributing to the relatively low kappa coefficient for the five provinces (Henan, Anhui, Hebei, Shandong and Jiangsu).The producer's accuracies of the four provinces are much lower than their user's accuracies (Figure 6).An average of 38% of the urban blocks in this area is omitted.That is because this area is the major agricultural region in China (the North China Plain), where a large number of villages and towns are located.The omissions of the small "urban blocks" reduce the classification accuracy.
In general, the classification results are satisfied.On the pixel level, the proposed classification scheme has successfully identified most urban extent and with a high classification accuracy.The commission errors of this map mainly occurred in cultivated land and water areas.If we remove water bodies by water masks from the final map, the overall accuracy will rise to 99.80%, and the kappa coefficient will reach 0.865.It can be said that most of the classified urban areas are correctly classified.The omission errors mainly occurred in cultivated land.The area of some tribal non-city regions is beyond the minimum requirement of the definition of urban extent (i.e., 1 km 2 ), but the factors such as low population intensity and less human activity at night have a negative influence on the nighttime light data.When integrating nightlight data for classification, they may be omitted.That is also the factor contributing to the relatively low kappa coefficient for the five provinces (Henan, Anhui, Hebei, Shandong and Jiangsu).The producer's accuracies of the four provinces are much lower than their user's accuracies (Figure 6).An average of 38% of the urban blocks in this area is omitted.That is because this area is the major agricultural region in China (the North China Plain), where a large number of villages and towns are located.The omissions of the small "urban blocks" reduce the classification accuracy.
In general, the classification results are satisfied.On the pixel level, the proposed classification scheme has successfully identified most urban extent and with a high classification accuracy.

Urban Land Classification Accuracy on the City Level
To assess the classification accuracy on the city level, 647 cities and counties are selected based on China City Statistical Yearbook 2013.Cities are divided into four groups by administrative levels: municipality directly under the central government, vice-provincial city, prefecture-level city and county-level city (Table 3).The locations of the administrative department of each city are shown in Figure 7, represented by points.county-level city (Table 3).The locations of the administrative department of each city are shown in Figure 7, represented by points.The accuracy is high, reaching 99.38%.The classification scheme successfully mapped all of the municipalities directly under the central government, vice-provincial cities and prefecture-level cities.Only four county-level cities are omitted (Aershan, Jinggangshan, Nanxiong and Pingxiang).
From the spatial distribution perspective, the omissions are distributed mainly in the mountainous regions, and most land areas of the four cities are covered by farmland, grassland or mountain land; the urban blocks are relatively small.For example, Nanxiong, located in the northeastern part of Guangdong Province, is surrounded by the Dayu Mountains; over 80% of the areas of Pingxiang are covered by hilly regions, and the forest coverage of Jinggangshan reaches up to 81.2%.Due to the high coverage of vegetation, the NDVI values in these cities may be relatively higher than other cities.In addition, their population densities are relatively low.For instance, Aershan, an omission case located in the northeastern part of China, only contains a population of 48 thousand, but its land area is 7408.7 km 2 .Accordingly, the DNB values within the cities, as a signal of human activity, are much lower than a typical city, which may cause the omission errors.The accuracy is high, reaching 99.38%.The classification scheme successfully mapped all of the municipalities directly under the central government, vice-provincial cities and prefecture-level cities.
From the spatial distribution perspective, the omissions are distributed mainly in the mountainous regions, and most land areas of the four cities are covered by farmland, grassland or mountain land; the urban blocks are relatively small.For example, Nanxiong, located in the northeastern part of Guangdong Province, is surrounded by the Dayu Mountains; over 80% of the areas of Pingxiang are covered by hilly regions, and the forest coverage of Jinggangshan reaches up to 81.2%.Due to the high coverage of vegetation, the NDVI values in these cities may be relatively higher than other cities.In addition, their population densities are relatively low.For instance, Aershan, an omission case located in the northeastern part of China, only contains a population of 48 thousand, but its land area is 7408.7 km 2 .Accordingly, the DNB values within the cities, as a signal of human activity, are much lower than a typical city, which may cause the omission errors.

Comparative Analysis of Urban Extent under Different Nighttime Light Data
Previous studies have recommended the NPP-VIIRS DNB for extracting urban extent at regional and continental scales [24,31].The wider radiometric range and finer spatial resolution of NPP-VIIRS DNB provide an opportunity to obtain more detailed information about the urban extent.Thus, for contrastive analysis of the classification performance of different nighttime light data under the proposed classification scheme, another urban land map of the study area in 2012 was generated by DMSP/OLS nighttime light data and the maximum MODIS NDVI (Column 3, Figure 8).The maximum MODIS NDVI images, the training samples and the classification method (i.e., the PUL algorithm) are the same as used in this paper (Column 2, Figure 8).The only difference for the two maps is the source of the nighttime light data.In addition, public land cover datasets (MCD12Q1 in MODIS products [81]) are also employed, and the urban and built-up layer is extracted for comparison (Column 4, Figure 8).All maps demonstrate the situation of urban extent in 2012.Landsat images used as the ground truth are collected between October 2012 and August 2013 for the best display effect (Column 1, Figure 8).
Five cities, Beijing, Wuhan, Hohhot, Datong and Suihua, at different levels and scales were chosen as the cases of interest.Beijing is the biggest megacity in China with the largest population and Gross Domestic Product (GDP).Wuhan is a megacity located in Central China.Hohhot, Datong and Suihua are located in North China.The locations of the cities are shown in Figure 1.For each city, the urban omission rates of different productions are also calculated (Table 4).The same reference samples are used in each city.
Compared with MCD12Q1, NPP-VIIRS DNB and MODIS NDVI (referred to as "Group 1" below) have a better performance on the urban extent.Although the omission rate of Beijing generated by Group 1 (39.39%) is relatively higher than the other two maps, 27.27% for the map generated by DMSP/OLS NLD and MODIS NDVI (referred to as "Group 2" below) and 28.28% for MCD12Q1, its urban boundaries are more consistent with the ground truth.The urban extent in MCD12Q1 is smaller than that in the former two maps, and it is also inconsistent with that shown in the Landsat images.Many urban pixels especially those located in the urban fringe are omitted in MCD12Q1.The omission rate of MCD12Q1 is much higher than that of the other two productions, topping out at 57.14% in Suihua.In addition, the "salt and pepper" effect is relatively evident in MCD12Q1, such as Beijing.Through manual validation, it is found that many tiny urban blocks around main cities in MCD12Q1 are commission errors; cultivated lands or bare lands are frequently misclassified as urban lands.
For the two urban maps generated by different nightlight data, the difference is mainly in small urban blocks.The small urban blocks far away from city centers are often missed by Group 2. This phenomenon exists obviously in small cities.For example, in Hohhot, compared with the urban map generated by Group 1, surrounding small urban blocks are obviously lost by Group 2. That is why the omission rates of Group 1 are still in single digits, such as Suihua with only 8.57%, while those of Group 2 are far more than that, reaching 37.14%.The small urban blocks near city centers are merged into the large blocks.Some details of urban pattern are lost in the map generated by Group 2, such as Datong.
In general, NPP-VIIRS DNB has a better performance on detecting low light area, such as small urban blocks and the edge of cities.It can be seen from Figure 9 that when the value of NPP-VIIRS DNB starts to rise from the bottom, DMSP/OLS NLD is already saturated.Although NDVI makes up some detail losses, the high nighttime light values preclude the separation of the non-urban areas.This makes the small urban blocks close to city centers be merged into the large urban blocks, resulting in detail losses and overestimation.In addition, the DMSP/OLS NLD is limited at detecting low light.Isolated small urban blocks with faint light are easily ignored, causing omission errors.

Case Analysis: Performance of Combining NPP-VIIRS DNB and MODIS NDVI
Different remotely-sensed data have their own characteristics, and combined use of them could provide more information than individually [42].To evaluate the performance of combining NPP-VIIRS DNB and MODIS NDVI, we analyzed the distribution frequencies of the maximum NDVI calculated for the entire year and the NPP-VIIRS DNB value of four frequently-confused land cover types, including cultivated land, water bodies, artificial surfaces and bare land.The definitions of the four land cover types refer to those in GLC30-2010 dataset, since the training samples in this paper are collected mainly based on this dataset.The accuracy of cultivated land, water bodies, artificial surfaces and bare land in GLC 30-2010 is 82.76%, 84.70%, 86.70% and 81.76%, respectively.
Five cities (Beijing, Wuhan, Guangzhou, Chengdu and Urumchi) distributed across various geographies in China are employed as the cases (Figure 1).To maintain the consistent spatial resolution, pixels are aggregated to 480 m based on the majority rule.All pixels of the four land cover types in these cities are counted.
The distribution frequencies of the NDVI value of the four land cover types are shown in Figure 10.It can be seen that artificial surfaces have NDVI values between 0.3 and 0.8 (Figure 10a).Cultivated land shows prominent vegetation characteristic.The NDVI is mainly distributed between 0.6 and 0.9; only a small part is distributed among 0.4 to 0.5.A value below 0.4 is rarely observed (Figure 10b).NDVI values of water bodies have a wide distribution, ranging from 0.1 to 0.9 (Figure 10c), and NDVI values of bare land are mainly below 0.2 (Figure 10d).

Case Analysis: Performance of Combining NPP-VIIRS DNB and MODIS NDVI
Different remotely-sensed data have their own characteristics, and combined use of them could provide more information than individually [42].To evaluate the performance of combining NPP-VIIRS DNB and MODIS NDVI, we analyzed the distribution frequencies of the maximum NDVI calculated for the entire year and the NPP-VIIRS DNB value of four frequently-confused land cover types, including cultivated land, water bodies, artificial surfaces and bare land.The definitions of the four land cover types refer to those in GLC30-2010 dataset, since the training samples in this paper are collected mainly based on this dataset.The accuracy of cultivated land, water bodies, artificial surfaces and bare land in GLC 30-2010 is 82.76%, 84.70%, 86.70% and 81.76%, respectively.
Five cities (Beijing, Wuhan, Guangzhou, Chengdu and Urumchi) distributed across various geographies in China are employed as the cases (Figure 1).To maintain the consistent spatial resolution, pixels are aggregated to 480 m based on the majority rule.All pixels of the four land cover types in these cities are counted.
The distribution frequencies of the NDVI value of the four land cover types are shown in Figure 10.It can be seen that artificial surfaces have NDVI values between 0.3 and 0.8 (Figure 10a).Cultivated land shows prominent vegetation characteristic.The NDVI is mainly distributed between 0.6 and 0.9; only a small part is distributed among 0.4 to 0.5.A value below 0.4 is rarely observed (Figure 10b).NDVI values of water bodies have a wide distribution, ranging from 0.1 to 0.9 (Figure 10c), and NDVI values of bare land are mainly below 0.2 (Figure 10d).small, and the pixels of water bodies are few in number.Bare land and water bodies are not the main confused land covers.If only using NDVI as the factor for classification, artificial surfaces may result in many misclassification errors.
Figure 11 demonstrates the distribution frequencies of the NPP-VIIRS DNB value of the four land cover types.Most DNB values of cultivated land, water bodies and bare land are distributed below five, while those of artificial surfaces are relatively evenly distributed.However, there are still some low light areas (DNB < 5) existing in artificial surfaces, resulting in confusion.Although the NDVI distribution trends of these four land covers are different, there is still significant overlap.Many pixels of cultivated land have a similar NDVI value to that of artificial surfaces, especially between 0.5 and 0.7.That is, during this interval, little difference exists between cultivated land and artificial surfaces.The NDVI value distribution areas of bare land and water bodies are also overlapped with that of artificial surfaces.However, the overlap area of bare land is small, and the pixels of water bodies are few in number.Bare land and water bodies are not the main confused land covers.If only using NDVI as the factor for classification, artificial surfaces may result in many misclassification errors.
Figure 11 demonstrates the distribution frequencies of the NPP-VIIRS DNB value of the four land cover types.Most DNB values of cultivated land, water bodies and bare land are distributed below five, while those of artificial surfaces are relatively evenly distributed.However, there are still some low light areas (DNB < 5) existing in artificial surfaces, resulting in confusion.Although the NDVI distribution trends of these four land covers are different, there is still significant overlap.Many pixels of cultivated land have a similar NDVI value to that of artificial surfaces, especially between 0.5 and 0.7.That is, during this interval, little difference exists between cultivated land and artificial surfaces.The NDVI value distribution areas of bare land and water bodies are also overlapped with that of artificial surfaces.However, the overlap area of bare land is small, and the pixels of water bodies are few in number.Bare land and water bodies are not the main confused land covers.If only using NDVI as the factor for classification, artificial surfaces may result in many misclassification errors.
Figure 11 demonstrates the distribution frequencies of the NPP-VIIRS DNB value of the four land cover types.Most DNB values of cultivated land, water bodies and bare land are distributed below five, while those of artificial surfaces are relatively evenly distributed.However, there are still some low light areas (DNB < 5) existing in artificial surfaces, resulting in confusion.It is a bit difficult to separate these land cover types based on one single type of data.Figure 12 demonstrates the simultaneous use of NPP-VIIRS DNB and the maximum MODIS NDVI (200 random samples).It can be seen that most features can be distinguished by the two-dimensional feature.Almost all of the bare land is separated from artificial surfaces.Only a small part of water bodies and cultivated land with relatively high nightlight values is mixed in the main part of artificial surfaces, which may cause commission errors.It is important to note the low light parts.Because the low nighttime light is not the typical urban feature, artificial surfaces in this part are frequently omitted during the classification.This is also the main error of our classification results.For a further understanding of the artificial surfaces, we classified artificial surfaces into two types: urban artificial surfaces and rural artificial surfaces.If a pixel is surrounded mainly by cultivated land, it will be considered as a rural artificial surface; otherwise, it is an urban artificial surface.As seen in Figure 13, urban artificial surfaces have relatively higher nightlight values and lower NDVI values.Conversely, the artificial surfaces with low nightlight and high NDVI are mainly located in rural areas.That is, the combination of NPP-VIIRS DNB and MODIS NDVI can well distinguish urban artificial surfaces, but is limited in identifying rural artificial surfaces.It could also explain why the omission errors of our map mostly occurred in rural areas.
Remote Sens. 2017, 9, 862 18 of 25 It is a bit difficult to separate these land cover types based on one single type of data.Figure 12 demonstrates the simultaneous use of NPP-VIIRS DNB and the maximum MODIS NDVI (200 random samples).It can be seen that most features can be distinguished by the two-dimensional feature.Almost all of the bare land is separated from artificial surfaces.Only a small part of water bodies and cultivated land with relatively high nightlight values is mixed in the main part of artificial surfaces, which may cause commission errors.It is important to note the low light parts.Because the low nighttime light is not the typical urban feature, artificial surfaces in this part are frequently omitted during the classification.This is also the main error of our classification results.For a further understanding of the artificial surfaces, we classified artificial surfaces into two types: urban artificial surfaces and rural artificial surfaces.If a pixel is surrounded mainly by cultivated land, it will be considered as a rural artificial surface; otherwise, it is an urban artificial surface.As seen in Figure 13, urban artificial surfaces have relatively higher nightlight values and lower NDVI values.Conversely, the artificial surfaces with low nightlight and high NDVI are mainly located in rural areas.That is, the combination of NPP-VIIRS DNB and MODIS NDVI can well distinguish urban artificial surfaces, but is limited in identifying rural artificial surfaces.It could also explain why the omission errors of our map mostly occurred in rural areas.

Reliability Analysis
Several indexes have been proposed to combine the advantages of both NPP-VIIRS DNB and MODIS NDVI features [31,48].It has been proven that these indexes are effective on large-scale urban extent extraction.Different from the indexes, the classification method employed in this paper, i.e., PUL, combines NPP-VIIRS DNB with MODIS NDVI in a nonlinear way.It is necessary to evaluate the reliability of this nonparametric fusion approach.
For contrastive analysis on the reliability, an impervious surface index distribution mapping method, Large-scale Impervious Surface Index (LISI) [31], was introduced.LISI was calculated under the same input data as those used by PUL.The formula of LISI is shown as follows: ( ) where NDVIMAX is the maximum NDVI from multi-temporal NDVI images and DNBnor is normalized VIIRS-DNB data ranging between zero and one.Since NDVIMAX is negatively related to ISA while DNBnor has a positive relation, a lower NDVI and a higher DNB represent a higher probability of ISA, i.e., LISI.
To visualize the difference of probability distribution between LISI and PUL, the standard deviation is calculated with a 3 × 3 window.The value of the center pixel is replaced with the standard deviation of the neighboring pixels.A higher standard deviation represents a larger variation of the probability values.
Three regions with various economic conditions and geographic environments were selected as the test area (Figure 14), located in northern China (Region 1), central China (Region 2) and southern

Reliability Analysis
Several indexes have been proposed to combine the advantages of both NPP-VIIRS DNB and MODIS NDVI features [31,48].It has been proven that these indexes are effective on large-scale urban extent extraction.Different from the indexes, the classification method employed in this paper, i.e., PUL, combines NPP-VIIRS DNB with MODIS NDVI in a nonlinear way.It is necessary to evaluate the reliability of this nonparametric fusion approach.
For contrastive analysis on the reliability, an impervious surface index distribution mapping method, Large-scale Impervious Surface Index (LISI) [31], was introduced.LISI was calculated under the same input data as those used by PUL.The formula of LISI is shown as follows: where NDVI MAX is the maximum NDVI from multi-temporal NDVI images and DNB nor is normalized VIIRS-DNB data ranging between zero and one.Since NDVI MAX is negatively related to ISA while DNB nor has a positive relation, a lower NDVI and a higher DNB represent a higher probability of ISA, i.e., LISI.
To visualize the difference of probability distribution between LISI and PUL, the standard deviation is calculated with a 3 × 3 window.The value of the center pixel is replaced with the standard deviation of the neighboring pixels.A higher standard deviation represents a larger variation of the probability values.Three regions with various economic conditions and geographic environments were selected as the test area (Figure 14), located in northern China (Region 1), central China (Region 2) and southern China (Region 3).First, the probability distribution in test areas was calculated with LISI and PUL, respectively, and then, based on the output probability map, the standard deviations were further calculated.
China (Region 3).First, the probability distribution in test areas was calculated with LISI and PUL, respectively, and then, based on the output probability map, the standard deviations were further calculated.
As shown in Figure 14, the spatial distribution of the standard deviation calculated by PUL presents a regular shape, and it almost surrounds the urban areas, which means the probabilities on the urban fringe calculated by PUL display an obvious change.The probabilities of inner cities are higher than in the outside.In contrast, the standard deviation calculated by LISI shows more details of inner cities.The probabilities decrease from high values in core urban areas to low values in rural areas, and the urban boundary is vague.Thus, when mapping urban areas into a binary image, PUL is more practical and effective, since the threshold of the urban boundary is much easier to set.To prove this, we calculate the urban pixel count under different threshold in steps of 0.05 (Figure 15).It can be seen from the line charts that the urban pixels of the three regions calculated by PUL are rapidly converged in a small range (within 0.1), while the reduction speed of the urban pixels in LISI is slower (about 0.1 to 0.3).The slow convergence makes it difficult to select a suitable threshold for LISI.In addition, in LISI, the convergence rates of different regions are different, e.g., when the threshold is 0.2, the urban pixels count in Region 2 tends to converge; however, it is not converged in Region 1, only until 0.7.That is, when using LISI, different thresholds are required for different regions.By contrast, the PUL method is insensitive to the threshold and has good robustness (Table 5).Thus, for regional urban extent mapping, PUL is a more appropriate choice.As shown in Figure 14, the spatial distribution of the standard deviation calculated by PUL presents a regular shape, and it almost surrounds the urban areas, which means the probabilities on the urban fringe calculated by PUL display an obvious change.The probabilities of inner cities are higher than in the outside.In contrast, the standard deviation calculated by LISI shows more details of inner cities.The probabilities decrease from high values in core urban areas to low values in rural areas, and the urban boundary is vague.Thus, when mapping urban areas into a binary image, PUL is more practical and effective, since the threshold of the urban boundary is much easier to set.
To prove this, we calculate the urban pixel count under different threshold in steps of 0.05 (Figure 15).It can be seen from the line charts that the urban pixels of the three regions calculated by PUL are rapidly converged in a small range (within 0.1), while the reduction speed of the urban pixels in LISI is slower (about 0.1 to 0.3).The slow convergence makes it difficult to select a suitable threshold for LISI.In addition, in LISI, the convergence rates of different regions are different, e.g., when the threshold is 0.2, the urban pixels count in Region 2 tends to converge; however, it is not converged in Region 1, only until 0.7.That is, when using LISI, different thresholds are required for different regions.By contrast, the PUL method is insensitive to the threshold and has good robustness (Table 5).Thus, for regional urban extent mapping, PUL is a more appropriate choice.

Conclusions
This study presented an urban-mapping scheme for large-scale mapping.We employed a BP neural network-based one-class classification method (the PUL algorithm) to map the urban extents of China in 2012 by using NPP-VIIRS DNB and MODIS NDVI as variables.The kappa of the obtained binary urban map reached 0.842, which shows the effectiveness of this classification scheme.The combined use of MODIS NDVI and NPP-VIIRS DNB can significantly help to separate urban blocks from the background and reduces errors caused by confusing land covers (e.g., bare land, cultivated land and waterbodies).Compared with DMSP/OLS NLD, NPP-VIIRS DNB has a better performance for detecting underdeveloped towns and suburban areas with low light brightness of nighttime light data, supplying more detailed information about the urban extent.In addition, the classification method (PUL) can significantly reduce the effort needed in assigning labels to train samples and is robust to the threshold, showing great potential to map urban extent at a large scale.Although our result is lacking for the rural regions, the urban extent estimated from our classification scheme is of high accuracy.In the future, we will further improve the procedure for rural artificial surfaces and produce a highly accurate global-scale urban map.

Figure 1 .
Figure 1.The study area that contains 30 provinces in inland China and the location of eight typical cities that are used in Section 5.

Figure 1 .
Figure 1.The study area that contains 30 provinces in inland China and the location of eight typical cities that are used in Section 5.

( 2 )
The second step is filtering extreme NPP-VIIRS DNB values.The original NPP-VIIRS DNB values range from −0.42 to 3329.31 nW•cm −2 •sr −1 ; however, 90% of the values are smaller than 332.93, and more than 98% of the values are smaller than 63.The extremely high values that are probably caused by lights from some special places such as airports or fires need to be removed.Since the nighttime light has a positive correlation with the economic development level, the maximum value of light brightness should not be larger than that in the most developed city.According to China Statistical Yearbook 2013, Beijing is the most economically advanced city.A maximum value (235.13)from Beijing is chosen to correct the DNB values.All pixel values that are greater than 235.13 are replaced by the mean value of their neighbor pixels.A total of 84 pixels is reassigned in the study area.The filtered NPP-VIIRS DNB values range from 0 to 235.13 nW•cm −2 •sr −1 .Remote Sens. 2017, 9, 862 6 of 25 discarded.Thus, the non-light area (DN = 0) of the stable DMSP/OLS NLD is extracted and overlays NPP-VIIRS DNB to mask out the background noise.This step not only retains the stable night light, but also removes many non-urban areas, thus reducing the data process load.(2) The second step is filtering extreme NPP-VIIRS DNB values.The original NPP-VIIRS DNB values range from −0.42 to 3329.31 nW•cm −2 •sr −1 ; however, 90% of the values are smaller than 332.93, and more than 98% of the values are smaller than 63.The extremely high values that are probably caused by lights from some special places such as airports or fires need to be removed.Since the nighttime light has a positive correlation with the economic development level, the maximum value of light brightness should not be larger than that in the most developed city.According to China Statistical Yearbook 2013, Beijing is the most economically advanced city.A maximum value (235.13)from Beijing is chosen to correct the DNB values.All pixel values that are greater than 235.13 are replaced by the mean value of their neighbor pixels.A total of 84 pixels is reassigned in the study area.The filtered NPP-VIIRS DNB values range from 0 to 235.13 nW•cm −2 •sr −1 .

Figure 2 .
Figure 2. Pre-processing procedure of the original nighttime light image composited by the Day/Night Band of the Visible Infrared Imaging Radiometer Suite on the Suomi National Polar-orbiting Partnership Satellite (the NPP-VIIRS DNB data).

Figure 2 .
Figure 2. Pre-processing procedure of the original nighttime light image composited by the Day/Night Band of the Visible Infrared Imaging Radiometer Suite on the Suomi National Polar-orbiting Partnership Satellite (the NPP-VIIRS DNB data).

Figure 3 .
Three key steps are included: (1) sampling training and testing samples from the study area; (2) training classifiers for each scene by the BP neural network-based one-class classification method using the filtered normalized NPP-VIIRS DNB and the maximum NDVI as indicators and predicting the urban extent map of the study area; and (3) assessing the accuracy of the classification results using an independent testing set randomly obtained from the entire study area.Remote Sens. 2017, 9, 862 7 of 25 removed due to the minimum area constraint.The re-classified map is called the GLC30-2010 480 m urban map in this paper.

Figure 3 .
Three key steps are included: (1) sampling training and testing samples from the study area; (2) training classifiers for each scene by the BP neural network-based one-class classification method using the filtered normalized NPP-VIIRS DNB and the maximum NDVI as indicators and predicting the urban extent map of the study area; and (3) assessing the accuracy of the classification results using an independent testing set randomly obtained from the entire study area.

Figure 3 .
Figure 3. Flowchart of extracting urban extent in the study area.

3. 1
. Sampling Two datasets are generated: one (training set) is for the classifier's training, and the other (testing set) is for accuracy assessing of the urban map.They are independent of each other (i.e., testing samples are not included in training sets).

Figure 3 .
Figure 3. Flowchart of extracting urban extent in the study area.
3.1.Sampling Two datasets are generated: one (training set) is for the classifier's training, and the other (testing set) is for accuracy assessing of the urban map.They are independent of each other (i.e., testing samples are not included in training sets).

Figure 4 .
Figure 4. Urban map obtained using the proposed classification scheme.

Figure 4 .
Figure 4. Urban map obtained using the proposed classification scheme.

Figure 5 .
Figure 5. Kappa coefficient distribution at provincial level.The values ranging from 0.73 to 0.99 are divided into six classes by equal intervals.

Figure 6 .
Figure 6.The user's and producer's accuracy of 30 provinces in ascending order of the kappa coefficient.

Figure 5 .
Figure 5. Kappa coefficient distribution at provincial level.The values ranging from 0.73 to 0.99 are divided into six classes by equal intervals.

Figure 5 .
Figure 5. Kappa coefficient distribution at provincial level.The values ranging from 0.73 to 0.99 are divided into six classes by equal intervals.

Figure 6 .
Figure 6.The user's and producer's accuracy of 30 provinces in ascending order of the kappa coefficient.

Figure 6 .
Figure 6.The user's and producer's accuracy of 30 provinces in ascending order of the kappa coefficient.

Figure 7 .
Figure 7.The distribution of 647 cities, represented by the locations of the administrative department.

Figure 7 .
Figure 7.The distribution of 647 cities, represented by the locations of the administrative department.

Figure 8 .
Figure 8. Urban extent of five cities (Beijing, Wuhan, Hohhot, Datong and Suihua) in different productions.The Day/Night Band of the Visible Infrared Imaging Radiometer Suite on the Suomi National Polar-orbiting Partnership Satellite (NPP-VIIRS DNB), the the Defense Meteorological Satellite Program/Operational Linescan System Nighttime Light Data (DMSP/OLS NLD), the Normalized Difference Vegetation Index from the Moderate Resolution Imaging Spectroradiometer products (MODIS NDVI) and the public land cover datasets in MODIS products (MCD12Q1) are collected in 2012.Landsat images are collect between October 2012 and August 2013 for the best display effect.

Figure 8 .
Figure 8. Urban extent of five cities (Beijing, Wuhan, Hohhot, Datong and Suihua) in different productions.The Day/Night Band of the Visible Infrared Imaging Radiometer Suite on the Suomi National Polar-orbiting Partnership Satellite (NPP-VIIRS DNB), the the Defense Meteorological Satellite Program/Operational Linescan System Nighttime Light Data (DMSP/OLS NLD), the Normalized Difference Vegetation Index from the Moderate Resolution Imaging Spectroradiometer products (MODIS NDVI) and the public land cover datasets in MODIS products (MCD12Q1) are collected in 2012.Landsat images are collect between October 2012 and August 2013 for the best display effect.

Figure 9 .
Figure 9. Latitudinal transects of the normalized NPP-VIIRS DNB, the normalized DMSP/OLS NLD and MODIS NDVI along one transect (the black guide line) in Beijing.

Figure 10 .
Figure 10.The distribution frequencies of the NDVI value of four easily confused land cover types: (a) artificial surfaces, (b) cultivated land, (c) water bodies and (d) bare land.The x-axis is the maximum NDVI value in 2012, and the y-axis is the distribution frequency.

Figure 11 .
Figure 11.The distribution frequencies of the NPP-VIIRS DNB value of four easily confused land cover types: (a) artificial surfaces, (b) cultivated land, (c) water bodies and (d) bare land.The x-axis is the original NPP-VIIRS DNB value in 2012, and the y-axis is the distribution frequency.Pixels with DNB values greater than 50 nW•cm −2 •sr −1 are not shown.

Figure 10 .
Figure 10.The distribution frequencies of the NDVI value of four easily confused land cover types: (a) artificial surfaces, (b) cultivated land, (c) water bodies and (d) bare land.The x-axis is the maximum NDVI value in 2012, and the y-axis is the distribution frequency.

Figure 10 .
Figure 10.The distribution frequencies of the NDVI value of four easily confused land cover types: (a) artificial surfaces, (b) cultivated land, (c) water bodies and (d) bare land.The x-axis is the maximum NDVI value in 2012, and the y-axis is the distribution frequency.

Figure 11 .
Figure 11.The distribution frequencies of the NPP-VIIRS DNB value of four easily confused land cover types: (a) artificial surfaces, (b) cultivated land, (c) water bodies and (d) bare land.The x-axis is the original NPP-VIIRS DNB value in 2012, and the y-axis is the distribution frequency.Pixels with DNB values greater than 50 nW•cm −2 •sr −1 are not shown.

Figure 11 .
Figure 11.The distribution frequencies of the NPP-VIIRS DNB value of four easily confused land cover types: (a) artificial surfaces, (b) cultivated land, (c) water bodies and (d) bare land.The x-axis is the original NPP-VIIRS DNB value in 2012, and the y-axis is the distribution frequency.Pixels with DNB values greater than 50 nW•cm −2 •sr −1 are not shown.

Figure 12 .
Figure 12.The distribution frequency of the NPP-VIIRS DNB value and the MODIS NDVI value of four easily-confused land cover types (200 random samples): water bodies, cultivated land, artificial surfaces and bare land.The x-axis is the original NPP-VIIRS DNB value in 2012, and the y-axis is the maximum NDVI in 2012.Points with DNB values greater than 50 nW•cm −2 •sr −1 are not displayed.

Figure 12 .
Figure 12.The distribution frequency of the NPP-VIIRS DNB value and the MODIS NDVI value of four easily-confused land cover types (200 random samples): water bodies, cultivated land, artificial surfaces and bare land.The x-axis is the original NPP-VIIRS DNB value in 2012, and the y-axis is the maximum NDVI in 2012.Points with DNB values greater than 50 nW•cm −2 •sr −1 are not displayed.

Figure 13 .
Figure 13.The distribution frequency of the NPP-VIIRS DNB value and the MODIS NDVI value of two types of artificial surfaces.The x-axis is the original NPP-VIIRS DNB value in 2012, and the y-axis is the maximum NDVI in 2012.Points with DNB values greater than 50 nW•cm −2 •sr −1 are not displayed.

Figure 13 .
Figure 13.The distribution frequency of the NPP-VIIRS DNB value and the MODIS NDVI value of two types of artificial surfaces.The x-axis is the original NPP-VIIRS DNB value in 2012, and the y-axis is the maximum NDVI in 2012.Points with DNB values greater than 50 nW•cm −2 •sr −1 are not displayed.

Figure 14 .
Figure 14.The standard deviations of Present-Unlabeled Learning (PUL) and Large-scale Impervious Surface Index (LISI) in three regions.

Figure 14 .
Figure 14.The standard deviations of Present-Unlabeled Learning (PUL) and Large-scale Impervious Surface Index (LISI) in three regions.

Figure 15 .
Figure 15.(a) The urban pixel count distributions of PUL in three regions; and (b) the urban pixel count distributions of LISI in three regions.The x-axis is the value of the threshold; and the y-axis is the pixel count of the urban area.

Table 1 .
Descriptions of the data used in this study.

Table 2 .
Error matrices and accuracy assessment for urban and non-urban land cover types.
Note:1OA represents the Overall Accuracy.

Table 2 .
Error matrices and accuracy assessment for urban and non-urban land cover types.
Note:1OA represents the Overall Accuracy.

Table 3 .
Error matrices and accuracy assessment for urban and non-urban land cover types.

Table 3 .
Error matrices and accuracy assessment for urban and non-urban land cover types.

Table 4 .
The urban omission rate of five cities (Beijing, Wuhan, Hohhot, Datong and Suihua) in different productions.

Table 4 .
The urban omission rate of five cities (Beijing, Wuhan, Hohhot, Datong and Suihua) in different productions.

Table 5 .
The overall accuracy and kappa coefficient of the three regions under different thresholds.

Table 5 .
The overall accuracy and kappa coefficient of the three regions under different thresholds.