Title Evaluating Saturation Correction Methods for DMSP / OLSNighttime Light Data : A Case Study from China ’ s Cities

Remotely sensed nighttime lights (NTL) datasets derived from the Defense Meteorological Satellite Program’s Operational Linescan System (DMSP/OLS) have been identified as a good indicator of the urbanization process and have been widely used to study such demographic and economic variables as population distribution and density, electricity consumption, and carbon emission. However, one issue must be considered in the application of NTL data, i.e., saturation in the bright cores of urban centers. In this study, we evaluate four correction methods in China’s cities: the linear regression model and the cubic regression model at the regional level, and the Human Settlement Index (HSI) and the Vegetation Adjusted NTL Urban Index (VANUI) at a pixel level. The results suggest that both correction methods at the regional level improve the correlation between NTL data and socioeconomic variables. However, since the methods can only be used on saturated pixels, the correction effects are limited, as the saturated area in Chinese cities is rather small. HSI and VANUI increase the inter-urban variability within certain cities, OPEN ACCESS Remote Sens. 2014, 6 9854 especially when their vegetation health and abundance is negatively correlated with NTL. However, the indices may induce bias when applied in a large region with a diverse natural environment and vegetation, and the application of HSI with a relatively high sensitivity of HSI to NDVI may be limited as NTL approaches maximum. Proper methods for reducing saturation effects should thus vary with different study areas and research purposes.


Introduction
Urbanization is associated with a booming population, socio-economic growth, and land use change [1,2].The monitoring and measurement of urban dynamics are essential to understand global urbanization.For more than a decade, nighttime light (NTL) data collected by the Defense Meteorological Satellite Program/Operational Linescan System (DMSP/OLS) have been widely used to study urbanization [3][4][5][6][7].The OLS has a unique capacity for global mapping of artificial lighting present at the Earth's surface, such as that generated by human settlements, gas flares, fires, and fishing boats [8].Although NTL does not directly measure human settlements or urban land cover, it is identified as a good indicator of human activity [9][10][11][12].NTL data have been archived since 1992 and are comprised of all available DMSP/OLS data for calendar years; each grid of the image has a digital number (DN), ranging from 0-63, that indicates annual average NTL intensity [13].In recent years, numerous studies have been conducted on the relationship between NTL data and key socioeconomic variables, such as electricity consumption [14][15][16][17], gross domestic product (GDP) [18][19][20][21][22][23], carbon emission [24][25][26], economic activity [27][28][29][30], and population distribution and density [31][32][33][34].Since statistical data or sufficiently accurate data on these variables is lacking in certain countries and regions, NTL data offer a unique way to study the social economy.Furthermore, with archival data from a period of more than 20 years, since 1992, and with the sensors continuing to record data, NTL data would benefit from time series studies of the urbanization process.
Although DMSP/OLS NTL data consistently demonstrate a strong capacity to evaluate economic distribution over both global and regional scales, the data's weakness is obvious.As the satellite data were initially used to produce nighttime cloud imagery, the sensor was typically operated in a high gain setting to enable the detection of moonlit clouds.However, with six bit quantization and a limited dynamic range, the recorded data are saturated in the bright cores of urban centers, in which the nighttime light may be brighter, but the DN values are all 63 [35].The loss of inner-urban variation caused by saturation effects reduces the correlation between the detected nighttime light and economic activities and therefore limits the application of NTL data [36,37].The saturation issue poses a significant challenge in the application of DMSP/OLS data in the assessment of urbanization processes.The utility of these data for urban applications could be greatly improved if the saturation of NTL data values were corrected or reduced [38].In an attempt to solve the problem, several global nighttime light products with no saturation have been released (more information on the product can be found on the National Oceanic and Atmospheric Administration website [39]).However, the data are only available for a very limited number of years, which makes it difficult to use NTL data to study a specific year or a long time series.Furthermore, several methods have been developed to correct or reduce NTL saturation, which can be regarded as a remedy for the released non-saturated data.The correction methods can be divided into two categories: those that can only be applied at the regional scale and those that can be used at a pixel scale.
(1) Regional Scale Hara et al. [40] proposed a linear correction method for the saturated light.With the total number of pixels arranged in increasing order of DN values in an arbitrary area, a linear regression model is determined by the first pixel with DN = 63.Based on the assumption that the tendency of DN change in the saturated area is similar to that of the non-saturated area, Letu et al. [41] obtained a cubic regression equation from the DN of the non-saturated part of stable light and applied it to correct the saturated part.
(2) Pixel Scale In 1996-1997, Letu et al. [42] developed a saturated light correction method at a pixel scale using a radiance calibration image.They assumed the NTL intensity to be constant in the saturated areas during 1996-1999 and derived a linear regression model from a comparison between the non-saturated part of the stable light image in 1999 and the radiance calibration image in 1996-1997 and used the model to correct saturation for the 1999 image.In order to apply the method, NTL must not change during the observation time.Thus, the method has limited applicability for correcting NTL saturation in numerous developing countries, such as India and China, since their NTL may change due to rapid urban expansion [38].In addition, the application of this method depends on the availability of a radiance calibration image.By setting the gain of the detector significantly lower than its typical operational setting, it would be possible to observe brightness variations within urban centers.Therefore, Ziskin et al. [35] produced an NTL product in 2006 with no saturation by combining a limited set of NTL data obtained in a lower setting with the operational data acquired at high gain settings.The calibrated data of Ziskin's method are accurate and of high quality.However, the approach is very labor-and cost-intensive.As the data acquired in low gain settings are sparse and only available for a very limited number of years, the method is unlikely to be used to correct the entire historical NTL archive [38,43].Previous research has shown that vegetation indices or vegetation abundance are closely correlated in a negative manner with key urban features.Based on this rationale, Lu et al. [44] proposed the Human Settlement Index (HSI), combining DMSP/OLS and Terra MODIS Normalized Difference Vegetation Index (NDVI) data to enhance urban features in saturated areas, and Zhang et al. [38] developed the Vegetation Adjusted NTL Urban Index (VANUI), which also uses MODIS NDVI and NTL.
In summary, the two correction models at the regional scale are based on the change in DN values of NTL data.The saturation effects are corrected by applying the feature of DN change in the non-saturated area.In contrast, the four methods at a pixel scale all utilize NTL images and other satellite data.In terms of data viability, HSI and VANUI can be used for a long-term series; however, the methods of Letu and Ziskin, in which radiance calibration images are combined with NTL data acquired in low gain settings, are only applicable for a limited number of years.
Similar to the DMSP/OLS NTL data, a new generation of nighttime light images without saturation effects was acquired by the Visible Infrared Imaging Radiometer Suite (VIIRS) carried on the Suomi National Polar-Orbiting Partnership (NPP) satellite [45].The NPP-VIIRS is configured to collect visible and infrared imagery and radiometric measurements of land, atmosphere, cryosphere, and oceans [46,47].In the current study, the available NPP-VIIRS global nighttime light images were generated using the VIIRS day/night band data collected on nights with zero moonlight during the following periods: 18-26 April 2012, 11-23 October 2012, and January 2013 [48].Although the NPP-VIIRS data have not been filtered to remove light detections associated with fires, gas flares, volcanoes, or auroras, and while the background noise has not been subtracted, this data has an obvious advantage over DMSP/OLS: saturation is not an issue with the NPP-VIIRS data since a wider radiometric detection range has been used [49].The VIIRS day/night band on Suomi NPP has a specified dynamic range of approximately seven orders of magnitude from 3×10 −9 W•cm −2 •sr −1 to 0.02 W•cm −2 •sr −1 [50].Furthermore, the NPP-VIIRS data employ onboard calibration, which is not available for the DMSP-OLS data [51].A more detailed introduction to NPP-VIIRS is provided on the National Aeronautics and Space Administration website [52].
In this paper, we focused on the problem of NTL saturation.Although several kinds of correction methods have been proposed, no assessment has been made of where and under what conditions a particular method should be applied.The purpose of this paper is to determine the best fitting model for reducing saturation effects.A series of statistical correlation analyses was conducted to evaluate the correction methods from current studies.In light of data availability, we applied four feasible methods in mainland China: the linear regression model and the cubic regression model available at the regional level, and HSI and VANUI at a pixel level.The NPP-VIIRS data were used as a means to evaluate the correction results.

Study Area
In 2012, saturation occurred in 135 cities in mainland China (excluding Hong Kong, Macao, and Taiwan), sixty-four of which were defined as severely saturated cities in this study, whose saturated area was more than 30 km 2 , or the saturated area was equal to more than 1% of the total lit area.To highlight the difference between the original data and the corrected data, 62 severely saturated cities were included in the case study area, as shown in Table 1 (Shanghai and Xiamen were excluded; an explanation for this decision is presented later in this paper).

Data Collection
The linear model and the cubic regression model act on the regions with DN = 63 and leave the other regions unchanged.To increase the difference between the original data and the corrected data, the NTL image from 2012 was used since the amount of light in China has increased rapidly in recent years [20], and the image from 2012 has a larger saturated area than those from other years.Statistical data on population, GDP, built-up area, and electric power consumption in the selected 62 cities were collected to evaluate the correction results.
HSI and VANUI correction methods can be used at a pixel level.In order to evaluate their correction results, we derived grid maps from various sources: population and GDP distribution in

Data Processing
As mentioned above, the NPP-VIIRS image is a preliminary product with background noise unassociated with economic activities, such as fires and volcanoes.Shi et al. [49] used DMSP/OLS data to remove data noise and to improve the accuracy and reliability of NPP-VIIRS data.First, the pixels with DN > 0 in the DMSP/OLS image from 2012 were taken as a mask to extract NPP-VIIRS data; two thresholds were then used to correct the outliers in the extracted NPP-VIIRS data, i.e., the pixels with negative DN values were assigned a 0, and pixels greater than 235.13 were smoothed by their eight neighbors.The upper threshold of 235.13 was set as the highest DN value of the three most developed cities in China: Beijing, Shanghai, and Guangzhou.Unlike Shi et al. [49], we took the mean value of the maximum DN in each of the three cities as the upper threshold since we thought that their development should be similar and mean DN could better reduce data bias.All the grid maps were reprojected to a Lambert azimuthal equal area projection with a spatial resolution of 1000 m.

Implementing Correction Methods
Ziskin's method [35] and Letu's method [42], combining radiance calibration images and NTL data acquired in low gain settings, were not considered in this study because of their limited applicability in correcting NTL saturation.The other four methods discussed above were implemented and evaluated in this article.The regression models available only at the regional scale were applied for data from 2012 within administrative units.Additionally, the pixel-based correction methods were applied for data from 2003, 2006, and 2012.
(1) Regional Scale The general idea of Hara's linear regression model for correcting saturation is presented in Figure 1 [40].The X-axis shows the accumulated number of pixels (PN X ), while the Y-axis shows the arbitrary DN value (DN Y ).
where A and T represent the lower and upper limit of the total number of pixels in the saturated area, respectively.Meanwhile, T DN refers to the total value of calibrated DN, and N DN is the sum of DN in the non-saturated area.Similar to Hara's method, the cubic regression model (Figure 2) is also based on the tendency of DN change in non-saturated areas to correct saturation effects.
where a, b, c, and d are the coefficients obtained by a four-dimensional simultaneous equation based on the least-squares method [41].The explanation of the other variables is the same as that for Equations ( 1)-(3).The two correction methods were applied in the 62 cities, and the accuracy was verified by using statistical data pertaining to population, GDP, built-up area, and electric power consumption.To reduce the effects of dim lighting in NTL images, pixels with DN < 12 were not considered in the analysis [53].Further, pixels with negative values were removed from NDVI MAX and NDVI MEAN images to exclude water bodies.In general, a higher level of urbanization should result in a lower NDVI and a higher NTL, thus generating higher values in HSI and VANUI.However, the coarse spatial resolution of NTL and NDVI, as well as data bias, could cause outliers in the HSI and VANUI images.Therefore, a threshold method was used to remove noise, which is similar to that used to correct the VIIRS-NPP image; i.e., the average of the highest HSI and VANUI in Beijing, Shanghai, and Guangzhou was used as the upper limit, so each pixel with larger values than the upper limit would be given a new value from their eight neighbors.

Saturation Correction at Regional Scale
The curves of NTL DN change in most cities present a concave shape, such as those of Beijing (Figure 3a) and Xian (Figure 3b), whereas cities like Suzhou (Figure 3c) and Zhuhai (Figure 3d) have convex shaped curves.These two types of curves fit the linear regression model and the cubic regression model.However, the cubic regression models in Shanghai (Figure 3e) and Xiamen (Figure 3f) imply an incorrect decrease in the curves of NTL DN change.Therefore, the cubic regression model cannot be applied in Shanghai or Xiamen, and these two cities were excluded from analysis.

The Comparison between the Linear Correction Model and the Cubic Correction Model
To evaluate the correction results of the linear correction model and the cubic correction model, the cumulative DN of the original NTL data and the corrected data were calculated in the 62 cities and then taken into the Pearson correlation with four socioeconomic variables: urban construction area (km 2 ), GDP (10 9 RMB), population (10 7 ), and electricity consumption (10 11 kilowatt hour) (because of the lack of data, only 47 cities were taken into the Pearson correlation with regards to electricity consumption).
As shown in Table 3, both the original cumulative DN and the corrected cumulative DN are correlated with the socioeconomic variables (significant at 0.01 level).Linear regression was then applied between cumulative DN and the socioeconomic variables, as shown in Figure 5.The results of the Pearson correlation and the linear regression both imply that the NTL data are good indicators for socioeconomic activity.On the other hand, the results of the Pearson correlation and linear regression indicate that the two correction methods did not obviously improve the correlation between NTL and economic activities.This may be because the correction methods were only performed on the pixels with original DN = 63, while China is a developing country with a small saturated area.To test this hypothesis, we further analyzed the relationship between the change of NTL data after correction and the area of saturation in the 62 cities. Figure 6 shows that the correction effects tend to decrease with decreasing saturated area.

The Comparison between HSI and VANUI
Two assessments were used to evaluate the ability of HSI and VANUI to reduce saturation.First, the original NTL data and corrected data of HSI and VANUI were taken into the Pearson correlation at a pixel level in the regions with DMSP/OLS DN ≥ 12 and NDVI ≥ 0 in mainland China with grid maps of population and GDP in 2003, radiance calibrated NTL data in 2003 and 2006, and VIIRS-NPP data in 2012.The results (significant at the 0.01 level) in Table 4 demonstrate that the correlation coefficients of the original NTL data are larger than those of HSI and smaller than those of VANUI, which indicates that VANUI enhances the correlation in the national area and HSI reduces the application of NTL.Second, the Pearson correlation coefficients in each of the 62 cities were calculated to analyze the correction results at the city level.To represent how well the correction methods reduced NTL saturation, the average (AVG) and standard deviation (STDEV) of the Pearson correlation coefficients of the original data, HSI, and VANUI in 62 cities were calculated.Table 5 shows that except for the average of HSI with radiance calibrated data in 2003 and 2006, most average values of HSI and VANUI are higher than those of the original data.Additionally, the correction results of VANUI are more stable than those of HSI, since the standard deviations of VANUI are smaller.To evaluate the methods used with the former linear regression model and cubic regression model, the cumulative DN of HSI and VANUI were calculated in 62 cities.The correlation results in Table 6 (significant at the 0.01 level) show that HSI and VANUI do not have higher coefficients than the original data in some variables.HSI only improves the correlation coefficients of electricity consumption, and the coefficients of VANUI are higher than those of the original NTL data, with the exception of population.

Discussion
Since the HSI and VANUI did not perform well in some cases, we further analyzed the factors that might contribute to errors.First, we examined the major assumption of the two indices that key urban features and vegetation are inversely correlated.In a city where vegetation health and abundance can be considered constant, the indices may be helpful in capturing details in bright urban cores.However, in a city of diverse natural environments and vegetation, the inconstant environmental background could influence the accuracy of HSI and VANUI.To analyze the relationship between NTL data and NDVI, we calculated the mean of the annual maximum NDVI of each NTL DN in the 62 cities (we consider the annual maximum NDVI and the annual average NDVI are the same in this analysis, since the correlation coefficient between them is larger than 0.9 at a significant level of 0.01).The results show that the assumption is valid in most eastern and central cities, such as Tianjin in 2012 (Figure 7a), but invalid in some western cities, like Wuzhong in 2012 (Figure 7b).A further comparison between these two kinds of cities reveals that most eastern and central cities are located on the vegetated region, where the change of NDVI in urban areas is very different from non-urbanized areas; however, the western cities have some desert land, which makes the change of NDVI caused by urbanization difficult to recognize.Next, we explored if the correlation between NTL DN and NDVI would influence the correction results of HSI and VANUI.The form of a ratio (P H + and P V + ) was used to measure correction results.
P V + = P O P V (10) where P O , P H and P V are the Pearson correlation coefficients of the original data and the corrected data of HSI and VANUI, respectively.It is obvious that a ratio higher than 1.0 indicates that saturation was reduced.Therefore, the 62 cities were divided into two groups: the saturation-corrected (SC) with ratio>1.0 and the saturation-remaining (SR) with ratio ≤ 1.0.Table 7 shows the average of the correlation coefficients of each group and implies that cities whose saturation is corrected by HSI and VANUI tend to have higher correlation coefficients.Figure 8 shows an example from 2012.The number of cities in Group SC (on the right side of P H + = 1 and P V + = 1 lines) is larger than that of Group SR (on the left side of P H + = 1 and P V + = 1 lines).Additionally, most cities in Group SR have smaller correlation coefficients (R 2 ) than Group SC.Combining NTL DN and NDVI, HSI and VANUI are also affected by these two factors.We calculated HSI and VANUI with an interval of 10 DN (e.g., 13, 23, 33, etc.) and 0.1 NDVI (e.g., 0, 0.1, 0.2, etc.) to show the variation trend.As illustrated in Figure 9, NDVI has a larger impact on HSI and VANUI when NTL DN is higher.When NTL DN is constant, HSI decreases sharply with NDVI, and VANUI decreases steadily.Therefore, in urban areas, which have relatively high NTL DN and low NDVI, HSI is more sensitive to NDVI or NTL DN than VANUI.Also, HSI is unavailable when NTL is 63 and NDVI is 0.

Conclusion
The correction methods at the regional level, i.e., the linear regression model and the cubic regression model, are established based on the tendency of DN change in unsaturated areas; therefore, the methods can be used to correct historical NTL without additional data.We applied the two methods in 62 cities in China.The results demonstrate that both methods reduce saturation effects and improve the correlation between NTL data and socioeconomic variables.However, since the methods only perform in saturated areas, which are small in China, the correction effects are limited.The methods may perform better in developed countries, where more pixels are DN = 63.
Based on the principle that vegetation and urban features are inversely correlated, the correction methods at a pixel level, HSI and VANUI, combine NTL with contemporaneous NDVI to correct saturation effects in bright urban cores.Our analysis shows that the correction results vary with study scale and study area.At the country level, VANUI is useful in reducing saturation.In contrast, HSI reduces the correlation between NTL data and economic variables.At the city level, the correction results of HSI are much improved, and the correction effects of VANUI are more stable than that of HSI.Then, we analyzed two factors that may impact the correction results of HSI and VANUI.First, NDVI is helpful in increasing inter-urban variability within certain cities, where the vegetation health and abundance can be considered constant.However, in a large region with diverse natural environments and vegetation, the HSI and VANUI may result in errors.Second, HSI is sensitive to changes in NDVI when NTL DN is high, which is usually found in urban areas, so this may limit the application of HSI.Therefore, the correction method needs to be selected according to the study area and the purpose of research.

Figure 1 .
Figure 1.The concept of the linear regression model.

Figure 2 .
Figure 2. The concept of the cubic regression model.

( 2 )
Pixel ScaleThe HSI method and the VANUI method both use contemporaneous NDVI to correct DN saturation based on the principle that key urban features are inversely correlated with vegetation health and abundance.NDVI MAX = MAX [NDVI 1 , NDVI 2 , …, NDVI m ] NDVI MAX ) + NTL N 1 -NTL N + NDVI MAX + NDVI MAX × NTL N (6) NDVI MEAN = AVERAGE [NDVI 1 , NDVI 2 , …, NDVI m ] (7) VANUI = 1 -NDVI MEAN × NTL (8) where NDVI MAX is the annual maximum NDVI, NDVI 1 , NDVI 2 , …, NDVI m are the multitemporal NDVI images, NTL N represents the normalized value of DN, NDVI MEAN is the average value of annual NDVI, and NTL is the DMSP/OLS stable light data.The methods were performed in 2003, 2006, and 2012.

Figure 4 .
Figure 4.The spatial pattern of DMSP/OLS data, HSI and VANUI in part of the Pearl River Delta in 2003, 2006 and 2012.

Figure 5 .
Figure 5. Linear relationship between socioeconomic variables and the accumulative DN of original data and corrected data of linear and cubic models.

Figure 6 .
Figure 6.The saturation effect and correction effect in the 62 cities.

Figure 7 .
Figure 7. Linear relationship between NTL and NDVI in (a) Tianjin and (b) Wuzhong in 2012.

Figure 8 .
Figure 8. Scatterplots of correlation coefficients between NTL DN and NDVI and correction effect of HSI and VANUI in 2012.

Figure 9 .
Figure 9.The change of HSI and VANUI with NDVI and NTL DN.

Table 1 .
Sixty-two severely saturated cities in mainland China.

Table 2 .
2003, global radiance calibrated NTL data (with no sensor saturation) in 2003 and 2006, and the NPP-VIIRS image in 2012.Thus, NTL images from 2003, 2006, and 2012 were all used.All data sets used and their detailed descriptions are listed in Table 2. Data sets used in research.

Table 3 .
Pearson correlation coefficients between cumulative digital number (DN) and socioeconomic variables.

Table 4 .
Pearson correlation coefficients at a pixel level.

Table 5 .
The average and standard deviation of the Pearson correlation coefficients in 62 cities at a pixel level.

Table 6 .
Pearson correlation coefficients at city level.

Table 7 .
Average correlation coefficients between NTL and NDVI of Group SC and Group SR.