Remote Sensing Estimation of Gross Domestic Product Using Multi-sensor Remote Sensing Data: a Case Study in Zhejiang Province, East China

There exists a spatial mismatch between socioeconomic data, such as Gross Domestic Product (GDP), and physical and environmental datasets. This study provides a dasymetric approach for GDP estimation at a fine scale by combining the Defense Meteorological Satellite Program Operational Linescan System (DMSP/OLS) nighttime imagery, enhanced vegetation index (EVI), and land cover data. Despite the advantages of DMSP/OLS nighttime imagery in estimating human activities, its drawbacks, including coarse resolution, overglow, and saturation effects, limit its application. Hence, high-resolution EVI data were integrated with DMSP/OLS in this study to create a Human Settlement Index (HSI) for estimating the GDP of secondary and tertiary industries. The GDP of the primary industry was then estimated on the basis of land cover data, and the area with the GDP of the primary industry was classified by a threshold technique (DN ≤ 8). The regression model for GDP distribution estimation was implemented in Zhejiang Province in southeast China, and a GDP density map was generated at a resolution of 250 m × 250 m. Compared with the outcome of taking DMSP/OLS as a unique parameter, estimation errors obviously decreased. This study offers a low-cost and accurate approach for rapidly estimating high-resolution GDP distribution to construct an important database for the government when formulating developmental strategies.


Introduction
Understanding the growing interference of human beings with the natural physical Earth system is one of the core issues addressed in global change research.In these studies, socioeconomic data play a significant role in measuring the human activities.However, there always exists a spatial mismatch of socioeconomic data and physical and environmental datasets [1].Socioeconomic data, including Gross Domestic Product (GDP), are usually provided on the basis of administrative units, such as counties, states, or even countries.Although the use of administrative units is convenient for data collection, such unit may be an inappropriate basis on which to integrate socioeconomic data with physical and environmental datasets (e.g., climate, soil, elevation, and land cover data), which are usually available in grid or raster formats.
The spatial mismatch can be a source of error since the socioeconomic data are often assumed to be uniformly distributed across a coarse areal unit and any spatial disparity within the unit ignored [2].The actual distribution of social and economic activities within the boundaries of these spatial units is rather heterogeneous influenced by manifold factors.Therefore, socioeconomic data should be presented in a gridded format at finer spatial scales rather than in the traditionally reported or statistical administrative units.Detailed information on socioeconomic activities, especially its spatial distribution, can serve as an important database of government and other organizations for formulating regional developmental strategies.
In order to solve the spatial mismatch problem, land cover data have been widely used as ancillary data in the spatialization of population, GDP and other socioeconomic attributes [1,[3][4][5][6][7].However, such data suffer from a major limitation, that is, deriving weights for each land cover class to reflect individual classes' density [8,9].With respect to the estimation of economic activities, such as GDP, population distribution data are another typically used ancillary data [10].The use of population distribution data is based on the assumption that population and economic activities are closely correlated.
Remote sensing data in the form of nighttime lights (NTL) provide a consistent, uniform, and independent measure of economic activity from outer space [10][11][12].The Defense Meteorological Satellite Program Operational Linescan System (DMSP/OLS) NTL imagery has been proven to be a strong proxy measure of GDP in earlier studies.For example, Elvidge et al. [13] first reported that the DMSP/OLS lit area was highly correlated with GDP.The relationship between lit area and GDP has been confirmed for a large number of countries by Doll et al. [14] and Elvidge et al. [15].Chen and Nordhaus [16] concluded that luminosity data may be a useful supplement to current economic indicators in countries and regions with missing or low-quality data.On the basis of the correlation between NTL and economic activities, Doll et al. [14] further created the first global disaggregated map of GDP on a 1° × 1° grid considering a country-level relationship between lit areas and GDP in 46 countries.Subsequently, Sutton and Costanza [17] presented an explicit map of global market economic activity with a fine spatial resolution of 1 km 2 by using radiance-calibrated images from 1996 to 1997.However, Doll et al. [14] and Sutton and Costanza [17] did not consider the contribution from the agricultural sector, which is generally distributed in the darker areas of NTL images.At the national and subnational levels, Doll et al. [11] created estimated GDP maps for the United States and 11 countries in the European Union at a 5 km resolution by using linear relationships between total radiance and GDP, rather than between lit area and GDP.They also noted the necessity of considering agricultural activity in the estimation of the total economic activity.More recently, Ghosh et al. [10] generated a spatially disaggregated 1 km 2 map of global estimated total economic activity on the basis of NTL imagery and LandScan population grid.They considered the contribution of the agricultural sector, and the estimated GDP attributed to agriculture was spatially distributed according to the LandScan population grid.A recent study by Wu et al. [18] concluded that agriculture was responsible for almost 25% of total light consumption.Therefore, maps based only on NLT should consider the spatial distribution of agricultural productivity in developing countries.Moreover, the NTL images have been used to examine poverty at the grid and at the subnational and national levels [19][20][21].
Recently, NTL data were also used to model GDP in China.Zhao et al. [22] created Chinese GDP images for 1996 and 2000 on the basis of province-level relationships between sum lights and GDP.Han et al. [23] generated a 1 km grid GDP map of China on DMSP/OLS NTL and land use data.For the primary-industry GDP, the area weightiness method based on land use types is adopted.For the secondary and tertiary industries, a model based on the linear relationship between NTL and GDP is built.
Nevertheless, using the DMSP/OLS NTL data to generate a spatially disaggregated map of GDP still suffers from some drawbacks, for instance, coarse spatial resolution of the sensor, blooming or overglow, and saturation [24].Blooming or overglow effects, such as the phenomenon in DMSP/OLS nighttime imagery where urban peripheries and water-bodies inside cities are brightened by the urban lights, can result in the overestimation of lit areas, and then increase the error of distribution estimate [22].Different methods, such as threshold techniques, have been employed by previous researchers to overcome the overglow effects and to increase the accuracy of the GDP estimation based on DMSP/OLS NTL images [22,25,26].Moreover, the saturation problem is also severe when GDP-like measures are derived by using NTL, which can result in the poorer correlation using nighttime light for centrally-planned economies [14,24,27].
To decrease the information loss and reduce the influence of saturation and overglow effects, Lu et al. [28] developed an integrated approach for mapping fractional settlements by combining DMSP/OLS and the normalized difference vegetation index (NDVI).They generated a human settlement index (HSI) based on the strong negative relationship between vegetation indices and built-up surfaces [29].Pixels with nighttime light of high intensity are generally located in built-up areas.Therefore, the combined use of these two strongly negatively correlated data sets can reveal more details close to reality.Recent studies by Yang et al. [8] used an elevation-adjusted HSI to estimate the population density in Zhejiang Province.An important question is whether the HSI image can be used to improve measures of economic activity such as GDP at provincial levels.
This study explores the use of multi-sensor remote-sensing data, including DMSP/OLS NTL imagery and EVI, as information sources to estimate the distribution of non-agricultural GDP in Zhejiang Province.Zhejiang is located in the southeast coast of China.In this area, the industry and service sectors comprise more than 90% of the economy.Land cover data of the study area are employed in estimating the agricultural GDP.Finally, a spatially disaggregated GDP map with a high resolution of 250 m is presented for Zhejiang Province.This dataset offers significant analytical flexibility and enables local governments and analysts to implement appropriate measures and policies.

Study Area
Zhejiang Province is located in the southeastern coast of China.The province faces the East China Sea to the east, and has a long and winding coastline that extends 6484 kilometers (Figure 1).The total land area of Zhejiang is 101,800 km 2 with a complex topography, 70.4% of which is covered by hills and mountains.Plains and basins only account for 23.2% of the total land area, and the remaining 6.4% comprises lakes and rivers.Given its location in Yangtze River Delta, one of most developed areas in China, Zhejiang is one of the most densely populated provinces with a permanent population of 47.48 million by the end of 2009.Based on the statistical data in the same year, the GDP and GDP per capita are 2299 billion CNY and 44,641 CNY, respectively.The secondary and tertiary industries comprise 94.94% of the GDP.

Data Collection and Pre-Processing
(1) NTL data derived from the DMSP/OLS comprise the major dataset in this study.The stable light composite with 1 km resolution is one of the major products of DMSP/OLS data and is provided by the National Geophysical Data Center (NGDC) of the National Oceanic and Atmospheric Administration annually since 1992.These stable data products, which can be downloaded for free from the official website of the NGDC, have solved the problems of ephemeral lights and cloud over, thus, making these data more attractive for use in various domains [30].
where EVI 1 , EVI  5) County-level administrative boundary maps at a scale of 1:50,000 were provided by the Zhejiang Administration of Surveying Mapping and Geoinformation.These data were used as a basic unit for census data.

Fusion of NTL Imagery and Land Cover Data
Ghosh et al. [10] estimated the spatial distribution of agricultural productivity according to the LandScan population grid with a 1-km resolution.The assumption is that rural populations living in darker areas contribute to economic activity (mainly through agricultural activities).However, the population is typically tallied on the basis of residence location, rather than workplace [36].Agricultural activities are closely related to land use and land cover.Therefore, land cover data are suitable ancillary data for the dasymetric mapping of GDP contributed by agriculture.The basic idea in this study is to model agricultural production by using land cover data and to model the non-agricultural production by using HSI image.The fusion of DMSP/OLS NTL imagery and GlobCover 2009 land cover data is initially conducted.
In this study, a threshold technique is used to reduce the blooming effects while eliminating the areas with agricultural productions from the NTL image.Imhoff et al. [25] proposed a threshold of 89% (frequency of detection) to distinguish urban areas and to reduce the effects of overglow.Small et al. [37] noted the spatial correspondence of dim lights (DN < 12) with agriculture and low population land use after comparisons with higher resolution images.The dimly lit areas with DN < 12 almost always contained some indication of anthropogenic land cover.Ma et al. [38] also applied a threshold DN value of 12 on the basis of a previous study and comparisons with other images.Zhao et al. [39] then selected 10 as the DN threshold value.However, there cannot be a unique threshold value for all areas because of different spatial scales and urban forms.
In this study, multi-sensor remote sensing data are adopted to generate a NTL threshold value.Pixels with high EVI values and nonzero DN values were identified as potential blooming areas.First, the mean EVI value (0.35) of artificial surfaces and associated areas (urban areas > 50%) in GlobCover 2009 data was calculated for Zhejiang Province.Then, pixels that were not located in artificial surfaces and had an EVI value > 0.35 and NTL DN value > 0 were identified as potential blooming areas.According to the statistics, the mean NTL DN value for these potential blooming pixels is 8.68 in Zhejiang Province.Therefore, a DN threshold value of 9 in the DMSP/OLS NTL imagery was selected to remove the blooming pixels and to exclude agricultural production.Notably, the areas with NTL DN value ≥ 9 are generally larger than urban areas in GlobCover (figure not shown).
The NTL imagery with DN value ≥ 9 was overlaid in GlobCover 2009 land cover data.The artificial surfaces and associated areas, as well as the adjacent non-urban land cover in GlobCover 2009, were replaced by the NTL DN value.The non-urban land cover categories without NTL were not modified.The land types related to agricultural activities in GlobCover 2009 were then combined into four main agricultural land uses (Table 1), namely, farming, forestry, stockbreeding, and fishery (Figure 2).
The land cover data for the four agricultural sectors in Figure 2 were used to estimate the GDP density of each sector.The statistical GDP data for the four agricultural sectors can be separately collected from the official statistical yearbook at the county level.For a certain county, the pixel-based agricultural GDP for a certain sector was calculated on the basis of the ratio between the GDP of the sector and the number of corresponding land use pixels.

Pixel-Based HSI
On the basis of the strong negative relationship between built-up surfaces and vegetation cover, Lu et al. [28] developed an HSI to map human settlements by combining the DMSP/OLS NTL data and NDVI, in order to reduce the information loss of spatial patterns and overcome the overglow and saturation effects of DMSP/OLS nighttime light data.In the present study, another vegetation index called EVI, which has a high spatial resolution of 250 m, was used to generate the HSI, as expressed in Equation ( 2): OLS nor represents the normalized value (0-1) of the DMSP/OLS DN image.OLS max is the maximum value in the DMSP/OLS image of the study area.The pixels with NTL DN value ≥ 9 in Figure 2 were then turned into HSI, and the land use data remained unchanged (Figure 3).Notably, the 16-bit MODIS EVI data provided more detailed information for pixels with same NTL DN value.Therefore, incorporating vegetation information into the NTL data can reduce the effects of saturation, which significantly influences GDP estimation and provides new insights into mapping economic activities.The correlation between HSI and non-agricultural GDP will be examined, and the relationship will be used to create spatially disaggregated GDP map.

Accuracy Assessment for Non-Agricultural GDP Estimation
Accuracy assessment is an essential procedure in modeling GDP density.However, no independent method exists to of check whether the disaggregation is correct.R 2 is a frequently used indicator to evaluate the performance of a model as the square of the correlation between observed and fitted values.In addition, a Relative Error (RE) is employed as a measure of the model performance.The RE is defined as follows: where GDP is the estimated GDP, and GDP is the actual GDP.
The mean of the absolute value of RE is typically used to assess overall accuracy [40].The mean RE (MRE) was further calculated by using the following formula: where n is the number of counties and cities in Zhejiang Province.Moreover, the 50th percentile of the n |RE i |, named median relative error (MdRE), was also adopted.

Regression Analysis between HSI and Non-Agricultural Productions
The relationship between the GDP of secondary and tertiary industries in Zhejiang Province and the cumulative DN values of the NTL data at a county level was initially examined.Figure 4a shows that the cumulative DN values are closely correlated with the GDP statistics with an R 2 equal to 0.97.This result corresponds to the positive correlation between the sum of lights and GDP, as proven by previous studies [10,11,17,22].The correlation in the present study is higher at the county level than at the province level in China, as reported by Zhao et al. [22], because of the high contribution (>90%) of the industry and service sectors to the economy in Zhejiang Province.The GDP of the primary industry obtained from official data only accounts for 5.06% of the total GDP of Zhejiang in 2009.The scatter plot in Figure 4b shows that the cumulative HSI was also highly correlated with non-agricultural productions at the county level.A slight increase in correlation with an R 2 equal to 0.98 can be observed between these two data points.In addition, the top three cities (Hangzhou, Ningbo, and Wenzhou) with large non-agricultural productions converged better to the regression line after combining the DMSP/OLS and the EVI.

GDP Density Map
After combining NTL, EVI, and high-resolution land use data, a GDP density map of Zhejiang Province for the year 2009 was presented with a resolution of 250 m (Figure 5).The GDP estimation of non-agricultural activities was based on the correlation between the HSI and the non-agricultural GDP at the county level.The estimation of agricultural production was based on land use data.The spatial distribution of the GDP density for Zhejiang Province in 2009 was consistent with that reported in previous studies (e.g., [23]), but more apparent spatial heterogeneity and richer information were revealed because of the higher spatial resolution and significant reduction in saturation effect.Figure 5 shows that most of the GDP contributions were from plains and basin regions with a high urbanization level, especially the Hangzhou-Jiaxing-Huzhou Plain in northern Zhejiang Province.In addition, the Jinhua-Quzhou basin and coastal areas (such as Cixi, Taizhou, Ningbo, and Wenzhou) exhibit large GDP density.The highest GDP density is generally located in urban centers.The raster map offers a straightforward and visible manner of using the information on GDP distribution.The database can serve not only as a significant reference material for the government and organizations when formulating developmental strategies, but also as an input for exposure and risk analysis.

Error Analysis
Accuracy assessment was used to evaluate the accuracy of the regression model built on cumulative HSI and statistical non-agricultural GDP data at the county level.When only the cumulative DN values of NTL was used to estimate the non-agricultural GDP, the MRE and MdRE were 26.58% and 23.91%, respectively.When the regression model was used to estimate the GDP on the basis of HSI, a significant reduction in the MRE (23.65%) and MdRE (20.02%) was attained.
When it comes to a certain cities or counties, significant improvement in |RE| was observed in Hangzhou, Ningbo, and Wenzhou, the largest cities in Zhejiang Province, when HSI was adopted.The results showed that the |RE| of Hangzhou, Ningbo, and Wenzhou decreased from 1.61%, 7.32% and 37.94% to 1.06%, 1.62% and 24.24%, respectively.On the whole, |RE| of 51 cities and counties (68 in total) showed a decrease after the adoption of HSI.
Elevation is considered an important variable in modeling the population distribution of China [41].Therefore, it can be inferred that elevation also serves a critical function in estimating economic activity.However, virtually existing studies have used NTL data to model socioeconomic variables without considering the influence of elevation.In Yang et al. [8] estimation of population distribution using HSI, elevation correction was implemented, and a significant reduction in MRE was observed.However, in this study, the effect of elevation correction on GDP estimation is negligible because most pixels with NTL DN value > 8 are located in low-elevation areas.As Figure 6, the larger measurement errors in excess of 50% were obtained from the estimates of counties with very high elevation or low GDPs.In the cities and counties with complex terrain in Zhejiang Province, the estimated GDPs were generally underestimated, because compared with urban areas, usually fewer lights can be detected in rural areas or underdeveloped counties in China.When a NTL threshold was used in underdevelopment areas with high elevation, some pixels would be lost, resulting in a significant underestimation of GDP.Consequently, cluster analysis to subdivide county-level units into several groups ahead of regression analysis and the adoption of different NTL thresholds can be used in further studies.The non-agricultural GDP in the eastern coastal counties of Zhejiang Province, especially island counties (e.g., Dongtou, Shengsi, Xiangshan, and Ninghai), was also significantly underestimated.In coastal areas, land reclamation usually caused a mismatch between administrative boundaries and satellite images.Moreover, the long and winding coastline and the numerous islands in Zhejiang Province resulted in the discarding of pixels in zonal statistics, thus causing the underestimation of non-agricultural GDP.

Conclusions
Economic output is conventionally reported at the national level or in administrative units.To bridge the gap between socioeconomic and physical and environmental datasets, a dasymetric mapping approach based on multi-sensor remote sensing data was developed to provide disaggregated GDP data.The creation of a gridded GDP map enables the aggregation and analysis of these figures according to whichever spatial units are most convenient for the user.The DMSP/OLS NTL images provide a powerful tool for assessing various aspects of human activities across the globe.Although the DMSP/OLS images have significant advantages (e.g., free availability, global coverage, high temporal resolution, and high correlation between NTL and human activities) in estimating human activities, its drawbacks, such as coarse resolution, overglow, and saturation effects, limit its application effect when using NTL data as a unique measure of GDP density.Estimating socioeconomic data at finer spatial scales than those offered by the DMSP/OLS imagery and gaining new insights into socio-economic patterns are necessary in the economic and policy community [42].
Through a case study of Zhejiang Province in east China, the potentials for spatializing agricultural and non-agricultural productions separately were explored.The GDP was divided into agricultural and non-agricultural parts.Land cover data and the human settlement index (HSI) were used to estimate the spatial distribution of agricultural and non-agricultural production at a fine scale, respectively.A grid of agricultural economic output was created on the basis of GlobCover 2009 land cover data.The GDP of different agricultural sectors (farming, forestry, stockbreeding, and fishery) were distributed equally into pixels, which have the same attribute of each land cover type.Then, a threshold technique for DMSP/OLS nighttime light (NTL) imagery was used to determine the areas with commercial and industrial sectors, and the pixels with NTL DN value > 8 were identified as areas that contribute to non-agricultural production eventually.The HSI developed by Lu et al. [28], which combines DMSP/OLS NTL imagery and enhanced vegetation index (EVI), was used to disaggregate the non-agricultural GDP.The cumulative HSI was highly correlated with non-agricultural GDP at the county level in Zhejiang Province, with an R 2 of 0.98.The mean relative error of estimated non-agricultural GDP by HSI is 23.65% and the fraction of high accuracy predictions (with absolute relative error from 0% to 25%) from the HSI data is 57.35%.Compared with the results from cumulative DMSP/OLS NTL, the mean relative error of estimated non-agricultural GDP by HSI has been reduced in 75% counties.Moreover, the use of higher spatial resolution HSI imagery is expected to provide new insights into socio-economic patterns.Finally, the grids of agricultural and non-agricultural productions were then overlaid to generate the grid of total economic output at a resolution of 250 m for Zhejiang Province.
The incorporation of EVI data with finer resolution (250 m) can reduce estimation errors and reveal more details about GDP distribution.Compared with that in previous studies [6,10,23], the spatial resolution obviously improved when the approach in this study was applied.Given that DMSP/OLS and MODIS EVI data can be freely downloaded globally, the new approach applied in this study offers a low-cost and accurate way for rapidly estimating economic activity.Moreover, the linkage between NTL and non-agricultural GDP was investigated at a finer level than most previous studies.Therefore, the combination of DMSP/OLS and EVI can be considered an effective method to model and estimate GDP spatial distribution in a regional scale.
Overall, despite the use of a simple model, which only requires limited information, a large part of the variation in county-level GDP was sufficiently explained.With the launch of the Visible Infrared Imaging Radiometer Suite (VIIRS) in October 2011, which will have on-board calibration and higher spectral and spatial resolution (742 m), better low-light imaging data would be made available.The recently released NTL imagery from the VIIRS sensor on the Suomi National Polar-orbiting Partnership (NPP) Satellite was used to model China's regional economy in 31 provincial and 393 county units [43].The results show that the NPP-VIIRS data are more strongly capable of modeling regional economy than the DMSP/OLS data, which could help create better GDP distribution maps.Therefore, policy-makers should seriously consider NTL data as an input to understand GDP better at highly disaggregated levels.

Figure 1 .
Figure 1.Location of study area and elevation.

Figure 2 .
Figure 2. The fusion of DMSP/OLS NTL image and GlobCover land cover data for Zhejiang Province.

Figure 3 .
Figure 3.The fusion of Human Settlement Index and GlobCover 2009 land cover data for Zhejiang Province.

Figure 4 .
Figure 4. Scatter plots between the total non-agricultural production at the county level in Zhejiang Province and (a) cumulative DNs of the stable NTL and (b) cumulative HSI.

Figure 6 .
Figure 6.Distribution of non-agricultural GDP modeling error for Zhejiang Province.
[33]32]els of the stable NTL data show brightness values in digital numbers (DNs).The DN values range from 0 to 63 because of 6-bit quantization.With the use of the nearest neighbor algorithm in ArcGIS software, the 2009 stable NTL data (F162009) were reprojected to an Albers Conical Equal Area projection.The bilinear algorithm was used to resample the NTL image into a pixel size of 250 m × 250 m to match the spatial resolution of the moderate-resolution imaging spectroradiometer (MODIS) EVI data.(2)Compared with NDVI, EVI is developed to optimize the vegetation signal with improved sensitivity in high-biomass regions and improved vegetation monitoring through a reduction in atmospheric influences and a decoupling of the canopy background signal[31,32].Therefore, EVI is suitable for Zhejiang Province, which has high biomass.A time series of 23 16-day composite MODIS EVI data (MOD13Q1) for the year 2009, at a resolution of 250 m, was downloaded from the National Aeronautics and Space Administration website[33].MODIS Reprojection Tools were used for image mosaic, format conversion, and reprojection.To remove the effect of cloud contamination and to improve the separation of human settlements from bare soils, multi-temporal EVI images from 2009 were used to generate a new EVI composite (i.e., EVI max ) by employing the maximum algorithm, as expressed in Equation (1):

Table 1 .
Land cover classes before and after combination.