Estimating Rural Electric Power Consumption Using NPP-VIIRS Night-Time Light, Toponym and POI Data in Ethnic Minority Areas of China

Aiming at the problem that the estimation of electric power consumption (EPC) by using night-time light (NTL) data is mostly concentrated in large areas, a method for estimating EPC in rural areas is proposed. Rural electric power consumption (REPC) is a key indicator of the national socio-economic development. Despite an improved quality of life in rural areas, there is still a big gap between electricity consumption between rural residents and urban residents in China. The experiment takes REPC as the research target, selects Dehong (DH) Dai Jingpo Autonomous Prefecture of Yunnan Province as an example, and uses the NTL data from the Visible Infrared Imaging Radiometer Suite (VIIRS) Day–Night Band (DNB) carried by the Suomi National Polar-orbiting Partnership (NPP) Satellite from 2012 to 2017, toponym and points-of-interest (POI) data as the main data source. By performing kernel density estimation to extract the urban center and rural area boundaries in the prefecture, and combining the county-level boundary data and electric power data, a linear regression model of the total rural NTL intensity and REPC is estimated. Finally, according to the model, the EPC in ethnic minority rural areas is estimated at a 1-km spatial resolution. The results show that the NPP-REPC model can simulate REPC within a small average error (17.8%). Additionally, there are distinct spatial differences of REPC in ethnic minority areas.


Introduction
The development of electric power has been an important indicator in socio-economic development. A rapid development of a national economy is often accompanied with a strong demand in electric power consumption. Since 2002, China has formed a unified national power management system that combines rural power systems with urban power systems [1]. With rapid development in the Chinese economy and a steady progress of society, the quality of life of urban and rural residents has improved. Particularly, in rural areas, the electric power consumption structure has undergone profound changes, and the per capita electricity consumption index of rural residents has increased. Despite such progress, there is still a large gap compared with the standard of urban residents. Therefore, it is important to promote a coordinated and sustainable electric power development of urban and rural areas in China at this stage.

Data Sources
Four types of data are used in this study (Table 1): 1) the NPP-VIIRS NTL composite data; 2) the POI data and toponym data; 3) the EPC statistical data; and 4) administrative boundary data and digital elevation model (DEM) data. The geographic coordinate system of all data is the WGS84 coordinate system, and the projected coordinate system is the UTM 47N projection based on WGS84.
The remotely sensed NPP-VIIRS NTL data used in this study are derived from the annual average NPP-VIIRS global cloud-free synthetic products from 2012 to 2017. We mainly downloaded annual composites and monthly composites. The annual composites of 2012, 2013, 2014 and 2017 obtained by weighted average of the monthly composites. The data were downloaded from the National Geophysical Data Center (NGDC) of the National Oceanic and Atmospheric Administration (NOAA) (https://www.ngdc.noaa.gov/eog/viirs/download_dnb_composites.html).
The POI data used in this paper were collected from the BigeMap (http://www.bigemap.com/). We found 7641 POIs in Mangshi city, 4974 in Ruili city, 1898 in Lianghe County, 2946 in Longchuan County, 4835 in Yingjiang County for a total of 22,294 POIs. These POIs mainly included administrative, real estate, shopping, education and training, hotel, company, transportation facility, life service and natural feature landmarks. The toponym data were collected from the National Database for Geographical Names of China (http://dmfw.mca.gov.cn/), with a total of 9580 items, mainly including urban residential areas, rural residential areas, village committees, agricultural, forestry and pasture sites, industrial areas, public institutions, party and government organizations, etc.
The county-level EPC statistical data for the Dehong Prefecture used in this study include county-level REPC and industrial EPC data (unit: 10 4 kW·h) from 2012-2017 (lacking of statistical data for 2016), which were obtained from the "DH Prefecture Yearbook" provided by China's Economic and Social Big Data Research Platform (http://data.cnki.net/) from 2012-2017. It should be noted that the statistical yearbook does not specify the EPC of urban residents.
The border of the county-level administrative divisions in Dehong Prefecture used in this paper was obtained from the county-level administrative boundaries provided by the National Geomatics Center of China. The DEM of Dehong Prefecture were obtained from the Geospatial Data Cloud (http://www.gscloud.cn/) and performed clipping, projection and resampling for slope analysis.

Data Sources
Four types of data are used in this study (Table 1): (1) the NPP-VIIRS NTL composite data; (2) the POI data and toponym data; (3) the EPC statistical data; and (4) administrative boundary data and digital elevation model (DEM) data. The geographic coordinate system of all data is the WGS84 coordinate system, and the projected coordinate system is the UTM 47N projection based on WGS84. The remotely sensed NPP-VIIRS NTL data used in this study are derived from the annual average NPP-VIIRS global cloud-free synthetic products from 2012 to 2017. We mainly downloaded annual composites and monthly composites. The annual composites of 2012, 2013, 2014 and 2017 obtained by weighted average of the monthly composites. The data were downloaded from the National Geophysical Data Center (NGDC) of the National Oceanic and Atmospheric Administration (NOAA) (https://www.ngdc.noaa.gov/eog/viirs/download_dnb_composites.html).
The POI data used in this paper were collected from the BigeMap (http://www.bigemap.com/). We found 7641 POIs in Mangshi city, 4974 in Ruili city, 1898 in Lianghe County, 2946 in Longchuan County, 4835 in Yingjiang County for a total of 22,294 POIs. These POIs mainly included administrative, real estate, shopping, education and training, hotel, company, transportation facility, life service and natural feature landmarks. The toponym data were collected from the National Database for Geographical Names of China (http://dmfw.mca.gov.cn/), with a total of 9580 items, mainly including urban residential areas, rural residential areas, village committees, agricultural, forestry and pasture sites, industrial areas, public institutions, party and government organizations, etc.
The county-level EPC statistical data for the Dehong Prefecture used in this study include county-level REPC and industrial EPC data (unit: 10 4 kW·h) from 2012-2017 (lacking of statistical data for 2016), which were obtained from the "DH Prefecture Yearbook" provided by China's Economic and Social Big Data Research Platform (http://data.cnki.net/) from 2012-2017. It should be noted that the statistical yearbook does not specify the EPC of urban residents.
The border of the county-level administrative divisions in Dehong Prefecture used in this paper was obtained from the county-level administrative boundaries provided by the National Geomatics Center of China. The DEM of Dehong Prefecture were obtained from the Geospatial Data Cloud (http://www.gscloud.cn/) and performed clipping, projection and resampling for slope analysis.

Methods
The remotely sensed NPP-VIIRS NTL data from 2012-2017 are used to estimate the REPC of the DH Prefecture based on the following four main procedures. First, the remotely sensed NPP-VIIRS NTL data are preprocessed to obtain night-time stable light (NSL). Second, the POI and toponym data are used to perform kernel density estimation, and then the study area can be divided into urban centers, rural areas and natural surfaces by combining kernel density estimation and the NSL data. Third, according to the boundaries of rural area, a regression analysis of electricity consumption and NSL from NPP-VIIRS is performed. Finally, the spatial estimation of EPC in the DH Prefecture is conducted according to the regression model ( Figure 2).

Calibration of the NPP-VIIRS Data
The specific data processing process is as follows: First, the map projection of the remotely sensed NPP-VIIRS NTL data need to be converted to the UTM 47N projection based on WGS84 as the same as other data mentioned above. Due to the influence of many factors, including deformation caused by the remote sensor itself, external factors and processing process, so we need to convert the map projection of the NTL image to UTM projection. Then, we need to correct the NTL data based on radiation. The process of radiometric correction is: The NPP-VIIRS NTL data are not filtered out the effects of background noise (fire, gas combustion, volcanoes and auroras), and there are some surface scattering values (the dim light areas caused by removing the reflection of moonlight) in the NTL image, so we use the annual NPP-VIIRS NTL composites, from which background noise and surface scattering values are removed after geometric calibration. In the process of radiation calibration, we refer to the algorithm of Elvidge and other researchers on night light data [38]. In the first step, the average radiance value of the cloud in the low reflectivity area of the sea surface is selected as the calibration value for removing the scattered light, and then the calibration value is subtracted from the whole image for cloud scattering removal. Then, using the method of adjacent image difference, we set a threshold value to obtain a stable surface area, use the area as a mask, carry out statistical analysis on the radiance value of the mask area, and three times of the average values as the confidence interval to remove the surface scattered light. Finally, the effective light data from NPP-VIIRS are extracted to obtain the NSL data. After preprocessing, the NTL data from NPP-VIIRS are clipped according to the county administrative division boundary of the DH Prefecture as the mask. Finally, a resampling method of cubic convolution interpolation is used to obtain a grid size of 1000 m. The calibrated NSL data in the DH Prefecture in 2017 are shown in Figure 3.

Calibration of the NPP-VIIRS Data
The specific data processing process is as follows: First, the map projection of the remotely sensed NPP-VIIRS NTL data need to be converted to the UTM 47N projection based on WGS84 as the same as other data mentioned above. Due to the influence of many factors, including deformation caused by the remote sensor itself, external factors and processing process, so we need to convert the map projection of the NTL image to UTM projection. Then, we need to correct the NTL data based on radiation. The process of radiometric correction is: The NPP-VIIRS NTL data are not filtered out the effects of background noise (fire, gas combustion, volcanoes and auroras), and there are some surface scattering values (the dim light areas caused by removing the reflection of moonlight) in the NTL image, so we use the annual NPP-VIIRS NTL composites, from which background noise and surface scattering values are removed after geometric calibration. In the process of radiation calibration, we refer to the algorithm of Elvidge and other researchers on night light data [38]. In the first step, the average radiance value of the cloud in the low reflectivity area of the sea surface is selected as the calibration value for removing the scattered light, and then the calibration value is subtracted from the whole image for cloud scattering removal. Then, using the method of adjacent image difference, we set a threshold value to obtain a stable surface area, use the area as a mask, carry out statistical analysis on the radiance value of the mask area, and three times of the average values as the confidence interval to remove the surface scattered light. Finally, the effective light data from NPP-VIIRS are extracted to obtain the NSL data. After preprocessing, the NTL data from NPP-VIIRS are clipped according to the county administrative division boundary of the DH Prefecture as the mask. Finally, a resampling method of cubic convolution interpolation is used to obtain a grid size of 1000 m. The calibrated NSL data in the DH Prefecture in 2017 are shown in Figure 3.

Extraction and Accuracy Assessment of Urban and Rural Areas
In this study, considering the complementary characteristics of POI and toponym data, the two types of data are merged into a POI and toponym data set (P&T data set) after the redundant data have been deleted by spatial data matching, and then the P&T data set is used for kernel density estimation.
Kernel density estimation (KDE) is used to calculate the unit density of the measured values of point and line elements in a specified neighborhood, which can directly reflect the distribution of discrete measured values in a continuous area. The principle of kernel density estimation is to assign weights according to the distance between a data point and a center point, and the weight increases as the distance from the center point decreases. Finally, the weighted average density of all data points in the study area is obtained [39]. The equation used for kernel density estimation is: where Pi is the kernel density of data point i, Kj is the weight of data point j, Dij is the distance between point i and point j, R is the bandwidth threshold used in area calculations (Dij < R), n is the number of study objects j in the range of bandwidth R. In this study, through kernel density estimation with P&T data set in the study area, the aggregation degree of the POI and toponym data and the spatial distribution of the data density are calculated. The initial bandwidth R is set to 3000 m, and the output pixel size is one-tenth of the bandwidth (300 m). Then, the bandwidth and the output pixel size are continuously reduced to 300 m and 30 m, respectively. The kernel density estimation results are shown in Figure 4. By comparing the results of kernel density estimation, it is found that the region with the highest luminance value has an obvious boundary when the bandwidth R ranges from 800-1000 m, which is a suitable range for easily extracting the central boundaries of county-level urban areas. Therefore, the bandwidth R is set to 1000 m, and the output pixel size is set to 100 m in kernel density analysis to obtain the results.

Extraction and Accuracy Assessment of Urban and Rural Areas
In this study, considering the complementary characteristics of POI and toponym data, the two types of data are merged into a POI and toponym data set (P&T data set) after the redundant data have been deleted by spatial data matching, and then the P&T data set is used for kernel density estimation.
Kernel density estimation (KDE) is used to calculate the unit density of the measured values of point and line elements in a specified neighborhood, which can directly reflect the distribution of discrete measured values in a continuous area. The principle of kernel density estimation is to assign weights according to the distance between a data point and a center point, and the weight increases as the distance from the center point decreases. Finally, the weighted average density of all data points in the study area is obtained [39]. The equation used for kernel density estimation is: where P i is the kernel density of data point i, K j is the weight of data point j, D ij is the distance between point i and point j, R is the bandwidth threshold used in area calculations (D ij < R), n is the number of study objects j in the range of bandwidth R. In this study, through kernel density estimation with P&T data set in the study area, the aggregation degree of the POI and toponym data and the spatial distribution of the data density are calculated. The initial bandwidth R is set to 3000 m, and the output pixel size is one-tenth of the bandwidth (300 m). Then, the bandwidth and the output pixel size are continuously reduced to 300 m and 30 m, respectively. The kernel density estimation results are shown in Figure 4. By comparing the results of kernel density estimation, it is found that the region with the highest luminance value has an obvious boundary when the bandwidth R ranges from 800-1000 m, which is a suitable range for easily extracting the central boundaries of county-level urban areas. Therefore, the bandwidth R is set to 1000 m, and the output Remote Sens. 2020, 12, 2836 8 of 20 pixel size is set to 100 m in kernel density analysis to obtain the results. After obtaining the kernel density analysis results, the processed NSL data of NPP-VIIRS are re-sampled to 100 m, and the NPP & PT composite index values are then calculated by referring to the existing research [40] combined with the NSL data of NPP-VIIRS. The calculation equation is as follows: where NPC i is the NPP & PT composite index value of data point i, NTL i is the NTL brightness value of data point i, K i is the kernel density of data point i.
Remote Sens. 2020, 12, x FOR PEER REVIEW 8 of 21 After obtaining the kernel density analysis results, the processed NSL data of NPP-VIIRS are resampled to 100 m, and the NPP & PT composite index values are then calculated by referring to the existing research [40] combined with the NSL data of NPP-VIIRS. The calculation equation is as follows: where NPCi is the NPP & PT composite index value of data point i, NTLi is the NTL brightness value of data point i, Ki is the kernel density of data point i.  Combining the purpose of this research, we set the classification into four main categories: natural features, rural areas, suburb areas and urban areas. Then, we use the equal interval method, geometrical interval method and natural breaks method to do classification. After comparison, observation and analysis, we find that the natural breaks method has the best result. Moreover, using the natural breaks method can ensure that the differences within the categories were the smallest and those between the categories were the largest. Therefore, the NPP & PT composite index results are divided into four sections by the natural breaks method ( Figure 5). The first section includes natural features, the second section includes rural areas, and the third and fourth sections include urban central areas after being combined. Finally, according to the segmented values, the radiance values in corresponding natural feature areas are set to 0, and the boundaries between rural areas and urban centers are extracted. Compared with the statistical data, the urban centers of Mangshi city, Ruili city, Longchuan Country, Yingjiang Country and Lianghe Country are located in the areas near the government offices of each county and city, which largely conforms to the actual situation. Combining the purpose of this research, we set the classification into four main categories: natural features, rural areas, suburb areas and urban areas. Then, we use the equal interval method, geometrical interval method and natural breaks method to do classification. After comparison, observation and analysis, we find that the natural breaks method has the best result. Moreover, using the natural breaks method can ensure that the differences within the categories were the smallest and those between the categories were the largest. Therefore, the NPP & PT composite index results are divided into four sections by the natural breaks method ( Figure 5). The first section includes natural features, the second section includes rural areas, and the third and fourth sections include urban central areas after being combined. Finally, according to the segmented values, the radiance values in corresponding natural feature areas are set to 0, and the boundaries between rural areas and urban centers are extracted. Compared with the statistical data, the urban centers of Mangshi city, Ruili city, Longchuan Country, Yingjiang Country and Lianghe Country are located in the areas near the government offices of each county and city, which largely conforms to the actual situation.
After division, we have accessed the accuracy of 2017 extraction results ( Figure 5) by using 2017 Sentinel-2 remote sensing image. We mainly use the statistical classification index (precision, recall and F1) to evaluate the extracting areas quantitatively. The specific formulas are as follow: where F1-score is the harmonic average of precision and recall, a computed is the area of the extracting areas using the NTL, POIs and toponyms, a comparative is the area of the reference, a overlap is the area of the overlap between the extracted results and the reference. The results are shown in the Table 2. Remote Sens. 2020, 12, x FOR PEER REVIEW 10 of 21 overlap comparative a recall= a (4) precision recall F1-score=2 precision+recall   (5) where F1-score is the harmonic average of precision and recall, acomputed is the area of the extracting areas using the NTL, POIs and toponyms, acomparative is the area of the reference, aoverlap is the area of the overlap between the extracted results and the reference. The results are shown in the Table 2. According Table 2, the values of precision, recall and F1 score in urban central areas, suburb areas and rural areas are relatively high, which means that the results of the extracting boundaries are good. However, the accuracy of the natural areas is a bit low. This might be because the research areas are mainly in urban central areas, suburb areas and rural areas. Thus, in the study,  According Table 2, the values of precision, recall and F1 score in urban central areas, suburb areas and rural areas are relatively high, which means that the results of the extracting boundaries are good. However, the accuracy of the natural areas is a bit low. This might be because the research areas are mainly in urban central areas, suburb areas and rural areas. Thus, in the study, we mainly use toponyms and POIs related to people to extract the boundaries, the POIs and toponyms of natural features are not added.

Construction of the REPC Model
There are three important steps in building the EPC model of DH Prefecture at a 1-km spatial resolution. The first step is to divide the research area into three parts according to the existing county-level administrative division boundaries of the DH Prefecture and the extracted boundaries of urban centers and rural areas. The second step is to calculate the light index values of the urban centers and rural areas in each county and city from 2012-2017 and to perform a regression analysis of the calculated light index scores with the REPC statistical data. Finally, according to the analysis results, the regression model of REPC is constructed. We compared linear regression models and quadratic polynomial nonlinear regression models to find the best fit.
Light index, such as the total NTL index (TNL), comprehensive NTL index (CNLI), average light intensity (I) and light area ratio (s) [41] may reflect the social and economic development of a certain area. The CNLI is based on the average light intensity and light area ratio. The TNL is based on the sum of the digital number (radiance value) of NTL image pixels in an administrative unit. In this study, the TNL is used to build the regression model of electricity consumption, and the calculation equation is as follows [42]: where TNL is the total NTL intensity, R i is the gray pixel value and pixel number of level i in the administrative unit, and C i is the number of pixels at level i. The results of TNL for the DH Prefecture from 2012 to 2017 are shown in Tables 3-5.  First, the linear regression model and quadratic regression model of the rural electricity consumption and the total intensity of NTL are constructed. After calculation, the average error of the linear regression model and quadratic regression model are 17.75% and 19%, respectively. Therefore, this research selected the linear regression model. The regressions are shown in Equation (7), and the corresponding correlation coefficients R 2 are 0.7835.
where REPC n is the rural electric power consumption in administrative region n and TNL n is the total intensity of rural NTL in administrative region n. The regression results are shown in Figure 6. the linear regression model and quadratic regression model are 17.75% and 19%, respectively. Therefore, this research selected the linear regression model. The regressions are shown in Equation (7), and the corresponding correlation coefficients R 2 are 0.7835. nn REPC =3.344TNL -36940 (7) where RE is the relative error, ARE is the average relative error, REPCcal is the calculated rural electric power consumption, REPCreal is the real rural electric power consumption, and n is the number of values in the same area. The calculation results of RE are shown in Table 6.  Next, the relative error (RE) and the average relative error (ARE) are used to evaluate the REPC results. The formulas for the RE and ARE are as follows: where RE is the relative error, ARE is the average relative error, REPC cal is the calculated rural electric power consumption, REPC real is the real rural electric power consumption, and n is the number of values in the same area. The calculation results of RE are shown in Table 6. Finally, after comparing the RE of the REPC between the two regression models from 2012 to 2017, we find that the average RE of the linear model is between 0.28 and 2.69 and the average RE of the quadratic polynomial model is between 0.28 and 2.41. There is little difference between the two fitting models. Therefore, the linear regression model is selected to build the REPC model in this study. To apply the model to the unit grid with a 1-km resolution and perform the next step of EPC spatialization, it is necessary to reduce the estimation scale of the model from administrative unit to 1-km grid size. The final linear regression model for spatialization according to the method given in [42] is: (10) where EPC ij is the calculated electric power consumption value of grid j in administrative region i, R ij is the radiance value of NTL in grid j in administrative region i, and TNL i is the total intensity of rural NTL in the ith administrative region.

EPC Spatialization
There are three steps in the spatialization of EPC. The first step is to estimate the REPC of each county at a 1-km resolution using Equation (10). The second step is to estimate the urban center EPC spatially in each county. Because the statistical yearbook does not specify the EPC of urban residents and there is little difference in economic development in the DH Prefecture, the urban center EPC can be calculated according to the total intensity of urban central light according to the regression model of power consumption. The third step is error correction. Because there is a certain deviation between the regional EPC value calculated according to the linear regression model of EPC and the actual power consumption value provided in the statistical data, it is necessary to correct the error in the calculated value of EPC; then, the corrected EPC value can be added to the cell network with a 1-km spatial resolution to obtain the spatial results of total EPC in the DH Prefecture. The modified equation for this revision is [43]: where K i is the correction factor for administrative region i, REPC i is the calculated value of rural electric power consumption in administrative region i and can be calculated from Equation (7), and TEPC i is the statistical value of rural electric power consumption in administrative region i. The correction factor results for DH Prefecture are shown in Table 7: where CE ij is the revised EPC value of grid j in administrative region i; EPC ij is the calculated value of electric power consumption in grid j in administrative region i, which can be calculated by Equation (10); and K i is the correction factor for administrative region i.

Results
Using the methods described in Section 3, the EPC spatial results for five counties in the DH Prefecture in 2017 are obtained, and the results are divided into seven grades based on natural breaks (Figure 7).  First, the areas with the highest level of electricity consumption are concentrated near urban centers, and the light intensity generally decreases from these urban centers to the suburb, industrial parks and rural areas. Areas of relatively high electricity consumption are concentrated in cities and the surrounding suburbs, and the low electricity consumption levels are concentrated in rural areas. The distributions of urban centers and rural areas are shown in Figure 7. The distribution of the REPC is relatively scattered, with the highest values concentrated in cities and subsequent radially decreasing values toward rural areas. Second, there are three electricity consumption belts: the Ruili-Mangshi belt, the Yingjiang-Lianghe belt and the Longchuan-Lianghe belt. Overall, the EPC results largely conform to the spatial strategic layout for the DH Prefecture established in the "13th Five Year Plan" and objectively and clearly show the overall spatial distribution of electricity consumption in First, the areas with the highest level of electricity consumption are concentrated near urban centers, and the light intensity generally decreases from these urban centers to the suburb, industrial parks and rural areas. Areas of relatively high electricity consumption are concentrated in cities and the surrounding suburbs, and the low electricity consumption levels are concentrated in rural areas. The distributions of urban centers and rural areas are shown in Figure 7. The distribution of the REPC is relatively scattered, with the highest values concentrated in cities and subsequent radially decreasing values toward rural areas. Second, there are three electricity consumption belts: the Ruili-Mangshi belt, the Yingjiang-Lianghe belt and the Longchuan-Lianghe belt. Overall, the EPC results largely conform to the spatial strategic layout for the DH Prefecture established in the "13th Five Year Plan" and objectively and clearly show the overall spatial distribution of electricity consumption in the DH Prefecture. According to the RE estimation from the REPC model in Section 3, the area with the largest estimation error is Ruili city, with an error reaching 1.78, and the estimation errors in others areas fluctuate between 0.20 and 0.30. Specifically, other areas with notable errors include Lianghe County (0.29), Longchuan County (0.25), Yingjiang County (0.23) and Mangshi County (0.20) from high to low. The electricity consumption level in the urban centers of five counties in the DH Prefecture in 2017 is estimated to simulate the urban EPC of each county in DH Prefecture. The estimated results are shown in Table 8. Table 8 shows that the area with the highest urban EPC in the DH Prefecture is Mangshi, with the result reaching 10,649 (10 4 kW·h), followed by Longchuan County, Yingjiang County and Lianghe County. The area with the lowest urban power consumption is Lianghe at only 1501 (10 4 kW·h).

EPC of Ethnic Minorities
There are a large number of ethnic minorities in the DH Prefecture, and most of them live in rural areas outside the cities. Through the spatial analysis of the existing population census data of ethnic minorities in the DH Prefecture and the selected toponym data of ethnic minorities, we can get two kinds of proportion: proportion of ethnic minority population in each region and the proportion of ethnic minority toponym. Both of them reflect the distribution and amount of ethnic minorities in the DH Prefecture. Then, according to these two kinds of proportion and the spatial results of DH REPC, we can calculate the results of rural minority power consumption in the DH Prefecture. Finally, the ethnic REPC results of the DH Prefecture are reclassified into four levels: high consumption level, less high consumption level, medium consumption level and low consumption

EPC of Ethnic Minorities
There are a large number of ethnic minorities in the DH Prefecture, and most of them live in rural areas outside the cities. Through the spatial analysis of the existing population census data of ethnic minorities in the DH Prefecture and the selected toponym data of ethnic minorities, we can get two kinds of proportion: proportion of ethnic minority population in each region and the proportion of ethnic minority toponym. Both of them reflect the distribution and amount of ethnic minorities in the DH Prefecture. Then, according to these two kinds of proportion and the spatial results of DH REPC, we can calculate the results of rural minority power consumption in the DH Prefecture. Finally, the ethnic REPC results of the DH Prefecture are reclassified into four levels: high consumption level, less high consumption level, medium consumption level and low consumption level. The results reflect the spatial distribution and power consumption levels of ethnic minorities in DH Prefecture (Figure 9). It can be seen from Figure 9 that the REPC of ethnic minorities in DH Prefecture is mainly concentrated in Mangshi, Yingjiang and Longchuan, and most of them are at less high and medium consumption levels.

Influence of Terrain on Distribution of REPC
The DH Prefecture is located in west of the Yunnan-Guizhou Plateau and south of the Hengduan Mountain range, characterized as a mountain landform that is high in the northeast and low in the southwest. In addition, the overall landform of DH is a ridge and valley basin; specifically, the ridges and valleys are parallel with alternating arrangements.
By comparing the EPC results in 2017 with the digital elevation map of DH (Figure 10), it can be found that most of the electricity consumption in the prefecture is concentrated in the valley basin areas. Moreover, by connecting the main power consumption centers and regions, we found there are three main EPC stripes in DH, the distribution of the three power consumption stripes is not only consistent with the topographical trend of the ridge and valley basin, but also consistent with the economic development of DH. Finally, according to the statistical results of REPC, the relatively high rural power consumption in Yingjiang and Mangshi may be due to the low elevations and the large areas of river valley in these regions. These features contribute to the human settlement and population growth. Through Figure 10, we can more accurately understand the overall distribution trend of DH's power consumption, and provide reference for future economic development direction, urban and traffic construction.

Influence of Terrain on Distribution of REPC
The DH Prefecture is located in west of the Yunnan-Guizhou Plateau and south of the Hengduan Mountain range, characterized as a mountain landform that is high in the northeast and low in the southwest. In addition, the overall landform of DH is a ridge and valley basin; specifically, the ridges and valleys are parallel with alternating arrangements.
By comparing the EPC results in 2017 with the digital elevation map of DH (Figure 10), it can be found that most of the electricity consumption in the prefecture is concentrated in the valley basin areas. Moreover, by connecting the main power consumption centers and regions, we found there are three main EPC stripes in DH, the distribution of the three power consumption stripes is not only consistent with the topographical trend of the ridge and valley basin, but also consistent with the economic development of DH. Finally, according to the statistical results of REPC, the relatively high rural power consumption in Yingjiang and Mangshi may be due to the low elevations and the large areas of river valley in these regions. These features contribute to the human settlement and population growth. Through Figure 10, we can more accurately understand the overall distribution trend of DH's power consumption, and provide reference for future economic development direction, urban and traffic construction.

Innovation and Limitation
The innovation of this study lies in some aspects: On the basis of NPP-VIIRS NTL data and electricity statistical data, the spatial estimation of EPC in ethnic minority rural areas is carried out by using toponym and POI data. This method is not just suitable for rural areas, but also for other types of small-scale areas. That is to say, it can combine different types of POI and toponym data to estimate the spatial EPC in different types of areas (residential land, commercial land, industrial land, etc.) at a certain spatial resolution. Most existing studies have concentrated in large scale areas like country or more developed regions like city, few studies have been conducted to investigate the electricity consumption in undeveloped rural areas by using NPP-VIIRS data. At the same time, there are some research gaps in estimating rural power consumption for Yunnan province, and due to the backward economic development of Yunnan, the existing EPC model cannot meet its estimation needs. Therefore, establishing the REPC model of DH help to better understand development and spatial pattern of rural power consumption. At the same time, it can help us estimate the gap between rural EPC and urban EPC in developing country or region. What is more, there are a large number of ethnic minorities live in DH, the electricity consumption here may be different from other areas because of their different life style and living habit. The research not only estimated the electricity consumption of rural areas, but also the areas where lots of ethnic minorities live in.
Finally, there are several limitations in this study. First, due to the lack of electricity consumption data of urban areas and 2016 rural electricity consumption statistics for DH, this study cannot more accurately reflect the difference in electricity consumption between urban centers and rural areas in DH Prefecture. Second, because of the short research period (2012-2017), the spatial and temporal changes in the research results are not obvious, and the EPC trend does not exhibit obvious growth.

Conclusions
Based on remotely sensed NPP-VIIRS NTL data, rural electricity consumption at the county level, and POI and toponym data of the DH Prefecture, a regression model of county-level REPC in ethnic minority areas is constructed, and then, the urban and total electricity consumption levels in the DH prefecture in 2017 are estimated spatially based on this model. By comparing the estimated

Innovation and Limitation
The innovation of this study lies in some aspects: On the basis of NPP-VIIRS NTL data and electricity statistical data, the spatial estimation of EPC in ethnic minority rural areas is carried out by using toponym and POI data. This method is not just suitable for rural areas, but also for other types of small-scale areas. That is to say, it can combine different types of POI and toponym data to estimate the spatial EPC in different types of areas (residential land, commercial land, industrial land, etc.) at a certain spatial resolution. Most existing studies have concentrated in large scale areas like country or more developed regions like city, few studies have been conducted to investigate the electricity consumption in undeveloped rural areas by using NPP-VIIRS data. At the same time, there are some research gaps in estimating rural power consumption for Yunnan province, and due to the backward economic development of Yunnan, the existing EPC model cannot meet its estimation needs. Therefore, establishing the REPC model of DH help to better understand development and spatial pattern of rural power consumption. At the same time, it can help us estimate the gap between rural EPC and urban EPC in developing country or region. What is more, there are a large number of ethnic minorities live in DH, the electricity consumption here may be different from other areas because of their different life style and living habit. The research not only estimated the electricity consumption of rural areas, but also the areas where lots of ethnic minorities live in.
Finally, there are several limitations in this study. First, due to the lack of electricity consumption data of urban areas and 2016 rural electricity consumption statistics for DH, this study cannot more accurately reflect the difference in electricity consumption between urban centers and rural areas in DH Prefecture. Second, because of the short research period (2012-2017), the spatial and temporal changes in the research results are not obvious, and the EPC trend does not exhibit obvious growth.

Conclusions
Based on remotely sensed NPP-VIIRS NTL data, rural electricity consumption at the county level, and POI and toponym data of the DH Prefecture, a regression model of county-level REPC in ethnic minority areas is constructed, and then, the urban and total electricity consumption levels in the DH prefecture in 2017 are estimated spatially based on this model. By comparing the estimated results with the statistical data, it is found that except in Ruili, the simulated REPC results accurately reflect rural electricity consumption within a certain average error (17.8%). The model fits Mangshi and Yingjiang best. Higher estimation error of the Ruili REPC may be due to the size of the city, and the overall low consumption. Thus, there is a large error when using the REPC model to simulate Ruili electricity consumption. Finally, based on the spatial estimation of rural electric power consumption, this paper discusses the EPC distribution of ethnic minorities and topography influence on the distribution of EPC in the study area.
All in all, this research provides a method for estimating EPC in small areas and rural areas by using VIIRS DNB nighttime light, and makes some improvements during the process of multisource data fusion and various techniques integration. For one thing, we find that the method makes up for the disadvantage of poor effect of NTL data in rural areas, and through this method, we can use VIIRS data to estimate the electricity consumption in rural areas accurately. For another thing, the study of RECP can help us identify the poverty level and make up for the lack of EPC data in rural areas and ethnic minority areas. In other words, with the help of multiple-source data, we can get the electric power consumption data of rural areas more quickly, more conveniently and more accurately.