Developing Non-Negative Spatial Autoregressive Models for Better Exploring Relation Between Nighttime Light Images and Land Use Types

Exploring the relationship between nighttime light and land use is of great significance to understanding human nighttime activities and studying socioeconomic phenomena. Models have been studied to explain the relationships, but the existing studies seldom consider the spatial autocorrelation of night light data, which leads to large regression residuals and an inaccurate regression correlation between night light and land use. In this paper, two non-negative spatial autoregressive models are proposed for the spatial lag model and spatial error model, respectively, which use a spatial adjacency matrix to calculate the spatial autocorrelation effect of light in adjacent pixels on the central pixel. The application scenarios of the two models were analyzed, and the contribution of various land use types to nighttime light in different study areas are further discussed. Experiments in Berlin, Massachusetts and Shenzhen showed that the proposed methods have better correlations with the reference data compared with the non-negative least-squares method, better reflecting the luminous situation of different land use types at night. Furthermore, the proposed model and the obtained relationship between nighttime light and land use types can be utilized for other applications of nighttime light images in the population, GDP and carbon emissions for better exploring the relationship between nighttime remote sensing brightness and socioeconomic activities.


Introduction
Nighttime light recorded by satellites represents what human beings use at night for production and living, which is an effective means to study the spatial distribution of human nocturnal activities [1,2]. It has been confirmed that there is a strong linear correlation between night light brightness values and various economic indicators in a region [2][3][4]. So, night light data have been widely used to simulate the spatial distribution of the population, GDP, carbon emissions and other fields [4][5][6]. However, studies have also shown that the correlation between night light and economic indicators varies from region to region [7][8][9]. For example, the nighttime light intensity (NLI) of an American town with a population of 10,000 is three times that of a German town of the same size [10]. Studying the regional differences and influencing factors in the correlation between night light and economic activities can help researchers correct the errors in estimating GDP and economic activities by using night light, improving the accuracy of GDP estimation, and reflecting the spatial distribution of GDP accurately [11,12]. In addition, only by understanding the causes and distribution pattern of night light light more accurately and (2) to find land use types closely related to human nighttime activities and expand the application field of nighttime light data in the study of human activities, such as light pollution, traffic, GDP components, and so forth.

Methods
In this study, non-negative spatial autoregressive models and corresponding solving methods are developed to quantify the land use contribution to nighttime light using coarse-resolution night imagery and fine-resolution land use data. The experimental framework is shown in Figure 1, including resolution conversion, land use type proportion statistics, spatial neighborhood setting, NLI calculation by different models and results comparison and analysis.  Figure 1. Experimental framework for comparing R 2 with reference data and residual errors between the proposed non-negative spatial autoregressive models and nonnegative least square (NLS) model.

Study Area and Data Sources
The study areas were Berlin in Germany, Massachusetts in the United States, and Shenzhen in China. Since the three areas are the mostly developed regions in these countries, there were abundant remote sensing images and statistical data available online for supporting this study. The selected data covered large time spans and diverse spatial resolutions. In addition, the three areas are located in different countries with distinct levels of economic and urban development stages. These features can help to verify the robustness and adaptability of the models in different application scenarios. The nighttime light images and the administrative boundaries of the three study areas are illustrated in Figure 2.
With a population of 3.7 million, Berlin is the largest city in, and the capital of Germany [40]. Berlin is also a city with a high level of economic development and a good ecological environment. About one-third of the area is composed by forests, parks, gardens, rivers, canals and lakes. With an Figure 1. Experimental framework for comparing R 2 with reference data and residual errors between the proposed non-negative spatial autoregressive models and nonnegative least square (NLS) model.

Study Area and Data Sources
The study areas were Berlin in Germany, Massachusetts in the United States, and Shenzhen in China. Since the three areas are the mostly developed regions in these countries, there were abundant remote sensing images and statistical data available online for supporting this study. The selected data covered large time spans and diverse spatial resolutions. In addition, the three areas are located in different countries with distinct levels of economic and urban development stages. These features can help to verify the robustness and adaptability of the models in different application scenarios. The nighttime light images and the administrative boundaries of the three study areas are illustrated in Figure 2.
With a population of 3.7 million, Berlin is the largest city in, and the capital of Germany [40]. Berlin is also a city with a high level of economic development and a good ecological environment. About one-third of the area is composed by forests, parks, gardens, rivers, canals and lakes. With an area of 27,340 km 2 and population of 6.9 million in 2018, Massachusetts is the third most densely Remote Sens. 2020, 12, 798 4 of 25 populated state in the U.S. [41]. As one of the original 13 states, Massachusetts has a long history of development. Over 80% of Massachusetts' population lives in the Greater Boston metropolitan area that has great influential upon U.S. history, academia and industry. Shenzhen is the first special economic zone in China and the major city in the Pearl River Delta megalopolis. With a population of 12.5 million in 2017, Shenzhen ranked as the third largest city in China [42]. As a newly developed city after the reform and opening up, Shenzhen has become one of the most economically dynamic cities in China, and even the world.
Remote Sens. 2020, 12,798 4 of 25 area of 27,340 km 2 and population of 6.9 million in 2018, Massachusetts is the third most densely populated state in the U.S [41]. As one of the original 13 states, Massachusetts has a long history of development. Over 80% of Massachusetts's population lives in the Greater Boston metropolitan area that has great influential upon U.S. history, academia and industry. Shenzhen is the first special economic zone in China and the major city in the Pearl River Delta megalopolis. With a population of 12.5 million in 2017, Shenzhen ranked as the third largest city in China [42]. As a newly developed city after the reform and opening up, Shenzhen has become one of the most economically dynamic cities in China, and even the world. The experimental data of the three study areas are shown in Table 1. The data used for building the regression model were land use data and coarse resolution nighttime light images. Reference data were fine-resolution nighttime light images for model accuracy verification.  The experimental data of the three study areas are shown in Table 1. The data used for building the regression model were land use data and coarse resolution nighttime light images. Reference data were fine-resolution nighttime light images for model accuracy verification. The raw night light images used in this paper were the average annual images of Defense Meteorological Satellite Program's Operational Linescan System (DMSP/OLS) and Visible Infrared Imaging Radiometer Suite (VIIRS) sensor on the Suomi National Polar-orbiting Partnership (NPP) Satellite. Radiometric correction has been conducted, and the background noise of the images has been removed by data providers [33,43,44]. Since this study does not involve time series analysis, the joint correction of NLI values between yearly images is not needed. So, the projected transformation and resolution conversion were performed directly with the downloaded images.
All images of the same study areas were projected into the same geographic coordinate system. Images of Berlin, Shenzhen and Massachusetts were projected into World Geodetic System 1984(WGS84)/Universal Transverse Mercator (UTM) zone 33N, zone 49N and the State Plane (Mass Mainland) coordinate system, respectively. As the nighttime light images and land use data had different resolutions, fine-resolution land use data were converted to a coarse resolution that of the nighttime light images for further analysis.

Spatial Neighborhood Setting
For the spatial autoregressive model, setting the spatial weight matrix W has a great influence on the regression results [45]. In this paper, k nearest neighbors was used to choose the adjacent pixels, and the inverse distance method was used to determine the weight of each neighboring pixel [46].
In order to specify the threshold of spatial neighbors, the influence range of spatial autocorrelation effects needs to be identified. The semi-variation coefficients are commonly used indicator to determine the most suitable k value. According to Chen's study, the influence range of spatial autocorrelation characteristics is about 2-11 pixels for different land use types [47]. In this paper, we selected two k values (i.e., k = 8 and k = 24) to construct the spatial weight matrices and compare regression results for exploring the influence range and optimal threshold of spatial autocorrelation of night light data. The specified eight and 24 nearest neighbors are shown in Figure 3, which correspond to thresholds of 1.5 and three pixels, respectively. The weights for the pixels in the adjacent region to the central pixel are the reciprocal of their distances, while the weights for other pixels are set to 0. The raw night light images used in this paper were the average annual images of Defense Meteorological Satellite Program's Operational Linescan System (DMSP/OLS) and Visible Infrared Imaging Radiometer Suite (VIIRS) sensor on the Suomi National Polar-orbiting Partnership (NPP) Satellite. Radiometric correction has been conducted, and the background noise of the images has been removed by data providers [33,43,44]. Since this study does not involve time series analysis, the joint correction of NLI values between yearly images is not needed. So, the projected transformation and resolution conversion were performed directly with the downloaded images.
All images of the same study areas were projected into the same geographic coordinate system. Images of Berlin, Shenzhen and Massachusetts were projected into World Geodetic System 1984(WGS84)/Universal Transverse Mercator (UTM) zone 33N, zone 49N and the State Plane (Mass Mainland) coordinate system, respectively. As the nighttime light images and land use data had different resolutions, fine-resolution land use data were converted to a coarse resolution that of the nighttime light images for further analysis.

Spatial Neighborhood Setting
For the spatial autoregressive model, setting the spatial weight matrix W has a great influence on the regression results [45]. In this paper, k nearest neighbors was used to choose the adjacent pixels, and the inverse distance method was used to determine the weight of each neighboring pixel [46].
In order to specify the threshold of spatial neighbors, the influence range of spatial autocorrelation effects needs to be identified. The semi-variation coefficients are commonly used indicator to determine the most suitable k value. According to Chen's study, the influence range of spatial autocorrelation characteristics is about 2-11 pixels for different land use types [47]. In this paper, we selected two k values (i.e., k = 8 and k = 24) to construct the spatial weight matrices and compare regression results for exploring the influence range and optimal threshold of spatial autocorrelation of night light data. The specified eight and 24 nearest neighbors are shown in Figure  3, which correspond to thresholds of 1.5 and three pixels, respectively. The weights for the pixels in the adjacent region to the central pixel are the reciprocal of their distances, while the weights for other pixels are set to 0.

Land Use Types Proportion Statistic
The nighttime light value of a pixel can be expressed as the sum of the brightness values of all land use types, which is calculated by multiplying the NLI of such a land use types with the proportion of it in the central pixel [33]. So, we counted the proportions of different land use types in each grid. Because of the autocorrelation of land use in geographical space, there were only a few land types that existing in the same pixel, which led to uneven proportions of land use types. This phenomenon incurs a sparse matrix solving problem in regression, which is solved by the modified models proposed in Sections 2.3.2 and 2.3.3.

Spatial Autoregressive Model Construction
Because of the sensor resolution and landscape variability, remote sensing data exhibited a spectral dependency between neighboring pixels, i.e., spatial autocorrelation, in multispectral remote sensing images, and in nighttime light remote sensing data [48][49][50][51][52], which was also proved for our three study areas in Table A1 of the Appendix A. Campbell used Landsat multi-spectral scanner images to study the positive autocorrelation between neighboring pixels [53]. Dana indicated that the radiance of one pixel can further affect the radiance of pixels 4-6 away [54]. The presence spatial autocorrelation violates the hypothesis that the data are independent, so traditional methods such as OLS and NOLS may yield unsatisfactory results and lead to errors in analyzing the cause of night light. Therefore, the accuracy of the relation between nighttime light and land use types could be improved by considering the spatial autocorrelation.
The main concern of this paper is the correlation between night light and land use. Studying the origin of the spatial autocorrelation of night light images can help us understand the spatial pattern of night light, which in turn expanding the application scope of night light images. Night light data are characterized in the single band by a low resolution and significant brightness overflow in general. Three potential reasons for the spatial autocorrelation phenomenon, which need to be eliminated, can be drawn as follows [53,54]: (1) the refraction and reflection of light ( Figure 3a); (2) the influence of climate and atmospheric conditions; (3) positive correlation caused by the imaging system. The first explanation reflects that the night light brightness is affected by the ambient brightness. This part of autocorrelation comes from the dependent variable of luminous brightness itself, so it can be measured by the spatial lag model (SLM). The second and third explanations indicate that, besides being affected by the dependent variable, there are some factors missing in linear regression that are not related to the independent variable, and the spatial error model (SEM) can be used to solve this problem. Therefore, we analyzed application scenarios and the influenced the spatial ranges of the two models through further experiments in this paper.

Traditional Spatial Autoregressive Model
Because of the strong spatial autocorrelation within nighttime light images, the classical regression models are no longer applicable [55]. Spatial autoregressive models are an effective method with which to solve this problem, which include the first-order spatial auto-regressive model, spatial lag model and spatial error mode. Among these models, the spatial lag model and the spatial error model are the most widely used. The general form of the autoregressive models is as Equation (1).
where y is the dependent variable; X is an n-by-k independent variable matrix; β is a k-by-1 parameter vector associated with the independent variable X. W 1 and W 2 are the n-by-n order weight matrixes, reflecting the spatial trend of the dependent variable and the spatial trend of residuals respectively; ρ is the spatial lag variable W y coefficient; λ is the spatial correlation intensity between regression resistors. µ is a n-by-1 error vector and ε is a normally distributed random error vector [56].
Different spatial autoregressive models can be obtained by applying different model parameter restrictions. When ρ 0 and λ = 0, it is a SLM; when ρ = 0 and λ 0, it is a SEM. The SLM model considers dependent variables on a spatial object are related not only to independent variables on the same object, but also to dependent variables of adjacent objects. The SEM model assumes that spatial correlation is generated by missing variables. It reflects the error propagation process through the spatial covariance of different regions [50].
These models can measure the spatial autocorrelation caused by dependent variables and error, but have troubles in parameter interpretation and model solving. According to the physical meaning, the value of the land use brightness coefficient should be non-negative. However, the coefficients of the independent variables will contain both positive and negative values if following the conventional model's procedure. It incurs difficulty to interpret the correlation between particular land use types and NLI. Meanwhile, the proportion of land use types in a single pixel is uneven due to the agglomeration of different land use types. It leads to a sparse coefficient matrix of the proportions, and impedes the solving of traditional spatial autoregressive model, in turn posing a great challenge to the analysis of the relation between land use and nighttime light. Therefore, we proposed an improved solving method by making the coefficients non-negative, which may better explain the correlations. Meanwhile, instead of using the original matrix inversion method, we adopted dynamic programming to solve the local optimal autoregressive coefficient using the multistage optimization decision method provided by MATLAB software. It enhances the robustness in solving even when the coefficient matrix is sparse. The improved calculation procedure is as illustrated in Figure 4. Different spatial autoregressive models can be obtained by applying different model parameter restrictions. When ≠ 0 and = 0, it is a SLM; when = 0 and ≠ 0, it is a SEM. The SLM model considers dependent variables on a spatial object are related not only to independent variables on the same object, but also to dependent variables of adjacent objects. The SEM model assumes that spatial correlation is generated by missing variables. It reflects the error propagation process through the spatial covariance of different regions [50].
These models can measure the spatial autocorrelation caused by dependent variables and error, but have troubles in parameter interpretation and model solving. According to the physical meaning, the value of the land use brightness coefficient should be non-negative. However, the coefficients of the independent variables will contain both positive and negative values if following the conventional model's procedure. It incurs difficulty to interpret the correlation between particular land use types and NLI. Meanwhile, the proportion of land use types in a single pixel is uneven due to the agglomeration of different land use types. It leads to a sparse coefficient matrix of the proportions, and impedes the solving of traditional spatial autoregressive model, in turn posing a great challenge to the analysis of the relation between land use and nighttime light. Therefore, we proposed an improved solving method by making the coefficients non-negative, which may better explain the correlations. Meanwhile, instead of using the original matrix inversion method, we adopted dynamic programming to solve the local optimal autoregressive coefficient using the multistage optimization decision method provided by MATLAB software. It enhances the robustness in solving even when the coefficient matrix is sparse. The improved calculation procedure is as illustrated in Figure 4.

Improved Non-Negative Spatial Lag Model (NSLM)
The spatial lag model can measure the spatial autocorrelation caused by the dependent variable. The regression equation is as Equation (2).
Remote Sens. 2020, 12, 798 where y denotes the nighttime light value of the pixel, and the proportion of different land use types in this pixel i are denoted by {x i1 , x i2 , . . . , x im }. n denotes the number of the pixels on nighttime light images, and m denotes the number of land use types. β j denotes the NLI of the land use type j. ρ is the coefficient of spatial lag variable W y , which is also called the spatial lag factor. W is the spatial weight matrix, which represents the influences between all pixel pairs in the image. ε is a normally distributed random error vector, which can be denoted as ε ∼ N 0, σ 2 I n .
In order to solve the three variables, i.e., β, ρ and σ, a new vector θ = (β, ρ, σ) is constructed. Maximum likelihood estimation is used to solve the variables according to the traditional spatial lag model. The logarithmic likelihood function of vector θ is shown in Equation (3).
Since the value of spatial lag factor ρ is usually between 0 and 1, to reduce the number of unknowns, we use the enumeration method and take 101 ρ values between 0 and 1 by using 0.01 as the interval. For each given value of ρ, dynamic programming is used to solve the non-negative value of β with constraints, and then ρ and β are used to solve the corresponding value of σ. The solution methods of β and σ are shown in Equations (4) and (5).
According to the 101 vectors of θ, the logarithmic likelihood function of ρ is calculated using Equation (6), and ρ and β that maximize the logarithmic likelihood function are obtained.

Improved Non-Negative Spatial Error Model (NSEM)
The spatial error model is more accurate when spatial correlation is generated by neglected variables. It reflects the error propagation process through the spatial covariance of different regions. The regression equation is as Equation (7).
where the weight matrix W reflects the spatial trend of residuals; λ is the spatial correlation intensity between regression resistors; ε is the random error vector; and µ is a normally distributed random error vector. X, y and β represent the same meaning as the NSLM.
As with the NSLM, the maximum likelihood estimation is used to solve the unknowns of the model by introducing a new vector τ = (β, λ, σ). The logarithmic likelihood function of the vector is shown in Equation (8).
The 101 values of the spatial lag factor λ are also selected between 0 and 1 by using 0.01 as the interval. For each given value of λ, dynamic programming solves the non-negative value of β with constraints, and the value of σ is solved by using λ and β. The solution methods of β and σ are shown in Equations (9) and (10). Then, according to the obtained vectors of τ, the logarithmic likelihood function of λ is calculated using Equation (11), and λ and β that maximize the logarithmic likelihood function are obtained.β = arg min

Comparative Analysis of Nighttime Light Intensity
In order to verify the accuracy of our models, the estimated NLIs of each land use type from coarse-resolution nighttime light imagery by different models are compared with that of reference data listed in Table 1. For the proposed two non-negative spatial autoregressive models, NSEM and NSLM, the non-negative spatial linear regression model NSL without considering spatial autocorrelation was selected as the baseline method. In this experiment and the analysis in Section 3.2, spatial weight matrices constructed by the eight nearest neighbors were used as the case study.
Considering different reference data as described in Table 1, the calculations of NLIs in three study areas were different. In Berlin, the reference data, nighttime aerial photographs with 1 m resolution, has a higher resolution than the land use map. So, for each type of land use, the average value of NLI fell into the land use type on the map was calculated as the reference NLI value. In Massachusetts and Shenzhen, the reference data were photographs with 30 m resolution and 170m resolution, respectively, whose resolutions were lower than that of land use maps. To ensure the accuracy of the reference NLI data, land use proportion maps at a lower resolution were produced from the original land use map. For each land use type, the pixels that entirely occupied the land use type were treated as pure pixels, and the average nighttime light values of pure pixels were calculated as the reference NLI value. The estimated NLIs and reference NLIs in three study areas are listed in Tables A1-A3 in the Appendix A. The differences between the nighttime brightness coefficients of different land use types in these tables indicated that land use type was an important factor affecting the NLI.
In order to compare the performance of the models, we adopted linear regression to analyze the fitting of NLI values for all land use types. As sensors are different, the NLI values represent different units of brightness among images. Therefore, the NLIs derived from different nighttime light images may vary on the same land use map and cannot be compared directly. Considering that there were 52, 32 and 26 land use types in three study areas respectively, the R 2 values were satisfactory, showing that the estimated NLIs from NPP/VIIRS data reflected the actual NLIs of different land use types. The higher goodness-of-fit indicated better estimation accuracy and better capability to reflect the relationships between NLI and land use types in our case [33]. Figure 5 shows the scatter diagrams and the linear regression results.
Remote Sens. 2020, 12, 798 9 of 25 likelihood function of λ is calculated using Equation (11), and and that maximize the logarithmic likelihood function are obtained.

Comparative Analysis of Nighttime Light Intensity
In order to verify the accuracy of our models, the estimated NLIs of each land use type from coarse-resolution nighttime light imagery by different models are compared with that of reference data listed in Table 1. For the proposed two non-negative spatial autoregressive models, NSEM and NSLM, the non-negative spatial linear regression model NSL without considering spatial autocorrelation was selected as the baseline method. In this experiment and the analysis in Section 3.2, spatial weight matrices constructed by the eight nearest neighbors were used as the case study.
Considering different reference data as described in Table 1, the calculations of NLIs in three study areas were different. In Berlin, the reference data, nighttime aerial photographs with 1 m resolution, has a higher resolution than the land use map. So, for each type of land use, the average value of NLI fell into the land use type on the map was calculated as the reference NLI value. In Massachusetts and Shenzhen, the reference data were photographs with 30 m resolution and 170m resolution, respectively, whose resolutions were lower than that of land use maps. To ensure the accuracy of the reference NLI data, land use proportion maps at a lower resolution were produced from the original land use map. For each land use type, the pixels that entirely occupied the land use type were treated as pure pixels, and the average nighttime light values of pure pixels were calculated as the reference NLI value. The estimated NLIs and reference NLIs in three study areas are listed in Tables A1-A3 in the Appendix. The differences between the nighttime brightness coefficients of different land use types in these tables indicated that land use type was an important factor affecting the NLI.
In order to compare the performance of the models, we adopted linear regression to analyze the fitting of NLI values for all land use types. As sensors are different, the NLI values represent different units of brightness among images. Therefore, the NLIs derived from different nighttime light images may vary on the same land use map and cannot be compared directly. Considering that there were 52, 32 and 26 land use types in three study areas respectively, the R 2 values were satisfactory, showing that the estimated NLIs from NPP/VIIRS data reflected the actual NLIs of different land use types. The higher goodness-of-fit indicated better estimation accuracy and better capability to reflect the relationships between NLI and land use types in our case [33]. Figure 5 shows the scatter diagrams and the linear regression results.  As illustrated in Figure 5, NSEM had the highest R 2 value of regression among the three models, while NLS had the lowest goodness of fit for all three study areas. We also found that the R 2 values in Shenzhen were lower than those of other two areas, as there is a three-year gap between the NPP/VIIRS imagery and LuoJia-01 imagery. However, R 2 values of NSLM and NSEM were still higher than NLS in Shenzhen, which demonstrated that the estimated NLI by NSLM and NSEM models was closer to reality. In general, NSLE and NSEM have a better performance for explaining the relationships between NLI and land use types.
The fitting confidence of the regression models in the three experimental areas were further evaluated via Akaike information criterion (AIC), which is widely used in factor analysis, regression and latent class analysis [57]. AIC is capable of taking both descriptive accuracy and parsimony into account [58]. AIC is defined as Equation (12).
where L i is the maximum likelihood for model i, which is determined by adjusting the free parameters V i to maximize the probability to generate the data observed by the candidate model. The AIC values of NLS, NSL and NSE models are shown in Table 2. From Table 2, we can see that AIC values of NSLM and NSEM were far less than that of NLS; there was no significant difference between NSLM and NSEM. It indicated that NSLM and NSEM had better performances by leveraging the accuracy and the complexity of the model, when compared with NLS. Although our models had more free parameters (i.e., spatial lag variable ρ for NSLM), significant improvement in accuracy can be achieved. The AIC values of NSLM are slightly lower than those of NSEM, which may reflect that the spatial lag effect of NLI was more significant than the spatial error effect in the study areas. In addition to calculating the R 2 between the models and the reference data and the AIC value of the models, we also tested the significance of the regression coefficients (i.e., NLI) of each model. The land use types with large coefficients in the NSLM and NSEM models proposed generally had large t statistics, indicating that they had significant impacts on the brightness value of nighttime lights. Therefore, the proposed two models can measure the NLIs of different land use types more accurately so as to better understand the relationship between night light and land use.

Spatial Correlation of Residual Error Analysis
Since Moran's I analysis in Table A1 of the Appendix A showed a significant spatial autocorrelation effect of the NLIs in the three study areas, the interpretation effect [59,60] on the spatial autocorrelation of NLI values for the three regression models was further compared via regression residuals. The residual within each pixel of different models was calculated by using the NLI of each global land use type and the proportion of land use of each pixel. The residual maps obtained are shown in Figure 6.
As shown in Figure 6, the regression residuals of NLS were larger than that of spatial autoregressive models, and the residuals were more concentrated in spatial distribution, showing a phenomenon of high-high and low-low aggregation. In order to quantitatively measure the interpretation effect of the three models on the global spatial autocorrelation effect, Moran's I values and accumulation of the residuals of each model were calculated, as shown in Table 3 and Figure 7. As shown in Figure 6, the regression residuals of NLS were larger than that of spatial autoregressive models, and the residuals were more concentrated in spatial distribution, showing a phenomenon of high-high and low-low aggregation. In order to quantitatively measure the interpretation effect of the three models on the global spatial autocorrelation effect, Moran's I values and accumulation of the residuals of each model were calculated, as shown in Table 3 and Figure 7.    From Table 3, we can also find that Moran's I values of the regression residuals by NLS were the highest in the three study areas, while those of NSLM and NSEM were much lower. Hence, NSLM and NSEM are more effective than NLS at eliminating the spatial autocorrelation effect of NLI. This result can also be found from the residual accumulation in Figure 7. In all three study areas, the accumulations of the absolute residuals of each pixel for NLS were much larger than those for NSLM and NSEM. This also indicated that NSLM and NSEM had a more accurate fitting effect on the relationship between land use and NLI. From Table 3, we can also find that Moran's I values of the regression residuals by NLS were the highest in the three study areas, while those of NSLM and NSEM were much lower. Hence, NSLM and NSEM are more effective than NLS at eliminating the spatial autocorrelation effect of NLI. This result can also be found from the residual accumulation in Figure 7. In all three study areas, the accumulations of the absolute residuals of each pixel for NLS were much larger than those for NSLM and NSEM. This also indicated that NSLM and NSEM had a more accurate fitting effect on the relationship between land use and NLI.

Nighttime Light Contribution of Urban Land Use Classes
From the NLIs of different land use types, we found that the major sources of nighttime light in the three areas varied, as illustrated in Table A5 in the Appendix A. In Massachusetts, residential and commercial areas were the largest sources. Urban public facilities like the city squares and cultural areas were the lightest in Berlin. In Shenzhen, transportation and industrial areas were the main sources of nighttime light. The darkest parts of three study areas were also different. Old buildings and detached houses were darkest at night in Berlin, while in Massachusetts and Shenzhen, land covered with plants gave out the least light.
Different study areas have different types of brighter or darker land use, which may be due to the following reasons: (1) Different nations are at distinct stages of socioeconomic development. In developed countries, e.g., the United States and Germany, human activities at night mainly link with residential and commercial activities. So, residential land and commercial land use and urban public facilities have higher brightness in the night. On the other hand, China's economy is in a stage of rapid development, and many metropolitan areas are vigorously carrying out urban sprawl, so infrastructure construction, transportation and industrial production are still very active in the night [61,62]. (2) Different countries have different policies on night lighting. Germany has stricter night lighting laws on nighttime light pollution and energy conservation than the United States, so night lighting is mainly concentrated in urban public facilities [10,63]. (3) The times of data collection were different. With development of the economic and technological progress, the role of different land use types in urban developments is gradually changing, and the amount of nighttime activities performed on them is changing accordingly.
To intuitively illustrate the relationship between night lights on reference data and land use type, the top ten lightest and darkest land use types in Berlin through NLI calculated by NSLM (the land use types selected by NSEM and NSLM were same in the three areas) are shown in Figure 8, and the other two study areas are shown in Figures A1 and A2 in the Appendix A.
From Figure 8 we can see that the land use type with large NLI was usually distributed in the bright area in the night light image, while the land use type with the smallest NLI was distributed in the dark area on the edge of the city. Therefore, our method can reflect the correlation between night light and land use, so as to better understand the component of night light. However, there are also some bright parts in the image that are not covered by specified land use types in Figure 8, which may be due to following reasons: (1) There is a difference between the acquisition time of land use and night light, and the areas with higher brightness may be new construction land. (2) NLI calculated by the proposed model reflects the global relationship between land use types and nighttime light. Even though some small areas had strong brightness values in the night light image, the average NLI for this land use type in the whole study area was low, so it was not identified as the top ten lightest. We may divide the study area into small patches to explore the relationships between NLI and land use type in different urban functional zones in future research.  From Figure 8 we can see that the land use type with large NLI was usually distributed in the bright area in the night light image, while the land use type with the smallest NLI was distributed in the dark area on the edge of the city. Therefore, our method can reflect the correlation between night light and land use, so as to better understand the component of night light. However, there are also some bright parts in the image that are not covered by specified land use types in Figure 8, which may be due to following reasons: 1) There is a difference between the acquisition time of land use and night light, and the areas with higher brightness may be new construction land. 2) NLI calculated by the proposed model reflects the global relationship between land use types and nighttime light. Even though some small areas had strong brightness values in the night light image, the average NLI for this land use type in the whole study area was low, so it was not identified as the top ten lightest. We may divide the study area into small patches to explore the relationships between NLI and land use type in different urban functional zones in future research.

Main Cause and Distance Threshold of Spatial Autocorrelation
The threshold of spatial autocorrelation effects and the setting of spatial weight matrixes have a great influence on the solution of the coefficient [45,47]. In order to analyze the application scenarios of NSLM and NSEM, different distance thresholds for calculating the spatial weight matrix were studied according to spatial neighborhood settings in Section 2.2.2. Since the image resolutions of Massachusetts, Berlin and Shenzhen were 1 km, 500 m and 500 m respectively, the distance thresholds of 1.5 pixels (k = 8) and three pixels (k = 24) of the three areas were different. R 2 values of linear fitting NLI values of various land use types were used to analyze the accuracies of the two models, and the results are shown in Table 4.

Main Cause and Distance Threshold of Spatial Autocorrelation
The threshold of spatial autocorrelation effects and the setting of spatial weight matrixes have a great influence on the solution of the coefficient [45,47]. In order to analyze the application scenarios of NSLM and NSEM, different distance thresholds for calculating the spatial weight matrix were studied according to spatial neighborhood settings in Section 2.2.2. Since the image resolutions of Massachusetts, Berlin and Shenzhen were 1 km, 500 m and 500 m respectively, the distance thresholds of 1.5 pixels (k = 8) and three pixels (k = 24) of the three areas were different. R 2 values of linear fitting NLI values of various land use types were used to analyze the accuracies of the two models, and the results are shown in Table 4. As illustrated in Table 4, with the increasing of the distance threshold, the fitting accuracy of the two models decreased significantly. This result shows that the eight first-order adjacent pixels had the greatest influence on the central pixel. A large threshold may degrade the fitting and even generated worse results than the traditional NLS model. This phenomenon indicated that the spatial autocorrelation effect of the night light image had a certain distance threshold, and an overlarge threshold will introduce noise and weaken the weight of the current pixel. Therefore, when studying the correlation between NLI and land use, the distance threshold needs to be carefully considered.
In addition, we also found that when the distance threshold was less than 1.5 km, the fitting accuracy of NSEM was higher than that of NSLM; when the distance threshold was greater than 1.5 km, NSLM had better performance. The difference in the model performance within and beyond the 1.5 km bandwidth may be due to the difference in the dominant role of spatial autocorrelation characteristics. The spatial autocorrelation in the spatial lag model is mainly derived from the dependent variable, while the spatial autocorrelation in the spatial error model is mainly derived from the error, or the independent variable not considered [64,65]. The sensor radiation characteristics or meteorological conditions may be the main contributors to spatial autocorrelation within the distance threshold of 1.5 km, while when the distance threshold is greater than 1.5 km, the dependent variable related to light refraction and reflection may be the dominant factor. The findings from the above experiments may provide guidelines for selecting regression models. When the resolution of night light data is high, the spatial error model may get a better fitting effect. When the resolution of night light data is low, priority should be given to the spatial lag model.

Potential Use Case
The data sources, acquisition time and classification accuracy values of the three study areas were different, so NLIs of the same land use types obtained cannot be compared. In order to study the differences between NLIs of the same land use types in different regions and their influencing factors, we used Finer Resolution Observation and Monitoring of Global Land Cover (FROM-GLC) 2015 data (http://data.ess.tsinghua.edu.cn/fromglc2015_v1.html) from Tsinghua University and the NPP/VIIRS nighttime light images to solve the NLI of each land use type in different cities in China by using NSEM with threshold of the eight nearest neighbors.
FROM-GLC maps are the first 30 m resolution global land cover maps produced using Landsat Thematic Mapper (TM) and Enhanced Thematic Mapper Plus (ETM+) data. We used NSEM to calculate the NLIs of seven first-level classifications of land uses in five provincial capital cities in China. These cities are located in different regions of China with different levels of economic development, which can help us find the relation between NLIs of land use types and economic development. The estimated NLIs are listed in Table 5. As shown in Table 5, there were significant differences in the NLI for each land use type between different cities. All land use types in Shenzhen and Shanghai had higher brightness values, while those in Taiyuan and Harbin were lower. By observing NLIs of the same land use types in different cities, we found that forests and shrubs had relatively small NLIs in each city, while impervious surfaces and bare land had large NLIs. Shenzhen was the only city for which bare land had the largest NLIs; meanwhile impervious surfaces were the lightest land use type in other cities. Moreover, bare land and impervious surfaces had relatively close NLIs in Shenzhen, Shanghai and Wuhan; the brightness of impervious surfaces was significantly higher than those values of other land use types in Taiyuan and Harbin. This phenomenon may reflect the different development speeds and stages of different cities, and Nighttime photographs from the International Space Station shown in Figure 9 can help explain it. Shenzhen is in the midst of rapid urban expansion. The bare lands in the suburbs of the city are construction sites and, therefore, have a higher brightness than the built impermeable surfaces. However, Taiyuan and Harbin have had relatively slow or even stagnant urban expansion in recent years [66][67][68]. Most of the bare lands were stagnant construction land or cultivated land, so the brightness was far lower than for the impervious surfaces. Table 5, there were significant differences in the NLI for each land use type between different cities. All land use types in Shenzhen and Shanghai had higher brightness values, while those in Taiyuan and Harbin were lower. By observing NLIs of the same land use types in different cities, we found that forests and shrubs had relatively small NLIs in each city, while impervious surfaces and bare land had large NLIs. Shenzhen was the only city for which bare land had the largest NLIs; meanwhile impervious surfaces were the lightest land use type in other cities. Moreover, bare land and impervious surfaces had relatively close NLIs in Shenzhen, Shanghai and Wuhan; the brightness of impervious surfaces was significantly higher than those values of other land use types in Taiyuan and Harbin. This phenomenon may reflect the different development speeds and stages of different cities, and Nighttime photographs from the International Space Station shown in Figure 9 can help explain it. Shenzhen is in the midst of rapid urban expansion. The bare lands in the suburbs of the city are construction sites and, therefore, have a higher brightness than the built impermeable surfaces. However, Taiyuan and Harbin have had relatively slow or even stagnant urban expansion in recent years [66][67][68]. Most of the bare lands were stagnant construction land or cultivated land, so the brightness was far lower than for the impervious surfaces.

Shenzhen
Shanghai Wuhan Taiyuan To verify our observation and compare the stage of urban construction, official statistics of the experimental cities [69][70][71][72] were also collected. Harbin was not included in the comparison due to the administrative division adjustment in 2015. The investment completed in the current year and the cost of new construction in 2015 were chosen to study the urban construction situation, which are shown in Table 6. As we can see from the table, new construction projects in Shenzhen and Shanghai accounted for a large proportion in the total investment, about 76% and 64% respectively. This indicated that the urban constructions in Shenzhen and Shanghai were dominated by new construction projects, and plenty of bare lands were under construction in 2015. This is consistent with the modeling and analysis of night light data. Although the proposition of new construction investment in Wuhan is relatively small, the total amount is much higher than those values of other cities. Therefore, the urban sprawl speed is relatively fast in Wuhan, and the bare land may have a higher luminous brightness. In contrast, the new construction projects in Taiyuan accounted for a relatively low proportion in the total investment, and the total cost of new construction is also small compared with Shenzhen, which has a similar area, indicating that urban sprawl does not play a prominent role in Taiyuan.  To verify our observation and compare the stage of urban construction, official statistics of the experimental cities [69][70][71][72] were also collected. Harbin was not included in the comparison due to the administrative division adjustment in 2015. The investment completed in the current year and the cost of new construction in 2015 were chosen to study the urban construction situation, which are shown in Table 6. As we can see from the table, new construction projects in Shenzhen and Shanghai accounted for a large proportion in the total investment, about 76% and 64% respectively. This indicated that the urban constructions in Shenzhen and Shanghai were dominated by new construction projects, and plenty of bare lands were under construction in 2015. This is consistent with the modeling and analysis of night light data. Although the proposition of new construction investment in Wuhan is relatively small, the total amount is much higher than those values of other cities. Therefore, the urban sprawl speed is relatively fast in Wuhan, and the bare land may have a higher luminous brightness. In contrast, the new construction projects in Taiyuan accounted for a relatively low proportion in the total investment, and the total cost of new construction is also small compared with Shenzhen, which has a similar area, indicating that urban sprawl does not play a prominent role in Taiyuan. The above analysis demonstrates that the NLIs calculated by our models can reflect the relation between land use types and night light more accurately than an NLS model with an appropriate distance threshold, and it is helpful to identify urban construction situations. It can better reveal the difference in the contribution of land use to night light so as to discover the differences in economic development stages and night light policies (Section 4.1) between different regions. Moreover, our models could facilitate other potential applications, such as light pollution prevention, traffic volume estimation and GDP composition. For example, when the land use types are of high classification accuracy, the contributions of different buildings types to nighttime brightness can be identified, and they could assist in establishing environmental policies for preventing light pollution in urban residential areas and ecological protection areas. When transportation land types are known in detail, the traffic volume of different transportation modes can be estimated according to NLI. In conclusion, the proposed regression models could be extended in different application scenarios to interpret the influences of human activities on nighttime light.

Conclusions
This study proposed two non-negative spatial autoregressive models to study the correlation of night light with land use types, which used a space adjacency matrixes to consider the spatial autocorrelation effect of nighttime light remote sensing data. Analysis of influential factors in the spatial autocorrelation effect provides a reference for model selection and adjacency matrix parameter setting. Compared with the traditional NLS model, the results of our models are closer to the verified data in the three study areas with the distance threshold of 1.5 pixels. The regression residual and its spatial autocorrelation issue are better eliminated. Meanwhile, the empirical experiments verified that the main source of night light varied in countries and regions due to different levels of economic development. In economically developed countries and regions, commercial land, public facilities and residential land have high brightness; in regions with rapid economic development, traffic, construction and industrial land are the main sources of nighttime light. Therefore, the proposed models can quantify the relation between nighttime light data and land use more accurately, and in turn benefit applications of nighttime light data in light pollution control, traffic volume estimation and urbanization stage analysis.
The paper also has the following shortcomings. (1) The NLIs of surface features were solved as homogeneous global variables in each study area. The value of each pixel in the same land use type may have great variance. So, the proposed method lacks a fine-grained analysis of the difference of NLIs within the same land use types. (2) Results of our models are greatly influenced by experimental data. Experimental dataset of land use have different classification standards, and the acquisition time varies greatly, so the results cannot be used for the comparison between these areas exactly. In order to obtain a generic conclusion on causes of nighttime lighting, further experiments are needed with the support of accurate and detailed experimental data.

Conflicts of Interest:
The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.