Quantitative Analysis of Spatial Heterogeneity and Driving Forces of the Thermal Environment in Urban Built-up Areas: A Case Study in Xi’an, China

: Clarifying the spatial heterogeneity of urban heat island (UHI) effect is of great signiﬁcance for promoting sustainable urban development. A GeoDetector was used to detect the inﬂuential natural and society factors. Natural factors (normalized difference vegetation index ( NDVI ), soil-regulating vegetation index ( SAVI ), normalized building index ( NDBI ), and modiﬁed normalized difference water index ( MNDWI )) as well as society factors (road density ( RDD ), and population density ( POPD )) were selected as driving factors to be tested for their explanatory power for land surface temperature ( LST ). Results indicated that the Moran’s I index value for the LST of the built-up area is 0.778. The top three factors inﬂuencing the LST were NDBI , NDVI , and SAVI , the explanatory power of which was 0.7593, 0.6356, and 0.6356, respectively. The interactive explanatory power for NDBI and MNDWI was 0.8108 and for NDBI and RDD was 0.8002, these two interactions are double enhanced interaction relationships. The results of this study play a guiding role in the development of urban thermal environment regulation schemes and ecological environment planning.


Introduction
Urbanization has become a global trend, and the world has become an urban world [1]. According to the 2018 World Urbanization Prospects [2], more than 55% of the world's population lives in urban areas. The increase in the urban population has contributed to the socio-economic development but has also had negative impacts on the urban system, such as rapid urban expansion, air pollution, traffic congestion, increased land surface temperature (LST), and increased energy consumption [3,4]. One of the most serious problems is the urban heat island (UHI) effect, which severely affects the urban thermal environment [5]. As shown by previous studies, this phenomenon will worsen urban ecology and affect human health [6][7][8]. Therefore, improving and optimizing the urban thermal environment, as well as creating good living spaces, are important for sustainable urban development.
Urban LST is the radiant temperature of the urban surface [8] and is one of the most important parameters for analyzing the urban thermal environment. The LST is commonly used to characterize the urban thermal environment [9] and there have been many studies that have explored the temporal and spatial variation in the urban thermal environment and its driving factors [10][11][12][13][14][15][16]. Chaiyapon et al. [17] studied the urban thermal environment using descriptive statistical variables such as the mean, standard deviation, maximum and minimum LST values. Weng et al. [18] explored the summer thermal environment using a combination of Shannon entropy and Pearson chi-square statistics. Sun et al. [19] quantitatively studied the diurnal and seasonal characteristics of the thermal environment in 245 cities in China. However, many previously related studies focused more on the effect of certain land use types (e.g., green space [20] and waterbody [21]) on LST, and neglected social development factors, which is of prime importance to rapiddeveloping cities. In addition, these studies commonly used traditional mathematical statistical models to analyze the spatial distribution and driving factors of the urban thermal environment, although these offer weak spatial analysis methods, which make it difficult to carry out in-depth research on the mechanism of heterogeneity in the spatial distribution of the urban thermal environment. Therefore, this study utilizes the GeoDetector spatial statistical method to study the spatial distribution heterogeneity of LST and its driving factors in urban built-up areas.
GeoDetector is a spatial statistical method that analyzes the difference between the intra-layer and inter-layer variance to determine the variance in a layer, thereby quantitatively expressing the spatial heterogeneity of the research object [22]. The method detects not only the quantitative data but also the qualitative data of the main drivers, expressing both the spatio-temporal phenomenon and the interaction between different drivers, and this method has a higher explanatory efficiency compared to other spatial heterogeneity detection tools [23]. The method was first applied to disease transmission and causation analysis [24,25], and in recent years, it has been invoked for studying land use change [26,27], vegetation index change [28][29][30], climate and environmental changes [31][32][33], etc. However, these previously related studies commonly used traditional mathematical statical models to analyze the spatial distribution of the urban heat island effect, although these offer a weak link to the true relation to the distribution of the urban heat island effect.
In this study, to address the complex interaction between the LST and various driving factors that lead to the spatial heterogeneity of the UHI pattern as well as to determine the dominant driving factors among diverse potential factors, this study takes a typical rapid-developing city, Xi'an, China, as a study case. The relationship between the spatial pattern of the UHI and its potential factors is analyzed using the GeoDetector method. The research objectives include (1) explore the characteristics of the spatial heterogeneity of urban land surface temperature and its distribution patterns; (2) determine the main driving factors and their influence on the spatial heterogeneity of urban land surface temperature; and (3) quantitatively analyze the influence of the interaction of driving factors on the spatial heterogeneity of surface temperature distribution in urban built-up areas. This study aimed to provide a theoretical frame of reference to practice optimizing the urban thermal environmental management, especially for rapid-developing cities.

Overview of the Study Area
Xi'an is a major city in western China. It is located in the central part of the Guanzhong Plain, south of the Weihe River and north of the Qinling Mountains, with coordinates ranging from 107 • 40 -109 • 49 E and 33 • 39 -34 • 45 N. Xi'an has a temperate climate which is influenced by the East Asian monsoon, classified under the Köppen climate classification as situated on the borderline between a semi-arid climate. The average annual temperature is 13.3 • C, with a frost-free period of 226 days. The elevation is in the range of 410~420 m. The study area was chosen as a built-up area, and it includes six districts as Weiyang District, Baqiao District, Lianhu District, Xincheng District, Beilin District, and Yanta District, covering an area of approximately 457 km 2 ( Figure 1). The main tree species in this area include side cypress, locust, chaste tree, dahurica, birch, oil pine, and hairy poplar. The most common shrub species are chaste tree, yellow poplar, and heather, which account for more than 80% of the city's total shrubs. The herbaceous plants mainly comprise watercress, white trillium, and maidenhair.
As typical of a rapid-developing city, Xi'an became a major target for accelerated attention [34]. In 2020, Xi'an's GDP will grow by 5.2%, with a total of about USD 15.9 billion. Xi'an has entered the "trillion Chinese yuan club" for the first time, and it has also become the first city in Northwest China with a GDP exceeding CNY 1 trillion [35].

Data Sources
Landsat 8 Level 1 data products were used in this study, which were grouped by six images in cloudless weather (cloud cover less than 15%) from the growing season (1 May 2018 to 30 September 2018) and downloaded for free from the US Geological Survey server (USGS, https://www.usgs.gov). A series of processes, including radiometric calibration, FLAASH atmospheric correction [36], and geometric correction, were performed to ensure data quality in the ENVI5.3 software. The population density data were obtained from https://www.worldpop.org at a spatial resolution of 1 km. The road network data of the research area was obtained with a vector format, including four functional classes: Express highway, national highway, provincial roads, and county-level roads, provided by OpenStreetMap (https://www.openstreetmap.org), and road density was obtained using the ArcGIS density analysis tool.

Surface Temperature Computation
The average LST of the selected six scenes of Landsat 8 data were computed to reflect the LST in the study area. For each scene of image, the 10th band of Landsat 8 OLI and TIRS images were used as Equations (1)-(4) [37]: where LST is the true land surface temperature, TB is the bright temperature (K), λ is the effective wavelength (band 10 is 10.9 mm in the Landsat 8 data), σ is the Boltzmann constant (1.  As typical of a rapid-developing city, Xi'an became a major target for accelerated attention [34]. In 2020, Xi'an's GDP will grow by 5.2%, with a total of about USD 15.9 billion. Xi'an has entered the "trillion Chinese yuan club" for the first time, and it has also become the first city in Northwest China with a GDP exceeding CNY 1 trillion [35].

Data Sources
Landsat 8 Level 1 data products were used in this study, which were grouped by six images in cloudless weather (cloud cover less than 15%) from the growing season (1 May 2018 to 30 September 2018) and downloaded for free from the US Geological Survey server (USGS, https://www.usgs.gov (accessed on 5 February 2021)). A series of processes, including radiometric calibration, FLAASH atmospheric correction [36], and geometric correction, were performed to ensure data quality in the ENVI5.3 software. The population density data were obtained from https://www.worldpop.org (accessed on 5 February 2021) at a spatial resolution of 1 km. The road network data of the research area was obtained with a vector format, including four functional classes: Express highway, national highway, provincial roads, and county-level roads, provided by OpenStreetMap (https://www.openstreetmap.org (accessed on 5 February 2021)), and road density was obtained using the ArcGIS density analysis tool.

Surface Temperature Computation
The average LST of the selected six scenes of Landsat 8 data were computed to reflect the LST in the study area. For each scene of image, the 10th band of Landsat 8 OLI and TIRS images were used as Equations (1)-(4) [37]: where LST is the true land surface temperature, T B is the bright temperature (K), λ is the effective wavelength (band 10 is 10.9 mm in the Landsat 8 data), σ is the Boltzmann constant (1.38 × 10 −23 J/K), h is the Plank constant (6.626 × 10 −34 Js), c is the light velocity in a vacuum (2.998 × 10 −8 m/s), and ε is the ground emissivity.
where gains is the gain coefficient (value = 0.0003342), DN is the brightness value, and offsets are the offset coefficients (value = 0.1). The gain and offset coefficients were obtained from the header file of the images. The surface emissivity ε was estimated using the NDVI threshold method [39,40]. The vegetation cover F V was calculated by NDVI using Equation (4) [41]: where NDVI min is the minimum NDVI value (0.2), where pixels are considered bare soil; and NDVI max is the maximum NDVI value (0.5) and pixels are considered healthy vegetation.

Spatial Autocorrelation Analysis
This study justified Moran's I index to analyze the global spatial autocorrelation of LST using the formula in (6).
where x i and x j are the LST at the location of grid i and j, respectively, and i = j; w ij is the distance weight matrix, and n is the number of grids. A Moran's I index >0 and not close to 0 indicates that the data are spatially positively correlated, a Moran's I index close to 0 indicates that the data are not correlated, and a Moran's I index <0 and not close to 0 indicates that the data are spatially negatively correlated.

Hot Spot Analysis
Although Moran's I index can reflect the clustering characteristics of urban LST, it cannot determine whether the spatially aggregated area is aggregated by hot spot values or cold spot values. Therefore, the GetisOrd Gi* statistic is used to identify the statistically significant hot and cold spots [42], and the formula is shown in (7).
where x i and x j are attribute values for features i and j, and w ij is the spatial weight between features i and j. In addition, n is the number of features in the dataset. The G * i statistic is a z-score, so no further calculations are required for a statistically significant positive z-score, the higher the z-score, the tighter the clustering of high values (hot spots). For a statistically significant negative z-score, the lower the z-score, the tighter the clustering of low values (cold spots).

Selection of Driving Factors for LST Change
The drivers of the urban thermal environment are complex and diverse. Based on the results of previous studies on the driving factors of the urban thermal environment [43,44], drivers can be classified into four types: Vegetation, impervious surface, water bodies, and socio-economic characteristics. Among them, vegetation, impervious surfaces, and water bodies are the main drivers of urban surface temperature change [45]. Six driving factors were selected for this study based on a literature review and data availability (Table 1, Figure 2).

Statistical Analysis
First, the ArcGIS 10.2 was used to create a 200 × 200 m fishnet, generating a total of 20,591 grids, and then the extraction analysis tool was used to compute the attribute values of the six driving factors of LST, NDVI, NDBI, MNDWI, SAVI, RDD, and POPD for each fishnet. Since GeoDetector applications require the independent variables to be categorical variables, the attribute values of the six driving factors were reclassified into five categories. It is noteworthy that we choose the discretization method with the maximum q-value [52]. To minimize the uncertainty rising from selections of discretization methods, several discretization methods have been compared, such as the standard deviation method, equal interval method, and natural breaks method. Finally, we choose the natural breaks method with the maximum q-value as the optimal discretization methods.

Driving Factor Analysis
This study uses the GeoDetector method to analyze the spatial heterogeneity and driving factors of LST. Data processing was performed using the free GeoDetector software package (http://www.geodetector.cn/ (accessed on 5 February 2021)), which consists of four main detectors: Factor detector, interaction detector, risk detector, and ecological detector [22,53]. In this study, we mainly applied the factor detector and the interaction detector.
where h ∈ (1, 2, 3 : : : , 6) represents the category index for driving factors. N h is the number of plots in category h for geographical factor X, σ 2 h is the variance of LST in category h of the driving factor, N is the total number of grids of the fishnet (i.e., 20,591 in this study), and σ 2 h is the variance of LST for all grids of the fishnet. The factor detector mainly quantitatively detects the effect of different driving factors on urban LST, where the larger the probability distribution q-value is, the stronger the effect of the influencing factor on LST. The q-value is between 0 to 1; in the extreme case, a value of q equal to 1 indicates that the driving factor fully controls the spatial distribution of LST. A value of q equal to 0 indicates that the driving factor is independent of LST. In other words, the q-value represents the 100*q% of LST explained by the driving factor. Although there may be co-collinearity between the selected variables, it has no effect on the results of the study since each factor is entered individually when calculating the degree of LST explanation.
Interaction detectors are used to identify interactions between different driving factors (X), that is, whether factors X1 and X2 interact to increase or decrease the explanatory power of the dependent variable (Y) or whether the effects of these factors on Y are independent of each other. First, we compute the q-values of the two factors X1 and X2. Then, we superimpose these two factors and compute their q-values (q(X1∩ X2). The relationship between these two factors yields the following five results:

5.
Non-linear enhancement: Q(X1 ∩ X2) > q(X1) + q(X2). Table 2 shows the results of the calculations using GeoDetector, showing the extent of the influence of the six driving factors on the LST (q-value). From Table 2, it can be seen that the influence of NDBI, NDVI, SAVI, and RDD on the spatial heterogeneity of the LST is significant (p < 0.05), while each driving factor (q-value) has a different degree of influence. An evaluation of the single driving factor disclosed that the primary geographical factors are ranked by the value of the q-value as NDBI (0.7593) > NDVI (0.6356) = SAVI (0.6356) > RDD (0.4619) > MNDWI (0.1239) > POPD (0.0352), which indicates that building indexes have the main influences on the spatial distribution of LST in the study area, followed by vegetation indexes. Although the water body and population density affect the spatial distribution of LST, the explanatory power of these individual factors has a small effect (q-value of 0.1239 and 0.0352, respectively).  Figure 3 shows the results from the remote sensing estimation of LST in the built-up area, and the statistical results show that the highest temperature in the built-up area is 37.68 • C, the lowest temperature is 11.12 • C, and the average temperature is 27.29 • C. The high-temperature region is distributed in the north, and the low-temperature region is distributed in the southeast. It can be estimated by Equation (1) that the Moran's I index value for the LST of the built-up area is 0.778, close to 1, and the Z score is 219.85. At the p = 0.00 level, the spatial distribution of LST heterogeneity in the built-up area is significant. The hotspot analysis (Getis-Ord Gi*) using ArcGIS at 90%, 95%, and 99% confidence intervals (Figure 4) shows that the spatial distribution of LST in the built-up area has a high-high (hotspot) or low-low (cold spot) pattern at the time point of this study. The average temperatures of hot spots and cold spots were 29.59 and 23.76 • C, respectively. Hot spots were mainly located in the western part of the study area of Weiyang District and the central part in Xincheng and Lianhu Districts, while cold spots were located in the southeastern part near the Qinling Mountains and the Chan river and Ba river.

Interaction of Driver Factors on LST
The interaction detection analyzes whether the interaction between pairs of the six driving factors (independent variable X) enhances or weakens the explanatory power of the urban land surface temperature (dependent variable Y) by identifying the interactions (independent variable X) or determining that the effects of these driving factors on LST

Interaction of Driver Factors on LST
The interaction detection analyzes whether the interaction between pairs of the six driving factors (independent variable X) enhances or weakens the explanatory power of the urban land surface temperature (dependent variable Y) by identifying the interactions (independent variable X) or determining that the effects of these driving factors on LST

Interaction of Driver Factors on LST
The interaction detection analyzes whether the interaction between pairs of the six driving factors (independent variable X) enhances or weakens the explanatory power of the urban land surface temperature (dependent variable Y) by identifying the interactions (independent variable X) or determining that the effects of these driving factors on LST are independent of each other. The interaction detection results show that the interactions of pairs of the six driving factors are significant (q-values are all greater than 0) for the LST, indicating that no single factor affects the LST. The analysis of the results of the interaction detection of influencing factors found that the main driving factors are mainly the double factor and non-linear enhancements, and there is no independent factor (Table 3). Table 3. Interaction of multiple factors.
The two factors interact and have different magnitudes of explanatory power for LST, as shown by the results of building (NDBI) interactions with other factors: NDBI ∩ MNDWI (0.8108) > NDBI ∩ RDD (0.8002) > NDBI ∩ NDVI (0.7754) = NDBI ∩ SAVI (0.7754) > NDBI ∩ POPD (0.7710). The results showed that the interaction between NDBI and MNDWI (q = 0.8108) had the strongest effect on the spatial distribution of LST, and the interaction between MNDWI and POPD had the lowest effect on the spatial distribution of LST (0.1459), indicating that the combined effect of water bodies and population density had the smallest effect on LST.

Discussion
The study of the spatial distribution pattern and driving factors is of great significance for revealing the internal mechanism of the urban thermal environment and formulating measures for urban ecological and environmental protection. In this study, the LST was obtained by an estimation of Landsat 8 TIRS data, the spatial heterogeneity of LST was analyzed by Moran's I index, and the GetisOrd Gi* statistic, and the driving factors of LST were analyzed by the GeoDetector model. In particular, the effects of interaction on the LST between the driving factors were analyzed. The results of the study are valuable for the formulation of thermal environment management plans and cooling measures in urban built-up areas.
The LST in urban built-up areas has a certain degree of autocorrelation and continuity in its spatial distribution. The most basic assumptions of classical statistics are sampling independence and randomness, and the application of classical statistics to the study of surface temperature inevitably exposes its limitations. Spatial statistical methods are based on the theory of regionalized variables and are advantageous in studying natural phenomena, which are both random and structured in space. This study adopts spatial statistics to show that the Moran's I index value of land surface temperature in the Xi'an built-up area is 0.778, close to 1, with a Z score of 219.85. The spatial autocorrelation of land surface temperature is significant at the p = 0.00 level and exhibits multiple hot spot and cold spot clustering characteristics. This result is similar to the results obtained by Zhang et al. [54] and Cheng et al. [55] in the study of UHI. The advantages of GeoDetector are as follows: (1) The process of using this index for the urban heat island range extraction is more objective, and the results are more accurate and reasonable; (2) the accurate acquisition of the urban heat island range helps determine the spatially precise location for the implementation of heat island mitigation measures; (3) this index can identify the urban heat island while being able to identify the range of the cold island, which helps identify the cooling distribution pattern of urban land properties, thus providing a better basis for the formulation of more targeted urban heat island mitigation measures in order to improve the efficiency of urban heat island regulation.
The spatial heterogeneity of the urban LST is caused by the interaction of a set of driving factors [56]. This study recommends that greater attention should be given to the distribution of the impervious surface with the greatest explanatory power of spatial heterogeneity of LST (Table 2). Specifically, the most significant factor explaining LST is the NDBI factor, followed by the greenness factor. As already proven in previous studies, the impervious surface is the main factor effect the spatial distribution of LST [57]. The MNDWI of the wetness factor does not play a significant role in the influence of spatial heterogeneity of LST in our research area, while the explanatory power of interaction between NDBI and MNDWI is greatest (Table 3). This result might have occurred since the proportions of the waterbody are relatively small, the influence of the waterbody on LST is not significant statistically, while the strong interaction between NDBI and MNDWI proved that waterbodies can absorb heat and mitigate the UHI effect. Furthermore, as an important social development factor, RDD is one of the potential impact factors of spatial heterogeneity of LST and has a strong interaction with NDBI(P(X1 ∩ X2) = 0.8002). This result agreed well with Ying-ying Li et al. [58] in Shanghai, China.
However, there are some shortcomings that need to be further explored in future studies. First, although the main driving factors affecting the urban surface temperature were selected based on previous literature and data availability, the driving factors are small, and more driving factors can be considered in the future. Second, the coverage area of land surface temperature hotspots and cold spots in the built-up area is small, accounting for less than 6% of the total area, the reason which is mainly related to the numerical interval standard of the identified hot and cold spots. Third, the population density (POPD) raster data with a low spatial resolution (1 × 1 km) were selected in this study, which may have influenced the results of the influence study on the surface temperature. In conclusion, relevant driving factors can be considered according to the natural, economic, social, and three-dimensional urban space of the focal cities. In addition, a more comprehensive driving factor index can be constructed for a comprehensive analysis in future studies.

Conclusions
The remote sensing inversion of urban land surface temperature and spatial statistical analysis showed that the spatial distribution of land surface temperature in the built-up area showed a high degree of spatial heterogeneity and was characterized by multiple hot and cold spots. On the whole, the trend is for the high-temperature region of the LST in the built-up area to shift from the city centre to the city edge. The results of the driving factor detection showed that under the effect of a single driving factor, the building (NDBI) was the most important factor affecting the spatial distribution and heterogeneity of LST in the built-up area, followed by vegetation (NDVI), road (RDD), water body (MNDWI), and population (POPD). The results of the driving factor interaction detection showed that the formation of the spatial distribution of LST in urban built-up areas was not influenced by only one driver, but was the result of the joint action of multiple factors. Examining the combined effects of multiple drivers, buildings and vegetation, roads and populations have a significant influence on the spatial distribution of LST. In addition, the interaction of all the driving factors showed a non-linear or bifactor enhanced effect on the spatial distribution of LST compared to the effects of a single driver. The results of this study will be useful for the formulation of urban thermal and ecological planning.