Spatiotemporal Variation of Surface Urban Heat Islands in Relation to Land Cover Composition and Configuration: A Multi-Scale Case Study of Xi’an, China

Urban heat islands (UHI) can lead to multiple adverse impacts, including increased air pollution, morbidity, and energy consumption. The association between UHI effects and land cover characteristics has been extensively studied but is insufficiently understood in inland cities due to their unique urban environments. This study sought to investigate the spatiotemporal variations of the thermal environment and their relationships with land cover composition and configuration in Xi’an, the largest city in northwestern China. Land cover maps were classified and land surface temperature (LST) was estimated using Landsat imagery in six time periods from 1995 to 2020. The variations of surface heat island were captured using multi-temporal LST data and a surface urban heat island intensity (SUHII) indicator. The relationship between land cover features and land surface temperature was analyzed through multi-resolution grids and correlation analysis. The results showed that mean SUHII in the study area increased from 0.683 ◦C in 1995 to 2.759 ◦C in 2020. The densities of impervious surfaces had a stronger impact on LST than green space, with Pearson’s correlation coefficient r ranging from 0.59 to 0.97. The correlation between normalized difference impervious surface index and LST was enhanced with the enlargement of the grid cell size. The correlations between normalized difference vegetation index and LST reached maxima and stabilized at grid cell sizes of 210 and 240 m. Increasing the total area and aggregation level of urban green space alleviated the negative impacts of UHI in the study area. Our results also highlight the necessity of multi-scale analysis for examining the relationships between landscape configuration metrics and LST. These findings improved our understanding of the spatiotemporal variation of the surface urban heat island effect and its relationship with land cover features in a major inland city of China.


Introduction
The proportion of urban population globally is about 55%. It is projected to continue to increase, and nearly 90% of the growth will occur in Asia and Africa [1]. The urbanization process worldwide has caused the alteration of land cover and land use and a series of environmental problems [2]. Among them, a well-known phenomenon is the urban heat island (UHI), in which the temperature in urban areas is significantly higher than nearby rural areas. UHIs can lead to multiple consequences, such as increased air pollution, morbidity, and energy consumption, and thus, have an adverse impact on residents' lives in cities and towns [3]. In particular, the consequences of the UHI effect is expected to be aggravated by global warming and an increasingly urbanizing world [4]. Therefore, the UHI effect has been widely investigated from local to global scales.
The UHI phenomenon can be characterized in two methods: atmospheric UHI and surface UHI. Atmospheric UHI is measured by meteorological stations or via field surveys using equipment mounted on a car, balloon, or aircraft. Surface urban heat island (SUHI) documents the difference in land surface temperatures in cities and nearby rural areas observed by satellite sensors. Due to their repeatable, large-scale observations of Earth's surface, satellites provide a superior data source for land surface temperature (LST) estimation and urban thermal environment characterization [5,6]. Thermal infrared bands of satellite sensors, such as AVHRR, MODIS, Landsat TM/ETM+/TIRS, and ASTER, have been used for LST retrieval and UHI measurement at various spatial and temporal scales [7][8][9][10]. Among them, Landsat satellites provide long, continuous observation of land surface and have been widely used to explore the spatiotemporal evolution of urban thermal environment [11,12].
A variety of indicators have been developed to quantify the spatial pattern and temporal variation of SUHI using land surface temperature. The LST difference between urban and rural areas was a fundamental indicator to identify SUHI variations [5]. One limitation of the urban-rural difference indicator is that the distinction between urban and rural remains confusing and there lacks a standardized method to define urban and rural extents [13,14]. Several other indicators such as the urban-agriculture difference, urban-water difference, Gaussian magnitude, and hot island area were also developed to measure the magnitude of SUHI [15]. To provide an objective protocol for measuring the magnitude of the urban heat island effect, the concept of Local Climate Zones (LCZ) was proposed [16,17]. Under this framework, UHI magnitude is defined as the temperature difference between built-type and natural-type LCZ classes [18,19].
Previous UHI studies have mainly concentrated on big and/or coastal cities where UHI effects are prominent. Imhoff et al. [20] found that the amplitude of UHI is seasonally asymmetric with higher temperature differences in summer than in winter, when controlling for ecological settings, across the 38 most populated urban areas in the conterminous United States. Peng et al. [21] reported that the daytime intensity of SUHI was significantly higher than nighttime across more than 400 big cities globally and that the spatial distribution of SUHI corelates with the differences in vegetation cover between urban and suburban areas. Zhou et al. [22] analyzed SUHI in 32 major cities in China and attributed their spatial variations to such driving forces as vegetation, anthropogenic heat release, climate, and built-up area density. Firozjaei et al. [23] conducted a comparative analysis of surface anthropogenic heat islands across six megacities with varying geographic and climate settings. By comparing the UHI in different cities, these studies revealed the large spatial, diurnal, and seasonal heterogeneity in SUHI across cities due to their different driving variables.
Since land cover characteristics and their changes are closely related to the distribution of land surface temperature, their relationship has been investigated in numerous studies. Landscape composition is the amount of land cover categories within a defined spatial unit. The influences of landscape composition on urban LST were well documented [24,25]. Gogoi et al. found that 25% to 50% of observed overall warming was associated land use and land cover changes over eastern India [26]. Guha et al. analyzed seasonal variability in land surface temperature and their relationship with spectral indices such as normalized difference vegetation index (NDVI) and normalized difference built-up index (NDBI) in Raipur [27]. Logan et al. examined the influence of urban characteristics such as building floor area, NDBI, and NDVI on land surface temperature in four cities across the United States [28]. Using Landsat 8 satellite imagery, Song et al. quantified the effects of building density on land surface temperature across 21 cities in China [29]. Landscape configuration metrics quantify the spatial characteristics or arrangement of land cover patches. Using these metrics, the influences of size, shape, and segmentation of land cover on LST have been analyzed [30,31]. Xie et al. used structural equation models to describe the relationship between land scape indicators such as patch density, largest patch index, aggregation index, and surface temperature [32]. Despite these efforts, the relationship between the UHI and land cover composition and spatial arrangement are not fully understood due to the heterogeneity of urban environments. Previous reported conflict associations between land cover patterns and LST in case studies of different cities [33,34]. They also revealed that the relationship between landscape composition and LST was scale-dependent [35][36][37]. Multi-scale analysis has been necessary to understand the scale-dependent effect of UHI in ecological studies [38,39].
This study takes Xi'an, China, as the study area to analyze the spatiotemporal variations of the thermal environment and their relationships with land cover characteristics using remote sensing data. The relationship between the amount and arrangement of urban green space and impervious surfaces and LST was explored using a multi-scale analysis approach. This study will enhance our understanding of the spatiotemporal variation of SUHI effect and its association with land cover features in an inland city.

Study Area
Xi'an, the largest city in northwestern China, consists of 11 municipal districts and 2 counties. With a total area of 10,752 km 2 , the population in Xi'an reached 10 million in 2018, of which about 75% were urban residents. It has a temperate continental monsoon climate. Winters are dry and cold and summers are rainy and hot. The mean temperature ranges from 26.3 to 26.6 • C in July, and −1.2 to 0.0 • C in January [40].
As the administrative, economic, and cultural center of Shaanxi Province, Xi'an has been one of the fastest urbanizing cities in China. In particular, the China Western Development national strategy has greatly promoted its economic growth since 2000. The rapid urban growth in Xi'an caused a decrease in forest areas, an increase in air pollution, and a degradation of environmental quality [41]. To investigate the spatial and temporal variation of thermal environment in metropolitan areas, the six municipal districts of Xi'an were selected as our study area, including Weiyang, Baqiao, Lianhu, Xincheng, Beilin, Yanta, and their surrounding areas ( Figure 1). density on land surface temperature across 21 cities in China [29]. Landscape configuration metrics quantify the spatial characteristics or arrangement of land cover patches. Using these metrics, the influences of size, shape, and segmentation of land cover on LST have been analyzed [30,31]. Xie et al. used structural equation models to describe the relationship between land scape indicators such as patch density, largest patch index, aggregation index, and surface temperature [32]. Despite these efforts, the relationship between the UHI and land cover composition and spatial arrangement are not fully understood due to the heterogeneity of urban environments. Previous reported conflict associations between land cover patterns and LST in case studies of different cities [33,34]. They also revealed that the relationship between landscape composition and LST was scale-dependent [35][36][37].
Multi-scale analysis has been necessary to understand the scale-dependent effect of UHI in ecological studies [38,39]. This study takes Xi'an, China, as the study area to analyze the spatiotemporal variations of the thermal environment and their relationships with land cover characteristics using remote sensing data. The relationship between the amount and arrangement of urban green space and impervious surfaces and LST was explored using a multi-scale analysis approach. This study will enhance our understanding of the spatiotemporal variation of SUHI effect and its association with land cover features in an inland city.

Study Area
Xi'an, the largest city in northwestern China, consists of 11 municipal districts and 2 counties. With a total area of 10,752 km 2 , the population in Xi'an reached 10 million in 2018, of which about 75% were urban residents. It has a temperate continental monsoon climate. Winters are dry and cold and summers are rainy and hot. The mean temperature ranges from 26.3 to 26.6 °C in July, and −1.2 to 0.0 °C in January [40].
As the administrative, economic, and cultural center of Shaanxi Province, Xi'an has been one of the fastest urbanizing cities in China. In particular, the China Western Development national strategy has greatly promoted its economic growth since 2000. The rapid urban growth in Xi'an caused a decrease in forest areas, an increase in air pollution, and a degradation of environmental quality [41]. To investigate the spatial and temporal variation of thermal environment in metropolitan areas, the six municipal districts of Xi'an were selected as our study area, including Weiyang, Baqiao, Lianhu, Xincheng, Beilin, Yanta, and their surrounding areas ( Figure 1).

Satellite Data
Six Landsat images acquired in 1995,2000,2004,2011,2014, and 2020 were used to depict the land cover and thermal environment in the study area with a five-year interval since 1995 (Table 1). Since climate conditions such as air temperature, humidity, and cloudiness vary least in spring in the study area, satellite images in April and May under clear sky conditions were selected for each year to reduce the variation of the atmospheric background [42]. The images were acquired by Landsat 5 TM, Landsat 7 ETM+, and Landsat 8 OLI/TIRS, respectively. The Level-1 Landsat products (L1TP) were obtained from the USGS Earth Explorer (https://earthexplorer.usgs.gov/). Atmospheric correction was conducted on the Landsat data by using the FLAASH module in ENVI 5.3 software. Due to a malfunction of the Scan Line Corrector, Landsat 7 ETM+ imagery acquired after May 2003 was flawed. In each image scene, about 20% of pixels were lost. We filled the gaps in the Landsat 7 ETM+ image in 2004 using the Landsat Gapfill tool in the ENVI software.

Land Cover Classification
A random forest classifier was implemented to obtain the land cover maps in the study area. Land cover types in the classification included built-up area, bare land, green space, and water. In addition to original data, normalized difference impervious surface index (NDISI) [43], normalized difference vegetation index (NDVI) [44], and normalized water index (NDWI) [45] were derived as additional input features for the classification.
Afterward, post-classification correction was performed manually to further improve the classification results. Following the "good practice" recommendations for accuracy assessment, a stratified random sampling method with mapped classes as strata was implemented to assess the overall accuracy of classified land cover maps quantitively [46,47]. To obtain a target standard error for overall accuracy of 0.01, the sample sizes range approximately from 500 (user's accuracy 95%) to 900 (user's accuracy 90%) [48]. A minimum of 50 samples per land cover category is also desirable to ensure a sufficient representation of rare classes [48]. Hence, a total of 700 random sample points were collected as the validation dataset for each Landsat scene [49]. Among the sample points, 300 were assigned to built-up area, 150 to bare land, 200 to green space, and 50 to water. Reference ground truth information was collected from high resolution satellite images of Google Earth and the pan-sharpened images of the Landsat data.
Two metrics, the annual change area (ACA, km 2 ) and annual rate of change (ACR, %), were used to measure the amount and rate of land cover change for each land cover category. The ACA and ACR are defined as follows: where A start and A end represent the areas of an individual land cover type at the start and end time point, respectively, and d is the time period from the start to the end time point. The ACA and ACR were calculated for each land cover type from 1995 to 2020 using the derived multi-temporal land cover maps.

Land Surface Temperature Retrieval
The digital numbers of the thermal bands from the original Landsat data (Band 6 in Landsat TM and ETM+, and Band 10 in Landsat TIRS) were converted to absolute radiance values. Then, the radiance values were converted to at-satellite brightness temperature in degrees Kelvin. Finally, the land surface temperature was calculated using: where T B is brightness temperature of the thermal bands; λ is wavelength of emitted radiance (λ = 11.457, 11.270, and 10.904 for Landsat TM, ETM+, and TIRS B10, respectively); ρ = h × c/σ, 1.438 × 10 −2 mk, where σ is Boltzmann constant, h is Planck's radiation constant, c is velocity of light, and ε is the land surface emissivity [50].
By considering an image pixel as a mixture of bare soil and vegetation, land surface emissivity at the pixel level can be modeled using vegetation proportion and emissivities of vegetation of soil [50]. Land surface emissivity ε can be obtained from NDVI using a NDVI threshold method [51]. In this study, the land surface emissivity ε was estimated using a simplified equation [52]: where P v is the vegetation coverage. By using typical constant values for the emissivities of vegetation and soil, m and n were calculated as 0.004 and 0.986, respectively [52]. The NDVI was derived using the red and near-infrared bands of Landsat data. Using NDVI data, vegetation coverage P v was derived using the following equation: where NDVI max and NDVI min are the maximum and minimum vegetation index in the study area, respectively. Due to the atmospheric background differences among satellite image scenes, LST cannot be used directly to compare UHI in the study area. To reduce the differences in the atmospheric background among the Landsat scenes, a normalization procedure was performed to rescale the LST to a range between 0 and 1 using the following equation [53]: where NLST i refers to the normalized land surface temperature at pixel i, LST i is the land surface temperature at pixel i, and LST min and LST max refer to the minimum and maximum LST in the study area, respectively. To remove the impacts of outliers, we chose the 5th and 95th percentile of each LST dataset as the maximum and minimum land surface temperature values. Due to the variation of LST during the study period, the maximum and minimum values of LST were calculated for each Landsat image.

Surface Urban Heat Island Intensity
Following the LCZ concept, the surface UHI intensity was defined as the land surface temperature difference between built-up and green space areas in this study [19,54]. At each built-up area pixel, the SUHII was calculated using the following formula: where T bui is land surface temperature at pixel i and T gs is mean land surface temperature of green space pixels. The SUHII of the study area were further derived as: where SUHII i is the surface urban heat island at pixel i and n is the total number of built-up area pixels.

Land cover composition analysis
In this study, the densities of green space and built-up area were used as landscape composition metrics. The influence of the density of impervious surfaces and green space on LST was examined based on grids at different spatial scales. Using the fishnet tool, we created square grids with cells of 30, 180, 270, 360, and 450m covering the entire study area [34]. The average values of LST, NDVI, and NDISI were calculated in each fishnet grid. Grid units that contained water pixels were excluded in the analysis.

Land Cover Configuration Analysis
The main characteristics describing the spatial pattern of land cover, including edge density, fragmentation, and the abundance, size, and shape of patches, can be represented by several spatial configuration landscape metrics [55]. As the basic elements that make up a landscape, patches represent relatively discrete areas of relatively homogeneous land cover types where patch boundaries are distinguished by discontinuities from their surroundings. A landscape contains a hierarchy of patch mosaics across a range of scales. Considering their potential effects on LST, interpretability, and redundancy, six landscape metrics were selected in our study ( Table 2): percentage of landscape (PLAND), largest patch index (LPI), mean patch size (AREA_MN), edge density (ED), mean patch shape index (SHAPE_MN), and aggregation index (AI).
The PLAND refers to the areal percentage of all patches of a certain patch type in the total landscape area. It quantifies the proportional abundance of a certain patch type in the landscape. The LPI is the area percentage of the largest patch in the landscape in the total landscape area. As a simple measure of dominance, the LPI approaches 100 when the largest patch comprises most of the landscape. The AREA_MN is the sum of area across all patches of the corresponding patch type divided by the number of patches. The ED refers to the sum of the lengths of all edge segments involving the corresponding patch type, divided by the total landscape area. The shape index is a straightforward measure of shape complexity, and SHAPE_MN increases as shapes of patches become more irregular. The AI equals the number of like adjacencies involving the corresponding class, divided by the maximum possible number of like adjacencies. AI is close to 0 when the focal patch type is maximally disaggregated and increases when the patch type is aggregated into a single, compact patch. Table 2. Description of landscape metrics [55].

Landscape Metrics Abbreviation Formula Description
Percentage of landscape PLAND The spatial metric values were produced using Fragstats 4.2 software. A patch of a given land cover type was delineated as a cluster of pixels of the same land cove type which are contiguous using an 8-cell neighbor rule. The mean LST and spatial metrics in grid cells with a size of 3 × 3 km were used to analyze the influence of landscape configuration on the LST in the study area [34]. For each grid cell, the spatial metrics were computed for individual patches and summarized at the class level.

Statistical Analysis
In this study, a commonly used statistical method, bivariate correlation analysis, was used to quantify the relationships between LST and land cover compositions. In specific, Pearson's correlation coefficient r was calculated to quantity the intensity of the correlations between LST and land composition variables. We randomly selected 1000 samples to perform the correlation analysis between LST and land cover composition features at different spatial scales.
Landscape configuration can affect the land surface temperature independently as well as interactively. It is essential to control the effects of the amount of land cover when measuring the impacts of their spatial configuration on LST [35]. The partial correlation method was employed to measure the strength and direction of correlations between LST and the spatial metrics of built-up areas and green space when controlling the effect of percentage coverage of the land cover type. LST was set as the dependent variable and all spatial pattern metrics were set as independent variables. The relationship between LST and each spatial pattern metric was quantified, while controlling for the PLAND variable.

Land Cover Maps
The land cover of the study area was mapped for six time periods, namely, 1995, 2000, 2005, 2010, 2015, and 2020 ( Figure 2). The accuracy for each classified image was evaluated using quantitative metrics, including users' accuracy, producers' accuracy, overall accuracy, and kappa coefficient (Table 3). The overall accuracies of the land cover maps range from 87.91% to 94.38%. The Kappa index values range from 0.83 to 0.92. For each analyzed year, an overall classification accuracy over 85% was obtained. The relatively low accuracy for year 2005 might be caused by the image gap filling process. Therefore, the classification maps were considered suitable for analyzing urban expansion and land cover changes.  The areas and changes of each land cover type are shown in Table 4. An obvious increasing trend was observed for built-up areas, which increased from 413.74 in 1995 to 1022.75 km 2 in 2020 in the study area. The area of green space and water bodies significantly decreased from 1995 to 2020. A total area of 616.42 km 2 of vegetated area was converted into other land uses from 1995 to 2020. Green space mainly composed of agricultural lands in suburban areas was changed into construction land due to urbanization.   The overall accuracies of the land cover maps range from 87.91% to 94.38%. The Kappa index values range from 0.83 to 0.92. For each analyzed year, an overall classification accuracy over 85% was obtained. The relatively low accuracy for year 2005 might be caused by the image gap filling process. Therefore, the classification maps were considered suitable for analyzing urban expansion and land cover changes.
The areas and changes of each land cover type are shown in Table 4. An obvious increasing trend was observed for built-up areas, which increased from 413.74 in 1995 to 1022.75 km 2 in 2020 in the study area. The area of green space and water bodies significantly decreased from 1995 to 2020. A total area of 616.42 km 2 of vegetated area was converted into other land uses from 1995 to 2020. Green space mainly composed of agricultural lands in suburban areas was changed into construction land due to urbanization.

LST Variation and Its Relationship with Land Cover
The normalized land surface temperatures (NLST) in the study area were mapped and illustrated in Figure 3. The areas with high temperature values (over 0.5) were distributed mainly around the center of the city. The suburban and rural areas surrounding the city core were mainly covered with low and medium temperatures. The changes in the spatial pattern of NLST revealed that the areas with high temperatures increased remarkably with the sprawl of urban lands from 1995 to 2020.

LST Variation and Its Relationship with Land Cover
The normalized land surface temperatures (NLST) in the study area were mapped and illustrated in Figure 3. The areas with high temperature values (over 0.5) were distributed mainly around the center of the city. The suburban and rural areas surrounding the city core were mainly covered with low and medium temperatures. The changes in the spatial pattern of NLST revealed that the areas with high temperatures increased remarkably with the sprawl of urban lands from 1995 to 2020. The association between NLST and land cover types across the six periods is further analyzed (Table 5). Generally, built-up areas and bare land had higher average NLST than vegetation and water. The relatively high LSTs in built-up areas were mainly associated with its different thermal characteristics from other land cover types. The buildings and paved surfaces in urban areas tend to cause greater absorption of solar radiation, greater retention of infrared radiation, and delayed release of heat [16]. Heat and moisture released from human activity, energy consumption, and transportation also contribute to the changes of LST in urban areas [16]. The temporal evolution of surface temperature showed an increasing trend for all land cover types from 1995 to 2020. The association between NLST and land cover types across the six periods is further analyzed (Table 5). Generally, built-up areas and bare land had higher average NLST than vegetation and water. The relatively high LSTs in built-up areas were mainly associated with its different thermal characteristics from other land cover types. The buildings and paved surfaces in urban areas tend to cause greater absorption of solar radiation, greater retention of infrared radiation, and delayed release of heat [16]. Heat and moisture released from human activity, energy consumption, and transportation also contribute to the changes of LST in urban areas [16]. The temporal evolution of surface temperature showed an increasing trend for all land cover types from 1995 to 2020. The boxplots of NLST for each land cover type are presented in Figure 4. Significant differences in the average NLST of land cover types were observed. A relatively large range of temperature variation was observed for vegetation and water. The LST variation in vegetated areas might be attributed to the variability of vegetation densities in the metropolitan area. The negative relationship between vegetation abundance and land surface temperature has been observed by many investigators [56,57]. The water body area of Xi'an city is small, and the main water bodies are the rivers in the suburban area. The spatial resolution of Landsat thermal band ranges from 60 to 120 m. As a result of the presence of mixed pixels, the land surface temperatures of rivers on the Landsat imagery were biased by the background temperature such as bare land [58]. This might lead to the high NLST values observed in the land cover type of water. The boxplots of NLST for each land cover type are presented in Figure 4. Significant differences in the average NLST of land cover types were observed. A relatively large range of temperature variation was observed for vegetation and water. The LST variation in vegetated areas might be attributed to the variability of vegetation densities in the metropolitan area. The negative relationship between vegetation abundance and land surface temperature has been observed by many investigators [56,57]. The water body area of Xi'an city is small, and the main water bodies are the rivers in the suburban area. The spatial resolution of Landsat thermal band ranges from 60 to 120 m. As a result of the presence of mixed pixels, the land surface temperatures of rivers on the Landsat imagery were biased by the background temperature such as bare land [58]. This might lead to the high NLST values observed in the land cover type of water.  Figure 5 displays the spatial pattern of SUHII in different time periods in the study area. The expansion of the urban heat island corresponds to the urban spatial growth pattern. With the sprawl of urban construction lands, new hot spots of heat island appeared in the newly built areas surrounding the city center. The SUHII values of these areas were even higher than the urban core.  The mean SUHII in the study area increased from 0.683 °C in 1995 to 2.759 °C in 2020 ( Table 6). The standard deviation showed a downward trend from 1995 to 2020. This indicated that the variability of land surface temperature in the built-up areas was gradually decreasing, while the overall surface urban heat island magnitude was increasing.

Relationship between LST and Land Cover Composition
The multi-scale grids analysis suggests that the mean LST is correlated with density of impervious surface (NDISI) positively and density of vegetation (NDVI) negatively. The correlation coefficients between NDISI and NDVI and mean LST were significant at the 0.01 significance level across the five grid cell sizes ( Figure 6). In addition, the correlation between NDVI and mean LST are weaker than that between NDISI and mean LST for most observations during the study period.
The increase in the correlation coefficients between NDISI and LST indicates that the effect of impervious surfaces on LST becomes stronger with an increase in the grid cell size. The correlation between NDVI and mean LST increases with the enlargement of the grid cell size at the beginning, while reaching the maximum magnitude and showing a stable or slight decreasing trend exceeding the window size around 7 × 7 or 8 × 8 ( Figure 6). A similar pattern was revealed for the relations between NDISI and LST in the years 2010, 2015, and 2020. The relationship suggests that dense vegetation can produce a stronger cooling effect, while the cooling effect may decay greatly beyond a distance threshold. The mean SUHII in the study area increased from 0.683 • C in 1995 to 2.759 • C in 2020 ( Table 6). The standard deviation showed a downward trend from 1995 to 2020. This indicated that the variability of land surface temperature in the built-up areas was gradually decreasing, while the overall surface urban heat island magnitude was increasing.

Relationship between LST and Land Cover Composition
The multi-scale grids analysis suggests that the mean LST is correlated with density of impervious surface (NDISI) positively and density of vegetation (NDVI) negatively. The correlation coefficients between NDISI and NDVI and mean LST were significant at the 0.01 significance level across the five grid cell sizes ( Figure 6). In addition, the correlation between NDVI and mean LST are weaker than that between NDISI and mean LST for most observations during the study period.
The increase in the correlation coefficients between NDISI and LST indicates that the effect of impervious surfaces on LST becomes stronger with an increase in the grid cell size. The correlation between NDVI and mean LST increases with the enlargement of the grid cell size at the beginning, while reaching the maximum magnitude and showing a stable or slight decreasing trend exceeding the window size around 7 × 7 or 8 × 8 ( Figure 6). A similar pattern was revealed for the relations between NDISI and LST in the years 2010, 2015, and 2020. The relationship suggests that dense vegetation can produce a stronger cooling effect, while the cooling effect may decay greatly beyond a distance threshold.

Relationship between LST and Land Cover Configuration
The mean values for the spatial metrics of built-up areas all exhibited an increasing trend from 1995 to 2020 (Table 7). The PLAND, LPI, and AREA_MN exhibited an increasing trend during the study period. The AI also showed an increasing trend, while the ED and SHAPE_MN exhibited slight changes. The PLAND and LPI have a significant positive relationship with LST, indicating that LST increases with an increase in areal percentage of building patches. The AREA_MN and ED had an insignificant relationship with LST for most periods. The SHAPE_MN and AI had a negative insignificant relationship with LST, while the relationship became insignificant after 2010. The mean values for the spatial metrics of green space were derived for each time period ( Table  8). The PLAND, LPI, and AREA_MN showed a consistent downward trend, which indicated a significant reduction in green space in the study area. This corresponds with the multi-temporal land cover classification results in the study area. The LPI of green space decreased from 53.82% in 1995 to 21.41% in 2020. The edge density and mean shape index showed slight changes. The decrease in AI implies that the green space patches became more fragmented during the study period. There was a significant negative correlation between the PLAND and LST. The relationship of LPI, AREA_MN, Figure 6. Correlation between NDISI and NDVI and mean LST across grid cell sizes. All correlations were significant at the 0.01 level (2-tailed). The orange and green lines represent NDISI and NDVI, respectively.

Relationship between LST and Land Cover Configuration
The mean values for the spatial metrics of built-up areas all exhibited an increasing trend from 1995 to 2020 (Table 7). The PLAND, LPI, and AREA_MN exhibited an increasing trend during the study period. The AI also showed an increasing trend, while the ED and SHAPE_MN exhibited slight changes. The PLAND and LPI have a significant positive relationship with LST, indicating that LST increases with an increase in areal percentage of building patches. The AREA_MN and ED had an insignificant relationship with LST for most periods. The SHAPE_MN and AI had a negative insignificant relationship with LST, while the relationship became insignificant after 2010.
The mean values for the spatial metrics of green space were derived for each time period (Table 8). The PLAND, LPI, and AREA_MN showed a consistent downward trend, which indicated a significant reduction in green space in the study area. This corresponds with the multi-temporal land cover classification results in the study area. The LPI of green space decreased from 53.82% in 1995 to 21.41% in 2020. The edge density and mean shape index showed slight changes. The decrease in AI implies that the green space patches became more fragmented during the study period. There was a significant negative correlation between the PLAND and LST. The relationship of LPI, AREA_MN, and ED with LST varied in different periods. The negative correlations between AI and LST indicated the negative effect of higher aggregation on LST.

Discussion
The average of NLST in built-up areas and bare land is consistently higher than vegetation and water bodies. Compared to impervious surfaces such as roads, pavements, buildings, and parking lots, vegetated areas can emit lower radiance, increase evapotranspiration, and provide shading from canopies, and thus, reduce their surrounding temperature [59][60][61]. The increasing trend of NLST across all land cover types indicates an overall trend of surface warming during the study period. The SUHII of the study area is 2.759 • C in 2020, which is comparable to other big cities in China. For example, the mean LST of impervious surface were 2.8 and 3.4 • C higher than the mean LST of green space in Guangzhou and Beijing, respectively [62,63]. Urbanization led to an expansion of built-up areas and increasing human activities in the study area. The increase in heat-absorbing manmade materials, reduction in natural vegetation, and increase in anthropogenic heat release during urbanization caused the increase in NLST across time [64].
The higher correlation between LST with NDISI than NDVI confirms earlier findings that built-up areas have a stronger impact on LST in comparison with green space [35,65]. The analysis results indicate that the correlation between NDISI and LST increases with an increase in window size, whereas the optimal grid cell size of the green space that influences LST effectively is from 210 to 240 m. The result agrees with findings of previous studies that used a comparable multi-scale grid analysis approach [66,67]. A previous study by Xiao et al. [66] reported that correlation between density of built-up areas and LST was enhanced as the size of analysis unit grew from 30 to 960 m in a UHI study in Beijing, China. The changing trend of the correlation between green space density and mean LST implies existence of a distance threshold for cooling effect of green space. Based on the analysis of air temperature recorded at weather stations, Myint et al. [67] also found that the correlation between impervious surfaces and maximum air temperatures decreased after reaching a window size of 210 m in Phoenix, Arizona, USA. The impacts of increasing urban green infrastructure on local meteorology have been simulated using statistical and numerical models in urban and green space planning for heat mitigation strategies [68]. The optimal cell grid size should be considered for the modeling of environmental parameters between LST and land cover. Grids with a coarse spatial resolution in numerical models may fail to capture the cooling effects of green areas [69].
The land cover configuration analysis reveals that large amounts of built-up area induced stronger heat island effects, while the shape of the built-up area had an insignificant effect on LST. Increasing the total area of green space significantly mitigated the negative impact of UHI in the study area. The influence of the shape and arrangement of green space on temperature reduction has been widely investigated and contrasting findings have been reported in case studies of different cities. Li found that LST increased significantly with higher patch density, given a fixed amount of greenspace in the Beijing metropolitan area [38]. In contrast, Peng et al. claimed a positive correlation between LST and the shape and fragmentation index of vegetated land in Beijing [70]. Asgarian found a complex patch shape with highly convoluted edges had stronger mitigating effects on land surface temperature in Isfahan, Iran [71]. Using analytical units with the largest size of 1080 × 1080 m, Zhou et al. reported that mean patch size and edge density of trees had negative and positive effects on LST in Sacramento, CA, USA with a Mediterranean climate with hot and dry summers, respectively [36]. With 240 × 240 m grids, Masoudi et al. [72] analyzed the complex relationship between green space pattern and LST in four major cities in Asia and found that configuration was not an influencing factor of LST in Kuala Lumpur and Hong Kong, while simply shaped, more aggregated, less fragmented patches of green space provided better cooling effects in Jakarta and Singapore. The analysis units varied from patches or census tracts to multi-scale grid cells in these studies. Since landscape metrics are sensitive to the size and shape of the calculation extent, the landscape metric values calculated using analysis units with different sizes or irregular shapes are incomparable [56,73,74]. The study by Estoque et al., which used the same analysis unit as the present study, revealed that aggregation had consistent and strong correlation with LST in the three Asian megacities, Bangkok, Jakarta, and Manila [34]. This agrees with our results that aggregated green space had better cooling effects than fragmented green space. However, Wesley and Brunsell claimed that the selection of window size in the above studies lacked biophysical justification and might fail to represent the characteristic scales of LST-green space interaction. They conducted a multi-resolution wavelet analysis to identify the dominant scales of LST variation for each image pixel and used them as the calculation extents of landscape metrics [74]. Despite the inconsistencies in the previous studies, the consensus is that the contribution of green space configuration to SUHI is weak, comparing with the percentage of vegetation [74].
In spite of these findings, several limitations of our study should be considered. The findings are limited to one inland city in northwestern China. The relationship between LST and spatial configurations of urban green space may vary in different cities. To fully understand the impacts of the spatial arrangement of urban land cover on LST, their relationship should be investigated by comparing multiples cities under different settings [75]. Moreover, only land cover characteristics were examined in this study; in addition to their spatial arrangements, the impact of other factors such as the functional types and heights of vegetation should also be investigated due to the cooling effects of green spaces in urban environments [76].

Conclusions
The spatiotemporal variations of the thermal environment and their relationships with land cover characteristics were investigated with remote sensing data in Xi'an, a fast-growing city in northwestern China. The land cover maps were classified and land surface temperature was estimated using Landsat imagery for six time periods with a five-year interval from 1995 to 2020. The built-up area increased from 413.74 to 1022.75 km 2 , and the area of green space and water bodies significantly decreased from 1995 to 2020. An overall trend of surface warming was revealed during 1995-2020, with the increase in mean SUHII from 0.683 • C in 1995 to 2.759 • C in 2020. The effects of built-up areas and vegetated areas on LST were analyzed based on polygon grids at different spatial scales. The results indicated that the densities of built-up areas had a stronger impact on land surface temperature than green space. The correlation between NDISI and LST was enhanced with the enlargement of the grid cell size. The correlations between NDVI and LST reached their maxima at the grid cell sizes of 210 and 240 m. Increasing the total area and aggregation level of green space can mitigate the negative impact of urban heat islands. These results enhanced our understanding of spatiotemporal variations of thermal environment and their association with land cover features in an inland city with a temperate continental monsoon climate. The main findings can provide implications for urban planners on UHI mitigation and land resource utilization in the study area.
Our study confirmed that multi-scale analysis methods should be applied for the exploration of the impacts of land cover patterns on SUHI in future studies. Moreover, the impacts of the spatial arrangement and features of green space on urban thermal environment are complex and should be further investigated. The warming and cooling effects of land cover and their characteristics can be explored using climate and statistical model simulations for the building of sustainable and resilient urban landscapes [77][78][79].