Driving Mechanism of Differentiation in Urban Thermal Environment during Rapid Urbanization

: To achieve sustainable urban development, it is essential to gain insight into the spatial and temporal differentiation characteristics and the driving mechanisms of the urban thermal environment (UTE). As urbanization continues to accelerate, human activity and landscape conﬁguration and composition interact to complicate the UTE. However, the differences in UTE-driven mechanisms at different stages of urbanization remain unclear. In this study, the UTE of Shenyang was measured quantitatively by using the land surface temperature (LST). The spatial and temporal differentiation characteristics were chronologically studied using the standard deviation ellipse (SDE) and hotspot analysis (Getis–Ord Gi*). Then, the relationship between human activities, landscape composition and landscape conﬁguration and LST was explored in a hierarchical manner by applying the geographical detector. The results show that the UTE in Shenyang continues to deteriorate with rapid urbanization, with signiﬁcant spatial and temporal differentiation characteristics. The class-level landscape conﬁguration is more important than that at the landscape level when studying UTE-driven mechanisms. At the class level, the increased area and abundance of cropland can effectively reduce LST, while those of impervious surfaces can increase LST. At the landscape level, LST is mainly inﬂuenced by landscape composition and human activities. Due to rapid urbanization, the nonlinear relationship between most drivers and LST shifts to near-linear. In the later stage of urbanization, more attention needs to be paid to the effect of the interaction of drivers on LST. At the class level, the interaction between landscape conﬁguration indices for impervious surfaces, cropland and water signiﬁcantly inﬂuenced LST. At the landscape level, the interactions among the normalized difference building index (NDBI) and other selected factors are signiﬁcant. The ﬁndings of this study can contribute to the development of urban planning strategies to optimize the UTE for different stages of urbanization.


Introduction
The physical environment associated with heat in urban areas is known as the urban thermal environment (UTE) and can affect human thermal comfort and physical conditions [1,2]. During the rapid urbanization process, the changing urban form, population migration and uncontrolled land exploitation cause alterations in the surface energy balance, which in turn leads to the continuous deterioration of the UTE [3,4]. Statistics show that the human-induced global land surface temperature has increased by 1.07 • C from 1850-1900 to 2010-2019 [5]. As a result, many ecological and social problems have arisen, such as ecosystem destruction, increased energy consumption and threats to human health [6][7][8]. The UTE should therefore be an important research object for sustainable urban development [9].
Studies increasingly recognize that investigating the spatial and temporal characteristics of the UTE and its driving mechanisms is a prerequisite for mitigating its deterioration [10,11]. Although station observations, model simulations and mobile measurements are the main methods to obtain the local thermal environment, these methods hardly reflect the spatial and temporal characteristics of the UTE on a macroscale [12]. Studying urban scales, however, helps to understand the unique surface structure of a city [13]. Recently, the popularity of satellite remote sensing has facilitated access to the acquisition of geospatial information related to land cover, especially in large-scale and long-term monitoring [14]. It has become common to quantify the spatial and temporal characteristics of the UTE and the mechanisms driving it using land surface temperature (LST) obtained from satellite thermal infrared remote sensing [11].
LST is a complex defined by multiple factors [12,15], such as human activities and landscape composition and configuration. Researchers often describe the biophysical characteristics of the land using landscape composition [16]. Land covered by buildings usually increases LST, while the opposite is true for land covered by vegetation and water bodies [17]. Studies have demonstrated that the role of landscape composition in LST can be quantified using the normalized difference building index (NDBI), fractional vegetation cover (FVC) and modified normalized difference water index (MNDWI) [18,19].
Human beings tend to release large amounts of anthropogenic heat through intensive and continuous activities on the ground, resulting in a rise in LST [20]. Population density and nighttime light data are considered suitable for analyzing the contribution of human activities to LST [21,22]. Moreover, the combination of landscape composition and human activities can jointly facilitate the spatial arrangement of the entire landscape, i.e., landscape configuration [23]. The efficiency of energy exchange and surface energy fluxes between landscape patches can be influenced by their spatial configuration, which in turn leads to differences in the distribution of LST [15].
Most studies have used landscape pattern indices to examine the role of the spatial configuration of landscape patches, including area factors, shape factors, aggregation factors and diversity factors [15,24]. It is important to note that the landscape pattern indices are multidimensional and usually include landscape-level indices and class-level indices in studies [25]. Landscape-level indices are applied to reflect the spatial configuration of multi-class mosaics of landscape patches, but it is difficult to show the morphological characteristics of each landscape patch [26].
Class-level indices can provide more detailed information on the spatial pattern of each type of landscape patch than landscape-level indices [26]. The stratified nature of ecological processes in the landscape may influence LST [15]. As a result, a growing number of researchers have investigated the impact of landscape pattern indices that combine both the landscape level and the class level on LST. For example, by studying the relationship between the spatial configuration of heat source and heat sink landscapes and LST in Zhengzhou over an 18-year period, Zhao et al. [27] found that both landscape-level and class-level landscape pattern indices have a significant impact on LST and can inform urban planning based on the results of the different levels. Furthermore, there is a consensus among researchers that it is more important to optimize the landscape composition than to optimize the landscape configuration at different levels when reducing LST [15,27]. However, the difference in the importance of landscape pattern indices at the landscape level and class level in reducing LST has not received much attention. They represent different spatial hierarchies of landscape configuration and may contribute differently to LST [28]. By comparing the differences in the contributions of landscape-level and class-level landscape pattern indices to LST, it is possible to clarify whether the importance of landscape configuration in reducing LST varies from level to level and thus to inform the adjustment of urban planning strategies [28,29].
In fact, quantifying the contribution of drivers is challenging because their effects on LST may be the result of a nonlinear interaction process [30]. For instance, by studying the effect of urban landscape structure on LST, Zhou et al. [31] found interactions among some landscape indices; i.e., changes in some landscape indices will indirectly enhance or diminish the effects of other indices on LST. Hu et al. [32] highlighted the existence of thresholds for the effects of the Normalized Difference Vegetation Index (NDVI) and NDBI on LST through empirical studies. Common analytical methods for exploring the effects of drivers mainly include correlation analysis and regression analysis [33][34][35]. However, these methods are only applicable to explore linear relationships between variables, and most of them have difficulty detecting interactions between independent variables, making it difficult to capture the marginal effects of variables on LST. Meanwhile, the highly complex variability and spatial heterogeneity of the UTE require deeper insight to accurately characterize the relationship between LST and its drivers [30]. As a statistical method that has emerged in recent years, a geographical detector can quantify the impacts of drivers and their interactions on spatial heterogeneity [36]. The detector does not follow the linearity assumptions of traditional statistical methods, and it can avoid the effects of collinearity between variables [37]. This approach has become common in attribution analysis in a number of fields, such as social sciences, environmental sciences and public health [38][39][40]. Therefore, this study conducted a multi-level attribution analysis for LST using the geographical detector.
In addition to the lack of quantification of the interactions and nonlinear effects of drivers on LST, there are limitations in research on the driving mechanisms of LST at the time scale. Much of the research on the driving mechanisms of LST focuses on a single stage of urbanization [20,41] while ignoring the differences in the driving mechanisms of LST across different urbanization stages. Driving mechanisms may show variable results across different stages of urbanization, so the neglect of such variability may not support the development of urban planning strategies to reduce LST in differentiated UTE contexts [42].
The rapid urbanization of major Chinese cities over the last twenty years has led to the transformation of land cover, which in turn has resulted in the continuing deterioration of the UTE [43]. As one of China's fifteen sub-provincial cities, Shenyang's urbanization rate was over 20% higher than China's average urbanization rate in 2020. Meanwhile, Shenyang's status as an important industrial base in China has allowed it to undergo a phase of rapid urbanization and led to increased UTE deterioration [44]. Like many cities, Shenyang has faced the threat of extreme weather conditions in recent years [45]. In the summer of 2018, the highest temperature in Shenyang reached 38.4 • C, surpassing previous statistically extreme values. Therefore, Shenyang can be used as a representative for the study of UTE deterioration during rapid urbanization, and the findings can be a reference for other cities with similar development backgrounds.
Based on the 2000, 2010 and 2020 LSTs of Shenyang, this study analyzed the spatial and temporal differentiation characteristics and driving mechanisms of the UTE. The main objectives of our study were (1) to detect the spatial and temporal differentiation characteristics of LST and (2) to construct an analytical framework to quantify the relationship between landscape composition, landscape configuration, human activities and LST heterogeneity in a hierarchical manner and to reveal the changing characteristics of the driving mechanisms at different stages of rapid urbanization. This study can enhance the understanding of the mechanisms driving the UTE during rapid urbanization and provide marginal contributions to the formulation of strategies for land use and urban planning.

Study Area
Shenyang is located at 41.20 •~4 3.04 • N and 122.42 •~1 23.81 • E, with a total area of about 12,860 square kilometers. Shenyang is the capital of Liaoning Province, China, and a typical city in cold regions. The terrain of Shenyang inclines from northeast to southwest, forming mountains, hills and alluvial plains. There are 27 rivers flowing through Shenyang, all belonging to the two major water systems, the Liao River and the Hun River. The Köppen climate classification classifies Shenyang as Dwa, i.e., cold and dry winters and hot and rainy summers. As the central city of Northeast China, Shenyang has developed rapidly in the last forty years. In 2020, the city had a resident population of 9.07 million and an urbanization rate of 84.52%, having reached an advanced stage of urbanization. As shown in Figure 1, the current study focuses on nine major municipal districts in Shenyang, namely, Heping District, Shenhe District, Dandong District, Huanggu District, Tiexi District, Sujiatun District, Hunnan District, Shenbei New District and Yuhong District, with a total study area of about 3766.76 square kilometers.

Study Area
Shenyang is located at 41.20°~43.04°N and 122.42°~123.81°E, with a total area of about 12,860 square kilometers. Shenyang is the capital of Liaoning Province, China, and a typical city in cold regions. The terrain of Shenyang inclines from northeast to southwest, forming mountains, hills and alluvial plains. There are 27 rivers flowing through Shenyang, all belonging to the two major water systems, the Liao River and the Hun River. The Köppen climate classification classifies Shenyang as Dwa, i.e., cold and dry winters and hot and rainy summers. As the central city of Northeast China, Shenyang has developed rapidly in the last forty years. In 2020, the city had a resident population of 9.07 million and an urbanization rate of 84.52%, having reached an advanced stage of urbanization. As shown in Figure 1, the current study focuses on nine major municipal districts in Shenyang, namely, Heping District, Shenhe District, Dandong District, Huanggu District, Tiexi District, Sujiatun District, Hunnan District, Shenbei New District and Yuhong District, with a total study area of about 3766.76 square kilometers.

Data Sources and Preprocessing
This study acquired Landsat 5 TM images with a strip number of 119 and a shape number of 31 for 31 July 2000 and 12 August 2010 and Landsat 8 OLI images with a strip number of 119 and a shape number of 31 for 22 July 2020 from the Geospatial Data Cloud (http://www.gscloud.cn/search, accessed on 14 April 2022). All images with no or few

Data Sources and Preprocessing
This study acquired Landsat 5 TM images with a strip number of 119 and a shape number of 31 for 31 July 2000 and 12 August 2010 and Landsat 8 OLI images with a strip number of 119 and a shape number of 31 for 22 July 2020 from the Geospatial Data Cloud (http://www.gscloud.cn/search, accessed on 14 April 2022). All images with no or few clouds (less than 5%) in Shenyang were obtained in the summer, because the vegetation usually reaches its maximum growth in this season. The images were preprocessed using Remote Sens. 2023, 15, 2075 5 of 24 ENVI 5.3, with preprocessing including radiometric calibration, atmospheric correction and image cutting, to facilitate the calculation of LST and landscape composition indices.
Land cover data were obtained from Globeland30 (http://globeland30.org/, accessed on 14 April 2022). Compared to Globcover data with 300 m resolution and MODIS data with 500 m resolution, Globeland30 has a finer scale and is considered suitable for studies in developing countries [46]. Using Globeland30, the land cover categories in Shenyang were extracted and classified into six types: cropland, forest, grassland, wetland, water and impervious surface.
Worldpop (https://www.worldpop.org/, accessed on 14 April 2022) provided population density data for the years 2000 to 2020 at a spatial resolution of 1 km. Nighttime light data for 2000, 2010 and 2020 can be retrieved from the National Tibetan Plateau Data Center (http://data.tpdc.ac.cn, accessed on 14 April 2022) with a spatial resolution of 1 km. This dataset is the result of a nighttime light convolutional long short-term memory (NTLSTM) network. By applying the network, the researchers produced the world's first 1984-2020 Chinese prolonged artificial nighttime light dataset [47]. Model evaluation against the original images showed a root-mean-square error (RMSE) of 0.73, a coefficient of determination (R 2 ) of 0.95 and a linear slope of 0.99 at the pixel level, indicating the high quality of data for the generated product. The correlation between nighttime light data and socioeconomic indicators such as the built-up area, GDP and population at different stages is better than all existing products, validating the temporal consistency of this data product [47].

Methodology
Our research consisted of four main steps: (1) the radiative transfer equation-based method (RTE) was used to obtain LSTs from 2000 to 2020; (2) spatial and temporal differentiation characteristics based on the multi-year LST dataset were analyzed; (3) the role of landscape configuration in LST for different land types in 2000, 2010 and 2020 was quantified using the geographical detector; (4) at the landscape level, the role of landscape composition, landscape configuration and human activities in LST in 2000, 2010 and 2020 was quantified using the geographical detector. The study flow chart is shown in Figure 2.  Based on band 6 of Landsat 5 TM in 2000 and 2010 and band 10 of Landsat 8 OLI in 2020, this study used the RTE for the inversion of LST. The RTE has been considered to have high accuracy in previous studies and has been widely used for the inversion of LST [48,49]. A simplified radiative transfer equation can express the brightness value of thermal infrared radiation: where ε is the surface-specific emissivity, B(T s ) is the blackbody thermal radiation luminance, τ is the atmospheric transmittance in the thermal infrared band, L ↓ is the atmospheric downward radiance, and L ↑ is the atmospheric upward radiance. T s can be obtained as a function of Planck's equation: For band 6 of Landsat 5 TM, K 1 = 607.76 W/(m 2 * µm * sr) and K 2 = 1260.56 K. For band 10 of Landsat 8 OLI, K 1 = 774.89 W/(m 2 * µm * sr) and K 2 = 1321.08 K.

Acquisition of Independent Variables
As shown in Table 1, the findings of the researchers and the basic information about the study area led to the selection of three types of independent variables to examine the driving mechanism of the UTE. Landscape composition indices mainly consist of biophysical components such as vegetation, water and built-up land. Based on previous studies, we quantified the landscape composition indices using FVC, NDBI and MNDWI.
The proportion of pixels containing the vertical projection of vegetation is called FVC. Previous studies have mostly used it to quantify the ecological status of the research area. Firstly, we obtained the NDV I using the following equation: where N IR is the near-infrared band, and R is the red band. Based on the results of NDV I, a dimidiate pixel model was used to obtain FVC, and 5-95% was selected as the confidence interval for the NDVI value. The formula is as follows: where NDV I max is the NDV I value of the fully vegetated area, and NDV I min is the NDV I value of the non-vegetated area. The NDBI is widely used to extract the construction sites in areas of interest. The calculation formula is as follows: where MIR is the mid-infrared band, and N IR is the near-infrared band. Compared to the NDWI, the MNDWI proposed by Xu [50] can effectively eliminate construction noise and consequently provide more accurate information on water body changes. The MNDWI is therefore often used to obtain water information from water areas where the background is dominated by built-up land. The MNDW I is calculated as follows: where Green is the green band.
Human activity indices were represented by preprocessed population density data and nighttime light data. Landscape configuration indices were quantified through landscape pattern indices, which were calculated based on land use data. Based on previous studies [16,38], seven landscape pattern indices were considered appropriate to quantify the landscape configuration at the class level and landscape level ( Table 2). Five class-level indices were chosen to explore the landscape configuration of land cover; they represent the area factor, shape factor and aggregation factor. Additionally, six landscape-level indices were used to assess the structure and morphology of the entire landscape. In addition to the area factor, shape factor and aggregation factor, a diversity factor was added to the landscape-level indices. The landscape pattern indices were all calculated using Fragstats 4.0.

Standard Deviation Ellipse
The SDE introduced by Lefever [51] was used to spatially quantify the distribution characteristics of and spatio-temporal variation in the study object. The basic characteristics of the SDE include the long axis, short axis, shape and range. The long and short axes indicate the direction and extent of the spatial distribution of the study object, respectively. The higher their ratio, the more concentrated and directional the study object is. Conversely, the lower the ratio, the more discrete and less directional the study object is. In addition, the gravity center can be used to assess the trend of a geographical object over time [52]. By comparing changes in the gravity centers of the ellipses at different times, the development trend and trajectory changes in the study object can be estimated spatially. This study used the SDE to determine the migration trajectory of the hotspot regions of LST.

Hotspot Analysis (Getis-Ord Gi*)
The hotspot analysis method proposed by Ord and Getis [53] is an effective method for studying the spatial clustering characteristics of research objects. By calculating the z-scores and p-values of elements, the method can effectively explore where the clustering of high-or low-value elements occurs in space. Elements with high values and surrounded by other elements having high values are usually statistically significant hotspots. We performed the hotspot analysis of LST using the hotspot analysis tool of Arc GIS 10.4.1. The results obtained with this tool can show the spatial clustering of cold and hotspots in LST.

Geographical Detector
As a statistical tool for measuring spatial heterogeneity, the geographical detector has been widely used in recent years [37]. This tool can effectively measure the relationship between variables and identify the explanatory variables that contribute primarily to the spatial heterogeneity of the dependent variable, without assuming the linearity of the association between them [36]. The geographical detector consists of four parts: a factor detector, risk detector, ecological detector and interaction detector. In this study, the direct effect of selected drivers on the spatial heterogeneity of LST was analyzed using the factor detector. The results can be expressed as q-statistics: where L is the number of strata of explanatory variables (h = 1, 2, . . . , L), N is the number of samples, and stratum h consists of N h units. σ 2 is the variance of the dependent variable, and σ h 2 represents the variance in the dependent variable in stratum h. The value of q ranges within [0, 1]. The closer the value of q is to 1, the stronger the explanatory power of the explanatory variable, and the closer it is to 0, the weaker the explanatory power.
The risk detector was used to identify significant differences between the means of the dependent variable and the explanatory variable in the two sub-regions, which in turn determined the potential risk region of the dependent variable. The interaction detector determined whether the explanatory power of two explanatory variables superimposed on each other was enhanced, diminished or mutually independent.
Since the explanatory variables of the geographical detector need to be discretized, the Jenks natural breaks classification method was used to classify the means of independent variables into nine classes and incorporate them into 967 2 km × 2 km grids for calculation. Major urban blocks in China are rectangles of approximately 800 m to 1200 m in length, so grid cells of 1 km to 5 km are suitable for urban-scale studies [54]. In previous studies, a grid scale of 2 km × 2 km has been considered the best scale for exploring the driving mechanisms of LST [55]. Therefore, a 2 km × 2 km analysis unit was chosen to explore the driving mechanism of LST in this study.

Validation of LST Results
We validated the LST results for 2000, 2010 and 2020 using the MOD11A1 Version 6.1 product (MODIS/Terra Land Surface Temperature/Emissivity Daily L3 Global 1 km SIN Grid) obtained from LAADS DAAC (https://ladsweb.modaps.eosdis.nasa.gov/, accessed on 24 March 2023). The product provides daily per-pixel LST and emissivity with a spatial resolution of 1 km [56]. Previous studies have demonstrated that this product can be used to validate LST results [57]. MOD11A1 1 km LST data on the same date as the selected Landsat remote-sensing images for this study were extracted as validation data. Afterward, linear fits were made using these data and inverse Shenyang LST data to verify the accuracy of LST in this study.  Figure 4 shows the spatial distribution of and temporal changes in LST in Shenyang from 2000 to 2020. Over 20 years, the area of Shenyang's high-temperature region continued to expand. The average LSTs of Shenyang were 29.49 °C, 31.08 °C and 32.47 °C in 2000, 2010 and 2020, respectively, exhibiting an increasing trend, according to Table 3. The temperature difference was growing in 2000, 2010 and 2020, with values of 32.99 °C, 41.40 °C and 45.55 °C. In addition, the standard deviation (SD) of the mean LST also widened, with 3.37, 3.69 and 3.85 in the three periods, respectively. These results indicate that LST in Shenyang tends to deteriorate, which is accompanied by a significant increase in heterogeneity.     Figure 4 shows the spatial distribution of and temporal changes in LST in Shenyang from 2000 to 2020. Over 20 years, the area of Shenyang's high-temperature region continued to expand. The average LSTs of Shenyang were 29.49 °C, 31.08 °C and 32.47 °C in 2000, 2010 and 2020, respectively, exhibiting an increasing trend, according to Table 3. The temperature difference was growing in 2000, 2010 and 2020, with values of 32.99 °C, 41.40 °C and 45.55 °C. In addition, the standard deviation (SD) of the mean LST also widened, with 3.37, 3.69 and 3.85 in the three periods, respectively. These results indicate that LST in Shenyang tends to deteriorate, which is accompanied by a significant increase in heterogeneity.   The results of the hotspot analysis (Getis-Ord Gi*) show the spatial clustering characteristics of LST in 2000, 2010 and 2020 (see Figure 5). Overall, the central urban area always contained hotspots, while the suburban area near water and vegetation contained most of the cold spots. Over a period of 20 years, the clustering areas of hot and cold spots tended to expand. In 2000, the regions described as hot and cold spots accounted for 12.67% and 9%, respectively; in 2010, they were 15.4% and 7.85%, respectively; and in 2020, they were 19.22% and 13.57%, respectively (see Figure 6). In addition, the hotspots formed by the Hun River and its surrounding areas in the central urban area have gradually disappeared, resulting in the gradual separation of hotspots in the central urban area along the Hun River over a period of 20 years and the formation of two separate heat islands across the north and south banks of the Hun River.  The results of the hotspot analysis (Getis-Ord Gi*) show the spatial clustering characteristics of LST in 2000, 2010 and 2020 (see Figure 5). Overall, the central urban area always contained hotspots, while the suburban area near water and vegetation contained most of the cold spots. Over a period of 20 years, the clustering areas of hot and cold spots tended to expand. In 2000, the regions described as hot and cold spots accounted for 12.67% and 9%, respectively; in 2010, they were 15.4% and 7.85%, respectively; and in 2020, they were 19.22% and 13.57%, respectively (see Figure 6). In addition, the hotspots formed by the Hun River and its surrounding areas in the central urban area have gradually disappeared, resulting in the gradual separation of hotspots in the central urban area along the Hun River over a period of 20 years and the formation of two separate heat islands across the north and south banks of the Hun River.     Table 4. From 2000 to 2010, the rotation angle of the ellipse increased clockwise from 32.38 degrees to 45.96 degrees. Then, from 2010 to 2020, the rotation angle decreased counterclockwise from 45.96 degrees to 40.58 degrees. These rotations indicate that the direction of the hotspot was altered over the 20-year period and elongated in the northeastsouthwest direction. The ratio of the long axis to the short axis expresses the spatial dispersion of the hotspot regions. From 2000 to 2020, the ratio first decreased and then increased, showing the development trend of hotspot regions from dispersion to concentration.     LSTs for different land types also show significant variations (Table 5). Over the 20-year period, the mean values of LST increased significantly for all land use types. The impervious surface was consistently the hottest land use type, with mean LST values of 33.69 • C, 35.75 • C and 36.80 • C in the three periods, respectively. The mean LST for forests in 2000 was 28.06 • C, making it the coldest land use type. However, in 2010 and 2020, it was replaced by water, with mean LSTs of 26.49 • C and 29.74 • C, respectively. This confirms that natural surfaces covered by water and forests can effectively reduce the ambient temperature. Figure 8 shows the results of superimposed hotspot analysis and land use types. Over 75% of the cold spot regions were concentrated in croplands near water bodies over the 20-year period. In addition, more than 79% of the hotspot regions were covered by impervious surfaces. It can be inferred that the construction land in the urban core of Shenyang leads to the aggregation of hotspots of LST. Conversely, large areas of cropland on the outskirts of the city lead to the clustering of cold spots.

Factor Detection Analysis
Since only one delineated grid contained wetland, it was difficult to quantify the effect on the heterogeneity of LST. In the subsequent analysis, wetland was excluded. The analysis results of class-level factor detection are shown in Table 6. Over the 20 years, some landscape configuration indices of cropland, water and impervious surface significantly influenced the spatial differentiation of LST (p < 0.1). The explanatory power of PLAND and LPI for the LST of impervious surfaces was consistently above 51%, while that for the

Factor Detection Analysis
Since only one delineated grid contained wetland, it was difficult to quantify the effect on the heterogeneity of LST. In the subsequent analysis, wetland was excluded. The analysis results of class-level factor detection are shown in Table 6. Over the 20 years, some landscape configuration indices of cropland, water and impervious surface significantly influenced the spatial differentiation of LST (p < 0.1). The explanatory power of PLAND and LPI for the LST of impervious surfaces was consistently above 51%, while that for the LST of cropland was consistently above 27%. This means that the area factor of cropland and impervious surface is critical to LST heterogeneity. The explanatory power of FRAC_AM for the LST of impervious surfaces was 18% in 2000 and 20% in 2010, respectively. However, by 2020, it had dropped to 11% and was no longer significant. This may imply that the shape factor of the impervious surface has a reduced driving force on LST. Moreover, in 2010, the explanatory power of LPI and FRAC_AM for the LST of water was 37% and 16%, respectively, indicating that the area factor and shape factor of water may also affect LST. Table 7 shows the results of factor detection at the landscape level. Under the premise that significance was always satisfied, the average factor explanatory power at the landscape level was as follows: NDBI > FVC > nighttime light > population density > MNDWI > FRAC_AM. Among all the landscape configuration indices, only FRAC_AM had an explanatory power that was always greater than 10% and was always significant. In contrast, other factors had weaker explanatory power (less than 10% or not significant) for the heterogeneity of LST. Overall, the landscape composition indices and the human activity indices are more important for LST heterogeneity, with an average explanatory power greater than 40%, while the landscape configuration indices are secondary drivers.

Risk Detection Analysis
The drivers that were significant contributors to LST heterogeneity over 20 years were selected for risk detection analysis. Figure 9 presents the results of class-level risk detection. The LST of cropland decreased with increasing PLAND and LPI, while the opposite was true for that of impervious surfaces. In addition, the LST of cropland varied considerably in the first two classes of PLAND and LPI. The LST of impervious surfaces, in contrast, increased rapidly in the latter two classes of PLAND and LPI. However, these differences diminished over time for cropland and impervious surfaces.
Remote Sens. 2023, 15, x FOR PEER REVIEW 16 of 27 Figure 9. Results of the class-level risk detector. (Curves represent the change in the mean LST at various levels for each driver. Red squares denote that the difference in effect on LST between the two factor levels is significant with 95% confidence. Blue squares denote insignificance.).
In the landscape-level risk detection analysis, the relationship between different drivers and LST differs significantly ( Figure 10). The NDBI, population density and nighttime light were positively correlated with LST, which means that the development of construction land, the growth of the population and the increase in the socioeconomic level will increase LST. Compared to the NDBI, population density and nighttime light, FVC had a near-linear negative association with LST. For FRAC_AM, LST always decreased rapidly in the first two levels and showed a constant negative correlation in the subsequent levels. In addition, the MNDWI was found to show a significant nonlinear association with LST. LST increased rapidly with increasing levels of MNDWI, peaking between levels 7 and 8. At subsequent levels, LST decreased rapidly. Figure 9. Results of the class-level risk detector. (Curves represent the change in the mean LST at various levels for each driver. Red squares denote that the difference in effect on LST between the two factor levels is significant with 95% confidence. Blue squares denote insignificance.).
In the landscape-level risk detection analysis, the relationship between different drivers and LST differs significantly ( Figure 10). The NDBI, population density and nighttime light were positively correlated with LST, which means that the development of construction land, the growth of the population and the increase in the socioeconomic level will increase LST. Compared to the NDBI, population density and nighttime light, FVC had a near-linear negative association with LST. For FRAC_AM, LST always decreased rapidly in the first two levels and showed a constant negative correlation in the subsequent levels. In addition, the MNDWI was found to show a significant nonlinear association with LST. LST increased rapidly with increasing levels of MNDWI, peaking between levels 7 and 8. At subsequent levels, LST decreased rapidly. Figures 11 and 12 show the results of interaction detection at the class level and the landscape level, respectively. The q-values of the interactions between different drivers all showed bidirectional enhancement or nonlinear enhancement (dark underline). Bidirectional enhancement means that the effect of the interaction between drivers is stronger than their individual effects, while nonlinear enhancement means that the interaction between drivers exceeds the sum of their individual effects. That is, in the present study, the superposition of any two drivers enhances their effects on the heterogeneity of LST.

Interaction Detection Analysis
Remote Sens. 2023, 15, x FOR PEER REVIEW 17 of 27 Figure 10. Results of the landscape-level risk detector. (Curves represent the change in the mean LST at various levels for each driver. Red squares denote that the difference in effect on LST between the two factor levels is significant with 95% confidence. Blue squares denote insignificance.). Figures 11 and 12 show the results of interaction detection at the class level and the landscape level, respectively. The q-values of the interactions between different drivers all showed bidirectional enhancement or nonlinear enhancement (dark underline). Bidirectional enhancement means that the effect of the interaction between drivers is stronger than their individual effects, while nonlinear enhancement means that the interaction Figure 10. Results of the landscape-level risk detector. (Curves represent the change in the mean LST at various levels for each driver. Red squares denote that the difference in effect on LST between the two factor levels is significant with 95% confidence. Blue squares denote insignificance.).

5, x FOR PEER REVIEW
18 of 27 between drivers exceeds the sum of their individual effects. That is, in the present study, the superposition of any two drivers enhances their effects on the heterogeneity of LST. Figure 11. Results of the class-level interaction detector.
At the class level, LPI∩ED (61%), PLAND∩ED (65%) and PLAND∩FRAC_AM (58%) for impervious surfaces had the strongest interactions in 2000, 2010 and 2020, respectively. LPI∩ED (37%), PLAND∩ED (41%) and PLAND∩ED (43%) for cropland also showed strong interactions in 2000, 2010 and 2020, respectively. It was also found that LPI∩FRAC_AM for water had 52% and 33% of the interaction in 2010 and 2020, respectively. PD, ED and density had the strongest interactions on LST heterogeneity. In 2010, NDBI ∩nighttime light and NDBI ∩FVC had the highest contribution to LST heterogeneity. Finally, NDBI ∩FVC and NDBI ∩SHDI had the strongest interaction effects on LST heterogeneity in 2020. The explanatory variables that possess the strongest interactions at the landscape level involve all three categories, implying that the interactions among landscape composition indices, landscape configuration indices and human activity indices have a significant effect on the spatial differentiation of LST.

Spatial and Temporal Differences
Researchers in urban ecology believe that the UTE reflects the result of the surface and atmospheric energy balance [11]. The study of the spatial and temporal differentiation of the UTE is fundamental to understanding ecological change and urban development [12,31]. From 2000 to 2020, LST in Shenyang continued to rise. The hotspots consistently extended in the northeast-southwest direction, and the hot/cold spots in the city center and its suburbs kept expanding. These findings indicate that rapid urbanization has indeed resulted in the deterioration and spatial heterogeneity of the UTE [58].
Among all land use types, we found that the hotspot regions were always concentrated on impervious surfaces. Impervious surfaces, including impervious landscapes and exposed surfaces in cities, are thought to always increase urban temperatures by holding heat and reducing evaporative cooling [59]. It is noteworthy that, although water and forest were always the land types with the lowest mean values of LST, the cold spots in Shenyang were always clustered in croplands near water during the 20 years. This may be influenced by the area of land cover [60]. As shown in Table 8, cropland was always the dominant land type in the study area, with a consistent percentage of over 58%, although it continued to decrease. In contrast, forest and water continued to expand, but the sum of the areas never exceeded 9%. For the whole study area, the smaller area proportions make it difficult to form spatial clusters with significant cold spots in forest and water. However, the growth of large areas of crops in summer, such as corn and rice, can lead to an increase in surface soil moisture and evapotranspiration and thus can promote the clustering of cold spots [61,62].
An interesting phenomenon is that the LST hotspots in the Hun River and its surrounding areas in the central city of Shenyang have been disappearing over the past 20 At the class level, LPI∩ED (61%), PLAND∩ED (65%) and PLAND∩FRAC_AM (58%) for impervious surfaces had the strongest interactions in 2000, 2010 and 2020, respectively. LPI∩ED (37%), PLAND∩ED (41%) and PLAND∩ED (43%) for cropland also showed strong interactions in 2000, 2010 and 2020, respectively. It was also found that LPI∩FRAC_AM for water had 52% and 33% of the interaction in 2010 and 2020, respectively. PD, ED and FRAC_AM had weaker explanatory power in single-factor detection, but their explanatory power for LST heterogeneity could be significantly enhanced when combined with other factors.
Among the landscape-level interactions, the NDBI was the main contributor to LST heterogeneity at different times. In 2000, NDBI ∩nighttime light and NDBI ∩population density had the strongest interactions on LST heterogeneity. In 2010, NDBI ∩nighttime light and NDBI ∩FVC had the highest contribution to LST heterogeneity. Finally, NDBI ∩FVC and NDBI ∩SHDI had the strongest interaction effects on LST heterogeneity in 2020. The explanatory variables that possess the strongest interactions at the landscape level involve all three categories, implying that the interactions among landscape composition indices, landscape configuration indices and human activity indices have a significant effect on the spatial differentiation of LST.

Spatial and Temporal Differences
Researchers in urban ecology believe that the UTE reflects the result of the surface and atmospheric energy balance [11]. The study of the spatial and temporal differentiation of the UTE is fundamental to understanding ecological change and urban development [12,31]. From 2000 to 2020, LST in Shenyang continued to rise. The hotspots consistently extended in the northeast-southwest direction, and the hot/cold spots in the city center and its suburbs kept expanding. These findings indicate that rapid urbanization has indeed resulted in the deterioration and spatial heterogeneity of the UTE [58].
Among all land use types, we found that the hotspot regions were always concentrated on impervious surfaces. Impervious surfaces, including impervious landscapes and exposed surfaces in cities, are thought to always increase urban temperatures by holding heat and reducing evaporative cooling [59]. It is noteworthy that, although water and forest were always the land types with the lowest mean values of LST, the cold spots in Shenyang were always clustered in croplands near water during the 20 years. This may be influenced by the area of land cover [60]. As shown in Table 8, cropland was always the dominant land type in the study area, with a consistent percentage of over 58%, although it continued to decrease. In contrast, forest and water continued to expand, but the sum of the areas never exceeded 9%. For the whole study area, the smaller area proportions make it difficult to form spatial clusters with significant cold spots in forest and water. However, the growth of large areas of crops in summer, such as corn and rice, can lead to an increase in surface soil moisture and evapotranspiration and thus can promote the clustering of cold spots [61,62]. An interesting phenomenon is that the LST hotspots in the Hun River and its surrounding areas in the central city of Shenyang have been disappearing over the past 20 years, creating two heat islands to the north and south along the Hun River. This may be due to the pollution control and ecological protection of the Hun River. As an important industrial city in China, Shenyang has well-developed food and chemical industries, but the uncontrolled discharge of pollutants led to extremely serious pollution of the river until 2002 [63]. The Shenyang municipal government then embarked on river management and gradually established the landscape corridor to optimize the quality of the river and its surrounding ecology [63]. Through these measures, the LST in the area has also been reduced, gradually creating a corridor separating LST hotspots.

Driving Mechanism at the Class Level
The relationship between the landscape configuration and LST varies considerably across land types at the class level. On longer time scales, the PLAND and LPI of cropland and impervious surfaces were always the main landscape configuration indices that affected LST. The intensity of LST decreases as the area and patches of cropland increase, while the opposite is true for impervious surfaces. As part of the urban green space, the positive impact of the increased size and abundance of cropland on LST is associated with increased irrigation and an increased scale of transpiration during crop growth peaks [64,65]. Furthermore, the aggregation and large-scale expansion of built-up land can lead to a reduction in surface infiltration and surface moisture, which in turn leads to the deterioration of LST [66].
It is noteworthy that the LST of cropland always decreased rapidly in the first two levels of LPI and PLAND, and the LST of impervious surfaces always increased rapidly in the last two levels of LPI and PLAND. This implies that the transpiration cooling effect of cropland on LST occurs rapidly with increasing area and abundance [67]. Moreover, when the impervious surface increases to a certain extent, unrestricted urban expansion leads to difficulties in finding a balance between urban development and urban ecology, resulting in a rapid rise in LST [68]. However, over the 20-year period, the difference in LST increased for each level of PLAND and LPI, and the relationship between PLAND, LPI and LST gradually changed from nonlinear to near-linear. This can be attributed to rapid urbanization [69]. Rapid urbanization has led to massive urban expansion, rapid land cover changes and ever-increasing LST. In this case, while the PLAND and LPI of cropland and impervious surfaces still have effects on LST, their marginal effects are more difficult to specify, and further attention needs to be paid to the interaction of landscape configuration to maximize LST reduction.
For other landscape configuration indices, even if the individual explanatory power was not strong, the interactions between them tended to enhance their impact on LST. This implies that the complex synergy of factors is more important than the role of a single factor [38]. The strongest explanatory power of the interactions appeared in the area and shape indices of the impervious surface, including LPI, PLAND, ED and FRAC_AM. This suggests that the combination of area, abundance, shape and connectivity patterns of different patches of impervious surface needs to be focused on when optimizing the UTE [70]. The PLAND, LPI and ED of cropland also showed strong interactions, suggesting that the combined configuration of the area, abundance and patch connectivity patterns of cropland also needs to be taken into account. Attention also needs to be paid to the interacting configurations of the LPI and FRAC_AM of water, as they also showed stronger interactions in 2010 and 2020.

Driving Mechanism at the Landscape Level
Landscape composition and human activities are key drivers that influence LST at the landscape level. The NDBI, population density, and nighttime light were significantly and positively correlated with LST from 2000 to 2020. During the rapid expansion phase of cities, population migration and economic activity can increase anthropogenic heat release and LST [71,72]. In contrast, FVC showed a significantly negative correlation with LST. Urban green areas can promote air ventilation by improving the thermodynamic properties of the land surface, which can help reduce LST in urban areas [71,73]. Unlike previous studies, the MNDWI and LST were not always significantly negatively correlated, but rather, there was a threshold effect. The positive effect of the MNDWI on LST could be attributed to the fact that the remote-sensing images were acquired at night and in late summer. Gunawardena et al. [74] showed that urban heat island intensity is greatest at night and in late summer, which may lead to warming effects on water bodies. In addition, we speculate that LST will rapidly decrease when the surface water content reaches a certain threshold.
The landscape configuration indices had less explanatory power for LST at the landscape level than at the class level. This finding might further indicate that the configuration of land cover is more valuable than that of the entire landscape when studying the driving forces of LST [75,76]. However, we still found strong explanatory power for FRAC_AM for LST at the landscape level. Complex landscape patches can lead to a reduction in LST, especially in the first two levels of FRAC_AM. Zhao et al. [27] argued that the shape complexity of urban landscapes should receive more attention when aiming to reduce the LST of cities, which is consistent with our findings.
With the exception of the MNDWI and FRAC_AM, all drivers showed a near-linear correlation trend with LST over 20 years, similar to the results at the class level. This suggests that more attention also needs to be paid to the interaction of drivers at the landscape level during the rapid urbanization phase.
The two variables with the strongest interaction effects on LST were constantly changing over 20 years. However, one of the interacting variables always included the NDBI. The single explanatory power of the NDBI for LST is very strong, but the explanatory power can still be enhanced by its interaction with other factors, such as nighttime lighting, population density, FVC and SHDI. These findings emphasize the importance of built-up areas in LST at the landscape level. At the same time, urban planners need to pay attention to the combination of buildings and vegetation, population, economic activity and the diversity of the landscape when optimizing the UTE.

Impact on Urban Planning
This study explored the spatial and temporal differentiation characteristics and driving mechanisms of the UTE. In addition, the differences in driving mechanisms across different stages of urbanization were investigated. The results of this study may therefore have important implications for UTE-based urban planning. At the landscape level, due to the high correlation between the NDBI, population density and nighttime light and LST, the high density of buildings, concentrated population and frequent economic activities can have a significant impact on UTE deterioration. Therefore, urban sprawl and human activities should first be strictly controlled during rapid urbanization. FVC and FRAC_AM are significantly negatively correlated with LST, so during rapid urbanization, increasing vegetation cover and landscape diversity can help to rapidly optimize the UTE, for example, by creating pocket parks and landscape corridors in high-density urban cores and increasing the protection of ecological reserves in the outskirts of cities. The marginal contribution of the MNDWI to LST suggests that the stock of water bodies should be maintained, their quality should be improved, and large lakes should be built on the outskirts of the city. At the class level, the relationship between PLAND and LPI and LST for cropland and impervious surfaces indicates that an ecological red line for cropland should be drawn, and continuous cropland should be built on the outskirts of the city to avoid the encroachment on cropland caused by concentrated urban development during rapid urbanization. The integration of the main drivers at the landscape level and class level allows for the efficient optimization of the UTE during rapid urbanization.
In the later stage of rapid urbanization, the development of large urban areas is constrained, and the marginal contribution of a single driver to UTE is reduced. At this stage, therefore, urban planners need to focus on the interactions between drivers to maximize the reduction in LST. Firstly, according to our study, the combination of the area, abundance, shape and patchy connectivity patterns of impervious surfaces needs to be focused on. This means that when controlling urban sprawl, attention should also be paid to the diversity of morphology and the complexity of the boundaries of urban built-up areas. Secondly, the area, abundance and patch connectivity patterns of cropland have strong effects on LST. This suggests that when conserving cropland, attention needs to be paid to the size of the cropland, the type of crop and the complexity of the cropland boundaries. Thirdly, the abundance and morphology of water can also significantly interact with LST, and it is therefore recommended that when increasing water area, it should be combined with a diversity of morphological features. Finally, the combination of buildings and vegetation, population, economic activity and landscape diversity has a significant impact on LST, suggesting that, in the later stage of rapid urbanization, urban development should be intensified, and the ecological benefits of the landscape should be achieved by allocating diverse vegetation and landscape functions to high-density built-up areas.

Conclusions
Based on three Landsat images of Shenyang city in 2000, 2010 and 2020, we investigated the spatial and temporal differentiation characteristics of the UTE and its driving mechanisms during rapid urbanization using the SDE, hotspot analysis and the geographical detector. We identified the driving mechanisms by which landscape composition, landscape configuration and human activities affect the UTE at multiple levels. The key findings of this study are as follows.
Firstly, the average LST in Shenyang continued to rise and had significant spatial and temporal differentiation characteristics during the twenty-year period. The hotspot regions extended in the northeast-southwest direction. Hotspot and cold spot areas continued to expand in urban and suburban areas, respectively. The hotspots were always concentrated on impervious surfaces, while the cold spots were scattered in croplands near water. Over a period of 20 years, the hotspot in the central city has gradually been divided into two parts along the north and south banks of the Hun River.
Secondly, the spatial configuration indices at the class level are more important than those at the landscape level when studying the UTE. PLAND and LPI for class-level cropland and impervious surfaces were strong drivers of LST over the 20-year period, whereas only FRAC_AM at the landscape level had a consistently significant effect on LST.
Thirdly, as urbanization progresses, the marginal effects of some drivers on LST decrease, showing a near-linear correlation. At the class level, increases in the PLAND and LPI of cropland could rapidly reduce LST, while increases in the PLAND and LPI of impervious surfaces could rapidly increase LST. At the landscape level, the NDBI, population density and nighttime light were positively correlated with LST, FVC and FRAC_AM could effectively reduce LST, and there was a threshold for the response of the MNDWI to LST. However, over a period of 20 years, the nonlinear relationship between all factors and LST, except the MNDWI and FRAC_AM, gradually shifted to a near-linear relationship.
Finally, the interaction between the drivers should receive more attention when optimizing the UTE in the later stage of rapid urbanization. All class-level and landscape-level factors had enhanced effects on LST through their interactions. At the class level, the focus needs to be on the combination of the area, abundance, shape and patch connectivity patterns of impervious surfaces. The combined configuration of the area, abundance and patch connectivity patterns of cropland and the patch abundance and shape of water also need to be considered. At the landscape level, attention needs to be paid to the integration of buildings and vegetation, population, economic activity and landscape diversity. These results will provide the basis for a more comprehensive understanding of the UTE and the implementation of urban planning strategies during rapid urbanization.
Our study also has some limitations. First, our study was conducted only in Shenyang, a typical city in cold regions. The spatial pattern of LST may vary greatly among different cities due to differences in urban development patterns. Recent studies have concluded that conducting studies on LST in different cities or urban clusters can avoid bias in the results [13,77]. Second, in this study, only three remote-sensing images were selected for the time-scale research. Although the meteorological conditions of the selected remote-sensing images are similar and good, it is difficult to reflect the results of the study on a time scale of 20 years with data from only three remote-sensing images. Follow-up studies could increase the number of remote-sensing images to draw more precise conclusions. Third, we studied only one analysis unit at 2 km*2 km. Although this unit is recognized, the modifiable areal unit problem (MAUP) may exist, and different analysis scales may lead to differences in the results [54]. Therefore, in future UTE studies, the inclusion of different statistical scales should be considered to increase the accuracy of the results. Finally, the drivers affecting changes in the UTE may not be limited to two-dimensional composition and configuration. Three-dimensional landscape indicators, such as the floor area ratio and shadow patterns, have been found to influence the UTE [78,79]. However, this was not discussed in this study. The relationship between three-dimensional urban landscape metrics and the UTE should be emphasized and combined with two-dimensional metrics for a comprehensive analysis.