A Geographically Weighted Regression Analysis of the Underlying Factors Related to the Surface Urban Heat Island Phenomenon

This study investigated how underlying biophysical attributes affect the characterization of the Surface Urban Heat Island (SUHI) phenomenon using (and comparing) two statistical techniques: global regression and geographically weighted regression (GWR). Land surface temperature (LST) was calculated from Landsat 8 imagery for 20 July 2015 for the metropolitan areas of Austin and San Antonio, Texas. We sought to examine SUHI by relating LST to Lidar-derived terrain factors, land cover composition, and landscape pattern metrics developed using the National Land Cover Database (NLCD) 2011. The results indicate that (1) land cover composition is closely related to the SUHI effect for both metropolitan areas, as indicated by the global regression coefficients of building fraction and NDVI, with values of 0.29 and −0.74 for Austin, and 0.19 and −0.38 for San Antonio, respectively. The terrain morphology was also an indicator of the SUHI phenomenon, implied by the elevation (0.20 for Austin and 0.09 for San Antonio) and northness (0.20 for Austin and 0.09 for San Antonio); (2) the SUHI phenomenon of Austin on 20 July 2015 was affected by the spatial pattern of the land use and land cover (LULC), which was not detected for San Antonio; and (3) with a local determination coefficient higher than 0.8, GWR had higher explanatory power of the underlying factors compared to global regression. By accommodating spatial non-stationarity and allowing the model parameters to vary in space, GWR illustrated the spatial heterogeneity of the relationships between different land surface properties and the LST. The GWR analysis of SUHI phenomenon can provide unique information for site-specific land planning and policy implementation for SUHI mitigation.


Introduction
Although urban land covers only 2-3% of the total area of the Earth [1,2], urban areas draw ample attention due to the rapid pace of urbanization across the world.Indeed, whereas the global fraction of population living in the urban areas was just 30% in 1950, by 2014, that figure had passed 54% [3].Urbanization involves increases in both population and developed space, and certain planning and management strategies significantly influence the landscape pattern and the environment [4][5][6][7].The Urban Heat Island (UHI) effect, which refers to the phenomenon of an urban area's atmosphere exhibiting higher temperatures than surrounding rural areas [8], has become a well-researched topic due to a series of adverse effects on vegetation phenology [9], air pollution [10], meteorology [11], climatic warming [12], and health risks for urban residents [13].
Traditional UHI investigation has focused on atmosphere UHI where the air temperature is measured in situ at isolated stations.Although some modern devices can capture additional parameters to investigate the UHI (e.g., velocity, turbulence, and even pollutant concentration), the stationary networks are still limited in most of urban areas worldwide.Meanwhile, due to the air mixes and advection effect, the study of the atmosphere UHI is not capable of capturing the heterogeneous thermal pattern caused by land use and land cover (LULC) composition and configuration.In contrast, remote sensing data have been used to estimate the land surface temperature (LST), which is time-synchronized and grid-based for a considerable areal extent [14,15].Surface UHI (SUHI), which is characterized by the analysis of remotely-sensed LST, has enabled the evaluation of the relationships between LST and microscale geophysical variables on the landscape.Different from the traditional UHI analysis, LST data contributes to a broader understanding of spatial thermal patterns and the influence of surface properties on SUHI formation [16].
The examination of how underlying surface characteristics affect SUHI formation has become one of the major applications of remote sensing for urban climate studies.Remote sensing-derived spectral indices have been used to examine SUHI formation, including the Normalized Difference Vegetation Index (NDVI) [17], the Normalized Difference Built-up Index (NDBI) [18], and the Normalized Difference Water Index (NDWI) [19], among others.Moreover, several studies have reported that green spaces or water bodies mitigate high LSTs [20][21][22][23].On the other hand, built-up or impervious land cover increases LST and exacerbates SUHI effects [23][24][25].Furthermore, research has also demonstrated the impacts of spatial composition and configurations of detailed LULC on the LST [21][22][23]26,27].These studies indicate that empirical estimation models are effective tools for characterizing SUHI formation with less computational intensity compared to simulation models.In addition, empirical estimation outputs are relatively easy to interpret.Conventional statistics (e.g., ordinary least squares (OLS)) are the primary tools for researchers in most studies that investigate the impacts of underlying factors on SUHI formation.However, the prominent limitation of conventional statistics in geoscience is spatial non-stationarity which refers to the spatially varying relationships between dependent and independent variables [28].Moreover, OLS has been shown to be of limited utility when spatial data are coupled with highly correlated independent variables [29].An alternative to conventional regression is Geographically Weighted Regression (GWR) which can model spatial variation in the relationship between dependent and independent variables.Li et al. [30] first indicated that GWR provides a better fit and more localized information than a global model when exploring the landscape drivers of LST.Additional assessments of GWR have been compared to global regression (e.g., OLS) with respect to residual spatial autocorrelation, and a goodness-of-fit model has also been explored [31][32][33].
Previous studies have focused on analyzing SUHI for single cities, while SUHI research on broader metropolitan areas of multiple cities is far less common.This larger extent analysis warrants further investigation since the underlying factors are likely to be more complex and variable over space.The Austin and San Antonio metropolitan areas in Texas, USA were selected for our LST mapping because of their hot and humid summers and population concentrations (Figure 1 and Table 1).However, there have been few UHI studies within these two metropolitan areas, and comparisons of UHI results are difficult because of variations in mapping units and land cover characteristics.Under that backdrop, the main objectives of this study are to answer the following questions: (1) Which underlying landscape properties (e.g., land cover and terrain morphology) are significantly related to the SUHI phenomena for Austin and San Antonio, Texas, and (2) compared to a global regression approach, does GWR provide improved insight about the landscape drivers of LST?

Study Areas
As two of the four major metropolitan areas of Texas, the Austin-Round Rock (Austin) and the San Antonio-New Braunfels (San Antonio) metropolitan areas are located in the central south region of Texas, USA (Figure 1).The Interstate highway I-35, a major transportation corridor, passes north-south through these two metropolitan areas.Both areas are characterized by relatively flat terrain to the east, grading to more complex topography along the Western edge of the Balcones Escarpment.Based on the Köppen climate classification scheme, the region is considered humid subtropical with long and hot summers, short and mild winters, and warm and rainy spring and fall seasons [34] (Table 1).Both the Austin-Round Rock and San Antonio-New Braunfels metropolitan areas are located in a unique and narrow transitional zone that ranges from semi-arid vegetation cover dominated by trees and shrubs in the west, to humid and more densely vegetated prairie/grassland to the east.

Landsat 8 OLI/TIRS Imagery
The moderate spatial resolution and spectral bands of Landsat data are amenable to modeling SUHI variation and underlying landscape patterns at the extent of a metropolitan area.Because of the harmful impact of high temperature on human health in our study areas, summer (June, July, and August) and day-time SUHI phenomena were considered in this SUHI study.Meanwhile, at least two Landsat images are needed to cover the entire San Antonio area.In this sense, two cloud-free Landsat 8 OLI/TIRS images acquired on 20 July 2015 were obtained from the United States Geological Survey (USGS) Earth Explorer for LST calculation and derivation of related spectral indices (e.g., NDVI).Image preprocessing was implemented using ERDAS Imagine 2015 and included stacking individual bands and clipping the image stack of the study areas.Digital numbers for all reflective bands were converted to top-of-atmosphere (TOA) reflectance using scaling coefficients provided in the image metadata.

Lidar Dataset Processing
Lidar data were used to calculate a series of morphological properties closely related to SUHI variation, including topography and building morphology.Lidar datasets were obtained from the Texas Natural Resources Information System (TNRIS) and downloaded using an online interface.Table 2 provides the relevant metadata information about the different airborne Lidar survey project between 2007 and 2012 that were used to calculate the Lidar-derived model covariates.Lidar data processing included the generation of digital terrain models (DTM) and building footprint extraction.Five-meter DTMs were generated using the "LAS Dataset to Raster" tool in ArcGIS 10.5 for individual Lidar projects, using the 3D Analyst Conversion Toolbox and "Average Cell Assignment Type" for ground classified returns.The DTMs from the different source projects were merged together after being re-projected to the same spatial reference (NAD UTM Zone 14N).Similarly, 5 m digital surface models (DSMs) were generated with the same tool but using the "Maximum Cell Assignment Type" for all Lidar returns.The DSMs represented the ground surface combined with built-up areas, vegetation, and other features on the landscape.The same reprojection and data merging procedures were applied to the DSMs. Figure 2 shows the spatial pattern of DSMs for the study areas and a detailed illustration of DSMs for a selected area in San Antonio.several trial and error runs.Subsequently, building outlines were extracted as polygons from the Lidar data in LP360 with the automated point cloud task tool "Point Group Tracing and Squaring" for the developed areas.To further improve the accuracy of the building extraction, we manually interpreted and deleted erroneous building outlines with the aid of high-resolution current and historical images from Google Earth and base maps in ArcGIS.Figure 3 provides examples of detailed building extraction results for selected intensively developed areas.Buildings are an essential component of SUHI formation, both in terms of building height and fractional cover on the landscape.Building footprints for portions of central and south Austin were created by the City of Austin Enterprise Geospatial Services; however, they did not cover the entire metropolitan area.Consequently, we extracted building footprints using the dataset of Lidar points in LP360, a type of software for Lidar data management and post-processing.Buildings were first classified using the point cloud task tool "Planar Point Filter", in which the parameters were set after several trial and error runs.Subsequently, building outlines were extracted as polygons from the Lidar data in LP360 with the automated point cloud task tool "Point Group Tracing and Squaring" for the developed areas.To further improve the accuracy of the building extraction, we manually interpreted and deleted erroneous building outlines with the aid of high-resolution current and historical images from Google Earth and base maps in ArcGIS.

SUHI Measurement
Top-of-atmosphere (TOA) radiance (i.e., radiance measured by the , sensor) was converted to brightness temperature according to the following formula: where is the temperature in Kelvin (K), and K1 and K2 are calibration constants specific to the

SUHI Measurement
Top-of-atmosphere (TOA) radiance (i.e., radiance measured by the L TOA,λ sensor) was converted to brightness temperature according to the following formula: where T sen is the temperature in Kelvin (K), and K1 and K2 are calibration constants specific to the Landsat TIRS sensor which can be obtained from the image metadata.
The brightness temperature was further adjusted using land surface emissivity (LSE) which is essential for LST inversion due to the noteworthy thermal variation of different land surface properties at a large spatial extent.The variation of vegetation coverage, surface moisture, roughness, and viewing angles leads to different LSEs for different cover types [35].The Normalized Difference Vegetation Index (NDVI) threshold emissivity estimation algorithm [36,37], a common method for LSE estimates, was applied in this study.The NDVI values were used to distinguish between soil and vegetated pixels before LSE calculation.The radiance-corrected values of band 4 (Red) and band 5 (NIR) from Landsat OLI were used for the corresponding NDVI calculation to mitigate the effects of vegetation phenology.
Specifically, following instructive literature, an NDVI threshold for rocks/soil (NDV I s ) was assigned a value of 0.2, and the NDVI threshold for vegetation (NDV I v ) was assigned a value of 0.5 [23,35].Thus, any pixel for which NDVI < NDV I v was assumed to be bare soil or rock with an ε λ (emissivity) value of 0.966.If NDVI > NDV I v , a pixel was considered to be fully vegetated and assigned an ε λ value of 0.986 [35].If the NDVI was between NDV I s and NDV I v , then the pixels were considered to be a combination of vegetation and rocks/soil.Equations ( 2)-( 4) were used to represent the relationship between NDVI and LSE to calculate the LSE for corresponding pixels.
where ε vλ and ε sλ are emissivity values of vegetation and soil, respectively.C λ was used to calibrate the cavity effect due to surface roughness: F is a geometrical factor, assigned a value of 0.55, by assuming different geographical distributions [37], while P V is the vegetation fraction.
The Planck's function was used to perform LSE correction of the substance compared to the blackbody.Thus, brightness temperatures were converted to LST [38,39].
where Ts is the LST in Kelvin (K), and B T is the brightness temperature (e.g., Tsen) in this study.λ is the wavelength of emitted radiance (band 10 was used for the LST calculation and λ = 10.895µm for Landsat 8 TRIS), ρ (e.g., h × c/σ) = 1.438 × 10 −2 mK.Ts was then converted into Celsius LST ( • C).

OLS and GWR Analysis
Unlike a conventional (global) regression model, GWR is able to model spatial variation in the relationship between dependent and independent variables.A GWR model takes the following form: where y i , x ik , and ε i are the dependent variable, the k independent variable (subscripted as k), and the random error at point i (subscripted as subscript i), respectively.The location is denoted by the coordinates (u i , v i ) of a given point (i).The coefficients β k (u i , v i ) are varying weights on the location, and β 0 (u i , v i ) is the geographically varying intercept.Thus, the GWR extends the global regression model by adding the geographical location parameter to generate the local coefficients to account for spatial non-stationarity.The estimates of β 0 (u i , v i ) and β k (u i , v i ) are based on the unbiased estimation of a set of observations, in which the weight matrix is used to weight the observations differently [40].The variation in β k (u i , v i ) at different locations makes the GWR different from the OLS.The distance to the given point (i) is an important factor that affects the weight matrix: Generally, the Gaussian function is used to generate the weight matrix (W(u i , v i )) with the following form: Here, an adaptive Gaussian kernel function was adopted for the analysis, where optimal the bandwidth was detected through a golden search algorithm in GWR 4. Here, by incorporating the weight into the pooled observations, the GWR yielded more precise estimates compared to the OLS.
Global regression models were also developed and compared to the GWR results.The coefficient of determination (R 2 ), the global Moran's I of the residuals, the Akaike Information Criterion (AIC), and the corrected AICc were used to compare the performances of global regression models versus the GWR model with respect to the goodness-of-fit and residual spatial autocorrelation.In this paper, both the GWR and global regression models were built using the open source platform GWR 4 [41].

Derivation and Selection of Explanatory Variables
Based on knowledge from prior SUHI studies [21][22][23][24][25][26][27], a suite of potential explanatory variables was selected for two models, one each for the metropolitan areas under consideration.Considering the variable type, the explanatory variables were categorized into three groups, as summarized in Table 3.

Land Use/Land Cover Composition (LULC) Variables
The Buildings Fraction (BF), Impervious Surfaces Fraction (ISF), NDVI, and Canopy were considered for SUHI explanatory variables in terms of the LULC composition.ISF and Canopy were downloaded as independent layers from National Land Cover Database (NLCD) 2011 [42], the thematic accuracy of which has been established in the literature [43].BF was derived using the building footprint features based on the Lidar datasets.

Landscape Pattern Metrics
The 30 m NLCD 2011 data was used to compute landscape metrics at the landscape level to quantify the general characteristics of the overall mosaic of LULC patches [42].The LULC types of the study area included open water, developed land (in four intensity levels), forest (deciduous, evergreen, and mixed), shrub/scrub, grassland/herbaceous, pasture/hay, cultivated land, and wetlands (woody, herbaceous).
The Contagion Index (CONTAG) and Patch Density (PD) were applied to describe aggregation.CONTAG is inversely related to the edge density.For instance, when a single class occupies a very large percentage of the landscape (low edge density), contagion is high and vice versa.It is affected by both the dispersion and interspersion of land types.PD is the number of patches on the landscape and describes the aggregation and subdivision characteristics of the various land covers.The Shannon's Diversity Index (SHDI) considers the proportional abundance of each patch type across all patch types and the Patch Richness (PR) quantifies the number of different patch types.These metrics were selected to measure the diversity characteristics on the landscape.In regard to optimal sampling scale selection, our previous study of Local Climate Zones (LCZs) indicated that 270 m is an optimal scale for an urban climate study as a uniform measure of land cover and surface structure [44].Based on this finding, 1000 m × 1000 m grids cells were used as observations to mitigate the spatial dependence and spatial autocorrelation.For consistency, all landscape metrics were calculated in FRAGSTATS using a radius of 1000 m centered on sampling points distributed throughout the study areas.

Terrain Factors: Elevation and Northness
Air temperatures are influenced by the elevation [45] and variation in elevation results in spatial patterns of LST on the landscape [30].Additionally, the landform aspect strongly affects the intensity of solar radiation and has been included in several SUHI studies [30,31].Aspect (the compass direction of a slope) was transformed to northness to mitigate the circular property of the data using Equation ( 9): All of the explanatory variables were aggregated or resampled to a resolution of 90 m to match the dependent variable (LST).Furthermore, to mitigate spatial autocorrelation and to ensure that the sampling dataset represented the study area with sufficient information to understand SUHI patterns, a systematic sampling scheme was designed to obtain sampling points for the regression models.We referred to the previous SUHI explanatory studies at the megacity level and used 1000 m as the sampling interval for further exploration [30].The center points of the cells which were completely within the boundaries of the study areas were selected as independent observations.Finally, 3887 and 4113 samples were generated for the Austin and San Antonio metropolitans, respectively.

LST Spatial Distribution
The spatial distribution of LSTs revealed the formation of an SUHI on 20 July 2015 for the two metropolitan areas (Figure 4).As expected, extreme values of LST (>35 • C) were often associated with the distribution of major transportation lines (e.g., I 35) and high-density buildings.The high LST (>30 • C) areas for each study area were mainly in both downtown centers and isolated urbanized areas (e.g., Georgetown, New Braunfels).In contrast, low LSTs (<25 • C) were distributed in central to Western Austin and Northern San Antonio, where forested areas and water bodies are located.The remaining areas exhibited relatively cool LSTs.

Terrain Factors: Elevation and Northness
Air temperatures are influenced by the elevation [45] and variation in elevation results in spatial patterns of LST on the landscape [30].Additionally, the landform aspect strongly affects the intensity of solar radiation and has been included in several SUHI studies [30,31].Aspect (the compass direction of a slope) was transformed to northness to mitigate the circular property of the data using Equation ( 9): All of the explanatory variables were aggregated or resampled to a resolution of 90 m to match the dependent variable (LST).Furthermore, to mitigate spatial autocorrelation and to ensure that the sampling dataset represented the study area with sufficient information to understand SUHI patterns, a systematic sampling scheme was designed to obtain sampling points for the regression models.We referred to the previous SUHI explanatory studies at the megacity level and used 1000 m as the sampling interval for further exploration [30].The center points of the cells which were completely within the boundaries of the study areas were selected as independent observations.Finally, 3887 and 4113 samples were generated for the Austin and San Antonio metropolitans, respectively.

LST Spatial Distribution
The spatial distribution of LSTs revealed the formation of an SUHI on 20 July 2015 for the two metropolitan areas (Figure 4).As expected, extreme values of LST (>35 °C) were often associated with the distribution of major transportation lines (e.g., I 35) and high-density buildings.The high LST (>30 °C) areas for each study area were mainly in both downtown centers and isolated urbanized areas (e.g., Georgetown, New Braunfels).In contrast, low LSTs (<25 °C) were distributed in central to Western Austin and Northern San Antonio, where forested areas and water bodies are located.The remaining areas exhibited relatively cool LSTs.

Diagnostics of the Regression Models
Prior to modeling, correlation analyses were conducted among the explanatory variables to assess multicollinearity.Variance inflation factors higher than 10 were considered highly correlated.Of the all candidate variables, five were used for analysis in the following global and local regression analysis: the Shannon's Diversity Index (SHDI), the Buildings Fraction (BF), NDVI, northness, and elevation.For the sake of comparability and interpretation, all of the dependent and independent variables for each were transformed to a range of values spanning from 1 to 100 as the inputs for the global and GWR model.
The global regression models for each of the two study areas estimated statistically significant (p < 0.001) relationships between the LST and all explanatory variables (Table 4).For Austin, the coefficient of determination (R 2 ) for the global regression model was estimated to be 0.53.Among the explanatory variables, the SHDI, BF, and elevation were found to vary positively with LST, while the NDVI and northness were negatively correlated with LST.The global model for San Antonio resulted in an R 2 of 0.45 and revealed the same tendencies regarding the explanatory variables and LST with one exception-the relationship between SHDI and LST was not statistically significant in the San Antonio model.Compared to these global regressions, GWR models seem better suited to the investigation of the SUHI and the underlying influencing factors for both study areas.Specifically, in the case of the Austin metropolitan area, the higher R 2 (0.85) for the GWR model suggests that the relationships between LST and the modeled underlying physical factors may exhibit spatial non-stationarity.Such circumstances are one reason why the GWR model seems to outperform the global regressions (Table 4).While the AICc (the corrected Akaike Information Criterion) is not an absolute measurement of goodness of fit, it contributes to the model comparison.Taking the Austin area as an example, the difference between AICc and GWR (e.g., 25,271.62) is much lower (4468.63)than that of the global model, which indicates the benefits of moving from the OLS to the GWR.Additionally, a full comparison of other diagnostic measures further suggests improved performance for GWR compared to the global regression model, including F-tests of GWR model improvement (34.42,Austin), and the Global Moran's I of the residuals (e.g., 0.44 for GWR vs 0.058 for the global model in Austin).

The Global and Spatial Non-Stationarity Relationship
Ordinary kriging interpolation using the Spatial Analyst Toolbox in ArcGIS 10.5 was applied to interpret the samples using a resolution of 300 m for the sake of visualization.First, both the global and GWR analyses indicated that LULC affected the LST variation on 20 July 2015, as indicated by the BF (Buildings Fraction) and NDVI (Table 4).GWR further revealed strong spatial heterogeneity in their relationships based on the coefficient values and t statistics.These covariate relationships were similar for Austin and San Antonio in the same range of estimates of BF coefficients (ranging from 0 to 0.6).Additionally, the t statistics of BF coefficients for both Austin and San Antonio indicated that the relationship between the LST and BF was significant for most of the study areas, especially urban areas.Nevertheless, the effect of BF was more prevalent for Austin than for San Antonio, as evidenced from the higher estimates of BF coefficients indicated by the global and local models.While the highest values of the local estimates of BF coefficients (displayed in red) were located in the isolated natural areas (Figure 5), the effect of the BF was prevalent and most of the high values were distributed in areas with compact human settlements (displayed in yellow and slight blue).
high values were distributed in areas with compact human settlements (displayed in yellow and slight blue).
As expected, an increase in NDVI reduced the SUHI effect, as suggested by the significantly negative relationship between LST and NDVI across the study areas.The estimated coefficient from the global regression was −0.74 for Austin and −0.38 for San Antonio, revealing a stronger relationship for LST and NDVI for Austin than San Antonio (Table 4).This fact was further proved by the GWR analysis, the values of local absolute coefficients were higher overall in Austin (e.g., displayed in red for majority of the region) than in San Antonio, indicating a higher capacity to mitigate the SUHI.However, attention should also be paid to the area around the Lake Travis in Austin and nearby the Calaveras Lake in San Antonio, where LST was notably positively related to NDVI, and the relationship was not significant in some part of this area, as indicated by t statistics (Figure 5).4).It indicated that the fragmented landscape (e.g., the interaction of different land patches) would lead to a decrease in the capacity to mitigate SUHI.This effect was further proved by the GWR analysis, from which the regression coefficient was positive for most of the area.However, it is notable that the As expected, an increase in NDVI reduced the SUHI effect, as suggested by the significantly negative relationship between LST and NDVI across the study areas.The estimated coefficient from the global regression was −0.74 for Austin and −0.38 for San Antonio, revealing a stronger relationship for LST and NDVI for Austin than San Antonio (Table 4).This fact was further proved by the GWR analysis, the values of local absolute coefficients were higher overall in Austin (e.g., displayed in red for majority of the region) than in San Antonio, indicating a higher capacity to mitigate the SUHI.However, attention should also be paid to the area around the Lake Travis in Austin and nearby the Calaveras Lake in San Antonio, where LST was notably positively related to NDVI, and the relationship was not significant in some part of this area, as indicated by t statistics (Figure 5).SHDI, as the indicator of the diversity of landscape spatial arrangement, had an overall positive effect on the LST intensity, as explained by the global regression for Austin (Table 4).It indicated that the fragmented landscape (e.g., the interaction of different land patches) would lead to a decrease in the capacity to mitigate SUHI.This effect was further proved by the GWR analysis, from which the regression coefficient was positive for most of the area.However, it is notable that the situation was contrary in the area around the Lake Travis in Austin, as indicated by the slightly negative local regression coefficients (Figure 6).Additionally, the global regression revealed a non-relationship between SHDI and LST for the San Antonio metropolitan area, and GWR further found that the effects of SHDI on LST variation were not significant for most of the area, as indicted by t statistics (e.g., −1.5 to 1.2, Figure 6).Nevertheless, the GWR analysis did find some exceptions where the SHDI was a strong predictor for LST variations (e.g., the small areas in the Southern and Northeastern parts of San Antonio metropolitan area colored in red, Figure 6).By linking with Google Earth imagery and the NLCD 2011, we found that Southern part displayed in red is covered by forest, pasture, and shrub with high complexity.Also, this area demonstrated low DSM, as indicated by Figure 2. by t statistics (e.g., −1.5 to 1.2, Figure 6).Nevertheless, the GWR analysis did find some exceptions where the SHDI was a strong predictor for LST variations (e.g., the small areas in the Southern and Northeastern parts of San Antonio metropolitan area colored in red, Figure 6).By linking with Google Earth imagery and the NLCD 2011, we found that Southern part displayed in red is covered by forest, pasture, and shrub with high complexity.Also, this area demonstrated low DSM, as indicated by Figure 2. Terrain factors affected the LST variation on 20 July 2015, as indicated by the elevation and northness.For both Austin and San Antonio, the elevation exhibited an overall slight positive effect in the global regression (e.g., 0.20 and 0.09 of coefficients, Table 4).GWR further helped to explore this unexpected finding by indicating that negative coefficients indeed exist in the rural areas for both Austin and San Antonio, especially in the areas with low vegetation cover (e.g., the Northeastern part of Austin and Eastern San Antonio colored with blue, Figure 7).On the other hand, both the global and local regressions yielded the significantly negative coefficient (average coefficient) of northness, which indicated that the northness is a strong predictor of the LST (Table 4).Again, the effect of northness varied spatially, and it had a stronger effect on San Antonio (−5.34 vs. −2.07indicated by the global regression, Table 4), especially in the Northern part (a larger area colored with red indicated by the GWR, Figure 7).
The spatial patterns of the local determination coefficients (R 2 ) of the GWR model highlighted a marked regional heterogeneity.Overall, GWR outperformed the global regression with local R 2 values larger than 0.5 across both study areas (Figure 8).GWR modeling was also characterized by higher local R 2 values in the city centers (>0.8) while relatively lower values were observed (0.5-0.65) in the surrounding areas.Standardized residuals, indicating the under or overprediction of LST, were distributed with significant spatial autocorrelation as tested by the Global Moran's I test in ArcGIS.As shown in Figure 8, the standardized residuals from both global and local regression were distributed in clusters, while the spatial clustering effect resulting from the global regressions was much more apparent than those resulting from the GWR.Terrain factors affected the LST variation on 20 July 2015, as indicated by the elevation and northness.For both Austin and San Antonio, the elevation exhibited an overall slight positive effect in the global regression (e.g., 0.20 and 0.09 of coefficients, Table 4).GWR further helped to explore this unexpected finding by indicating that negative coefficients indeed exist in the rural areas for both Austin and San Antonio, especially in the areas with low vegetation cover (e.g., the Northeastern part of Austin and Eastern San Antonio colored with blue, Figure 7).On the other hand, both the global and local regressions yielded the significantly negative coefficient (average coefficient) of northness, which indicated that the northness is a strong predictor of the LST (Table 4).Again, the effect of northness varied spatially, and it had a stronger effect on San Antonio (−5.34 vs. −2.07indicated by the global regression, Table 4), especially in the Northern part (a larger area colored with red indicated by the GWR, Figure 7).The spatial patterns of the local determination coefficients (R 2 ) of the GWR model highlighted a marked regional heterogeneity.Overall, GWR outperformed the global regression with local R 2 values larger than 0.5 across both study areas (Figure 8).GWR modeling was also characterized by higher local R 2 values in the city centers (>0.8) while relatively lower values were observed (0.5-0.65) in the surrounding areas.Standardized residuals, indicating the under or overprediction of LST, were distributed with significant spatial autocorrelation as tested by the Global Moran's I test in ArcGIS.As shown in Figure 8, the standardized residuals from both global and local regression were distributed in clusters, while the spatial clustering effect resulting from the global regressions was much more apparent than those resulting from the GWR.

Which Underlying Properties Are Significantly Related to the SUHI for Austin and San Antonio?
Overall, the regression analyses revealed that LULC composition and terrain morphology are closely related to SUHI effects for both metropolitan areas.Similarities in climate and topography for these metropolitan areas facilitates the comparison of their respective SUHIs.First, the composition of land cover was shown to be an essential factor that influences the LST pattern for both study areas.Our study confirmed previous findings that an increase in the building density (e.g., Buildings Fraction; BF) tends to exacerbate the SUHI effect, while an increase in the vegetation  Overall, the regression analyses revealed that LULC composition and terrain morphology are closely related to SUHI effects for both metropolitan areas.Similarities in climate and topography for these metropolitan areas facilitates the comparison of their respective SUHIs.First, the composition of land cover was shown to be an essential factor that influences the LST pattern for both study areas.Our study confirmed previous findings that an increase in the building density (e.g., Buildings Fraction; BF) tends to exacerbate the SUHI effect, while an increase in the vegetation cover intensity tends to mitigate the SUHI effect [24,33,[46][47][48][49]. Regarding the regional differences, although the effects tended to be similar, building density and vegetation cover intensity were more strongly correlated with the SUHI phenomenon in San Antonio than in Austin, as indicated by global BF coefficient (0.29 vs. 0.19) and NDVI coefficient (−0.74 vs. −0.38)(Table 4).Furthermore, it is notable that the situation was contrary in the area near a water body (e.g., Lake Travis in Austin) as indicated by the slightly negative local regression coefficients.The unique characteristics of water bodies regarding thermal characteristics were also consistent with previous studies [17,47].
Both global and local regressions provided unexpected results for elevation.Our results show that an increase in elevation is associated with an overall increase of LST.It is different with the common knowledge that air temperature decreases with an increase in altitude.In fact, it was DTM (e.g., bare earth digital terrain model) that was used as the potential explanatory factor for LST variation, while the actual surface temperature of the property was being remotely sensed.Air temperature varies between about 0.6 and 1.0 K per 100 m elevation change.Given its small variation in the study area, the effect of elevation may not be substantial.Thus, in the areas with dense building or forest cover, the effect of elevation would be not as severe as other factors when incorporating them simultaneously in the modeling.Also, most areas of Austin and San Antonio exhibit flat terrains; thus, the effect of elevation was not apparent.Furthermore, the Lidar-derived DSM proved to be an effective way to characterize the terrain morphology at the microscale (e.g., 5 m × 5 m) to understand the relationship between northness and SUHI variation.Consistent with previous studies, this research indicated that the alleviating effect of northness was more apparent for floodplain areas with low and flat terrains for both the Austin and San Antonio metropolitan areas (e.g., Refs.[31][32][33]).
The effect of landscape configuration on SUHI formulation has drawn attention in recent years by incorporating the landscape matrix at the patch, class, or landscape level [21][22][23]26,27,46].For example, a recent study by Kim, et al. [50] found that larger and better-connected landscape patches have the effect of mitigating high LST at the neighborhood scale, while fragmented and isolated patches showed the opposite effect in the city of Austin.Similarly, this study found that the SUHI of Austin on 20 July 2015 was also affected by the spatial pattern of LULC, measured by SHDI, which was not detected in San Antonio.
These contrasting findings may be explained by the fact that the landscape of San Antonio is more aggregated and less diverse as indicated by the statistics of potential explanatory variables (Table 3).In this study, to balance the spatially detailed characterization of SUHI and the size of samples in the modeling, 1000 m was set as the sampling interval as well as the window size for landscape matrix calculation for both study areas.Since both landscape patterns and LST distributions are scale dependent, previous studies have drawn different conclusions on the suitable scale for investigation of the relationship (e.g., 700 m for Beijing, China [51]; 600 m for Wuhan, China [52]).In addition, the appropriate scale also depends on landscape matrix selection.For instance, the SHDI showed the strongest correlation with LST at the 700 m scale which was not the optimal solution if other landscape matrices were taken into account, as indicated by a case study of Wuhan, China [53].In this sense, further research is warranted to examine the scale sensitivity of their relationships in the future to lead to more comprehensive understanding.

Compared to the Conventional Regression Model, Does the GWR Provide Fresh Insight into the SUHI Phenomenon?
As a natural process, SUHI exhibits high spatial heterogeneity which is difficult to characterize with conventional regression methods.However, most previous studies derived the spatial relationship by focusing on individual cities, especially for cities in Asia [24,33,[46][47][48][49], while varying heterogeneous impacts have been rarely studied and compared.The results reported in this paper suggest significant spatial non-stationarity in the relationship between the LST and explanatory variables for the two metropolitan areas.Here, GWR modeling was confirmed as an effective method to detect non-stationarity, especially for the urban cores of Austin and San Antonio as well as Western San Antonio with local R 2 values of higher than 0.8.Furthermore, locally detailed differentiation of the underlying mechanisms of SUHI provided by non-stationarity GWR modeling is necessary for partitioned regional landscape planning.With the implementation of GWR, studies have suggested site-specific policies designed for effective SUHI mitigation, including land use planning that considers the distance to roads to alleviate the high LST effect [30] and the location and configuration of green spaces in urban areas [31].Considering natural and socioeconomic factors, Li, Cao, Lang and Wu [33] mapped the potential heat sources and sinks of a megacity and performed GWR analysis based on the heat source and sink regions, where the partitioned policies were provided.
Whereas a classic regression method would provide an estimate of the mean value for an entire region regardless of the spatial pattern, GWR can infer a more dynamic approach to parameter estimations by using data from neighborhoods to determine the model parameters.In the case of identifying whether the SUHI phenomenon is influenced by local underlying physical factors, this dynamic model approach is desirable, as a "one size fits all" approach may not accurately identify the LST variation across a heterogeneous metropolitan landscape.In short, compared to conventional regression, GWR has the inherent potential to enable a better understanding of SUHI phenomenon and associated biophysical factors across the metropolitan area.

Conclusions
In this study, we used Landsat imagery from 20 July 2015, a Lidar dataset and derived products, NLCD 2011 land cover, and associated fractional cover products as the main data sources to investigate SUHI spatial distribution in the Austin and San Antonio metropolitan areas.This study further explored how the underlying surface characteristics affect SUHI phenomenon by using global regression and GWR analysis.
The results indicate that the GWR is in overall agreement with the global regression, and that both help to address the contributions of a set of specific underlying physical factors related to the SUHI phenomenon.The composition of land cover is an essential factor that influences the LST pattern for both study areas.Furthermore, the Lidar-derived DSM was proved to be an effective way to characterize the terrain morphology at the microscale (e.g., 5 m × 5 m) to aid in the understanding of the relationship between northness and SUHI variation.This study further found that the SUHI of Austin on 20 July 2015 was also affected by the spatial pattern of LULC, measured by SHDI, which was not detected in the San Antonio metropolitan area.Overall, this study contributed to obtaining a better understanding of SUHI phenomenon.
By accommodating spatial non-stationarity and allowing the model parameters to vary in space, GWR illustrated the spatial heterogeneity of the relationship between different land surface properties and the LST.In particular, the GWR analysis revealed considerably stronger relationships in some areas.Thus, together with the mapping result, the GWR analysis of the SUHI phenomenon can provide unique information for site-specific land planning and policy implementation for SUHI mitigation.
In this paper, the LST was calculated from a Landsat sensor, and the effect of urban facets was not considered in this SUHI study.For instance, in a dense built-up neighborhood, the uneven solar heating on different building surfaces can lead to the anisotropy of surface thermal emission [54,55].The LST on 20 July 2015 was calculated as the dependent variable, while the variation in the SUHI pattern at different times may affect the regression result.Uncertainties also come from the inconstancy of the acquisition time among input and output data for modeling.NLCD was built in the year 2011, whereas Lidar projects were conducted by different agencies with different accuracy standards and different time periods (Table 2).Hence, further work on the complete surface temperature acquisition and SUHI spatiotemporal variation is warranted.Then, a robust analysis of underlying properties related to the SUHI phenomenon can be performed.

1 )
Inquired or calculated by the authors in ArcMap based on the projection system "WGS 1984 UTM Zone 14N".(2) US Census Bureau.(3) Derived from 5 m Digital Terrain Models (DTMs), built by the authors from Lidar data provided by the Texas Natural Resources Information System (TNRIS).(4) US climate data website (www.usclimatedata.com).

Figure 1 .
Figure 1.The Austin and San Antonio metropolitan areas in Texas, U.S.Figure 1.The Austin and San Antonio metropolitan areas in Texas, U.S.

Figure 1 .
Figure 1.The Austin and San Antonio metropolitan areas in Texas, U.S.Figure 1.The Austin and San Antonio metropolitan areas in Texas, U.S.

Figure 2 .
Figure 2. Digital surface models (DSMs) with a 5 m resolution for Austin (left), San Antonio (middle), and a detailed illustration of DSMs for a selected area of San Antonio (right).

Figure 2 .
Figure 2. Digital surface models (DSMs) with a 5 m resolution for Austin (left), San Antonio (middle), and a detailed illustration of DSMs for a selected area of San Antonio (right).

18 Figure 3 .
Figure 3. Detailed illustrations (left) of building extractions from a selected part (right) of the study areas.

Figure 3 .
Figure 3. Detailed illustrations (left) of building extractions from a selected part (right) of the study areas.

Figure 4 .
Figure 4. Overall spatial variation of the and surface temperature (LST) for the SUHIs of Austin (left) and San Antonio (right) metropolitan areas on 20 July 2015.

Figure 4 .
Figure 4. Overall spatial variation of the and surface temperature (LST) for the SUHIs of Austin (left) and San Antonio (right) metropolitan areas on 20 July 2015.

Figure 5 .
Figure 5. Overall spatial variation of the coefficients (left) and the t statistics (right): Buildings Fraction (BF) and NDVI.

Figure 5 .
Figure 5. Overall spatial variation of the coefficients (left) and the t statistics (right): Buildings Fraction (BF) and NDVI.

Figure 6 .
Figure 6.Overall spatial variation of the coefficients (left) and the t statistics (right): Shannon's Diversity Index (SHDI).

Figure 6 .
Figure 6.Overall spatial variation of the coefficients (left) and the t statistics (right): Shannon's Diversity Index (SHDI).

Figure 7 .
Figure 7. Overall spatial variation of the terrain-related coefficients (left) and the t statistics (right): elevation and northness.

Figure 7 .
Figure 7. Overall spatial variation of the terrain-related coefficients (left) and the t statistics (right): elevation and northness.

Figure 7 .
Figure 7. Overall spatial variation of the terrain-related coefficients (left) and the t statistics (right): elevation and northness.

Figure 8 .
Figure 8. Overall spatial distribution of local R 2 values (top), standardized residuals of the GWR (bottom left), and global regression (bottom right) as measures of model goodness-of-fit.

Figure 8 .
Figure 8. Overall spatial distribution of local R 2 values (top), standardized residuals of the GWR (bottom left), and global regression (bottom right) as measures of model goodness-of-fit.

1 .
Which Underlying Properties Are Significantly Related to the SUHI for Austin and San Antonio?

Table 1 .
Summary of the geographic, demographic, and climatic characteristics of Austin and San Antonio, Texas.AreasLocation (

Table 3 .
Potential explanatory variables: derivation sources and statistical summary of the observations.

Table 4 .
Comparison of global regression and Geographically Weighted Regression (GWR): summary of the coefficients and diagnostics.