The Dynamic Relationship between Air and Land Surface Temperature within the Madison, Wisconsin Urban Heat Island

: The urban heat island (UHI) effect, the phenomenon by which cities are warmer than rural surroundings, is increasingly important in a rapidly urbanizing and warming world, but ﬁne-scale differences in temperature within cities are difﬁcult to observe accurately. Networks of air temperature (T air ) sensors rarely offer the spatial density needed to capture neighborhood-level disparities in warming, while satellite measures of land surface temperature (LST) do not reﬂect the air temperatures that people physically experience. This analysis combines both T air measurements recorded by a spatially-dense stationary sensor network in Dane County, Wisconsin, and remotely-sensed measurements of LST over the same area—to improve the use and interpretation of LST in UHI studies. The data analyzed span three summer months (June, July, and August) and eight years (2012–2019). Overall, T air and LST displayed greater agreement in spatial distribution than in magnitude. The relationship between day of the year and correlation was ﬁt to a parabolic curve (R 2 = 0.76, p = 0.0002) that peaked in late July. The seasonal evolution in the relationship between T air and LST, along with particularly high variability in LST across agricultural land cover suggest that plant phenology contributes to a seasonally varying relationship between T air and LST measurements of the UHI.


Introduction
In the context of a rapidly urbanizing planet [1,2] and rising global temperatures [3], the extra heat burden experienced in cities raises concern. The common pattern of increased temperatures in cities, referred to as the urban heat island (UHI) effect, has been tied to poor air quality [4], increased energy consumption [5,6], and multiple health risks, including increased mortality [7]. Extreme heat is already one of the deadliest weather-related threats [8], and the combined effect of climate change and urbanization are expected to increase dangerous exposure to extreme heat among urban populations [9]. Detailed and accurate monitoring of UHIs is needed to adequately predict and prepare for extreme heat risk [10,11].
Historically, a variety of data sources have been used to observe and analyze UHIs, each with its own advantages and drawbacks. Observational studies may rely on measurements from pairs of urban and rural air temperature (T air ) sensors (e.g., [12]) or use networks of dozens or hundreds of sensors [13]. Mobile sensors have also been used to measure T air at high spatial resolution along urban transects [14][15][16][17]. These mobile transects typically cannot provide a historical temperature record, while weather stations that offer a long historic record may provide limited spatial coverage, and few high-density sensor networks have been maintained for longer than five years [13]. Therefore, an alternative to using T air measurements to study the canopy-level UHI is studying the surface urban heat island (SUHI) with satellite data [18]. Satellite measurements of land surface temperature U.S. urban areas: the city of Madison, Wisconsin, which has a population of 258,000, and the surrounding county, which has a total population of 547,000 over roughly 3100 km 2 (1200 mi 2 ) of land [37].
This study uses data from a spatially dense network of air temperature sensors throughout the Madison area to build on past investigations of the relationship between the SUHI and canopy-level UHI. In addition to its high spatial resolution, our sensor array is distinguished by the diversity of land use covered. Although most of the network is within city limits, 40% of the sensors are located outside these boundaries, providing coverage of agricultural land dominated by annual crops, as well as nearby forest, grasslands, wetlands, and neighboring towns [13]. Similar to other mid-sized cities in the Corn Belt, land cover in Madison quickly transitions from urban development to cropland near the city limits [38]. The spatial density of the Madison sensor network allows for a uniquely detailed examination of how urban warming varies across a representative Midwest city. By comparing T air measured by this sensor network with satellite measurements of summer LST, we investigate how accurately the SUHI represents T air across the heterogeneous landscape of Dane County, Wisconsin. This study explores how the relationship between summer T air and LST changes with land cover and throughout the growing season. By comparing T air and LST along these lines, we can better understand when and how to use satellite data to study urban warming in a mid-sized city within a heterogeneous agricultural region.

Study Area
This study focuses on Dane County (Wisconsin) and the city of Madison, located near the center of the county. Dane County has a population of roughly 547,000, and nearly half of the county's residents (260,000 total) live within the city of Madison [37]. A chain of four lakes lies within and around the city. The county's rural areas are primarily agricultural, but also contain forest, wetlands, and grasslands ( Figure 1a). Across Dane County, 15% of the land area is developed and 46% of the area is cropland, with corn and soybeans the two most commonly grown crops [39].

Satellite Data
This analysis uses Landsat 7 and Landsat 8 scenes from the U.S. Geological Survey (USGS) Analysis Ready Data (ARD) Provisional Surface Temperature product. These two sensors have different thermal band wavelengths (10.4-12.5 µm for Landsat 7 and 10.6-11.2 µm for Landsat 8 band 10) and resolutions (60 m for Landsat 7 and 100 m for Landsat 8), but USGS produces 30-m LST from both, using higher resolution input from visible and shortwave bands to resample the coarser thermal data [40]. The USGS ARD product was chosen for this analysis due to the detailed land surface emissivity incorporated in these data [40]. Calculating LST requires measurements of the radiance received at the satellite, as well as knowledge of the sensor viewing angle, thermal band wavelength, the emissivity of the land surface, and atmospheric transmissivity, path radiance, and downwelling irradiance [40]. The sensor measurements, viewing angle, and thermal wavelength are straightforward to measure, and the atmospheric parameters can be calculated with a radiative transfer model [40]. However, estimating surface emissivity is more difficult, as it requires inputs that are independent of the sensor, and small errors in emissivity can produce much larger errors in LST [40]. Instead of using land cover classifications or vegetation indices to estimate emissivity [40,41], the USGS ARD product incorporates emissivity from the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) Global Emissivity Data Set (GED) [42]. This dataset averages millions of ASTER datapoints collected between 2000 to 2008 to produce a monthly worldwide database at 100 m resolution with an error of roughly 1% [42]. Because surface emissivity is more precisely known over water than the land surface, the USGS ARD product has been found to have an average error of less than 1 • C over water and 2.2-2.3 • C over land [40].

Satellite Data
This analysis uses Landsat 7 and Landsat 8 scenes from the U.S. Geological Survey (USGS) Analysis Ready Data (ARD) Provisional Surface Temperature product. These two sensors have different thermal band wavelengths (10.4-12.5 µm for Landsat 7 and 10.6-11.2 µm for Landsat 8 band 10) and resolutions (60 m for Landsat 7 and 100 m for Landsat 8), but USGS produces 30-m LST from both, using higher resolution input from visible and shortwave bands to resample the coarser thermal data [40]. The USGS ARD product was chosen for this analysis due to the detailed land surface emissivity incorporated in these data [40]. Calculating LST requires measurements of the radiance received at the satellite, as well as knowledge of the sensor viewing angle, thermal band wavelength, the emissivity of the land surface, and atmospheric transmissivity, path radiance, and downwelling irradiance [40]. The sensor measurements, viewing angle, and thermal wavelength are straightforward to measure, and the atmospheric parameters can be calculated with a radiative transfer model [40]. However, estimating surface emissivity is more difficult, as it requires inputs that are independent of the sensor, and small errors in We limited this analysis to the summer months (June through August), which is when urban warming poses the greatest threat to human health and when Madison's urban-rural temperature difference is largest [13]. We further limited the study period to the years from 2012 through 2019 to overlap all eight years that the Madison temperature sensor network has been operational. Using the USGS EarthExplorer tool, we initially identified 101 Landsat 7 or Landsat 8 scenes recorded during this study period. Because clouds emit radiation at lower temperatures than the land surface that they obscure, scenes with heavy cloud cover are not usable for LST analysis [43]. After filtering for dates with less than 30% cloud cover and visually inspecting the results to eliminate scenes where clouds obscured Dane County, we chose 15 dates, including at least one scene from each year of the study period and a relatively even distribution across both satellites and all three summer months (Table 1). Although all 15 dates had clear skies, meteorological conditions were otherwise representative of the typical range seen during the summer in Madison, with daily low temperatures ranging from 8.9 • C to 21.7 • C, daily highs ranging from 21.7 • C to 33.9 • C, and daily wind speeds between a mean of 1.21 m/s and 4.47 m/s and a maximum of 3.58 m/s and 8.05 m/s (Table 1). Table 1. Dates, satellite, daily high and low air temperatures, and daily mean and maximum wind speed for the 15 scenes analyzed. Daily temperature data and wind speed were recorded at the Dane County Regional Airport and accessed through the Midwestern Regional Climate Center's online data portal.

Date
Satellite Because Dane County spans two tiles on the ARD grid (h20v07 and h20v06), 30 scenes were downloaded in total. Unlike other satellite products that use temporally variable tile boundaries, ARD tiles are defined with fixed, non-overlapping coordinates so that adjacent scenes can be mosaicked without requiring additional processing [44]. Therefore, after masking cloudy pixels and scan line artifacts, the two scenes for each date were simply stitched together using ENVI's Seamless Mosaic tool, then clipped to Dane County boundaries. We also chose to mask the water area from all 15 Landsat scenes, because the sensor network does not extend over the local lakes, preventing comparison between LST and T air over these areas. The resulting 30m Landsat scenes were used in the land cover analysis after these processing steps. However, the comparisons between LST and T air required both datasets to share the same spatial resolution. The interpolated sensor data has a 400 m resolution, so we needed to reduce the resolution of each Landsat scene from 30 m to 400 m, using the "aggregate" function in R's raster package [45]. Because 30 m is not an integer factor of 400 m, aggregating the Landsat scenes to this coarser resolution required resampling the raster. Aggregating and resampling satellite data can introduce errors and reduce the level of detail in spatial patterns, occasionally distorting or obscuring patterns entirely [46,47]; however, this processing was necessary to produce LST data files that could be compared with T air .

Stationary Air Temperature Data
The T air data used in this analysis comes from an observational network of 146 HOBO U23 Pro v2 temperature and relative humidity data loggers mounted on utility poles throughout the Madison area. Each sensor is installed 3.5 m above the ground, a height that was chosen to avoid damage from passing pedestrians or vehicles. The first 135 sensors were launched in March 2012, 16 more were added over the following year, and five sensors have been broken or lost since then. This sensor array has been recording T air and relative humidity measurements every 15 min since being launched. This network, one of the largest and densest in the world, is described more fully in [13]. To provide a consistent comparison with the LST data, this analysis used T air measurements collected on the same days as the satellite data. All 15 Landsat scenes were recorded between 11:30 a.m. and 11:45 a.m. local time, so we selected sensor measurements from 11:30 a.m. local time to understand the relationship between approximately simultaneous T air and LST conditions. Additionally, because the differences in magnitude between T air and LST are often greatest near noon [23] and past research [31,48] has found satellite-derived LST to be more closely correlated to T air at night than during the day, we also used T air measurements from 12 h prior to each satellite overpass (11:30 pm) to allow for analysis of the relationship between LST and overnight T air .
For all 15 dates studied, 400-m spatial interpolations of daytime and nighttime T air were produced using regression kriging, a technique that incorporates land surface covariates and has been found to produce more accurate spatial models of urban temperatures than ordinary kriging [13,49]. Following the process described in [13], impervious surface cover within a 500-m radius (from the 2016 National Land Cover Database), and topography and lake effects within each 400-m grid cell were used as covariates, and lake area was masked from the final interpolated files.

Land Cover Data
Our analysis of how temperature anomalies differ by land cover used data from the 2016 National Land Cover Database (NLCD), which provides detailed classifications across the continental U.S. at 30-m spatial resolution [50]. Across Dane County, 15 different NLCD classes are represented ( Table 2). Past research has found that the classification accuracy of satellite-derived land cover datasets, such as NLCD, can be improved by consolidating similar classes [51]. Because our primary focus was on the differences in temperature anomaly between urban areas, unmanaged rural land, and managed agriculture, rather than the differences within these three general categories, we chose to consolidate the 15 represented NLCD classes into 8 broader groups. Two of these 8 categories-bare land and shrub and grassland-were excluded from the land cover analysis because they each comprise less than 1% of Dane County land area and were therefore assumed to have a minimal impact on the local UHI. Although 3.19% of Dane County is open water, this category was also excluded from analysis, because none of the sensors are located over water, preventing comparison between LST and T air in this category. The full set of original and reclassified categories are listed in Table 2. Temperatures can vary greatly between different dates or times of the day. Therefore, temperature anomalies, or the difference in temperature between a given location and a reference point, were analyzed instead of absolute temperatures to provide a more consistent basis for comparison. In UHI studies, reference points are often rural, so that the anomaly at a specific location represents the UHI magnitude relative to a rural baseline; however, identifying an appropriate rural site can be tricky [52,53]. A forested reference site, for example, may produce a different seasonal pattern than an agricultural site, even for the same city. Given the heterogeneity of Dane County's rural landscape, this analysis used central urban temperatures as a reference to avoid preferencing one type of rural land over another. Using the approach described in [15], the reference temperature was calculated as the average across five downtown sensors (Figure 1b) in the area with the greatest population density and highest observed air temperatures [13,54]. To calculate temperature anomaly, the temperatures (either LST or T air ) across these five sensors on that date and time were first averaged, then subtracted from the temperature at each pixel. The resulting LST anomaly (LST anom ) or T air anomaly (T air(anom) ) at a given location represents how much cooler that point is compared to downtown Madison.

T air vs. LST Comparisons
Analysis of the relationship between LST anom and T air(anom) included both the magnitude of difference and the correlation in spatial pattern. We used two different metrics to quantify the difference between the T air and LST-based UHIs. The root mean square error (RMSE) is used to describe how LST anom and T air(anom) differ in overall magnitude. RMSE was calculated using the following formula, where T anom,i represents the 11:30 a.m. T air(anom) at pixel location i,T anom,i represents a different measure of UHI (either LST anom or 11:30 p.m. T air(anom) ) being compared to 11:30 a.m. T air(anom) at pixel location i, and N represents the total number of pixels in the scene. Pearson correlation coefficients were calculated between paired scenes to describe how LST anom and T air(anom) differed in shape. These metrics were used to compare temperature anomalies averaged over all 15 dates studied, as well as to compare between individual dates. Wind speed, daily high and low temperature, and day of the year were all considered as factors that might account for the variability in RMSE and correlation. Lastly, LST anom and T air(anom) were compared by land cover. All analysis and visualizations were produced using R [55] with the raster [45] and rgdal [56] packages.

Overall Trends
Averaged over all 15 dates studied, the spatial distributions of LST anom , 11:30 p.m. T air(anom) , and 11:30 a.m. T air(anom) generally resembled each other ( Figure 2). All three temperature anomalies were highest where the impervious surface cover was high, including within Madison city limits and in the business districts of smaller Dane County cities, and coolest over rural land. These similarities were evident in the correlations between each temperature anomaly pair; the correlation coefficient was 0.74 between LST anom and 11:30 a.m. T air(anom) , and 0.68 between LST anom and 11:30 p.m. T air(anom). Correlation between the day and night T air(anom) (r = 0.80) was stronger than either of these surface and air temperature pairs. Despite the similarities in spatial pattern, the difference in magnitude between day T air , night T air , and LST were substantial. LST anom had the largest range, varying from -15.44 • C to 3.97 • C. Overall, Madison's overnight T air -based UHI had an ab-  The magnitude of difference between Tair(anom) and LSTanom varied by time of day a between rural and urban settings ( Figure 3). Across all pixels, the difference between s face and air temperatures was greater during the day than at night (p < 0.001). This smal difference at night than during the day held true for both urban ( Figure 3b) and ru ( Figure 3c) settings (p < 0.001 for both cases). In addition to this disparity between day a night, the difference between Tair(anom) and LSTanom was lower for urban pixels than ru ones, at both 11:30 am (p < 0.001) and 11:30 pm (p < 0.001). The magnitude of difference between T air(anom) and LST anom varied by time of day and between rural and urban settings ( Figure 3). Across all pixels, the difference between surface and air temperatures was greater during the day than at night (p < 0.001). This smaller difference at night than during the day held true for both urban ( Figure 3b) and rural ( Figure 3c) settings (p < 0.001 for both cases). In addition to this disparity between day and night, the difference between T air(anom) and LST anom was lower for urban pixels than rural ones, at both 11:30 a.m. (p < 0.001) and 11:30 p.m. (p < 0.001).  Urban pixels are all those categorized as higher and lower density development, and rural pixels are all those categorized as forest, agriculture, grasslands, barren, or wetlands.

Comparisons between Individual Dates
A wide range of statistically significant (p < 0.001) correlations between Tair(anom) and LSTanom were found over all 15 individual dates studied (Table 3). Although the same analysis was conducted for all 15 dates, Figure 4 highlights just two dates-5 June 2018 and 7 July 2018-that display the two extremes observed, despite occurring roughly one month apart. The spatial patterns of 11:30 am Tair(anom) (Figure 4a,b) were more similar across both dates (r = 0.95), than the spatial patterns of LSTanom (r = 0.45). The clear SUHI pattern visible on July 7th (Figure 4d) was absent on June 5th (Figure 4c). Instead, hot and cold patches were scattered throughout urban and rural areas of Dane County. Overall, RMSE between LSTanom and 11:30 am Tair(anom) was higher on 7 July 2018 (10.06 °C) than on 5 June 2018 (6.21 °C).  Urban pixels are all those categorized as higher and lower density development, and rural pixels are all those categorized as forest, agriculture, grasslands, barren, or wetlands.

Comparisons between Individual Dates
A wide range of statistically significant (p < 0.001) correlations between T air(anom) and LST anom were found over all 15 individual dates studied (Table 3). Although the same analysis was conducted for all 15 dates, Figure 4 highlights just two dates-5 and 7 July 2018-that display the two extremes observed, despite occurring roughly one month apart. The spatial patterns of 11:30 a.m. T air(anom) (Figure 4a,b) were more similar across both dates (r = 0.95), than the spatial patterns of LST anom (r = 0.45). The clear SUHI pattern visible on 7 July (Figure 4d) was absent on 5 June (Figure 4c). Instead, hot and cold patches were scattered throughout urban and rural areas of Dane County. Overall, RMSE between LST anom and 11:30 a.m. T air(anom) was higher on 7 July 2018 (10.06 • C) than on 5 June 2018 (6.21 • C).  On both 5 June 2018 and 7 July 2018, the scatterplot between 11:30 am Tair(anom) and LSTanom displays an artifact of the regression kriging; rural points that shared the same covariate values and were all substantial distances from the sensor network all share the same predicted Tair despite differing in LST, producing the vertical lines seen in Figure  4e,f. This artifact does not obscure the fact that there was a stronger linear relationship On both 5 and 7 July 2018, the scatterplot between 11:30 a.m. T air(anom) and LST anom displays an artifact of the regression kriging; rural points that shared the same covariate values and were all substantial distances from the sensor network all share the same predicted T air despite differing in LST, producing the vertical lines seen in Figure 4e,f. This artifact does not obscure the fact that there was a stronger linear relationship between 11:30 a.m. T air(anom) and LST anom on 7 July 2018 than there was one month earlier. The pattern of lower correlation on 5 June 2018 and higher correlation on 7 July 2018 was not limited to the comparison between LST anom and 11:30 a.m. T air(anom) . This trend was also evident in comparisons between day and night T air(anom) . On 6-7 July, the correlation between 11:30 a.m. and 11:30 p.m. T air(anom) was 0.79, while the same comparison for the 4-5 June scene had an r-value of just 0.37.
Although results from only two dates are depicted in Figure 4, all 15 dates displayed similar trends. Dates with low correlation between LST anom and 11:30 a.m. T air(anom) typically had low RMSE values as well. Figure 5a shows a positive linear trend between these two statistical metrics (R 2 = 0.54, p = 0.002). Overall, on days when the spatial distribution of LST anom most closely resembled the spatial distribution of 11:30 a.m. T air(anom) , the magnitude of the differences between these anomalies was also at its largest. No linear trend (R 2 = 0.0003, p = 0.95) was present between these metrics for the 11:30 p.m. T air(anom) and 11:30 a.m. T air(anom) comparison (Figure 5b). The impact of wind speed and daily high and low temperatures on RMSE and correlation coefficient was also explored, but no relationship was found. evident in comparisons between day and night Tair(anom). On 6-7 July, the correla tween 11:30 am and 11:30 pm Tair(anom) was 0.79, while the same comparison for the 4 scene had an r-value of just 0.37.
Although results from only two dates are depicted in Figure 4, all 15 dates di similar trends. Dates with low correlation between LSTanom and 11:30 am Tair(anom) t had low RMSE values as well. Figure 5a shows a positive linear trend between th statistical metrics (R 2 = 0.54, p = 0.002). Overall, on days when the spatial distrib LSTanom most closely resembled the spatial distribution of 11:30 am Tair(anom), the ma of the differences between these anomalies was also at its largest. No linear tren 0.0003, p = 0.95) was present between these metrics for the 11:30 pm Tair(anom) and 1 Tair(anom) comparison (Figure 5b). The impact of wind speed and daily high and lo peratures on RMSE and correlation coefficient was also explored, but no relations found.

Trends by Day of Year
Correlation between LST and 11:30 am Tair was lowest at the beginning and the summer, peaking in late July. Some summer weeks were represented more fre than others; for example, four scenes from the last week of July were included scenes between June 14th and 28th or between August 1st and 12th were suitable f ysis. Despite these gaps, a parabolic curve (R 2 = 0.76, p = 0.0002) was fit to the relat between the day of the year and the correlation between LSTanom and 11:30 am (Figure 6a). The correlation coefficient between 11:30 am Tair(anom) and 11:30 pm also followed a parabolic curve when plotted against the day of the year (Figure the fit was weaker (R 2 = 0.58, p = 0.002).

Trends by Day of Year
Correlation between LST and 11:30 a.m. Tair was lowest at the beginning and end of the summer, peaking in late July. Some summer weeks were represented more frequently than others; for example, four scenes from the last week of July were included, but no scenes between 14 and 28 June or between 1 and 12 August were suitable for analysis. Despite these gaps, a parabolic curve (R 2 = 0.76, p = 0.0002) was fit to the relationship between the day of the year and the correlation between LST anom and 11:30 a.m. T air(anom) (Figure 6a). The correlation coefficient between 11:30 a.m. T air(anom) and 11:30 p.m. T air(anom) also followed a parabolic curve when plotted against the day of the year (Figure 6b), but the fit was weaker (R 2 = 0.58, p = 0.002).

Impacts of Land Cover
Although the spatial patterns of the T air -based UHI were quite similar over all 15 dates, the shape and intensity of the SUHI varied greatly between days. Grouping LST anom by land cover shows that this variability in LST was greater for some land uses than for others (p < 0.001). Both lower and higher intensity development, for example, displayed low variability in LST anom across all 15 dates, and the standard deviation of the median pixel in these two categories was just 0.69 • C and 1.27 • C, respectively (Figure 7c,e). Forests and wetlands both displayed slightly greater variability, with standard deviations of 1.98 • C and 1.84 • C, respectively (Figure 7d,f). In contrast, median LST anom values across all agricultural land in Dane County was highly variable, with a maximum value of −5.14 • C on 2 June 2017, a minimum value of −13.37 • C on 9 July 2016, and an overall standard deviation of 2.81 • C, 0.83 • C higher than the next most variable category (Figure 7b). The variance across all fifteen days was higher for agriculture than for medium-high density development (p < 0.001) or for open-low density development (p = 0.003), but there was no statistically significant difference between the variance for agricultural and forested land (p = 0.101) or between agriculture and wetlands (p = 0.063). In addition to the relatively high standard deviation in median LST anom for agriculture, the spread of LST anom was also more variable for agriculture than any other land cover class. The interquartile range of LST anom for agriculture was largest early in the season, with an average value of 4.59 • C across all five June dates. Across six July dates and four August dates, the agricultural interquartile ranges fell to 2.91 • C and 3.04 • C, respectively. No other land cover categories demonstrated such variability in spread; the interquartile range was consistently large for the two developed classes and narrow for the two other vegetated classes.

Impacts of Land Cover
Although the spatial patterns of the Tair-based UHI were quite similar over al dates, the shape and intensity of the SUHI varied greatly between days. Grouping LST by land cover shows that this variability in LST was greater for some land uses than others (p < 0.001). Both lower and higher intensity development, for example, displa low variability in LSTanom across all 15 dates, and the standard deviation of the med pixel in these two categories was just 0.69 °C and 1.27 °C, respectively (Figure 7c,e). For and wetlands both displayed slightly greater variability, with standard deviations of °C and 1.84 °C, respectively (Figure 7d,f). In contrast, median LSTanom values acros agricultural land in Dane County was highly variable, with a maximum value of −5.1 on 2 June 2017, a minimum value of −13.37 °C on 9 July 2016, and an overall stand deviation of 2.81 °C, 0.83 °C higher than the next most variable category (Figure 7b). variance across all fifteen days was higher for agriculture than for medium-high den development (p < 0.001) or for open-low density development (p = 0.003), but there Over all 15 days, LST was consistently lowest for forest and wetland pixels and highest for medium-to high-density development. Lower intensity development was the second warmest category on every date other than 2 June 2017. On most days, median agricultural LST anom was higher than the other two vegetated classes and lower than the two developed land categories, but the difference between agricultural LST and the four other land cover classes shifted throughout the summer season. Across the three early June dates, median LST anom for agriculture fell within 1 • C of the median for open-low development, while over the latter half of the summer, median LST anom values across agricultural land were closer to the medians for forest and wetland pixels than to the developed land categories (Figure 7a Over all 15 days, LST was consistently lowest for forest and wetland pixels and highest for medium-to high-density development. Lower intensity development was the second warmest category on every date other than 2 June 2017. On most days, median agricultural LSTanom was higher than the other two vegetated classes and lower than the two developed land categories, but the difference between agricultural LST and the four other land cover classes shifted throughout the summer season. Across the three early June dates, median LSTanom for agriculture fell within 1 ºC of the median for open-low development, while over the latter half of the summer, median LSTanom values across agricultural land were closer to the medians for forest and wetland pixels than to the developed land categories (Figure 7a).

Increased Understanding of the UHI in a Mid-Sized Midwestern City
This study expanded upon past work using an observation network in Madison, Wisconsin to study the region's urban climate. By evaluating a longer period of record and incorporating two different measures of temperature, our results simultaneously support past findings and allow for a more nuanced look at Madison's urban climatology. An analysis of the first 20 months of sensor data (spanning two summers, including the historically warm summer of 2012) found that the Tair-based UHI was largest in August when it averaged 1.5 °C during the day and 4 °C at night [13]. Our analysis used a different methodology, evaluated fewer dates, and spanned a longer temporal range, yet we produced similar results. A 2014 analysis of the first year of Madison sensor network data identified a positive correlation between temperature and density of development and a visible distinction between land within Madison city limits, which experienced urban warming, and rural Dane County, where air temperatures were consistently lower [13]. We find these same spatial patterns in all eight summers of our own Tair analysis. For the first time, we show that these same patterns were also visible in summertime LSTanom. Our

Increased Understanding of the UHI in a Mid-Sized Midwestern City
This study expanded upon past work using an observation network in Madison, Wisconsin to study the region's urban climate. By evaluating a longer period of record and incorporating two different measures of temperature, our results simultaneously support past findings and allow for a more nuanced look at Madison's urban climatology. An analysis of the first 20 months of sensor data (spanning two summers, including the historically warm summer of 2012) found that the T air -based UHI was largest in August when it averaged 1.5 • C during the day and 4 • C at night [13]. Our analysis used a different methodology, evaluated fewer dates, and spanned a longer temporal range, yet we produced similar results. A 2014 analysis of the first year of Madison sensor network data identified a positive correlation between temperature and density of development and a visible distinction between land within Madison city limits, which experienced urban warming, and rural Dane County, where air temperatures were consistently lower [13]. We find these same spatial patterns in all eight summers of our own T air analysis. For the first time, we show that these same patterns were also visible in summertime LST anom . Our results offer new insight into the impact that the rural portions of the county have on regional UHI trends. We showed that high variability in agricultural LST anom may drive variability in the relationship between LST and T air .
Instead of comparing urban temperatures to a rural reference point, we compared temperatures to those recorded at central downtown sensors, essentially quantifying rural cooling instead of urban warming. This methodology avoids the ambiguity of identifying an appropriate rural reference. Past studies have shown that slight differences between chosen rural reference points can greatly impact the calculated UHI magnitude, making it difficult to compare UHIs between cities [52,53,57]. Our research highlights an additional concern; because rural LST anom was so variable, defining the UHI in reference to rural locations would restrict temperature anomaly comparisons across dates. Given the ubiquity of satellite data in UHI studies, the results seen here are worth considering more broadly.
This finding is likely most relevant for other Midwestern cities, particularly those with mid-sized populations. We believe that the observed variability in rural LST is tied to Madison's heavily agricultural rural surroundings, which resembles the landscape that also surrounds many similarly-sized Midwestern cities.

Impacts of Plant Phenology on the SUHI
Previous research on urban phenology has found that UHIs can lengthen the growing season and shape growth characteristics as far as 15 km outside of cities [58][59][60][61]. Midwestern agriculture has been observed to impact the regional climate by reducing daytime high temperatures [62,63]. The impact of agricultural phenology on temperature differences across urban, peri-urban, and rural climates, however, has been less widely explored [64]. Across 15 summer dates between 2012 and 2019, we found a strong parabolic relationship between the correlation coefficient and day of the year ( Figure 6), while meteorological factors (Table 1) did not have a noticeable impact on the relationship between T air(anom) and T air(anom) . These results, combined with the high variability in LST anom across agricultural land (Figure 7), suggest plant phenology as a key driver of the changing relationship between T air(anom) and LST anom in Dane County.
Agricultural lands surrounding Madison experience significant changes in LST throughout the growing season. The growing season in this region typically begins in early to mid-May and ends in early to mid-October [58]. During the intervening months, the rural landscape changes dramatically. Satellite imagery of agricultural land across Wisconsin shows that statewide corn emergence typically begins between late April and early May [65]. Rapid corn development and growth continue through July when maximum leaf area index (LAI)-the ratio of leaf surface area to ground surface area-is reached, before shifting to the grain fill stage. The emergence of soybeans typically occurs from mid-May and follows a rapid growth and development cycle through July [65].
We hypothesize that spatial heterogeneity in the rural landscape in early June, when some fields are bare soil and others have started to green but still have low LAI, explains the higher and more variable LST anom observed during these weeks (Figure 7). Low correlation during late summer may indicate senescence, while peak correlation observed near the end of July occurs when Dane County cropland is reaching maximum LAI and biomass [13]. These trends suggest that uniformly green fields and reduced bare soil area contribute to the cooler and narrower range of LST anom across agricultural land in late July.
Future research could incorporate additional agricultural data to further our understanding of how crop type, phenological development, and crop management decisions impact LST. Our definition of agricultural land use considered all varieties of cultivated crops and pasture as a single category. Past research shows that LST can vary greatly by crop type [66]. Data sources like the USDA's Cropland Data Layer [67] could aid analysis of how specific crops cool the landscape. Differences in management practices, soil characteristics, and plant health can also differentiate the thermal signature between plants of the same species [66]. Different tillage practices, for example, result in varying levels of residue cover and soil surface roughness, changing the surface energy balance [68]. In Dane County, 30% of farms practice no-till agriculture, and 20% use intensive tillage, potentially contributing to the observed irregularities in early season LST anom (USDA 2017). Soil moisture is also linked to LST, suggesting that LST anom at the beginning and end of the growing season, when bare soil cover is greatest, may be driven by drought and moisture conditions [69]. The two dates with the greatest deviations from the parabolic relationship between day of year and correlation ( Figure 6a) were both in late summer (27 August 2019 and 13 August 2014). Future investigation of soil moisture conditions and crop senescence during these periods may help explain these two outliers. Additionally, future work could improve our understanding of how agriculture shapes the SUHI by expanding the months of study to include the entire growing season, from spring planting through fall harvest.

Study Limitations and Future Work
Stationary sensor networks with the size and density seen in Madison, Wisconsin are rare, while satellite measures of LST are widely accessible. As a result, satellite data will continue to be a common data source used in UHI studies, and the relationship between T air and LST has become an increasingly important question. Our comparison between the air and surface UHIs in Dane County contributes to previous explorations of this research question, but it is important to acknowledge the limitations of this study. Although our sensor array is among the largest urban networks in the world, the number and placement of the temperature sensors are still more limited than the resolution offered by the USGS ARD surface temperature product [13]. Therefore, to enable comparison with the interpolated T air data, the resolution of the LST files was reduced from 30 m to 400 m. In addition to the lower level of spatial detail offered by this coarser resolution, this resampling and aggregation process introduces errors to each pixel value, on top of the 2.2-2.3 • C land pixel error in the raw ARD data files [40,46,47]. Our study was further limited by the range of meteorological conditions present during the 15 dates studied. Past research in homogeneous rural settings has identified cloud cover and wind speed as meteorological factors that impact the strength of the relationship between T air and LST [23,24]. We did not identify any impact of wind speed on the relationship between T air and LST and were unable to explore the impact of cloud cover due to concerns about satellite image quality under cloudy conditions. Despite these limitations, our results were generally consistent with similar studies that compared LST-based and T air -based measures of UHI in other urban settings. LST typically exhibits a greater temperature range than measurements of T air in the same locations [32]. Due to this difference in temperature range, LST and T air are generally poorly correlated [20,32], but their spatial distributions may display greater similarities [14,33]. Our results support these conclusions; on all 15 dates studied, the range in LST anom was always greater than the range in T air(anom) , and the average correlation in spatial patterns between LST anom and T air(anom) was relatively high, both during the day and at night. Overall, in Madison, and potentially other comparable cities, the SUHI appears to reflect the T air -based UHI most accurately in late July, during the peak of the growing season. This is a useful finding; however, the ability to model T air from LST would make urban satellite data even more valuable.
Several past studies have attempted to predict T air from LST in specific locations, typically employing statistical regressions, relying on empirical relationships between vegetation indices and temperature, or using models of surface energy balance [11,[70][71][72][73]. These models commonly produce estimates of T air at relatively coarse spatial resolutions, limited both by the availability of widespread ground stations for validation and the 1-km resolution of the MODIS sensors and have been found to produce errors of 1 to 3 • C [11,[70][71][72][73]. This task is still typically described as an open challenge [18,74]. With more than eight years of spatially-rich air temperature data available for model testing, future research can potentially use the LST and T air patterns observed in this analysis to produce a methodology for predicting Dane County air temperatures from satellite data.

Conclusions
As urban populations continue to grow, it is essential to understand the unique climatic patterns and risks posed by development. The mechanisms of urban warming are widely understood and have been observed in cities around the world, but data sources used to monitor UHIs are often imperfect. Stations that measure T air can accurately record nearby meteorological conditions but only provide limited spatial coverage. In comparison, satellites measure LST over large swaths of land at once, but these scenes only reflect a single moment in time. Beyond these spatial and temporal concerns, the applicability of the SUHI to the lived experience of urban populations is unclear. Given these concerns, our research combined two rich sources of temperature data in Dane County, Wisconsin-a network of 150 stationary T air sensors and a 30-m resolution Landsat surface temperature product-to better understand how T air and LST relate to each other, as well as if and when remote sensing data accurately reflect conditions of urban warming.
Overall, we found that although the magnitude of LST and T air anomalies differ greatly, these different measures of temperature display similar spatial patterns, suggesting that satellite data can be useful for identifying where peak temperatures occur. Additionally, our analysis suggests that the relationship between T air and LST changes as the growing season progresses; our data exhibited a maximum correlation during the middle of the summer when the rural, agricultural landscapes are more homogeneous and row crops have reached full canopy cover. These results highlight potential challenges in quantifying UHIs in agricultural regions, but they can also inform further exploration of the unique climatology of mid-sized Midwestern cities. With a greater understanding of the relationship between LST and T air in the region, the historic satellite record can be used to back-cast T air UHI patterns over recent decades, allowing for analysis of how urban growth and development has shaped the local climate.