Spatiotemporal Variability of Heat Storage in Major U.S. Cities—A Satellite-Based Analysis

: Heat storage, ∆ Q s , is quantiﬁed for 10 major U.S. cities using a method called the thermal variability scheme (TVS), which incorporates urban thermal mass parameters and the variability of land surface temperatures. The remotely sensed land surface temperature (LST) is retrieved from the GOES-16 satellite and is used in conjunction with high spatial resolution land cover and imperviousness classes. New York City is ﬁrst used as a testing ground to compare the satellite-derived heat storage model to two other methods: a surface energy balance (SEB) residual derived from numerical weather model ﬂuxes, and a residual calculated from ground-based eddy covariance ﬂux tower measurements. The satellite determination of ∆ Q s was found to fall between the residual method predicted by both the numerical weather model and the surface ﬂux stations. The GOES-16 LST was then downscaled to 1-km using the WRF surface temperature output, which resulted in a higher spatial representation of storage heat in cities. The subsequent model was used to predict the total heat stored across 10 major urban areas across the contiguous United States for August 2019. The analysis presents a positive correlation between population density and heat storage, where higher density cities such as New York and Chicago have a higher capacity to store heat when compared to lower density cities such as Houston or Dallas. Application of the TVS ultimately has the potential to improve closure of the urban surface energy balance. Author Contributions: Conceptualization, J.H. and P.R.; Data curation, J.H. and D.M.-V.; Formal analysis, J.H.; Funding acquisition, J.E.G.; Investigation, J.H.; Methodology, J.H. and P.R.; Project administration, P.R. and J.E.G.; Resources, P.R. and D.M.-V.; Software, J.H.; Supervision, J.E.G.; Validation, J.H.; Visualization, J.H.; Writing—original draft, J.H.; and Writing–review and editing, J.H., P.R., and David Melecio-Vazquez. All read the of the manuscript.


Introduction
Heat storage, ∆Q s , is defined as the uptake and release of energy by air, vegetation, buildings, and other impervious surfaces in an urban system [1,2]. The increase in both thermal conductivity and heat capacity is purported to amplify the total heat stored in cities above that of surrounding rural areas [3]. This net increase in ∆Q s is theorized as a major contributor to the urban heat island phenomenon-a widely recognized artifact of urbanization [4]. Furthermore, accurate estimation of urban heat storage is paramount for energy balance closure, which can have significant impacts on monitoring atmospheric processes such as evapotranspiration, transport of particulate matter, variations in sensible heat, and prediction of boundary layer height and stability [5].
Most often, the urban heat storage is determined as a residual of the surface energy balance (SEB) [6][7][8][9][10]. This is due to the use eddy covariance for computation of other flux terms in the SEB [11][12][13][14]. The second most common formulation of heat storage is the objective hysteresis model (OHM), which uses the temporal hysteresis of net all-wave radiation to relate the residual heat storage from the SEB to a given surface type [15,16]. A third method for computing ∆Q s involves a numerical model called the town energy balance (TEB). The TEB uses either the residual of the surface energy balance or the hysteresis model to formulate urban heat storage based on modeled fluxes [14,17,18]. Lastly, a fourth method can be found scattered throughout the literature and is based on a conduction and material storage model, referred to as both the element surface temperature method (ESTM) and the thermal mass scheme (TMS) [19][20][21][22].
Many studies have computed ∆Q s using surface observations, while very few have done so from the satellite perspective. In urban areas, the number of heat storage-related studies involving satellite observations is even lower [23]. This is likely due to the lack of standardization for computing heat storage, and the simultaneous difficulty of representing accurate fluxes from remotely-sensed imagery. One widely recognized satellite estimate of heat storage was computed using the residual method, which first computes every flux in the surface energy balance [24]. The element surface temperature method (ESTM) has gained popularity recently, perhaps due to increased computational power and numerical model capabilities [19,25]. However, the infrequency of satellite observations is still a clear hindrance on the ability to accurately characterize models associated with urban heat storage, as many of the satellite methods are unable to produce the full diurnal cycle of ∆Q s .
Long temporal overpass intervals have long been an obstacle for satellite applications. One recent advancement in the GOES-16 geostationary satellite allows snapshots of urban areas in the United States at spatial resolutions of 2 km every 5 min [26,27]. This high temporal resolution makes it the ideal candidate for implementing a thermal variability scheme (TVS) over urban areas. The TVS applied in this study is similar to a wide range of studies that estimate ground heat fluxes in soils using surface skin temperatures and material properties [28][29][30][31].
One of the more arduous tasks involved in deploying the TVS is the determination of the thermal mass of the urban fabric, i.e., the thermal conductivity, heat capacity, and material thickness [32]. Moreover, urban areas complicate things further by invalidating many of the established methods for deriving thermal properties due to the heterogeneous formation of buildings, roads, and other impervious land cover [33]. Consequently, the thermal properties of the satellite-derived urban heat storage employed here will be based on values determined in urban-specific studies aimed at quantifying the thermal mass of specific urban materials [34][35][36][37].
The resulting analytical, satellite-derived heat storage algorithm uses the 2-km land surface temperature (LST) outputted by the GOES-16 satellite, downscales it from 2 to 1 km, and uses its temporal hysteresis and the thermal masses of each land cover fraction to produce a 1-km spatial representation of urban heat storage. The development of the thermal variability scheme is similar to that of the thermal mass scheme and element surface temperature methods, which both use the variability of satellite-derived land surface temperature as an input to estimate heat storage. [19,38]. The TVS, as with other satellite routines, requires each satellite pixel to behave as a control volume element, which allows the surface energy balance to characterize each pixel as a unique portion of an urban area, allowing for partition of energy into components [39,40]. This is the general operating assumption of the thermal variability scheme as well.
These spatial constructions of ∆Q s are compared with ground-based flux towers and an urban weather research and forecasting (uWRF) model to determine its validity against two imperfect methods. The comparison is conducted for New York City during the summer of 2019, where all three data were available. The satellite model is ultimately extrapolated to the ten most populous cities in the United States and used to determine the variability of heat stored among different urban infrastructures. The development of the high temporal resolution satellite model has applications in investigating energy use, extreme heat impacts on vulnerable populations, and tracking pollution among increased urbanization [41]. Finally, the comparison between observations, numerical model, and satellite method will indicate where the deficit in closure of the surface energy balance, which may indicate that the numerical model could benefit from a restructuring of its prediction of heat storage in urban areas [42].
The aim of this study was to further the current understanding of the surface energy balance in urban areas, particularly with an innovative approach to determining heat stored in urban areas. Furthermore, the thermal variability scheme (TVS) developed using the GOES-16 satellite and publicly-available land cover information allow for heat storage to be calculated in all cities across the contiguous United States at a resolution of 1 km. The expectation is that this product will be used for urban surface energy balance studies whether through numerical models or comparison with surface flux stations, whether to determine other fluxes in the SEB or as a model for an improved estimate of heat storage.

Test Site-New York City
New York City is used as a test site for estimating heat storage due to the availability of urban weather research and forecasting (uWRF) model runs and the presence of four surface flux towers within the boundaries of the city.
2.1.1. National Land Cover Database (NLCD) Figure 1 shows a map of New York City used for the analysis, with the National Land Cover Database (NLCD) 2016 imperviousness land cover categorization mapped atop the city's boundary. Four green stars denote each surface flux station in Figure 1, which are used as comparison points in conjunction with the numerical WRF model. The NLCD 2016 is also being used to mask pixels that contain more than 30% water.

Surface Flux Stations
The New York State Mesonet (NYSMesonet) provides three of the four surface flux stations used as comparison points with the satellite-derived estimation of heat storage. The NYSMesonet uses Kipp and Zonen CNR4 net radiometers and a Campbell Scientific CPEC200 eddy covariance system that includes a 3D ultrasonic anemometer (CSAT3A) and closed-path gas analyzer (EC155) [43]. The fourth urban site is located at the City College of New York (CCNY) and uses a similar CSAT3 eddy covariance system by Campbell Scientific, and a CNR4 Kipp and Zonen net radiometer. The use of four close-proximity flux instruments allows for heat storage comparisons in four out of the five boroughs in New York City: Brooklyn, Manhattan, Queens, and Staten Island. The uniqueness of each station's surface cover also aids in more accurately determining the validity of the satellite model.
The surface flux station data were available for the entire summer of 2019, courtesy of the NYSMesonet. The heat storage from the flux stations uses the residual heat storage between the net radiation and sensible and latent heat fluxes. The same is done for the weather research and forecasting model.

Weather Research and Forecasting Model (WRF)
The Weather Research and Forecasting (WRF) model is used as a second comparison point for the satellite heat storage algorithm. It is also used to downscale the land surface temperature output of the GOES-16 satellite from 2-km to a chosen WRF resolution of 1 km. The WRF model version 3.9.1.1 [44] is initialized with the North American Mesoscale (NAM) forecast and run for a series of dates in August 2019, using a spin-up time of 12 h for each run. Three nested domains were used for New York City, with spatial resolutions occupying 9 (120 × 120), 3 (121 × 121), and 1 (85 × 82). The vertical grid has 51 levels, with 30 levels confined to the boundary layer. The shortwave and longwave radiation schemes are the Dudhia scheme and the Rapid Radiative Transfer Model (RRTM) [45,46], respectively. The microphysics is activated only for the fine-grid 1-km domain using the WRF Single-moment, 6-class scheme [47]. For the land surface model, the NOAH scheme was used [48]. The Mellor-Yamada-Janjić and the Eta Similarity scheme were used for the boundary layer and surface layer schemes, respectively [49][50][51].
The large number of levels in the boundary-layer helps to better represent the buildingatmosphere interaction within a multi-layer urban canopy framework. The coupled Building Environment Parameterization (BEP) and Building Energy Model (BEM) parameterize the urban surface momentum and thermal exchanges, respectively [52,53]. Each urban grid is subdivided into one of three urban land class categories: low-intensity residential, highintensity residential, and commercial/industrial, and their properties are defined using a look-up table of urban canopy parameters from the National Urban Database and Access Portal Tool (NUDAPT) [54]. The standard BEP+BEM scheme, as described above, has been further modified to include two additions. First, a cooling tower model was added to the BEM parameterization to account for the latent heat released from buildings [55]. Second, for the urban grids in New York City, the Primary Land Use Tax Lot Output (PLUTO) was used to define the urban morphological parameters of building area fraction, building surface area-to-height ratio, and building heights, which then defined a mechanical drag coefficient. The PLUTO data have been aggregated from its tax lot-based resolution to 1-km aggregates for the fine resolution domain. Accounting for the mechanical and thermal effects of buildings has also resulted in more accurate estimates of urban temperatures and winds [56,57].

GOES-16 Land Surface Temperature
The GOES-16 land surface temperature (LST) used here is an enterprise algorithm produced at 5-min temporal intervals at a spatial resolution of 2-km across the contiguous United States (CONUS). LST values are derived using the infrared bands (band 14: 11.2 µm, and band 15: 12.3 µm) and an approximation of emissivity using a daily split-window algorithm developed by the Land Surface Temperature Algorithm Working Group at the National Oceanic and Atmospheric Administration (NOAA) [58]. The enterprise LST product is selected over the baseline product due to the desired temporal resolution of 5 min vs. 1 h. This temporal improvement minimizes delays between comparisons between the satellite and the numerical model or surface flux stations. The enterprise LST product is only available to our team; however, the functional use of the heat storage method is applicable to the baseline product as well. Cloudy pixels were excluded from the analysis due to the cloud mask imposed by the LST product [27]. This is doubly detrimental to the estimation of ∆Q s due to the requirement of at least two LST values in Equation (8), which will decrease the availability of heat storage periodically throughout the analysis. There is also a land mask that nulls pixels near bodies of water based on sea level, which prevents contamination of sea surface on land surface temperatures.

Properties of the Ten Most Populous U.S. Cities
The ten most populous cities in the United States are listed in Table 1, with information about population, population density, and imperviousness percentage. The approximate area used for the analysis is not limited to the city boundaries; instead, it expands outward by roughly 10 km in each direction. This allows capture of the immediate urban area as well as some outer urbanization while also ensuring that no urban pixels are clipped during the analysis. This also increases the number of data points used in the analysis, which may be beneficial for smaller cities with few satellite pixels within the city boundaries. The ten cities listed in Table 1 were chosen based on population, which are all above 1 million. The cities exhibit differing amounts of population density and imperviousness percentage. For example, New York, NY has a much denser population than Phoenix, AZ. Imperviousness is listed in Table 1 as it is a major component of the heat storage model presented here. The goal of comparing these ten cities is to gain insight into the possible relationship between urbanization or imperviousness and heat storage behavior. It has been hypothesized that heat storage may be the largest contributor to the urban heat island (UHI) effect, and the goal here is to determine if heat storage has any relation to increased urbanization or even the decrease of vegetation in some way contributing to UHI by way of increase storage of heat [8,61,62].

Residual Heat Storage
The surface energy balance (SEB) is often presented as the starting point for numerical weather models and meteorological observation studies interested in quantifying energetic processes within a given volumetric layer [4,63]. Figure 2 shows the balance of energetic components found in a typical urban control volume. The net all-wave radiation (Q * ), sensible heat flux (Q H ), latent heat flux (Q E ), and storage heat flux (∆Q s ) are all included in the traditional surface energy balance, while the anthropogenic heat flux (Q F ), advective heat flux (∆Q A ), and other sources and sinks (S) are incorporated depending on their significance to a given application. The resulting surface energy balance for urban areas can be written as [64]: where the external sources of energy are given on the left-hand side, and surface fluxes are given on the right-hand side [65]. The heat storage is often estimated as a residual to the surface energy balance in Equation (1), with varying degrees of components omitted. For many applications, ∆Q s is simplified as [6,7,23]: This residual is used by both surface flux towers and the weather research and forecasting model to determine heat storage. For the particular case of a numerical model, the sensible heat flux captures much of the anthropogenic influence in the urban control volume. Flux stations are also considered to capture much of the anthropogenic influence in the sensible heat measurements. Thus, the prediction of ∆Q s from the satellite's perspective will be compared with two approximations of heat storage that are assumed to account for Q F to some degree [66].

Satellite Thermal Variability Scheme (TVS)
The satellite thermal variability scheme developed here determines heat storage from the perspective of the geostationary satellite. Its derivation begins where several analytical soil heat flux derivations begin: with the one-dimensional heat conduction equation [67][68][69]: ∂T ∂t where k [W · m −1 · K −1 ] represents the thermal conductivity, ρ [kg · m −3 ] is the bulk density, and c p [J · kg −1 · K −1 ] is the specific heat capacity, all of which are properties of the storage material. The storage of heat in a given material is written as a boundary condition on one-dimensional conduction at the surface, z = 0: The second boundary condition imposed on the solution in the spatial domain invokes the assumption that at a certain depth in the material the temperature change with depth is negligible (adiabatic assumption) [70]: The final condition implies that the temperature can be measured at two distinct points in time at the surface of material, z = 0, which allows the one-dimensional heat conduction equation to be solved-for. The solution and final storage representation arises from the solution based on the boundary conduction at the surface: where β is defined as: The thermal diffusivity is defined here as ν = k/ρc p . Two temperature measurements are required, T 0 , T 1 , which are measured at times t 0 , t 1 , respectively. This result is similar to those developed for determination of soil heat fluxes [71,72], with modifications for temporal measurements at the surface of the material. This result proposes that the amount of heat stored in a material can be determined using multiple surface temperature measurements and thermal properties of the storage material. This is a modification of the element surface temperature method and thermal mass schemes presented for a range of analyses [19,20,25].
One further adjustment need to be made in order to support multiple surface types within a given satellite pixel [64]: where the plan area fraction, f i , defines each surface type's contribution to the total heat stored within each satellite's respective pixel. The index, i, represents the surface type within the array of total surface types, n. For the ensuing analysis, four different storage surface types are explored for urban areas: natural, buildings, roads, and other impervious surfaces. The different material properties will be explored in the next section, where values are sourced from comparable studies on thermal mass and heat storage.

Determination of Thermal Mass
Accurate determination of thermal mass is essential for estimating how much heat will be stored in a given urban fabric [19,73]. The various thermal properties of urban materials, i.e., the thermal conductivity and heat capacity, are widely available in the literature pertaining to building energy and building materials. Unfortunately, much of the urban landscape is comprised of an array of materials that differ in thicknesses and composition. This results in a characterization of thermal mass that introduces great uncertainty and reaffirms a commonly-held hypothesis that direct measurement or modeling of ∆Q s is nearly impossible [8,74].
For the satellite thermal variability approach to heat storage, three static material properties are needed according to Equation (8): thermal conductivity (k), heat capacity (ρc p ), and effective thickness (z 1 ). The two thermal properties are known for most building materials; however, at the satellite pixel scale of 1-2 km, lumped element properties are used in place of micro-scale building properties. This saves computation time as well as data requirements, as specific building material databases are generally not available at such large scales. The thickness of the material layer, Z 1 , is perhaps the largest unknown due to varying methodologies around heat storage. As a starting point, values and ranges were taken from comparable studies, particularly those in [3,5,7,64]. The particular values used for the four surface types (natural, buildings, roads, and other impervious) were based on a sensitivity study comparing values with a numerical model and several surface flux stations. Table 2 shows the chosen values for thermal mass parameters that are used going forward for the estimation of ∆Q s . Each GOES-16 satellite pixel is subdivided into land cover fraction, f i , using the most recent National Land Cover Database, the NLCD 2016. The NLCD is useful due to its widespread use in numerical weather modeling, environmental planning, and urban development [75]. The NLCD 2016 specifically contains 30-m land cover information for the United States based on 20 different land cover types, 4 of which are urban-specific [76]. The four urban classifications can be further expanded to 12 imperviousness categories, where roads and other impervious surfaces are distinguished from one another. This results in a classification for three out of the four storage fractions ( f i=0:2 ): natural cover, roads, and other non-road impervious surfaces. Determination of the final surface type, buildings ( f i=3 ), requires parameterization of the non-road imperviousness category.
The fraction of buildings present in each satellite pixel was determined by first collecting building databases for eight out of eleven of the largest U.S. cities by population: New York, Los Angeles, Chicago, Houston, Philadelphia, Dallas, San Jose, and Austin. Three of the eleven were omitted due to lack of publicly available data. Figure 3a shows the relationship between building area fraction and non-road imperviousness land cover fraction. The points above the 1:1 line likely represent situations where the city developed buildings within the four years since the development of the NLCD 2016 product. A modest correlation exists between the square-root of population density and the slope between non-road imperviousness and building area fraction. This relationship can be written as follows: where BAF is an abbreviation for building area fraction, I n represents the corresponding satellite pixel's non-road imperviousness fraction, PD is the population density, and a and b are the fit coefficients derived from the cities in Figure 3. This representation of building area fraction serves as the method for determining the thermal mass contribution of buildings to storage heat. This relationship is particularly useful for cities and outlying suburban regions where building databases are unavailable. With this parameterization of building area fraction derived from the NLCD's non-road imperviousness fraction, the four components of the thermal mass are compiled for every city in the United States, completing the model and enabling comparison between heat storage for major U.S. cities.

Land Surface Temperature Downscaling
The land surface temperature (LST) used here is released as an enterprise dataset and has a temporal resolution of 5 min. The temporal capabilities of the enterprise product allows for comparisons down to just a few minutes between satellite LST and either ground observations or numerical model comparisons. The native spatial resolution for the GOES-16 land surface temperature is 2 km, which creates only a few dozen points within each urban boundary. This poor spatial resolution resulted in the decision to downscale the GOES-16 LST from 2 to 1 km. The weather research and forecasting (WRF) model's LST is used as the comparison temperature for downscaling due to its 1-km spatial resolution and good agreement with other satellites [77]. The downscaling uses the imperviousness fraction based on the NLCD 2016, and follows the DisTrad method for downscaling the LST outlined in [78] and applied to urban areas by Essa et al. [79,80]. The resulting 1-km LST can be approximated using the following relationship: where T 1km,i represents the 1-km LST value at index i. The coefficients, c 1 , c 2 , c 3 , are found using the least-squares fit between the WRF LST at 1 km and the GOES-16 LST at 2 km, as well as the respective imperviousness fraction at 1 km (I 1km,i ) and 2 km (I 2km,j ), according to Equation (10). The subscript j denotes 2-km variables, while i denotes 1-km variables.

Case Study-New York City
The satellite thermal variability scheme (TVS) estimate of heat storage at 2-km spatial resolution is first tested in New York City against four urban surface flux stations provided by the NYSMesonet and an urban Weather Research and Forecasting (uWRF) model. The nearest satellite and uWRF pixels were selected based on centroids located nearest to each ground station coordinate. Figure 4 shows the comparison between the three different estimates of heat storage. The satellite method uses an 8-h temporal difference, ∆t, between land surface temperature measurements in the prediction of ∆Q s . This was chosen based on minimizing the lag between satellite-derived storage and the other two residual methods while also ensuring that the difference between two temperature values is large enough to surpass the mission requirement error of 2.5K for the GOES-16 LST product [27].
One clear statement can be made regarding the comparison of satellite-derived, observed, and numerically modeled estimates of heat storage: the variability among them during the daytime. The surface flux stations appear to over predict ∆Q s in comparison with WRF and GOES-16 values. The WRF model is universally lower than both the GOES-16 approximation and satellite model. One hypothesis for the lower heat storage capability of the WRF model is the inclusion of anthropogenic heat flux into the parameterization of sensible heat in the form of air conditioning and heat release from buildings [54,81]. Since WRF does not directly model ∆Q s , the resulting residual of the energy balance is offset by the larger value of Q H . Similarly, the exclusion of advection and other sources to the residual energy balance of Equation (2) may also explain the higher amplitudes from flux station observations. The residual of the surface energy balance has long been a point of contention, and it is widely recognized to overestimate ∆Q s , due to either the aforementioned reasons or a combination of overestimates of Q * and underestimates of Q H , Q E [7,64,82,83]. This is perhaps an indication that the satellite-derived thermal mass estimate is closer to the real storage contribution of urban materials when compared to surface energy balance residuals derived by numerical models or flux observations.

Downscaling LST to 1 km
The coefficients are found using a three-day period in August 2019, with a leastsquares fit between the WRF LST as the T 1km variable in Equation (10) and NLCD 2016 imperviousness as the I 1km , I 2km inputs across New York City: Figure 5 shows the performance improvement after downscaling LST from the GOES-16 satellite to the 1-km WRF resolution. It is important to note that the coefficients were found for a WRF run from 9 to 12 August 2019, while the data in Figure 5 were independently tested for those coefficients with a WRF run from 29 August to 1 September 2019. Furthermore, similar errors were presented by Bechtel et al. [84] for an urban study in Hamburg, Germany, where LST from the Spinning Enhanced Visible and Infrared Imager (SEVIRI) was downscaled from 1 km to 300 m using the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER).

One-Kilometer Heat Storage Comparison with uWRF
The native 2-km spatial resolution of land surface temperature produced by the GOES-16 satellite was used to validate the thermal variability scheme by generating similar profiles as the residual method for both the WRF model and surface eddy covariance observations. Unfortunately, many pixels are dropped and significant dynamics are obscured at the city scale due to coarseness in spatial resolution. As a result, the land surface temperature is downscaled to 1 km and used as input to the same determination of ∆Q s . The 1-km heat storage is used going forward for all spatial representations of ∆Q s . Figure 6 shows a 1-km spatial difference plot of heat storage during the daytime and nighttime in New York City, where the uWRF estimate has been subtracted from the GOES-16 estimate. Several observations can be made regarding the difference between estimates of heat storage in Figure 6. First, in Figure 6a, the center of the city particularly shows a tendency toward higher storage from the satellite's perspective. It is difficult to make any other distinctions between satellite and uWRF estimates, as there is a distribution of difference values scattered throughout the city. The mean absolute difference between satellite and numerical model during the nighttime agree quite well to within 15 W·m −2 , indicating fairly good agreement throughout the city. This was also observed for the sites and pixels in Figure 4.
During the daytime in Figure 6b, the satellite model mostly dominates the numerical model. The difference plot shows higher heat storage along the coast for the uWRF estimate, which appears to be a result of the satellite method tending toward zero. This may be due to the influence of seabreeze, or perhaps an artifact of the satellite algorithm's handling of land/water pixels. Several pockets within the city also appear to agree well for the daytime period, amounting to a mean absolute difference of 52 W·m −2 , which is similar to other daytime comparisons between heat storage approximations [3,7,15].
The behavior exhibited in Figure 6 between the satellite method and numerical model are consistent across several days of observations. This affirms the findings in Figure 4, which shows that the satellite model differences are minimized during the nighttime and maximized during the daytime. A possible hypothesis is that if a 1-km spatial distribution of flux towers existed, the satellite estimation of ∆Q s would fall within the bounds of the uWRF model and flux tower amplitudes, particularly during the daytime. In the next section, the 1-km spatial representation of ∆Q s is presented and used for analyzing and comparing multiple cities and their ability to store heat.

Multi-City Analysis
Heat storage is computed across the ten largest U.S. cities by population for the month of August 2019. Figure 7 shows the diurnal behavior of each city's heat storage based on the satellite-derived thermal variability scheme (TVS). Upon integration of the amplitudes over the 24-h diurnal profile, an approximate value for daily heat storage flux for summertime can be found. This was done as a way of comparing the values with observations and models in the literature, as shown in Figure 8, where population density is plotted as a function of daily net heat storage. The range of values for suburban and urban areas can be found to fall between roughly −0.1 and 4.0 MJ·m −2 ·day −1 , which bookends the range observed in our analysis for August 2019 [1,9,85].
Two of the heat storage values from the literature (Tucson, AZ, USA and Chicago, IL, USA) fall directly inline with the approximations by the satellite model. The larger values taken from the literature in Figure 8 (Arcadia, CA, USA; Vancouver, BC, Canada; etc.) are likely evidence of a combination of anthropogenic influence, advection, and other sources that were also found to bias the New York City heat storage residuals in Figure 4.  The direct comparison between flux tower and satellite TVS may be inappropriate due to the limited footprint of the eddy covariance system. Depending on the characteristics of a particular site, the flux tower residual heat storage may or may not accurately represent the heat storage capabilities of the city as a whole [86][87][88]. This is perhaps why some of the stations match the satellite algorithm and others do not. Despite the outliers in comparison with surface stations, confidence has accumulated around the satellite TVS due to its stability and marginal agreement in amplitude with surface stations and previous studies.
Consequently, a few spatial distributions of heat storage from the satellite's perspective RE presented. Figure 9 shows near-peak daytime heat storage from July 2019 plotted for four different cities: New York, Los Angeles, Phoenix, and Dallas. The cities were chosen in an attempt to diversify the urban microclimates without needing to produce individual plots for each of the ten cities referenced in Figure 7. New York was chosen for its humid subtropical climate, Los Angeles for its Mediterranean climate, Chicago for its continental climate, and Phoenix for its hot desert climate [89,90]. The peak values for each city align with each urban area, which is likely a result of the urban materials having higher thermal mass. Smaller storage values can be found around non-urban areas, as they have smaller thermal mass when compared to the nearby urban materials. Another perhaps less expected observation is that the coastal urban areas have lower storage values near water. Fortunately, similar observations of this mixedpixel coastal behavior can be found in [19,91]. One possible explanation of the coastal storage aberration is the poor retrieval of LST around a mixed pixel or even the inability for the algorithm to downscale a water-contaminated pixel. Both New York and Chicago exhibit less variability between highly urban and less urban areas, whereas Los Angeles and Phoenix demonstrate larger fluctuations between urban and suburban/rural areas. This may be explained by the surrounding Mediterranean and desert climates, where a larger portion of the incident energy is released back into the atmosphere as sensible heat and the thermal mass values are smaller. Moreover, New York and Chicago also have more suburban regions within their boundaries in Figures 9 and 10, which may also explain the lower variability between highly urbanized and less urbanized areas. Similar behavior was also observed by Middel et al. [92] for Phoenix, AZ.  Figure 10 shows ∆Q s for the same four cities during the nighttime, which further exemplifies the unique behavior of heat storage in urban areas. Subplot Figure 10a shows the variability between heat storage during the nighttime, where New York City's Central Park (center of plot) is seen releasing less heat compared to the more urban areas, particularly in downtown Manhattan and Brooklyn. Los Angeles during the nighttime experiences higher variability when compared to New York, with larger span between urban and rural storage values. The upper-right portion of Los Angeles is home to a national forest at higher elevation, which can be seen as having lower heat storage during the daytime and a more tempered release during the nighttime.
Chicago is similar to New York in that it is widely urban and suburban with little vegetation. Consequently, it experiences a more even distribution of heat storage, with the exception of a large park in the lower lefthand corner of Figure 10c.
Phoenix, in contrast to New York and Chicago, has a much higher release of heat storage during the nighttime. This can be observed in Figure 10d, where the blues are roughly as dark as Los Angeles, but much darker than New York and Chicago-indicating a more intense release of storage during the nighttime. This was also observed in the city-wide diurnal average, where Phoenix, Los Angeles, and San Diego all exhibited the largest release of storage heat for all ten major cities.

Discussion
One of the implications of the results presented here is the interpretation of current frameworks to estimate ∆Q s . Observations from eddy covariance methods show that the residual term, often used to represents ∆Q s , is very high compared to the WRF residual method. Our results show that the thermal variability scheme estimate of ∆Q s is within 60-70% of the residual term computed from the flux tower. This is partially due to the tuning of parameters in the diurnal comparison of ∆Q s values between satellite TVS, uWRF residual, and flux tower residual. There is a possibility that advection is a more prominent term in the Urban SEB than previously believedand that the lower TVS values may be further indication of such phenomena [93]. Furthermore, there is also a possibility that eddy covariance systems are incapable of capturing all energy-containing eddies in urban environments. The residual term is uniformly high at all three urban sites, as shown in Figure 4. Interestingly, in the least urban of the four sites, Staten Island (STAT) where the flux tower is located in a suburban environment with 42% vegetation cover, the ∆Q s values estimated by the TVS method match reasonably well with the residual term and numerical model. This bolsters the hypothesis that caution should be taken when representing ∆Q s as the residual term in the surface energy balance for highly urbanized sites. Previous studies that have elaborately used the residual strategy for determining ∆Q s may, in fact, be overestimating the storage heat term in urban areas. On the other hand, the numerical model exhibits lower values when compared to the satellite method, which may be related to the lack of parameterization for storage heat flux in the urban canopy models. Many researchers in the past have observed larger amplitudes in sensible heat flux predicted by numerical models in urban areas, resulting in a residual term with a lower amplitude. Urban canopy models are designed based on traditional land surface models created for non-urban environments, where the available energy is partitioned between sensible and latent heat fluxes that are dependent on soil properties and vegetation type [94,95]. This is true even for the urban WRF model presented as a comparison here. However, this binary mode appears deficient in urban areas as the contribution to latent heat flux is weak in highly urbanized environments, resulting in the over-prediction of sensible heat.
The satellite method also has its own set of limitations; currently, at the native resolution, we are unable to account for the three-dimensionality of the urban surface cover and representing shadow effects is currently not possible. However, downscaling the data using high-resolution satellite images from NASA's MODIS or EcoStress could potentially aid in this effort. Furthermore, the complexity of urban structures and the reliance on a predetermined thicknesses both contribute to the uncertainty and unpredictability of the thermal properties used for the TVS method. Currently, there is no uniform database on thermal and physical properties of urban materials. Here, our cross comparisons with other surface stations in multiple cities reinforced the confidence in the methods and values used to estimate the thermal mass.
Our analysis also shows that the storage heat flux remains positive throughout the daytime for all 10 major cities with the average peak value between 75-150 W·m −2 . The spatially averaged ∆Q s values dip below zero during the nighttime period, with peak release values between −50 and −110 W·m −2 . The gradients present in Figures 9 and 10 indicate hot-spots within the cities that could potentially play a significant role in maintaining high urban heat island intensity during the nighttimes [96]. In general, it is widely acknowledged that temperate mid-latitude cities experience high urban heat island intensity during the nighttime periods [97]. The thermal gradients can also initiate highly localized flows within the urban environment, hence increasing the magnitude of advective fluxes.

Conclusions
A novel satellite-based thermal variability scheme (TVS) was introduced as a method for estimating heat storage, ∆Q s , in urban areas. The TVS method uses four categorizations of thermal mass in conjunction with the temporal variability of land surface temperature to predict ∆Q s . The high temporal resolution of the GOES-16 satellite allows for complete reconstruction of the diurnal profile of heat storage in urban areas-something lacking in previous works relating to satellite-derived estimates of storage heat.
The resulting model of heat storage exhibited behaviors comparable to both an urban weather research and forecasting model and four different surface flux observations in New York City. The test case of NYC uncovered the discrepancy between forecasting model and surface flux tower observations of ∆Q s . The satellite-derived estimate of storage mostly fell between the flux tower and WRF estimates, particularly during the daytime. During the nighttime, the satellite model dominates over the other two methods, and the morning shows all three collapse and mostly agree with one another. The comparison between satellite model, surface flux observations, and numerical model helped build confidence in the satellite method and resulted in its application to ten different major cities across the United States.
Before carrying out spatial representations of ∆Q s , land surface temperatures were downscaled from 2 to 1 km. This allowed for more pixels to populate the urban areas and create a better illustration of how cities store heat. The 1-km downscaling is based on comparisons with the WRF numerical model and the DisTrad method that uses imperviousness cover to allow for country-wide downscaling to 1 km by utilizing the National Land Cover Database's imperviousness fraction category.
Using the novel TVS method at 1-km spatial resolution, heat storage was computed across the ten most populous cities in United States. Each city exhibited different behavior based on urbanization, population density, and heterogeneity of land cover. The daily net heat storage for the month of August 2019 was compared with other studies and was found to agree well with other urban areas, even those outside the ten cities in our analysis.
Following the reasonable agreement with values in the literature, spatial plots were presented using the TVS method as a way of exploring the unique and novel behavior of heat storage across different urban microclimates. Spatial reconstructions in New York, Los Angeles, Chicago, and Phoenix all demonstrated the influence of urbanization on the amplitude and variability of storage heat flux. During the nighttime, each city tends to behave similar to their respective daytime observations. Chicago and New York exhibit more uniformly distributed estimates of ∆Q s throughout their urban boundaries. Conversely, Los Angeles and Phoenix experience more drastic changes between the urban and non-urban regions. During the evening just after sunset, a large portion of the stored heat is released back into the atmosphere across all cities.
In closing, the satellite-derived estimation of heat storage may prove to be a key element in closing the surface energy balance for urban areas. This unique method of determining ∆Q s is a step toward removing the non-storage components often neglected in residual estimates of storage. The two prevailing methods, the residual method and objective hysteresis model, both use an oversimplified energy balance, which results in the larger amplitudes seen in flux tower observations when compared with the satellite method and numerical weather model. Additionally, the thermal variability scheme also opens the door for better weather research and forecasting estimates of heat storage in cities, which could ultimately improve the distribution of energy in weather models that will ultimately increase the predictability of extreme heat events that plague our cities.