High-Resolution Gridded Air Temperature Data for the Urban Environment: The Milan Data Set

Temperature is the most used meteorological variable for a large number of applications in urban resilience planning, but direct measurements using traditional sensors are not affordable at the usually required spatial density. On the other hand, spaceborne remote sensing provides surface temperatures at medium to high spatial resolutions, almost compatible with the needed requirements. However, in this case, limitations are represented by cloud conditions and passing times together with the fact that surface temperature is not directly comparable to air temperature. Various methodologies are possible to take benefits from both measurements and analysis methods, such as direct assimilation in numerical models, multivariate analysis, or statistical interpolation. High-resolution thermal fields in the urban environment are also obtained by numerical modelling. Several codes have been developed to resolve at some level or to parameterize the complex urban boundary layer and are used for research and applications. Downscaling techniques from global or regional models offer another possibility. In the Milan metropolitan area, given the availability of both a high-quality urban meteorological network and spaceborne land surface temperatures, and also modelling and downscaling products, these methods can be directly compared. In this paper, the comparison is performed using: the ClimaMi Project high-quality data set with the accurately selected measurements in the Milan urban canopy layer, interpolated by a cokriging technique with remote-sensed land surface temperatures to enhance spatial resolution; the UrbClim downscaled data from the reanalysis data set ERA5; a set of near-surface temperatures produced by some WRF outputs with the building environment parameterization urban scheme. The comparison with UrbClim and WRF of the cokriging interpolated data set, mainly based on the urban canopy layer measurements and covering several years, is presented and discussed in this article. This comparison emphasizes the primary relevance of surface urban measurements and highlights discrepancies with the urban modelling data sets.


Introduction
In the current climate change situation, cities represent one of the most relevant environments, mainly because a large percentage of human population lives in urbanized areas: more than 56% already in 2020, with a still growing trend and a projection up to 68% in 2050 (United Nations Department of Economic and Social Affairs-Population Dynamics: World Urbanization Prospects 2018 (https://population.un.org/wup/), last accessed on 7 January 2022). Therefore, large quantities of energy and pollution are produced, which exacerbate the well-known urban heat island (UHI) phenomenon and severely affect inhabitant wellness. This is especially true during heat wave (HW) episodes [1], when UHI adds its effect to the already-elevated temperatures caused by the synoptic weather situation. Nevertheless, UHI has several diversified aspects, depending on regional climate, topographic characteristics, urban shape, and urban metabolism. It is then clear that a more detailed monitoring, also the lower UCL should be considered as in the case of urban squares, green areas, and parks or aquifers inside the city, according to the local climatic zone (LCZ) classification [13].
Several cities are nowadays monitored by such networks [14,15]. In Milan, the weather network of Fondazione OMD (Fondazione Osservatorio Meteorologico Milano Duomo (https://www.fondazioneomd.it/), last accessed on 10 January 2022) Network), specifically designed for urban applications and climatological purposes [16], has been operating continuously for a decade and has so far collected a database of minima, maxima, and mean values over 10 min and of the most relevant ECVs: it consists of about 20 automatic weather stations (AWSs) distributed in the larger metropolitan area (1575 km 2 ), 8 AWSs sited in Milan downtown (50 km 2 ). The FOMD Network monitors the weather variables at the roof level of buildings (upper UCL), it is managed with metrological criteria and calibrated with high-quality international standards, and it is characterized by total homogeneity of AWS siting and of hardware and software [17]. Laboratory and in-field measurement uncertainties have also been evaluated [18]. The Milan FOMD subnet has allowed, therefore, a much-improved description of Milan UHI relative to previous studies [19,20].
The FOMD Network database, limited to 2016-2019 and short-time periods in 2014 and 2015, is used as the basic reference for the urban air temperature T a . The city of Milan ( Figure 1) has an almost center-symmetric configuration, covering an area of 181.8 km 2 with a population of 1.4 million (2020) and a relatively high density of 7.7 × 10 3 km −2 . The city center is at latitude 45 • 28 N and longitude 9 • 11.5 E in the central-western part of the flat valley of Po River in Northern Italy, with the Alps northwards and westwards, the Apennines to the south, and a smooth and small elevation gradient oriented from the higher (140 m above mean sea level) northern-northwestern region to the lowland in the south-southeast. The larger metropolitan area (1575 km 2 and 3.3 × 10 6 inhabitants) has a mixed composition of smaller towns and industrial and agricultural lands, and it extends from the more than 250 m elevated region in the northwest to the less than 60 m a.m.s.l. plane in the southeast.
The FOMD Network is suitable for a description of the weather and climate in the Milan metropolitan area. However, it is not dense enough to represent in detail the temperature at the urban microscale, as required by public administrators and professionals in a number of applications, such as urban planning, energy, and building design. Therefore, we accurately selected and added to the analysis further operational weather stations managed by the Regional Environmental Protection Agency (ARPA Lombardia (Lombardy Regional Environmental Protection Agency [https://www.arpalombardia.it/Pages/ Meteorologia/Osservazioni-e-Dati/Dati-in-tempo-reale.aspx]), last accessed on 13 December 2021) and by a citizen-scientist weather observers' association (MeteoNetwork (https://www.meteonetwork.it/rete/livemap/), last accessed on 13 December 2021), connected with similar associations abroad and actively collaborating with academic and research institutions. Selection criteria were based on operational continuity, data reliability (we performed an accurate independent validation based on automatic checking and cross-reference), and siting characterization. Figure 1 gives a synoptic view of the resulting overall network, and Table 1 lists coordinates and altitudes above mean sea level of the AWSs.
There are different possibilities to integrate the classical "in situ" measurements with remote-sensed data from spaceborne platforms, which provide thermal IR observations at a resolution varying from 1 km to a few tens of a meter. The satellite data considered for this study were the land surface temperatures (LSTs) directly provided by ESA (European Space Agency (https://www.esa.int/Applications/Observing_the_Earth), last accessed on 13 December 2021) in the framework of the EU Copernicus (Copernicus (https://www. copernicus.eu/), last accessed on 13 December 2021) initiative, based mainly on operational measurements by Sentinels 3A and 3B, and LST data by NOAA (National Oceanic and Atmospheric Administration (https://www.noaa.gov/), last accessed on 13 December 2021) and USGS (United States Geological Survey (https://earthexplorer.usgs.gov/), last accessed on 13 December 2021), based mainly on Landsat 8 sensors. Correlation, kriging, and cokriging interpolation methods were tested using T a and LST data for the Milan area, and their results were compared, as already described in [12]. Correlation between T a and LST was found to be not sufficiently performing from a meteorological/physical point of view for monitoring and climatological purposes, being relatively good in some cases but practically absent in others. Best results were obtained with kriging and cokriging interpolation methods. The cokriging method (CoK), implemented on a regular grid of square cells of 100 m side that cover the metropolitan area of Milan, proved to be particularly suitable. It allows us to "model" the interpolation of AWSs' T a according to the LST of each square cell.
The methodology finally adopted is based on the cokriging technique [21,22]: the reference T a was taken as the primary variable, and the satellites' LST as the secondary one. The CoK interpolation, obtained after best fitting of the semivariograms and minimizing the covariances, generated a gridded air temperature field at the higher resolution of the secondary variable LST, but keeping a good part of the characteristics and quality of the primary one (i.e., measurements of T a at the top of UCL). We also performed a statistical estimation of the interpolation uncertainty: in all the analyzed situations, we obtained a statistical uncertainty well under 2 • C and in most cases under 1 • C. Interpolation uncertainty is comparable with the AWSs' T a uncertainties due to calibration, siting, and exposure of sensors and to maintenance procedures. Specific development and application of this cokriging methodology to the Milan area and the FOMD Network database are detailed in [12].   The results, mainly obtained in the framework of ClimaMi Project (https://www. progettoclimami.it, last accessed on 10 January 2022), are medium-to high-resolution gridded fields of air temperature together with the related uncertainties, available for weather situations without clouds at the passing times of the used orbital platforms, in the morning hours (around 9:00 or 10:00 UTC) or in the evening (22:00 or 23:00 UTC). They refer to UHI and HW episodes generally occurring with anticyclonic and clear weather, at high spatial resolution for urbanistic planning, energy, and building design: both are relevant in Milan. UHI produces temperature differences, "urban-rural", which in the annual mean are of the order of 4 • C and have maxima up to 10 • C. HW episodes are generally due to a persistent northward extension of the African subtropical high, and they are not infrequent: the annual mean of HW days (i.e., with daily temperature maxima over 33 ÷ 34 • C and minima over 23 ÷ 24 • C and lasting at least 2 days) is well over 10 in downtown.
The above-quoted resolution and uncertainty of the gridded fields are compatible with the requirements by architects and engineers resulting from collaboration with professional associations in the ClimaMi Project. Regular updates are foreseen.

Data Set of Gridded Air Temperatures for Milan Metropolitan Area
The construction of the Milan gridded temperature database related to two different sources of information, as already previously depicted: a.
The surface meteorological database, composed of 10 min mean values obtained by: − 21 AWSs of the FOMD Network; − 26 AWSs of the institutional regional network managed by ARPA Lombardia; − 14 stations of the MeteoNetwork.
Overall, 61 stations were used. In the last two cases, an accurate selection of stations and the subsequent data validation were performed to maintain the high-quality standards as in the first one. b.
The LST database, created by the selection of cloud-free spaceborne observations of the Milan area by two different operational platforms: − Sentinels 3A and 3B, managed by ESA under the EU Copernicus initiative, with a good repetition interval of 2 days and a medium spatial resolution of 1 km ( Figure 2a); − Landsat 8, operated by NOAA and USGS with slower repetition but a much better resolution of 30 m (Figure 2b).
The reasons for selecting these satellite sources excluding other possibilities were based on easy access to the provider's databases, observational repetition, data quality, and spatial resolution. In a future development, this data set could be extended to other existing or planned spaceborne sensors.
Application of the specifically developed CoK algorithm produced grids of nearsurface air temperatures at spatial resolutions of 100 and 30 m ( Table 2): the lower (100 m, 2 satellite passes per day) suitable for climatological analysis at urban mesoscale phenomena as the UHI, and the higher (30 m, 1 morning satellite pass every 14 days) to be used for a detailed description of the thermal characteristics of the urbanized environment at the local and microscale as the post operam assessment of the microclimatic performance of urban changes at the scale of a square or a street. Examples are shown in Figure 3 together with their error maps.
Furthermore, in both resolution cases the obtained gridded thermal fields were classified on the basis of the seasonal morphological aspect of the UHI, as modified by local weather circulations. Table 3 gives a simplified overview of UHI patterns and thermal characteristics of the local weather types with respective frequencies of occurrence. Each thermal field also has its own uncertainty field: uncertainties are in the range of 2 • C, when far from AWSs, to a few tenths of a Celsius degree near measuring stations (Figure 3e). Altogether, the useful generated fields at 100 m resolution are 62 between 2016 and 2019 (Table 2), covering winter and summer seasons, morning and evening times, and a variety of UHI patterns and weather types. The 30 m fields (Table 2), based on Landsat 8 LST data and limited to summer situations (especially heat waves) in the same time period, are 38 with 3 further cases in 2014 and 2015 used for comparison with WRF runs (see Section 3.2). Among them, only a few are almost timely coincident with Sentinel 3-based imagery at lower resolution.  In the next step of analysis, mean thermal fields at 100 m were computed according to the weather type classifications (season, time, wind speed, baric configuration, UHI shape), obtaining for the first time an initial, high-resolution, and climatologically representative description of the lowest level of the Milan metropolitan area atmosphere. These climatic mean fields describe in detail the diversified morphology of the UHI in Milan and other towns in the area (as in Table 3) and its elongations and displacements relative to the densest urbanized city center and as a consequence of local or mesoscale pressure and circulation patterns. These mean fields are all freely accessible at the ClimaMi web portal. Table 3. Relative frequency of spatial distribution of temperature for the metropolitan area of Milan at 100 m resolution for Sentinels 3A and 3B considered LST data. The "Max Temperature" column refers to the position of the maximum relative to the city center; HW: Heat Wave.   (Table 2), covering winter and summer seasons, morning and evening times, and a variety of UHI patterns and weather types. The 30 m fields (Table 2), based on Landsat 8 LST data and limited to summer situations (especially heat waves) in the same time pe riod, are 38 with 3 further cases in 2014 and 2015 used for comparison with WRF runs (see Section 3.2). Among them, only a few are almost timely coincident with Sentinel 3-based imagery at lower resolution.

Day
In the next step of analysis, mean thermal fields at 100 m were computed according to the weather type classifications (season, time, wind speed, baric configuration, UH shape), obtaining for the first time an initial, high-resolution, and climatologically repre sentative description of the lowest level of the Milan metropolitan area atmosphere. These climatic mean fields describe in detail the diversified morphology of the UHI in Milan and other towns in the area (as in Table 3) and its elongations and displacements relative to the densest urbanized city center and as a consequence of local or mesoscale pressure and circulation patterns. These mean fields are all freely accessible at the ClimaMi web portal

Comparison with Other Data Sets
To evaluate the relevance and usefulness of the database described in the previous section, a comparison with other similar data sets should be made. Looking for a gridded description of the urban thermal fields, the current possibilities are offered by large-scale model reanalysis and successive downscaling, or by high-resolution modelling adapted to specific urban environments. In the first case, we can rely on well-established and con solidated operational frameworks as those offered by NOAA and ECMWF (European Centre for Medium-Range Weather Forecast (www.ecmwf.int), last accessed on 10 De cember 2021). On the other hand, operational site-specific urban modelling is hardly avail able, and generally, it is still used only for research and development. In this paper, we considered the ECMWF downscaled reanalysis data and some runs over Milan of the weather research and forecasting model (WRF (www.mmm.ucar.edu/weather-research and-forecasting-model), WRF-ARW DOI :10.5065/D6MK6B4K), both last accessed on 10

Comparison with Other Data Sets
To evaluate the relevance and usefulness of the database described in the previous section, a comparison with other similar data sets should be made. Looking for a gridded description of the urban thermal fields, the current possibilities are offered by large-scale model reanalysis and successive downscaling, or by high-resolution modelling adapted to specific urban environments. In the first case, we can rely on well-established and consolidated operational frameworks as those offered by NOAA and ECMWF (European Centre for Medium-Range Weather Forecast (www.ecmwf.int), last accessed on 10 December 2021)). On the other hand, operational site-specific urban modelling is hardly available, and generally, it is still used only for research and development. In this paper, we considered the ECMWF downscaled reanalysis data and some runs over Milan of the weather research and forecasting model (WRF (www.mmm.ucar.edu/weather-research-and-forecasting-model), WRF-ARW DOI:10.5065/D6MK6B4K), both last accessed on 10 December 2021) operated by the University of L'Aquila (www.univaq.it, last accessed on 10 December 2021), Italy.

Comparison with ERA5/UrbClim
The last version of reanalysis by ECMWF is ERA5 [23]. A number of meteorological variables are obtained by the 4D-Var assimilation and analysis of the Integrated Forecast System (IFS) model, version CY41R2, with 137 vertical levels, a spatial resolution of 31 km, and an uncertainty estimation measured by the spread of a 10-member ensemble. On this basis, a downscaling was further developed by ECMWF for 100 European cities (Milan is among them) to obtain some ECVs at a spatial resolution of 100 m × 100 m at an hourly frequency [24]. This downscaling is the UrbClim model (UrC), which uses "a land surface scheme describing the physics of energy and water exchange between the soil and the atmosphere in the city, coupled to a 3D atmospheric boundary layer module" (cit. from Urb-Clim_extra_documentaion_v2.pdf (https://cds.climate.copernicus.eu/cdsapp#!/dataset/ sis-urban-climate-cities?tab=doc), last accessed on 13 December 2021). The atmospheric boundary conditions are defined by meteorological input data, while the land surface data influence heat flux and evaporation. UrC data are available via the Copernicus CDS (Copernicus Climate Data Store (https://cds.climate.copernicus.eu/), last accessed on 13 December 2021) portal: Figure 4b is an example of the UrbClim product for Milan.
Unfortunately, while ERA5 has a temporal coverage since 1950 up to the present, UrC data sets cover the time interval 2008-2017, and a direct comparison with ClimaMi CoK thermal fields is only possible for the years 2016 and 2017. Moreover, while CoK fields are timely defined on a 10 min basis (being computed with 10 min mean data by AWSs nearest to the satellite pass time), UrC data are hourly: for each date, the hour selected is the nearest to the CoK field time. Therefore, the time approximation in the comparison is half an hour. Furthermore, the area covered by UrC for the city of Milan is much smaller than for ClimaMi CoK interpolations: a direct comparison is given in Figure 4a. Data were extracted from the CoK and UrC fields to combine pixel per pixel both data sets. Differences were then computed for every CoK available time: moreover, comparison is limited to winter and summer cloud free situations. As an example, in Figure  The overall Pearson correlation coefficient R, given by: where T C is the CoK interpolated temperature, T U the temperature from the UrbClim data set, and N the number of image elements (pixels), is 0.5 for the same example of Figure 5, while the covariance: is 0.7. Data were also extracted from UrC matrices for elements coincident with the coordinates of surface measuring stations in the Milan area covered by UrbClim. Furthermore, to provide a better and spatially related quantification of map comparison, the fuzzy Kappa statistics [25] and Moran's index [26,27] were performed. In the first case, the similarity is based on quantity and location, and the fuzziness of location is taken by letting the fuzzy representation of a pixel be partly defined by its proximity.
The similarity of image pixels for two different temperature field values, T 1 and T 2 , is defined by: For the fuzziness, an exponential decay with distance appears to be appropriate considering the temperature: a radius of neighborhood of 4 and a halving distance of 2 pixels were chosen for both fields to be compared. The result statistics for overall similarity is the average similarity over the whole area.
On the other hand, Moran's index allows a measure of spatial (auto-) correlation based on variances and covariances. The index is defined as: where N is the number of samples (pixels), T i and T j are the variables of interest (surface temperature in our case, measured on two different maps of the same event and over the same region), w ij is a matrix of spatial weights computed as an inverse distance function (with w ii = 0), and W is the sum of all w ij .   On the other hand, Moran's index allows a measure of spatial (auto-) correlation based on variances and covariances. The index is defined as: where N is the number of samples (pixels), Ti and Tj are the variables of interest (surface temperature in our case, measured on two different maps of the same event and over the same region), wij is a matrix of spatial weights computed as an inverse distance function (with wii = 0), and W is the sum of all wij.
The expected value of Moran's I under the null hypothesis (no spatial correlation) is given by: where N is the sample number. For large samples, E(I) approaches zero: in our case, the matrices to be compared are 285 × 202 pixels and E(I)  10 −5 .
The index itself varies between −1 (total anticorrelation) and +1 (perfect autocorrelation) and can be used to compare different fields as a CrossMoran index: small differences The expected value of Moran's I under the null hypothesis (no spatial correlation) is given by: where N is the sample number. For large samples, E(I) approaches zero: in our case, the matrices to be compared are 285 × 202 pixels and E(I) ∼ = 10 −5 . The index itself varies between −1 (total anticorrelation) and +1 (perfect autocorrelation) and can be used to compare different fields as a CrossMoran index: small differences arise depending on which is considered first: we simply take the mean value. Further, Lee's L index [28] is a generalization for a bivariate spatial association measure, which integrates Pearson R as a spatial bivariate association measure and Moran's I as a univariate spatial association measure using a spatial smoothing scalar. Considering the different resolution of the two types of maps, a radius of 10 was chosen as a good compromise for the spatial lag for Lee's L, the R lagged, and the Moran statistics. A useful tool for these evaluations was presented in [29].

Comparison with the Weather Research and Forecasting Model
The weather research and forecasting model (WRF) is an open-source, well-known, and much-used nonhydrostatic limited area model [30]. With suitable extensions, it can also be applied for the urban environment [31]. For the city of Milan, the WRF was used for several applications [32,33] with the building environment parameterization (BEP) urban scheme [34] and at a resolution of 1 km or even less. We made use of some outputs (in default configuration: BEPd) for a direct comparison with measurements in the UCL by the 7 FOMD Network weather stations located in Milan downtown. The comparison covers two 10-day summer periods in 2014 and 2015, mainly characterized by anticyclonic weather with only occasional cloudiness and very low winds: very favorable conditions for the development of a strong nocturnal UHI effect.

Comparison Results: ERA5/UrbClim
The comparison for each and all the available CoK episodes (21 cases, number limited by the availability of useful spaceborne data) with UrC data for Milan provides the results summarized in Table 4. The variety of meteorological and seasonal situations does not allow a straightforward conclusion, and data are not enough for situation-based statistics.
Anyway, the mean range between minimum and maximum differences is 9.3 • C, while the mean difference ("CoK minus UrC", pixel by pixel) amounts to −2.0 ± 1.5 • C, showing a small negative bias. The mean temperature differences among CoK and UrC maps vary from −4.5 to +5.5 • C, with a clear prevalence of positive values in the morning and negative ones in the evening. The mean covariance is 0.5, ranging from almost 0 up to 1.1 regardless of daytime or season, as shown in Figure 6. The covariance is also independent of the mean temperature (representative of the season), while the mean temperature difference between CoK and UrC is smaller for lower temperatures (winter: −0.5 • C) and larger in the warmer episodes (summer: −3.0 • C).
The comparison results between measurements and UrbClim are shown in Table 5: UrC data have a mean bias "AWS minus UrC" of −2.3 ± 1.6 • C, while differences span between −11.9 • C and +6.0 • C, in substantial agreement with the result obtained for CoK data. There is anyway a large seasonal disparity, with a winter mean of only −0.5 • C and a summer mean of −3.4 • C, once again in substantial agreement with the comparison operated by using interpolated CoK data. In the winter episodes, the difference range is also smaller.
The results obtained with the fuzzy and Moran statistics for the temperature fields are summarized in Table 6, while examples of the spatial distribution of the local Lee's L are given in Figure 7. The CoK and UrC temperature fields appear to have a relatively good spatial correlation over the city: both are catching the UHI effect. Nevertheless, there are smaller regions where correlation is weak or even negative, emphasizing the marked differences in the methods in accurately resolving the smallest-scale details.

Comparison Results: WRF
The results show an average systematic underestimation of the UCL temperatures by the WRF: the mean bias (WRF computed minus FOMD Network measured) is −0.3 • C in 2014 and −2.2 • C in 2015 (Figure 8). Negative differences are systematically much larger during the night hours (−1.5 and −3.4 • C, respectively), resulting in a clear underestimation of the intensity of the typical nocturnal UHI effect over the city. The underestimation does not agree with the finding in [34]: anyway, in that case, a much smaller town is considered. During the sunlit hours, absolute values of the differences are smaller, and their sign can be even positive (−1.0 and +0.9 • C, respectively, in 2014 and 2015). As a consequence, WRF could not always account for the existence of the diurnal urban cold island, which is, however, climatologically typical in the city of Milan [18].
The differences between both years is likely related to the marked different mean temperatures in the city over the two 10-day periods: about 25 • C in 2014 with a certain range of variability among different days and 30.5 • C in 2015 characterized by an almost constant and persistent HW episode. Furthermore, the last 4 days of the 2014 period show differences that are three times as large as in the previous days. It must be noted that the mean hourly standard deviation among FOMD Network stations is always about 0.3 • C, all the seven stations being sited in a similar urban environment and their sensors exposed with the same criteria on top of typical buildings of almost equal height and with an almost constant surface albedo of 0.3. Therefore, the local climate zone classification [13] for most of them is "compact low-rise", while only Milano Sud is "compact mid-rise", and Bovisa and S. Siro "open high-rise". In fact, WRF data correlate in a similar manner to all the seven FOMD Network stations in Milan downtown: correlation coefficients (R 2 ) between WRF figures and FOMD Network data are 0.84 and 0.94, respectively: in 2015 the correlation is always significant at p < 0.01, while in 2014 the same value is found only for MI Centro; it is not significant for two stations (MI Città Studi and MI Sud), and the other ones have p < 0.1 or 0.5.
We compared the 2 m thermal fields from the WRF-BEP model with the ClimaMi CoK interpolations at a high resolution (30 m, obtained by Landsat 8 LST data): Figure 9 shows the overlapping of the two fields. From the available WRF runs, we could, unfortunately, find only three situations coincident with the CoK ones: 19 July 2014 at 10:00 UTC, 15 July 2015 at 10:00 UTC, and 22 July 2015 at 10:00 UTC. The comparison, adapting the WRF-BEP grid (84 × 66 pixels) onto the CoK-30 m grid (1001 × 1001 pixels), produced the differences shown in Figure 10: the resulting mean bias (CoK minus WRF) is −0.9 • C, essentially confirming what was previously found in this section, while the mean difference range is 4.3 • C. More evident is the general difference gradient in longitude across the area, with WRF underestimating temperature in the eastern part of the domain. On the other hand, the WRF overestimation south of the city center could be related to the fact that this region is well known to be a climatologically colder and moister area, being at a lower elevation and with a mostly agricultural land use. More interesting is the central part of the city: in this area, a small-scale variability is evident with a possible WRF overestimation, consistent with the direct comparison with measurements considering that all the three episodes are in the late morning.
We can conclude that WRF, at least in summer heat wave episodes, underestimates night temperatures of about 3 • C over the city of Milan, when the UHI effect is especially relevant, while the diurnal urban cold island could remain undetected: the results may help to improve the models in better describing the urban canopy layer, and the city of Milan could be used as a useful testbed.
Forecasting 2021, 3 FOR PEER REVIEW 20 the WRF-BEP grid (84 × 66 pixels) onto the CoK-30 m grid (1001 × 1001 pixels), produced the differences shown in Figure 10: the resulting mean bias (CoK minus WRF) is −0.9 °C, essentially confirming what was previously found in this section, while the mean difference range is 4.3 °C. More evident is the general difference gradient in longitude across the area, with WRF underestimating temperature in the eastern part of the domain. On the other hand, the WRF overestimation south of the city center could be related to the fact that this region is well known to be a climatologically colder and moister area, being at a lower elevation and with a mostly agricultural land use. More interesting is the central part of the city: in this area, a small-scale variability is evident with a possible WRF overestimation, consistent with the direct comparison with measurements considering that all the three episodes are in the late morning. We can conclude that WRF, at least in summer heat wave episodes, underestimates night temperatures of about 3 °C over the city of Milan, when the UHI effect is especially relevant, while the diurnal urban cold island could remain undetected: the results may help to improve the models in better describing the urban canopy layer, and the city of Milan could be used as a useful testbed.     Figure 7. An E-W gradient is always evident, as well as the WRF overestimation just south of the city center (at position 45. 47°N, 9.19°E), where on the contrary, there is a small-scale variability.

Discussion
The comparison results described in Sections 3.1 and 3.2 consistently show an underestimation of the temperatures over the city of Milan by UrbClim and partly also by WRF-BEP. The mean difference is about −2 °C for UrC, with a possible seasonal variation showing smaller values in low-temperature situations in relation to warmer ones. The mean R, omitting the two unrealistic negative values, is 0.4  0.1 °C, showing in general a low to moderate overall correlation. On the other hand, covariances are very variable among the considered episodes.
A similar result is found comparing UrbClim data directly with station measurements, where original information is not lost in the cokriging interpolation. The fuzzy statistics applied to compare CoK and UrC thermal fields gives a good average value, while the global matching of the maps is relatively low (0. 21), showing that aside from the observed bias, the thermal fields are characterized by some differences in spatial Forecasting 2021, 3 FOR PEER REVIEW 21 some precipitation episodes. In 2015 mean temperatures were more than 5 °C higher than in the same period of 2014.  Figure 7. An E-W gradient is always evident, as well as the WRF overestimation just south of the city center (at position 45. 47°N, 9.19°E), where on the contrary, there is a small-scale variability.

Discussion
The comparison results described in Sections 3.1 and 3.2 consistently show an underestimation of the temperatures over the city of Milan by UrbClim and partly also by WRF-BEP. The mean difference is about −2 °C for UrC, with a possible seasonal variation showing smaller values in low-temperature situations in relation to warmer ones. The mean R, omitting the two unrealistic negative values, is 0.4  0.1 °C, showing in general a low to moderate overall correlation. On the other hand, covariances are very variable among the considered episodes.
A similar result is found comparing UrbClim data directly with station measurements, where original information is not lost in the cokriging interpolation. The fuzzy statistics applied to compare CoK and UrC thermal fields gives a good average value, while the global matching of the maps is relatively low (0. 21), showing that aside from the observed bias, the thermal fields are characterized by some differences in spatial

Discussion
The comparison results described in Sections 3.1 and 3.2 consistently show an underestimation of the temperatures over the city of Milan by UrbClim and partly also by WRF-BEP. The mean difference is about −2 • C for UrC, with a possible seasonal variation showing smaller values in low-temperature situations in relation to warmer ones. The mean R, omitting the two unrealistic negative values, is 0.4 ± 0.1 • C, showing in general a low to moderate overall correlation. On the other hand, covariances are very variable among the considered episodes.
A similar result is found comparing UrbClim data directly with station measurements, where original information is not lost in the cokriging interpolation. The fuzzy statistics applied to compare CoK and UrC thermal fields gives a good average value, while the global matching of the maps is relatively low (0. 21), showing that aside from the observed bias, the thermal fields are characterized by some differences in spatial distribution over the project area. Altogether, UrbClim appears to give a satisfactory general description of the UHI, but details even at the urban mesoscale are not well reproduced despite the declared downscaling spatial resolution at 100 m: the spatial distribution of Lee's L shows in general better scores above the city than elsewhere, but is very variable and lacking in resolution.
On the contrary, ClimaMi CoK shows realistic details: it is able to reproduce detailed differences caused by the urban texture. Moreover, being mainly based on in situ measurements in the upper part of UCL, it can be taken as a reference. The result is confirmed by the direct match with station measurements. Some bias is also present in relation to daytime and season.
The spatial comparison with the WRF-BEP runs is based on only three episodes and cannot be considered conclusive. Anyway, a mean bias of −0.9 • C appears to be qualitatively significant. It is also consistent with the results obtained matching WRF outputs to direct measurements over the two periods of 10 summer days, where the mean bias is -1.3 ± 0.3 • C, varying from -0.1 ± 0.4 • C during daytime to -2.4 ± 0.3 • C in the night. The spatial distribution of the differences "CoK minus WRF" shows a remarkable longitudinal gradient (while it is of little or no interest in this context). Over the city area, their values are small and variable, but generally positive: this agrees with the daily variation of the differences obtained matching WRF and measurements, all the three episodes being at 10 UTC. Evidently, a more extensive comparison is required in this case, while also the model setup could be better designed (being in this case in default mode, as noted before).

Conclusions
An almost 10-year data set of measurements in the Milan metropolitan area UCL, mainly based on the FOMD urban meteoclimatological network and integrated with selected data from the official ARPA Lombardia monitoring network and from the amateur MeteoNetwork for a total of 61 AWSs, has been used to obtain medium-to high-resolution near-surface gridded thermal fields. In the adopted cokriging algorithm, ground air temperature by AWSs is the primary variable, while spaceborne land surface temperature by Sentinel 3 and Landsat 8 satellites is the secondary one. The results obtained, already usefully applied in resilience plans and projects by officials, engineers, and architects for the Milan metropolitan area in the framework of the ClimaMi Project, were compared with other available gridded thermal information, namely, the ERA5 downscaled reanalysis data for European cities (UrbClim) and some WRF-BEP runs over Milan made available by a research group at the University of L'Aquila.
The comparison, performed for the overlapping periods of available data from the different data sets and limited to winter and summer episodes, highlights in both cases a general agreement but also a systematic underestimation of the UHI effect over Milan. The bias can be quantified, on average, at −2 • C, while the range of this underestimation can be larger than 10 • C in the considered episodes. The spatial correlations, analyzed using fuzzy and Moran' statistics, confirms the bias and shows a large variability in performance and location, suggesting some likely systematic trends too. Even if in this work the number of episodes is limited by data availability, specifically for spaceborne data and the WRF runs, the results imply the necessity to significantly improve modelling and downscaling methods to adequately cope with the request for punctual data posed by professionals and decision makers in order to design and effectively implement plans and projects in the current climate change situation, which in cities is a very complex subject.
We can conclude that correctly performed observations in the challenging urban environment remain of utmost importance, not only in monitoring the weather and the climate, but also specifically as an unavoidable benchmark for new modelling and downscaling products. In this case, the meteorological network extent and its affordability and operational continuity represent a major issue and require specifically dedicated resources. Integration of citizen science data, such as amateur meteorologists' ones, has proved to be a benefit, but implies accurate data selection and validation. Furthermore, spatial analysis extends the potential of point measurements in a useful manner. The results obtained represent a first attempt to comparatively evaluate some different techniques used to obtain high-resolution gridded thermal fields in the urban environment. Further developments shall consider a wider range of observational and modelling data sets as well as other methods of spatial analysis: further spaceborne data to improve the time and spatial resolution and more up-to-date and site-adapted meteorological models and spatial analysis methods.
Furthermore, other variables, such as humidity, wind, and precipitation, shall be considered to get a more complete description of the urban environment: analysis is already started on precipitation in the ClimaMi Project framework, while humidity and wind are the next to be analyzed: this will be a further important step towards a complete and detailed knowledge of the urban climate to address resilience to climate change.
Author Contributions: G.F. wrote the text and made the comparison statistics among AWS measurements, CoK interpolations, and UrbClim data; S.P. ensured AWS data validation and the paper revision; C.L. supervised the whole work as leader of the ClimaMi Project and revised the paper; E.M.M. performed all the CoK interpolations and the paper revision. All authors have read and agreed to the published version of the manuscript.