Comparison of ERA5-Land and UERRA MESCAN-SURFEX Reanalysis Data with Spatially Interpolated Weather Observations for the Regional Assessment of Reference Evapotranspiration

: Reanalysis data are being increasingly used as gridded weather data sources for assessing crop-reference evapotranspiration (ET 0 ) in irrigation water-budget analyses at regional scales. This study assesses the performances of ET 0 estimates based on weather data, respectively produced by two high-resolution reanalysis datasets: UERRA MESCAN-SURFEX (UMS) and ERA5-Land (E5L). The study is conducted in Campania Region (Southern Italy), with reference to the irrigation seasons (April–September) of years 2008–2018. Temperature, wind speed, vapor pressure deﬁcit, solar radiation and ET 0 derived from reanalysis datasets, were compared with the corresponding estimates obtained by spatially interpolating data observed by a network of 18 automatic weather stations (AWSs). Statistical performances of the spatial interpolations were evaluated with a cross-validation procedure, by recursively applying universal kriging or ordinary kriging to the observed weather data. ERA5-Land outperformed UMS both in weather data and ET 0 estimates. Averaging over all 18 AWSs sites in the region, the normalized BIAS (nBIAS) was found less than 5% for all the databases. The normalized RMSE (nRMSE) for ET 0 computed with E5L data was 17%, while it was 22% with UMS data. Both performances were not far from those obtained by kriging interpolation, which presented an average nRMSE of 14%. Overall, this study conﬁrms that reanalysis can successfully surrogate the unavailability of observed weather data for the regional assessment of ET 0 . to the cross-validation procedure. In the former case, the mean BIAS is 0.37 ◦ C, ranging from − 0.80 ◦ C to 1.64 ◦ C; the RMSE is on average 1.11 ◦ C, varying from 0.81 to 1.98. In the latter case, the mean BIAS is very close to zero as expected, ranging from − 1.61 ◦ C to 1.32 ◦ C and the RMSE ranges from 0.46 ◦ C to 1.96 ◦ C with a mean value of 1.15 ◦ C. The results show a low sensitivity of the kriging interpolator of temperature to network density.


Introduction
Fresh water scarcity is one of the main issues examined by climate-change adaptation policies [1,2]. A common objective of climate change adaptation policies is to identify feasible water management strategies at regional and continental scales to secure the water required for food production, while preserving the ecosystems. This issue is particularly relevant in regions where agricultural irrigation is the main pressure on renewable water resources. In these regions, an assessment of water withdrawal for crop irrigation is fundamental to plan sustainable policies and implement better-informed decisions in water resource management.
Irrigation maps at global and continental scales have been developed for spatial statistics of agricultural water demand [3][4][5]. However their utility for planning water resources and climate orography in proximity of the sea and in the inner part of the region, that highly influences the spatial variability of the weather [51].
The study area is characterized by Mediterranean climates with hot, dry summers, warm springs and mild, wet winters and autumns. Precipitations occur mostly in autumn and winter: the maximum monthly precipitations are recorded during November and December.
Although the actual duration of the irrigation season is determined by the weather variability and specific agricultural practices, irrigation in open fields generally starts in April and it lasts until the end of September. Thus, hereinafter the period from April 1st to September 30th will be referred to as irrigation season. The region has been the focus area of several projects aiming at developing innovative technologies to support water resources management in agriculture, based on satellite remote sensing and numerical weather predictions [9,12,14,52]. It is also the demonstrative area of the EU-Horizon2020 project, LANDSUPPORT (https://www.landsupport.eu/), aiming at identifying sustainable management practices in agriculture.
A network of 18 automatic weather stations (AWSs) has been operating since 2008, recording hourly measurements of air temperature at 2 m, wind speed at 10 m, relative humidity at 2 m, global incoming solar radiation and barometric pressure. The locations of the stations are shown on the regional map in Figure 1, while details about their reference ID number, site name, elevation and coordinates are listed in Table 1. This network is managed by the Regional Hydrometeorological Service, as part of the larger hydrometeorological network, including additional 59 weather stations equipped with air temperature sensors at 2 m ( Figure A1 in Appendix B). Table 2 illustrates the mean and coefficient of variation of daily weather data recorded at the 18 AWSs during the irrigation seasons of years 2008-2018, assumed as focus period in this study.
Mean air temperature, T, ranges between 17.85 °C and 23.50 °C with coefficients of variation (cv) from 0.19 to 0.30; wind speed, WS, ranges from 1.27 m s −1 to 4.64 m s −1 with cv from 0.23 to 0.47; solar radiation, RS, ranged from 224 W m −2 to 269 W m −2 with cv from 0.26 to 0. 35.
The daily vapor pressure deficit (VPD) was computed by using daily mean air relative humidity data and the daily maximum and minimum temperatures according to the FAO irrigation and drainage study No. 56 [15] (pp. [36][37][38][39][40]. The VPD ranged from 0.63 kPa to 1.13 kPa with coefficients of variation from 0.43 to 0.83. The study area is characterized by Mediterranean climates with hot, dry summers, warm springs and mild, wet winters and autumns. Precipitations occur mostly in autumn and winter: the maximum monthly precipitations are recorded during November and December.
Although the actual duration of the irrigation season is determined by the weather variability and specific agricultural practices, irrigation in open fields generally starts in April and it lasts until the end of September. Thus, hereinafter the period from April 1st to September 30th will be referred to as irrigation season.
The region has been the focus area of several projects aiming at developing innovative technologies to support water resources management in agriculture, based on satellite remote sensing and numerical weather predictions [9,12,14,52]. It is also the demonstrative area of the EU-Horizon2020 project, LANDSUPPORT (https://www.landsupport.eu/), aiming at identifying sustainable management practices in agriculture.
A network of 18 automatic weather stations (AWSs) has been operating since 2008, recording hourly measurements of air temperature at 2 m, wind speed at 10 m, relative humidity at 2 m, global incoming solar radiation and barometric pressure. The locations of the stations are shown on the regional map in Figure 1, while details about their reference ID number, site name, elevation and coordinates are listed in Table 1. This network is managed by the Regional Hydrometeorological Service, as part of the larger hydrometeorological network, including additional 59 weather stations equipped with air temperature sensors at 2 m ( Figure A1 in Appendix B).  Table 2 illustrates the mean and coefficient of variation of daily weather data recorded at the 18 AWSs during the irrigation seasons of years 2008-2018, assumed as focus period in this study. The daily vapor pressure deficit (VPD) was computed by using daily mean air relative humidity data and the daily maximum and minimum temperatures according to the FAO irrigation and drainage study No. 56 [15] (pp. [36][37][38][39][40]. The VPD ranged from 0.63 kPa to 1.13 kPa with coefficients of variation from 0.43 to 0.83.

Reanalysis Data
Meteorological reanalysis data provide an accurate description of the past weather by assimilating land surface and atmospheric observations from several sources into the most advanced numerical weather prediction models that obey physical laws, in order to create a complete and consistent climate dataset that covers several decades back in time and that is coherent both from physical and dynamic perspectives [53]. Reanalysis data are generally freely available on dedicated platforms of the web, released in regular grid formats, with a delay of few months from present.
In this study, we employed the most advanced reanalysis data produced in Europe, specifically optimized for the land surface: UERRA MESCAN-SURFEX (UMS) [54] and ERA5-Land (E5L) [50,55]. The former is a regional reanalysis model implemented for Europe within an European FP7 reanalysis preoperational project with the participation of the European Centre for Medium-Range Weather Forecasts (ECMWF), while the latter is a global atmospheric reanalysis model implemented by ECMWF. Both are freely available as products of the Copernicus Climate Change Service that holds the intellectual property rights of the raw data.
UERRA MESCAN-SURFEX is a downscaled surface analysis system of the UERRA-HARMONIE regional reanalysis forced by the global ERA-Interim reanalysis data [56] and its production ceased with the end of availability of ERA-interim (July 2019). It will be replaced by a new system called Copernicus European Regional ReAnalysis (CERRA). CERRA will be used to create a pan-European reanalysis, with very high resolution (5.5 km) forced by the global ERA5 reanalysis [57].
UERRA MESCAN-SURFEX has a spatial resolution of 5.5 km over Europe and covers the period January 1961-July 2019. Figure 2a shows its numerical grid over the study area. Daily mean values of air temperature at 2 m, wind speed at 10 m, relative humidity at 2 m, global incoming solar radiation and barometric pressure as well as maximum and minimum daily air temperature were computed by aggregating the six-hour forecasts available at 00, 06, 12 and 18 UTC. physical and dynamic perspectives [53]. Reanalysis data are generally freely available on dedicated platforms of the web, released in regular grid formats, with a delay of few months from present. In this study, we employed the most advanced reanalysis data produced in Europe, specifically optimized for the land surface: UERRA MESCAN-SURFEX (UMS) [54] and ERA5-Land (E5L) [50,55]. The former is a regional reanalysis model implemented for Europe within an European FP7 reanalysis preoperational project with the participation of the European Centre for Medium-Range Weather Forecasts (ECMWF), while the latter is a global atmospheric reanalysis model implemented by ECMWF. Both are freely available as products of the Copernicus Climate Change Service that holds the intellectual property rights of the raw data. UERRA MESCAN-SURFEX is a downscaled surface analysis system of the UERRA-HARMONIE regional reanalysis forced by the global ERA-Interim reanalysis data [56] and its production ceased with the end of availability of ERA-interim (July 2019). It will be replaced by a new system called Copernicus European Regional ReAnalysis (CERRA). CERRA will be used to create a pan-European reanalysis, with very high resolution (5.5 km) forced by the global ERA5 reanalysis [57].
UERRA MESCAN-SURFEX has a spatial resolution of 5.5 km over Europe and covers the period January 1961-July 2019. Figure 2a shows its numerical grid over the study area. Daily mean values of air temperature at 2 m, wind speed at 10 m, relative humidity at 2 m, global incoming solar radiation and barometric pressure as well as maximum and minimum daily air temperature were computed by aggregating the six-hour forecasts available at 00, 06, 12 and 18 UTC.
ERA5-Land is a rerun of the land component of the ERA5 reanalysis, obtained by implementing a series of improvements with respect to ERA5, in order to make it more accurate for all types of land applications [58]. ERA5-Land runs at enhanced spatial resolution (9 km vs 31 km in ERA5), but it has ERA5-Land is a rerun of the land component of the ERA5 reanalysis, obtained by implementing a series of improvements with respect to ERA5, in order to make it more accurate for all types of land applications [58]. ERA5-Land runs at enhanced spatial resolution (9 km vs 31 km in ERA5), but it has the inconvenient that data, including atmospheric variables, are not provided for numerical grid points falling on sea surface. Thus, for land areas close to the coast, since they are not surrounded by four valid grid data values, the assessment of weather variables must rely on inland data only. Figure 2b shows the numerical grid of the E5L data: the gray dots refer to the grid points on the sea surface for which no data are available.
ERA5-Land data are produced with an hourly time-step. Thus, daily values were derived by collecting 24 values starting from 00 UTC every day. The available weather variables are air temperature at 2 m, wind speed at 10 m, global incoming solar radiation, barometric pressure and dew point temperature at 2 m. Instead of relative humidity, dew point temperature was then used for computing vapor pressure deficit.
For both E5L and UMS data, a simple bias correction technique was applied to temperature, vapor pressure deficit and solar radiation data to remove the systematic bias. Raw reanalysis data were corrected by subtracting the corresponding regional mean bias computed at the 18 AWSs. Both E5L and UMS data showed a systematic overestimation of temperature over the region that was corrected with the average regional error that was equal to 0.66 • C for E5L and 1.57 • C for UMS.

Geostatistical Interpolation of Ground Sensor Data and Cross-Validation for Evaluating Performances
Geostatistical interpolation is generally employed to estimate weather data over a regularly spaced prediction grid by using ground sensor measurements available over irregularly spaced networks [59]. Among the wide collection of geostatistical interpolation methods, ordinary kriging is the most used to interpolate weather data, such as wind speed [60,61], relative humidity [61] and solar radiation [62][63][64]. However, ordinary kriging only uses data available from observations and its performance greatly depends on the density of the available measuring network and the spatial structure of the variable of interest [65]. For this reason, in some other applications where an evident statistical correlation exists between the weather variable of interest and an auxiliary external variable sampled with a higher spatial resolution, regression kriging is preferred [66]. This is the case of temperature [66][67][68][69], which is highly correlated with the elevation.
Regression kriging [70] (also named residual kriging [71], mathematically equivalent to universal kriging, often said kriging with external drift [72]) couples a regression model of the variable of interest on auxiliary independent variables, with the ordinary kriging applied to the regression residuals.
Appendix A provides theoretical details of kriging estimation that can be exhaustively found in Journel & Huijbregts [73] and Cressie [74], among others.
In this study, we used (i) ordinary kriging-Equations (A1-A4)-for interpolating daily wind speed, relative humidity and solar radiation and (ii) regression kriging for daily mean, maximum and minimum air temperature. By implementing daily regression models of temperature with elevation, regression kriging requires the application of Equations (A1-A4) to the random field of the regression residuals, instead of the observed weather variable.
Then, the performances of kriging estimators were assessed by means of the cross-validation method [74,75]. Indeed, a very common way to assess the predictive ability of a statistical model and, specifically, of a spatial interpolation technique, such as kriging, is to test model results on a set of data that were not used for estimation. Cross-validation serves this scope and it is based on a jackknife kriging estimation of the variable of interest at a sampling site, obtained by applying the kriging interpolator to the variable observations available at the remaining sampling sites (leave-one-out cross-validation).
Then, after applying the described procedure for all sampling sites, i.e., 18 AWSs sites and weather variables, the (jackknife) kriging estimations were used for the performance analysis and to evaluate the predictive ability of kriging in deriving gridded data from weather observations in the study area.

Reference Evapotranspiration Model
The reference evapotranspiration is an important notion in hydrological and agricultural applications, able to describe the evaporative demand of the atmosphere [15]. It only depends on the weather data and it can be defined as the evapotranspiration under standard conditions of a hypothetical grass reference crop according to FAO irrigation and drainage study No. 56 [15]. In this study, daily reference evapotranspiration, ET 0 , was computed by using daily weather data retrieved from ground sensors at AWS sites and, alternately, by using bias-corrected reanalysis data and spatially interpolated data from ground sensors by cross-validation.
ET 0 (mm day −1 ) is computed with the FAO Penman-Monteith equation, as follows: where: • λ is the latent heat of vaporization; T is the daily mean air temperature at 2 m height ( • C) computed as the average between the daily maximum (T max ) and minimum (T min ) air temperature at 2 m height at the highest available temporal resolution of data; • WS is the wind speed at 2 m height above ground (m s −1 ) computed from the wind speed at 10 m above the ground by multiplying for a factor of about 0.75 given by the logarithmic equation of the wind speed profile [15]; • VPD is the daily vapor pressure deficit (kPa), computed as function of T max and T min and relative humidity, RH. If available, dew point temperature (T dew ) is used for computing VPD, instead of RH); • R n is the net radiation at the crop surface (MJ m −2 day −1 ) and G is the soil heat flux density (MJ m −2 day −1 ). The net radiation (R n ) was calculated as the difference between the incoming net shortwave radiation and the outgoing net long-wave radiation. The incoming net shortwave radiation is a fraction of the incoming shortwave solar radiation (RS), computed for the reference crop by setting the albedo equal to 0.23. Table 3 reports monthly statistics averaged over years 2008-2018 for daily ET 0 , computed with ground-based weather data at the 18 AWSs. The mean daily ET 0 values over the region are 2.78 mm day −1 in April, 3.51 mm day −1 in May, 4.55 mm day −1 in June, 4.94 mm day −1 in July, 4.00 mm day −1 in August and 2.80 mm day −1 in September. The minimum mean value over a month of daily ET 0 is registered in September, 2.21 mm day −1 and the maximum in July, 5.54 mm day −1 . The mean coefficient of variation is minimum in July, 0.17 and maximum in April, 0.32.

Statistical Indicators for Performance Analysis
The performance of the bias-corrected reanalysis data and of the spatially interpolated data, computed from ground sensors by cross-validation, were evaluated at each AWSs, by assuming the weather observations retrieved from the AWS sensors as benchmarks.
Similarly, for what concerns the reference evapotranspiration, the benchmark was assumed to be the reference evapotranspiration computed with Equation (1) by using the weather data retrieved from ground sensors as input variables. This benchmark ET 0 was, respectively compared with ET 0 computed by using the bias-corrected reanalysis weather data and the spatially interpolated weather data from ground sensors as input.
The performances were evaluated by computing the BIAS, which was used as an indicator of accuracy of data. and the root mean square error, RMSE, which gives insight about both accuracy and precision of data. These two statistical indicators were calculated as follows: where X i p and X i b are, respectively, the variable for which the performance evaluation is desired and the corresponding variables assumed as benchmark, at the i-th day; n denotes the number of examined days (183 days of the annual irrigation season multiplied by the 11 years analyzed, i.e., 2013 days in total). The more the BIAS differs from zero the less accurate is the estimate. The greater is the RMSE, the less are accuracy and precision of the estimates.
In the following, differences between the performance indicators were computed for comparing the performance of different methods each other. For comparing BIAS, the differences between the corresponding absolute values resulting from Equation (2) were computed.
When monthly analyses for the reference evapotranspiration were performed, the BIAS and RMSE were divided by the mean daily ET 0 on the examined j-th month computed over the analyzed years (values specified in Table 3), in order to consistently compare monthly performances.
These normalized indicators are hereinafter referred to as nBIAS and nRMSE, computed as follows for the j-month: where k refers to the number of days in the j-month of interest and X j o is the mean observed value of the generic variable X in the examined j-month over the analyzed years at a specific site of interest. In the following, we express nBIAS and nRMSE in percent.

Performance Analysis on Weather Data
A performance analysis of weather variables was conducted to better explain the performance of the target variable of this study, i.e., the reference evapotranspiration. This analysis may be also relevant for other applications, since it can guide the choice of the most suitable data source for the different set of weather variables, based on the corresponding statistical performance indicators. Figure 3 shows the statistical performance indicators of the kriging estimator computed with Equations (2-3), where X i p are the spatially interpolated observed daily weather data obtained by cross-validation and X i b are the daily weather observations. Both indicators were computed over the reference period April-September of years 2008-2018 (2013 days).
In Figure 3a, the BIAS values are represented according to the color scale on the side: green corresponds to BIAS close to zero. In Figure 3b, the color scale ranges from blue to red, as RMSE runs from the minimum to maximum values.
Summarizing the main results depicted in Figure 3, temperature BIAS varies over the region from −0.8 °C to 1.6 °C, with a mean RMSE of 1.2 °C and a maximum RMSE of 2.0 °C. Wind speed BIAS varies from −1.66 m s −1 to 2.19 m s −1 , with mean RMSE equal to 1.16 m s −1 and maximum RMSE equal to 2.78 m s −1 . Solar radiation BIAS ranges from −30 W m −2 to 20 W m −2 , with mean RMSE equal to 30 W m −2 and a maximum RMSE at 53 W m −2 . VPD BIAS varies over the region from −0.19 kPa to 0.30 kPa, with a mean RMSE of 1.50 kPa and a maximum RMSE of 2.24 kPa. The worst performances in terms of RMSE are observed at AWSs n. 1, 9 and 13 for T, at AWSs n. 5, 10, 12 for WS, at AWSs n. 1, 10, 12 for RS and at AWSs n. 1, 6, 13 for VPD.
The performances considerably change across the region due to: (i) the irregularly spaced AWS network that may penalize some of the AWS sites, such n. 13; (ii) border effects that penalize the locations at the inland boundaries (AWSs n. 2, 5, 10, 12) of the region and along coastline (AWSs n. 1,6,9); (iii) complex topography, especially in proximity of the sea (AWS n. 1).
In relative terms, considering the statistics of weather observations over the irrigation seasons of years 2008-2018, kriging spatial interpolations provided the best performances with temperature, thanks to the strong correlation with the elevation taken as auxiliary variable. However, as illustrated in Appendix B, the density of the network in the case of temperature slightly influenced the kriging performances.  The statistical performances were presented by means of BIAS and RMSE, as defined in Equations (2)(3), where X are the bias-corrected daily reanalysis values for the variable of interest and X are the daily weather observations at the AWSs sensors. Figure 4 shows the differences In the case of air temperature, the kriging performances at each of the 18 reference AWSs were assessed by cross validation, after conditioning the kriging estimates to the observations at the remaining 76 temperature stations, i.e., the remaining 17 reference AWSs plus the additional 59 air temperature sensors, distributed across the region as depicted in Figure A1. In this way, we explicitly considered the availability of air temperature data with a much higher density than the other weather variables involved in ET 0 computations: a common scenario to many regions of the world. Appendix B shows the sensitivity of kriging interpolator to the density of the network for what concerns air temperature data. For all other weather variables, the kriging estimates at each AWSs where conditioned to just the remaining 17 AWSs, for which the corresponding observations were available.
In Figure 3a, the BIAS values are represented according to the color scale on the side: green corresponds to BIAS close to zero. In Figure 3b, the color scale ranges from blue to red, as RMSE runs from the minimum to maximum values.
Summarizing the main results depicted in Figure 3, temperature BIAS varies over the region from −0.8 • C to 1.6 • C, with a mean RMSE of 1. The performances considerably change across the region due to: (i) the irregularly spaced AWS network that may penalize some of the AWS sites, such n. 13; (ii) border effects that penalize the locations at the inland boundaries (AWSs n. 2, 5, 10, 12) of the region and along coastline (AWSs n. 1, 6,9); (iii) complex topography, especially in proximity of the sea (AWS n. 1).
In relative terms, considering the statistics of weather observations over the irrigation seasons of years 2008-2018, kriging spatial interpolations provided the best performances with temperature, thanks to the strong correlation with the elevation taken as auxiliary variable. However, as illustrated in Appendix B, the density of the network in the case of temperature slightly influenced the kriging performances. Figure 4 shows the performances of the two considered reanalysis datasets: UERRA MESCAN-SURFEX (UMS) and ERA5-Land (E5L). Temperature, vapor pressure deficit and solar radiation were bias-corrected as explained in Section 2.2.
Water 2020, 12, x FOR PEER REVIEW 11 of 21 between the computed performance indicators of each reanalysis dataset and the kriging estimator (as depicted in Figure 3), providing information about the convenience of using reanalysis datasets versus spatially interpolated ground sensor weather data. The benchmarks data for the computations of the BIAS and RMSE remain the daily weather observations at AWSs sensors. Figure 4a shows the differences between the corresponding absolute BIAS, while Figure 4b shows the differences between the corresponding RMSE. Except for RMSE computed for RS, the color scales are built to range from negative values when kriging is performing better than reanalysis (blue on the color bar), to positive values when kriging is performing worse than reanalysis (red color on the color bar). In case of RS, the color scale represents only positive increasing values from blue to red.  Figure 4a), while kriging slightly outperforms reanalysis in term of RMSE (see T in Figure 4b). ERA5-Land shows generally more favorable RMSE values than UMS. AWS n.1 is a special case that deserves a note, since it is located on the ridge of the Sorrento's Peninsula, characterized by high elevation in proximity of the coastline. The effect of this complex terrain features cannot be represented by the coarse spatial resolution of the numerical grid. Moreover, as shown in Figure 2b, E5L does not provide valid data grid points located in proximity of this coastal ridge. Any estimation at AWS n.1 must rely on bilinear interpolation of grid points located in the inland plain.

Reanalysis and kriging show generally similar accuracies in temperature estimates (see T in
With respect to wind speed, both BIAS and RMSE show that kriging estimator outperforms reanalysis roughly in only half of the AWSs and that E5L outperforms UMS.
With respect to vapor pressure deficit and solar radiation, differences in BIAS and in RMSE show that kriging estimator outperforms reanalysis and that E5L outperforms UMS, which presents limitations in predicting solar radiation, as also reported in the UERRA user guide [56].

Performance Analysis on Reference Evapotranspiration
This section is focused on the performances of ET0 averaged over the entire irrigation season of years 2008-2018. The following subsection will present monthly based results. Figure 5 shows the average BIAS and RMSE of ET0 in the period 2008-2018 for the whole region. The spread of the box plots refers to the variability of the statistical indicators among the 18 AWS sites. ET0 was computed by, respectively using weather data from kriging interpolation, UERRA MESCAN-SURFEX (UMS) and ERA5-Land (E5L). Figure 5 reveals that ET0 computed with weather data estimated by kriging is generally more accurate and precise than reanalysis data. However, the differences in ET0 estimate performances among datasets are negligible: the differences in RMSE between kriging estimator and E5L are on average less than 3% of the mean daily ET0 and the differences in RMSE between kriging estimator and UMS are on average less than 8% of the mean daily ET0. The differences among databases are even smaller in terms of BIAS. The statistical performances were presented by means of BIAS and RMSE, as defined in Equations (2-3), where X i p are the bias-corrected daily reanalysis values for the variable of interest and X i b are the daily weather observations at the AWSs sensors. Figure 4 shows the differences between the computed performance indicators of each reanalysis dataset and the kriging estimator (as depicted in Figure 3), providing information about the convenience of using reanalysis datasets versus spatially interpolated ground sensor weather data. The benchmarks data for the computations of the BIAS and RMSE remain the daily weather observations at AWSs sensors. Figure 4a shows the differences between the corresponding absolute BIAS, while Figure 4b shows the differences between the corresponding RMSE. Except for RMSE computed for RS, the color scales are built to range from negative values when kriging is performing better than reanalysis (blue on the color bar), to positive values when kriging is performing worse than reanalysis (red color on the color bar). In case of RS, the color scale represents only positive increasing values from blue to red.
Reanalysis and kriging show generally similar accuracies in temperature estimates (see T in Figure 4a), while kriging slightly outperforms reanalysis in term of RMSE (see T in Figure 4b). ERA5-Land shows generally more favorable RMSE values than UMS. AWS n.1 is a special case that deserves a note, since it is located on the ridge of the Sorrento's Peninsula, characterized by high elevation in proximity of the coastline. The effect of this complex terrain features cannot be represented by the coarse spatial resolution of the numerical grid. Moreover, as shown in Figure 2b, E5L does not provide valid data grid points located in proximity of this coastal ridge. Any estimation at AWS n.1 must rely on bilinear interpolation of grid points located in the inland plain.
With respect to wind speed, both BIAS and RMSE show that kriging estimator outperforms reanalysis roughly in only half of the AWSs and that E5L outperforms UMS.
With respect to vapor pressure deficit and solar radiation, differences in BIAS and in RMSE show that kriging estimator outperforms reanalysis and that E5L outperforms UMS, which presents limitations in predicting solar radiation, as also reported in the UERRA user guide [56].

Performance Analysis on Reference Evapotranspiration
This section is focused on the performances of ET 0 averaged over the entire irrigation season of years 2008-2018. The following subsection will present monthly based results. Figure 5 shows the average BIAS and RMSE of ET 0 in the period 2008-2018 for the whole region. The spread of the box plots refers to the variability of the statistical indicators among the 18 AWS sites. ET 0 was computed by, respectively using weather data from kriging interpolation, UERRA MESCAN-SURFEX (UMS) and ERA5-Land (E5L). Figure 5 reveals that ET 0 computed with weather data estimated by kriging is generally more accurate and precise than reanalysis data. However, the differences in ET 0 estimate performances among datasets are negligible: the differences in RMSE between kriging estimator and E5L are on average less than 3% of the mean daily ET 0 and the differences in RMSE between kriging estimator and UMS are on average less than 8% of the mean daily ET 0 . The differences among databases are even smaller in terms of BIAS.  Whiskers extend to the most extreme data values, excluding the outliers. Values are considered as outliers when larger than p75 + 1.5(p75 -p25) or smaller than p25 -1.5(p75 -p25).

Performance Analysis on Monthly Basis
This section presents ET0 estimate performances on a monthly basis. The statistical indicators of Equations (4)(5) are averaged for each month and normalized by the corresponding monthly mean of the daily ET0 observed at each AWSs (Table 3). Figure 6 shows the nBIAS and nRMSE of ET0 computed by using weather data from kriging interpolation of ground sensor data.

Performance Analysis on Monthly Basis
This section presents ET 0 estimate performances on a monthly basis. The statistical indicators of Equations (4)(5) are averaged for each month and normalized by the corresponding monthly mean of the daily ET 0 observed at each AWSs (Table 3). Figure 6 shows the nBIAS and nRMSE of ET 0 computed by using weather data from kriging interpolation of ground sensor data.

Performance Analysis on Monthly Basis
This section presents ET0 estimate performances on a monthly basis. The statistical indicators of Equations (4)(5) are averaged for each month and normalized by the corresponding monthly mean of the daily ET0 observed at each AWSs (Table 3). Figure 6 shows the nBIAS and nRMSE of ET0 computed by using weather data from kriging interpolation of ground sensor data.  The interannual variability of nRMSE is greater than that of the nBIAS. The months characterized by the worst performances in terms of nRMSE are April and September. June is the month that presents the best relative performances. However, the accuracy and precision of the estimates are very high: nBIAS ranges between −13% and 14% of the corresponding observed ET 0 monthly mean; nRMSE ranges between 8% and 21% of the corresponding observed ET 0 monthly mean, with an average nRMSE value of 14%.
As for Figures 4 and 7 presents the differences between ET 0 estimate performances of each reanalysis dataset and the corresponding performances of the kriging estimator (given by Figure 6). The monthly statistical indicators are represented according to the color scale on the bottom. The markers' size describes to the expected maximum interannual variability of the statistics, defined by p75 + 1.5(p75−p25), where p25 and p75 are, respectively the 25th and 75th percentiles of the monthly values in the years 2008-2018.
The interannual variability of nRMSE is greater than that of the nBIAS. The months characterized by the worst performances in terms of nRMSE are April and September. June is the month that presents the best relative performances. However, the accuracy and precision of the estimates are very high: nBIAS ranges between −13% and 14% of the corresponding observed ET0 monthly mean; nRMSE ranges between 8% and 21% of the corresponding observed ET0 monthly mean, with an average nRMSE value of 14%.
As for Figure 4, Figure 7 presents the differences between ET0 estimate performances of each reanalysis dataset and the corresponding performances of the kriging estimator (given by Figure 6). Figure 7a shows the differences between absolute values of nBIAS, while Figure 7b between nRMSE: positive values (red color on the color bar) mean that the kriging estimator is performing better than reanalysis and negative values mean the opposite. In most cases, kriging estimator outperforms reanalysis data, but in the worst cases the increase of nBIAS and nRMSE is less than 15% for UMS and less than 10% for E5L.
E5L outperforms UMS in assessing reference evapotranspiration. However, both datasets show very high performances. For UMS, nBIAS ranges between −28% and 23% of the corresponding observed ET0 monthly mean and nRMSE ranges between 11% and 32% of the corresponding observed ET0monthly mean, with an average value of 22%. For E5L, nBIAS ranges between −12% and 8% of the corresponding observed ET0 monthly mean and nRMSE ranges between 7% and 27% of the corresponding observed ET0 monthly mean, with an average value of 17%.
The most accurate and precise ET0 estimates based on UMS and E5L occurs in July, which is also the month with highest ET0 in the study region. UMS performances drop in April, May and slightly in September. When the aerodynamic term of Penman Monteith equation is prevalent over the radiation term, the uncertainty about VPD and wind speed exerts a major role in the estimation of ET0. Hence, the performance of reanalysis models in April and May is influenced by the larger BIAS and RMSE (Figure 4).

Discussion and Conclusions
Regional maps of daily reference evapotranspiration are fundamental for addressing many water resources management issues, especially in those areas, such in Mediterranean regions, where significant freshwater abstractions are practiced for crop irrigation in the dry seasons [4]. The computation of FAO-56 daily reference evapotranspiration (ET0) requires dataset of multiple weather variables, recorded by networks of automatic weather stations (AWSs). The spatial distribution of  Figure 7a shows the differences between absolute values of nBIAS, while Figure 7b between nRMSE: positive values (red color on the color bar) mean that the kriging estimator is performing better than reanalysis and negative values mean the opposite. In most cases, kriging estimator outperforms reanalysis data, but in the worst cases the increase of nBIAS and nRMSE is less than 15% for UMS and less than 10% for E5L. E5L outperforms UMS in assessing reference evapotranspiration. However, both datasets show very high performances. For UMS, nBIAS ranges between −28% and 23% of the corresponding observed ET 0 monthly mean and nRMSE ranges between 11% and 32% of the corresponding observed ET 0 monthly mean , with an average value of 22%. For E5L, nBIAS ranges between −12% and 8% of the corresponding observed ET 0 monthly mean and nRMSE ranges between 7% and 27% of the corresponding observed ET 0 monthly mean, with an average value of 17%.
The most accurate and precise ET 0 estimates based on UMS and E5L occurs in July, which is also the month with highest ET 0 in the study region. UMS performances drop in April, May and slightly in September. When the aerodynamic term of Penman Monteith equation is prevalent over the radiation term, the uncertainty about VPD and wind speed exerts a major role in the estimation of ET 0 . Hence, the performance of reanalysis models in April and May is influenced by the larger BIAS and RMSE (Figure 4).

Discussion and Conclusions
Regional maps of daily reference evapotranspiration are fundamental for addressing many water resources management issues, especially in those areas, such in Mediterranean regions, where significant freshwater abstractions are practiced for crop irrigation in the dry seasons [4]. The computation of FAO-56 daily reference evapotranspiration (ET 0 ) requires dataset of multiple weather variables, recorded by networks of automatic weather stations (AWSs). The spatial distribution of these AWSs is generally too coarse and irregular for assessing the spatial variability of weather variables, such as solar radiation and wind [76].
The availability of observed weather data is also hindered by the policy data access adopted by the hydrometeorological service providers. This study examined the case of the Campania Region (Southern Italy), where the Regional Hydrometeorological Service is operating a network of 18 automatic weather stations, recording a complete set of weather data since year 2008, across an area of 14,000 km 2 . The spatial density of this network is relatively high if compared with those examined by previous studies on this topic: Paredes et al. [45] analyzed data of 24 stations in Portugal (about 92,000 km 2 ) while Martins et al. [41] collected data observed at 130 station across the entire Iberian Peninsula (597,000 km 2 ). Indeed, the access to observed weather data can be even more difficult for areas extending across regional or national boundaries. In Italy, for instance, weather monitoring networks are managed by regional and national services, without a common data sharing policy and data distribution platform [77]. Pan European gridded agrometeorological maps released to support the EU Common Agricultural Policy, are derived by interpolating observed weather data collected in the different Member States, with uneven distributions, not consistent with what would be required based on the topographic complexity (e.g., Agri4Cast Resources Portal, Gridded Agro-Meteorological Data in Europe [78]).
These constrains in the availability of observational weather data are the main motivations of the growing interest toward the application of reanalysis data in hydrological studies.
The quality of the reanalysis data is rapidly growing, thanks to the significant research efforts in providing temporally consistent combination of the most advanced numerical weather models and the available climate observations. This study examined the last two reanalysis products distributed by the Copernicus Climate Change Service: the pan-European regional reanalysis UERRA MESCAN-SURFEX (UMS) with a spatial resolution of 5.5 km; and the global land surface reanalysis ERA5-Land (E5L) with a spatial resolution of nine kilometers.
We found that ERA5-Land systematically outperformed UERRA MESCAN-SURFEX in reconstructing the historical weather variables required for computing ET 0 . This result testifies that the choice of the gridded product should be not merely driven by its nominal spatial resolution. ERA5-Land benefits of the latest ERA5 global reanalysis, downscaled from 31 km to 9 m by applying the lapse rate correction to account for the influence of the altitude on the spatial structure of the weather variables. UMS is instead a downscaled product of UERRA-HARMONIE regional reanalysis at spatial resolution of 11 km, which boundary conditions were defined by the ERA-Interim global reanalysis, which production ceased in 2019, being substituted by the more advanced ERA5.
Reanalysis products are generally evaluated by point-to-point comparisons, consisting in assessing the difference between the observed data and the corresponding reanalysis value, obtained by linear interpolation of the reanalysis data at the numerical grid points surrounding the observation site, e.g., [41,45]. In this study, we aimed at assessing the reanalysis performances with what could be achieved by spatially interpolating AWSs observations, following established methods in regional scale analyses [75,79]. AWSs data were interpolated by universal kriging and ordinary kriging, combined with a cross validation procedure for retrieving the corresponding indicators of performances, which were then compared with the reanalysis products.
Reanalysis temperature and wind speed exhibited better performances than other variables, and comparable to those achieved by kriging interpolations (Figure 4). Indeed, as a general aspect of numerical weather predictions, prognostic variables such as temperature and wind speed, since they are directly derived by the numerical integration of dynamic laws of the atmosphere, are better predicted than diagnostic variables.
Our findings confirm that the numerical solar radiation reconstructions are most prone to error. UMS estimate errors of solar radiation were almost 2-3 times larger than E5L. UMS results were coherent with other studies that evidenced ERA-Interim tendency to overestimate solar radiation [45,80]. Improvement in solar radiation estimates is certainly one of the most desirable aspect from future reanalysis products, such as the upcoming Copernicus regional reanalysis for Europe (CERRA) [57], especially because solar radiation errors have the highest impact on the predicted evapotranspiration [52,81]. The availability of weather data on grid points located over the sea close to the coastline would be also desirable, to avoid the issue experienced when assessing weather conditions close to the coast with E5L (Figure 2b).
Still, ET 0 estimate performances based on the examined reanalysis data were satisfactory if compared with those obtained by spatially interpolating the observed data. Reanalysis-based ET 0 presented average nBIAS lower than 5%. Average nRMSE ranged from 17% with E5L to 22% with UMS. The values of these performance indicators were not far from those obtained by kriging interpolation: average nBIAS of 1% and average nRMSE of 14%.
ET 0 estimate performances based on E5L and UMS were better than those presented by Martins et al. [41] for the Iberian Peninsula based on NCEP/NCAR blended reanalysis data, and those presented by Parades et al. [45] for Continental Portugal based on the ERA-Interim.
Another interesting result is that differences in statistical performances between reanalysis-and observation-based ET 0 estimates were small: ET 0 predicted with E5L and the kriging estimator presented average differences in nRMSE smaller than 3%; while in the comparison between UMS and kriging estimator, the differences in nRMSE were smaller than 8%. The corresponding differences were even smaller and negligible in terms of nBIAS.
These outcomes confirm that for regions with limited past weather data archives or served by sparse and irregular monitoring networks, reanalysis data can be successfully exploited as a source of gridded weather data. Nevertheless, the availability of ground-based weather observations always represents an added value that must be used in combination (and not in substitution) with reanalysis data, to improve our knowledge of the past weather in a region.
A preliminary bias correction of the reanalysis products is always advisable if observational data are available in the region. In this study, we used a simple regionally uniform bias correction, applied to all variables except for wind speed, for which we did not find any systematic bias pattern, consistently with previous reports [82]. Indeed, the development of different procedures for improving the bias-correction has been the topic of several research studies [43][44][45]76]. However, we believe that this topic must be addressed in a more general research framework, aimed at evaluating the optimal integration of observational and reanalysis data for regional hydrological studies [76]. data are Copernicus products belonging to European Union and accessible on the ECMWF Copernicus portal. UERRA data are available thanks to the EU FP7 Collaborative Project (Grant agreement 607193).

Conflicts of Interest:
The authors declare no conflicts of interest.

Appendix A. Kriging Interpolation
The theoretical bases of geostatistical methods have been described in detail by Journel & Huijbregts [73] and Cressie [74], among others. Here, we just summarize some main aspects to consider for performing a 2-D spatial interpolation by kriging.
In geostatistical methods, the variable of interest, Z, is considered as a stochastic spatial random process, i.e., a random function of space x, Z(x), where x is the vector containing a pair of coordinates, such as longitude and latitude. In this framework, an observed value of the variable of interest at a sampling location, z(x i ), represents a realization of the random variable Z(x i ). The set of random variables {Z(x 1 ), Z(x 2 ), . . . , Z(x m )} constitutes a random function for which only one realization, given by the set of observed values at the m sampling locations {z(x 1 ), z(x 2 ), . . . , z(x m )}, is known.
The purpose of kriging is to predict data values at ungauged locations x k using information available at the m sampling locations {x 1 , x 2 ,..., x m }. This can be carried out by expressing Z(x k ) as a linear combination of {Z(x 1 ), Z(x 2 ), . . . , Z(x m )} as follows: where the weights w i , for i= 1, . . . m, are obtained by solving the kriging system that provide the optimal (minimum variance) unbiased linear estimator [73]. In particular, ordinary kriging assumes the hypothesis of constant unknown mean, µ, over the search neighborhood of x k . The ordinary kriging system can be written as follows: where γ is the semi-variogram of the random process/field. Under the hypothesis of an intrinsically stationary, isotropic random field Z(x), the semi-variogram of the field unambiguously describes the spatial autocorrelation of the sampling locations and it can be estimated from data as follows: where h is the varying distance (lag) between two sampling locations. Then, the experimental semi-variogram has to be fitted by a suitable parametric analytic function to define the theoretical semi-variogram of the field, γ that is needed in the system of Equation (A2). Among various analytical function used to fit the experimental semi-variograms, here we recall the exponential curve: where h is the lag distance; n, s and r are parameters of the curve called nugget, partial sill and range, respectively. The nugget is the value of the semi-variogram at lag equal to zero and it is theoretically zero. However, in presence of small-scale variability and/or in presence of measurements errors, the nugget is not zero (nugget effect). The range defines the distance within sample locations are spatially correlated while the value that the semi-variogram model reaches at a distance equal to the range is called the sill. The partial sill is the sill minus the nugget.
Appendix B. The Network of Temperature Sensors and Sensitivity of the Kriging Estimator to Network Density Figure A1 shows, on the map of the region, the location of the 59 ground stations that are equipped with air temperature sensors along with the 18 ground automatic weather stations (AWSs) equipped with a complete set of weather sensors. The comprehensive network that measures temperature is made up of 77 stations and it was used in the study to interpolate temperature data via regression kriging by cross-validation and to compute the reference evapotranspiration at the 18 AWSs.
In order to evaluate the sensitivity of the kriging estimator to the network density, the performances of the kriging estimator at the 18 AWSs was evaluated at the 18 AWS sites by using alternatively the whole set of 77 stations and the 18 AWSs. The results in terms of BIAS and RMSE obtained by cross-validation are shown in Figure A2.  Figure A2 shows boxplots of BIAS and RMSE of the temperature estimates at each of the 18 AWSs obtained with kriging estimates recursively conditioned to the air temperature observed at the residual 76 stations and 17 stations, respectively, according to the cross-validation procedure. In the former case, the mean BIAS is 0.37 °C, ranging from −0.80 °C to 1.64 °C; the RMSE is on average 1.11 °C, varying from 0.81 to 1.98. In the latter case, the mean BIAS is very close to zero as expected, ranging from −1.61 °C to 1.32 °C and the RMSE ranges from 0.46 °C to 1.96 °C with a mean value of 1.15 °C. The results show a low sensitivity of the kriging interpolator of temperature to network density. Figure A1. Eighteen ground weather station sites (red triangles) and 59 ground temperature station sites (orange triangles): 77 stations in total. Figure A2 shows boxplots of BIAS and RMSE of the temperature estimates at each of the 18 AWSs obtained with kriging estimates recursively conditioned to the air temperature observed at the residual 76 stations and 17 stations, respectively, according to the cross-validation procedure. In the former case, the mean BIAS is 0.37 • C, ranging from −0.80 • C to 1.64 • C; the RMSE is on average 1.11 • C, varying from 0.81 to 1.98. In the latter case, the mean BIAS is very close to zero as expected, ranging from −1.61 • C to 1.32 • C and the RMSE ranges from 0.46 • C to 1.96 • C with a mean value of 1.15 • C. The results show a low sensitivity of the kriging interpolator of temperature to network density. former case, the mean BIAS is 0.37 °C, ranging from −0.80 °C to 1.64 °C; the RMSE is on average 1.11 °C, varying from 0.81 to 1.98. In the latter case, the mean BIAS is very close to zero as expected, ranging from −1.61 °C to 1.32 °C and the RMSE ranges from 0.46 °C to 1.96 °C with a mean value of 1.15 °C. The results show a low sensitivity of the kriging interpolator of temperature to network density.