Validating the Copernicus European Regional Reanalysis (CERRA) Dataset for Human-Biometeorological Applications †

: In recent years, a considerable body of research has demonstrated the suitability of global and regional reanalysis data for human-biometeorological applications. These applications include the assessment of the outdoor thermal environment and the investigation of its relation to human health, especially in areas where the spatial coverage of surface observational networks is sparse. Here, we present the ﬁrst comprehensive evaluation of the most recent pan-European regional reanalysis, namely the Copernicus European Regional Reanalysis (CERRA) dataset at 5.5 km spatial resolution, in terms of simulating the observed human bioclimate, as expressed by the modiﬁed physiologically equivalent temperature (mPET) that is computed through the RayMan Pro model, and its meteorological drivers. The validation was performed over Greece using up to 11 years of records of 2 m air temperature and relative humidity, 10 m wind speed and global solar radiation derived from 35 sites of the nationwide network of surface weather stations operated by the METEO Unit at the National Observatory of Athens. The ERA5-Land dataset at ~9 km spatial resolution, which represents the current state-of-the-art reanalysis, was also compared against the same observations. Our ﬁndings show that the CERRA dataset performs signiﬁcantly better compared to the ERA5-Land reanalysis with respect to the replication of the examined meteorological variables and mPET. The added value of the CERRA data is particularly evident during the warm period of the year and in regions that are characterized by complex topography and/or proximity to the coastline. Combining the CERRA dataset with population and mortality data, we further showcase its applicability for human-biometeorological and heat–health studies at a local scale, using the regional unit of Rethymno (Crete) as a pilot area for the analysis.


Introduction
Reanalysis datasets provide a comprehensive description of the observed past climate and have been widely used for various geophysical applications [1].In recent years, their exploitation for human-biometeorological and heat-related epidemiological studies has also emerged [2,3].Such applications are motivated by the fact that networks of surface weather stations may be characterized by limited spatial coverage, especially with respect to epidemiological information.They may also include temporal gaps and have limited availability of observations that are necessary for the complete assessment of the human thermal bioclimate (e.g., global solar radiation) [4].
The ERA5 and ERA5-Land (ERA5L) datasets [5] represent the current state-of-the-art global reanalysis, providing multiple near-surface climate variables relevant to heat-and health-related studies at relatively high spatial resolution.Still, they may decline in quality in areas characterized by high variability in topography and coastal geomorphology and/or by limited observational data ingested in the reanalysis [6].Regional and local gridded climate datasets can be used to overcome these issues [7].In this direction, the Copernicus European Regional Reanalysis (CERRA) has been recently released, providing a continental reanalysis with very high spatial resolution [8].
The key objective of the present study is to assess the accuracy of CERRA for humanbiometeorological and heat-health applications.Within this context, we provide a comprehensive evaluation of the CERRA dataset using up to 11-year-long ground-based observations of 2 m air temperature (T2) and relative humidity (RH2), 10 m wind speed (WS10) and global solar radiation (GSR) in Greece.The RayMan Pro model [9] was also employed for computing (based on the above meteorological data) and comparing the observational-and reanalysis-based modified physiologically equivalent temperature (mPET), a rational index used for assessing the human thermal bioclimate [10].The same evaluation process was further applied to the ERA5L dataset in order to explore performance differences between the two examined reanalyses.Finally, we demonstrate an exposure-response analysis at a local level using population and mortality data and the CERRA reanalysis for the regional unit (RU) of Rethymno in Crete.

Materials and Methods
The ERA5L reanalysis data are provided at 9 km spatial resolution and they are not available for grid cells where the land-sea mask is ≤50%, including many coastal areas and islands [5].On the other hand, the CERRA climate data are available over all grid cells, accounting for the proportion of sea in each very high resolution (5.5 km) grid box [8].In this way, a better representation of the country's coastlines and islands is provided by the CERRA reanalysis, as illustrated in Figure 1.
limited availability of observations that are necessary for the complete assessment of the human thermal bioclimate (e.g., global solar radiation) [4].
The ERA5 and ERA5-Land (ERA5L) datasets [5] represent the current state-of-the-art global reanalysis, providing multiple near-surface climate variables relevant to heat-and health-related studies at relatively high spatial resolution.Still, they may decline in quality in areas characterized by high variability in topography and coastal geomorphology and/or by limited observational data ingested in the reanalysis [6].Regional and local gridded climate datasets can be used to overcome these issues [7].In this direction, the Copernicus European Regional Reanalysis (CERRA) has been recently released, providing a continental reanalysis with very high spatial resolution [8].
The key objective of the present study is to assess the accuracy of CERRA for human-biometeorological and heat-health applications.Within this context, we provide a comprehensive evaluation of the CERRA dataset using up to 11-year-long ground-based observations of 2 m air temperature (T2) and relative humidity (RH2), 10 m wind speed (WS10) and global solar radiation (GSR) in Greece.The RayMan Pro model [9] was also employed for computing (based on the above meteorological data) and comparing the observational-and reanalysis-based modified physiologically equivalent temperature (mPET), a rational index used for assessing the human thermal bioclimate [10].The same evaluation process was further applied to the ERA5L dataset in order to explore performance differences between the two examined reanalyses.Finally, we demonstrate an exposure-response analysis at a local level using population and mortality data and the CERRA reanalysis for the regional unit (RU) of Rethymno in Crete.

Materials and Methods
The ERA5L reanalysis data are provided at 9 km spatial resolution and they are not available for grid cells where the land-sea mask is ≤50%, including many coastal areas and islands [5].On the other hand, the CERRA climate data are available over all grid cells, accounting for the proportion of sea in each very high resolution (5.5 km) grid box [8].In this way, a better representation of the country's coastlines and islands is provided by the CERRA reanalysis, as illustrated in Figure 1.The same figure also shows the locations of the 35 surface weather stations used for the validation, which are operated by the METEO Unit at the National Observatory of Athens [11].The observed and reanalysis data have a temporal resolution of 1 h and The same figure also shows the locations of the 35 surface weather stations used for the validation, which are operated by the METEO Unit at the National Observatory of Athens [11].The observed and reanalysis data have a temporal resolution of 1 h and cover an 11-year period (2010−2020) for most stations.Using reanalysis observation pairs based on the "nearest neighbor" technique [12], the mean bias (MB), root mean square error (RMSE) and index of agreement (IOA) for T2, RH2, WS10, GSR and mPET were calculated for six distinct regions (Figure 1) and four seasons (DJF: December-January-February; MAM: March-April-May; JJA: June-July-August; SON: September-October-November).The mPET human-biometeorological index was computed at the locations of the weather monitoring sites and for a standard male reference adult using the RayMan Pro model [9,13].
For the investigation of the exposure-response relationship between heat stress and mortality, additional mPET computations were performed for the 1999−2018 period using the CERRA data from the Rethymno RU.Population data derived from the Hellenic Statistical Service (HSS) were combined with the CERRA grid cells falling within the Rethymno RU in order to compute population-weighted mPET means for two standard reference seniors; a male and a female [13].Mortality data provided by HSS for the same period were processed according to Matzarakis et al. [14] for computing the daily relative mortality.In the end, the impact of the heat stress, as expressed by the maximum daily population-weighted mPET values, on the relative mortality was studied by applying a 10-day lag and focusing on the warm period of the year (April-October).

Reanalyses Validation
Table 1 shows that CERRA performs significantly better over most seasons and regions with respect to the representation of the observed 2 m air temperature and relative humidity.In more detail, the CERRA reanalysis appears to be primarily biased towards cold with T2 RMSE values being mostly lower than 2 • C. For RH2, there are mixed results concerning CERRA overestimation/underestimation and RMSE values range around 10%.The IOA values denote a very strong correlation between the observed and reanalysis T2 values for both datasets, whereas CERRA features a better correlation with RH2 observations, which is characterized as very strong during all seasons and over all regions.Concerning the 10 m wind speed, CERRA outperforms ERA5L in coastal and insular regions (i.e., the North and South Aegean and southern Greece) during all seasons.Both reanalyses primarily overestimate this variable and demonstrate a strong correlation with its observed values.Overestimation is also evident for global solar radiation, with both CERRA and ERA5L featuring a very strong correlation with the GSR in situ observations.ERA5L performs overall better in reproducing the observed GSR values as denoted by the lower RMSEs.For both reanalyses, the greater GSR errors are evident during the warm period of the year (MAM and JJA), exceeding 100 W/m 2 .The observational-based mPET values are primarily underestimated by both climate datasets, with RMSEs being mostly lower than 3.5 • C and the IOA values indicating minimal reanalyses errors in phase.Overall, the human thermal bioclimate, as assessed by the modified physiologically equivalent temperature, is better represented by the CERRA data in most regions, and for all seasons except DJF.

Heat-Health-Related Application
Figure 2 shows a pronounced increase of the 10-day mean relative mortality when the 10-day mean mPET values are greater than 35 • C for both male and female seniors.This fact highlights the vulnerability of this specific age group, as the heat-related issues for senior people begin to occur at relatively low mPET thresholds that correspond to strong heat stress.

Heat-Health-Related Application
Figure 2 shows a pronounced increase of the 10-day mean relative mortality when the 10-day mean mPET values are greater than 35 °C for both male and female seniors.This fact highlights the vulnerability of this specific age group, as the heat-related issues for senior people begin to occur at relatively low mPET thresholds that correspond to strong heat stress.The exposure-response curves in Figure 2 also demonstrate a greater sensitivity to heat stress for female seniors, as the relative mortality for this subgroup increases faster as the mPET values increase.It is characteristic that when mPET values are close to 39 • C, the estimated increase of mortality for male seniors is about 42%, whereas it exceeds 50% for female seniors.

Discussion and Concluding Remarks
Currently, ERA5 and ERA5L represent the state-of-the-art gridded climate datasets, providing all necessary inputs for computing, among others, advanced thermal indices.ERA5L data in particular, are available at enhanced spatial resolution (~9 km), and thus have been widely used for relevant applications (e.g., [6,8]).In the present study, we validated for the first time the most recent pan-European regional reanalysis (CERRA), provided at 5.5 km horizontal grid resolution, in terms of reproducing the observed human bioclimate as expressed by mPET and its meteorological drivers.The validation was performed over Greece using up to 11-year-long hourly time series of ground-based observations at 35 weather monitoring sites.
First, we demonstrated that the CERRA grid can adequately represent the complex coastal and insular areas in Greece (Figure 1).This is of great importance in representing the actual thermal conditions experienced by the people in the study area, as most of the country's population lives in coastal areas.This is reflected in the validation outcomes for both reanalyses (Table 1).Overall, the CERRA dataset yielded most of the best statistical scores for T2, RH2 and mPET, especially during MAM and JJA and in regions with complex geomorphology (e.g., central Greece).This is also true for WS10 in the North and South Aegean and southern Greece, whereas ERA5L GSR values showed the closest agreement with the ground-based observations.However, the differences in the CERRA and ERA5L RSME values for GSR were relatively small compared to the magnitude of this variable.It is worth noting that the statistical metrics for the examined human bioclimate factors are similar to (or even better than) those provided by other regional modeling and reanalysis studies (e.g., [12,[15][16][17][18]).
The exposure-response analysis between maximum daily mPET and mortality at Rethymno RU (Figure 2), a relatively small in terms of population insular regional unit in Greece (Figure 1), highlights the applicability of CERRA under a human-biometeorological and heat-health context at local level.This is important when considering that most previous relevant studies were conducted mainly over major metropolitan areas (e.g., [19]).Further, the provided analysis underlines the significance of considering the physiological characteristics of different groups of people when assessing the heat-health nexus.Using mPET for this purpose allowed us to identify the increased vulnerability of the elderly population, especially of female seniors, in Rethymno RU, which is in agreement with the outcomes of previous physiologically consistent studies [13,14].It is worth noting that the choice of mPET was based on a preliminary comparison between mPET, PET and the universal thermal climate index (UTCI), which showed that PET (UTCI) exacerbates (alleviates) heat stress conditions, as also demonstrated by Giannaros et al. [18].Further, UTCI estimates cannot account for diverse populations, since they are computed for a fixed physiological setting [20].

Figure 1 .
Figure 1.CERRA grid points over the greater area of Greece with identification of six distinct regions (black outlines), locations of the 35 surface weather stations (green inverted triangles) and the Rethymno regional unit (red outline).

Figure 1 .
Figure 1.CERRA grid points over the greater area of Greece with identification of six distinct regions (black outlines), locations of the 35 surface weather stations (green inverted triangles) and the Rethymno regional unit (red outline).

Figure 2 .
Figure 2. Scatter plot of the daily relative mortality and daily maximum mPET computed based on population-weighted CERRA data over the Rethymno regional unit and considering a 10-day lag.The lines correspond to locally weighted scatterplot smoothing (LOWESS) and the shaded areas represent the 95% confidence intervals around the LOWESS fits.The exposure-response curves in Figure2also demonstrate a greater sensitivity to heat stress for female seniors, as the relative mortality for this subgroup increases faster as the mPET values increase.It is characteristic that when mPET values are close to 39 °C,

Figure 2 .
Figure 2. Scatter plot of the daily relative mortality and daily maximum mPET computed based on population-weighted CERRA data over the Rethymno regional unit and considering a 10-day lag.The lines correspond to locally weighted scatterplot smoothing (LOWESS) and the shaded areas represent the 95% confidence intervals around the LOWESS fits.

Table 1 .
CERRA and ERA5L RMSE values computed for mPET and its meteorological drivers by season and region.