Recurrence Spectra of European Temperature in Historical Climate Simulations

We analyse and quantify the recurrences of European temperature extremes using 32 historical simulations (1900–1999) of the fifth Coupled Model Intercomparison Project (CMIP5) and 8 historical simulations (1971–2005) from the EUROCORDEX experiment. We compare the former simulations to the 20th Century Reanalysis (20CRv2c) dataset to compute recurrence spectra of temperature in Europe. We find that, (1) the spectra obtained by the model ensemble mean are generally consistent with those of 20CR; (2) spectra biases have a strong regional dependence; (3) the resolution does not change the order of magnitude of spectral biases between models and reanalysis, (4) the spread in recurrence biases is larger for cold extremes. Our analysis of biases provides a new way of selecting a subset of the CMIP5 ensemble to obtain an optimal estimate of temperature recurrences for a range of time-scales.


Introduction
Climate events linked to extremes of temperature (heatwaves/cold spells) [1][2][3] have severe impacts on human health and natural ecosystems [4][5][6].Heatwave events have increased in Europe within the last decades in frequency or intensity, while cold spells have decreased in frequency and intensity since 1950 [7][8][9][10][11].An important question to evaluate the future evolution of such events is whether climate models are able to reproduce temperature distributions and their extremes.
Several studies already have focused on this issue [12][13][14][15][16][17][18].Most of them show substantial inconsistencies between observed and modeled temperature distributions.Morak et al. [16] uses the Hadley Centre Global Environmental Model, version 1 (HadGEM1) with both anthropogenic and natural forcings to demonstrate that although the model shows a tendency to significantly overestimated changes in warm extremes, changes in temperature extremes are generally well captured by the model.In the latest version, HadGEM3-A [19], models overestimated warm extremes especially in central/northern Europe while cold extremes appear well simulated.Thus, one of the most striking findings in [19] is the ability of the model to capture and reproduce the main observed North-Atlantic atmospheric weather regimes responsible for temperature and precipitation extreme events.
Regarding Coupled Model Intercomparison Project (CMIP) models, Sillmann et al. [18] shows that the spread amongst CMIP5 models for several temperature indices (including extremes) is reduced compared to CMIP3 models demonstrating, as well, that the median model climatology outperforms individual models for all indices.In Krueger et al. [15] a comparison of weather patterns from CMIP5 models with patterns derived from ERA interim indicates that climate models simulate mechanisms associated with temperature extremes reasonably well, in particular circulation-based mechanisms.More recently, Li et al. [20] find that climate models forced with natural and anthropogenic historical forcings underestimate changes in temperature extremes.Conversely, other studies e.g., Kharin et al. [21] have found that climate models provide a good estimation for warm temperature extremes but poor estimations of cold extremes, especially in sea ice covered areas.
Most of those analyses focus on extremely low or high temperature values, and model those extremes with Generalized Extreme Value (GEV) distributions [22][23][24].Studies focusing on climate variability have investigated the Fourier spectra of temperature from observations and climate model simulations.Such studies have emphasized the role of periodic or quasi-periodic features of the climate system [25].
The present paper combines the paradigms of Fourier spectra and extreme events, by focusing on the range of return levels associated with return period for rare temperature events.Thus we want to assess the whole spectrum of probabilities of events linked to temperature, and investigate how climate models simulate rare temperature events.In this approach, we estimate the recurrences for points of the phase space of the underlying dynamical system, by computing probabilities of events.Those probabilities have known GEV asymptotic distributions [26,27] that can be used to check the reliability and robustness of estimated return levels.The recent study of Faranda et al. [28] shows that the recurrence method can be adapted to the study of atmospheric variables even when the underlying exact dynamics of the system (in terms of dynamical systems) is unknown.In [28] rare recurrences are computed using different approaches (Recurrences and Block maxima) and comparing the IPSL-CM5 historical model within the CMIP5 historical experiment with two long-terms reanalyses (ERA20C and 20CR).Their results show that with respect to the traditional approaches, the recurrence technique is sensitive to the change in the size of the selection window of extremes due to the conditions imposed by the dynamics.
In this paper we investigate the recurrence properties of temperature values in ensembles of climate model simulations, using the technique developed in Faranda and Vaienti [29].This allows quantifying the properties of rare values of the system, by accounting for its chaotic nature.To perform our analysis, we use the multi-model ensemble of coupled ocean-atmosphere General Circulation Models (GCMs) provided by the Coupled Model Intercomparison Project Phase 5 (CMIP5) [30].The main goal of our analysis is to evaluate how CMIP5 model simulations covering 1900-1999 can represent the recurrences in extremes events using a dynamical system technique developed by Faranda et al. [28] and to further investigate the existence of biases among models and observations.In order to assess the dependence of the results on the horizontal resolution, we also use the bias Corrected EURO-CORDEX Climate Projections for the period 1971-2005.
This paper is organized as follows: first, we present the data sets (models and observations), the region of our study and we describe our analysis method, then, we compute and analyze temperature return levels and compare them to the 20th Century Reanalysis 20CRv2c dataset.Finally we discuss and summarize the main results.

Data
We base our analysis on daily 2-m temperature (t2m) from the historical experiment (1900-1999) of the Coupled Model Intercomparison Project Phase 5 (CMIP5) [30].CMIP5 daily model output is available for 32 models for the historical experiment.The rationale of the CMIP5 ensemble is that it samples a large fraction of the possible climate states that are compatible with observed natural and anthropogenic forcings, rather than one trajectory (i.e., the observations).In order to compute the return levels, we compare those 32 models (Table 1) with the 20th Century Reanalysis data version 2c (20CRv2c, [31]).We have selected 20CRv2c in this study because it is the latest version of 20CR and has bias correction applied to the sea-ice distribution by assimilating new SST and sea-ice cover (SIC) data [32].However, we have found similar results (not shown here), with slight differences for some models, when we have tested other reanalysis (ERA20C and former 20CR, results for IPSL in [28]).We consider the ensemble mean of the 56 realizations of 20CR, as is usually done in the literature.Alvarez-Castro et al. [7] analyzed the whole ensemble members for European heatwaves and argued that the ensemble mean provides robust features of the dynamical properties.The analysis is focused on the European region, between 35 • N-62 • N and 12 • W-32 • E. The horizontal resolution of 20CR is 2 • × 2 • and for a better comparison, all the datasets have been bilinearly re-interpolated onto the 20CRv2c grid.Models are sorted in tables and figures by decreasing resolution.We compute then the Ensemble Model Mean (EMM) and the standard deviation of the model ensemble (SDM), since these are synthetic useful indicators to evaluate the models spread.Then, we confront the EMM to the ensemble mean of 20CRv2c (ensemble composed of 56 members).To complete our study, we analyse the outputs of 8 post-processed (Table 2) regional models simulations from the bias Corrected EURO-CORDEX Climate Projections for the period 1971-2005.The EURO-CORDEX initiative is a part of the global Coordinated Regional Downscaling Experiment (CORDEX, http://wcrp-cordex.ipsl.jussieu.fr/)to improve regional climate scenarios for the land-regions worldwide.It provides regional climate projections for Europe at 50 km and 12.5 km resolution, which downscale the CMIP5 global climate projections and the RCP scenarios (See [33,34] for further information).The bias correction methodology uses the general Cumulative Distribution Function transform method (CDFt) of [35].It assumes a reference period over which observation-based data is available: 1971-2005.The reference dataset is the 0.22-rotated E-OBS version 10 data set [36] that can be downloaded from http://eca.knmi.nl.

Methods
We assume that the climate variables have trajectories that wind around a chaotic attractor that contains the underlying dynamics of the system.We fix an arbitrary temperature T * and we consider the probability that the time series T(t) returns within a tolerance to T * .The return period is defined as the average time it takes for the time series to "hit" this interval.More precisely, the recurrence technique computes the probability Pr that the variable T returns in an interval of radius centered in T * : Pr(T * − < T < T * + ).
Following [29], we provide the algorithm for defining the spectrum of recurrences for temperature time series: Divide the series g(t) into n bins each containing m data and extract the maxima M j , with j = 1, ..., n.

3.
Distribution functions like Pr(M n ≤ z) are modelled, for n sufficiently large, by the so-called generalized extreme value (GEV) distribution which depend on three parameters ξ ∈ R, κ ∈ R, σ > 0 and such that: The parameter ξ is called the tail index; when its value is 0, the GEV corresponds to the Gumbel type of distribution.Indeed this is the expected distributions of recurrences, providing that we use the g(T * , t) observable.4.
Perform an Anderson and Darling [37] test to assess whether the fit is compatible with a Gumbel distribution.
If the fit is found to be compatible with a Gumbel distribution, one can repeat the procedure for shorter bin lengths and find the smallest m such that, for the chosen T * , the fit converges.This defines the shortest convergent recurrence time τ and its corresponding return value .Note that, not for all the T * it is possible to find a value of m such that the fit to the Gumbel law is acceptable.The range of values T * such that there is a suitable m defines the ( , m) spectrum of recurrences as illustrated in Figure 1.Rare temperatures are located in the white area between the red lines and the blue area.This provides an alternative definition of maximum and minimum temperatures based on the rarity of the recurrences.In the following we will stick this definition to define temperature hot and cold extremes.This representation of a recurrence spectrum (i.e., temperature levels as a function of return times) is analogous to a Fourier spectrum (i.e., temperature values as a function of frequency), but for nonperiodic phenomena.
We consider 100 years (1900-1999) of daily temperature data for the whole year.This allows testing return times m between 6 months up to 4 years [28].For recurrence windows longer than 4 years, we get less than 25 years of temperature recurrences and unreliable estimates of the Gumbel distributions.Conversely, events chosen for bin lengths shorter than 6 months cannot be considered as rare.We are dealing with non-stationary time series.However, the values of in the spectra are not on the tails of the distribution of T * which implied that the non-stationarity (about 0.5 • C/century) does not affect the convergence to the spectrum.
The procedure as well as the parameters used in this study is the same as in Faranda et al. [28].

Results
We evaluate how CMIP5 historical simulations can represent the recurrences in extremes events using the methodology of [28] to further illustrate the biases between CMIP5 models and a reference (20CRv2c).Here the term bias refers to the difference between CMIP5 and 20CRv2c recurrence spectra.

Changes in European Temperature Recurrence Spectrum
We illustrate the capabilities of the method by showing the results for three specific locations over Europe (Figure 2).Here the curves represent the ( , τ) spectrum for a grid point in central Europe Cold extremes instead are rare for temperatures lower than 0 • C. All the models show that return levels for hot temperature extremes are similar for 1 ≤ m ≤ 4 years (Figure 2).The EMM and the SDM shows similar behavior with almost constant for 1 ≤ m ≤ 4 years.
However, return levels for the cold temperature extremes do depend on the bin length.For cold (hot) temperature extremes the disagreement (agreement) between the models increases (decreases) with the bin length, m = 4 (m = 1) and differences between 20CRv2c and EMM also increase (decrease) with the bin length m = 4 (m = 1).
The agreement between return levels for models and the 20CR (Figure 2) depends on the location.For location (a), the matching between EMM and 20CRv2c is good for both the hot and the cold temperature extremes.Only few models deviate significantly from the reanalysis.For location (b) the representation of cold and warm extremes is poor both for single models and EMM.Even if there is inconsistency among each single model and 20CRv2c, the EMM follows the behavior of the reanalysis but it underestimates the hot and overestimates the cold temperature extremes.In location (c), models are more coherent, although the EMM is shifted towards colder extremes for both the hot and the cold temperature extremes.
The distribution of the biases in return levels at all grid points is provided in Figure 3 in blue for the cold and red for the hot temperature extremes.Numbers correspond to CMIP5 models in Table 1, sorted by decreasing in resolution in order to investigate the role of horizontal resolution.Figure 3 illustrates the probability distribution of biases between models and 20CRv2c showing the results for 1 year (Panels a and c) and 4 years (Panels b and d) bin lengths m.Table 3 contains the statistical information in Figure 3.The boxplots demonstrate the results obtained at some specific location (Figure 2) : the cold extremes show larger biases than the hot extremes.This premise is also evident in Figures 4 and 5. Figure 4 shows the errorbars of the biases in cold extremes (a) and hot extremes (b) for all the bin lengths m.Moreover, biases in warm extremes do not change significantly with the bin length.A Kolmogorov-Smirnov test shows that the distribution of biases is neither Gaussian nor centered around zero for the models and the ensemble (Figure 5).The number of outliers in Figure 5 is represented vs the mean of the biases (a) and the standard deviation (b) for m = 1 and m = 4 by model (Higher/lower resolution in dark/light colours).In most cases, there are many outliers, despite the fact that the standard deviation of the biases (Figure 4) is relatively small (Table 3).The ideal case is only found for the hot temperature extremes at m = 4 years of the MIROC5 model.This analysis shows that biases are not linked to the low/high resolution of the GCMs since there is no trend in Figures 3-5.
We focus now on the spatial distribution of biases in order to investigate their coherence.We concentrate on two quantities: the average biases of models EMM, obtained as the difference between the ensemble mean and 20CR return levels at each grid point (Figure 6), and the SDM at each grid point (Figure 7).Following the same structure of Figures 3-5 and the cold extremes (Figure 6a,b) show larger deviations than the hot extremes (Figure 6c,d).This analysis shows the geographical distribution of biases: for the cold extremes, the Southwestern Iberian peninsula shows the largest positive biases, while for the largest negative biases occur in the Scandinavian region.The SDM (Figure 7) shows a coherent spatial structure for both the cold extremes (a, b) and the hot extremes (c, d).For the cold temperatures, larger biases appear mostly in the Scandinavian peninsula.For hot temperatures, biases are in general small for both m = 1 and m = 4 year, except at some specific grid points where they are mainly concentrated over the Baltic sea.    3. .Dark blue represents cold temperature extremes for the 16 models having higher resolution while light blue represents cold temperature extremes for the 16 models at lower resolution.For hot temperature extremes, dark red represents higher resolution and pink lower resolution.Detailed statistical information in Table 3.  3 (Ensemble).The three black dots in (d) correspond to central (Figure 2a), northern (Figure 2b), and southern (Figure 2c) points of Figure 2.  3 (Ensemble).The three black dots in (d) correspond to central (Figure 2a), northern (Figure 2b), and southern (Figure 2c) points of Figure 2. Table 3. Statistical information extracted from the box plots of the biases (Figure 3) in hot and cold temperature extremes between models and 20CRv2c in 1900-1999 for two different bin lengths: m = 1 year and m = 4 years.

Effects of Resolution: A Regional Application
The sole use of CMIP5 simulations is not conclusive on the dependence on the resolution of temperature extremes biases.Indeed the simulations span only a limited range of scale of order of hundreds kilometers.To complete this study, we therefore analyze the outputs of 8 post-processed regional models simulations (Table 2) from the bias Corrected EURO-CORDEX Climate Projections for the period 1971-2005.Regional climate models add crucial spatial detail for temperature extremes, allowing us to study fine-scale processes missing in Global Circulation Models,(e.g., Urban effects).As for CMIP5 models, for the regional simulations we compute the EMM and SDM in a similar way.These results are displayed in Figure 8.They show that increasing the resolution does not reduce the order of magnitude of the SDM.As in the CMIP5 case, biases for the hot temperature extremes are smaller than for the cold temperature extremes.The hot extremes biases are mostly located on the Iberian Peninsula, as for CMIP5 models.The cold biases appear anywhere in continental Europe and do seem not directly to the orography.These results are in line with Vautard, et al. [34] and Lhotka and Kysely [38].In Vautard, et al. [34] most of the regional models show an overestimation of hot temperature extremes in Mediterranean regions and an underestimation over Scandinavia.Thus, a preliminary analysis of the sources of spread show that the simulation of hot temperature is primarily sensitive to the convection and the microphysics schemes.Lhotka and Kysely [38] suggest that simulated cold events in central Europe should be analyzed and interpreted with caution, since they may develop also under zonal flow in some models, which contradicts observations.

Discussion
We have analyzed the recurrence spectra for hot and cold temperature values in the European region by using a methodology that provides robust estimates of return levels.In order to assess the dependence on the horizontal resolution on the temperature extremes biases, we have used both CMIP5 global medium resolution simulations and regional high resolutions runs from EURO-CORDEX.
Our results show that inconsistencies of European recurrence spectra between models and reanalysis can be much larger than the climate change signal in each model.Moreover, although the biases are generally centered around zero, their spatial variability suggests that these are caused not only by the differences in the average temperature of CMIP5 models but they largely depends on model dynamics/physics.For cold temperatures we find that biases depend on the chosen return period, whereas for the hot temperatures this dependence is observed only for return periods shorter than the seasonal cycle.An additional important difference is that, while for warm temperatures the biases are centered around zeros, for rare (m = 4y) cold temperatures the average biases are mostly positive, although bias model dependence is very large.In individual models, the asymmetry between hot and cold temperature recurrences and their biases is probably due to the (mis)representation of the albedo for negative temperatures, as already discussed by [39].Indeed an ensemble average of CMIP5 simulations allows to reduce those biases even though the distributions of hot and cold temperature extremes differ in variance and number of outliers with respect to the 20CRv2c reanalysis.
Our results suggest how to perform models selection in order to avoid the models leading to biased estimations of temperatures extremes on extended regions as well as on specific grid-points for both global (CMIP5) and regional (EURO-CORDEX) simulations.This outcome is coherent with Tebaldi and Knutti [40] who argued that the quantification of all aspects of model biases requires multi-model ensembles, ideally as a complement to the exploration of single-model bias.
Our analysis gives geographical details on the distribution of the biases: in general we find that models have the largest biases in hot temperature values in South Western Europe for both global and regional simulations.For CMIP5 simulations, cold temperature extremes biases are larger over North Europe and the multi-model bias spread is larger on the coastal grid-points.Improving the resolution does not change the order of magnitude of the averaged biases although the multi-model spread is reduced on the coasts.Overall, the biases in temperature extremes for recurrences in EURO-CORDEX simulations seem to be related to the physical parameterizations instead of the orography, since some areas with complex geography do not present large biases.This motivates to further pursue the bias corrections techniques to include the physical mechanisms at the origin of the biases in cold temperature extremes, e.g., water phase transitions or ice-radiation feedbacks and interactions.

Conclusions
This paper presents a tool for a general assessment of biases in the recurrence spectra for temperatures.We find that:

•
The recurrence spectra obtained by the model ensemble mean are generally consistent with those of 20CRv2c.

•
The spectra biases have a strong regional dependence.

•
A comparison with an ensemble of regional climate simulations shows that the resolution does not change the order of magnitude of spectral biases between models and reanalysis.

•
The spread in recurrence biases is larger for cold extremes.
Our analysis of biases provides a new way of selecting a subset of the CMIP5 ensemble to obtain an optimal estimate of temperature recurrences for a range of time-scales.This assessment could be extended to investigate the seasonal dependence of such the spectra, or their dependence on future climate change.
In a changing climate, the analysis suggests that biases are so large for some grid points, that it would not be straightforward to provide robust regional projection of hot and cold temperature extremes, especially when using only a single model.Moreover, it suggests that sea-land effects should be taken into account as biases are mostly concentrated around the coasts.Instead, it seems that averaging the biases over Europe and/or over different models could provide results directly comparable with the 20CRv2c.

Figure 1 .
Figure 1.After Faranda and Vaienti [29].Region of temperature for which the hypothesis that minima of |T * − T(t)| are GEV distributed is not rejected (blue area) for different period of recurrences m in years.Return times τ and corresponding return levels are represented by the projections of the green curve on the axis.Red lines are the absolute extremes of the series, indicated as global maximum and global minimum.
(a), northern Europe (b) and southern Europe (c).The different shapes of the red lines (20CR), describing the area of normal recurrences, are explainable in terms of the climate characteristics of the grid points considered.Central Europe (a) features a continental climate with very cold extremes of temperature, being −10 • C rare for a return time of 1 year while−20 • C becomes normal for higher return times.From 1 to 4 years of return times, 25 • C are normal temperature in warm extremes.The grid point of Northern Europe (b) is located close to the Baltic sea showing mild temperatures in comparison with a more continental Northern point.Even though, the shape of red line shows cold extremes of temperature reach −15 • C for normal recurrences in return times of 2-4 years and 20 • C for warm extremes in return times of 1-4 years.Temperatures lower than −10 • C are rare for return time of 1 year or less.Southern Europe grid point is located in Guadalquivir valley, presenting very hot extremes of temperatures reaching 40 • C as normal recurrences from 1 to 4 years of return times.

Figure 2 .
Figure 2. Recurrence analysis at three specific grid points.Lines represent the curve ( , τ).(a) grid point in central Europe with small inter-model biases, (b) grid point in northern Europe with large inter-model biases for cold temperature extremes, (c) grid point in southern Europe withEMM biases for hot temperature extremes and small inter-model biases .Red line: 20 CRv2c, black dashed line: EMM and SDM (errorbars), Grey lines: single models.

Figure 3 .
Figure 3. Box plots of the biases in T min (a,b in blue) and T max (c,d in red) between models and 20CRv2c in 1900-1999.Numbers in x-axis correspond to CMIP5 models ordered from highest to lowest resolution.Central marks are the median, the edges of the box are the 25th and 75th percentiles, the whiskers extend to the most extreme data points not considered outliers, and outliers are plotted individually.Different bin length m are represented: 1 year (a,c) and 4 years (b,d).Detailed statistical information about this figure in Table3.

Figure 4 .Figure 5 .
Figure 4. Errorbars of the biases in (a) T min and (b) T max between models and 20CRv2c in 1900-1999.Errorbars correspond to the standard deviation of the mean.Different bin length m are represented: 1 year in blue, 2 years in orange, 3 years in yellow and 4 years in purple.Detailed statistical information about this figure appear in Table3.

Figure 6 .
Figure 6.Average biases of models EMM for T min (a,b) and EMM T max (c,d) with respect to the 20CRv2c return levels.Different bin length m are represented: 1 year (a,c) and 4 years (b,d).Detailed statistical information about this figure in Table3(Ensemble).The three black dots in (d) correspond to central (Figure2a), northern (Figure2b), and southern (Figure2c) points of Figure2.

Figure 7 .
Figure 7. Standard deviation of the biases of models SDM for T max for hot temperature extremes (a,b) and SDM for T min for cold temperature extremes (c,d).Different bin length m are represented: 1 year (a,c) and 4 years (b,d).Detailed statistical information about this figure in Table3(Ensemble).The three black dots in (d) correspond to central (Figure2a), northern (Figure2b), and southern (Figure2c) points of Figure2.

Figure 8 .
Figure 8. Recurrences analysis for eight EURO-CORDEX runs at m = 1 years.Average biases of models (a) for hot temperature extremes EMM for T max (as in Figure 6c for CMIP5) and (b) for cold temperature extremes EMM for T min (as in Figure 6a for CMIP5).Standard deviation of the biases of models (c) for hot temperature extremes SDM for T max (as in Figure 7c for CMIP5) and (d) for cold temperature extremes SDM for T min (as in Figure 7a for CMIP5)

Table 1 .
List of CMIP5 Models Analysed.The order is decreasing in resolution.Longitude × Latitude ( • );3Centre National de Recherches Meteorologiques -Centre Européen de Recherche et de Formation Avancée en Calcul Scientifique;4Atmosphere and Ocean Research Institute (University of Tokyo), National Institute for Environmental Studies, and Japan Agency for Marine-Earth Science and Technology;5Commonwealth Scientific and Industrial Research Organisation(CSIRO), Bureau of Meteorology(BOM).

Table 2 .
List of EURO-CORDEX bias corrected with EOBS10 used in this work.The experiment used was the control historical simulation (r1i1p1 ensemble member at 0.11 • of resolution).