Validation and Selection of a Representative Subset from the Ensemble of EURO-CORDEX EUR11 Regional Climate Model Outputs for the Czech Republic

: To better understand the impact of climate change at a given location, it is crucial to consider a wide range of climate models that are representative of the area. In this study, we emphasize the importance of the careful validation and selection of climate models most suitable for a particular region. This step is critical to enhance the relevance of climate change impact studies and consequently design appropriate and robust adaptation measures, particularly in agriculture, forestry and water resources management. We propose validation and selection methods for regional climate models that can help identify a smaller group of well-performing models using the Central European area and Czech Republic as examples. In the validation process, 7 out of 19 regional climate models performed poorly. Of the 12 well-performing models, a subset of 7 models was selected to represent the uncertainty in the entire ensemble, which could be used in subsequent studies. The methodology is sufﬁciently general and may be applied to other climate model ensembles.


Introduction
Climate models play a crucial role in providing valuable insights into future climatic conditions, among other benefits, for various impact studies and serve as the background for climate change adaptation and mitigation strategies [1][2][3].To ensure that climate change impact studies and all other steps leading to the formulation of adaptation and mitigation strategies are well grounded and reliable at the regional level, climate models should accurately represent regional-to local-scale processes and phenomena.The current generation of global climate models (GCMs) remains too coarse in terms of the spatial resolution to provide a sufficient level of detail.Information retrieved from GCMs must therefore be downscaled by various methods to meet the criteria for climate change impact modeling on fine spatial scales.The downscaling of GCMs over a limited spatial area by a regional climate model (RCM) is one of the most popular approaches, and Giorgi [3] summarized its principles and achievements over the last three decades.
The spatial resolution of RCMs is usually a few tens of kilometers, rendering them a highly preferred choice for the consequent modeling of climate change impacts.The most popular set of RCM simulations currently used was created within the international CORDEX initiative [4].CORDEX simulations are based on downscaling GCMs of the Coupled Model Intercomparison Project Phase 5 (CMIP5) [5], and their projected climate changes in the future have been summarized, for instance, in the new Interactive Atlas of Climate Change [6], a part of the most recent 6th Assessment Report of IPCC Working Group 1 [7].In addition, a new set of CORDEX simulations obtained by downscaling the newer generation of CMIP6 GCMs is now being prepared and should become available soon.In the European part of CORDEX (the so-called Euro-CORDEX), two ensembles of RCM simulations were prepared: one at a 0.44 • (∼50 km) spatial resolution and the other at a 0.11 • (∼12.5 km) spatial resolution.
There are more than twenty Euro-CORDEX simulations of future climate conditions at a fine spatial resolution of 0.11 • prepared by various groups using several RCMs.To better understand future climate evolution, it is important to adopt a large ensemble of model simulations rather than a single model projection.This strategy enables us to capture the uncertainty associated with model selection, different forcings and various sources of variability.At the same time, it can be complicated to utilize the full ensemble of all available RCM or GCM simulations for impact studies, mainly due to processing and time limitations.There is, thus, interest in narrowing the entire ensemble while retaining its main statistical properties.
To refine climate change projections for a specific region, several general approaches can be used to narrow large model ensembles [8].Among the most commonly used approaches are weighting models based on their performance in reproducing observed mean, variability, or trend fields for one or more variables [8][9][10].These weights can then be applied only to the outputs of impact models that use individual RCMs as inputs.Other approaches include detection-and attribution-based methods, Bayesian methods and single-model methods [8], but these methods may not be suitable for impact studies that often require daily data since these data are not, unfortunately, outputs of such methods.In regard to impact studies, it is insufficient to solely rely on the overall signal of climate change in the future, as these three methods do.Moreover, a full range of uncertainty may be needed, such as that in hydrological models [11,12] or agricultural crop growth models [13].
The main aim of this study is to present an approach to reduce a large ensemble of Euro-CORDEX 0.11 • RCM simulations for the relatively small area of the Czech Republic (Figure 1).The ensemble consists of validated models, and a new, smaller ensemble of RCMs is then obtained that still preserves the spread of the original ensemble.In addition, the reduced ensemble should decrease the effort associated with consequent impact studies and, at the same time, ensure that the main statistical properties of the climate change signal of the full ensemble are preserved in the reduced ensemble for important climate variables.The reduction in the ensemble size should therefore not cause a significant loss of information that may result in biased outcomes of impact studies.The selected subset is referred to as the climate change envelope (CliChE).
In the Section 2, an overview of the input data is provided, along with detailed definitions of the techniques employed in our study.In the Section 3, we (i) justify the exclusion of certain models from the ensemble of RCMs for the Czech Republic and Central Europe and (ii) describe the selection process for creating the CliChE for the Czech Republic.In the Section 4, we analyze the benefits of our methodology and emphasize its distinctive features, setting it apart from other approaches in the literature.

Climate Model Data
RCM simulations from the high-resolution version of Euro-CORDEX ensembles (i.e., 0.11 • , which corresponds to an approximately 12.5 km grid spacing), were processed.The selected models are summarized in Table 1, and their choice reflects the data availability at Earth System Grid Federation data nodes at the initiation time of this study.In total, there are 19 RCM simulations (or GCM-RCM pairs) driven by 8 GCMs.Furthermore, in the text and figures, the GCM-RCM pairs are denoted by their abbreviations, which are also provided in Table 1., including the driving global climate models (GCMs) for various regional climate models (RCMs).The abbreviations of all GCM-RCM pairs are given, which are used in the following figures and text.In the last column, information is provided on the validation process with assigned performance validation characteristics: correlation of the annual cycle (AC), spatial correlation (SC) and spatial variability (SV).Members of the CliChE are emphasized by * (refer to the text for further details).The RCM simulations were analyzed in two parts.The first entailed a historical run, when the RCMs are driven by GCMs during the period beginning in the second half of the 20th century until 2005.Within the context of this study, the historical run is a validation subject to reveal any potential errors due to the combination of GCMs and RCMs in reproducing the main climate features over the observational period.Second, a scenario was simulated, in which the RCMs are driven by GCMs following the future evolution trend in greenhouse gas concentrations as described by the representative concentration pathway scenario RCP8.5.Under this scenario, continuous growth in greenhouse gas concentrations is expected throughout the entire 21st century, which is thus considered the upper limit of the projected changes in climate variables.The following variables needed for impact studies (with a focus on agriculture) were processed: air temperature (daily mean, maximum and minimum values), precipitation, global radiation, 10 m wind speed and relative humidity.

Validation Data
To evaluate the performance of the RCM simulations for the Czech Republic in the recent past , station observations from the Czech Hydrometeorological Institute (CHMI), the national weather service of the Czech Republic, were applied.The station data were processed in several steps before the validation process.At first, daily and subdaily station data were quality controlled and homogenized according to the method described by Štěpánek et al. [14,15].Consequently, any missing station time-series data were replaced with recalculated data from nearby stations, considering elevation and other spatial parameters [14].The final product of station data processing is referred to as the station technical series (TS).The TS offers a daily time step and provides a significantly improved version of the data relative to the original data or regional and global datasets (e.g., from the Copernicus Climate Change Services).These regional or global datasets may contain high biases in certain areas depending on how many station records are available for their construction, and the data analysis remains superficial, e.g., during homogenization with less metadata information available.In total, there are 268 TSs for the air temperature (daily mean, maximum and minimum values), sunshine duration (recalculated into the global radiation), 10 m wind speed and relative humidity.There are also 787 TSs for precipitation since the Czech Republic hosts a dense network of precipitation stations.
The historical runs of the RCMs end in 2005, making it the designated end of the validation period.There are variations in the historical runs of the RCMs, with some models starting in the 1960s and others even as far back as the 1950s, but all of them are ultimately available only since the 1970s.As a result, a validation period of 35 years from 1971 to 2005 was utilized for the RCMs using TSs of the Czech Republic.This period is still five years longer than the climatological normal period, and the main climatological normal periods (1961-1990 or 1991-2020) cannot be applied in this case.
The stations of the TSs are part of an irregular network, while the RCMs are provided at grid points at regular locations (a grid with a mesh size of 0.11 • ).To validate the RCMs against observations (TSs), i.e., to correlate the model grid points with relevant station positions, there are two options: (i) fitting the RCM data at TS locations, or (ii) recalculating the TSs at new positions to match the RCM grid.Different regridding methods can be applied for this purpose.Several tests using either the nearest neighbor value method or more complex spatial interpolation methods (e.g., regression kriging) were conducted considering both options mentioned.No significant differences were found between the results of the interpolation methods and the nearest target locations (TSs and RCM grid points) in the Czech Republic.Consequently, it was proceeded with option (i): the nearest neighbor method was adopted as the main approach.It should be noted that the number of RCM grid points in the Czech Republic is practically the same as the number of TSs for precipitation measurements (approximately 800).

Methods for Validation
The aim of the validation process was to identify RCM simulations (or better, GCM-RCM pairs) that show major discrepancies in the description of historical climate conditions.In the model validation process, it is generally accepted that models with a low-quality historical run may also perform poorly in estimations of future climate change, and vice versa.Moreover, models that suitably capture current climate conditions will also suitably capture future climate conditions.Four validation criteria were used: (i) reproduction of circulation patterns; (ii) temporal correlation of the annual cycle; (iii) spatial correlation of long-term summer and winter means and spatial variability of long-term annual and seasonal means; and (iv) availability of the set of all meteorological elements needed for the subsequent activities of impact studies, e.g., calculation of evapotranspiration based on the FAO-56 Penman-Monteith approach [16].
Because of the uncertainty in the model results (future is not known), the statistics for model evaluation must be assessed relative to other models.This suggests that if all models attain similar low correlations, this does not affect the removal of any of the models.However, when a given model attains a much lower correlation than other models, it could then be concluded that the model achieves a limited performance (among the processed models).
When any model exhibits a very poor performance in terms of one of the four criteria for any meteorological element relative to the other models, this model should thus be removed from further processing.If a model achieves poor performance for any meteorological element (suggesting slightly poorer performance in this case vs. the very poor performance in the former case) in terms of more than one of the four criteria or for more than one meteorological element, this model should also be removed from the ensemble.Notably, model performance assessment is prone to the subjective judgment of researchers and requires experience, and this process should reflect the use of the model outputs in subsequent impact studies.

Circulation Patterns
The capability of regional climate models for reproducing circulation patterns associated with temperature extremes was assessed.With the use of daily mean sea level pressure fields obtained from the National Centers for Environmental Prediction and the National Center for Atmospheric Research (NCEP/NCAR) reanalysis dataset [17], three circulation indices-flow strength, direction and vorticity-were calculated for a region centered at 50 • north latitude and 15 • east longitude (representing the Czech Republic).The daily values of the above circulation indices were classified into 11 circulation types: eight directional (e.g., northwesterly), one strongly cyclonic, one strongly anticyclonic and one unclassified.The full methodology and related equations were provided by Jenkinson and Collison [18] and Plavcová and Kyselý (2011) [19].
In the next step, the promotion effect of each circulation type on the occurrence of hot days (the summertime maximum temperature exceeds the 95th percentile value) was analyzed using efficiency coefficients, which were calculated as ratios between the relative abundance of the circulation type on hot days and the relative abundance in summer.This procedure was applied to both the observed data and regional climate model simulations, and the results were then compared.Moreover, summed differences across all circulation types were calculated.

Temporal Correlation of the Annual Cycle (AC)
For the TSs and all RCMs, monthly aggregate values for all years of the validation period were calculated for each meteorological element.Specifically, for each month (1 ≤ m ≤ 12; over all TSs and all years), the TS m value was calculated as the average over all days within a given month m.Similarly, for a given RCM (over all grids and all years), the RCM m value is the average over all days within a certain month.Thus, 12 monthly values were obtained.Then, the AC was calculated as the Pearson correlation coefficient, as follows: r (TS m ) 12 m=1 , (RCM m ) 12 m=1 . (1)

Spatial Correlation (SC)
At each grid point (1 ≤ i ≤ n; in our case, n = 787 for the precipitation network and n = 286 for the network of other meteorological elements), the average value of a given meteorological element over the whole validation period for the TS data (TS i ) and for RCM data (RCM i ) was calculated.Then, the SC was calculated as the Pearson correlation coefficient, as follows: Similarly, spatial correlations were calculated over seasons in which averages were calculated only from values for a given season over the whole validation period.In the validation process, average summer and winter SC values were also used since the models exhibited different performance levels during different periods of the year.Summer and winter seasons are sufficient for evaluation purposes.During the transition seasons, no other different features were found.

Spatial Variability
The spatial variability was not calculated separately but investigated in conjunction with the spatial correlation in the form of a Taylor diagram [20], which combines the root mean square difference (RMSD) with the standard deviation (SD) and the Pearson correlation coefficient in a single plot.To visualize all RCMs in a single Taylor diagram for a given meteorological element, it is necessary to normalize the statistics, i.e., to calculate the normalized SD and centered RMSD.The resulting Taylor diagram for the long-term seasonal and annual climatology was analyzed, and potentially outlying RCMs were identified.

Bias Correction
The RCMs produce biased results, necessitating bias correction before using the model outputs in other analyses, e.g., in impact models.Therefore, before the analysis of future climate change signals, the RCMs were bias corrected using the distribution adjusting by percentiles (DAP) method, which was described by Štěpánek et al. [21] and is based on the quantile mapping approach of Déqué [22].This correction method, based on the adjustment in the individual percentiles of the empirical distribution, was compared to other bias correction approaches, e.g., Gutiérrez et al. [23], and indicated to perform very well.In contrast to other quantile-mapping methods, the selected bias correction method focuses on a proper transfer function for the tails of distributions (representation of extremes).The reference period was again set from 1971 to 2005.
The bias correction process was applied on a daily basis and for each TS location separately.To ensure suitability for impact studies in which models are usually trained on station data (because such data are available for the current climate), bias correction was performed by determining the nearest grid points for a given location (station).Such bias correction also entails the localization of the nearest grid point for the location of the station used in a given pair.To better address the uncertainty resulting from the correction process, bias correction was applied 5 times in the case of precipitation to 10 times in the case of other meteorological elements (i.e., applied to several neighboring stations), even if in practice, only the first (nearest) neighbor is applied as the final correction step.

Center and Distance from the Center
At each grid point, the center of a given meteorological element over all models was calculated as the seasonal average.The distance of a grid point from the center was then calculated as the sum of the squares of the differences over all seasons.The distance of an RCM from the center can then be obtained as the average distance over all grid points and over all seasons.
Specifically, let M(S, G, Y) be the value of Model M for season S at grid point G in year Y.Then, for the period from year Y 1 to year Y 2 , the following can be obtained: Moreover, the center and standard deviation of the ensemble are Then, the distance Dist(M) of Model M from the center can be calculated as follows:

Methods for the Selection of Representative Models-CliChE
There may be relatively simple water balance or agroclimate models that can process dozens of RCMs.In contrast, fully distributed hydrological models with high spatial resolution would require excessive computational time, and thus, the use of a limited number of input RCMs is preferable here.In such cases, the final selection must represent the spread of all the original models.
From an ensemble of well-performing models, a subgroup that represents the entire model ensemble (climate change envelope-CliChE) comprising only a few members can be selected.For various purposes, various numbers of models in the CliChE were selected.This approach follows the requirements of impact studies and depends on the complexity of a given impact model.For these various purposes, several model selection options were prepared, starting with one (central) model and continuing with ensembles of three, five and up to seven models.The selection process was based on the RCP8.5 scenario runs and bias-corrected data (with the TSs as the reference dataset) for the 2021-2060 period (aimed at the middle of the 21st century).
The central model should represent centrality through several meteorological elements: mean air temperature, precipitation and global radiation.Then, one of the warmest models (i.e., the largest distance from the center with a mean air temperature higher than that of the center) and one of the coldest models (i.e., the largest distance from the center with a mean air temperature lower than that of the center) can be added, and if possible, the conditions that one of the models is wetter than the central model (i.e., a precipitation amount larger than that of the central model) and one of them is drier than the central model (i.e., a precipitation amount smaller than that of the central model) can be fulfilled at the same time.These models jointly create an ensemble of three models.Next, one of the wettest models (i.e., the largest distance from the center with a precipitation amount larger than that of the center) and one of the driest models (i.e., the largest distance from the center with a precipitation amount smaller than that of the center) can be added, and if possible, one of these models can again be warmer than the central model (i.e., a mean air temperature higher than that of the central model) and one of these models can again be colder than the central model (i.e., a mean air temperature lower than that of the central model).All of these models jointly comprise an ensemble of five models.
In the end, two other models that represent extremes can be added.One is given by the largest number of tropical days (a daily maximum temperature equal to or greater than 30 • C), which is at the same time drier than the central model, and the other is given by the largest number of days with a daily precipitation amount equal to or greater than 50 mm, which is at the same time colder than the central model.This step creates an ensemble of seven models.
Any new pair of models (selection of 3, 5 or 7 models) can be added to preserve the balance of the models around the central model (i.e., having one model above and another below the central model) for the mean air temperature and precipitation.
When there is no central model (the behavior of the models may change over a century), or there are no other models with the above-mentioned properties, then the only goal that may be achieved is the selection of a subset of models representing the entire ensemble, i.e., reducing the ensemble size.However, in the case of the Euro-CORDEX RCMs, it was possible to select models based on the proposed criteria.
In the end, a comparison of the entire ensemble of well-performing models to the CliChE is provided for the mean air temperature, precipitation and global radiation via the following statistics: mean, standard deviation and range.Specifically, let M(G, Y) be the annual value of model M at grid point G in year Y.Then, for a period from year Y 1 to year Y 2 , the following can be obtained: Over the given period, the mean of the ensemble can be calculated as The standard deviation can be obtained as The range can be calculated as

Results
In the following section, the RCM processing results are presented.First, models with low skill scores were removed, and second, from the models suitable for the study area, a subset of models (CliChE) was selected that preserved the statistical properties of the original entire ensemble.The first criterion considered in model performance evaluation was the representation of circulation patterns in regard to hot days.The whole model ensemble could capture the circulation mechanisms leading to high temperature extremes in summer, and no significant differences in model performance were found.

Correlation of the Annual Cycle
Furthermore, the AC was evaluated for all meteorological elements (Figure 2).Model H-CLM yielded a negative AC for precipitation (−0.5), which suggests a very poor model performance.Model C-CLM achieved a very low AC for precipitation (0.2) relative to the other models, which again indicates poor model performance.In regard to the other meteorological elements, no problems were found in terms of the AC.Wind speed (10 m) Temporal correlation of the annual cycle (AC) and spatial correlation (SC) as validation characteristics of the global and regional climate model (GCM-RCM) pairs compared to the station measurements (technical series) for the global radiation, air temperature (mean, minimum and maximum values), precipitation, relative humidity and 10 m wind speed.The abbreviations of the GCM-RCM pairs are given in Table 1.

Spatial Correlation and Variability
Then, the SC was evaluated (Figure 2).Models G-REMO, I-REMO and M-REMO attained very low spatial correlations for the global radiation and precipitation (0.1-0.25), which indicates poor performance for both meteorological elements.Upon detailed examination of the SV during the various seasons (Figure 3), precipitation has a low spatial correlation in both summer and winter.Regarding radiation, low spatial correlations are found mainly in summer.Moreover, I-REMO is an outlier model in terms of the SD for the winter minimum temperature.Model C-ALADIN yields a very low SC for the minimum temperature (0.45), which suggests poor model performance.Upon examination of the SV, a problem mainly occurs in summer, including the spatial correlation and centered RMSD, suggesting an outlier (Figure 3).The relative humidity shows the lowest spatial correlation and the lowest SD (Figure 3), while in terms of summer precipitation, a low spatial correlation is observed (0.4), which also indicates poor performance.The C-CLM model exhibits low spatial correlations for precipitation (0.4) and global radiation in summer (0.3; Figure 3).In regard to the other meteorological elements, there were no problems in terms of the SC based on the comparison of the different models.Model comparison is important since there are very low SC values relative to the other meteorological elements, e.g., the 10 m wind speed, which was observed for all the analyzed RCMs.
Taylor diagrams of the spatial variability of the global and regional climate model (GCM-RCM) pairs: Pearson correlation coefficient, normalized standard deviation and centered root mean square difference (cRMSD) for the precipitation and minimum air temperature in summer and winter, the global radiation in summer and the annual relative humidity, in which problematic validation characteristics are observed.The same symbols are used for the same RCMs, and the same color is used for the same driving GCM.The abbreviations of the GCM-RCM pairs are given in Table 1.

Completeness of the Meteorological Elements
Regarding the C-ALARO model, only temperature and precipitation data were available.Furthermore, relative humidity data were not available for C-CLM, G-REMO, I-REMO, H-CLM and M-REMO.It should be noted that the original relative humidity data could not be downloaded for other models.Later, when relative humidity data became available, the data were downloaded for the validated models.Thus, in the end, the availability of relative humidity data was not a deciding factor for model removal.

Validation Summary
The above-mentioned results reveal that Models C-ALADIN, C-ALARO, C-CLM, G-REMO, I-REMO, H-CLM, and MPI-REMO were removed from further processing, and the remaining 12 models of the original ensemble were retained since they performed well.The results are provided in Table 1.

Clustering of Model Pairs Based on Their Affiliation to RCMs vs. GCMs
The SV (Figure 3) indicates that the points in the Taylor diagrams that represent individual models can be clustered into groups, where similarity follows for a given (same) RCM rather than for the driving GCM.Therefore, it could be concluded that in the Czech Republic, the RCM-simulated climate dominates the driving GCM climate, which is not surprising since the Czech Republic occurs at the center of the RCM spatial domain (in the middle of Europe).In contrast, the GCM climate is manifested mainly at the borders of the domain.

Selection of Representative Models for the CliChE
In the case of the Czech Republic, there are 12 well-performing models out of the 19 original RCMs from which the CliChE can be established.First, one model that is central based on the mean air temperature, precipitation and global radiation was identified.This task was conducted in such a way that models with a larger distance from the center (Figure 4) were successively removed.Based on the distance in terms of the temperature, the H-RACMO, C-RCA and E-CLM models were removed since they exhibited relatively large distances.Then, based on the distance in terms of precipitation, the H-HIRHAM, H-RCA and E-HIRHAM models were removed, and subsequently, based on solar radiation, the E-RACMO, E-RCA and N-HIRHAM models were removed.In the end, only three models, i.e., I-RCA, M-CLM and M-RCA, were left.Note that the same three models would also remain if the models were removed differently, namely, applying a different order of meteorological elements: removing one model by the mean air temperature, one model by precipitation and one model by the global radiation and repeating this process three times.These results confirm that the applied process is correct and sufficiently robust.Of the three models left, M-RCA was selected as the central model since it showed the minimum distance in terms of the temperature and global radiation and a similar distance in terms of precipitation relative to the other two models.As a control meteorological element, the relative humidity was used, for which the distances were not large and similar for all three models.

Global radiation Relative humidity
Figure 4. Distances of the models from the center of the entire ensemble of 12 well-performing global and regional climate model (GCM-RCM) pairs for the mean air temperature, precipitation, global radiation and relative humidity.The abbreviations of the GCM-RCM pairs are given in Table 1.
Because it was found that the same RCMs of the GCM-RCM pairs produced similar SV values, a requirement for the diversity of RCMs was included in the selection process.
The characteristics considered during warmest and coldest model selection included the change in the future mean air temperature (Figure 5) and the distance from the center of all models (Figure 4).Models very far from the center were chosen as much as possible, such as H-RACMO (order 2), which is warm (and wetter than the central model), and M-CLM (order 3), which is cold (and drier than the central model).At the same time, precipitation (Figure 5) was considered by confirming that one of the two distant models in terms of the temperature was wetter than the central model and that the second model was drier than the central model.Note: This is a very convenient situation since the selection of only three models yields an envelope for the remaining meteorological elements (at the same time, these are the most important ones in terms of their frequency of usage): air temperature and precipitation.Well−performing GCM−RCM pairs that are not in CliChE GCMs that drives some RCM in CliChE GCMs that drives some RCM, but not any member of CliChE GCMs that do not drive any RCM Assigning a subset of five models requires adding wet and dry models.The models chosen were C-RCA (order 4), which is wet, and E-RACMO (order 5), which is dry.Again, the characteristics considered included the change in precipitation under the future climate conditions (Figure 5) and the distance from the center of all models (Figure 4).Models with a large precipitation distance from the center were identified as much as possible.At the same time, the mean air temperature (Figure 5) was considered by confirming that one of the two distant models in terms of precipitation was warmer than the central model and that the other one was colder than the central model.However, in this case, it was not possible to find any models fulfilling these criteria.As such, the chosen models occurred near the central model in terms of temperature.
Finally, a search for two additional models was performed to complete the selection of the subset of seven models.Here, models were identified that could characterize the original ensemble in regard to the extremes.The models fulfilling this criterion were the I-RCA model (order 6), with the largest number of tropical days during the first half of the 21st century (warmer during the mid-21st century and drier than the central model), and the N-HIRHAM model (order 7), with the largest number of days with an extreme precipitation equal to or more than 50 mm (colder and wetter than the central model).
The selected models for the CliChE are listed in Table 2, with their order in the CliChE and their attributes.These models are also noted in Table 1.
Table 2. Climate change envelope (CliChE) for the Czech Republic based on the seven regional and climate model pairs with their order in the CliChE and their attributes.

MPI-ESM-LR RCA4
Central model; this represents the center in projections of the mean air temperature and precipitation.

MOHC-HADGEM-ES RACMO22E
This is the warmest and at the same time a wetter model than the central model.A comparison of the entire ensemble of well-performing models to the CliChE is given in Table 3 for the 2021-2040 and 2041-2060 periods.There were no significant differences between these two ensembles, which confirms that the described methodology is appropriate and that the results are robust.The evaluation of climate models against the observed data is crucial to any use of climate models.This is particularly important when climate models are used to project future climate conditions, and these projections are then used on the input of decisionmaking processes, e.g., planning and fulfillment of adaptation and mitigation policies.The validation is at the same time a crucial part of the presented methodology to narrow the ensemble of Euro-CORDEX RCM simulations in the Czech Republic.Although climate models can perform generally well across their entire domain [24], they can exhibit significantly higher errors on a local scale or over a specific region [13].Even if an agreement between model outputs and reality is acceptable on a large scale, on fine spatial scales, where adaptation strategies need to be planned, the uncertainty increases and may reach tens of percentage points, which already brings some difficulties.For instance, an RCM wet bias of 100 mm per year over the Czech Republic (as follows from the RCMs processed within this study) accounts for approximately 15% of the long-term mean annual precipitation (668 mm).By considering that the average runoff in the Czech territory reaches on average approximately 192 mm per year (i.e., 29% of precipitation), the wet bias accounts for more than half of this amount.Applying these values within the Budyko space and assuming no significant changes in vegetation landcover [25], an increase in runoff by at least 30% could be expected, and the remaining part of the bias could be attributed to evapotranspiration.Such a high bias could compromise detailed planning of technical adaptation measures, such as building water reservoirs, and could indicate a relatively optimistic future in terms of plant production.Nevertheless, the main take-home message is that the uncertainty is so high that any adaptation measures must be very robust overall and tailored to address extremes rather than long-term states [26].

Model Weights
A combination of models represents one of the options to obtain representative probabilistic estimates of climate change within the Czech Republic.By assigning weights to individual RCMs [8][9][10], reflecting their performance gained during the validation process, it is expected to obtain better results in comparison to a simple aggregation.The weights were calculated as the average of the exponential weights obtained from the AC and SC for mean air temperature, precipitation, and global radiation.The weights based on the precipitation and global radiation, compared to those based on the mean air temperature, showed higher uncertainty.The highest variability resulted from the AC for precipitation and a lower variability resulted from the SC for the global radiation and precipitation (Figure 2).For the models that passed the validation process, the estimated values of the weights ranged from 7.9 to 8.7%.This suggests that all the RCMs were assigned very similar weights, and weighting of the RCMs thus yielded a negligible effect overall.Therefore, it is better to use equal weights for all RCMs, which avoids an introduction of additional uncertainty in the ensemble.This conclusion is also supported by the findings of other researchers [27].
For the evaluation of the climate change signal in some future period, a combination of all suitable models using arithmetic or weighted average (or individual quantiles) is a valid approach.In the case of impact studies, however, daily data are usually utilized in the models.Unfortunately, as follows from the character of climate models, it is not feasible to combine climate model projections on a daily time scale into one daily time series.This leads to a solution where a single impact projection is obtained using input from a single climate model, and this approach is repeated with other climate models to obtain the matrix of impact model projections.Then, the impact model outputs can be combined, again, using arithmetic average or by applying model weights.However, this is an extremely costly strategy, given by the high number of climate model simulations on the input of impact modeling (e.g., 19 in the case of Euro-CORDEX RCMs).Therefore, it is needed to limit the number of input simulations for impact modeling.This is where the proposed strategy of CliChE is focused-to limit the number of RCMs while preserving the properties of the entire RCM ensemble.
In the case of monthly data instead of daily data, the approach of Chhin and Yoden [28] may be applied, which offers either weighted or unweighted cases, and validation of the models is based on the summation of rank, Euclidean distance of the cluster analysis, and that of empirical orthogonal function analysis.Besides a temporal resolution, this approach differs also in the spatial one-it was applied to GCMs.On the other hand, it shows, in general, a very similar approach to validation and narrowing an originally large model ensemble.

Comparison of RCMs with Their Driving GCMs
Even though models were carefully validated and selected from a larger ensemble, this cannot guarantee that the results will yield correct estimates of future climate change.The projected changes in the mean air temperature, precipitation, global radiation and other important meteorological elements may differ across various model ensembles [29][30][31][32][33].In Europe, the most prominent difference between RCM and GCM projections is the reduction in summer warming, as obtained by RCMs [34] accompanied by a smaller decrease in precipitation and a smaller increase in the global radiation in RCMs than those in GCMs [31].The differences between RCMs and GCMs deserve more attention, which is beyond the scope of this article, and they are described in detail elsewhere [33].Here, we only emphasize that while climate change in mean air temperature RCM simulations is part of the changes simulated by GCMs (even if under a lower margin-with lower estimates), regarding precipitation and global radiation, the estimated trends do not coincide (Figure 5).
Generally, RCMs yield the same amount of precipitation in summer but generate increased amounts in winter and spring over the current levels, which may lead to, among other factors, higher winter and spring run-off levels as determined by hydrological models [35,36].RCMs predict, on an annual basis, compared to GCMs, wetter future climate conditions, lower global radiation and smaller increase in the mean air temperature [37].Such conditions could lead to an ideal state for vegetation.This, combined with other factors, such as much lower actual precipitation in recent years than the simulated precipitation, leads us to conclude that RCMs, in their current version (Euro-CORDEX with the driving GCMs of CMIP5), are not a suitable source of information on the future climate because they contradict observed trends.Such a statement is valid at least in our study region of Central Europe, which is specific in that it occurs in a transient zone [6,37].In contrast, these discrepancies resulting from RCMs are not found for GCMs, which predict more frequent and severe droughts connected with higher temperature, higher potential evapotranspiration and lower precipitation levels in the future [38].The selection of a model ensemble is thus crucial to the overall results of any consequent analysis (e.g., within impact studies).The uncertainty related to the choice of the model ensemble (regional versus global models; CMIP5 or CMIP6 generation) or downscaling strategy (dynamical versus statistical downscaling) is often larger and more important than the spread of individual simulations, e.g., in the RCM ensemble.

Applicability of the Method
The presented method of narrowing the Euro-CORDEX ensemble was primarily designed for local impact studies within the Czech Republic, but there are not known any obstacles that would hinder its application, e.g., for the entire European domain.The same is valid for the presented method of the validation of the Euro-CORDEX ensemble.At least the validation part with SC and SV can be applied universally, for example, for all of Europe.The AC should be applied better only on local scales, since on a European scale, for example, the level of averaging (smoothing) is high and we work with information that is too coarse.Both presented methods can also be applied to other model ensembles (e.g., GCMs).Once the new Euro-CORDEX simulations based on the CMIP6 GCMs become available, it is planned to run the same procedure to define a new set of representative RCMs for assessing the impact of climate change in the Czech Republic.

Conclusions
In this study, we introduced a new method to narrow a large ensemble of climate model simulations for application in climate impact modeling in a cost-effective way while keeping the uncertainty range in the entire climate model ensemble.It is a two-step approach that combines the validation of climate models against observed data using several test criteria, and the selection of individual climate models according to their projected climate change signals and position in the entire ensemble.Both steps, validation and model selection, were described and demonstrated on an example at the area corresponding to the territory of the Czech Republic.An emphasis was placed on the selected meteorological elements and their indices that are important inputs for impact modeling.
The validation revealed 7 RCMs (out of a total of 19 from Euro-CORDEX ensemble) that do not qualify for the selection process.The main issues of RCMs excluded from the ensemble were related to the poor annual cycle and spatial correlations and variability in the representation of the current climate conditions in the Czech Republic.One RCM was also excluded due to missing data for meteorological elements.On the other hand, the evaluation based on the circulation criteria did not identify any RCM that would be significantly worse and could be removed from the ensemble.
Twelve RCMs were entering the second step, which led to the selection of the subset of a total of seven RCMs that represent the uncertainty in the entire twelve-member ensemble.From this subset, even smaller subsets of five or three RCMs or only one RCM were proposed for cases where the impact models cannot run all of the seven models.
The proposed climate model validation and selection methods are robust and sufficiently general to be used for narrowing other ensembles of climate model simulations over various world regions.It ensures that only the well-performing climate models are taken for further impact modeling.It thus limits the cost of consequent impact studies and preserves, at the same time, the corresponding uncertainty of the original ensemble of well-performing climate models.

Figure 1 .
Figure 1.The Czech Republic is located in the central part of Europe.

Figure 5 .
Figure 5.Estimated climate change as the average differences over 20-year periods from the average reference period (1981-2005) for the mean air temperature in degrees Celsius, the number of tropical days and the number of days with precipitation amounts equal to or greater than 50 mm and as precipitation and global radiation ratios to the reference period values.The global climate models (GCMs) are plotted in gray, while the global and regional climate model (GCM-RCM) pairs are marked in different colors.The green points indicate the observed climate changes in the Czech Republic represented by the station measurements (technical series).The GCM-RCM pairs are numbered based on their membership in the climate change envelope (CliChE); details are given in the text and listed in Table 2.

Table 1 .
Euro-CORDEX ensemble of 19 models with a resolution of 0.

Table 3 .
Comparison of statistics in the mean air temperature, precipitation and global radiation for the entire ensemble of well-performing models relative to the CLiChE for 2021-2040/2041-2060 (in the table, values for both periods are shown separated with a backslash).