Updated Assessment of Temperature Extremes over the Middle East–North Africa (MENA) Region from Observational and CMIP5 Data

: The objective of this analysis is to provide an up-to-date observation-based assessment of the evolution of temperature extremes in the Middle East–North Africa (MENA) region and evaluate the performance of global climate model simulations of the past four decades. A list of indices of temperature extremes, based on absolute level, threshold, percentile and duration is used, as deﬁned by the Expert Team on Climate Change Detection and Indices (ETCCDI). We use daily near-surface air temperature (Tmax and Tmin) to derive the indices of extremes for the period 1980–2018 from: (i) re-analyses (ERA-Interim, MERRA-2) and gridded observational data (Berkeley Earth) and (ii) 18 CMIP5 model results combining historical (1950–2005) and scenario runs (2006–2018 under RCP 2.6, RCP 4.5 and RCP 8.5). The CMIP5 results show domain-wide strong, statistically signiﬁcant warming, while the observation based ones are more spatially variable. The CMIP5 models capture the climatology of the hottest areas in the western parts of northern Africa and the Gulf region with the thewarmest day (TXx) > 46 ◦ C and warmest night (TNx) > 33 ◦ C. For these indices, the observed trends are about 0.3–0.4 ◦ C/decade while they are 0.1–0.2 ◦ C/decade stronger in the CMIP5 results. Overall, the modeled climate warming up to 2018, as reﬂected in the indices of temperature extremes is conﬁrmed by re-analysis and observational data.


Introduction
The global average air temperature near the Earth's surface has increased by approximately 1 • C above the pre-industrial levels and this warming was likely due to man-made forcing by increased green-house gas (GHG) levels, as reported in the Intergovernmental Panel on Climate Change (IPCC) Special Report on Global Warming of 1.5 • C [1]. In addition to this shift in the mean climate, changes in weather and climate extremes have been observed. During 1951-2003 over 70% of the global land area sampled showed a significant decrease in annual occurrence of cold nights and a significant increase in the annual occurrence of warm nights [2]. In situ observations reveal widespread significant changes in temperature extremes consistent with warming, especially for those indices derived from daily minimum temperature over the whole 110 years of record but with stronger trends in more recent decades [3].
In this global background, the Middle East-North Africa (MENA) region emerges as a climate change hotspot, with temperature increases and rainfall reductions since the middle of the 20th century [4][5][6][7]. Several studies based on long-term temperature station data suggest that since the 1970s,

Global Climate Models
Daily minimum and maximum near-surface air temperatures (TN and TX, respectively) were retrieved from the Earth System Grid Federation (ESGF) data portal (https://esgf-node.llnl.gov/ projects/esgf-llnl/) for 18 CMIP5 models. For model evaluation in the present study we consider only one (typically the first) ensemble member of each model. The data were acquired from the historical simulations for the period 1950-2005 and scenario projections for the period 2006-2018 (under the RCP 2.6, RCP 4.5 and RCP 8.5). The complete set of 18 models is summarized in Table 1.

Re-Analyses and Observations
For the purposes of assessing the model performance, re-analysis data were used. Such data sets are often used for model evaluation also because of their gridded format, complete global spatial coverage and similarity of scales represented. They are produced via data assimilation, a process that relies on both observations and model-based forecasts with variables that are directly assimilated in the re-analysis forecast model to be typically closer to observations.
In this study, we derived indices for two re-analysis datasets: ERA-Interim [29] and MERRA-2 [46]. ERA-Interim was developed by the European Centre for Medium-Range Weather Forecasts (ECMWF), and is available from 1979 until August 2018. The data assimilation method used to produce ERA-Interim is based on a 4-D variational scheme (4D-Var) with a 12 h analysis window [47]. The Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2) is created by NASA's Global Modeling and Assimilation Office (GMAO) and is produced with version 5.12.4 of Goddard Earth Observing System (GEOS) atmospheric data assimilation system. For both, daily maximum and minimum temperature data were downloaded from the Climate Explorer website (https://climexp.knmi.nl). The ERA-Interim data are available on a regular 0.7 • × 0.7 • longitude/latitude 512 × 216 grid for the period from 1979 to 2018. The MERRA-2 dataset is available for the MENA region grid boxes on a 0.63 • × 0.5 • longitude/latitude 135 x 69 grid for the period from 1980 to 2018.
In addition, we used the gridded Berkeley Earth dataset [31]. The Berkeley Earth initiative has developed a new mathematical framework for producing maps and large-scale averages of temperature changes from weather station data for the purpose of climate analyses. The framework contains a weighting process that assesses the quality and consistency of a spatial network of temperature stations as an integral part of the averaging process. This permits data with varying levels of quality to be used without compromising the accuracy of the resulting reconstructions. The Berkeley dataset is provided only over land on a 1 • × 1 • grid from the Climate Explorer website (at the time of the performed analysis the last available year of the data was 2017). Therefore, due to the unavailability of data for 2018, the subsequent calculations and results for the Berkeley Earth dataset refer to the 1980-2017 period.

Climate Extreme Indices Definition
The indices under investigation are defined and described in detail by Klein Tank et al. [48] and Zhang et al. [49] and can be subdivided into four categories: (1) absolute value indices, which describe, for instance, the hottest or the coldest day of the year; (2) threshold indices, which count the number of days when a fixed temperature threshold is exceeded, for instance, frost days; (3) duration indices, which provide the duration of hot or dry spells; (4) percentile-based threshold indices, which are the exceedance rates above or below a threshold, which is defined, as the 10th or 90th percentile derived from the 1981-2010 base period. The set of the 12 indices used is summarized in Table 2.

Data Processing and Calculations
The temperature-based indices are derived for the CMIP5 output, the two re-analyses, and the Berkeley Earth datasets. The calculations are performed with the R package climdex.pcic as documented at The Comprehensive R Archive Network website (https://cran.r-project.org/web/ packages/climdex.pcic/climdex.pcic.pdf). In this analysis we used the 1981-2010 interval as a base period for the percentile-based calculations, as we focus on the recent past (until 2018), which is also possible considering the temporal coverage of the re-analyses daily data.
Since our analysis addresses the MENA region, we focused on a domain that extends from 12°N to 46°N, and from 20°W to 64°E, covering North Africa, Middle East, the Mediterranean and parts of south Europe. For the purpose of comparing indices among datasets of different native spatial resolution, we regridded the original daily temperature data to a common 45 × 19 grid (1.87 • × 1.87 • ), to maintain a balance between the coarser resolution of the CMIP5 models and the higher resolution of the re-analysis datasets. For the regridding we used a distance-weighted average remapping procedure (https://code.mpimet.mpg.de/projects/cdo/embedded/index.html#x1-6230002.12.4). We consider only grid points over land to focus on populated land areas. Due to the coarse resolution of the common grid, major islands in the Mediterranean, and southern Italy, are excluded from the analysis.
For the CMIP5 data, the indices were derived from the Tmax and Tmin daily timeseries for each model and RCP scenario, and the multi-model averages were subsequently estimated. For all datasets, the obtained annual time series of the indices were statistically treated and plotted as spatial climatologies, timeseries, trends and box-and-whisker plots (boxplots). For the trends, the slope was estimated by fitting a linear model (https://stat.ethz.ch/R-manual/R-devel/library/stats/html/lm. html) and its statistical significance was evaluated by the Mann-Kendall, ranked, non-parametric, test for monotonic trend (https://cran.r-project.org/web/packages/Kendall/Kendall.pdf). In the rest of the text we use the term "climatology" in the sense of long-term time average.

Spatial Representation
In the following (climatology and trend) maps we compare the CMIP5 multi-model ensemble mean, combining output from the historical runs  and the RCP4. 5 (2006-2018), with the two re-analyses (ERA-Int, MERRA-2) and Berkeley gridded observational data.

Climatology
The 1980-2018 climatology of selected indices for the MENA region are presented in Figures 1-4, with maps of TXx (hottest day), TNx (hottest night), WSDI (warm spell duration), TX90p (warm days), respectively. First we examine the climatology of the absolute maxima of daily maximum and minimum temperature (TXx and TNx, or warmest day and warmest night, respectively) in Figures 1 and 2. The spatial distribution and magnitude of the CMIP5 derived indices are in good agreement with the ones obtained from the two re-analysis and the station-based observational datasets. The CMIP5 ensemble mean captures the observed hottest areas within the region well (western parts of northern Africa and the Gulf region with TXx > 46 • C and TNx > 33 • C). In the northern parts of the domain, where milder conditions exist, the CMIP5 ensemble tends to overestimate those indices, especially over high terrain areas ,whereas there is agreement among the three observational datasets ( Figures A1 and A2). This slight warm bias can be attributed to the coarse spatial resolution of the global models compared with the ERA-Interim, MERRA-2 and Berkeley datasets, as the models tend to underestimate the topography and thus represent the surface at lower altitudes.   Figure 3 the CMIP5 multi-model mean results in longer warm spells over the whole region (with WSDI values greater than 5 days almost everywhere), overestimating the values derived from the three observational datasets ( Figure A3). Interestingly, though, the CMIP5 models capture quite well the "hotspot" areas of the index (with WDSI > 10 days) over the west Balkan region, central Sahara and Middle East, in good agreement with the observed data (although less so with Berkeley Earth). In Figure 4, the CMIP5 data simulate values greater than 11 for the Warm Days index (TX90p) everywhere, while the three observational datasets indicate values less than 11 for almost half of the MENA region. Berkeley Earth (constructed solely by station observations) stands out as the dataset with the smaller number of Warm Days, with TX90p values less than 10 days everywhere, in contrast with the two re-analysis.

Decadal Trends
Figures 5-8 present the decadal trends for TXx, TNx, WSDI and TX90p. We compare again the CMIP5 multi-model ensemble with the two re-analyses (ERA-Int, MERRA-2) and Berkeley Earth gridded observational data. The statistical significance of the trends at 95% confidence is illustrated in the maps by the hatching. It is noticeable that the CMIP5 models show strong positive trends for all four indices that are also statistically significant throughout the MENA domain. This is expected from the clearly monotonic positive trend driven by the overall, linearly increasing warming related to the RCPs (evident in the timeseries Figures 9 and 10). It can be also attributed to the suppressed interannual variability of the 18-model ensemble average compared to the individual observationally-based datasets.   In Figure 5, the CMIP5 results show a region-wide, large, positive trend for TXx, ranging from 0.4 • C/decade mainly in parts of northern Africa desert and Middle East to almost 0.8 • C/decade in Europe, especially in the Balkans. The re-analyses and Berkeley Earth exhibit smaller warming ( Figure A5), much more spatially variable and in some places of opposite sign (i.e., cooling, not statistically significant, for example MERRA-2 in Iberia and Berkeley Earth over Libya). The three observation-based datasets show a maximum warming trend, larger than 1 • C/decade, around the Caucasus region. This is may be driven by the positive snow-albedo feedback, which is underestimated in the coarse resolution CMIP5 models with their simplified representation of key snow processes and vegetation [50].
There is generally agreement regarding the spatial distribution and magnitude of the TXx trend in the two re-analysis datasets ( Figure A5), which is not always in line with the one derived from Berkeley Earth (for example, the latter presents a distinct warming in south-east Europe). In Figure 6, a MENA region widespread positive trend of 0.4-0.6 • C/decade is evident in the CMIP5 data for the TNx index, which reaches 0.6-0.8 • C/decade in parts of south-east Europe, in better agreement with the MERRA-2 data than the other two ( Figure A6). In fact, ERA-Interim and Berkeley Earth exhibit similar patterns of warming (e.g., over Maghreb) and cooling (in the south-west part of the domain).  Figure 7 indicates that the CMIP5 multi-model mean represents the observed decadal trend in the WSDI index rather well. The GCMs apparently capture the patterns also derived from the three observational datasets. Increasing duration of warm spells (of at least 5 days per decade), is thus evident in the western and parts of northern Africa, the western Balkans and Italy, and the African and Arabian coasts of the southern Red Sea (with ERA-Interim only), also evident in Figure A7. Similar agreement in the spatial patterns is apparent in the trend of the exceedance rate of warm days, TX90p ( Figure 8). In parts of southern Europe, northern Africa and the Middle East, a 3-5% exceedance per decade is calculated for the CMIP5 and the other three datasets, although the latter include also extended areas without trend ( Figure A8).

Temporal Evolution
The temporal evolution of the indices averaged for the MENA domain (land only) is shown in Figures 9 and 10. Here we plot CMIP5 data for the historical period . For the 2006-2018 period we plot, separately, the high emissions scenario (RCP 8.5), the intermediate mitigation scenario (RCP 4.5) and the stringent mitigation scenario (RCP 2.6). It can be seen that the three RCP scenarios do not result in different temporal variation during this short period after 2005. This is in accordance with the global average temperature projections, where a differential warming due to the RCP scenarios is discernible only after 2030 [51]. Figure 9 depicts the temporal evolution of absolute and threshold indices displayed as anomalies relative to the reference period 1981-2000. Regarding the absolute heat extremes (TXx, TNx) the evolution of the CMIP5 multi-model ensemble closely follows the warming trends in the re-analysis and observation-based data, most evident from the 1990s and onwards, resulting in a warming, by 2018, of 1 • C. Note though that the values presented here represent the MENA averages and, as shown in Section 3.1.2, specific sub-regions warm at much faster rates. For the absolute cold extremes (TNn, TXn), the GCMs indicate a warming, in contrast to the re-analysis/observations which do not show significant trends. For all observational data, the TNn and TXn indices exhibit greater interannual variability than TXx and TNx, a feature that as expected, cannot be seen in the CIMP5 ensemble mean, but it is evident in the minima and maxima of the individual model values. In some recent years the observed cold anomalies (TNn, TXn) reach 2 • C, which is within the CMIP5 multi-model min-max range and in accordance with documented extreme cold winter weather events in the middle latitudes over Eurasia [52,53].
Regarding the FD index, the CMIP5 models follow the decreasing trend in the re-analysis and Berkeley Earth, being about 5 days in the past four decades. Until about 2000 the decrease in frost days although evident, was not flagged as significant in earlier assessments for the region [8], but our updated analysis identifies a persistent trend. Frost days are usually related with nighttime minimum temperatures and the FD index temporal variation is indeed mirrored by the one in TNx. It is also worth noting the significant discrepancies among datasets in absolute values (not shown), of the order of 10 days. For the ID index there is overall agreement between CMIP5 models and re-analysis/observations suggestive that we have entered a period of fewer Ice Days in the 21st century. However, due to the absence of ice days in large parts of the MENA region, the absolute values of the index are small and there is a large interannual variability.   Figure 10 illustrates the time series of the percentile-based and duration indices. Overall, the models follow the evolution of re-analysis and observational datasets. The increasing trend that is observed for warm days/nights and WSDI and the decreasing trend of cold days/nights and CSDI is in line with the trends obtained at the global scale [27]. In contrast with the absolute and threshold indices, the CMIP5 model minima and maxima can capture to some extent the interannual variability indicated in the other three datasets. For instance, the GCM range captures the observed peaks in cold days (TX10p) and cold nights (TN10p), but to a lesser extent the lows in Warm Days (TX90p) and Warm Nights (TN90p), in the years following the 1982 El Chichon and 1991 Mt. Pinatubo volcanic eruptions. The presence of the volcanic forcing signal corroborates for the MENA region, the findings of the global analysis of CMIP5 by Paik and Min [54], who demonstrated that temperature extremes follow the mean temperature response due to the volcanic forcing.
Further scrutiny of the graphs in Figure 10 reveals that while the variation of the model ensemble mean in the historical period  closely follows that of the re-analyses/observations, the upward trend in the scenario simulations after 2005 is twice that compared to the observed warm extreme indices (TX90P, TN90P, WSDI). The observed indices indicate less pronounced warming in recent years and, except for a warm peak in 2010, lie outside the model inter-quartile range (IQR, between the 25th and 75th percentile), at the lower end of the model range. This observed increase in the warm extremes (mainly the TX90P, TN90P, and WSDI) after 2005, within the period during which the apparent global mean surface temperature (GMST) trend appears to level-off, may be linked to sea surface temperature variations associated with the Atlantic Multidecadal Oscillation [55].

Climatology
A more detailed analysis of the climatology (absolute values) of temperature indices of extremes (MENA domain average) shows the CMIP5 model percentiles and min-max range compared with the Berkeley Earth and the two re-analysis datasets, graphically summarized by the box-and-whisker plots in Figure 11. For the absolute indices, the Berkeley and re-analyses values lie within the models IQR, indicating overall good agreement with the CMIP5 results, except for TXn, which is underestimated by the majority of models. The best agreement is obtained for TXx, with the CMIP5 median almost coinciding with the compact, observed values (and a model outlier by >8 • C). Note also that the CMIP5 IQR (representing half of the models) for all four indices is about 3-4 • C, while the range in the observed datasets is up to 2 • C, for which apart for TXx, all three datasets disagree, a reminder of the observational uncertainty and the importance of using several data sources in climate model evaluation [56].
For frost days (FD) and ice days (ID), the spread of the Berkeley Earth and re-analysis datasets is quite small and falls within the CMIP5 IQR. The observed spread is also small for WSDI, but below the 25th percentile of the multi-model ensemble meaning that most models overestimate the index (by about 2 days). For CSDI there is overall agreement between models and observational datasets, except for a notable, off scale, model outlier. The GCM median value lies in the middle of the other datasets spread ,which is large though.
For the percentile indices, the inter-quartile model range is small for all. This is expected from the definition of these indices as the 1980-2018 range is very similar to the baseline period 1981-2010 for which the average exceedances for the upper and lower deciles of TX and TN are the nominal 10%. To obtain a clearer illustration of the model spread, the index values for Berkeley are not shown since they have lower values (7-8%) and out of the plotted space. The re-analysis values are similar and lower than most models in the representation of warm days and nights (TX90p, TN90p), suggesting a slight overestimation by the CMIP5 results.

Decadal Trends
We repeat the boxplot analysis for the decadal trends of the indices, shown in Figure 12. For the absolute indices, the CMIP5 inter-quartile range indicates a warming trend of the order of 0.4-0.5 • C/decade for the warmest day and night (TXx and TNx) and 0.3-0.4 • C/decade for the coldest day and night (TNn and TXn) respectively. These are about twice the rate derived from the three observational datasets, indicating a faster warming in the GCMs than the observed. According to the same numbers, the CMIP5 models simulate a slightly larger warming trend in these absolute value warm extremes than in the cold ones. Interestingly, a larger warming in the warm extremes is also obtained from the re-analyses and Berkeley Earth datasets.
For the FD and ID indices, good agreement between the GCMs and re-analyses/Berkeley Earth is obtained, as all three datasets fall within the inter-quartile model range. The average decadal trend is much less pronounced for the ID index, which is reasonable due to the low average value of this index in the MENA region. The larger overall model warming compared to the observations is also apparent in the duration indices. A pronounced positive trend in the WSDI index, with the ensemble model median trend roughly 5 days/decade, is about 1 day higher than the re-analysis and Berkeley Earth.
There is also a negative trend in CMIP5 for the CSDI index (−2 days/decade), slightly stronger than the observed (which is also negative) by 1 to 2 days per decade. In contrast with the climatology comparison, the Berkeley Earth average decadal trends agree with the re-analyses for the percentile indices. According to the CMIP5 ensemble median, there is an increasing rate of roughly 3% per decade for the warm days (TX90p) and the warm nights (TN90p), although the inter-quartile model range is broader for TX90p. Lower values (by 0.5 % on average) are derived for the re-analyses and Berkeley Earth. The models show a negative trend of −2.5%/decade for the cold nights (TN10p), with MERRA-2 (and to a lesser extent ERA-Interim and Berkeley Earth), very close to the model median. A negative trend (−2.0%/decade) in the cold days (TX10p) is derived for the GCMs, in very good agreement with the observational datasets. From the overall results of the present section we can infer that the increasing trend of heat extremes is larger than the decreasing trend of cold extremes. This asymmetry has been documented on a global level and it is linked with differences in seasonal trends in the Northern Hemisphere [57], which seems to affect the MENA region to some extent [58].  Figure 11, but for the decadal trends of temperature indices.

Quantification of Individual Model Biases
The overall performance of individual models in simulating the 1980-2018 climatology and decadal trends of the indices is summarized in the "portrait" diagrams of Figure 13. The portrait diagrams display the relative magnitude of spatially averaged bias for each index (rows) and for each model (columns). The warmer colors indicate models with positive bias and colder colors indicate models with negative bias. Here the model performance is assessed with respect to the ERA-Interim re-analyses. In addition to the individual models the performance of the so-called mean and median models is displayed in the last two columns.
Regarding the model representation of 1980-2018 climatology (Figure 13a), the diagram indicates that the multi-model mean outperforms individual models because some of the systematic errors in individual models are cancelled out in the ensemble mean. For the percentile indices (i.e., TX90p, TX10p, TN90p, and TN10p), the models generally perform well. This is not surprising, since for the calculation of these indices we used 1981-2010 as the base period, so we expect values from all datasets to be, as mentioned before, around 10% for that period and thus leading to trivial biases. For the other indices, most models perform reasonably well apart from some outlying models in the TXx index (MIROC-ESM, MIROC-ESM-CHEM, CanESM2). McSweeney et al. [59] also showed that MIROC-ESM and MIROC-ESM-CHEM were simulating key regional aspects of Europe's climate poorly. We identify MRI-CGCM3, CCSM4 and CSIRO_Mk3.6.0 as the best performing models. An exception is the frost days (FD) index which seems to be poorly represented by most models. We find that some models show large negative and some other large positive biases for FD compared to the ERA-Interim data. As already mentioned, the ERA-Interim, MERRA-2 and Berkeley Earth datasets do show differences, therefore, no conclusion can be drawn regarding the model ability to accurately simulate this index.
For the decadal trends (Figure 13b), the GCMs exhibit small biases for the absolute indices (TXx, TXn, TNn, and TNx). Most models have a small positive bias, in other words, they slightly overestimate the trend compared to ERA-Interim. Similar to Figure 13a, the multi-model mean tends to outperform the individual models. For ice days (ID) index, models exhibit small biases which can be attributed to the general stability of the trend and the relatively small ID value of the MENA region. With respect to the FD index, models and re-analyses show noticeable uncertainty, but for the trend most models are in good agreement with the ERA-Interim, except for a few outliers (HadGEM2-ES, IPSL-CM5A-LR). The biases are more prominent for the percentile-based indices. Overall, based on the ensemble mean and median, positive biases are observed for the warm extreme indices (WSDI, TX90p, TN90p) and negative biases for the cold extreme indices (CSDI, TX10p, TN10p). Hence, most CMIP5 models overestimate the average MENA region warming rate, as also shown-in Figure 6. A more detailed analysis of the diagrams reveals a few individual models which appear to perform better for these indices. For the WSDI index, CNRM-CM5 simulates a trend relatively close to ERA-Interim, showing a slight negative bias, quite different from the positive bias of the ensemble mean. The same is true for the respective comparison with MERRA-2 and Berkeley Earth in Figures A9 and A10 of the Appendix A. Other models with good performance in the simulation of decadal trends are MRI-CGCM3, MIROC-ESM and CCSM4.
A correlation analysis between the models' grid-box area and the derived biases for all the indices did not reveal any noteworthy association between the model horizontal resolution and the representation of temperature indices. Only for TXx we find a significant correlation (r = 0.55). These results are in line with the global study of Sillmann et al. [27], who also did not find any noticeable effect of model resolution in the biases. Finally, we examined (not shown) if model performance (biases in Figure 13) was linked to the inclusion of aerosol and/or atmospheric chemistry coupling in some of the CMIP5 models, according to Table 9.1 in Flato et al. [60], but no connection was found.

Summary and Conclusions
We have analyzed the long-term average and evolution of observed and modeled temperature extremes defined by the ETCCDI for a recent period up to 2018. This was possible by using gridded re-analysis (ERA-Interim and MERRA-2) and station-based (Berkeley Earth) observational datasets and output from 18 CMIP5 models. For the CMIP5 models we combined information from the historical runs   Results from MENA climatology maps indicate that the ensemble mean of the CMIP5 models reproducesthe observed patterns of climate extremes well and captures the hottest areas in the western parts of northern Africa and the Gulf region with TXx > 46 • C and TNx > 33 • C. Due to the relatively low horizontal resolution of the models, and the consequent underestimation of topographical texture, some warm biases occur over elevated terrain regions for the absolute value indices. For the 1980-2018 trend, the multi-model mean exhibits less spatial variability and more widespread statistical significance than the observations and indicates a rate of increase of 0.4-0.8 • C/decade for TXx and TNx.
The analysis of the temporal evolution of temperature extremes reveals that the MENA region follows the global trends [2,27,61]. For most indices model trends are in agreement with re-analyses and observations, especially the percentile-based indices, for which CMIP5 models capture the observed temporal variations relatively well, including signals following major volcanic eruptions at the end of the 20th century. The modeled warm days and warm nights (cold days and cold nights) continued to increase (decrease) throughout the second decade of the 21st century, in general agreement with the observations. This confirms that in the MENA region, temperature extremes continue to follow a trend of accelerated warming, regardless of the recent slowdown (the so-called global warming hiatus) in the global mean temperature increase [55,62].
The warmest day (TXx) and warmest night (TNx) indices increase steadily for the last 40 years while the coldest day (TXn) and coldest night (TNn) decrease at a smaller rate, both in CMIP5 and observational data. Our updated analysis shows a persistent negative trend in the number of frost days (FD) in the region. Further analysis of processes that control nighttime minimum temperatures, such as regional circulation changes, cloud cover and soil moisture [61], could shed light on the driving mechanisms. The climatology comparison is best quantified with the box-and-whisker plots, indicating good agreement for the absolute values (within 1-2 • C) and threshold/duration indices between models and observations. The respective trend comparison indicates that for the warmest day (TXx) and warmest night (TNx) indices, the observed trends are about 0.3-0.4 • C/decade while they are 0.1-0.2 • C/decade stronger in the CMIP5 results. The observed trend is smaller for the coldest day (TXn) and night (TNn) but not for CMIP5 which is again strong. This larger positive trend in the heat extremes than the negative trend in the cold extremes, is also calculated for the duration and percentile indices, with variable agreement between the GCMs and the observational data. In the portrait diagrams of biases, the multi-model ensemble mean, and the ensemble median generally outperform individual models, in agreement with previous multi-model studies [27]. Among individual models, the performance is variable for the different indices in the climatology and trend biases, but some overall consistency can be assigned to MRI-CGCM3 and CCSM4.
The additional 13 years after 2005 (the end of the period of the CMIP5 simulations with historical forcings) provide a "post-historical" record to retrospectively assess the RCP generation projections [63].
The results of the 1980-2018 analysis for most of the indices of temperature extremes point to continued and significant warming in the MENA region broadly in line with the CMIP5 model projections. Follow-up work could include output from the new projections driven by the Shared Socioeconomic Pathways (SSP) scenarios from the CMIP6 model simulations to investigate how temperature extremes are represented, considering that several of the participating models have ECS values higher than any of the CMIP5 models [64]. A complete assessment of the evolution of the ETCCDI indices for the MENA region should also use the data from the CMPI5 dynamical downscaling results over the MENA-CORDEX (COordinated Regional Downscaling EXperiment) domain [20,[65][66][67]. Funding: This work was co-funded by the European Regional Development Fund and the Republic of Cyprus through the Research Innovation Foundation CELSIUS Project EXCELLENCE/1216/0039. It was also supported by the EMME-CARE project that has received funding from the European Union's Horizon 2020 Research and Innovation Program, under grant agreement No. 856612, as well as matching co-funding by the Government of the Republic of Cyprus.

Acknowledgments:
We would like to thank the CMIP5 modeling groups that performed simulations and publicly shared their data and the Earth System Grid Federation(ESGF) facilities for hosting them.