Evaluation of the Performance of CMIP6 Models in Reproducing Rainfall Patterns over North Africa

: This study assesses the performance of historical rainfall data from the Coupled Model Intercomparison Project phase 6 (CMIP6) in reproducing the spatial and temporal rainfall variability over North Africa. Datasets from Climatic Research Unit (CRU) and Global Precipitation Climatology Centre (GPCC) are used as proxy to observational datasets to examine the capability of 15 CMIP6 models’ and their ensemble in simulating rainfall during 1951–2014. In addition, robust statistical metrics, empirical cumulative distribution function (ECDF), Taylor diagram (TD), and Taylor skill score (TSS) are utilized to assess models’ performance in reproducing annual and seasonal and monthly rainfall over the study domain. Results show that CMIP6 models satisfactorily reproduce mean annual climatology of dry/wet months. However, some models show a slight over/under estimation across dry/wet months. The models’ overall top ranking from all the performance analyses ranging from mean cycle simulation, trend analysis, inter-annual variability, ECDFs, and statistical metrics are as follows: EC-Earth3-Veg, UKESM1-0-LL, GFDL-CM4, NorESM2-LM, IPSL-CM6A-LR, and GFDL-ESM4. The mean model ensemble outperformed the individual CMIP6 models resulting in a TSS ratio (0.79). For future impact studies over the study domain, it is advisable to employ the multi-model ensemble of the best performing models.


Introduction
Climate change is a major concern globally, particularly in the arid and semi-arid regions such as North Africa. The changes in climate directly affect many socio-economic activities. For instance, it affects agriculture, water availability and quality, energy, and food security, limiting socioeconomic growth [1]. According to several studies [2][3][4], the North African region is regarded as one of the climate change hotspots. Unfortunately, limited gauge stations hinder quantification and assessment of impacts of climate change over the region.
To better understand climate change and its effects on the past, current, and future environments, several tools have been employed to reproduce the climate patterns. One of the main tools is Global Circulation Models (GCMs) of the Coupled Model Inter-comparison Project (CMIP). Various phases of CMIP data can be found (CMIP-https://www.wcrpclimate.org/wgcm-cmip [Accessed 10 December 2020]). The latest CMIP dataset that is available for analysis is phase 6 (CMIP6 [5]). The CMIP6 models simulate the physics, chemistry, and biology of the atmosphere, terrestrial, and oceans in fine detail. The CMIP6 dataset differs from its predecessors: CMIP5 and CMIP3 datasets, in links of forcing scenarios and carbon emissions. The CMIP6 project uses an updated version of the coupled global climate models, a new beginning year, and a novel set of Shared Socioeconomic Pathways (SSP) scenarios [6].
Many studies have been conducted to assess the impact of climate change on the Northern Africa region. Some studies focusing on the variability of the annual rainfall and drought periods over the region have projected a decrease of annual rainfall between 2071 and 2100 over the Mediterranean by 20% compared to the period 1961-1990 [7][8][9][10].
Other studies [1,[10][11][12][13] also reported that the annual rainfall over North Africa is likely to decrease in future. Furthermore, according to Intergovernmental Panel on Climate Change Fifth Assessment Report (IPCC AR5), the mean annual rainfall has decreased over the northern Africa region, with an abrupt reduction noted from 1968 to date [14]. Liebmann et al. [15] reported that the region received low annual rainfall amounts, and a downward trend. Massoud et al. [16] investigated the change of atmospheric rivers (ARs) and rainfall over the Middle East and North Africa (MENA) region. According to their study [16], the mean daily rainfall is projected to decrease by 15%-30% between 2070 and 2100. In a recent study, Almazroui et al. [17] investigated the performance of CMIP6 datasets over Africa and projected a steady reduction in rainfall over North Africa region in the 21st century.
While numerous studies have projected a decrease in rainfall over the Mediterranean region, few studies have delineated with precision the intensity, frequency or magnitude of the projected decline. Moreover, the sub-regional divergent tendencies remain a source of confusion to the relevant stakeholders over the region. To illustrate this, recent studies (e.g., [18][19][20]) have observed an increase in rainfall over some regions of Morocco and Tunisia. Such contradictory scientific opinions on the spatiotemporal variability of rainfall in a region already under absolute water scarcity result into more uncertainty to the general community. This situation calls for further studies that would accurately delineate the possible future rainfall patterns over the water scarce region of MENA. The advent of CMIP6 promises robust projections due to the advanced features that characterize the model outputs akin to predecessor versions used to examine past climate patterns over the North Africa. Presently, no regional study has investigated the performance of CMIP6 models in replicating or projecting the rainfall over the region in terms of their spatial and temporal variability.
As a first step, this study seeks to evaluate the capability of 15 CMIP6 GCMs in reproducing annual, seasonal and monthly rainfall patterns over the study region. The best performing models will be utilized in examining the possible future climate patterns for vigorous policy making and adaptations. The rest of the study is organized as follows: description of the study region, data, and methods used are presented in Section 2. Section 3 outlines results and discussion, while Section 4 presents the summary and conclusion of the study.

Study Domain
The study domain of the present work is confined with longitude 20 • W and 38 • E and latitude 19 • and 37 • N ( Figure 1). The climate of North Africa varies greatly between coastline and inner domains of the region. North Africa has a Mediterranean climate along the coast, which is characterized by mild, wet winters and warm, dry summers, with plenty rainfall of approximately 400 to 600 mm per year. Inland, the countries of North Africa experience semiarid and arid desert climates, which are characterized by extremes in daily high and low temperatures, little rainfall of about 200 to 400 mm per year for semiarid areas and <100 mm per year for desert regions. The region is situated at the boundary of the subtropics and mid-latitudes. The changes in rainfall over the domain are attributed to different factors, such as the Hadley circulation and winter storms [16]. For instance, the changes in Hadley circulation have led to the observed decline in rainfall over the study due to the significant strengthening of Azores high pressure. Seasonal variability of rainfall is mainly influenced by the formation and passage of cyclonic disturbances. Furthermore, the spatial variability is due to the complex topography landscape [21,22]. Although the study domain has 4 distinct seasons [2,[22][23][24][25][26], this work will mainly focus on two seasons: winter (December-February [DJF]) and summer (June-August [JJA]).

Data
In this study, simulated data from 15 CMIP6 model simulations alongside their ensemble were investigated. The choice for the model outputs was informed by the availability of the CMIP6 models with complete first ensemble member (r1i1p1f1) at the time of analysis. Moreover, the listed models were not sensitive to the other ensemble members. Our initial analysis using the first three ensembles' members for two randomly selected models (NorESM2-LM and IPSL-CM6A-LR) indicated that the results were not sensitive to the ensemble member selected, with no statistically significant difference between the ensemble members and mean rainfall over North Africa. Details of the models used in this study are summarized in Table A1. More information about CMIP6 models can be found in Eyring et al. [5]. The model simulations were compared with two gauge derived gridded datasets in the absence of quality rain-gauge observational datasets to minimize uncertainty. The two datasets were Climatic Research Unit (CRU TS v. 4.04) [27], and the global precipitation analysis products of the Global Precipitation Climatology Centre (GPCC) version 2020 [28], for the period 1951-2014. For comparison, all datasets were initially re-gridded to a standard resolution (2.81 • × 2.81 • ) of one of the participating models with the lowest resolution. The interpolation was done using the bilinear interpolation method to allow the evaluations to be made at an equal resolution.

Statistical Validation of CMIP6
The robustness of each CMIP6 model outputs and their ensemble was assessed at annual, seasonal (i.e., DJF), and monthly time scales by comparing rainfall estimates with the two gridded observations. In this work, the average of all CMIP6 dataset was considered as the ensemble mean. The annual and seasonal means were calculated from monthly data. Furthermore, the mean climatological description is presented by showing the spatial rainfall pattern of inter-annual variability. A student t-test was applied to examine the significance of the difference between observations and the model datasets. A typical significance level of 0.05 was employed. CMIP6 simulations were further evaluated based on statistical analysis and validation. The performance measures are statistically based and include the empirical cumulative density function (ECDF), Mean Bias (MB), Root Mean Squared Error (RMSE), and coefficient of determination (R 2 ). The measures are defined in Equations (1)-(3); where S is the rainfall value acquired from the observation dataset; CRU, the D represents the assessed value obtained from the CMIP6 rainfall dataset, while S and D are denote the observed and the CMIP6 estimated mean values, respectively. The n is the number of data pairs. Bias indicates how close the mean of CMIP6 is to the mean observed value. A bias ratio closer to 0 indicates that CMIP6 rainfall estimate data is closer to observation data and thus considered an ideal score. Root mean square error (RMSE) computes the distance between estimated (CMIP6) and observed (CRU) measurements. It states how concentrated the data is around the line of best fit; a measure of 0 is the optimal score. The coefficient of determination (R 2 ) quantifies the correlation between the CMIP6 models and observations (CRU), with the value of R 2 closer to 0 signifying no correlation whereas 1 denotes perfect correlation. Spatial model performance is further assessed based on the Taylor diagram and Taylor Skill Score (TSS) [29]. The approach has been widely used to assess the similarity between different sets of data [29][30][31]. In this work, the Taylor skill score was employed to estimate the performance of CMIP6 models reproducing the spatial rainfall patterns over North Africa. The TSS is computed using Equation (4); where PC is the spatial pattern correlation coefficient between the model outputs and observation. The PC 0 is the highest achievable (here, we set the threshold at 1). Also, variables such as σ cmip and σ obs represent standard deviation of the simulated and observed patterns, respectively. The score~1 threshold value shows a perfect association between models and observed whereas 0 expresses contrary model performance. Details about TSS technique and its successful application in various studies (e.g., [32][33][34][35]).

Trend Analysis
The trend detection analysis was performed using the Modified Mann-Kendall (MMK) test. The MMK test [36][37][38] is used to discern statistically significant decreasing or increasing trend in time series against the null hypothesis of no trend [39][40][41][42]. The method did not require the sample to conform to any specific probability distribution since it worked well even with insufficient or abnormal values. Significance of the trend was tested at 5% significance level. Z-score exceeding the magnitude of critical values (±1.96) at 5% significance level denotes a monotonic trend of rainfall events. The S value denotes variance, which is used to calculate significance of the trend. Detailed equations of MMK can be found in relevant literature [42][43][44][45]. The magnitude of the slope in models was assessed using Theil-Sen Slope Estimator [46]. The methods were successfully applied in related studies [34,44,47].

Annual Cycle
Structured assessment of climate models through estimation of observations is a criterion to gaining confidence in climate analysis. Appropriate climate model evaluation involves examining mean state, trends, variability, vital physical processes, and emergent constraints [48]. In this study, the performance of CMIP6 models is examined based on their capability to mimic regional mean annual climatology, trends, and inter-annual variability. Figure 2 presents the annual rainfall cycle over North Africa. The climate of the study domain is mostly characterized by dry anomalies with rainfall mostly experienced during winter (DJF), while summer (JJA) is regarded as "hotspot" climate [23]. This is so because the region experiences stronger warming of the regional land-based hot extremes compared to the mean global temperature increase as defined by IPCC [49]. The dry (JJA)/wet (DJF) seasons over the Mediterranean are mainly regulated by the north/south oscillation of Azores anticyclone [21]. The observed annual cycle peak of the observation is 23 mm/month in December, and the lowest amount was found between June and July (4 mm/month). Some models satisfactorily simulate the peak season of regional rainfall but fail to accurately mimic the rainfall pattern. This shows the models' capability to capture mechanisms influencing the local rainfall and the number of observed values by the majority. This presents an optimistic simulation of regional climate patterns in a region characterized by sparse gauge-based datasets.
In agreement with expectation, the multimodal mean ensemble adequately reproduces the observed features than individual models. The ensemble mean performs best from January to April and tends to overestimate the observed rainfall during the rest of the year. Moreover, some individual models (i.e., BCC-CSM2-MR, CESM2-WACCM, and NorESM2-LM) robustly reproduce the rainfall peaks while with varying rainfall amounts reproduced ( Figure 2). For instance, the CMIP6 dataset scored low values during the JJA season. It is notable that EC-Earth3-veg scored the lowest rainfall value among the other GCMs (<5 mm/month) between July and August. Interestingly, SAMO-UNICON and CanESM5 are unable to accurately replicate the observed annual climatology in the North Africa region. CNRM-CM6-1 and CNRM-ESM2-1 consistently overestimate the rainfall, specifically between September and February, scoring an overestimated amount exceeding 5 mm/month. The performance of these two models was contrary to what they depicted in a recent study [34] over Uganda, East Africa where they underestimated rainfall more than all the other models. The large wet biases exhibited by CNRM-CM6-1 and CNRM-ESM2-1 calls for further studies to establish in depth understanding of the model performance in a region that is classified as hot dry climate anomaly. Overall, nine models depicted an overestimated annual cycle of regional rainfall out of the 15 participants ( Figure 2). The observed wet bias noted in most CMIP6 models could be attributed to the challenges arising from the parameterization schemes which tends to capture more rainfall as compared to the observed value [5]. This is despite new improvements such higher spatial resolution in comparison to coarser resolution for CMIP5 [11], or changes in the physical processes and biogeochemical cycles, among others [5].
Further analyses of spatial patterns of regional rainfall for monthly and seasonal timescales are presented in Figures A1 and A3, respectively. The observed rainfall quantities for monthly mean varies across the region from 0 to 50 mm/month, while DJF mean ranges from 0 to 80 mm/season (Appendix B, Figures A1 and A3). The lowest monthly mean rainfall is recorded over the arid region of southern North Africa, while high monthly mean rainfall accrued over northern parts of the study region ( Figures A1 and A3). Most CMIP6 models tend to capture the observed patterns of rainfall distribution over the study domain. The spatial variation of pluviometry over North Africa is primarily influenced by anticyclone passage towards the south, giving way to the ocean disturbances to the adjacent landmass [21]. The ensemble adequately estimated the rainfall across the entire domain. In contrast, CESM2-WACCM, CESM2, and SAMO-UNICON show a moderate overestimation of the rainfall amount across the south of the study domain ranging from 5 to 10 mm/month, while CanESM5, CNRM-CM6-1, and MRI-ESM2-0 depict slight rainfall underestimation in mid and south of the domain ( Figure A1). CNRM-CM6-1 and CNRM-CM2-1 demonstrated a consistent overestimation of rainfall over the north of the study area by 10-20 mm/month reaching an amount of 80 mm/month in some regions (Figure 1). On the other hand, BCC-CSM2-MR, BCC-ESM1, CanESM5, CESM2-WACCM, CESM2, EC-Earth3-veg, SAMO-UNICON, and UKESM1-0-LL models showed an underestimation (10-20 mm/month) in the northern parts of the study domain. GFDL-CM4, GFDL-ESM4, IPSL-CM6A-LR, and the ensemble mean, outperformed other models in estimating the suitably observed monthly rainfall ( Figure A1). Likewise, seasonal patterns ( Figure 3) show enhanced rainfall over northern parts, similar to monthly patterns ( Figure A1). However, more rainfall is simulated during DJF, with the highest amount recorded in CNRM-ESM2-1 and CNRM-CM6-1. Remarkably, SAMO-UNICON depicted pronounced underestimation over the entire domain with the least amount of rainfall <15 mm/season ( Figure 3). Overall, the best performing models in seasonal rainfall simulations over North Africa are: CNRM-ESM2-1, CNRM-CM6-1, and GFDL-ESM4. Conversely, models that exemplify unsatisfactory performance over this study region include SAMO-UNICON and CESM2.

Trend Analysis
The trends were evaluated and tested for their significance and magnitude over North Africa during 1951-2014. Table 1 shows the mean, slope, Z-score, and significance of linear trend of annual and DJF rainfall for CRU, GPCC, and the CMIP6 models. The rainfall over the region exhibits significant decreasing trends for annual and DJF season. CRU depicts the significant decreasing trend with Z-score values of 2.32 during annual and DJF season, whereas GPCC show significant decreasing trend during DJF (−2.32). These results agree with past studies over the study area [2,50,51]. To illustrate, Giorgi and Lionello [2], found a pronounced decrease in rainfall, especially in warm season, with attributions pointing to increased anticyclonic circulations that yields increasingly stable conditions. Only two models: BCC-ESM1 and GFDL-ESM4, show a significant decrease during annual and DJF season with Z-scores of −2.1 and −2.35, respectively. The mean ensemble shows an insignificant decreasing trend in annual (seasonal) analysis with a Z-score of −1.05 (−1.16). Examination of the magnitude of the slope using Sen's slope shows capability of most models in reproducing observed trends satisfactorily (Table 1). Generally, models' competence to reproduce the spatial variance and trends vary from one model to another and from one-time slice to another. Nevertheless, the best performing models capable of simulating annual temporal trends over North Africa can be summarized as follows: BCC-ESM1, CESM2, GFDL-ESM2, and ensemble mean. Conversely, poor performing models in mimicking the observed temporal trends include CNRM-CM6-1, EC-Earth3-Veg, and CESM2-WACCM.
Further analyses of spatial trends based on Sen's slope during annual and DJF are presented in Figures 4 and 5, respectively. The observed datasets display a slightly decreasing trend at −0.8 to 0 mm/year across the entire grids of study region. Remarkably, the northern part of the study domain depicted much lower trend values of −1.0 mm/year on annual and DJF season (Figures 4 and 5). Whereas most models satisfactorily simulate temporal trends of annual and seasonal rainfall over North Africa, spatial tendencies remain a challenge on an annual scale. Few models such as BCC-ESM1, GFDL-CM4, and CNRM-ESM2-1 can reproduce observed trends. The majority of models show enhanced positive variance of rainfall tendency in a region that has experienced an intense and prolonged drought during the recent decades [52,53]. Due to the participation of diverse models, the ensemble did not perform well in capturing the observed trend. The significant decreasing rainfall trends observed in North Africa result from the recent amplification of cyclonic circulation and northward alteration of the sequence of cyclonic depression regions [2]. This has occasioned acute water shortage due to recurrent droughts recorded over the recent decades [8]. Moreover, the continued decline of rainfall observed has equally affected most of the economic activities, with communities shifting from traditional agricultural activities to long-distance trade. This illustrates how some susceptible regions are forced to pursue drastic adaptive responses, including relocation and societal structure changes [49]. Models that robustly exemplify the observed trends presents promising future climate diagnostic and projections, unlike previous versions of CMIP3 and CMIP5 that presented varying outcomes of historical patterns and projections of North Africa climate [18,19,54].

Student's t-Test, Bias, Coefficient of Determination, RMSE, and ECDF Metrics
The skillfulness of CMIP6 models is determined by their capability to exhibit low bias (<0), high R 2 (1), and small RMSE (<0). Moreover, the ability of models to mimic extreme rainfall events is examined based on the ECDF performance. Table 2 shows the Student's t-test analysis on monthly time scale between CRU and CMIP6 models. At (p < 0.05), most CMIP6 models resulted in significant differences with the CRU, other than BCC-CSM2-MR and UKESM1-0-LL. Figure A2 displays the spread in the biases simulated by the CMIP6 models over the Northern African region for monthly rainfall against CRU observations. Over the northwest of the study domain, CNRM-CM6-1 and CNRM-ESM2-1 reveal a positive bias range of 16 to 32 mm/month. In contrast, the remaining models exhibit negative bias from 2 to 32 mm/month in the same area. Over the southern parts of North Africa, SAMO-UNICON exhibits some high positive values across the desert ranging between 8 to 16 mm/month. BCC-CSM2-MR, BCC-ESM1, CanESM5, CESM2-WACCM, CESM2, CNRM-CM6-1, CNRM-ESM2-1, MRI-ESM2-0, UKESM1-0-LL, and the mean ensemble depict positive bias values (2 to 8 mm/month). Meanwhile, EC-Earth3-Veg, NorESM2-LM, GFDL-CM4, GFDL-ESM4, and IPSL-CM6A-LR slightly underestimate the observed rainfall (0 to −2 mm/month). The summary of CMIP6 models spread with respect to the observational datasets for mean bias is presented in Figure 6. The overall distribution shows that most models tend to underestimate the observed rainfall, especially EC-Earth3veg, NorESM2-LM, and IPSL-CM6A-LR (around 75% of the distribution). Meanwhile, the concentration of the bias distribution of CNRM-CM6-1, CNRM-ESM2-1 exhibits more positive bias values (Nearly 75% of the distribution). GFDL-CM4 and the ensemble show the best performance scoring a median value close to 0 mm/month, and their boxplot is comparatively minimal.  Figure A3 shows the spatial coefficient of determination between CMIP6 models and CRU over Northern Africa. Remarkably, most CMIP6 models depict low spatial correlation across the entire study domain. CMIP6 models display some weakness in reproducing the observed rainfall variability over North Africa. This could be attributed to the steady moisture outflow due to amplified evapotranspiration, increased radiation, and minimal cloud cover [55]. Across the northern part of the study area along the Mediterranean coast, the ensemble depicts an enhanced R 2 values with the observed data reaching (0.6) in some grids. This could be associated with the presence of a large coastline that results in more convective activities leading to more rainfall. It is evident from Table 2 that the ensemble demonstrates the most satisfactory R 2 value (0.66), followed by EC-Earth3-Veg (0.58), GFDL-CM4 (0.49), GFDL-ESM4 (0.45), and IPSL-CM6A-LR (0.44). Conversely, SAMO-UNICON scores the lowest (0.04). A similar low correlation of SAMO-UNICON was also found over the Yangtze River in China [56], and over Uganda, situated in the East Africa region [45].  Figure 7 displays the boxplot of the coefficient of determination of CMIP6 models data with respect to CRU observed data. In general, CMIP6 models exhibit a low correlation. Most CMIP6 models depict low R 2 ratio resulting in low median values. Interestingly, 75% of R 2 distribution of 14 CMIP6 models were below 0.1 ratio. Meanwhile, the ensemble and IPSL-CM6A-LR model showed enhanced results. IPSL-CM6A-LR performance was comparatively better than the other individual CMIP6 models scoring higher correlation values. This performance is agreeing with the spatial R 2 distribution ( Figure A3), where IPSL-CM6A-LR showed a moderate correlation over the southern of the study domain. However, its median remained low (<0.1). The ensemble mean appears to show the most enhanced performance scoring the highest median (0.04) despite depicting comparatively a large spread in its distribution. The highest R 2 value was scored by the mean ensemble (0.66). The ensemble good performance was also found in Table 2 scoring the highest temporal correlation of determination ratio (R 2 = 0.66). Consistent with other studies, the mean ensemble shows better performance due to the cancellation of systematic errors in the individual models. The spatial RMSE of mean monthly rainfall relative to CRU data over North Africa from 1951 to 2014 is presented in Figure A4. The overall RMSE values results show higher values (30 < RMSE < 80 mm/month) over the northwest of the study domain featuring higher altitude topography. However, low values of RMSE of <10 mm/month are noted over the low-lying areas characterized by arid and semi-arid land (ASAL). CESM2-WACCM, CESM2, and SAMO-UNICON exhibit RMSE values between 10 and30 mm/month, which is considered higher compared to the monthly rainfall average observed in this particular region. From Figure A4, the ensemble performs well, followed by IPSL-CM6A-LR, NorESM2-LM, ECEarth3-veg, and UKESM1-0-LL. The results show that the RMSE of most CMIP6 models varies in performance with variations to different climatic features and topography. High RMSE is observed in high altitude with wet climate regions, while low RSME is recorded in low altitude with dry climate anomaly.
Temporal patterns based on the boxplot of the RMSE distribution of CMIP6 data akin CRU observed data are shown Figure 8. The general distribution of RMSE exhibits different variation from one model to another. However, the concentration of RMSE distribution for all GCMs is below 35 mm/month. Clearly, SAMO-UNICON shows the weakest performance resulting in the high RMSE values and lengthiest spread of the RMSE distribution. In addition, it results in the highest median value, followed by CNRM-CM6-1 and CNRM-ESM2-1. On the other hand, the ensemble shows the most acceptable performance among the models followed by NorESM2-LM, IPSL-CM6A-LR, and EC-Earth3-veg. The enhanced performance of the ensemble was also found in Table 2 scoring the lowest RMSE value (4.36 mm/month).

Cumulative Density Function (ECDF) Analysis
The ECDF of mean monthly rainfall for GCMs obtained from CMIP6 models and their ensemble average against two observational datasets of CRU and GPCC are shown in Figure A5. The ECDFs depict the frequency of occurrence of anomalous rainfall on a monthly distribution (mm/month). Most CMIP6 models overestimate the monthly rainfall ( Figure A5). The ensemble displayed similarity with the observed rainfall distribution from 22 to 41 mm with slight over/underestimation. CNRM-CM6-1, CNRM-ESM2-1, NorESM2-LM, IPSL-CM6A-LR, and UKESM1-0-LL show large wet bias of rainfall between 27 mm to 54 mm/month. On the other hand, SAM0-UNICON, CanESM1 EC-Earth3-Veg, BCC-ESM1, and MRI-ESM2-0 show dry bias ranging between 16 mm/month and 31 mm/month.

Taylor Score and Ranking
Taylor diagram and skill score are employed to evaluate the general skill performance of CMIP6 models reproducing the spatial rainfall pattern. Figure 9 displays the Taylor diagram for annual rainfall climatology over North Africa based on CMIP6 models and observations. The radial coordinate represents the magnitude of the standard deviation, while the root mean square difference is illustrated using the concentric semi-cycle. The angular coordinate shows the correlation coefficient between models and the observed dataset; CRU. The analyses demonstrate that SAMO-UNICON, CanESM5, and BCC-ESM1 models score low variability against CRU data. In contrast, CESM2-WACCM, UKESM1-0-LL, GFDL-CM4, GFDL-ESM4, MRI-ESM2-0, and CNRM-CM6-1 exhibit high variability compared to CRU observation. IPSL-CM6A-LR, NorESM2-LM, BCC-CSM2-MR, CESM2, and EC-Earth3-Veg exhibit a close agreement in terms of rainfall variability relative to CRU. SAMO-UNICON scored the weakest correlation (r = 0.2), and ensemble scored the highest (r = 0.81). The remaining models reveal a correlation ratio between 0.4 and 0.76. In terms of the centered RMSE, ensemble indicates a good performance resulting in a low centered RMSE value (0.6). On the other hand, the highest value was scored by SAMO-UNICON, which results in the most inferior performance compared to the other models.
Finally, model overall ranking is established using the Taylor skill score (TSS) metrics. The TSS for the 15 CMIP6 models and their mean ensemble over North Africa are shown in Figure 10. Principally, the TSS values closer to 1 depict better performance. The values of the TSS for all the models are listed in Table 2

Summary and Conclusions
In this study, 15 CMIP6 models were investigated over North Africa for their performance in replicating rainfall. Their performance in reproducing the spatial and temporal distribution of rainfall over the region was investigated against two observational datasets of CRU and GPCC. In addition, prominent statistical metrics such as bias, RMSE, R 2 , ECDF and Taylor skill score (TSS) are utilized to assess model performance in reproducing monthly and seasonal rainfall over North Africa domain. Results show that CMIP6 models satisfactorily reproduced mean annual climatology of dry/wet months. However, some models showed a slight over/underestimation across dry/wet months. The mean ensemble showed relatively good performance from the wet season.
In terms of the spatial distribution of rainfall, CMIP6 models captured both the generally wet and dry regions. However, they fail to reproduce the observed amounts well. It was found that GFDL-CM4, GFDL-ESM4, IPSL-CM6A-LR, and ensemble showed a better agreement with the observation reproducing the spatial rainfall variability. The trend test results indicated that most CMIP6 models could reproduce the observed downward trend of rainfall over the study region. However, few models showed an opposite. Due to the diverse performance of CMIP6 models, the mean ensemble could not reproduce the accurate observed spatial trend. Most CMIP6 models showed a dry bias over North Africa (from 0 to 32 mm/month). Nevertheless, only CNRM-CM6-1 and CNRM-ESM2-1 depict wet bias in the same region, reaching 32 mm/month. GFDL-CM4 and the ensemble revealed the lowest values in terms of spatial spread-out of bias ratios across the study domain.
The spatial correlation between CMIP6 models and CRU indicates that the ensemble outperformed the other models resulting in a high correlation (0.66), followed by EC-Earth3-Veg with R 2 of 0.58. The model's overall ranking from all the analyses ranging from mean cycle simulation, trend analysis, inter-annual variability, ECDFs, spatial patterns variability based on RMSE, bias, and R 2 are as follows: EC-Earth3-Veg, UKESM1-0-LL, GFDL-CM4, NorESM2-LM, IPSL-CM6A-LR, and GFDL-ESM4. The mean model ensemble outperforms the other CMIP6 models resulting in a TSS ratio (0.79). On the contrary, the lowest score was recorded by SAMO-UNICON (0.36). Following [57] recommendations, the six-top ranked GCMs will be used to build an ensemble for impact analysis and future projections of North Africa climate.

Conflicts of Interest:
No competing interest in the present study among the authors or any other body.