Evaluation of the Performance of CMIP6 HighResMIP on West African Precipitation

This research focuses on evaluating the High-Resolution Model Intercomparison Project (HighResMIP) simulations within the framework of the Coupled Model Intercomparison Project (CMIP) Phase 6 (CMIP6). We used seven of its consortiums to study how CMIP6 reproduced the West African precipitation features during the 1950–2014 historical simulation periods. The rainfall event was studied for two sub-regions of West Africa, the Sahel and the Guinea Coast. Precipitation datasets from the Climate Research Unit (CRU) TS v4.03, University of Delaware (UDEL) v5.01, and Global Precipitation Climatology Centre (GPCC) were used as observational references with the aim of accounting for uncertainty. The observed annual peak during August, which is greater than 200, 25, and 100 mm/month in the Guinea Coast, the Sahel, and West Africa as a whole, respectively, appears to be slightly underestimated by some of the models and the ensemble mean, although all the models captured the general rainfall pattern. Global climate models (GCMs) and the ensemble mean reproduced the spatial daily pattern of precipitation in the monsoon season (from June to September) over West Africa, with a high correlation coefficient exceeding 0.8 for the mean field and a relatively lower correlation coefficient for extreme events. Individual models, such as IPSL and ECMWF, tend to show high performance, but the ensemble mean appears to outperform all other models in reproducing West African precipitation features. The result from this study shows that merely improving the horizontal resolution may not remove biases from CMIP6.


Introduction
The West African region is one of the most densely populated regions in the world, with close to 400 million inhabitants in 2018 [1]. Most of its key economic sectors, such as agriculture, power generation, and industry, strongly depend on the climate, of which seasonal rainfall forms a key component [2,3]. The study of climate and its change over the region is challenging because the region is vulnerable to low adaptive capacity to observed climate change [4] and because there are variations in the observed precipitation dataset from one location to another [5][6][7][8]. The fifth assessment of the Intergovernmental Panel on Climate Change (IPCC) [9] reported that most of the African region lacks

Model Datasets
The daily precipitation of the historical simulations for the period 1950-2014 from seven highresolution models (see Table 1) was used in this study. The data are archived at the Earth System Grid Federation (ESGF) website under CMIP6 HighResMIP (https://esgfnode.llnl.gov/search/cmip6/).

Model Datasets
The daily precipitation of the historical simulations for the period 1950-2014 from seven high-resolution models (see Table 1) was used in this study. The data are archived at the Earth System Grid Federation (ESGF) website under CMIP6 HighResMIP (https://esgf-node.llnl.gov/search/cmip6/).

Model Evaluation
Due to the sparse observation precipitation datasets over West Africa, data from the daily Global Precipitation Climatology Centre (GPCC) 2018 version (https://opendata.dwd.de/climate_environment/ GPCC/html/fulldata-daily_v2018_doi_download.html), and monthly data from the University of Delaware (UDEL) v5.01 (https://psl.noaa.gov/data/gridded/data.UDel_AirT_Precip.htm) and Climate Research Unit (CRU) TSv4.03 (https://crudata.uea.ac.uk/cru/data/hrg/) were used in this study as the observation data with which to evaluate the ability of GCMs to reproduce precipitation and to account for the observation uncertainty. GPCC data are a rain-gauge-based gridded precipitation dataset with a horizontal resolution of 1 • × 1 • [28]. The dataset has a quality control program to check for errors in the precipitation data received all over the world. We used it in this study to evaluate how the high-resolution GCMs simulate daily climatology and extreme events over the study area. CRU TS v4.03 and UDEL v5.01 are produced at 0.5 km resolution by the Climate Research Unit, School of Environmental Sciences, University of East Anglia, and the Centre for Climate Research and Department of Geography, University of Delaware, respectively; these were used to examine the performance of the GCMs in simulating the annual cycle and trend of precipitation. Due to differences in the horizontal resolution of each dataset (see Table 1), we re-gridded using a bilinear interpolation method so that each model's datasets have the same grid resolution as the observation data.

Statistical Analysis
The arithmetic average of each CMIP6 dataset was taken as the ensemble mean. A Taylor diagram and spatial analysis were used to evaluate climatology and extreme events, while temporal plots were used for annual precipitation cycle and trend analysis. We calculated the mean bias for both climatology and extreme events using is the mean of the 95th percentile of the observation. The ability of the CMIP6 models to simulate precipitation trends from one period to the other was investigated by subtracting precipitation values for 1955-1984 and 1985-2014. The changes were compared and differences at p < 0.10 were considered significant. The standardized anomaly in each model was analyzed to determine the trends and to study how each model simulates the wet and dry season with respect to the observation. The standardized anomaly was calculated using where R is the raw time series, R is the mean of the time series, and std is the standard deviation. Extreme precipitation (R95p) and its return levels were extracted by calculating the 95th percentile from the daily datasets. Africa as a whole. The amount of rainfall in the monsoon season (June-September) is higher in the Guinea Coast than the Sahel. The individual GCM and ensemble mean captured the high/low rainfall over the Guinea Coast and Sahel adequately. The observed annual peak during August was greater than 200, 25, and 100 mm/month in the Guinea Coast, Sahel, and West Africa as a whole, respectively. This appears to be slightly underestimated by some of the models and the ensemble mean, although all models captured the general rainfall pattern. In the Guinea Coast (Figure 2a), the length of the rainy season was from May to October, with peak rainfall in August; this observation was well captured by most GCMs, and was also noted by Abatan [29], although the GCMs show some variation in the intensity of precipitation. The NICAM and MRI models simulated early onset and late cessation. In comparison, the IPSL model appears to perform well over the Guinea Coast and West Africa with an exact peak in precipitation during August (Figure 2a,c), while the NICAM model simulates the pattern of precipitation over the Sahel sub-region better than others (Figure 2b).

Annual Cycle
Figure 2a-c presents the annual cycle of precipitation over the Guinea Coast, Sahel, and West Africa as a whole. The amount of rainfall in the monsoon season (June-September) is higher in the Guinea Coast than the Sahel. The individual GCM and ensemble mean captured the high/low rainfall over the Guinea Coast and Sahel adequately. The observed annual peak during August was greater than 200, 25, and 100 mm/month in the Guinea Coast, Sahel, and West Africa as a whole, respectively. This appears to be slightly underestimated by some of the models and the ensemble mean, although all models captured the general rainfall pattern. In the Guinea Coast (Figure 2a), the length of the rainy season was from May to October, with peak rainfall in August; this observation was well captured by most GCMs, and was also noted by Abatan [29], although the GCMs show some variation in the intensity of precipitation. The NICAM and MRI models simulated early onset and late cessation. In comparison, the IPSL model appears to perform well over the Guinea Coast and West Africa with an exact peak in precipitation during August (Figure 2a,c), while the NICAM model simulates the pattern of precipitation over the Sahel sub-region better than others (Figure 2b).   Figure 3 depicts the spatial distribution of daily precipitation from June to September. Most of the models adequately captured daily precipitation over the region, with rain bands from 3 to 24 mm/day. There was slight to moderate variation in the amount of precipitation simulated by the models with respect to the GPCC. The daily rainfall pattern was well simulated by the GCMs and the ensemble mean. The majority of areas with less than 8 mm/day precipitation, and the areas with most precipitation (i.e., Guinea high-ground including its sub-region and southern Nigeria) were captured adequately. Over the southeast and southwest areas of West Africa, daily precipitation was above 20 mm/day, and this was reproduced by the GCMs and ensemble mean; these findings were also reported by [30][31][32][33][34][35]. Figure 4 shows the deviation of the ensemble mean and its constituent values from observations. The ensemble mean, ECMWF, CNRM, HADGEM, and MPI showed moderate underestimation in the range of 1-5 mm/day precipitation, while IPSL, MRI, and NICAM showed consistent overestimation of precipitation in the Guinea Coast ranging from 1-6 mm/day. There were moderate overestimations Atmosphere 2020, 11, 1053 6 of 15 in Jos, Nigeria by the CNRM, and IPSL models ranging from 3-4 mm/day, but the MRI and NICAM models overestimated by more, ranging from 10-12 mm/day. The ensemble mean, followed by the ECMWF model, outperformed other models, with less deviation from observational data. spatial correlation coefficient, root-mean-square error (RMSE), and the ratio of standard deviation [36]. As shown in Figure 5, the ensemble mean and its members showed a high spatial correlation coefficient exceeding 0.8. The RMSEs relative to the observation were below 0.5 for all simulations except CNRM. The ratios of standard deviations between the simulations and the observation were around 1. In comparison, the ECMWF model had the highest spatial correlation and the lowest RMSEs, followed by the ensemble mean; MRI seems to perform better in terms of the standard deviation, which is the closest to the observation, followed by CNRM and the ensemble mean. These results demonstrate that both the ensemble mean and individual models have a strong performance in simulating spatial patterns of precipitation over West Africa, and the ensemble mean outperforms its constituent members overall.    We further evaluated the performance of HighResMIP of CMIP6 using a Taylor diagram, which quantitatively measures how well the simulated and observed patterns match each other in terms of spatial correlation coefficient, root-mean-square error (RMSE), and the ratio of standard deviation [36]. As shown in Figure 5, the ensemble mean and its members showed a high spatial correlation coefficient exceeding 0.8. The RMSEs relative to the observation were below 0.5 for all simulations except CNRM. The ratios of standard deviations between the simulations and the observation were around 1. In comparison, the ECMWF model had the highest spatial correlation and the lowest RMSEs, followed by the ensemble mean; MRI seems to perform better in terms of the standard deviation, which is the closest to the observation, followed by CNRM and the ensemble mean. These results demonstrate that both the ensemble mean and individual models have a strong performance in simulating spatial patterns of precipitation over West Africa, and the ensemble mean outperforms its constituent members overall.

Trend Analysis
Climate change is an important aspect of climate study and mean climatology alone may not be sufficient to reveal if a model can simulate climate change [37]. The precipitation from the historical periods 1955-1984 and 1985-2014 simulated by the CMIP6 were subtracted to quantify the percentage of the West African region (grid) that has experienced a significant decrease or increase in precipitation. Table 2 presents the results of the analysis. Observations show that some grids are exhibiting decreasing/increasing trends, while some are showing no significant change. The extent of the trend is not the same between CRU TSv4.03 and UDEL v5.01. CRU TSv4.03 shows that 32% of the

Trend Analysis
Climate change is an important aspect of climate study and mean climatology alone may not be sufficient to reveal if a model can simulate climate change [37]. The precipitation from the historical periods 1955-1984 and 1985-2014 simulated by the CMIP6 were subtracted to quantify the percentage of the West African region (grid) that has experienced a significant decrease or increase in precipitation. Table 2 presents the results of the analysis. Observations show that some grids are exhibiting decreasing/increasing trends, while some are showing no significant change. The extent of the trend is not the same between CRU TSv4.03 and UDEL v5.01. CRU TSv4.03 shows that 32% of the grids had a significant decrease in precipitation, while 1% of the grids had a significant increase and 67% of the region showed no significant precipitation change. UDEL v5.01 shows 18% of the grids had a significant decrease, while 82% of the region had no significant precipitation change. The uncertainty in interpolation could cause a variation in the precipitation estimates [38]. In contrast with observation, ECMWF shows 17% of the grids had precipitation increase, with 23% of the grids showing precipitation decrease, while 60% of the region showed no significant precipitation change. NICAM had 4% of the grids showing increasing trends, with 21% of the grids showing decreasing trends, while 75% of the grids showed no significant change. Among the simulations, NICAM outperformed in simulating observed decreasing trends among the grids. In this analysis, ensemble mean did not outperform individual models, and this could be attributed to the low performance of its members in capturing the change.  The ability of GCMs to capture observed trends during the 1950s to 2000s using standardized precipitation anomaly is presented in Figure 6. As shown by the observations, there was a decreasing trend from the 1950s to the mid-1980s and then turned to slightly increasing trend afterwards. Furthermore, the two observation datasets showed wet and dry periods during the 1950s to 1960s and the 1970s to 1980s, respectively, as also noticed by [39]. Most of the GCMs and the ensemble mean captured the observed trends as well as wet and dry periods. IPSL outperformed in capturing the observed trends, in wet and dry periods. The performance of the GCMs depicts their capabilities in simulating the wet and dry periods, which are regular features of West African precipitation.

Extreme Precipitation Event
Extreme event analysis is an important aspect of climate change, as many of the regions usually have excess rainfall that frequently leads to flooding. The performance of the GCMs used to estimate it will aid the preparation for such events. Figure 7 presents the spatial pattern of extreme events in the study area. Observations showed more than 60 mm/day precipitation over coastal areas of Sierra Leone, Liberia, and Nigeria. The remaining part of the Guinea coast had from 30 to 40 mm/day. In comparison with the observational data, the ensemble mean and its constituent models showed a similar pattern of extreme precipitation; however, there was consistent underestimation by most models throughout the Guinea coast, except for the NICAM model, which overestimated precipitation in most of the region. Figure 8 shows the magnitude of deviation of the GCMs and ensemble mean from observation in simulating extreme events. The ensemble mean and its constituents showed moderate positive and negative bias (+6 mm/day and −6 mm/day) over the Sahel sub-region. Over the Guinea coast, a negative bias with moderate to high magnitude was predominant, varying from −6 to −30 mm/day for the ensemble mean, and ECMWF, CNRM, HADGEM, IPSL, MPI, and MRI models.

Extreme Precipitation Event
Extreme event analysis is an important aspect of climate change, as many of the regions usually have excess rainfall that frequently leads to flooding. The performance of the GCMs used to estimate it will aid the preparation for such events. Figure 7 presents the spatial pattern of extreme events in the study area. Observations showed more than 60 mm/day precipitation over coastal areas of Sierra Leone, Liberia, and Nigeria. The remaining part of the Guinea coast had from 30 to 40 mm/day. In comparison with the observational data, the ensemble mean and its constituent models showed a ECMWF, IPSL, MPI, and MRI models were less than 1, while the ratio of standard deviation between the observation and simulation data was also around 1. The contrast showed that the ensemble mean had the highest spatial correlation and lowest RMSE. The NICAM model appeared to perform better with regard the ratio of the standard deviation, having the closest value to the observation. This analysis shows that the majority of the seven HighResMIP models used can simulate the spatial pattern of extreme precipitation events over West Africa, and the ensemble mean outperformed any single model.   . The spatial pattern of mean bias of 95th percentile precipitation with respect to GPCC data. Figure 8. The spatial pattern of mean bias of 95th percentile precipitation with respect to GPCC data.
A deeper understanding of the performance of each CMIP6 model in reproducing the spatial pattern of extreme precipitation event was performed using the Taylor diagram ( Figure 9). As observed in Figure 9, the majority of CMIP6 models and the ensemble mean showed a spatial correlation coefficient of 0.4 to 0.7, with the exception of HADGEM. The RMSEs of the CNRM, ECMWF, IPSL, MPI, and MRI models were less than 1, while the ratio of standard deviation between the observation and simulation data was also around 1. The contrast showed that the ensemble mean had the highest spatial correlation and lowest RMSE. The NICAM model appeared to perform better with regard the ratio of the standard deviation, having the closest value to the observation. This analysis shows that the majority of the seven HighResMIP models used can simulate the spatial pattern of extreme precipitation events over West Africa, and the ensemble mean outperformed any single model.

Return Level
The precipitation return level for the 30-year period of 1985-2014 is shown in Figure 10 for the observation and the GCMs with ensemble mean. The observational data showed return levels of the extreme events at up to 80 mm/day of precipitation over most of the Guinea Coast and the southernmost part of the Sahel, with the highest return level over the Guinea high ground. The ensemble mean, ECMWF, CNRM, HADGEM, IPSL, MPI, and MRI underestimated the return level over most regions, with the highest overestimation by the NICAM model for the Guinea coast. The MRI and HADGEM models performed well over the Guinea high ground. The deviation of the GCMs from observations of the extreme events' return level based on grids is shown in Table 3. The IPSL model showed that 21% of the grids were underestimated and 77% of grids showed no significant change, while the MRI and HADGEM models both underestimated 22% of grids, while 78% showed

Return Level
The precipitation return level for the 30-year period of 1985-2014 is shown in Figure 10 for the observation and the GCMs with ensemble mean. The observational data showed return levels of the extreme events at up to 80 mm/day of precipitation over most of the Guinea Coast and the southernmost part of the Sahel, with the highest return level over the Guinea high ground. The ensemble mean, ECMWF, CNRM, HADGEM, IPSL, MPI, and MRI underestimated the return level over most regions, with the highest overestimation by the NICAM model for the Guinea coast. The MRI and HADGEM models performed well over the Guinea high ground. The deviation of the GCMs from observations of the extreme events' return level based on grids is shown in Table 3. The IPSL model showed that 21% of the grids were underestimated and 77% of grids showed no significant change, while the MRI and HADGEM models both underestimated 22% of grids, while 78% showed no significant change. The ensemble mean did not perform well in this analysis because most of the GCMs failed to capture the extreme events (R95p) return levels. Table 3. Return level percentage grid change between the observation and GCMs.

Conclusions
In this study, we evaluated the precipitation simulation performance of seven models from the CMIP6 HighResMIP over the West African region. Several metrics were employed to determine the performance of the HighResMIP models over the study region. The main objective was to evaluate the impact of the models' new high-resolution feature in representing the precipitation characteristics of the West African region. The evaluation results from the annual cycle showed that the ensemble mean and its constituent models adequately reproduced the annual precipitation pattern. Most models' simulations underestimated the magnitude of precipitation, except for the IPSL and NICAM models, which appeared to overestimate precipitation over the Guinea high ground. Most of the models simulated August as the peak of precipitation over the two homogenous regions and the whole of West Africa. The ensemble mean most closely matched the observed annual pattern of precipitation, and therefore outperformed the individual models. In further analysis, a comparison of the simulated spatial distribution of precipitation with observational data showed that the ensemble mean and its members reproduced the pattern of precipitation over most of the rainy areas

Conclusions
In this study, we evaluated the precipitation simulation performance of seven models from the CMIP6 HighResMIP over the West African region. Several metrics were employed to determine the performance of the HighResMIP models over the study region. The main objective was to evaluate the impact of the models' new high-resolution feature in representing the precipitation characteristics of the West African region. The evaluation results from the annual cycle showed that the ensemble mean and its constituent models adequately reproduced the annual precipitation pattern. Most models' simulations underestimated the magnitude of precipitation, except for the IPSL and NICAM models, which appeared to overestimate precipitation over the Guinea high ground. Most of the models simulated August as the peak of precipitation over the two homogenous regions and the whole of West Africa. The ensemble mean most closely matched the observed annual pattern of precipitation, and therefore outperformed the individual models. In further analysis, a comparison of the simulated spatial distribution of precipitation with observational data showed that the ensemble mean and its members reproduced the pattern of precipitation over most of the rainy areas (Guinea high-ground and Nigeria's coastal region). A Taylor diagram was used to examine quantitatively how the model simulations and observational data matched with each other. While the results showed that the capacity of individual models cannot be discarded, the ensemble mean outperformed its ensemble members. The performance of the HighResMIP in simulating reduced precipitation over some grid areas was also examined. In this aspect, NICAM outperformed other models. The standardized anomaly was used to determine precipitation trends and the result showed that the wet and dry periods are well simulated by the majority of the GCMs, while IPSL is the highest performing model among the ensemble members. The anomaly of R95p was used to plot a Taylor diagram to evaluate the performance of the HighResMIP GCMs in terms of simulating extreme events. Most of the models had a relatively low correlation with the observational dataset and a root-mean-square error of less than 1; in this analysis, the ensemble mean performed better than others. The spatial distribution of the extreme events and return level was also investigated, and most of the models underestimated the extreme events, except the NICAM model. This typical feature from NICAM suggests that the highly simulated extreme events might be closely related to the high resolution. Although attributing such an extreme behavior from NICAM goes beyond the scope of this paper, this result provides evidence that high-resolution GCMs without regional components such as topography, vegetation, or waterbodies may be biased in providing accurate and regionally resolved climate information. This study was limited to the seven available HighResMIP models; with the release of more models, there may be a more robust evaluation in the future. The unavailability of corresponding projection data for the climate models used precluded the assessment of future projections with respect to precipitation over West Africa, but this remains a highly relevant task for future studies in the region.