#### 2.2. Datasets

Two types of data were used in this study. The first one constitutes the daily rainfall data from thirty five stations (

Figure 1) available for the period 1950–2014; these data were obtained from the National Meteorology Agency of Benin (Météo Bénin). The spatial distribution is showed in

Figure 1.

The missing rate is calculated over the whole recording period, ending in 2014. This rate is more important for some stations exploited since 1921 (

Table 1). The missing data are more present before 1950. So we have considered for analysis, 1950–2014. For data processing, any year which contains more than 10% missing values between April and October (rainy period) is considered like missing and the climate indices aren’t calculated for this year.

The second type of data used constitutes the daily rainfall data from a set of simulations (scenario) conducted with the regional climate model REMO. REMO is a three-dimensional, hydrostatic atmospheric circulation model which solves the discretized primitive equations of atmospheric motion. The REMO simulations are forced with data from the global climate model MPI-ESM-LR following the IPCC (Intergovernmental Panel on Climate Change) Representative Concentration Pathways (RCP) scenarios. The details of the model characteristics are summarized in

Table 2.

REMO data are available in the context of the Coordinated Regional Climate Downscaling Experiment (CORDEX) over Africa at 0.44° resolution for the period 1950 to 2100 [

32] and it has already been used over Africa by [

33,

34,

35,

36] and particularly in Benin by [

37]. Simulated precipitation of ten (10) regional climate models was evaluated at a range of time scales including seasonal means, annual and diurnal cycles, against a number of detailed observational datasets by [

38]. According to their analysis, REMO produce good simulations of precipitation over West Africa. Furthermore, N’Tcha M’Po et al. [

39] have compared four regional climate models ability to reproduce the daily precipitation characteristics, after bias correction, in the Ouémé watershed which is the study area of this paper. They confirmed REMO high capacity to reproduce daily precipitation in this area compared to the three others. In addition, the comparison of simulated and measured rainfall amounts in Benin has shown that REMO is able to compute realistic precipitation amounts for the region [

40]. Then the choice of this model is based on prior results obtained by [

38].

We used REMO projections following the most extreme IPCC scenario RCP8.5 and the mean RCP4.5 for the period 2015–2050 in CORDEX database. We also used REMO historical data from 1973 to 2005 for bias correction. All these data are available in the CORDEX database online [

41].

Several researchers demonstrated that raw output from regional climate models (RCMs) cannot be used directly as input for impact models because of systematic bias [

34,

36,

39,

42,

43]. Before the future precipitation index was calculated, we corrected the bias of the raw output of the RCM with a new quantile—quantile calibration method based on a nonparametric function that amends mean, variability, and shape errors in the simulated cumulative distribution functions (CDFs) of the climatic variables, developed by [

44]. Indeed, two studies devoted to the comparison of daily precipitation bias correction methods were done in Benin namely N’Tcha M’Po et al. [

39] and Obada et al. [

45]. In these studies, six daily precipitation bias correction methods were compared and the new quantile method (AQM: Adjusted Quantile Mapping) is the most adapted method to reduce the bias of the daily precipitation simulated by the RCMs in Benin. The procedure consists of calculating the changes, quantile by quantile, in the CDFs of daily RCM outputs between a x-year control period and successive x-year future time slices [

39,

44]. These changes are rescaled based on the observed CDF for the same control period, and then added, quantile by quantile, to these observations to obtain new calibrated future CDFs that convey the climate change signal [

44]. The choice of x value depends on the length of the observation datasets available; but the x-year chosen must have a climatological meaning [

44]. In this study, we chose the 15-year periods due to the temporal limitation of the observed database of reference period (33 years, 1973–2005) and also to be in accordance with N’Tcha M’Po et al. [

39] since it is the same stations. Furthermore, we consider a length of 15 years to be a compromise between series large enough to have climatological meaning; here the statistical sample is N = 5478, and short enough to permit, by comparing the simulated CDFs (Cumulative Distribution Functions) of successive 15-year to detect any climate change signal along the twenty-first century. Reference period is the period for which both observed and historical simulations of REMO data are available. We have calibrated the method over 15-year periods chosen between 1973 and 1990 and the period 1991–2005 is used to test model in order to assess the effect of calibration period on model performance. In short we have four (4) 15-year calibration periods in the period 1973–1990 (1973–1987, 1974–1988, 1975–1989 and 1976–1990), one 15-year period contains 15 consecutive years. Based on the model efficiency on different calibration periods, we chose 1976–1990 as baseline period for the correction of projected data. The best efficiency of model is obtained on the nearest period (1976–1990) to validation period (1991–2005).

Recalling that, our reference period extends from 1976 to 1990 and the future periods are 2015–2029 and 2030–2050. The statistical adjustment can be written as the following relationship between the

ith percentile value

${P}_{i}$ (projected or future corrected),

${O}_{i}$ (reference observed),

${S}_{ci}$ (raw reference simulated), and

${S}_{fi}$ (raw future simulated) of the corresponding CDFs. This is just a summary of the method, all details can be found in [

44].

As surrogates of the population variability, Amengual et al. [

44] proposed

$IQ{R}_{|O}$ (interquartile range of the observed data) and

$IQ{R}_{|Sc}$ (interquartile range of the raw control simulated data).

$IQR$ is the parametric difference between the 75th and 25th percentiles for all the variables, except for the precipitation for which they proposed to use 90th and 10th percentiles owing to the highly asymmetrical gamma-type distribution of this variable, with a high proportion of no-rainy days. Factor

$g$ modulates the variation in the mean state

$\overline{\mathrm{\Delta}}$, while

$f$ calibrates the change in variability and shape expressed by

$\mathrm{\Delta}{\prime}_{i}$.

A difficulty arises for precipitation since the climate model overestimates the number of days resulting in trace values and so underestimates the number of non-rainy days, thus resulting in an unrealistic probability of precipitation in the simulations [

38,

46]. To overcome this problem while respecting the internal dynamical evolution of the modeled climate scenario when dealing with the drying or moistening of the rainfall regimes and according to [

44], we imposed an additional constraint: the ratio of non-rainy days between future and control simulated raw data is maintained for the calibrated versus observed series, which is

$n{z}_{p}=\frac{n{z}_{{S}_{f}}}{n{z}_{Sc}}n{z}_{O}$ with

$n{z}_{p}$,

$n{z}_{o}$,

$n{z}_{{S}_{c}}$ and

$n{z}_{{S}_{f}}$ are the number of zeros in the projected, observed, simulated reference, and simulated future series, respectively.

All details about this bias correction method can be found in [

39,

44,

45]. The MeteoLab toolbox is used to compute REMO data. MeteoLab is an open source MATLAB toolbox for meteorology and climate. It is available on

http://meteo.unican.es/en/meteolab.

#### 2.4. Temporal Trend Analysis

Many techniques can be used for analysing the series trends, yet the most commonly used technique by meteorologists is the Mann-Kendall (MK) test [

3,

12,

19,

21,

48,

49]. There are two advantages of using this test. The Mann-Kendall test is non-parametric, does not require normally distributed data, and has a low sensitivity to missing data [

12]. This method has also an advantage to have a low sensitivity to abrupt breaks due to inhomogeneous time series [

19,

49]. Null hypothesis H

_{0} means that no trend changes in series data have been found (the data are independent and randomly ordered), and H

_{0} is tested against the alternative hypothesis H

_{1}, which assumes a trend exists.

The Mann-Kendall statistics are calculated as follow

with

where

${X}_{j}$ and

${X}_{i}$ are the annual values in years

j and

i,

j >

i, respectively. If

n < 10, the value of |

S| is directly compared to the theoretical distribution of

S derived by Mann and Kendall. At a certain probability level α, H

_{0} is rejected in favor of H

_{1} if the absolute value of

S equals or exceeds a specified value

${S}_{\alpha /2}$, where

${S}_{\alpha /2}$, is the smallest

S which has the probability less than α/2 to appear in case of no trend. A positive (negative) value of

S indicates an upward (downward) trend. For

n ≥ 10, the statistic

S is approximately normally distributed with the average (

E) and variance (Var) as follows:

where

m is the number of the tied groups in the data set and

t_{j} denotes the number of ties to extent

j. The summation term in the numerator is used only if the data series contains tied values. If the sample size

n > 10, the values of

S and Var(

S) are used to calculate the statistics of standard test

Z as follows:

In the same way, the statistic tau (

τ) of Kendall is calculated by:

where

The statistic Z test is used to measure the importance of the trend. In fact, Z is used to test the null hypothesis H_{0}. If |Z| is greater than ${Z}_{\frac{\alpha}{2}}$, where α represents the chosen significance level (we used, α equals 5% and then ${Z}_{0.025}=1.96$), then the null hypothesis is invalid, implying that the trend is significant.

The second method used to determine the temporal trend of precipitation index is the linear regression (LR). Linear regression is a parametric approach used to test for linear temporal trends [

19,

50]. Ordinary least squares regression is used to fit the “best” straight line. A linear trend is reported when the slope of the regression line is demonstrated to be statistically different from zero; a positive slope indicates an increasing trend and a negative slope a decreasing trend [

19,

21,

50]. The method of linear regression requires the assumptions of normality of residuals, constant variance, and true linearity of relationship. Linear regression is also used in the climatological variables trends analysis [

19,

21,

51]. Both methods were used to detect climate indices historical trends over the period 1950–2014 and the future trends from 2015 to 2050.

When a monotonic trend is detected, its magnitude is calculated by the Sen’s slope method [

52]. The Sen’s slope β corresponds to the median of the slopes calculated on each peer of points in the time series where each measurement is performed at regular intervals.

where

${X}_{i}$ and

${X}_{j}$ are values data at time steps

j and

i (

j >

i), respectively.

Both statistical tests used (Linear Regression and Mann Kendall) were done using XLSTAT software. For each case, the p-value is calculated and compared to the significance threshold used here.