- freely available
- re-usable

*Water*
**2013**,
*5*(4),
1561-1579;
doi:10.3390/w5041561

^{1}

^{2}

## Abstract

**:**This paper focuses on uncertainty analysis to aid decision making in applications of statistically modeled flow-duration-frequency (FDF) relationships of both daily high and low flows. The analysis is based on 24 selected catchments in the Lake Victoria basin in Eastern Africa. The FDF relationships were derived for aggregation levels in the range 1–90 days for high flows and 1–365 days for low flows. The validity of the projected FDF quantiles for high return periods T was checked using growth factor curves. Monte Carlo simulations were used to construct confidence intervals CI on both the estimated Ts for given flows and the estimated FDF quantiles for given T. The average bias of the modeled T of high and low flows are for all catchments and Ts up to 25 years lower than 8%. Despite this relatively small average bias in the modeled T, the limits of the CI on the modeled 25-year flows go up to more than 100% for high flows and more than 150% for low flows. The assessed FDF relationships and accompanied uncertainties are useful for various types of risk based water engineering and water management applications related to floods and droughts.

## 1. Introduction

In support of water related risk analysis, there is a great need for frequency analysis for both high and low flow extremes for proper management applications related to floods and droughts. Examples of applications include reservoir operations, irrigation, hydropower scheduling, industrial planning, flow control for ecological purposes e.g., compensation flows and dilution flows for improving the quality of water for treatment plants or power generating plants. Proper management of water resources under global climate change and/or anthropogenic influence is an important key to development. It requires an accurate descriptive study of hydrological extremes and their recurrence rates, at the relevant scales, based on long-term time series of observations of rainfall intensities, discharges or water levels. One approach to obtain substantially compressed frequency information on such extremes from a hydrological time series is through extreme value analysis for a range of aggregation levels to constitute Amplitude-Duration-Frequency (ADF) relationships (FDF or IDF, for discharges or rainfall respectively). Aggregation levels are simply durational intervals over which the hydrological values are averaged. Premised on such durations, the conditional relationships are essentially cumulative functions of the amplitude values in the time series [1].

ADF relationships have been presented in a number of studies (see e.g., [2,3,4,5,6,7,8,9,10]). These relationships are very important in water engineering. According to Nhat [11], ADF relationships are among the most commonly used tools in water resources engineering, either for planning, designing and operating of water resource projects, or for various engineering projects against floods. They are used to construct design storms for hydrological modeling applications [1,12]. Another application is the calibration and validation of stochastic rainfall generators [13]. Several studies have used ADF relationships to assess the impact of climate change and/or variability on hydrological extremes (see e.g., [8,14,15,16,17,18]).

These studies did, however, not include uncertainty analysis on the calibrations or communication of confidence intervals (CIs) on the ADF relationships. Although there are some studies including [19,20] that take into account uncertainty analysis on flow duration curves, uncertainties in the FDF relationships can, however, be large in data scarce regions, as is shown in this paper for the basin of Lake Victoria in Eastern Africa. It would in such cases be important to take these uncertainties into account in the water resources management or decision making.

According to Ayyub [21], the need to model and analyze uncertainties stems from the awareness that data abundance does not necessarily give us certainty, and sometimes can lead to overwhelmingly confusing situations, and/or a sense of over-confidence leading to an improper information use. The former can be an outcome of the limited capacity of our human mind in some situations to deal with complexity and data abundance whereas the latter can be attributed to a higher order of ignorance, called the ignorance of self-ignorance. Predictive uncertainty, its quantification and its reduction, is a key issue in statistical modeling of a hydrological variable that allows judgment of the degree of confidence in estimated results. It then can be taken into account in decision making in water resources management. Several approaches exist to assess errors in statistical models, in which the Monte Carlo technique is commonly used. Examples of Monte Carlo simulations applied in statistical modeling can be found in many researches (see e.g., [8,22,23,24,25,26]). Other approaches are the Jackknife method [27] and the typical-value principle [28], which both construct CIs by employing subsample values of a general statistic as the building block. Bootstrapping was introduced by Efron [29] to further strengthen the jackknife method of estimating bias or standard error. Other than probabilistic approaches, the generalized likelihood uncertainty estimation (GLUE) technique of Beven and Binley [30] and Bayesian methods are commonly employed. The aforementioned methods deal with a number of uncertainties as elaborated in [31]. However, this study considers sampling uncertainty of quantile estimates by applying parametric bootstrapping. Consequentially, this study is aimed at not only statistically modeling the FDF relationships but also carrying out an in-depth analysis of errors and uncertainty for communication of the research findings to water resources managers and environmental policy makers. The parametric bootstrapping Monte Carlo method is used to establish CIs on the FDF based flow quantiles. Root mean square error and average bias on these quantiles and on the parameters describing the FDF relationships are also estimated to understand uncertainty.

## 2. Study Area and Data Series

Lake Victoria is the world’s second largest freshwater lake but has a relatively small drainage basin of about 184,000 km^{2}, being slightly less than three times the Lake’s surface in area. The Lake’s basin is situated at an altitude of 1134 m above mean sea level and stretches 355 km from west to east between 31°37' E to 34°53' E and 412 km from north to south between longitudes 00°30' N and 3°12' S. Low-lying parts close to Lake Victoria are characterized by episodes of floods, for instance, downstream of River Nzoia, around Budalang’i [32,33,34]. Despite the episodes of flooding, low flows tend to subsequently punctuate the basin’s hydrology often for long periods of time. Severe droughts in the Lake Victoria basin can be expected on average about every seven to eight years during the hot dry season from December to February [35]. According to Otieno and Awange [36], these droughts affected power production with resulting economic losses. According to Awange et al. [35], Lake Victoria and its environment is more recently under threat from declining water levels, which has had a number of social and economic effects.

This study is based on 24 selected catchments in the Lake Victoria basin. Long discharge time series, preferably above 25 years were used. Figure 1 shows the locations of the discharge measurement stations used in this study and Table 1 shows for these stations the flow record length, coordinates, area of the catchment upstream of each station, and percentage of missing records.

**Figure 1.**Locations of the discharge stations; see Table 1 for details on these stations.

Station number | River catchment | Station ID | Area [km^{2}] | Data length [Year] | Location | Missing records [%] | ||
---|---|---|---|---|---|---|---|---|

From | To | Longitude [°] | Latitude [°] | |||||

(1) | Biharamulo | *** | 1,981 | 1950 | 2004 | 31.29 | 2.62 | 13 |

(2) | Bukora | 81270 | 8,392 | 1951 | 1976 | 31.48 | −0.85 | 16 |

(3) | Grumeti | 5F3 | 13,363 | 1950 | 2004 | 33.94 | −2.06 | 19 |

(4) | Gurcha-migori | 1KB05 | 6,600 | 1950 | 2004 | 34.20 | −0.95 | 3 |

(5) | Isanga | 114012 | 6,812 | 1976 | 2004 | 32.77 | −3.21 | 12 |

(6) | Kagera | 58370 | 54,260 | 1950 | 1994 | 31.43 | −1.29 | 20 |

(7) | Katonga | 100006 | 15,244 | 1950 | 1975 | 31.95 | −0.09 | 18 |

(8) | Koitobos | 1BE06 | 813 | 1949 | 1975 | 35.09 | 0.97 | 3 |

(9) | Magogo-maome | 113012 | 5,207 | 1950 | 2004 | 33.15 | −2.92 | 19 |

(10) | Mara | 107072 | 13,393 | 1950 | 2003 | 34.56 | −1.65 | 11 |

(11) | Mbalangeti | 111012 | 3,591 | 1950 | 2004 | 33.86 | −2.22 | 14 |

(12) | Moiben | 1BA01 | 188 | 1953 | 1990 | 35.44 | 0.80 | 16 |

(13) | Nyakizumba | 100005 | 359 | 1950 | 1987 | 30.08 | −1.32 | 15 |

(14) | Nyando | 1GD01 | 3,652 | 1962 | 2001 | 35.04 | −0.10 | 2 |

(15) | Nyangores | 1LA03 | 4,683 | 1963 | 1993 | 35.35 | −0.79 | 10 |

(16) | Nzoia | 1EF01 | 12,676 | 1974 | 1999 | 34.08 | 0.13 | 8 |

(17) | Ogilla | 1GD03 | 2,650 | 1970 | 1996 | 34.96 | −0.13 | 1 |

(18) | Ruizi | 100004 | 2,070 | 1970 | 1998 | 30.65 | −0.62 | 8 |

(19) | Sergoit | 1CA02 | 659 | 1959 | 1990 | 35.06 | 0.63 | 2 |

(20) | Simiyu Ndagalu | 5D1 | 1,205 | 1970 | 1996 | 33.56 | −2.63 | 12 |

(21) | Sio | 1AH01 | 1,450 | 1958 | 2000 | 34.15 | 0.38 | 6 |

(22) | Sondu | 1JG01 | 3,508 | 1950 | 1990 | 35.01 | −0.39 | 12 |

(23) | South Awach | 1HE01 | 3,156 | 1950 | 2004 | 34.54 | −0.47 | 7 |

(24) | Yala | 1FG01 | 3,351 | 1950 | 2000 | 34.51 | 0.09 | 3 |

Note: *** Missing station ID.

## 3. Methodology

#### 3.1. FDF Modeling

The extreme value analysis (EVA) and FDF modeling are based on nearly independent high and low flow extremes extracted from the full time series. Independent high flows are selected using independence criteria based on threshold values for the time between two successive independent flow (F) peaks, the ratio of the minimum flow between the two peaks over the peak value, and the peak height; see Willems [37] for details on the method. Extraction of low flows was carried out by applying the same method but on the inverted flow (1/F) series. Prior to the extraction of the extreme values from the full time series for each of the selected stations, n-day moving averaging window was passed through the series. The aggregation levels considered for high flows were 1, 3, 5, 7, 10, 30, 60, 90 days while for low flows 1, 10, 30, 90, 150, 180, 240 and 365 days were taken. This is the range covered by the relevant water engineering or management applications as agriculture, irrigation, hydropower, domestic water supply, pollution control, etc. The highest aggregation levels of three months considered for high flows and one year for low flows are based on the differences in time scale of the high/low flow related hydrological processes. Peak flows are sudden and result in immediate effects due to the excess of water, whereas low flows are due to progressive low rainfall periods with long term effects such as shortage in water availability.

To come up with the FDF relationships, for the selected range of aggregation levels, EVA was carried out and the suitable extreme value distribution (EVD) selected. To enable an adequate selection of the most optimal threshold level and to avoid systematic over-/underestimation in the tail of the EVD, quantile plots or Q-Q plots were considered. As seen in the principle suggested by Csörgö et al. [38] and Beirlant et al. [39], and used by Taye and Willems [8], Onyutha [9], Willems et al. [40], Willems [41], and Willems [18], the extreme value index γ (or k = −γ) enables identification of the shape of the EVD. This extreme value index describes the tail heaviness of the EVD. It is a parameter in the Generalized Extreme Value (GEV) distribution of Jenkinson [42] or Generalized Pareto Distribution (GPD) of Pickands [43] commonly applied as EVDs.

The class of the GEV distribution or GPD is identified as heavy tail (when γ >0 or k <0), normal tail (when γ = k = 0) and light tail (when γ <0 or k >0). The principle of calibrating the GPD by a weighted linear regression in Q-Q plots, with the primary forms of testing the tail shape of the GPD was adopted in this study. It has been shown in some asymptotic sense that for independent peak flow extremes as used in this study, the conditional distribution of these extremes follows the GPD [43].

Considering x_{t}, α and k as threshold, scale and shape parameters respectively; the cumulative distribution function G(x) of the GPD is given by:

This distribution is valid for values of x above the threshold x_{t}.

The relationship between the T-year flow F_{T} and the return period T is given by:

When based on the calibrated GPD, or by:

When based on the empirical data. In Equations (2) and (3), n is the data record length in years; t the number of observed flows above the threshold x_{t} that is considered in the EVD; j the rank of the events (j = 1 for the highest). The relationships between F_{T} and T for the GPD can be given by Equations (5) and (6). For the exponential distribution of Equation (1), Equation (3) transfers to a linear relationship between F_{T} and log(T) as in Equation (6):

_{T}

_{0}is the flow at return periodT0, and T0 is equal or higher than the return period n/t of the threshold x

_{t}. The quantiles F

_{T}are hereafter called the growth factors (Gf

_{T}).

Figure 2 shows examples of such calibrated normal tailed GPDs as linear regression lines in exponential Q-Q plots for station 16.

**Figure 2.**(

**a**) Daily high flows; (

**b**) Daily low flows. Symbol (○) shows observations in exponential Q-Q plots; (□) denotes the selected optimal threshold. The regression lines are the calibrated EVDs.

The slopes of the linear calibrated EVD were quantified by the weighting factors proposed by Hill [44]. The slope is computed as the difference between two ordered flows divided by the difference between their corresponding ranks. By considering j to be the rank of events, the MSE of the weighted linear regression in the exponential Q-Q plot is:

Because of high fluctuations which occur in the slope of the Q**-**Q plots for high thresholds (see Figure 3) stemming from randomness of the available dataset, the slope estimates for these high thresholds have high statistical uncertainty. Instead for very low thresholds the slope estimates might result in pronounced bias because according to Pickands [43] the slope asymptotically converges to a constant one (in the exponential Q-Q plot for normal tails) for higher thresholds. The selection of optimal threshold values x_{t} above which the EVD is calibrated was ensured to be at a point above which the mean squared error (MSE) of the linear regression is minimal, i.e., within nearly horizontal sections in the plot of the slope versus the number of observations above the threshold. In cases where it is possible for MSE to reach local minima at different threshold orders within the range where the optimal threshold is situated, the thresholds at the different local minima are calibrated to the EVD separately for visual aid of selecting the most suitable one. The examples in Figure 2 are for the daily flows at station 16. The optimal thresholds are determined as the flow values with threshold rankst = 100 and 69 [the 100th and 69th highest flow values for high and low (1/F) flow values respectively] as shown in Figure 3. A linear tail behavior in the exponential Q-Q plot can be observed towards the higher F or (1/F) values.

**Figure 3.**(

**a**) Daily high flows; (

**b**) Daily low flows. The symbol (♦) shows Hill-type estimation of the slope in the exponential Q-Q plot; (◊) is for mean squared error (MSE) of the Hill-type regression in the exponential Q-Q plot; and (□) represents selected optimal threshold.

What followed next after carefully selecting, in a consistent way, the optimal thresholds for the different aggregation levels, was the calibration of the parameters of the EVD and analysis of the relationship between the model parameters and the aggregation levels using the formula presented by Willems [45] and used by Taye and Willems [8], and Onyutha [9] as expressed below:

In this formula, A is the area of the catchment upstream of the discharge measuring station considered. The formula is based on scaling properties for the rainfall intensities and consequently of the river discharges. The scaling property indicates that the same EVD is valid for different aggregation levels after application of a scaling factor to the rainfall or discharges values. The scaling factor is different for different aggregation levels. The formula has five parameters: c, w, H, z and a. The last three parameters are called “scaling exponents” in the scaling theory and a specific interpretation can be given to these parameters. The parameter H is called “Hurst-exponent”; while z represents the dynamic scaling exponent; and a, the scaling exponent applied for the aggregation level.

For parameters α and t, threshold discharge q (mean discharge value calculated based on the complete time series) equals 0. After calibrating of parameters of the GPD using Equation (9), FDF relationships are derived using Equations (10) and (11):

The parameter-aggregation level relationships, together with the analytical description of the EVD, finally constituted the FDF relationships that are used to estimate high or low flow quantiles as a simultaneous function of different Ts and aggregation levels.

#### 3.2. Uncertainty and Error Analysis

The difference between the observed (Q_{f}) and the FDF-based (Q_{d}) flow quantiles were used as a measure of bias. Considering i to be the rank of POT events (i = 1 for the highest); and R the number of POT events above optimal threshold event; the mean of values obtained from expression (Q_{f,I}− Q_{d,i}) as percentage of Q_{d,i} for i = 1 to R is considered the average percentage bias as in Equation (12). The overall differences between Q_{f,i} and Q_{d,i} values for each of the selected catchments were also evaluated in terms of the root mean squared error [RMSE; Equation (13)].

Since there are no empirical values for T higher than the length of the available flow series, the validity of the flow quantiles higher than that T obtained from the FDF relationships was checked using the at-site EVD based quantiles.

To check for the bias in the theoretical T in comparison with the empirical T, the distribution of the residuals of FDF based versus empirical or Gf_{T } based T values, and the CIs of these residuals were assessed. On assumptions that a small sample size of observed flow extremes was obtained and that the residuals on the modeled T are random and follow a normal distribution, a t-mean test was conducted on the null hypothesis of an unbiased model hence on the bias (mean residual) in FDF based T or flow quantile estimates of each selected catchment.

Probability distributions and/or CIs on the FDF based T or flow quantile can be applied to provide the FDF based estimates with uncertainty measures. The biases in fitting of theoretical T to empirical quantiles, and that of calibrating parameter-aggregation level relationships were considered. The CIs were computed applying the parametric bootstrapping method, based on samples of river flows randomly generated from the EVD using the Monte Carlo method. Random samples of equal size as the dataset of observed flow extremes were randomly generated. This was repeated 1000 times. The CIs were computed by the percentile method after ranking the flow quantiles in the generated samples and picking the 25th and 975th quantiles as the upper and lower limits of the 95% CI respectively.

## 4. Results and Discussion

#### 4.1. FDF Relationships

Figure 4 shows examples of the FDF relationships obtained after compiling the exponential EVD calibration results for river flows at various aggregation levels for station 14. Up to Ts equal to the length of the available time series, empirical quantiles were derived as well. Because the lengths of the available river flow series were all longer than 25 years but less than 100 years, empirical T-year flow quantiles are only shown for curves up to 25 years in Figure 4. For higher T values, due to the randomness involved in the empirical data, the empirical quantiles can be far more inaccurate in comparison with the theoretical quantiles. Differences between the empirical and theoretical quantiles can, for the higher T values, also be explained by the influence of river flooding and the higher observation errors for higher flows. One of the reasons of the latter increase in observation errors for higher flows are due to bias in rating curve extrapolation or the difference between the river discharge and the catchment rainfall-runoff discharge. Another reason is the increasing statistical model uncertainty with increasing T as seen later in this paper.

#### 4.2. Evaluation of the FDF Relationships

Figure 5 shows the graphical goodness-of-fit of the flow quantiles after calibration of the EVDs for the daily aggregation level and for T of 5, 10 and 25 years. Considering the full range of aggregation levels, it can be visualized that the theoretical quantiles fit well the empirical ones. This shows that the FDF calibrations are highly acceptable.

Figure 6 shows a representative example of comparison between the at-site EVD (or Gf_{T}) based curves and the corresponding curve derived from the FDF relationships of station 1. The figure shows increasing deviations between flow quantiles derived from the two types of curves. Given that no bias could be observed in Figure 5, the systematic deviation between the two curves must be attributed to the EVD versus FDF extrapolations for larger T. Results also show that there is increasing uncertainty in flow quantiles and T estimates for T higher than the length of the observed data series. Extrapolation of data measured over relatively short record length introduces large uncertainties [46]. This clearly indicates that caution must be exercised in using data of short record length in estimating very high T. It has been recommended that T should be extrapolated only for values higher than about two or three times the series length due to the uncertainties introduced by the finite sample size [47].

**Figure 5.**(

**a**) High flows of aggregation level 3 days; (

**b**) Low flows of aggregation level 10 days. The symbol (ж) is for T = 5 years; (□) for T = 10 years; and (●) for T = 25 years.

**Figure 6.**(

**a**) Daily high flow quantiles; (

**b**) Daily low flow quantiles. Solid lines are for Gf

_{T }curves and dashed lines are for FDF relationships.

#### 4.3. Uncertainty Analysis

#### 4.3.1. Uncertainty in Return Periods

Figure 7 shows comparison between the theoretical (FDF modeled) and empirical T values. The figure shows increasing statistical modeling uncertainty with increasing T. This again can be explained by the higher uncertainty in EVA for higher T. The alignment of the data points along straight lines parallel to the bisector shows that the deviations between the empirical and theoretical T values are related to the parameter α. The shifts of the data points from the bisector are proportional to the parameter α. The bias as shown in Figure 7 can be explained by the uncertainty in EVD parameter α, which controls the vertical differences between two successive T-year curves on the FDF relationships.

**Figure 7.**(

**a**) Daily high flow quantiles; (

**b**) Daily low flow quantiles. The labels of the stations from 1 to 24 are in the order arranged in Table 1.

Figure 8 shows the Monte Carlo simulation results on the FDF modeled T for daily high and low hydrological extremes. It can be seen that the CIs increase in width with increase in T, which obviously is the result of the high uncertainty associated with EVA for very high T. The widths of the CIs were also noted to vary significantly from one catchment to the next. This reflects the difference in the degree of temporal variability in river flows for the different catchments of the study area.

Figure 9 shows the bias and RMSE of the deviation between FDF modeled and empirical T averaged over the entire EVD fitted to daily flows. It can be seen in Figure 9 that for both high and low flows, the average biases for all the selected catchments of the study area are less than 8%. This explains the acceptability of the quantiles from the statistically modeled FDF relationships of the hydrological extremes in the study area. However, as seen from Figure 9, catchments 2, 6 and 7 border each other in the North Western quadrant of the study area (see Figure 1). This indicates the reduced variability of the river flows in this portion of the study area compared to other areas, e.g., the western or eastern quadrant (see stations 5, 15, and 24 in Figure 1).

The differences in the bias and uncertainty in the FDF modeled flow and T values indicate spatial differences across the study area. The catchments with wider CIs in general present higher temporal variability in the flows due to higher differences between low and high flows or stronger differences between short-duration values and longer duration values. This could be explained by the strong variation in the rainfall extreme intensities in the Nile basin as seen from the regional extreme value analysis by Nyeko-Ogiramoi et al. [48].

As can be seen from the probability distributions of the residuals on the FDF modeled T for daily flows in Figure 10, the residuals on the T estimates (FDF based versus empirical) are wider for high flows than for low flows. This was also seen in Figure 8 and Figure 9. It is noted that the zero value is within the CIs on the T residuals for all the catchments indicating that the FDF modeled Ts are unbiased.

**Figure 8.**(

**a**) and (

**c**) are for daily high flows; (

**b**) and (

**d**) are for daily low flows. (

**a**) and (

**b**) are for station 7; (

**c**) and (

**d**) are for station 6. Round dotted line is the upper limit of the simulated 95% CI. Thin solid line is the lower limit of the CI. Thick solid line is the modeled T. Vertical axis of each graph is logarithmic.

**Figure 9.**(

**a**) Bias [%]; (

**b**) RMSE [years]. The labels of the catchments from 1 to 24 are in the order arranged in Table 1.

**Figure 10.**(

**a**) and (

**c**) are for daily high flows; (

**b**) and (

**d**) are for daily low flows. (

**a**) and (

**b**) are for station 20; (

**c**) and (

**d**) are for station 22. The blue line (bell-shaped) is the assumed Gaussian probability density function (pdf). The dark line (sigmoid) is the cumulative probability distribution (cdf). (

**a**), (

**b**) and (

**d**) all have the same legend as (

**c**).

#### 4.3.2. Uncertainty in Flow Quantiles

Figure 11 shows the validity of the projected FDF quantiles for very high T checked using Gf_{T} curves. It can be seen that the bias increases for higher T of both high and low flows. Whereas general underestimations are obtained for high flows, the statistical FDF models for low flows are characterized by overestimations. The overall variation of the biases across the catchments (min, max) for high flows are (−24.49%, 10.29%), (−29.98%, 13.03%), and (−35.14%, 22.02%) for T = 25, 100 and 500 years respectively; while the corresponding figures for low flows are (−32.14%, 67.41%), (−47.05%, 81.23%), and (−59.95%, 110.71%) for T = 25, 100 and 500 years respectively. On average, higher biases (for T = 25, 100 and 500) were obtained in FDF models of low flows (11.61%, 19.01%, and 28.08%) than high flows (5.37%, 11.28%, and 16.85%). This suggests that the FDF models for low flows are less capable of capturing the flow variability than those of high flows; this is somewhat contradictory to the conclusion in previous section that the FDF based T estimates are less biased for low flows than high flows. The reason for this is the lower low flow values, which leads to higher relative errors for the same absolute errors.

**Figure 11.**(

**a**) High flows; (

**b**) Low flows. The labels of the catchments from 1 to 24 are in the order arranged in Table 1.

The average difference between CI (lower, upper) limits and empirical quantiles (EQ) as percentages of EQ on daily flows of FDF relationships of high flows are (−51.9%, 60.5%), (−61.0%, 82.5%) and (−70.7%, 116.7%) for T of 5, 10 and 25 years respectively. Correspondingly, for low flows, CIs of (−56.1%, 91.6%), (−65.5%, 116.3%) and (−77.7%, 151.2%) are obtained. The highest differences are found at stations 20 and 16 with 155.41% and 179.08% for high and low flows respectively; and the minimum % are respectively 37.74% and 91.47% at stations 23 and 11 for high and low flows respectively. These differences are expected to be due to the variation in the influence of local climate bringing about uneven wet and dry periods across the study area.

The differences are noted to become narrower and wider with increase in aggregation levels of high flows and low flows respectively (see Figure 12). This explains that the CIs on the FDF based extreme flow quantiles depend on the magnitude of the aggregated river flows. With increase in the aggregation level, the magnitudes of the river flows reduce (for high flows) and increase (for low flows). This can be seen in Figure 12, which shows CIs constructed using Monte Carlo simulations on the FDF curve of T = 5 years. Importantly, as shown in Figure 12, the CI for any selected T-year quantile on FDF relationship can be estimated.

**Figure 12.**(

**a**) and (

**c**) are high flow quantiles; (

**b**) and (

**d**) are low flow quantiles. (

**a**) and (

**b**) are for station 16; (

**c**) and (

**d**) are for station 24. (

**b**), (

**c**) and (

**d**) all have the same legend as (

**a**). “T5 Emp.” stands for empirical FDF quantiles for T = 5 years.

## 5. Conclusions

This paper has developed statistically modeled FDF relationships for high and low flow extremes and Monte Carlo based uncertainty estimates on both T-year flow quantiles and return period estimates. It was possible to adequately fit the exponential case of the GPD as the EVD for all the selected study catchments. This was based on the analysis of the tail’s shape of the EVD in Q-Q plots, and calibration of the EVD by weighted regression in the exponential Q-Q plot. These results suggest that the GPD family of distributions represented in this case by the exponential distribution is most suited for hydrological extremes in most of the catchments of the Lake Victoria basin. However, no definite generalization can be made for the entire Lake Victoria basin. More studies need to be carried out in other catchments of the region with longer and up-to-date time series. It can also be recommended that instead of MSE in Q-Q plots, other goodness-of-fit measures e.g., Anderson-Darling test, Kolmogorov-Smirnov test, Z-statistics, L-Moment ratio diagrams, etc., be applied.

The average bias on the modeled return periods of high flows and low flows are all less than 8% (averaged over the entire EVD fitted to daily flows). This confirmed the acceptability of the established FDF relationships. Despite this relatively small value for the average bias in the modeled return period, the bias for individual locations and the limits of the 95% CIs on the modeled T-year flows can differ much more from the observed values. These limits go up to 117% for high flows and return periods up to 25 years, and up to 152% for low flows up to the same return period. The 95% CI of the average bias in the T-year flow ranges from −24.49% to +10.29% for high flows and return period of 25 years, and from −32.14% to 67.41% for low flows.

In addition, the validity of the projected FDF quantiles for return periods higher than the length of the available flow series was checked. When the Gf_{T} curves were taken as the reference, the 95% CI of the average bias widens to (−29.98%, 13.03%) and (−35.14%, 22.02%) for high flows and return periods of 100 and 500 years respectively. They widen to (−47.05%, 81.23%) and (−59.95%, 110.71%) for low flows. For individual stations, the CIs become even wider. This result shows that if FDF relationships calibrated for data scarce regions are used for estimating projected T-year flows for high return periods the uncertainty is large, hence should not be ignored. Quantification of this uncertainty then becomes important.

The assessed uncertainty will be useful for decision making to use the constructed FDFs for various applications to estimate cumulative volumes of water during drought or flood periods at various aggregation levels or return periods, as clarified before in the introduction section.

## Acknowledgments

The research was linked to the FRIEND/NILE project of UNESCO and the Flanders in Trust Fund. The research was financially supported by an IRO PhD scholarship of Katholieke Universiteit Leuven.

## Conflicts of Interest

The authors declare no conflict of interest.

## References

- Chow, V.T.; Maidment, D.R.; Mays, L.W. Applied Hydrology; McGraw-Hill: New York, NY, USA, 1988; pp. 380–410. [Google Scholar]
- Javelle, P.; Grésillon, J.M.; Galéa, G. Discharge-duration frequency curves modeling for floods and scale invariance. C.R. Acad. Sci. IIA
**1999**, 329, 39–44. [Google Scholar] - Javelle, P.; Ouarda, T.B.M.J.; Lang, M.; Bobée, B.; Galéa, G.; Grésillon, J.M. Development of regional flood-duration frequency curves based on the index-flood method. J. Hydrol.
**2002**, 258, 249–259. [Google Scholar] [CrossRef] - Zaidman, M.D.; Keller, V.; Young, A.R.; Cadman, D. Flow-duration-frequency behavior of British rivers based on annual minima data. J. Hydrol.
**2003**, 277, 195–213. [Google Scholar] [CrossRef] - Maurino, M.F. Generalized rainfall-duration-frequency relationships: Applicability in different climatic regions of Argentina. J. Hydrol. Eng.
**2004**, 9, 269–274. [Google Scholar] [CrossRef] - Borga, M.; Vezzani, C.; Fontana, G.D. Regional rainfall depth-duration-frequency equations for an Alpine region. Nat. Hazards
**2005**, 36, 221–235. [Google Scholar] [CrossRef] - Juraj, M.C.; Taha, B.M.J.O. Regional flood-rainfall duration-frequency modeling at small ungaged sites. J. Hydrol.
**2007**, 345, 61–69. [Google Scholar] [CrossRef] - Taye, M.T.; Willems, P. Influence of climate variability on representative QDF predictions of the upper Blue Nile Basin. J. Hydrol.
**2011**, 411, 355–365. [Google Scholar] [CrossRef] - Onyutha, C. Statistical modelling of FDC and return periods to characterise QDF and design threshold of hydrological extremes. J. Urban Environ. Engng.
**2012**, 6, 140–156. [Google Scholar] - Willems, P. Compound intensity/duration/frequency-relationships of extreme precipitation for two seasons and two storm types. J. Hydrol.
**2000**, 233, 189–205. [Google Scholar] [CrossRef] - Nhat, L.M.; Tachikawa, Y.; Takara, K. Establishment of intensity-duration-frequency curves for precipitation in the monsoon area of Vietnam. Ann. Disas. Prev. Res. Inst. Kyoto Univ.
**2006**, 49, 93–103. [Google Scholar] - World Meteorological Organization (WMO), Management of water resources and application of hydrological practices. In Guide to Hydrological Practices; WMO report No.168; WMO: Geneva, Switzerland, 2009; p. 302.
- Willems, P. A spatial rainfall generator for small spatial scales. J. Hydrol.
**2001**, 252, 126–144. [Google Scholar] [CrossRef] - Peck, A.; Prodanovic, P.; Simonovic, S.P.P. Rainfall intensity duration frequency curves under climate change: City of London, Canada. Can. Water Resour. J.
**2013**, 37, 177–189. [Google Scholar] - Mailhot, A.; Duchesne, S.; Caya, D.; Talbot, G. Assessment of future change in intensity-duration-frequency (IDF) curves for Southern Quebec using Canadian Regional Climate Model (CRCM). J. Hydrol.
**2007**, 347, 197–210. [Google Scholar] [CrossRef] - Kuo, C.C.; Jahan, N.; Gizaw, M.S.; Gan, T.Y.; Chan, S. The Climate Change Impact on Future IDF Curves in Central Alberta. In Proceedings of Engineering Institute of Canada (EIC) Climate Change Technology Conference, Concordia University, Montreal, Canada, 27–29 May 2013.
- Nyeko-Ogiramoi, P. Climate Change Impacts on Hydrological Extremes and Water Resources in Lake Victoria Catchments, Upper Nile Basin. Ph.D. Thesis, Arenberg Doctoral School of Science, Engineering and Technology, Katholieke Universiteit Leuven, Leuven, Belgium, 2011. [Google Scholar]
- Willems, P. Revision of urban drainage design rules after assessment of climate change impacts on precipitation extremes at Uccle, Belgium. J. Hydrol.
**2013**, 496, 166–177. [Google Scholar] [CrossRef] - Yu, P.S.; Yang, T.C.; Wang, Y.C. Uncertainty analysis of regional flow duration curves. J. Water Resour. Plann. Manag.
**2002**, 128, 424–430. [Google Scholar] [CrossRef] - Castellarin, A.; Camorani, G.; Brath, A. Predicting annual and long term flow-duration curves in ungauged basins. Adv. Water Resour.
**2007**, 30, 937–953. [Google Scholar] [CrossRef] - Ayyub, B.M. The philosophical and theoretical basis for analyzing and modeling uncertainty and ignorance. In Applied Research in Uncertainty Modeling and Analysis; Springer: New York, NY, USA, 2005; pp. 1–18. [Google Scholar]
- Davidson, A.C.; Hinkley, D.V. Bootstrap Methods and Their Application; Cambridge University Press: Cambridge, UK, 1997; p. 582. [Google Scholar]
- Mbungu, W.; Ntegeka, V.; Kahimba, F.C.; Taye, M.; Willems, P. Temporal and spatial variations in hydro-climatic extremes in the Lake Victoria basin. Phys. Chem. Earth
**2012**, 50–52, 24–33. [Google Scholar] [CrossRef] - Chu, P.; Wang, J. Modeling return periods of tropical cyclone intensities in the vicinity of Hawaii. J. Appl. Meteorol.
**1998**, 37, 951–960. [Google Scholar] [CrossRef] - Taye, M.T.; Willems, P. Temporal variability of hydroclimatic extremes in the Blue Nile basin. Water Resour. Res.
**2012**, 48, W03513. [Google Scholar] - Nyeko-Ogiramoi, P.; Willems, P.; Gaddi-Ngirane, K. Trend and variability in observed hydrometeorological extremes in the Lake Victoria basin. J. Hydrol.
**2013**, 489, 56–73. [Google Scholar] [CrossRef] - Tukey, J. Bias and confidence in not quite large samples (abstract). Ann. Math. Stat.
**1958**, 29, 614. [Google Scholar] [CrossRef] - Hartigan, J. Using subsample values as typical values. J. Am. Stat. Assoc.
**1969**, 64, 1303–1317. [Google Scholar] [CrossRef] - Efron, B. Bootstrap methods: Another look at jackknife. Ann. Stat.
**1979**, 7, 1–26. [Google Scholar] [CrossRef] - Beven, K.J.; Binley, A.M. The future of distributed models: Model calibration and uncertainty prediction. Hydrol. Process.
**1992**, 6, 279–298. [Google Scholar] [CrossRef] - Montanari, A. Uncertainty of hydrological predictions. In Treatise on Water Science; Wilderer, P., Ed.; Academic Press: Oxford, UK, 2011; pp. 459–478. [Google Scholar]
- Gichere, S.K.; Olado, G.; Anyona, D.N.; Matano, A.S.; Dida, G.O.; Abuom, P.O.; Amayi, A.J.; Ofulla, A.V.O. Effects of drought and floods on crop and animal losses and socio-economic status of households in the Lake Victoria Basin of Kenya. J. Emerg. Trends Econ. Manag. Sci.
**2013**, 4, 31–41. [Google Scholar] - Kibiiy, J.; Kivuma, J.; Karogo, P.; Muturi, J.M.; Dulo, S.O.; Roushdy, M.; Kimaro, T.A.; Akiiki, J.B.M. Flood and Drought Forecasting and Early Warning; Flood Management Research Cluster, Nile Basin Capacity Building Network (NBCBN-SEC) Office: Cairo, Egypt, 2010; p. 68. [Google Scholar]
- Kenyan Ministry of Water and Irrigation (KMWI), Flood Mitigation Strategy; KMWI: Nairobi, Kenya; June; 2009; p. 66.
- Awange, J.L.; Aluoch, J.; Ogallo, L.; Omulo, M.; Omondi, P. An assessment of frequency and severity of drought in the Lake Victoria region (Kenya) and its impact on food security. Clim. Res.
**2007**, 33, 135–142. [Google Scholar] [CrossRef] - Otieno, H.O.; Awange, J.L. Energy Resources in East Africa; Springer-Verlag: Berlin, Germany, 2006. [Google Scholar]
- Willlems, P. A time series tool to support the multi-criteria performance evaluation of rainfall-runoff models. Environ. Model. Softw.
**2009**, 24, 311–321. [Google Scholar] [CrossRef] - Csörgö, S.; Deheuvels, P.; Mason, D. Kernel estimators of the tail index of a distribution. Ann. Statist.
**1985**, 13, 1050–1077. [Google Scholar] [CrossRef] - Beirlant, J.; Teugels, J.L.; Vynckier, P. Practical Analysis of Extreme Values; Leuven University Press: Leuven, Belgium, 1996. [Google Scholar]
- Willems, P.; Guillou, A.; Beirlant, J. Bias correction in hydrologic GPD based extreme value analysis by means of a slowly varying function. J. Hydrol.
**2007**, 338, 221–236. [Google Scholar] [CrossRef] - Willems, P. Hydrological applications of extreme value analysis. In Hydrology in a Changing Environment; Wheater, H., Kirby, C., Eds.; John Wiley & Sons: Chichester, UK, 1998; pp. 15–25. [Google Scholar]
- Jenkinson, A.F. The frequency distribution of the annual maximum (or minimum) of meteorological elements. Q. J. Roy. Meteor. Soc.
**1955**, 81, 158–171. [Google Scholar] [CrossRef] - Pickands, J. Statistical inference using extreme order statistics. Ann. Stat.
**1975**, 3, 119–131. [Google Scholar] [CrossRef] - Hill, B.M. A simple and general approach to inference about the tail of a distribution. Ann. Stat.
**1975**, 3, 1163–1174. [Google Scholar] [CrossRef] - Willems, P. Formula for the Calibration of QDF-Curves on the Basis of Scaling Properties and Correct Asymptotic Properties; Katholieke Universiteit Leuven: Leuven, Belgium, 2003. [Google Scholar]
- Klemeš, V. Tall tales about tails of hydrological distributions. II. J. Hydrol. Eng.
**2000**, 5, 232–239. [Google Scholar] [CrossRef] - Kangieser, P.C.; Blackadar, A. Estimating the likelihood of extreme events. Weatherwise
**1994**, 47, 38–40. [Google Scholar] - Nyeko-Ogiramoi, P.O.; Willems, P.; Mutua, F.; Moges, S.A. An elusive search for regional flood frequency estimates in the River Nile basin. Hydrol. Earth Syst. Sci.
**2012**, 16, 3149–3163. [Google Scholar] [CrossRef]

© 2013 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).