Generalized Beta Distribution of the Second Kind for Flood Frequency Analysis

Estimation of flood magnitude for a given recurrence interval T (T-year flood) at a specific location is needed for design of hydraulic and civil infrastructure facilities. A key step in the estimation or flood frequency analysis (FFA) is the selection of a suitable distribution. More than one distribution is often found to be adequate for FFA on a given watershed and choosing the best one is often less than objective. In this study, the generalized beta distribution of the second kind (GB2) was introduced for FFA. The principle of maximum entropy (POME) method was proposed to estimate the GB2 parameters. The performance of GB2 distribution was evaluated using flood data from gauging stations on the Colorado River, USA. Frequency estimates from the GB2 distribution were also compared with those of commonly used distributions. Also, the evolution of frequency distribution along the stream from upstream to downstream was investigated. It concludes that the GB2 is appealing for FFA, since it has four parameters and includes some well-known distributions. Results of case study demonstrate that the parameters estimated by POME method are found reasonable. According to the RMSD and AIC values, the performance of the GB2 distribution is better than that of the widely used distributions in hydrology. When using different distributions for FFA, significant different design flood values are obtained. For a given return period, the design flood value of the downstream gauging stations is larger than that of the upstream gauging station. In addition, there is an evolution of distribution. Along the Yampa River, the distribution for FFA changes from the four-parameter GB2 distribution to the three-parameter Burr XII distribution.


Introduction
Estimation of flood magnitude for a given recurrence interval T (T-year flood) at a given location is essential for the design of hydraulic and civil infrastructure facilities, such as dams, spillways, levees, urban drainage, culverts, road embankments, and parking lots.A key step in flood frequency estimation or analysis (FFA) is the selection of a suitable frequency distribution [1].Commonly used distributions for flood frequency analysis include Gumbel, gamma, generalized extreme value (GEV), Pearson type III (P-III), log-Pearson type III (LP-III), Weibull, and log-normal (LN).Some of these distributions have been adopted in different countries.For example, the P-III distribution has been adopted in China and Australia as a standard method for hydrologic frequency analysis [2][3][4].The LP-III distribution has been adopted in the United States and the GEV distribution in Europe.
Mielke and Johnson investigated the use of two special cases of the generalized beta distribution of the second kind, namely gamma and log normal distributions, for flood frequency analysis [5].
where B(•) is the beta function; β is the scale parameter, β > 0; and r 1 > 0, r 2 > 0, and r 3 > 0 are the shape parameters.Parameter r 3 represents the overall shape; parameter r 1 governs the left tail; parameter r 2 controls the right tail; and β is a scale parameter and depends on the unit of measurement.These parameters allow the distribution to be able to fit data having very different histogram shapes.It can simulate both the J-shaped and bell-shaped distributions.Parameters r 1 and Entropy 2017, 19, 254 3 of 17 r 2 together determine the skewness of the distribution.The general shapes of GB2 probability density distribution were shown in Figure 1.
Entropy 2017, 19, 254 3 of 18 skewness of the distribution.The general shapes of GB2 probability density distribution were shown in Figure 1.When analyzing extreme rainfall, Papalexiou and Koutsoyiannis showed that the GB2 distribution is a very flexible four-parameter distribution [8].By fixing certain parameters, the GB2 distribution can yield some well-known distributions, such as the beta distribution of the second kind (B2), the Burr type XII, generalized gamma (GG), and so on.These distributions can be treated as special or limiting cases of the GB2 distribution, as shown in Figure 2. Some of these special cases have been applied in hydrological frequency analysis.For example, Shao et al. employed the Burr type XII distribution for flood frequency analysis [2].When analyzing extreme rainfall, Papalexiou and Koutsoyiannis showed that the GB2 distribution is a very flexible four-parameter distribution [8].By fixing certain parameters, the GB2 distribution can yield some well-known distributions, such as the beta distribution of the second kind (B2), the Burr type XII, generalized gamma (GG), and so on.These distributions can be treated as special or limiting cases of the GB2 distribution, as shown in Figure 2. Some of these special cases have been applied in hydrological frequency analysis.For example, Shao et al. employed the Burr type XII distribution for flood frequency analysis [2].skewness of the distribution.The general shapes of GB2 probability density distribution were shown in Figure 1.When analyzing extreme rainfall, Papalexiou and Koutsoyiannis showed that the GB2 distribution is a very flexible four-parameter distribution [8].By fixing certain parameters, the GB2 distribution can yield some well-known distributions, such as the beta distribution of the second kind (B2), the Burr type XII, generalized gamma (GG), and so on.These distributions can be treated as special or limiting cases of the GB2 distribution, as shown in Figure 2. Some of these special cases have been applied in hydrological frequency analysis.For example, Shao et al. employed the Burr type XII distribution for flood frequency analysis [2].

Estimation of Parameters of GB2 Distribution by POME Method
The GB2 distribution parameters were determined using the principle of maximum entropy (POME).The POME method involves the following steps: (1) specification of constraints; (2) maximization of entropy using the method of Lagrange multipliers; (3) derivation of the relation between Lagrange multipliers and constraints; (4) derivation of the relation between Lagrange multipliers and distribution parameters; and (5) derivation of the relation between distribution parameters and constraints.These steps are discussed in Appendix A.Here only steps (1) and ( 5) are outlined.
Flood discharge is considered as a random variable X, which ranges from 0 to infinite.Its probability distribution function (PDF) and cumulative distribution function (CDF) are denoted as f (x) and F(x) respectively, where x is a specific value of X.Since constraints encode the information that can be given for the random variable, following Singh (1998), the constraints for the GB2 distribution can be expressed as: )) The first constraint is the total probability law, the second constraint is the mean of log values or the geometric mean, and the third constraint is the mean of log of scaled values raised to a power and then shifted by unity.
Following the derivation in Appendix A, the relation between parameters and constraints can be expressed as: where φ(.) is the digamma function; and φ (.) is the trigamma function.Detailed information for deriving these relationships can be found in Appendix A.

Flood Frequency Analysis
For FFA, three problems were addressed.First, the GB2 distribution was tested using observed flood data, and was compared with commonly used distributions in hydrology.Second, a method for selecting the best distribution was discussed.Third, flood frequency analysis was carried out at several gauging stations from upstream to downstream, and the evolution of frequency distribution along the stream was investigated.

Flood Data
Flood data from eight gauging stations on the Colorado River and its tributaries, as shown in Figure 3, were considered to test the performance of the GB2 distribution and discuss the evolution of frequency distribution along the river.The Colorado River is the principal river of the Southwestern United States and northwest Mexico.It rises in the central Rocky Mountains, flows generally southwest across the Colorado Plateau and through the Grand Canyon.The basin boundary consists of mountains that are 13,000 to 14,000 feet (3962.4m to 4267.2 m) high in Wyoming, Colorado, and Utah; and the boundary drops to elevations of less than 1000 feet (304.8 m) at Hoover Dam.The northern part of the river basin in Colorado and Wyoming is a mountainous plateau that ranges from 5000 to 8000 feet (1524 m to 2438 m) in elevation, which encompasses deep canyons, rolling valleys, and intersecting mountain ranges.The central and southern portions of the basin in eastern Utah, northwestern New Mexico, and northern Arizona consist of rugged mountain ranges interspersed with rolling plateaus and broad valleys.In general, the mountains in the southern part of the basin are much lower than those in the northern part.Of the eight gauging stations considered in this study, gauging stations or sites 1, 2 and 3 are on the Yampa River which is a secondary tributary of the Colorado River.Sites 4, 5, 6, 7 and 8 are on the mainstream of the Colorado River.Site 8 is near the location of the Hoover Dam.The data of these gauging stations is directly downloaded from USGS (United States Geological Survey) website.The characteristics of flow data of these gauging stations, including length of the data, mean, standard deviation, skewness, and kurtosis, were calculated, as shown in Table 1.Since there is a dam, named Glenn Canyon, regulating the river flow past Lees Ferry (shown in Figure 3), the characteristics of the flow at the Hoover dam (site 8) are quite different from those at sites 4, 5, 6 and 7 upstream.It can be seen from Table 1 that for sites 1 to 7 the mean values increase from upstream to downstream, as more rainfall or water flows into the river.Since the standard deviation is related to the flood magnitude, it also increases with the mean value.For site 8, considering the impact of reservoir operation, some streamflow was stored in the reservoir, which leads that the streamflow at site 8 is reduced.The skewness is positive for all gauging stations, indicating that the right tail is longer or fatter than the left side and the mass of distribution is concentrated on the left side.Kurtosis is a measure of the peakedness of the probability distribution.The skewness and kurtosis values in the mainstream are generally lower than those in the tributaries.
Southwestern United States and northwest Mexico.It rises in the central Rocky Mountains, flows generally southwest across the Colorado Plateau and through the Grand Canyon.The basin boundary consists of mountains that are 13,000 to 14,000 feet (3962.4m to 4267.2 m) high in Wyoming, Colorado, and Utah; and the boundary drops to elevations of less than 1000 feet (304.8 m) at Hoover Dam.The northern part of the river basin in Colorado and Wyoming is a mountainous plateau that ranges from 5000 to 8000 feet (1524 m to 2438 m) in elevation, which encompasses deep canyons, rolling valleys, and intersecting mountain ranges.The central and southern portions of the basin in eastern Utah, northwestern New Mexico, and northern Arizona consist of rugged mountain ranges interspersed with rolling plateaus and broad valleys.In general, the mountains in the southern part of the basin are much lower than those in the northern part.Of the eight gauging stations considered in this study, gauging stations or sites 1, 2 and 3 are on the Yampa River which is a secondary tributary of the Colorado River.Sites 4, 5, 6, 7 and 8 are on the mainstream of the Colorado River.Site 8 is near the location of the Hoover Dam.The data of these gauging stations is directly downloaded from USGS (United States Geological Survey) website.The characteristics of flow data of these gauging stations, including length of the data, mean, standard deviation, skewness, and kurtosis, were calculated, as shown in Table 1.Since there is a dam, named Glenn Canyon, regulating the river flow past Lees Ferry (shown in Figure 3), the characteristics of the flow at the Hoover dam (site 8) are quite different from those at sites 4, 5, 6 and 7 upstream.It can be seen from Table 1 that for sites 1 to 7 the mean values increase from upstream to downstream, as more rainfall or water flows into the river.Since the standard deviation is related to the flood magnitude, it also increases with the mean value.For site 8, considering the impact of reservoir operation, some streamflow was stored in the reservoir, which leads that the streamflow at site 8 is reduced.The skewness is positive for all gauging stations, indicating that the right tail is longer or fatter than the left side and the mass of distribution is concentrated on the left side.Kurtosis is a measure of the peakedness of the probability distribution.The skewness and kurtosis values in the mainstream are generally lower than those in the tributaries.

Performance Measures
For evaluating the performance of the GB2 distribution, two measures were employed: (1) the root mean square deviation (RMSD); and (2) the Akaike information criterion (AIC).These methods assess the fitted distribution at a site by summarizing the deviations between observed discharges and computed discharges.
A frequently used method for assessing the goodness-of-fit of a function is the RMSD [12].This method was used by NERC (1975) for ranking candidate distributions [13].RMSD can be expressed as: where n is the sample size; Q the is the computed discharge at the i th plotting position.Q emp denotes the observed i th smallest discharge.The value of RMSD is from 0 to 1.The samller is, the better the distribution fits.AIC is a measure of the relative quality of statistical models for a given set of data.It also includes a penalty that is an increasing function of the number of estimated parameters.The AIC value was calculated as [14]: where K is the number of parameters of the distribution, and MSE was calculated by Given a set of candidate models for the data, the preferred model is the one with the minimum AIC value.

Evaluation of GB2 Distribution
Annual maximum flood peak data from four gauging stations, namely sites 2, 6, 7 and 8 in Figure 3, were selected.The empirical frequencies were calculated first.The purpose of defining the empirical distribution is to compare it with selected theoretical distributions in order to verify whether they fit sample data.
Many plotting positions are proposed, most of which can be expressed in general form: where a is a constant having values from 0 to 0.5 in different formula, 0.5 for Hazen's formula, 0.3 for Chegadayev's formula, zero for Weibull's formula, 3/8 for Blom's formula, 1/3 for Tukey's formula, and 0.44 for Gringorten's formula.
Among these formulars, Gringorten's formular is recoganized by lots of researchers, especially for GEV, gumbel, exponential, Generalized pareto distributions which have been widely used for flood frequency analysis [15][16][17][18][19][20].The Gringorten formula is also used for GB2 distribution.For normal, generalized normal and Gamma distributions, the Blom's formula is recommended [21,22].For Pearson type 3 and log Pearson type 3 distributions, Weibull's formula is recommended [18,21].The GB2 distribution was employed to fit the annual maximum (AM) series of the four sites.The distribution parameters were estimated using Equation (3) and given in Table 2.The fitted GB2 distributions and empirical frequency of each AM series are shown in Figure 4.In the left of Figure 4, the line represents the fitted distribution and circle the empirical frequencies of observations.Results show that the marginal distributions fit the empirical data well.Histograms of AM flood peak series fitted by the GB2 distribution for the gauging stations on the Colorado River are shown in the right section of Figure 4.It also indicates that the GB2 distribution can successfully be fitted to empirical histograms.
Several distributions, including normal, exponential, gamma, Gumbel, generalized normal, pearson type III, log Pearson type III, generalized Pareto, and generalized extreme-value that are commonly used in hydrology, were fitted to the AM series at this site.The L-moment method was used to estimate the parameters of these distributions.Among these formulars, Gringorten's formular is recoganized by lots of researchers, especially for GEV, gumbel, exponential, Generalized pareto distributions which have been widely used for flood frequency analysis [15][16][17][18][19][20].The Gringorten formula is also used for GB2 distribution.For normal, generalized normal and Gamma distributions, the Blom's formula is recommended [21,22].For Pearson type 3 and log Pearson type 3 distributions, Weibull's formula is recommended [18,21].The GB2 distribution was employed to fit the annual maximum (AM) series of the four sites.The distribution parameters were estimated using Equation (3) and given in Table 2.The fitted GB2 distributions and empirical frequency of each AM series are shown in Figure 4.In the left of Figure 4, the line represents the fitted distribution and circle the empirical frequencies of observations.Results show that the marginal distributions fit the empirical data well.Histograms of AM flood peak series fitted by the GB2 distribution for the gauging stations on the Colorado River are shown in the right section of Figure 4.It also indicates that the GB2 distribution can successfully be fitted to empirical histograms.
Several distributions, including normal, exponential, gamma, Gumbel, generalized normal, pearson type III, log Pearson type III, generalized Pareto, and generalized extreme-value that are commonly used in hydrology, were fitted to the AM series at this site.The L-moment method was used to estimate the parameters of these distributions.Singh and Guo compared the POME method with the L-moment method, and indicated that the two methods are comparable [11,23,24].Therefore no matter what method is used, it has little influence on the value of the T-year design discharge.The Kolmogorov-Smirnov test was used here to compare a sample with a reference probability distribution.The p-value was calculated and given in Table 3 as well.The higher or more close to 1 the p-value is the more similar the theoretical and empirical distributions are.It is indicated from Table 3 that the p-value of GB2 distribution is 1 or close to 1, which demonstrates that the GB2 distribution fit the data better.Table 3 also listed the RMSD and AIC values computed for the fitted GB2 distribution using Equations ( 4)- (7).The smaller the RMSD and AIC values are, the better the distribution fits.For the site streamboat springs, the GB2 and generalized normal distributions have the smallest RMSD values, which is equal to 0.025.For the site Near Cisco, the GB2 has the smallest RMSE values, which is equal to 0.061.For the site Near Colorado-Utah, the GB2 and gamma distributions have the smallest RMSE value.For the site Hoover dam, the GB2 distribution has the smallest RMSE value.Since the GB2 distribution have more parameters, the AIC values of GB2 distribution are larger than those of generalized normal, Gamma and GEV distributions.Thus, generally GB2 distribution gives a getter fit.Singh and Guo compared the POME method with the L-moment method, and indicated that the two methods are comparable [11,23,24].Therefore no matter what method is used, it has little influence on the value of the T-year design discharge.The Kolmogorov-Smirnov test was used here to compare a sample with a reference probability distribution.The p-value was calculated and given in Table 3 as well.The higher or more close to 1 the p-value is the more similar the theoretical and empirical distributions are.It is indicated from Table 3 that the p-value of GB2 distribution is 1 or close to 1, which demonstrates that the GB2 distribution fit the data better.Table 3 also listed the RMSD and AIC values computed for the fitted GB2 distribution using Equations ( 4)- (7).The smaller the RMSD and AIC values are, the better the distribution fits.For the site streamboat springs, the GB2 and generalized normal distributions have the smallest RMSD values, which is equal to 0.025.For the site Near Cisco, the GB2 has the smallest RMSE values, which is equal to 0.061.For the site Near Colorado-Utah, the GB2 and gamma distributions have the smallest RMSE value.For the site Hoover dam, the GB2 distribution has the smallest RMSE value.Since the GB2 distribution have more parameters, the AIC values of GB2 distribution are larger than those of generalized normal, Gamma and GEV distributions.Thus, generally GB2 distribution gives a getter fit.In order to compare the POME with the current used method, the maximum likelihood (ML) method was also employed for the parameter estimation of GB2 distribution.Taking the site Near Colorada-Utah for an example, the estimated parameters by POME and ML method are given in Table 4.The p-value, RMSE and AIC values are also given in Table 4.It is indicated that the parameters obtained by the two method are more or less the same.And the RMSE and AIC values based on the POME method are smaller.

Flood Frequency Analysis
The Hoover dam is a multi-purpose dam, serving the needs of flood control, irrigation, water supply, and hydropower generation.Therefore, it was desired to determine the most appropriate distribution for FFA at the dam site.The T-year design flood at Hoover dam was calculated using each distribution, as given in Table 5, and it can be seen that different distributions yielded significantly different values.For example, the 1000-year design flood values calculated by the GB2 and gamma distributions were 76,702 and 50,485 ft 3 /s, respectively.The RMSD and AIC values for GB2 distribution (Gamma distribution) were 0.036 (0.057) and 1098.8 (1192.9),respectively, which indicates that the performance of GB2 distribution is much better than that of the gamma distribution.It concludes that if the gamma distribution were used, the design flood would be underestimated and potential flood risk would be higher.

Change in Flood Frequency Distribution with Change in Drainage Area
The GB2 distribution was applied for FFA along the main stem of the Colorado River.Four gauging stations (sites 4, 5, 6 and 7) from upstream to downstream were used, as shown in Figure 3 and Table 6.These gauging stations were selected, because all these stations are on the mainstream and no dam has been built on this reach.The drainage area and statistical characteristics (including mean, skewness and kurtosis of the annual maximum data) of these stations were calculated, as given in Table 1.The T-year design flood of these gauging stations was calculated, as shown in Figure 5, in which the x-axis represents the return periods and the y-axis represents the design flood values.Figure 5 shows that for a given return period, the design flood value of the downstream gauging stations is larger than that of the upstream gauging stations.The increasing rates of drainage area and T-year design flood values between the adjacent gauging stations were computed, as given in Table 6, which indicates that the percentage increase of the drainage area was nearly the same as that of the design flood values.For instance, with the increase of drainage area up to 45% from the gauging station near Dotsero to that near Cameo, the flood value increased by 43% on average.It is also seen that from upstream to downstream, when the drainage area increased by 45%, 55% and 26%, the flood value increased by 43%, 42%, and 16%, respectively.It seems that in a mountainous watershed, the upstream the reach is, the greater the impact the drainage area has on flood.This may be because that the runoff coefficient is generally larger in the steep area.that from upstream to downstream, when the drainage area increased by 45%, 55% and 26%, the flood value increased by 43%, 42%, and 16%, respectively.It seems that in a mountainous watershed, the upstream the reach is, the greater the impact the drainage area has on flood.This may be because that the runoff coefficient is generally larger in the steep area.Near Cisco 62419

Evolution of Frequency Distribution along Stream
In order to determine the evolution of frequency distribution and its parameters along the river, data from the Yampa River were applied, because this river is taken as one of the west's last wild rivers and has only a few small dams and diversions.The Yampa River with a length of 402 km, located in northwestern Colorado, is a tributary of Green River and a secondary tributary of the Colorado River.Data from three gauging stations along this river, designated as sites 1, 2 and 3 in Figure 6, were used.The GB2 distribution was used to fit the AM series of each of the three gauging stations, as shown in Table 7.It can be seen that shape parameters r1 and r2 decreased along the river.The value of r1 became close to be 1.When r1 equals 1, the GB2 distribution becomes the Burr XII distribution [25].This distribution has been shown to reasonably fit the income distribution data [20,26,27]and has recently been used in hydrology [2,28].The PDF of Burr XII distribution can be written as: where b is the scale parameter.The Burr XII distribution was also used to fit the data at the gauging station near Maybell of Yampa River.The estimated parameters of Burr XII distribution were: r2 = 1.94, r3 = 4.19, and b = 12.33.The fitting results of the GB2 and Burr distributions for the gauging station near Maybell are shown in Figure 7.For the gauging station near Maybell, parameters of the GB2 distribution estimated by POME method are nearly as the same as the parameters of the Burr XII distribution estimated by MLE method.Thus, Burr XII distribution instead of GB2 distribution can be used for FFA at that station.In other words, the distribution for FFA changes from the fourparameter GB2 distribution to the three-parameter Burr XII distribution along the Yampa River.
There is an evolution of distribution along this river.From Equation ( 1), the value of scale parameter β increases with the mean value, because more water flows into the stream.Parameters r1 and r2    Near Cisco 62419

Evolution of Frequency Distribution along Stream
In order to determine the evolution of frequency distribution and its parameters along the river, data from the Yampa River were applied, because this river is taken as one of the west's last wild rivers and has only a few small dams and diversions.The Yampa River with a length of 402 km, located in northwestern Colorado, is a tributary of Green River and a secondary tributary of the Colorado River.Data from three gauging stations along this river, designated as sites 1, 2 and 3 in Figure 6, were used.The GB2 distribution was used to fit the AM series of each of the three gauging stations, as shown in Table 7.It can be seen that shape parameters r 1 and r 2 decreased along the river.The value of r 1 became close to be 1.When r 1 equals 1, the GB2 distribution becomes the Burr XII distribution [25].This distribution has been shown to reasonably fit the income distribution data [20,26,27] and has recently been used in hydrology [2,28].The PDF of Burr XII distribution can be written as: where b is the scale parameter.The Burr XII distribution was also used to fit the data at the gauging station near Maybell of Yampa River.The estimated parameters of Burr XII distribution were: r 2 = 1.94, r 3 = 4.19, and b = 12.33.The fitting results of the GB2 and Burr distributions for the gauging station near Maybell are shown in Figure 7.For the gauging station near Maybell, parameters of the GB2 distribution estimated by POME method are nearly as the same as the parameters of the Burr XII distribution estimated by MLE method.Thus, Burr XII distribution instead of GB2 distribution can be used for FFA at that station.In other words, the distribution for FFA changes from the four-parameter GB2 distribution to the three-parameter Burr XII distribution along the Yampa River.There is an evolution of distribution along this river.From Equation (1), the value of scale parameter β increases with the mean value, because more water flows into the stream.Parameters r 1 and r 2 govern the left and right tails, respectively.The smaller the value of r 1 , the fatter the left tail is; and the smaller the value of r 2 , the fatter the right tail is.It can be seen from Table 7 that both r 1 and r 2 decrease along the stream, which demonstrates that both the left and right tails become fatter, and the PDF values become larger in these areas and lower in the central area.govern the left and right tails, respectively.The smaller the value of r1, the fatter the left tail is; and the smaller the value of r2, the fatter the right tail is.It can be seen from Table 7 that both r1 and r2 decrease along the stream, which demonstrates that both the left and right tails become fatter, and the PDF values become larger in these areas and lower in the central area.

Conclusions
The GB2 provides sufficient flexibility to fit a large variety of data sets.Papalexiou and Koutsoyiannis introduced this distribution in hydrology and used it for rainfall frequency analysis [8].In this study, the generalized beta distribution of the second kind (GB2) is introduced for FFA for the first time.The POME method was proposed to estimate the parameters of GB2 distribution.govern the left and right tails, respectively.The smaller the value of r1, the fatter the left tail is; and the smaller the value of r2, the fatter the right tail is.It can be seen from Table 7 that both r1 and r2 decrease along the stream, which demonstrates that both the left and right tails become fatter, and the PDF values become larger in these areas and lower in the central area.

Conclusions
The GB2 provides sufficient flexibility to fit a large variety of data sets.Papalexiou and Koutsoyiannis introduced this distribution in hydrology and used it for rainfall frequency analysis [8].In this study, the generalized beta distribution of the second kind (GB2) is introduced for FFA for the first time.The POME method was proposed to estimate the parameters of GB2 distribution.

Conclusions
The GB2 provides sufficient flexibility to fit a large variety of data sets.Papalexiou and Koutsoyiannis introduced this distribution in hydrology and used it for rainfall frequency analysis [8].In this study, the generalized beta distribution of the second kind (GB2) is introduced for FFA for the first time.The POME method was proposed to estimate the parameters of GB2 distribution.Equations of POME method was deduced by ourselves and given in Appendix A. The Colorado River basin was selected as a case study to test the performance of GB2 distribution.Frequency estimates from the GB2 distribution were also compared with those of commonly used distributions in hydrology.In addition, some characteristics of FFA in mountainous areas are discussed.The conclusions can be summarized as follows: (1) Results demonstrate that the GB2 is appealing for FFA, since it has four parameters which allows the distribution to be able to fit data having very different histogram shapes, such as the J-shaped and bell-shaped distributions.And by fixing certain parameters, the GB2 distribution can yield some well-known distributions, such as the beta distribution of the second kind (B2), the Burr type XII, generalized gamma (GG), and so on.(2) The parameters estimated by POME method are found reasonable.Both the marginal distributions and histograms indicates that the GB2 distribution can successfully be fitted to empirical values using the POME method.(3) The performance of the GB2 distribution is better than that of the widely used distributions in hydrology.For the site streamboat springs, the GB2 and generalized normal distributions have the smallest RMSD values.For the site Near Cisco, the GB2 has the smallest RMSE values.
For the site Near Colorado-Utah, the GB2 and gamma distributions have the smallest RMSE value.For the site Hoover dam, the GB2 distribution has the smallest RMSE value.Since the GB2 distribution have more parameters, the AIC values of GB2 distribution are larger than those of generalized normal, Gamma and GEV distributions.Thus, generally GB2 distribution gives a getter fit.(4) When using different distributions for FFA, significant different design flood values are obtained.
It concludes that if the wrong distribution were used, the design flood would be underestimated and potential flood risk would be higher.(5) The design flood value increase with the drainage area.For a given return period, the design flood value of the downstream gauging stations is larger than that of the upstream gauging stations.In this study, the percentage increase of the drainage area was nearly the same as that of the design flood values.It seems that in a mountainous watershed, the upstream the reach is, the greater the impact the drainage area has on flood.This may be because that the runoff coefficient is generally larger in the steep area.(6) There is an evolution of distribution along this river.Along the Yampa River, the distribution for FFA changes from the four-parameter GB2 distribution to the three-parameter Burr XII distribution.And both r 1 and r 2 decrease along the stream, which demonstrates that both the left and right tails become fatter, and the PDF values become larger in these areas and lower in the central area, which means that when the drainage area become larger, the flood magnitudes has a more significant variation.

Figure 1 .
Figure 1.Shapes of PDF of GB2 distribution.

Figure 2 .Figure 1 .
Figure 2. The GB2 distribution and its special cases (where BR12 means the Burr XII distribution; BR3 means the Burr III distribution; B2 means the beta distribution of second kind; Fisk means log-logistic distribution; L means the Lomax distribution; IL means inverse Lomax distribution; GA distribution means the gamma distribution; GN means the generalized normal distribution; W means the Weibull distribution and EXP means the exponential distribution).

Figure 1 .
Figure 1.Shapes of PDF of GB2 distribution.

Figure 2 .
Figure 2. The GB2 distribution and its special cases (where BR12 means the Burr XII distribution; BR3 means the Burr III distribution; B2 means the beta distribution of second kind; Fisk means log-logistic distribution; L means the Lomax distribution; IL means inverse Lomax distribution; GA distribution means the gamma distribution; GN means the generalized normal distribution; W means the Weibull distribution and EXP means the exponential distribution).

Figure 2 .
Figure 2. The GB2 distribution and its special cases (where BR12 means the Burr XII distribution; BR3 means the Burr III distribution; B2 means the beta distribution of second kind; Fisk means log-logistic distribution; L means the Lomax distribution; IL means inverse Lomax distribution; GA distribution means the gamma distribution; GN means the generalized normal distribution; W means the Weibull distribution and EXP means the exponential distribution).

Figure 3 .
Figure 3. Locations of gauging stations on the Colorado River.

Figure 3 .
Figure 3. Locations of gauging stations on the Colorado River.

Figure 4 .
Figure 4. Marginal distributions and histograms of AM flood peak series fitted by the GB2 distribution for the gauging stations on the Colorado River.(a) Steamboat springs; (b) Near Colorado-Utah; (c) Near Cisco; (d) Hoover Dam.

Figure 4 .
Figure 4. Marginal distributions and histograms of AM flood peak series fitted by the GB2 distribution for the gauging stations on the Colorado River.(a) Steamboat springs; (b) Near Colorado-Utah; (c) Near Cisco; (d) Hoover Dam.

Figure 5 .
Figure 5. Flood values along the mainstream of the upper Colorado River.

Figure 5 .
Figure 5. Flood values along the mainstream of the upper Colorado River.

Figure 6 .
Figure 6.Evaluations of PDF of sites along the Yampa River.

Figure 7 .
Figure 7. Marginal distribution and histograms of AM flood peak series fitted by the GB2 and Burr XII distributions for the gauging station near Maybell on the Yampa River.

Figure 6 .
Figure 6.Evaluations of PDF of sites along the Yampa River.

Figure 6 .
Figure 6.Evaluations of PDF of sites along the Yampa River.

Figure 7 .
Figure 7. Marginal distribution and histograms of AM flood peak series fitted by the GB2 and Burr XII distributions for the gauging station near Maybell on the Yampa River.

Figure 7 .
Figure 7. Marginal distribution and histograms of AM flood peak series fitted by the GB2 and Burr XII distributions for the gauging station near Maybell on the Yampa River.

Table 1 .
Characteristics of the gauging stations used in the study.

Table 2 .
Parameters of the GB2 distribution for the gauging stations along the Colorado River.

Table 2 .
Parameters of the GB2 distribution for the gauging stations along the Colorado River.

Table 3 .
RMSE and AIC values of different distributions.

Table 4 .
Parameters estimated by POME and ML methods for site Near Colorada-Utah.

Table 5 .
Comparison of T-year design flood discharges (10 3 ft 3 /s) calculated by different distributions for the Hoover dam site.

Table 6 .
Statistical characteristics of the four gauging stations, the increasing rate of drainage area and flood discharge between adjacent gauging stations.

Table 6 .
Statistical characteristics of the four gauging stations, the increasing rate of drainage area and flood discharge between adjacent gauging stations.

Table 7 .
Parameters of the GB2 distribution for four gauging stations along the Yampa River.

Table 7 .
Parameters of the GB2 distribution for four gauging stations along the Yampa River.

Table 7 .
Parameters of the GB2 distribution for four gauging stations along the Yampa River.