Next Article in Journal
Numerical Simulation of the Lower and Middle Reaches of the Yarkant River (China) Using MIKE SHE
Next Article in Special Issue
Urban Flood Runoff Modeling in Japan: Recent Developments and Future Prospects
Previous Article in Journal
Channel Gradient as a Factor in the Distribution of Beaver Dams and Ponds on Small Rivers: A Case Study in the Northern Extremity of the Volga Upland, the East European Plain
Previous Article in Special Issue
Application of Porous Concrete Infiltration Techniques to Street Stormwater Inlets That Simultaneously Mitigate against Non-Point Heavy Metal Pollution and Stormwater Runoff Reduction in Urban Areas: Catchment-Scale Evaluation of the Potential of Discrete and Small-Scale Techniques
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Frequency Analysis of Hydrological Data for Urban Floods—Review of Traditional Methods and Recent Developments, Especially an Introduction of Japanese Proper Methods

by
Chiharu Mizuki
1,2,3,* and
Yasuhisa Kuzuha
1,2
1
Disaster Mitigation Research Center, Mie University, 1577 Kurima-Machiya, Tsu 514-8507, Japan
2
Mie Disaster Mitigation Center, Mie Prefecture and Mie University, 1577 Kurima-Machiya, Tsu 514-8507, Japan
3
Graduate School of Regional Innovation Studies, Mie University, 1577 Kurima-Machiya, Tsu 514-8507, Japan
*
Author to whom correspondence should be addressed.
Water 2023, 15(13), 2490; https://doi.org/10.3390/w15132490
Submission received: 10 April 2023 / Revised: 19 June 2023 / Accepted: 4 July 2023 / Published: 7 July 2023
(This article belongs to the Special Issue Urban Water-Related Problems)

Abstract

:
Frequency analysis has long been an important theme of hydrology research. Although meteorological techniques (physical approaches) such as radar nowcasting, remote sensing, and forecasting heavy rainfall events using meteorological simulation models are quite effective for urban disaster prevention, statistical and stochastic theories that include frequency analysis, which are usually used in flood control plans, are also valuable for flood control plans for disaster prevention. Master plans for flood control projects in urban areas often use the concept of T-year hydrological values with a T-year return period. A flood control target is a “landside area that is safe against heavy rainfall or floods with a return period of T years”. This review emphasizes discussions of parameter estimation of stochastic models and selection of optimal statistical models, which include evaluation of goodness-of-fit techniques of statistical models. Based on those results, the authors criticize Japanese standard procedures recommended by the central government. Consistency between parameter estimation and evaluation of goodness-of-fit is necessary. From this perspective, we recommend using the maximum likelihood method and AIC, both of which are related to Kullback–Leibler divergence. If one prefers using SLSC, we recommend not SLSC itself but SLSC’s non-exceedance probability. One important purpose of this review is the introduction of well-used Japanese methods. Because some techniques that are slightly different from the international standard have been used for many years in Japan, we introduce those in the review article.

1. Introduction

First, we would like to emphasize that parameter-estimation processes and processes for selecting the optimal probability distribution are the most important processes in hydrological frequency analyses. Therefore, we focus on only these techniques in this review article. Nevertheless, we have used similar techniques for the past three or four decades. We cannot propose optimal and decisive techniques that numerous researchers think are the optimal techniques.
For preventing water-related disasters, flood control plans are usually made for large rivers. In Japan, main rivers are designated as “Class-1 rivers” in principle, managed by the central government, or as “Class-2 rivers” managed by local governments. A certain numerical goal is set in a flood control plan, for which the jurisdictional government has a responsibility to protect people, residences, and other properties in the river basin.
According to Nakamura [1], such numerical goals are set using two methods. Japan, the Netherlands, the Philippines, and other countries have adopted stochastic goals: T-year hydrological values with a return period of T years. The United States, China, and other nations have adopted historical maximum values. This review specifically examines the former case. For the former case, the government estimates T-year hydrological values. The estimation processes are divisible mainly into methods of two kinds: non-parametric and parametric methods. Takara [2] described that non-parametric methods can be adopted when the sample size is sufficiently large.
Non-parametric methods are likely to be superior to parametric methods because they use no specific probability distribution: neither a parameter nor an optimal probability distribution need to be selected. By contrast, using the parametric methods, one must estimate parameters and select the optimal probability distribution. Selecting parameters and probability distribution processes include subjective judgments. If one uses parametric methods, subjective judgment must be eliminated to the greatest degree possible. The “Japanese MLIT (Ministry of Land, Infrastructure, Transport and Tourism) flow chart” described later includes some subjective judgment. Therefore, the authors are critical of the method. One should not refer to the flow chart by Japanese MLIT.
The next chapter briefly presents international standard procedures used for hydrological frequency analysis. Because some techniques used in Japan are slightly different from the international standard, we introduce those in Section 3. Techniques described in Section 2 and Section 3 are those which have been used for many years. Section 4 presents other techniques developed in recent years. Subsequently, we introduce some future perspectives.

2. International Standard Procedure

The World Meteorological Organization (WMO) published its “Guide to Hydrological Practices (WMO-No. 168 fifth edition)” [3] in 1994. One chapter has the title “Frequency analysis (Chapter 27)”. The chapter includes the statement that “hydrological phenomena that are commonly described by frequency analysis are storm precipitation and annual flood maxima”. They present 16 probability distributions that are commonly used in hydrology. These include a lognormal distribution, Pearson type three distribution, Gumbel distribution, general extreme value distribution, and others, which have been used for hydrologically extreme values. The sixth edition of the guide [4] was published later, and the Kolmogorov–Smirnov test, the probability plot correlation test, AIC, and BIC were introduced. Those are related to the goodness-of-fit test. Moreover, the L-moment method was also mentioned in the guide.
In the “Handbook of Hydrology” [5], one chapter has the title “Frequency Analysis of Extreme Events”. As a parameter estimation method, the authors first introduced the method of moments (MOM), the method of L-moments, and maximum likelihood. They describe that maximum likelihood estimators (MLEs) have very good statistical properties for large samples. Experience has shown that they generally perform well with data from records available from hydrology studies, but experience has also shown that MLEs often cannot be reduced to simple formulas. Regarding the selection of the optimal probability distribution, the authors described goodness-of-fit tests and L-moment diagrams. The textbook introduces the Kolmogorov–Smirnov test, the probability plot correlation coefficient test, L-moment diagrams [6], and ratio tests.
Rao and Hamed [7] explicitly described the selection of distributions. After reviewing many reports of the literature, including reports by Hazen [8], Markovic [9], Gupta [10], McCuen and Rawls [11], McCuen [12], Campbell and Sidel [13], Turkman [14], Vogel [15], Vogel and McMartin [16], Haktanir [17], Bobee et al. [18], and Onoz and Bayazit [19], they expounded the chi-square test, Kolmogorov–Smirnoff test, and Akaike’s Information Criterion (AIC) [20]. Using these three methods, they described that probability distributions for flood frequency analysis had been selected.

3. Japanese History of Estimating T-Year Hydrological Value

As described in Section 2, Akaike proposed the information criterion—AIC [20]. Moreover, many researchers developed their own statistical hydrological theories. We have an impression that some hydrological procedures used in Japan differ somewhat from international standard procedures. Some effective theories might not be known worldwide because they have been published only in Japanese-language journals.
In Japan, the main class-1 rivers are managed by MLIT. An organization related to MLIT published some manuals [21,22,23] in which they explained river plan production.

3.1. Iwai Method for Parameter Estimation of a Three-Parameter Lognormal Distribution

Iwai [24] proposed their method, which belongs to the “quantile method” type and is used for parameter estimation of the three parameters lognormal distribution. The so-called “Slade type [25] of lognormal distribution” has a bounded probability distribution function. Iwai used “Slade type II” described below. His method consists of the estimation of parameters of the three-parameter lognormal distribution. First, one can define a cumulative distribution function F x as explained below. In Equation (1), ξ is designated as “reduced variate” (Equation (2)).
F x = 1 π ξ exp t 2 d t
ξ = α log 10 x + b x 0 + b
This function for the lognormal distribution has three parameters: α , x 0 , and b. Additionally, −b is a lower bound ( x > b ). After Kadoya [26] proposed a modification of the original Iwai method, the modified Iwai method has come to be used in most cases. Therefore, we intend to present the “modified Iwai method” herein.
Presuming that there are extreme data with sample size n, then they are annual maxima data. We present these samples as x n (i = 1, 2, 3,…, n), which is the ascending order of statistics.
A.
Approximation of x 0
First, we use Equation (3) to estimate x g (approximation of x 0 )
log 10 x g = i = 1 n log 10 x i n
B.
Estimation of b and x 0
First, we produce b ( s ) i   ( i = 1 , 2 , 3 , m ) values (Equation (4)). Integer m is the nearest integer to n / 10 .
b ( s ) i = x i x n i + 1 x g 2 2 x g x i + x n i + 1
Then, b is estimated using the following equation.
b ^ = 1 m i = 1 m b ( s ) i
By defining X i = log 10 x i + b , x 0 ^ can be estimated by solving the following Equation (6).
log 10 x 0 ^ + b = 1 n i = 1 n log 10 x i + b = 1 n i = 1 n X i
In this Equation, x 0 ^ is the estimate of x 0 ; b ^ is obtained using Equation (5), and is substituted for b in Equation (6).
C.
Final process: Estimation of α .
α is estimated by solving the following equation.
1 α ^ = 2 n n 1 X 2 ¯ X ¯ 2
In Equation (7), X ¯ = 1 n i = 1 n X i and X 2 ¯ = 1 n i = 1 n X i 2 .

3.2. Ishihara–Takase Method for Parameter Estimation of Three-Parameter Lognormal Distribution

Ishihara and Takase proposed their method [27], which belongs to the “moment method” type. Their method, similar to the Iwai method, estimates the parameters of the three-parameter lognormal distribution. Although a natural logarithm can be used instead of a common logarithm, we use Equation (2) for assigning priority to uniformity with the Iwai method described above.
First, we calculate the sample average x ¯ , standard deviation s, and coefficient of skewness C S 1 . These are estimated using Equations (8)–(10) presented below.
x ¯ = 1 n i = 1 n x i
s 2 = 1 n 1 i = 1 n x i x ¯ 2 = n n 1 x 2 ¯ x ¯ 2 , s = s 2
C S 1 = i = 1 n x i x ¯ 3 / s 3 / ( n 1 )
The parameters that must be estimated are α , b, and x 0 . Ishihara and Takase concluded that α is estimated using the following Equation (11).
k = 1 / 2 ln 1 + 2 1 / 3 2 + C S 2 + 4 C S 2 + C S 4 1 / 3 + 2 + C S 2 + 4 C S 2 + C S 4 1 / 3 2 1 / 3 α = k ln 10
The reason for using k is that their original paper adopted natural instead of common logarithms in Equation (2); k is a parameter for the case of using natural logarithms. Furthermore, C S is not C S 1 in Equation (10). Therein, C S 1 is biased; C S is corrected when using correction factor F C S in Equation (12).
C S = C S 1 ( 1 + F C S )
As for the correction factor F C S , Ishihara and Takase showed it using a figure. One can obtain F C S , which is a function of sample size n and C S 1 , using their figure, which is well-known as Ishihara–Takase’s figure. However, calculating F C S by PC can be performed more easily than ever using the following procedure. Therefore, we recommend that analysts calculate F C S by themselves.
A.
Estimating tentative k and α using Equation (11)
First, we estimate k and α . In Equation (11), C S 1 is substituted into C S . C S 1 is calculated using Equation (10); sample x i represents observed data.
B.
Generating ξ i ( i = 1 , 2 , 3 , .... n )
According to Hazen’s plotting position formula (as for plotting position formula, see [8,28,29]), F i ( i = 1 , 2 , 3 , .... n , which is the probability of non-exceedance) is calculated. Additionally, ξ i is calculated by the inverse function of Equation (1) as ξ F . Hazen’s plotting position formula is the following, where i is the order of ascending-order statistics:
F i = 2 i 1 2 n
The method for obtaining ξ F using the inverse function depends on the software used. Equation (1) can be written as F ξ = 1 + E r f ξ / 2 , where E r f is the error function. Therefore, an inverse function of it can be expressed as the following Equation (14).
ξ F = E r f 1 2 F 1
y i is obtained using an inverse function of Equation (2): —Equation (15). Then we can use x 0 = 1 ,   b = 0 for simplicity of calculation.
y ξ = 10 ξ / α x 0 + b b = e ξ / k x 0 + b b
C.
Calculating C * S 1 _ y and C * S _ y of samples
We can calculate C * S 1 _ y using y i , which is the coefficient of skewness not of x i but of y i by Equation (10). Then, we obtain the theoretical coefficient of skewness C * S _ y using the following Equation (16). When one calculates C * S _ y , k is the value estimated by using sample x i first.
C * S _ y = exp 9 / ( 4 k 2 ) 3 exp 5 / ( 4 k 2 ) + 2 exp 3 / ( 4 k 2 ) exp 1 / ( k 2 ) exp 1 / ( 2 k 2 ) 3 / 2
As a result, F C S = C * S _ y / C * S 1 _ y 1 is obtained.
D.
Calculating three parameters
Using the corrected coefficient of skewness, k or α is obtained. Then b and x 0 are estimated using the following equations (Iwai and Ishiguro [30]).
λ = exp 1 / ( 4 k 2 ) b = 1 λ 2 1 σ x ¯ x 0 = x ¯ λ 1 λ λ 2 1 σ
In Equation (17), x ¯ and σ , respectively, denote the average and standard deviation of the observed sample x i .

3.3. Etoh’s Distribution

Etoh et al. proposed the probability distribution for extreme values (Etoh et al. [31]). The cumulative distribution function of Etoh’s distribution, which has the two parameters a and b, is the following. This probability density function has a heavy tail.
F x = exp a 1 + b x exp b x ( x 0 ) 0       ( x < 0 )
Although the following probability density function (Equation (19)) has been used, F 0 (left-hand limit) is e a . It is not zero in accordance with Equation (18). Therefore, Hayashi et al. [32] proposed the modified function as Equation (20), where δ x represents Dirac’s delta.
f x = a b 2 exp b x a 1 + b x exp b x ( x 0 )
f x = a b 2 exp b x a 1 + b x exp b x + δ x exp a ( x 0 ) 0   ( x < 0 )
Because exp a is usually small, however, the use of Equation (19) is adequate.
As a parameter estimation method, we usually use the maximum likelihood method. Etoh et al. [31] and Hoshi [33] recommend the following procedure. The log-likelihood of this probability distribution is presented as Equation (21).
L ( a , b ) = j = 1 N ln f x j = N ln a + N ln b N ln 2 j = 1 N b x j a j = 1 N exp b x j + j = 1 N b x j exp b x j
By solving L b = 0 , we can obtain a , which is a function of b , as the following Equation (22), which is referred to as a 1 .
a 1 = j = 1 N b x j 2 N j = 1 N b x j exp b x j
Then, substituting a 1 , obtained by Equation (22) into Equation (21), L ( a , b ) is modified to L ( b ) . Finally, we seek the largest L ( b ) —the optimal b in some way. Kubota [34] proposed the following procedure. Solving L a = 0 , one can obtain a (designated as “ a 2 ”), which is a function of b , from Equation (23).
a 2 = N j = 1 N exp b x j + j = 1 N b x j exp b x j
The solution of a can be obtained by minimizing h ( b ) = a 1 a 2 [33], which can be performed easily using software such as Mathematica [34]. In Japan, Etoh’s distribution is thought to be appropriate for extreme values data. Kuzuha and Mizuki [35] applied several probability distributions to 42,500 pieces of annual maximum one-hour rainfall data whose sample size is 60. They reported that Etoh’s distribution was most appropriate for 37% of the 42,500 data. Two-parameters lognormal distribution was most appropriate for 42%, and the Gumbel distribution was most appropriate for 14%.

3.4. Approach Proposed by Tsuchiya and Takeuchi

Although Etoh’s distribution is quite an effective probability distribution, the L-moment solution has not been known. This probability distribution was not described by Hosking and Wallis [6] because this probability distribution is not well-known internationally.
Tsuchiya et al. [36] (see also Kuzuha [37]) presented the PWM solution of this probability distribution as follows. Their solution was obtained using numerical procedures, but the method is simple.
Specifically, we can estimate the parameters using the following procedure.
β r = 0 1 x F F r d F = 0 x F F r f x d x
M 1 , 0 , 0 = β 0 = 0 x f x d x = a b 2 0 x exp b x a 1 + b x exp b x d x M 1 , 1 , 0 = β 1 = 0 x F x f x d x   = a b 2 0 x exp b x 2 a 1 + b x exp b x d x
Equations (24) and (25) indicate the first-order and second-order probability weighted moments. Equation (26) presents the sample probability weighted moments.
M ^ 1 , 0 , 0 = b 0 = 1 n i = 1 n x i M ^ 1 , 1 , 0 = b 1 = 1 n i = 1 n x i i 1 n 1
As Tsuchiya and Takeuchi reported [36], M 1 , 1 , 0 / M 1 , 0 , 0 is independent of b ; it is a function of only a . Therefore, we can ignore b and can set b = 1 as the following Equation (27).
0 x exp x 2 a 1 + x exp x d x 0 x exp x a 1 + x exp x d x = b 1 b 0
a is obtained by numerically solving Equation (27).
Finally, estimation of b is obtained by numerical solution of Equation (28) after substituting the a ^ obtained into a .
M ^ 1 , 0 , 0 = a b 2 0 x exp b x a 1 + b x exp b x d x
Furthermore, we would like to mention the following facts. Takeuchi and Tsuchiya reported the PWM solution of the normal distribution [38], a lognormal distribution, and Pearson type three distribution [39]. Because their findings were published in a Japanese journal, they have not become well-known internationally, but they found their solution ahead of the international hydrological community.

3.5. Ueda–Kawamura’s Criterion for Evaluating Goodness-of-Fit

Ueda and Kawamura [40] proposed a criterion to evaluate the goodness-of-fit of a probability model. Although many textbooks have recommended the evaluation of the validity of a probability model based on probability studies, it is difficult to evaluate their validity quantitatively. They sought to quantitively evaluate the probability model’s goodness-of-fit, as explained below.
A.
Presuming sample data with size n and that have ascending order statistics, then using the plotting position formula, the non-exceedance probability F P x i is estimated. Several plotting position formulas are expressed as Equation (29).
F P x i = i α n + 1 2 α β
For example, for Cunnan’s formula [29], α is 0.4 and β is 0.
B.
If the cumulative distribution function of the probability model is F x , then, of course, the non-exceedance probability is F x i .
C.
Ueda and Kawamura plot F x i , F P x i on a graph with the normal axis. The minimum and maximum of both axes are 0 and 1. From the viewpoint of goodness-of-fit, the data shown are near the line of y = x .
Ueda and Kawamura proposed the use of the χ 2 test as a goodness-of-fit test. As a result, the χ 2 value of each probability distribution is a candidate “fair criterion” when choosing a probability distribution.

3.6. Takasao–Takara’s SLSC for Evaluating Goodness-of-Fit

Takasao et al. [41] proposed the standard least-squares criterion for goodness of fit (SLSC). This criterion evaluates goodness-of-fit by linearity on a probability plotting paper. The SLSC is expressed as the following Equation (30).
SLSC = i = 1 n s i s * i 2 / n s 0.99 s 0.01
In Equation (30), s is a reduced variate and calculated according to Equation (31), where ξ and α , respectively, denote the location and scale parameter.
s = x ξ / α
x i * = x F x i
The value of x i * is calculated using Equation (32); in addition, s * i is transformed from x i * by Equation (31). One can assume a probability plotting paper with a horizontal axis x and vertical axis s . x i , s i is on a linear line because of the definition. However, x i , s * i is plotted nearly as a straight line but not on the line: SLSC is the mean distance between a straight line and x i , s * i . That is, SLSC evaluates the mean distance which is the degree of separation of the probability model from the sample, not by vision but by values.
Takasao et al. used a denominator of the right side of Equation (30) to maintain the fairness of the criterion. They regarded vertical scales of the probability plotting paper of each probability distribution as corrected to the same scale, divided by the denominator. As Kuzuha [42] and Hayashi et al. [43] found and Kuzuha et al. [35] [44,45,46] later examined in detail, however, it is not true. That point is explained in the next section.

3.7. Procedure for Parameter Estimation and Choosing the Probability Distribution of the Japan Ministry of Land, Infrastructure, Transport, and Tourism

For estimating long-term stochastic hydrological values (e.g., 100-year precipitation whose return period is 100 years), THE MLIT used their own flow chart for parameter estimation and for choosing an optimal probability distribution [21]. Recently, Kuzuha and Mizuki criticized the flow chart. The flow chart has several shortcomings, but it is wholly inappropriate for three main reasons:
(1)
The most important process of the MLIT flow chart is THE evaluation of goodness-of-fit by SLSC and the evaluation of variability by resampling technique for each probability distribution: some candidates are first chosen for the optimal probability distribution by SLSC. However, the authors found that SLSC is not valid from the perspective of fairness among probability distributions. An unfair referee should not judge the match.
(2)
In the MLIT flow chart, the probability distribution with the smallest variability is thought to be the optimal one among the candidates selected above. They regard the probability distribution having the least variability of T-year values as optimal. They use three criteria in the flow chart: “at parameter-estimation process”, “at process of selecting the optimal”, and “at evaluating variability”.
(3)
The criteria of the least likelihood method for parameter estimation and AIC are related to Kullback–Leibler divergence [47]. If they use the L-moment method (or the “conventional” moment method), they assign importance to the coincidence of the L-moment (moment) between the model and data.
(4)
Work by Tanaka and Takara [48] probably affected the MLIT flow chart the most. Tanaka and Takara mentioned that “if SLSC is less than 0.04, we regard that the probability distribution’s goodness-of-fit as sufficient. If using 0.03 for the threshold, most probability distributions are evaluated as inappropriate from the viewpoint of the goodness-of-fit. Then, we use 0.04 as the threshold”. The authors have criticized this rationale as it is not scientific. It is for the convenience of administration—the Japan MLIT.

3.8. Current Best Practice

We think that consistency is extremely important between the processes of parameter estimation and choosing the probability distribution. In this context, “consistency” means using the same or similar criterion for parameter-estimation and evaluation of goodness-of-fit. Moreover, we believe that “evaluating variability in MLIT flow chart” is not necessary. Let us explain the reason in detail. The most important is that the criterion for evaluating the goodness-of-fit is a fair criterion from the perspective of comparing probability distributions. Because we compare a goodness-of-fit-measure of each probability distribution and select the optimal probability distribution, fairness is most important. From this perspective, SLSC is not a fair measure at all.
Suppose that an analyst uses the maximum likelihood method for parameter estimation and that they estimate parameters of an A-probability distribution and a B-probability distribution. Moreover, suppose that the analyst chooses Takasao–Takara’s criterion (SLSC) for selecting the optimal probability distribution. Parameters are selected to maximize the likelihood. Then, the A-probability distribution and B-probability distribution are compared. If the SLSC of the A-probability distribution is smaller than that of the B-probability distribution, the A-probability is selected as the optimal distribution. This poses a big problem since there is a possibility that other parameter sets are selected, and the B-probability distribution is selected as the optimal distribution if parameters are selected to minimize SLSC. This is the reason why we insist that the consistency of measure for parameter-estimation and evaluating goodness-of-fit is quite important.
According to the arguments presented above, using the maximum likelihood method for parameter estimation and using AIC for testing goodness-of-fit are recommended procedures. The main reason is that both are related to Kullback–Leibler divergence [46]. As described in Section 3.7, Tanaka and Takara’s explanation [47] for the threshold (=0.04) is inappropriate. However, one can understand the difficulty of policymakers in government agencies in changing their methods quickly to align with an academic perspective. Therefore, we presented some issues related to the conventional method in earlier reports [35,44,45].
A.
We recommend using the maximum likelihood method and AIC (or TIC, etc.).
B.
If an analyst prefers using SLSC, then we recommend not using SLSC itself but SLSC’s non-exceedance probability F ( S L S C ) . For calculating F ( S L S C ) , one must know SLSC’s probability distribution function. Hayashi et al. [43] and Kuzuha and Mizuki [35,44] demonstrated how to obtain the SLSC’s probability density function using Monte Carlo simulation.
C.
If an analyst uses the SLSC’s non-exceedance probability, then they can evaluate the goodness-of-fit of each probability distribution, even if SLSC is not a fair criterion. That procedure can be applied to any criterion, even if the criterion is not a fair one from the viewpoint of comparing the degrees of goodness-of-fit.

4. Novel Techniques and Future Perspectives

In 2004, Gelder [49] described some well-known techniques for parameter estimation: the method of moments (MOM), maximum likelihood estimation (MLE), least squares, Bayesian estimation, minimum cross-entropy, probability weighted moments (PWMs), and L-moments. More recent reports, such as that by Yuan et al. (2018) [50], described the adoption of the so-called MOM for parameter estimation. Langat et al. (2019) [51] adopted MLE after reviewing some techniques. Those are MOM, L-moments, LH moments [52], and the expected moments algorithm (EMA). Anghel and Ilinca (2022) [53] used both MOM and L-moments for parameter estimation.
Coles [54] and Hayashi et al. [43] considered non-stationary hydrological models. Hayashi et al. discussed non-stationary hydrological frequency models introducing time-dependent parameters. Their report recommended the use of MLE for parameter estimation. Langat et al. commented on the method of Bayesian estimation: “although there are drawbacks of complexity in its implementation in present time, it might become a useful non-stationarity flood frequency analysis model in the future, with advancements in technology”.
Yuan et al. (2018) [50] described that “the choice of an appropriate PDF is still one of the major issues in engineering practice because there is no general agreement as to which distribution could be used for the frequency analysis of extreme rainfalls”. They adopted the chi-square test for selecting the optimal probability distribution. Langat et al. [51] introduced the Kolmogorov–Smirnov, Anderson–Darling, and Cramer–Von Mises tests in addition to the chi-square test.
Most techniques described above have a long history; quite attractive and novel techniques that have become a new international standard have not been proposed in recent years. Nevertheless, because hydrological frequency analyses that use non-stationary hydrological data have become increasingly important in light of drastic climate change, non-stationary analyses have become ever more necessary. Some techniques are useful for non-stationary analyses. The maximum likelihood method and AIC, TIC, or BIC, which are related to Kullback–Leibler divergence [47], are expected to be crucially important in the research area. In addition, the method of Bayesian estimation might be particularly effective.

5. Conclusions

We reviewed statistical hydrological studies, especially those conducted in Japan. Many Japanese government analysts often use procedures developed in Japan, which have been recommended by Japanese MLIT. We criticized the use of those procedures. Some consistency between parameter estimation and evaluation of goodness-of-fit is necessary. From this perspective, we recommend using the maximum likelihood method and AIC, both of which are related to Kullback–Leibler divergence. If one prefers using SLSC, we recommend not SLSC itself but SLSC’s non-exceedance probability.
Techniques for parameter estimation and selecting the optimal probability distribution should be discussed from an international viewpoint. Some techniques related to Kullback–Leibler divergence or Bayesian estimation might be candidates for the solution of non-stationary flood frequency analyses.

Author Contributions

Conceptualization, C.M. and Y.K.; methodology, C.M. and Y.K.; investigation, C.M. and Y.K.; writing, C.M.; supervision, Y.K.; project administration, Y.K.; funding acquisition, Y.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partly supported by JSPS Grants-in-Aid for Scientific Research. The grant Number is JP19K04613.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Nakamura, S. Floods and Probability; University of Tokyo Press: Tokyo, Japan, 2021; 195p. (In Japanese) [Google Scholar]
  2. Takara, K. Frequency Analysis of Larger Samples of Hydrologic Extreme-Value Data—How to estimate the T-year quantile for samples with a size of more than the return period T. Annu. Disaster Prev. Res. Inst. Kyoto Univ. 2006, 49B, 7–12. (In Japanese) [Google Scholar]
  3. World Meteorological Organization (WMO). Guide to Hydrological Practices, 5th ed.; World Meteorological Organization (WMO): Geneva, Switzerland, 1994; Available online: http://www.innovativehydrology.com/WMO-No.168-1994.pdf (accessed on 28 December 2022).
  4. World Meteorological Organization (WMO). Guide to Hydrological Practices, 6th ed.; World Meteorological Organization (WMO): Geneva, Switzerland, 2009; Available online: https://www.hydrology.nl/images/docs/hwrp/WMO_Guide_168_Vol_II_en.pdf (accessed on 28 December 2022).
  5. Stedinger, J.R.; Vogel, R.M.; Foufoula-Georgiou, D. Frequency Analysis of Extreme Events. In Handbook of Hydrology; Maidment, D.R., Ed.; McGraw-Hill: New York, NY, USA, 1992; pp. 18.1–18.66. [Google Scholar]
  6. Hosking, J.R.M.; Wallis, J.R. Regional Frequency Analysis; Cambridge University Press: New York, NY, USA, 1977; 224p. [Google Scholar]
  7. Rao, R.R.; Hamed, K.H. Flood Frequency Analysis; CRC Press: Boca Raton, FL, USA, 2000; 350p. [Google Scholar]
  8. Hazen, A. Storage to be Provided Impounding Resevoirs for Municipal Water Supply. Trans. ASCE 1914, 77, 1308. [Google Scholar]
  9. Markovic, R.D. Probability Functions of the Best Fit to Distributions of Annual Precipitation and Runoff; Hydrology paper No. 8; Colorado State University: Fort Collins, CO, USA, 1965. [Google Scholar]
  10. Gupta, V.L. Selection of Frequency Distribution Models. Water Resour. Res. 1970, 6, 1193–1198. [Google Scholar] [CrossRef]
  11. McCuen, R.H.; Rawls, W.J. Classification of Evaluation of Flood Flow Frequency Estimation Techniques. Water Resour. Bull. 1979, 15, 88–93. [Google Scholar] [CrossRef]
  12. McCuen, R.H. Statistical Terminology: Definitions and Interpretation for Flood Peak Estimation. Water Resour. Bull. 1979, 15, 1106–1116. [Google Scholar] [CrossRef]
  13. Campbell, A.J.; Sidel, R.C. Prediction of Peak Flows on Small Watersheds in Oregon for Use in Culvert Design. Water Resour. Bull. 1984, 20, 9–14. [Google Scholar] [CrossRef]
  14. Turkman, K.F. The choice of extremal models by Akaike’s information criterion. J. Hydrol. 1985, 82, 307–315. [Google Scholar] [CrossRef]
  15. Vogel, R.M. The Probability Plot Correlation Coefficient Test for the Normal, Lognormal, and Gumbel Distributional Hypotheses. Water Resour. Res. 1986, 22, 587–590. [Google Scholar] [CrossRef] [Green Version]
  16. Vogel, R.W.; McMartin, D.E. Probability Plot Goodness-of-Fit and Skewness Estimation Procedures for the Pearson Type 3 Distribution. Water Resour. Res. 1991, 27, 3149–3158. [Google Scholar] [CrossRef]
  17. Haktanir, T. Comparison of various flood frequency distributions using annual flood peaks data of rivers in Anatolia. J. Hydrol. 1992, 136, 1–31. [Google Scholar] [CrossRef]
  18. Bobée, B.; Cavadias, G.; Ashkar, F.; Bernier, J.; Rasmussen, P. Towards a systematic approach to comparing distributions used in flood frequency analysis. J. Hydrol. 1993, 142, 121–136. [Google Scholar] [CrossRef]
  19. Önöz, B.; Bayazit, M. Best-fit distributions of largest available flood samples. J. Hydrol. 1995, 167, 195–208. [Google Scholar] [CrossRef]
  20. Akaike, H. Information Theory and an Extension of the Maximum Likelihood Principle. In Proceedings of the 2nd International Symposium on Information Theory, Tsahkadsor, Armenia, 2–8 September 1973; pp. 267–281. [Google Scholar]
  21. Committee to Discuss on River Plan Design for Small and Medium-Sized Rivers. Guide for River Plan Design for Small and Medium-Sized Rivers; Japan Institute of Countryology and Engineering: Tokyo, Japan, 1999; 243p. (In Japanese) [Google Scholar]
  22. Japan Institute of Countryology and Engineering. Guide for Discussion on High-Water Plan; Japan Institute of Countryology and Engineering: Tokyo, Japan, 2007; 43p. (In Japanese) [Google Scholar]
  23. International Centre for Water Hazard and Risk Management (ICHARM). User’s Manual of Hydrological Statistics Utility. Available online: https://www.pwri.go.jp/icharm/special_topic/20171013_manual_en_hsu/english_manual_for_hydrological_statistics_utility.pdf (accessed on 28 December 2022).
  24. Iwai, S. Some Estimating Methods of Probable Flood and Their Application to Japanese Rivers. Bull. Math. Stat. 1949, 2, 21–36. (In Japanese) [Google Scholar]
  25. Slade, J.J.J. An asymmetric probability function. Trans. ASCE 1936, 101, 35–61. [Google Scholar] [CrossRef]
  26. Kadoya, M. On the Applicable Ranges and Parameters of Logarithmic Normal Distributions of Slade-Type. J. Irrig. Eng. Rural Plan. 1962, 3, 12–16. (In Japanese) [Google Scholar]
  27. Ishithara, T.; Takase, N. The Logarithmic-Normal Distribution and its Solution Based on Moment Method. Trans. JSCE 1957, 47, 18–23. (In Japanese) [Google Scholar]
  28. Barnett, V. Probability Plotting Methods and Order Statistics. J. R. Stat. Soc. 1975, 24, 95–108. [Google Scholar] [CrossRef]
  29. Cunnane, C. Unbiased plotting positions—A review. J. Hydrol. 1978, 37, 205–222. [Google Scholar] [CrossRef]
  30. Iwai, S.; Ishiguro, M. Applied Hydrological Statistics; Morikita Publishing: Tokyo, Japan, 1970; 370p. (In Japanese) [Google Scholar]
  31. Etoh, T.; Murota, A.; Yonetani, T.; Kinoshita, T. Frequency of Record-breaking Large Precipitation. Proc. JSCE 1986, 369, 165–174. (In Japanese) [Google Scholar] [CrossRef] [Green Version]
  32. Hayashi, H.; Tachikawa, Y.; Shiiba, M. Non-Stationary Hydrologic Frequency Analysis Using Time Dependent Parameters and Its Model Selection. J. Jpn. Soc. Civ. Eng. Ser B1 2015, 71, 28–42. [Google Scholar]
  33. Hoshi, K. Hydrological Statistical Analysis. Month. Rep. Civil Eng. Res. Inst. 1998, 540, 31–63. (In Japanese) [Google Scholar]
  34. Kubota, K. On Probability Distribution and Method for Estimating Statistics. Available online: http://civilyarou.web.fc2.com/WANtaroHP_html5_win/f90_ENGI/dir_HFA/suimon.pdf (accessed on 28 December 2022). (In Japanese).
  35. Kuzuha, Y.; Mizuki, C. T-year Hydrological Event Estimation Using the Akaike Information Criterion and Some Considerations. J. Jpn. Soc. Hydol. Water Resour. 2022, 35, 134–147. (In Japanese) [Google Scholar] [CrossRef]
  36. Tsuchiya, K.; Takeuchi, K. Application of PWM Method to SQRT-ET-max Distribution. In Proceedings of the 42th Annual Conference of the Japan Society of Civil Engineers (Division 2); 1987; pp. 34–35. (In Japanese). [Google Scholar]
  37. Kuzuha, Y. L-moment Solution of Etoh’s Distribution. J. JSCE 2023, unpublished manuscript. [Google Scholar]
  38. Takeuchi, K.; Tsuchiya, K. A PWM Solution for Parameters of Normal Distribution. Annu. J. Hydraul. Eng. 1987, 31, 191–196. (In Japanese) [Google Scholar]
  39. Takeuchi, K.; Tsuchiya, K. PWM Solutions to Nomal, Lognormal and Pearson-III Distributions. Proc. JSCE 1988, 393/II-9, 95–101. (In Japanese) [Google Scholar]
  40. Ueda, T.; Kawamura, A. A New Graphical Method of Testing the Goodness of fit of Data to Probability Distributions. Proc. JSCE 1985, 357/II-3, 243–246. (In Japanese) [Google Scholar]
  41. Takasao, T.; Takara, K.; Shimizu, A. A Basic Study on Frequency Analysis of Hydrologic Data in The Lake Biwa Basin. Annu. Disaster Prev. Res. Inst. Kyoto Univ. 1986, 29B-2, 157–171. (In Japanese) [Google Scholar]
  42. Kuzuha, Y. Considerations of Statistical Method in Flood-control planning -SLSC and Cost Benefit Analysis. J. JSCE 2010, 66, 66–75. (In Japanese) [Google Scholar] [CrossRef]
  43. Hayashi, H.; Tachikawa, Y.; Shiiba, M.; Yorozu, K.; Sunmin, K. Introducing a Statistical Hypothesis Testing into SLSC Goodness-of -fit Evaluation for Hydrological Frequency Analysi Models. J. JSCE 2012, 68, I_1381–I_1386. [Google Scholar]
  44. Kuzuha, Y.; Mizuki, C. Estimating T-year Hydrological Event and Issues of Conventional Methods—Improved Standard Least Squares Criterion (SLSC) Method for Goodness-of-fit Evaluation. J. Jpn. Soc. Hydol. Water Resour. 2021, 34, 283–302. (In Japanese) [Google Scholar] [CrossRef]
  45. Kuzuha, Y.; Mizuki, C. Proof of Problems in SLSC—Simple Explanation Using the Mean of SLSC Derived by Semi-analytic Method. J. JSCE 2022, 78, I-481–I-486. (In Japanese) [Google Scholar] [CrossRef]
  46. Kuzuha, Y.; Mizuki, C. Some Issues Related to SLSC Method and Guideline for Prameter Estimation and Goodness-of-fit Test. J. JSCE 2022, 78, I-487–I-492. (In Japanese) [Google Scholar] [CrossRef]
  47. Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
  48. Tanaka, S.; Takara, K. Goodness-of-fit and Stability Assessment in Flood Frequency Analysis. Annu. J. Hydraul. Eng. 1999, 43, 127–132. (In Japanese) [Google Scholar] [CrossRef] [Green Version]
  49. Van Gelder, P.H.A.J.M. Statistical Estimation Methods in Hydrological Engineering. In Proceedings International Scientific Seminar; Korytny, L.M., Luxemburg, W.M., Eds.; Publishing House of the Institute of Geography: Irkutsk, Russia, 2004; pp. 11–57. [Google Scholar]
  50. Yuan, J.; Emura, K.; Farnham, C.; Alam, M.A. Frequency analysis of annual maximum hourly precipitation and determination of best fit probability distribution for regions in Japan. Urban Clim. 2018, 24, 276–286. [Google Scholar] [CrossRef]
  51. Langat, P.K.; Kumar, L.; Koech, R. Identification of the Most Suitable Probability Distribution Models for Maximum, Minimum, and Mean Streamflow. Water 2019, 11, 734. [Google Scholar] [CrossRef] [Green Version]
  52. Wang, Q.J. Lh moments for statistical analysis of extreme events. Water Resour. Res. 1997, 33, 2841–2848. [Google Scholar] [CrossRef]
  53. Anghel, C.G.; Ilinca, C. Parameter Estimation for Some Probability Distributions Used in Hydrology. Appl. Sci. 2022, 12, 12588. [Google Scholar] [CrossRef]
  54. Coles, S. An Introduction to Statistical Modeling of Extreme Values; Springer: London, UK, 2001; 223p. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mizuki, C.; Kuzuha, Y. Frequency Analysis of Hydrological Data for Urban Floods—Review of Traditional Methods and Recent Developments, Especially an Introduction of Japanese Proper Methods. Water 2023, 15, 2490. https://doi.org/10.3390/w15132490

AMA Style

Mizuki C, Kuzuha Y. Frequency Analysis of Hydrological Data for Urban Floods—Review of Traditional Methods and Recent Developments, Especially an Introduction of Japanese Proper Methods. Water. 2023; 15(13):2490. https://doi.org/10.3390/w15132490

Chicago/Turabian Style

Mizuki, Chiharu, and Yasuhisa Kuzuha. 2023. "Frequency Analysis of Hydrological Data for Urban Floods—Review of Traditional Methods and Recent Developments, Especially an Introduction of Japanese Proper Methods" Water 15, no. 13: 2490. https://doi.org/10.3390/w15132490

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop