Probability Models and Statistical Tests for Extreme Precipitation Based on Generalized Negative Binomial Distributions

: Mathematical models are proposed for statistical regularities of maximum daily precipitation within a wet period and total precipitation volume per wet period. The proposed models are based on the generalized negative binomial (GNB) distribution of the duration of a wet period. The GNB distribution is a mixed Poisson distribution, the mixing distribution being generalized gamma (GG). The GNB distribution demonstrates excellent ﬁt with real data of durations of wet periods measured in days. By means of limit theorems for statistics constructed from samples with random sizes having the GNB distribution, asymptotic approximations are proposed for the distributions of maximum daily precipitation volume within a wet period and total precipitation volume for a wet period. It is shown that the exponent power parameter in the mixing GG distribution matches slow global climate trends. The bounds for the accuracy of the proposed approximations are presented. Several tests for daily precipitation, total precipitation volume and precipitation intensities to be abnormally extremal are proposed and compared to the traditional PoT-method. The results of the application of this test to real data are presented.


Introduction
In this paper, we continue the research we started in [1,2]. We develop the mathematical models for statistical regularities in precipitation proposed in the papers mentioned above. We consider the models for the statistical regularities in the duration of a wet period, maximum daily precipitation within a wet period and total precipitation volume per wet period. The base for the models is the generalized negative binomial (GNB) introduced in the recent paper [3]. The GNB distribution is a mixed Poisson distribution, the mixing distribution being generalized gamma (GG). The results of fitting the GNB distribution to real data are presented and demonstrate excellent concordance of the GNB model with the empirical distribution of the duration of wet periods measured in days. Based on this GNB model, asymptotic approximations are proposed for the distributions of the maximum daily precipitation volume within a wet period and of the total precipitation volume for a wet period. The asymptotic distribution of the maximum daily precipitation volume within a wet period turns out to be a tempered scale mixture of the gamma distribution in which the scale factor has the Weibull distribution, whereas the asymptotic approximation for the total precipitation volume for a wet period turns out to be the GG distribution. These asymptotic approximations are deduced using limit theorems for statistics constructed from samples with random sizes having the GNB distribution. The bounds for the accuracy of the proposed approximations are discussed theoretically and illustrated statistically. The proposed approximations appear to be very accurate. Based on these models, two approaches are proposed to the definition of abnormally extremal precipitation. These approaches can be regarded as a further development of those proposed in [2].
The importance of the problem of modeling statistical regularities in extreme precipitation is indisputable. Understanding climate variability and trends at relatively large time horizons is of crucial importance for long-range business, say, agricultural projects and forecasting of risks of water floods, dry spells and other natural disasters. Modeling regularities and trends in heavy and extreme daily precipitation is important for understanding climate variability and change at relatively small or medium time horizons. However, these models are much more uncertain as compared to those derived for mean precipitation or total precipitation during a wet period. In [4], a detailed review of this phenomenon is presented and it is noted that, at least for the European continent, most results hint at a growing intensity of heavy precipitation over the last decades.
In [2], we proposed a rather reasonable approach to the unambiguous (algorithmic) determination of extreme or abnormally heavy total precipitation for a wet period. This approach was based on the NB model for the duration of wet periods measured in days, and, as a consequence, on the distribution of the total precipitation volume during a wet period. This approach has some advantages. First, estimates of the parameters of the total precipitation are weakly affected by the accuracy of the daily records and are less sensitive to missing values. Second, the corresponding mathematical models are theoretically based on limit theorems of probability theorems that yield unambiguous asymptotic approximations, which are used as adequate mathematical models. Third, this approach gives an unambiguous algorithm for the determination of extreme or abnormally heavy total precipitation that does not involve statistical significance problems owing to the low occurrence of such (relatively rare) events.
The problem of the construction of a statistical test for the precipitation volume to be abnormally large can be mathematically formalized as follows. Let m 2 be a natural number and consider a sample of m positive observations X 1 , X 2 , . . . , X m . With finite m, among X i 's there is always an extreme observation, say, X 1 , such that X 1 X i , i = 1, 2, . . . , m. Two cases are possible: (i) X 1 is a 'typical' observation and its extreme character is conditioned by purely stochastic circumstances (there must be an extreme observation within a finite homogeneous sample) and (ii) X 1 is abnormally large so that it is an 'outlier' and its extreme character is due to some exogenous factors.
To construct a test for distinguishing between these two cases for abnormally extreme daily precipitation, we use the fact that the distribution of the maximum daily precipitation per wet period is a tempered scale mixture of the gamma distribution in which the scale factor has the Weibull distribution. According to this model, a daily precipitation volume is considered to be abnormally extremal if it exceeds a certain (pre-defined) quantile of this distribution.
As regards testing for anomalous extremeness of total precipitation volume during a wet period, we use the GG distribution as the model of statistical regularities of its behavior. The theoretical grounds for this model are provided by the law of large numbers for random sums in which the number of summands has the GNB distribution. It turns out that, as compared to the ordinary negative binomial (NB) model (see [2]), the additional exponent power parameter in the corresponding GG distribution matches slow global climate trends. Hence, the hypothesis that the total precipitation volume during a certain wet period is abnormally large can be re-formulated as the homogeneity hypothesis of a sample from the GG distribution. Two equivalent tests are proposed for testing this hypothesis. One of them is based on the beta distribution whereas the second is based on the Snedecor-Fisher distribution. Both of these tests deal with the relative contribution of the total precipitation volume for a wet period to the considered set (sample) of successive wet periods. Within the second approach, it is possible to introduce the notions of relatively abnormal and absolutely abnormal precipitation volumes. These tests are scale-free and depend only on the easily estimated shape parameter of the GNB distribution and the time-scale parameter determining the denominator in the fractional contribution of a wet period under consideration. The tests appeared to be applicable not only to total precipitation volumes over wet periods but also to the precipitation intensities (the ratios of total precipitation volumes per wet periods to the durations of the corresponding wet periods measured in days).

Generalized Negative Binomial Model for the Duration of Wet Periods
The main results of this paper strongly rely on the GNB model, a wide and flexible family of discrete distributions that are mixed Poisson laws with the mixing GG distribution. Namely, we say that a random variable N r,γ,µ (r > 0, γ ∈ R and µ > 0) has the generalized negative binomial distribution, if where g * (z; r, γ, µ) is the density of GG distribution: with γ ∈ R, µ > 0, r > 0. The GNB distributions seem to be very promising in the statistical description of many real phenomena, being very convenient and almost universal models. It is necessary to explain why this combination of the mixed and mixing distributions is considered. First of all, the Poisson kernel is used as mixed for the following reasons. Pure Poisson processes can be regarded as the best models of stationary (time-homogeneous) chaotic flows of events [5]. Recall that the attractiveness of a Poisson process as a model of homogeneous discrete stochastic chaos is due to at least two circumstances. First, Poisson processes are point processes characterized by the time intervals between successive points that are independent random variables (r.v.'s) with one and the same exponential distribution, and, as is well known, the exponential distribution possesses the maximum differential entropy among all absolutely continuous distributions concentrated on the nonnegative half-line with finite expectations, whereas the entropy is a natural and convenient measure of uncertainty. Second, the points forming the Poisson process are uniformly distributed along the time axis in the sense that for any finite time interval [t 1 , t 2 ], t 1 < t 2 , the conditional joint distribution of the points of the Poisson process that fall into the interval [t 1 , t 2 ] under the condition that the number of such points is fixed and equals, say, n, coincides with the joint distribution of the order statistics constructed from an independent sample of size n from the uniform distribution on [t 1 , t 2 ], whereas the uniform distribution possesses the maximum differential entropy among all absolutely continuous distributions concentrated on finite intervals and very well corresponds to the conventional impression of an absolutely unpredictable random variable (see, e g., [5,6]). However, in actual practice, as a rule, the parameters of the chaotic stochastic processes are influenced by poorly predictable «extrinsic» factors, which can be regarded as stochastic so that most reasonable probabilistic models of non-stationary (time-non-homogeneous) chaotic point processes are doubly stochastic Poisson processes, also called Cox processes (see, e.g., [5,7,8]). These processes are defined as Poisson processes with stochastic intensities. Such processes proved to be adequate models in insurance [5,7,8], financial mathematics [9], physics [10] and many other fields. Their one-dimensional distributions are mixed Poisson.
In order to have a flexible model of a mixing distribution that is "responsible" for the description of statistical regularities of the manifestation of external stochastic factors, we suggest to use the GG distributions defined by the density (2). The class of GG distributions was first described as a unitary family in 1962 by E. Stacy [11] as the class of probability distributions simultaneously containing both Weibull and gamma distributions. The family of GG distributions contains practically all the most popular absolutely continuous distributions concentrated on the non-negative half-line. In particular, the family of GG distributions contains:

•
The gamma distribution (γ = 1) and its special cases The chi-square distribution (γ = 1, µ = 1 2 ); • The Nakagami distribution (γ = 2); • The half-normal (folded normal) distribution (the distribution of the maximum of a standard Wiener process on the interval [0, 1]) (γ = 2, r = 1 2 ); The Maxwell distribution (the distribution of the absolute values of the velocities of molecules in a dilute gas) (γ = 2, r = 3 2 ); • The Weibull-Gnedenko distribution (the extreme value distribution of type III) (r = 1, γ > 0); • The (folded) exponential power distribution (γ > 0, r = 1 γ ); • The inverse gamma distribution (γ = −1) and its special case GG distributions are widely applied in many practical problems. There are dozens of papers dealing with the application of GG-distributions as models of regularities observed in practice. Apparently, the popularity of GG-distributions is due to the fact that most of them can serve as adequate asymptotic approximations, since all the representatives of the class of GG-distributions listed above appear as limit laws in various limit theorems of probability theory in rather simple limit schemes. Below we will formulate a general limit theorem (an analog of the law of large numbers) for random sums of independent r.v.'s in which the GG-distributions are limit laws. It is worth noting that the GG distribution and its limit cases give a general form of the exponential distribution of rank 1 for the scale parameter.
In [1], the data registered in so climatically different points as Potsdam (Brandenburg, Germany) and Elista (Kalmykia, Russia) was analyzed, and it was demonstrated that the fluctuations of the numbers of successive wet days with very high confidence fit the NB distribution with shape parameters r = 0.847 and r = 0.876, respectively. In the same paper, a schematic attempt was undertaken to explain this phenomenon by the fact that NB distributions can be represented as mixed Poisson laws with mixing gamma distributions whereas, as it already has been mentioned, the Poisson distribution is the best model for the discrete stochastic chaos and the mixing distribution accumulates the stochastic influence of factors that can be assumed exogenous with respect to the local system under consideration.
The NB distributions are special cases of the GNB distributions. This family of discrete distributions is very wide and embraces Poisson distributions (as limit points corresponding to a degenerate mixing distribution), NB (Polya) distributions including geometric distributions (corresponding to the gamma mixing distribution, see [12]), Sichel distributions (corresponding to the inverse gamma mixing distributions, see [13,14]), Weibull-Poisson distributions (corresponding to the Weibull mixing distributions, see [15]) and many other types supplying descriptive statistics with many flexible models. More examples of mixed Poisson laws can be found in [8,16].
It is quite natural to expect that, having introduced one more free parameter into the pure negative binomial model, namely, the power parameter in the exponent of the original gamma mixing distribution, instead of the negative binomial model one might obtain a more flexible GNB model that provides an even better fit with the statistical data of the durations of wet days. The analysis of the real data shows that this is indeed so.
In Figures 1 and 2 there are the histograms constructed from real data of 3323 wet periods in Potsdam and 2937 wet periods in Elista. On the same pictures, there are the graphs of the fitted NB distribution (that is, the GNB distribution with γ = 1) and the fitted GNB distribution with additionally adjusted scale and power parameters. For vividness, in the GNB model, the value of the shape parameter r was taken the same as that obtained for the NB model and equal to 0.876 for Elista and 0.847 for Potsdam. For the "fine tuning" of the GNB models with these fixed values of r, the minimization of the 1 -norm of the difference between the histogram and the fitted GNB model was used. In Appendix A, the Algorithm A1 of for the computation of GNB probabilities by the minimization of the 1 , 2 and ∞ -norms of the difference between the histogram and the fitted GNB model is presented. The analytic and asymptotic properties of the GNB distributions were studied in [3]. In particular, it was shown in that paper that the GNB distribution with shape parameter and exponent power parameter less than one is actually mixed geometric. The mixed geometric distributions were introduced and studied in [17] (also see [15,18]). A mixed geometric distribution can be interpreted in terms of the Bernoulli trials as follows. First, as a result of some "preliminary" experiment the value of some r.v. taking values in [0, 1] is determined, which is then used as the probability of success in the sequence of Bernoulli trials in which the original "unconditional" mixed Poisson r.v. is nothing else than the "conditionally" geometrically distributed r.v. having the sense of the number of trials up to the first failure. This makes it possible to assume that the sequence of wet/dry days is not independent but is conditionally independent and the random probability of success is determined by some outer stochastic factors. As such, we can consider the seasonality or the type of the cause of a wet period. So, since the GG-distribution is a more general and, hence, a more flexible model than the "pure" gamma distribution, there arises a hope that the GNB distribution could provide an even better goodness of fit to the statistical regularities in the duration of wet periods than the "pure" NB binomial distribution. Figure 2. The histograms constructed from real data of 2937 wet periods in Elista and the fitted NB and GNB models, 1 -distance minimization.

Notation, Definitions and Mathematical Preliminaries
In the paper, conventional notation is used. The symbols d = and =⇒ denote the coincidence of distributions and convergence in distribution, respectively.
In what follows, for brevity and convenience, the results will be presented in terms of r.v.'s with the corresponding distributions. It will be assumed that all the r.v.'s are defined on the same probability space (Ω, F, P).
An r.v. having the gamma distribution with shape parameter r > 0 and scale parameter µ > 0 will be denoted G r,µ , where Γ(r) is Euler's gamma-function, Γ(r) = ∞ 0 x r−1 e −x dx, r > 0. In this notation, obviously, G 1,1 is an r.v. with the standard exponential distribution:

here and in what follows 1(A) is the indicator function of a set A).
A GG-distribution is the absolutely continuous distribution defined by the density (Equation (2)). The distribution function (d.f.) corresponding to the density g * (x; r, γ, µ) will be denoted F * (x; r, γ, µ).
The properties of GG-distributions are described in [11,19]. An r.v. with the density g * (x; r, γ, µ) will be denoted G r,γ,µ . It can be easily made sure that and hence, For convenience, for an r.v. with the Weibull distribution, a particular case of GG-distributions corresponding to the density g * (x; 1, γ, 1) and the d.f. 1 − e −x γ 1(x 0) with γ > 0, we will use a special notation W α , that is, W γ d = G 1,γ,1 . Thus, G 1,1 d = W 1 . The density g * (x; 1, α, 1) with α < 0 defines the Fréchet or inverse Weibull distribution. It is easy to see that An r.v. N r,p is said to have the negative binomial (NB) distribution with parameters r > 0 (shape) and p ∈ (0, 1) (success probability), if A particular case of the NB distribution corresponding to the value r = 1 is the geometric distribution. Let p ∈ (0, 1) and let N 1,p be the r.v. having the geometric distribution with parameter p :

This means that for any
Let Y be an r.v. taking values in the interval (0, 1). Moreover, let for all p ∈ (0, 1) the r.v. Y and the It is well known that the negative binomial distribution is a mixed Poisson distribution with the gamma mixing distribution [12] (also see [20]): for any r > 0, p ∈ (0, 1) and k ∈ {0} N we have where µ = p/(1 − p).
The standard normal d.f. will be denoted Φ(x), An r.v. with the d.f. Φ(x) will be denoted X. The folded or half-normal distribution is the distribution of the r.v. |X|. It can be easily verified that S 2,0 In [24,25], it was demonstrated that if γ ∈ (0, 1], then with the r.v.'s on the right-hand side being independent. For r ∈ (0, 1) let G r, 1 and G 1−r, 1 be independent gamma-distributed r.v.'s. Let µ > 0. Introduce the r.v.
where Q 1−r,r is the r.v. with the Snedecor-Fisher distribution defined by the probability density In the paper [26], it was shown that any gamma distribution with shape parameter no greater than one is mixed exponential. For convenience, we formulate this result as the following lemma.
Lemma 1 ( [26]). The density of a gamma distribution g(x; r, µ) with 0 < r < 1 can be represented as is the density of the r.v. Z r,µ introduced above. In other words, if 0 < r < 1, then where the random variables W 1 and Z r,µ are independent. Moreover, a gamma distribution with shape parameter r > 1 cannot be represented as a mixed exponential distribution.
Along with the arguments given above in favor of the adequacy of the GNB models for the duration of wet periods based on their definition as mixed Poisson distributions, this effect can also be explained (at least in part) by their one more important property of being mixed geometric formulated as the following theorem.

The Asymptotic Approximation to the Probability Distribution of Extremal Daily Precipitation within a Wet Period
In this section, the probability distribution of extremal daily precipitation within a wet period will be deduced as an asymptotic approximation. We will require some auxiliary statements formulated as lemmas.
The following asymptotic property of the GNB distribution will play the fundamental role in the construction of asymptotic approximations to the distributions of extreme daily precipitation within a wet period and the total precipitation volume per wet period and the corresponding statistical tests for precipitation to be abnormally heavy.
Let µ > 0, γ > 0. Instead of an infinitesimal parameter µ, in order to construct asymptotic approximations with "large" sample size, introduce an auxiliary "infinitely large" parameter n ∈ N and assume that µ = µ n = µn −γ . It can be easily made sure that in this case Then for r > 0, µ > 0 for any n ∈ N, we have The standard Poisson process (the Poisson process with unit intensity) will be denoted P(t), t 0.

Lemma 3 ([27]
). Let Λ 1 , Λ 2 , . . . be a sequence of positive r.v.'s such that for any n ∈ N the r.v. Λ n is independent of the standard Poisson process P(t), t 0. The convergence Lemma 3 can be regarded as a special case of the following result. Consider a sequence of r.v.'s W 1 , W 2 , ... Let N 1 , N 2 , ... be natural-valued r.v.'s such that for every n ∈ N the r.v. N n is independent of the sequence W 1 , W 2 , ... In the following statement, the convergence is meant as n → ∞.
Lemma 4 ( [28,29]). Assume that there exists an infinitely increasing (convergent to zero) sequence of positive numbers {b n } n 1 and an r.v. W such that b −1 n W n =⇒ W.
If there exist an infinitely increasing (convergent to zero) sequence of positive numbers {d n } n 1 and an r.v. N such that where the r.v.'s on the right-hand side of Equation (19) are independent. If, in addition, N n −→ ∞ in probability and the family of scale mixtures of the d.f. of the r.v. W is identifiable, then Condition (18) is not only sufficient for Equation (19), but is necessary as well.

Lemma 5 ([30]
). Let Λ 1 , Λ 2 , . . . be a sequence of positive r.v.'s such that for each n ∈ N the r.v. Λ n is independent of the Poisson process P(t), t 0. Let N n = P(Λ n ). Assume that there exists a nonnegative r.v. Λ such that Convergence (17) takes place. Let X 1 , X 2 , . . . be i.i.d. r.v.'s a common d.f. F(x). Assume also that sup{x : F(x) < 1} = ∞ and there exists a number α > 0 such that for each x > 0 Then Now we turn to the main results of this section. The principal role in our reasoning will be played by Lemma 5. In order to justify its applicability, we need to make sure that the daily precipitation volumes satisfy Condition (20). A thorough statistical analysis shows that, although being rather adequate and, in general, acceptable model, the traditional gamma distribution (used, e.g., in [4]) is not the best model for statistical regularities in daily precipitation. The analysis of meteorological data (daily precipitation volumes) registered over 60 years at two geographic points with a very different climate: Potsdam (Brandenburg, Germany) with a mild climate influenced by the closeness to the ocean with warm Gulfstream flow and Elista (Kalmykia, Russia) with a radically continental climate convincingly suggests the Pareto-type model for the distribution of daily precipitation volumes, see Figures 3 and 4. For comparison, on these figures, the graphs of the best gamma-densities there are also presented. It can be seen that the gamma model fits the histograms in a noticeably worse way than the Pareto distribution. Theorem 2. Let n ∈ N, γ > 0, µ > 0 and let N r,γ,µ n be an r.v. with the GNB distribution with parameters r > 0, γ > 0 and µ n = µ/n γ . Let X 1 , X 2 , . . . be i.i.d. r.v.'s with a common d.f. F(x). Assume that rext(F) = ∞ and there exists a number α > 0 such that Relation (20) holds for any x > 0. Then The limit r.v. M r,α,γ,µ admits the following product representations: and in each term, the involved random variables are independent.
Proof. By definition, the GNB distribution is a mixed Poisson distribution with the GG mixing distribution. So, N r,γ,µ n d = P G r,γ,µ n . Therefore, from Equation (16), Lemma 3 with Λ n = G r,γ,µ n and Lemma 5 with the account of the absolute continuity of the limit distribution it immediately follows that lim n→∞ sup x 0 P max{X 1 , . . . , X N r,γ,µn } Since the Fréchet (inverse Weibull) d.f. e −x −α with α > 0 corresponds to the r.v. W −1 α , it is easy to make sure Moreover, using relation G r,γ,µ d = G 1/γ r,µ , it is easy to see that where in each term the involved random variables are independent. The theorem is proved.
Theorem 3. The distribution of the r.v. M r,α,γ,µ admits the following representations.
(i) If r ∈ (0, 1], it is the scale mixture of the distribution of the ratio of two independent Weibull-distributed r.v.'s: where all the involved random variables are independent and the r.v. Z r,1 is defined in Equation (8).
If γ ∈ (0, 1], it is the scale mixture of the tempered Snedecor-Fisher distribution with parameters r and 1: where S γ,1 is a positive strictly stable r.v. with characteristic exponent γ independent of the r.v. Q r,1 with the Snedecor-Fisher distribution in Equation (9) with parameters r and 1.

(iii)
If γ ∈ (0, 1] and r ∈ (0, 1], it is the scale mixture of the Pareto laws: If r ∈ (0, 1] and αγ ∈ (0, 1], it is the scale mixture of the folded normal laws: where all the involved r.v.'s are independent. Proof. To prove (i) it suffices to consider the rightmost term in Equation (21), apply relations W To prove (ii) it suffices to transform the rightmost term in (21) with the account of representation in Equation (7) and use the definition of the Snedecor-Fisher distribution as the distribution of the ratio of two independent gamma-distributed r.v.'s (see, e.g., Section 27 in [33]).
To prove (iii) it suffices to transform the second term in Equation (21) with the account of Equation (14) and notice that the distribution of the ratio of two independent exponentially distributed r.v.'s coincides with that of the random variable Π 1 .
Proof. To prove this statement, it suffices to transform the second term in Equation (21)  Proof. This statement immediately follows from Theorem 3 and the result of [34] stating that the product of two independent non-negative r.v.'s is infinitely divisible, if one of the two is exponentially distributed.

Proof. From Equation
Then for any n ∈ N Proof. This statement is a special case of Corollary 2 in [35].
Then for any x ∈ R P a c(n − 1) .
Proof. First of all, check Condition (20). We have as y → ∞, that is, Condition (20) holds implying Equation (22) with H(x) = e −x −α in accordance with the classical theory of extremes (see, e.g., [36]). Second, note that in the case under consideration Third, from Equation (15) it follows that N r,γ,µ/n γ d = P(nG r,γ,µ ) with independent P(t) and G r,γ,µ . Therefore, by Lemma 6 we have P a c(n − 1) 1/γ max 1 k N r,γ,µn The theorem is proved.
Actually, Theorem 6 states that the rate of convergence in Theorem 2 is O(µ 1/γ n ) as µ n → 0. The results of this section serve as a theoretical base for the construction of a test for abnormally extreme daily precipitation. The distribution of the maximum daily precipitation per wet period can be assumed to be a tempered scale mixture of the gamma distribution in which the scale factor has the Weibull distribution. According to the typical construction of a test, a daily precipitation volume is considered to be abnormally extremal, if it exceeds a certain (pre-defined) quantile of this distribution. A detailed description of this test and algorithm of estimation of the parameters of the distribution mentioned above deserve a separate study as well as its application to real data.

The Asymptotic Approximation to the Probability Distribution of Total Precipitation over a Wet Period. Generalized R ényi Theorem for Gnb Random Sums
As far ago as in the 1950s, being interested in modeling rare events, A. Rényi studied rarefaction of renewal point processes and proved his famous theorem on convergence of rarefied renewal processes to the Poisson process [37,38]. The Rényi theorem states that the distribution of a geometric sum (i.e., a sum of a random number of i.i.d. r.v.'s in which the number of summands is a r.v. with the geometric distribution independent of the summands) normalized by its expectation converges to the exponential law as the expectation of the sum infinitely increases. The normalization of a sum by its expectation is typical for laws of large numbers. Therefore, the Rényi theorem can be regarded as the law of large numbers for geometric sums. A general law of large numbers for random sums of independent identically distributed (i.i.d.) random variables (r.v.'s) was proved in [28]. It was demonstrated there that the distribution of a random sum normalized by its expectation converges to some distribution, if and only if the distribution of the random index (the number of summands) converges to the same distribution (up to a scale parameter) under the same normalization. In [3] the law of large numbers for GNB random sums was proved. However, a direct application of this result to modeling the probability distribution of total precipitation over a wet period is hampered by the following very interesting practical observation.
One might have expected that successive daily precipitation volumes X 1 , X 2 , . . . satisfy the classical law of large numbers, that is, the arithmetic mean 1 n (X 1 + . . . + X n ) converges to some number a almost surely as n infinitely grows, as it was done in [2]. However, a thorough analysis of real data shows that this not quite so. In Figure 5, there are the graphs of the averaged daily precipitation volumes in Potsdam and Elista demonstrating the slowly decreasing trend for Potsdam and slowly increasing trend for Elista. This means that, in order to match the stabilization of the averages at some level a, it is required to normalize the sum X 1 + . . . + X n not by n, but by a somewhat more complicated function of n that can match the influence of slow global trends. As such, a function of n, consider a power function n β with β > 0 and assume that not necessarily i.i.d. r.v.'s X 1 , X 2 , . . . satisfy the condition as n → ∞. The parameters a and β can be rather reliably estimated by the least squares technique. Let X 1 , X 2 , . . . , X n be the observed values of successive nonzero daily precipitation volumes, n ∈ N be the total number of available observations. For a natural k = 1, . . . , n denote s k = X 1 + . . . + X k .
If Condition (23) holds, then for k large enough (1 m k n), the following estimates of the parameters a and β in Relation (23) can be used: Indeed, if Condition (23) holds, the following approximate equality can be written: Therefore, the estimates of the parameters a and β can be found as the solution of the least squares problem This solution can be found explicitly and has the form that leads to Formulas (24) and (25). This least squares method for estimation of a and β is realized by Algorithm A2 (see Appendix A). The application of Equation (23) to real data from Potsdam and Elista with a and β estimated by Equations (24) and (25) is illustrated in Figure 6. It can be seen that the cumulative averages stabilize at the level a = 4.087 with β = 0.981 for Potsdam and at the level a = 0.96 with β = 1.146 for Elista. So, to construct the asymptotic approximation to the probability distribution of total precipitation over a wet period, we should prove a generalized Rényi theorem for GNB random sums improving an analogous statement proved in [3]. It must be especially noted that in the following theorem, the r.v.'s X 1 , X 2 , ... are not assumed to be i.i.d.

Theorem 7.
Assume that the nonzero daily precipitation volumes X 1 , X 2 , ... satisfy Condition (23) with some β > 0 and a > 0. Let the numbers r > 0, γ and µ > 0 be arbitrary. For each n ∈ N, let the r.v. N r,γ,µ n have the GNB distribution with parameters r, γ and µ n = µ/n γ . Assume that for each n ∈ N the r.v. N r,γ,µ n is independent of the sequence X 1 , X 2 , ... Then Proof. The proof is based on Lemma 4 and Equation (16). From Equation (16) it follows that as n → ∞. By virtue of Condition (23), in Lemma 4 let b n = n β /a. As N n in Lemma 4 take N r,γ,µ/n γ .
Then b N n = 1 a N β r,γ,µ/n γ . From Equation (26) it follows that, as n → ∞, Therefore, as d n we can take d n = n β /µ β/γ . So, using Equation (27) in the role of Equation (18) in Lemma 4, we obtain Equation (19) in the form whence follows the desired result. The theorem is proved.
Theorem 7 presents a good tool for the account of the parameters β and γ characterizing the deviation from traditional NB and arithmetic mean models due to the influence of possible (slow) global trends. If in Theorem 7 r = γ = β = 1, then we obtain a version of the Rényi theorem [39] generalized to non-identically distributed and not necessarily independent summands. If in Theorem 7 β = 1, then we obtain the law of large numbers for GNB random sums (see [3]). If in Theorem 7 γ = 1, then we obtain the law of large numbers for NB random sums modified for the case β = 1.
Therefore, if daily precipitation volumes X 1 , X 2 , . . . (of course, being non-identically distributed and not independent), with the account of the excellent fit of the GNB model for the duration of a wet period (see Figure 1), with rather small µ, the GG distribution can be regarded as an adequate and theoretically well-based model for the total precipitation volume over a (long enough) wet period.
As regards the bounds for the rate of convergence in Theorem 7, consider a special case of β = 1 and i.i.d. X 1 , X 2 , . . . As a measure of the distance between probability distributions, consider the ζ-metric proposed by V. M. Zolotarev in [40,41] (also see [42], p. 44). Let s > 0. There exists a unique representation of the number s as s = m + α where m is an integer and 0 < α 1. By F s we denote the set of all real-valued bounded functions f on R that are m times differentiable and | f (m) (x) − f (m) (y)| |x − y| α . Let X and Y be two r.v.'s in which the distribution functions will be denoted F X (x) and F Y (x), respectively.
The ζ-metric ζ s (X, Y) ≡ ζ s (F X , F Y ) in the space of probability distributions is defined by the equality In [43], it was shown that in the case β = 1 and i.i.d. X 1 , X 2 , . . . for 1 s 2 we have In particular, The results presented above justify the GG models for the probability distribution of total precipitation volume over a wet period improving the models considered in [2]. Statistical tests for the detection of anomalously extreme total volumes will be considered below.

Statistical Tests for Anomalously Extreme Total Precipitation Volumes
Now we turn to the construction of the tests for the total precipitation volume during a wet period to be abnormally large.
In what follows, based on the results of the preceding section, we will assume that the total precipitation volume during a wet period has the GG distribution with some parameters r > 0, γ > 0 and µ > 0.
Let m ∈ N and G (1) r,γ,µ , G r,γ,µ , . . . , G (m) r,γ,µ be independent r.v.'s having the same GG distribution with parameters r > 0, γ and µ > 0. Also, let G Consider the relative contribution of the r.v. G (1) r,γ,µ γ to the sum G (1) r,γ,µ From Equation (4), it obviously follows that (see Equation (29)). So, the r.v. R characterizes the relative precipitation volume for one (long enough) wet period with respect to the total precipitation volume registered for m wet periods.
Note that where the gamma-distributed r.v.'s on the right hand side are independent. The distribution of the r.v.
R was described in [2] where it was demonstrated that R d = 1 + k r Q k,r −1 where Q k,r is the r.v. having the Snedecor-Fisher distribution determined for k > 0, r > 0 by the Lebesgue density It should be noted that the particular value of the scale parameter is insignificant. For convenience, it is assumed equal to one. It can be easily made sure by standard calculation using Equation (30), the distribution of the r.v. R is determined by the density that is, it is the beta distribution with parameters k = (m − 1)r and r.
Then the test for the homogeneity of an independent sample of size m consisting of the GG distributed observations of total precipitation volumes during m wet periods with known γ based on the r.v. R looks as follows. Let V 1 , . . . , V m be the total precipitation volumes during m wet periods and, moreover, V 1 V j for all j 2. Calculate the quantity From what was said above, it follows that under the hypothesis H 0 : «the precipitation volume V 1 under consideration is not abnormally large» the r.v. SR has the beta distribution with parameters k = (m − 1)r and r. Let ε ∈ (0, 1) be a small number, β k,r (1 − ε) be the (1 − ε)-quantile of the beta distribution with parameters k = (m − 1)r and r. If SR > β k,r (1 − ε), then the hypothesis H 0 must be rejected, that is, the volume V 1 of precipitation during one wet period must be regarded as abnormally large. Moreover, the probability of erroneous rejection of H 0 is equal to ε. Instead of R, the quantity can be considered. Then, as is easily seen, the r.v.'s R and R 0 are related by the one-to-one correspondence so that the homogeneity test for a sample from the GG distribution equivalent to the one described above and, correspondingly, the test for a precipitation volume during a wet period to be abnormally large, can be based on the r.v. R 0 , which has the Snedecor-Fisher distribution with parameters r and k = (m − 1)r. Namely, again let V 1 , . . . , V m be the total precipitation volumes during m wet periods and, moreover, V 1 V j for all j 2. Calculate the quantity (SR 0 means «Sample R 0 »). From what was said above, it follows that under the hypothesis H 0 : «the precipitation volume V 1 under consideration is not abnormally large» the r.v. SR has the Snedecor-Fisher distribution with parameters r and k = (m − 1)r. Let ε ∈ (0, 1) be a small number, q r,k (1 − ε) be the (1 − ε)-quantile of the Snedecor-Fisher distribution with parameters r and k = (m − 1)r. If SR 0 > q r,k (1 − ε), then the hypothesis H 0 must be rejected, that is, the volume V 1 of precipitation during one wet period must be regarded as abnormally large. Moreover, the probability of erroneous rejection of H 0 is equal to ε. Let l be a natural number, 1 l < m. It is worth noting that, unlike the test based on the statistic R, the test based on R 0 can be modified for testing the hypothesis H 0 : «the precipitation volumes V i 1 , V i 2 , . . . , V i l do not make an abnormally large cumulative contribution to the total precipitation volume V 1 + . . . + V m ». For this purpose denote and consider the quantity .
In the same way as it was done above, it is easy to make sure that Let ε ∈ (0, 1) be a small number, q lr,(m−1)r (1 − ε) be the (1 − ε)-quantile of the Snedecor-Fisher distribution with parameters lr and k = (m − l)r. If SR 0 > q lr,(m−l)r (1 − ε), then the hypothesis H 0 must be rejected, that is, the cumulative contribution of the precipitation volumes V i 1 , V i 2 , . . . , V i l into the total precipitation volume V 1 + . . . + V m must be regarded as abnormally large. Moreover, the probability of erroneous rejection of H 0 is equal to ε.

Comparison of Tests for Anomalously Extreme Precipitation Volumes Based on Gamma and Gg Distributions
In this section, the results of the application of the test based on the statistic R in Equation (29) to the analysis of the time series of daily precipitation observed in Potsdam and Elista from 1950 to 2007 are considered and compared with similar results for the case of gamma distributed total precipitation volumes during wet periods [2].
The results of the application of the tests for a total precipitation volume during one wet period to be abnormally large based on GG and gamma models in the moving mode are shown in If m is the window width (the number of observations in a moving window). A fixed sample point falls in exactly m windows. One of the following cases can occur for a fixed observation: Algorithm A3 (see Appendix A) realizes the method based on the statistic R described above.
For the sake of vividness on these figures the time horizon equals 90 and 360 days and the significance level α of the tests is 0.01. The absolutely, intermediate and relatively abnormal precipitation volumes are marked by downward-pointing triangles, circles and squares, respectively, for the test based on the gamma model, whereas the corresponding test based on the statistic R based on the GG distribution are marked by upward-pointing triangles, diamonds and right-pointing triangles, respectively. It is worth noting that MATLAB's notations are used here for these markers.

Comparison of GG-Based Statistical Test and Peaks over Threshold Methodology for Extreme Precipitation Intensities
One important precipitation indicator is the precipitation intensity that is defined as the ratio of the total precipitation volume over a wet period to the duration of this wet period measured in days. The extreme precipitation volumes and intensities are relevant to various problems of climatology and hydrometeorology (see, for example, [44][45][46][47]). Traditionally, these phenomena are investigated for different geographical regions or countries [48,49]. In particular, the issue of determining threshold values, the excess of which leads to the extreme events, for example, in daily rainfalls or their intensities, is the key point of the study. Precipitation intensities are important not only for forecasting floods but also for solving problems such as runoff and soil erosion [50,51]. It can be explained by the contemporary climate change scenarios that predict a significant increase in the frequency of high intensity rainfall events, primarily in the dry areas. Moreover, precipitation can induce shallow landslides [52,53] and debris flows [54].
Statistical analysis of real data shows that the probability distribution of the precipitation intensity can be approximated by the gamma distribution with very high accuracy. In [55], some theoretical arguments were presented to justify the gamma model for the distribution of precipitation intensities. So, the statistical approach described in Section 6 and in [2] can be also used for identification of abnormally large intensities. For the analysis, the precipitation intensities in Potsdam and Elista (verified samples without missing values) are used as the initial data. This section presents a comparison of a non-parametric approach based on the extreme value theory as well as modified Peaks over Threshold (PoT) methodology [56] with the parametric approach that significantly involves testing parametric statistical hypotheses to determine extreme intensities of wet periods (see Section 6). The classical version of PoT [57] is quite popular for solving a wide range of climatic problems. In particular, the following results can be mentioned: the inverse Weibull distribution as an extreme wind speed model [58], a time-dependent versions of the PoT model for severe storm waves [59] and daily temperatures [60], probability model for rainfalls of high magnitude [61], analysis of precipitation extremes in a changing climate [62]. Most applications of the extreme value theory assume stationarity, but it is well-known that real events are not stationary. So, the generalized results analogous to Theorem 7 are required. All the numerical methods are implemented as a MATLAB program. Algorithm A4 demonstrates the method based on the PoT and GG-test (see Appendix A). Figures 11 and 12 present the results obtained by the modified PoT algorithm in which the Weibull distribution is considered as the distribution of time between extreme events. Starting from the maximum threshold value that coincides with the maximum of the analyzed data, the hypothesis that the time intervals between the moments of excess of a certain threshold have the Weibull distribution is tested. The corresponding P-value is saved, and the threshold is shifted down by a certain (small) step (in this case, 0.01). It is worth noting that a similar procedure was suggested in [56] for precipitation volumes under the assumption that the time intervals between excesses have the exponential distribution. For a given significance level (in both cases, α is chosen as 0.01), the corresponding hypothesis is not rejected for all thresholds for which the P-values are located to the right of the red vertical line in the upper graphs in Figures 11 and 12. The lower graphs show the parameters of the fitted Weibull distribution.
On Figures 13 and 14 the results of the test (see Section 6) for both the GG and usual gamma distribution are compared with those of the PoT method based on the exponential and Weibull distributions for the intensities in Potsdam and Elista. The following notation is used:

•
The thresholds with the indices low correspond to the minimum levels at which the hypothesis of exponential or Weibull distribution is not rejected (the lowest point to the right of the red line on the upper graphs in Figures 11 and 12); • The thresholds with the indices maxval correspond to the maximum P-value (the rightmost point in the upper graphs); • The thresholds with the indices high correspond to the upper level, when the corresponding hypothesis is not rejected (the highest point to the right of the red line on the graphs).  The green-filled downward-pointing triangles mark the intensities, which are classified as absolutely abnormal based on the GG test (see Section 6). The black upward-pointing triangles correspond to the decision based on the classical gamma distribution test (that is, γ = 1, see (31)). The circles denote intermediate extreme observations, and the squares mark relatively extreme ones. This classification is described in Section 7. It is worth noting that for the GG test in Potsdam the value of γ is 1.0775 and for Elista γ = 1.1257.
For Potsdam, the results of gamma and GG tests are good and close. In addition, the PoT method is also effective in the case where the threshold is chosen with the maximum P-value. However, for Elista, with less rainfalls with lower intensities, the results are quite different. Indeed, the decisions of the PoT method are close for the exponential and Weibull cases (the thresholds differ by only 0.29).
However, a statistical test based on the gamma distribution identifies only four intensities as absolutely extreme, while the GG test identifies more absolute extremes, including those below the thresholds mentioned above.

Conclusions and Discussion
In the paper, asymptotic models for some precipitation characteristics based on GNB distributions were considered. Also, a statistical test based on the GG distribution was proposed for the determination of the type of precipitation extremes. The GG and GNB distributions are not quite widespread, so the methods for the estimation of their parameters are, as a rule, not implemented in standard statistical packages. Therefore, the implementation of appropriate procedures requires the creation of specialized software solutions, for example, based on the functional approach, as it was done in the study described in this paper using the MATLAB programming language. However, as was demonstrated in the paper, the results of fitting such distributions to real data turned out to be better as compared to conventional models. Therefore, for processing spatial meteorological data from a large number of stations, the proposed methods and models can be effectively implemented as high-performance computing services.