Abstract
The paper comprehensively studies the natural exponential family and its associated exponential dispersion model generated by the Landau distribution. These families exhibit probabilistic and statistical properties and are suitable for modeling skewed continuous data sets on the whole real line. The study explores and further develops various probabilistic properties, including reciprocity, self-decomposability, reproducibility, unimodality, and characterizations. It delves into statistical aspects such as maximum-likelihood estimation, hypothesis testing, and generalized linear models.
Keywords:
Landau distribution; natural exponential family; exponential dispersion model; generalized linear model; exponential variance function; self-decomposability; unimodality; characterizations MSC:
62E99; 62P99
1. Introduction
The present paper reviews, develops, and provides a comprehensive and thorough study of the natural exponential family (NEF) of distributions along with its exponential dispersion model (EDM), which is generated by the Landau distribution. Such NEFs and EDMs are associated with exponential variance function (VF). They will be abbreviated henceforth as NEF-EVF and EDM-EVF, respectively. These families are absolutely continuous, supported on the whole real line, and abundant with probabilistic and statistical properties.
Our study embraces both properties of these families and contains a collection of scattered results in the literature and the development of additional new results. We present these families exhaustively in a unified approach and propose them as statistical model candidates for fitting skewed continuous data sets on the real line (with or without covariates). Various other continuous distributions supported on the whole real line are available in the statistical literature (e.g., the Behrens–Fisher distribution [1], the exponentially modified Gaussian distribution [2], the hyperbolic secant distribution [3], and the Johnson SU distribution [4]. However, we trust that our proposed NEF and EDM have many virtues (detailed in the sequel), making them important models for statistical modeling and GLM applications.
The standard Landau distribution is stable and supported on . In general, the -stable distribution requires four parameters for a complete description: a stability index (c.f., [5]), a skewness parameter , a scale parameter , and a location parameter . For , the corresponding set of parameters is and , yielding a characteristic function (c.f.)
and density
The measure is analyzed in [5]. It is also studied by [6] and further discussed in different contexts by [7,8,9]. It is named after Leb Landau due to its brilliant presentation in ionization losses—the energy losses by fast-charged particles traveling through matter. This process has been studied for over 100 years, and the theoretical explanation spans a similar period. About 80 years ago, ref. [10] published a theoretical paper on the subject, drastically leveling up the research and remaining among the most cited in the field. See [11,12] for more on the history of the theoretical developments and attempts to clarify Landau’s method of research and the function named after him).
By transforming the measure by , we obtain the measure with density
and c.f.
This simple translation transformation yielding has some practical importance, as seen in the sequel. First, it provides, for , a more elegant expression for the VF of the NEF-EVF, and second, it allows a simple computation of the density of its corresponding EDM.
The paper is organized as follows. Section 2 is dedicated to preliminaries. We first elaborate more on the measure and its Laplace transform and provide a new proof for the form of its Laplace transform that is needed for all derivations in the sequel. We then present basic preliminaries on NEFs, EDMs, and associated VFs. Section 3 introduces the EDM-EVF generated by the measure and derives the corresponding densities. Section 4 and Section 5 present numerous probabilistic and statistical properties of EDM-EVFs. In Section 4, we derive expressions for the cumulants, skewness, and kurtosis coefficients and show that the EDM-EVF densities are skewed to the right and leptokurtic (i.e., have fatter tails). We show that the VF of the EDM-EVF is a limit of VFs in the Tweedie scale and thus reason to call it the Tweedie family with power infinity. We study the EDM-EVF concerning the following properties: reciprocity, self-decomposability and unimodality, reproducibility in the broad and regular sense, duality, chainability (a new notion), characterizations by zero regression on the sample mean, and large deviations. Section 5 describes and develops various aspects of statistical features related to EDM-EVFs. In particular, we consider maximum-likelihood estimation, second-order minimax estimation of the mean, testing hypotheses, and describe practical steps toward GLM applications. Some concluding remarks are presented in Section 6.
2. Preliminaries
This section includes two subsections. The first is devoted to deriving the Laplace transform (LT) of the Landau distribution which is needed to define the set NEF-EVF. The second introduces some essential preliminaries on NEFs and EDMs required to determine the sets NEF-EVF and EDM-EVF generated by the Landau distribution
Let be a non-Dirac positive measure on . Its LT and effective domain are defined as
and
Also, let
and define the cumulant transform of by
2.1. The Laplace Transform of the Landau Distribution
The form of the LT of the Landau measure is presented in various sources in the literature as
from which that of is simply
However, all our attempts to find proof for (5) in the literature were not successful. It is, of course, possible to derive the LT of from its c.f. given in (1). However, such a derivation is only sometimes simple. Since the present paper reviews and studies in detail the NEF and EDM generated by the Landau distribution—a study which is heavily dependent on the form of —we decided, for the sake of completeness, to provide a proof for (6). This proof is presented in Proposition 1 below. It was provided to the author by Gérard Letac (Institut de Mathématiques de Toulouse, Université Paul Sabatier, France).
Proposition 1.
The Laplace transform of has the form given in (6).
Proof.
In the Lévy canonical representation, a c.f. of an infinitely divisible distribution has the form
where b is a location parameter, is the Gaussian parameter, is a centering function, and is the Lévy measure defined by its c.f. (see [5,6]), which can be either type , or 2. If the Lévy measure has type 2, then the convex support of is even if the support of is contained in and the Gaussian coefficient is zero. Ref. [6] uses while the Russian literature uses . We shall adopt Feller’s usage of . Since is stable supported on and its associated Lévy measure measure is , it follows that its c.f. is
and thus its LT has the form
The trick is to use first integration by parts and then split into two integrals.
where
Recall that the Frullani integral is , when , f is continuous on and converges. By applying this to , and we obtain To show that is more complicated, a way to proceed is to consider the entire function in
and to apply the Cauchy theorem to f and the segment , to the quart of circle , and to the segment In this way, we obtain
We now use the following majorization for
Therefore, Another way to show this is to compute for Since
is easily computed and since we obtain that and thus This concludes the proof of (6). □
2.2. Preliminaries on NEFs and EDMs
Let be a non-Dirac positive measure on with LT and cumulant transform as defined (1)–(4). We henceforth define the notions of NEF, VF, EDM, and other related quantities needed to describe NEF-EVF (for a good reference, see [13]).
NEF. The NEF F generated by is defined by the probabilities
Full NEF. It is defined as follows. Let be an NEF generated by . If is a probability measure, then
is called the full NEF generated by
Cumulants. The cumulants of are obtained by the derivatives of . The r-th cumulant of is given by
In particular, and are the mean and variance of .
Mean domain. The image of under is called the mean domain of Since is strictly convex and real analytic on , the map is one to one, and its inverse function is well defined.
VF. The variance corresponding to is . The map from into is called the VF of F. More precisely, a VF of an NEF F is a pair . It uniquely determines an NEF within the class of NEFs c.f., (see [3,14]). Since they were first defined by [3], various classes of VFs have appeared in numerous papers in the last four decades. Morris himself characterized all six NEFs having (up) to quadratic VFs (e.g., Poisson with VF gamma with VF inverse Gaussian with VF
Steep NEF. An NEF is called steep iff . The steepness of F (or ) ensures that the MLE of the canonical parameter or the related mean exists with probability 1 and is given as the unique solution of the maximum-likelihood equation based on n independent replicas taken from (c.f., [15], Chapter 9). This property is essential in various derivations of the MLE and GLM applications (see Section 4.8 and Section 6).
Mean value parameterization. For a given VF of an NEF F, it is clear that and are two primitives of and , respectively; i.e.,
Accordingly, the NEF F can be represented as
The reparameterization in (11) is called the mean value parameterization of F [13], Proposition 2.3, and [16]. Such a parameterization of F in terms of its mean m is more appealing than that of , as is just an artificial parameter (the argument of the LT of ).
EDM. An EDM is related to an NEF as follows. The Jorgensen set associated with F is defined by
Note that is the p-th fold convolution of (even if p is not a positive integer); i.e.,
The Jorgensen set is nonempty since, by convolution, it contains . Also, iff is infinitely divisible (and thus is composed of infinitely divisible (i.d.) distributions).
For , the NEF generated by is
where the support of may depend on p. For , we denote, respectively, the mean function, mean domain, and VF of the NEF by and (instead of , and ). These are given by
and
Then, the set of families
is called the EDM associated with , and it is parameterized by .
EDMs have been studied thoroughly by [17,18] and others, suggesting them to describe the error component in generalized linear models (GLMs). The statistical literature contains hundreds of articles applying EDMs in GLM methodology.
Remark 1.
It should be noted, however, that applying EDMs in GLM methodology requires the knowledge of the exact expression of the measure appearing in the EDM model (12). Such knowledge could be very complicated, if at all attainable. To feel the last statement’s complexity, we will refer to a case where . For this case, , is the n-th fold convolution of the generating measure μ of —a rather complicated computational task.
In the sequel, when no confusion is caused, we shall suppress the dependence of , and on and F and write , and
We now post two rules regarding appropriate transformations on an to conclude these introductory preliminaries. These are
The Jorgensen rule:
and
the affine rule:
3. The NEF-EVF and EDM-EVF
These are described in the two following subsections.
3.1. The NEF-EVF Generated by the Landau Distribution
The choice in (16) provides a more elegant form of the VF. We adopt this choice and consider the NEF-EVF to be generated by with cumulant transform
Hence, the NEF-EVF is presented by density and VF,
and
where, by (1), is
The NEF-EVF is discussed in the context of EDMs by [17,18,19]. (For simplicity of notation, we have suppressed the dependence of and on ).
Note that the full family (see (8)) related to is well defined as is a probability measure, in which case The full family is relevant in Section 4.4 when discussing the properties of self-decomposability and unimodality of
Thus, the mean value parameterization of (18) is
Such a representation of is needed when discussing GLM methodology in Section 5.4.
3.2. The EDM-EVF Generated by
Since is stable (and thus is infinitely divisible), is a LT of some measure for all . Hence, the EDM-EVF generated by has, by (12), densities of the form
where
is the cumulant transform of So, the question is: what is As indicated before, the answer is a rather complicated problem. Luckily enough, for our case, is stable—a fact allowing the computation of while using the general form of the density given in (1). This is performed in the following proposition.
Proposition 2.
Consider the EDM-EVF in (23), then
(i) The Jorgensen set of is
(ii) With denied in (1), is
and the densities of the EDM-EVF in (23) are
Proof.
(i) This is simple as is infinitely divisible, implying that is a LT for all .
(ii) Note that
i.e.,
where is defined in (6). This means that
which, by changing variables , leads to
or, equivalently, that
Thus, using (1) with leads to (25). □
Remark 2.
In particular, we should notice the simple but interesting case where . This case concerns the convolution of n i.i.d. random variables taken from an NEF-EVF distribution. So if , then by (25) the density of is
The VF corresponding to EDM-EVF is by (12)
where its mean value parameterization is obtained from (10) by setting
in which case
Figure 1 plots the EDM-EVF density (26) for the four couples , , , It can be seen that its skewness to the right is well evident.
Figure 1.
EDM-EVF density (26) for
4. Probabilistic Features of NEF-EVF and EDM-EVF
This section will present and develop several probabilistic features of both NEF-EVF and EDM-EVF. Usually, we present these for EDM-EVF, as those for NEF-EVF are obtained by setting . However, sometimes, we only represent them for NEF-EVF for the sake of notation simplicity.
These probabilistic features include: (1) A derivation of cumulants, skewness, and kurtosis coefficients. Primarily, we show that all such distributions are skewed to the right and leptokurtic; (2) a presentation of EVFs as a limit of a sequence of VFs in the Tweedie scale; (3) the property of reciprocity; (4) properties of self-decomposability and unimodality; (5) reproducibility in the broad and regular sense; (6) duality; (7) chainability (a new property of infinitely divisible distributions); (8) characterizations by zero regression on the sample mean; and (9) large deviations.
4.1. Cumulants, Central Moments, Skewness, and Kurtosis
Let ,, denote the r-th central moment. Then, and
The skewness and kurtosis coefficients are
Hence, all members of the EDM-NEF are skewed to the right and leptokurtic. Note that (29) entails an interesting observation that the kurtosis coefficient is quadratic in ; i.e.,
4.2. NEF-EVFs as a Limit of a Sequence of VFs in the Tweedie Scale
Mora [14] (see also [13]) discussed the situation when a limit of a sequence of VFs is a VF. As this work is also a review paper, we find it beneficial to quote Mora’s result.
Theorem 1
([14]). Let be a sequence of NEF’s with VF’s . Assume there exists a nonempty open interval J contained in and a strictly positive function V on J with uniformly on all compact subintervals of J. Then:
(i) There exists an NEF F such that and such that restricted to J is equal to V.
(ii) For all lim in the week convergence sense.
We apply this theorem for an NEF-EVF (where the same holds for EDM-EVF). First note that is a VF for all (c.f., [21]). For the latter VF, apply the Jorgensen rule (13) with and then the affine rule (14) with
This yields
Finally, by letting we obtain the limit —the VF of the NEF-EVF in (19). The latter limiting behavior is reasoned to call the NEF-EVF, a Tweedie NEF with power infinity (c.f., [17]).
4.3. Reciprocity
The definition of reciprocity among two NEFs is a bit ’boring’ and tedious, but on the other hand, it has some interesting probabilistic interpretations. For brevity, we provide here a theorem (which can also serve as a definition) regarding a reciprocal pair of NEFs, taken from [13] (for more details, see their Definition 5.1, Proposition 5.1, and Theorem 5.2).
Theorem 2
([13], Theorem 5.2). Let F and be two NEF’s and denote . Then, is a reciprocal pair iff the three following conditions hold: (i) and are nonempty; (ii) the mapping is bijective from onto ; (iii)
The most famous examples of reciprocal pairs are perhaps: (i) The normal and inverse Gaussian NEF’s given, respectively, by their VF’s: where is constant, and and (ii) The exponential and Poisson NEFs given, respectively, by their VF’s: and .
Although a general probabilistic interpretation of a reciprocity is still lacking, certain cases (as the above two examples) can be explained using fluctuation theory (see [22], pp. 24–26, and [23]).
We now apply reciprocity to the NEF-EVF given by (18). Consider the image of of by dilation mapping , and consider the pair ( The VF of is . As is composed of infinitely divisible members (see next property below). Hence, its Lévy measure is concentrated on the negative line, and thus, admits a reciprocal NEF, say A, whose mean domain is and VF
Here, A is the family of stopping times where is a Lévy process such that the distribution of is (given by (18)) when varies in If we now consider the image of by a translation mapping , then like the above, admits a reciprocal NEF, say having mean domain and VF This fact makes a marked difference among other NEFs and could be formulated in a sort (rather trivial) characterization of in (18).
4.4. Self-Decomposabilty and Unimodality
Let P be a probability on and be its image by the map (). Then P is said to be self-decomposable if for there exists a probability on such that , where ∗ indicates convolution. Self-decomposabilty is an important property with a significant amount of literature—see [5,6,24]. All self-decomposable probabilities are also infinitely divisible. However, a striking property of self-decomposability is that it implies both absolute continuity and unimodality of P. This property has been shown by [25]. All stable distributions are self-decomposable and thus unimodal ([5]).
Bar-Lev Bshouty and Letac [26] dealt with the problem: Consider a full NEF generated by . If a member of is self-decomposable, can one conclude that all other members of also have this property? They showed that this does not generally hold but provided necessary and/or sufficient conditions for this property. These conditions are related to the behavior of the Lévy measure associated with . In particular, they showed the full NEF-EVF generated by in (18) satisfies such conditions and thus is composed of self-decomposable members, implying all NEF-EVF (and therefore its associated EDM-EVF) distributions are unimodal.
4.5. Reproducibility in the Broad and Regular Sense
Bar-Lev and Casslis [27] defined the notion of reproducibility in the broad sense and developed a discrete version of this definition. They showed that all NEFs in the Tweedie scale are such. This property is defined as follows. Let be the NEF generated by . Then F is said to be reproducible in the broad sense if there exists a real number p belonging to the Jorgensen set and affine transformation such that . In other words, an NEF F is reproducible in the broad sense if a p-th power convolution of F equals an affine transformation of
The NEF-EVF can easily be shown to be reproducible in the broad sense. Indeed, the cumulant transform of and ( are given, respectively, by (17) and (24) as and . Thus, implies
Thus, by choosing and we obtain
implying that is reproducible in the broad sense.
The regular definition of reproducibility, which preceded the one given by [27], is a particular case of the above definition. It was first defined by [28] for the one-parameter NEFs. It resulted in the characterization of NEFs having power variance functions or NEFs in the Tweedie scale (in this respect, see also [29].
Here, however, we consider a generalization of their definition by defining it as a two-parameter family. Indeed, let be a family of distributions indexed by two parameters where has a nonempty interior in . Also, let be i.i.d. r.v.’s with (where stands for the law of ). Then, is said to be reproducible if, for all and , there exist sequences and , and there exists a mapping such that
This problem of reproducibility, in its general setting, is rather complex to solve, and its complexity is discussed in Bar-Lev (2021). We shall implement, however, this definition of reproducibility for the EDM-EVF given in (23) and (24), while considering and Note that (31) can be expressed in terms of (see (24))—the cumulant transform of as
or
The general solution of (32) is rather intricate and cumbersome. So, we leave it as an open problem. We do, however, demonstrate some special solutions:
- (a)
- and This result implies that , i.e., if are i.i.d. taken from , then the distribution of their random sum belongs to —a rather important fact for statistical applications.
- (b)
- , where is increasing in The implications of this result are the following: If are i.i.d. taken from , then the distribution of their random sum belongs to but with parameter instead of —a quite surprising result.
4.6. Duality
The notion of duality among NEFs has been introduced by [30] as follows. Consider two NEFs and generated by the two measures and let and and and be their corresponding cumulant transforms, VFs, and canonical parameter spaces. Also, denote by . Then, is called the dual of if
which implies that
As for an NEF-EVF and for the Poisson NEF generated by , it follows that these two families are dual (c.f., [30], Section 4.1). Among many others, one more example of duality is the pair of normal and inverse Gaussian NEFs. It should be noted, however, that duality is not valid for all NEFs.
4.7. Chainability
We introduce a new notion of a property regarding infinitely divisible probability measures, which we term chainability. Let
and be the union of with the set of positive Dirac measures on .
It is well known ([6], Chapter XIII) that a measure is infinitely divisible (i.d.) iff there exists a measure such that
If is also i.d., then there exists a measure such that
This procedure can proceed by assuming that and then and so forth, are also i.d. This process leads to the following property of chainable i.d. measures on .
Definition 1.
With the definitions of and above, let be an i.d. measure in and a sequence of i.d. measures in Then, is called infinitely chainable iff
It is called chainable of order , if are i.d. measures in such that
The problem of chainability raises some stimulating probabilistic questions. For instance, formulating necessary and/or sufficient conditions under which an i.d. measure is infinitely chainable or just chainable of order Yet, responding to such questions or others goes beyond this paper’s scope as it deserves a special study.
Here, we only analyze the Landau distribution generating the NEF-EVF with respect to chainability.
Proposition 3.
The Landau distribution is infinitely chainable.
Proof.
For simplicity, denote and define the i.d. measure by
where
Now, for given by (23), we have
Thus, with L.T.
for which
imlying that with
Continuing this way, we obtain, by a simple induction, that
i.e.,
This concludes the proof. □
4.8. Zero Regression Characterizations of
Let be a random sample taken from a common distribution P and let be a polynomial statistic (in the ’s) such that the regression of S on the sample mean is zero (or constant). If P is the only distribution for which such a property holds, we say that P is characterized by the zero regression of S on . The pioneering study of such characterizations is due to [22], who characterized all distributions (six at all) for which the regression of a quadratic form of S on is constant. Since their seminal work, numerous such characterizations have appeared in the literature (e.g., [31,32,33,34,35,36,37] At this point, it should be noted that a zero regression characterization of P means a characterization of a family of distributions, say F, to which P belongs (e.g., the normal family with unknown location and scale parameters or the Poisson family with unknown mean).
Bar-Lev [38] provided methods enabling to characterize ’almost’ any family F (at least those that establish NEFs) by zero regression properties (e.g., a zero regression characterization of the generalized Laplace distribution—see [39]. Such methods suggest searching for cumulant relationships existing among the members of F. Indeed, let be a set of m arbitrary cumulants of F. Derive functional relations among these cumulants in the form , where g is a polynomial in the ’s, and then construct an unbiased polynomial statistic for g with the tools described in [38]. In the next proposition, we demonstrate such a process for obtaining a zero regression characterization of NEF-EVF (which can also be executed for the NEF-EDM For this, note that by (27), the cumulants of satisfy
Thus, possible polynomials g’s have the form
where, in particular, for ,
Now, let , and define
with
and
where the summations in (37) and (38) are taken over all distinct indices and ranging between 1 and n. It then can be simply shown that
Note also that the two components of can be represented in terms of , as follows:
and
We now have all the ingredients for the following characterization proposition of NEF-EVF distributions.
Proposition 4.
Let F be a non-degenerate distribution and be a random sample of size taken from F having a finite third moment with . Then, has a zero regression on iff F is an NEF-EVF distribution.
Proof.
We prove only the necessity part of the proposition as its sufficiency part is easily verified. Let be the characteristic function of and be some -neighborhood of the origin. Then by Lemma 1.1.1 of [33] and Lemma 1 of [38], it follows that if has a zero regression on then
where Thus,
which by integrating becomes
Set , then Integration using the separation variable technique leads to
and hence
where and c are arbitrary constants with Since and it follows that and so that
4.9. Large Deviations
- In his seminal pioneering study of the characterization of NEFs by their VFs, ref. [3] introduced the following large deviation theorem for NEFs (see [3] Equation (9.1)). Let be an NEF with VF then for allwhere
5. Statistical Aspects of EDM-EVFs
In general, for various statistical aspects, particularly for generalized linear model (GLM) applications, it is more effective to represent an absolutely continuous EDM distribution to resemble the normal structure, i.e., instead of (12), writing the model densities as
(c.f., [18,40,41]. The structure in (43) is inappropriate for the discrete case (counting measures on ), as for different s, it changes the support of the measure For the latter case, the structure in (12) is appropriate.
Accordingly, for obtaining the structure (43) for EDM-EVF densities given in (25), we denote and map . This yields
as the densities that are appropriate for GLM applications as well as other statistical applications. Note that we also changed the variable of interest from x to y to make it more suitable for GLM usage. Accordingly, in the sequel, we denote a random sample of size n taken from (44) by
Henceforth, we shall describe the following statistical features related to EDM-EVF of the form (44): (1) maximum-likelihood estimation; (2) second-order minimax estimation of the mean; (3) test of hypotheses aspects; and (4) practical steps towards GLM applications. For estimation problems, we shall see that the steepness property of EDM-EVF plays an important role.
5.1. Maximum-Likelihood Estimates (MLEs) of the Mean and Dispersion Parameters
The log-likelihood function of based on the random sample taken from (44) is
Note that for the one-parameter case with known , the maximum likelihood equation for yields
This follows since the NEF (44) is steep (see [15], Chapter 9.6). The same result holds for the two-parameter case as well. Note, by (21), that is strictly increasing so its inverse is well defined. Hence, the MLE for (for both cases where known or unknown) is Let
then, the maximum-likelihood equation for is
where is the MLE of Equation (48) can be solved numerically with Newton–Raphson’s method or any other search algorithm, as is performed with all NEFs generated by positive stable distributions—all of which have integral forms of their generating measures.
One more aspect should be raised concerning the corresponding information matrix. Let denote the second partial derivative of with respect to and (where the first index i relates to differentiation with respect to and the second index j with respect to ), and . Since is differentiable in for almost all , the corresponding Fisher information matrix is diagonal; i.e., the parameters and are orthogonal. This observation can be easily seen as
and Moreover, adopting the tools and methods developed by [42], it is seen that
where as Moreover, the moments of exist and converge, respectively, to the moments of from which it can be deduced that
5.2. Second-Order Minimax Estimation of the Mean
Bar-Lev and Landsman [42] presented a modified second-order minimax estimator for the mean of EDMs associated with steep NEFs and established some of its asymptotic properties. They provided some necessary and sufficient conditions for such a modified estimator to improve on the sample mean . One of their necessary conditions requires the steepness of the EDM, as indeed is the case with EDM-EVF. They considered the EDM-EVF and showed the following result as a specific example.
Theorem 3
([42], Theorem 4). The estimator of mean m can be improved in the second-order minimax sense with respect to the power weight iff . Consequently, the second-order minimax estimator, which improves for any , is given by
where its mean squared error is
5.3. Testing Hypotheses
Various tests are available for model fit of the EDM-EVF for real data. Among them are extensive literature studies dealing with goodness-of-fit (gof) tests. Some are based on characterizations of the distributions belonging to the null hypothesis. Indeed, as [43] pointed out, characterization theorems or properties can be natural and practical starting points for constructing gof tests and are essential for assessing the validity of distributional models. The first idea of constructing gof tests based on a characterization of distribution in the realm of the null hypotheses is due to [44] (see [45]). Since then, various studies of constructing gof tests have been suggested; for example, those developed by [46,47,48,49], and the references cited therein. However, the earliest explicit use of a characterization theorem for constructing a gof test was presented by [50], who used Shannon’s maximum entropy characterization to construct a test for a composite hypothesis of normality.
Recently, ref. [51] employed the zero regression characterizations for the Tweedie class with of NEFs having power VFs of the form to construct novel gof tests for deviation from any given family belonging to the Tweedie class. The zero regression characterizations are those obtained by [31,34] for all the Tweedie class, including those members with —see a comment on the latter members in the sections of conclusions.
Accordingly, a similar gof test for testing
can be obtained by employing the zero regression characterization for the NEF-EVF presented in Proposition 4. The test statistic is naturally defined in (36)–(40). This test statistic has the desirable properties as detailed in the following theorem. Whenever convenient, we use and to denote F in and respectively. We also adopt the symbols , and ∼ for almost sure convergence, weak convergence (in distribution), and equivalence, respectively, as
Theorem 4.
Let be a random sample of size taken from a distribution F having first six finite moments, and let be the statistic defined in (36)–(40). Then, the following properties hold:
(i)
(ii)
(iii)
where c is a constant depending on the first six moments of F.
(iv)
(v) Under
where denotes a chi-squared distribution with 1 degree of freedom.
The proof of this theorem is straightforward but somewhat tedious. It can be conducted like that used in [51] for the Tweedie scale families. As this paper is mainly expository, we omit such a proof for brevity. We do, however, sketch some helpful points related to this proof. The first part is followed easily by the derivations preceding Proposition 4. For the three other parts, note that defined in (36)–(40), is a polynomial in the sample moments Hence, the almost sure convergence in part (ii) is straightforward. The variance of can be computed by . Then, by (36)–(40), note that the squared form of will yield expressions involving , a fact implying that the variance of , and thus also c, will be involved with the sixth moment of F. The proof of part (iv) follows from the asymptotic multivariate normality of appropriately scaled, and then the application of well-known and old results concerning the asymptotic normality of a function of these sample moments (c.f., [52,53]. Part (v) trivially follows from (49) and (50).
Consequently, as a testing procedure under , one inclines to reject for absolute large values of or large values of . Theorem 2 presents the general result concerning the asymptotic null distribution of . This limiting distribution depends on which is a function of the first six moments of the NEF-EVF distribution. So, we can write where Expressions for the ’s can be obtained directly from (27) and (28) with . Indeed, the cumulants of the NEF-EVF are given by
and the central moments are given in (28) as functions of the ’s. Thus, for instance, and , i.e., the the ’s, are polynomials in the As the MLE for m is , we immediately obtain the MLE for The latter result is crucial when calculating the proposed test’s power.
One can also approximate the p-value and the critical points using a parametric bootstrap approach by applying the following procedure as suggested by [51]:
- For some large integer B, repeat the following steps for every
- (a)
- Generate a bootstrap sample
- (b)
- Based on the bootstrap sample, calculate the bootstrap version of test statistic
- Approximate the p-value with
Various alternatives in are listed in the introduction section above. Simulations should then be executed to assess the performance of the proposed gof test in terms of type I error rate.
5.4. Practical Steps towards GLM Applications
For GLM applications, we need the following ingredients. Equation (44) presents the EDM-EVF densities in the form required by GLM. However, for better insight, we represent them in terms of the mean m (rather than in terms of as
where, by (21), the mean value parameterization, and are
with VF
If Y, we also use the standard EDM’s notation and write . The mean, variance, and cumulants of such a Y are
We shall now consider two essential ingredients needed for GLM applications of EDM-EVFs (52), namely, the scaled deviance and the link function. These were introduced by [18,41] (see also [40,54]. We also discuss some relevant computational aspects involved.
- Scaled deviance and link functionConsiderThen, as (52) is steep, it follows that is obtained at (see (46) for ). Hence, the unit deviancecan be considered as a distance measure with two properties: and for For the EDM-EVF, we obtain by (53) thatand thus, for EDM-EVF,Consequently, (52) can be rewritten asGLMs assume a systematic component where the linear predictoris linked to the mean m through a link function g such that . For the EDM-EVF, we choose the canonical link functiona relatively simple link function.The set of observations is where the ’s are independent with and is associated with the link function Here, the set of covariates is matrix so that we may write . The total and scaled deviances are given, respectively, byandWhen the saddlepoint approximation holds (and it holds for EDM-EVF—see [54]), the scaled deviance distribution follows an approximate chi-square distribution,at the true values of (for all i) and Consequently, the log-likelihood isAll of the above provides all the necessary ingredients for GLM applications.Computational aspectsTherefore, we reason to call the Tweedie scale a Tweedie NEF with power infinity. The Tweedie class is composed of power VFs in the form , where for , the corresponding NEFs are generated by positive stable. We already noticed in Section 4.2 that the VF of the EDM-EVF is a limit of a sequence of VFs in the Tweedie scale distributions (which are absolutely continuous with respect to Lebesgue measures on ), except for the inverse Gaussian NEF (), none of which have an expressible density function but rather are expressed in terms of integral form (or power series)—a situation that also occurs with the EDM-NEF (power infinity). The Tweedie scale with power is comprises NEFs generated by extreme stable distributions and lacks the steepness property (this will be discussed in the concluding remarks section). At this point, it seems fair to note that the Tweedie class should also be attributed to [28] in their study of power VFs through the analysis of the notion of reproducibility (see [29], for further details). Indeed, in recent papers, the Tweedie class was abbreviated as the TBE class.The situation above, whereby this class of NEFs does not have explicitly expressed densities, probably prevented its use for statistical modeling for quite some years. This complexity has then been resolved due to the availability of powerful mathematical software. Ref. [54] studied two methods for evaluating the density function of a Tweedie distribution, which are based on the inversion of the cumulant generating function while using the Fourier inversion and the saddlepoint approximation. An algorithm for evaluating their density function based on series expansions was presented by [55] (for these evaluation aspects, see also [56,57]. Dunn created and maintained the Tweedie R package [58], while [59] contributed to and maintained the statmod R package. In this frame, the function tweedie.profile in the tweedie R package practically enables the fit of TBE models. These packages can be extended to include the EDM-EVF as well.
6. Concluding Remarks
In this study, we presented a comprehensive review and further developed various properties of the class of EDM-EVF distributions and found it is abundant with probabilistic and statistical properties. This class of absolutely continuous distributions, supported on the whole real line, possesses simple VF, cumulants, and central moments with skewed distributions to the right and leptokurtic. In the context of probabilistic aspects, we illustrated the following features of EDM-EVF distributions related to reciprocity, self-decomposability, unimodality, reproducibility, duality, chainability, and large deviations. Also, we provided some characterizations by zero regression on the sample mean.
We also described some aspects of statistical features. Mainly, we considered maximum-likelihood estimation, second-order minimax estimation of the mean, and hypotheses testing and presented practical steps toward generalized linear model applications. However, applying the EDM-EVF distributions to real-world data presents a multifaceted challenge that necessitates using advanced estimation techniques and corresponding goodness-of-fit tests. These challenges primarily revolve around computational complexities. The first significant challenge lies in estimating the parameter p. This estimation must be executed numerically, as the probability density function includes an integral with no closed-form solution. The second challenge arises when someone wants to use a classical goodness-of-fit test, such as the Kolmogorov–Smirnov. Since we have a composite goodness-of-fit test, the computation of the p-value of the test should be done by using bootstrap methods. This, in turn, requires both the development of an algorithm for generating random values from the EDM-EVF distribution and the development of an algorithm to estimate the unknown parameters. All of these make it even more complex for GLM applications. The execution and analysis of the latter statistical aspects constitute a distinct project as it involves developing appropriate tools, for example, in R. Such a computationally-oriented project is now being carried out in collaboration with other researchers.
We trust that the proposed EDM-EVF will play an important role in modeling real data, mainly due to its simple link function, simple mean value parameterization, and other properties.
Funding
This research received no external funding.
Data Availability Statement
Not applicable.
Acknowledgments
I am grateful to Gérard Letac for his helpful comments and discussion, which enriched the presentation of the paper. I am also thankful to two reviewers for their constructive comments.
Conflicts of Interest
The author declares no conflict of interest
References
- Fenstad, G.U. A comparison between the U and V tests in the Behrens-Fisher problem. Biometrika 1983, 70, 300–302. [Google Scholar] [CrossRef]
- Jeansonne, M.S.; Foley, J.P. Review of the exponentially modified Gaussian function Since 1983. J. Chromatogr. Sci. 1991, 29, 258–266. [Google Scholar] [CrossRef]
- Morris, C.N. Natural Exponential Families with Quadratic Variance Functions. Ann. Statist. 1982, 10, 65–80. [Google Scholar] [CrossRef]
- Johnson, N.L. Systems of frequency curves generated by methods of translation. Biometrika 1949, 36, 149–176. [Google Scholar] [CrossRef]
- Lukacs, E. Characteristic Functions, 2nd ed.; Hafner: New York, NY, USA, 1970. [Google Scholar]
- Feller, W. An Introduction to Probability Theory and Its Applications, 1st ed.; Wiley: New York, NY, USA, 1966; Volume 2. [Google Scholar]
- Eaton, M.L.; Morris, C.; Rubin, H. On extreme stable laws and some applications. J. Appl. Probab. 1971, 8, 794–801. [Google Scholar] [CrossRef]
- Grosswald, E. The Student t-distribution of any degree of freedom is infinitely divisible. Z. Wahrscheinlichkeitstheorie Verw. Geb. 1976, 36, 103–109. [Google Scholar] [CrossRef]
- Nolan, J.P. Stable Distributions: Models for Heavy Tailed Data; Retrieved from American University; Birkhäuser: Boston, MA, USA, 2010. [Google Scholar]
- Landau, L. On the energy loss of fast particles by ionization. J. Phys. (USSR) 1944, 8, 201–205. [Google Scholar]
- Marucho, M.; Garcia-Canal, C.; Fanchiotti, H. The Landau distribution for charged particles traversing thin films. Int. J. Mod. Phys. C 2006, 17, 1461–1476. [Google Scholar] [CrossRef]
- Bulyal, E.; Shul’ga, N. Landau distribution of ionization losses: History, importance, extensions. arXiv 2022, arXiv:2209.06387v1. [Google Scholar]
- Letac, G.; Mora, M. Natural real exponential families with cubic variance functions. Ann. Stat. 1990, 18, 1–37. [Google Scholar] [CrossRef]
- Mora, M. La convergence des fonctions variances des familles exponentielles naturelles. Ann. Faculté Sci. Toulouse 1990, 11, 105–120. [Google Scholar] [CrossRef]
- Barndorff-Nielsen, O.E. Information and Exponential Families in Statistical Theory; Wiley: New York, NY, USA, 1978. [Google Scholar]
- Bar-Lev, S.K.; Kokonendji, C.C. On the mean value parameterization of natural exponential families—A revisited review. Math. Methods Stat. 2017, 26, 159–175. [Google Scholar] [CrossRef]
- Jorgensen, B. Exponential dispersion models (with discussion). J. R. Stat. Soc. Ser. B 1987, 49, 127–162. [Google Scholar]
- Jorgensen, B. The Theory of Dispersion Models; Chapman and Hall: London, UK, 1997. [Google Scholar]
- Burridge, J. Discussion on paper by B. Jorgensen, Exponential dispersion models. J. R. Soc. Ser. B 1987, 49, 150–152. [Google Scholar]
- Kendall, M.G.; Stuart, A. The Advanced Theory of Statistics 1, 4th ed.; Macmillan: New York, NY, USA, 1977. [Google Scholar]
- Bar-Lev, S.K. Discussion on paper by B. Jorgensen, Exponential dispersion models. J. R. Soc. Ser. B 1987, 49, 153–154. [Google Scholar]
- Laha, R.G.; Lukacs, E. On a problem connected with quadratic regression. Biometrika 1960, 47, 335–345. [Google Scholar] [CrossRef]
- Bingham, N.H. Fluctuation theory in continuous time. Adv. Appl. Probab. 1975, 7, 705–766. [Google Scholar] [CrossRef]
- Lukacs, E. Developments in Characteristic Functions Theory; Griffin: London, UK, 1983. [Google Scholar]
- Yamazato, M. Unimodality in infinitely divisible distribution functions of class L. Ann. Probab. 1978, 6, 253–531. [Google Scholar] [CrossRef]
- Bar-Lev, S.K.; Bshouty, D.; Letac, G. Natural exponential families and self-decomposability. Stat. Probab. Lett. 1992, 13, 147–152. [Google Scholar] [CrossRef]
- Bar-Lev, S.K.; Casalis, M. A classification of reducible natural exponential families in the broad sense. J. Theor. Probab. 2003, 16, 175–196. [Google Scholar] [CrossRef]
- Bar-Lev, S.K.; Enis, P. Reproducibility and natural exponential families with power variance functions. Ann. Stat. 1986, 14, 1507–1522. [Google Scholar] [CrossRef]
- Bar-Lev, S.K. Independent, tough Identical results: The class of Tweedie on power variance functions and the class of Bar-Lev and Enis on reproducible natural exponential families. Int. Stat. Probab. 2019, 9, 30–35. [Google Scholar] [CrossRef]
- Letac, G. Duality for real and multivariate exponential families. J. Multivar. Anal. 2022, 188, 104811. [Google Scholar] [CrossRef]
- Bar-Lev, S.K.; Stramer, O. Characterizations of natural exponential families with power variance functions by zero regression properties. Probab. Theory Related Fields 1987, 76, 509–522. [Google Scholar] [CrossRef]
- Gordon, F.S. Characterizations of populations using regression properties. Ann. Stat. 1973, 1, 114–126. [Google Scholar] [CrossRef]
- Kagan, A.M.; Linnik, Y.V.; Rao, C.R. Characterizations Problems in Mathematical Statistics; Wiley: New York, NY, USA, 1973. [Google Scholar]
- Bar-Lev, S.K.; Bshouty, D.; van der Duyn, S. Zero regression characterizations of natural exponential families—A complementary. Math. Methods Os Stat. 2004, 13, 1–12. [Google Scholar]
- Bar-Lev, S.K.; Kagan, A. Bivariate distributions with Gaussian-Type dependence structure. Commun.-Stat. Theory Methods 2009, 38, 2669–2676. [Google Scholar] [CrossRef]
- Fosam, E.B.; Shanbhag, D.N. An extended Laha{Lukacs characterization results based on a regression property. J. Stat. Planing Inference 1997, 63, 173–186. [Google Scholar] [CrossRef]
- Wesolowski, J. Characterizations of distributions by constant regression of quadratic statistics on a linear one. Sankhya Ser. A 1990, 52, 383–386. [Google Scholar]
- Bar-Lev, S.K. Methods of constructing characterizations by constancy of regression on the sample mean and related problems for NEF’s. Math. Methods Stat. 2007, 16, 96–109. [Google Scholar] [CrossRef]
- Bar-Lev, S.K.; Bshouty, D. A characterization of the generalized Laplace distribution by constant regression on the sample mean. Stat. Probab. Lett. 2016, 113, 79–83. [Google Scholar] [CrossRef]
- Dunn, P.K.; Smyth, G.K. Generalized Linear Models with Examples in R; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar] [CrossRef]
- McCullagh, P.; Nelder, J.A. Generalized Linear Models, 2nd ed.; Springer: Berlin/Heidelberg, Germany, 1989. [Google Scholar]
- Bar-Lev, S.K.; Landsman, Z. Exponential dispersion models: Second-order minimax estimation of the mean for unknown dispersion parameter. J. Stat. Planing Inference 2006, 136, 3837–3851. [Google Scholar] [CrossRef]
- Wilding, G.E.; Mudholkar, G.S. A gamma goodness-of-fit test based on characteristic independence of the mean and coefficient of variation. J. Stat. Inference 2008, 138, 3813–3821. [Google Scholar] [CrossRef]
- Linnik, Y.V. Linear forms and statistical criteria. I, II. Ukrain. Mat. Zh. 1963, 5, 207–243, 247–290. (In Russian)English translation in Sel. Transl. Math. Statist. Prob 1963, 3, 1–90 [Google Scholar]
- Nikitin, Y.Y. Tests based on characterizations, and their efficiencies: A survey. Acta Comment. Univ. Tartu. Math. 2017, 21, 3–24. [Google Scholar] [CrossRef]
- Bar-Lev, S.K.; Batsidis, A.; Economou, P. Tweedie, Bar-Lev, and Enis class of leptokurtic distributions as a candidate for modeling real data. Commun. Stat. Case Stud. Data Anal. Appl. 2021, 7, 229–248. [Google Scholar] [CrossRef]
- Marchetti, C.E.; Mudholkar, G.S. Characterization theorems and goodness-of-fit test. In Goodness-of-Fit Tests and Model Validity, Statistics for Industry and Technology; Huber-Carol, C., Balakrishnan, N., Nikulin, M.S., Mesbah, M., Eds.; Birkhäuser: Boston, MA, USA, 2002. [Google Scholar]
- Milosević, B. Asymptotic efficiency of goodness-of-fit tests based on Too-Lin characterization. Commun. Stat.-Simul. Comput. 2020, 49, 2082–2101. [Google Scholar] [CrossRef]
- Mudholkar, G.S.; Lin, C.T. On two applications of characterization theorems to goodness-of-fit. Colloq. Math. Soc. Janos Bolyai 1984, 45, 395–414. [Google Scholar]
- Vasicek, O. A test of normality based on sample entropy. J. Roy. Statist. Soc. B 1976, 38, 54–59. [Google Scholar] [CrossRef]
- Bar-Lev, S.K.; Batsidis, A.; Einbeck, J.; Liu, X.; Ren, P. Cumulant-Based Goodness-of-Fit Tests for the Tweedie, Bar-Lev and Enis Class of Distributions. Mathematics 2023, 11, 1603. [Google Scholar] [CrossRef]
- Doob, J.L. The limiting distributions of certain statistics. Annals of Math. Stat. 1935, 6, 160–170. [Google Scholar] [CrossRef]
- Hsu, C.T. The limiting distribution of a general class of statistics. Sci. Rec. (Acad. Sin.) 1942, 1, 37–41. [Google Scholar]
- Dunn, P.K.; Smyth, G.K. Evaluation of Tweedie exponential dispersion model densities by Fourier inversion. Stat. Comput. 2008, 18, 73–86. [Google Scholar] [CrossRef]
- Dunn, P.K.; Smyth, G.K. Series evaluation of Tweedie exponential dispersion model Densities. Stat. Comput. 2005, 15, 267–280. [Google Scholar] [CrossRef]
- Vinogradov, V.; Paris, R.B.; Yanushkevichiene, O. New properties and representations for members of the power-variance family, I. Lith. Math. J. 2012, 52, 444–461. [Google Scholar] [CrossRef]
- Vinogradov, V.; Paris, R.B.; Yanushkevichiene, O. New properties and representations for members of the power-variance family, II. Lith. Math. J. 2013, 53, 103–120. [Google Scholar] [CrossRef]
- Dunn, P.K. Tweedie: Evaluation of Tweedie Exponential Family Models. R Package Version 2.3.5. 2022. Available online: https://cran.r-project.org/web/packages/tweedie/tweedie.pdf (accessed on 12 September 2023).
- Smyth, G.K. Statmod: Statistical Modeling. 2017. Available online: https://CRAN.R-project.org/package=statmod (accessed on 12 September 2023).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).