On the Notion of Reproducibility and Its Full Implementation to Natural Exponential Families

Shaul K. Bar-Lev

doi:10.3390/math9131568

Faculty of Industrial Engineering and Technology Management, Holon Institute of Technology, Holon 5810201, Israel

Mathematics2021, 9(13), 1568;https://doi.org/10.3390/math9131568

This article belongs to the Special Issue Advances in Applications of Probability Theory and Stochastic Processes

Version Notes

Order Reprints

Abstract

Let

F = \{F_{θ} : θ \in Θ \subset R\}

be a family of probability distributions indexed by a parameter

θ

and let

X_{1}, \dots, X_{n}

be i.i.d. r.v.’s with

L (X_{1}) =

F_{θ} \in F

. Then,

F

is said to be reproducible if for all

θ \in Θ

and

n \in N

, there exists a sequence

{(α_{n})}_{n \geq 1}

and a mapping

g_{n} : Θ \to Θ,

θ ⟼ g_{n} (θ)

such that

L (α_{n} \sum_{i = 1}^{n} X_{i}) = F_{g_{n} (θ)} \in F .

In this paper, we prove that a natural exponential family

F

is reproducible iff it possesses a variance function which is a power function of its mean. Such a result generalizes that of Bar-Lev and Enis (1986, The Annals of Statistics) who proved a similar but partial statement under the assumption that

F

is steep as and under rather restricted constraints on the forms of

α_{n}

and

g_{n} (θ)

. We show that such restrictions are not required. In addition, we examine various aspects of reproducibility, both theoretically and practically, and discuss the relationship between reproducibility, convolution and infinite divisibility. We suggest new avenues for characterizing other classes of families of distributions with respect to their reproducibility and convolution properties.

Keywords:

natural exponential families; reproducibility; infinite divisibility; variance function; functional equation

1. Introduction

The notion of a distribution function, F, which is reproductive with respect to a parameter

θ \in Θ,

a nonempty set of

R

, was first introduced by Wilks [1] as follows: Let

X_{1}

and

X_{2}

be independent r.v.s with distributions

F (.; θ_{1})

and

F (.; θ_{2}),

respectively. If for any

θ_{1}, θ_{2} \in Θ

, the distribution of

Y = X_{1} +

X_{2}

is

F (.; θ_{1} + θ_{2})

, then the distribution of

F (.; θ)

is said to be reproductive with respect to

θ

. In this respect, Wilks mentioned the Poisson, chi-square and normal distributions.

After Wilks introduced this definition, there were early attempts to characterize such families, c.f.,Bolger and Harkness [2], Baringhaus, Davies, and Plachky [3] and Bar-Lev and Enis [4], but all ended in characterizing the Poisson distribution. A broader definition of Wilk’s concept was given by Bar-Lev and Enis [5], and then applied to one-parameter natural exponential families (NEFs) in-depth analysis. (Well known preliminaries on NEFs will be presented in Section 3). The definition of reproducibility given by Bar-Lev and Enis [5] is the following.

Definition 1.

Let

F = \{F_{θ} : θ \in Θ \subset R\}

be a family of distributions indexed by a parameter

θ \in Θ

, where Θ has a nonempty interior. Let

X_{1}, \dots, X_{n}

be i.i.d. r.v.s with

L (X_{1}) =

F_{θ} \in F

(where

L (X)

stands for the law of

X_{1}

). Then,

F

is said to be reproducible if for all

θ \in Θ

and

n \in N

, there exists a sequence

{(α_{n})}_{n \geq 1}

(called the stabilizing constants), which is either identically equal to 1 or it is a one-to-one mapping

α_{n} : N \to R

and there exists a mapping

g_{n} : Θ \to Θ,

θ ⟼ g_{n} (θ)

such that

L (α_{n} \sum_{i = 1}^{n} X_{i}) = F_{g_{n} (θ)} \in F .

(1)

Or equivalently, in terms of the characteristic functions, the reproducibility property is equivalent to requiring that

f (t, g_{n} (θ)) = f^{n} (α_{n} t, θ), \forall n \in N, t \in R and θ \in Θ,

where f is the characteristic function of

X_{1} .

Note that for

α_{n} \equiv 1

, the reproducibility property means that the convolution belongs to the same family, while for

α_{n} = 1 / n

, it means that the distribution of the sample mean belongs to the same family as each of the

X_{i}

’s.

Bar-Lev and Enis [5] implemented their Definition 1 to the class of natural exponential families (NEFs) by imposing the following restrictions:

\{\begin{matrix} (i) g_{n} is an onto mapping, \\ (ii) α_{n} has the form α_{n} = n^{β}, β \in R, \\ (iii) The NEF F is steep, \end{matrix}

(2)

where steepness of

F

means that the mean parameter space of

F

coincides with the interior of its common convex support. This definition of steepness will be further elaborated in the sequel. Their implementation of the notion of reproducibility of NEFs led to the characterization of the class of steep NEFs having power variance functions (VFs).

The analysis of the notion of reproducibility from a different angle than that formulated in Definition 1 and Equation (1) was given in Bar-Lev and Cassalis [6,7]. They defined an NEF

F

to be reproducible in the broad sense, as follows. Let

F

be an NEF which is associated with a Laplace transform L. If there exists a triple

(α, β, λ) \in (R, R, R^{+})

such that

f_{α, β} (F) =

F_{λ}

, where

f_{α, β} (F)

is the image of

F

under the affine transformation

f_{α, β} : x ⟼ α x + β

and

F_{λ}

is the NEF associated with the Laplace transform

L^{λ}

, then

F

is said to be reproducible in the broad sense. They obtained a complete classification of the class of reproducible NEF’s in the broad sense and showed that this class is composed of NEFs having power VFs, exponential VF, as well as NEF’s constituting discrete versions of the latter NEF’s. In a completely different aspect and not at all related to the notion of reproducibility, the class of NEFs having power VF was introduced by Tweedie [8] and won publicity only after being described in Jorgensen [9].

In this paper, we adhere to the original definition of reproducibility presented in Definition 1. In Section 2, we will present a number of issues related to reproducibility for general classes of parametric families of distributions and discuss some practical aspects. Section 3 is devoted to reproducibility in NEFs. Indeed, we will prove that the three requirements in (2) assumed by Bar-Lev and Enis [5] are not needed as one can obtain their characterization results and more without assuming that (2) holds. Our method of proof enable to delineate all NEFs having power VFs including those missed by Bar-Lev and Enis [5] due to their steepness restriction. Naturally, the latter assumption did not allow them to characterize the subclass of non-steep NEFs having VFs with negative power. Such a subclass of non-steep NEFs is huge as it contains all NEFs generated by stable distributions with a stable index in

(1, 2)

. In the same context, one should note that Tweedie [8] was wrong when claimed that such a subclass does not exist (see Jorgensen [9] and Bar-Lev [10] for more details). In fact, we will prove in Section 3 that the three assumptions in (2) follow trivially from the proof presented in this paper.

2. Some General Noteworthy Comments on the Notion of Reproducibility

A number of noteworthy comments on the notion of reproducibility will now follow:

Restriction on the support of $F$

It would appear pretentious to think that one can characterize the property of reproducibility for all parametric families. It, therefore, seems necessary to limit ourselves in a number of aspects. One of them, for example, is to require that all members of

F

possess a common support. The second is inherently obvious as to require that the family

F = \{F_{θ} : θ \in Θ \subset R\}

is identifiable, i.e., if

θ_{1}, θ_{2} \in Θ

with

θ_{1} \neq θ_{2}

then

F_{θ_{1}} \neq F_{θ_{2}}

on at least one Borel set. Accordingly, in order to aim efficiently at characterizing reproducible parametric families, one should impose the following assumption.

Assumption A1.

(i): For all $θ \in Θ$ , $F_{θ}$ possesses a common support $S .$ Let $(c, d)$ denote the interior of the convex-hull of S, where $- \infty \leq c < d \leq \infty$ .
(ii): $F$ is identifiable.

Some immediate conclusions which follow from Assumption A1 are presented in the following lemma.

Lemma 1.

Suppose that

F = \{F_{θ} : θ \in Θ \subset R\}

is reproducible in the sense of Definition 1 and that it satisfies Assumption A1. Then,

(i): c and d do not depend on $θ,$
(ii): for any fixed $n \in N$ , the mapping $g_{n}$ is one-to-one from Θ into itself,
(iii): if $α_{n} \neq 1 / n$ , then either $c = - \infty$ and $d = 0$ or $\infty;$ or $c = 0$ and $d = \infty .$
(iv): $α_{n} > 0,$ for all $n \in N$ .

Proof .

(i): Simple, as $F$ depends on $θ$ only and is assumed to have a common support $S .$
(ii): If $θ_{1} \neq θ_{2}$ , then $g_{n} (θ_{1}) \neq g_{n} (θ_{2})$ , otherwise a contradiction to the identifiably of $F$ will follow.
(iii): If $α_{n} \neq 1 / n$ , then relation (1) implies that

$n α_{n} (c, d) = (n α_{n} c, n α_{n} d) = (c, d)$

(3)

and thus, the desired result is obtained.
(iv): This follows from (3).

□

The situation where

α_{n} = 1 / n

, which is excluded in Lemma 1, is satisfied by several reproducible families that are not NEFs. The most famous example is the Cauchy distribution (with an unknown scale parameter

θ

and a fixed location parameter

δ

) having a density

f_{θ} (x) = \frac{1}{π θ [1 + {(\frac{x - δ}{θ})}^{2}]}, - \infty < x < \infty, θ \in R .

For this example,

L ({\overset{Θ}{X}}_{n}) = L (X_{1}), c = - \infty,

d = \infty

and

g_{n} (θ) \equiv θ

for all

n \in N .

2.: The restriction thatcandddo not depend on $θ$ and that $α_{n}$ is a one-to-one mapping

We restricted the analysis to a situation where both c and d do not depend on

θ

, as otherwise, not placing such a restriction, would have made the task of the reproducibility characterization almost impossible to execute. Below are two examples (out of many) of reproducible families of distributions for which either c or d depend on

θ .

Example 1.

Let

F = \{B i n (θ, 1 / 2); θ \in N\},

α_{n} \equiv 1

and

g_{n} (θ) = n θ,

then

α_{n} Y_{n} \sim B i n (g_{n} (θ),

1 / 2) \in F .

Note that here d depends on θ and

α_{n}

is not one-to-one.

Example 2.

Let

U_{1}, \dots U_{n}

be n i.i.d.

U (0, θ)

r.v.s. For

k - 1 < θ \leq k

, denote by

f_{θ}

the density of

\sum_{i = 1}^{k} U_{k}

. Then, this density has the reproducibility property. Note that S depends on θ and

α_{n}

is not one-to-one.

3.: Reproducibility and infinite divisibility

At first intuitive glance, it was conjectured that reproducibility of

F

will entail its infinite divisibility. Recall that a probability distribution is called infinitely divisible if it can be expressed as the probability distribution of the sum of an arbitrary number of independent and identically distributed random variables. Accordingly, an infinite divisibility of a probability distribution entails that the convex-hull of its support cannot be bounded (c.f., Chapter 17 of Feller [11]).

The two examples above show that such a conjecture is false as the corresponding families have bounded supports and thus are not infinitely divisible. However, then the question arose as to whether reproducible families with unbounded support are infinitely divisible. The following example (provided to the author via a personal communication by Gérard Letac - Institut de Mathématiques de Toulouse, Université Paul Sabatier, France) shows that the related conjecture is also false.

Example 3.

This is an example in which S is unbounded and does not depend on

θ .

Let

a > 0

, n be a positive integer,

Z \sim N (0, 1), B_{n} \sim B (n, 1 / 2),

where Z and

B_{n}

are independent. Define

Y = \sqrt{n a} Z + B_{n},

then, the distribution

P_{a, n}

of Y depends on two parameters n and a. However, if one defines the parameter

θ = \frac{1}{1 + a} + n, θ > 1,

then,

P_{θ} = P_{a, n}

is parameterized by θ and has the reproducibility property, although it is not infinitely divisible.

4.: Reproducibility and convolutions

Let

F

be reproducible in the sense of (1), i.e.,

L (α_{n} \sum_{i = 1}^{n} X_{i}) = F_{g_{n} (θ)} \in F,

from which it follows that the convolution of the

X_{i}

s,

i = 1, \dots, n,

is

P (\sum_{i = 1}^{n} X_{i} \leq x) = F_{g_{n} (θ)} (\frac{x}{α_{n}}),

(4)

where

F_{g_{n} (θ)}

is the distribution of

X_{1}

with parameter

g_{n} (θ) \in Θ

, i.e., the distribution of a sum of i.i.d. r.v.s has the same distribution type as any of its components.

Convolutions of n i.i.d. random variables play an important role both in statistical inference, probability, stochastic processes, risk theory and insurance. Some immediate examples are (i) For NEFs in statistical modeling,

Y_{n} = \sum_{i = 1}^{n} X_{i}

is the minimal sufficient statistic for

θ

(see below), the distribution of which is needed for all statistical inference aspects on

θ

; (ii) for an

M / G / 1

queuing system with “first come, first served” queue discipline and service time distribution F, the n-fold convolution of F with itself is needed for computing the distribution of the length of the busy period; (iii) in insurance risk, the aggregated claim is a random sum of i.i.d. r.v.s with common distribution F and thus the n-fold convolution of F is needed for various computations (e.g., Shushi and Yaob [12], Bahnemann [13]). In general, however, the derivation of a convolution is usually rather complex and cumbersome. Thus, if F belongs to a reproducible family, then the convolution has the same distribution type as any of its components (as in (4)) and thus no further complex computations are needed at all. Indeed, the reproducibility property has been employed in Bar-Lev and Ridder [14] for computing the insurance risk of aggregated claim data and then implemented to some real car insurance claim data.

5.: An extension of the reproducibility notion to the multi-parameter case

Naturally, the reproducibility property can be easily extended in the sense of Definition 1 to the multi-parameter case. Three examples of reproducible two-parameter

(θ_{1}, θ_{2})

NEFs appear in Bar-Lev and Reiser [15]. These are the two-parameter gamma, inverse Gaussian and normal NEFs. For these three NEFs, the distribution of the sample mean

{\bar{X}}_{n}

is of the same type as that of

X_{1}

but with parameters

(n θ_{1}, n θ_{2})

(see Bar-Lev and Reiser [15] for the definition of the parameters

(θ_{1}, θ_{2})

for each of these NEFs).

Obviously, from both theoretical and practical (convolutions) aspects it would be interesting to characterize families of multi-parameter distributions by the reproducibility property.

3. NEFs—Preliminaries and Characterization by the Reproducibility Property

In this section, we first present some required preliminaries on NEFs and their associated VFs. These preliminaries are taken from the fundamental paper of Letac and Mora [16]. We will then introduce the main result and characterize reproducible NEFs in the sense of Definition 1. In particular, we will show that the sequence

α_{n}

has the form

n^{β}

for some

β \in R,

and that an NEF is reproducible iff it possesses a power VF. Afterwards, we present a table including the class reproducible NEFs with power VFs along with their associated

α_{n}, g_{n} (θ), S,

and

Θ

.

3.1. Some Preliminaries on NEFs

Let

ν

be a positive Radon measure on

R

with convex support

C_{ν}

. Consider the set

D_{ν} ≐ \{θ \in R : L_{ν} (θ) ≐ \int_{R} exp (θ x) ν (d x) < \infty\},

(5)

and assume that

Θ_{ν} ≐ int D_{ν}

is nonempty. Then, the NEF

F (ν)

generated by

ν

is defined by the set of probability distributions

F (ν) ≐ \{F (θ, ν (d x)) = exp (θ x - k_{ν} (θ)) ν (d x) : θ \in Θ_{ν}\},

(6)

where

k_{ν} (θ) ≐ log L_{ν} (θ)

is the cumulant transform of

ν

and

k_{ν}

is strictly convex and real analytic on

Θ_{ν}

. Moreover,

k_{ν}^{'} (θ)

and

k_{ν}^{''} (θ)

,

θ \in Θ_{ν}

, are the respective mean and variance corresponding to

F (θ, ν)

, and the open interval

M_{ν}

≐ k_{ν}^{'} (Θ_{ν})

is called the mean domain of

F (ν)

.

An important observation is that measure

ν

is not unique for

F (ν)

. Let

M

be the set of Radon measures

ν

on

R

for which

L_{ν} (θ) < \infty

on domain

Θ_{ν}

. Consider two measures

ν, ν^{*} \in M

, and suppose that

ν^{*}

is an exponential shift of

ν

; i.e.,

ν^{*} (d x) = e^{a + b x} ν (d x)

for some real

a, b

. Then, a simple calculation shows that

F (ν) = F (ν^{*})

. This also holds in reverse, if

F (ν) = F (ν^{*})

for two measures

ν, ν^{*} \in M

, then one is an exponential shift of the other and, obviously, they are equivalent in the sense that each is absolutely continuous with respect to the other. A VF

(V, M)

of NEF

F

determines the NEF uniquely within the class of NEFs in the following sense: If

F_{1}

and

F_{2}

are NEFs with respective VFs

(V_{1}, M_{1})

and

(V_{2}, M_{2})

such that

V_{1} = V_{2}

on

J = V_{1} \cap V_{2} \neq ϕ

, then

F_{1} = F_{2}

, c.f., Mora [17,18]. This would imply the following interesting observation: if

(V, M)

is a VF of an NEF

F

, then

The mean parameter space M is the largest open interval on which V is positive real analytic .

(7)

Consequently, we may denote the NEF by

F = F (ν)

and the mean domain

M = M_{ν}

to stress the fact that these do not depend on

ν

.

Since the function

k_{ν}^{'} : Θ_{ν} \to M

is one-to-one, its inverse function

{(k_{ν}^{'})}^{- 1}

: M \to Θ_{ν}

is well defined. Since

k_{ν}^{'}

is continuous and increasing, then M, the image of

k_{ν}^{'} (Θ)

is an interval, say

(a, b)

,on the real line. Now, when we compute the variance

V_{ν} (θ) ≐ k_{ν}^{''} (θ)

of the distribution

F (θ, ν)

as a function of the mean

m \in M

, i.e.,

V_{ν} (m) = k_{ν}^{''} ({(k_{ν}^{'})}^{- 1} (m)),

(8)

then, it also does not depend on

ν

, and we can suppress the dependence on

ν

and write

V (m)

instead of

V_{ν} (m)

. Furthermore, when no confusion is made, we shall suppress the dependence on

ν

of all other notations introduced above and write

C, D,

Θ

,..., for

C_{ν}, D_{ν}, Θ_{ν} \dots

Finally,

F

is called steep iff

M = int C

(or equivalently, if

k_{ν}

is essentially a smooth convex function on

Θ

—for a definition, see Barndorff-Nielsen [19]).

Now,

F

is said to have a power, VF, iff it has the form

V (m) = δ m^{γ}, δ > 0, γ \in R .

(9)

We are now ready to present the reproducibility characterization of NEFs in the following proposition. This proposition states that

F

is reproducible (in the sense of Definition 1) iff it possesses a power VF. Accordingly, for brevity, the details of a classification of all NEFs having a power VFs is given in Section 3.2 and not in the proof of the proposition.

In addition, in a manner similar to Part (iii) of Lemma 1, it follows that

n α_{n} M = (n α_{n} a, n α_{n} b) = (a, b)

and thus M is either

(0, \infty)

or

(- \infty, \infty)

(for more details, see Bar-Lev and Enis [5]).

Proposition 1.

Let

F

be an NEF generated by a basis ν and associated by a VF

(V, M) .

Then,

F

is reproducible in the sense of Definition 1 iff V is a power VF of the form (9). The values of γ for which (9) is a VF of an NEF along with the corresponding

α_{n}

,

g_{n} (θ), k_{p} (θ), Θ

and M are presented in Table 1 below.

Table 1. NEFs having Power VFs.

Proof.

If

F

is an NEF with power VF of the form (9), then trivially

F

is reproducible (see also Table 1 below). The reverse implication is made as follows. Assume now that

F

is a reproducible NEF. We shall henceforth exclude the case where

α_{n} \equiv 1

(i.e., convolution) as this case was treated by Bar-Lev and Enis [4]—without assuming (2)—and led to the characterization of the Poisson NEF. We shall also exclude the case

α_{n} = 1 / n

as by Lemma 2.1 in Bar-Lev and Enis [5], no one-parameter reproducible NEF exists for this case. Accordingly, we shall assume throughout the sequel that

α_{n} \neq 1 and α_{n} \neq 1 / n .

(10)

Let

ν_{0} \in M

be a specific chosen basis of

F

. Then, by (1), the probabilities of

α_{n} \sum_{i = 1}^{n} X_{i}

are

P (ν_{0} (d x); θ) = exp (g_{n} (θ) x - k (g_{n} (θ)) ν_{0} (d x), θ \in Θ .

(11)

Let

ν_{0}^{* n}

be the n-th fold convolution of the chosen basis

ν_{0}

of

F

. Then, the probabilities of

\sum_{i = 1}^{n} X_{i}

are given by

exp (θ x - n k (θ)) ν_{0}^{* n} (d x), θ \in Θ,

(12)

and thus from (12), it follows that the probabilities of

α_{n} \sum_{i = 1}^{n} X_{i}

are given by

exp (θ \frac{x}{α_{n}} - n k (θ)) \frac{1}{α_{n}} ν_{0}^{* n} (d x), θ \in Θ .

(13)

Consequently, by comparing (11) and (13) we obtain that

\{\begin{matrix} l ν_{0} = \frac{1}{α_{n}} ν_{0}^{* n}, almost everywhere ν_{0}, \\ g_{n} (θ) = \frac{θ}{α_{n}}, θ \in Θ, n \in N, and \\ k (g_{n} (θ)) = n k (θ)), θ \in Θ, n \in N . \end{matrix}

(14)

Hence, by comparing (11) and (13), we obtain

g_{n} (θ) = \frac{θ}{α_{n}}

(15)

and

n k (θ) = k (g_{n} (θ)) .

(16)

Substitute (15) into (16), we have

n k (θ) = k (\frac{θ}{α_{n}}) .

(17)

Denote

α_{n} ≐ α (n), n \in N

, and let

y = y (n) ≐ 1 / α (n) .

Since by assumption

α (n)

is a one-to-one mapping from

N

into

R

(see part (ii) of Lemma 1), its inverse

n = α^{- 1} (\frac{1}{y}) ≐ h (y)

(18)

is well defined and thus by using (18) in (17), we obtain

k (θ y) = k (θ) h (y), θ \in Θ,

(19)

By Aczél ([20], Theorem 4, pp. 144–145), the general solution of the functional equation

f (x y) = g (x) h (y)

(20)

with positive x and y and f continuous at a point is

f (t) = a b t^{c}, g (t) = a t^{c}, h (t) = b t^{c}

(21)

(supplemented with a trivial solution, which is irrelevant in our situation). In order to apply the solution (21) of (20) to (19), we need to show that the premises of Theorem 4 in Aczél [20] are met. Indeed, k is real analytic on

Θ

and thus is continuous there. In addition, by Lemma 1,

y = 1 / α (n)

is positive. Now, if

Θ

contains an open interval

(a_{1}, b_{1}) \subset R^{+}

, then we will confine any further analysis of the problem to this interval. However, if otherwise

Θ \subset R^{-},

we define

x = - θ

and

r (x) = k (- x)

and apply the solution (21) of (20), to r. Therefore, for simplicity and without any loss of generality, we may assume that

Θ

contains an open interval

(a_{1}, b_{1}) \subset R^{+}

and thus we will continue with k instead of r. Hence, Theorem 4 of Aczél [20] can be applied for

k = f = g .

In which case, we obtain

b = 1,

k (t) = a t^{c}, t \in (a_{1}, b_{1}) \subset R^{+}

(22)

and

n = h (y) = y^{c} = {(α (n))}^{- c}, n \in N,

which implies that

α (n) = n^{- \frac{1}{c}} and g_{n} (θ) = θ n^{\frac{1}{c}}, where by (10) c \neq 1 .

(23)

We now exclude the case

c = 2

as it will be treated separately in the sequel and thus we assume that

c \neq 2 .

Let

m = k^{'} (t) = a c t^{c - 1}

and

V (t) = k^{^{''}} (t) = a c (c - 1) t^{c - 2}

be the mean and variance of

F

defined on

(a_{1}, b_{1})

and let

M_{1} = k^{'} ((a_{1}, b_{1}))

be the image of

(a_{1}, b_{1})

under

k^{'} .

Since

V (t) > 0

on

t \in (a_{1}, b_{1})

,

k^{'}

is strictly increasing on

(a_{1}, b_{1})

and thus its inverse

{(k_{ν}^{'})}^{- 1} : M_{1} \to (a_{1}, b_{1})

is well defined and is given by

{(k_{ν}^{'} (m))}^{- 1} = {(\frac{m}{a c})}^{1 / (c - 1)}

,

a c > 0, m \in M_{1}

. Hence, V expressed in terms of

m \in M_{1}

has the form

V (m) = a c (c - 1) {(a c)}^{- \frac{c - 1}{c - 2}} m^{\frac{c - 1}{c - 2}}, a c > 0, c - 1 > 0, m \in M_{1} .

Let

δ = a c (c - 1) {(a c)}^{- \frac{c - 1}{c - 2}} and γ = \frac{c - 1}{c - 2}, c \neq 1, 2

then, V has the form

V (m) = δ m^{γ}, δ > 0, γ < 0 or γ > 0, γ \neq 1, m \in M_{1} = (a_{1}, b_{1}) \subset R^{+} .

(24)

As by (7), the mean parameter space M of V is the largest open interval on which V is positive real analytic; it follows that the mean parameter space corresponding to V in (24) is

M = R^{+} .

Accordingly, if

F

is reproducible in the sense of Definition 1, then its VF is necessarily of the form

(V, M) = (δ m^{γ}, R^{+}), δ > 0, γ < 0 or γ > 0, γ \neq 0, 1 .

(25)

In order to conclude the proof we need to consider the two remaining cases:

γ = 0

(c = 2)

and

γ = 1

(c = 1) .

Note that for

γ = 0

, the VF is, identically, a positive constant

δ > 0

and thus the corresponding mean parameter space is

M = R

. Indeed, this case with

γ = 0

was treated in Theorem 2.1 of Bar-Lev and Enis [5]. It was shown there that

F

is reproducible with stabilizing constants

α_{n} = n^{1 / 2}

iff

F

is the family of normal distributions with constant variance

δ

(i.e.,

F

is an NEF having a power VF with power parameter

γ = 0) .

The case

γ = 1

was analyzed in Bar-Lev and Enis [4]. It was shown that

F

is reproducible with stabilizing constants

α_{n} \equiv 1

iff

F

is the family of Poisson distributions (i.e.,

F

is an NEF with power VF and power parameter

γ = 1) .

Thus, we have proven that an NEF

F

is reproducible in the sense of Definition 1 iff it possesses a power VF. The appropriate forms of the stabilizing constants

α_{n}

and of

g_{n} (θ)

are presented in Table 1 below and this concludes the proof. Furthermore, for less common used distributions, the corresponding probability densities will be specified. □

3.2. A Classification of NEFs with Power VFs and Their Associated $α_{n}$ , $g_{n} (θ), k_{γ} (θ), Θ$ and M

The permissible values of

γ

for which (9) is a VF of an NEF along with the corresponding

α_{n}

,

g_{n} (θ), k_{p} (θ), Θ

and M are taken from Jorgensen [9], Bar-Lev and Enis [5] and Bar-Lev and Cassalis [7]. Only if

γ \in (0, 1)

, then no NEF exists. For all other values of

γ

, we have the following:

For $γ \in (- \infty, 0)$ , the NEF $F$ is generated by an extreme stable distribution with stable index $1 < τ < 2$ , where $τ = (γ - 2) / (γ - 1)$ , in which case $C = R$ and $M = R^{+}$ , i.e., $F$ is not steep. Here, $α_{n} = n^{\frac{1 - γ}{γ - 2}}$ and $g_{n} (θ) = θ n^{\frac{γ - 1}{γ - 2}} .$ (As already mentioned above, Bar-Lev and Enis [5] showed that non-steep NEFs exist if $γ < 0$ , whereas Tweedie [8] claimed that such NEFs do not exist by utilizing an incorrect claim). Here, the associated absolutely continuous probability density is quite cumbersome as it depends on several parameters. Consequently, we do not present it here and the interested reader is referred to Chapters 6 and 7 of Lukacs [21].
For $γ = 0$ , the corresponding NEF $F$ is the normal one with variance equaling the constant $δ$ . $F$ is steep with $C = M = R^{+},$ $α_{n} = n^{- 1 / 2}$ and $g_{n} (θ) = n^{- 1 / 2} θ .$
For $γ \in (0, 1)$ , no NEF exists with VF in the form (9).
For $γ = 1$ , the corresponding NEF $F$ is Poisson. $F$ is steep with $C = M = R^{+},$ $α_{n} \equiv 1$ and $g_{n} (θ) = n^{- 1 / 2} θ .$
For $γ \in (1, 2)$ , the corresponding NEF $F$ is a compound Poisson NEF generated by gamma distributions. $F$ is steep with $C = M = R^{+},$ $α_{n} = n^{\frac{1 - γ}{γ - 2}}$ and $g_{n} (θ) = θ n^{\frac{γ - 1}{γ - 2}} .$ Here, the corresponding cumulative distribution function is given by

$P (X \leq x) = e^{- p} E (x) + e^{- p} \int_{0}^{x} [e^{θ t} \sum_{n = 1}^{\infty} \frac{b^{n} t^{- n ρ - 1}}{n! Γ (- n ρ)}] d t,$

where $E (x)$ is a cumulative distribution function degenerated at 0 and

$ρ = \frac{2 - γ}{1 - γ}, p = \frac{ρ - 1}{δ ρ} {(\frac{δ θ}{ρ - 1})}^{ρ}, b = - ρ^{- 1} {[\frac{1 - ρ}{δ}]}^{1 - ρ}, ρ < 0, θ < 0 and δ > 0 .$
For $γ = 2,$ the corresponding NEF is gamma one with shape parameter $δ^{- 1}$ . However, as was shown in Bar-Lev and Enis (1986) it is not reproducible when considered as a one-parameter NEF. It is reproducible when considered as a two-parameter NEF (see part 5 of the previous section).
For $γ \in (2, \infty)$ , the corresponding NEF is generated by a positive stable distribution with stable index $0 < α < 1$ , where $α = (γ - 2) / (γ - 1)$ . $F$ is steep with $C = M = R^{+}, α_{n} = n^{\frac{1 - γ}{γ - 2}}$ and $g_{n} (θ) = θ n^{\frac{γ - 1}{γ - 2}} .$ Here, the associated absolutely continuous stable probability density is $ν_{ρ} (d x) = h_{ρ} (x) d x$ (c.f., Bar-Lev and Enis [5]) where

$h_{ρ} (x) = - \frac{1}{π} \sum_{k = 0}^{\infty} \frac{{(- 1)}^{k}}{k!} sin (π ρ x) \frac{{(1 - ρ)}^{k (1 - ρ)} Γ (ρ k + 1)}{ρ^{k} δ^{k (1 - ρ)} x^{ρ k + 1}}, x > 0, θ < 0, α > 0, 0 < ρ < 1 .$

The above NEFs are displayed in the following table along with their corresponding

α_{n}

,

g_{n} (θ), k_{γ} (θ), Θ

and

M .

4. Conclusions and Topics for Further Research

Convolutions of n i.i.d. random variables play an important role in statistical inference, probability and stochastic processes. However, they are typically very complex, intricate and cumbersome to calculate. Reproducible families of distributions in the sense of Definition 1 are, therefore, very useful in allowing a simple computation of such convolutions by employing (4). The convolution of i.i.d. r.v.s has the same distribution type as any of its components up to a dilation. Various applications of reproducibility and convolution are presented in part 5 of Section 2.

In this study, we have classified all one-parameter reproducible NEFs in the sense of Definition 1 and showed that their corresponding VFs are a power of their mean, as in (9). However, there are still various research avenues concerning the notion of reproducibility that need to be explored. Below are some examples:

1.: Reproducibility for the multi-parameter case

When

F

is a multi-parameter family of distributions depending, say, on k parameters

(θ_{1}, \dots, θ_{k})

, then under which conditions is it reproducible in the sense of Definition 1? As we have already seen from the one-parameter case, such conditions cannot be found for general families. We will, therefore, need to limit the analysis to families of distributions with certain characteristics such as multi-parameter NEFs. Indeed, in Part 5 of Section 2, we already indicated that for a sub-class of two-parameter NEFs, which include the gamma, normal and inverse Gaussian NEFs with parameters

(θ_{1}, θ_{2})

; the distribution of

{\bar{X}}_{n}

is also gamma, normal and inverse Gaussian but with parameters

(n θ_{1}, n θ_{2})

. For these three cases, the sequence of stabilizing constants is

α_{n} = n^{- 1}

with

(g_{n}^{1} (θ_{1}), g_{n}^{2} (θ_{1})) = (n θ_{1}, n θ_{2}) .

2.: Reproducibility for non-NEFs families

We have seen in Examples 1 and 2 that one of the terminal points of the support of the reproducible families of distributions presented depends on the parameter

θ

. Accordingly, one might consider the reproducibility property for huge classes of distributions for which at least one of the terminal points of their respective support depends on a parameter (as the uniform distribution on the interval

(θ_{1}, θ_{2})

or the shifted exponential distribution with scale and location parameters

θ_{1}

and

θ_{2},

respectively). One class of this type was introduced by Hogg and Craig [22]. Another class of truncated exponential families was introduced by Bar-Lev [23] and further elaborated by numerous authors in various fields of statistical inference, c.f., Vancak, Goldberg, Bar-Lev, and Boukai, B. [24] and the referees cited therein.

3.: Reproducibility and infinite divisibility

We have seen that all one-parameter reproducible NEFs are also infinitely divisible. On the other hand, Examples 1–3 demonstrate that there are reproducible families that are not infinitely divisible. Accordingly, one can pose the following challenging problem: Under which necessary and/or sufficient conditions are reproducible families also infinitely divisible and vice versa?

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

I am thankful to two reviewers for helpful comments.

Conflicts of Interest

The author declares no conflict of interest.

References

Wilks, S. Mathematical Statistics; Wiley: New York, NY, USA, 1963. [Google Scholar]
Bolger, E.M.; Harkness, W.L. Some characterizations of exponential-type distributions. Pacific J. Math. 1966, 16, 5–11. [Google Scholar] [CrossRef]
Baringhaus, L.; Davies, P.L.; Plachky, D. A characterization of the Poisson distribution by convolution properties of one-parameter exponential families. Z. Angew. Math. Mech. 1976, 56, T333–T334. [Google Scholar] [CrossRef]
Bar-Lev, S.K.; Enis, P. Reproducibility in the one-parameter exponential family. Metika 1985, 32, 391–394. [Google Scholar] [CrossRef]
Bar-Lev, S.K.; Enis, P. Reproducibility and natural exponential families with power variance functions. Ann. Stat. 1986, 14, 1507–1522. [Google Scholar] [CrossRef]
Bar-Lev, S.K.; Casalis, M. Les familles exponentielles naturelles reproduisantes. C. R. Acad. Sci. Paris Ser. I 1994, 319, 1323–1326. [Google Scholar]
Bar-Lev, S.K.; Casalis, M.A. Classification of reducible natural exponential families in the broad sense. J. Theor. Probab. 2003, 16, 175–196. [Google Scholar] [CrossRef]
Tweedie, M.C.K. An index which distinguishes between some important exponential families. In Statistics: Applications and New Directions, Proceedings of the Indian Institute Golden Jubilee International Conference, Calcutta, India, 27 September–1 October 1984; Ghosh, J.K., Roy, J., Eds.; Indian Statistical Institute: Calcutta, India, 1984; pp. 579–604. [Google Scholar]
Jørgensen, B. Exponential dispersion models (with discussion). J. R. Stat. Soc. Ser. B 1987, 49, 127–162. [Google Scholar]
Bar-Lev, S.K. Independent, tough Identical results: The class of Tweedie on power variance functions and the Class of Bar-Lev and Enis on reproducible natural exponential families. Int. J. Stat. Probab. 2020, 9, 30–35. [Google Scholar] [CrossRef]
Feller, W. An Introduction to Probability Theory and Its Applications 2; Wiley: New York, NY, USA, 1966. [Google Scholar]
Shushi, T.; Yao, J. Multivariate risk measures based on conditional expectation and systemic risk for Exponential Dispersion Models. Insur. Math. Econ. 2020, 93, 178–186. [Google Scholar] [CrossRef]
Bahnemann, D. Distributions for Actuaries; Casualty Actuarial Society: Arlington, VA, USA, 2015. [Google Scholar]
Bar-Lev, S.K.; Carlo, A.R.M. Methods for insurance risk computations. Int. J. Stat. Probab. 2019, 8, 54–74. [Google Scholar] [CrossRef]
Bar-Lev, S.K.; Reiser, B. An exponential subfamily which admits UMPU tests based on a single test statistic. Ann. Stat. 1982, 10, 979–989. [Google Scholar] [CrossRef]
Letac, G.; Mora, M. Natural real exponential families with cubic variance functions. Ann. Stat. 1990, 18, 1–37. [Google Scholar] [CrossRef]
Mora, M. Classification des fontions-variance cubiques des families exponentielles sur ℝ. C. R. Acad. Sci. Paris Ser. 1986, 116, 582–591. [Google Scholar]
Mora, M. La convergence des fonctions variance des familles exponentielles naturelles. Ann. Fac. Sci. Toulouse 1990, 11, 105–120. [Google Scholar] [CrossRef]
Barndorff-Nielsen, O. Information and Exponential Families in Statistical Theory; Wiley: New York, NY, USA, 1978. [Google Scholar]
Aczél, J. Functional Equations and Their Applications; Academic Press: New York, NY, USA, 1966. [Google Scholar]
Lukacs, E. Characteristic Functions, 2nd ed.; Hafner: New York, NY, USA, 1970. [Google Scholar]
Hogg, R.V.; Craig, A.T. Sufficient statistics in elementary distribution theory. Sankhya 1956, 17, 209–216. [Google Scholar]
Bar Lev, S.K. Large sample properties of the MLE and MCLE for the natural parameter of a truncated exponential family. Ann. Inst. Stat. Math. Part A 1984, 36, 217–222. [Google Scholar] [CrossRef]
Vancak, V.; Goldberg, Y.; Bar-Lev, S.K.; Boukai, B. Continuous statistical models: With or without truncation parameter? Math. Methods Stat. 2015, 24, 55–73. [Google Scholar] [CrossRef][Green Version]

Table 1. NEFs having Power VFs.

$γ$	NEF Type	$Θ$	M	C	$k_{γ} (θ)$	$α (n)$	$g_{n} (θ)$
$(- \infty, 0)$	Extreme stable	$R^{+}$	$R^{+}$	$R$	$\frac{1}{a (2 - γ)} {(a (1 - γ) θ)}^{\frac{(γ - 2)}{γ - 1}}$	$n^{\frac{1 - γ}{γ - 2}}$	$θ n^{\frac{γ - 1}{γ - 2}}$
0	Normal	$R$	$R$	$R$	$\frac{1}{2} a θ^{2}$	$\pm n^{- 1 / 2}$	$\pm n^{- 1 / 2} θ$
1	Poisson	$R$	$R^{+}$	$R_{0}^{+}$	$\frac{1}{a} e^{a θ}$	1	$θ + \frac{1}{a} ln n$
$(1, 2)$	Compound Poisson	$R^{-}$	$R^{+}$	$R_{0}^{+}$	$\frac{1}{a (2 - γ)} {(a (1 - γ) θ)}^{\frac{(γ - 2)}{γ - 1}}$	$n^{\frac{1 - γ}{γ - 2}}$	$θ n^{\frac{γ - 1}{γ - 2}}$
2	gamma	$R^{-}$	$R^{+}$	$R^{+}$	$\frac{1}{a} ln (- \frac{1}{a θ})$	−	−
$(2, \infty)$	positive stable	$R^{-}$	$R^{+}$	$R^{+}$	$\frac{1}{a (2 - γ)} {(a (1 - γ) θ)}^{\frac{(γ - 2)}{γ - 1}}$	$n^{\frac{1 - γ}{γ - 2}}$	$θ n^{\frac{γ - 1}{γ - 2}}$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

On the Notion of Reproducibility and Its Full Implementation to Natural Exponential Families

Abstract

1. Introduction

2. Some General Noteworthy Comments on the Notion of Reproducibility

3. NEFs—Preliminaries and Characterization by the Reproducibility Property

3.1. Some Preliminaries on NEFs

3.2. A Classification of NEFs with Power VFs and Their Associated $α_{n}$ , $g_{n} (θ), k_{γ} (θ), Θ$ and M

4. Conclusions and Topics for Further Research

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

On the Notion of Reproducibility and Its Full Implementation to Natural Exponential Families

Abstract

1. Introduction

2. Some General Noteworthy Comments on the Notion of Reproducibility

3. NEFs—Preliminaries and Characterization by the Reproducibility Property

3.1. Some Preliminaries on NEFs

3.2. A Classification of NEFs with Power VFs and Their Associated α n , g n ( θ ) , k γ ( θ ) , Θ and M

4. Conclusions and Topics for Further Research

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

3.2. A Classification of NEFs with Power VFs and Their Associated $α_{n}$ , $g_{n} (θ), k_{γ} (θ), Θ$ and M