Goodness-of-Fit Test for the Bivariate Hermite Distribution

Pablo González-Albornoz; Francisco Novoa-Muñoz

doi:10.3390/axioms12010007

and

¹

Departamento de Matemática, Universidad Adventista de Chile, Chillán 3780000, CP, Chile

²

Departamento de Estadística, Universidad del Bío-Bío, Concepción 4051381, CP, Chile

^*

Author to whom correspondence should be addressed.

Axioms2023, 12(1), 7;https://doi.org/10.3390/axioms12010007

This article belongs to the Special Issue Statistical Methods and Applications

Version Notes

Order Reprints

Review Reports

Abstract

This paper studies the goodness of fit test for the bivariate Hermite distribution. Specifically, we propose and study a Cramér–von Mises-type test based on the empirical probability generation function. The bootstrap can be used to consistently estimate the null distribution of the test statistics. A simulation study investigates the goodness of the bootstrap approach for finite sample sizes.

Keywords:

bivariate Hermite distribution; goodness-of-fit; empirical probability generating function; bootstrap distribution estimator

1. Introduction

Testing the goodness-of-fit (gof) of given observations with a probabilistic model is a crucial aspect of data analysis.

Since the chi-square test was proposed and analyzed by Pearson in 1900 until today, new gof tests have been constructed and applied to continuous and discrete data. Just to mention some of the most recent publications, there are, for example, the works of: Ebner and Henze [1], Górecki, Horváth and Kokoszka [2], Puig and Wei

β

[3], Arnastauskaitè et al. [4], Dörr, Ebner, and Henze [5]), Kolkiewicz, Rice, and Xie [6], Milonas et al. [7], Di Noia et al. [8], and Erlemann and Lindqvist [9].

Because count data can appear in different circumstances, the present investigation is oriented to gof in the discrete case, specifically, in the bivariate Hermite distribution (BHD).

In the univariate configuration, the Hermite distribution is a linear combination of the form

Y = X_{1} + 2 X_{2}

, where

X_{1}

and

X_{2}

are independent Poisson random variables. The distinguishing property of the univariate Hermite distribution (UHD) is that it is flexible when it comes to modeling count data that present a multimodality, in addition to presenting several zeros, which is called zero-inflation. It also allows for modeling data in which the overdispersion is moderate, that is, the variance is greater than the expected value. It was McKendrick at [10] who modeled a phagocytic experiment (bacteria count in leukocytes) through the UHD, obtaining a more satisfactory model than with the Poisson distribution. However, in practice, bivariate count data emerge in several different disciplines and the BHD plays an important role, having superinflated data—for example, the number of accidents in two different periods [11].

The only gof test related to the Hermite distribution found in this study so far is the one developed by the researchers Meintanis and Bassiakos in [12]. However, this test is for univariate data.

On the other hand, to the best of our knowledge, we did not find literature on gof tests for BHD.

The purpose of this paper is to propose and study a gof test for the bivariate Hermite Distribution that is consistent.

According to Novoa-Muñoz in [13], the probability generating function (pgf) characterizes the distribution of a random vector and can be estimated consistently by the empirical probability generating function (epgf); the proposed test is a function of the epgf. This statistical test compares the epgf of the data with an estimator of the pgf of the BHD. As it is well known, to establish the rejection region, we need to know the distribution of the statistic test.

As for finite sample sizes, the resulting test statistic is of the Cramér–von Mises type, and it was not possible to calculate explicitly the distribution of the statistic under a null hypothesis. This is why one uses simulation techniques. Therefore, we decided to use a null approximation of the statistic by using a parametric bootstrap.

Because the properties of the proposed test are asymptotic (see, for example, [14]) and with the purpose of evaluating the behavior of the test for samples of finite size, a simulation study was carried out.

The present work is ordered as follows: In Section 2, we present some preliminary results that will serve us in the following chapters, and the definition of the BHD with some of its properties is also given. In Section 3, the proposed statistic is presented. Section 4 is devoted to showing the bootstrap estimator and its approximation to the null distribution of the statistic. Section 5 is dedicated to presenting the results of a simulation study, power of a hypothesis test, and the application to a set of real data.

Before ending this section, we introduce some notation:

F_{A} \underset{δ}{\land} F_{B}

denotes a mixture (compounding) distribution, where

F_{A}

represents the original distribution and

F_{B}

the mixing distribution (i.e., the distribution of

δ

) [15]; all vectors are row vectors, and

x^{⊤}

is the transposed of the row vector x; for any vector

x, x_{k}

denotes its kth coordinate, and

∥ x ∥

its Euclidean norm;

N_{0} = {0, 1, 2, 3, \dots}

;

I {A}

denotes the indicator function of the set A;

P_{θ}

denotes the probability law of the BHD with parameter

θ

;

E_{θ}

denotes expectation with respect to the probability function

P_{θ}

;

P_{*}

and

E_{*}

denotes the conditional probability law and expectation, given the data

(X_{1}, Y_{1}), \dots, (X_{n}, Y_{n})

, respectively; all limits in this work are taken as

n \to \infty; \overset{L}{⟶}

denotes convergence in distribution;

\overset{a . s .}{⟶}

denotes almost sure convergence; let

{C_{n}}

be a sequence of random variables or random elements and let

ϵ \in R

; then,

C_{n} = O_{_{P}} (n^{- ϵ})

means that

n^{ϵ} C_{n}

is bounded in probability,

C_{n} = o_{_{P}} (n^{- ϵ})

means that

n^{ϵ} C_{n} \overset{P}{⟶} 0

and

C_{n} = o (n^{- ϵ})

means that

n^{ϵ} C_{n} \overset{a . s .}{⟶} 0

and

H = L^{2} ({[0, 1]}^{2}, ϱ)

denotes the separable Hilbert space of the measurable functions

φ, ϱ : {[0, 1]}^{2} \to R

such that

{| | φ | |}_{H}^{2} = \int_{0}^{1} \int_{0}^{1} φ^{2} (t) ϱ (t) d t < \infty

.

2. Preliminaries

Several definitions for the BHD have been given (see, for example, Kocherlakota and Kocherlakota in [16]). In this paper, we will work with the following one, which has received more attention in the statistical literature (see, for example, Papageorgiou et al. in [17]; Kemp et al. in [18]).

Let

X = (X_{1}, X_{2})

have the bivariate Poisson distribution with the parameters

δ λ_{1}

,

δ λ_{2}

, and

δ λ_{3}

(for more details of this distribution; see, for example, Johnson et al. in [19]); then,

X \underset{δ}{\land} N (μ, σ^{2})

has the BHD. Kocherlakota in [20] obtained its pgf, which is given by

v (t; θ) = exp (μ λ + \frac{1}{2} σ^{2} λ^{2}),

(1)

where

t = (t_{1}, t_{2})

,

θ = (μ, σ^{2}, λ_{1}, λ_{2}, λ_{3})

,

λ = λ_{1} (t_{1} - 1) + λ_{2} (t_{2} - 1) + λ_{3} (t_{1} t_{2} - 1)

and

μ > σ^{2} (λ_{i} + λ_{3})

,

i = 1, 2

.

From the pgf of the BHD, Kocherlakota and Kocherlakota [16] obtained the probability mass function of the BHD, which is given by

f (r, s) = \frac{λ_{1}^{r} λ_{2}^{s}}{r! s!} M (γ) \sum_{k = 0}^{min (r, s)} (\binom{r}{k}) (\binom{s}{k}) k! ξ^{k} P_{r + s - k} (γ),

where

M (x)

is the moment-generating function of the normal distribution,

P_{r} (x)

is a polynomial of degree r in x,

γ = - (λ_{1} + λ_{2} + λ_{3})

and

ξ = \frac{λ_{3}}{λ_{1} λ_{2}}

.

Remark 1.

If

λ_{3} = 0

, then the probability function is reduced to

f (r, s) = \frac{λ_{1}^{r} λ_{2}^{s}}{r! s!} M (- λ_{1} - λ_{2}) P_{r + s} (- λ_{1} - λ_{2}) .

Remark 2.

If

X

is a random vector that is bivariate Hermite distributed with parameter θ, it will be denoted

X \sim B H (θ)

, where

θ \in Θ

, and the parameter space is

Θ = \{(μ, σ^{2}, λ_{1}, λ_{2}, λ_{3}) \in R^{5} / μ > σ^{2} (λ_{i} + λ_{3}), λ_{i} > λ_{3} \geq 0, i = 1, 2\} .

Let

X_{1} = (X_{11}, X_{12}), X_{2} = (X_{21}, X_{22}), \dots, X_{n} = (X_{n 1}, X_{n 2})

be independent and identically distributed (iid) random vectors defined on a probability space

(Ω, A, P)

and taking values in

N_{0}^{2}

. In what follows, let

v_{n} (t) = \frac{1}{n} \sum_{i = 1}^{n} t_{1}^{X_{i 1}} t_{2}^{X_{i 2}}

denote the epgf of

X_{1}, X_{2}, \dots, X_{n}

for some appropriate

W \subseteq R^{2}

.

The following section is dedicated to developing the statistic proposed in this study and, for this, it is essential to know the result that is presented below, the proof of which can be reviewed in [14]:

Proposition 1.

Let

X_{1}, \dots, X_{n}

be iid from a random vector

X = (X_{1}, X_{2}) \in N_{0}^{2}

. Let

v (t) = E (t_{1}^{X_{1}} t_{2}^{X_{2}})

be the pgf of

X

, defined on

W \subseteq R^{2}

. Let

0 \leq b_{j} \leq c_{j} < \infty, j = 1, 2

, such that

Q = [b_{1}, c_{1}] \times [b_{2}, c_{2}] \subseteq W

; then,

sup_{t \in Q} | v_{n} (t) - v (t) | \overset{a . s .}{⟶} 0 .

3. The Test Statistic and Its Asymptotic Null Distribution

Let

X_{1} = (X_{11}, X_{12}), X_{2} = (X_{21}, X_{22}), \dots, X_{n} = (X_{n 1}, X_{n 2})

be iid from a random vector

X = (X_{1}, X_{2}) \in N_{0}^{2}

. Based on the sample

X_{1}, X_{2}, \dots, X_{n}

, the objective is to test the hypothesis

H_{0} : (X_{1}, X_{2}) \sim B H (θ), for some θ \in Θ,

against the alternative

H_{1} : (X_{1}, X_{2}) ≁ B H (θ), \forall θ \in Θ .

With this purpose, we will recourse to some of the properties of the pgf that allow us to propose the following statistical test.

According to Proposition 1, a consistent estimator of the pgf is the epgf. If

H_{0}

is true and

{\hat{θ}}_{n}

is a consistent estimator of

θ

, then

v (t; {\hat{θ}}_{n})

consistently estimates the population pgf. Since the distribution of

X = (X_{1}, X_{2})

is uniquely determined by its pgf,

v (t)

,

t = (t_{1}, t_{2}) \in {[0, 1]}^{2}

, a reasonable test for testing

H_{0}

should reject the null hypothesis for large values of

V_{n, w} ({\hat{θ}}_{n})

defined by

V_{n, w} ({\hat{θ}}_{n}) = \int_{0}^{1} \int_{0}^{1} V_{n}^{2} (t; {\hat{θ}}_{n}) w (t) d t,

(2)

where

V_{n} (t; θ) = \sqrt{n} \{v_{n} (t) - v (t; θ)\},

{\hat{θ}}_{n} = {\hat{θ}}_{n} (X_{1}, X_{2}, \dots, X_{n})

is a consistent estimator of

θ

and

w (t)

is a measurable weight function, such that

w (t) \geq 0, \forall t \in {[0, 1]}^{2}

, and

\int_{0}^{1} \int_{0}^{1} w (t) d t < \infty .

(3)

The assumption (3) on w ensures that the double integral in (2) is finite for each fixed n. Now, to determine what are large values of

V_{n, w} ({\hat{θ}}_{n})

, we must calculate its null distribution, or at least an approximation to it. Since the null distribution of

V_{n, w} ({\hat{θ}}_{n})

is unknown, we first try to estimate it by means of its asymptotic null distribution. In order to derive it, we will assume that the estimator

{\hat{θ}}_{n}

satisfies the following regularity condition:

Assumption 1.

Under

H_{0}

, if

θ = (μ, σ^{2}, λ_{1}, λ_{2}, λ_{3}) \in Θ

denotes the true parameter value, then

\sqrt{n} ({\hat{θ}}_{n} - θ) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} ℓ (X_{i}; θ) + o_{_{P}} (1),

where

ℓ : N_{0}^{2} \times Θ ⟶ R^{5}

is such that

E_{θ} \{ℓ (X_{1}; θ)\} = 0

and

J (θ) = E_{θ} \{ℓ {(X_{1}; θ)}^{⊤} ℓ (X_{1}; θ)\} < \infty

.

Assumption 1 is fulfilled by most commonly used estimators; see [16,21].

The next result gives the asymptotic null distribution of

V_{n, w} ({\hat{θ}}_{n})

.

Theorem 1.

Let

X_{1}, \dots, X_{n}

be iid from

X = (X_{1}, X_{2}) \sim B H (θ)

. Suppose that Assumption 1 holds.

Then

V_{n, w} ({\hat{θ}}_{n}) = | | W_{n} {| |}_{_{H}}^{2} + o_{_{P}} (1),

where

W_{n} (t) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} V^{0} (X_{i}, θ; t)

, with

V^{0} (X_{i}, θ; t) = t_{1}^{X_{i 1}} t_{2}^{X_{i 2}} - v (t; θ) \{1 + (λ, \frac{1}{2} λ^{2}, η (t_{1} - 1), η (t_{2} - 1), η (t_{1} t_{2} - 1)) ℓ {(X_{i}; θ)}^{⊤}\},

i = 1, \dots, n, η = μ + σ^{2} λ

. Moreover,

V_{n, w} ({\hat{θ}}_{n}) \overset{L}{⟶} \sum_{j \geq 1} λ_{j} χ_{1 j}^{2},

(4)

where

χ_{11}^{2}, χ_{12}^{2}, \dots

are independent

χ^{2}

variates with one degree of freedom and the set

{λ_{j}}

is the non-null eigenvalues of the operator

C (θ)

defined on the function space

{τ : N_{0}^{2} \to R, s u c h t h a t E_{θ} \{τ^{2} (X)\}

< \infty, \forall θ \in Θ}

, as follows:

C (θ) τ (x) = E_{θ} {h (x, Y; θ) τ (Y)},

where

h (x, y; θ) = \int_{0}^{1} \int_{0}^{1} V^{0} (x; θ; t) V^{0} (y; θ; t) w (t) d t .

(5)

Proof.

By definition,

V_{n, w} ({\hat{θ}}_{n}) = {∥ V_{n} ({\hat{θ}}_{n}) ∥}_{_{H}}^{2}

. Note that

V_{n} (t; {\hat{θ}}_{n}) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} V (X_{i}; {\hat{θ}}_{n}; t), with V (X_{i}; θ; t) = t_{1}^{X_{i 1}} t_{2}^{X_{i 2}} - v (t; θ) .

(6)

By Taylor expansion of

V (X_{i}; {\hat{θ}}_{n}; t)

around

{\hat{θ}}_{n} = θ

,

V_{n} (t; {\hat{θ}}_{n}) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} V (X_{i}; θ; t) + \frac{1}{n} \sum_{i = 1}^{n} Q^{(1)} (X_{i}; θ; t) \sqrt{n} {({\hat{θ}}_{n} - θ)}^{⊤} + q_{n},

(7)

where

q_{n} = \frac{1}{2 \sqrt{n}} ({\hat{θ}}_{n} - θ) \sum_{i = 1}^{n} Q^{(2)} (X_{i}; \tilde{θ}; t) {({\hat{θ}}_{n} - θ)}^{⊤}

,

\tilde{θ} = α {\hat{θ}}_{n} + (1 - α) θ

, for some

0 < α < 1

,

Q^{(1)} (x; ϑ; t)

is the vector of the first derivatives and

Q^{(2)} (x; ϑ; t)

is the matrix of the second derivatives of

V (x; ϑ; t)

with respect to

ϑ

.

Thus, considering (3) results in

E_{θ} \{{∥Q_{j}^{(1)} (X_{1}; θ; t)∥}_{_{H}}^{2}\} < \infty, j = 1, 2, \dots, 5 .

(8)

Using the Markov inequality and (8), we have

\begin{matrix} P_{θ} [{∥\frac{1}{n} \sum_{i = 1}^{n} Q_{j}^{(1)} (X_{i}; θ; t) - E_{θ} \{Q_{j}^{(1)} (X_{1}; θ; t)\}∥}_{_{H}} > ε] \\ \leq \frac{1}{n ε^{2}} E_{θ} [{∥Q_{j}^{(1)} (X_{1}; θ; t)∥}_{_{H}}^{2}] \to 0, j = 1, 2, \dots, 5 . \end{matrix}

Then,

\frac{1}{n} \sum_{i = 1}^{n} Q^{(1)} (X_{i}; θ; t) \overset{P}{⟶} E_{θ} \{Q^{(1)} (X_{1}; θ; t)\},

where

E_{θ} \{Q^{(1)} (X_{1}; θ; t)\} = - v (t; θ) (λ, \frac{1}{2} λ^{2}, η (t_{1} - 1), η (t_{2} - 1), η (t_{1} t_{2} - 1)) .

As

∥ q_{n} ∥_{H} = o_{_{P}} (1)

, then, using Assumption 1, (7) can be written as

V_{n} (t; {\hat{θ}}_{n}) = S_{n} (t; θ) + s_{n},

where

∥ s_{n} ∥_{H} = o_{_{P}} (1)

, and

S_{n} (t; θ) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} [V (X_{i}; θ; t) + E_{θ} \{Q^{(1)} (X_{1}; θ; t)\} ℓ {(X_{i}; θ)}^{⊤}] .

On the other hand, observe that

∥ S_{n} {(θ) ∥}_{_{H}}^{2} = \frac{1}{n} \sum_{i = 1}^{n} \sum_{j = 1}^{n} h (X_{i}, X_{j}; θ),

where

h (x, y; θ)

is defined in (5) and satisfies

h (x, y; θ) = h (y, x; θ)

,

E_{θ} \{h^{2} (X_{1}, X_{2}; θ)\} < \infty

,

E_{θ} \{| h (X_{1}, X_{1}; θ) |\} < \infty

and

E_{θ} \{h (X_{1}, X_{2}; θ)\} = 0

. Thus, from Theorem 6.4.1.B in Serfling [22],

∥ S_{n} {(θ) ∥}_{_{H}}^{2} \overset{L}{⟶} \sum_{j \geq 1} λ_{j} χ_{1 j}^{2}

where

χ_{11}^{2}, χ_{12}^{2}, \dots

and the set

{λ_{j}}

are as defined in the statement of the Theorem. In particular,

∥ S_{n} {(θ) ∥}_{_{H}}^{2} = O_{P} (1)

, which implies (4). □

The asymptotic null distribution of

V_{n, w} ({\hat{θ}}_{n})

depends on the unknown true value of the parameter

θ

; therefore, in practice, they do not provide a useful solution to the problem of estimating the null distribution of the respective statistical tests. This could be solved by replacing

θ

with

\hat{θ}

.

However, a greater difficulty is to determine the sets

{λ_{j}}_{j \geq 1}

; for most of the cases, calculating the eigenvalues of an operator is not a simple task and, in our case, we must also obtain the expression

h (x, y; θ)

, which is not easy to find, since it depends on the function ℓ, which usually does not have a simple expression.

Thus, in the next section, we consider another way to approximate the null distribution of the statistical test, the parametric bootstrap method.

4. The Bootstrap Estimator

An alternative way to estimate the null distribution is through the parametric bootstrap method.

Let

X_{1}, \dots, X_{n}

be iid taking values in

N_{0}^{2}

. Assume that

{\hat{θ}}_{n} = {\hat{θ}}_{n} (X_{1}, \dots, X_{n}) \in Θ

. Let

X_{1}^{*}, \dots, X_{n}^{*}

be iid from a population with distribution

B H ({\hat{θ}}_{n})

, given

X_{1}, \dots, X_{n}

, and let

V_{n, w}^{*} ({\hat{θ}}_{n}^{*})

be the bootstrap version of

V_{n, w} ({\hat{θ}}_{n})

obtained by replacing

X_{1},

\dots, X_{n}

and

{\hat{θ}}_{n} = {\hat{θ}}_{n} (X_{1}, \dots, X_{n})

by

X_{1}^{*}, \dots, X_{n}^{*}

and

{\hat{θ}}_{n}^{*} = {\hat{θ}}_{n} (X_{1}^{*}, \dots, X_{n}^{*})

, respectively, in the expression of

V_{n, w} ({\hat{θ}}_{n})

. Let

P_{*}

denote the bootstrap conditional probability law, given

X_{1}, \dots, X_{n}

. In order to show that the bootstrap consistently estimate the null distribution of

V_{n, w} ({\hat{θ}}_{n})

, we will assume the following assumption, which is a bit stronger than Assumption 1.

Assumption 2.

Assumption 1 holds and the functions ℓ and J satisfy

(1): ${sup}_{ϑ \in Θ_{0}} E_{ϑ} [{∥ ℓ (X; ϑ) ∥}^{2} I \{∥ ℓ (X; ϑ) ∥ > γ\}] ⟶ 0$ , as $γ \to \infty$ , where $Θ_{0} \subseteq Θ$ is an open neighborhood of θ.
(2): $ℓ (X; ϑ)$ is continuous as a function of ϑ at $ϑ = θ$ , and $J (ϑ)$ is finite $\forall ϑ \in Θ_{0}$ .

As stated after Assumption 1, Assumption 2 is not restrictive since it is fulfilled by commonly used estimators.

The next theorem shows that the bootstrap distribution of

V_{n, w} ({\hat{θ}}_{n})

consistently estimates its null distribution.

Theorem 2.

Let

X_{1}, \dots, X_{n}

be iid from a random vector

X = (X_{1}, X_{2}) \in N_{0}^{2}

. Suppose that Assumption 2 holds and that

{\hat{θ}}_{n} = θ + o (1)

, for some

θ \in Θ

. Then,

sup_{x \in R} | P_{*} {V_{n, w}^{*} ({\hat{θ}}_{n}^{*}) \leq x} - P_{θ} {V_{n, w} ({\hat{θ}}_{n}) \leq x} | \overset{a . s .}{⟶} 0 .

Proof.

By definition,

V_{n, w}^{*} ({\hat{θ}}_{n}^{*}) = {∥ V_{n}^{*} ({\hat{θ}}_{n}^{*}) ∥}_{_{H}}^{2}

, with

V_{n}^{*} (t; {\hat{θ}}_{n}^{*}) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} V (X_{i}^{*}; {\hat{θ}}_{n}^{*}; t)

and

V (X; θ; t)

defined in (6).

Following similar steps to those given in the proof of Theorem 1, it can be seen that

V_{n, w}^{*} ({\hat{θ}}_{n}^{*}) = {∥ W_{n}^{*} ∥}_{H}^{2} + o_{P_{*}} (1)

, where

W_{n}^{*} (t)

is defined as

W_{n} (t)

with

X_{i}

and

θ

replaced by

X_{i}^{*}

and

{\hat{θ}}_{n}

, respectively.

To derive the result, first we will check that assumptions (i)–(iii) in Theorem 1.1 of Kundu et al. [23] hold.

Observe that

Y_{n}^{*} (t) = \sum_{i = 1}^{n} Y_{n i}^{*} (t)

where

Y_{n i}^{*} (t) = \frac{1}{\sqrt{n}} V^{0} (X_{i}^{*}; {\hat{θ}}_{n}; t), i = 1, \dots, n,

Clearly,

E_{*} \{Y_{n i}^{*}\} = 0

and

E_{*} \{∥ Y_{n i}^{*} ∥_{_{H}}^{2}\} < \infty

. Let

K_{n}

be the covariance kernel of

Y_{n}^{*}

, which by SLLN satisfies

\begin{matrix} K_{n} (u, v) & = E_{*} \{Y_{n}^{*} (u) Y_{n}^{*} (v)\} \\ = E_{*} \{V^{0} (X_{1}^{*}; {\hat{θ}}_{n}; u) V^{0} (X_{1}^{*}; {\hat{θ}}_{n}; v)\} \\ \overset{a . s .}{⟶} E_{θ} \{V^{0} (X_{1}; θ; u) V^{0} (X_{1}; θ; v)\} = K (u, v) . \end{matrix}

Moreover, let

Z

be a zero-mean Gaussian process on

H

whose operator of covariance C is characterized by

\begin{matrix} {⟨ C f, h ⟩}_{_{H}} & = c o v ({⟨ Z, f ⟩}_{_{H}}, {⟨ Z, h ⟩}_{_{H}}) \\ = \int_{{[0, 1]}^{4}} K (u, v) f (u) h (v) w (u) w (v) d u d v . \end{matrix}

From the central limit theorem in Hilbert spaces (see, for example, van der Vaart and Wellner [24]), it follows that

Y_{n} = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} V^{0} (X_{i}; θ; t) \overset{L}{⟶} Z

on

H

, when the data are iid from the random vector

X \sim H B (θ)

.

Let

C_{n}

denote the covariance operator of

Y_{n}^{*}

and let

{e_{k} : k \geq 0}

be an orthonormal basis of

H

. Let

f, h \in H

, by a dominated convergence theorem,

\begin{matrix} lim_{n \to \infty} {⟨ C_{n} e_{k}, e_{l} ⟩}_{_{H}} & = lim_{n \to \infty} \int_{{[0, 1]}^{4}} K_{n} (u, v) e_{k} (u) e_{l} (v) w (u) w (v) d u d v \\ = {⟨ C e_{k}, e_{l} ⟩}_{_{H}} . \end{matrix}

Setting

a_{k l} = {⟨ C e_{k}, e_{l} ⟩}_{_{H}}

in the aforementioned Theorem 1.1, this proves that condition (i) holds. To verify condition (ii), by using a monotone convergence theorem, Parseval’s relation and dominated convergence theorem, we obtained

\begin{matrix} lim_{n \to \infty} \sum_{k = 0}^{\infty} {⟨ C_{n} e_{k}, e_{k} ⟩}_{_{H}} & = lim_{n \to \infty} \sum_{k = 0}^{\infty} \int_{{[0, 1]}^{4}} K_{n} (u, v) e_{k} (u) e_{k} (v) w (u) w (v) d u d v \\ = \sum_{k = 0}^{\infty} \int_{{[0, 1]}^{4}} K (u, v) e_{k} (u) e_{k} (v) w (u) w (v) d u d v = \sum_{k = 0}^{\infty} {⟨ C e_{k}, e_{k} ⟩}_{_{H}} \\ = \sum_{k = 0}^{\infty} a_{k k} = \sum_{k = 0}^{\infty} E_{θ} \{{⟨ Z, e_{k} ⟩}_{_{H_{1}}}^{2}\} = E_{θ} \{{∥ Z ∥}_{_{H}}^{2}\} < \infty . \end{matrix}

To prove condition (iii), we first notice that

| {⟨ Y_{n i}^{*}, e_{k} ⟩}_{_{H}} | \leq \frac{M}{\sqrt{n}}, i = 1, \dots, n, \forall n, where 0 < M < \infty .

From the above inequality, for each fixed

ε > 0

,

E_{*} [{⟨ Y_{n i}^{*}, e_{k} ⟩}_{_{H}}^{2} I \{| {⟨ Y_{n i}^{*}, e_{k} ⟩}_{_{H}} | > ε\}] = 0 .

for sufficiently large n. This proves condition (iii). Therefore,

Y_{n}^{*} \overset{L}{⟶} Z

in

H

, a.s. Now, the result follows from the continuous mapping theorem. □

From Theorem 2, the test function

Ψ_{V}^{*} = \{\begin{matrix} 1, & if V_{n, w}^{*} ({\hat{θ}}_{n}^{*}) \geq v_{n, w, α}^{*}, \\ 0, & otherwise, \end{matrix}

or, equivalently, the test that rejects

H_{0}

when

p^{*} = P_{*} {V_{n, w}^{*} ({\hat{θ}}_{n}^{*}) \geq V_{o b s}} \leq α,

is asymptotically correct in the sense that, when

H_{0}

is true,

lim P_{θ} (Ψ_{V}^{*} = 1) = α

, where

v_{n, w, α}^{*} = inf {x : P_{*} (V_{n, w}^{*} ({\hat{θ}}_{n}^{*}) \geq x) \leq α}

is the

α

upper percentile of the bootstrap distribution of

V_{n, w} ({\hat{θ}}_{n})

and

V_{o b s}

is the observed value of the test statistic.

5. Numerical Results and Discussion

According to Novoa-Muñoz and Jiménez-Gamero in [14], the properties of the statistic

V_{n, w} ({\hat{θ}}_{n})

are asymptotic, that is, such properties describe the behavior of the test proposed for large samples. To study the goodness of the bootstrap approach for samples of finite size, a simulation experiment was carried out. In this section, we describe this experiment and provide a summary of the results that have been obtained.

It is necessary to emphasize, as mentioned in the Introduction that, to the best of our knowledge, we have not found another goodness-of-fit test for the bivariate Hermite distribution with which we can make a comparison. Therefore, the simulation study is limited only to the test presented in this investigation.

On the other hand, all the computational calculations made in this paper were carried out through codes written in the R language [25].

To calculate

V_{n, w} ({\hat{θ}}_{n})

, it is necessary to give an explicit form to the weight function w. Here, the following is taken into account:

w (t; a_{1}, a_{2}) = t_{1}^{a_{1}} t_{2}^{a_{2}} .

(9)

Observe that the only restrictions that have been imposed on the weight function are that w be positive almost everywhere in

{[0, 1]}^{2}

and the established in (3). The function

w (t; a_{1}, a_{2})

given in (9) meets these conditions whenever

a_{i} > - 1

,

i = 1, 2

. Hence,

V_{n, w} ({\hat{θ}}_{n}) = n \int_{0}^{1} \int_{0}^{1} {[\sum_{i = 1}^{n} t_{1}^{X_{i 1}} t_{2}^{X_{i 2}} - e x p (\hat{μ} \hat{λ} + \frac{1}{2} {\hat{σ}}^{2} {\hat{λ}}^{2})]}^{2} t_{1}^{a_{1}} t_{2}^{a_{2}} d t_{1} d t_{2} .

It was not possible to find an explicit form of the statistic

V_{n, w} ({\hat{θ}}_{n})

, for which its calculation used the curvature package of R [25] to calculate it.

5.1. Simulated Data

In order to approximate the null distribution of the statistic

V_{n, w} ({\hat{θ}}_{n})

for finite-size samples of sizes 30, 50, and 70 from a

B H (θ)

, for

θ = (μ, σ^{2}, λ_{1}, λ_{2}, λ_{3})

, the pgf (1), with

λ_{3} = 0

, was utilized. The combinations of parameters were chosen in such a way that

μ > σ^{2} (λ_{i} + λ_{3})

,

i = 1, 2

.

The selected values of the other parameters were

μ \in {1.0, 1.5, 2.0}

,

σ^{2} \in {0.8, 1.0}

,

λ_{1} \in {0.10, 0.25, 0.50, 0.75, 1.00}

and

λ_{2} \in {0.20, 0.25, 0.50, 0.75}

.

The selected values of

λ_{1}

and

λ_{2}

were not greater than 1 since the Hermite distribution is characterized as being zero-inflated.

To estimate the parameter

θ

, we use the maximum likelihood method given in Kocherlakota and Kocherlakota [16]. Then, we approximated the bootstrap p-values of the proposed test with the weight function given in (9) for

(a_{1}, a_{2}) \in {(0, 0), (1, 0), (0, 1), (1, 1), (5, 1),

(1, 5), (5, 5)}

, and we generate

B = 500

bootstrap samples.

The above procedure was repeated 1000 times, and the fraction of the estimated

p

-values that was found to be less than or equal to 0.05 and 0.10, which are the estimates type I error probabilities for

α =

0.05 and 0.1.

The results obtained are presented in Table 1, Table 2, Table 3, Table 4, Table 5, Table 6 and Table 7 for the different pairs

(a_{1}, a_{2})

. In each table, the established order was growing in

μ

and

σ^{2}

, and for each new

μ

increasing values in

λ_{1}

, and in each new

λ_{1}

, increasing values for

λ_{2}

. From these results, we can conclude that the parametric bootstrap method provides good approximations to the null distribution of the

V_{n, w} ({\hat{θ}}_{n})

in most of the cases considered.

Table 1. Simulation results for the probability of type I error for

a_{1} = 0

and

a_{2} = 0

.

Table 2. Simulation results for the probability of type I error for

a_{1} = 1

and

a_{2} = 0

.

Table 3. Simulation results for the probability of type I error for

a_{1} = 0

and

a_{2} = 1

.

Table 4. Simulation results for the probability of type I error for

a_{1} = 1

and

a_{2} = 1

.

Table 5. Simulation results for the probability of type I error for

a_{1} = 1

and

a_{2} = 5

.

Table 6. Simulation results for the probability of type I error for

a_{1} = 5

and

a_{2} = 1

.

Table 7. Simulation results for the probability of type I error for

a_{1} = 5

and

a_{2} = 5

.

It is seen that the values of

a_{1}

and

a_{2}

of the weight function affect bootstrap estimates of p-values.

From the tables, it is clear that the bootstrap p-values are increasingly approaching the nominal value as n increases. These approximations are better when

a_{1} = a_{2}

. In particular, when

a_{1} = a_{2}

is small (less than 5), then the bootstrap p-values are approached from the left (below) to the nominal value; otherwise, it happens when

a_{1} = a_{2}

are fairly large values (greater or equal to 5). Table 4 is the one that shows the best results, being the weight function with

a_{1} = a_{2} = 1

that presents the best p-values estimates.

Unfortunately, we could not find a closed form for our statistic

V_{n, w} ({\hat{θ}}_{n})

; in order to calculate it, we used the curvature package of the software R [25]. This had a serious impact on the computation time since the simulations were increased in their execution time by at least 30%.

5.2. The Power of a Hypothesis Test

To study the power, we repeated the previous experiment for samples of size

n = 50

and, for the weight function, we used the values of

a_{1}

and

a_{2}

that yielded the best results in the study of type I error. The alternative distributions we use are detailed below:

bivariate binomial distribution $B B (m; p_{1}, p_{2}, p_{3})$ , where $p_{1} + p_{2} - p_{3} \leq 1$ , $p_{1} \geq p_{3}$ , $p_{2} \geq p_{3}$ and $p_{3} > 0$ ,
bivariate Poisson distribution $B P (λ_{1}, λ_{2}, λ_{3})$ , where $λ_{1} > λ_{3}$ , $λ_{2} > λ_{3} > 0$ ,
bivariate logarithmic series distribution $B L S (λ_{1}, λ_{2}, λ_{3})$ , where $0 < λ_{1} + λ_{2} + λ_{3} < 1$ ,
bivariate negative binomial distribution $B N B (ν; γ_{0}, γ_{1}, γ_{2})$ , where $ν \in N, γ_{0} > γ_{2}, γ_{1} > γ_{2}$ and $γ_{2} > 0$ ,
bivariate Neyman type A distribution $B N T A (λ; λ_{1}, λ_{2}, λ_{3})$ , where $0 < λ_{1} + λ_{2} + λ_{3} \leq 1$ ,
bivariate Poisson distribution mixtures of the form $p B P (θ) + (1 - p) B P (λ)$ , where $0 < p < 1$ , denoted by $B P P (p; θ, λ)$ .

Table 8 displays the alternatives considered and the estimated power for nominal significance level

α = 0.05

. Analyzing this table, we can conclude that all the considered tests, denoted by

V_{(a_{1}, a_{2})}

, are able to detect the alternatives studied and with a good power, giving better results in cases where

a_{1} = a_{2}

. The best result was achieved for

a_{1} = a_{2} = 1

, as expected, as occurred in the study of type I error.

Table 8. Simulation results for the power. The values are in the form of percentages, rounded to the nearest integer.

5.3. Real Data Set

Now, the proposed test will be applied to a real data set. The data set comprises the number of accidents in two different years, presented in [16], where X is the accident number of the first period and Y the accident number of the second period. Table 9 shows the real data set.

Table 9. Real data of X accident number in a period and Y of another period.

The p-value, obtained from the statistic

V_{n, w} ({\hat{θ}}_{n})

of the proposed test, with

a_{1} = 1

and

a_{2} = 0

applied to the real values, is 0.838; therefore, we decided not to reject the null hypothesis, that is, the data seem to have a BHD. This is consistent with the results presented by Kemp and Papageorgiou in [26], who performed the goodness-of-fit test

χ^{2}

obtaining a p-value of 0.3078.

Author Contributions

Conceptualization, F.N.-M.; methodology, F.N.-M. and P.G.-A.; software, F.N.-M. and P.G.-A.; validation, F.N.-M. and P.G.-A.; formal analysis, F.N.-M. and P.G.-A.; investigation, F.N.-M. and P.G.-A.; resources, F.N.-M.; data curation, P.G.-A.; writing—original draft preparation, F.N.-M. and P.G.-A.; writing—review and editing, F.N.-M. and P.G.-A.; visualization, F.N.-M. and P.G.-A. All authors have read and agreed to the published version of the manuscript.

Funding

This publication was supported by Universidad del Bío-Bío, DICREA [2220529 IF/R] and Universidad Adventista de Chile, DI [2021-139 II], Chile.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The corresponding author would like to thank research project DIUBB 2220529 IF/R and Fondo de Apoyo a la Participación a Eventos Internacionales (FAPEI) at Universidad del Bío-Bío, Chile. He also thanks the anonymous reviewers and the editor of this journal for their valuable time and their careful comments and suggestions with which the quality of this paper has been improved.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ebner, B.; Henze, N. Tests for multivariate normality-a critical review with emphasis on weighted L²-statistics. TEST 2020, 29, 845–892. [Google Scholar] [CrossRef]
Górecki, T.; Horváth, L.; Kokoszka, P. Tests of Normality of Functional Data. Int. Stat. Rev. 2020, 88, 677–697. [Google Scholar] [CrossRef]
Puig, P.; Weiβ, C.H. Some goodness-of-fit tests for the Poisson distribution with applications in biodosimetry. Comput. Stat. Data Anal. 2020, 144, 106878. [Google Scholar] [CrossRef]
Arnastauskaitè, J.; Ruzgas, T.; Bražènas, M. A New Goodness of Fit Test for Multivariate Normality and Comparative Simulation Study. Mathematics 2021, 9, 3003. [Google Scholar]
Dörr, P.; Ebner, B.; Henze, N. A new test of multivariate normality by a double estimation in a characterizing PDE. Metrika 2021, 84, 401–427. [Google Scholar]
Kolkiewicz, A.; Rice, G.; Xie, Y. Projection pursuit based tests of normality with functional data. J. Stat. Plan. Inference 2021, 211, 326–339. [Google Scholar] [CrossRef]
Milonas, D.; Ruzgas, T.; Venclovas, Z.; Jievaltas, M.; Joniau, S. The significance of prostate specific antigen persistence in prostate cancer risk groups on long-term oncological outcomes. Cancers 2021, 13, 2453. [Google Scholar] [CrossRef] [PubMed]
Di Noia, A.; Barabesi, L.; Marcheselli, M.; Pisani, C.; Pratelli, L. Goodness-of-fit test for count distributions with finite second moment. J. Nonparametric Stat. 2022. [Google Scholar] [CrossRef]
Erlemann, R.; Lindqvist, B.H. Conditional Goodness-of-Fit Tests for Discrete Distributions. J. Stat. Theory Pract. 2022. [Google Scholar] [CrossRef]
McKendrick, A.G. Applications of Mathematics to Medical Problems? Proc. Edinb. Math. Soc. 1926, 44, 98–130. [Google Scholar] [CrossRef]
Cresswell, W.L.; Froggatt, P. The Causation of Bus Driver Accidents; Oxford University Press: Oxford, UK, 1963; p. 316. [Google Scholar]
Meintanis, S.; Bassiakos, Y. Goodness-of-fit tests for additively closed count models with an application to the generalized Hermite distribution. Sankhya 2005, 67, 538–552. [Google Scholar]
Novoa-Muñoz, F. Goodness-of-fit tests for the bivariate Poisson distribution. Commun. Stat. Simul. Comput. 2019. [Google Scholar] [CrossRef]
Novoa-Muñoz, F.; Jiménez-Gamero, M.D. Testing for the bivariate Poisson distribution. Metrika 2013, 77, 771–793. [Google Scholar] [CrossRef]
Johnson, N.L.; Kemp, A.W.; Kotz, S. Univariate Discrete Distributions, 3rd ed.; John Wiley & Sons, Inc.: New York, NY, USA, 2005. [Google Scholar]
Kocherlakota, S.; Kocherlakota, K. Bivariate Discrete Distributions; John Wiley & Sons: Hoboken, NJ, USA, 1992. [Google Scholar]
Papageorgiou, H.; Kemp, C.D.; Loukas, S. Some methods of estimation for the bivariate Hermite distribution. Biometrika 1983, 70, 479–484. [Google Scholar] [CrossRef]
Kemp, C.D.; Kemp, A.W. Rapid estimation for discrete distributions. Statistician 1988, 37, 243–255. [Google Scholar] [CrossRef]
Johnson, N.L.; Kotz, S.; Balakrishnan, N. Discrete Multivariate Distributions; Wiley: New York, NY, USA, 1997. [Google Scholar]
Kocherlakota, S. On the compounded bivariate Poisson distribution: A unified approach. Ann. Inst. Stat. Math. 1988, 40, 61–76. [Google Scholar] [CrossRef]
Papageorgiou, H.; Loukas, S. Conditional even point estimation for bivariate discrete distributions. Commun. Stat. Theory Methods 1988, 17, 3403–3412. [Google Scholar] [CrossRef]
Serfling, R.J. Approximation Theorems of Mathematical Statistics; Wiley: New York, NY, USA, 1980. [Google Scholar]
Kundu, S.; Majumdar, S.; Mukherjee, K. Central limits theorems revisited. Stat. Probab. Lett. 2000, 47, 265–275. [Google Scholar] [CrossRef]
Van der Vaart, J.A.; Wellner, J.A. Weak Convergence and Empirical Processes; Springer: New York, NY, USA, 1996. [Google Scholar]
R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. 2021. Available online: https://www.R-project.org/ (accessed on 1 July 2019).
Kemp, C.D.; Papageorgiou, H. Bivariate Hermite distributions. Sankhya 1982, 44, 269–280. [Google Scholar]

Table 1. Simulation results for the probability of type I error for

a_{1} = 0

and

a_{2} = 0

.

Table 1. Simulation results for the probability of type I error for

a_{1} = 0

and

a_{2} = 0

.

$θ$	$n = 30$		$n = 50$		$n = 70$
$θ$	$α = 0.05$	$α = 0.1$	$α = 0.05$	$α = 0.1$	$α = 0.05$	$α = 0.1$
(1.0, 0.8, 0.10, 0.20, 0.00)	0.012	0.053	0.029	0.069	0.037	0.081
(1.0, 0.8, 0.25, 0.25, 0.00)	0.027	0.067	0.037	0.064	0.043	0.094
(1.0, 0.8, 0.50, 0.20, 0.00)	0.016	0.062	0.046	0.073	0.047	0.087
(1.0, 0.8, 0.50, 0.50, 0.00)	0.025	0.063	0.042	0.076	0.044	0.091
(1.5, 1.0, 0.50, 0.50, 0.00)	0.010	0.064	0.035	0.078	0.042	0.089
(1.5, 1.0, 0.50, 0.75, 0.00)	0.010	0.065	0.036	0.084	0.041	0.084
(1.5, 1.0, 0.75, 0.25, 0.00)	0.017	0.071	0.038	0.087	0.043	0.088
(1.5, 1.0, 1.00, 0.25, 0.00)	0.027	0.076	0.039	0.090	0.042	0.092
(2.0, 1.0, 0.25, 0.75, 0.00)	0.017	0.067	0.038	0.082	0.047	0.089
(2.0, 1.0, 0.50, 0.25, 0.00)	0.011	0.067	0.037	0.088	0.045	0.091
(2.0, 1.0, 0.75, 0.25, 0.00)	0.029	0.070	0.035	0.087	0.043	0.089

Table 2. Simulation results for the probability of type I error for

a_{1} = 1

and

a_{2} = 0

.

Table 2. Simulation results for the probability of type I error for

a_{1} = 1

and

a_{2} = 0

.

$θ$	$n = 30$		$n = 50$		$n = 70$
$θ$	$α = 0.05$	$α = 0.1$	$α = 0.05$	$α = 0.1$	$α = 0.05$	$α = 0.1$
(1.0, 0.8, 0.10, 0.20, 0.00)	0.010	0.039	0.025	0.073	0.043	0.088
(1.0, 0.8, 0.25, 0.25, 0.00)	0.025	0.073	0.037	0.088	0.041	0.104
(1.0, 0.8,0.50, 0.20, 0.00)	0.027	0.072	0.041	0.083	0.045	0.086
(1.0, 0.8, 0.50, 0.50, 0.00)	0.035	0.053	0.042	0.072	0.045	0.101
(1.5, 1.0, 0.50, 0.50, 0.00)	0.011	0.064	0.031	0.080	0.038	0.085
(1.5, 1.0, 0.50, 0.75, 0.00)	0.019	0.065	0.034	0.078	0.039	0.080
(1.5, 1.0, 0.75, 0.25, 0.00)	0.025	0.081	0.038	0.085	0.042	0.084
(1.5, 1.0, 1.00, 0.25, 0.00)	0.037	0.074	0.035	0.085	0.040	0.086
(2.0, 1.0, 0.25, 0.75, 0.00)	0.027	0.071	0.034	0.082	0.047	0.089
(2.0, 1.0, 0.50, 0.25, 0.00)	0.011	0.077	0.031	0.084	0.044	0.086
(2.0, 1.0, 0.75, 0.25, 0.00)	0.019	0.080	0.035	0.085	0.044	0.087

Table 3. Simulation results for the probability of type I error for

a_{1} = 0

and

a_{2} = 1

.

Table 3. Simulation results for the probability of type I error for

a_{1} = 0

and

a_{2} = 1

.

$θ$	$n = 30$		$n = 50$		$n = 70$
$θ$	$α = 0.05$	$α = 0.1$	$α = 0.05$	$α = 0.1$	$α = 0.05$	$α = 0.1$
(1.0, 0.8, 0.10, 0.20, 0.00)	0.014	0.044	0.029	0.067	0.043	0.088
(1.0, 0.8, 0.25, 0.25, 0.00)	0.028	0.068	0.039	0.079	0.042	0.084
(1.0, 0.8, 0.50, 0.20, 0.00)	0.019	0.063	0.042	0.083	0.057	0.092
(1.0, 0.8, 0.50, 0.50, 0.00)	0.029	0.063	0.045	0.075	0.054	0.089
(1.5, 1.0, 0.50, 0.50, 0.00)	0.011	0.066	0.039	0.079	0.042	0.089
(1.5, 1.0, 0.50, 0.75, 0.00)	0.013	0.070	0.043	0.082	0.043	0.087
(1.5, 1.0, 0.75, 0.25, 0.00)	0.017	0.081	0.042	0.089	0.043	0.092
(1.5, 1.0, 1.00, 0.25, 0.00)	0.037	0.086	0.045	0.091	0.045	0.093
(2.0, 1.0, 0.25, 0.75, 0.00)	0.047	0.077	0.048	0.084	0.047	0.089
(2.0, 1.0, 0.50, 0.25, 0.00)	0.014	0.077	0.037	0.089	0.043	0.093
(2.0, 1.0, 0.75, 0.25, 0.00)	0.027	0.080	0.041	0.097	0.044	0.096

Table 4. Simulation results for the probability of type I error for

a_{1} = 1

and

a_{2} = 1

.

Table 4. Simulation results for the probability of type I error for

a_{1} = 1

and

a_{2} = 1

.

$θ$	$n = 30$		$n = 50$		$n = 70$
$θ$	$α = 0.05$	$α = 0.1$	$α = 0.05$	$α = 0.1$	$α = 0.05$	$α = 0.1$
(1.0, 0.8, 0.10, 0.20, 0.00)	0.016	0.073	0.024	0.086	0.048	0.092
(1.0, 0.8, 0.25, 0.25, 0.00)	0.032	0.058	0.037	0.088	0.049	0.091
(1.0, 0.8, 0.50, 0.20, 0.00)	0.024	0.064	0.043	0.085	0.048	0.089
(1.0, 0.8, 0.50, 0.50, 0.00)	0.033	0.072	0.043	0.086	0.049	0.093
(1.5, 1.0, 0.50, 0.50, 0.00)	0.030	0.072	0.038	0.088	0.046	0.090
(1.5, 1.0, 0.50, 0.75, 0.00)	0.033	0.071	0.042	0.084	0.047	0.098
(1.5, 1.0, 0.75, 0.25, 0.00)	0.036	0.097	0.039	0.097	0.049	0.099
(1.5, 1.0, 1.00, 0.25, 0.00)	0.039	0.088	0.046	0.090	0.049	0.093
(2.0, 1.0, 0.25, 0.75, 0.00)	0.031	0.087	0.044	0.092	0.048	0.099
(2.0, 1.0, 0.50, 0.25, 0.00)	0.035	0.068	0.039	0.081	0.047	0.093
(2.0, 1.0, 0.75, 0.25, 0.00)	0.037	0.080	0.045	0.088	0.049	0.096

Table 5. Simulation results for the probability of type I error for

a_{1} = 1

and

a_{2} = 5

.

Table 5. Simulation results for the probability of type I error for

a_{1} = 1

and

a_{2} = 5

.

$θ$	$n = 30$		$n = 50$		$n = 70$
$θ$	$α = 0.05$	$α = 0.1$	$α = 0.05$	$α = 0.1$	$α = 0.05$	$α = 0.1$
(1.0, 0.8, 0.10, 0.20, 0.00)	0.014	0.037	0.032	0.075	0.051	0.093
(1.0, 0.8, 0.25, 0.25, 0.00)	0.023	0.074	0.053	0.090	0.060	0.113
(1.0, 0.8, 0.50, 0.20, 0.00)	0.036	0.101	0.062	0.110	0.064	0.117
(1.0, 0.8, 0.50, 0.50, 0.00)	0.023	0.080	0.042	0.107	0.063	0.109
(1.5, 1.0, 0.50, 0.50, 0.00)	0.022	0.081	0.037	0.111	0.046	0.108
(1.5, 1.0, 0.50, 0.75, 0.00)	0.039	0.095	0.048	0.108	0.056	0.108
(1.5, 1.0, 0.75, 0.25, 0.00)	0.034	0.108	0.048	0.107	0.054	0.108
(1.5, 1.0, 1.00, 0.25, 0.00)	0.037	0.107	0.059	0.109	0.054	0.107
(2.0, 1.0, 0.25, 0.75, 0.00)	0.048	0.106	0.056	0.108	0.054	0.106
(2.0, 1.0, 0.50, 0.25, 0.00)	0.025	0.107	0.047	0.108	0.045	0.108
(2.0, 1.0, 0.75, 0.25, 0.00)	0.043	0.107	0.045	0.107	0.043	0.106

Table 6. Simulation results for the probability of type I error for

a_{1} = 5

and

a_{2} = 1

.

Table 6. Simulation results for the probability of type I error for

a_{1} = 5

and

a_{2} = 1

.

$θ$	$n = 30$		$n = 50$		$n = 70$
$θ$	$α = 0.05$	$α = 0.1$	$α = 0.05$	$α = 0.1$	$α = 0.05$	$α = 0.1$
(1.0, 0.8, 0.10, 0.20, 0.00)	0.015	0.040	0.032	0.062	0.042	0.081
(1.0, 0.8, 0.25, 0.25, 0.00)	0.034	0.076	0.045	0.101	0.048	0.104
(1.0, 0.8, 0.50, 0.20, 0.00)	0.028	0.084	0.048	0.073	0.053	0.089
(1.0, 0.8, 0.50, 0.50, 0.00)	0.028	0.069	0.045	0.079	0.054	0.098
(1.5, 1.0, 0.50, 0.50, 0.00)	0.019	0.071	0.035	0.078	0.042	0.099
(1.5, 1.0, 0.50, 0.75, 0.00)	0.044	0.104	0.048	0.098	0.056	0.104
(1.5, 1.0, 0.75, 0.25, 0.00)	0.027	0.107	0.038	0.105	0.046	0.103
(1.5, 1.0, 1.00, 0.25, 0.00)	0.037	0.117	0.043	0.112	0.060	0.107
(2.0, 1.0, 0.25, 0.75, 0.00)	0.037	0.112	0.039	0.108	0.054	0.108
(2.0, 1.0, 0.50, 0.25, 0.00)	0.026	0.077	0.034	0.109	0.055	0.109
(2.0, 1.0, 0.75, 0.25, 0.00)	0.034	0.116	0.045	0.107	0.056	0.105

Table 7. Simulation results for the probability of type I error for

a_{1} = 5

and

a_{2} = 5

.

Table 7. Simulation results for the probability of type I error for

a_{1} = 5

and

a_{2} = 5

.

$θ$	$n = 30$		$n = 50$		$n = 70$
$θ$	$α = 0.05$	$α = 0.1$	$α = 0.05$	$α = 0.1$	$α = 0.05$	$α = 0.1$
(1.0, 0.8, 0.10, 0.20, 0.00)	0.017	0.035	0.032	0.065	0.050	0.089
(1.0, 0.8, 0.25, 0.25, 0.00)	0.027	0.077	0.034	0.081	0.043	0.084
(1.0, 0.8, 0.50, 0.20, 0.00)	0.030	0.086	0.042	0.087	0.048	0.104
(1.0, 0.8, 0.50, 0.50, 0.00)	0.013	0.069	0.030	0.076	0.045	0.105
(1.5, 1.0, 0.50, 0.50, 0.00)	0.016	0.063	0.035	0.078	0.046	0.087
(1.5, 1.0, 0.50, 0.75, 0.00)	0.019	0.085	0.061	0.089	0.054	0.094
(1.5, 1.0, 0.75, 0.25, 0.00)	0.031	0.071	0.053	0.102	0.047	0.098
(1.5, 1.0, 1.00, 0.25, 0.00)	0.037	0.086	0.049	0.104	0.052	0.102
(2.0, 1.0, 0.25, 0.75, 0.00)	0.015	0.087	0.057	0.098	0.055	0.101
(2.0, 1.0, 0.75, 0.25, 0.00)	0.040	0.097	0.054	0.102	0.053	0.102

Table 8. Simulation results for the power. The values are in the form of percentages, rounded to the nearest integer.

Alternative	$V_{(0, 0)}$	$V_{(1, 0)}$	$V_{(1, 1)}$	$V_{(1, 5)}$	$V_{(5, 5)}$
$B B (1; 0.41, 0.02, 0.01)$ $B B (1; 0.41, 0.03, 0.02)$ $B B (2; 0.61, 0.01, 0.01)$ $B B (1; 0.61, 0.03, 0.02)$ $B B (2; 0.71, 0.01, 0.01)$	87 85 93 95 94	81 82 84 89 86	89 88 98 100 100	81 80 83 87 85	85 86 92 95 93
$B P (1.00, 1.00, 0.25)$ $B P (1.00, 1.00, 0.50)$ $B P (1.00, 1.00, 0.75)$ $B P (1.50, 1.00, 0.31)$ $B P (1.50, 1.00, 0.92)$	85 84 87 87 86	76 77 75 77 76	89 91 92 93 92	77 72 73 75 77	82 85 83 87 87
$B L S (0.25, 0.15, 0.10)$ $B L S (5 d / 7, d / 7, d / 7)^{}$ $B L S (3 d / 4, d / 8, d / 8)^{}$ $B L S (7 d / 9, d / 9, d / 9)^{*}$ $B L S (0.51, 0.01, 0.02)$	94 91 90 94 90	85 85 86 86 83	98 100 100 100 98	86 84 84 83 83	95 90 90 93 91
$B N B (1; 0.92, 0.97, 0.01)$ $B N B (1; 0.97, 0.97, 0.01)$ $B N B (1; 0.97, 0.97, 0.02)$ $B N B (1; 0.98, 0.98, 0.01)$ $B N B (1; 0.99, 0.99, 0.01)$	93 92 94 92 91	87 86 88 84 84	96 95 100 97 96	85 85 89 85 83	92 92 93 92 91
$B N T A (0.21; 0.01, 0.01, 0.98)$ $B N T A (0.24; 0.01, 0.01, 0.98)$ $B N T A (0.26; 0.01, 0.01, 0.97)$ $B N T A (0.26; 0.01, 0.01, 0.98)$ $B N T A (0.28; 0.01, 0.01, 0.97)$	93 95 93 94 93	86 87 85 85 86	98 100 97 98 96	85 85 86 86 86	92 95 93 94 94
$B P P (0.31; (0.2, 0.2, 0.1), (1.0, 1.0, 0.9))$ $B P P (0.31; (0.2, 0.2, 0.1), (1.0, 1.2, 0.9))$ $B P P (0.32; (0.2, 0.2, 0.1), (1.0, 1.0, 0.9))$ $B P P (0.33; (0.2, 0.2, 0.1), (1.0, 1.0, 0.9))$ $B P P (0.33; (0.2, 0.2, 0.1), (1.0, 1.1, 0.9))$	76 77 78 78 76	70 71 71 70 71	82 84 84 85 83	72 71 71 70 70	77 76 76 77 78

* d = 1 − exp(−1) ≈ 0.63212

Table 9. Real data of X accident number in a period and Y of another period.

		X
		0	1	2	3	4	5	6	7	Total
	0	117	96	55	19	2	2	0	0	291
	1	61	69	47	27	8	5	1	0	218
	2	34	42	31	13	7	2	3	0	132
Y	3	7	15	17	7	3	1	0	0	49
	4	3	3	1	1	2	1	1	1	13
	5	2	1	0	0	0	0	0	0	3
	6	0	0	0	0	1	0	0	0	1
	7	0	0	0	1	0	0	0	0	1
	Total	224	226	150	68	23	11	5	1	708

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Goodness-of-Fit Test for the Bivariate Hermite Distribution

Abstract

1. Introduction

2. Preliminaries

3. The Test Statistic and Its Asymptotic Null Distribution

4. The Bootstrap Estimator

5. Numerical Results and Discussion

5.1. Simulated Data

5.2. The Power of a Hypothesis Test

5.3. Real Data Set

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics