A Discrete Cramér–Von Mises Statistic Related to Hahn Polynomials with Application to Goodness-of-Fit Testing for Hypergeometric Distributions

Pycke, Jean-Renaud

doi:10.3390/axioms13060369

Open AccessArticle

A Discrete Cramér–Von Mises Statistic Related to Hahn Polynomials with Application to Goodness-of-Fit Testing for Hypergeometric Distributions

by

Jean-Renaud Pycke

Laboratoire de Mathématiques et Modélisation d’Évry (LaMME), CNRS (UMR 8071), University of Paris-Saclay (Évry), 91037 Evry-Courcouronnes, France

Axioms 2024, 13(6), 369; https://doi.org/10.3390/axioms13060369

Submission received: 25 March 2024 / Revised: 9 May 2024 / Accepted: 24 May 2024 / Published: 30 May 2024

(This article belongs to the Special Issue New Trends in Discrete Probability and Statistics)

Download Versions Notes

Abstract

We give the Karhunen–Loève expansion of the covariance function of a family of discrete weighted Brownian bridges, appearing as discrete analogues of continuous Gaussian processes related to Cramér –von Mises and Anderson–Darling statistics. This analogy enables us to introduce a discrete Cramér–von Mises statistic and show that this statistic satisfies a property of local asymptotic Bahadur optimality for a statistical test involving the classical hypergeometric distributions. Our statistic and the goodness-of-fit problem we deal with are based on basic properties of Hahn polynomials and are, therefore, subject to some extension to all families of classical orthogonal polynomials, as well as their q-analogues. Due probably to computational difficulties, the family of discrete Cramér–von Mises statistics has received less attention than its continuous counterpart—the aim of this article is to bridge part of this gap.

Keywords:

discrete probability; classical discrete orthogonal polynomials; asymptotic theory; empirical processes; hypothesis testing; Bahadur efficiency

MSC:

33C45; 62F05

1. Introduction

In [1], Theorem 6.3, we derived a general Karhunen–Loève (KL) representation valid for Jacobi, Laguerre, and Hermite polynomials (see below some basic definitions concerning the KL representation of a centred Gaussian process and the associated KL expansion of its covariance function, with respect to a positive measure

μ

). In the particular case of Jacobi polynomials, it implies that, given

α

and

β > - 1

, the KL expansion

\begin{matrix} \frac{F^{(α, β)} (min {x, y}) {\bar{F}}^{(α, β)} (max {x, y})}{\sqrt{σ (x) σ (y)} f^{(α, β)} (x) f^{(α, β)} (y)} \\ = \sum_{k = 1}^{\infty} \frac{1}{λ_{k}^{(α, β)}} \cdot \frac{d_{0}^{(α, β)} \sqrt{σ (x)} \frac{d}{d x} P_{k}^{(α, β)} (x) \sqrt{σ (y)} \frac{d}{d x} P_{k}^{(α, β)} (y)}{λ_{k}^{(α, β)} d_{k}^{(α, β)}} (- 1 < x, y < 1) \end{matrix}

(1)

holds, with

σ (x) = 1 - x^{2}

. Function

f^{(α, β)}

is the weight associated with Jacobi polynomials (see [2], pp. 8–9),

F^{(α, β)}

and

{\bar{F}}^{(α, β)}

its integrals, these three functions being given by

\begin{matrix} f^{(α, β)} (x) & = {(1 - x)}^{α} {(1 + x)}^{β}, \\ F^{(α, β)} (x) & = \int_{- 1}^{x} {(1 - u)}^{α} {(1 + u)}^{β} d u, {\bar{F}}^{(α, β)} (x) = \int_{x}^{1} {(1 - u)}^{α} {(1 + u)}^{β} d u . \end{matrix}

The eigenvalues appearing in (1) are given by

λ_{k}^{(α, β)} = k (k + α + β + 1) (k = 0, 1, \dots),

the squared norms by

d_{0}^{(α, β)} = \frac{2^{α + β + 1} Γ (α + 1) Γ (β + 1)}{Γ (α + β + 2)}, d_{k}^{(α, β)} = \frac{2^{α + β + 1} Γ (k + α + 1) Γ (k + β + 1)}{k! (2 k + α + β + 1) Γ (k + α + β + 1)} (k \geq 1)

and

P_{k}^{(α, β)} (x) = \frac{{(- 1)}^{k}}{2^{k} k!} {(1 - x)}^{- α} {(1 + x)}^{- β} \frac{d^{k}}{d x^{k}} [{(1 - x)}^{α + k} {(1 + x)}^{β + k}] (k = 0, 1, . .)

is the

k -

th Jacobi polynomial. Jacobi polynomials satisfy orthogonality relations

\begin{matrix} \int_{- 1}^{1} f^{(α, β)} (x) \frac{P_{k}^{(α, β)} (x) P_{ℓ}^{(α, β)} (x)}{d_{k}^{(α, β)}} d x = \int_{- 1}^{1} f^{(α, β)} (x) \frac{σ (x) \frac{d}{d x} P_{k}^{(α, β)} (x) \frac{d}{d x} P_{ℓ}^{(α, β)} (x)}{d_{k}^{(α, β)} λ_{k}^{(α, β)}} d x = δ_{k, ℓ} \end{matrix}

(2)

where the Kronecker delta is defined by

δ_{i, j} = δ_{i} (j) = \{\begin{matrix} 0 & if i \neq j, \\ 1 & if i = j, \end{matrix} (i, j \in Z) .

Jacobi polynomials also satisfy the differential equations

\frac{d}{d x} [σ (x) f^{(α, β)} (x) \frac{d}{d x} P_{k}^{(α, β)} (x)] = - λ_{k}^{(α, β)} (x) f^{(α, β)} (x) P_{k}^{(α, β)} (x) (k = 0, 1, \dots) .

(3)

About these standard identities, see, e.g., formulae (1.2.1), p. 3, (1.3.4), p. 8, §1.4.1.1, p. 9 and Table 1.1, p. 11 in [2], or formulae (9.8.2), (9.8.6), and (9.8.10), pp. 217–218 in [3].

In view of the equalities

P_{0}^{(α, β)} (x) = 1, \int_{- 1}^{1} \frac{f_{0}^{(α, β)} (x)}{d_{0}^{(α, β)}} d x = \int_{- 1}^{1} \frac{{[P_{0}^{(α, β)} (x)]}^{2} f_{0}^{(α, β)} (x)}{d_{0}^{(α, β)}} d x = 1,

these functional and numerical elements associated with Jacobi polynomials enable us to express the probability density function (p.d.f.)

ω^{(α, β)}

, the distribution function

Ω^{(α, β)}

, as well as its tail

{\bar{Ω}}^{(α, β)}

, associated with Jacobi polynomials as

ω^{(α, β)} (x) = \frac{f^{(α, β)} (x)}{d_{0}^{(α, β)}}, Ω^{(α, β)} (x) = \frac{F^{(α, β)} (x)}{d_{0}^{(α, β)}}, {\bar{Ω}}^{(α, β)} (x) = \frac{{\bar{F}}^{(α, β)} (x)}{d_{0}^{(α, β)}} (- 1 \leq x \leq 1) .

(4)

In the particular cases

(α, β) = (- 1 / 2, - 1 / 2)

and

(0, 0)

, development (1) reduces, up to a change of variables (see details below in Section 8.1), to expansions

\begin{matrix} min {s, t} - s t & = \sum_{k = 1}^{\infty} \frac{1}{π^{2} k^{2}} \cdot 2 sin (k π s) sin (k π t), \end{matrix}

(5)

\begin{matrix} \frac{min {s, t} - s t}{\sqrt{s (1 - s) t (1 - t)}} & = \sum_{k = 1}^{\infty} \frac{1}{k (k + 1)} \cdot \frac{4 (2 k + 1) \sqrt{s (1 - s)} P_{k}^{'} (2 s - 1) \sqrt{t (1 - t)} P_{k}^{'} (2 t - 1)}{k (k + 1)}, \end{matrix}

(6)

for

s, t \in (0, 1)

, where the standard notation

P_{k} = P_{k}^{(0, 0)}

is used for the Legendre polynomial. These two expansions are familiar to statisticians as they play a key role in the study of Cramér–von Mises (CvM) and Anderson–Darling (AD) statistics; see Proposition 1 pp. 213–214 and Theorem 1, p. 225 in [4].

This remark motivates the contents of our paper, which is organised as follows:

In Section 2, we recall some basic facts about Cramér–von Mises statistics, highlighting those we wish to extend to some of their discrete analogues involving Hahn polynomials and the hypergeometric distribution. In particular, Proposition 1 states a new result concerning the optimal local asymptotic Bahadur efficiency of a sub-family of these statistics, including the cases of Cramér–von Mises and Anderson–Darling statistics.

In Section 3, we introduce the classical hypergeometric distribution

ω^{(μ, ν, N)}

and the associated Hahn polynomials. We define a weighted discrete Brownian bridge process

D^{(μ, ν, N)}

associated with this distribution.

In Section 4, Proposition 3 gives the K-L expansion of the covariance function of this process in terms of Hahn polynomials, a discrete analogue of (1) and (6).

In Section 5, we introduce our new statistic

T_{n}^{(μ, ν, N)}

, defined either as a discrete weighted Cramér–von Mises statistic, or as a degenerate V-statistic, with a kernel whose KL expansion is given.

In Section 6, we provide in Theorem 1 some of the properties of

T_{n}^{(μ, ν, N)}

. In particular, Proposition 5 states a result about the probability of a large deviation, a key result for the subsequent study of Bahadur efficiency.

In Section 7, we study some properties of our statistic under a general alternative, and Theorem 2 states its local asymptotic Bahadur optimality under a more particular alternative hypothesis, the latter appearing as a perturbation of the hypergeometric distribution by the first non-constant Hahn polynomial. This result is a discrete analogue for the hypergeometric distribution of Proposition 1 for some Beta distributions.

Some proofs and various required formulas are postponed to Section 8. For the sake of simplicity, we will at some places omit superscripts in proofs.

2. Cramér–Von Mises and Anderson–Darling Statistics Revisited

The usefulness of orthogonal expansions such as (5)–(6) is well known in the field of statistics; see [4], Chapter 5, and the numerous references therein about this topic. In particular, recall that KL expansions (5)–(6) yield the equalities in law, or KL representations,

\begin{matrix} B (t) & = \sum_{k = 1}^{\infty} \frac{1}{π k} \cdot ξ_{k} \cdot \sqrt{2} sin (k π t), \\ \frac{B (t)}{\sqrt{t (1 - t)}} & = \sum_{k = 1}^{\infty} \frac{1}{\sqrt{k (k + 1)}} \cdot ξ_{k} \cdot \frac{2 \sqrt{2 k + 1} \sqrt{t (1 - t)} P_{k}^{'} (2 t - 1)}{\sqrt{k (k + 1)}} \end{matrix}

where

B = {B (t) : 0 \leq t \leq 1}

is a Brownian bridge process, i.e., a centred Gaussian process with covariance function

(s, t) \mapsto min {s, t} - s t

, and

ξ_{1}, ξ_{2}, \dots

are independent standard normal random variables. Let us now introduce some terminology about KL expansions and representations. Let

(T, μ)

be a measure space and

L_{2} (T, μ)

be the associated Hilbert space of real, square-integrable functions endowed with the inner product

{〈 f | g 〉}_{μ} = \int_{T} f (t) g (t) d μ (t) .

By a

μ

-KL expansion of the bivariate symmetric kernel

K : T \times T \to R

, we mean a pointwise convergent series of the form

K (s, t) = \sum_{k \geq 1} μ_{k} \cdot \frac{f_{k} (s) f_{k} (t)}{c_{k}} (s, t \in T),

(7)

where the sequences of eigenfunctions

(f_{k})

, eigenvalues

(μ_{k})

, and squared norms

(c_{k})

satisfy the integral and orthogonality relations

\int_{T} K (s, t) f_{k} (s) d μ (s) = μ_{k} f_{k} (t), {〈 f_{k} | f_{l} 〉}_{μ} = c_{k} δ_{k, ℓ}, with μ_{k}, c_{k} > 0 (t \in T and k, ℓ \geq 1) .

(8)

Recall that the entire law of centred Gaussian process is determined by its covariance function (see [5], p. 2). Therefore, if

{X (t) : t \in T}

is a centred Gaussian process with covariance function

K (s, t) = E [X (s) X (t)],

then a corollary of (7) are the equalities in law

X (t) = \sum_{k = 1}^{\infty} \sqrt{μ_{k}} \cdot ξ_{k} \cdot \frac{f_{k} (t)}{\sqrt{c_{k}}} (t \in T)

(9)

and

\int_{T} X {(t)}^{2} d μ (t) = \sum_{k = 1}^{\infty} μ_{k} \cdot ξ_{k}^{2} .

(10)

In view of the orthogonality relations (8), we call (9) the

μ -

KL representation associated with the

μ -

KL expansion (7).

Now, assume that

- \infty \leq a < b \leq \infty

and let

T = (a, b)

be endowed with a positive continuous p.d.f.

ω = Ω^{'}

. Consider a kernel K admitting the expression and the

ω

-KL expansion

\begin{matrix} K (x, y) & = \sqrt{q (x) q (y)} [Ω (min {x, y}) - Ω (x) Ω (y)] = \sum_{k \geq 1} μ_{k} \cdot f_{k} (x) f_{k} (y) (a < x, y < b), \end{matrix}

(11)

\begin{matrix} \int_{a}^{b} f_{k} (u) f_{ℓ} (u) ω (u) d u = δ_{k, ℓ}, (k, ℓ = 1, 2, \dots), μ_{1} > μ_{2} > \dots > 0, \end{matrix}

(12)

for some weight function

q : (a, b) \to (0, \infty)

. Then, K is the covariance function of the centred Gaussian process

{\sqrt{q (x)} B [Ω (x)] : a < x < b}

, which, consequently, admits the

ω

-KL representation

\sqrt{q (x)} B {Ω (x)} = \sum_{k \geq 1} μ_{k}^{1 / 2} \cdot ξ_{k} \cdot f_{k} (x) (a < x < b) .

(13)

In this framework, by setting

ω = ω^{(α, β)}, q (x) = q^{(α, β)} (x) = \frac{1}{σ (x) ω^{(α, β)} {(x)}^{2}},

(14)

and using the equality

F^{(α, β)} / f^{(α, β)} = Ω^{(α, β)} / ω^{(α, β)}

, then in view of (2) and (4), expansion (1) can be seen as an

ω^{(α, β)}

-KL expansion associated with the

ω^{(α, β)}

-KL representation

\begin{matrix} q^{(α, β)} {(x)}^{1 / 2} B {Ω^{(α, β)} (x)} = \frac{B {Ω^{(α, β)} (x)}}{\sqrt{σ (x)} ω^{(α, β)} (x)} \\ = \sum_{k = 1}^{\infty} {(\frac{1}{λ_{k}^{(α, β)}})}^{1 / 2} \cdot ξ_{k} \cdot {(\frac{d_{0}^{(α, β)}}{d_{k}^{(α, β)} λ_{k}^{(α, β)}})}^{1 / 2} \sqrt{σ (x)} \frac{d}{d x} P_{k}^{(α, β)} (x) (- 1 < x < 1) . \end{matrix}

(15)

In this case, the corollary of representation (15) is the equality in law

\int_{- 1}^{1} \frac{B {Ω^{(α, β)} (x)}^{2}}{σ (x) ω^{(α, β)} {(x)}^{2}} ω^{(α, β)} (x) d x = \int_{- 1}^{1} \frac{B {Ω^{(α, β)} (x)}^{2}}{σ (x) ω^{(α, β)} (x)} d x = \sum_{k = 1}^{\infty} \frac{ξ_{k}^{2}}{λ_{k}^{(α, β)}} .

These developments and identities appear as asymptotic statistical properties of some goodness-of-fit tests in the following way:

Assume the sequence of independent and identically distributed (i.i.d.) random variables

X, X_{1}, X_{2}, \dots

taking values in

(a, b)

is drawn from a population with positive continuous p.d.f.

ω_{X}

. Assume we wish to test the null hypothesis

H_{0} : ω_{X} = ω = Ω^{'}

against the alternative

H_{1} : ω_{X} \neq ω

. Let us use the

ω

-KL expansion (11) to build a test suited to this aim, based on a certain statistic associated with

ω

, say

T_{n}^{ω, q}

.

The

ω

-empirical process associated with our sample

X_{1}, \dots, X_{n}

is

V_{n}^{ω} (x) = n^{1 / 2} [\frac{\sum_{m = 1}^{n} H (X_{m}, x)}{n} - Ω (x)] = n^{- 1 / 2} \sum_{m = 1}^{n} [H (X_{m}, x) - Ω (x)]

(16)

where

H : R \times R

is the Heaviside function

H (x, y) = \{\begin{matrix} 0 & if y < x, \\ 1 & if y \geq x, \end{matrix}

so that

x \in (a, b) \mapsto n^{- 1} \sum_{m = 1}^{n} H (X_{m}, x)

is nothing else but the empirical distribution function.

Note for further use that under

H_{0}

, we have for

x, y \in (a, b)

,

\begin{matrix} E [H (X, x)] & = Ω (x), \\ E \{[H (X, x) - Ω (x)] [H (X, y) - Ω (y)]\} & = min {Ω (x) Ω (y)} - Ω (x) Ω (y) . \end{matrix}

Under

H_{0}

, the convergence in law (denoted by the sign ⇒) and the equality

\begin{matrix} {T_{n}^{ω, q}}^{2} : = \int_{a}^{b} q (x) V_{n}^{ω} {(x)}^{2} ω (x) d x & \Rightarrow \int_{a}^{b} q (x) B {Ω (x)}^{2} ω (x) d x = \sum_{k = 1}^{\infty} μ_{k} ξ_{k}^{2} \end{matrix}

(17)

hold. Note for further reference that our statistic

n^{- 1} {T_{n}^{ω, q}}^{2}

equals

ω_{n, q}^{2}

defined by [6], (2.1.9), p. 41. Also recall that given the square summable bivariate kernel

K : (a, b) \times (a, b) \mapsto R

, the statistic

(X_{1}, \dots, X_{n}) \mapsto \frac{1}{n} \sum_{i = 1}^{n} \sum_{j = 1}^{n} K (X_{i}, X_{j})

is called a von Mises functional statistic, or a V-statistic (see [7], p. 39, [8], Exercise 10, p. 172 or [9], §5.1.2, pp. 174–175). If, furthermore, the so-called degeneracy condition

\int_{a}^{b} K (x, y) ω (x) d x = 0 (a < y < b)

holds, then the V-statistic is said to be degenerate with respect to

ω

(see [8] (§12.3) or [7] (§4.3)). Now, the second and first equalities, in (16) and in (17), respectively, enable us to write

{T_{n}^{ω, q}}^{2} = \frac{1}{n} \sum_{ℓ = 1}^{n} \sum_{m = 1}^{n} \int_{a}^{b} [H (X_{ℓ}, x) - Ω (x)] [H (X_{m}, x) - Ω (x)] q (x) ω (x) d x,

(18)

in other words

{T_{n}^{ω, q}}^{2}

is a V-statistic with kernel

L (X_{1}, X_{2}) = \int_{a}^{b} [H (X_{1}, u) - Ω (u)] [H (X_{2}, u) - Ω (u)] q (u) ω (u) d u .

(19)

In view of the equality

\int_{a}^{b} [H (y, u) - Ω (u)] ω (y) d y = \int_{a}^{u} ω (y) d y - Ω (u) \int_{a}^{b} ω (y) d y = 0,

Fubini’s theorem implies that

\int_{a}^{b} L (x, y) ω (y) d y = \int_{a}^{b} [H (x, u) - Ω (u)] \{\int_{a}^{b} [H (y, u) - Ω (u)] ω (y) d y\} q (u) ω (u) d u = 0,

which means that

{T_{n}^{ω, q}}^{2}

is a degenerate V-statistic, with respect to

ω

.

The asymptotic distribution of

{T_{n}^{ω, q}}^{2}

under

H_{0}

being given by (17), we know from the theory of degenerate V-statistics that the

ω

-KL expansion of L, involving the eigenvalues appearing in (17), will be of the form

\begin{matrix} L (x, y) & = \sum_{k \geq 1} μ_{k} \cdot g_{k} (x) g_{k} (y) (a < x, y < b), & \int_{a}^{b} g_{k} (u) g_{ℓ} (u) ω (u) d u = δ_{k, ℓ} (k, ℓ = 1, 2, \dots) . \end{matrix}

(20)

The statistical interest of this spectral expansion of L stems from the limit theorem satisfied by degenerate V-statistics, which imply in our case that if

ω_{X} (x) = ω (x) {1 + \sum_{k \geq 1} a_{k} g_{k} (x)} (a_{1}, a_{2}, \dots \in R), then {T_{n}^{ω, q}}^{2} \Rightarrow \sum_{k = 1}^{\infty} μ_{k} a_{k}^{2},

whereas if

ω_{X} (x) = ω (x) then n {T_{n}^{ω, q}}^{2} \Rightarrow \sum_{k = 1}^{\infty} μ_{k} ξ_{k}^{2} .

(21)

Therefore, only if

H_{0}

does not obtain, will

n {T_{n}^{ω, q}}^{2}

tend to infinity almost surely, and heuristically, the speed of divergence will increase with the value of

\sum_{k = 1}^{\infty} μ_{k} a_{k}^{2}

. If

μ_{1}

is the maximal eigenvalue, the statistic

T_{n}^{ω, q}

is, therefore, expected to be suited for detecting alternatives of the form

H_{1} (a_{1}) : ω_{X} (x) = ω (x) {1 + a_{1} g_{1} (x)},

(22)

where

a_{1} \neq 0

is small enough to ensure that

ω_{X} (x)

is a well-defined p.d.f.

The statistic appearing in (17) is referred to as a (continuous) weighted Cramér–von Mises statistic. The original Cramér–von Mises statistic as well as the Anderson–Darling statistic correspond to the uniform p.d.f.

f (x) = 1

for

x \in (0, 1)

, and weight functions

q (x) = 1

and

q (x) = 1 / [x (1 - x)]

, respectively. Therefore, they are usually discussed as tests for uniformity over

(0, 1)

.

Now, our KL expansions (1) enable us to interpret the CvM and AD statistics in another way. To this end, consider again the family of weighted CvM statistics with p.d.f and weight function given for

x \in (a, b) = (- 1, 1)

by

ω (x) = ω^{(α, β)} (x), q (x) = q^{(α, β)} (x) = \frac{1}{σ (x) ω^{(α, β)} {(x)}^{2}},

and

V_{n}^{(α, β)}

denoting the

ω^{(α, β)}

-empirical process defined above by (16). The corresponding statistic is

\begin{matrix} {T_{n}^{(α, β)}}^{2} = \int_{- 1}^{1} q^{(α, β)} (x) V_{n}^{(α, β)} {(x)}^{2} ω^{(α, β)} (x) d x \\ = \int_{- 1}^{1} \frac{V_{n}^{(α, β)} {(x)}^{2}}{σ (x) ω^{(α, β)} (x)} d x = \frac{1}{n} \sum_{ℓ = 1}^{n} \sum_{m = 1}^{n} L^{(α, β)} (X_{ℓ}, X_{m}), \end{matrix}

(23)

CvM and AD statistics corresponding to the cases

(- 1 / 2, - 1 / 2)

and

(0, 0)

, respectively. They are now associated with the arcsine law

ω^{(- 1 / 2, - 1 / 2)}

and the uniform distribution

ω^{(0, 0)}

, respectively.

The latter association raises the following question: Has

T_{n}^{(α, β)}

some optimal efficiency for some goodness-of-fit test of the null hypothesis

H_{0} = ω^{(α, β)}

against a specific alternative, and if so, can Jacobi polynomials play a key role in the mathematical study of this efficiency of this test? Among the different measures of efficiency (see [6], Introduction), we choose the Bahadur efficiency, so let us recall some facts before addressing our question.

In the theory of Bahadur efficiency, the central result is [10], Theorem 7.5. Recall its underlying general principles. Assume

{(X_{m})}_{m \geq 1}

is a sequence of i.i.d. random variables following a distribution determined by a parameter, say

θ \in Θ

. Let

θ_{0} \in Θ

. The efficiency of a test based on the rejection of the simple hypothesis

H_{0} : θ = θ_{0}

, against the alternative

H_{1} : θ \neq θ_{0}

, for large values of the statistic

T_{n} = T_{n} (X_{1}, \dots, X_{n})

, is measured by the magnitude of a positive coefficient called the slope of the sequence

T = (T_{n})

, denoted by

c_{T} (θ)

. High values of

c_{T}

correspond to a good efficiency of the test. The main way to compute

c_{T}

is provided by [10], Theorem 7.2, p. 29. Furthermore, an upper bound for

c_{T}

is

2 J (θ, θ_{0})

, where

J (θ, θ_{0})

is the Kullback–Leibler information number ([10], Theorem 7.5, p. 29). The test is asymptotically optimal whenever this upper bound is reached, or locally asymptotically optimal if

c_{T} (θ) \sim 2 J (θ, θ_{0}), θ \to θ_{0} .

The difficult part in the determination of Bahadur efficiency is always the determination of some large deviation probability under

H_{0}

. In this respect, KL expansions can play a determinant role in some cases, including the cases we will deal with. The large deviation probability to be determined is

lim_{n \to \infty} n^{- 1} log P (T_{n}^{2} > n t^{2})

under

H_{0}

.

KL expansions enable to compute the exact slope in the case where it coincides with the so-called approximate exact slope, which means that the large deviation problem associated with

T_{n}

may be replaced by the large deviation problem associated with

lim T_{n}

given by (21). The solution of the latter problem is provided by a classical result from Zolotarev (see [6], (1.2.17), p. 10 and pp. 75–76, in particular, Table 2, p. 76) stating that

lim_{n \to \infty} n^{- 1} log P (\sum_{k \geq 1} μ_{k} ξ_{k}^{2} > n t^{2}) \to - \frac{t μ_{1}}{2} .

Heuristically, we may expect that if

T_{n}^{2} \to \sum_{k \geq 1} μ_{k} ξ_{k}^{2},

then the asymptotic relation

n^{- 1} log P (T_{n}^{2} > n t^{2}) \sim n^{- 1} log P (\sum_{k \geq 1} μ_{k} ξ_{k}^{2} > n t^{2}) \to - \frac{t μ_{1}}{2}

will hold.

It is true in the particular case of the Anderson–Darling statistic

T_{n}^{(0, 0)}

, and also in the case of

T_{n}^{(α, β)}

, provided that

q^{(α, β)}

is summable. The latter condition is

\int_{- 1}^{1} q^{(α, β)} (x) ω^{(α, β)} (x) d x = \int_{- 1}^{1} \frac{1}{σ (x) ω^{(α, β)} (x)} d x < \infty ⟺ - 1 < α, β < 0 .

When

q^{(α, β)}

is not summable, no result is available. These facts justify the hypotheses about

α

and

β

in the following:

Proposition 1.

(i): Assume that $| θ | \in (0, 1]$ and that we wish to test

$H_{0} : ω_{X} (x) = 1 against H_{1} : ω_{X} (x) = ω_{θ} (x) = 1 + θ cos (π x) (0 \leq x \leq 1) .$

Then, the goodness-of-fit test based on the Cramér–von Mises statistic

$T_{n}^{2} = n^{- 1} \int_{0}^{1} {\sum_{m = 1}^{n} [H (X_{m}, x) - x]}^{2} d x = n \int_{0}^{1} {[{\frac{1}{n} \sum_{m = 1}^{n} H (X_{m}, x)} - x]}^{2} d x$

is locally asymptotically Bahadur optimal, as $θ \to 0$ .
(ii): Assume that $| θ | \in (0, 1]$ , $- 1 < α, β < 0$ or $α = β = 0$ , and that we wish to test

$H_{0} : ω_{X} = ω^{(α, β)} against H_{1} : ω_{X} (x) = ω_{θ} (x) = ω^{(α, β)} {1 + θ P_{1}^{(α, β)} (x)} (- 1 \leq x \leq 1) .$

Then, the goodness-of-fit test based on the rejection of $H_{0}$ for large values of $T_{n}^{(α, β)}$ is locally asymptotically optimal in the sense of Bahadur, as $θ \to 0$ .

Proof.

We will use [6], Table 2, p. 76, with

- λ_{0} (q, k)

from this reference identified with

μ_{1}

from (11).

(i) One has, for

- 1 \leq θ \leq 1

,

Ω_{θ} (x) = \int_{0}^{x} ω_{θ} (y) d y = \int_{0}^{x} [1 + θ cos (π y)] d y = x - θ \frac{sin (π x)}{π} .

On the first hand, the exact slope is given by

c (θ) = π^{2} \int_{0}^{1} {Ω_{θ} (x) - Ω_{0} (x)}^{2} d x = π^{2} \int_{0}^{1} {θ \frac{sin (π x)}{π}}^{2} d x = \frac{θ^{2}}{2}

On the other hand, the Kullback–Leibler information number is given by

\begin{matrix} J (ω_{θ}, ω_{0}) & = \int_{0}^{1} ω_{θ} (x) log [\frac{ω_{θ} (x)}{ω_{0} (x)}] d x = \int_{0}^{1} [1 + θ cos (π x)] log [1 + θ cos (π x)] d x \\ \sim \int_{0}^{1} [1 + θ cos (π x)] [θ cos (π x) - \frac{θ^{2} {cos}^{2} (π x)}{2}] d x \sim \frac{θ^{2}}{4} (θ \to 0), \end{matrix}

so that

c (θ) \sim 2 J (ω_{θ}, ω_{0})

, and the result is established.

(ii) First, note that

P_{1}^{(α, β)} (x) = \frac{α - β}{2} + \frac{α + β + 2}{2} x \Rightarrow - 1 \leq P_{1}^{(α, β)} (- 1) = - β - 1 \leq P_{1}^{(α, β)} (1) = α + 1 \leq 1,

so that under our assumptions

ω_{θ}^{(α, β)}

is a well-defined p.d.f. We will use repeatedly (4). On the first hand, we have, using [11], 22.13.1, p. 785,

\begin{matrix} Ω_{θ} (x) - Ω_{0} (x) = θ \int_{- 1}^{x} P_{1}^{(α, β)} (y) ω^{(α, β)} d y = θ \int_{- 1}^{x} P_{1}^{(α, β)} (y) \frac{f^{(α, β)} (y)}{d_{0}^{(α, β)}} d y \\ = - \frac{θ f^{(α + 1, β + 1)} (x)}{2 d_{0}^{(α, β)}} = - \frac{θ d_{0}^{(α + 1, β + 1)} ω^{(α + 1, β + 1)} (x)}{2 d_{0}^{(α, β)}} . \end{matrix}

The exact slope is given by

\begin{matrix} c (θ) & = λ_{1}^{(α, β)} \int_{- 1}^{1} q^{(α, β)} (θ) {[Ω_{θ} (x) - Ω_{0} (x)]}^{2} ω^{(α, β)} (x) d x \\ = λ_{1}^{(α, β)} \int_{- 1}^{1} q^{(α, β)} (θ) {[\frac{θ f^{(α + 1, β + 1)} (x)}{2 d_{0}^{(α, β)}}]}^{2} ω^{(α, β)} (x) d x \\ = λ_{1}^{(α, β)} \int_{- 1}^{1} \frac{{[d_{0}^{(α, β)}]}^{2}}{(1 - x^{2}) f^{(α, β)} {(θ)}^{2}} {[\frac{θ f^{(α + 1, β + 1)} (x)}{2 d_{0}^{(α, β)}}]}^{2} \frac{f^{(α, β)} (x)}{d_{0}^{(α, β)}} d x \\ = \frac{θ^{2} λ_{1}^{(α, β)}}{4 d_{0}^{(α, β)}} \int_{- 1}^{1} f^{(α + 1, β + 1)} (x) d x = \frac{θ^{2} λ_{1}^{(α, β)}}{4 d_{0}^{(α, β)}} d_{0}^{(α + 1, β + 1)} . \end{matrix}

On the other hand, the Kullback–Leibler number is given by

\begin{matrix} J (ω_{θ}, ω_{0}) & = \int_{- 1}^{1} ω_{θ} (x) log [\frac{ω_{θ} (x)}{ω_{0} (x)}] d x = \int_{- 1}^{1} [1 + θ P_{1}^{(α, β)} (x)] log [1 + θ P_{1}^{(α, β)} (x)] ω^{(α, β)} (x) d x \\ \sim \int_{- 1}^{1} [θ P_{1}^{(α, β)} (x) + \frac{θ^{2} P_{1}^{(α, β)} {(x)}^{2}}{2}] ω^{(α, β)} (x) d x = \frac{θ^{2} d_{1}^{(α, β)}}{2 d_{0}^{(α, β)}} (θ \to 0) . \end{matrix}

Therefore, as

θ \to 0

,

\frac{c (θ)}{J (ω_{θ}, ω_{0})} \sim \frac{\frac{θ^{2}}{4} \frac{λ_{1}^{(α, β)} d_{0}^{(α + 1, β + 1)}}{d_{0}^{(α, β)}}}{\frac{θ^{2} d_{1}^{(α, β)}}{2 d_{0}^{(α, β)}}} = \frac{λ_{1}^{(α, β)} d_{0}^{(α + 1, β + 1)}}{2 d_{1}^{(α, β)}} = \frac{(α + β + 2) \cdot 2^{α + β + 3} \frac{(α + 1)! (β + 1)!}{(α + β + 3))!}}{2 \cdot 2^{α + β + 1} \frac{(α + 1)! (β + 1)!}{(α + β + 3) (α + β + 1)!}} = 2

and the result is established. □

Let us now show that these results involving Jacobi polynomials have analogues for Hahn polynomials.

3. A Discrete Brownian Bridge Associated with the Hypergeometric Distribution

The discrete analogues of (17) are called discrete Cramér–von Mises statistics and were introduced and discussed only quite recently by [12,13].

The elements appearing in (1), as well as (3), have their counterpart in all other families of classical orthogonal polynomials, continuous (Laguerre and Hermite), or discrete (Charlier, Hahn, Krawtchouk, and Meixner). It is, therefore, tempting to introduce the formal counterpart of (1) for each of these families, and the associated weighted Cramér–von Mises statistic.

Such statistics were discussed in [14], but dealing only with the continuous case (Jacobi, Laguerre, and Hermite polynomials).

Let N be a positive integer and let

ω

be a probability mass function (p.m.f.) supported by

〚 0, N 〛 = {0, 1, \dots, N}

.

Assume the sequence of independent and identically distributed random variables

X, X_{1}, X_{2}, \dots

taking values in

〚 0, N 〛

is drawn from a population with positive p.m.f.

ω_{X}

. Assume we wish to test the simple null hypothesis

H_{0} : ω_{X} = ω^{(μ, ν, N)}

(24)

where

ω^{(μ, ν, N)}

is the classical hypergeometric distribution given below by (25), against the alternative

H_{1} : ω_{X} \neq ω^{(μ, ν, N)}

.

Following [15],

(6.31)

, p. 259, consider for

μ, ν \in (- 1, \infty)

, the classical hypergeometric probability mass function (p.m.f.) given by

ω^{(μ, ν, N)} (i) = \{\begin{matrix} (\binom{N + μ}{N - i}) (\binom{N + ν}{i}) / (\binom{2 N + μ + ν}{N}) & for i \in {0, 1, \dots, N}, \\ 0 & for i \in Z ∖ {0, 1, \dots, N} . \end{matrix}

(25)

The associated cumulative distribution function (c.d.f.) and its tail are given by

Ω^{(μ, ν, N)} (i) = \sum_{j \leq i} ω^{(μ, ν, N)} (j) = 1 - {\bar{Ω}}^{(μ, ν, N)} (i) (i \in Z) .

(26)

In the present paper, Hahn polynomials, denoted by

H_{k}^{(μ, ν, N)} (x)

for

0 \leq k \leq N

, will be those denoted by

Q_{k} (x; α, β, N)

in [3], §9.5 (see formulas (1.4.1), p. 5 and (9.5.1), p. 204) with

α = - N - ν - 1, β = - N - μ - 1

, or also

{\tilde{h}}_{k}^{(μ, ν)} (x, N + 1) / {\tilde{h}}_{k}^{(μ, ν)} (0, N + 1)

in [2] (see the first equality p. 53 and Table 2.4, p. 54), so that

\begin{matrix} H_{k}^{(μ, ν, N)} (x) & = \sum_{j = 0}^{k} \frac{{(- k)}_{j} {(- 2 N - μ - ν + k - 1)}_{j} {(- x)}_{j}}{{(- N - ν)}_{j} {(- N)}_{j} j!} (0 \leq k \leq N) \end{matrix}

(27)

where the Pochhammer symbol

{(a)}_{k}

is defined to be

{(a)}_{0} = 1, {(a)}_{k} = a (a + 1) \dots (a + k - 1) (k = 1, 2, 3, \dots) .

These Hahn polynomials satisfy, for

0 \leq k, ℓ \leq N

, the orthogonality relations

\begin{matrix} \sum_{i = 0}^{N} (\binom{N + μ}{N - i}) (\binom{N + ν}{i}) H_{k}^{(μ, ν, N)} (i) H_{ℓ}^{(μ, ν, N)} (i) = δ_{k, ℓ} d_{k}^{(μ, ν, N)}, \end{matrix}

(28)

\begin{matrix} d_{k}^{(μ, ν, N)} = \frac{(2 N + μ + ν + 1 - k)! (N + μ)! (N + ν - k)! (N - k)! k!}{(2 N + μ + ν + 1 - 2 k) (N + μ + ν - k)! (N + ν)! (N + μ - k)! N!^{2}} . \end{matrix}

(29)

See a proof in Section 8.

Hahn polynomials also satisfy difference equations

L^{(μ, ν, N)} H_{k}^{(μ, ν, N)} : = Δ [σ^{(μ, ν)} ω^{(μ, ν, N)} \nabla H_{k}^{(μ, ν, N)}] = - λ_{k}^{(μ, ν, N)} ω^{(μ, ν, N)} H_{k}^{(μ, ν, N)},

(30)

for

0 \leq k \leq N

, with

λ_{k}^{(μ, ν, N)} = k (2 N + μ + ν - k + 1), σ^{(μ, ν)} (i) = i (i + μ) (0 \leq i, k \leq N)

(31)

(see [2] ((2.1.18), p. 21)), and where for any function

f : Z \to R

, the forward and backward shift operators are defined by

Δ f (i) = f (i + 1) - f (i), \nabla f (i) = f (i) - f (i - 1) = Δ f (i - 1) (i \in Z) .

For a two-variable function, a subscript will indicate the variable upon which these operators act, e.g.,

Δ_{i} f (i, j) = f (i + 1, j) - f (i, j), Δ_{j} f (i, j) = f (i, j + 1) - f (i, j) .

The first few Hahn polynomials with their (discrete) derivatives are given by

\begin{matrix} H_{0}^{(μ, ν, 0)} (x) & = 1, \end{matrix}

(32)

\begin{matrix} H_{0}^{(μ, ν, 1)} (x) & = 1, H_{1}^{(μ, ν, 1)} (x) = 1 - \frac{2 + μ + ν}{1 + ν} x, \end{matrix}

(33)

\begin{matrix} H_{0}^{(μ, ν, 2)} (x) & = 1, H_{1}^{(μ, ν, 2)} (x) = 1 - \frac{4 + μ + ν}{2 (2 + ν)} x, \end{matrix}

(34)

\begin{matrix} H_{2}^{(μ, ν, 2)} (x) & = 1 - \frac{3 + μ + ν}{2 + ν} x + \frac{(μ + ν + 3) (μ + ν + 2) x (x - 1)}{2 (ν + 2) (ν + 1)} . \end{matrix}

(35)

\begin{matrix} \nabla H_{0}^{(μ, ν, 0)} (x) & = \nabla H_{0}^{(μ, ν, 1)} (x) = \nabla H_{0}^{(μ, ν, 2)} (x) = 0, \end{matrix}

(36)

\begin{matrix} \nabla H_{1}^{(μ, ν, 1)} (x) & = - \frac{2 + μ + ν}{1 + ν}, \nabla H_{1}^{(μ, ν, 2)} (x) = - \frac{4 + μ + ν}{2 (2 + ν)}, \end{matrix}

(37)

\begin{matrix} \nabla H_{2}^{(μ, ν, 2)} (x) & = - \frac{3 + μ + ν}{2 + ν} + \frac{(μ + ν + 3) (μ + ν + 2)}{(ν + 2) (ν + 1)} (x - 1) \end{matrix}

(38)

In order to avoid problems at the endpoints 0 and N when dealing with difference operators, any function

f : 〚 0, N 〛 \to R

will be extended, over

Z

, to a function also denoted f, in a way that will be specified when it has to. Note for such a non-null function f, the fundamental property, valid for

k = 0, \dots, N

,

L^{(μ, ν, N)} f = - λ ω^{(μ, ν, N)} f ⟺ \exists c \in R, \exists k \in 〚 0, N 〛 : (f, λ) = (c H_{k}^{(μ, ν, N)}, λ_{k}^{(μ, ν, N)})

(39)

since the sequence

{(H_{k}^{(μ, ν, N)}, λ_{k}^{(μ, ν, N)})}_{0 \leq k \leq N}

forms a complete set of solutions to the eigenvalue problem

L^{(μ, ν, N)} f = - λ ω^{(μ, ν, N)} f (λ \in R, f : 〚 0, N 〛 \to R) .

Let us introduce the scalar products

\begin{matrix} 〈 f | g 〉 = \sum_{i \in Z} f (i) g (i) = \sum_{i = 0}^{N} f (i) g (i), {〈 f | g 〉}_{ω} = \sum_{i = 0}^{N} f (i) g (i) ω^{(μ, ν, N)} (i) = 〈 f | g ω^{(μ, ν, N)} 〉 . \end{matrix}

(40)

Note the identity

〈 f | Δ g 〉 = \sum_{i \in Z} f (i) [g (i + 1) - g (i)] = \sum_{i \in Z} g (i) [f (i - 1) - f (i)] = - 〈 \nabla f | g 〉 .

(41)

In this setting, noticing that

d_{0}^{(μ, ν, N)} = (\binom{2 N + μ + ν}{N})

, the orthogonality relations (28) take the form

{〈 H_{k}^{(μ, ν, N)} | H_{ℓ}^{(μ, ν, N)} 〉}_{ω} = \frac{d_{k}^{(μ, ν, N)}}{d_{0}^{(μ, ν, N)}} δ_{k, ℓ} (0 \leq k, ℓ \leq N) .

(42)

The orthonormalized sequence of Hahn polynomials with respect to

ω^{(μ, ν, N)}

is, therefore, given by

ψ_{k}^{(μ, ν, N)} = {(\frac{d_{0}^{(μ, ν, N)}}{d_{k}^{(μ, ν, N)}})}^{1 / 2} H_{k}^{(μ, ν, N)}, {〈 ψ_{k}^{(μ, ν, N)} | ψ_{ℓ}^{(μ, ν, N)} 〉}_{ω} = δ_{k, ℓ} (0 \leq k, ℓ \leq N) .

(43)

Given a Brownian bridge process

B = {B (t) : 0 \leq t \leq 1}

, we define a discrete Brownian bridge process

D^{(μ, ν, N)} = {D^{(μ, ν, N)} (i) : 1 \leq i \leq N}

by setting

D^{(μ, ν, N)} (i) = \frac{B \{Ω^{(μ, ν, N)} (i - 1)\}}{\sqrt{σ^{(μ, ν)} (i)} ω^{(μ, ν, N)} (i)}, i \in {1, \dots, N} .

(44)

The process is Gaussian centred, with covariance kernel

Γ_{i, j}^{(μ, ν, N)} = Γ_{j, i}^{(μ, ν, N)} = \frac{Ω^{(μ, ν, N)} (i - 1) {\bar{Ω}}^{(μ, ν, N)} (j - 1)}{\sqrt{σ^{(μ, ν)} (i) σ^{(μ, ν)} (j)} ω^{(μ, ν, N)} (i) ω^{(μ, ν, N)} (j)} (1 \leq i \leq j \leq N) .

(45)

We will now give the

ω^{(μ, ν, N)}

-KL representation and expansion of this process and its covariance function.

4. Orthogonal Decomposition of the Weighted Discrete Brownian Bridge

First, we shall prove the following orthogonality relations for Hahn polynomials, an analogue of the second equality in (2).

Proposition 2.

Hahn polynomials satisfy

\sum_{i = 1}^{N} σ^{(μ, ν)} (i) \nabla H_{k}^{(μ, ν, N)} (i) \nabla H_{ℓ}^{(μ, ν, N)} (i) ω^{(μ, ν, N)} (i) = δ_{k, ℓ} \frac{d_{k}^{(μ, ν, N)}}{d_{0}^{(μ, ν, N)}} λ_{k}^{(μ, ν, N)} (1 \leq k, ℓ \leq N) .

(46)

Proof.

Taking into account relations (30), (40), (41), and

σ^{(μ, ν)} (0) = 0

, we obtain successively

\begin{matrix} \sum_{i = 1}^{N} σ^{(μ, ν)} (i) \nabla H_{k}^{(μ, ν, N)} (i) \nabla H_{ℓ}^{(μ, ν, N)} (i) ω^{(μ, ν, N)} (i) & = \sum_{i \in Z} \nabla H_{k}^{(μ, ν, N)} (i) \nabla H_{ℓ}^{(μ, ν, N)} (i) σ^{(μ, ν)} (i) ω^{(μ, ν, N)} (i) \\ = 〈 \nabla H_{k}^{(μ, ν, N)} | \nabla H_{ℓ}^{(μ, ν, N)} σ^{(μ, ν)} ω^{(μ, ν, N)} 〉 \\ = - 〈 H_{k}^{(μ, ν, N)} | Δ [\nabla H_{ℓ}^{(μ, ν, N)} σ^{(μ, ν)} ω^{(μ, ν, N)}] 〉 \\ = λ_{ℓ}^{(μ, ν, N)} 〈 H_{k}^{(μ, ν, N)} | ω^{(μ, ν, N)} H_{ℓ}^{(μ, ν, N)} 〉 \\ = λ_{ℓ}^{(μ, ν, N)} {〈 H_{k}^{(μ, ν, N)} | H_{ℓ}^{(μ, ν, N)} 〉}_{ω} \end{matrix}

and the result follows from (42). □

Note that the preceding Proposition establishes that the functions

{(ϕ_{k}^{(μ, ν, N)})}_{1 \leq k \leq N)}

defined by

\begin{matrix} ϕ_{k}^{(μ, ν, N)} (i) & : = {(\frac{d_{0}^{(μ, ν, N)}}{λ_{k}^{(μ, ν, N)} d_{k}^{(μ, ν, N)}})}^{1 / 2} \sqrt{σ^{(μ, ν)} (i)} \nabla H_{k}^{(μ, ν, N)} (i) (1 \leq i \leq N), \end{matrix}

(47)

\begin{matrix} ϕ_{k}^{(μ, ν, N)} (i) & : = 0 (i \leq 0, i > N) \end{matrix}

(48)

satisfy the orthogonality relations

{〈 ϕ_{k}^{(μ, ν, N)} | ϕ_{k}^{(μ, ν, N)} 〉}_{ω} = δ_{k, ℓ} (1 \leq k, ℓ \leq N) .

(49)

We are now in a position to state.

Proposition 3.

The spectral decomposition

Γ_{i, j}^{(μ, ν, N)} = \sum_{k = 1}^{N} \frac{1}{λ_{k}^{(μ, ν, N)}} ϕ_{k}^{(μ, ν, N)} (i) ϕ_{k}^{(μ, ν, N)} (j) (1 \leq i, j \leq N)

(50)

holds. In other words, Hahn polynomials satisfy the identity

\begin{matrix} \frac{Ω^{(μ, ν, N)} (i - 1) {\bar{Ω}}^{(μ, ν, N)} (j - 1)}{\sqrt{σ^{(μ, ν)} (i) σ^{(μ, ν)} (j)} ω^{(μ, ν, N)} (i) ω^{(μ, ν, N)} (j)} \\ = \sum_{k = 1}^{N} \frac{1}{λ_{k}^{(μ, ν, N)}} \cdot \frac{d_{0}^{(μ, ν, N)} \sqrt{σ^{(μ, ν)} (i)} \nabla H_{k}^{(μ, ν, N)} (i) \cdot \sqrt{σ^{(μ, ν)} (j)} \nabla H_{k}^{(μ, ν, N)} (j)}{d_{k}^{(μ, ν, N)} λ_{k}^{(μ, ν, N)}} \\ (1 \leq i \leq j \leq N) . \end{matrix}

(51)

Proof.

On the first hand, we have, for i and

j \in Z

,

\begin{matrix} Δ_{i} \circ Δ_{j} [\sum_{k = 1}^{N} \frac{1}{λ_{k}} \cdot \frac{d_{0} σ (i) ω (i) \nabla H_{k} (i) \cdot σ (j) ω (j) \nabla H_{k} (j)}{d_{k} λ_{k}}] \\ = \sum_{k = 1}^{N} \frac{1}{λ_{k}} \cdot \frac{d_{0} Δ [σ (i) ω (i) \nabla H_{k} (i)] \cdot Δ [σ (j) ω (j) \nabla H_{k} (j)]}{d_{k} λ_{k}} \\ = \sum_{k = 1}^{N} \frac{1}{λ_{k}} \cdot \frac{d_{0} λ_{k} ω (i) H_{k} (i) \cdot λ_{k} ω (j) H_{k} (j)}{d_{k} λ_{k}} \\ = \sum_{k = 1}^{N} \frac{d_{0} ω (i) ω (j) H_{k} (i) \cdot H_{k} (j)}{d_{k}} = ω (j) [δ_{i, j} - ω (i)], \end{matrix}

the last equality being proved in Lemma 2. In view of Lemmas 1 and 3, we see that there exist two real functions c and d such that for every

i, j \in Z

,

\sum_{k = 1}^{N} \frac{1}{λ_{k}} \cdot \frac{d_{0} σ (i) ω (i) \nabla H_{k} (i) \cdot σ (j) ω (j) \nabla H_{k} (j)}{d_{k} λ_{k}} = Ω (min {i, j} - 1) \bar{Ω} (max {i, j} - 1) + c (i) + d (j) .

(52)

Keeping in mind that

σ (0) = Ω (- 1) = 0

, the particular cases

i = 0

, then

j = 0

; then

i = j = 0

provide the equalities

c (0) + d (j) = c (i) + d (0) = c (0) + d (0) = 0 (i, j \in Z) .

The unique solution is

c (i) + d (j) = 0

for any

i, j \in Z

. After substitution in (52), the latter becomes an equality implying (51). □

Note that

(51)

can be written with matrices, in the equivalent form,

\begin{matrix} {(\frac{Ω^{(μ, ν, N)} (i - 1) {\bar{Ω}}^{(μ, ν, N)} (j - 1)}{ω^{(μ, ν, N)} (i) ω^{(μ, ν, N)} (j)})}_{\begin{matrix} 1 \leq i \leq N \\ 1 \leq j \leq N \end{matrix}} \\ = \sum_{k = 1}^{N} \frac{1}{λ_{k}^{(μ, ν, N)}} \cdot \frac{d_{k}^{(μ, ν, N)}}{d_{k}^{(μ, ν, N)} λ_{k}^{(μ, ν, N)}} {(σ^{(μ, ν)} (i) \nabla H_{k}^{(μ, ν, N)} (i) \cdot σ^{(μ, ν)} (j) \nabla H_{k}^{(μ, ν, N)} (j))}_{\begin{matrix} 1 \leq i \leq N \\ 1 \leq j \leq N \end{matrix}} . \end{matrix}

(53)

Example 1.

Let us write explicitly the first spectral expansion, using relations

(28)

,

(31)

, and

(35)

.

For

N = 1

,

(53)

reduces to

\frac{Ω^{(μ, ν, 1)} (0) {\bar{Ω}}^{(μ, ν, 1)} (0)}{ω^{(μ, ν, 1)} (1) ω^{(μ, ν, 1)} (1)} = \frac{1}{λ_{1}^{(μ, ν, 1)}} \cdot \frac{σ^{(μ, ν)} (1) \nabla K_{1}^{(μ, ν, 1)} (1) \cdot σ^{(μ, ν)} (1) \nabla K_{1}^{(μ, ν, 1)} (1)}{d_{1}^{(μ, ν, 1)} λ_{1}^{(μ, ν, 1)}},

which with numerical values reads

\frac{(1 + μ) \times (1 + ν)}{(1 + ν) \times (1 + ν)} = \frac{1}{2 + μ + ν} \cdot \frac{(1 + μ) \times \frac{2 + μ + ν}{1 + ν} \cdot (1 + μ) \times \frac{2 + μ + ν}{1 + ν}}{\frac{μ + 1}{ν + 1} \times (2 + μ + ν)},

an obviously true relation.

5. A Family of Discrete Cramér–Von Mises Statistics

Let

X_{1}, \dots, X_{n}

be a sample of size

n \geq 1

from a population whose distribution, with support

{0, 1, \dots, N}

, has p.m.f. and c.d.f. denoted by

ω_{X}

and

Ω_{X}

. The observed frequencies associated with our sample are

{\hat{n}}_{i} : = \sum_{m = 1}^{n} 1_{{X_{m} = i}} (0 \leq i \leq N),

the empirical p.m.f. and c.d.f. being denoted and given by

{\hat{ω}}_{n} (i) : = \frac{{\hat{n}}_{i}}{n}, {\hat{Ω}}_{n} (i) : = \sum_{j = 0}^{i} {\hat{ω}}_{n} (j) (0 \leq i \leq N),

(54)

respectively. Let E denote the expectation operator under the null hypothesis

H_{0} : ω_{X} = ω^{(μ, ν, N)} .

(55)

For

1 \leq m \leq n

, one can associate with

X_{m}

the random

(N + 1) -

vector

V_{m}

, whose components are the random variables

V_{m} (i) = 1_{{X_{m} \leq i}} - Ω^{(μ, ν, N)} (i) (0 \leq i \leq N) .

(56)

For

0 \leq i, j \leq N

, we clearly have

\begin{matrix} E [1_{{X_{m} \leq i}}] & = Ω^{(μ, ν, N)} (i), \end{matrix}

(57)

\begin{matrix} E [1_{{X_{m} \leq i}} 1_{{X_{m} \leq j}}] & = E [1_{{X_{m} \leq min (i, j)}}] = Ω^{(μ, ν, N)} (min {i, j}) . \end{matrix}

(58)

These relations imply, in turn,

E [V_{m} (i) V_{m} (j)] = Ω^{(μ, ν, N)} (min {i, j}) - Ω^{(μ, ν, N)} (i) Ω^{(μ, ν, N)} (j) (0 \leq i, j \leq N) .

(59)

If

1 \leq ℓ \neq m \leq n

, then

V_{ℓ}

and

V_{m}

are independent, so that

E [V_{ℓ} (i) V_{m} (j)] = E [V_{ℓ} (i)] E [V_{m} (j)] = 0 .

(60)

By analogy with the uniform empirical process in the continuous case

(16)

, consider the

ω^{(μ, ν, N)}

-empirical process

U_{n} (i) = \sqrt{n} \{{\hat{Ω}}_{n} (i) - Ω^{(μ, ν, N)} (i)\} = n^{- 1 / 2} \sum_{m = 1}^{n} [1_{{X_{m} \leq i}} - Ω^{(μ, ν, N)} (i)] (0 \leq i \leq N)

and the weighted empirical process defined over

{0, 1, \dots, N}

by

\begin{matrix} X_{n}^{(μ, ν, N)} (0) = 0, X_{n}^{(μ, ν, N)} (i) & = \frac{U_{n} (i - 1)}{\sqrt{σ^{(μ, ν)} (i)} ω^{(μ, ν, N)} (i)} \end{matrix}

(61)

\begin{matrix} = \frac{n^{- 1 / 2} \sum_{m = 1}^{n} [1_{{X_{m} \leq i - 1}} - Ω^{(μ, ν, N)} (i - 1)]}{\sqrt{σ^{(μ, ν)} (i)} ω^{(μ, ν, N)} (i)} (1 \leq i \leq N) . \end{matrix}

(62)

Proposition 4.

One has, under

H_{0}

, for

n = 1, 2, \dots

,

E {X_{n}^{(μ, ν, N)} (i)} = 0, E {X_{n}^{(μ, ν, N)} (i) X_{n}^{(μ, ν, N)} (j)} = Γ_{i, j}^{(μ, ν, N)} (1 \leq i, j \leq N) .

(63)

Proof.

The first equality is straightforward. The second equality follows, keeping (45) in mind, from (56)–(60), combined with definition (62). □

The discrete Cramér–von Mises statistic, associated with our empirical process, is the non-negative number, say

T_{n}^{(μ, ν, N)}

, defined by

\begin{matrix} {(T_{n}^{(μ, ν, N)})}^{2} & = ∥ X_{n}^{(μ, ν, N)} ∥_{ω}^{2} = \sum_{i = 1}^{N} X_{n}^{(μ, ν, N)} {(i)}^{2} ω^{(μ, ν, N)} (i) = \sum_{i = 1}^{N} \frac{U_{n}^{(μ, ν, N)} {(i - 1)}^{2}}{σ^{(μ, ν)} (i) ω^{(μ, ν, N)} (i)} \end{matrix}

(64)

The statistic

T_{n}^{(μ, ν, N)}

has to be thought of as a test statistic, large values of

T_{n}^{(μ, ν, N)}

being significant, i.e., leading to the rejection of

H_{0}

. For computations, one can use the equality

n {(T_{n}^{(μ, ν, N)})}^{2} = \sum_{i = 1}^{N} \frac{{n {\hat{Ω}}_{n} (i - 1) - n Ω^{(μ, ν, N)} (i - 1)}^{2}}{σ^{(μ, ν)} (i) ω^{(μ, ν, N)} (i)},

(65)

to be compared with the widely used chi-square statistic

{(D_{n}^{(μ, ν, N)})}^{2} = \sum_{i = 0}^{N} \frac{{n {\hat{ω}}_{n} (i) - n ω^{(μ, ν, N)} (i)}^{2}}{n ω^{(μ, ν, N)} (i)} .

(66)

Remark 1.

It is well-known that the chi-square statistic provides a formula for testing the fit of a sample to any distribution, say ω, over

{0, 1, \dots, N}

. One has only to replace

ω^{(μ, ν, N)}

by ω in

(66)

. The same property holds for our statistic defined, by formula

(65)

, in which

ω^{(μ, ν, N)}

and

Ω^{(μ, ν, N)}

should be replaced by ω and Ω. The difference is that our weight

σ^{(μ, ν, N)} (i) = i (i + μ)

remains unchanged and the choice of μ is arbitrary.

As well as in the continuous case by using (62), our statistic can be seen as the degenerate V-statistic

{(T_{n}^{(μ, ν, N)})}^{2} = \frac{1}{n} \sum_{ℓ = ℓ}^{N} \sum_{m = 1}^{N} L^{(μ, ν, N)}) (X_{ℓ}, X_{m}),

(67)

with kernel

L^{(μ, ν, N)} (X_{1}, X_{2}) = \sum_{i = 1}^{N} \frac{{1_{{X_{1} \leq i - 1}} - Ω^{(μ, ν, N)} (i - 1)} {1_{{X_{2} \leq i - 1}} - Ω^{(μ, ν, N)} (i - 1)}}{σ^{(μ, ν)} (i) ω^{(μ, ν, N)} (i)},

(68)

the degeneracy with respect to the p.m.f.

ω^{(μ, ν, N)}

being obtained by linearity and the equality

E {1_{{X \leq i - 1}} - Ω^{(μ, ν, N)} (i - 1)} = 0 (i \in Z) .

6. Convergence and Large Deviations under $H_{0}$

Let us now discuss some properties of

T_{n}^{(μ, ν, N)}

, classical in the framework of Cramér–von Mises statistics. On the first hand, the Pythagorean theorem with respect to the orthonormal system

{(ϕ_{k}^{(μ, ν, N)})}_{1 \leq k \leq N}

gives

\begin{matrix} {T_{n}^{(μ, ν, N)}}^{2} = {∥ X_{n}^{(μ, ν, N)} ∥}_{ω}^{2} & = \sum_{k = 1}^{N} {〈 X_{n}^{(μ, ν, N)} | ϕ_{k}^{(μ, ν, N)} 〉}_{ω}^{2} = \sum_{k = 1}^{n} \frac{{(Z_{k, n}^{(μ, ν, N)})}^{2}}{λ_{k}^{(μ, ν, N)}}, \end{matrix}

(69)

where the so-called principal components, introduced by [16] for the continuous case, are given, for

1 \leq k \leq N

, by

Z_{k, n}^{(μ, ν, N)} : = \sqrt{λ_{k}^{(μ, ν, N)}} {〈 ϕ_{k}^{(μ, ν, N)} | X_{n}^{(μ, ν, N)} 〉}_{ω} = \sqrt{λ_{k}^{(μ, ν, N)}} \sum_{i = 1}^{N} ϕ_{k}^{(μ, ν, N)} (i) X_{n}^{(μ, ν, N)} (i) ω^{(μ, ν, N)} (i) .

(70)

A fruitful expression for the principal components is available in terms of Hahn polynomials.

Theorem 1.

The principal components admit the expressions

Z_{k, n}^{(μ, ν, N)} = - n^{- 1 / 2} \sum_{m = 1}^{n} ψ_{k}^{(μ, ν, N)} (X_{m}) (1 \leq k \leq N) .

(71)

Furthermore, under the null hypothesis (55), the equalities

E (Z_{k, n}^{(μ, ν, N)}) = 0, E (Z_{k, n}^{(μ, ν, N)} Z_{ℓ, n}^{(μ, ν, N)}) = δ_{k, ℓ} (1 \leq k, ℓ \leq N, n \in N^{*})

(72)

and the convergences in law

\begin{matrix} {(Z_{k, n}^{(μ, ν, N)})}_{1 \leq k \leq N} \Rightarrow N_{N} (0, I), {(T_{n}^{(μ, ν, N)})}^{2} \Rightarrow \sum_{k = 1}^{N} \frac{ξ_{k}^{2}}{λ_{k}^{(μ, ν, N)}} \end{matrix}

(73)

hold, where

N_{N} (0, I)

denotes an N-variate standard normal random vector, and

ξ_{1}, \dots, ξ_{N}

are independent standard normal random variables.

Proof.

As for (71), note first that for

n = 1

, in view of (49) and (61), we have

\begin{matrix} \sqrt{λ_{k}^{(μ, ν, N)} d_{k}^{(μ, ν, N)} / d_{0}^{(μ, ν, N)}} {〈 ϕ_{k}^{(μ, ν, N)} | X_{1}^{(μ, ν, N)} 〉}_{ω} \\ = \sqrt{λ_{k}^{(μ, ν, N)} d_{k}^{(μ, ν, N)} / d_{0}^{(μ, ν, N)}} \sum_{i = 1}^{N} ϕ_{k}^{(μ, ν, N)} (i) \frac{1_{{X_{1} \leq i - 1}} - Ω^{(μ, ν, N)} (i - 1)}{\sqrt{σ^{(μ, ν)} (i)} ω^{(μ, ν, N)} (i)} ω^{(μ, ν, N)} (i) \\ = \sum_{i = 1}^{N} \nabla H_{k}^{(μ, ν, N)} (i) [1_{{X_{1} \leq i - 1}} - Ω^{(μ, ν, N)} (i - 1)] \\ = \sum_{i \in Z} \nabla H_{k}^{(μ, ν, N)} (i) [1_{{X_{1} \leq i - 1}} - Ω^{(μ, ν, N)} (i - 1)] \\ = \sum_{i \in Z} H_{k}^{(μ, ν, N)} (i) [1_{{X_{1} \leq i - 1}} - 1_{{X_{1} \leq i}}] + \sum_{i \in Z} H_{k}^{(μ, ν, N)} (i) [- Ω^{(μ, ν, N)} (i - 1) + Ω^{(μ, ν, N)} (i)] \\ = - H_{k}^{(μ, ν, N)} (X_{1}) + {〈 H_{0}^{(μ, ν, N)} | H_{k}^{(μ, ν, N)} 〉}_{ω} = - H_{k}^{(μ, ν, N)} (X_{1}) . \end{matrix}

In view of the first equality in (43), we have proved that

\sqrt{λ_{k}^{(μ, ν, N)}} {〈 ϕ_{k}^{(μ, ν, N)} | X_{1}^{(μ, ν, N)} 〉}_{ω} = - ψ_{k}^{(μ, ν, N)} (X_{1}),

in other words,

\sqrt{λ_{k}^{(μ, ν, N)}} {〈 ϕ_{k}^{(μ, ν, N)} | \frac{1_{{X_{1} \leq i - 1}} - Ω^{(μ, ν, N)} (i - 1)}{\sqrt{σ^{(μ, ν)} (i)} ω^{(μ, ν, N)} (i)} 〉}_{ω} = - ψ_{k}^{(μ, ν, N)} (X_{1}) .

The same equality is valid for

X_{m}

with

1 \leq m \leq n

, and by summing them we obtain

\sqrt{λ_{k}^{(μ, ν, N)}} {〈 ϕ_{k}^{(μ, ν, N)} | \frac{n^{- 1 / 2} \sum_{m = 1}^{n} [1_{{X_{m} \leq i - 1}} - Ω^{(μ, ν, N)} (i - 1)]}{\sqrt{σ^{(μ, ν)} (i)} ω^{(μ, ν, N)} (i)} 〉}_{ω} = - n^{- 1 / 2} \sum_{m = 1}^{n} ψ_{k}^{(μ, ν, N)} (X_{m})

which is (71) in view of definition (70). The two equalities in (72) are straightforward consequences of (71) by using (43).

Vector

{(ψ_{k}^{(μ, ν, N)} (X_{1}))}_{1 \leq k \leq N}

is, by construction, centred and its covariance is the identity matrix. Then the multivariate central limit theorem implies (73). □

We are in the case (discrete finite support) where an important result about the probability of large deviations can be obtained by simply applying [17], completed by [18], Lemma 2.2. We will use, for reference to their assumptions, their restatement [6], Theorem 1.6.3. Recall that as

t \to t_{0}

, the notation

a (t) \sim b (t)

means that functions a and b satisfy

{lim}_{t \to t_{0}} [a (t) / b (t)] = 1

.

Proposition 5.

Under

H_{0}

, there exists a function f, such that in a neighbourhood of 0, f is continuous and

\begin{matrix} lim_{n \to \infty} n^{- 1} log [P (T_{n}^{(μ, ν, N)} > n^{1 / 2} t)] = - f (t), with f (t) \sim \frac{λ_{1}^{(μ, ν, N)}}{2} t^{2} (t \to 0) . \end{matrix}

(74)

Proof.

Consider, for

1 \leq m \leq n

, the random vector

Y_{m} \in R^{N + 1}

, with components

Y_{m} (0) = 0, Y_{m} (i) = \frac{1_{{X_{m} \leq i - 1}} - Ω^{(μ, ν, N)} (i - 1)}{\sqrt{σ^{(μ, ν)} (i)} ω^{(μ, ν, N)} (i)} (1 \leq i \leq N) .

Note first that for

m = 1

, we have

∥ Y_{1} ∥_{ω}^{2} = \sum_{i = 0} ω (i) Y_{1}^{2} (i) = \sum_{i = 1}^{X_{1}} \frac{Ω {(i - 1)}^{2}}{σ (i) ω (i)} + \sum_{i = X_{1} + 1}^{N} \frac{\bar{Ω} {(i - 1)}^{2}}{σ (i) ω (i)}, E (exp {t ∥ Y_{1} ∥_{ω}}) < \infty,

(75)

for every

t \in R

, since this expectation reduces to a finite sum of finite terms. Thus, [6], (1.6.9) is satisfied.

Let us then prove that

sup {Var y^{*} (Y_{1}) : y^{*} \in {(R^{N + 1})}^{*}, ∥ y^{*} ∥_{ω} = 1} = \frac{1}{λ_{1}^{(μ, ν, N)}},

where

{(R^{N + 1})}^{*}

is the dual space of

R^{N + 1}

. Any unit element

y^{*} \in {(R^{N + 1})}^{*}

is associated with a vector

y \in R^{N + 1}

, such that

{∥ y ∥}_{ω} = 1

, whose action on

Y_{1}

is given by

y^{*} (Y_{1}) = \sum_{i = 0}^{N} y (i) Y_{1} (i)

.

First, we have

E [y^{*} (Y_{1})] = \sum_{i = 0}^{N} y (i) E [X_{1}^{(μ, ν, N)} (i)] = 0

(i.e., [6], (1.6.8) is satisfied) and then

\begin{matrix} Var [y^{*} (Y_{1})] & = E [y^{*} {(Y_{1})}^{2}] = E {{[\sum_{i = 0}^{N} y (i) Y_{1} (i)]}^{2}} = \sum_{i, j} y (i) y (j) E [Y_{1} (i) Y_{1} (j)] \\ = \sum_{i, j} y (i) y (j) E [X_{1}^{(μ, ν, N)} (i) X_{1}^{(μ, ν, N)} (j)] = \sum_{i, j} y (i) y (j) Γ_{i, j}^{(μ, ν, N)} . \end{matrix}

In other words, the maximal variance, say

σ^{2}

, we are looking for, is the maximal value of the quadratic form with matrix

Γ_{i, j}^{(μ, ν, N)}

, over the unit ball. The positive eigenvalues of

Γ_{i, j}^{(μ, ν, N)}

are, in decreasing order,

1 / λ_{1}^{(μ, ν, N)}

, …,

1 / λ_{N}^{(μ, ν, N)}

. The maximal variance

σ^{2}

will, therefore, be

1 / λ_{1}^{(μ, ν, N)}

. From ([6], 1.6.3), we infer the existence of f such that

lim_{n \to \infty} n^{- 1} log P (∥ Y_{1} + \dots + Y_{n} ∥_{ω} > n t) = - f (t) \sim - \frac{t^{2}}{2 σ^{2}} = - \frac{λ_{1}^{(μ, ν, N)} t^{2}}{2} (t \to 0)

and (74) follows from the equality

∥ Y_{1} + \dots + Y_{n} ∥_{ω} = {∥ \sqrt{n} X_{n}^{(μ, ν, N)} ∥}_{ω} = \sqrt{n} T_{n}^{(μ, ν, N)} .

□

7. Exact Slope under $H_{1}$ and Local Asymptotic Bahadur Optimality under the Alternative $H_{1} (θ)$

Let us apply Bahadur’s fundamental result ([10], §7) to the sequence of statistics

T = {(T_{n}^{(μ, ν, N)})}_{n \geq 1}

.

Given the null hypothesis (55), we will first consider an alternative hypothesis

H_{1}

under which the distribution is a p.m.f., supported, as under

H_{0}

, by

{0, 1, \dots, N}

, and denoted by

{(ω_{X} (i))}_{0 \leq i \leq N}

, the associated c.d.f. being

{(Ω_{X} (i))}_{0 \leq i \leq N}

with

Ω_{X} (i) = \sum_{j = 0}^{i} ω_{X} (j)

,

0 \leq i \leq N

.

Let us state a first result, recalling that function f was defined above in Proposition 5. Note that (77) ensures that

T_{n}^{(μ, ν, N)}

is consistent against all alternatives. A row vector

(a_{1}, \dots, a_{N}) \in R^{N}

will be denoted by

a

, the zero vector by

0_{N}

. Note that (43) implies that all p.m.f. over

{0, 1, \dots, N}

can be written in the form (76) below.

Proposition 6.

If the alternative hypothesis

H_{1} (a) : ω_{X} (i) = ω_{a} (i) = ω^{(μ, ν, N)} (i) \{1 + \sum_{k = 1}^{N} a_{k} ψ_{k} (i)\} (0 \leq i \leq N), a \neq 0_{N}

(76)

holds, then the convergence in probability

\begin{matrix} lim_{n \to \infty} n^{- 1} {T_{n}^{(μ, ν, N)}}^{2} & = \sum_{i = 1}^{N} \frac{{[Ω_{X} (i - 1) - Ω^{(μ, ν, N)} (i - 1)]}^{2}}{σ^{(μ, ν)} (i) ω^{(μ, ν, N)} (i)} = \sum_{k = 1}^{N} \frac{a_{k}^{2}}{λ_{k}^{(μ, ν, N)}} = : b^{(μ, ν, N)} {(a)}^{2} \end{matrix}

(77)

takes place. Furthermore, the exact slope of

T_{n}^{(μ, ν, N)}

satisfies

\begin{matrix} c_{T} (a) = 2 f (b^{(μ, ν, N)} (a)) \sim λ_{1}^{(μ, ν, N)} b^{(μ, ν, N)} {(a)}^{2} (a \to 0_{N}) . \end{matrix}

(78)

Proof.

We will use the abbreviation

ω^{(μ, ν, N)} = ω

. The first equality in (77) follows from the law of large numbers applied to (65). As for the second equality, let us use the definition of

T_{n}

as a V-statistic with kernel L. Setting

a_{0} = 1

, we can rewrite

ω_{a}

as

ω_{a} (i) = ω (i) \sum_{k = 0}^{N} a_{k} ψ_{k} (i) .

First, note the identities

\sum_{i = 1}^{N} \sum_{j = 1}^{N} L (i, j) ψ_{k} (i) ψ_{0} (j) ω (i) ω (j) = 0, \sum_{i = 1}^{N} \sum_{j = 1}^{N} L (i, j) ψ_{k} (i) ψ_{ℓ} (j) ω (i) ω (j) = \frac{δ_{k, ℓ}}{λ_{k}} (1 \leq k, ℓ \leq N) .

Then, the law of large numbers applied to V-statistics implies that, as

n \to \infty

,

\begin{matrix} n^{- 1} T_{n}^{2} & = n^{- 2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} L (X_{i}, X_{j}) \to \sum_{i = 1}^{N} \sum_{j = 1}^{N} L (i, j) \{\sum_{k = 0}^{N} a_{k} ψ_{k} (i)\} \{\sum_{ℓ = 0}^{N} a_{ℓ} ψ_{ℓ} (j)\} ω (i) ω (j) \\ = \sum_{k = 0}^{N} \sum_{ℓ = 0}^{N} a_{k} a_{ℓ} \{\sum_{i = 1}^{N} \sum_{j = 1}^{N} L (i, j) ψ_{k} (i) ψ_{ℓ} (j) ω (i) ω (j)\} \\ = \sum_{k = 1}^{N} \sum_{ℓ = 1}^{N} a_{k} a_{ℓ} \{\sum_{i = 1}^{N} \sum_{j = 1}^{N} L (i, j) ψ_{k} (i) ψ_{ℓ} (j) ω (i) ω (j)\} = \sum_{k = 1}^{N} \sum_{ℓ = 1}^{N} a_{k} a_{ℓ} \frac{δ_{k, ℓ}}{λ_{k}} = \sum_{k = 1}^{N} \frac{a_{k}^{2}}{λ_{k}} . \end{matrix}

Then, (74) allows us to use [10], Theorem 7.2, and conclude that (78) holds. □

Let us use the notation

ω_{a} = ω_{θ}

whenever

a_{1} = θ \neq 0

and

a_{2} = \dots = a_{N} = 0

, and in this case

b^{(μ, ν, N)} (ω_{θ}) = \tilde{b} (θ) .

Recall that the Kullback–Leibler information number (introduced by [19]) of

ω^{(μ, ν, N)}

and

ω_{θ}

is defined in the discrete case by

J (ω_{θ}, ω^{(μ, ν,, N)}) = \sum_{i = 0}^{N} ω_{θ} (i) log \frac{ω_{θ} (i)}{ω^{(μ, ν, N)} (i)},

and that function f was introduced in Proposition 5.

Theorem 2.

If the alternative hypothesis

H_{1} (θ) : ω_{X} = ω_{θ} = ω^{(μ, ν, N)} {1 + θ ψ_{1}^{(μ, ν, N)}}

(79)

holds for some

θ \neq 0

, then

\begin{matrix} lim_{n \to \infty} n^{- 1 / 2} T_{n}^{(μ, ν, N)} = \tilde{b} (θ) = \frac{θ^{2}}{λ_{1}^{(μ, ν, N)}} . \end{matrix}

(80)

Therefore, the exact slope of

T_{n}^{(μ, ν, N)}

satisfies

c_{T} (θ) = 2 f (\tilde{b} (θ)) \sim θ^{2} \sim 2 J (ω_{θ}, ω^{(μ, ν, N)}) (θ \to 0),

(81)

so that the statistic

T_{n}^{(μ, ν, N)}

is locally asymptotically optimal in the sense of Bahadur, with respect to the statistical problem

(55), (79)

.

Proof.

The first result (80) is a particular case of (77). Then, (78), rewritten with

b (a)

replaced by

\tilde{b} (θ)

given by (80), leads to the first equality and equivalence in (81).

Finally, Fisher information being at

θ = 0

given by

I^{(μ, ν, N)} = E ({[\frac{\partial log \{ω_{0} [1 + θ ψ_{1}]\}}{\partial θ} |_{θ = 0}]}^{2}) = E [ψ_{1}^{2}] = 1

(see, e.g., Theorem 9.17 and Example 9.20 in [20]), the Kullback–Leibler divergence satisfies

J (ω_{θ}, ω_{0}) \sim \frac{1}{2} I^{(μ, ν, N)} θ^{2} = \frac{θ^{2}}{2} (θ \to 0)

(82)

(see

(2.7)

in [19]), and the last equivalence in (81) readily follows. □

8. Useful Technical Results

8.1. Relationship between Identities $(1)$ and $(5)$ – $(6)$

In the particular case

(α, β) = (- 1 / 2, - 1 / 2)

, we have, for

- 1 < x < 1

and

k \geq 1

,

\begin{matrix} f^{(- 1 / 2, - 1 / 2)} (x) & = \frac{1}{\sqrt{1 - x^{2}}}, F^{(- 1 / 2, - 1 / 2)} (x) = arccos (- x), {\bar{F}}^{(- 1 / 2, - 1 / 2)} (x) = arccos (x), \\ d_{k}^{(- 1 / 2, - 1 / 2)} & = \frac{Γ {(k + 1 / 2)}^{2}}{2 k!^{2}}, d_{0}^{(- 1 / 2, - 1 / 2)} = π, λ_{k}^{(- 1 / 2, - 1 / 2)} = k^{2}, \frac{d_{0}}{d_{k} λ_{k}} = - \frac{2 π {[(k - 1)!]}^{2}}{Γ {(k + 1 / 2)}^{2}} \\ P_{k}^{(- 1 / 2, - 1 / 2)} (x) & = \frac{Γ (k + 1 / 2)}{\sqrt{π} k!} cos (k arccos (x)), \frac{d}{d x} P_{k}^{(- 1 / 2, - 1 / 2)} (x) = - \frac{Γ (k + 1 / 2)}{\sqrt{π} (k - 1)!} \frac{sin (k arccos (x))}{\sqrt{1 - x^{2}}} \end{matrix}

(for

P_{k}^{(- 1 / 2, - 1 / 2)}

see [2], (1.4.3), p. 9). After rearrangement, (1) becomes

\begin{matrix} arccos (- min {x, y}) arccos (max {x, y}) \\ = \sum_{k = 1}^{\infty} \frac{1}{k^{2}} \cdot 2 sin [k arccos (x)] sin [k arccos (y)] (- 1 < x \leq y < 1) . \end{matrix}

After setting

arccos (- x) = π s

,

arccos (- y) = π t

and using the equality

arccos (x) = π - arccos (- x)

, which implies

arccos (- min {x, y}) = π min {s, t}, arccos (max {x, y}) = π (1 - max {s, t}),

we obtain (5). In the case

(α, β) = (0, 0)

, we have

\begin{matrix} f^{(0, 0)} (x) & = 1, F^{(0, 0)} (x) = 1 + x, {\bar{F}}^{(0, 0)} (x) = 1 - x, \\ d_{k}^{(0, 0)} & = \frac{2}{2 k + 1}, \frac{d_{0}^{(0, 0)}}{d_{k}^{(0, 0)}} = 2 k + 1, λ_{k}^{(0, 0)} = k (k + 1) . \end{matrix}

Therefore, (1) becomes

\begin{matrix} \frac{(1 + min {x, y}) (1 - max {x, y})}{\sqrt{(1 - x^{2}) (1 - y^{2})}} = \sum_{k = 1}^{\infty} \frac{1}{k (k + 1)} \frac{(2 k + 1) \sqrt{1 - x^{2}} P_{k}^{'} (x) \sqrt{1 - y^{2}} P_{k}^{'} (y)}{k (k + 1)} \\ (- 1 < x \leq y < 1) . \end{matrix}

If we set

1 + x = 2 s

and

1 + y = 2 t

, so that

1 - x^{2} = 4 s (1 - s), 1 - y^{2} = 4 t (1 - t),

and

1 + min {x, y} = 2 min {s, t}, 1 - max {x, y} = 2 (1 - max {s, t}),

this development leads to (6).

8.2. Proof of the Orthogonality Relations for Hahn Polynomials

Following [15], §1.1 and [3], §1.3, we use the generalized factorial and binomial coefficients defined by

r! = Γ (r + 1), (\binom{r}{k}) = {(- 1)}^{k} \frac{{(- r)}_{k}}{k!} (r > - 1, k \in N),

where

Γ

is the gamma function, and set

(\binom{- r}{k}) = {(- 1)}^{k} (\binom{r + k - 1}{k}), (r \geq 1, k \in N) .

Note the identity

{(- r)}_{k} = {(- 1)}^{k} \frac{r!}{(r - k)!} (r \geq k)

The traditional orthogonality relation for Hahn polynomials given by ([3], §9.5), reads

\sum_{i = 0}^{N} (\binom{α + i}{i}) (\binom{β + N - i}{N - i}) Q_{k} (i; α, β, N) Q_{ℓ} (i; α, β, N) = \frac{{(- 1)}^{k} {(k + α + β + 1)}_{N + 1} {(β + 1)}_{k} k!}{(2 k + α + β + 1) {(α + 1)}_{k} {(- N)}_{k}} = : d_{k}^{'} .

With parameters

ν = - α - N - 1, μ = - β - N - 1

, the above identities give

\begin{matrix} (\binom{α + i}{i}) (\binom{β + N - i}{N - i}) = (\binom{- N - ν - 1 + i}{i}) (\binom{- N - μ - 1 + N - i}{N - i}) \\ = {(- 1)}^{i} (\binom{N + ν}{i}) {(- 1)}^{N - i} (\binom{μ + N}{N - i}) \end{matrix}

On the other hand,

\begin{matrix} {(k + α + β + 1)}_{N + 1} & = {(k - 2 N - μ - ν - 1)}_{N + 1} = {(- 1)}^{N + 1} \frac{(2 N + μ + ν + 1 - k)!}{(N + μ + ν - k)!}, \end{matrix}

(83)

\begin{matrix} {(β + 1)}_{k} = {(- N - μ)}_{k} & = {(- 1)}^{k} \frac{(N + μ)!}{(N + μ - k!)}, {(α + 1)}_{k} = {(- N - ν)}_{k} = {(- 1)}^{k} \frac{(N + ν)!}{(N + ν - k)!}, \end{matrix}

(84)

\begin{matrix} {(- N)}_{k} & = {(- 1)}^{k} \frac{N!}{(N - k)!}, (2 k + α + β + 1) = - (2 N + μ + ν + 1 - 2 k), \end{matrix}

(85)

so that

d_{k}^{'} = \frac{{(- 1)}^{N} (2 N + μ + ν + 1 - k)! (N + μ)! (N + ν - k)! (N - k)! k!}{(2 N + μ + ν + 1 - 2 k) (N + μ + ν - k)! (N + ν)! (N + μ - k)! N!^{2}},

and (28) readily follows.

8.3. Other Results

Note the identity

Δ_{i} \circ Δ_{j} K (i, j) = Δ_{j} \circ Δ_{i} K (i, j) = K (i + 1, j + 1) + K (i, j) - K (i + 1, j) - K (i, j + 1)

(86)

It is easily seen that

Lemma 1.

For any function

f : Z \times Z \to R

, the equivalence

Δ_{i} \circ Δ_{j} f = 0 ⟺ \exists c, d : Z \to R : f (i, j) = c (i) + d (j) (i, j \in Z)

(87)

holds.

In terms of matrices, relation (42) can be written as

δ_{k, ℓ} = \sum_{i = 0}^{N} m_{k, i} m_{ℓ, i}, m_{k, i} : = {(\frac{d_{0} ω (i)}{d_{k}})}^{1 / 2} H_{k} (i) (0 \leq i, k \leq N) .

The matrix

(m_{i, k})

is the inverse of its transpose, and the expansion of Kronecker’s function is given by the dual relations

δ_{i, j} = \sum_{k = 0}^{N} m_{k, i} m_{k, j} = ω^{(μ, ν, N)} (i) \sum_{k = 0}^{N} \frac{d_{0}^{(μ, ν, N)} H_{k}^{(μ, ν, N)} (i) H_{k}^{(μ, ν, N)} (j)}{d_{k}^{(μ, ν, N)}} = δ_{i, j} (0 \leq i, j \leq N)

(88)

The fact that

H_{0}^{(μ, ν, N)} = 1

then implies

Lemma 2.

Assume

0 \leq i, j \leq N

. Then,

\begin{matrix} ω^{(μ, ν, N)} (i) \sum_{k = 1}^{N} \frac{d_{0}^{(μ, ν, N)} H_{k}^{(μ, ν, N)} (i) H_{k}^{(μ, ν, N)} (j)}{d_{k}^{(μ, ν, N)}} & = δ_{i, j} - ω^{(μ, ν, N)} (i) . \end{matrix}

(89)

Lemma 3.

One has,

Δ_{i} \circ Δ_{j} [Ω (min {i, j} - 1) \bar{Ω} (max {i, j} - 1)] = - ω (i) ω (j) + δ_{i, j} ω (i)

Proof.

Let us use identity (86) with

K (i, j) = Ω (min {i, j}) \bar{Ω} (max {i, j})

and

L = Δ_{i} \circ Δ_{j} K

. By symmetry, it is sufficient to prove the result in the case

i \leq j

.

If

i < j

then

i + 1 \leq j

and (86) reads

L (i, j) = Ω (i + 1) \bar{Ω} (j + 1) - Ω (i) \bar{Ω} (j + 1) - Ω (i + 1) \bar{Ω} (j) + Ω (i) \bar{Ω} (j) = - ω (i + 1) ω (j + 1) .

If

i = j

then

i + 1 > j

and (86) gives

L (i, i) = Ω (i + 1) \bar{Ω} (i + 1) - Ω (i) \bar{Ω} (i + 1) - Ω (i) \bar{Ω} (i + 1) + Ω (i) \bar{Ω} (i) = ω (i + 1) - ω {(i + 1)}^{2} .

The result follows. □

9. Discussion and Future Research Directions

Hypergeometric distributions, some aspects of which are discussed in our paper, have various applications. See the already classical treatise [15], §6.9 for a basic list of references.

The fact that the use of hypergeometric distributions to model various problems remains an active field of research is illustrated by more recent references, such as [21,22] for theoretical aspects, as well as [23,24,25,26,27,28] for practical applications.

We have proved that most features of some Cramér–von Mises statistics can be derived from their connection with classical orthogonal polynomials. This approach enabled us, on the first hand, to state new results about the local asymptotic Bahadur optimality of the well-known Cramér–von Mises and Anderson–Darling statistics. On the second hand, similar properties were proved for the discrete case associated with Hahn polynomials and hypergeometric distributions.

A first natural direction for future work is to extend our results to the whole family of classical orthogonal polynomials, discrete or continuous, as well as their q-analogues (see [2], Chapter 3, and [3], Part II about this topic). The main difficulty lies in the issue of large deviation probabilities.

A second direction should be the use of simulation to check the power of such tests, compared to that of standard tests used by practitioners, such as the chi-square test. In this respect, one should identify the alternatives of interest in practice in different fields. Whereas common alternatives are mean shifts, our approach privileges alternatives associated with the first non-constant polynomial of the family associated with a distribution. In most cases, such an alternative may not reduce to the mean shift, but be of interest in a way that has to be determined.

A third, more difficult, direction of research, would be to extend our results to the wider family of the polynomials of the Askey–Wilson scheme. The random variables associated with these polynomials, as well as their various mutual relationships, via limit relations, were discussed in [29], where basic references concerning the Askey–Wilson scheme are given. Since only classical orthogonal polynomials satisfy a second-order differential equation, another property of orthogonal polynomials, such as, for instance, the three-term recurrence relation, should be used to carry out such a task.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The author declares no conflicts of interest.

References

Pycke, J.R. On Three Families of Karhunen–Loève Expansions Associated with Classical Orthogonal Polynomials. Results Math. 2021, 76, 148. [Google Scholar] [CrossRef]
Nikiforov, A.F.; Uvarov, V.B.; Suslov, S.K.; Nikiforov, A.F.; Uvarov, V.B.; Suslov, S.K. Classical Orthogonal Polynomials of a Discrete Variable; Springer: Berlin/Heidelberg, Germany, 1991. [Google Scholar]
Koekoek, R.; Lesky, P.A.; Swarttouw, R.F.; Koekoek, R.; Lesky, P.A.; Swarttouw, R.F. Hypergeometric Orthogonal Polynomials; Springer: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
Shorack, G.R.; Wellner, J.A. Empirical Processes with Applications to Statistics; SIAM: Philadelphia, PA, USA, 2009. [Google Scholar]
Adler, R.J. An Introduction to Continuity, Extrema, and Related Topics for General Gaussian Processes; IMS: San Diego, CA, USA, 1990. [Google Scholar]
Nikitin, Y. Asymptotic Efficiency of Nonparametric Tests; Cambridge University Press: Cambridge, UK, 1995. [Google Scholar]
Korolyuk, V.S.; Borovskich, Y.V. Theory of U-Statistics; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013; Volume 273. [Google Scholar]
Van der Vaart, A.W. Asymptotic Statistics; Cambridge University Press: Cambridge, UK, 2000; Volume 3. [Google Scholar]
Serfling, R.J. Approximation Theorems of Mathematical Statistics; John Wiley & Sons: Hoboken, NJ, USA, 2009. [Google Scholar]
Bahadur, R.R. Some Limit Theorems in Statistics; SIAM: Philadelphia, PA, USA, 1971. [Google Scholar]
Howlett, J. Handbook of Mathematical Functions, Edited by Milton Abramowitz and Irene A. Stegun. Constable (Dover Publications Inc.), Paperback ed.; Cambridge University Press: Cambridge, UK, 1966; pp. 358–359. [Google Scholar] [CrossRef]
Anderson, T.W.; Stephens, M.A. The continuous and discrete Brownian bridges: Representations and applications. Linear Algebra Its Appl. 1997, 264, 145–171. [Google Scholar] [CrossRef]
Choulakian, V.; Lockhart, R.A.; Stephens, M.A. Cramér-von Mises statistics for discrete distributions. Can. J. Stat. Rev. Can. Stat. 1994, 22, 125–137. [Google Scholar] [CrossRef]
De Wet, T.; Venter, J. Asymptotic distributions of certain test criteria of normality. S. Afr. Stat. J. 1972, 6, 135–149. [Google Scholar]
Johnson, N.L.; Kemp, A.W.; Kotz, S. Univariate Discrete Distributions; John Wiley & Sons: Hoboken, NJ, USA, 2005; Volume 444. [Google Scholar]
Durbin, J.; Knott, M. Components of Cramér–von Mises statistics. I. J. R. Stat. Soc. Ser. B 1972, 34, 290–307. [Google Scholar] [CrossRef]
Sethuraman, J. On the probability of large deviations of families of sample means. Ann. Math. Stat. 1964, 35, 1304–1316. [Google Scholar] [CrossRef]
Rao, J. Bahadur efficiencies of some tests for uniformity on the circle. Ann. Math. Stat. 1972, 43, 468–479. [Google Scholar] [CrossRef]
Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
Wasserman, L. All of Nonparametric Statistics; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
Greene, E.; Wellner, J.A. Exponential bounds for the hypergeometric distribution. Bernoulli Off. J. Bernoulli Soc. Math. Stat. Probab. 2017, 23, 1911. [Google Scholar] [CrossRef] [PubMed]
Tillé, Y. Yet Another Attempt to Classify Positive Univariate Probability Distributions. Austrian J. Stat. 2024, 53, 87–101. [Google Scholar]
Roy, S.; Tripathi, R.C.; Balakrishnan, N. A Conway Maxwell Poisson type generalization of the negative hypergeometric distribution. Commun. Stat. Theory Methods 2020, 49, 2410–2428. [Google Scholar] [CrossRef]
Jacobs, B. From multisets over distributions to distributions over multisets. In Proceedings of the 2021 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), Rome, Italy, 29 June–2 July 2021; pp. 1–13. [Google Scholar]
Wang, Y.Q.; Zhang, Y.Y.; Liu, J.L. Expectation identity of the hypergeometric distribution and its application in the calculations of high-order origin moments. Commun. Stat. Theory Methods 2023, 52, 6018–6036. [Google Scholar] [CrossRef]
Kondo, T.; Kikuchi, O.; Wang, Y.; Nakai, Y.; Ida, T.; Cao, Y.; Vu, T.H.; Kondo, Y.; Sunami, T.; Yamamoto, Y.; et al. A genome-wide CRISPR screen identifies ARID1A as a potential resistance marker of IDH1 inhibitor in IDH1-mutant cholangiocarcinoma cell. Cancer Res. 2024, 84, 4659. [Google Scholar] [CrossRef]
Rivest, L.P.; Yauck, M. Small Sample Inference for Two-Way Capture-Recapture Experiments. Int. Stat. Rev. 2024. [Google Scholar] [CrossRef]
Comert, G.; Amdeberhan, T.; Begashaw, N.; Medhin, N.G.; Chowdhury, M. Simple analytical models for estimating the queue lengths from probe vehicles at traffic signals: A combinatorial approach for nonparametric models. Expert Syst. Appl. 2024, 252, 124076. [Google Scholar] [CrossRef]
Pycke, J.R. A probabilistic counterpart of the Askey scheme for continuous polynomials. Adv. Pure Appl. Math. 2012, 3, 85–111. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pycke, J.-R. A Discrete Cramér–Von Mises Statistic Related to Hahn Polynomials with Application to Goodness-of-Fit Testing for Hypergeometric Distributions. Axioms 2024, 13, 369. https://doi.org/10.3390/axioms13060369

AMA Style

Pycke J-R. A Discrete Cramér–Von Mises Statistic Related to Hahn Polynomials with Application to Goodness-of-Fit Testing for Hypergeometric Distributions. Axioms. 2024; 13(6):369. https://doi.org/10.3390/axioms13060369

Chicago/Turabian Style

Pycke, Jean-Renaud. 2024. "A Discrete Cramér–Von Mises Statistic Related to Hahn Polynomials with Application to Goodness-of-Fit Testing for Hypergeometric Distributions" Axioms 13, no. 6: 369. https://doi.org/10.3390/axioms13060369

APA Style

Pycke, J.-R. (2024). A Discrete Cramér–Von Mises Statistic Related to Hahn Polynomials with Application to Goodness-of-Fit Testing for Hypergeometric Distributions. Axioms, 13(6), 369. https://doi.org/10.3390/axioms13060369

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Discrete Cramér–Von Mises Statistic Related to Hahn Polynomials with Application to Goodness-of-Fit Testing for Hypergeometric Distributions

Abstract

1. Introduction

2. Cramér–Von Mises and Anderson–Darling Statistics Revisited

3. A Discrete Brownian Bridge Associated with the Hypergeometric Distribution

4. Orthogonal Decomposition of the Weighted Discrete Brownian Bridge

5. A Family of Discrete Cramér–Von Mises Statistics

6. Convergence and Large Deviations under $H_{0}$

7. Exact Slope under $H_{1}$ and Local Asymptotic Bahadur Optimality under the Alternative $H_{1} (θ)$

8. Useful Technical Results

8.1. Relationship between Identities $(1)$ and $(5)$ – $(6)$

8.2. Proof of the Orthogonality Relations for Hahn Polynomials

8.3. Other Results

9. Discussion and Future Research Directions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

A Discrete Cramér–Von Mises Statistic Related to Hahn Polynomials with Application to Goodness-of-Fit Testing for Hypergeometric Distributions

Abstract

1. Introduction

2. Cramér–Von Mises and Anderson–Darling Statistics Revisited

3. A Discrete Brownian Bridge Associated with the Hypergeometric Distribution

4. Orthogonal Decomposition of the Weighted Discrete Brownian Bridge

5. A Family of Discrete Cramér–Von Mises Statistics

6. Convergence and Large Deviations under H 0

7. Exact Slope under H 1 and Local Asymptotic Bahadur Optimality under the Alternative H 1 ( θ )

8. Useful Technical Results

8.1. Relationship between Identities ( 1 ) and ( 5 ) – ( 6 )

8.2. Proof of the Orthogonality Relations for Hahn Polynomials

8.3. Other Results

9. Discussion and Future Research Directions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

6. Convergence and Large Deviations under $H_{0}$

7. Exact Slope under $H_{1}$ and Local Asymptotic Bahadur Optimality under the Alternative $H_{1} (θ)$

8.1. Relationship between Identities $(1)$ and $(5)$ – $(6)$