On the Numbers of Particles in Cells in an Allocation Scheme Having an Even Number of Particles in Each Cell

Chuprunov, Alexey Nikolaevich; Fazekas, István

doi:10.3390/math10071099

Open AccessArticle

On the Numbers of Particles in Cells in an Allocation Scheme Having an Even Number of Particles in Each Cell

by

Alexey Nikolaevich Chuprunov

¹ and

István Fazekas

^2,*

¹

Faculty of Applied Mathematics, Physics and Information Technology, Chuvash State University, Universitetskaia Str. 38, 428015 Cheboksary, Russia

²

Faculty of Informatics, University of Debrecen, Egyetem Square 1, 4032 Debrecen, Hungary

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(7), 1099; https://doi.org/10.3390/math10071099

Submission received: 25 January 2022 / Revised: 16 March 2022 / Accepted: 21 March 2022 / Published: 29 March 2022

(This article belongs to the Special Issue Random Combinatorial Structures)

Download

Browse Figures

Versions Notes

Abstract

:

We consider the usual random allocation model of distinguishable particles into distinct cells in the case when there are an even number of particles in each cell. For inhomogeneous allocations, we study the numbers of particles in the first K cells. We prove that, under some conditions, this K-dimensional random vector with centralised and normalised coordinates converges in distribution to the K-dimensional standard Gaussian law. We obtain both local and integral versions of this limit theorem. The above limit theorem implies a

χ^{2}

limit theorem which leads to a

χ^{2}

-test. The parity bit method does not detect even numbers of errors in binary files; therefore, our model can be applied to describe the distribution of errors in those files. For the homogeneous allocation model, we obtain a limit theorem when both the number of particles and the number of cells tend to infinity. In that case, we prove convergence to the finite dimensional distributions of the Brownian bridge. This result also implies a

χ^{2}

-test. To handle the mathematical problem, we insert our model into the framework of Kolchin’s generalized allocation scheme.

Keywords:

random allocation; generalized allocation scheme; Poisson distribution; Gaussian distribution; limit theorem; local limit theorem; Brownian bridge; χ²-test

MSC:

0C05; 60F05; 62G10

1. Introduction and Notation

In this paper, we study the usual random allocation model.

The random variables

η_{1}, \dots, η_{N}

represent a non-homogeneous allocation scheme of n-distinguishable particles into N distinct cells if their joint distribution has the form

P {η_{1} = k_{1}, \dots, η_{N} = k_{N}} = \frac{n!}{k_{1}! k_{2}! \dots k_{N}!} {(q_{1})}^{k_{1}} {(q_{2})}^{k_{2}} \dots {(q_{N})}^{k_{N}},

(1)

where

k_{1}, k_{2}, \dots, k_{N}

are non-negative integers with

k_{1} + k_{2} + \dots + k_{N} = n

,

q_{1} + q_{2} + \dots + q_{N} = 1

,

0 \leq q_{i} \leq 1

for

1 \leq i \leq N

. Here,

q_{i}

is the probability that the particle is inserted into the i^th cell, and the random variable

η_{i}

is the number of particles in the i^th cell after allocating n particles into the cells. When

q_{1} = q_{2} = \dots = q_{N} = \frac{1}{N}

, then scheme (1) is called a homogeneous allocation scheme of n distinguishable particles into N distinct cells. In [1], homogeneous and non-homogeneous allocation schemes of n distinguishable particles into N distinct cells were considered.

Our goal is to study allocations with an even number of particles in each cell. Thus, let

A_{2}

be the set of even non-negative integers, i.e.,

A_{2} = {2 k : k = 0, 1, 2, \dots}

; let

η_{1}, \dots, η_{N}

be the allocation scheme of

2 n

distinguishable particles into N different cells; and let

η_{1}^{'}, \dots, η_{N}^{'}

be the allocation scheme of

2 n

distinguishable particles into N different cells with an even number of particles in each cell. Then,

η_{1}^{'}, \dots, η_{N}^{'}

has the distribution

P {η_{1}^{'} = 2 k_{1}, \dots, η_{N}^{'} = 2 k_{N}}

= P \{η_{1} = 2 k_{1}, \dots, η_{N} = 2 k_{N} |η_{i} \in A_{2}, 1 \leq i \leq N\},

(2)

where

k_{1}, k_{2}, \dots, k_{N}

are non-negative integer numbers, such that

2 k_{1} + 2 k_{2} + \dots + 2 k_{N} = 2 n

.

To describe the results of the paper, we need the following notation.

\overset{d}{\to}

denotes the convergence in distribution.

γ

,

γ_{i}

,

i \in N

are independent, identically distributed Gaussian random variables with mean 0 and variance 1.

In [2], it was proved that

(\frac{η_{1} - n q_{1}}{\sqrt{n q_{1}}}, \dots, \frac{η_{K} - n q_{K}}{\sqrt{n q_{K}}}) \overset{d}{\to} (γ_{1}, \dots, γ_{K}),

(3)

if K is a fixed number and

N, n \to \infty

such that

q_{i} \to 0

,

n q_{i} \to \infty

for

1 \leq i \leq N

.

The first aim of this paper is to obtain an analogue of the above result for an allocation scheme of distinguishable particles into distinct cells having an even number of particles in each cell. We shall prove that, under some conditions,

(\frac{η_{1}^{'} - 2 n q_{1}}{\sqrt{2 n q_{1}}}, \frac{η_{2}^{'} - 2 n q_{2}}{\sqrt{2 n q_{2}}}, \dots, \frac{η_{K}^{'} - 2 n q_{K}}{\sqrt{2 n q_{K}}}) \overset{d}{\to} (γ_{1}, γ_{2}, \dots, γ_{K})

as

N, n \to \infty

, see Theorems 2 and 3.

A well-known fact is that the polynomial distribution (1) is asymptotically normal, when N is fixed and

n \to \infty

. This result serves as a basis of the proof that the limit of the empirical process is the Brownian bridge, see [3]. In this paper, we shall study this problem for allocations having an even number of particles in each cell. Here, we introduce the following two random processes:

X_{2 n, N} (t) = \sum_{i = 1}^{[t N]} η_{i}^{'}, 0 \leq t \leq 1,

and

Y_{2 n, N} (t) = \frac{1}{\sqrt{2 n}} (\sum_{i = 1}^{[t N]} η_{i}^{'} - [t N] \frac{2 n}{N}) = \frac{1}{\sqrt{2 n}} (X_{2 n, N} (t) - [t N] \frac{2 n}{N}), 0 \leq t \leq 1 .

(4)

Observe that

X_{2 n, N} (0) = 0

and

X_{2 n, N} (1) = 2 n

,

Y_{2 n, N} (0) = Y_{2 n, N} (1) = 0

.

The Gaussian random process,

W_{0} (t)

,

0 \leq t \leq 1

, is called a Brownian bridge if its mean value function is 0 and its correlation function is

f (t_{1}, t_{2}) = t_{1} (1 - t_{2})

,

0 \leq t_{1} \leq t_{2} \leq 1

.

For the homogeneous allocation scheme, we shall prove in Theorem 4 that the finite dimensional distributions of

Y_{2 n, N}

converge to the finite dimensional distributions of

W_{0}

, if

N, n \to \infty

, such that

\frac{2 n}{N} \to \infty

, see Theorem 4.

Both Theorems 3 and 4 imply

χ^{2}

-tests.

Our mathematical approach is based on the well-known notion of the generalized allocation scheme introduced by V. F. Kolchin in [4]. Thus, we recall the definition of the generalized allocation scheme. The random variables

η_{1}, \dots, η_{N}

obey the generalized allocation scheme of n particles into N cells, if their joint distribution has the form

P {η_{1} = k_{1}, \dots, η_{N} = k_{N}} = P \{ξ_{1} = k_{1}, \dots, ξ_{N} = k_{N} |\sum_{i = 1}^{N} ξ_{i} = n\},

(5)

for non-negative integer numbers

k_{1}, k_{2}, \dots, k_{N}

, such that

k_{1} + k_{2} + \dots + k_{N} = n

and for some independent non-negative integer valued random variables

ξ_{1}, ξ_{2}, \dots, ξ_{N}

.

The simplest particular case of the generalized allocation scheme is the usual allocation of particles into cells. Thus, let

ξ_{1}, ξ_{2}, \dots, ξ_{N}

be independent Poisson random variables with parameters

α q_{1}, α q_{2}, \dots, α q_{N}

for some

α > 0

and

\sum_{i = 1}^{N} q_{i} = 1

, then the generalized allocation scheme is a usual allocation scheme of n distinguishable particles into N different cells. In other words, a generalized allocation scheme defined by independent Poisson random variables

ξ_{1}, \dots, ξ_{N}

with parameters

α_{1}, α_{2}, \dots α_{N}

, is the usual allocation scheme of n distinguishable particles into N different cells, such that

q_{i} = \frac{α_{i}}{\sum_{j = 1}^{N} α_{j}}

,

1 \leq i \leq N

. Thus, in a certain general sense, we can consider the value

ξ_{i}

in Equation (5) as the number of particles in the i^th cell.

In the original paper [4], Kolchin obtained the basic properties of the generalized allocation scheme; moreover, he proved limit theorems for the number of cells containing precisely r particles. In Equation (5), the distribution of the random variable

ξ_{1}

can be arbitrary. Fixing its distribution in various ways, several models of discrete probability theory, such as random forests, random permutations, random allocations, and urn schemes are obtained as particular cases of the generalized allocation scheme, see [5].

In our paper, we shall not use known limit theorems for the generalized allocation scheme, we shall just use the representation (7), which is a certain consequence of the generalized allocation scheme. To this end, we shall show that when there are an even number of particles in each cell, then the usual allocation can be described by a generalized allocation scheme in the following way. Let

(t) = \frac{e^{t} + e^{- t}}{2}

,

t \in R

be the hyperbolic cosine function.

Theorem 1.

Let

η_{1}, \dots, η_{N}

be a generalized allocation scheme of

2 n

particles into N cells defined by Poisson independent random variables

ξ_{1}, \dots, ξ_{N}

with parameters

β_{1}, \dots, β_{N}

. Then,

η_{1}^{'}, \dots, η_{N}^{'}

defined by (2) can be represented as a generalized allocation scheme of

2 n

particles into N cells defined by the independent random variables

ξ_{1}^{'}, \dots, ξ_{N}^{'}

, with distributions

P {ξ_{i}^{'} = 2 k} = \frac{β_{i}^{2 k}}{(2 k)! ch (β_{i})}, k = 0, 1, 2 \dots, 1 \leq i \leq N .

That is

P {η_{1}^{'} = 2 k_{1}, \dots, η_{N}^{'} = 2 k_{N}} = P \{ξ_{1}^{'} = 2 k_{1}, \dots, ξ_{N}^{'} = 2 k_{N} |\sum_{i = 1}^{N} ξ_{i}^{'} = 2 n\},

(6)

for non-negative integer numbers

k_{1}, k_{2}, \dots, k_{N}

, such that

k_{1} + k_{2} + \dots + k_{N} = n

.

For identically distributed random variables

ξ_{1}, \dots, ξ_{N}

, Theorem 1 was proved in [6]. One can prove Theorem 1 using similar elementary calculations as in the proof given in [6].

From Theorem 1 and (5), it follows that

P {η_{1}^{'} = 2 k_{1}, \dots, η_{K}^{'} = 2 k_{K}} = (\prod_{i = 1}^{K} P {ξ_{i}^{'} = 2 k_{i}}) \frac{P {\sum_{i = K + 1}^{N} ξ_{i}^{'} = 2 n - 2 k}}{P {\sum_{i = 1}^{N} ξ_{i}^{'} = 2 n}}

= (\prod_{i = 1}^{K} P {ξ_{i}^{'} = 2 k_{i}}) \frac{P {\sum_{i = K + 1}^{N} ξ_{i}^{*} = n - k}}{P {\sum_{i = 1}^{N} ξ_{i}^{*} = n}},

(7)

where

K \leq N

,

k = k_{1} + k_{2} + \dots + k_{K}

, and the independent random variables

ξ_{1}^{*}, \dots, ξ_{N}^{*}

have the distributions

P {ξ_{i}^{*} = k} = \frac{β_{i}^{2 k}}{(2 k)! ch (β_{i})}, k = 0, 1, 2 \dots, 1 \leq i \leq N .

Equation (7) plays a crucial role in our paper. The proof of Theorem 2 will be based on approximations of the fractional and the multipliers in (7).

The structure of our paper is as follows: In Section 2, further notation is given and the main results are presented. Theorem 2 is the integral version of the central limit theorem for the allocation scheme, when each cell contains an even number of particles. Theorem 2 is given in terms of the generalized allocation scheme, but the underlying distribution is the Poisson distribution, so the result concerns the usual allocation scheme. However, the general setting is important because in the proof, the general framework given in Theorem 1, is used. Corollary 1 is the local version of Theorem 2. Theorem 3 is a version of Theorem 2. For practical applications, Theorem 3 is more convenient than Theorem 2. Then, we turn to the homogeneous case and present Theorem 4, which states the convergence of the finite dimensional distributions to those of the Brownian bridge. In Section 3, two

χ^{2}

-tests are proposed. The first one tests the probabilities

q_{1}, \dots, q_{N}

, when the sample comes from the random allocation with an even number of particles in each cell. Then, we give a proposal to apply the

χ^{2}

-test to check binary files with parity bits. The second

χ^{2}

-test can be applied when we have observations only for the numbers of particles in some unions of the cells. Examples 3 and 4 offer numerical evidence for our limit theorems. In Section 4, some auxiliary results are given. In Section 5, the proofs of the main results are presented. For the proofs, we use both known approximation theorems and direct calculations.

We shall apply the following usual notation.

R

is the set of real numbers,

N

is the set of positive integers,

E

stands for the expectation, and

D^{2}

denotes the variance.

o (1)

is a quantity converging to 0.

f (x) = O (h (x))

if

f (x) / h (x)

is bounded as

x \to 0

.

2. Main Results

First, we study the non-homogeneous allocation scheme. Consider the scheme (6) and representation (7). Consider the generic random variable

ξ^{*} (β)

with parameter

β > 0

, having the distribution

P {ξ^{*} (β) = k} = \frac{β^{2 k}}{(2 k)! ch (β)}, k = 0, 1, 2 \dots .

(8)

Let

ξ_{1}^{*} = ξ^{*} (β_{1}), ξ_{2}^{*} = ξ^{*} (β_{2}), \dots, ξ_{N}^{*} = ξ^{*} (β_{N})

(9)

be independent random variables so that for any i,

ξ_{i}^{*} = ξ^{*} (β_{i})

has the distribution (8) with parameter

β = β_{i}

.

The expectation and the variance of

ξ^{*} (β)

(see later in Equations (21) and (25)) are

m^{*} (β) = E (ξ^{*} (β)) = \frac{β}{2} th (β), σ^{* 2} (β) = D^{2} (ξ^{*} (β)) = \frac{β}{4} (1 + \frac{β}{{ch}^{2} (β)} + \frac{e^{- β}}{ch (β)}),

(10)

where

th (x) = \frac{e^{x} - e^{- x}}{e^{x} + e^{- x}}

is the hyperbolic tangent function. Therefore, the expectation and the variance of

S_{N}^{*} = \sum_{i = 1}^{N} ξ_{i}^{*}

are

m_{N}^{*} = E S_{N}^{*} = \frac{1}{2} \sum_{i = 1}^{N} β_{i} th (β_{i}), σ_{N}^{* 2} = D^{2} S_{N}^{*} = \frac{1}{4} \sum_{i = 1}^{N} β_{i} (1 + \frac{β_{i}}{{ch}^{2} (β_{i})} + \frac{e^{- β_{i}}}{ch (β_{i})}) .

(11)

In our main theorem, we will use the following condition: for some

C > 0

,

\frac{| n - m_{N}^{*} |}{σ_{N}^{*}} < C, min_{1 \leq i \leq N} β_{i} \to \infty, and \frac{{max}_{1 \leq i \leq N} β_{i}}{\sum_{i = 1}^{N} β_{i}} \to 0

(12)

as

n, N \to \infty

.

Our first main results in this paper are the following theorems:

Theorem 2.

Let

η_{1}^{'}, \dots, η_{N}^{'}

be defined by (2), where

η_{1}, \dots, η_{N}

are defined in (5), where

ξ_{1}, \dots, ξ_{N}

are independent Poisson random variables with the parameters

β_{1}, \dots, β_{N}

. Let condition (12) be valid. Then, we have

(\frac{η_{1}^{'} - β_{1}}{\sqrt{β_{1}}}, \frac{η_{2}^{'} - β_{2}}{\sqrt{β_{2}}}, \dots, \frac{η_{K}^{'} - β_{K}}{\sqrt{β_{K}}}) \overset{d}{\to} (γ_{1}, γ_{2}, \dots, γ_{K})

as

N, n \to \infty

.

During the proof of Theorem 2, we shall obtain the following local limit theorem.

Corollary 1.

Under the conditions of Theorem 2, if

N, n \to \infty

, then, we have

P (η_{1}^{'} = 2 k_{1}, \dots, η_{K}^{'} = 2 k_{K}) = (\prod_{i = 1}^{K} \frac{2}{\sqrt{2 π β_{i}}} (exp (- \frac{{(2 k_{i} - β_{i})}^{2}}{2 β_{i}}))) (1 + o (1))

(13)

uniformly for the values of

k_{i}

, such that

C_{1 i} < \frac{2 k_{i} - β_{i}}{\sqrt{β_{i}}} < C_{2 i}

,

1 \leq i \leq K

, for any fixed numbers

C_{1 i} < C_{2 i}

,

1 \leq i \leq K

.

In the following theorem,

q_{1}, q_{2}, \dots, q_{N}

will denote a discrete probability distribution depending on n and N.

Theorem 3.

Let

η_{1}^{'}, \dots, η_{N}^{'}

be the usual allocation scheme of

2 n

distinguishable particles into N different cells with even number of particles in each cell. Assume that the allocation probabilities are

q_{1}, q_{2}, \dots, q_{N}

which depend on n and N. Suppose that, for some

C > 0

,

\sqrt{n} \sum_{i = 1}^{N} q_{i} e^{- 4 n q_{i}} < C, n min_{1 \leq i \leq N} q_{i} \to \infty and max_{1 \leq i \leq N} q_{i} \to 0

(14)

as

n, N \to \infty

. Then, we have

(\frac{η_{1}^{'} - 2 n q_{1}}{\sqrt{2 n q_{1}}}, \frac{η_{2}^{'} - 2 n q_{2}}{\sqrt{2 n q_{2}}}, \dots, \frac{η_{K}^{'} - 2 n q_{K}}{\sqrt{2 n q_{K}}}) \overset{d}{\to} (γ_{1}, γ_{2}, \dots, γ_{K})

as

N, n \to \infty

.

Theorem 3 can be obtained from Theorem 2 if we use

β_{i} = 2 n q_{i}

,

1 \leq i \leq N

.

Now, we turn to the homogeneous allocation scheme; we assume that in (1), the parameters

q_{i}

are the same. If there are an even number of particles in each cell, then this allocation is described by Equation (6), and because of homogeneity, the random variables

ξ_{1}^{'}, \dots, ξ_{N}^{'}

, are independent and identically distributed with distribution

P {ξ_{i}^{'} = 2 k} = \frac{β^{2 k}}{(2 k)! ch (β)}, k = 0, 1, 2 \dots, 1 \leq i \leq N,

(15)

where

β > 0

. From (6), it follows that

\begin{matrix} P {X_{2 n, N} (t_{1}) = 2 k_{1}, X_{2 n, N} (t_{2}) - X_{2 n, N} (t_{1}) = 2 k_{2}} \\ = \frac{P \{\sum_{i = 1}^{[t_{1} N]} ξ_{i}^{'} = 2 k_{1}\} \cdot P \{\sum_{i = [t_{1} N] + 1}^{[t_{2} N]} ξ_{i}^{'} = 2 k_{2}\} \cdot P \{\sum_{i = [t_{2} N] + 1}^{N} ξ_{i}^{'} = 2 k_{3}\}}{P {ξ_{1}^{'} + \dots + ξ_{N}^{'} = 2 n}}, \end{matrix}

(16)

where

k_{1}, k_{2}, k_{3}

are non-negative integer numbers, such that

k_{1} + k_{2} + k_{3} = n

. We shall need this formula in the proof of our Theorem 4.

Theorem 4.

Let

Y_{2 n, N}

be defined in (4). Assume that in (6), the random variables

ξ_{1}^{'}, \dots, ξ_{N}^{'}

are independent and identically distributed with distribution (15). Let

N, n \to \infty

, such that

\frac{2 n}{N} \to \infty

. Then, the finite dimensional distributions of

Y_{2 n, N}

converge to the finite dimensional distributions of the Brownian bridge

W_{0}

.

The idea of the proof for the particular case of two-dimensional distributions is the following. Let

0 < t_{1} < t_{2} < 1

. The vector of two increments of the Brownian bridge

W_{0}

(W_{0} (t_{1}) - W_{0} (0), W_{0} (t_{2}) - W_{0} (t_{1})) = (W_{0} (t_{1}), W_{0} (t_{2}) - W_{0} (t_{1}))

has the correlation matrix

Σ = (\begin{matrix} t_{1} (1 - t_{1}) t_{1} (t_{1} - t_{2}) \\ t_{1} (t_{1} - t_{2}) (t_{2} - t_{1}) (1 - (t_{2} - t_{1})) \end{matrix}) .

(17)

The determinant of

Σ

is

| Σ | = t_{1} (1 - t_{1}) (t_{2} - t_{1}) (1 - (t_{2} - t_{1})) - {(t_{1} (t_{1} - t_{2}))}^{2} = t_{1} (t_{2} - t_{1}) (1 - t_{2})

and its inverse is

Σ^{- 1} = (\begin{matrix} \frac{1 - (t_{2} - t_{1})}{t_{1} (1 - t_{2})} \frac{1}{1 - t_{2}} \\ \frac{1}{1 - t_{2}} \frac{1 - t_{1}}{(t_{2} - t_{1}) (1 - t_{2})} \end{matrix}) .

During the proof of Theorem 4, we shall show that the distribution of the vector

(Y_{n, N} (t_{1}) - Y_{n, N} (0), Y_{n N} (t_{2}) - Y_{n N} (t_{1}))

converges to the two-dimensional Gaussian distribution with a mean of 0 and covariance matrix

Σ

.

3. Applications of the Main Results for $χ^{2}$ -Tests and Numerical Examples

Using our main results, we can construct some analogues of the well-known

χ^{2}

-test.

The first one is a consequence of Theorem 3, so we assume the conditions of that theorem.

Theorem 5.

Let

η_{1}^{'}, \dots, η_{N}^{'}

be an allocation scheme of

2 n

distinguishable particles into N different cells with an even number of particles in each cell. Assume that the allocation probabilities are

q_{1}, q_{2}, \dots, q_{N}

which depend on n and N. Suppose that conditions (14) are valid. Then, we have

\sum_{i = 1}^{K} \frac{{(η_{i}^{'} - 2 n q_{i})}^{2}}{2 n q_{i}} \overset{d}{\to} χ^{2} (K)

as

N, n \to \infty

, where

χ^{2} (K)

denotes the

χ^{2}

-distribution with degree of freedom K.

The proof of Theorem 5 is a simple application of Theorem 3 and the definition of the

χ^{2}

-distribution.

Now, we turn to an application of the above

χ^{2}

-test for a well-known method of coding, i.e., the parity checking.

Example 1.

We can apply our

χ^{2}

-test for testing a transmission channel for messages using parity bits. The well-known parity bits are used for error detection. First, we briefly describe the usage of parity bits in the case of the so-called even parity bit. Consider a binary message containing N blocks. If a fixed block contains an odd number of bits having value 1, then we add a parity bit having value 1. If the fixed block contains an even number of bits having value 1, then we set the value of the parity bit to 0. Thus, in the final block, the number of bits having value 1 should be always even. Sometimes this method is called control sum.

After transmission of the binary message through a noisy channel, one can check the parity of each block. If the parity is odd, it shows an error. More precisely, the parity check shows an odd number of errors. However, if a block contains an even number of errors, then this check does not show an error. We are interested in finding the error rate of a transmission channel, assuming that the parity check does not show any error.

Our statistical model is as follows: Consider a file which contains N blocks. The m^th block,

1 \leq m \leq N

, is a sequence

i_{1} i_{2} \dots i_{l_{m}}

, where

i_{j} = 1

or

i_{j} = 0

,

1 \leq j \leq l_{m} - 1

, and

i_{l_{m}} = (\sum_{j = 1}^{l_{m} - 1} i_{j}) (mod 2)

.

i_{l_{m}}

represents the parity bit. An error in a block is a replacement of any element

i_{k}

of the block to its opposite value, that is the true value 1 is replaced by 0, or the true value 0 is replaced by 1.

We consider the following statistical model for the errors. The file contains a binary message. It is divided into N blocks. In each block, a parity bit is used. After the transmission of the file throughout a channel, the parity check does not show any error.

To check the quality of the channel, we should obtain the original file and compare it with the transmitted one to identify the errors. We can test the hypothesis

H_{0}

:

q_{i}

is the probability that an error occurs in the i^th block, where

q_{i} > 0

,

\sum_{i = 1}^{N} q_{i} = 1

. E.g., we can test that the probability that an error happens in the i^th block is proportional to the length of the block by using

q_{i} = \frac{l_{i}}{\sum_{k = 1}^{N} l_{k}}, 1 \leq i \leq N .

The numbers of errors in the N blocks are

η_{1}^{'}, \dots, η_{N}^{'}

with the following properties:

(1): The number of errors in the whole file is $2 n$ (i.e., $η_{1}^{'} + \dots + η_{N}^{'} = 2 n$ );
(2): Errors can occur in the blocks independently and the probability that an error occurs in the i^th block is $q_{i}$ ;
(3): The parity check does not find any block with error (that is each block has an even number of errors).

Then, the numbers of errors

η_{1}^{'}, \dots, η_{N}^{'}

can be considered as the allocation of

2 n

distinguishable particles into N different cells with an even number of particles in each cell.

We calculate the statistic

\sum_{i = 1}^{K} \frac{{(η_{i}^{'} - 2 n q_{i})}^{2}}{2 n q_{i}}

from Theorem 5, and if its value is larger than a critical value, then we reject hypothesis

H_{0}

.

Now, we turn to an application of Theorem 4 to mathematical statistics. Our next example is similar to Example 1. Consider again a binary file containing N blocks and any block that contains a parity bit. Assume that the parity check does not show any error in the blocks. So, in any block, there can be an even number of errors. We are not able to find the number of errors in the blocks, but we can find the number of errors in m super blocks (i.e., in some unions of the original N blocks). Using the following procedure, we can test either the sizes of the super blocks or when the super block sizes are known; then, we can test if the errors are uniformly distributed among the original N blocks. In the next example, we describe the statistical procedure in a general mathematical setting.

Example 2.

Consider the homogeneous allocation model. Let

2 k_{1}, \dots, 2 k_{N}

be the numbers of particles in the cells after allocating

2 n

distinguishable particles into N different cells having an even number of particles in each cell. However, the numbers

2 k_{1}, \dots, 2 k_{N}

are not known for us, only the numbers of particles in some neighbouring cells are known.

Let

t_{0}^{'} = 0 < t_{1}^{'} < \dots < t_{m}^{'} = 1

, where each

t_{j}^{'}

has the form

k / N

. So we suppose that the numbers of particles in certain sets of the cells are known, more precisely

n_{j} = \sum_{i = [t_{j - 1}^{'} N] + 1}^{[t_{j}^{'} N]} 2 k_{i}

,

1 \leq j \leq m

, are known. Let

t_{0} = 0 < t_{1} < \dots < t_{m} = 1

be some fixed known numbers, where again each

t_{j}

has the form of

k / N

. We will check the null hypothesis

H_{0} : t_{i}^{'} = t_{i}

,

1 \leq i \leq m

, against the alternative hypothesis

H_{1} : t_{i}^{'} \neq t_{i}

for some

1 \leq i \leq m

.

To this end, we propose the following

χ^{2}

-test. Let

χ_{o}^{2} = \sum_{j = 1}^{m} \frac{{(n_{j} - 2 n \frac{[t_{j} N] - [t_{j - 1} N]}{N})}^{2}}{2 n (t_{j} - t_{j - 1})}

be the test statistic.

Let

0 < α < 1

. Choose the critical value

χ_{c}^{2}

, such that

P {χ^{2} (m - 1) < χ_{c}^{2}} = 1 - α

, where

χ^{2} (m - 1)

is a random variable having

χ^{2}

-distribution with a degree of freedom of

m - 1

. The hypothesis

H_{0}

is accepted if

χ_{o}^{2} < χ_{c}^{2}

, and it is rejected if

χ_{o}^{2} \geq χ_{c}^{2}

. By Theorem 4, if

\frac{n}{N} \to \infty

, then the probability of the type I error converges to

P {χ^{2} (m - 1) \geq χ_{c}^{2}} = α .

Above we used, besides Theorem 4, the following known fact from the statistical theory of

χ^{2}

-tests. If

t_{0} = 0 < t_{1} < \dots < t_{m} = 1

, then, for the increments of the Brownian bridge

W_{0}

, the distribution of

\sum_{j = 1}^{m} \frac{{(W_{0} (t_{j}) - W_{0} (t_{j - 1})}^{2}}{t_{j} - t_{j - 1}}

is

χ^{2} (m - 1)

.

Example 3.

We carried out computer experiments to show numerically the results of our theorems. We simulated the allocations using random numbers. We considered a homogeneous allocation, that is, when we allocate a particle, then we choose a cell uniformly at random from the N cells. We allocated

2 n = 1000

particles into

N = 20

cells. We repeated this experiment several times and we only saved those results when there was an even number of particles in each cell. So we saved

s = 200

times the results of the allocations. In this way, we obtained a sample of size

s = 200

for our

N = 20

-dimensional random vector

η_{1}^{'}, \dots, η_{N}^{'}

. Then, we constructed histograms for the fist two coordinates of the above-mentioned

N = 20

-dimensional sample. On the left-hand side of Figure 1, the histogram of the observations of

\frac{η_{1}^{'} - 2 n q_{1}}{\sqrt{2 n q_{1}}}

, together with the standard normal probability density function, are shown. On the right-hand side of Figure 1, the histogram for

\frac{η_{2}^{'} - 2 n q_{2}}{\sqrt{2 n q_{2}}}

and the standard normal probability density function can be seen. The fit to the normal distribution seems to be very good. On the left-hand side of Figure 2, the joint histogram of the sample for the variables

\frac{η_{1}^{'} - 2 n q_{1}}{\sqrt{2 n q_{1}}}

and

\frac{η_{2}^{'} - 2 n q_{2}}{\sqrt{2 n q_{2}}}

is given. This figure supports the joint normality of the two coordinates.Therefore, we obtained numerical evidence for Theorems 2 and 3. Finally, we performed principal component analysis for the observations of the vector

\frac{η_{1}^{'} - 2 n q_{1}}{\sqrt{2 n q_{1}}}, \dots, \frac{η_{N}^{'} - 2 n q_{N}}{\sqrt{2 n q_{N}}} .

The first 19 principal component variances were between

1.64

and

0.53

, but the last one was zero; this result supports the theory that the degree of freedom of the

χ^{2}

-statistic in Example 2 is

m - 1

.

Example 4.

We carried out the same computer experiment as in Example 3, but using other parameters. We allocated

2 n = 2000

particles into

N = 10

cells. We saved

s = 1000

times the results of those allocations when there was an even number of particles in each cell. In this way, we obtained a sample of size

s = 1000

for the

N = 10

-dimensional random vector

η_{1}^{'}, \dots, η_{N}^{'}

. On the left-hand side of Figure 3, the histogram of the observations of

\frac{η_{1}^{'} - 2 n q_{1}}{\sqrt{2 n q_{1}}}

, together with the standard normal probability density function, are presented. On the right-hand side of Figure 3, the histogram for

\frac{η_{2}^{'} - 2 n q_{2}}{\sqrt{2 n q_{2}}}

and the standard normal probability density function are given. The fit to the normal distribution is again, very good. On the right-hand side of Figure 2, the joint histogram of the sample for the variables

\frac{η_{1}^{'} - 2 n q_{1}}{\sqrt{2 n q_{1}}}

and

\frac{η_{2}^{'} - 2 n q_{2}}{\sqrt{2 n q_{2}}}

is given. This figure also supports the joint normality of the two coordinates. Therefore, we obtained another numerical confirmation for Theorems 2 and 3. Then, we performed principal component analysis for the observations of the vector

\frac{η_{1}^{'} - 2 n q_{1}}{\sqrt{2 n q_{1}}}, \dots, \frac{η_{N}^{'} - 2 n q_{N}}{\sqrt{2 n q_{N}}} .

The first 9 principal component variances were between

1.17

and

0.87

, but the last one was zero. It supports that the degree of freedom of the

χ^{2}

-statistic in Example 2 is

m - 1

and not m.

We mention that for relatively small values of n, e.g., for

2 n = 500

and

N = 25

, the numerical results show a pure fit to the normal distribution. It is also worth mentioning that we need a large sample size, i.e.,

s > 100

, to numerically show the goodness of fit to the normal distribution.

Figure 2. The joint histograms of the first two coordinates in Examples 3 and 4. (a) Histogram for Example 3; (b) Histogram for Example 4.

Figure 3. The histograms of the first and the second coordinates in Example 4. (a) First coordinate; (b) Second coordinate.

4. Auxiliary Results

We shall use the following notation. Let

π (β)

denote a Poisson random variable with the parameter

β

, and let

ξ^{'} (β)

, where

β > 0

, be a random variable with the distribution

P {ξ^{'} (β) = 2 k} = \frac{β^{2 k}}{(2 k)! ch (β)}, k = 0, 1, 2 \dots .

Recall that this distribution appears in Theorem 1. We see that the distribution of

ξ^{*} (β)

is the same as the distribution of

ξ^{'} (β) / 2

.

Lemma 1.

Let

C > 0

be fixed. Then, we have

P {ξ^{'} (β) = 2 k} = \frac{2}{\sqrt{2 π β}} (exp (- \frac{{(2 k - β)}^{2}}{2 β})) (1 + o (1))

(18)

as

β \to \infty

, uniformly for those values of k for which

\frac{| 2 k - β |}{\sqrt{β}} < C

.

Proof.

We need the following approximation of the Poisson distribution by the normal density function, see p. 43 of [7]. Let

τ

have Poisson distribution

P (τ = k) = \frac{λ^{k}}{k!} e^{- λ}

,

k = 0, 1, 2, \dots

and let

k = λ + x \sqrt{λ}

. Then, as

λ \to \infty

,

P (τ = k) = \frac{1}{\sqrt{2 π λ}} e^{- \frac{x^{2}}{2}} (1 + O (\frac{1}{\sqrt{λ}})),

uniformly for

x \in [- c, c]

, where c is an arbitrary fixed positive number.

Using the above approximation for

P {π (β) = 2 k}

, we obtain

P {ξ^{'} (β) = 2 k} = \frac{β^{2 k}}{(2 k)! e^{β}} \frac{1}{\frac{1}{2} (1 + e^{- 2 β})} = P {π (β) = 2 k} \frac{1 + o (1)}{\frac{1}{2}}

= \frac{1}{\sqrt{2 π β}} (exp (- \frac{{(2 k - β)}^{2}}{2 β})) (1 + o (1)) \frac{1 + o (1)}{\frac{1}{2}}

= \frac{2}{\sqrt{2 π β}} (exp (- \frac{{(2 k - β)}^{2}}{2 β})) (1 + o (1))

as

β \to \infty

uniformly for k such that

\frac{| 2 k - β |}{\sqrt{β}} < C

. □

Lemma 2.

For the moments of

ξ^{*} (β)

, we have

\begin{matrix} m^{*} (β) = E ξ^{*} (β) = \frac{β}{2} th (β), σ^{* 2} (β) = D^{2} ξ^{*} (β) = \frac{β}{4} (1 + o (1)), \\ E {(ξ^{*} (β) - E ξ^{*} (β))}^{4} = \frac{3}{16} β^{2} (1 + o (1)) \end{matrix}

(19)

as

β \to \infty

.

Proof.

By simple calculation, one can obtain that the characteristic function of

ξ^{*} (β)

is

ϕ (t) = \frac{ch (β e^{\frac{i t}{2}})}{ch (β)}, t \in R,

(20)

where

i = \sqrt{- 1}

. Using the hyperbolic sine function

sh (x) = \frac{e^{x} - e^{- x}}{2}

, we can obtain the derivatives of the characteristic function

ϕ^{'} (t) = \frac{sh (β e^{\frac{i t}{2}})}{ch (β)} β e^{\frac{i t}{2}} \frac{i}{2},

ϕ^{″} (t) = \frac{ch (β e^{\frac{i t}{2}})}{ch (β)} {(β e^{\frac{i t}{2}} \frac{i}{2})}^{2} + \frac{sh (β e^{\frac{i t}{2}})}{ch (β)} β e^{\frac{i t}{2}} \frac{(- 1)}{4},

ϕ^{‴} (t) = \frac{sh (β e^{\frac{i t}{2}})}{ch (β)} {(β e^{\frac{i t}{2}} \frac{i}{2})}^{3} + \frac{ch (β e^{\frac{i t}{2}})}{ch (β)} β^{2} e^{i t} \frac{(- i)}{4} + \frac{ch (β e^{\frac{i t}{2}})}{ch (β)} β^{2} e^{i t} \frac{(- i)}{8} + \frac{sh (β e^{\frac{i t}{2}})}{ch (β)} β e^{\frac{i t}{2}} \frac{(- i)}{8}

= \frac{sh (β e^{\frac{i t}{2}})}{ch (β)} {(β e^{\frac{i t}{2}} \frac{i}{2})}^{3} + 3 \frac{ch (β e^{\frac{i t}{2}})}{ch (β)} β^{2} e^{i t} \frac{(- i)}{8} + \frac{sh (β e^{\frac{i t}{2}})}{ch (β)} β e^{\frac{i t}{2}} \frac{(- i)}{8},

and

ϕ^{i v} (t) = \frac{ch (β e^{\frac{i t}{2}})}{ch (β)} {(β e^{\frac{i t}{2}} \frac{i}{2})}^{4} + 3 \frac{sh (β e^{\frac{i t}{2}})}{ch (β)} β^{3} e^{i \frac{3 t}{2}} \frac{1}{16} + \frac{ch (β e^{\frac{i t}{2}})}{ch (β)} β^{2} e^{i t} \frac{1}{16}

+ 3 \frac{sh (β e^{\frac{i t}{2}})}{ch (β)} {(β e^{\frac{i t}{2}} \frac{i}{2})}^{2} β e^{\frac{i t}{2}} \frac{(- 1)}{4} + 3 \frac{ch (β e^{\frac{i t}{2}})}{ch (β)} β^{2} e^{i t} \frac{1}{8} + \frac{sh (β e^{\frac{i t}{2}})}{ch (β)} β e^{\frac{i t}{2}} \frac{1}{16}

= \frac{ch (β e^{\frac{i t}{2}})}{ch (β)} {(β e^{\frac{i t}{2}} \frac{i}{2})}^{4} + 6 \frac{sh (β e^{\frac{i t}{2}})}{ch (β)} β^{3} e^{i \frac{3 t}{2}} \frac{1}{16} + 7 \frac{ch (β e^{\frac{i t}{2}})}{ch (β)} β^{2} e^{i t} \frac{1}{16} + \frac{sh (β e^{\frac{i t}{2}})}{ch (β)} β e^{\frac{i t}{2}} \frac{1}{16} .

Therefore, we obtain

m^{*} (β) = E ξ^{*} (β) = - i ϕ^{'} (0) = \frac{β}{2} th (β) .

(21)

Moreover,

E {(ξ^{*} (β))}^{2} = - ϕ^{″} (0) = \frac{β^{2}}{4} + \frac{β}{4} th (β),

(22)

E {(ξ^{*} (β))}^{3} = i ϕ^{‴} (0) = \frac{β^{3}}{8} th (β) + \frac{3}{8} β^{2} + \frac{β}{8} th (β),

(23)

and

E {(ξ^{*} (β))}^{4} = ϕ^{i v} (0) = \frac{1}{16} β^{4} + \frac{6 β^{3}}{16} th (β) + \frac{7}{16} β^{2} + \frac{β}{16} th (β) .

(24)

From Equations (21)–(24), it follows that

σ^{* 2} (β) = D^{2} (ξ^{*} (β)) = E {(ξ^{*} (β))}^{2} - {(E ξ^{*} (β))}^{2}

= \frac{β^{2}}{4} + \frac{β}{4} th (β) - {(\frac{β}{2} th (β))}^{2} = \frac{β^{2}}{4 {ch}^{2} (β)} + \frac{β}{4} th (β) = \frac{β}{4} (1 + \frac{β}{{ch}^{2} (β)} - \frac{e^{- β}}{ch (β)}) .

(25)

It implies that

σ^{* 2} (β) = \frac{β}{4} (1 + o (1))

. Then

E {(ξ^{*} (β) - E ξ^{*} (β))}^{4}

= E {(ξ^{*} (β))}^{4} - 4 (E {(ξ^{*} (β))}^{3}) E ξ^{*} (β) + 6 (E {(ξ^{*} (β))}^{2}) {(E ξ^{*} (β))}^{2} - 3 {(E ξ^{*} (β))}^{4}

= \frac{1}{16} β^{4} + \frac{6 β^{3}}{16} th (β) + \frac{7}{16} β^{2} + \frac{β}{16} th (β) - 4 (\frac{β^{3}}{8} th (β) + \frac{3}{8} β^{2} + \frac{β}{8} th (β)) \frac{β}{2} th (β)

+ 6 (\frac{β^{2}}{4} + \frac{β}{4} th (β)) {(\frac{β}{2} th (β))}^{2} - 3 {(\frac{β}{2} th (β))}^{4}

= \frac{2 β^{4}}{16} ({th}^{2} (β) - 1) - \frac{3 β^{4}}{16} ({th}^{4} (β) - 1) - \frac{6 β^{3}}{16} (th (β) - 1) + \frac{6 β^{3}}{16} ({th}^{3} (β) - 1)

+ \frac{3}{16} β^{2} - \frac{4 β^{2}}{16} ({th}^{2} (β) - 1) + \frac{β}{16} + \frac{β}{16} (th (β) - 1) .

(26)

Since

β^{k} ({th}^{l} (β) - 1) \to 0 and th (β) \to 1 as β \to \infty,

for

k, l = 1, 2, 3, 4

, we obtain

E {(ξ^{*} (β) - E ξ^{*} (β))}^{4} = \frac{3}{16} β^{2} + \frac{1}{16} β + o (1) = \frac{3}{16} β^{2} (1 + o (1)) .

Thus, (19) is proved. □

We shall use the following general Berry–Esseen-type inequality. We should mention that in the following Lemma 3, there is no assumption on the distributions of the random variables

ξ_{i}

,

1 \leq i \leq N

.

Lemma 3.

Let

ξ_{i}

,

1 \leq i \leq N

, be independent random variables with variances

σ_{i}^{2}

and expectations

m_{i}

,

i = 1, 2, \dots, N

. Let

S_{N} = \sum_{i = 1}^{N} ξ_{i}

, and let

d_{N}^{2} = \sum_{i = 1}^{N} σ_{i}^{2}

be its variance and let

μ_{N} = \sum_{i = 1}^{N} m_{i}

be its expectation. Then, we have

sup_{t \in R} |P \{\frac{S_{N} - μ_{N}}{d_{N}} < t\} - Φ (t)| \leq 2 c {(\frac{\sum_{i = 1}^{N} E {(ξ_{i} - m_{i})}^{4}}{d_{N}^{4}})}^{\frac{1}{2}} .

(27)

Here, Φ is the standard normal distribution function and c is the constant from the Berry–Esseen inequality.

Lemma 3 was proved in [2]. Now, we shall apply Lemma 3 to our model.

Lemma 4.

Assume that for the random variables

ξ_{1}^{*}, ξ_{2}^{*}, \dots, ξ_{N}^{*}

from formula (9), condition (12) is valid. Then, for their standardized sum, we have

\frac{S_{N}^{*} - m_{N}^{*}}{σ_{N}^{*}} \overset{d}{\to} γ .

Proof.

Using (19) for the right-hand side of (27), we obtain

sup_{t \in R} |P \{\frac{S_{N}^{*} - m_{N}^{*}}{σ_{N}^{*}} < t\} - Φ (t)| \leq 2 c {(\frac{3 (\sum_{i = 1}^{N} {(β_{i})}^{2}) (1 + o (1))}{{(\sum_{i = 1}^{N} β_{i})}^{2}})}^{\frac{1}{2}}

\leq 2 c {(\frac{3 ({max}_{1 \leq i \leq N} β_{i}) (1 + o (1))}{\sum_{i = 1}^{N} β_{i}})}^{\frac{1}{2}} .

(28)

Now, (28) implies Lemma 4. □

In the following lemma, we shall need the characteristic functions of the random variables in (9). Thus, let

ϕ_{j} (t) = \frac{ch (β_{j} e^{i \frac{t}{2}})}{ch (β_{j})} = \frac{e^{β_{j} e^{i \frac{t}{2}}} + e^{- β_{j} e^{i \frac{t}{2}}}}{e^{β_{j}} + e^{- β_{j}}}

be the characteristic function of

ξ_{j}^{*}

, let

ϕ_{j}^{c} (t) = ϕ_{j} (t) e^{- i t m^{*} (β_{j})}

be the characteristic function of the centralized version of

ξ_{j}^{*}

,

1 \leq j \leq N

, and let

ϕ_{N} (t)

be the characteristic function of the standardized sum

\frac{S_{N}^{*} - m_{N}^{*}}{σ_{N}^{*}}

.

Lemma 5.

Assume that for the random variables

ξ_{1}^{*}, ξ_{2}^{*}, \dots, ξ_{N}^{*}

from formula (9), condition (12) is valid. Let

C > 0

. Then, we have

σ_{N}^{*} P {S_{N}^{*} = n} = \frac{1}{\sqrt{2 π}} exp (- \frac{{(n - m_{N}^{*})}^{2}}{2 σ_{N}^{* 2}}) (1 + o (1)),

(29)

uniformly for those n, such that

\frac{| n - m_{N}^{*} |}{σ_{N}^{*}} < C

.

Proof.

We shall need the notation

z = \frac{n - m_{N}^{*}}{σ_{N}^{*}} .

S_{N}^{*}

is an integer valued random variable, so its distribution can be expressed by the following inverse Fourier transform

P {S_{N}^{*} = n} = \frac{1}{2 π} \int_{- π}^{π} e^{- i n x} ϕ_{N}^{0} (x) d x,

(30)

where

ϕ_{N}^{0}

is the characteristic function of

S_{N}^{*}

. However,

ϕ_{N}^{0} (\frac{t}{σ_{N}^{*}}) = ϕ_{N} (t) e^{\frac{i t m_{N}^{*}}{σ_{N}^{*}}} = \prod_{j = 1}^{N} ϕ_{j}^{c} (\frac{t}{σ_{N}^{*}}) e^{\frac{i t m_{N}^{*}}{σ_{N}^{*}}} .

So, substituting

x = t / σ_{N}^{*}

into the integral in (30), we obtain

σ_{N}^{*} P {S_{N}^{*} = n} = \frac{1}{2 π} \int_{- π σ_{N}^{*}}^{π σ_{N}^{*}} e^{- i t z} ϕ_{N} (t) d t = \frac{1}{2 π} \int_{- π σ_{N}^{*}}^{π σ_{N}^{*}} e^{- i t z} \prod_{i = 1}^{N} ϕ_{i}^{c} (\frac{t}{σ_{N}^{*}}) d t .

Let

0 < ε < 1

and

B > 0

. Using the characteristic function of the standard normal law, we have, for any real z, that

\frac{1}{\sqrt{2 π}} e^{- \frac{z^{2}}{2}} = \frac{1}{2 π} \int_{- \infty}^{\infty} e^{- i x z} e^{- \frac{x^{2}}{2}} d x .

So we can represent the difference of the two sides of (29) as the sum of four integrals

R_{N} = 2 π (σ_{N}^{*} P {S_{N}^{*} = n} - \frac{1}{\sqrt{2 π}} e^{- \frac{z^{2}}{2}}) = I_{1} + I_{2} + I_{3} + I_{4},

(31)

where

I_{1} = \int_{| x | < B} e^{- i x z} ϕ_{N} (x) d x - \int_{| x | < B} e^{- i x z} e^{- \frac{x^{2}}{2}} d x,

I_{2} = - \int_{| x | > B} e^{- i x z} e^{- \frac{x^{2}}{2}} d x,

I_{3} = \int_{B < | x | \leq ε σ_{N}^{*}} e^{- i x z} (\prod_{j = 1}^{N} ϕ_{j}^{c} (\frac{x}{σ_{N}^{*}})) d x

and

I_{4} = \int_{ε σ_{N}^{*} < | x | \leq π σ_{N}^{*}} e^{- i x z} (\prod_{j = 1}^{N} ϕ_{j}^{c} (\frac{x}{σ_{N}^{*}})) d x .

Since

I_{1} = \int_{| x | < B} e^{- i x z} (ϕ_{N} (x) - e^{- \frac{x^{2}}{2}}) d x,

from Lemma 4, it follows that

I_{1} \to 0

(32)

for all fixed

B > 0

.

Since

| I_{2} | < \int_{| x | > B} e^{- \frac{x^{2}}{2}} d x,

we have

I_{2} \to 0 as B \to \infty .

(33)

Using formula (20) for the characteristic function of

ξ^{*} (β)

, we obtain

| I_{3} | \leq \int_{B < | x | \leq ε σ_{N}^{*}} |e^{- i x z} (\prod_{j = 1}^{N} ϕ_{j}^{c} (\frac{x}{σ_{N}^{*}}))| d x

= \int_{B < | t | \leq ε σ_{N}^{*}} (\prod_{j = 1}^{N} e^{β_{j} (cos \frac{t}{2 σ_{N}^{*}} - 1)}) \prod_{j = 1}^{N} |\frac{1 + e^{- 2 β_{j} e^{i \frac{t}{2 σ_{N}^{*}}}}}{1 + e^{- 2 β_{j}}}| d t .

We know that

cos (x) - 1 \leq - \frac{x^{2}}{2} + \frac{x^{4}}{24} \leq - \frac{11 x^{2}}{24}, if | x | \leq 1; e^{x} - 1 \leq x e^{x}, if x \geq 0 .

Therefore, we obtain that

\prod_{j = 1}^{N} e^{β_{j} (cos \frac{t}{2 σ_{N}^{*}} - 1)} \leq e^{- (\sum_{j = 1}^{N} β_{j}) \frac{11 \cdot t^{2}}{24 \cdot 4 σ_{N}^{* 2}}} \leq e^{- \frac{11 t^{2}}{24 \cdot 4 (1 + o (1))}}

as

B < | t | \leq ε σ_{N}^{*}

, where we applied that

σ^{* 2} (β_{j}) = \frac{β_{j}}{4} (1 + o (1))

. Moreover,

\prod_{j = 1}^{N} |\frac{1 + e^{- 2 β_{j} e^{i \frac{t}{2 σ_{N}^{*}}}}}{1 + e^{- 2 β_{j}}}| \leq \prod_{j = 1}^{N} \frac{1 + e^{- 2 β_{j} cos (\frac{t}{2 σ_{N}^{*}})}}{1 + e^{- 2 β_{j}}}

= \prod_{j = 1}^{N} (1 + \frac{e^{- 2 β_{j} cos (\frac{t}{2 σ_{N}^{*}})} - e^{- 2 β_{j}}}{1 + e^{- 2 β_{j}}}) = exp (\sum_{j = 1}^{N} ln (1 + \frac{e^{- 2 β_{j} cos (\frac{t}{2 σ_{N}^{*}})} - e^{- 2 β_{j}}}{1 + e^{- 2 β_{j}}}))

\leq exp (\sum_{j = 1}^{N} e^{- 2 β_{j}} \frac{e^{2 β_{j} (1 - cos (\frac{t}{2 σ_{N}^{*}}))} - 1}{1 + e^{- 2 β_{j}}}) \leq exp (\sum_{j = 1}^{N} e^{- 2 β_{j}} \frac{2 β_{j} (1 - cos (\frac{t}{2 σ_{N}^{*}}))}{1 + e^{- 2 β_{j}}})

\leq exp (\sum_{j = 1}^{N} e^{- 2 β_{j}} \frac{2 β_{j} \frac{11}{24} {(\frac{t}{2 σ_{N}^{*}})}^{2}}{1 + e^{- 2 β_{j}}}) \leq exp (\frac{11}{12} \cdot \frac{exp (- 2 {min}_{1 \leq j \leq N} β_{j}) t^{2}}{(1 + o (1))}) = e^{t^{2} o (1)},

B < | t | \leq ε σ_{N}^{*}

. Here, we used that

e^{x} \geq x + 1

, the shape of

σ_{N}^{*}

and condition (12). Therefore, we obtain

| I_{3} | \leq \int_{B < | t | \leq ε σ_{N}^{*}} (e^{- \frac{10 t^{2}}{24} (1 + o (1))}) d t \leq \int_{B < | t |} (e^{- \frac{10 t^{2}}{24} (1 + o (1))}) d t .

Consequently,

| I_{3} | \to 0 as B \to \infty .

(34)

Now, we turn to

I_{4}

. By (20), we have

| ϕ_{j} (x) | = |\frac{e^{β_{j} e^{i \frac{x}{2}}} + e^{- β_{j} e^{i \frac{x}{2}}}}{e^{β_{j}} + e^{- β_{j}}}|

\leq \frac{e^{β_{j} cos (\frac{x}{2})} + e^{- β_{j} cos (\frac{x}{2})}}{e^{β_{j}} + e^{- β_{j}}} \leq \frac{e^{β_{j} cos (ε)} + e^{- β_{j} cos (ε)}}{e^{β_{j}} + e^{- β_{j}}} = \frac{ch (β_{j} cos (ε))}{ch (β_{j})}

for

ε < | x | \leq π

. Thus,

| I_{4} | \leq 2 π σ_{N}^{*} \prod_{j = 1}^{N} (\frac{ch (β_{j} cos (ε))}{ch (β_{j})}) \leq 2 π σ_{N}^{*} \prod_{j = 1}^{N} (e^{- (1 - cos (ε)) β_{j}} \frac{1 + e^{- 2 β_{j} cos (ε)}}{1 + e^{- 2 β_{j}}}) .

Since, by (12),

{min}_{1 \leq j \leq N} β_{j} \to \infty

; therefore, we have

\frac{1 + e^{- 2 β_{j} cos (ε)}}{1 + e^{- 2 β_{j}}} \to 1

as

N \to \infty

, uniformly for

1 \leq i \leq N

. Consequently, there exists

N_{0} \in N

, such that

\frac{1 + e^{- 2 β_{j} cos (ε)}}{1 + e^{- 2 β_{j}}} < e^{\frac{1}{2} (1 - cos (ε)) β_{j}},

1 \leq j \leq N

,

N > N_{0}

. Therefore,

| I_{4} | \leq 2 π σ_{N}^{*} \prod_{j = 1}^{N} e^{- \frac{1}{2} (1 - cos (ε)) β_{j}} \leq 2 π σ_{N}^{*} e^{- \frac{1}{2} (1 - cos (ε)) σ_{N}^{* 2} (1 + o (1))}

for

N > N_{0}

. Therefore, we obtain that

I_{4} \to 0 .

(35)

Finally, using formulae (32), (33), (34), and (35) to approximate the left-hand side of (31), we obtain (29). □

5. Proofs of the Main Theorems

Proof of Theorem 2.

During the proof, we represent

η_{1}^{'}, η_{2}^{'}, \dots, η_{N}^{'}

in the form of (7). First, we prove a local version of our limit theorem. To this end, we study the case when the standardized random variables are inside some bounded intervals. Therefore, we need the following notation. Let

C_{1 i} < C_{2 i}

,

1 \leq i \leq K

. Let

k = \sum_{i = 1}^{K} k_{i}, C^{*} = max {| C_{i j} | : 1 \leq j \leq K, i = 1, 2,} .

Let

m_{K}^{*} = \sum_{i = K + 1}^{N} m^{*} (β_{i}) and σ_{K}^{* 2} = \sum_{i = K + 1}^{N} σ^{* 2} (β_{i})

be the expectation and the variance of

\sum_{i = K + 1}^{N} ξ_{i}^{*}

. By Lemma 1, we have

\prod_{i = 1}^{K} P (ξ_{i}^{'} = 2 k_{i}) = \prod_{i = 1}^{K} \frac{2}{\sqrt{2 π β_{i}}} (exp (- \frac{{(2 k_{i} - β_{i})}^{2}}{2 β_{i}})) (1 + o (1)),

(36)

uniformly for

k_{i}

,

1 \leq i \leq K

, such that

C_{1 i} < \frac{2 k_{i} - β_{i}}{\sqrt{β_{i}}} < C_{2 i}, 1 \leq i \leq K .

(37)

Since

σ_{N}^{* 2} \geq σ_{K}^{* 2} = \sum_{i = K + 1}^{N} \frac{β_{i}}{4} (1 + \frac{β_{i}}{{ch}^{2} (β_{i})} - \frac{e^{- β_{i}}}{ch β_{i}})

= \frac{1 + o (1)}{4} \sum_{i = K + 1}^{N} β_{i} = \frac{1 + o (1)}{4} (\sum_{i = 1}^{N} β_{i}) (1 - \frac{\sum_{i = 1}^{K} β_{i}}{\sum_{i = 1}^{N} β_{i}})

\geq \frac{1 + o (1)}{4} (\sum_{i = 1}^{N} β_{i}) (1 - \frac{K {max}_{1 \leq i \leq K} β_{i}}{\sum_{i = 1}^{N} β_{i}}) = (1 + o (1)) σ_{N}^{* 2},

Therefore, we have

σ_{K}^{* 2} = (1 + o (1)) σ_{N}^{* 2} and \frac{σ_{K}^{*}}{σ_{N}^{*}} = o (1) .

Let

k_{i}

,

1 \leq i \leq K

, be such that

C_{1 i} < \frac{2 k_{i} - β_{i}}{\sqrt{β_{i}}} < C_{2 i}

for

1 \leq i \leq K

. Using (12) and the above calculation, we have

\frac{| n - k - m_{K}^{*} |}{σ_{K}^{*}} \leq \frac{| n - m_{N}^{*} |}{σ_{K}^{*}} + \sum_{i = 1}^{K} \frac{| k_{i} - m^{*} (β_{i}) |}{σ_{K}^{*}} < C + K C^{*} + o (1) .

(38)

Therefore, by Lemma 5, we obtain

\frac{P \{\sum_{i = K + 1}^{N} ξ_{i}^{*} = n - k\}}{P \{\sum_{i = 1}^{N} ξ_{i}^{*} = n\}} = \frac{\frac{1}{\sqrt{2 π} σ_{K}^{*}} exp (- \frac{{(n - k - m_{K}^{*})}^{2}}{2 σ_{K}^{* 2}}) (1 + o (1))}{\frac{1}{\sqrt{2 π} σ_{N}^{*}} exp (- \frac{{(n - m_{N}^{*})}^{2}}{2 σ_{N}^{* 2}}) (1 + o (1))}

= exp (\frac{{(n - m_{N}^{*})}^{2}}{2 σ_{N}^{* 2}} - \frac{{(n - k - m_{K}^{*})}^{2}}{2 σ_{N}^{* 2}}) (1 + o (1))

= exp (\frac{(k - (m_{N}^{*} - m_{K}^{*}))}{σ_{N}^{*}} \frac{(n - m_{N}^{*} + n - k - m_{K}^{*})}{2 σ_{N}^{*}}) (1 + o (1)) .

Using (38), (37), and assumption (12), we obtain

|\frac{(n - m_{N}^{*} + n - k - m_{K}^{*})}{σ_{N}^{*}} \frac{(k - (m_{N}^{*} - m_{K}^{*}))}{σ_{N}^{*}}| \leq (2 C + K C^{*} + o (1)) \sum_{i = 1}^{K} \frac{σ^{*} (β_{i})}{σ_{N}^{*}} \frac{| k_{i} - m^{*} (β_{i}) |}{σ^{*} (β_{i})}

\leq (2 C + K C^{*} + o (1)) (C^{*} + o (1)) \sum_{i = 1}^{K} \frac{σ^{*} (β_{i})}{σ_{N}^{*}} = o (1) .

Using the above calculations, we have

\frac{P \{\sum_{i = K + 1}^{N} ξ_{i}^{*} = n - k\}}{P \{\sum_{i = 1}^{N} ξ_{i}^{*} = n\}} = 1 + o (1) .

(39)

Now, using (36) and (39) in formula (7), we obtain

P (η_{1}^{'} = 2 k_{1}, \dots, η_{K}^{'} = 2 k_{K}) = (\prod_{i = 1}^{K} \frac{2}{\sqrt{2 π β_{i}}} (exp (- \frac{{(2 k_{i} - β_{i})}^{2}}{2 β_{i}}))) (1 + o (1))

(40)

uniformly for

k_{i}

, such that

C_{1 i} < \frac{2 k_{i} - β_{i}}{\sqrt{β_{i}}} < C_{2 i}

,

1 \leq i \leq K

. Thus, we obtained Corollary 1.

Now, we can apply the well-known method of obtaining the integral version of de Moivre–Laplace theorem from its local version. Thus, using the notation

t_{k_{i}} = \frac{2 k_{i} - β_{i}}{\sqrt{β_{i}}}

,

Δ t_{k_{i}} = \frac{2}{\sqrt{β_{i}}}

for

1 \leq i \leq K

, we obtain

P (\frac{η_{1}^{'} - β_{1}}{\sqrt{β_{1}}} = t_{k_{1}}, \dots, \frac{η_{K}^{'} - β_{K}}{\sqrt{β_{K}}} = t_{k_{K}}) = (\prod_{i = 1}^{K} \frac{Δ t_{k_{i}}}{\sqrt{2 π}} (exp (- \frac{t_{k_{i}}^{2}}{2}))) (1 + o (1)) .

(41)

Here, on the right-hand side, there is a member of the approximating sum of the integral of the K dimensional standard normal probability density function; so, we obtain

P \{C_{11} < η_{1}^{'} < C_{12}, \dots, C_{K 1} < η_{K}^{'} < C_{K 2}\} \to \prod_{i = 1}^{K} P \{C_{1 i} < γ_{i} < C_{2 i}\} .

This implies Theorem 2. □

Now, we turn to the proof of Theorem 4. Thus, we consider the homogeneous allocation scheme, and we assume that there are even numbers of particles in each cell. That is why we consider Equation (6) with independent and identically distributed random variables

ξ_{1}^{'}, \dots, ξ_{N}^{'}

, with distribution

P {ξ_{i}^{'} = 2 k} = \frac{β^{2 k}}{(2 k)! ch (β)}, k = 0, 1, 2 \dots, 1 \leq i \leq N .

As

ξ_{i}^{'} (β) = 2 ξ_{i}^{*} (β),

so

m (β) = E ξ_{i}^{'} (β) = 2 E ξ_{i}^{*} (β), σ^{2} (β) = E {(ξ_{i}^{'} (β) - m (β))}^{2} = 4 σ^{* 2} (β)

(42)

are the expectation and the variance of

ξ_{i}^{'} (β)

. Let

S_{N} = \sum_{i = 1}^{N} ξ_{i}^{'}

be the sum of our random variables. We need the following corollary of Lemma 5.

Corollary 2.

Consider the homogeneous allocation scheme. Let

β \to \infty

. Then, we have

P {S_{N} = 2 k} = \frac{2}{\sqrt{2 π N} σ (β)} (exp (- \frac{{(2 k - N m (β))}^{2}}{2 N σ^{2} (β)}) (1 + o (1)))

(43)

as

N \to \infty

uniformly for

\frac{| 2 k - N m (β) |}{\sqrt{N} σ (β)} < C

for any

C > 0

.

Proof of Theorem 4.

First, we give a detailed proof for the two-dimensional distributions; then, we sketch the proof for the arbitrary finite dimensional distributions.

Let

- \infty < b_{i 1} < b_{i 2} < \infty

,

i \in {1, 2}

. Choose

C > 0

, such that

- C < b_{i 1} < b_{i 2} < C

,

i \in {1, 2}

. Let

β

be such that

\frac{2 n}{N} = m (β)

. From (42) and Lemma 2, it follows that

m (β) = β \frac{e^{β} - e^{- β}}{e^{β} + e^{- β}} = β th (β)

(44)

and

σ^{2} (β) = β (1 + \frac{2 e^{- β}}{e^{β} + e^{- β}} + β \frac{4}{{(e^{β} + e^{- β})}^{2}}) .

Since

f (β) = th (β) = \frac{e^{β} - e^{- β}}{e^{β} + e^{- β}}

,

β \geq 0

is a bounded function, from (44) and from condition

\frac{2 n}{N} \to \infty

, we obtain that

β \to \infty

. Therefore,

σ^{2} (β) = \frac{2 n}{N} (1 + o (1))

as

\frac{2 n}{N} \to \infty

. Condition

\frac{2 n}{N} \to \infty

implies that

\frac{n}{[t_{1} N]} \to \infty, \frac{n}{[t_{2} N] - [t_{1} N]} \to \infty, \frac{n}{N - [t_{2} N]} \to \infty .

Consequently, from Corollary 2, it follows that

\begin{matrix} P {S_{N} = 2 n} & = \frac{2}{\sqrt{2 π N} σ (β)} exp (- \frac{{(2 n - N m (β))}^{2}}{2 N σ^{2} (β)}) (1 + o (1)) \\ = \frac{2}{\sqrt{4 π n}} (1 + o (1)), \end{matrix}

(45)

uniformly for

\frac{| 2 n - N m (β) |}{2 \sqrt{N} σ (β)} < C

. Similarly,

\begin{matrix} P {S_{[N t_{1}]} = 2 k_{1}} & = \frac{2 exp (- \frac{{(2 k_{1} - [N t_{1}] m (β))}^{2}}{2 [N t_{1}] σ^{2} (β)})}{\sqrt{2 π [N t_{1}]} σ (β)} (1 + o (1)) \\ = \frac{2 exp (- \frac{{(2 k_{1} - 2 n \frac{[t_{1} N]}{N})}^{2}}{4 n t_{1}})}{\sqrt{4 π n t_{1}}} (1 + o (1)) \end{matrix}

(46)

uniformly for

\frac{|2 k_{1} - 2 n \frac{[t_{1} N]}{N}|}{\sqrt{2 n t_{1}}} < C

, and

\begin{matrix} P {S_{[N t_{2}] - [N t_{1}]} = 2 k_{2}} & = \frac{2 exp (- \frac{{(2 k_{2} - ([N t_{2}] - [N t_{1}]) m (β))}^{2}}{2 ([N t_{2}] - [N t_{1}]) σ^{2} (β)})}{\sqrt{2 π ([N t_{2}] - [N t_{1}])} σ (β)} (1 + o (1)) \\ = \frac{2 exp (- \frac{{(2 k_{2} - 2 n \frac{[t_{2} N] - [t_{1} N]}{N})}^{2}}{4 n (t_{2} - t_{1})})}{\sqrt{4 π n (t_{2} - t_{1})}} (1 + o (1)) \end{matrix}

(47)

uniformly for

\frac{|2 k_{2} - 2 n \frac{[t_{2} N] - [t_{1} N]}{N}|}{\sqrt{2 n (t_{2} - t_{1})}} < C

. Since

\frac{| 2 k_{1} + 2 k_{2} - 2 n \frac{[t_{2} N]}{N} |}{\sqrt{4 n (1 - t_{2})}} \leq \sqrt{\frac{t_{1}}{1 - t_{2}}} \frac{| 2 k_{1} - 2 n \frac{[t_{1} N]}{N} |}{\sqrt{4 n t_{1}}} + \sqrt{\frac{t_{2} - t_{1}}{1 - t_{2}}} \frac{| 2 k_{2} - 2 n \frac{[t_{2} N] - [t_{1} N]}{N} |}{\sqrt{4 n (t_{2} - t_{1})}},

so we have

P {S_{N - [N t_{2}]} = 2 n - 2 k_{1} - 2 k_{2}} =

(48)

= \frac{2 exp (- \frac{{(2 n - (2 k_{1} + 2 k_{2}) - (N - [N t_{2}]) m (β))}^{2}}{2 (N - [N t_{2}]) σ^{2} (β)})}{\sqrt{2 π (N - [N t_{2}])} σ (β)} (1 + o (1)) = \frac{2 exp (- \frac{{(2 k_{1} + 2 k_{2} - 2 n \frac{[t_{2} N]}{N})}^{2}}{4 n (1 - t_{2})})}{\sqrt{4 π n (1 - t_{2})}} (1 + o (1))

uniformly for

\frac{|2 k_{1} - 2 n \frac{[t_{1} N]}{N}|}{\sqrt{2 n t_{1}}} < C

,

\frac{|2 k_{2} - 2 n \frac{[t_{2} N] - [t_{1} N]}{N}|}{\sqrt{2 n (t_{2} - t_{1})}} < C

.

For short, let

A = \frac{[t_{1} N]}{N}

,

B = \frac{[t_{2} N] - [t_{1} N]}{N}

. Using Equations (45)–(48) to approximate the probabilities in (16), and applying the definition of

Σ

from (17), we obtain

P {X_{2 n, N} (t_{1}) = 2 k_{1}, X_{2 n, N} (t_{2}) - X_{2 n, N} (t_{1}) = 2 k_{2}}

= \frac{P {S_{[N t_{1}]} = 2 k_{1}} P {S_{[N t_{2}] - [N t_{1}]} = 2 k_{2}} P {S_{N - [N t_{2}]} = 2 n - 2 k_{1} - 2 k_{2}}}{P {S_{N} = 2 n}}

= \frac{\frac{2 exp (- \frac{{(2 k_{1} - 2 n \frac{[t_{1} N]}{N})}^{2}}{4 n t_{1}})}{\sqrt{4 π n t_{1}}} \frac{2 exp (- \frac{{(2 k_{2} - 2 n \frac{[t_{2} N] - [t_{1} N]}{N})}^{2}}{4 n (t_{2} - t_{1})})}{\sqrt{4 π n (t_{2} - t_{1})}} \frac{2 exp (- \frac{{(2 k_{1} + 2 k_{2} - 2 n \frac{[t_{2} N]}{N})}^{2}}{4 n (1 - t_{2})})}{\sqrt{4 π n (1 - t_{2})}}}{\frac{2}{\sqrt{4 π n}} (1 + o (1))}

= \frac{4 exp (- \frac{1}{2} (\frac{{(2 k_{1} - 2 n A)}^{2} (1 - (t_{2} - t_{1}))}{t_{1} (1 - t_{2}) 2 n} + \frac{{(2 k_{2} - 2 n B)}^{2}) (1 - t_{1})}{(t_{2} - t_{1}) (1 - t_{2}) 2 n} + 2 \frac{(2 k_{1} - 2 n A) (2 k_{2} - 2 n B)}{(1 - t_{2}) 2 n}))}{2 π 2 n \sqrt{| Σ |} (1 + o (1))}

= \frac{2}{n} \cdot \frac{exp (- \frac{1}{2} (\frac{2 k_{1} - 2 n A}{\sqrt{2 n}}, \frac{2 k_{2} - 2 n B}{\sqrt{2 n}}) Σ^{- 1} (\begin{matrix} \frac{2 k_{1} - 2 n A}{\sqrt{2 n}} \\ \frac{2 k_{2} - 2 n B}{\sqrt{2 n}} \end{matrix})) (1 + o (1))}{2 π \sqrt{| Σ |}}

(49)

uniformly for

\frac{|2 k_{1} - 2 n \frac{[t_{1} N]}{N}|}{\sqrt{2 n t_{1}}} < C

,

\frac{|2 k_{2} - 2 n \frac{[t_{2} N] - [t_{1} N]}{N}|}{\sqrt{2 n (t_{2} - t_{1})}} < C

.

From (49) and (4), using the same argument as in the proof of the de Moivre–Laplace theorem, we obtain

P \{b_{11} < Y_{2 n, N} (t_{1}) < b_{12}, b_{21} < Y_{2 n, N} (t_{2}) - Y_{2 n, N} (t_{1}) < b_{22}\}

= \sum_{\begin{matrix} b_{11} < \frac{2 k_{1} - 2 n A}{\sqrt{2 n}} < b_{12}, \\ b_{21} < \frac{2 k_{2} - 2 n B}{\sqrt{2 n}} < b_{22} \end{matrix}} P {X_{2 n, N} (t_{1}) = 2 k_{1}, X_{2 n, N} (t_{2}) - X_{2 n, N} (t_{1}) = 2 k_{2}}

= \sum_{\begin{matrix} b_{11} < \frac{2 k_{1} - 2 n A}{\sqrt{2 n}} < b_{12}, \\ b_{21} < \frac{2 k_{2} - 2 n B}{\sqrt{2 n}} < b_{22} \end{matrix}} \frac{2}{n} \cdot \frac{exp (- \frac{1}{2} (\frac{2 k_{1} - 2 n A}{\sqrt{2 n}}, \frac{2 k_{2} - 2 n B}{\sqrt{2 n}}) Σ^{- 1} (\begin{matrix} \frac{2 k_{1} - 2 n A}{\sqrt{2 n}} \\ \frac{2 k_{2} - 2 n B}{\sqrt{2 n}} \end{matrix}))}{2 π \sqrt{| Σ |} (1 + o (1))}

= \int_{b_{11}}^{b_{12}} \int_{b_{21}}^{b_{22}} \frac{1}{2 π \sqrt{| Σ |}} exp (- \frac{1}{2} (x, y) Σ^{- 1} (\begin{matrix} x \\ y \end{matrix})) d x d y (1 + o (1))

= P {b_{11} < W_{0} (t_{1}) < b_{12}, b_{11} < W_{0} (t_{2}) - W_{0} (t_{1}) < b_{12}} (1 + o (1)) .

Thus, the two-dimensional distributions of

Y_{2 n, N}

converge to the two-dimensional distributions of

W_{0}

.

Now, we sketch the proof for the l-dimensional distributions. Let

0 < t_{1} < t_{2} < \dots < t_{l} < 1

. Then,

P {X_{2 n, N} (t_{1}) = 2 k_{1}, X_{2 n, N} (t_{2}) - X_{2 n, N} (t_{1}) = 2 k_{2}, \dots, X_{2 n, N} (t_{l}) - X_{2 n, N} (t_{l - 1}) = 2 k_{l}}

= \frac{P {S_{[N t_{1}]} = 2 k_{1}} P {S_{[N t_{2}] - [N t_{1}]} = 2 k_{2}} \dots P {S_{[N t_{l}] - [N t_{l - 1}]} = 2 k_{l}} P {S_{N - [N t_{l}]} = 2 n - 2 (k_{1} + \dots + k_{l})}}{P {S_{N} = 2 n}}

= \frac{\frac{2 exp (- \frac{{(2 k_{1} - 2 n \frac{[t_{1} N]}{N})}^{2}}{4 n t_{1}})}{\sqrt{4 π n t_{1}}} \prod_{j = 2}^{l} \frac{2 exp (- \frac{{(2 k_{j} - 2 n \frac{[t_{j} N] - [t_{j - 1} N]}{N})}^{2}}{4 n (t_{j} - t_{j - 1})})}{\sqrt{4 π n (t_{j} - t_{j - 1})}} \frac{2 exp (- \frac{{(2 k_{1} + \dots + 2 k_{l} - 2 n \frac{[t_{l} N]}{N})}^{2}}{4 n (1 - t_{l})})}{\sqrt{4 π n (1 - t_{l})}}}{\frac{2}{\sqrt{4 π n}} (1 + o (1))}

= {(\frac{2}{\sqrt{2 n}})}^{l} \frac{1}{{(2 π)}^{l / 2} \sqrt{\prod_{r = 1}^{l + 1} (t_{r} - t_{r - 1})}} exp (\frac{- U (x)}{2}) (1 + o (1)),

(50)

where

t_{0} = 0

,

t_{l + 1} = 1

, and the quadratic form

U (x)

has the following shape

U (x) = \frac{x_{1}^{2}}{b_{1}} + \dots + \frac{x_{l}^{2}}{b_{l}} + \frac{{(x_{1} + \dots + x_{l})}^{2}}{1 - (b_{1} + \dots + b_{l})} .

Here, we used the notation

x_{j} = \frac{2 k_{j} - 2 n \frac{[t_{j} N] - [t_{j - 1} N]}{N}}{\sqrt{2 n}}, b_{j} = t_{j} - t_{j - 1}, j = 1, 2, \dots, l .

(51)

Now, by some algebra, we see that

U (x) = \frac{\sum_{j = 1}^{l} (x_{j}^{2} (1 - \sum_{i \neq j} b_{i}) \prod_{i \neq j} b_{i}) + \sum_{i \neq j} x_{i} x_{j} \prod_{r = 1}^{l} b_{r}}{(1 - \sum_{r = 1}^{l} b_{r}) \prod_{r = 1}^{l} b_{r}} .

We need the

l \times l

-type matrix

D = [\begin{matrix} b_{1} (1 - b_{1}) & - b_{1} b_{2} & - b_{1} b_{3} & \dots & - b_{1} b_{l} \\ - b_{2} b_{1} & b_{2} (1 - b_{2}) & - b_{2} b_{3} & \dots & - b_{2} b_{l} \\ ⋮ & ⋮ & ⋱ \\ - b_{l} b_{1} & - b_{l} b_{2} & - b_{l} b_{3} & \dots & b_{l} (1 - b_{l}) \end{matrix}],

and its inverse

D^{- 1} = \frac{1}{(1 - \sum_{r = 1}^{l} b_{r}) \prod_{r = 1}^{l} b_{r}} [\begin{matrix} (1 - \sum_{r \neq 1} b_{r}) \prod_{r \neq 1} b_{r} & \prod_{r = 1}^{l} b_{r} & \prod_{r = 1}^{l} b_{r} & \dots & \prod_{r = 1}^{l} b_{r} \\ \prod_{r = 1}^{l} b_{r} & (1 - \sum_{r \neq 2} b_{r}) \prod_{r \neq 2} b_{r} & \prod_{r = 1}^{l} b_{r} & \dots & \prod_{r = 1}^{l} b_{r} \\ ⋮ & ⋮ & ⋱ \\ \prod_{r = 1}^{l} b_{r} & \prod_{r = 1}^{l} b_{r} & \prod_{r = 1}^{l} b_{r} & \dots & (1 - \sum_{r \neq l} b_{r}) \prod_{r \neq l} b_{r} \end{matrix}] .

We can see that we obtain the covariance matrix of the increments

(W_{0} (t_{1}) - W_{0} (0), W_{0} (t_{2}) - W_{0} (t_{1}), \dots, W_{0} (t_{l}) - W_{0} (t_{l - 1}))

of the Brownian bridge

W_{0}

if we insert

b_{j} = t_{j} - t_{j - 1}

,

j = 1, 2, \dots, l

, into the matrix D. Denote this matrix by

Σ

(for any fixed value of l). The determinant of D is

(1 - \sum_{r = 1}^{l} b_{r}) \prod_{r = 1}^{l} b_{r}

, so the determinant of

Σ

is

\prod_{r = 1}^{l + 1} (t_{r} - t_{r - 1})

. We can also check that the matrix of the quadratic form

U (x)

is

D^{- 1}

.

So, by using the above considerations, Equation (50) gives

\begin{matrix} P {X_{2 n, N} (t_{1}) = 2 k_{1}, X_{2 n, N} (t_{2}) - X_{2 n, N} (t_{1}) = 2 k_{2}, \dots, X_{2 n, N} (t_{l}) - X_{2 n, N} (t_{l - 1}) = 2 k_{l}} \\ = {(\frac{2}{\sqrt{2 n}})}^{l} \frac{1}{{(2 π)}^{l / 2} \sqrt{det Σ}} exp (\frac{- x^{⊤} Σ^{- 1} x}{2}) (1 + o (1)), \end{matrix}

(52)

where

x = {(x_{1}, x_{2}, \dots, x_{l})}^{⊤}

and

x_{j}

is defined in (51). It implies that the finite dimensional distributions of

Y_{2 n, N}

converge to the finite dimensional distributions of

W_{0}

. □

Remark 1.

Relation (52) is a local limit theorem for the random allocation.

6. Discussion

The random allocation of particles into cells is a well-known model in probability theory. There are limit theorems when either the number of particles or the number of cells or both of them tend to infinity, see [1]. The errors in the blocks of a binary file can be modelled as a random allocation. However, if the parity bits are used, then any odd number of errors in the blocks is always detected, but an even number of errors is never detected.

Therefore, describing the behaviour of the allocation model is an interesting problem when there is an even number of particles in each cell.

In this paper, we consider the numbers of particles in cells when we allocate

2 n

distinguishable particles into N distinct cells having an even number of particles in each cell. For the non-homogeneous case, we study the numbers of particles in the first K cells. We were able to prove the asymptotic normality of this K-dimensional random vector when

n, N \to \infty

. For the homogeneous allocation model, we proved a limit theorem to the finite dimensional distributions of the Brownian bridge, if

n, N \to \infty

. To handle the mathematical problem, we inserted our model into the framework of Kolchin’s generalized allocation scheme. Using the above limit theorems, we obtained two

χ^{2}

-tests. As the parity bit method does not detect any even number of errors in the blocks of a binary file, we suggest applying our model to study the distribution of errors in that file.

Author Contributions

Conceptualization, A.N.C.; methodology, A.N.C.; software, I.F.; formal analysis, A.N.C. and I.F.; investigation, A.N.C. and I.F.; writing—original draft preparation, A.N.C.; writing—review and editing, I.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Acknowledgments

The authors would like to thank the referees for the helpful remarks.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kolchin, V.F.; Sevast’ynov, B.A.; Chistiakov, V.P. Random Allocations; Scripta Series in Mathematics; V. H. Winston & Sons: Washington, DC, USA, 1978. [Google Scholar]
Chikrin, D.E.; Chuprunov, A.N.; Kokunin, P.A. Gaussian limit theorems for the number of given value cells in the non-homogeneous generalized allocation scheme. J. Math. Sci. 2020, 246, 476–487. [Google Scholar] [CrossRef]
Billingsley, P. Convergence of Probability Measures; Wiley: New York, NY, USA, 1968. [Google Scholar]
Kolchin, V.F. A class of limit theorems for conditional distributions. Lith. Math. J. 1968, 8, 53–63. (In Russian) [Google Scholar] [CrossRef]
Kolchin, V.F. Random Graphs; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
Abdushukurov, F.A.; Chuprunov, A.N. Poisson limit theorems in an allocation scheme with even number of particles in each cell. Lobachevskii J. Math. 2020, 41, 289–297. [Google Scholar] [CrossRef]
Timashev, A.M. Asymptotic Expansions in Probabilistic Combinatorics; TVP Science Publishers: Moscow, Russia, 2011. (In Russian) [Google Scholar]

Figure 1. The histograms of the first and the second coordinates in Example 3. (a) First coordinate; (b) Second coordinate.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chuprunov, A.N.; Fazekas, I. On the Numbers of Particles in Cells in an Allocation Scheme Having an Even Number of Particles in Each Cell. Mathematics 2022, 10, 1099. https://doi.org/10.3390/math10071099

AMA Style

Chuprunov AN, Fazekas I. On the Numbers of Particles in Cells in an Allocation Scheme Having an Even Number of Particles in Each Cell. Mathematics. 2022; 10(7):1099. https://doi.org/10.3390/math10071099

Chicago/Turabian Style

Chuprunov, Alexey Nikolaevich, and István Fazekas. 2022. "On the Numbers of Particles in Cells in an Allocation Scheme Having an Even Number of Particles in Each Cell" Mathematics 10, no. 7: 1099. https://doi.org/10.3390/math10071099

APA Style

Chuprunov, A. N., & Fazekas, I. (2022). On the Numbers of Particles in Cells in an Allocation Scheme Having an Even Number of Particles in Each Cell. Mathematics, 10(7), 1099. https://doi.org/10.3390/math10071099

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

On the Numbers of Particles in Cells in an Allocation Scheme Having an Even Number of Particles in Each Cell

Abstract

1. Introduction and Notation

2. Main Results

3. Applications of the Main Results for $χ^{2}$ -Tests and Numerical Examples

4. Auxiliary Results

5. Proofs of the Main Theorems

6. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

On the Numbers of Particles in Cells in an Allocation Scheme Having an Even Number of Particles in Each Cell

Abstract

1. Introduction and Notation

2. Main Results

3. Applications of the Main Results for χ 2 -Tests and Numerical Examples

4. Auxiliary Results

5. Proofs of the Main Theorems

6. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3. Applications of the Main Results for $χ^{2}$ -Tests and Numerical Examples