Next Article in Journal
A New Hybrid Synchronization PLL Scheme for Interconnecting Renewable Energy Sources to an Abnormal Electric Grid
Next Article in Special Issue
Generating Functions for Four Classes of Triple Binomial Sums
Previous Article in Journal
Lightweight Image Super-Resolution Based on Local Interaction of Multi-Scale Features and Global Fusion
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On the Numbers of Particles in Cells in an Allocation Scheme Having an Even Number of Particles in Each Cell

by
Alexey Nikolaevich Chuprunov
1 and
István Fazekas
2,*
1
Faculty of Applied Mathematics, Physics and Information Technology, Chuvash State University, Universitetskaia Str. 38, 428015 Cheboksary, Russia
2
Faculty of Informatics, University of Debrecen, Egyetem Square 1, 4032 Debrecen, Hungary
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(7), 1099; https://doi.org/10.3390/math10071099
Submission received: 25 January 2022 / Revised: 16 March 2022 / Accepted: 21 March 2022 / Published: 29 March 2022
(This article belongs to the Special Issue Random Combinatorial Structures)

Abstract

:
We consider the usual random allocation model of distinguishable particles into distinct cells in the case when there are an even number of particles in each cell. For inhomogeneous allocations, we study the numbers of particles in the first K cells. We prove that, under some conditions, this K-dimensional random vector with centralised and normalised coordinates converges in distribution to the K-dimensional standard Gaussian law. We obtain both local and integral versions of this limit theorem. The above limit theorem implies a χ 2 limit theorem which leads to a χ 2 -test. The parity bit method does not detect even numbers of errors in binary files; therefore, our model can be applied to describe the distribution of errors in those files. For the homogeneous allocation model, we obtain a limit theorem when both the number of particles and the number of cells tend to infinity. In that case, we prove convergence to the finite dimensional distributions of the Brownian bridge. This result also implies a χ 2 -test. To handle the mathematical problem, we insert our model into the framework of Kolchin’s generalized allocation scheme.

1. Introduction and Notation

In this paper, we study the usual random allocation model.
The random variables η 1 , , η N represent a non-homogeneous allocation scheme of n-distinguishable particles into N distinct cells if their joint distribution has the form
P { η 1 = k 1 , , η N = k N } = n ! k 1 ! k 2 ! k N ! ( q 1 ) k 1 ( q 2 ) k 2 ( q N ) k N ,
where k 1 , k 2 , , k N are non-negative integers with k 1 + k 2 + + k N = n , q 1 + q 2 + + q N = 1 , 0 q i 1 for 1 i N . Here, q i is the probability that the particle is inserted into the ith cell, and the random variable η i is the number of particles in the ith cell after allocating n particles into the cells. When q 1 = q 2 = = q N = 1 N , then scheme (1) is called a homogeneous allocation scheme of n distinguishable particles into N distinct cells. In [1], homogeneous and non-homogeneous allocation schemes of n distinguishable particles into N distinct cells were considered.
Our goal is to study allocations with an even number of particles in each cell. Thus, let A 2 be the set of even non-negative integers, i.e., A 2 = { 2 k : k = 0 , 1 , 2 , } ; let η 1 , , η N be the allocation scheme of 2 n distinguishable particles into N different cells; and let η 1 , , η N be the allocation scheme of 2 n distinguishable particles into N different cells with an even number of particles in each cell. Then, η 1 , , η N has the distribution
P { η 1 = 2 k 1 , , η N = 2 k N }
= P η 1 = 2 k 1 , , η N = 2 k N η i A 2 , 1 i N ,
where k 1 , k 2 , , k N are non-negative integer numbers, such that 2 k 1 + 2 k 2 + + 2 k N = 2 n .
To describe the results of the paper, we need the following notation. d denotes the convergence in distribution. γ , γ i , i N are independent, identically distributed Gaussian random variables with mean 0 and variance 1.
In [2], it was proved that
η 1 n q 1 n q 1 , , η K n q K n q K d ( γ 1 , , γ K ) ,
if K is a fixed number and N , n such that q i 0 , n q i for 1 i N .
The first aim of this paper is to obtain an analogue of the above result for an allocation scheme of distinguishable particles into distinct cells having an even number of particles in each cell. We shall prove that, under some conditions,
η 1 2 n q 1 2 n q 1 , η 2 2 n q 2 2 n q 2 , , η K 2 n q K 2 n q K d ( γ 1 , γ 2 , , γ K )
as N , n , see Theorems 2 and 3.
A well-known fact is that the polynomial distribution (1) is asymptotically normal, when N is fixed and n . This result serves as a basis of the proof that the limit of the empirical process is the Brownian bridge, see [3]. In this paper, we shall study this problem for allocations having an even number of particles in each cell. Here, we introduce the following two random processes:
X 2 n , N ( t ) = i = 1 [ t N ] η i , 0 t 1 ,
and
Y 2 n , N ( t ) = 1 2 n i = 1 [ t N ] η i [ t N ] 2 n N = 1 2 n X 2 n , N ( t ) [ t N ] 2 n N , 0 t 1 .
Observe that X 2 n , N ( 0 ) = 0 and X 2 n , N ( 1 ) = 2 n , Y 2 n , N ( 0 ) = Y 2 n , N ( 1 ) = 0 .
The Gaussian random process, W 0 ( t ) , 0 t 1 , is called a Brownian bridge if its mean value function is 0 and its correlation function is f ( t 1 , t 2 ) = t 1 ( 1 t 2 ) , 0 t 1 t 2 1 .
For the homogeneous allocation scheme, we shall prove in Theorem 4 that the finite dimensional distributions of Y 2 n , N converge to the finite dimensional distributions of W 0 , if N , n , such that 2 n N , see Theorem 4.
Both Theorems 3 and 4 imply χ 2 -tests.
Our mathematical approach is based on the well-known notion of the generalized allocation scheme introduced by V. F. Kolchin in [4]. Thus, we recall the definition of the generalized allocation scheme. The random variables η 1 , , η N obey the generalized allocation scheme of n particles into N cells, if their joint distribution has the form
P { η 1 = k 1 , , η N = k N } = P ξ 1 = k 1 , , ξ N = k N i = 1 N ξ i = n ,
for non-negative integer numbers k 1 , k 2 , , k N , such that k 1 + k 2 + + k N = n and for some independent non-negative integer valued random variables ξ 1 , ξ 2 , , ξ N .
The simplest particular case of the generalized allocation scheme is the usual allocation of particles into cells. Thus, let ξ 1 , ξ 2 , , ξ N be independent Poisson random variables with parameters α q 1 , α q 2 , , α q N for some α > 0 and i = 1 N q i = 1 , then the generalized allocation scheme is a usual allocation scheme of n distinguishable particles into N different cells. In other words, a generalized allocation scheme defined by independent Poisson random variables ξ 1 , , ξ N with parameters α 1 , α 2 , α N , is the usual allocation scheme of n distinguishable particles into N different cells, such that q i = α i j = 1 N α j , 1 i N . Thus, in a certain general sense, we can consider the value ξ i in Equation (5) as the number of particles in the ith cell.
In the original paper [4], Kolchin obtained the basic properties of the generalized allocation scheme; moreover, he proved limit theorems for the number of cells containing precisely r particles. In Equation (5), the distribution of the random variable ξ 1 can be arbitrary. Fixing its distribution in various ways, several models of discrete probability theory, such as random forests, random permutations, random allocations, and urn schemes are obtained as particular cases of the generalized allocation scheme, see [5].
In our paper, we shall not use known limit theorems for the generalized allocation scheme, we shall just use the representation (7), which is a certain consequence of the generalized allocation scheme. To this end, we shall show that when there are an even number of particles in each cell, then the usual allocation can be described by a generalized allocation scheme in the following way. Let ( t ) = e t + e t 2 , t R be the hyperbolic cosine function.
Theorem 1.
Let η 1 , , η N be a generalized allocation scheme of 2 n particles into N cells defined by Poisson independent random variables ξ 1 , , ξ N with parameters β 1 , , β N . Then, η 1 , , η N defined by (2) can be represented as a generalized allocation scheme of 2 n particles into N cells defined by the independent random variables ξ 1 , , ξ N , with distributions
P { ξ i = 2 k } = β i 2 k ( 2 k ) ! ch ( β i ) , k = 0 , 1 , 2 , 1 i N .
That is
P { η 1 = 2 k 1 , , η N = 2 k N } = P ξ 1 = 2 k 1 , , ξ N = 2 k N i = 1 N ξ i = 2 n ,
for non-negative integer numbers k 1 , k 2 , , k N , such that k 1 + k 2 + + k N = n .
For identically distributed random variables ξ 1 , , ξ N , Theorem 1 was proved in [6]. One can prove Theorem 1 using similar elementary calculations as in the proof given in [6].
From Theorem 1 and (5), it follows that
P { η 1 = 2 k 1 , , η K = 2 k K } = i = 1 K P { ξ i = 2 k i } P { i = K + 1 N ξ i = 2 n 2 k } P { i = 1 N ξ i = 2 n }
= i = 1 K P { ξ i = 2 k i } P { i = K + 1 N ξ i * = n k } P { i = 1 N ξ i * = n } ,
where K N , k = k 1 + k 2 + + k K , and the independent random variables ξ 1 * , , ξ N * have the distributions
P { ξ i * = k } = β i 2 k ( 2 k ) ! ch ( β i ) , k = 0 , 1 , 2 , 1 i N .
Equation (7) plays a crucial role in our paper. The proof of Theorem 2 will be based on approximations of the fractional and the multipliers in (7).
The structure of our paper is as follows: In Section 2, further notation is given and the main results are presented. Theorem 2 is the integral version of the central limit theorem for the allocation scheme, when each cell contains an even number of particles. Theorem 2 is given in terms of the generalized allocation scheme, but the underlying distribution is the Poisson distribution, so the result concerns the usual allocation scheme. However, the general setting is important because in the proof, the general framework given in Theorem 1, is used. Corollary 1 is the local version of Theorem 2. Theorem 3 is a version of Theorem 2. For practical applications, Theorem 3 is more convenient than Theorem 2. Then, we turn to the homogeneous case and present Theorem 4, which states the convergence of the finite dimensional distributions to those of the Brownian bridge. In Section 3, two χ 2 -tests are proposed. The first one tests the probabilities q 1 , , q N , when the sample comes from the random allocation with an even number of particles in each cell. Then, we give a proposal to apply the χ 2 -test to check binary files with parity bits. The second χ 2 -test can be applied when we have observations only for the numbers of particles in some unions of the cells. Examples 3 and 4 offer numerical evidence for our limit theorems. In Section 4, some auxiliary results are given. In Section 5, the proofs of the main results are presented. For the proofs, we use both known approximation theorems and direct calculations.
We shall apply the following usual notation. R is the set of real numbers, N is the set of positive integers, E stands for the expectation, and D 2 denotes the variance. o ( 1 ) is a quantity converging to 0. f ( x ) = O ( h ( x ) ) if f ( x ) / h ( x ) is bounded as x 0 .

2. Main Results

First, we study the non-homogeneous allocation scheme. Consider the scheme (6) and representation (7). Consider the generic random variable ξ * ( β ) with parameter β > 0 , having the distribution
P { ξ * ( β ) = k } = β 2 k ( 2 k ) ! ch ( β ) , k = 0 , 1 , 2 .
Let
ξ 1 * = ξ * ( β 1 ) , ξ 2 * = ξ * ( β 2 ) , , ξ N * = ξ * ( β N )
be independent random variables so that for any i, ξ i * = ξ * ( β i ) has the distribution (8) with parameter β = β i .
The expectation and the variance of ξ * ( β ) (see later in Equations (21) and (25)) are
m * ( β ) = E ( ξ * ( β ) ) = β 2 th ( β ) , σ * 2 ( β ) = D 2 ( ξ * ( β ) ) = β 4 1 + β ch 2 ( β ) + e β ch ( β ) ,
where th ( x ) = e x e x e x + e x is the hyperbolic tangent function. Therefore, the expectation and the variance of
S N * = i = 1 N ξ i *
are
m N * = E S N * = 1 2 i = 1 N β i th ( β i ) , σ N * 2 = D 2 S N * = 1 4 i = 1 N β i 1 + β i ch 2 ( β i ) + e β i ch ( β i ) .
In our main theorem, we will use the following condition: for some C > 0 ,
| n m N * | σ N * < C , min 1 i N β i , and max 1 i N β i i = 1 N β i 0
as n , N .
Our first main results in this paper are the following theorems:
Theorem 2.
Let η 1 , , η N be defined by (2), where η 1 , , η N are defined in (5), where ξ 1 , , ξ N are independent Poisson random variables with the parameters β 1 , , β N . Let condition (12) be valid. Then, we have
η 1 β 1 β 1 , η 2 β 2 β 2 , , η K β K β K d ( γ 1 , γ 2 , , γ K )
as N , n .
During the proof of Theorem 2, we shall obtain the following local limit theorem.
Corollary 1.
Under the conditions of Theorem 2, if N , n , then, we have
P ( η 1 = 2 k 1 , , η K = 2 k K ) = i = 1 K 2 2 π β i exp ( 2 k i β i ) 2 2 β i ( 1 + o ( 1 ) )
uniformly for the values of k i , such that C 1 i < 2 k i β i β i < C 2 i , 1 i K , for any fixed numbers C 1 i < C 2 i , 1 i K .
In the following theorem, q 1 , q 2 , , q N will denote a discrete probability distribution depending on n and N.
Theorem 3.
Let η 1 , , η N be the usual allocation scheme of 2 n distinguishable particles into N different cells with even number of particles in each cell. Assume that the allocation probabilities are q 1 , q 2 , , q N which depend on n and N. Suppose that, for some C > 0 ,
n i = 1 N q i e 4 n q i < C , n min 1 i N q i and max 1 i N q i 0
as n , N . Then, we have
η 1 2 n q 1 2 n q 1 , η 2 2 n q 2 2 n q 2 , , η K 2 n q K 2 n q K d ( γ 1 , γ 2 , , γ K )
as N , n .
Theorem 3 can be obtained from Theorem 2 if we use β i = 2 n q i , 1 i N .
Now, we turn to the homogeneous allocation scheme; we assume that in (1), the parameters q i are the same. If there are an even number of particles in each cell, then this allocation is described by Equation (6), and because of homogeneity, the random variables ξ 1 , , ξ N , are independent and identically distributed with distribution
P { ξ i = 2 k } = β 2 k ( 2 k ) ! ch ( β ) , k = 0 , 1 , 2 , 1 i N ,
where β > 0 . From (6), it follows that
P { X 2 n , N ( t 1 ) = 2 k 1 , X 2 n , N ( t 2 ) X 2 n , N ( t 1 ) = 2 k 2 } = P i = 1 [ t 1 N ] ξ i = 2 k 1 · P i = [ t 1 N ] + 1 [ t 2 N ] ξ i = 2 k 2 · P i = [ t 2 N ] + 1 N ξ i = 2 k 3 P { ξ 1 + + ξ N = 2 n } ,
where k 1 , k 2 , k 3 are non-negative integer numbers, such that k 1 + k 2 + k 3 = n . We shall need this formula in the proof of our Theorem 4.
Theorem 4.
Let Y 2 n , N be defined in (4). Assume that in (6), the random variables ξ 1 , , ξ N are independent and identically distributed with distribution (15). Let N , n , such that 2 n N . Then, the finite dimensional distributions of Y 2 n , N converge to the finite dimensional distributions of the Brownian bridge W 0 .
The idea of the proof for the particular case of two-dimensional distributions is the following. Let 0 < t 1 < t 2 < 1 . The vector of two increments of the Brownian bridge W 0
( W 0 ( t 1 ) W 0 ( 0 ) , W 0 ( t 2 ) W 0 ( t 1 ) ) = ( W 0 ( t 1 ) , W 0 ( t 2 ) W 0 ( t 1 ) )
has the correlation matrix
Σ = t 1 ( 1 t 1 ) t 1 ( t 1 t 2 ) t 1 ( t 1 t 2 ) ( t 2 t 1 ) ( 1 ( t 2 t 1 ) ) .
The determinant of Σ is
| Σ | = t 1 ( 1 t 1 ) ( t 2 t 1 ) ( 1 ( t 2 t 1 ) ) ( t 1 ( t 1 t 2 ) ) 2 = t 1 ( t 2 t 1 ) ( 1 t 2 )
and its inverse is
Σ 1 = 1 ( t 2 t 1 ) t 1 ( 1 t 2 ) 1 1 t 2 1 1 t 2 1 t 1 ( t 2 t 1 ) ( 1 t 2 ) .
During the proof of Theorem 4, we shall show that the distribution of the vector Y n , N ( t 1 ) Y n , N ( 0 ) , Y n N ( t 2 ) Y n N ( t 1 ) converges to the two-dimensional Gaussian distribution with a mean of 0 and covariance matrix Σ .

3. Applications of the Main Results for χ 2 -Tests and Numerical Examples

Using our main results, we can construct some analogues of the well-known χ 2 -test.
The first one is a consequence of Theorem 3, so we assume the conditions of that theorem.
Theorem 5.
Let η 1 , , η N be an allocation scheme of 2 n distinguishable particles into N different cells with an even number of particles in each cell. Assume that the allocation probabilities are q 1 , q 2 , , q N which depend on n and N. Suppose that conditions (14) are valid. Then, we have
i = 1 K ( η i 2 n q i ) 2 2 n q i d χ 2 ( K )
as N , n , where χ 2 ( K ) denotes the χ 2 -distribution with degree of freedom K.
The proof of Theorem 5 is a simple application of Theorem 3 and the definition of the χ 2 -distribution.
Now, we turn to an application of the above χ 2 -test for a well-known method of coding, i.e., the parity checking.
Example 1.
We can apply our χ 2 -test for testing a transmission channel for messages using parity bits. The well-known parity bits are used for error detection. First, we briefly describe the usage of parity bits in the case of the so-called even parity bit. Consider a binary message containing N blocks. If a fixed block contains an odd number of bits having value 1, then we add a parity bit having value 1. If the fixed block contains an even number of bits having value 1, then we set the value of the parity bit to 0. Thus, in the final block, the number of bits having value 1 should be always even. Sometimes this method is called control sum.
After transmission of the binary message through a noisy channel, one can check the parity of each block. If the parity is odd, it shows an error. More precisely, the parity check shows an odd number of errors. However, if a block contains an even number of errors, then this check does not show an error. We are interested in finding the error rate of a transmission channel, assuming that the parity check does not show any error.
Our statistical model is as follows: Consider a file which contains N blocks. The mth block, 1 m N , is a sequence i 1 i 2 i l m , where i j = 1 or i j = 0 , 1 j l m 1 , and i l m = j = 1 l m 1 i j ( mod 2 ) . i l m represents the parity bit. An error in a block is a replacement of any element i k of the block to its opposite value, that is the true value 1 is replaced by 0, or the true value 0 is replaced by 1.
We consider the following statistical model for the errors. The file contains a binary message. It is divided into N blocks. In each block, a parity bit is used. After the transmission of the file throughout a channel, the parity check does not show any error.
To check the quality of the channel, we should obtain the original file and compare it with the transmitted one to identify the errors. We can test the hypothesis H 0 : q i is the probability that an error occurs in the ith block, where q i > 0 , i = 1 N q i = 1 . E.g., we can test that the probability that an error happens in the ith block is proportional to the length of the block by using
q i = l i k = 1 N l k , 1 i N .
The numbers of errors in the N blocks are η 1 , , η N with the following properties:
(1) 
The number of errors in the whole file is 2 n (i.e., η 1 + + η N = 2 n );
(2) 
Errors can occur in the blocks independently and the probability that an error occurs in the ith block is q i ;
(3) 
The parity check does not find any block with error (that is each block has an even number of errors).
Then, the numbers of errors η 1 , , η N can be considered as the allocation of 2 n distinguishable particles into N different cells with an even number of particles in each cell.
We calculate the statistic
i = 1 K ( η i 2 n q i ) 2 2 n q i
from Theorem 5, and if its value is larger than a critical value, then we reject hypothesis H 0 .
Now, we turn to an application of Theorem 4 to mathematical statistics. Our next example is similar to Example 1. Consider again a binary file containing N blocks and any block that contains a parity bit. Assume that the parity check does not show any error in the blocks. So, in any block, there can be an even number of errors. We are not able to find the number of errors in the blocks, but we can find the number of errors in m super blocks (i.e., in some unions of the original N blocks). Using the following procedure, we can test either the sizes of the super blocks or when the super block sizes are known; then, we can test if the errors are uniformly distributed among the original N blocks. In the next example, we describe the statistical procedure in a general mathematical setting.
Example 2.
Consider the homogeneous allocation model. Let 2 k 1 , , 2 k N be the numbers of particles in the cells after allocating 2 n distinguishable particles into N different cells having an even number of particles in each cell. However, the numbers 2 k 1 , , 2 k N are not known for us, only the numbers of particles in some neighbouring cells are known.
Let t 0 = 0 < t 1 < < t m = 1 , where each t j has the form k / N . So we suppose that the numbers of particles in certain sets of the cells are known, more precisely n j = i = [ t j 1 N ] + 1 [ t j N ] 2 k i , 1 j m , are known. Let t 0 = 0 < t 1 < < t m = 1 be some fixed known numbers, where again each t j has the form of k / N . We will check the null hypothesis H 0 : t i = t i , 1 i m , against the alternative hypothesis H 1 : t i t i for some 1 i m .
To this end, we propose the following χ 2 -test. Let
χ o 2 = j = 1 m n j 2 n [ t j N ] [ t j 1 N ] N 2 2 n ( t j t j 1 )
be the test statistic.
Let 0 < α < 1 . Choose the critical value χ c 2 , such that P { χ 2 ( m 1 ) < χ c 2 } = 1 α , where χ 2 ( m 1 ) is a random variable having χ 2 -distribution with a degree of freedom of m 1 . The hypothesis H 0 is accepted if χ o 2 < χ c 2 , and it is rejected if χ o 2 χ c 2 . By Theorem 4, if n N , then the probability of the type I error converges to
P { χ 2 ( m 1 ) χ c 2 } = α .
Above we used, besides Theorem 4, the following known fact from the statistical theory of χ 2 -tests. If t 0 = 0 < t 1 < < t m = 1 , then, for the increments of the Brownian bridge W 0 , the distribution of
j = 1 m W 0 ( t j ) W 0 ( t j 1 2 t j t j 1
is χ 2 ( m 1 ) .
Example 3.
We carried out computer experiments to show numerically the results of our theorems. We simulated the allocations using random numbers. We considered a homogeneous allocation, that is, when we allocate a particle, then we choose a cell uniformly at random from the N cells. We allocated 2 n = 1000 particles into N = 20 cells. We repeated this experiment several times and we only saved those results when there was an even number of particles in each cell. So we saved s = 200 times the results of the allocations. In this way, we obtained a sample of size s = 200 for our N = 20 -dimensional random vector η 1 , , η N . Then, we constructed histograms for the fist two coordinates of the above-mentioned N = 20 -dimensional sample. On the left-hand side of Figure 1, the histogram of the observations of η 1 2 n q 1 2 n q 1 , together with the standard normal probability density function, are shown. On the right-hand side of Figure 1, the histogram for η 2 2 n q 2 2 n q 2 and the standard normal probability density function can be seen. The fit to the normal distribution seems to be very good. On the left-hand side of Figure 2, the joint histogram of the sample for the variables η 1 2 n q 1 2 n q 1 and η 2 2 n q 2 2 n q 2 is given. This figure supports the joint normality of the two coordinates.Therefore, we obtained numerical evidence for Theorems 2 and 3. Finally, we performed principal component analysis for the observations of the vector
η 1 2 n q 1 2 n q 1 , , η N 2 n q N 2 n q N .
The first 19 principal component variances were between 1.64 and 0.53 , but the last one was zero; this result supports the theory that the degree of freedom of the χ 2 -statistic in Example 2 is m 1 .
Example 4.
We carried out the same computer experiment as in Example 3, but using other parameters. We allocated 2 n = 2000 particles into N = 10 cells. We saved s = 1000 times the results of those allocations when there was an even number of particles in each cell. In this way, we obtained a sample of size s = 1000 for the N = 10 -dimensional random vector η 1 , , η N . On the left-hand side of Figure 3, the histogram of the observations of η 1 2 n q 1 2 n q 1 , together with the standard normal probability density function, are presented. On the right-hand side of Figure 3, the histogram for η 2 2 n q 2 2 n q 2 and the standard normal probability density function are given. The fit to the normal distribution is again, very good. On the right-hand side of Figure 2, the joint histogram of the sample for the variables η 1 2 n q 1 2 n q 1 and η 2 2 n q 2 2 n q 2 is given. This figure also supports the joint normality of the two coordinates. Therefore, we obtained another numerical confirmation for Theorems 2 and 3. Then, we performed principal component analysis for the observations of the vector
η 1 2 n q 1 2 n q 1 , , η N 2 n q N 2 n q N .
The first 9 principal component variances were between 1.17 and 0.87 , but the last one was zero. It supports that the degree of freedom of the χ 2 -statistic in Example 2 is m 1 and not m.
We mention that for relatively small values of n, e.g., for 2 n = 500 and N = 25 , the numerical results show a pure fit to the normal distribution. It is also worth mentioning that we need a large sample size, i.e., s > 100 , to numerically show the goodness of fit to the normal distribution.
Figure 2. The joint histograms of the first two coordinates in Examples 3 and 4. (a) Histogram for Example 3; (b) Histogram for Example 4.
Figure 2. The joint histograms of the first two coordinates in Examples 3 and 4. (a) Histogram for Example 3; (b) Histogram for Example 4.
Mathematics 10 01099 g002
Figure 3. The histograms of the first and the second coordinates in Example 4. (a) First coordinate; (b) Second coordinate.
Figure 3. The histograms of the first and the second coordinates in Example 4. (a) First coordinate; (b) Second coordinate.
Mathematics 10 01099 g003

4. Auxiliary Results

We shall use the following notation. Let π ( β ) denote a Poisson random variable with the parameter β , and let ξ ( β ) , where β > 0 , be a random variable with the distribution
P { ξ ( β ) = 2 k } = β 2 k ( 2 k ) ! ch ( β ) , k = 0 , 1 , 2 .
Recall that this distribution appears in Theorem 1. We see that the distribution of ξ * ( β ) is the same as the distribution of ξ ( β ) / 2 .
Lemma 1.
Let C > 0 be fixed. Then, we have
P { ξ ( β ) = 2 k } = 2 2 π β exp ( 2 k β ) 2 2 β ( 1 + o ( 1 ) )
as β , uniformly for those values of k for which | 2 k β | β < C .
Proof. 
We need the following approximation of the Poisson distribution by the normal density function, see p. 43 of [7]. Let τ have Poisson distribution P ( τ = k ) = λ k k ! e λ , k = 0 , 1 , 2 , and let k = λ + x λ . Then, as λ ,
P ( τ = k ) = 1 2 π λ e x 2 2 1 + O 1 λ ,
uniformly for x [ c , c ] , where c is an arbitrary fixed positive number.
Using the above approximation for P { π ( β ) = 2 k } , we obtain
P { ξ ( β ) = 2 k } = β 2 k ( 2 k ) ! e β 1 1 2 ( 1 + e 2 β ) = P { π ( β ) = 2 k } 1 + o ( 1 ) 1 2
= 1 2 π β exp ( 2 k β ) 2 2 β ( 1 + o ( 1 ) ) 1 + o ( 1 ) 1 2
= 2 2 π β exp ( 2 k β ) 2 2 β ( 1 + o ( 1 ) )
as β uniformly for k such that | 2 k β | β < C . □
Lemma 2.
For the moments of ξ * ( β ) , we have
m * ( β ) = E ξ * ( β ) = β 2 th ( β ) , σ * 2 ( β ) = D 2 ξ * ( β ) = β 4 1 + o ( 1 ) , E ( ξ * ( β ) E ξ * ( β ) ) 4 = 3 16 β 2 ( 1 + o ( 1 ) )
as β .
Proof. 
By simple calculation, one can obtain that the characteristic function of ξ * ( β ) is
ϕ ( t ) = ch ( β e i t 2 ) ch ( β ) , t R ,
where i = 1 . Using the hyperbolic sine function sh ( x ) = e x e x 2 , we can obtain the derivatives of the characteristic function
ϕ ( t ) = sh ( β e i t 2 ) ch ( β ) β e i t 2 i 2 ,
ϕ ( t ) = ch ( β e i t 2 ) ch ( β ) β e i t 2 i 2 2 + sh ( β e i t 2 ) ch ( β ) β e i t 2 ( 1 ) 4 ,
ϕ ( t ) = sh ( β e i t 2 ) ch ( β ) β e i t 2 i 2 3 + ch ( β e i t 2 ) ch ( β ) β 2 e i t ( i ) 4 + ch ( β e i t 2 ) ch ( β ) β 2 e i t ( i ) 8 + sh ( β e i t 2 ) ch ( β ) β e i t 2 ( i ) 8
= sh ( β e i t 2 ) ch ( β ) β e i t 2 i 2 3 + 3 ch ( β e i t 2 ) ch ( β ) β 2 e i t ( i ) 8 + sh ( β e i t 2 ) ch ( β ) β e i t 2 ( i ) 8 ,
and
ϕ i v ( t ) = ch ( β e i t 2 ) ch ( β ) β e i t 2 i 2 4 + 3 sh ( β e i t 2 ) ch ( β ) β 3 e i 3 t 2 1 16 + ch ( β e i t 2 ) ch ( β ) β 2 e i t 1 16
+ 3 sh ( β e i t 2 ) ch ( β ) β e i t 2 i 2 2 β e i t 2 ( 1 ) 4 + 3 ch ( β e i t 2 ) ch ( β ) β 2 e i t 1 8 + sh ( β e i t 2 ) ch ( β ) β e i t 2 1 16
= ch ( β e i t 2 ) ch ( β ) β e i t 2 i 2 4 + 6 sh ( β e i t 2 ) ch ( β ) β 3 e i 3 t 2 1 16 + 7 ch ( β e i t 2 ) ch ( β ) β 2 e i t 1 16 + sh ( β e i t 2 ) ch ( β ) β e i t 2 1 16 .
Therefore, we obtain
m * ( β ) = E ξ * ( β ) = i ϕ ( 0 ) = β 2 th ( β ) .
Moreover,
E ( ξ * ( β ) ) 2 = ϕ ( 0 ) = β 2 4 + β 4 th ( β ) ,
E ( ξ * ( β ) ) 3 = i ϕ ( 0 ) = β 3 8 th ( β ) + 3 8 β 2 + β 8 th ( β ) ,
and
E ( ξ * ( β ) ) 4 = ϕ i v ( 0 ) = 1 16 β 4 + 6 β 3 16 th ( β ) + 7 16 β 2 + β 16 th ( β ) .
From Equations (21)–(24), it follows that
σ * 2 ( β ) = D 2 ( ξ * ( β ) ) = E ( ξ * ( β ) ) 2 ( E ξ * ( β ) ) 2
= β 2 4 + β 4 th ( β ) β 2 th ( β ) 2 = β 2 4 ch 2 ( β ) + β 4 th ( β ) = β 4 1 + β ch 2 ( β ) e β ch ( β ) .
It implies that σ * 2 ( β ) = β 4 1 + o ( 1 ) . Then
E ( ξ * ( β ) E ξ * ( β ) ) 4
= E ( ξ * ( β ) ) 4 4 ( E ( ξ * ( β ) ) 3 ) E ξ * ( β ) + 6 ( E ( ξ * ( β ) ) 2 ) ( E ξ * ( β ) ) 2 3 ( E ξ * ( β ) ) 4
= 1 16 β 4 + 6 β 3 16 th ( β ) + 7 16 β 2 + β 16 th ( β ) 4 β 3 8 th ( β ) + 3 8 β 2 + β 8 th ( β ) β 2 th ( β )
+ 6 β 2 4 + β 4 th ( β ) β 2 th ( β ) 2 3 β 2 th ( β ) 4
= 2 β 4 16 th 2 ( β ) 1 3 β 4 16 th 4 ( β ) 1 6 β 3 16 th ( β ) 1 + 6 β 3 16 th 3 ( β ) 1
+ 3 16 β 2 4 β 2 16 th 2 ( β ) 1 + β 16 + β 16 th ( β ) 1 .
Since
β k ( th l ( β ) 1 ) 0 and th ( β ) 1 as β ,
for k , l = 1 , 2 , 3 , 4 , we obtain
E ( ξ * ( β ) E ξ * ( β ) ) 4 = 3 16 β 2 + 1 16 β + o ( 1 ) = 3 16 β 2 ( 1 + o ( 1 ) ) .
Thus, (19) is proved. □
We shall use the following general Berry–Esseen-type inequality. We should mention that in the following Lemma 3, there is no assumption on the distributions of the random variables ξ i , 1 i N .
Lemma 3.
Let ξ i , 1 i N , be independent random variables with variances σ i 2 and expectations m i , i = 1 , 2 , , N . Let S N = i = 1 N ξ i , and let d N 2 = i = 1 N σ i 2 be its variance and let μ N = i = 1 N m i be its expectation. Then, we have
sup t R P S N μ N d N < t Φ ( t ) 2 c i = 1 N E ( ξ i m i ) 4 d N 4 1 2 .
Here, Φ is the standard normal distribution function and c is the constant from the Berry–Esseen inequality.
Lemma 3 was proved in [2]. Now, we shall apply Lemma 3 to our model.
Lemma 4.
Assume that for the random variables
ξ 1 * , ξ 2 * , , ξ N *
from formula (9), condition (12) is valid. Then, for their standardized sum, we have
S N * m N * σ N * d γ .
Proof. 
Using (19) for the right-hand side of (27), we obtain
sup t R P S N * m N * σ N * < t Φ ( t ) 2 c 3 i = 1 N ( β i ) 2 ( 1 + o ( 1 ) ) i = 1 N β i 2 1 2
2 c 3 max 1 i N β i ( 1 + o ( 1 ) ) i = 1 N β i 1 2 .
Now, (28) implies Lemma 4. □
In the following lemma, we shall need the characteristic functions of the random variables in (9). Thus, let
ϕ j ( t ) = ch ( β j e i t 2 ) ch ( β j ) = e β j e i t 2 + e β j e i t 2 e β j + e β j
be the characteristic function of ξ j * , let ϕ j c ( t ) = ϕ j ( t ) e i t m * ( β j ) be the characteristic function of the centralized version of ξ j * , 1 j N , and let ϕ N ( t ) be the characteristic function of the standardized sum S N * m N * σ N * .
Lemma 5.
Assume that for the random variables
ξ 1 * , ξ 2 * , , ξ N *
from formula (9), condition (12) is valid. Let C > 0 . Then, we have
σ N * P { S N * = n } = 1 2 π exp ( n m N * ) 2 2 σ N * 2 ( 1 + o ( 1 ) ) ,
uniformly for those n, such that | n m N * | σ N * < C .
Proof. 
We shall need the notation
z = n m N * σ N * .
S N * is an integer valued random variable, so its distribution can be expressed by the following inverse Fourier transform
P { S N * = n } = 1 2 π π π e i n x ϕ N 0 ( x ) d x ,
where ϕ N 0 is the characteristic function of S N * . However,
ϕ N 0 t σ N * = ϕ N ( t ) e i t m N * σ N * = j = 1 N ϕ j c t σ N * e i t m N * σ N * .
So, substituting x = t / σ N * into the integral in (30), we obtain
σ N * P { S N * = n } = 1 2 π π σ N * π σ N * e i t z ϕ N ( t ) d t = 1 2 π π σ N * π σ N * e i t z i = 1 N ϕ i c t σ N * d t .
Let 0 < ε < 1 and B > 0 . Using the characteristic function of the standard normal law, we have, for any real z, that
1 2 π e z 2 2 = 1 2 π e i x z e x 2 2 d x .
So we can represent the difference of the two sides of (29) as the sum of four integrals
R N = 2 π σ N * P { S N * = n } 1 2 π e z 2 2 = I 1 + I 2 + I 3 + I 4 ,
where
I 1 = | x | < B e i x z ϕ N ( x ) d x | x | < B e i x z e x 2 2 d x ,
I 2 = | x | > B e i x z e x 2 2 d x ,
I 3 = B < | x | ε σ N * e i x z j = 1 N ϕ j c x σ N * d x
and
I 4 = ε σ N * < | x | π σ N * e i x z j = 1 N ϕ j c x σ N * d x .
Since
I 1 = | x | < B e i x z ϕ N ( x ) e x 2 2 d x ,
from Lemma 4, it follows that
I 1 0
for all fixed B > 0 .
Since
| I 2 | < | x | > B e x 2 2 d x ,
we have
I 2 0 as B .
Using formula (20) for the characteristic function of ξ * ( β ) , we obtain
| I 3 | B < | x | ε σ N * e i x z j = 1 N ϕ j c x σ N * d x
= B < | t | ε σ N * j = 1 N e β j cos t 2 σ N * 1 j = 1 N 1 + e 2 β j e i t 2 σ N * 1 + e 2 β j d t .
We know that
cos ( x ) 1 x 2 2 + x 4 24 11 x 2 24 , if | x | 1 ; e x 1 x e x , if x 0 .
Therefore, we obtain that
j = 1 N e β j cos t 2 σ N * 1 e ( j = 1 N β j ) 11 · t 2 24 · 4 σ N * 2 e 11 t 2 24 · 4 ( 1 + o ( 1 ) )
as B < | t | ε σ N * , where we applied that σ * 2 ( β j ) = β j 4 ( 1 + o ( 1 ) ) . Moreover,
j = 1 N 1 + e 2 β j e i t 2 σ N * 1 + e 2 β j j = 1 N 1 + e 2 β j cos t 2 σ N * 1 + e 2 β j
= j = 1 N 1 + e 2 β j cos t 2 σ N * e 2 β j 1 + e 2 β j = exp j = 1 N ln 1 + e 2 β j cos t 2 σ N * e 2 β j 1 + e 2 β j
exp j = 1 N e 2 β j e 2 β j 1 cos t 2 σ N * 1 1 + e 2 β j exp j = 1 N e 2 β j 2 β j 1 cos t 2 σ N * 1 + e 2 β j
exp j = 1 N e 2 β j 2 β j 11 24 t 2 σ N * 2 1 + e 2 β j exp 11 12 · exp ( 2 min 1 j N β j ) t 2 ( 1 + o ( 1 ) ) = e t 2 o ( 1 ) ,
B < | t | ε σ N * . Here, we used that e x x + 1 , the shape of σ N * and condition (12). Therefore, we obtain
| I 3 | B < | t | ε σ N * e 10 t 2 24 ( 1 + o ( 1 ) ) d t B < | t | e 10 t 2 24 ( 1 + o ( 1 ) ) d t .
Consequently,
| I 3 | 0 as B .
Now, we turn to I 4 . By (20), we have
| ϕ j ( x ) | = e β j e i x 2 + e β j e i x 2 e β j + e β j
e β j cos ( x 2 ) + e β j cos ( x 2 ) e β j + e β j e β j cos ( ε ) + e β j cos ( ε ) e β j + e β j = ch ( β j cos ( ε ) ) ch ( β j )
for ε < | x | π . Thus,
| I 4 | 2 π σ N * j = 1 N ch ( β j cos ( ε ) ) ch ( β j ) 2 π σ N * j = 1 N e ( 1 cos ( ε ) ) β j 1 + e 2 β j cos ( ε ) 1 + e 2 β j .
Since, by (12), min 1 j N β j ; therefore, we have
1 + e 2 β j cos ( ε ) 1 + e 2 β j 1
as N , uniformly for 1 i N . Consequently, there exists N 0 N , such that
1 + e 2 β j cos ( ε ) 1 + e 2 β j < e 1 2 ( 1 cos ( ε ) ) β j ,
1 j N , N > N 0 . Therefore,
| I 4 | 2 π σ N * j = 1 N e 1 2 ( 1 cos ( ε ) ) β j 2 π σ N * e 1 2 ( 1 cos ( ε ) ) σ N * 2 ( 1 + o ( 1 ) )
for N > N 0 . Therefore, we obtain that
I 4 0 .
Finally, using formulae (32), (33), (34), and (35) to approximate the left-hand side of (31), we obtain (29). □

5. Proofs of the Main Theorems

Proof of Theorem 2.
During the proof, we represent η 1 , η 2 , , η N in the form of (7). First, we prove a local version of our limit theorem. To this end, we study the case when the standardized random variables are inside some bounded intervals. Therefore, we need the following notation. Let C 1 i < C 2 i , 1 i K . Let
k = i = 1 K k i , C * = max { | C i j | : 1 j K , i = 1 , 2 , } .
Let
m K * = i = K + 1 N m * ( β i ) and σ K * 2 = i = K + 1 N σ * 2 ( β i )
be the expectation and the variance of i = K + 1 N ξ i * . By Lemma 1, we have
i = 1 K P ( ξ i = 2 k i ) = i = 1 K 2 2 π β i exp ( 2 k i β i ) 2 2 β i ( 1 + o ( 1 ) ) ,
uniformly for k i , 1 i K , such that
C 1 i < 2 k i β i β i < C 2 i , 1 i K .
Since
σ N * 2 σ K * 2 = i = K + 1 N β i 4 1 + β i ch 2 ( β i ) e β i ch β i
= 1 + o ( 1 ) 4 i = K + 1 N β i = 1 + o ( 1 ) 4 i = 1 N β i 1 i = 1 K β i i = 1 N β i
1 + o ( 1 ) 4 i = 1 N β i 1 K max 1 i K β i i = 1 N β i = ( 1 + o ( 1 ) ) σ N * 2 ,
Therefore, we have
σ K * 2 = ( 1 + o ( 1 ) ) σ N * 2 and σ K * σ N * = o ( 1 ) .
Let k i , 1 i K , be such that C 1 i < 2 k i β i β i < C 2 i for 1 i K . Using (12) and the above calculation, we have
| n k m K * | σ K * | n m N * | σ K * + i = 1 K | k i m * ( β i ) | σ K * < C + K C * + o ( 1 ) .
Therefore, by Lemma 5, we obtain
P i = K + 1 N ξ i * = n k P i = 1 N ξ i * = n = 1 2 π σ K * exp ( n k m K * ) 2 2 σ K * 2 ( 1 + o ( 1 ) ) 1 2 π σ N * exp ( n m N * ) 2 2 σ N * 2 ( 1 + o ( 1 ) )
= exp ( n m N * ) 2 2 σ N * 2 ( n k m K * ) 2 2 σ N * 2 ( 1 + o ( 1 ) )
= exp ( k ( m N * m K * ) ) σ N * ( n m N * + n k m K * ) 2 σ N * ( 1 + o ( 1 ) ) .
Using (38), (37), and assumption (12), we obtain
( n m N * + n k m K * ) σ N * ( k ( m N * m K * ) ) σ N * ( 2 C + K C * + o ( 1 ) ) i = 1 K σ * ( β i ) σ N * | k i m * ( β i ) | σ * ( β i )
( 2 C + K C * + o ( 1 ) ) ( C * + o ( 1 ) ) i = 1 K σ * ( β i ) σ N * = o ( 1 ) .
Using the above calculations, we have
P i = K + 1 N ξ i * = n k P i = 1 N ξ i * = n = 1 + o ( 1 ) .
Now, using (36) and (39) in formula (7), we obtain
P ( η 1 = 2 k 1 , , η K = 2 k K ) = i = 1 K 2 2 π β i exp ( 2 k i β i ) 2 2 β i ( 1 + o ( 1 ) )
uniformly for k i , such that C 1 i < 2 k i β i β i < C 2 i , 1 i K . Thus, we obtained Corollary 1.
Now, we can apply the well-known method of obtaining the integral version of de Moivre–Laplace theorem from its local version. Thus, using the notation t k i = 2 k i β i β i , Δ t k i = 2 β i for 1 i K , we obtain
P η 1 β 1 β 1 = t k 1 , , η K β K β K = t k K = i = 1 K Δ t k i 2 π exp t k i 2 2 ( 1 + o ( 1 ) ) .
Here, on the right-hand side, there is a member of the approximating sum of the integral of the K dimensional standard normal probability density function; so, we obtain
P C 11 < η 1 < C 12 , , C K 1 < η K < C K 2 i = 1 K P C 1 i < γ i < C 2 i .
This implies Theorem 2. □
Now, we turn to the proof of Theorem 4. Thus, we consider the homogeneous allocation scheme, and we assume that there are even numbers of particles in each cell. That is why we consider Equation (6) with independent and identically distributed random variables ξ 1 , , ξ N , with distribution
P { ξ i = 2 k } = β 2 k ( 2 k ) ! ch ( β ) , k = 0 , 1 , 2 , 1 i N .
As
ξ i ( β ) = 2 ξ i * ( β ) ,
so
m ( β ) = E ξ i ( β ) = 2 E ξ i * ( β ) , σ 2 ( β ) = E ( ξ i ( β ) m ( β ) ) 2 = 4 σ * 2 ( β )
are the expectation and the variance of ξ i ( β ) . Let
S N = i = 1 N ξ i
be the sum of our random variables. We need the following corollary of Lemma 5.
Corollary 2.
Consider the homogeneous allocation scheme. Let β . Then, we have
P { S N = 2 k } = 2 2 π N σ ( β ) exp ( 2 k N m ( β ) ) 2 2 N σ 2 ( β ) ( 1 + o ( 1 ) )
as N uniformly for | 2 k N m ( β ) | N σ ( β ) < C for any C > 0 .
Proof of Theorem 4.
First, we give a detailed proof for the two-dimensional distributions; then, we sketch the proof for the arbitrary finite dimensional distributions.
Let < b i 1 < b i 2 < , i { 1 , 2 } . Choose C > 0 , such that C < b i 1 < b i 2 < C , i { 1 , 2 } . Let β be such that 2 n N = m ( β ) . From (42) and Lemma 2, it follows that
m ( β ) = β e β e β e β + e β = β th ( β )
and
σ 2 ( β ) = β 1 + 2 e β e β + e β + β 4 ( e β + e β ) 2 .
Since f ( β ) = th ( β ) = e β e β e β + e β , β 0 is a bounded function, from (44) and from condition 2 n N , we obtain that β . Therefore,
σ 2 ( β ) = 2 n N ( 1 + o ( 1 ) )
as 2 n N . Condition 2 n N implies that
n [ t 1 N ] , n [ t 2 N ] [ t 1 N ] , n N [ t 2 N ] .
Consequently, from Corollary 2, it follows that
P { S N = 2 n } = 2 2 π N σ ( β ) exp ( 2 n N m ( β ) ) 2 2 N σ 2 ( β ) ( 1 + o ( 1 ) ) = 2 4 π n ( 1 + o ( 1 ) ) ,
uniformly for | 2 n N m ( β ) | 2 N σ ( β ) < C . Similarly,
P { S [ N t 1 ] = 2 k 1 } = 2 exp ( 2 k 1 [ N t 1 ] m ( β ) ) 2 2 [ N t 1 ] σ 2 ( β ) 2 π [ N t 1 ] σ ( β ) ( 1 + o ( 1 ) ) = 2 exp 2 k 1 2 n [ t 1 N ] N 2 4 n t 1 4 π n t 1 ( 1 + o ( 1 ) )
uniformly for 2 k 1 2 n [ t 1 N ] N 2 n t 1 < C , and
P { S [ N t 2 ] [ N t 1 ] = 2 k 2 } = 2 exp ( 2 k 2 ( [ N t 2 ] [ N t 1 ] ) m ( β ) ) 2 2 ( [ N t 2 ] [ N t 1 ] ) σ 2 ( β ) 2 π ( [ N t 2 ] [ N t 1 ] ) σ ( β ) ( 1 + o ( 1 ) ) = 2 exp 2 k 2 2 n [ t 2 N ] [ t 1 N ] N 2 4 n ( t 2 t 1 ) 4 π n ( t 2 t 1 ) ( 1 + o ( 1 ) )
uniformly for 2 k 2 2 n [ t 2 N ] [ t 1 N ] N 2 n ( t 2 t 1 ) < C . Since
| 2 k 1 + 2 k 2 2 n [ t 2 N ] N | 4 n ( 1 t 2 ) t 1 1 t 2 | 2 k 1 2 n [ t 1 N ] N | 4 n t 1 + t 2 t 1 1 t 2 | 2 k 2 2 n [ t 2 N ] [ t 1 N ] N | 4 n ( t 2 t 1 ) ,
so we have
P { S N [ N t 2 ] = 2 n 2 k 1 2 k 2 } =
= 2 exp ( 2 n ( 2 k 1 + 2 k 2 ) ( N [ N t 2 ] ) m ( β ) ) 2 2 ( N [ N t 2 ] ) σ 2 ( β ) 2 π ( N [ N t 2 ] ) σ ( β ) ( 1 + o ( 1 ) ) = 2 exp 2 k 1 + 2 k 2 2 n [ t 2 N ] N 2 4 n ( 1 t 2 ) 4 π n ( 1 t 2 ) ( 1 + o ( 1 ) )
uniformly for 2 k 1 2 n [ t 1 N ] N 2 n t 1 < C , 2 k 2 2 n [ t 2 N ] [ t 1 N ] N 2 n ( t 2 t 1 ) < C .
For short, let A = [ t 1 N ] N , B = [ t 2 N ] [ t 1 N ] N . Using Equations (45)–(48) to approximate the probabilities in (16), and applying the definition of Σ from (17), we obtain
P { X 2 n , N ( t 1 ) = 2 k 1 , X 2 n , N ( t 2 ) X 2 n , N ( t 1 ) = 2 k 2 }
= P { S [ N t 1 ] = 2 k 1 } P { S [ N t 2 ] [ N t 1 ] = 2 k 2 } P { S N [ N t 2 ] = 2 n 2 k 1 2 k 2 } P { S N = 2 n }
= 2 exp 2 k 1 2 n [ t 1 N ] N 2 4 n t 1 4 π n t 1 2 exp 2 k 2 2 n [ t 2 N ] [ t 1 N ] N 2 4 n ( t 2 t 1 ) 4 π n ( t 2 t 1 ) 2 exp 2 k 1 + 2 k 2 2 n [ t 2 N ] N 2 4 n ( 1 t 2 ) 4 π n ( 1 t 2 ) 2 4 π n ( 1 + o ( 1 ) )
= 4 exp 1 2 ( 2 k 1 2 n A ) 2 ( 1 ( t 2 t 1 ) ) t 1 ( 1 t 2 ) 2 n + ( 2 k 2 2 n B ) 2 ) ( 1 t 1 ) ( t 2 t 1 ) ( 1 t 2 ) 2 n + 2 ( 2 k 1 2 n A ) ( 2 k 2 2 n B ) ( 1 t 2 ) 2 n 2 π 2 n | Σ | ( 1 + o ( 1 ) )
= 2 n · exp 1 2 2 k 1 2 n A 2 n , 2 k 2 2 n B 2 n Σ 1 2 k 1 2 n A 2 n 2 k 2 2 n B 2 n ( 1 + o ( 1 ) ) 2 π | Σ |
uniformly for 2 k 1 2 n [ t 1 N ] N 2 n t 1 < C , 2 k 2 2 n [ t 2 N ] [ t 1 N ] N 2 n ( t 2 t 1 ) < C .
From (49) and (4), using the same argument as in the proof of the de Moivre–Laplace theorem, we obtain
P b 11 < Y 2 n , N ( t 1 ) < b 12 , b 21 < Y 2 n , N ( t 2 ) Y 2 n , N ( t 1 ) < b 22
= b 11 < 2 k 1 2 n A 2 n < b 12 , b 21 < 2 k 2 2 n B 2 n < b 22 P { X 2 n , N ( t 1 ) = 2 k 1 , X 2 n , N ( t 2 ) X 2 n , N ( t 1 ) = 2 k 2 }
= b 11 < 2 k 1 2 n A 2 n < b 12 , b 21 < 2 k 2 2 n B 2 n < b 22 2 n · exp 1 2 2 k 1 2 n A 2 n , 2 k 2 2 n B 2 n Σ 1 2 k 1 2 n A 2 n 2 k 2 2 n B 2 n 2 π | Σ | ( 1 + o ( 1 ) )
= b 11 b 12 b 21 b 22 1 2 π | Σ | exp 1 2 x , y Σ 1 x y d x d y ( 1 + o ( 1 ) )
= P { b 11 < W 0 ( t 1 ) < b 12 , b 11 < W 0 ( t 2 ) W 0 ( t 1 ) < b 12 } ( 1 + o ( 1 ) ) .
Thus, the two-dimensional distributions of Y 2 n , N converge to the two-dimensional distributions of W 0 .
Now, we sketch the proof for the l-dimensional distributions. Let 0 < t 1 < t 2 < < t l < 1 . Then,
P { X 2 n , N ( t 1 ) = 2 k 1 , X 2 n , N ( t 2 ) X 2 n , N ( t 1 ) = 2 k 2 , , X 2 n , N ( t l ) X 2 n , N ( t l 1 ) = 2 k l }
= P { S [ N t 1 ] = 2 k 1 } P { S [ N t 2 ] [ N t 1 ] = 2 k 2 } P { S [ N t l ] [ N t l 1 ] = 2 k l } P { S N [ N t l ] = 2 n 2 ( k 1 + + k l ) } P { S N = 2 n }
= 2 exp 2 k 1 2 n [ t 1 N ] N 2 4 n t 1 4 π n t 1 j = 2 l 2 exp 2 k j 2 n [ t j N ] [ t j 1 N ] N 2 4 n ( t j t j 1 ) 4 π n ( t j t j 1 ) 2 exp 2 k 1 + + 2 k l 2 n [ t l N ] N 2 4 n ( 1 t l ) 4 π n ( 1 t l ) 2 4 π n ( 1 + o ( 1 ) )
= 2 2 n l 1 ( 2 π ) l / 2 r = 1 l + 1 ( t r t r 1 ) exp U ( x ) 2 ( 1 + o ( 1 ) ) ,
where t 0 = 0 , t l + 1 = 1 , and the quadratic form U ( x ) has the following shape
U ( x ) = x 1 2 b 1 + + x l 2 b l + ( x 1 + + x l ) 2 1 ( b 1 + + b l ) .
Here, we used the notation
x j = 2 k j 2 n [ t j N ] [ t j 1 N ] N 2 n , b j = t j t j 1 , j = 1 , 2 , , l .
Now, by some algebra, we see that
U ( x ) = j = 1 l x j 2 ( 1 i j b i ) i j b i + i j x i x j r = 1 l b r ( 1 r = 1 l b r ) r = 1 l b r .
We need the l × l -type matrix
D = b 1 ( 1 b 1 ) b 1 b 2 b 1 b 3 b 1 b l b 2 b 1 b 2 ( 1 b 2 ) b 2 b 3 b 2 b l b l b 1 b l b 2 b l b 3 b l ( 1 b l ) ,
and its inverse
D 1 = 1 ( 1 r = 1 l b r ) r = 1 l b r ( 1 r 1 b r ) r 1 b r r = 1 l b r r = 1 l b r r = 1 l b r r = 1 l b r ( 1 r 2 b r ) r 2 b r r = 1 l b r r = 1 l b r r = 1 l b r r = 1 l b r r = 1 l b r ( 1 r l b r ) r l b r .
We can see that we obtain the covariance matrix of the increments
( W 0 ( t 1 ) W 0 ( 0 ) , W 0 ( t 2 ) W 0 ( t 1 ) , , W 0 ( t l ) W 0 ( t l 1 ) )
of the Brownian bridge W 0 if we insert b j = t j t j 1 , j = 1 , 2 , , l , into the matrix D. Denote this matrix by Σ (for any fixed value of l). The determinant of D is ( 1 r = 1 l b r ) r = 1 l b r , so the determinant of Σ is r = 1 l + 1 ( t r t r 1 ) . We can also check that the matrix of the quadratic form U ( x ) is D 1 .
So, by using the above considerations, Equation (50) gives
P { X 2 n , N ( t 1 ) = 2 k 1 , X 2 n , N ( t 2 ) X 2 n , N ( t 1 ) = 2 k 2 , , X 2 n , N ( t l ) X 2 n , N ( t l 1 ) = 2 k l } = 2 2 n l 1 ( 2 π ) l / 2 det Σ exp x Σ 1 x 2 ( 1 + o ( 1 ) ) ,
where
x = x 1 , x 2 , , x l
and x j is defined in (51). It implies that the finite dimensional distributions of Y 2 n , N converge to the finite dimensional distributions of W 0 . □
Remark 1.
Relation (52) is a local limit theorem for the random allocation.

6. Discussion

The random allocation of particles into cells is a well-known model in probability theory. There are limit theorems when either the number of particles or the number of cells or both of them tend to infinity, see [1]. The errors in the blocks of a binary file can be modelled as a random allocation. However, if the parity bits are used, then any odd number of errors in the blocks is always detected, but an even number of errors is never detected.
Therefore, describing the behaviour of the allocation model is an interesting problem when there is an even number of particles in each cell.
In this paper, we consider the numbers of particles in cells when we allocate 2 n distinguishable particles into N distinct cells having an even number of particles in each cell. For the non-homogeneous case, we study the numbers of particles in the first K cells. We were able to prove the asymptotic normality of this K-dimensional random vector when n , N . For the homogeneous allocation model, we proved a limit theorem to the finite dimensional distributions of the Brownian bridge, if n , N . To handle the mathematical problem, we inserted our model into the framework of Kolchin’s generalized allocation scheme. Using the above limit theorems, we obtained two χ 2 -tests. As the parity bit method does not detect any even number of errors in the blocks of a binary file, we suggest applying our model to study the distribution of errors in that file.

Author Contributions

Conceptualization, A.N.C.; methodology, A.N.C.; software, I.F.; formal analysis, A.N.C. and I.F.; investigation, A.N.C. and I.F.; writing—original draft preparation, A.N.C.; writing—review and editing, I.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Acknowledgments

The authors would like to thank the referees for the helpful remarks.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Kolchin, V.F.; Sevast’ynov, B.A.; Chistiakov, V.P. Random Allocations; Scripta Series in Mathematics; V. H. Winston & Sons: Washington, DC, USA, 1978. [Google Scholar]
  2. Chikrin, D.E.; Chuprunov, A.N.; Kokunin, P.A. Gaussian limit theorems for the number of given value cells in the non-homogeneous generalized allocation scheme. J. Math. Sci. 2020, 246, 476–487. [Google Scholar] [CrossRef]
  3. Billingsley, P. Convergence of Probability Measures; Wiley: New York, NY, USA, 1968. [Google Scholar]
  4. Kolchin, V.F. A class of limit theorems for conditional distributions. Lith. Math. J. 1968, 8, 53–63. (In Russian) [Google Scholar] [CrossRef]
  5. Kolchin, V.F. Random Graphs; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
  6. Abdushukurov, F.A.; Chuprunov, A.N. Poisson limit theorems in an allocation scheme with even number of particles in each cell. Lobachevskii J. Math. 2020, 41, 289–297. [Google Scholar] [CrossRef]
  7. Timashev, A.M. Asymptotic Expansions in Probabilistic Combinatorics; TVP Science Publishers: Moscow, Russia, 2011. (In Russian) [Google Scholar]
Figure 1. The histograms of the first and the second coordinates in Example 3. (a) First coordinate; (b) Second coordinate.
Figure 1. The histograms of the first and the second coordinates in Example 3. (a) First coordinate; (b) Second coordinate.
Mathematics 10 01099 g001
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Chuprunov, A.N.; Fazekas, I. On the Numbers of Particles in Cells in an Allocation Scheme Having an Even Number of Particles in Each Cell. Mathematics 2022, 10, 1099. https://doi.org/10.3390/math10071099

AMA Style

Chuprunov AN, Fazekas I. On the Numbers of Particles in Cells in an Allocation Scheme Having an Even Number of Particles in Each Cell. Mathematics. 2022; 10(7):1099. https://doi.org/10.3390/math10071099

Chicago/Turabian Style

Chuprunov, Alexey Nikolaevich, and István Fazekas. 2022. "On the Numbers of Particles in Cells in an Allocation Scheme Having an Even Number of Particles in Each Cell" Mathematics 10, no. 7: 1099. https://doi.org/10.3390/math10071099

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop