Next Article in Journal
Intelligent Robust Cross-Domain Fault Diagnostic Method for Rotating Machines Using Noisy Condition Labels
Next Article in Special Issue
The Estimators of the Bent, Shape and Scale Parameters of the Gamma-Exponential Distribution and Their Asymptotic Normality
Previous Article in Journal
Controlling Chaos in Van Der Pol Dynamics Using Signal-Encoded Deep Learning
Previous Article in Special Issue
Equilibrium in a Queueing System with Retrials
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparing Distributions of Sums of Random Variables by Deficiency: Discrete Case

by
Vladimir E. Bening
1,2 and
Victor Y. Korolev
1,2,3,*
1
Faculty of Computational Mathematics and Cybernetics, Moscow State University, 119991 Moscow, Russia
2
Moscow Center for Fundamental and Applied Mathematics, 119991 Moscow, Russia
3
Federal Research Center “Computer Science and Control”, Russian Academy of Sciences, 119333 Moscow, Russia
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(3), 454; https://doi.org/10.3390/math10030454
Submission received: 27 December 2021 / Revised: 22 January 2022 / Accepted: 28 January 2022 / Published: 30 January 2022
(This article belongs to the Special Issue Stability Problems for Stochastic Models: Theory and Applications II)

Abstract

:
In the paper, we consider a new approach to the comparison of the distributions of sums of random variables. Unlike preceding works, for this purpose we use the notion of deficiency that is well known in mathematical statistics. This approach is used, first, to determine the distribution of a separate random variable in the sum that provides the least possible number of summands guaranteeing the prescribed value of the ( 1 α ) -quantile of the normalized sum for a given α ( 0 , 1 ) , and second, to determine the distribution of a separate random variable in the sum that provides the least possible number of summands guaranteeing the prescribed value of the probability for the normalized sum to fall into a given interval. Both problems are solved under the condition that possible distributions of random summands possess coinciding three first moments. In both settings the best distribution delivers the smallest number of summands. Along with distributions of a non-random number of summands, we consider the case of random summation and introduce an analog of deficiency which can be used to compare the distributions of sums with random and non-random number of summands. The main mathematical tools used in the paper are asymptotic expansions for the distributions of R -valued functions of random vectors, in particular, normalized sums of independent identically distributed r.v.s and their quantiles. Along with the general case, main attention is paid to the situation where the summarized random variables are independent and identically distributed. The approach under consideration is applied to determination of the distribution of insurance payments providing the least insurance portfolio size under prescribed Value-at-Risk or non-ruin probability.

1. Introduction

1.1. The Problem under Consideration and the Structure of the Paper

The problem considered in the paper is very close to the problem of stochastic ordering and even may be considered as a a version of this problem. In probability theory and statistics, a stochastic order quantifies the concept of one random variable being “bigger” or “smaller” than another. Many different orders exist, which have different applications, see, e.g., the book [1]. Here we propose an approach to establishing stochastic order for the distributions of sums of independent random variables (r.v.s) based on the notion of deficiency that is well known in asymptotic statistics, see, e.g., [2] and later publications [3,4,5]. Roughly speaking, in statistics the deficiency of a statistical procedure with respect to an ‘optimal’ procedure is the number of additional observations required to attain the same quality of inference as is guaranteed by the ‘optimal’ procedure.
In this paper we deal with the case where the deficiency is measured in natural-valued discrete units (number of ‘additional’ summands) and therefore here we deal with discrete case. The notion of deficiency can be extended to the case of the continuous parameter, say, time. This case will be considered in another work.
Along with the general case, in the paper main attention is paid to the situation where the r.v.s being summed are assumed to be independent and identically distributed.
The first problem to be considered below consists in determination of the distribution of a separate random variable in the sum that provides the least possible number of summands guaranteeing the prescribed value of the ( 1 α ) -quantile of the normalized sum for a given α ( 0 , 1 ) . The second problem considered in the paper consists in determination of the distribution of a separate random variable in the sum that provides the least possible number of summands guaranteeing the prescribed value of the probability for the normalized sum to fall into a given interval. Actually, in both problems we deal with ‘fine tuning’ of the distribution of a separate summand since we assume that different possible distributions of random summands possess coinciding three first moments, so that they can differ only by their kurtosis. In both settings the best distribution delivers the smallest number of summands.
We also consider the problem where some additional randomization is introduced so that the number of summands in the sum can be random itself. This randomization may not be artificially induced, but also may occur when the exact number of summands is a priori unknown and only some its ‘expected’ value can be available as the parameter of the problem. For this case we introduce an analog of deficiency which can be used to compare the distributions of sums with random and non-random number of summands.
Both problems are closely related with the problem of quantification of the accuracy of approximations provided by limit theorems of probability theory. The main mathematical tools used in the paper are asymptotic expansions for the distributions of normalized sums of independent identically distributed r.v.s and their quantiles.
The formal settings mentioned above can be applied to solving practical problems where the models of the observed statistical regularities have the form of distributions of sums of r.v.s and the number of summands plays a substantial role. For example, consider an insurance company whose portfolio consists of a finite number of insurance contracts. Formally, the portfolio is assumed to be a finite set of r.v.s each of which characterizes the income of the company related to a separate contract. Instead of income we can speak of loss assuming that income is a negative loss or that loss is a negative income.
In these terms, the first setting concerns the problem of determination of the distribution of a possible loss within a separate insurance contract (say, the distribution of an insurance payment) providing the least possible portfolio size and guaranteeing the prescribed Value-at-Risk for the average losses. The approach considered in the paper can be used when the distributions of the summands (possible losses) are known only up to their three first moments and the exact Value-at-Risk is not known for sure. In the second setting the latter requirement is replaced by that of guaranteeing the prescribed ‘non-ruin’ probability. Within the framework of this example in both settings the problem consists in the description of the best strategy of the insurance company, if by a strategy we mean the choice of the terms of a contract (e.g., the amount of insurance payment related to each possible insurance event), that is, of the distribution of possible loss within a separate contract. Briefly, the problem is to choose an optimal distribution of a separate loss among the distributions that have the same first three moments so that the portfolio size is least possible.
The paper is organized as follows. Section 1.2 contains a short overview of the properties of statistical deficiency. In Section 2 we outline some results concerning the asymptotic expansions for the distributions of R -valued measurable functions of r.v.s and, in particular, for the distributions of normalized sums of r.v.s, as well as for their quantiles. In Section 3 the problem of comparison of the distributions of two sums of independent r.v.s by their deficiency is considered. The notion of asymptotic deficiency is introduced and some formulas for the calculation of asymptotic deficiency are presented. Section 3.1 contains the solution of this problem for these distributions providing a prescribed value of the ( 1 α ) -quantile for a given α ( 0 , 1 ) . In Section 3.2 this problem is considered for the distributions of sums of independent r.v.s guaranteeing a prescribed probability for an R -valued measurable function of r.v.s, in particular, for a normalized sum of r.v.s, to fall into a given interval. Section 4 contains an example of extension of the results of Section 3 to the case of a random number of summands in the sum (random portfolio size, in terms of the example dealing with an insurance company). In Section 4.1 asymptotic expansions for the asymptotic ( 1 α ) -quantile (called α -reserve here) under a random portfolio size are presented and an analog of deficiency of the sum of a random number of summands (or the strategy with a random portfolio size) with respect to the distribution of the sum of a non-random number of summands (or a strategy with a non-random portfolio size) is considered. In Section 4.2 the problem of comparison of these distributions by an analog of deficiency is considered in a special case of three-point distribution of portfolio size.
Everywhere in what follows the set of real numbers is denoted by R , the set of natural numbers is denoted by N . The distribution function of the standard normal law will be denoted by Φ ( x ) ,
Φ ( x ) = 1 2 π x φ ( y ) d y , φ ( x ) = 1 2 π exp x 2 2 , x R .
The distribution of a random vector ( X 1 , , X n ) will be denoted L ( X 1 , , X n ) .

1.2. Asymptotic Deficiency

Following the classical terminology of [6], consider two decision rules (say, two statistical procedures) D n * and D n whose quality is characterized by the quantities π n * and π n , respectively. Here n is the number of observations X 1 , , X n delivering the information underlying the decision rules. Assume that the rule D n * is in some sense optimal whereas the rule D n is competing. For example, in the problem of estimation usually π n * and π n are mean square deviations and π n * π n . In the problem of testing hypotheses usually π n * and π n are powers of tests so that π n * π n .
By m ( n ) denote the number of observations required for the decision rule D m ( n ) based on m ( n ) observations X 1 , , X m ( n ) to attain the same quality as the ‘best’ rule D n * based on n observations X 1 , , X n . In what follows we will keep to the asymptotic approach assuming that n . Following [7], by the asymptotic relative efficiency (a.r.e.) of the rule D n with respect to the rule D n * we will mean the limit
e lim n n m ( n )
(if it exists and does not depend on the sequence m ( n ) ).
Instead of the ratio of the required number of observations, the difference m ( n ) n can be considered as well, vividly showing the additional number of observations required by the decision rule D n . However, many authors considered the ratio n / m ( n ) , possibly, because the asymptotic analysis of its properties is simpler.
The systematic analysis of the asymptotic behavior of the difference m ( n ) n was first carried out by Hodges and Lehmann in 1970 [2]. They suggested to call the difference m ( n ) n deficiency of the competing decision rule D n with respect to the rule D n * and introduced the notation
d n = m ( n ) n .
If the limit lim n d n exists, then it is called the asymptotic deficiency of the competing decision rule D n with respect to the rule D n * and is denoted d. The number d is often called the deficiency of D n with respect to D n * . Note that if a.r.e. e 1 , then d = , so that this case is not so interesting. In [2] it was also noticed that for some decision rules (statistical procedures) there typically appear cases e = 1 (see, e.g., the book [8]), that is, in these cases the a.r.e. cannot give an answer to the question, which rule is better, whereas the deficiency can clarify the case, because, generally speaking, in this case the asymptotic deficiency can be arbitrary.
So, the deficiency of D n with respect to D n * shows, how many additional observations (that is, how much extra information) is required to attain the desired quality, if the decision rule D n is used instead of the ‘optimal’ decision rule D n * . Therefore, the notion of deficiency provides natural grounds for the asymptotic comparison of D n and D n * in the case e = 1 . The study of the asymptotic behavior of the deficiency d n requires more sophisticated techniques than is used to find the limit e. As a rule, this techniques employ the construction of asymptotic expansions (a.e.s) for the corresponding functions characterizing the quality of decision rules (see, e.g., the books [7,8,9]).
Since the rules D n * and D n have the quality characteristics π n * and π n , respectively, then, by the definition of the deficiency d n = m ( n ) n , for every n we have
π n * = π m ( n ) .
So solve Equation (2), the integer-valued quantity m ( n ) should be treated as a variable taking arbitrary real values. For this purpose the function π m ( n ) can be defined for non-integer m ( n ) by the formula
π m ( n ) = 1 m ( n ) + [ m ( n ) ] π [ m ( n ) ] + m ( n ) [ m ( n ) ] π [ m ( n ) ] + 1
(see [2]).
The functions π n * and π n are usually unknown, so, in practice, their approximations are used. Assume that the a.e.s
π n * = a n r + b n r + s + o n r s ,
and
π n = a n r + c n r + s + o n r s ,
hold, where a, b and c are some numbers that do not depend on n, and r > 0 , and s > 0 are constants determining the rate of decrease of these quality criteria in n. The first terms in these expansions coincide which means that the a.r.e. of the corresponding rules equals one. It can be easily obtained from relations (1)–(4) that
d n = c b r a n 1 s + o n 1 s
(see [2] or [7]). Thus, the asymptotic deficiency has the form
d = ± , 0 < s < 1 , c b r a , s = 1 , 0 , s > 1 . .
The asymptotic deficiency possesses the following obvious property of transitivity: if there is some third decision rule D ¯ n with the quality characteristic π ¯ n admitting an a.e. of the form (4), then the deficiency d ¯ n of the rule D ¯ n with respect the the rule D n * satisfies the equality
d ¯ n = d ˜ n + d n ,
where d ˜ n is the deficiency of the rule D ¯ n with respect to D n and d n is the deficiency of D n with respect to D n * .
The case where s = 1 is most interesting, because in this case the asymptotic deficiency is finite. In the paper [2] some simple examples are given illustrating that this case is quite natural in mathematical statistics (also see the book [8]).

2. Asymptotic Expansions for the Distributions of Normalized Sums of Random Variables

We begin with most general case. Let n N . Consider a finite set of r.v.s X 1 , , X n . For the time being we do not assume that the r.v.s X 1 , , X n are independent and identically distributed. Let L n = L n ( X 1 , , X n ) be an R -valued measurable function of X 1 , , X n . (In what follows when dealing with the example of the portfolio of an insurance company we will call this function generalized loss). In particular, L n may be of the form L n = n T n where T n is the arithmetic mean,
T n 1 n i = 1 n X i .
As it has been already said, the problem consists in description of the distribution of r.v.s X i providing the least possible number of summands n and guaranteeing the prescribed value of the ( 1 α ) -quantile of the function L n for a given α ( 0 , 1 ) .
Let α ( 0 , 1 ) be a small number. Consider the quantity c α ( n ) defined by the asymptotic relation
P L n c α ( n ) = α + o ( n 1 ) , n .
The quantity c α ( n ) is the asymptotic ( 1 α ) -quantile of L n . If L n = n T n , then c α ( n ) can be interpreted as the threshold, the exceedance of which by L n is undesirable and is assumed to have the prescribed small probability α . In terms of an insurance company, c α ( n ) is the asymptotic Value-at-Risk.
By applying the Taylor formula it is not difficult to obtain the following result.
Lemma 1.
Assume that there exist distribution function G ( x ) and functions g 1 ( x ) and g 2 ( x ) such that
sup x R | P L n < x G ( x ) 1 n g 1 ( x ) 1 n g 2 ( x ) | = o ( n 1 ) ,
where the functions G ( x ) , g 1 ( x ) and g 2 ( x ) are smooth enough. Then the asymptotic ( 1 α ) -quantile c α ( n ) of L n admits the a.e.
c α ( n ) = c α g 1 ( c α ) n G ( c α ) 1 n G ( c α ) g 1 2 ( c α ) 2 ( G ( c α ) ) 3 + G ( c α ) g 2 ( c α ) g 1 ( c α ) g 1 ( c α ) ( G ( c α ) ) 2 + o ( n 1 ) ,
where c α satisfies the equation G ( c α ) = 1 α .
Consider the application of this lemma to the case where X 1 , X 2 , are independent identically distributed r.v.s such that
E X 1 = 0 , E X 1 2 = 1 , E | X 1 | k + δ < , k N , k 3 , δ > 0
and the function L n has the form L n = n T n with T n defined by (7). Here the condition E X 1 = 0 means that the separate losses are centered by their expectations. Assume that the characteristic function f ( t ) of the r.v. X 1 satisfies the Cramér condition (C)
lim sup | t | | f ( t ) | < 1 .
Under conditions (9) and (10), from Theorem 6.3.2 of [10] (also see [9]) it follows that there exist functions Q 1 ( x ) , , Q k 2 ( x ) and a C k , δ ( 0 , ) such that
sup x P n T n < x Φ ( x ) i = 1 k 2 n i / 2 Q i ( x ) C k , δ n ( k 2 + δ ) / 2 , n N ,
For the definition of the functions Q 1 ( x ) , , Q k 2 ( x ) see the book [10]. In particular,
Q 1 ( x ) = ( x 2 1 ) φ ( x ) E X 1 3 6 ,
Q 2 ( x ) = ( x 3 3 x ) φ ( x ) E X 1 4 3 24 ( x 5 10 x 3 + 15 x ) φ ( x ) ( E X 1 3 ) 2 72 .
Relations (11) and (12) and Lemma 1 directly imply the a.e. for the asymptotic ( 1 α ) -quantile c n ( α ) of L n presented in the following lemma.
Lemma 2.
Let conditions ( 9 ) and ( 10 ) hold with k = 4 , δ > 0 . Then the the asymptotic ( 1 α ) -quantile c n ( α ) of L n admits the a.e.
c α ( n ) = u α + E X 1 3 6 n ( u α 2 1 ) + 1 12 n E 2 X 1 3 3 ( 5 u α 2 u α 3 ) + E X 1 4 3 2 ( u α 3 3 u α ) + o ( n 1 ) ,
where u α is the ( 1 α ) -quantile of the standard normal distribution: Φ ( u α ) = 1 α .

3. The Comparison of the Distributions of Two Normalized Sums of Random Variables

3.1. The Asymptotic Deficiency of the Distributions of Summands Providing a Given ( 1 α ) -Quantile of the Normalized Sums

In this section we will present an approach to the comparison of the distributions of two sums of r.v.s in terms of the number of summands. The distribution of the random vector X 1 , , X n will be denoted L ( X 1 , , X n ) . Consider an R -valued measurable function of X 1 , , X n .
From Lemma 1 we can easily obtain the following result.
Lemma 3.
Consider a sequence { ϵ n } n 1 such that ϵ n 0 as n . Under the conditions of Lemma 1 we have
sup x R | P L n ( X 1 , , X n ) < x + ϵ n P L n ( X 1 , , X n ) < x
ϵ n G ( x ) ϵ n 2 2 G ( x ) ϵ n n g 1 ( x ) | = o max ϵ n 2 , ϵ n n , n 1 .
Along with the r.v.s X 1 , , X n resulting in the value L n ( X 1 , , X n ) of the function L n , consider another set of r.v.s Y 1 , , Y n , according to which the value of the function L n is L n ( Y 1 , , Y n ) . For example, L n ( X 1 , , X n ) may have the form L n ( X 1 , , X n ) = n T n with T n defined by (7) and L n ( Y 1 , , Y n ) may have the form L n ( Y 1 , , Y n ) = n U n where
U n = 1 n i = 1 n Y i .
Let to the distribution L ( Y 1 , , Y n ) there correspond the asymptotic ( 1 α ) -quantile c ¯ α ( n ) of L n :
P L n ( Y 1 , , Y n ) c ¯ α ( n ) = α + o ( n 1 ) , n .
Assume that the a.e. for the distribution function of L n ( Y 1 , , Y n ) has the form
P L n ( Y 1 , , Y n ) < x = G ( x ) + 1 n g 1 ( x ) + 1 n g ¯ 2 ( x ) + o ( n 1 ) ,
where the functions G ( x ) , g 1 ( x ) and g ¯ 2 ( x ) are smooth enough. The a.e. (15) differs from the a.e. for the distribution function of L n ( X 1 , , X n ) established by Lemma 1 only by the term of order n 1 , which means that the two distributions are rather close. Define the sequence of natural numbers { m ( n ) } n 1 by the equality
P L m ( n ) ( Y 1 , , Y m ( n ) ) c α ( m ( n ) ) = α + o ( n 1 ) , n .
If m ( n ) n = d + o ( 1 ) , d R , n , then d is the asymptotic deficiency of the distribution L ( Y 1 , , Y 1 ) with respect to the distribution L ( X 1 , , X n ) . In other words, d is the asymptotic number of ‘additional’ r.v.s be included in the set Y 1 , , Y 1 in order that the distribution L ( Y 1 , , Y m ( n ) ) provides the same quality as the distribution L ( X 1 , , X n ) .
Theorem 1.
Assume that the conditions of Lemma 1 and (15) hold and G ( c α ) c α 0 . Then the asymptotic deficiency d of the distribution L ( Y 1 , , Y 1 ) with respect to the distribution L ( X 1 , , X n ) has the form
d = 2 g 2 ( c α ) g ¯ 2 ( c α ) G ( c α ) c α + o ( 1 ) .
Proof. 
From Lemma 1 and condition (15) it directly follows that
c ¯ α ( n ) = c α g 1 ( c α ) n G ( c α ) 1 n G ( c α ) g 1 2 ( c α ) 2 ( G ( c α ) ) 3 + G ( c α ) g ¯ 2 ( c α ) g 1 ( c α ) g 1 ( c α ) ( G ( c α ) ) 2 + o ( n 1 )
and therefore
ϵ n m ( n ) n c ¯ α ( m ( n ) ) c α ( m ( n ) ) = d 2 n c α 1 n g 2 ( c α ) g ¯ 2 ( c α ) G ( c α ) + o ( n 1 ) .
Further, with the account of the definitions of m ( n ) (see (16)) and ϵ n we have
α + o ( n 1 ) = P L m ( n ) ( Y 1 , , Y m ( n ) ) c ¯ α ( m ( n ) ) =
= P L m ( n ) ( Y 1 , , Y m ( n ) ) n m ( n ) c α ( m ( n ) ) + ϵ n )
Applying Lemma 3 to the right-hand side of (19) we obtain
α + o ( n 1 ) = P L m ( n ) ( Y 1 , , Y m ( n ) ) c α ( m ( n ) ) ϵ n G ( c α ) + o ( n 1 ) .
Now from (16) and (18) it follows that
d = 2 g 2 ( c α ) g ¯ 2 ( c α ) G ( c α ) c α + o ( 1 ) .
The theorem is proved. □
Now consider an example of the application of Theorem 1 to the optimization of the portfolio size of an insurance company. Let the possible losses X 1 , X 2 , related with each insurance contract in the portfolio be independent identically distributed r.v.s satisfying conditions (9) and (10). Consider another distribution, under which the possible losses Y 1 , Y 2 , are assumed to be independent identically distributed r.v.s such that
E Y 1 = 0 , E Y 1 2 = 1 , E | Y 1 | 4 + δ < , δ > 0 .
Assume that the characteristic function p ( t ) of the r.v. Y 1 satisfies the Cramér ( C ) condition
lim sup | t | | p ( t ) | < 1 .
For each n consider the average losses U n defined by (13). Assume that
E X 1 3 = E Y 1 3 ,
(for example, the r.v.s X i and Y i are centered by their expectations and the distributions of these centered r.v.s are symmetric). From Lemma 2 and Theorem 1 we directly obtain the following statement.
Lemma 4.
Let conditions (9), (10) and (20)–(22) hold. Then the asymptotic (as n ) deficiency of the distribution L ( Y 1 , , Y n ) with respect to the distribution L ( X 1 , , X n ) (the ‘additional number of contracts’)d has the form
d = E X 1 4 E Y 1 4 3 u α 2 12 + o ( 1 ) .
Lemma 4 illustrates that if the distributions are close, then the deficiency is determined by the kurtosis.

3.2. The Asymptotic Deficiency of the Distributions of Summands Providing a Given Probability for the Normalized Sum to Fall into a Given Interval

To begin with, in this section we again consider the values of a measurable R -valued function L n ( X 1 , , X n ) and L n ( Y 1 , , Y n ) on random vectors ( X 1 , , X n ) and ( Y 1 , , Y n ) with the the distributions L ( X 1 , , X n ) and L ( Y 1 , , Y n ) , respectively. The goal is to provide that the value of L n falls into the interval [ S 1 , S 2 ) for some given numbers S 1 < S 2 . As a quality characteristic consider the probabilities
π n = P ( S 1 L n ( X 1 , , X n ) < S 2 ) , π ¯ n = P ( S 1 L n ( Y 1 , , Y n ) < S 2 ) .
If L n ( X 1 , , X n ) = n T n (see (7)) and L n ( Y 1 , , Y n ) = n U n (see (22)), that is, normalized sums of r.v.s are considered, then relation (23) means that π n and π ¯ n are probabilities of that the normalized sums of r.v.s are inside the interval [ S 1 , S 2 ) .
From the definition of π n we directly obtain the following result.
Lemma 5.
Assume that for some r > 0 and s > 0 there exist a distribution function H ( x ) and functions h 1 ( x ) , h 2 ( x ) and h ¯ 2 ( x ) such that
sup x R | P L n ( X 1 , , X n ) < x H ( x ) 1 n r h 1 ( x ) 1 n r + s h 2 ( x ) | = o ( n r s ) ,
sup x R | P L n ( Y 1 , , Y n ) < x H ( x ) 1 n r h 1 ( x ) 1 n r + s h ¯ 2 ( x ) | = o ( n r s ) ,
and, moreover, the functions h 1 ( x ) , h 2 ( x ) and h ¯ 2 ( x ) are measurable. Then π n and π ¯ n admit a.e.s
π n = H ( S 2 ) H ( S 1 ) + h 1 ( S 2 ) h 1 ( S 1 ) n r + h 2 ( S 2 ) h 2 ( S 1 ) n r + s + o ( n r s ) ,
π ¯ n = H ( S 2 ) H ( S 1 ) + h 1 ( S 2 ) h 1 ( S 1 ) n r + h ¯ 2 ( S 2 ) h ¯ 2 ( S 1 ) n r + s + o ( n r s ) .
Corollary 1.
Let ϵ n 0 as n and S 2 = S 1 + ϵ n . Assume that the functions H ( x ) , h 1 ( x ) , h 2 ( x ) and h ¯ 2 ( x ) are smooth enough and h 1 ( S 2 ) h 1 ( S 1 ) . Then
ϵ n 1 π n = H ( S 1 ) + ϵ n 2 H ( S 1 ) + ϵ n 2 6 H ( S 1 ) + o ( ϵ n 2 ) +
+ 1 n r h 1 ( S 1 ) + 1 2 n r h 1 ( S 1 ) ϵ n + o ( ϵ n n r ) + 1 n r + s h 2 ( S 1 ) + o ( n r s ϵ n 1 ) ,
ϵ n 1 π ¯ n = H ( S 1 ) + ϵ n 2 H ( S 1 ) + ϵ n 2 6 H ( S 1 ) + o ( ϵ n 2 ) +
+ 1 n r h 1 ( S 1 ) + 1 2 n r h 1 ( S 1 ) ϵ n + o ( ϵ n n r ) + 1 n r + s h ¯ 2 ( S 1 ) + o ( n r s ϵ n 1 ) .
Lemma 5, Corollary 1 and formula (6) directly imply the expression for the asymptotic deficiency with quality characteristics (23).
Theorem 2.
Let conditions of Lemma 5 hold with s = 1 . Then the deficiency d n of the distribution L ( Y 1 , , Y n ) with the quality characteristic π ¯ n with respect to the distribution L ( X 1 , , X n ) with the quality characteristic π n has the form
d n = h ¯ 2 ( S 2 ) h 2 ( S 2 ) + h ¯ 2 ( S 1 ) h ¯ 2 ( S 1 ) r ( h 1 ( S 2 ) h 1 ( S 1 ) ) + o ( 1 ) .
If S 2 = S 1 + ϵ n with ϵ n 0 as n and h 1 ( S 1 ) 0 , then the formal passage to the limit in (3.13) yields the formula
d n = h ¯ 2 ( S 1 ) h 2 ( S 1 ) r h 1 ( S 1 ) + o ( 1 ) .
Consider an example of the application of Theorem 2 to the optimization of the portfolio size of an insurance company. Let the possible losses X 1 , X 2 , related with each insurance contract in the portfolio be independent identically distributed r.v.s satisfying conditions (9) and (10). Consider another distribution, under which the possible losses Y 1 , Y 2 , are assumed to be independent identically distributed r.v.s satisfying conditions (20) and (21). Assume that in (9) and (20) k = 3 . We are interested in the asymptotic behavior of the average losses T n (see (7)) and U n (see (13)). With the account of Lemma 5 we obtain the following statement.
Lemma 6.
Let conditions ( 9 ) , ( 10 ) , ( 19 ) and ( 20 ) hold with k = 3 . Then
P n T n < x = Φ ( x ) + Q 1 ( x ) n + Q 2 ( x ) n + o ( n 1 ) ,
P n U n < x = Φ ( x ) + Q ¯ 1 ( x ) n + Q ¯ 2 ( x ) n + o ( n 1 ) ,
uniformly in x R ,
π n = Φ ( S 2 ) Φ ( S 1 ) + Q 1 ( S 2 ) Q 1 ( S 1 ) n + Q 2 ( S 2 ) Q 2 ( S 1 ) n + o ( n 1 ) ,
π ¯ n = Φ ( S 2 ) Φ ( S 1 ) + Q ¯ 1 ( S 2 ) Q ¯ 1 ( S 1 ) n + Q ¯ 2 ( S 2 ) Q ¯ 2 ( S 1 ) n + o ( n 1 ) ,
where the functions Q 1 ( x ) and Q 2 ( x ) are defined in ( 12 ) ,
Q ¯ 1 ( x ) = ( x 2 1 ) φ ( x ) E Y 1 3 6 ,
Q ¯ 2 ( x ) = ( x 3 3 x ) φ ( x ) E Y 1 4 3 24 ( x 5 10 x 3 + 15 x ) φ ( x ) ( E Y 1 3 ) 2 72 .
Corollary 2.
Let ϵ n 0 as n and S 2 = S 1 + ϵ n . Assume that conditions of Lemma 6 hold. Then
ϵ n 1 π n = φ ( S 1 ) + ϵ n 2 φ ( S 1 ) + ϵ n 2 6 φ ( S 1 ) + o ( ϵ n 2 ) +
+ 1 n Q 1 ( S 1 ) + ϵ n 2 n Q 1 ( S 1 ) + o ( ϵ n n 1 / 2 ) 1 n Q 2 ( S 1 ) + o ( n 1 ϵ n 1 ) ,
ϵ n 1 π ¯ n = φ ( S 1 ) + ϵ n 2 φ ( S 1 ) + ϵ n 2 6 φ ( S 1 ) + o ( ϵ n 2 ) +
+ 1 n Q ¯ 1 ( S 1 ) + ϵ n 2 n Q ¯ 1 ( S 1 ) + o ( ϵ n n 1 / 2 ) + 1 n Q ¯ 2 ( S 1 ) + o ( n 1 ϵ n 1 ) .
Theorem 2, Lemma 5 and formula (5) directly imply the following statement.
Theorem 3.
Let, in addition to the conditions of Lemma 5., E X 1 3 = E Y 1 3 . Then the deficiency d n of the distribution L ( Y 1 , , Y n ) with the quality characteristic π ¯ n with respect to the strategy L ( X 1 , , X n ) with the quality characteristic π n (the ‘additional number of contracts’) has the form
d n = 2 Q ¯ 2 ( S 2 ) Q 2 ( S 2 ) + Q 2 ( S 1 ) Q ¯ 2 ( S 1 ) Q 1 ( S 2 ) Q 1 ( S 1 ) n 1 / 2 + o ( n 1 / 2 ) .
Consider an example where the asymptotic deficiency is finite.
Corollary 3.
Let ϵ n = 1 n and S 2 = S 1 + 1 n , E X 1 3 = E Y 1 3 = 0 . Then under the conditions of Lemma 5 we have
π n = φ ( S 1 ) n + φ ( S 1 ) + 2 Q 2 ( S 1 ) n 2 + o ( n 2 ) ,
π n = φ ( S 1 ) n + φ ( S 1 ) + 2 Q ¯ 2 ( S 1 ) n 2 + o ( n 2 ) .
Moreover, the deficiency d n has the form
d n = 2 ( Q ¯ 2 ( S 1 ) Q 2 ( S 1 ) ) φ ( S 1 ) + o ( 1 ) = S 1 4 6 S 1 2 + 3 12 ( E Y 1 4 E X 1 4 ) + o ( 1 ) .

4. Random Number of Summands

4.1. Asymptotic Expansions for the Asymptotic ( 1 α ) -Quantile of R -Valued Measurable Functions of a Random Number of Random Variables

In this section we consider the case where an additional randomization can be introduced into the problem. In this case the number of summands in the sum can be considered as random. This randomization may not be artificially induced, but also may occur when the exact portfolio size can be unknown beforehand and only some ‘expected’ number of summands can be available as the parameter of the problem.
Let natural-valued r.v.s N 1 , N 2 , and r.v.s X 1 , X 2 , be defined on one and the same probability space ( Ω , A , P ) . In what follows we will assume that n is the expected value of N n ,
E N n = n .
Assume that for each n 1 the r.v. N n is independent of the sequence X 1 , X 2 , . As above, for each n 1 , consider the value of an R -valued measurable function L n = L n ( X 1 , , X n ) . For each n 1 consider the r.v. L N n defined as
L N n ( ω ) L N n ( ω ) ( X 1 ( ω ) , , X N n ( ω ) ( ω ) ) , ω Ω .
Below we will assume that the following condition holds.
Condition A.There exist k N { 1 } , α i , n R , i = 1 , , k , β n > 0 , C k > 0 , a differentiable distribution function G ( x ) and measurable functions g j ( x ) , j = 1 , , k such that
β n 0 , max 1 i k | α i , n | 0
as n and
sup x P L n < x G ( x ) i = 1 k α i , n g i ( x ) C k β n , n N .
Lemma 7.
Let the function L n = L n ( X 1 , , X n ) satisfy Condition A. Then
sup x P L N n < x G ( x ) i = 1 k g i ( x ) E α i , N n C k E β N n .
The elementary proof of this lemma directly follows by the formula of total probability.
Consider an example of application of Lemma 7. Let X 1 , X 2 , be independent identically distributed r.v.s satisfying conditions (9) and (10). Assume that the function L n is the normalized arithmetic mean (or, which is the same, the normalized sum) L n = n T n with T n defined in (7). Then, in accordance with what has been said in Section 2, relation (11) holds implying the validity of Condition A. From (11) playing the role of Condition A and Lemma 7 we obtain the following statement.
Lemma 8.
Assume that L n = n T n with T n defined in (7) and conditions (9) and (10) hold. Then
sup x P N n T N n < x Φ ( x ) i = 1 k 2 Q i ( x ) E N n i / 2 C k , δ E N n ( k 2 + δ ) / 2 ,
where the functions Q i ( x ) are defined in Theorem 6.3.2 of [10].
Relation (11) and Lemma 8 imply the following statement.
Lemma 9.
Let conditions (9) and (10) hold with k = 4 and δ > 0 . Assume that condition (25) holds and
E N n 1 / 2 = 1 n + a n + o ( n 1 ) , a R ,
E N n 1 = b n + o ( n 1 ) , E N n ( 2 + δ ) / 2 = o ( n 1 ) , b R .
Then
sup x P n T n < x Φ ( x ) Q 1 ( x ) n Q 2 ( x ) n = o ( n 1 )
and
sup x P N n T N n < x Φ ( x ) Q 1 ( x ) n b Q 2 ( x ) + a Q 1 ( x ) n = o ( n 1 ) .
We will use Lemma 9 in order to determine the asymptotic ( 1 α ) -quantile of L n and calculate the asymptotic deficiency.
Recall that, for α ( 0 , 1 ) , the asymptotic ( 1 α ) -quantile of L n is the quantity c α ( n ) satisfying the asymptotic equality
P L n c α ( n ) = α + o ( n 1 ) , n .
Correspondingly, we define the the asymptotic ( 1 α ) -quantile c ˜ α ( n ) of L N n by the equation
P L N n c ˜ α ( n ) = α + o ( n 1 ) , n .
From Lemmas 1 and 9 we directly obtain the a.e.s for these asymptotic ( 1 α ) -quantiles.
Lemma 10.
Under the conditions of Lemma 8, we have
c α ( n ) = u α + E X 1 3 6 n ( u α 2 1 ) + 1 12 n E 2 X 1 3 3 ( 5 u α 2 u α 3 ) + E X 1 4 3 2 ( u α 3 3 u α ) + o ( n 1 ) ,
c ˜ α ( n ) = u α + E X 1 3 6 n ( u α 2 1 ) +
+ 1 12 n E 2 X 1 3 3 ( 5 u α 2 u α 3 ) + b ( E X 1 4 3 ) 2 ( u α 3 3 u α ) + 2 a E X 1 3 ( u α 2 1 ) + o ( n 1 ) ,
where u α satisfies the equation Φ ( u α ) = 1 α .
Now define the sequence m ( n ) of natural numbers by the relation
P n L N m ( n ) m ( n ) c α ( m ( n ) ) = α + o ( n 1 ) , n .
If
m ( n ) = n + d + o ( 1 ) ,
n = 1 , 2 , , then d can have the meaning of the expected additional number of summands to be included in the sum in order that the function L N n exceeds c α ( n ) for the loss under a non-random number n of summands. The quantity d will be called the asymptotic deficiency.
In the same way that Theorem 1 was proved, we can establish the following statement.
Theorem 4.
Assume that
E N n = n , E N n 1 / 2 = 1 n + a n + o ( n 1 ) , a R ,
E N n 1 = b n + o ( n 1 ) , E N n ( 2 + δ ) / 2 = o ( n 1 ) , b R ,
and there exist δ > 0 , a differentiable distribution function G ( x ) and measurable functions g 1 ( x ) and g 2 ( x ) such that
sup x P L n < x G ( x ) g 1 ( x ) n g 2 ( x ) n C n ( 2 + δ ) / 2
and G ( c α ) c α 0 . Then the expected number d of additional summands (see ( 28 ) and ( 29 ) ) in the normalized random sum L N n with respect to the normalized sum L n has the form
d = 2 g 2 ( c α ) ( 1 b ) a g 1 ( c α ) G ( c α ) c α + o ( 1 ) ,
where c α satisfies the equation G ( c α ) = 1 α .
Theorem 4 implies the following statement.
Corollary 4.
Under the conditions of Lemma 8 the expected additional number of summands d (see (28) and (29)) corresponding to the normalized sum N n T N n with a random number of summands with respect to the normalized sum n T n has the form
d = 2 ( 1 b ) Q 2 ( u α ) a Q 1 ( u α ) φ ( u α ) u α + o ( 1 ) .
If additionally E X 1 3 = 0 , then
d = ( 1 b ) ( 3 u α 2 ) ( E X 1 4 3 ) 12 + o ( 1 ) .

4.2. An Example of Three-Point Distribution of the Number of Summands

In this section, keeping to the terminology of the example related to optimization of the portfolio size of an insurance company, we will use Corollary 4 to obtain a.e.s for the asymptotic Value-at-Risk (asymptotic ( 1 α ) -quantile of the normalized average loss, or asymptotic normalized α -reserve) in the case where the portfolio size N n has a special distribution concentrated in three points so that is symmetric around the central point.
Assume that the portfolio size N n has the distribution of the form
P ( N n = n h n ) = P ( N n = n ) = P ( N n = n + h n ) = 1 3 ,
where h n N , h n < n , n = 1 , 2 , , and
lim n h n n = 0 .
Lemma 11.
Let the random portfolio size N n have distribution (30) and let condition (31) hold. Then E N n = n and, as n ,
E N n 1 / 2 = 1 n 1 4 n h n n 2 + O 1 n h n n 3 ,
E N n 1 = 1 n + 2 3 n h n n 2 + O 1 n h n n 4 , E N n 3 / 2 = 1 n 3 / 2 + O 1 n 3 / 2 h n n 2 .
Proof. 
The desired statements follow from the relations
E N n 1 = 3 n 2 h n 2 3 n ( n 2 h n 2 ) = 1 n 1 h n 2 3 n 1 + h n 2 n 2 + O h n 4 n 4 = 1 n + 2 3 n h n n 2 + O 1 n h n n 4 ,
E N n 3 / 2 = 1 3 n 3 / 2 1 ( 1 h n / n ) 3 / 2 + 1 + 1 ( 1 + h n / n ) 3 / 2 = 1 n 3 / 2 + O 1 n 3 / 2 h n n 2 .
The formula for E N n 1 / 2 is established in a similar way. □
Lemmas 10 and 11 imply the following statement.
Theorem 5.
Assume that the normalized average loss has the form L n = n T n with T n defined in (7). Let the r.v. N n be distributed according to (30) and condition (31) hold. Under the conditions of Lemma 9, for the asymptotic α-reserve c ˜ α ( n ) corresponding to the normalized average loss N n T N n there holds the relation
c ˜ α ( n ) = c α ( n ) E X 1 3 ( u α 2 1 ) 24 n h n n 2 + o ( n 1 ) , n .
Remark 1.
In addition to the conditions of Theorem 5, let
h n = γ n β + o ( n β ) , γ 0 , 0 β < 1 .
Then, as n ,
n 5 / 2 2 β c α ( n ) c ˜ α ( n ) γ 2 24 E X 1 3 ( u α 2 1 ) .
Applying Lemma 9, by simple calculations we obtain the following statement.
Lemma 12.
Assume that conditions (9) and (10) hold with k = 4 and 0 < δ 1 . Let conditions (30) and (31) hold. Then
sup x P N n T N n < x Φ ( x ) 1 h n 2 4 n 2 Q 1 ( x ) n 1 + 2 h n 2 3 n 2 Q 2 ( x ) n = O h n ( 4 + 2 δ ) / 3 n 7 ( 2 + δ ) / 6 .
Corollary 5.
Let conditions of Lemma 12 hold and h n = n 3 / 4 . Then
sup x R | P N n T N n < x Φ ( x ) 1 n Q 1 ( x ) 1 n Q 2 ( x ) 1 4 Q 1 ( x ) | = o ( n 1 ) .
Relations (12), Lemmas 10 and 11 yield the following theorem.
Theorem 6.
Let the conditions of Corollary 5 hold. Then the asymptotic α-reserves c α ( n ) and c ˜ α ( n ) related to the normalized average losses n T n and N n T N n have the form
c α ( n ) = u α + E X 1 3 6 n ( u α 2 1 ) + 1 12 n E 2 X 1 3 3 ( 5 u α 2 u α 3 ) + E X 1 4 3 2 ( u α 3 3 u α ) + o ( n 1 ) ,
c ˜ α ( n ) = u α + E X 1 3 6 n ( u α 2 1 ) +
+ 1 12 n E 2 X 1 3 3 ( 5 u α 2 u α 3 ) + E X 1 4 3 2 ( u α 3 3 u α ) 1 2 E X 1 3 ( u α 2 1 ) + o ( n 1 ) ,
where u α satisfies the equation Φ ( u α ) = 1 α . The corresponding expected additional number d of contracts has the form
d = Q 1 ( u α ) 2 φ ( u α ) u α + o ( 1 ) = ( 1 u α 2 ) E X 1 3 12 u α + o ( 1 ) .

5. Conclusions

The paper deals with an approach to the comparison of distributions of sums of a finite number of independent random variables by deficiency. The notion of asymptotic deficiency of the distribution of a measurable R -valued function of a random vector with respect to the distribution of the same function of another random vector was introduced. Some formulas for the calculation of asymptotic deficiency were presented in the cases where the function has the form of a normalized sum of independent identically distributed r.v.s. The formulas for the asymptotic deficiency were obtained as the solution of two problems, one of which deals with the description of the distribution of a separate summand minimizing the number of summands and providing a prescribed value of the ( 1 α ) -quantile of the normalized sum for a given α ( 0 , 1 ) . The second problem deals with minimization of the number of summands and guaranteeing a prescribed probability for a normalized sum of r.v.s to fall into a given interval. These results were extended to the case of a random number of summands in the sum (or random portfolio size, in terms of the example dealing with an insurance company). For this case, an analog of deficiency of the sum of a random number of summands with respect to the distribution of the sum of a non-random number of summands was introduced. The problem of comparison of these distributions by an analog of deficiency was considered in a special case of three-point distribution of portfolio size. The main mathematical tools used in the paper were asymptotic expansions for the distributions of average losses and their quantiles.

Author Contributions

Conceptualization, V.E.B. and V.Y.K.; Formal analysis, V.Y.K.; Funding acquisition, V.Y.K.; Investigation, V.E.B. and V.Y.K.; Writing – original draft, V.E.B. and V.Y.K. All authors have read and agreed to the published version of the manuscript.

Funding

The research was supported by the Ministry of Science and Higher Education of the Russian Federation, project No. 075-15-2020-799.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors thank the anonymous referees for their comments and suggestions that improved the paper. We also thank A.K. Gorshenin for his help in formatting the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Müller, A.; Stoyan, D. Comparison Methods for Stochastic Models and Risks; J. Wiley & Sons: Chichester, UK, 2002; 352p, ISBN 978-0-471-49446-1. [Google Scholar]
  2. Hodges, J.L.; Lehmann, E.L. Deficiency. Ann. Math. Stat. 1970, 41, 783–801. [Google Scholar] [CrossRef]
  3. Torgersen, E. Comparison of Statistical Experiments; Printed online: May 2013; Cambridge University Press: Cambridge, UK, 1991. [Google Scholar] [CrossRef]
  4. Xiang, X. Deficiency of the sample quantile estimator with respect to kernel quantile estimators for censored data. Ann. Stat. 1995, 23, 836–854. [Google Scholar] [CrossRef]
  5. Bening, V.E.; Korolev, V.Y.; Zeifman, A.I. Calculation of the deficiency of some statistical estimators constructed from samples with random sizes. Colloq. Math. 2019, 157, 157–171. [Google Scholar] [CrossRef]
  6. Blackwell, D.; Girshick, M.A. Theory of Games and Statistical Decisions. Wiley Publications in Statistics; J. Wiley & Sons: New York, NY, USA; Chapman & Hall: London, UK, 1954; p. XI, 355. [Google Scholar]
  7. Lehmann, E.L.; Casella, G. Theory of Point Estimation; Springer: Berlin, Germany, 1998; 589p. [Google Scholar]
  8. Bening, V.E. Asymptotic Theory of Testing Statistical Hypotheses: Efficient Statistics, Optimality, Power Loss, and Deficiency; Walter de Gruyter: Berlin, Germany, 2011; 277p, ISBN 978-3-11-093599-8. [Google Scholar]
  9. Cramér, H. Mathematical Methods of Statistics; Princeton University Press: Princeton, NJ, USA, 1946; 647p. [Google Scholar]
  10. Petrov, V.V. Limit Theorems of Probability Theory: Sequences of Independent Random Variables; Clarendon Press: Oxford, UK, 1985; 437p. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Bening, V.E.; Korolev, V.Y. Comparing Distributions of Sums of Random Variables by Deficiency: Discrete Case. Mathematics 2022, 10, 454. https://doi.org/10.3390/math10030454

AMA Style

Bening VE, Korolev VY. Comparing Distributions of Sums of Random Variables by Deficiency: Discrete Case. Mathematics. 2022; 10(3):454. https://doi.org/10.3390/math10030454

Chicago/Turabian Style

Bening, Vladimir E., and Victor Y. Korolev. 2022. "Comparing Distributions of Sums of Random Variables by Deficiency: Discrete Case" Mathematics 10, no. 3: 454. https://doi.org/10.3390/math10030454

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop