Abstract
In the paper, we consider a new approach to the comparison of the distributions of sums of random variables. Unlike preceding works, for this purpose we use the notion of deficiency that is well known in mathematical statistics. This approach is used, first, to determine the distribution of a separate random variable in the sum that provides the least possible number of summands guaranteeing the prescribed value of the -quantile of the normalized sum for a given , and second, to determine the distribution of a separate random variable in the sum that provides the least possible number of summands guaranteeing the prescribed value of the probability for the normalized sum to fall into a given interval. Both problems are solved under the condition that possible distributions of random summands possess coinciding three first moments. In both settings the best distribution delivers the smallest number of summands. Along with distributions of a non-random number of summands, we consider the case of random summation and introduce an analog of deficiency which can be used to compare the distributions of sums with random and non-random number of summands. The main mathematical tools used in the paper are asymptotic expansions for the distributions of -valued functions of random vectors, in particular, normalized sums of independent identically distributed r.v.s and their quantiles. Along with the general case, main attention is paid to the situation where the summarized random variables are independent and identically distributed. The approach under consideration is applied to determination of the distribution of insurance payments providing the least insurance portfolio size under prescribed Value-at-Risk or non-ruin probability.
1. Introduction
1.1. The Problem under Consideration and the Structure of the Paper
The problem considered in the paper is very close to the problem of stochastic ordering and even may be considered as a a version of this problem. In probability theory and statistics, a stochastic order quantifies the concept of one random variable being “bigger” or “smaller” than another. Many different orders exist, which have different applications, see, e.g., the book [1]. Here we propose an approach to establishing stochastic order for the distributions of sums of independent random variables (r.v.s) based on the notion of deficiency that is well known in asymptotic statistics, see, e.g., [2] and later publications [3,4,5]. Roughly speaking, in statistics the deficiency of a statistical procedure with respect to an ‘optimal’ procedure is the number of additional observations required to attain the same quality of inference as is guaranteed by the ‘optimal’ procedure.
In this paper we deal with the case where the deficiency is measured in natural-valued discrete units (number of ‘additional’ summands) and therefore here we deal with discrete case. The notion of deficiency can be extended to the case of the continuous parameter, say, time. This case will be considered in another work.
Along with the general case, in the paper main attention is paid to the situation where the r.v.s being summed are assumed to be independent and identically distributed.
The first problem to be considered below consists in determination of the distribution of a separate random variable in the sum that provides the least possible number of summands guaranteeing the prescribed value of the -quantile of the normalized sum for a given . The second problem considered in the paper consists in determination of the distribution of a separate random variable in the sum that provides the least possible number of summands guaranteeing the prescribed value of the probability for the normalized sum to fall into a given interval. Actually, in both problems we deal with ‘fine tuning’ of the distribution of a separate summand since we assume that different possible distributions of random summands possess coinciding three first moments, so that they can differ only by their kurtosis. In both settings the best distribution delivers the smallest number of summands.
We also consider the problem where some additional randomization is introduced so that the number of summands in the sum can be random itself. This randomization may not be artificially induced, but also may occur when the exact number of summands is a priori unknown and only some its ‘expected’ value can be available as the parameter of the problem. For this case we introduce an analog of deficiency which can be used to compare the distributions of sums with random and non-random number of summands.
Both problems are closely related with the problem of quantification of the accuracy of approximations provided by limit theorems of probability theory. The main mathematical tools used in the paper are asymptotic expansions for the distributions of normalized sums of independent identically distributed r.v.s and their quantiles.
The formal settings mentioned above can be applied to solving practical problems where the models of the observed statistical regularities have the form of distributions of sums of r.v.s and the number of summands plays a substantial role. For example, consider an insurance company whose portfolio consists of a finite number of insurance contracts. Formally, the portfolio is assumed to be a finite set of r.v.s each of which characterizes the income of the company related to a separate contract. Instead of income we can speak of loss assuming that income is a negative loss or that loss is a negative income.
In these terms, the first setting concerns the problem of determination of the distribution of a possible loss within a separate insurance contract (say, the distribution of an insurance payment) providing the least possible portfolio size and guaranteeing the prescribed Value-at-Risk for the average losses. The approach considered in the paper can be used when the distributions of the summands (possible losses) are known only up to their three first moments and the exact Value-at-Risk is not known for sure. In the second setting the latter requirement is replaced by that of guaranteeing the prescribed ‘non-ruin’ probability. Within the framework of this example in both settings the problem consists in the description of the best strategy of the insurance company, if by a strategy we mean the choice of the terms of a contract (e.g., the amount of insurance payment related to each possible insurance event), that is, of the distribution of possible loss within a separate contract. Briefly, the problem is to choose an optimal distribution of a separate loss among the distributions that have the same first three moments so that the portfolio size is least possible.
The paper is organized as follows. Section 1.2 contains a short overview of the properties of statistical deficiency. In Section 2 we outline some results concerning the asymptotic expansions for the distributions of -valued measurable functions of r.v.s and, in particular, for the distributions of normalized sums of r.v.s, as well as for their quantiles. In Section 3 the problem of comparison of the distributions of two sums of independent r.v.s by their deficiency is considered. The notion of asymptotic deficiency is introduced and some formulas for the calculation of asymptotic deficiency are presented. Section 3.1 contains the solution of this problem for these distributions providing a prescribed value of the -quantile for a given . In Section 3.2 this problem is considered for the distributions of sums of independent r.v.s guaranteeing a prescribed probability for an -valued measurable function of r.v.s, in particular, for a normalized sum of r.v.s, to fall into a given interval. Section 4 contains an example of extension of the results of Section 3 to the case of a random number of summands in the sum (random portfolio size, in terms of the example dealing with an insurance company). In Section 4.1 asymptotic expansions for the asymptotic -quantile (called -reserve here) under a random portfolio size are presented and an analog of deficiency of the sum of a random number of summands (or the strategy with a random portfolio size) with respect to the distribution of the sum of a non-random number of summands (or a strategy with a non-random portfolio size) is considered. In Section 4.2 the problem of comparison of these distributions by an analog of deficiency is considered in a special case of three-point distribution of portfolio size.
Everywhere in what follows the set of real numbers is denoted by , the set of natural numbers is denoted by . The distribution function of the standard normal law will be denoted by ,
The distribution of a random vector will be denoted .
1.2. Asymptotic Deficiency
Following the classical terminology of [6], consider two decision rules (say, two statistical procedures) and whose quality is characterized by the quantities and , respectively. Here n is the number of observations delivering the information underlying the decision rules. Assume that the rule is in some sense optimal whereas the rule is competing. For example, in the problem of estimation usually and are mean square deviations and . In the problem of testing hypotheses usually and are powers of tests so that .
By denote the number of observations required for the decision rule based on observations to attain the same quality as the ‘best’ rule based on n observations . In what follows we will keep to the asymptotic approach assuming that . Following [7], by the asymptotic relative efficiency (a.r.e.) of the rule with respect to the rule we will mean the limit
(if it exists and does not depend on the sequence ).
Instead of the ratio of the required number of observations, the difference can be considered as well, vividly showing the additional number of observations required by the decision rule . However, many authors considered the ratio , possibly, because the asymptotic analysis of its properties is simpler.
The systematic analysis of the asymptotic behavior of the difference was first carried out by Hodges and Lehmann in 1970 [2]. They suggested to call the difference deficiency of the competing decision rule with respect to the rule and introduced the notation
If the limit exists, then it is called the asymptotic deficiency of the competing decision rule with respect to the rule and is denoted d. The number d is often called the deficiency of with respect to . Note that if a.r.e. , then , so that this case is not so interesting. In [2] it was also noticed that for some decision rules (statistical procedures) there typically appear cases (see, e.g., the book [8]), that is, in these cases the a.r.e. cannot give an answer to the question, which rule is better, whereas the deficiency can clarify the case, because, generally speaking, in this case the asymptotic deficiency can be arbitrary.
So, the deficiency of with respect to shows, how many additional observations (that is, how much extra information) is required to attain the desired quality, if the decision rule is used instead of the ‘optimal’ decision rule . Therefore, the notion of deficiency provides natural grounds for the asymptotic comparison of and in the case . The study of the asymptotic behavior of the deficiency requires more sophisticated techniques than is used to find the limit e. As a rule, this techniques employ the construction of asymptotic expansions (a.e.s) for the corresponding functions characterizing the quality of decision rules (see, e.g., the books [7,8,9]).
Since the rules and have the quality characteristics and , respectively, then, by the definition of the deficiency , for every n we have
So solve Equation (2), the integer-valued quantity should be treated as a variable taking arbitrary real values. For this purpose the function can be defined for non-integer by the formula
(see [2]).
The functions and are usually unknown, so, in practice, their approximations are used. Assume that the a.e.s
and
hold, where a, b and c are some numbers that do not depend on n, and , and are constants determining the rate of decrease of these quality criteria in n. The first terms in these expansions coincide which means that the a.r.e. of the corresponding rules equals one. It can be easily obtained from relations (1)–(4) that
(see [2] or [7]). Thus, the asymptotic deficiency has the form
The asymptotic deficiency possesses the following obvious property of transitivity: if there is some third decision rule with the quality characteristic admitting an a.e. of the form (4), then the deficiency of the rule with respect the the rule satisfies the equality
where is the deficiency of the rule with respect to and is the deficiency of with respect to .
The case where is most interesting, because in this case the asymptotic deficiency is finite. In the paper [2] some simple examples are given illustrating that this case is quite natural in mathematical statistics (also see the book [8]).
2. Asymptotic Expansions for the Distributions of Normalized Sums of Random Variables
We begin with most general case. Let . Consider a finite set of r.v.s . For the time being we do not assume that the r.v.s are independent and identically distributed. Let be an -valued measurable function of . (In what follows when dealing with the example of the portfolio of an insurance company we will call this function generalized loss). In particular, may be of the form where is the arithmetic mean,
As it has been already said, the problem consists in description of the distribution of r.v.s providing the least possible number of summands n and guaranteeing the prescribed value of the -quantile of the function for a given .
Let be a small number. Consider the quantity defined by the asymptotic relation
The quantity is the asymptotic -quantile of . If , then can be interpreted as the threshold, the exceedance of which by is undesirable and is assumed to have the prescribed small probability . In terms of an insurance company, is the asymptotic Value-at-Risk.
By applying the Taylor formula it is not difficult to obtain the following result.
Lemma 1.
Assume that there exist distribution function and functions and such that
where the functions , and are smooth enough. Then the asymptotic -quantile of admits the a.e.
where satisfies the equation .
Consider the application of this lemma to the case where are independent identically distributed r.v.s such that
and the function has the form with defined by (7). Here the condition means that the separate losses are centered by their expectations. Assume that the characteristic function of the r.v. satisfies the Cramér condition (C)
Under conditions (9) and (10), from Theorem 6.3.2 of [10] (also see [9]) it follows that there exist functions and a such that
For the definition of the functions see the book [10]. In particular,
Relations (11) and (12) and Lemma 1 directly imply the a.e. for the asymptotic -quantile of presented in the following lemma.
Lemma 2.
Let conditions and hold with , . Then the the asymptotic -quantile of admits the a.e.
where is the -quantile of the standard normal distribution: .
3. The Comparison of the Distributions of Two Normalized Sums of Random Variables
3.1. The Asymptotic Deficiency of the Distributions of Summands Providing a Given -Quantile of the Normalized Sums
In this section we will present an approach to the comparison of the distributions of two sums of r.v.s in terms of the number of summands. The distribution of the random vector will be denoted . Consider an -valued measurable function of .
From Lemma 1 we can easily obtain the following result.
Lemma 3.
Consider a sequence such that as . Under the conditions of Lemma 1 we have
Along with the r.v.s resulting in the value of the function , consider another set of r.v.s , according to which the value of the function is . For example, may have the form with defined by (7) and may have the form where
Let to the distribution there correspond the asymptotic -quantile of :
Assume that the a.e. for the distribution function of has the form
where the functions , and are smooth enough. The a.e. (15) differs from the a.e. for the distribution function of established by Lemma 1 only by the term of order , which means that the two distributions are rather close. Define the sequence of natural numbers by the equality
If , , , then d is the asymptotic deficiency of the distribution with respect to the distribution . In other words, d is the asymptotic number of ‘additional’ r.v.s be included in the set in order that the distribution provides the same quality as the distribution .
Theorem 1.
Assume that the conditions of Lemma 1 and (15) hold and . Then the asymptotic deficiency d of the distribution with respect to the distribution has the form
Proof.
From Lemma 1 and condition (15) it directly follows that
and therefore
Further, with the account of the definitions of (see (16)) and we have
Applying Lemma 3 to the right-hand side of (19) we obtain
Now from (16) and (18) it follows that
The theorem is proved. □
Now consider an example of the application of Theorem 1 to the optimization of the portfolio size of an insurance company. Let the possible losses related with each insurance contract in the portfolio be independent identically distributed r.v.s satisfying conditions (9) and (10). Consider another distribution, under which the possible losses are assumed to be independent identically distributed r.v.s such that
Assume that the characteristic function of the r.v. satisfies the Cramér condition
For each n consider the average losses defined by (13). Assume that
(for example, the r.v.s and are centered by their expectations and the distributions of these centered r.v.s are symmetric). From Lemma 2 and Theorem 1 we directly obtain the following statement.
Lemma 4.
Let conditions (9), (10) and (20)–(22) hold. Then the asymptotic (as deficiency of the distribution with respect to the distribution (the ‘additional number of contracts’)d has the form
Lemma 4 illustrates that if the distributions are close, then the deficiency is determined by the kurtosis.
3.2. The Asymptotic Deficiency of the Distributions of Summands Providing a Given Probability for the Normalized Sum to Fall into a Given Interval
To begin with, in this section we again consider the values of a measurable -valued function and on random vectors and with the the distributions and , respectively. The goal is to provide that the value of falls into the interval for some given numbers . As a quality characteristic consider the probabilities
If (see (7)) and (see (22)), that is, normalized sums of r.v.s are considered, then relation (23) means that and are probabilities of that the normalized sums of r.v.s are inside the interval .
From the definition of we directly obtain the following result.
Lemma 5.
Assume that for some and there exist a distribution function and functions , and such that
and, moreover, the functions , and are measurable. Then and admit a.e.s
Corollary 1.
Let as and . Assume that the functions , , and are smooth enough and . Then
Lemma 5, Corollary 1 and formula (6) directly imply the expression for the asymptotic deficiency with quality characteristics (23).
Theorem 2.
Let conditions of Lemma 5 hold with . Then the deficiency of the distribution with the quality characteristic with respect to the distribution with the quality characteristic has the form
If with as and , then the formal passage to the limit in (3.13) yields the formula
Consider an example of the application of Theorem 2 to the optimization of the portfolio size of an insurance company. Let the possible losses related with each insurance contract in the portfolio be independent identically distributed r.v.s satisfying conditions (9) and (10). Consider another distribution, under which the possible losses are assumed to be independent identically distributed r.v.s satisfying conditions (20) and (21). Assume that in (9) and (20) . We are interested in the asymptotic behavior of the average losses (see (7)) and (see (13)). With the account of Lemma 5 we obtain the following statement.
Lemma 6.
Let conditions , , and hold with . Then
uniformly in ,
where the functions and are defined in ,
Corollary 2.
Let as and . Assume that conditions of Lemma 6 hold. Then
Theorem 2, Lemma 5 and formula (5) directly imply the following statement.
Theorem 3.
Let, in addition to the conditions of Lemma 5., . Then the deficiency of the distribution with the quality characteristic with respect to the strategy with the quality characteristic (the ‘additional number of contracts’) has the form
Consider an example where the asymptotic deficiency is finite.
Corollary 3.
Let and , . Then under the conditions of Lemma 5 we have
Moreover, the deficiency has the form
4. Random Number of Summands
4.1. Asymptotic Expansions for the Asymptotic -Quantile of -Valued Measurable Functions of a Random Number of Random Variables
In this section we consider the case where an additional randomization can be introduced into the problem. In this case the number of summands in the sum can be considered as random. This randomization may not be artificially induced, but also may occur when the exact portfolio size can be unknown beforehand and only some ‘expected’ number of summands can be available as the parameter of the problem.
Let natural-valued r.v.s and r.v.s be defined on one and the same probability space . In what follows we will assume that n is the expected value of ,
Assume that for each the r.v. is independent of the sequence . As above, for each , consider the value of an -valued measurable function . For each consider the r.v. defined as
Below we will assume that the following condition holds.
Condition A.There exist , , , , , a differentiable distribution function and measurable functions , such that
as and
Lemma 7.
Let the function satisfy Condition A. Then
The elementary proof of this lemma directly follows by the formula of total probability.
Consider an example of application of Lemma 7. Let be independent identically distributed r.v.s satisfying conditions (9) and (10). Assume that the function is the normalized arithmetic mean (or, which is the same, the normalized sum) with defined in (7). Then, in accordance with what has been said in Section 2, relation (11) holds implying the validity of Condition A. From (11) playing the role of Condition A and Lemma 7 we obtain the following statement.
Lemma 8.
Assume that with defined in (7) and conditions (9) and (10) hold. Then
where the functions are defined in Theorem 6.3.2 of [10].
Relation (11) and Lemma 8 imply the following statement.
Lemma 9.
Let conditions (9) and (10) hold with and . Assume that condition (25) holds and
Then
and
We will use Lemma 9 in order to determine the asymptotic -quantile of and calculate the asymptotic deficiency.
Recall that, for , the asymptotic -quantile of is the quantity satisfying the asymptotic equality
Correspondingly, we define the the asymptotic -quantile of by the equation
From Lemmas 1 and 9 we directly obtain the a.e.s for these asymptotic -quantiles.
Lemma 10.
Under the conditions of Lemma 8, we have
where satisfies the equation .
Now define the sequence of natural numbers by the relation
If
, then d can have the meaning of the expected additional number of summands to be included in the sum in order that the function exceeds for the loss under a non-random number n of summands. The quantity d will be called the asymptotic deficiency.
In the same way that Theorem 1 was proved, we can establish the following statement.
Theorem 4.
Assume that
and there exist , a differentiable distribution function and measurable functions and such that
and . Then the expected number d of additional summands (see and ) in the normalized random sum with respect to the normalized sum has the form
where satisfies the equation .
Theorem 4 implies the following statement.
Corollary 4.
Under the conditions of Lemma 8 the expected additional number of summands d (see (28) and (29)) corresponding to the normalized sum with a random number of summands with respect to the normalized sum has the form
If additionally , then
4.2. An Example of Three-Point Distribution of the Number of Summands
In this section, keeping to the terminology of the example related to optimization of the portfolio size of an insurance company, we will use Corollary 4 to obtain a.e.s for the asymptotic Value-at-Risk (asymptotic -quantile of the normalized average loss, or asymptotic normalized -reserve) in the case where the portfolio size has a special distribution concentrated in three points so that is symmetric around the central point.
Assume that the portfolio size has the distribution of the form
where , , , and
Lemma 11.
Let the random portfolio size have distribution (30) and let condition (31) hold. Then and, as ,
Proof.
The desired statements follow from the relations
The formula for is established in a similar way. □
Lemmas 10 and 11 imply the following statement.
Theorem 5.
Assume that the normalized average loss has the form with defined in (7). Let the r.v. be distributed according to (30) and condition (31) hold. Under the conditions of Lemma 9, for the asymptotic α-reserve corresponding to the normalized average loss there holds the relation
Remark 1.
In addition to the conditions of Theorem 5, let
Then, as ,
Applying Lemma 9, by simple calculations we obtain the following statement.
Lemma 12.
Assume that conditions (9) and (10) hold with and . Let conditions (30) and (31) hold. Then
Corollary 5.
Let conditions of Lemma 12 hold and . Then
Relations (12), Lemmas 10 and 11 yield the following theorem.
Theorem 6.
Let the conditions of Corollary 5 hold. Then the asymptotic α-reserves and related to the normalized average losses and have the form
where satisfies the equation . The corresponding expected additional number d of contracts has the form
5. Conclusions
The paper deals with an approach to the comparison of distributions of sums of a finite number of independent random variables by deficiency. The notion of asymptotic deficiency of the distribution of a measurable -valued function of a random vector with respect to the distribution of the same function of another random vector was introduced. Some formulas for the calculation of asymptotic deficiency were presented in the cases where the function has the form of a normalized sum of independent identically distributed r.v.s. The formulas for the asymptotic deficiency were obtained as the solution of two problems, one of which deals with the description of the distribution of a separate summand minimizing the number of summands and providing a prescribed value of the -quantile of the normalized sum for a given . The second problem deals with minimization of the number of summands and guaranteeing a prescribed probability for a normalized sum of r.v.s to fall into a given interval. These results were extended to the case of a random number of summands in the sum (or random portfolio size, in terms of the example dealing with an insurance company). For this case, an analog of deficiency of the sum of a random number of summands with respect to the distribution of the sum of a non-random number of summands was introduced. The problem of comparison of these distributions by an analog of deficiency was considered in a special case of three-point distribution of portfolio size. The main mathematical tools used in the paper were asymptotic expansions for the distributions of average losses and their quantiles.
Author Contributions
Conceptualization, V.E.B. and V.Y.K.; Formal analysis, V.Y.K.; Funding acquisition, V.Y.K.; Investigation, V.E.B. and V.Y.K.; Writing – original draft, V.E.B. and V.Y.K. All authors have read and agreed to the published version of the manuscript.
Funding
The research was supported by the Ministry of Science and Higher Education of the Russian Federation, project No. 075-15-2020-799.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Acknowledgments
The authors thank the anonymous referees for their comments and suggestions that improved the paper. We also thank A.K. Gorshenin for his help in formatting the paper.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Müller, A.; Stoyan, D. Comparison Methods for Stochastic Models and Risks; J. Wiley & Sons: Chichester, UK, 2002; 352p, ISBN 978-0-471-49446-1. [Google Scholar]
- Hodges, J.L.; Lehmann, E.L. Deficiency. Ann. Math. Stat. 1970, 41, 783–801. [Google Scholar] [CrossRef]
- Torgersen, E. Comparison of Statistical Experiments; Printed online: May 2013; Cambridge University Press: Cambridge, UK, 1991. [Google Scholar] [CrossRef]
- Xiang, X. Deficiency of the sample quantile estimator with respect to kernel quantile estimators for censored data. Ann. Stat. 1995, 23, 836–854. [Google Scholar] [CrossRef]
- Bening, V.E.; Korolev, V.Y.; Zeifman, A.I. Calculation of the deficiency of some statistical estimators constructed from samples with random sizes. Colloq. Math. 2019, 157, 157–171. [Google Scholar] [CrossRef]
- Blackwell, D.; Girshick, M.A. Theory of Games and Statistical Decisions. Wiley Publications in Statistics; J. Wiley & Sons: New York, NY, USA; Chapman & Hall: London, UK, 1954; p. XI, 355. [Google Scholar]
- Lehmann, E.L.; Casella, G. Theory of Point Estimation; Springer: Berlin, Germany, 1998; 589p. [Google Scholar]
- Bening, V.E. Asymptotic Theory of Testing Statistical Hypotheses: Efficient Statistics, Optimality, Power Loss, and Deficiency; Walter de Gruyter: Berlin, Germany, 2011; 277p, ISBN 978-3-11-093599-8. [Google Scholar]
- Cramér, H. Mathematical Methods of Statistics; Princeton University Press: Princeton, NJ, USA, 1946; 647p. [Google Scholar]
- Petrov, V.V. Limit Theorems of Probability Theory: Sequences of Independent Random Variables; Clarendon Press: Oxford, UK, 1985; 437p. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).