Abstract
In this article we introduce the stability analysis of a compound sum: it consists of computing the standardized variation of the survival function of the sum resulting from an infinitesimal perturbation of the common distribution of the summands. Stability analysis is complementary to the classical sensitivity analysis, which consists of computing the derivative of an important indicator of the model, with respect to a model parameter. We obtain a computational formula for this stability from the saddlepoint approximation. We apply the formula to the compound Poisson insurer loss with gamma individual claim amounts and to the compound geometric loss with Weibull individual claim amounts.
Keywords:
Dirac distribution; gamma-Poisson; Weibull-geometric compound distributions; Gâteaux differential; saddlepoint approximation MSC:
41A60; 65C05; 60K10
1. Introduction
This article presents a computational formula for the stability of the survival function (s.f.) of the compound sum of independent and identically distributed (i.i.d.) random variables that are independent of their summation index. The compound sum typically represents the insurer total claim amount during a fixed period (e.g., a year): the i.i.d. random variables are the individual claim amounts and the number of claims within the period is a counting random variable or a counting stochastic process, if we let the period length vary. We define the stability of a sum as the standardized variation of the s.f. of the sum resulting from an infinitesimal perturbation at some point of the distribution of the summands.
More precisely, let denote the Dirac distribution function (d.f.) over with mass one at x (thus jumping from level 0 to level 1 at point x). If F denotes the d.f. of the summands, then
is the -perturbation of F at x, for any choice of . The derivative of the s.f. of the sum under with respect to (w.r.t.) evaluated at is the s.f. stability (s.f.s.) at the perturbation point x.
This concept differs from the one of sensitivity of queueing theory or risk theory, which is defined as the derivative of the s.f. of the sum w.r.t. a parameter of F (cf., e.g., Asmussen and Albrecher (2010), sct. IV.9). From an abstract point of view, a parametric model spans only a low-dimensional or narrow subset of the space of probability distributions. Such a narrow subset is indeed beneficial to statistical data reduction, but often does not contain all realistic perturbations of the assumed model. In this sense, the sensitivity is a limited indicator of the model stability. Allowing for perturbations in all possible directions provides a more complete or realistic analysis of the model stability. In this sense, our concept of stability is preferable. This concept is in fact an important idea of robust statistics (e.g., Hampel et al. (1986)). Mathematically, the quantity of interest of a stochastic model is regarded as a functional, and a functional derivative is computed. This approach is used, for example, in renewal theory by Grübel (1989), where the renewal function is a functional of the lifetime distribution, or by Politis (2006) for the probability of ruin of the risk process.
Practically, for a given actuarial aggregate loss model in the form of a compound sum, if a stability of low magnitude results from the perturbation in the form of a new large individual claim amount (viz. a large value of x), then the loss model is reliable under perturbations through extreme large claims. In the context of uncertainty (where for example catastrophic events are not incorporated in the model), this notion of stability appears practically relevant. The s.f.s. informs the risk manager about the variation of the upper tail probability of the aggregate loss when an uncertain large claim amount is considered. Still from the practical point of view, the sensitivity as described above has the alternative role of identifying important model parameters—the most significant ones have large sensitivity value. However, this interpretation holds only when the model is actually the correct one (which is often not simple to establish). Of course, both sensitivity and stability analyses can be carried out simultaneously.
Field and Ronchetti (1985) considered this type of stability for the sample mean and called it a “tail area influence function”. Their applications concerned statistical testing. They computed the tail area influence function with the saddlepoint approximation of Daniels (1954). This article generalizes this approximation to the stability of the compound sum and suggests using this concept in risk management. The new formula is easy and fast to compute. Numerical illustrations for the total claim amount with gamma or Weibull individual claim amounts and Poisson or geometric number of claims are provided.
Most methods for computing sensitivities rely on Monte Carlo simulation (e.g., Asmussen and Rubinstein (1999) and Asmussen and Glynn (2007), sct. VII). One exception is Gatto and Peeters (2015), who proposes evaluating the sensitivity of the s.f. of the random sum w.r.t. the parameter of the summation index distribution (which is either Poisson or geometric) with the saddlepoint approximation. Gatto and Peeters (2015) shows numerically that the sensitivities obtained by the saddlepoint approximation and by simulation with importance sampling are very close, even though importance sampling is computationally intensive. The high accuracy of the saddlepoint approximation is well illustrated in the literature of statistics and applied probability; refer, for example, to Jensen (1995) or to Gatto and Mosimann (2012) in the context of risk theory.
This article proceeds as follows. Section 2 provides the approximations to s.f.s. based on the saddlepoint approximation. Section 2.1 considers the the deterministic sum and Section 2.2 the compound sum, viz. the insurer aggregate claim amount. Section 3 provides numerical illustrations. Section 3.1 considers the aggregate claim amount with Poisson-distributed number of claims and gamma-distributed individual claim amounts. In Section 3.2, the number of claims follows the geometric distribution and the individual claim amounts follow the Weibull distribution. Some related long derivatives are provided in the Appendix A.
2. Saddlepoint Approximation to the Stability
This section has two parts: in Section 2.1 an approximation to s.f.s. of the deterministic sum is derived. It corresponds to the formula of Field and Ronchetti (1985), although the derivation is different. Section 2.2 generalizes the formula to the compound sum, which is an essential quantity of risk theory.
2.1. The Sum
Let be independent random variables with d.f. F, moment generating function (m.g.f.) M and cumulant generating function (c.g.f.) . The Legendre–Fenchel transform (or convex conjugate or large deviations index) of the c.g.f. K and at point is given by
where is the domain of definition of . The transform is clearly nonnegative. One can show that it is convex and that it attains its minimum at , when the expectation exists. Assume that the supremum (2) is attained at . This condition is satisfied without restrictions on t when F is light-tailed, in the sense of having exponentially decaying tails. Under this assumption, solves w.r.t. v the equation
and convexity indicates that it is the unique solution. It is called the saddlepoint at t and . Define the sample mean by and the s.f. . Chernoff’s large deviations theorem states that ,
Although (4) is a large deviations approximation, the asymptotics are in the logarithmic scale of the probability.
This article is based on the saddlepoint approximation of Lugannani and Rice (1980), because it is known that it provides a very accurate approximation to the s.f. . It has bounded relative error on the probability scale, instead of the logarithmic scale. From now on, we assume that F is absolutely continuous. Under this additional assumption, Lugannani and Rice’s approximation to at is given by
where
and and are the standard normal density and d.f., respectively. The relative error of approximation (5) is , as . For comparison, (4) re-expressed in terms of the new variable r leads to the quite dissimilar approximation to given by .
The s.f.s. of at tail value t and perturbation point is given by the Gâteaux differential
where is the probability measure obtained by the replacement of the summand d.f. F by its -perturbation at x, that is defined in (1), for some . The following result gives an approximation to the s.f.s. obtained from (5).
Theorem 1.
Proof.
Let and . The approximate s.f.s. (8) is obtained by differentiating w.r.t. the Lugannani and Rice saddlepoint approximation (5) at and by evaluating it at .
Let , denote and . Then,
(because for any Borel function , ). The perturbed saddlepoint at point is defined by . Thus, from
we obtain
Denote in (6), then
Note that small perturbations do not affect the sign of when tail probabilities are considered. Precisely, if , then , , for some . Thus , . Define , , and . Then, from the multivariate chain rule,
Hence, we obtain
where (see (A1) in the Appendix A). By inserting this result into (13), we obtain (9).
Denote , then (6) leads to
From the multivariate chain rule, we obtain
where (see (A3) in the Appendix A). These two last results yield (10). ☐
The leading term of the approximation to the s.f.s. (8) is equal to Formula (3.1) in Field and Ronchetti (1985), which is however not derived from the saddlepoint approximation (5) but from the Laplace approximation to the integral of the saddlepoint approximation to the density of Daniels (1954). In order to control this equality, the following correspondences between the two notations can be useful: , , , and . Thus, Theorem 1 provides an alternative derivation of the s.f.s. of Field and Ronchetti (1985) as well as the exact form of the error term. However, numerical studies suggest that it is preferable to use the first-order term alone.
Regarding the sum, let , then is its s.f., its saddlepoint approximation is , is its s.f.s., and the saddlepoint approximation is .
2.2. The Compound Sum
Let the random variables fulfill the assumptions given in Section 2.1 and let F denote their common d.f. Let N be an independent random variable taking values in with probability function , for Consider the compound sum
where by definition. Define the indicator as the function equal to 1 if the statement A is true, or equal to 0 if A is false. The s.f. of Z at can be written as
which is generally not a computational formula. This section provides the saddlepoint approximation to (14), and then the associated approximation to the s.f.s.
In (14) we see that the distribution of Z is a linear combination of a distribution with mass one at zero and an absolutely continuous distribution. The mass at zero must be eliminated in order to apply the saddlepoint approximation. Denote by and the m.g.f. and the c.g.f. of N, respectively, and by K the c.g.f. of . Then, the m.g.f of Z is , and its c.g.f. is . Let be a random variable with the conditional distribution of Z given . Then, and are the s.f. and the c.g.f. of , respectively. The Legendre–Fenchel transform of the c.g.f. at is given by
We assume that the supremum in (15) is attained at . Under this assumption, solves w.r.t. v the equation
The solution is unique and called saddlepoint at t. We obtain the saddlepoint approximation to at , denoted , by the left side of (5) with and with
More explicit expressions of s and r than those in (17) are obtained with
(see (A6) in the Appendix A), and by
It follows from
that the saddlepoint approximation to is given by
The s.f.s. of Z is the Gâteaux differential
where is given by (1), , and . The following result gives an approximation to the s.f.s. obtained from the first-order approximation of the s.f.s. of the mean given in Theorem 1.
Theorem 2.
Proof.
This proof is similar to the one of Theorem 1, and so only the main arguments are given. Let and . Let us define the perturbed m.g.f. as in (11), , and for ,
By following the reasoning that led to (12) in the proof of Theorem 1, we obtain the perturbed saddlepoint at point as:
where is given by (A5) in the Appendix A. With (18), it simplifies to
where and are respectively given in (A1) and (A2) of the Appendix A.
By denoting , we find for ,
The multivariate chain rule yields
where is given in (A4) in the Appendix. By joining these two last expressions, one obtains (25) in the theorem. Then, (26) is obtained from (A4) and (A1) in the Appendix A, and from (18).
The s.f.s. of is given by
Thus, it follows from (21) that
This justifies (24) in the theorem. ☐
Remark 1.
Another approximation to the s.f.s. of Z can be obtained by generalizing the remainder term given in Theorem 1. This yields the approximation at given by
where
and with other quantities given in Theorem 2. The derivatives appearing in (29) are given by (28), (19),
and by
, , and in (31) can be found respectively in (A1), (A2), and (A3) in the Appendix A.
The justification follows the proof of Theorem 1. By denoting , we have
From the multivariate chain rule we obtain
3. Numerical Illustrations
This section provides numerical illustrations of the results of Section 2.2 for two important aggregate loss models: the Poisson number of occurrences with gamma individual claim amounts, in Section 3.1, and the geometric number of occurrences with Weibull individual claim amounts, in Section 3.2.
This numerical study was performed with Matlab (R2017b, The MathWorks, Natick, MA, USA), and the function fminsearch was used for computing the saddlepoint. Matlab’s programs used for these computations are available at http://www.stat.unibe.ch.
3.1. Poisson-Gamma Total Claim Amount
Assume that the total number of claims of an insurance company that occur during a fixed time horizon, denoted by N, is Poisson-distributed with parameter ; viz. , for . Let . The m.g.f. of N and its derivatives are given by
Assume that the individual claim amounts or losses are gamma-distributed, with density , , for some parameters . Let . The c.g.f. of and its derivatives are given by
The m.g.f. of the aggregate loss is given by
and so the c.g.f. of , viz. Z given , is given by
With these formulae we can obtain the values of s, r, and required in Theorem 2. So, we can compute the s.f.s. given in (24).
For the numerical illustration, we fixed , , and . The results are shown in Figure 1. The dashed curve shows the saddlepoint approximation to the s.f. (see (22)) for all relevant values of t. The four solid curves of Figure 1 show the approximation to the stability , for the perturbation points , and for relevant values of t. The highest curves correspond to the largest values of x. This is what we would have expected. A large perturbation point x yields a large increase of the upper tail probability, and thus a large value of the stability. A vanishing perturbation point x yields either a small increase or a decrease of the upper tail probability, and thus a small value of the stability. We should note that the numerical computation of these curves is very fast. Thus, the proposed approximation to the s.f.s. inherits the well-known computational efficiency of the saddlepoint approximation. Any purely numerical technique (e.g., Monte Carlo simulation) would be computationally intensive and thus slower.
Figure 1.
Poisson with compound sum of independent gamma with and random variables. Dashed curve: s.f. Continuous curve, from lowest to highest curve: approximate stabilities for perturbation points , respectively.
For a practical illustration, consider the following values from the setting of Figure 1: and . If the insurance believes that additional claim amounts of with small frequency ‰ have to be considered, then the tail probability of the non-perturbed model would rise by , because .
3.2. Geometric-Weibull Total Claim Amount
The suggested approximation was tested with a different aggregate loss model. Assume that the total number of claims N follows the geometric distribution with parameter , precisely , for . The m.g.f. of N and its derivatives at are given by
Assume the individual losses follow the Weibull distribution with density , , for some . We can easily compute its moments , for . The m.g.f. of the Weibull distribution exists for all v over a neighborhood of zero iff . Thus, the Weibull distribution is light-tailed in this sense iff . Therefore, the power series representation holds for any v within a neighborhood of zero. Moreover, for v in this neighborhood,
with . With this, the m.g.f. of the aggregate loss can be expressed as is given by
and the c.g.f. of can be written as
These formulae allow us to compute s, r, and of Theorem 2, and thus we can compute the s.f.s. given in (24).
For the numerical example, we considered and . Figure 2 shows the numerical results. The dashed curve indicates the saddlepoint approximation to the s.f., cf. (22), for all relevant values of t. The four solid curves of Figure 2 show the approximation to the s.f.s. , for the perturbation points , and for relevant values of t. The highest curves correspond to the largest values of x. The numerical evaluation of the above series representations of m.g.f. and c.g.f. does not give any particular problem: after only a few summands, numerical convergence is obtained. We note that the numerical results are similar to the ones of the Poisson-gamma aggregate loss of Section 3.1. Additionally, as with the Poisson-gamma model, the approximate s.f.s. can be computed very quickly. Thus, it can be conveniently applied to practical problems and it provides an additional indicator of the reliability of the model under uncertainty.
Figure 2.
Geometric with compound sum of independent Weibull with random variables. Dashed curve: s.f. Continuous curve, from lowest to highest curve: approximate stabilities for perturbation points , respectively.
Funding
This research received no external funding.
Acknowledgments
The author thanks two anonymous referees for their comments and suggestions.
Conflicts of Interest
The author declares no conflict of interest.
Appendix A
This appendix provides various elementary but long derivatives appearing in the previous developments.
Appendix A.1. Derivatives of the Cumulant Generating Function of the Perturbed Summand
Recall that M and K denote the m.g.f. and the c.g.f. of . This section gives some derivatives of under the -perturbation, viz. of , w.r.t. to v and . The results are the following:
and
Appendix A.2. Derivatives of the Cumulant Generating Function of the Perturbed Compound Sum
Recall that , K, and denote the m.g.f. of N and the c.g.f. of and of . This section gives some derivatives of under -perturbation of the distribution of , viz. of , w.r.t. to v and . The following results are expressed in terms of the derivatives of Appendix A.1:
and
References
- Asmussen, Soren, and Hansjorg Albrecher. 2010. Ruin Probabilities, 2nd ed. Singapore: World Scientific. [Google Scholar]
- Asmussen, Soren, and Peter W. Glynn. 2007. Stochastic Simulation. Algorithms and Analysis. New York: Springer. [Google Scholar]
- Asmussen, Soren, and Reuven Y. Rubinstein. 1999. Sensitivity analysis of insurance risk models via simulation. Management Science 45: 1125–41. [Google Scholar] [CrossRef]
- Daniels, Henry Ellis. 1954. Saddlepoint approximations in statistics. Annals of Mathematical Statistics 25: 631–50. [Google Scholar] [CrossRef]
- Field, Christopher A., and Elvezio Ronchetti. 1985. A tail area influence function and its application to testing. Sequential Analysis 4: 19–41. [Google Scholar] [CrossRef]
- Gatto, Riccardo, and Michael Mosimann. 2012. Four approaches to compute the probability of ruin in the compound Poisson risk process with diffusion. Mathematical and Computer Modelling 55: 1169–85. [Google Scholar] [CrossRef]
- Gatto, Riccardo, and Chantal Peeters. 2015. Saddlepoint approximations to sensitivities of tail probabilities of random sums and comparisons with Monte Carlo estimators. Journal of Statistical Computation and Simulation 85: 641–59. [Google Scholar] [CrossRef]
- Grübel, Rudolf. 1989. Stochastic models as functionals: Some remarks on the renewal case. Journal of Applied Probability 26: 296–303. [Google Scholar] [CrossRef]
- Hampel, Frank Rudolf, Elvezio M. Ronchetti, Peter Rousseeuw, and Werner Alfred Stahel. 1986. Robust Statistics. The Approach Based on Influence Functions. New York: Wiley & Sons. [Google Scholar]
- Jensen, Jens Ledet. 1995. Saddlepoint Approximations. New York: Oxford University Press. [Google Scholar]
- Lugannani, Robert, and Stephen Rice. 1980. Saddle point approximation for the distribution of the sum of independent random variables. Advances in Applied Probability 12: 475–90. [Google Scholar] [CrossRef]
- Politis, Konstadinos. 2006. A functional approach for ruin probabilities. Stochastic Models 22: 509–36. [Google Scholar] [CrossRef]
© 2018 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).