Abstract
With the purpose of introducing dependence between different types of claims, multivariate collective models have recently gained a lot of attention. However, when it comes to the evaluation of the corresponding compound distribution, the problems increase with the dimensionality of the model. In this paper, we consider a multivariate collective model that generalizes a model already studied from the point of view of recursive and FFT evaluation of its distribution, and we extend the same study to the general model. With the intention to see which method works better for this general model, we compare the recursive method with the FFT technique, and emphasize the advantages and drawbacks of each one, based on numerical examples.
1. Introduction
Recently, Robe-Voinea and Vernic (2016a, 2016b, 2017, 2018) and Vernic (2018) studied the recursive and Fast Fourier Transform (FFT)-based evaluation of the distribution of the following multivariate collective model:
which may arise in different contexts (see, e.g., the discussion in Section 14.1 of Reference Sundt and Vernic (2009)), from which we mention the case where a policyholder has m types of policies, such as auto, home, business, etc., that can be simultaneously affected by some claim events, such as floods, storms or earthquakes. More precisely, in this case, denotes the aggregate claims affecting solely the policy of type j, while denotes the random variable (r.v.) number of claims simultaneously affecting all m types of policies, with denoting the size of the kth such claim corresponding to the policy of type j. The assumptions under which this model was considered are: Each set of claim sizes are non-negative, independent and identically distributed (i.i.d.) r.v.s, they are also independent of the claim numbers and of the other claim sizes, included; the random vectors are non-negative i.i.d. as the generic random vector and independent of the claim numbers, while the components of , however, are dependent; by convention,
Note that the above model assumes that a claim event affects either a single type of insurance line or all the insurance lines at once; there is no middle way, i.e., an event cannot affect only, say, lines 1 and 2, without causing claims in the other lines.
To overcome this drawback, in this paper we consider the more general multivariate collective model:
where
- The m-variate claim size random vectors are i.i.d. as the generic m-variate random vector whose jth univariate component if meaning that results from those claim events simultaneously affecting solely the lines ; these events are counted by the r.v. . Moreover, the s are also independent of the other claim size random vectors (i.e., of each , where ) and of the claim numbers. We let denote the jth univariate component of the probability function (p.f.) of (in the discrete case) and, by convention, .
- The components of the random vector number of claims are dependent r.v.s, in total (maximum)
We adopt the actuarial terminology in which the distribution of is called “compound” and the distribution of is called “counting”.
To evaluate the distribution of this model, we shall consider that all the claim distributions are of the discrete type (e.g., they have been previously discretized; this is a usual assumption for collective models). We start the next section by presenting the exact formula of the p.f. of based on convolutions, which, unfortunately, is unpractical. Therefore, we also aim at developing recursions for the evaluation of this distribution, an approach that requires the introduction of supplementary assumptions under which it is possible to obtain recursive formulas; examples of such recursions are given in Section 2.1. Apart from the restrictive assumptions, another important drawback of recursions is that they become very time consuming when the dimensionality m of the model increases (see the numerical examples in Section 2.3). To overcome these drawbacks, in Section 2.2 we propose the use of the Fast Fourier Transform (FFT) technique, which can be applied whenever we know the form of the characteristic function of and which is very efficient when we want to evaluate the distribution’s tail. However, this remarkably fast method is an approximate one, and we must pay a special attention to its specific errors; this aspect is illustrated by the numerical examples discussed in Section 2.3.
For simplicity, let us introduce more notation: We denote by the p.f. of , by g and the probability generating function (pgf) and the characteristic function (cf), respectively, of a r.v., which will be indexed with the r.v.’s name. Also, are vectors whose corresponding dimension results from the context, is the zero-vector, while the difference is componentwise. By we denote the sum of the components of the vector and by the n-fold convolution of f. To shorten the formulas, we rewrite the sum as .
2. Evaluation of the Compound Distribution
We start by presenting the exact formula of the p.f. of based on convolutions. This formula is so complex that, in general, it cannot be directly applied to find the distribution of .
Proposition 1.
Proof.
We have
which immediately yields the result. ☐
We shall also need the pgf and the cf of .
Proposition 2.
Proof.
We prove only the pgf formula (the one for the cf follows along the same lines). Considering the independence assumptions of the model, we have
hence the formula (3). ☐
2.1. Recursive Evaluation
Due to the difficulty of directly applying the exact formula from Proposition 1, we present in the following examples of alternative recursive formulas for obtaining the p.f. of under some supplementary assumptions. These assumptions are chosen such that the multivariate compound distribution of can be rewritten as a compound distribution with a univariate counting distribution, for which we can apply the already existing recursions.
2.1.1. Case 1 Assumptions
As in Reference Robe-Voinea and Vernic (2017), we assume that follows the multivariate Poisson distribution with parameters and having the pgf (see, e.g., Johnson et al. (1997))
As a consequence, Proposition 2 easily yields the following pgf and cf
Also, two recursive formulas for evaluating the distribution of are obtained in the following proposition, where we denote by the p.f. of the sum r.v. .
Proposition 3.
Under the assumption that it holds that
and
with starting value
where . In the above formulas, is such that is a permutation of its components.
Proof.
Due to the independence of the random vectors we have that therefore, we can rewrite the pgf (5) as
meaning that in this case, the distribution of Model (2) is also a compound distribution, with a univariate Poisson counting distribution. More precisely, can also be rewritten as
where , while the random vectors are i.i.d. as the m-variate random vector having the mixture p.f.
Regarding model (8), with satisfying Panjer’s recursion (see Panjer (1981)) with parameters , i.e.,
from Reference Sundt (1999) (see, also, formulas (15.4) and, respectively, (15.5) in Sundt and Vernic (2009)) it holds that
Since in our case we have and . Based on this, we insert Equation (9) into Equation(10) and obtain for
We know that if hence, concerning the argument of we can take the components . Therefore, if clearly in the argument of which yields the first stated formula. The second formula results in a similar way by inserting Equation (9) into Equation (11), while the starting value is immediate from and from the above form of . This completes the proof. ☐
2.1.2. Case 2 Assumptions
Similarly to Robe-Voinea and Vernic (2016a, 2016b), the supplementary assumptions are now:
- A1
- The p.f. of the total number of claims satisfies Panjer’s recursion for .
- A2
- Given the conditional distribution of the random vector number of claims is assumed to be multinomial with parameters and where such that Therefore, with and
Under these assumptions, the pgf, the cf of and two alternative recursive formulas are presented in the following.
Proposition 4.
Under the assumptions (A1 and A2), the pgf and cf of the general multivariate collective model (2) become, respectively,
Proof.
To obtain the pgf formula, we recall that the pgf of the multinomial distribution is (see, e.g., Johnson et al. (1997)) , so that the pgf of becomes for
Inserting this into Equation (3) easily yields Equation (12). Equation (13) follows in a similar way, which completes the proof. ☐
Proposition 5.
Under the assumptions (A1 and A2) of Model (2), with starting value
the following recursive formula holds for
while for
where and is such that is a permutation of its components.
Proof.
Considering the assumptions (A1 and A2), we rewrite Model (2) as
where while the random vectors are i.i.d. as the m-variate random vector with the p.f.
We use again Equations (10) and (11). By inserting Equation (16) into Equation (10), the stated formula of the constant K is easily obtained and, for
Using reasoning similar with the one used in the proof of Proposition 3, we obtain Equation (14). Similarly, Equations (11) and (16) lead to Equation (15). This completes the proof. ☐
Particular case: . Let us now have a look at a recursive formula in the trivariate case, where the general Model (2) is with
For example, Equation (15) becomes
where .
2.1.3. Case 3 Assumptions
Another assumption under which recursive formulas already exist is the univariate mixed Poisson counting distribution. To this purpose, we assume that, given that a positive univariate r.v. takes the value the r.v.s are all i.i.d. Poisson distributed such that Then, the pgf of given becomes, from Equation (3):
where . This is the pgf of a compound distribution with univariate Poisson counting distribution and multivariate claims distribution having p.f. hence, the conditional distribution of , given can be evaluated based on Equations (10) and (11), with and . To find the unconditional distribution of we use the technique described in Chapter 20 of Sundt and Vernic (2009). Therefore, with U denoting the distribution function of , we introduce the auxiliary functions
and note that Multiplying Equations (10) and (11) by and integrating yields the following two recursions for
with starting value . Therefore, the algorithm for evaluating for all is more complex and implies the backward evaluation of all (here backward means by decreasing i, see, e.g., the algorithm in Section 20.4.1 in Reference Sundt and Vernic (2009)). Being very time consuming, we don’t insist on this algorithm. However, we note that the recursions can be refined under the assumption that the mixing distribution U is of the continuous type, with the density denoted by u satisfying the condition
This is also called Willmot’s mixing distribution, see Reference Willmot (1993).
Remark 1.
In view of the FFT, we also display the formula of the cf of given
where
Particular case: Simpler recursions are obtained when is gamma distributed, with . In this case, the univariate mixed Poisson distribution becomes a Negative Binomial distribution which satisfies Panjer’s recursion with and . Since
where hence and it follows that we can use Equations (10) and (11) to obtain direct recursions for i.e.,
with starting value Moreover, regarding the cf, we easily obtain
2.2. Fast Fourier Transform Evaluation
The recursive method is an exact one, but, as already mentioned in the introduction, it has some important drawbacks: It can be applied only on some particular models and it becomes quite slow with the increasing of the dimensionality of . A much faster and less restrictive way to evaluate the p.f. of is provided by the Fast Fourier Transform method, which is an approximate technique used to strongly reduce the computing time, especially when evaluating the distribution’s tail. As an advantage, this method can be applied to any model as long as its cf (4) (on which it is based) has a closed form, even if there is no recursive formula available. Therefore, the FFT technique received special consideration in the actuarial literature (see, e.g., References Bühlmann (1984), Embrechts et al. (1993), Jin and Ren (2014) or Robe-Voinea and Vernic (2018)). It consists of an algorithm that computes the discrete Fourier transform of a multivariate function, as well as its inverse, extremely fast. Let denote an m-variate function defined on the integer support ; then its discrete Fourier transform, , and, respectively, the inverse mapping, can defined by (definition consistent with the functions fftn and ifftn in Matlab)
In general, the FFT method requires that the values are powers of two for all j. For the multivariate model (2), this algorithm becomes:
FFT Algorithm for model (2)
Step 1. After setting the truncation point for each claim size random vector at the same , the corresponding truncated claim size distribution is obtained as ; if necessary, the resulting will be filled with zeros (e.g., to constraint the s to be powers of two).
Step 2. Apply the m-dimensional FFT to each , which results in the multidimensional table
Step 3. Use Equation (4) in the general case to obtain the discrete cf .
Step 4. Apply the multidimensional IFFT to to obtain the p.f. of .
Usually, to find the optimal s, one gradually increases them until the differences between the actual solutions and the previous ones are under a certain threshold (e.g., we increase as 32, 64, 128, 256 etc.). However, when dealing with heavy tailed claim size distributions, the results of this method can be strongly affected by a specific error caused by the discrete Fourier transform, which consists of placing under the truncation point the compound probability mass which is in fact above this point. This so-called “aliasing error” (AE) can be significantly reduced by applying to the claim size distributions an exponential change of measure, hence, forcing the tails of these distributions to decrease at an exponential rate; this transformation is known under the name of “exponential tilting” (for more details on this transformation see, e.g., Reference Grübel and Hermesmeier (1999)).
Particular cases: Under the particular assumptions considered in the previous section to allow for a recursive evaluation, one should use the following formulas at Step 3 of the above algorithm:
- -
- When is given by Equation (6);
- -
- Under the Case 2 assumptions (A1 and A2), is given by Equation (13);
- -
- Under the Case 3 mixed Poisson assumption, is given by Equation (18).
2.3. Numerical Illustration
In this section, we consider a particular trivariate model (2) with
for which we implemented both the recursive formulas and the FFT algorithm, under different assumptions.
As claim size distributions, we considered only type II Pareto distributions with the purpose to emphasize the effect of the exponential tilting on the FFT technique. We recall that the decumulative distribution (or survival) function of the m-variate type II Pareto distribution is given by
The expected value of each marginal exists only if , while the variance exists only when We took (mainly from the numerical Example 4 in Reference Robe-Voinea and Vernic (2018))
The expected value of and the variances of do not exist, hence we can see the effect of the exponential tilting in the heavy-tailed case. To discretize these distributions, we used the method of rounding considering the span (good enough for illustration, but not optimal, see the discussion in Reference Robe-Voinea and Vernic (2018)).
Concerning the FFT method, as discussed in Section 2.2, we increased the truncation point (we took for simplicity) from 16 till 128 (unfortunately, generated an “out of memory” warning), and noticed that yielded enough accurate results (for our data) compared to the exact method (see Tables 1, 3 and 5). Moreover, we also varied the tilting parameter and noticed that an increasing of improves the results till while a larger value like doesn’t significantly improve the results (see Table 4 in Example 2).
As expected, there is an important difference between the computing times requested by the two methods. This difference increases with the increasing of the truncation point and becomes really huge for in Example 1 and for in Examples 2 and 3. Therefore, we decided to compare the resulting p.f.s only up to a certain right endpoint denoted by even if the support of the FFT was much larger. Note that the discretization time was not taken into account in the displayed computing times since discretization is needed by both methods (the total discretization time up to was about 160 s).
To emphasize the differences between the FFT and the recursive results, we used the cumulative distribution function (cdf), the AE and the maximum absolute error evaluated between the exact p.f. and the FFT one; these last two are defined, respectively, by
We shall now present three examples based on the three particular cases considered in Section 2.1. From these examples, we also note that in cdf terms, an inequality caused by the AE that places compound mass below the truncation point.
Example 1.
We assume that where ; since for this particular model, the recursive method (we implemented Equation (7)) implies the evaluation of the p.f. (i.e., multivariate convolutions), the corresponding computing time increases tremendously with Therefore, starting with we took only which needed about 30 minutes only for the convolution part. However, the FFT was ready in only a few s even for , see Table 1, where we also display a comparison of the accuracy of the two methods. This example clearly emphasizes the speed discrepancy between the two methods and the important advantage of the FFT speed.
Table 1.
Example 1: Comparing recursive and FFT methods for and various .
Example 2.
We now assume that follows a Poisson distribution for which we recall that and Numerically, we took the multinomial parameters . We implemented the recursive Equation (17) and performed it up to the maximum in about 35 min. The speed difference between the two methods can be seen in Table 2, where we displayed the relative computing times Rec/FFT (for FFT took about 8 s).
Table 2.
Example 2: Relative performances of the two methods when varying r ().
The accuracy comparison of the two methods is presented in Table 3 and the effect of changing the tilting parameters in Table 4, both supporting the above conclusions regarding the choices of r and
Table 3.
Example 2: Comparing recursive and FFT methods for and various .
Table 4.
Example 2: Comparing recursive and FFT methods for and various .
Example 3.
This example is related to Case 3, i.e., follows a mixed Poisson distribution and, for simplicity, we let Therefore, we implemented recursion (19) and the FFT based on Equation (20). The values of the parameters are: The comparison between the two methods is presented in Table 5, from where we note once again that a value of is sufficient to obtain good enough results by FFT (at least for these data). Concerning the computing times, the values were similar with the ones obtained in Example 2, see Table 2.
Table 5.
Example 3: Comparing recursive and FFT methods for and various .
3. Conclusions
In this paper, we proposed a general multivariate collective model that allows for dependence between the r.v.s number of claims, and, moreover, between the different r.v.s claim sizes. Since the evaluation of the resulting compound distribution is not straightforward, we discussed two types of techniques to deal with it: The recursive method that was presented in Section 2.1 and the FFT algorithm that was described in Section 2.2. Unfortunately, even if the recursive method has the advantage of being exact, it has two main drawbacks compared with the FFT method: First, recursions are available under some restrictive assumptions and second, they become very slow with the increasing of the dimensionality of the model. On the other hand, the main drawback of the FFT method consists in its specific errors, especially the aliasing error. However, the FFT technique is so fast compared with the exact recursions, that it is quite worthwhile to use it, especially when values from the tail of the compound distribution are needed (nevertheless, it is important to pay attention when choosing optimal values for the truncation points and for the tilting parameters). Another advantage of the FFT is that specific functions are already implemented in existing software, even for higher dimensions, with, eventually, the disadvantage of memory limitation.
To conclude, we would recommend the following approach: If recursive formulas are available for the considered model, they should be used to evaluate the compound distribution until some reasonable (in computing time terms) upper limit is reached, and then the FFT method should be applied for a more extended domain; to validate the accuracy of the FFT results, they should be compared with the ones obtained by the recursive method.
Funding
This research received no external funding.
Acknowledgments
The author gratefully acknowledges the two anonymous referees for their nice and valuable comments, and the prompt help of the associate editor.
Conflicts of Interest
The author declares no conflict of interest.
References
- Bühlmann, Hans. 1984. Numerical evaluation of the compound Poisson distribution: Recursion or fast Fourier transform? Scandinavian Actuarial Journal 1984: 116–26. [Google Scholar] [CrossRef]
- Embrechts, Paul, R. Grübel, and S. M. Pitts. 1993. Some applications of the fast Fourier transform algorithm in insurance mathematics. Statistica Neerlandica 47: 59–75. [Google Scholar] [CrossRef]
- Grübel, Rudolf, and Renate Hermesmeier. 1999. Computation of compound distributions I: Aliasing errors and exponential tilting. ASTIN Bulletin: The Journal of the IAA 29: 197–214. [Google Scholar] [CrossRef]
- Jin, Tao, and Jiandong Ren. 2014. Recursions and fast Fourier transforms for a new bivariate aggregate claims model. Scandinavian Actuarial Journal 2014: 729–52. [Google Scholar] [CrossRef]
- Johnson, Norman Lloyd, Samuel Kotz, and Narayanaswamy Balakrishnan. 1997. Discrete Multivariate Distributions. New York: Wiley. [Google Scholar]
- Panjer, Harry H. 1981. Recursive evaluation of a family of compound distributions. ASTIN Bulletin: The Journal of the IAA 12: 22–26. [Google Scholar] [CrossRef]
- Robe-Voinea, Elena-Gratiela, and Raluca Vernic. 2016a. On the recursive evaluation of a certain multivariate compound distribution. Acta Mathematicae Applicatae Sinica, English Series 32: 913–20. [Google Scholar] [CrossRef]
- Robe-Voinea, Elena-Gratiela, and Raluca Vernic. 2016b. Another approach to the evaluation of a certain multivariate compound distribution. Analele Universitatii “Ovidius” Constanta-Seria Matematica 24: 339–49. [Google Scholar] [CrossRef]
- Robe-Voinea, Elena-Gratiela, and Raluca Vernic. 2017. On a multivariate aggregate claims model with multivariate Poisson counting distribution. Proceedings of the Romanian Academy Series A 18: 3–7. [Google Scholar]
- Robe-Voinea, Elena-Gratiela, and Raluca Vernic. 2018. Fast Fourier Transform for multivariate aggregate claims. Computational and Applied Mathematics 37: 205–19. [Google Scholar] [CrossRef]
- Sundt, Bjørn. 1999. On multivariate Panjer recursions. ASTIN Bulletin: The Journal of the IAA 29: 29–45. [Google Scholar] [CrossRef]
- Sundt, Bjørn, and Raluca Vernic. 2009. Recursions for Convolutions and Compound Distributions with Insurance Applications. Berlin: Springer Science & Business Media. [Google Scholar]
- Vernic, Raluca. 2018. On the evaluation of some multivariate compound distributions with Sarmanov’s counting distribution. Insurance Mathematics and Economics 79: 184–93. [Google Scholar] [CrossRef]
- Willmot, Gordon E. 1993. On recursive evaluation of mixed Poisson probabilities and related quantities. Scandinavian Actuarial Journal 1993: 114–33. [Google Scholar] [CrossRef]
© 2018 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).