Abstract
In this paper, a new heavy-tailed distribution, the mixture Pareto-loggamma distribution, derived through an exponential transformation of the generalized Lindley distribution is introduced. The resulting model is expressed as a convex sum of the classical Pareto and a special case of the loggamma distribution. A comprehensive exploration of its statistical properties and theoretical results related to insurance are provided. Estimation is performed by using the method of log-moments and maximum likelihood. Also, as the modal value of this distribution is expressed in closed-form, composite parametric models are easily obtained by a mode matching procedure. The performance of both the mixture Pareto-loggamma distribution and composite models are tested by employing different claims datasets.
1. Introduction
In general insurance, pricing is one of the most complex processes and is no easy task but a long-drawn exercise involving the crucial step of modeling past claim data. For modeling insurance claims data, finding a suitable distribution to unearth the information from the data is a crucial work to help the practitioners to calculate fair premiums. In actuarial statistics and finance, the classical Pareto distribution has been deemed better than other models because it provides a good description of the random behaviour of large claims. Loss data mostly have several characteristics such as unimodality, right skewness, and a thick right tail. To accommodate these features in a single model, many probability models have been proposed in the literature.
Over the last decades, different approaches to derive new classes of probability distributions that could provide more flexibility when modelling large losses has been added to the literature. This includes transformation method, the composition of two or more distributions, compounding of distributions or a finite mixture of distributions among other methodologies. In particular for generalizations of the classical Pareto distribution, the reader is referred to the Stoppa distribution (see Stoppa 1990), the Pareto positive stable distribution (see Sarabia and Prieto 2009), the Pareto ArcTan distribution (see Gómez-Déniz and Calderín-Ojeda 2015) and the generalized Pareto distribution proposed in Ghitany et al. (2018). Obviously, adding a parameter to the parent model complicates the parameter estimation.
In this paper, a probabilistic family, the mixture Pareto-loggamma distribution, which belongs to the heavy-tailed class of probabilistic models is introduced. This family is derived by using of an exponential transformation of the generalized Lindley distribution. It is expressed as a convex sum of the classical Pareto and a special case of the loggamma distribution similar to the one presented in Gómez-Déniz and Calderín-Ojeda (2014). The former model is obtained as a particular case whereas the latter one is a limiting case. We further present expressions of statistical and actuarial measures such as moments, variance, cumulative distribution function, hazard rate function, VaR, TVaR and limited expected values. For many of these distributional characteristics, closed-form expressions are obtained. Estimation of the parameters of this distribution can be easily calculated by the maximum likelihood method by using the numerical search of the maximum and also by the method of log-moments.
In many instances, composite distributions give a reasonably good fit as compared to classical distributions since the former models have the advantage of accommodating both low magnitude values with a high frequency as well as large magnitude figures with low frequency. Cooray and Ananda (2005) introduced a composite lognormal-Pareto distribution with restricted mixing weights. Scollnik (2007) improved the composite lognormal-Pareto model by allowing flexible mixing weights. Bakar et al. (2015) considered several probabilistic models in place of lognormal and Pareto distribution using the approach discussed in Scollnik (2007). Recently, Calderín-Ojeda and Kwok (2016) introduced a new class of composite model using mode-matching procedure. They derived composite models using lognormal and Weibull distributions with a Stoppa model, a generalization of the Pareto distribution. As there exists a closed-form expression for the modal value of this new model, in this article, we use the mode-matching procedure to derive new composite models based on this distribution.
The structure of the paper is organized as follows. In Section 2, we present the genesis of the new distribution, and discuss its relationship with other distributions. Besides, the most relevant distributional properties are studied in the same section. Also, different estimation methods are examined. Finally, composite models based on the proposed distribution are derived. Next, in Section 3, some results related to insurance are displayed. Numerical applications are displayed in Section 4 together with the derivation of some income indices. The last section concludes the paper.
2. The Distribution and Its Properties
It can be easily seen that the following expression
with and is a genuine probability density function (pdf). Here are shape parameter and is scale parameter (see the Appendix A). Note that the pdf (1) can be written as a convex sum of the classical Pareto distribution and a special case of the loggamma distribution. The former model is obtained when and the latter one for . Observe that the pdf of the loggamma distribution considered in this work is
with and . In our model it is assumed that .
The cumulative distribution function (cdf), survival function (sf) and hazard rate function of a random variable (rv) with pdf (1) are respectively given, for , as
Henceforth, a continuous rv X that follows (1) will be called as mixture Pareto-loggamma distribution and denoted as . The shape of the pdf given in (1) is shown in Figure 1 for different values of the parameters and and a fixed . It is observable that the larger is the value of , the thinner is the tail. On the contrary, the thickness of the tail decreases when drops.
Figure 1.
Graphs of the pdf (1) for different values of parameter , and fixed value of .
Undoubtedly, the single parameter Pareto distribution is one of the most attractive distributions in statistics; a power-law probability distribution found in a large number of real-world situations inside and outside the field of economics. Furthermore, it is usually used as a basis for Excess of Loss quotations as it gives a pretty good description of the random behavior of large losses. In this sense, many probability distributions can be used for modelling single loss amounts. Then, if the loss is assumed to follow the pdf (1), we have that defines the tail behavior of the distribution.
Below, the hazard rate function has been plotted in Figure 2 for different values of the parameters and for fixed . It is observable that the hazard rate function increases when the parameter grows and the parameter declines.
Figure 2.
Graphs of the hazard rate function (4) for different values of parameter , and for fixed value of .
The next result shows the relationship between the and the Generalized Lindley distribution, which is extensively used in lifetime analysis and reliability.
Remark 1.
Thedistribution can be also obtained by taking, where Y follows a generalized Lindley distribution with density
From this result, it is straightforward to generate random variates from the distribution by observing that the pdf given in (5), can be rewritten as
where , is the pdf of the exponential distribution with mean and is the pdf of the Erlang distribution with shape parameter 2 and rate parameter . Then, a random variate from the distribution, x, can be generated following a modification of the algorithm presented in Ghitany et al. (2008) as shown below.
- Generate two random numbers and from the standard uniform distribution, .
- Generate random variates from the exponential distribution with mean and from the Erlang distribution with shape parameter 2 and rate parameter by using .
- If , then set ; otherwise, set .
- Generate .
Observe that an analogous algorithm could be implemented by using the fact that (1) is a convex sum of the densities of Pareto and loggamma distributions.
2.1. Moments and Log-Moments
The rth raw moment of is given by
In particular, we have that
The variance is given by
The moment estimator of and with known can be simply derived from (7).
2.2. Unimodality
2.3. Stochastic Ordering
Many parametric families of distributions can be ordered by some stochastic orders according to the value of their parameters. A rv is said to be stochastically smaller than if for all x. The formal definition of stochastic ordering is
Definition 1.
Letandbe continuous random variables with densitiesand, respectively, such thatis non-decreasing over the union of the supports ofand. Thenis said to be smaller thanin the likelihood ratio order.(see Section 1.C of Shaked and Shanthikumar (2007)).
Similarly, a rv can be also ordered for stochastic (distribution function), hazard rate and mean excess orders when the following results hold:
- (i)
- Stochastic order if for all x.
- (ii)
- Hazard rate order if for all x.
- (iii)
- Mean excess order if for all x, where is mean excess function given in expression (17).
Using Theorem 1.C.1 and Theorem 2.A.1 of Shaked and Shanthikumar (2007), the above stochastic orders hold following implications result
This gives rise to Proposition 1.
Proposition 1.
Letandbe two rv’s havingdistribution with parameters. Then following results hold
- i.
- If,and, then,and.
- ii.
- If,and, then,and.
Proof.
See the Appendix A. □
2.4. Integrated Tail Distribution and Equilibrium Hazard Rate
As we have already seen previously, the proposed distribution can be viewed as a mixture of two thick-tailed distributions, the Pareto and loggamma models, thus it can be used to describe events that include heavy-tailed behaviour. In this particular, we define another distribution function called Integrated tail distribution (ITD) (also known as equilibrium distribution) which often appears in insurance fields (see Yang 2004). The ITD has many interesting applications, e.g., approximation of the ruin function (see Yang 2004) or characterisation of the tail of the distribution (see Su and Tang 2003) just to name a few. Hence, for distribution bluewith , ITD is obtained as
The associated equilibrium hazard rate , for is given by
Moreover, it can be easily verified that . Hence by using Theorem 2.1 of Su and Tang (2003) we can conclude that is a heavy-tailed distribution. Heavy-tailed distributions are important in non-life insurance when modeling losses related to motor third-party liability insurance, fire insurance or catastrophe insurance.
2.5. Estimation
Let be an independent and identically distributed (iid) random sample of size n drawn from a population which follows a distribution. Estimate of parameter can easily be obtained, as the support of the distribution is greater than . Therefore for known, we estimate the parameters and by (i) method of log-moments and (ii) maximum likelihood method.
2.5.1. Method of Log-Moments
The first and second log-moments of distribution obtained by using (7) are given by and . Solving these equations for and , we have that these estimates are provided by
2.5.2. Maximum Likelihood Estimation
The log-likelihood function given the iid random sample of size n is
and the normal equations obtained by differentiating (12) with respect to and are
The above equations cannot be solved analytically and hence maximum likelihood estimates of and can be computed numerically using in-built -function such as , or . Moreover, in all these functions to initialize the program we use initial values obtained from the method of log-moments. The second partial derivative of log-likelihood function with respect to parameter and are
Furthermore,
substituting , and using Tricomi confluent hypergeometric function , we get
Hence, the expected Fisher’s information matrix associated to the parameters and is given by with where
The standard errors of the estimates and can be obtained by inverting the aforementioned matrix and taking the square root of the diagonal entries.
2.6. Composite Models
Composite parametric models are a useful way to describe data that combined loss data of small and moderate size with a high frequency and large observations with a low frequency. They consist of two distributions, the first model is used up to an unknown single threshold value, estimated from the data, and the second distribution beyond this threshold (see Cooray and Ananda 2005). Another approach to derive composite models is utilizing a mode-matching procedure. In this technique, the two distributions are composed at the common modal value which can be estimated from the claims data. Thus, the composite model uses a truncated version of the first model up to the mode and the rest of the model is based on an appropriate truncation of the second distribution from that modal point onwards. The model after composition is similar in shape to either of the models considered but with a thicker tail. By using this methodology, it is guaranteed that the new density is continuous and smooth. These composite models give a significantly better fit as compared to the standard single models for the same empirical data. Here, we derived composite models of our distribution with lognormal, Weibull and paralogistic distributions.
2.6.1. Composite Lognormal- Model
Let us consider, for , the two-parameter lognormal distribution with pdf and cdf respectively given by
with , where is the cdf of the standard normal distribution. Now by taking expressions (1) and (2) as the pdf, , and cdf, of the distribution respectively with and setting equal the modes of lognormal and distributions we obtain
Note that we must impose the additional constraint . Now, the unrestricted mixing weight is given by
Then, the pdf of the four-parameter composite lognormal- distribution is given by
2.6.2. Composite Weibull- Model
Let
be the pdf and cdf of a two-parameter Weibull distribution with and . Let us again consider the pdf and cdf of the distribution given by (1) and (2) respectively. By equating the modes of the two distributions, we have that
where the restriction must be imposed to guarantee that the mode of the Weibull distribution is greater than 0. Now, the pdf of the four-parameter composite Weibull- distribution is provided by
where the unrestricted mixing weight is given by
2.6.3. Composite Paralogistic- Model
Finally, let us assume that
are the pdf and cdf of the two-parameter paralogistic distribution with α, τ > 0, and . Once again, we consider the pdf and cdf of the model given by (1) and (2). By setting equal the modal values of the two distributions, we have that
where the restriction is again established to ensure that the mode of the paralogistic distribution is larger than 0. The unrestricted mixing weight is now given by
The pdf of the four-parameter composite paralogistic- distribution is provided by
3. Insurance Results
In the following, several theoretical results related to insurance for the distribution and related composite models are derived.
3.1. Mean Excess Function
The mean excess function measures the expected payment per claim on a policy with a fixed amount deductible of x, ignoring the claims with amounts less than or equal to x. It is defined as
which can also be obtained by inverting the equilibrium hazard rate (Su and Tang 2003), hence for the mean excess function for the model with pdf (1) is given by
3.2. Excess of Loss Reinsurance
Let X be a rv denoting the individual claim size taking values greater than d. Let us also assume that X follows (1), and the expected cost per claim to the reinsurance layer when the loss in excess of m subject to a maximum of l is given by
3.3. Value-at-Risk and Tail Value-at-Risk
In this subsection, we firstly discuss the most widely used risk measure, the Value-at-Risk (VaR). It is defined as the minimum value of the distribution such that the probability of the loss larger than this value does not exceed a given probability. In statistical terms, VaR is a quantile of a random variable and the formal definition is as follows.
Definition 2.
Let X be a loss rv with a continuous cdf, and δ be a probability level such that, the Value-at-Risk at probability level δ, denoted by VaRδ(X), is the δ-quantile of X. That is
Hence for rv having distribution the Value-at-Risk is given in next Proposition.
Proposition 2.
For, the VaRδ(X) of thedistribution is
whereis the negative branch of Lambert-W function.
Proof.
See the Appendix A. □
The median of will be obtained by taking in VaRδ(X). As in general, the loss distributions are typically skewed, the VaR is a non-coherent risk measure due to the lack of subadditivity (see Klugman et al. 2012), for that reason, the Tail Value-at-Risk (TVaR) of X (see Acerbi and Tasche 2002) is usually considered as a more informative and more useful risk measure. The TVaR is given by
which is a coherent risk measure. If X is continuous, , and then the TVaR is the conditional tail expectation .
Proposition 3.
For, the TVaRδ(X) fordistribution is given by
where.
Proof.
See the Appendix A. □
3.4. Limited Loss Variable for Composite Models
Finally, in this subsection the kth moment of the limited loss variable for composite models based on the distribution is presented.
Definition 3.
The kth incomplete moment transform of a rv X with cdfand pdf, given that, is a rv with pdf given byand cdf given by,.
Definition 4.
The limited loss variable is defined as,where u is the maximum benefit paid by the insurance policy.
Proposition 4.
The moment of order k of the limited loss variable for the composite models based on thedistribution is given by
ifand
if.
Proof.
The kth moment of the limited loss variable can be derived as,
4. Case Studies
In this section, we use two claims datasets to assess the performance of distribution and the composite models. Finally, some results related to income indices are given.
4.1. An Application to Automobile Claims Data
First, we examine the performance of distribution as compared to other heavy-tailed distributions available in the literature by employing of a real automobile claims dataset (see De Jong and Heller 2008). This dataset describes a one-year vehicle insurance policies taken out in 2004 or 2005. There are 67,856 policies, of which 4624 (6.8%) had at least one claim. The variable of interest is the size of the claims. The minimum claim amount is $200 and the maximum value is $55,922.13. We have fitted to this dataset the distribution by setting the value of equal to the minimum value to explain the claims amount distribution. In Table 1, parameter estimates and standard errors (S.E.) for the and other two-parameter and three-parameter heavy-tailed probabilistic models with support in are illustrated. For a review of these distributions the reader is referred to Hogg and Klugman (2009) or Klugman et al. (2012). For the classical Pareto, PAT the location parameter has been set equal to the minimum value, i.e., . For the loggamma distribution the location parameter can be chosen in the neighborhood of this value. For comparison purposes, three measures of model selection has been included in this Table, the negative of the log-likelihood function (NLL), Akaike’s information criterion (AIC) and Bayesian information criterion (BIC). It is observable that the provides the best fit to data in terms of these measures of model selection. The figures of the measures of model selection reveal that the Pareto distribution is preferable to the loggamma distribution for this dataset.
Table 1.
Estimated values of different heavy-tailed models, corresponding standard errors (S.E.), negative of the log-likelihood function (NLL), Akaike’s information criterion (AIC) and Bayesian information criterion (BIC) for automobile claims dataset (De Jong and Heller 2008).
We have also fitted the three-parameter Burr distribution to this dataset. For this model the value of the NLL is 38,028.13. However, the algorithm used to search for the maximum of the log-likelihood surface to find the estimates stopped at the boundary of the parameter space, and thus these estimates are no longer displayed in Table 1.
4.2. An Application to Fire Insurance Claims
In our second application, the composite models derived from the distribution are compared to the composite Pareto and composite Lomax families. For that reason, we consider the well-known Danish fire insurance dataset that consists of 2492 fire insurance losses in millions of Danish kroner (DKr) from the years 1980 to 1990 (both inclusive), adjusted to reflect 1985 values. This dataset may be found in the ‘SMPrcacticals’ add-on package for R, available from the CRAN website http://cran.r-project.org/. Parameter estimation for all the models considered has been completed by the method of maximum likelihood (which is implemented using the function ‘mle’/‘mle2’ in R). In Table 2, parameter estimates and standard errors (S.E.) for all the composite distributions and the three measures of model validation considered in our first example are exhibited. Among the composite models, we can see that the Weibull- composite model gives the best fit overall for this dataset in terms of NLL, AIC and BIC values.
Table 2.
Estimated values of different heavy-tailed models, corresponding standard errors (S.E.), NLL, AIC, and BIC of different composite models for the Danish fire losses dataset.
Finally, to select composite models that provide an acceptable description of the loss process, we must verify that the first-order moment of the limited loss variable and the empirical counterpart, given by are essentially in agreement. Obviously when u tends to infinity, the former quantity and converge to and the sample mean, respectively. In Table 3 empirical and fitted limited expected value for the seven composite models considered are displayed. As it can be observed, the composite lognormal-Pareto and Weibull-Pareto distributions tend to overestimate the empirical limited expected value when the policy limit u increases. Although, the remainder models stay closer to the empirical limited expected value for different values of u, the composite Weibull-Lomax displays the best behaviour.
Table 3.
Limited expected value for the distributions considered and different values of the policy limit u.
4.3. Theil’s Income Indices
Among several economic inequality measures the Gini Index is commonly used. However, it does not allow us to vary the sensitivity of the index to redistributional movements at specific income ranges, which can point out different evolutions of disparities when no Lorenz dominance is found (see Sarabia and Castillo 2005). Yitzhaki (1983) further generalized this index which is sensitive to changes in the right tail of the distribution as the parameter increases. The two limiting cases of this family of inequality measures, known as Theils’s Indices ( and ), corresponding to the Theil entropy index (TEI) and the mean log deviation (MLD), are defined by
Proposition 5.
The TEI and MLD indices for a rv X that follows the distribution are provided by
and
Proof.
The proof follows directly after some computation. □
5. Conclusions
In this paper, we have proposed a new heavy-tailed class of distribution, that is obtained by using an exponential transformation of the generalized Lindley distribution. This class generalizes both the classical Pareto model and a special case of the loggamma distribution. Some of its most relevant statistical properties were examined. The model is very flexible since it allows for closed-form expressions for many results related to insurance. Besides, as the mode of this new model can be written in closed-form, composite models based on this distribution can be easily derived. The numerical illustrations reveal that the mixture Pareto-loggamma distribution provides a good fit to loss data, and it is a competitive model with other existing heavy-tailed distributions in the literature.
Author Contributions
The authors contributed equally to this work.
Funding
This research received no external funding.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A
Expression (1) is a genuine pdf:
where and are the densities of the Pareto and loggamma distribution respectively.
Proof of Proposition 1.
Proof of Proposition 2.
By assuming , the cdf can be written as
for fixed and , the th quantile function is obtained by solving . By re-arranging the above, we obtain
Now, by multiplying both sides of (A2) by , we obtain
From expression (A3), we see that is the Lambert-W function of real argument . Thus, we have
Moreover, for any , it is immediate that , and it can also be checked that since . Therefore, by taking into account the properties of the negative branch of the Lambert-W function, we deduce the following
Again, substituting , and solving for x, we obtain
This completes the proof of Proposition 2. □
Proof of Proposition 3.
Now, by denoting , we get
where . By letting , it gives , and , then we have
where . Hence the proposition. □
References
- Acerbi, Carlo, and Dirk Tasche. 2002. On the coherence of expected shortfall. Journal of Banking & Finance 26: 1487–503. [Google Scholar]
- Bakar, Shaiful A., Nor A. Hamzah, Mastoureh Maghsoudi, and Saralees Nadarajah. 2015. Modeling loss data using composite models. Insurance: Mathematics and Economics 61: 146–54. [Google Scholar]
- Calderín-Ojeda, Enrique, and Chun F. Kwok. 2016. Modeling Claims Data with Composite Stoppa Models. Scandinavian Actuarial Journal 9: 817–36. [Google Scholar] [CrossRef]
- Cooray, Kahadawala, and Malwane M. A. Ananda. 2005. Modeling actuarial data with a composite lognormal–Pareto model. Scandinavian Actuarial Journal 5: 321–34. [Google Scholar] [CrossRef]
- De Jong, Piet, and Heller Gillian H. 2008. Generalized Linear Models for Insurance Data. International Series on Actuarial Science; Cambridge: Cambridge University Press. [Google Scholar]
- Ghitany, Mohamed E., Barbra Atieh, and Saralees Nadarajah. 2008. Lindley distribution and its application. Mathematics and Computers in Simulation 78: 493–506. [Google Scholar] [CrossRef]
- Ghitany, Mohamed E., Emilio Gómez-Déniz, and Saralees Nadarajah. 2018. A new generalization of the Pareto distribution and its application to insurance data. Journal of Risk and Financial Management 11: 10. [Google Scholar] [CrossRef]
- Gómez-Déniz, Emilio, and Enrique Calderín-Ojeda. 2014. A Suitable Alternative to the Pareto Distribution. Hacettepe Journal of Mathematics and Statistics 43: 843–60. [Google Scholar]
- Gómez-Déniz, Emilio, and Enrique Calderín-Ojeda. 2015. Modeling insurance data with the Pareto ArcTan distribution. Astin Bulletin 45: 639–60. [Google Scholar] [CrossRef]
- Hogg, Robert V., and Stuart A. Klugman. 2009. Loss Distributions. Vol. 249 of Wiley Series in Probability and Statistics; New York: Wiley. [Google Scholar]
- Klugman, Stuart A., Harry H. Panjer, and Gordon E. Willmot. 2012. Loss Models: From Data to Decisions, 4th ed. Wiley Series in Probability and Statistics; Hoboken: John Wiley and Sons, Inc. [Google Scholar]
- Sarabia, José M., and Enrique Castillo. 2005. About a class of max–stable families with applications to income distributions. Metron LXIII: 505–27. [Google Scholar]
- Sarabia, José M., and Faustino Prieto. 2009. The Pareto-positive stable distribution: A new descriptive model for city size data. Physica A 388: 4179–91. [Google Scholar] [CrossRef]
- Scollnik, David P. M. 2007. On Composite Lognormal-Pareto Models. Scandinavian Actuarial Journal 1: 20–33. [Google Scholar] [CrossRef]
- Shaked, Moshe, and J. George Shanthikumar. 2007. Stochastic Orders. New York: Springer Science and Business Media. [Google Scholar]
- Stoppa, Gabriele. 1990. Proprietà campionarie di un nuovo modello Pareto generalizzato. In Atti XXXV Riunione Scientifica della Società Italiana di Statistica. Padova: Cedam, pp. 137–44. [Google Scholar]
- Su, Chun, and Qihe H. Tang. 2003. Characterizations on heavy-tailed distributions by means of hazard rate. Acta Mathematicae Applicatae Sinica 19: 135–42. [Google Scholar] [CrossRef]
- Yang, Hailiang. 2004. Cramér-Lundberg asymptotics. In Encyclopedia of Actuarial Science. New York: Wiley. [Google Scholar]
- Yitzhaki, Shlomo. 1983. On an Extension of the Gini Inequality Index. International Economic Review 24: 617–28. [Google Scholar]
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).