Next Article in Journal
Optimal Static Hedging of Variable Annuities with Volatility-Dependent Fees
Previous Article in Journal
Gerber-Shiu Metrics for a Bivariate Perturbed Risk Process
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On the Use of Lehmann’s Alternative to Capture Extreme Losses in Actuarial Science

by
Emilio Gómez-Déniz 
1,† and
Enrique Calderín-Ojeda 
2,*,†
1
Department of Quantitative Methods in Economics and TiDES, University of Las Palmas de Gran Canaria, 35017 Las Palmas de Gran Canaria, Spain
2
Department of Economics, University of Melbourne, Melbourne, VIC 3010, Australia
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Submission received: 30 October 2023 / Revised: 12 December 2023 / Accepted: 22 December 2023 / Published: 28 December 2023

Abstract

:
This paper studies properties and applications related to the mixture of the class of distributions built by the Lehmann’s alternative (also referred to in the statistical literature as max-stable or exponentiated distribution) of the form [ G ( · ) ] λ , where λ > 0 and G ( · ) is a continuous cumulative distribution function. This mixture can be useful in economics, financial, and actuarial fields, where extreme and long tails appear in the empirical data. The special case in which G ( · ) is the Stoppa cumulative distribution function, which is a good description of the random behaviour of large losses, is studied in detail. We provide properties of this mixture, mainly related to the analysis of the tail of the distribution that makes it a candidate for fitting actuarial data with extreme observations. Inference procedures are discussed and applications to three well-known datasets are shown.

1. Introduction

As (Boland 2007, p. IX) has pointed out, when a practitioner wants to model losses, there is usually considerable concern about the chances and sizes of large claims. More precisely, the study of the right tail of the distribution results is very important in order to not underestimate the size of large losses. The term long tail is useful here and is applied to rank-size distributions or rank-frequency distributions, which often form power laws. In this sense, the Pareto distribution and generalisations of this simple distribution, such as the Stoppa distribution (see Stoppa 1990; Kleiber and Kotz 2003), have been considered in this case. Other alternatives to the Pareto distribution have been proposed recently in the statistical and applied statistical literature. Some of them are provided by Sarabia and Prieto (2009), Gómez-Déniz and Calderín-Ojeda (2014, 2015) and Ghitany et al. (2018), among others.
One of the main advantages obtained when the Stoppa distribution is considered is the fact that this is a max-stable (maximum of a number of random variables) distribution, which is also related to extreme value theory. This extreme property can, in turn, be highlighted by making a mixture (compound) of it, gaining an even heavier tail.
In this paper, we study properties and applications related to the mixture of the class of probability distributions built by Lehmann’s alternative (also known as max-stable or exponentiated distributions). The reader is recommended to see Lehmann (1953) and Sarabia and Castillo (2005), among other references. A positive random variable X is said to be built by the Lehmann’s alternative method if its cumulative distribution function (cdf) can be represented as:
G ( x ) λ , λ > 0 ,
where G ( · ) is a continuous cdf. This class of distributions is also known in the statistical literature as Lehmann’s alternative and exponentiated distribution. In practise, if X represents, for instance, the loss of a given risk in a portfolio of insurances, a population of such policy-holders may be a ubiquitous variation in λ -values because of small fluctuations in the mean of losses, so that a policy-holder selected at random can be regarded as having a random value of λ , call it Λ . Here, Λ takes values in ( 0 , ) . The distribution of Λ across the positive real numbers is referred to in the statistical literature as the mixing distribution (or density). These mixtures are useful in economics, financial, and actuarial fields, where extreme and long tails appear in the empirical data.
Although in the statistical literature, mixtures of discrete distributions for providing thick tails (for example, the well-known Poisson–Gamma distribution) are much more frequent, mixtures of continuous distributions have also been the subject of research in the field of applied statistics and, in particular, in actuarial statistics. One example of this is the exponential-inverse Gaussian model provided by Frangos and Karlis (2004). Other examples of continuous mixture distributions were studied by Fung and Seneta (2007) and Gómez-Déniz et al. (2013). Also, some mixtures of continuous distributions have recently been considered in the actuarial literature in Tzougas and Karlis (2020) and Tzougas and Jeong (2021).
The special case in which (1) corresponds to the Stoppa cdf (see for example, Stoppa 1990; Kleiber and Kotz 2003)—which is a good description of the random behaviour of large losses—is studied in some detail. The Stoppa distribution is a generalisation of the classical Pareto distribution and is a max-stable distribution with a nice and simple cdf. We provide properties of this mixture, mainly related to the analysis of the tail of the distribution that makes it a candidate for fitting actuarial data with extreme observations. Inference procedures are discussed and applications to two well-known datasets are shown. Furthermore, two regression models for this mixture are derived to fit bodily injury claims data.
The remainder of the paper is organised as follows. The mixture representation of a max-stable distribution is provided in Section 2. Here, the representation of the Stoppa distribution is also considered and some basic properties are derived. Section 3 discusses properties related to the right tail of this mixture distribution. Some special cases of the mixing distribution are considered in Section 4. In particular, special attention is paid to the generalised inverse Gaussian distribution, which includes as special cases the inverse Gaussian, Gamma, and exponential distributions, among other models. Next, the stochastic ordering of the resulting mixture models is examined. Finally, two regression models for this mixture of distributions are given. Then, three numerical applications are shown in Section 5 and conclusions are provided in the last section.

2. Lehmann’s Alternative

The distribution function built by Lehmann’s alternative method, G ( z ) n , assumed that Z G ( z ) , can be interpreted in different ways (see for instance Sarabia and Castillo 2005); for example, it is the distribution of the last (maximum) order statistics in a random sample of size n. That is, consider the sample Z 1 , , Z n of n independent and identically distributed random variables with common cdf G ( z ) and define the ordered sample Z 1 , n , , Z n , n . If the interest of the practitioner is the asymptotic distribution of the maxima M n as n , the distribution of M n results in G n ( z ) = Pr ( M n z ) = Pr ( Z 1 z , , Z n z ) . On the other hand, exponentiated distributions of the form G ( z ) n also appear by applying the probability integral transformation or the quantile function of G to a Beta distribution with parameters α and 1.
We now consider extensions of the distributions built by the Lehmann’s alternative method of the form
F ( x | λ ) = G ( x ) λ , λ > 0 ,
assuming that the parameter λ > 0 is not constant and varies according to some known probability distribution, say m ( λ ; ω ) , depending on a vector of parameters ω . This has the following stochastic representation,
X G ( x ) λ = exp [ λ log G ( x ) ] , λ m ( λ ; ω ) .
Thus, we study the mixture of the form,
F ( x ; ω ) = E m ( λ ; ω ) exp λ log G ( x ) ,
where m ( λ ; ω ) is a genuine mixing probability density function of Λ , the parametric space in which the parameter λ moves, depending on a vector of parameters ω . Another possibility, which will not be studied here, is to consider that the parameter λ takes integer values { 1 , 2 , } and allows λ to follow a discrete distribution such as the geometric discrete distribution. This is the family of geometric max-stable distribution considered by Marshall and Olkin (1997).
Observe that (2) can be written as
F ( x ; ω ) = M Λ log G ( x ) ,
in which M ( · ) denotes the moment-generating function. Thus, attending to (3), any probability density function (pdf) with a closed-form for its moment generating function, such as the Gamma, the inverse Gaussian, and the generalised inverse Gaussian distributions, should be good candidates to introduce here.
The pdf, obtained from (2), is given by
f ( x ; ω ) = g ( x ) G ( x ) E m ( λ ; ω ) λ exp λ log G ( x ) ,
where g ( x ) = d G ( x ) / d x , and the hazard rate function results in
r ( x ; ω ) = g ( x ) G ( x ) E m ( λ ; ω ) λ exp λ log G ( x ) 1 E m ( λ ; ω ) exp λ log G ( x ) .

The Lehmann’s Alternative Stoppa Distribution

The natural generalisation of the Pareto distribution proposed by Stoppa (1990) (see also Kleiber and Kotz 2003) has a nice and simple cumulative distribution function (hereafter cdf) representation given by
F ( x ; σ , θ , λ ) = ψ σ , θ ( x ) λ = exp λ log ψ σ , θ ( x ) , x > σ ,
where σ > 0 , θ > 0 , and λ > 0 is a shape parameter that allows for unimodality when λ > 1 ; furthermore, the mode is located at zero when λ 1 . Observe that
ψ σ , θ ( x ) = 1 σ x θ
is the cdf of the classical Pareto distribution with scale parameter σ > 0 and shape parameter θ > 0 (see for instance Arnold 1983) with pdf
g ( x ) = θ σ θ x θ + 1 , x > σ > 0 , θ > 0
and survival function
G ¯ ( x ) = σ x θ , x > σ > 0 , θ > 0 .
Note that (4) obeys the cdf of a Lehmann’s alternative (max-stable) distribution, such as the one given in (1). Thus, this cdf is simply a power transformation of the classical Pareto cdf obtained as a special case of (4) for λ = 1 . Some additional properties, such as moments, Lorenz curves, the Gini index, and estimation of this distribution can be found in (Kleiber and Kotz 2003, chp. 3), where other generalisations of the classical Pareto distribution can also be viewed.
Henceforward, we will write X S ( σ , θ , λ ) to denote that a continuous random variable with support in ( σ , ) follows the distribution given in (4).
To see how the Stoppa model works as compared to the classical Pareto distribution, we will consider data taken from (Hogg and Klugman 1984, p. 64) (see also Boyd 1988), which concern 40 losses recorded in 1977 related to wind catastrophes. This set of empirical data was recorded to the nearest USD 1,000,000, although the data include only losses of USD 2,000,000. Maximum likelihood estimates were computed for the classical Pareto distribution and the Stoppa distribution, assuming σ = 1 , and the results appear in Table 1. As we can see, the Stoppa distribution provides a better fit than the one obtained with the classical Pareto distribution.

3. Extremal Properties

As was mentioned before, a random variable with non-negative support, such as the classical Pareto distribution, is commonly used in insurance and other financial scenarios to model the amount of claims (losses). In this sense, the size of the distribution tail is fundamental if it is desired that the chosen model allows us to capture amounts sufficiently far from the start of the distribution support, that is, extreme values. Due to this, the use of heavy right-tailed distributions such as the Pareto, lognormal, and Weibull (with shape parameter smaller than 1) distributions, among others, have been employed to model losses in motor third-party liability insurance, fire insurance, or catastrophe insurance, among others. In order to make the paper self-contained, in the following, we provide the formal definition of a heavy-tailed (heavy right-tailed) distribution (see Rolski et al. 1999).
Definition 1.
Any probability distribution that is specified through its cdf F ( x ) on the real line, is heavy right-tailed if lim sup x ( log F ¯ ( x ) / x ) = 0 .
Observe that d d x log F ¯ ( x ) is the hazard function of a random variable with cdf F ( x ) .
An important issue in extreme value theory is the regular variation (see Bingham et al. 1989; Konstantinides 2017); that is, a flexible description of the variation of some function according to the polynomial form of the type x δ + o ( x δ ) , δ > 0 . This concept is formalised in the following definition.
Definition 2.
A distribution function (measurable function) is called regular varying at infinity with index δ if it holds that
lim x F ¯ ( τ x ) F ¯ ( x ) = τ δ ,
where τ > 0 and the parameter δ 0 is called the tail index.
The next proposition establishes that the survival function of the distribution given in (2) is a regular variation Lebesgue measure.
Proposition 1.
The survival function of any mixture with cdf as the one given in (2) when G ( x ) is given in (5) is a survival function with regularly varying tails.
Proof. 
This is straightforwardly verified by taking into account that
lim sup x F ¯ ( τ x ) F ¯ ( x ) = lim sup x 1 E m ( λ ; ω ) exp λ log ψ σ , θ ( τ x ) 1 E m ( λ ; ω ) exp λ log ψ σ , θ ( x ) = lim sup x τ ψ σ , θ ( τ x ) ψ σ , θ ( x ) ψ σ , θ ( x ) ψ σ , θ ( τ x ) E m ( λ ; ω ) λ exp λ log ψ σ , θ ( τ x ) E m ( λ ; ω ) λ exp λ log ψ σ , θ ( x ) = τ θ .
Now, taking into account that θ > 0 , the statement of the result follows immediately.    □
Therefore, as a regular varying at infinity class is a long-tailed distribution, and the latter distribution is also heavy right-tailed, then any mixture of the Stoppa distribution is heavy right-tailed.
An immediate consequence of the previous result is the following (see Jessen and Mikoshch 2006):
Corollary 1.
If X , X 1 , , X n is a sequence of iid random variables with a survival function provided by F ¯ ( x ) = 1 F ( x ) and S n = i = 1 n X i , n 1 , then
Pr ( S n > x ) Pr ( X > x ) a s x ,
where the symbolmeans it is asymptotically equivalent in probability. As a consequence of this result, if P n = max i = 1 , , n X i , n 1 , then
Pr ( S n > x ) n Pr ( X > x ) Pr ( P n > x ) .
This result is important because, for large x, the event { S n > x } is due to the event { P n > x } . Thus, exceedances of high thresholds by the sum S n are explained by the exceedance of this cut-off by the largest value in the sample.

4. Some Candidates of Mixing Distributions

Consider the pdf of the generalised inverse Gaussian distribution given by
m ( λ ; ω ) = ( β / Γ ) α / 2 2 K α ( β Γ ) λ α 1 exp 1 2 β λ + Γ λ , λ > 0 ,
where ω = ( α , β , Γ ) and K ν ( z ) gives the modified Bessel function of the third kind and with index ν given by
K ν ( z ) = π ( 2 z ) ν e z U ν + 1 2 , 2 ν + 1 , 2 z ,
U ( · , · , · ) being the confluent hypergeometric function (Johnson et al. 2005) in which integral representation,
U ( a , b , z ) = 1 Γ ( a ) 0 t a 1 ( 1 + t ) a + b 1 exp ( z t ) d t .
A distribution with a pdf as the one given in (6) will be denoted as G I G ( α , β , Γ ) .
The domain of the parameters ω = ( α , β , Γ ) (see Barndorff-Nielsen and Halgreen 1977; Seshadri 1994; Lemonte and Cordeiro 2011) are given by α R and ( β , Γ ) Υ α , where
Υ α = Γ 0 , β > 0 . i f α > 0 , Γ > 0 , β > 0 , i f α = 0 , Γ > 0 , β 0 , i f α < 0 .
Its moment-generating function is given by
M ( t ) = β β 2 t α / 2 K α ( ( β 2 t ) Γ ) K α ( β Γ ) .
This distribution includes a lot of basic distributions such as the Gamma distribution ( α > 0 and Γ 0 ); the reciprocal Gamma distribution ( α < 0 and Γ 0 ); the inverse Gaussian distribution ( α = 1 / 2 ); the hyperbolic distribution ( α = 0 ); and others such as the exponential, chi-squared, half-normal, etc.
Now, by using (3) together with (7), we get the mixture of the Stoppa distributions with the generalised inverse Gaussian distribution (SGIG). In order to simplify the notation, we introduce the expression φ σ , β , θ ( x ) = β 2 log ψ σ , θ ( x ) . Then, its cdf results in
F ( x ) = β φ σ , β , θ ( x ) α / 2 K α Γ φ σ , β , θ ( x ) K α ( Γ β ) , x > 0 .
One special case obtained from the last is the mixture of the Stoppa distribution with the Gamma distribution (SG) with cdf,
F ( x ) = β φ σ , β , θ ( x ) α / 2 , x > 0 .
Observe that, in this case, the mixture distribution is again a max-stable distribution. Another interesting sub-model obtained from the generalised inverse Gaussian distribution is the inverse Gaussian distribution, which is reached when α = 1 / 2 . In this case, (3) can be written as
F ( x ) = exp Γ β Γ φ σ , β , θ ( x ) , x > 0 ,
which corresponds to the mixture of the Stoppa model with the inverse Gaussian distribution (SIG).
Now, the pdf obtained from (8) can be written as
f ( x ) = Γ β α φ σ , β , θ ( x ) α + 1 θ σ θ x θ + 1 ψ σ , θ ( x ) K α + 1 ( Γ φ σ , β , θ ( x ) ) K α ( Γ β ) , x > 0 .
From (9), the pdf results in
f ( x ) = α θ σ θ x θ + 1 β ψ σ , θ ( x ) β φ σ , β , θ ( x ) α / 2 + 1 , x > 0 .
It is known that K 1 / 2 ( z ) = K 1 / 2 ( z ) = π / ( 2 z ) exp ( z ) . Then, the pdf of the SIG distribution can be written as
f ( x ) = Γ φ σ , β , θ ( x ) θ σ θ x θ + 1 ψ σ , θ ( x ) exp Γ β Γ φ σ , β , θ ( x ) , x > 0 .
In Figure 1, some graphs of the pdf (10) are shown for special cases of the parameters. As can be noted, all the cases shown in the graph include a large tail.
In financial economics and risk theory, the accurate calculation of risk capital is an issue of crucial interest to researchers, regulators of financial institutions, and commercial vendors of financial products and services. In this sense, the Value-at-Risk (VaR), which is defined as the amount of capital required to ensure that the insurer does not become insolvent with a high degree of certainty, is an important risk measure. For a random variable X that follows the Stoppa and the different mixtures proposed here, i.e., the q-quantile, 0 < q < 1 are given by
S : VaR [ X ; q ] = σ ( 1 q 1 / λ ) 1 / θ , SG : VaR [ X ; q ] = σ 1 exp β 2 1 q 2 / α 1 / θ , SIG : VaR [ X ; q ] = σ 1 exp β 2 1 2 Γ Γ β log q 2 1 / θ .

4.1. Stochastic Ordering

It is our goal now to investigate the stochastic order of the distribution in (4) and its mixture. The analysis of stochastic ordering has been widely studied in many different areas such as economics, operations research, reliability, and statistics (e.g., survival analysis) among other fields. Also, comparing random variables is vital in risk theory. Numerous ordering concepts have been used in the statistical literature, e.g., the usual stochastic order, hazard rate order, reversed hazard rate order, and the likelihood ratio order among others. For a comprehensive understanding of stochastic ordering, the reader is encouraged to read Shaned and Shanthikumar (2007) and Yu (2009), among others. In this work, we use the stochastic dominance concept provided in Dhaene et al. (2006). We recall here the definition given in that book.
Definition 3.
Let X 1 and X 2 be continuous random variables with cdf F X 1 ( x ) and F X 2 ( x ) , respectively. Then, X 1 is said to be smaller than X 2 in the stochastic dominance sense (written as X 1 s t X 2 ) if F X 1 ( x ) F X 2 ( x ) for all x R .
In the following result, it is shown that the stochastic dominance can be expressed in terms of the Value-at-Risk.
Proposition 2.
Let X 1 and X 2 be continuous random variables with cdf F X 1 ( x ) and F X 2 ( x ) , respectively. We have that X 1 is smaller than X 2 in the stochastic dominance sense if and only if their respective quantiles are ordered:
X 1 s t X 2 VaR [ X 1 ; q ] VaR [ X 2 ; q ] , for all q ( 0 , 1 ) .
Proof. 
See Dhaene et al. (2006).    □
Proposition 3.
Let X 1 S ( σ 1 , θ 1 , λ 1 ) and X 2 S ( σ 2 , θ 2 , λ 2 ) . Then, if λ 1 < λ 2 , θ 1 > θ 2 and σ 1 < σ 2 , then X 1 s t X 2 .
Proof. 
If λ 1 < λ 2 , θ 1 > θ 2 and σ 1 < σ 2 , we have that
VaR [ X 1 ; q ] = σ 1 ( 1 q 1 / λ 1 ) 1 / θ 1 VaR [ X 2 ; q ] = σ 2 ( 1 q 1 / λ 2 ) 1 / θ 2 ,
and therefore by using the previous theorem we have that X 1 s t X 2 .    □
Proposition 4.
Let X 1 S G ( σ 1 , θ 1 , α 1 , β 1 ) and X 2 S G ( σ 2 , θ 2 , α 2 , β 2 ) . Then, if σ 1 < σ 2 , θ 1 > θ 2 , α 1 < α 2 , and β 1 > β 2 , then X 1 s t X 2 .
Proof. 
If σ 1 < σ 2 , θ 1 > θ 2 , α 1 < α 2 , and β 1 > β 2 , we have that
VaR [ X 1 ; q ] = σ 1 1 exp β 1 2 1 q 2 / α 1 1 / θ 1 VaR [ X 2 ; q ] = σ 2 1 exp β 2 2 1 q 2 / α 2 1 / θ 2 ,
and therefore by using the previous theorem, we have that X 1 s t X 2 .    □
It is easy to see that, if X 1 S ( σ , θ 1 , λ ) and X 2 S ( σ , θ 2 , λ ) with θ 1 < θ 2 , then F ¯ ( x 1 ) s t F ¯ ( x 2 ) . Furthermore, if X 1 S ( σ , θ , λ 1 ) and X 2 S ( σ , θ , λ 2 ) with λ 1 < λ 2 , then F ¯ ( x 2 ) s t F ¯ ( x 1 ) . Now, we have the following result.
Theorem 1.
Let X 1 and X 2 be two mixture random variables with cdfs F 1 ( x ; σ , θ 1 , ω ) and F 2 ( x ; σ , θ 2 , ω ) obtained from (3). If θ 1 θ 2 , then X 1 s t X 2 .
Proof. 
The result follows by applying Theorem 1.C.17 in Shaned and Shanthikumar (2007).   □
More effort will be necessary to get this quantity for the SGIG distribution.

4.2. Estimation

In this section, we show how to estimate the parameters of the mixture distributions via maximum likelihood estimation. For that reason, let us assume that x 1 , x 2 , , x n is a random sample selected from the distribution of interest (10) and also assume that σ = min { x i } , i = 1 , , n . The log-likelihood function, obtained from (10), is proportional to
( Ω ; x ˜ ) n θ log σ + log θ + α 2 log β + 1 2 log Γ K α ( Γ β ) α + 1 2 i = 1 n log φ σ , β , θ ( x i ) i = 1 n ψ σ , θ ( x i ) + i = 1 n K α + 1 Γ φ σ , β , θ ( x i ) ,
where Ω = ( α , β , θ , Γ ) is the vector of parameters to be estimated.
Although in practice, the modified Bessel function of the third kind is implemented in most statistical packages (note that this is not the case of the econometric software WinRats), it is convenient to express this function in terms of the modified Bessel function of the first kind, I ν ( z ) . The relationship between both functions (see Johnson et al. 2005, p. 19) is given by
K ν ( z ) = π csc ( π ν ) 2 I ν ( z ) I ν ( z ) ,
where
I ν ( z ) = j = 0 1 j ! Γ ( j + ν + 1 ) z 2 2 j + ν ,
is the modified Bessel function of the first kind. The advantage of this function is that it is written as a series representation that facilitates (by truncation) the estimation procedure of the parameters of the model. More details about the maximum likelihood estimation are given in the Appendix A.

Conjugate Distribution

In this subsection, we will see that the GIG distribution is a conjugate distribution with respect to the Stoppa distribution.
Theorem 2.
Let X S ( σ , θ , λ ) and suppose that, given Λ = λ , Λ G I G ( α , β , Γ ) . Then we have that the posterior distribution of λ, given n years of individual independent experience, x ˜ = ( x 1 , , x n ) is again a G I G ( α * , β * , Γ * ) , where the updated parameters are given by,
α * = α + n , β * = β 2 i = 1 n log ψ σ , θ ( x i )
Γ * = Γ .
Proof. 
The pdf of the Stoppa distribution, obtained from (4), is given by
f ( x ) = λ ψ σ , θ ( x ) ψ σ , θ ( x ) exp [ λ log ψ σ , θ ( x ) ] .
Thus, given the sample information x ˜ , the likelihood, taken from (13), is proportional to
λ n exp λ i = 1 n log ψ σ , θ ( x i ) .
Therefore, by applying Bayes’ theorem, we have that the posterior is proportional to
λ α + n 1 exp 1 2 β 2 i = 1 n log ψ σ , θ ( x i ) λ + Γ λ ,
which is a GIG distribution with parameters (11) and (12).    □
Observe that, if X S ( σ , θ , λ ) and Λ follow a Gamma distribution with parameters α > 0 and β > 0 , then using the previous result, we have that the posterior mean of the parameter λ given the sample information x ˜ results in
E ( Λ | x ˜ ) = α + n β i = 1 n log ψ σ , θ ( x i ) .
Consequently, this Bayesian expression takes the form
E ( Λ | x ˜ ) = ( 1 z n ) r ( x ˜ ) + z n E g ( Λ ) ,
where r ( x ˜ ) = n / i = 1 n log ψ σ , θ ( x i ) and
z n = β β i = 1 n log ψ σ , θ ( x i ) .
Since it is not verified that E ( X | λ ) = λ , then it is not guaranteed that the mean of the predictive distribution coincides with the posterior mean, i.e., E ( X n + 1 | x ˜ ) E ( Λ | x ˜ ) . Therefore, although the expression (14) resembles the credibility formula widely introduced in actuarial statistics with credibility factor z n , this is not a credibility premium since the purpose of that expression is to predict the expected aggregate claims size in the next period of time.

4.3. Novel Heavy-Tail Regression Models

The mean of the Stoppa distribution is given by the expression:
E ( X ) = λ σ B 1 1 θ , λ ,
where B ( a , b ) is the complete beta function defined as 0 1 z a 1 ( 1 z ) b 1 d z with a > 0 and b > 0 . Therefore, the mean of the mixture Stoppa distribution can be obtained by compounding using the expression,
E ( X ) = E Λ E X | Λ = σ E Λ Λ B 1 1 θ , Λ .
Then, given a sample x ˜ = ( x 1 , , x n ) , the new heavy-tailed regression models can be derived by writing the scale parameter of each model as
σ i = κ exp [ β z i ] 1 + exp [ β z i ] , i = 1 , , n .
Here, z i = ( z i 1 , , z i p ) is a vector of explanatory variables and β = ( β 1 , , β p ) is a vector of regressors to be estimated. Note that, in the latter expression, κ does not depend on the subscript i. In the practical implementation, this parameter was chosen by using a grid search for manually specified values of this parameter in the interval ( 0 , min { x 1 , , x n } ) .

4.3.1. SG Case

In this model, the log-likelihood function is given by
( Ω ; x ˜ ) n θ log σ i + log θ + log α + α 2 log β i = 1 n log φ σ i , β , θ ( x i ) ( θ + 1 ) i = 1 n log x i α 2 + 1 i = 1 n log ψ σ i , θ ( x i ) ,
where Ω = ( α , β , θ , β ) .
The maximum likelihood estimates of the SG distribution are obtained by solving the system of equations given by
n α 2 β α 2 + 1 i = 1 n 1 φ σ i , β , θ ( x i ) = 0 , n 1 θ + log σ i + i = 1 n σ i / x i θ ψ σ i , θ ( x i ) log σ i x i 1 + α 2 2 φ σ i , β , θ ( x i ) i = 1 n log x i = 0 , n α + log β 2 1 2 i = 1 n log ψ σ i , θ ( x i ) = 0 , β j 1 + exp [ β z i ] n θ + i = 1 n θ ψ σ i , θ ( x i ) σ i x i θ 1 + α 2 2 φ σ i , β , θ ( x i ) = 0 ,
where j = 1 , , p and i = 1 , , n .
From these equations, the entries of the observed information matrix can be derived after tedious algebra (not reproduced here) by differentiating these equations with respect to the p + 3 parameters. The function (15) can be simply maximised by considering several values as seed points. Of course, the global maximum is not guaranteed by the difficulty to show that the log-likelihood function is concave. We have used different maximum search methods that are available in the MAXIMIZE in-built function in the WinRATS software package by using the BFGS algorithms.

4.3.2. SIG Case

When the mixing distribution is the inverse Gaussian, the log-likelihood function is given by
( Ω ; x ˜ ) n θ log σ i + log θ + 1 2 log Γ + Γ β ( θ + 1 ) i = 1 n log x i 1 2 i = 1 n log φ σ i , β , θ ( x i ) i = 1 n log ψ σ i , θ ( x i ) Γ i = 1 n φ σ i , β , θ ( x i ) ,
where Ω = ( β , θ , Γ , β ) .
The normal equations are given, in this case, by
n Γ β i = 1 n 1 φ σ i , β , θ ( x i ) Γ i = 1 n φ σ i , β , θ ( x i ) 1 / 2 = 0 , n 1 θ + log σ i + i = 1 n σ i / x i θ log σ i / x i 1 ψ σ i , θ ( x i ) 1 1 φ σ i , β , θ ( x i ) Γ φ σ i , β , θ ( x i ) i = 1 n log x i = 0 , 1 Γ + β Γ i = 1 n φ σ i , β , θ ( x i ) Γ = 0 , β j 1 + exp [ β z i ] n θ + i = 1 n θ ψ σ i , θ ( x i ) σ i x i θ 1 1 φ σ i , β , θ ( x i ) Γ φ σ i , β , θ ( x i ) = 0 ,
where j = 1 , , p and i = 1 , , n .
The standard errors of the estimates can be computed by following the same approach as above.

5. Numerical Experiments

In this section, the performance of the distributions introduced in this work is verified by employing three different sets of data. The first one is the danishuni, which can be downloaded from the R package CASdatasets and also from Extreme Value Statistics in S-plus libraries, collected at Copenhagen Reinsurance, comprising 2157 fire losses, adjusted for inflation to reflect 1985 values, over DKK 1,000,000 during the period 1980 to 1990, adjusted for inflation to reflect 1985. A detailed statistical analysis of this dataset can be found in McNeil (1988), in Albrecher et al. (2017), and also in Embrechts et al. (1997). The second dataset is norfire, which is also available in the R package CASdatasets, and includes 9181 fire losses over the period 1972 to 1992 from an unknown Norwegian insurer. A priority of NKR 500,000 (if this amount is exceeded, the reinsurer becomes liable to pay) was applied to derive this set of data.
The estimates of the parameters and their corresponding p-values (given in brackets), the negative of the maximum of the log likelihood function, and the AIC for the two aforementioned sets of data are shown in Table 2 for the Stoppa distribution and for the three mixture models previously considered, i.e., SG, SIG, and SGIG. For all the models and the two datasets, the parameter σ was chosen by using a grid search for manually specified values of this parameter in the interval ( 0 , min { x 1 , , x n } ) . The validation of the models is carried out using the following information criteria: the negative log likelihood (NLL), computed by taking the negative of the value of the log-likelihood evaluated at the maximum likelihood estimates, and the Akaike’s information criterion (AIC), computed as twice the NLL, evaluated at the ML estimates plus twice the number of estimated parameters. Moreover, we also incorporate the Kolmogorov–Smirnov test (KS) and the Anderson–Darling test (AD) to show the fit of the model to the empirical data in terms of the cdf. For these test statistics, smaller values of these tests indicate a better fit of the model to the empirical data. Note that they do not only provide a way to measure the fit in terms of the cdfs, but also allow us to perform hypothesis testing for model validation purposes. An extremely small p-value might lead to a confident rejection of the null hypothesis that the data come from the given model. It can be seen that the SG distribution provides the best fit for the Danish dataset, whereas the SGIG returns the lowest values for NLL and AIC for the Norwegian set of data. For the former dataset, and using KS and AD tests, none of the models are rejected; however, it is noted that, for both tests, the Stoppa distribution is rejected for the Norwegian dataset.
For comparison reasons, we have also fitted shifted versions of lognormal, Weibull, and Burr distributions for the Danish and Norwegian datasets with the following densities:
f ( x ) = 1 λ ( x σ ) 2 π exp 1 2 λ 2 ( log ( x σ ) θ ) 2 , x > σ , θ R , λ > 0 , f ( x ) = θ λ x σ λ θ 1 exp x σ λ θ , x > σ , λ > 0 , θ > 0 , f ( x ) = λ θ ( x σ ) θ 1 ( 1 + ( x σ ) θ ) λ + 1 , x > σ , λ > 0 , θ > 0 ,
respectively. For all these distributions, the parameter σ was estimated by using the method explained above. A comparison of these models with the other heavy-tailed distributions used to model the Danish and the Norwegian datasets can be found in Gómez-Déniz et al. (2022) and Gómez-Déniz and Calderín-Ojeda (2015), respectively. This catalogue of heavy-tailed distributions includes shifted versions of the lognormal, inverse Gaussian, and generalised Gamma, and also inverse Gamma, log Gamma, and Fréchet and Pareto ArcTan, among other models.
The estimation of the parameters for all the distributions used in this work has been completed using the method of maximum likelihood by using Mathematica® v.12.0 and has also been verified via WinRATS v.7.0. The codes are available from the authors upon request. Standard errors of the estimates were obtained by finite differentiation. The WinRATS v.7.0 software package also gives the option to directly calculate the maximum of the log-likelihood returning the entries of the Fisher information matrix. The parameters can also be estimated via an EM algorithm as shown in the Appendix A.
These results are confirmed by Figure 2. In this figure, the graphs of the histogram of both datasets (Danish left, Norwegian right) are shown. We have also superimposed the fitted densities. Also in Figure 3, we have plotted the smooth cdf of the empirical data, and the fitted cdf of all the distributions previously considered have been superimposed. It is observable that the SG (dashed line) and SGIG (dotted-dashed line) distributions adhere closely to Danish and Norwegian datasets, respectively.
Now, we select a distribution that yields a feasible characterisation of the loss process for both sets of data, we should check that the theoretical limited expected values, calculated numerically by
L ( x ) = E [ min ( X , x ) ] = 0 x y d F ( y ) + x F ¯ ( x ) ,
adhere closely to the empirical ones. As is already known, (16) is the expected quantity per claim retained by the insured on a policy with a fixed amount deductible of x. Here, the empirical limited expected value function was computed based on the expression E n ( x ) = 1 n i = 1 n min ( x i , x ) . Obviously, when x tends to infinity, L ( x ) and E n ( x ) converge to E ( X ) and the sample mean, respectively.
Table 3 and Table 4 display the limited expected value for several values of the policy limit x considered for the Danish and Norwegian datasets, respectively. It is observed that that the values obtained from the mixing distributions adhere closely to the observed empirical limited expected values obtained, as compared to the Stoppa distribution for both datasets.
The absolute errors between the empirical values and the fitted values shown in Table 3 and Table 4 are shown in the graphs that appear in Figure 4. As can be seen, at least for the Danish data, the fit is improved to the extent that the value of the deductible increases. The pattern for the Norwegian data is challenging to observe.
The third dataset deals with automobile bodily injury claims using data from the Insurance Research Council (IRC), a division of the American Institute for Chartered Property Casualty Underwriters and the Insurance Institute of America. The data, collected in 2002, include demographic information about the claimants, attorney involvement, and economic losses (in thousands of USD), among other variables. As several of these covariates include missing observations, we will only use a sample of 1091 losses. This dataset can also be downloaded from the R package CASdatasets, see also Frees (2010). We consider as a response variable the claimant’s total economic loss. The empirical distribution of this variable combines losses of small, moderate, and large sizes, which makes it suitable for fitting heavy-tailed distributions. Other remarkable features of this set of data are unimodality, skewness, and a long right tail, showing a high likelihood of extremely expensive losses. Below, in Figure 5, we have plotted the histogram of this dataset. We have also superimposed the densities of the S, SG, and SIG distributions. The SIG distribution adheres closely to empirical data.
This dataset also includes the following covariates:
  • ATTORNEY takes the value 1 if the claimant is represented by an attorney and 0 otherwise;
  • CLMSEX takes the value 1 if the claimant is male and 0 otherwise;
  • MARRIED takes the value 1 if the claimant is married and 0 otherwise;
  • SINGLE takes the value 1 if the claimant is single and 0 otherwise;
  • WIDOWED takes the value 1 if the claimant is widowed and 0 otherwise;
  • CLMINSUR, whether or not the claimant’s vehicle was uninsured (=1 if yes and 0 otherwise);
  • SEATBELT, whether or not the claimant was wearing the seatbelt/child restraint (=1 if yes and 0 otherwise);
  • CLMAGE, claimant’s age.
Now, by using these covariates, we will explain the total losses in terms of the set of explanatory variables by using the Stoppa and SG and SIG regression models. From left to right, in Table 5, the parameter estimates, standard errors (S.E.), and the corresponding p-values calculated based on the t-Wald statistics for the three regression models are illustrated. Furthermore, NLL and AIC values for each model are provided in the last two rows of the table. For the ith policyholder, the total amount y i follows the specified model whose scale parameter σ i depends on the above set of covariates through the aforementioned link function. In view of its relatively low p-value, the estimates associated with the explanatory variables INTERCEPT, ATTORNEY, SEATBELT, and CLMAGE are statistically significant at the 5% significance levels for the three models considered. In addition, the shape parameter θ is also statistically significant at the same nominal level for all the models discussed here. It is noted that the parameters of the mixing distribution are also significant at this nominal level. Finally, the SIG provides the best fit to this dataset in terms of the two measures of the model selection considered.
As the estimates of the parameter θ significantly differ in these models, we now estimate the value of the tail index, i.e., 1 / θ estimated by the commonly used Hill’s estimator. For this, we consider the set of observations representing the n losses y 1 , , y n and let y ( n ) , , y ( 1 ) be reordered (reversed order statistics) in such a way that y ( 1 ) is the highest value in the sample. Then, we construct the sets of numbers H k and θ k for k > 2 . The set H k is defined by H k = 1 k i = 1 k log y ( i ) log y ( k ) and θ k = 1 / H k . Then, θ k is an estimate of θ when k is increased until it seems inappropriate to proceed. Below, Figure 6 presents the values of θ k when the value of k takes values in { 2 , , n } . Note that the values of the estimator stabilise in the neighbourhood of 1.1. These values can be compared with the estimate of the tail indexes 1 / θ for each one of the thee models given in Table 5. It would be interesting to study how the parameters’ estimates of these three models vary if we fix the the estimate of the tail index based on the Hill’s estimator and then estimate the other parameters via maximum likelihood estimation.

6. Conclusions

Although mixtures of discrete distributions have been largely considered in the statistical literature, the mixture of continuous distributions have been explored to a lesser extent in actuarial statistics. This paper considered properties and applications related to the mixture of probability distributions built by Lehmann’s alternative method (the class of max-stable continuous distributions). In particular, a mixture involving the shape parameter of the Stoppa distribution was introduced in this article. The special case of the generalised inverse Gaussian was chosen as the mixing distribution and was thoroughly investigated in this work. Additionally, we provided properties related to the right-tail of the distribution that is of vital interest in risk theory. In addition, we examined the stochastic ordering of this new class of probability distributions. Also, a Bayesian analysis of this family was carried out, in which it was shown that the generalised inverse Gaussian distribution conjugates with the Stoppa distribution. Furthermore, for the case where the mixing distribution is Gamma, the mean of the posterior distribution can be written as a convex sum of a functional expression of the data and the mean of the prior distribution.
The parameters of these mixture models were estimated by the method of maximum likelihood via maximisation of the log-likelihood surface, and the models were empirically validated by using two well-known datasets in the actuarial literature. Moreover, two regression models based on this mixture were implemented for two particular cases of the mixing distribution: Gamma and inverse Gaussian distributions. These models were used to fit bodily injury claims data. Some covariates associated with insurance claimants were considered when building a modification of the logit link function for the scale parameter σ . Because of the results obtained, we have found that the proposed models in this paper are a valid alternative to other parametric fat-tailed models in the literature.

Author Contributions

Conceptualisation, E.G.-D. and E.C.-O.; methodology, E.G.-D. and E.C.-O.; software, E.G.-D. and E.C.-O.; validation, E.G.-D. and E.C.-O.; formal analysis, E.G.-D. and E.C.-O.; investigation, E.G.-D. and E.C.-O.; writing—original draft preparation, E.G.-D. and E.C.-O.; writing—review and editing, E.G.-D. and E.C.-O.; visualisation, E.G.-D. and E.C.-O.; supervision, E.G.-D. and E.C.-O.; project administration, E.G.-D. and E.C.-O.; funding acquisition, E.G.-D. and E.C.-O. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially funded by grant PID2021-127989OB-I00 (Agencia Estatal de Investigación, Ministerio de Ciencia e Innovación, Spain).

Data Availability Statement

Publicly available datasets were analyzed in this study.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1. Parameters Estimation for SG and SIG Distributions

We provide here the log-likelihood function and the normal equations from which the parameters of the SG and SIG distributions can be estimated.

Appendix A.1.1. SG Case

In this case the log-likelihood function is proportional to
( Ω ; x ˜ ) n θ log σ + log θ + log α + α 2 log β i = 1 n log φ σ , β , θ ( x i ) ( θ + 1 ) i = 1 n log x i α 2 + 1 i = 1 n log ψ σ , θ ( x i ) ,
where Ω = ( α , β , θ ) . The mle of the SG distribution are obtained by solving the system of equations given by
n α 2 β α 2 + 1 i = 1 n 1 φ σ , β , θ ( x i ) = 0 , n 1 θ + log σ i = 1 n log x i + i = 1 n σ / x i θ ψ σ , θ ( x i ) log σ x i 1 α + 2 φ σ , β , θ ( x i ) = 0 ,
where
α = 1 2 1 n i = 1 n log φ σ , β , θ ( x i ) log β 1 .

Appendix A.1.2. SIG Case

( Ω ; x ˜ ) n θ log σ + log θ + 1 2 log Γ + Γ β ( θ + 1 ) i = 1 n log x i 1 2 i = 1 n log φ σ , β , θ ( x i ) i = 1 n log ψ σ , θ ( x i ) Γ i = 1 n φ σ , β , θ ( x i ) ,
where Ω = ( β , θ , Γ ) .
The normal equations are given, in this case, by,
n Γ β i = 1 n 1 φ σ , β , θ ( x i ) Γ i = 1 n φ σ , β , θ ( x i ) 1 / 2 = 0 , n 1 θ + log σ + i = 1 n σ / x i θ log σ / x i 1 ψ σ , θ ( x i ) 1 1 φ σ , β , θ ( x i ) Γ φ σ , β , θ ( x i ) i = 1 n log x i = 0 ,
where
Γ = 1 n i = 1 n φ σ , β , θ ( x i ) β 2 .

Appendix A.2. EM Algorithm for the SG and SIG Regression Models

The estimation of the parameters in the mixture of Stoppa regression models can be carried out via an EM algorithm. When the model contains explanatory variables the E-step computes the posterior expectations of the sufficient statistics, i.e., λ i and log λ i , while at the M-step, firstly the posterior expectation is included in the first term of the right hand side of (A1) to fit an Stoppa regression model to update the regression coefficients and later, the parameter of the mixing distribution, i.e., Gamma or inverse gaussian, is updated.
Let us consider now the vector of complete data x ˜ = ( x 1 , , x n ) and the missing data λ ˜ = ( λ 1 , , λ n ) that contains the observed data and the missing data. By writing θ i = exp [ β z i ] , i = 1 , , n , the complete data log-likelihood takes the form
c ( β , α ; x ˜ , λ ˜ ) = i = 1 n log f ( x i | λ i , β ) + i = 1 n log f ( λ i | ω ) ,
where z i = ( z i 1 , , z i p ) is a vector of explanatory variables and β = ( β 1 , , β p ) is a vector of regressors.

Appendix A.2.1. SG Regression Model

Here we consider the Gamma as mixing distribution. We will assume that E ( Λ ) = 1 , i.e., α = β . In this case, the EM scheme is given as follows. From the current estimates after the j-th iteration, ( β ^ ( j ) , α ^ ( j ) ) , the new estimates are calculated as follows:
  • at the E-step, calculate the pseudo-values
    t i = E Λ i | x i , θ ^ i , α ^ ( j ) = α ( j ) + 1 α ( j ) log ψ σ , θ ^ i ( j ) ( x i ) s i = E log Λ i | x i , θ ^ i , α ^ ( j ) = Ψ ( 1 + α ^ ( j ) ) log α ^ ( j ) log ( 1 log ψ σ , θ ^ i ( j ) ( x i ) ) ,
    where Ψ ( · ) is the diGamma function.
  • at the M-step, first update the regressors β ^ ( j + 1 ) by fitting an Stoppa regression model by including the covariates in the shape parameter as described above, by using the pseudo-values t i and s i . Then update the estimate of the parameter α by letting
    α ^ ( j + 1 ) = exp i = 1 n t i i = 1 n s i n n + Ψ ( α ^ ( j ) ) .
  • If some convergence condition is satisfied then stop iterating, otherwise move back to the E-step for another iteration.

Appendix A.2.2. SIG Regression Model

In this case, the EM scheme is given as follows. From the current estimates after the j-th iteration, ( β ^ ( j ) , Γ ^ ( j ) ) , the new estimates are calculated as follows:
  • at the E-step, calculate numerically the pseudo-values
    m i = E 1 / Λ i | x i , θ ^ i , Γ ^ ( j ) ; t i = E Λ i | x i , θ ^ i , Γ ^ ( j ) and s i = E log Λ i | x i , θ ^ i , Γ ^ ( j ) .
  • at the M-step, first update the regressors β ^ ( j + 1 ) by fitting an Stoppa regression model by including the covariates in the shape parameter as described above, by using the pseudo-values t i and s i . Then update the estimate of the parameter Γ by letting
    Γ ^ ( j + 1 ) = n i = 1 n t i + i = 1 n m i 2 n .
  • If some convergence condition is satisfied then stop iterating, otherwise move back to the E-step for another iteration.
The standard errors of the estimates ( β ^ ( j ) , α ^ ( j ) ) can be computed in the last iteration of the EM algorithm for these two models following the methodology in Louis (1982).

References

  1. Albrecher, Hansjörg, Jan Beirlant, and Jozef L. Teugels. 2017. Reinsurance: Actuarial and Statistical Aspects. Hoboken: John Wiley & Sons. [Google Scholar]
  2. Arnold, Barry. 1983. Pareto Distributions. Hoboken: Taylor & Francis Group. [Google Scholar]
  3. Barndorff-Nielsen, Ole, and Christian Halgreen. 1977. Infinite divisibility of the hyperbolic and generalised inverse gaussian distributions. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 38: 309–11. [Google Scholar] [CrossRef]
  4. Bingham, Nicholas, Charles M. Goldie, and Jozef L. Teugels. 1989. Regular Variation, Volume 27 of Encyclopedia of Mathematics and its Applications. Cambridge: Cambridge University Press. [Google Scholar]
  5. Boland, Philip J. 2007. Statistical and Probabilistic Methods in Actuarial Science. Boca Raton: Chapman & Hall. [Google Scholar]
  6. Boyd, Albert V. 1988. Fitting the truncated Pareto distribution to loss distributions. JSS 31: 151–58. [Google Scholar] [CrossRef]
  7. Dhaene, John, Steven Vanduffel, Marc J. Goovaerts, Rob Kaas, Qihe Tang, and David Vyncke. 2006. Risk measures and comonotonicity: A review. Stochastic Models 22: 573–606. [Google Scholar] [CrossRef]
  8. Embrechts, Paul, Claudia Klüppelberg, and Thomas Mikosch. 1997. Modelling Extremal Events for Insurance and Finance. Berlin: Springer. [Google Scholar]
  9. Frangos, Nikolaos, and Dimitris Karlis. 2004. Modelling losses using an exponential-inverse gaussian distri bution. Insurance: Mathematics and Economics 35: 53–67. [Google Scholar]
  10. Frees, Edward W. 2010. Regression Modelling with Actuarial and Financial Applications. Cambridge: Cambridge Univerity Press. [Google Scholar]
  11. Fung, Thomas, and Eugene Seneta. 2007. Tailweight, quantiles and kurtosis: A study of competing distributions. Operations Research Letters 35: 448–54. [Google Scholar] [CrossRef]
  12. Ghitany, Mohamed E., Emilio Gómez-Déniz, and Saralees Nadarajah. 2018. A new generalization of the Pareto distribution and its application to insurance data. Journal of Risk and Financial Management 11: 10. [Google Scholar] [CrossRef]
  13. Gómez-Déniz, Emilio, and Enrique Calderín-Ojeda. 2014. A suitable alternative to the Pareto distribution. Hacettepe Journal of Mathematics and Statistics 43: 843–60. [Google Scholar]
  14. Gómez-Déniz, Emilio, and Enrique Calderín-Ojeda. 2015. Modelling insurance data with the Pareto ArcTan distribution. ASTIN Bulletin 45: 639–60. [Google Scholar] [CrossRef]
  15. Gómez-Déniz, Emilio, Enrique Calderín-Ojeda, and José M. Sarabia. 2013. Gamma-generalized inverse Gaussian class of distributions with applications. Communications in Statistics-Theory and Methods 42: 919–33. [Google Scholar] [CrossRef]
  16. Gómez-Déniz, Emilio, Víctor Leiva, Enrique Calderín-Ojeda, and Christophe Chesneau. 2022. A novel claim size distribution based on a Birnbaum-Saunders and Gamma mixture capturing extreme values in insurance: Estimation, regression, and applications. Computational and Applied Mathematics 41: 171. [Google Scholar] [CrossRef]
  17. Hogg, Robert V., and Stuart A. Klugman. 1984. Loss Distributions. New York: John Wiley & Sons. [Google Scholar]
  18. Jessen, Anders H., and Thomas Mikosch. 2006. Regularly varying functions. Publications de L’institut Mathematique 80: 171–92. [Google Scholar] [CrossRef]
  19. Johnson, Norman, Adrienne W. Kemp, and Samuel Kotz. 2005. Univariate Discrete Distributions. Hoboken: John Wiley, Inc. [Google Scholar]
  20. Kleiber, Christian, and Samuel Kotz. 2003. Statistical Size Distributions in Economics and Actuarial Sciences. Hoboken: John Wiley & Sons, Inc. [Google Scholar]
  21. Konstantinides, Dimitrios G. 2017. Risk Theory. A Heavy Tail Approach. Singapore: World Scientific Publishing. [Google Scholar]
  22. Lehmann, Erich L. 1953. The power of rank test. Annals of Mathematical Statistics 24: 23–43. [Google Scholar] [CrossRef]
  23. Lemonte, Artur J., and Gauss M. Cordeiro. 2011. The exponentiated generalized inverse Gaussian distribution. Statistics & Probability Letters 81: 506–17. [Google Scholar]
  24. Louis, Thomas A. 1982. Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B 44: 226–33. [Google Scholar] [CrossRef]
  25. Marshall, Albert W., and Ingram Olkin. 1997. A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families. Biometrika 84: 641–52. [Google Scholar] [CrossRef]
  26. McNeil, Alexander J. 1988. Estimating the tails of loss severity distribution using extreme value theory. ASTIN Bulletin 27: 117–37. [Google Scholar] [CrossRef]
  27. Rolski, Tomasz, Hanspeter Schmidli, Volker Schmidt, and Jozef L. Teugels. 1999. Stochastic Processes for Insurance and Finance. Hoboken: John Wiley & Sons. [Google Scholar]
  28. Sarabia, José M., and Enrique Castillo. 2005. About a class of max-stable families with applications to income distributions. Metron 63: 505–27. [Google Scholar]
  29. Sarabia, José M., and Faustino Prieto. 2009. The Pareto-positive stable distribution: A new descriptive model for city size data. Physica A 388: 4179–91. [Google Scholar] [CrossRef]
  30. Seshadri, Vanamamalai. 1994. The Inverse Gaussian Distribution: A Case Study in Exponential Families. Oxford: Science Publications. [Google Scholar]
  31. Shaked, Moshe, and J. George Shanthikumar. 2007. Stochastic Orders. Springer Series in Statistics. New York: Springer. [Google Scholar]
  32. Stoppa, Gabriele. 1990. Proprietà Campionarie di un Nuovo Modello Pareto Generalizzato. Padova: Cedam, pp. 137–44. [Google Scholar]
  33. Tzougas, George, and Dimitris Karlis. 2020. An EM algorithm for fitting a new class of mixed exponential regression models with varying dispersion. ASTIN Bulletin 50: 555–83. [Google Scholar] [CrossRef]
  34. Tzougas, George, and Himchan Jeong. 2021. An expectation-maximization algorithm for the exponential-generalized inverse gaussian regression model with varying dispersion and shape for modelling the aggregate claim amount. Risks 9: 19. [Google Scholar] [CrossRef]
  35. Yu, Yaming. 2009. Stochastic ordering of exponential family of distributions and their mixtures. Journal of Applied Probability 46: 244–54. [Google Scholar] [CrossRef]
Figure 1. Different shapes of the pdf (10) for special values of the parameters.
Figure 1. Different shapes of the pdf (10) for special values of the parameters.
Risks 12 00006 g001
Figure 2. Empirical (smoothed) distribution (histogram) and theoretical distribution model of the different distributions considered for the Danish (left panel) and Norwegian (right panel) datasets.
Figure 2. Empirical (smoothed) distribution (histogram) and theoretical distribution model of the different distributions considered for the Danish (left panel) and Norwegian (right panel) datasets.
Risks 12 00006 g002
Figure 3. Smooth cdf of the empirical Danish (left panel) and Norwegian (right panel) claims data as compared to the theoretical models.
Figure 3. Smooth cdf of the empirical Danish (left panel) and Norwegian (right panel) claims data as compared to the theoretical models.
Risks 12 00006 g003
Figure 4. Absolute errors of the limited expected values for Danish dataset (left) and Norwegian dataset (right).
Figure 4. Absolute errors of the limited expected values for Danish dataset (left) and Norwegian dataset (right).
Risks 12 00006 g004
Figure 5. Histogram and densities for the S, SG, and SIG distributions for the automobile bodily injury claims dataset.
Figure 5. Histogram and densities for the S, SG, and SIG distributions for the automobile bodily injury claims dataset.
Risks 12 00006 g005
Figure 6. Hill’s estimator θ k values varying k, calculated for the automobile bodily injury claims dataset.
Figure 6. Hill’s estimator θ k values varying k, calculated for the automobile bodily injury claims dataset.
Risks 12 00006 g006
Table 1. Parameter estimators and log-likelihood for the different models considered.
Table 1. Parameter estimators and log-likelihood for the different models considered.
Model θ ^ λ ^ NLL
Pareto0.582130.331
Stoppa1.1983.861119.372
Table 2. Parameter estimates and their corresponding p-values (in brackets), negative of the maximum of the log likelihood function (NLL), AIC, Kolmogorov–Smirnov and Anderson–Darling tests for the Weibull, Lognormal, Burr, and Stoppa distributions and mixture models: SG, SIG, and SGIG.
Table 2. Parameter estimates and their corresponding p-values (in brackets), negative of the maximum of the log likelihood function (NLL), AIC, Kolmogorov–Smirnov and Anderson–Darling tests for the Weibull, Lognormal, Burr, and Stoppa distributions and mixture models: SG, SIG, and SGIG.
Danish Data
α ^ λ ^ β ^ θ ^ Γ ^ NLLAICKSAD
lognormal 1.557 −0.290 3404.3406812.680.2549365.672
(<0.001) (<0.001) (<0.001)(<0.001)
Weibull 1.584 0.660 3510.6107025.220.072431.752
(<0.001) (<0.001) (<0.001)(<0.001)
Burr 1.223 1.115 3339.4406682.89>1>100
(<0.001) (<0.001) (<0.001)(<0.001)
Stoppa 1.163 1.395 3342.4206688.840.02770.00055
(<0.001) (<0.001) (0.3772)(0.4930)
SG19.550 14.2411.512≈03334.7886675.5800.0220.00031
( 0.002 ) ( 0.008 )(<0.001) (0.6368)(0.7955)
SIG−0.5 5.8831.51711.3233335.0006676.0000.0220.000328
( 0.012 )(<0.001)(<0.001) (0.6368)(0.7705)
SGIG−4.546 2.9891.51716.3703335.1506678.3000.0220.00034
(<0.001) (<0.001)(<0.001)(<0.001) (0.6368)(0.767)
Norwegian Data
α ^ λ ^ β ^ θ ^ Γ ^ NLLAICKSAD
lognormal 6.313 1.537 21,097.40042,198.800.068223.082
(<0.001) (<0.001) (<0.001)(<0.001)
Weibull 0.689 1134.39 21,150.50042,305.000.076537.756
(<0.001) (<0.001) (<0.001)(<0.001)
Burr 0.0078 20.295 23,685.62047,375.200.419704.395
(<0.001) (<0.001) (<0.001)(<0.001)
Stoppa 1.143 1.124 21,045.29042,094.6000.08540.00627
(<0.001) (<0.001) (<0.001)(<0.001)
SG6.404 2.6861.593≈020,931.03341,868.1000.01580.00016
(<0.001) (<0.001)(<0.001) (0.9011)(0.8685)
SIG−0.5 0.5591.6734.47220,931.40041,868.7000.01620.00021
(<0.001)(<0.001)(<0.001) (0.8846)(0.7905)
SGIG1.559 1.5881.6321.62620,929.90041,867.9000.01540.00016
(<0.001) (<0.001)(<0.001)(<0.001) (0.9163)(0.8685)
Table 3. Limited expected value for the different distributions considered, and different values of the fixed amount deductible x for the Danish dataset.
Table 3. Limited expected value for the different distributions considered, and different values of the fixed amount deductible x for the Danish dataset.
DeductibleLimited Expected Value
EmpiricalStoppaSGSIGSGIG
6.002.423712.428792.449392.436082.43616
6.152.436412.442782.462652.449312.44940
6.302.448712.456302.475442.462062.46217
6.452.460522.469382.487782.474372.47450
6.602.472092.482052.499712.486272.48640
6.752.483452.494332.511232.497762.49791
6.902.494592.506232.522382.508882.50904
7.052.505472.517782.533172.519642.51982
7.202.516072.528992.543632.530072.53025
7.352.526372.539892.553762.540172.54037
7.502.536442.550472.563592.549972.55019
7.652.546202.560772.573132.559492.55971
7.802.555632.570802.582402.568722.56896
7.952.564912.580562.591402.577702.57795
8.102.574012.590062.600152.586422.58668
8.252.582942.599332.608662.594912.59518
8.402.591652.608362.616942.603172.60345
8.552.600262.617172.625012.611212.61150
8.702.608762.625772.632862.619042.61934
8.852.617032.634172.640522.626672.62698
9.002.625142.642372.647982.634102.63443
9.152.633242.650392.655262.641362.64169
9.302.641172.658222.662362.648442.64878
9.452.648912.665882.669292.655342.65570
9.602.656532.673372.676062.662092.66245
9.752.664142.680702.682672.668682.66905
9.902.671752.687882.689132.675122.67550
Table 4. Limited expected value for the different distributions considered, and different values of the fixed amount deductible x for the Norwegian dataset.
Table 4. Limited expected value for the different distributions considered, and different values of the fixed amount deductible x for the Norwegian dataset.
DeductibleLimited Expected Value
EmpiricalStoppaSGSIGSGIG
600594.306592.565594.307594.14594.189
1100936.322900.662935.339934.021935.392
16001139.381084.901136.211136.551137.46
21001269.481214.541267.371269.271269.35
26001357.431313.791360.361363.111362.55
31001421.891393.801430.211433.221432.30
36001473.581460.581484.961487.801486.72
41001516.111517.741529.231531.641530.56
46001551.111567.611565.911567.721566.75
51001580.551611.751596.911598.021597.23
56001606.311651.301623.521623.881623.31
61001629.251687.091646.681646.241645.93
66001650.391719.741667.041665.81665.77
71001669.511749.721685.131683.081683.33
76001686.981777.431701.321698.471699.02
81001703.071803.171715.921712.291713.13
86001717.581827.191729.171724.771725.91
91001730.791849.691741.261736.111737.54
96001743.011870.851752.341746.461748.19
Table 5. Parameter estimates, standard errors (S.E.), and p-values for automobile bodily injury claims dataset under S, SG, and SIG regression models. NLL and AIC are included in the last two rows. The response variable is total losses.
Table 5. Parameter estimates, standard errors (S.E.), and p-values for automobile bodily injury claims dataset under S, SG, and SIG regression models. NLL and AIC are included in the last two rows. The response variable is total losses.
Regression Model
Estimate (S.E.)SSGSIG
INTERCEPT−3.704 (1.041)2.512 (0.019)17.659 (2.713)
p-value0.0004<0.0001<0.0001
ATTORNEY1.651 (0.095)2.374 (0.244)20.014 (2.565)
p-value<0.0001<0.0001<0.0001
CLMSEX0.023 (0.087)2.544 (0.293)−0.092 (0.177)
p-value0.7941<0.00010.6022
MARRIED−0.352 (0.272)7.009 (0.233)−0.054 (0.582)
p-value0.19510.32100.9260
SINGLE−0.434 (0.281)2.363 (0.183)−0.065 (0.615)
p-value0.1222<0.00010.9158
WIDOWED−1.596 (0.488)10.005 (14.472)14.631 (2.536)
p-value0.00110.4802<0.0001
CLMINSUR0.079 (0.146)3.086 (0.714)0.271 (0.365)
p-value0.5898<0.00010.4575
SEATBELT−1.258 (0.361)2.508 (0.043)−19.1757 (2.7936)
p-value0.0005<0.0001<0.0001
CLMAGE0.019 (0.003)0.393 (0.003)0.0144 (0.0066)
p-value<0.0001<0.00010.0279
θ 0.736 (0.015)1.647 (0.008)1.061 (0.034)
p-value<0.0001<0.0001<0.0001
λ , α β 831.609 (528.599)1.093 (0.039)0.001 (0.000)
p-value0.1160<0.00010.0236
–, β , Γ 0.000 (0.000)741.355 (104.096)
p-value<0.0001<0.0001
NLL2608.382566.172558.82
AIC5238.765156.345141.65
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Gómez-Déniz , E.; Calderín-Ojeda , E. On the Use of Lehmann’s Alternative to Capture Extreme Losses in Actuarial Science. Risks 2024, 12, 6. https://doi.org/10.3390/risks12010006

AMA Style

Gómez-Déniz  E, Calderín-Ojeda  E. On the Use of Lehmann’s Alternative to Capture Extreme Losses in Actuarial Science. Risks. 2024; 12(1):6. https://doi.org/10.3390/risks12010006

Chicago/Turabian Style

Gómez-Déniz , Emilio, and Enrique Calderín-Ojeda . 2024. "On the Use of Lehmann’s Alternative to Capture Extreme Losses in Actuarial Science" Risks 12, no. 1: 6. https://doi.org/10.3390/risks12010006

APA Style

Gómez-Déniz , E., & Calderín-Ojeda , E. (2024). On the Use of Lehmann’s Alternative to Capture Extreme Losses in Actuarial Science. Risks, 12(1), 6. https://doi.org/10.3390/risks12010006

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop