Next Article in Journal
Multi-AGV Dynamic Scheduling in an Automated Container Terminal: A Deep Reinforcement Learning Approach
Previous Article in Journal
Research on the Lightweight Design of an Aircraft Support Based on Lattice-Filled Structures
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Heavy-Tailed Gleser Model: Properties, Estimation, and Applications

by
Neveka M. Olmos
1,*,
Emilio Gómez-Déniz
2 and
Osvaldo Venegas
3
1
Departamento de Matemáticas, Facultad de Ciencias Básicas, Universidad de Antofagasta, Antofagasta 1240000, Chile
2
Department of Quantitative Methods in Economics and TIDES Institute, University of Las Palmas de Gran Canaria, 35017 Las Palmas de Gran Canaria, Spain
3
Departamento de Ciencias Matemáticas y Físicas, Facultad de Ingeniería, Universidad Católica de Temuco, Temuco 4780000, Chile
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(23), 4577; https://doi.org/10.3390/math10234577
Submission received: 29 September 2022 / Revised: 19 November 2022 / Accepted: 21 November 2022 / Published: 2 December 2022
(This article belongs to the Section Probability and Statistics)

Abstract

:
In actuarial statistics, distributions with heavy tails are of great interest to actuaries, as they represent a better description of risk exposure through a type of indicator with a certain probability. These risk indicators are used to determine companies’ exposure to a particular risk. In this paper, we present a distribution with heavy right tail, studying its properties and the behaviour of the tail. We estimate the parameters using the maximum likelihood method and evaluate the performance of these estimators using Monte Carlo. We analyse one set of simulated data and another set of real data, showing that the distribution studied can be used to model income data.

1. Introduction

Heavy-tailed distributions have been used to model data in various applied sciences, such as environmental sciences, earth sciences, and economic and actuarial sciences. Insurance datasets tend to be positive and asymmetric to the right, with heavy tails (see Ibragimov and Prokhorov [1]); distributions with these characteristics are therefore used to model insurance data. Some authors have extended certain asymmetric distributions using the slash methodology, for instance, to increase the weight of the right tail, e.g., Olmos et al. [2], Olmos et al. [3] in the half-normal (HN) and generalized half-normal (GHN) models, Astorga et al. [4] in the generalized exponential model (see Gupta and Kundu [5]; Mudholkar et al. [6]), and  Gómez et al. [7] in the Gumbel model. Two other recent works are, for example, Bhati and Ravi [8] in the generalized log-Moyal model and Afify et al. [9] in the heavy-tailed exponential model. It is known that the Pareto distribution and its corresponding generalizations are heavy-tailed distributions; they have been used in several areas of knowledge, e.g., by Choulakian and Stephens [10], Zhang [11], Akinsete et al. [12], Nassar and Nada [13], Mahmoudi [14] and Boumaraf et al. [15]. It is important to study distributions with these characteristics in order to model, for example, insurance datasets and financial yields. In the present paper, we study a distribution with a heavy right tail that provides a good fit to family income data. A very necessary function in this paper is the beta function, denoted by B( a , b ), which can be expressed as:
B ( a , b ) = 0 1 t a 1 ( 1 t ) b 1 d t = 0 t a 1 ( 1 + t ) a + b d t = Γ ( a ) Γ ( b ) Γ ( a + b ) ,
where a > 0 , b > 0 and Γ ( · ) is the gamma function. The beta function is the normalisation constant of the beta distribution, i.e., we say that the random variable Y has a beta distribution with parameters a and b if its probability density function (pdf) is given by
f Y ( y ; a , b ) = 1 B ( a , b ) y a 1 ( 1 y ) b 1 , 0 < y < 1 ,
where a > 0 and b > 0 .
The incomplete beta function is denoted by B( y ; a , b ) and can be expressed as:
B ( y ; a , b ) = 0 y t a 1 ( 1 t ) b 1 d t , 0 < y < 1 ,
where a > 0 and b > 0 . Another related function is the regularised incomplete beta function, denoted by I y ( a , b ) , and expressed as I y ( a , b ) = B ( y ; a , b ) / B ( a , b ) .
Gleser [16] introduced a representation of the gamma distribution, the product of a mixed scale between an unknown distribution and the exponential distribution. The object of the present paper is to study this unknown distribution, which we call the Gleser (G) distribution. A random variable X has a G distribution with parameter α if its pdf is given by
f X ( x ; α ) = x α B ( α ¯ , α ) ( 1 + x ) , x > 0 ,
with 0 < α < 1 shape parameter and α ¯ = 1 α . The G distribution is a particular case of the beta prime distribution (see Keeping [17]), also called the inverted beta distribution (see Tiao and Cuttman [18]), but as it is a distribution with only one parameter, we were interested in studying it and applying some of its properties.Taking the density given in (4), and considering a new scale parameter, we obtain a more flexible distribution for modelling positive data with heavy right tail.
The article is organised as follows. In Section 2 we give some properties of the G distribution. In Section 3 we study the behaviour of the tail of the G distribution. In Section 4 we carry out parameter estimation using the maximum likelihood (ML) method and do a simulation study and asymptotic convergence of the ML estimators. Section 5 shows an application with data from the economic field. In Section 6 we offer some conclusions.

2. The G Distribution

In this section, we study the basic properties of the G distribution given in (4) incorporating a scale parameter. A random variable X has a G distribution with positive support and parameters σ and α if its pdf is:
f X ( x ; σ , α ) = σ α x α B ( α ¯ , α ) ( σ + x ) , x > 0 ,
where σ > 0 scale parameter and 0 < α < 1 shape parameter; we denote this by X G ( σ , α ) . The G distribution is an alternative model to distributions with two parameters, to be used for modelling actuarial statistics data such as the Pareto and GHN distributions, among others. Figure 1 shows the graph of the G density for different values of parameter α , fixing σ = 1 .
We perform a brief comparison illustrating that the tails of the G distribution are heavier when the α parameter decreases. Table 1 shows P ( X > x ) for different values of x in this distribution.

2.1. Properties

The following proposition shows some properties of the G distribution.
Proposition 1. 
Let X G ( σ , α ) and Y B e t a ( 1 α , α ) . Then,
(a) 
the G distribution has unimodality (at 0).
(b) 
σ Y 1 Y G ( σ , α ) .
(c) 
the cumulative distribution function (cdf) of X is given by
F X ( x ; σ , α ) = I y α ¯ , α , x > 0 ,
where y = x σ + x and I y ( · , · ) is the regularised incomplete beta function.
(d) 
the hazard function of X is decreasing for all x > 0 .
(e) 
the r-th moment of the random variable X does not exist for r α .
(f) 
1 1 + X B e t a ( α , α ¯ ) .
(g) 
the quantile function (Q) of the G distribution is given by
Q ( p ) = σ I p 1 ( α ¯ , α ) 1 I p 1 ( α ¯ , α ) , 0 < p < 1 ,
where I p 1 is the inverse function of the regularised incomplete beta function.
Proof. 
(d) Using the theorem, item (b), given in Glaser [19] we have that
η ( x ) = f X ( x ; σ , α ) f X ( x ; σ , α ) = α x + 1 σ + x ,
where f X ( x ; σ , α ) is the pdf given in (5), then the derivative of η ( x ) with respect to x
η ( x ) = α x 2 + 1 ( σ + x ) 2 < 0 , x > 0
gives the result.
(e) 
Considering σ = 1 , we claim that the integral 1 x α + r 1 + x d x is divergent.
In fact, taking g 1 ( x ) = x α + r 1 + x and g 2 ( x ) = 1 x 1 + α r , we have that
lim x g 1 ( x ) g 2 ( x ) = lim x x 1 + x = 1 0 ;
and as the integral 1 1 x 1 + α r d x is divergent for r α this proves the claim.
On the other hand, since E ( X r ) = 1 B ( α ¯ , α ) 0 x α + r 1 + x d x , by using comparison
0 1 x α + r 1 + x d x 0 x α + r 1 + x d x ,
and the above claim, the result is reached.
     □
The survival function F ¯ ( t ) , which is the probability that an item will not fail before time t, is defined by F ¯ ( t ) = 1 F ( t ) . The survival function for a G random variable is given by F ¯ ( t ) = 1 I t / ( σ + t ) α ¯ , α .
The hazard function h ( t ) , defined by h ( t ) = f ( t ) F ¯ ( t ) , for a G random variable is given by
h ( t ) = σ α t α ( σ + t ) B ( α ¯ , α ) B t σ + t , α ¯ , α , t > 0 .
Figure 2 shows the form of the hazard function for different values of α , considering σ = 1 . As we show in Proposition 1, part ( d ) , the hazard function is always a decreasing function. In the context of reliability, it indicates that failures are more likely to occur earlier in a product’s useful life.
Using the Q ( p ) function given in Proposition 1 ( g ) , we can compute the coefficient of skewness ( β 1 ) and the coefficient of kurtosis ( β 2 ) for the random variable X G ( σ , α )
β 1 = Q 3 4 + Q 1 4 2 Q 2 4 Q 3 4 Q 1 4 , β 2 = Q 3 8 Q 1 8 + Q 7 8 Q 5 8 Q 3 4 Q 1 4 .
Figure 3 depicts plots for the skewness and kurtosis coefficients in the G distribution.

2.2. Actuarial Measure

The value at risk (VaR) is used to assess the risk exposure, i.e., it can be used to determine the amount of capital necessary to liquidate adverse results. The VaR of the random variable X G ( σ , α ) is defined as (see Artzner [20] and Artzner et al. [21])
V a R p = Q ( p ) = σ I p 1 ( α ¯ , α ) 1 I p 1 ( α ¯ , α ) , 0 < p < 1 ,
where I p 1 is the inverse function of the regularised incomplete beta function.
Figure 4 shows graphs of the VaR p measurement of distribution G( 10 , α ) for different values of parameter α . We may observe that the smaller the value of parameter α , the larger the value of the VaR p measurement.

2.3. Order Statistics

Let X 1 , , X n be a random sample of the random variable X G ( σ , α ) , let us denote by X ( j ) the j t h —order statistics, j { 1 , , n } .
Proposition 2. 
The pdf of X ( j ) is
f X ( j ) ( x ; σ , α ) = n ! ( j 1 ) ! ( n j ) ! σ α x α B ( α ¯ , α ) ( σ + x ) I x / ( σ + x ) ( α ¯ , α ) j 1 1 I x / ( σ + x ) ( α ¯ , α ) n j ,
where x > 0 . In particular, the pdf of the minimum, X ( 1 ) , is
f X ( 1 ) ( x ; σ , α ) = n σ α x α B ( α ¯ , α ) ( σ + x ) 1 I x / ( σ + x ) ( α ¯ , α ) n 1 , x > 0
and the pdf of the maximum, X ( n ) , is
f X ( n ) ( x ; σ , α ) = n σ α x α B ( α ¯ , α ) ( σ + x ) I x / ( σ + x ) ( α ¯ , α ) n 1 , x > 0 .
Proof. 
Since we are dealing with an absolutely continuous model, the pdf of the j t h —order statistics is obtained by applying
f X ( j ) ( x ; σ , α ) = n ! ( j 1 ) ! ( n j ) ! f X ( x ; σ , α ) [ F X ( x ; σ , α ) ] j 1 [ 1 F X ( x ; σ , α ) ] n j , j { 1 , , n }
where F and f denote the cdf and pdf of the parent distribution, X G ( σ , α ) in this case.    □

2.4. Entropy

In this subsection we will discuss the Shannon and Rényi entropies of the G model.

2.4.1. Shannon Entropy

The following lemma will be very useful for calculating the Shannon entropy.
Lemma 1. 
Let X G ( σ , α ) , then we have the following results.
1. 
E ( log X ) = log σ + ψ ( α ¯ ) ψ ( α ) ,
2. 
E ( log ( σ + X ) ) = log σ γ ψ ( α ) ,
where ψ ( · ) is the digamma function and γ = ψ ( 1 ) is Euler’s constant.
Proof. 
Both results are obtained directly using the pdf given in (5).    □
The Shannon (S) entropy (see Shannon [22]) for a random variable X is defined as
S ( X ) = E ( log f X ( X ) ) .
Therefore, it can be verified that the S entropy for the G model is
Proposition 3. 
Let X G ( σ , α ) . Then the Shannon entropy of X is
S ( X ) = log σ + log B ( α ¯ , α ) γ + α ψ ( α ¯ ) ( 1 + α ) ψ ( α ) .
Figure 5 shows the Shannon entropy for the G model fixing σ = 1 .

2.4.2. Rényi Entropy

A generalization of the Shannon entropy is the Rényi ( R p ) entropy, which is defined as
R p ( X ) = 1 1 p log 0 [ f X ( x ) ] p d x .
Therefore, it can be verified that the R p entropy for the G model is
Proposition 4. 
Let X G ( σ , α ) . Then the Rényi entropy of X is
R p ( X ) = 1 1 p log Γ ( ξ ) + log Γ ( p ξ ) log Γ ( p ) ( p 1 ) log σ p log B ( α ¯ , α ) ,
where ξ = 1 p α > 0 .
Corollary 1. 
Let X G ( σ , α ) with S ( X ) and R p ( X ) the Shannon and Rényi entropies. Then we have
lim p 1 R p ( X ) = S ( X ) .

3. Tail of the Distribution

A random variable with a non-negative support, like the classic Pareto distribution, is commonly used in insurance contexts to model the amount of losses. The size of the distribution tail is fundamental if we want the chosen model to capture quantities sufficiently removed from the start of the distribution support, i.e., atypical (extreme) values. The concept of heavy tail is fundamental for this and other financial scenarios.
The use of heavy right-tailed distributions is of vital importance in general insurance. Pareto, Log-normal and Weibull distributions, among others, have been used to model third party liability insurance losses for motor vehicles, re-insurance and catastrophe insurance.
Let S be the class of subexponential distributions. That is, F S defined in R + satisfies that:
lim x 1 F 2 ( x ) 1 F ( x ) = 2 ,
where F j is the j-fold convolution of F.
The following Lemma, appearing, among others, in  Chapet 2, p. 55 in  Rolski et al. [23], is required in the next Theorem.
Lemma 2. 
Let F ( x ) and H ( x ) be two distributions on R + and assume that there exists a constant c > 0 such that
lim x H ¯ ( x ) F ¯ ( x ) = c .
Then, F S if and only if H S .
Theorem 1. 
Let the cdf of G ( σ , α ) given in (6). Then, F X ( x ; σ , α ) S .
Proof. 
Let H ¯ X ( x ; σ , α ) = ( σ / ( σ + x ) ) α the survival function of the Pareto Type II distribution (Lomax distribution). By using this together with the complementary of the cdf (6) and by computing (14) we get, after applying L’Hopital’s rule, that
lim x H ¯ ( x ) F ¯ ( x ) = α B ( α ¯ , α ) > 0 .
Now, taking into account that the Pareto type II distribution belongs to the class of subexponential distributions, we have the result.    □
As a consequence of the previous Theorem (see Rolski et al. [23] p. 50), we have the following Corollary.
Corollary 2. 
If X 1 and X 2 are independent and identically distributed random variables with distributions given in (6), then as x
Pr ( X 1 + X 2 > x ) Pr ( max { X 1 , X 2 } > x ) .
Proof. 
The result is a consequence of the fact that F X ( x ; σ , α ) S .    □
Proposition 5. 
It is verified that F X ( x ; σ , α ) is heavy right-tailed distribution.
Proof. 
Since F X ( x ; σ , α ) S , the result is a consequence of applying Theorem 2.5.2 in Rolski et al. [23].    □
Another way to see that F X ( x ; σ , α ) is a heavy right-tailed distribution is by computing the limit lim x ( log F ¯ X ( x ; σ , α ) / x ) , which results 0. Observe that log F ¯ X ( x ; σ , α ) / x ) is the hazard function of X.
As a consequence of the last result, we have the following Corollary:
Corollary 3. 
It is verified that lim sup x e s x F ¯ X ( x ; σ , α ) = , s > 0 .
In this case the distribution fails to possess any positive exponential moment, i.e., exp ( s x ) d F X ( x ; σ , α ) = for all s > 0 see Chapter 1, p. 2 [24]. Distributions of this type have moment generating function M F X ( x ; σ , α ) ( s ) = , for all s > 0 . This occurs, for example, with the log-normal distribution.
An important issue in extreme value theory is regular variation (see Bingham [25] and Konstantinides [26]). This is a flexible description of the variation of some functions according to the polynomial form of the type x δ + o ( x δ ) , δ > 0 . This concept is formalized in the following definition.
Definition 1. 
A CDF (measurable function) is called regular varying at infinity with index δ if it holds:
lim x F ¯ ( τ x ) F ¯ ( x ) = τ δ ,
where τ > 0 and the parameter δ 0 is called the tail index.
The following proposition establishes that the survival function of the G distribution is a distribution with regular variation.
Proposition 6. 
The survival function of G ( σ , α ) is regularly varying with tail index α.
Proof. 
Applying the above definition and using L’Hopital’s Rule, we have that
lim x F ¯ ( t x ) F ¯ ( x ) = t α + 1 lim x σ + x σ + t x α σ + x σ + t x α 1 σ + x σ + t x 2 = t α + 1 lim x σ + x σ + t x ,
Calculating the limit to the right, we obtain the result.    □
In actuarial settings and individual and collective risk models, the practitioner is usually interested in the random variable S n = i = 1 n X i for n 1 . Although its pdf is difficult or impossible to calculate in practice, we can approximate its probabilities using the following Corollary (see Jessen and Mikosch [27]).
Corollary 4. 
If X 1 , , X n are iid random variables G ( σ , α ) and S n = i = 1 n X i , n 1 , then
Pr ( S n > x ) n Pr ( X 1 > x ) , a s x .
Therefore, if  P n = max i = 1 , , n X i , n 1 , we have that
Pr ( S n > x ) n Pr ( X > x ) Pr ( P n > x ) .
This means that for large x the event { S n > x } is due to the event { P n > x } . Therefore, high thresholds being exceeded by the sum S n are due to this threshold being exceeded by the largest value in the sample.
As Jessen and Mikosch [27] point out, expression (15) can be taken as the definition of a subexponential distribution. The class of those distributions is larger than the class of regularly varying distributions. The result, given in Corollary 4, remains valid for subexponential distributions because the subexponentiality of S n implies subexponentiality of X 1 . Usually, this property is referred to as convolution root closure of subexponential distributions. More details can be viewed in [28,29].
On the other hand, let the random variable X, whose support is 0 < x < , represents either a policy limit or reinsurance deductible (from an insurer’s perspective); then the limited expected value function L of X with cdf F ( x ) , is defined by (see Boland [30] and Hogg and Klugman [31]):
L ( x ) = E [ min ( X , x ) ] = 0 x y d F ( y ) + x F ¯ ( x ) ,
which is the expectation of the cdf F ( x ) truncated at this point. In other words, it represents the expected amount per claim retained by the insured on a policy with a fixed amount deductible of x. Observe that we integrate according to the interval ( 0 , x ) .
Proposition 7. 
Let X be a random variable following the pdf (5). Then the limited expected value of X is given by
L ( z ) = z 2 σ B ( α ¯ , α ) ( 2 α ) σ z α 2 F 1 ( 2 α , 1 , 3 α , z / σ ) + z F ¯ ( z ) , α < 1 ,
where 2 F 1 represents the hypergeometric function.
Proof. 
By making the change of variable t = x / ( z x ) in the integral
I = σ α B ( α ¯ , α ) 0 z d x x α 1 ( σ + x ) ,
gives the result.    □

4. Inference

In this section we estimate the parameters of the G model by the ML method, and discuss a simulation study and asymptotic estimation of the ML estimators.

4.1. ML Estimation

For a random sample x 1 , , x n derived from the G ( σ , α ) distribution, the log-likelihood function can be written as
( σ , α ) = n α log σ α i = 1 n log x i n log B ( α ¯ , α ) i = 1 n log ( σ x i ) .
The score equations are given by:
n α σ i = 1 n 1 σ + x i = 0 ,
log σ log x ¯ + ψ ( 1 α ) ψ ( α ) = 0 ,
where ψ ( · ) is the digamma function and log x ¯ = 1 n i = 1 n log x i . From (17) we obtain
α ^ = σ ^ n i = 1 n 1 σ ^ + x i ,
and the ML estimator for σ ( σ ^ ) is obtained by solving numerically the following equation
log σ ^ = log x ¯ + ψ σ ^ n i = 1 n 1 σ ^ + x i ψ 1 σ ^ n i = 1 n 1 σ ^ + x i .
Equation (20) can be solved by using numerical procedures such as the Newton-Raphson algorithm. Alternatively, these estimates can be found by directly maximizing the log-likelihood surface given by (16) and using the “optim” subroutine in the R software package [32] version 4.2.1.

4.2. Simulation Study

To examine the behaviour of the ML estimation, we present a simulation study to evaluate its performance, evaluating the σ and α parameters in the G model. The simulation was analysed by generating 1000 samples of size n = 100 , 200 and 500 from the G model. The object of this simulation was to study the behaviour of the ML of the G model parameters. Algorithms 1 and 2 can be used to generate random numbers from the G model, as shown below.
Algorithm 1 For simulating from the X G ( σ , α ) can proceed as follows
  • Step 1: Generate Y U n i f o r m ( 0 , 1 ) .
  • Step 2: Compute X = σ I Y 1 ( α ¯ , α ) 1 I Y 1 ( α ¯ , α ) .
Algorithm 2 For simulating from the X G ( σ , α ) can proceed as follows
  • Step 1: Generate Y B e t a ( α ¯ , α ) .
  • Step 2: Compute X = σ Y 1 Y .
For each sample generated from the G distribution, the ML estimates are obtained by applying the Newton-Raphson algorithm. In Table 2 the empirical bias, the mean of the standard errors (SE), the root of the empirical mean squared error (RMSE), and the 95% coverage probability (CP) based on the asymptotic distribution for ML estimators are given for the estimators of the parameters. As Table 2 shows, the performance of the estimates improves when n increases.

4.3. Fisher’s Information Matrix

Let us now consider X G ( σ , α ) . For a single observation x of X, the log-likelihood function for θ = ( σ , α ) is given by
( θ ) = α log σ α log x log B ( α ¯ , α ) log ( σ + x ) .
The corresponding first and second partial derivatives of the log-likelihood function are derived in Appendix A. It can be shown that the Fisher’s information matrix, denoted by I F ( · ) , for the G distribution is provided by
I F ( θ ) = α ¯ α 2 σ 2 1 σ 1 σ ψ ( α ) + ψ ( α ¯ ) ,
where ψ ( · ) is the trigamma function.
Proposition 8. 
The ML estimate of θ is consistent and asymptotically normal, verifying
n θ ^ θ L N 2 ( 0 , I F ( θ ) 1 ) , a s n .
Proof. 
The distribution satisfies the regularity conditions see Lehmann and Casella [33] p. 449 under which the ML estimator θ ^ of θ is consistent and asymptotically normal. □
Thus the asymptotic variance of the ML estimator θ ^ is the inverse of Fisher’s information matrix I F ( θ ) . Since the parameters are unknown, the observed information matrix is usually considered, where the unknown parameters are estimated by ML.

5. Applications

In this section we analyse two applications, the first with simulated data and the second with real data. To compare the models, we use the Akaike information criterion AIC (see Akaike [34]) and the Bayesian information criterion BIC (see Schwarz [35]). In these applications we use the GHN model (see Cooray and Ananda [36]) and the Pareto model (see Arnold [37]). These models present a certain similarity in form with the G model and have greater flexibility in their coefficient of kurtosis. Hence the Pareto and GHN models are considered in the comparison of the fits to the two data sets. The two models are given respectively by:
  • f ( x ; α , β ) = β α β x β + 1 , x > α ,
  • f ( x ; σ , α ) = 2 α x α 1 σ α ϕ x σ α , x > 0 ,
where ϕ denotes the density function of the standard normal distribution and σ , α , β > 0 .

5.1. Numerical Application

In this numerical application we analyse 200 simulated data generated from the G( 1 , 0.3 ) model using Algorithm 1 given in Section 4.2. The object of this numerical example is to use the ML method given in Section 4.1 to see whether, when the G model is fitted, it has its own form.
Table 3 shows the ML estimates for the parameters of the G model, Pareto model and GHN model, as well as the AIC and BIC values for each model.
We observe that the smallest values of the AIC and BIC criteria correspond to the G model, meaning that the G model fits the data better than the other models. This is to be expected, since the data were simulated from the G model. The above values for the measures indicate that the G model has its own form and that it may be difficult to replace it by any other known two-parameter model.

5.2. Application with Real Data

This dataset comes from the Survey of Consumer Finances (SCF), a nationally representative sample that contains extensive information on assets, liabilities, income, and demographic characteristics of those sampled (potential U.S. customers). It contains a random sample of 500 households in USA with positive incomes that were interviewed in the 2004 survey. The variable of interest is the annual income of the family in thousands of US dollars divided by the number of members in the household. The data can be recovered from the web page https://www.federalreserve.gov/econres/scfindex.htm (accessed on 5 January 2022). The descriptive statistics of these data are shown in Table 4, where CS is the coefficient of skewness of the sample and CK is the coefficient of kurtosis of the sample.
Figure 6 presents two boxplot graphs; the left boxplot shows a very extreme datum and the right boxplot shows the dataset after elimination of the extreme datum to show the other extreme data that cannot be seen in the left boxplot. These atypical data make the right tail heavier. It may be noted that the majority of the observations are around 21,125 dollars per capita per family, and there is a very atypical value which is an income of 75 million dollars.
Table 5 shows the ML estimates for the parameters of the G, Pareto and GHN models, as well as the values of the AIC and BIC criteria for each model.
We observe that the smallest values of the AIC and BIC criteria correspond to the G model, meaning that the G model fits the data better than the Pareto and GHN models. The SE of the ML of the G model were calculated using Section 4.1.
Table 6 presents estimates of the VaR for the G, Pareto, and GHN models at the levels of 0.50, 0.60, 0.70, 0.80, 0.90, and 0.95 and empirical quantiles. It is well known that the models with the highest VaR values have the heaviest tails. Then, based on this characteristic, we can see that the G model has higher VaR values than the GHN model. On the other hand, the G model provides higher VaR values up to the 0.70 level than the Pareto model since the Pareto VaRs explode at high (but not too high) quantiles. Due to the selection criteria of the AIC and BIC models, the G model best fits the income data. Figure 7 shows the behavior of the VaR values of the three models. Figure 8 shows the empirical cdf with estimated G and Pareto cdf’s, which also shows the excellent agreement between the G model and the income data.

6. Discussion

This paper presents a study of the G model, used in a characterization of the gamma distribution. The G model is a special case of the prime beta model with one scale parameter. The G model has two parameters and this makes it an attractive competitor against various two-parameter models used in actuarial statistics. The G model appears to be a viable alternative for fitting data with extreme observations. Some other characteristics of the G model are:
  • The smaller parameter α , the heavier the right tail of the G model.
  • The G model has an explicit representation given in Proposition 1 ( b ) .
  • Cdf, risk function and quantile function are explicit and are represented by known functions.
  • The VaR measurement is explicit and is used to show that the right tail of the G model is heavy.
  • The applications show that the G model has its own characteristic compared with other two-parameter models, and that the G model can be a good candidate for modelling income data with a heavy tail.

Author Contributions

Conceptualization, N.M.O. and E.G.-D.; methodology, N.M.O. and E.G.-D.; software, N.M.O. and E.G.-D.; validation, N.M.O. and E.G.-D. and O.V.; formal analysis, N.M.O. and O.V.; investigation, E.G.-D. and N.M.O.; writing—original draft preparation, N.M.O. and E.G.-D.; writing—review and editing, O.V. and E.G.-D.; funding acquisition, O.V. All authors have read and agreed to the published version of the manuscript.

Funding

The research of N.M. Olmos was supported by Semillero UA-2022.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data is available on the web site https://www.federalreserve.gov/econres/scfindex.htm (accessed on 5 January 2022).

Acknowledgments

We thank the reviewers and the editor for their valuable comments, which undoubtedly helped to improve the presentation of our results.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

The first derivatives of ( θ ) are given by
( θ ) σ = α σ 1 σ + x , ( θ ) α = log ( σ ) log ( x ) + ψ ( α ¯ ) ψ ( α ) .
The second derivatives of l ( θ ) are:
2 ( θ ) σ 2 = α σ 2 + 1 ( σ + x ) 2 , 2 ( θ ) σ α = 1 σ , 2 ( θ ) α 2 = ψ ( α ¯ ) ψ ( α ) ,
where ψ ( · ) and ψ ( · ) are the digamma and trigamma functions respectively.

References

  1. Ibragimov, R.; Prokhorov, A. Heavy Tails and Copulas: Topics in Dependence Modelling in Economics and Finance; World Scientific: Singapore, 2017. [Google Scholar]
  2. Olmos, N.M.; Varela, H.; Gómez, H.W.; Bolfarine, H. An extension of the half-normal distribution. Stat. Pap. 2012, 53, 875–886. [Google Scholar] [CrossRef]
  3. Olmos, N.M.; Varela, H.; Bolfarine, H.; Gómez, H.W. An extension of the generalized half-normal distribution. Stat. Pap. 2014, 55, 967–981. [Google Scholar] [CrossRef]
  4. Astorga, J.M.; Gómez, H.W.; Bolfarine, H. Slashed generalized exponential distribution. Commun. Stat. Theory Methods 2017, 46, 2091–2102. [Google Scholar] [CrossRef]
  5. Gupta, R.D.; Kundu, D. Generalized exponential distributions. Aust. N. Z. J. Stat. 1999, 41, 173–188. [Google Scholar] [CrossRef]
  6. Mudholkar, G.S.; Srivastava, D.K.; Freimer, M. The exponentiated Weibull family: A reanalysis of the Bus-Motor-Failure data. Technometrics 1995, 37, 436–445. [Google Scholar] [CrossRef]
  7. Gómez, Y.M.; Bolfarine, H.; Gómez, H.W. Gumbel distribution with heavy tails and applications to environmental data. Math. Comput. Simul. 2019, 157, 115–129. [Google Scholar] [CrossRef]
  8. Bhati, D.; Ravi, S. On generalized log-Moyal distribution: A new heavy tailed size distribution. Insur. Math. Econ. 2018, 79, 247–259. [Google Scholar] [CrossRef]
  9. Afify, A.Z.; Gemeay, A.M.; Ibrahim, N.A. The Heavy-Tailed Exponential Distribution: Risk Measures, Estimation, and Application to Actuarial Data. Mathematics 2020, 8, 1276. [Google Scholar] [CrossRef]
  10. Choulakian, V.; Stephens, M.A. Goodness-of-fit tests for the generalized Pareto distribution. Technometrics 2001, 43, 478–484. [Google Scholar] [CrossRef]
  11. Zhang, J. Likelihood moment estimation for the generalized Pareto distribution. Aust. N. Z. J. Stat. 2007, 49, 69–77. [Google Scholar] [CrossRef]
  12. Akinsete, A.; Famoye, F.; Lee, C. The Beta-Pareto distribution. Statistics 2008, 42, 547–563. [Google Scholar] [CrossRef]
  13. Nassar, M.M.; Nada, N.K. The beta generalized Pareto distribution. J. Stat. Adv. Theory Appl. 2011, 6, 1–17. [Google Scholar]
  14. Mahmoudi, E. The beta generalized Pareto distribution with application to lifetime data. Math. Comp. Simul. 2011, 81, 2414–2430. [Google Scholar] [CrossRef]
  15. Boumaraf, B.; Seddik-Ameur, N.; Barbu, V.S. Estimation of Beta-Pareto Distribution Based on Several Optimization Methods. Mathematics 2020, 8, 1055. [Google Scholar] [CrossRef]
  16. Gleser, L.J. The Gamma Distribution as a Mixture of Exponential Distributions. Am. Stat. 1989, 43, 115–117. [Google Scholar]
  17. Keeping, E.S. Introduction to Statistical Inference; Van Nostrand: New York, NY, USA, 1962. [Google Scholar]
  18. Tiao, G.G.; Cuttman, I. The Inverted Dirichlet Distribution with Applications. J. Am. Stat. Assoc. 1965, 60, 793–805. [Google Scholar] [CrossRef]
  19. Glaser, R.E. Bathtub and Related Failure Rate Characterizations. J. Am. Stat. Assoc. 1980, 75, 667–672. [Google Scholar] [CrossRef]
  20. Artzner, P. Application of coherent risk measures to capital requirements in insurance. N. Am. Actuar. J. 1999, 3, 11–25. [Google Scholar] [CrossRef]
  21. Artzner, P.; Delbaen, F.; Eber, J.M.; Heath, D. Coherent measures of risk. Math. Financ. 1999, 9, 203–228. [Google Scholar] [CrossRef]
  22. Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1978, 27, 379–423. [Google Scholar] [CrossRef] [Green Version]
  23. Rolski, T.; Schmidli, H.; Schmidt, V.; Teugel, J. Stochastic Processes for Insurance and Finance; John Wiley & Sons: Hoboken, NJ, USA, 1999. [Google Scholar]
  24. Foss, S.; Korshunov, D.; Zachary, S. An Introduction to Heavy-Tailed and Subexponential Distributions; Springer Series in Operations Research and Financial Engineering; Springer: New York, NY, USA, 2011. [Google Scholar]
  25. Bingham, N. Regular Variation; Cambridge University Press: Cambridge, UK, 1987. [Google Scholar]
  26. Konstantinides, D. Risk Theory. A Heavy Tail Approach; World Scientific Publishing: Singapore, 2018. [Google Scholar]
  27. Jessen, A.H.; Mikosch, T. Regularly varying functions. Publ. Inst. Math. Nouvelle Série 2006, 80, 171–192. [Google Scholar] [CrossRef]
  28. Embrechts, P.; Goldie, C.M. On closure and factorization properties of subexponential and related distributions. J. Aust. Math. Soc. 1980, 29, 243–256. [Google Scholar] [CrossRef]
  29. Embrechts, P.; Goldie, C.M. On convolution tails. Stoch. Process. Their Appl. 1982, 13, 263–278. [Google Scholar] [CrossRef] [Green Version]
  30. Boland, P. Statistical and Probabilistic Methods in Actuarial Science; Chapman & Hall: Boca Raton, FL, USA, 2007. [Google Scholar]
  31. Hogg, R.V.; Klugman, S.A. Loss Distributions; John Wiley & Sons: Hoboken, NJ, USA, 1984. [Google Scholar]
  32. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021; Available online: https://www.R-project.org/ (accessed on 15 January 2022).
  33. Lehmann, E.L.; Casella, G. Theory of Point Estimation; Springer: New York, NY, USA, 1998. [Google Scholar]
  34. Akaike, H. A new look at the statistical model identification. IEEE Trans. Automat. Contr. 1974, 19, 716–723. [Google Scholar] [CrossRef]
  35. Schwarz, G. Estimating the dimension of a model. Ann. Statist. 1978, 6, 461–464. [Google Scholar] [CrossRef]
  36. Cooray, K.; Ananda, M.M.A. A generalization of the half-normal distribution with applications to lifetime data. Commun. Stat. Theory Methods 2008, 37, 1323–1337. [Google Scholar] [CrossRef]
  37. Arnold, B.C. Pareto Distributions. In Monographs on Statistics & Applied Probability, 2nd ed.; Chapman & Hall: Boca Raton, FL, USA, 2015. [Google Scholar]
Figure 1. Examples of G( 1 , 0.3 ), G( 1 , 0.7 ) and G( 1 , 0.9 ).
Figure 1. Examples of G( 1 , 0.3 ), G( 1 , 0.7 ) and G( 1 , 0.9 ).
Mathematics 10 04577 g001
Figure 2. Examples of h ( t ) , for α = 0.3 , α = 0.7 , α = 0.9 .
Figure 2. Examples of h ( t ) , for α = 0.3 , α = 0.7 , α = 0.9 .
Mathematics 10 04577 g002
Figure 3. Plots of the skewness and kurtosis coefficients for the G model.
Figure 3. Plots of the skewness and kurtosis coefficients for the G model.
Mathematics 10 04577 g003
Figure 4. Plots of the VaR, for G( 10 , 0.3 ), G( 10 , 0.5 ) and G( 10 , 0.8 ).
Figure 4. Plots of the VaR, for G( 10 , 0.3 ), G( 10 , 0.5 ) and G( 10 , 0.8 ).
Mathematics 10 04577 g004
Figure 5. Shannon entropy of G ( α , σ = 1 ) for different values of α .
Figure 5. Shannon entropy of G ( α , σ = 1 ) for different values of α .
Mathematics 10 04577 g005
Figure 6. Boxplot for income data (top) and boxplot for income data without extreme data (bottom).
Figure 6. Boxplot for income data (top) and boxplot for income data without extreme data (bottom).
Mathematics 10 04577 g006
Figure 7. Plots of the VaR using the values in Table 6, for G, Pareto and GHN models.
Figure 7. Plots of the VaR using the values in Table 6, for G, Pareto and GHN models.
Mathematics 10 04577 g007
Figure 8. Plots of the empirical cdf. with estimated G cdf and estimated Pareto cdf models.
Figure 8. Plots of the empirical cdf. with estimated G cdf and estimated Pareto cdf models.
Mathematics 10 04577 g008
Table 1. Tails comparison.
Table 1. Tails comparison.
Distribution P ( X > 2 ) P ( X > 3 ) P ( X > 4 )
G(1, 0.9) 0.04803 0.03537 0.02818
G(1, 0.7) 0.19065 0.15106 0.12697
G(1, 0.3) 0.63376 0.57717 0.53760
Table 2. Empirical bias, SE, RMSE and 95% CP for the ML estimators of σ and α in the G distribution with different combinations of parameters.
Table 2. Empirical bias, SE, RMSE and 95% CP for the ML estimators of σ and α in the G distribution with different combinations of parameters.
True Value n = 100 n = 200 n = 500
σ α Par.BiasSERMSECPBiasSERMSECPBiasSERMSECP
10.2 σ 0.19340.57090.65430.94020.11130.37270.40830.94740.04940.22190.22920.9540
α 0.00540.02600.02700.96180.00390.01810.01860.95620.00220.01130.01160.9542
0.5 σ 0.21910.75440.86680.90040.10060.49200.54250.91560.04680.30100.31310.9370
α 0.00040.06790.06670.9258 0.0012 0.04960.04910.93080.00050.03220.03200.9424
0.8 σ 0.07270.50900.54980.89880.03480.34630.36020.92160.01180.21370.21430.9432
α 0.0044 0.02580.02710.9550 0.0022 0.01790.01830.9496 0.0010 0.01120.01140.9510
100.2 σ 1.94015.70136.43200.93941.11033.72924.06760.94860.46512.21402.25560.9528
α 0.00520.02600.02680.95620.00370.01810.01890.94920.00200.01130.01120.9560
0.5 σ 1.96117.44188.47140.88721.06104.96545.23730.91760.41482.99383.07640.9360
α 0.0014 0.06810.06680.92160.00020.04970.04900.93440.00000.03220.03150.9484
0.8 σ 0.68325.08435.38950.89520.37243.46793.56250.92360.16922.14652.17210.9406
α 0.0047 0.02590.02740.9510 0.0020 0.01790.01810.9544 0.0008 0.01120.01130.9534
Table 3. 200 simulated data: Model, ML estimates, AIC and BIC values.
Table 3. 200 simulated data: Model, ML estimates, AIC and BIC values.
ModelML EstimatesAICBIC
G( σ , α ) σ ^ = 0.890 , α ^ = 0.295 1967.5561974.152
Pareto( α , β ) α ^ = 0.002 , β ^ = 0.121 2144.9582151.555
GHN( σ , α ) σ ^ = 204.871 , α ^ = 0.144 2143.3582151.787
Table 4. Descriptive statistics for income data.
Table 4. Descriptive statistics for income data.
nMedianMeanVarianceCSCK
50021.125216.70911,270,0010.4351.655
Table 5. ML estimates for the income data with corresponding standard errors (in parentheses), and AIC and BIC values.
Table 5. ML estimates for the income data with corresponding standard errors (in parentheses), and AIC and BIC values.
ModelML EstimatesAICBIC
G( σ , α ) σ ^ = 21.555 ( 6.264 ) ,
α ^ = 0.497 ( 0.033 )
5139.1765147.605
Pareto( α , β ) α ^ = 0.065 ( 0.001 ) ,
β ^ = 0.171 ( 0.008 )
5867.9105876.339
GHN( σ , α ) σ ^ = 64.313 ( 6.815 ) ,
α ^ = 0.335 ( 0.008 )
5280.3605288.789
Table 6. Comparison of VaR of different models for income data and empirical quantiles in parentheses.
Table 6. Comparison of VaR of different models for income data and empirical quantiles in parentheses.
Model∖Significance0.50 (21.125)0.60 (28.600)0.70 (40.000)0.80 (60.000)0.90 (100.000)0.95 (210.250)
G ( σ ^ , α ^ )22.03441.76485.060209.891889.7703632.538
Pareto ( α ^ , β ^ )3.74413.80574.248795.16545800.1202638007
GHN ( σ ^ , α ^ )19.83638.42671.568134.928284.355479.988
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Olmos, N.M.; Gómez-Déniz, E.; Venegas, O. The Heavy-Tailed Gleser Model: Properties, Estimation, and Applications. Mathematics 2022, 10, 4577. https://doi.org/10.3390/math10234577

AMA Style

Olmos NM, Gómez-Déniz E, Venegas O. The Heavy-Tailed Gleser Model: Properties, Estimation, and Applications. Mathematics. 2022; 10(23):4577. https://doi.org/10.3390/math10234577

Chicago/Turabian Style

Olmos, Neveka M., Emilio Gómez-Déniz, and Osvaldo Venegas. 2022. "The Heavy-Tailed Gleser Model: Properties, Estimation, and Applications" Mathematics 10, no. 23: 4577. https://doi.org/10.3390/math10234577

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop