Next Article in Journal
Ordinal Cochran-Mantel-Haenszel Testing and Nonparametric Analysis of Variance: Competing Methodologies
Previous Article in Journal
Benford Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On the Bivariate Composite Gumbel–Pareto Distribution

1
Doctoral School, Ovidius University of Constanta, 900527 Constanta, Romania
2
Department of Econometrics, RISKcenter-IREA, Universitat de Barcelona, 08034 Barcelona, Spain
3
Faculty of Mathematics and Computer Science, Ovidius University of Constanta, 900527 Constanta, Romania
*
Author to whom correspondence should be addressed.
Stats 2022, 5(4), 948-969; https://doi.org/10.3390/stats5040055
Submission received: 31 August 2022 / Revised: 30 September 2022 / Accepted: 13 October 2022 / Published: 16 October 2022

Abstract

:
In this paper, we propose a bivariate extension of univariate composite (two-spliced) distributions defined by a bivariate Pareto distribution for values larger than some thresholds and by a bivariate Gumbel distribution on the complementary domain. The purpose of this distribution is to capture the behavior of bivariate data consisting of mainly small and medium values but also of some extreme values. Some properties of the proposed distribution are presented. Further, two estimation procedures are discussed and illustrated on simulated data and on a real data set consisting of a bivariate sample of claims from an auto insurance portfolio. In addition, the risk of loss in this insurance portfolio is estimated by Monte Carlo simulation.

1. Introduction

Dependent multivariate data frequently occur in practice in areas such as insurance, finance, economics, reliability, etc. Therefore, the development of bivariate and multivariate distributions is a very active field of research, especially since—in contrast to univariate distributions—it gained interest later on. Nowadays, there are various methods of constructing multivariate distributions, see e.g., the review [1]. Some of these methods follow lines from the univariate distributions. In this sense, in this paper, we propose a bivariate composite distribution built on the same idea as the univariate composite (or two-spliced) distribution (see [2] for the splicing method in the univariate case).
Two-component spliced distributions are usually encountered in univariate extreme value theory, where a classical heavy-tailed distribution (such as the generalized Pareto) is used to model the tail, in combination with a less heavy-tailed distribution used for the so-called bulk model; see, e.g., the review [3]. More precisely, such a distribution is defined from two different distributions on distinct intervals, with the aim to better capture tails of distributions such as the loss ones. A two-component spliced distribution was called composite in [4], where a particular form of such distribution, namely the lognormal–Pareto composite distribution, was studied in connection with skewed and heavy-tailed loss data.
Therefore, the bivariate distribution we propose equals a certain bivariate distribution on one domain and another bivariate distribution on another domain. More precisely, we aim at using a more heavy-tailed bivariate distribution beyond some thresholds, such as the Pareto one. As in the univariate case, the motivation of such a model is to better capture the behavior of dependent random data that present many small and medium pairs of values but also some very large ones; we note that this could be the case with, e.g., insurance or financial data arising from two dependent lines of business. In this sense, we recall the discussions in [5,6], where it was noticed that for the particular bivariate insurance data set under study (consisting of auto claims, property damage costs, and medical expenses), the best globally fitted distribution does not provide the best model for tail risk measures because heavier-tailed distribution is needed.
Thus, in this paper, we consider the bivariate Pareto distribution of the first kind for the tail (i.e., from some thresholds on) and the bivariate Gumbel exponential distribution for the remaining domain. In Section 2, we define some notation and recall the just-mentioned bivariate distributions. In Section 3, we define the general bivariate composite distribution, while in Section 4, we introduce the particular composite Gumbel–Pareto distribution and study some continuity conditions, marginal distributions, and moments. Further, we discuss simulation from this particular bivariate distribution, and in order to reduce the computing time, we propose two procedures for parameter estimation: the first one is based on marginal estimation and completed by a limited full Maximum Likelihood Estimation (MLE), and the second one is based on conditional MLE. The estimation procedures are illustrated on simulated data in Section 5.1 and on a real auto insurance data set in Section 5.2, followed by a conclusions section. The paper ends with Appendix A containing the proofs.

2. Preliminaries

2.1. Notation

We shall use the incomplete gamma function (or generalized plica function) defined by
Γ α , z 0 , z 1 = z 0 z 1 x α 1 e x d x , z 1 > z 0 0 .
We also introduce the notation
Γ α , z 0 , z 1 ; k = z 0 z 1 x α 1 e k x d x , z 1 > z 0 0 , k > 0 ,
and note that
Γ α , z 0 , z 1 ; k = 1 k α Γ α , k z 0 , k z 1 .
We recall the exponential integral notation
E 1 z = z e t t d t .
The following result holds (its proof is given in Appendix A).
Lemma 1. 
With the above notation, with 0 z 0 < z 1 ,
( i ) Γ 1 , z 0 , z 1 = e z 0 z 0 e z 1 z 1 Γ 0 , z 0 , z 1 , z 0 > 0 . ( i i ) Γ 1 , z 0 , z 1 ; k = 1 k e k z 0 e k z 1 . ( i i i ) Γ 2 , z 0 , z 1 ; k = 1 k 2 e k z 0 ( k z 0 + 1 ) e k z 1 ( k z 1 + 1 ) .
In particular,
( i i i . 1 ) Γ 2 , 0 , θ ; k = 1 k 2 1 1 + k θ e k θ , ( i i i . 2 ) Γ 2 , θ , ; k = e k θ k 2 ( 1 + k θ ) .
( i v ) Γ 3 , z 0 , z 1 ; k = 1 k 3 e k z 0 k 2 z 0 2 + 2 k z 0 + 2 e k z 1 k 2 z 1 2 + 2 k z 1 + 2 .
In particular,
( i v . 1 ) Γ 3 , 0 , θ ; k = 1 k 3 2 2 + 2 k θ + k 2 θ 2 e k θ , ( i v . 2 ) Γ 3 , θ , ; k = 1 k 3 e k θ k 2 θ 2 + 2 k θ + 2 = e k θ k 3 k θ + 1 2 + 1 .
( v ) c e k y y d y = E 1 c k , k > 0 , ( v i ) c e k y y 2 d y = e c k c k E 1 c k .
For θ 1 > 0 , θ 2 > 0 , we also define the following domains
D 11 = x 1 , x 2 0 < x 1 θ 1 , 0 < x 2 θ 2 , D 12 = x 1 , x 2 0 < x 1 θ 1 , x 2 > θ 2 , D 21 = x 1 , x 2 x 1 > θ 1 , 0 < x 2 θ 2 , D 22 = x 1 , x 2 x 1 > θ 1 , x 2 > θ 2 D = D 11 D 12 D 21 .

2.2. Bivariate Classical Distributions

The following two bivariate continuous distributions are used in the bivariate composite model.

2.2.1. Gumbel’s Bivariate Exponential Distribution, G u 2

Gumbel’s [7] bivariate exponential distribution has pdf (see also [8])
e x 1 + x 2 + β x 1 x 2 1 + β x 1 1 + β x 2 β , x 1 > 0 , x 2 > 0 ,
with standard exponential marginal distributions. We shall, however, consider a more general bivariate pdf, having general exponential distributions (see e.g., [9]). Therefore, let Y = Y 1 , Y 2 follow Gumbel’s bivariate exponential distribution, Y G u 2 λ 1 , λ 2 , β , λ 1 , λ 2 > 0 , 0 β 1 , defined by the joint pdf
g Y x 1 , x 2 = λ 1 λ 2 e λ 1 x 1 + λ 2 x 2 + β λ 1 λ 2 x 1 x 2 1 + β λ 1 x 1 1 + β λ 2 x 2 β , x 1 > 0 , x 2 > 0 .
Its cdf is given by
F Y x 1 , x 2 = Pr Y 1 x 1 , Y 2 x 2 = 1 e λ 1 x 1 e λ 2 x 2 + e λ 1 x 1 + λ 2 x 2 + β λ 1 λ 2 x 1 x 2 ;
its joint survival function is
F ¯ Y x 1 , x 2 = Pr Y 1 > x 1 , Y 2 > x 2 = e λ 1 x 1 + λ 2 x 2 + β λ 1 λ 2 x 1 x 2 ;
while the marginal distributions are exponentials with pdf g Y i x i = λ i e λ i x i , x i > 0 , cdf G Y i x i = 1 e λ i x i , and expected value λ i 1 , i = 1 , 2 .
In view of the bivariate composite model defined in the next section, an easy calculation yields the following lemma.
Lemma 2. 
Let Y G u 2 λ 1 , λ 2 , β and θ 1 > 0 , θ 2 > 0 . Then, with the above notation, it holds that
P D = Pr Y D = 1 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 .
The next lemmas are also needed.
Lemma 3. 
If Y G u 2 λ 1 , λ 2 , β and θ 1 > 0 , θ 2 > 0 , then
L 1 x 1 ; θ 2 = 0 θ 2 g Y x 1 , x 2 d x 2 = λ 1 e λ 1 x 1 1 1 + β λ 2 θ 2 e λ 2 θ 2 1 + β λ 1 x 1 , x 1 > 0 , L 2 x 2 ; θ 1 = 0 θ 1 g Y x 1 , x 2 d x 1 = λ 2 e λ 2 x 2 1 1 + β λ 1 θ 1 e λ 1 θ 1 1 + β λ 2 x 2 , x 2 > 0 .
Lemma 4. 
If Y G u 2 λ 1 , λ 2 , β and θ 1 > 0 , θ 2 > 0 , then
I θ 1 , θ 2 = θ 1 θ 2 x 1 x 2 g Y x 1 , x 2 d x 1 d x 2 = 1 β λ 1 λ 2 2 1 1 + β λ 1 θ 1 1 1 + β λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 + E 1 1 + β λ 1 θ 1 1 + β λ 2 θ 2 β e 1 β .
Lemma 5. 
Given that Y 1 = y 1 , the conditional cdf of the marginal Y 2 of Y G u 2 λ 1 , λ 2 , β is
F Y 2 Y 1 = y 1 y 2 = 1 1 + β λ 2 y 2 e λ 2 y 2 1 + β λ 1 y 1 .

2.2.2. Bivariate Pareto Distribution of the First Kind, P a I 2

Let Z = Z 1 , Z 2 follow the bivariate Pareto of the first kind distribution, Z P a I 2 a , θ 1 , θ 2 , a > 0 , θ 1 > 0 , θ 2 > 0 . Its pdf is (see [10])
f Z x 1 , x 2 = a a + 1 θ 1 θ 2 a + 1 θ 2 x 1 + θ 1 x 2 θ 1 θ 2 a + 2 , x 1 > θ 1 , x 2 > θ 2 .
Its marginal distributions are univariate Pareto of the first kind, having pdf and cdf, respectively,
f Z i x i = a θ i a x i a + 1 , F Z i x i = 1 θ i x i a , x i > θ i , i = 1 , 2 .
Moreover, we recall the formulas of the expected values and variances
E Z i = a θ i a 1 , a > 1 , V a r Z i = a θ i 2 a 1 2 a 2 , a > 2 , i = 1 , 2 ,
while the formula of the covariance is
c o v ( Z 1 , Z 2 ) = θ 1 θ 2 ( a 1 ) 2 ( a 2 ) .
From here, it is easy to see that
E Z 1 Z 2 = θ 1 θ 2 a 2 a 1 ( a 1 ) ( a 2 ) .

3. A Bivariate Composite Model

We shall now define the bivariate composite model. Let X = X 1 , X 2 be a bivariate random vector, and let θ 1 , θ 2 R . We say that X follows a bivariate composite distribution if its pdf is defined as
f x 1 , x 2 = r f 1 x 1 , x 2 , x 1 θ 1 , x 2 θ 2 x 1 θ 1 , x 2 > θ 2 x 1 > θ 1 , x 2 θ 2 1 r f 2 x 1 , x 2 , x 1 > θ 1 , x 2 > θ 2 = r f 1 x 1 , x 2 , x 1 , x 2 D 1 r f 2 x 1 , x 2 , x 1 , x 2 D 22 ,
where 0 r 1 is a normalizing constant. We note that, in general, f 1 and f 2 are pdfs of distributions truncated on the domains D and D 22 , respectively. Therefore, we can rewrite this composite distribution as a two-component mixture model with mixing weights r and 1 r , i.e.,
f x 1 , x 2 = r f 1 x 1 , x 2 + 1 r f 2 x 1 , x 2 .
This form can be used for random number generation.
We would like our pdf to be at least continuous. However, in this case, the bivariate density changes shape on the line segments x 1 = θ 1 , x 2 > θ 2 and x 1 > θ 1 , x 2 = θ 2 , which generally restricts the continuity condition; more precisely, imposing continuity on, e.g., the first segment, results in
r P D f 1 θ 1 , x 2 = ( 1 r ) f 2 θ 1 , x 2 ,
which, in general, cannot be satisfied for all x 2 > θ 2 . We can impose a continuity condition at θ 1 , θ 2 and obtain the restriction for r
r = 1 + f 1 θ 1 , θ 2 f 2 θ 1 , θ 2 1 .
We can also impose continuity conditions to the marginal pdfs, since each one is two-spliced as we see in next section.

4. Particular Case: Bivariate Composite Gumbel–Pareto Distribution

In particular, we shall assume that f 1 is the pdf of a Gumbel bivariate exponential distribution truncated on the domain D, and that f 2 is a bivariate Pareto pdf defined on D 22 , which is left truncated by its nature. Therefore, let θ 1 > 0 , θ 2 > 0 , and let Y = Y 1 , Y 2 follow Gumbel’s bivariate distribution (1) truncated on the domain D , with parameters λ 1 , λ 2 > 0 , β 0 , 1 , and having pdf
f 1 x 1 , x 2 = g Y x 1 , x 2 P D , x 1 , x 2 D .
Additionally, let Z = Z 1 , Z 2 P a I 2 a , θ 1 , θ 2 , a > 0 . Then, using P D from (3), the pdf (6) of X becomes
f x 1 , x 2 = r λ 1 λ 2 e λ 1 x 1 + λ 2 x 2 + β λ 1 λ 2 x 1 x 2 1 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 1 + β λ 1 x 1 1 + β λ 2 x 2 β , x 1 , x 2 D 1 r a a + 1 θ 1 θ 2 a + 1 θ 2 x 1 + θ 1 x 2 θ 1 θ 2 a + 2 , x 1 , x 2 D 22 .
Note that by taking r = 0 , we obtain the bivariate Pareto pdf; with r = 1 , we obtain the bivariate Gumbel truncated on the domain D; if we take r = 1 and θ 1 = θ 2 = 0 , (10) reduces to the usual Gumbel pdf. If β = 0 , the Gumbel component becomes the bivariate exponential with independent marginals.
If we impose the continuity condition at θ 1 , θ 2 , we obtain the following formula of r
r = 1 + g Y θ 1 , θ 2 f 2 θ 1 , θ 2 P D 1 = 1 + 1 + β λ 1 θ 1 1 + β λ 2 θ 2 β e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 1 a + 1 a λ 1 λ 2 θ 1 θ 2 1 .
In the left side of Figure 1, we plotted a composite Gumbel–Pareto pdf satisfying marginal continuity conditions and the continuity condition at θ 1 , θ 2 ; see (iii) in Proposition 2. However, as discussed above, this pdf is not continuous everywhere; e.g., the continuity condition (8) becomes, in this case, r P D g Y θ 1 , x 2 = ( 1 r ) f Z θ 1 , x 2 , which, given the pdfs g Y and f Z , cannot be satisfied for all x 2 > θ 2 . This can be seen from the right plot of the same figure, where we focused better on the threshold lines x 1 = θ 1 , x 2 > θ 2 and x 1 > θ 1 , x 2 = θ 2 .
In Figure 2, we plotted another composite Gumbel–Pareto pdf with different parameters and all continuity conditions, having a more heavy-tailed Pareto component ( a < 1 ). A certain flexibility of the pdf’s shape can be noticed from the two plots. However, in both pdf plots, note the areas of strong decrease for small values of x 1 and x 2 due to the exponential characteristic of the Gumbel distribution.
We also plotted in Figure 3 the marginal pdfs of the two composite Gumbel–Pareto distributions considered in Figure 1 and Figure 2, and we note their continuity and exponential type shapes for small values of x.

4.1. Some Properties

The marginal distributions of X are both of univariate composite type, having a standard exponential pdf up to the threshold.
Proposition 1. 
(i) For the composite Gumbel–Pareto distribution, the marginal pdfs of X 1 and X 2 are given by
f X 1 x 1 = r P D λ 1 e λ 1 x 1 , 0 < x 1 θ 1 r P D λ 1 e λ 1 x 1 1 1 + β λ 2 θ 2 e λ 2 θ 2 1 + β λ 1 x 1 + 1 r a θ 1 a x 1 a + 1 , x 1 > θ 1 , f X 2 x 2 = r P D λ 2 e λ 2 x 2 , 0 < x 2 θ 2 r P D λ 2 e λ 2 x 2 1 1 + β λ 1 θ 1 e λ 1 θ 1 1 + β λ 2 x 2 + 1 r a θ 2 a x 2 a + 1 , x 2 > θ 2 .
(ii) Further, the cdfs of X 1 and X 2 are
F X 1 ( x 1 ) = r P D 1 e λ 1 x 1 , 0 < x 1 θ 1 1 + r P D e λ 1 x 1 e λ 2 θ 2 ( 1 + β λ 1 x 1 ) 1 1 r θ 1 x 1 a , x 1 > θ 1 , F X 2 ( x 2 ) = r P D 1 e λ 2 x 2 , 0 < x 2 θ 2 1 + r P D e λ 2 x 2 e λ 1 θ 1 ( 1 + β λ 2 x 2 ) 1 1 r θ 2 x 2 a , x 2 > θ 2 .
We can impose marginal continuity conditions and combine them with the continuity condition at θ 1 , θ 2 . The following restrictions result.
Proposition 2. 
Let X follow the bivariate composite Gumbel–Pareto distribution. Then:
(i) 
By imposing the continuity condition to the marginal X 1 , we obtain
r 1 = 1 + λ 1 θ 1 a 1 + β λ 2 θ 2 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 1 1 .
(ii) 
By imposing the continuity condition to the marginal X 2 , we obtain
r 2 = 1 + λ 2 θ 2 a 1 + β λ 1 θ 1 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 1 1 .
(iii) 
By simultaneously imposing continuity conditions to the marginals X 1 and X 2 , we obtain
λ 1 θ 1 = λ 2 θ 2 .
If, moreover, we also impose the continuity condition at θ 1 , θ 2 , the following restriction must be fulfilled    
a = λ 1 θ 1 1 + β λ 1 θ 1 β 1 + β λ 1 θ 1 1 .
Proposition 3. 
(i) The expected values of the marginals are given for a > 1 by
E X 1 = r λ 1 P D 1 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 1 1 + β λ 2 θ 2 + λ 1 θ 1 + 1 r a θ 1 a 1 , E X 2 = r λ 2 P D 1 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 1 1 + β λ 1 θ 1 + λ 2 θ 2 + 1 r a θ 2 a 1 .
(ii) The second-order moments of the marginals are given for a > 2 by
E X 1 2 = r λ 1 2 P D 2 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 1 + λ 1 θ 1 1 + β λ 2 θ 2 + 1 2 1 + β λ 2 θ 2 2 + 1 r a θ 1 2 a 2 , E X 2 2 = r λ 2 2 P D 2 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 1 + λ 2 θ 2 1 + β λ 1 θ 1 + 1 2 1 + β λ 1 θ 1 2 + 1 r a θ 2 2 a 2 .
Proposition 4. 
The expected value of the product X 1 X 2 is
E [ X 1 X 2 ] = r P D β λ 1 λ 2 E 1 1 β E 1 1 + β λ 1 θ 1 1 + β λ 2 θ 2 β e 1 β 2 1 1 + β λ 1 θ 1 1 1 + β λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 + ( 1 r ) θ 1 θ 2 a 2 a 1 ( a 1 ) ( a 2 ) .
In view of the random generation procedure, we also need the following result on the conditional distribution of a marginal.
Proposition 5. 
The conditional cdf of the marginal X 2 given X 1 = x 1 is
F X 2 | X 1 = x 1 ( x 2 ) = 1 e λ 2 x 2 1 + β λ 1 x 1 1 + β λ 2 x 2 , x 1 θ 1 , x 2 > 0 r P D λ 1 e λ 1 x 1 1 e λ 2 x 2 1 + β λ 1 x 1 1 + β λ 2 x 2 r P D λ 1 e λ 1 x 1 1 1 + β λ 2 θ 2 e λ 2 θ 2 1 + β λ 1 x 1 + 1 r a θ 1 a x 1 a + 1 , x 1 > θ 1 , x 2 θ 2 1 1 r a θ 1 a θ 2 a + 1 θ 1 x 2 + θ 2 x 1 θ 1 θ 2 ( a + 1 ) r P D λ 1 e λ 1 x 1 1 1 + β λ 2 θ 2 e λ 2 θ 2 1 + β λ 1 x 1 + 1 r a θ 1 a x 1 a + 1 , x 1 > θ 1 , x 2 > θ 2 .

4.2. Simulation

We propose two methods for generating random values from the bivariate composite Gumbel–Pareto distribution. The first one is the inversion method, while the second one is based on the representation in expression (7).
Method I: In the bivariate case, the inversion method consists of two steps:
1.
Generate a value x 1 from the marginal distribution of X 1 by inverting its cdf given in Proposition 1;
2.
Generate a value x 2 from the conditional distribution of X 2 given X 1 = x 1 by inverting the conditional cdf given in Proposition 5. Thus, the resulting pair x 1 , x 2 is simulated from (10).
Method II: Starting from the two-component mixture representation (7) with mixing weights r and 1 r , we propose the following algorithm:
1.
Generate a value b from the Bernoulli distribution with parameter r;
2.
If b = 1 , then generate the pair x 1 , x 2 from the Gumbel distribution truncated on D;
3.
If b = 0 , then generate the pair x 1 , x 2 from the bivariate Pareto distribution (4).
Now the problem is to generate values from the two bivariate distributions: Gumbel and Pareto. Bivariate Pareto values can be generated without difficulty by the inversion method as described in Method I. Concerning the Gumbel distribution truncated on D, the following cdfs (obtained similarly to the ones in Propositions 1 and 5) can be used for inversion:
The cdf of the truncated Gumbel marginal, Y 1 D :
F Y 1 D ( x 1 ) = 1 P D 1 e λ 1 x 1 , 0 < x 1 θ 1 1 + 1 P D e λ 1 x 1 e λ 2 θ 2 ( 1 + β λ 1 x 1 ) 1 , x 1 > θ 1 ;
The conditional cdf of the marginal Y 2 D given Y 1 D = x 1 of the truncated Gumbel distribution:
F Y 2 D | Y 1 D = x 1 ( x 2 ) = 1 1 + β λ 2 x 2 e λ 2 x 2 1 + β λ 1 x 1 , x 1 θ 1 1 1 + β λ 2 x 2 e λ 2 x 2 1 + β λ 1 x 1 1 1 + β λ 2 θ 2 e λ 2 θ 2 1 + β λ 1 x 1 , x 1 > θ 1 , x 2 > 0 .

4.3. Parameter Estimation

For a univariate composite distribution, estimating the parameters is already a difficult problem because the threshold where the distribution changes shape is itself a parameter. Therefore, the usual approach in the univariate case consists of sorting the data, assuming that the threshold lies between each two consecutive data points, and finding the corresponding MLE solution; then, the best MLE solution is selected from among the available ones. Alternatively, a set of possible thresholds can be defined, and for each such value, the resulting likelihood is maximized; see also the review [11] for threshold estimation approaches.
In the bivariate case, the estimation problem becomes even more difficult because there are two unknown thresholds θ 1 , θ 2 to estimate. Let x = x 1 i , x 2 i i = 1 n be a bivariate data sample of size n, let λ 1 , λ 2 , β , a , r denote the rest of the parameters of the bivariate density defined in (10) (note that r might be obtained from a continuity condition such as (11) or the ones in Proposition 2, if imposed), and let L denote the likelihood function
L x ; λ 1 , λ 2 , β , a , θ 1 , θ 2 , r = x 1 i , x 2 i D r f 1 x 1 , x 2 x 1 i , x 2 i | x 1 i > θ 1 , x 2 i > θ 2 ( 1 r ) f 2 x 1 , x 2 .
The log-likelihood function defined from (12) is the weighted sum of the two partial log-likelihood functions associated with the two distributions of the composite model: the Gumbel and the Pareto. Since the MLE exists for both distributions (see [10] for the bivariate Pareto distribution), then for a known r, we can easily find the MLE of our composite model. The aim of the proposed MLE procedures is to find the best value of r.
In the following, we propose two alternative methods to estimate the parameters.
Method 1: An approach similar to the one described in the univariate case would be to sort the marginal data, obtaining x 1 ( i ) i = 1 n and x 2 ( i ) i = 1 n , assume that each threshold lies, correspondingly, between each two consecutive marginal data points, find the MLEs, and choose the best one. However, this procedure is very time-consuming in the bivariate case, so we propose to combine it with marginal estimation in a two-part method as follows:
I.
Perform marginal estimation for both marginals; since the marginals are univariate composite distributions, the approach described above for the univariate case can be used. This would give starting values for the marginal parameters and the approximate location of the marginally estimated thresholds θ 1 ˜ , θ 2 ˜ .
II.
Let x 1 ( i ) i = 1 n and x 2 ( i ) i = 1 n denote the (increasing) sorted marginal data and assume that the marginally estimated thresholds θ j ˜ m j ( k j ) , j = 1 , 2 , where m j ( k j ) = x j ( k j ) , x j ( k j + 1 ) . Now consider m j ( k j h ) h = 1 l the l intervals preceding and m j ( k j + h ) h = 1 l the l intervals following the interval m j ( k j ) that covers θ j ˜ , j = 1 , 2 , as long as they exist; for each combination of such intervals, perform full MLE and keep the best solution. The resulting algorithm is:
  • Step 1. For m j ( k j l ) to m j ( k j + l ) , j = 1 , 2 ,
    evaluate λ 1 , λ 2 , β , a , θ 1 , θ 2 , r as solutions of the optimization problem:
    max log L x ; λ 1 , λ 2 , β , a , θ 1 , θ 2 , r ,
    under the constraints θ 1 and θ 2 in the corresponding intervals, and continuity conditions, if imposed.
  • Step 2. Among the solutions obtained from Step 1, choose the one that maximizes the log-likelihood function.
Note that in this way, for reasonable choices of m 1 ( k 1 ) , m 2 ( k 2 ) and l, the computing time is significantly reduced.
Method 2: The second method is a more analytical procedure for a specific sample; it takes into account that the parameter β of the bivariate Gumbel–Pareto density (10) is restricted to the [ 0 , 1 ] interval. This allows us to define a grid for it and to optimize the rest of the parameters for each value in this grid. The following procedure is designed, assuming the continuity conditions given in (i–iii) of Proposition 2 and the conditional likelihood defined by:
L c x ; λ 1 , θ 1 , θ 2 | β = x 1 i , x 2 i D r f 1 x 1 , x 2 x 1 i , x 2 i | x 1 i > θ 1 , x 2 i > θ 2 ( 1 r ) f 2 x 1 , x 2 ,
with the continuity conditions (constraints)
λ 2 = λ 1 θ 1 θ 2 , a = λ 1 θ 1 1 + β λ 1 θ 1 β 1 + β λ 1 θ 1 1 , β 0 , 1 .
The conditional likelihood L c x ; λ 2 , θ 1 , θ 2 | β is defined similarly. The procedure for maximizing log L c x ; λ 1 , θ 1 , θ 2 | β is described below:
  • Step 1. Obtain initial values for the parameters θ 1 , θ 2 , and λ 1 as follows:
    -
    The initially estimated thresholds are θ ˜ 1 = x 1 ( [ n p 1 ] ) and θ ˜ 2 = x 2 ( [ n p 2 ] ) , where p j , j = 1 , 2 , are two given large proportions, and [ · ] denotes the integer part. An initial value for each proportion can be deduced from the Hill plot or by doing MLE of the univariate Pareto for the tail.
    -
    The initially estimated value of the exponential parameter λ ˜ 1 is obtained by MLE of the univariate truncated exponential distribution with density function:
    f x 1 | x 1 θ ˜ 1 = f X 1 x 1 F X 1 θ ˜ 1 = λ 1 e λ 1 x 1 1 e λ 1 θ ˜ 1 .
  • Step 2. Define a grid for β [ 0 , 1 ) , i.e., ( β g ) g = 1 G . For each β g , the estimated parameters θ ^ 1 g , θ ^ 2 g , and λ ^ 1 g are obtained by maximizing the conditional log-likelihood function log L g c x ; λ 1 , θ 1 , θ 2 | β g . The optim() function of R software with the “Nelder–Mead” method can be used; this works reasonably well for non-differentiable functions. The parameters λ ^ 2 g and a ^ g are estimated using the continuity conditions.
  • Step 3. Let ( log L ˜ g c ) g = 1 G be the optimal values of the log-likelihood obtained at Step 2, and let λ ^ 1 g , θ ^ 1 g , θ ^ 2 g , β g be the corresponding parameters. The final estimated parameters are:
    λ ^ 1 , θ ^ 1 , θ ^ 2 , β ^ = arg max g = 1 , . . . , G log L ˜ g c
    with
    λ ^ 2 = λ ^ 1 θ ^ 1 θ ^ 2 ,
    a ^ = λ ^ 1 θ ^ 1 1 + β ^ λ ^ 1 θ ^ 1 β ^ 1 + β ^ λ ^ 1 θ ^ 1 1 .

5. Numerical Illustration

In this section, we present two numerical illustrations: the first one is on simulated data, and the second one is on a real data set.

5.1. Numerical Illustration Using Simulated Data

In this section, we used simulated data to check the performance of the first estimation procedure (Method 1) proposed in Section 4.3. The true values of the parameters were selected such that they satisfied all the continuity conditions given in Proposition 2: Gumbel: λ 1 = 1 , λ 2 = 1.2 , β = 0.7 ; Pareto: a = 0.7515 , θ 1 = 1.2 , θ 2 = 1 , while r = 0.9086 , P D = 0.9669 . Note that due to the heavy-tailedness of the Pareto distribution ( a < 1 ), there is no expected value for this particular distribution (its pdf is plotted in Figure 2).
With the aim of studying the properties of Method 1, using the two simulation methods described in Section 4.2, we generated 100 samples of size n = 200 and n = 1000, respectively, for the two methods. For each such sample, in the first step, we performed marginal estimation by imposing the continuity condition for each marginal (which restricts the parameters r, as stated in Proposition 2). As a consequence, β and a are estimated twice (for each marginal), and because of the differences in these estimations, we cannot rely only on marginal estimation. However, marginal estimation provides starting values for performing full MLE, and even better, gives an idea of where to look for the thresholds. More precisely, we restricted the search to about 40 intervals for each θ j , i.e., we took l = 20 . Thus, the computing time was significantly reduced compared to the threshold search through all data.
Finally, we estimated the Mean Square Error M S E = 1 100 i = 1 100 θ θ ^ 2 and the Mean Absolute Error M A E = 1 100 i = 1 100 θ θ ^ , where θ and θ ^ represent the true and estimated parameters, respectively.
With the estimated parameters obtained from the 100 replicas generated with each simulation method, we obtained the MSE and the MAE that are shown in Table 1. The results indicate that both error criteria decrease when the sample size increases. Some differences between the two simulation methods can be observed (e.g., the MSE of β is larger for simulation Method II than for simulation Method I, while the MSE of a is smaller for simulation Method II than for simulation Method I), but we believe that these differences are due to the randomness of the results, where some samples fall more in the Pareto part or in the exponential part; further simulation investigation is worthwhile, assuming that the estimation method can be modified to reduce the computing time.
Concerning Method 2, as already noticed, it is a more analytical procedure for a specific sample, and therefore, it cannot be standardized and we cannot perform several iterations to calculate MSE and MAE.
All the computations were preformed in R software using an optimization function with constraints to implement the continuity restrictions. The code is available upon request from the authors.

5.2. Numerical Illustration with Real Data

In this section, we fit our proposed bivariate Gumbel–Pareto distribution to a random sample of n = 518 motor insurance claims that include bodily injury. For these claims, we separately know the cost of property damage including third-part liability (variable X 1 ) and the cost of exceptional medical expenses not covered by public social security (variable X 2 ). The data were provided by a major insurer in Spain in the year 2002 and correspond to claims that occurred in the year 2000. These data were studied in previous works (see [5,6,12]).
In Table 2, we display the descriptive statistics of the original data divided by 1000; this change of scale is convenient, and it facilitates the MLE of the parameters. These descriptive statistics show that both variables have a strong right skewness. Furthermore, the left plot in Figure 4 shows the scatterplot of both cost variables in the original scale divided by 1000, where the existence of extreme values in both variables can be noticed. When we have right-skewed variables with extreme values, the MLE of a simple distribution as, e.g., the exponential, the Weibull, or the log-normal, tends to underestimate the probability on the right tail. Figure 5 displays the univariate exponential pdf fitted by MLE to each marginal variable; with these densities, we also plotted the observed costs: on top the costs of property damage, including third-part liability, and on bottom the costs of exceptional medical expenses not covered by public social security. For better visibility, the domains of the cost variables were divided in two parts, resulting in two plots for each marginal. Figure 5 shows how the density reaches zero in the part of the domain where there are still sample observations; so clearly, this model assumes a zero probability where it should not. Similar results are obtained using univariate Weibull and log-normal densities.
Therefore, the composite model with a Pareto right tail is a good way to improve the MLE fit for both univariate and bivariate data. Moreover, graphical analysis (e.g., the Hill plot) indicates that both variables have a Pareto tail with a shape parameter very close to 1, i.e., we have heavy-tailed marginal distributions. Thus, we can conclude that their distributions have only the first-order moment finite, or they do not have finite moments at all. In the left scatterplot of Figure 4, we can note that the sample information on extreme values is scarce; this is a difficulty in samples from heavy-tailed or Pareto distributions.
To asses the joint behavior of X 1 and X 2 , we calculated the Pearson linear correlation and the Kendall and the Spearman rank correlation coefficients, displayed in Table 3. These results show a strong dependence between the two cost variables. However, as can be seen from Figure 4, which presents the data scatterplot in both original and natural logarithm scales, the dependence is not linear. As shown in [12], these data exhibit extreme value dependence, i.e., the higher the costs, the stronger the dependency. This behavior can also be observed in Figure 4. Furthermore, [10] shows that when the bivariate Pareto parameter a is a 2 , as is the case with our cost data, the theoretical variance and covariance do not exist or cannot be calculated. Therefore, the Pearson linear correlation cannot be interpreted.
Further, from the right plot in Figure 4, it can be observed that for small values of both variables, the shape of the point cloud is spherical, i.e., the dependence is almost zero; however, for larger values, the shape indicates positive dependence between both variables. Clearly, this denotes a change of the joint distribution between the smaller and the larger costs.
In Table 4, we present the MLE parameters for Gumbel’s bivariate exponential distribution described in Section 2.2.1 and for the Gumbel–Pareto distribution from Section 4. The estimated parameters of the latter were obtained with Method 2 described in Section 4.3, imposing all continuity conditions (Method 1 yielded similar results). The initial values of the thresholds were taken from the Hill plots, and in this case, p 1 = p 2 = 0.102 , resulting in [ n p ] = [ 518 × 0.103 ] = 52 , i.e., θ ˜ 1 = 3.1 and θ ˜ 2 = 0.5 ; also, λ ˜ 1 = 1.4175 . Comparing the AICs, BICs, and CAICs given in Table 4 indicates that the bivariate Gumbel–Pareto clearly outperforms Gumbel’s bivariate exponential distribution. Moreover, from MLE, the dependence parameter of Gumbel’s bivariate exponential distribution, β , is zero, and it is close to zero for the Gumbel–Pareto distribution, which is coherent with the scatterplot in Figure 4.
In Figure 6, we also plotted a partial histogram of the data alongside the corresponding Gumbel–Pareto pdf with the estimated parameters, while in Figure 7, we plotted the marginal histograms with the fitted pdfs.
Finally, as a risk management application, we estimated the total risk of loss for the aggregate cost random variable S = X 1 + X 2 using Monte Carlo simulation, and based on it, we calculated the Value-at-Risk (VaR) measure. VaR is equivalent to an extreme quantile of the distribution, i.e., V a R α S = inf { s R | Pr S s α } , where α is close to 1. In Table 5, we present the VaR results with α = 0.95 , 0.99 , 0.995 for: the empirical distribution of the original data, the distribution of S simulated from Gumbel’s bivariate exponential distribution, and the distribution of S simulated from the Gumbel–Pareto distribution. Furthermore, we added the VaR obtained for the bivariate log-normal distribution fitted to the data; note that this distribution underestimates the risk in a way similar to that of Gumbel’s bivariate exponential.
When data follow a heavy-tailed distribution, the empirical VaR depends on the maximum data observed, and it is not an efficient estimator. The Gumbel–Pareto distribution provides an estimation that extrapolates beyond the observed maximum cost and takes into account the long and heavy bivariate tail with dependent marginal distributions.

6. Conclusions

To model bivariate dependent data that exhibit many small/medium values but also some very large values (i.e., extreme values), in this paper, we proposed a bivariate two-component spliced distribution. This distribution assumes a bivariate Pareto distribution on the domain consisting of values larger than some thresholds, and a bivariate Gumbel distribution on the complementary domain. We discussed some properties of the new distribution and focused on parameter estimation, proposing two alternative procedures. Because performing full MLE for this distribution may become time-prohibitive for larger data sets, as further work, we plan to investigate alternative methods that could reduce the computing time. Additionally, starting from the mixture formula (7), we plan to address the problem of parameter identifiability (see, e.g., [13] or [14]). Goodness-of-fit tests are envisaged for a future study.
Moreover, we also plan to study other such distributions by replacing the bivariate Gumbel with alternative distributions.

Author Contributions

Conceptualization, C.B. and R.V.; methodology, C.B. and R.V.; software, A.B., C.B. and R.V.; formal analysis, A.B., C.B. and R.V.; writing—original draft preparation, A.B. and R.V.; writing—review and editing, C.B. and R.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are available from the authors.

Acknowledgments

The authors are very grateful to the three referees for their valuable comments that helped to significantly improve the paper. Catalina Bolancé acknowledges the Spanish Ministry of Science, Innovation and Universities, grant PID2019-105986GB-C21.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
MDPIMultidisciplinary Digital Publishing Institute
DOAJDirectory of open access journals
TLAThree letter acronym
LDLinear dichroism

Appendix A. Proofs

Proof of Lemma1. 
Using integration by parts, it is easy to prove (i)–(iv); (v) results by changing variable t = k y , while (vi) is obtained by parts and by using (v). □
Proof of Lemma3. 
Without loss of generality, we prove the formula of L 2 ; proof of L 1 results in a similar way.
L 2 x 2 ; θ 1 = λ 2 e λ 2 x 2 0 θ 1 λ 1 e x 1 λ 1 + β λ 1 λ 2 x 2 β λ 1 x 1 1 + β λ 2 x 2 + 1 + β λ 2 x 2 β d x 1 = λ 2 e λ 2 x 2 β λ 1 2 1 + β λ 2 x 2 0 θ 1 x 1 e x 1 λ 1 + β λ 1 λ 2 x 2 d x 1 + λ 1 1 + β λ 2 x 2 β 0 θ 1 e x 1 λ 1 + β λ 1 λ 2 x 2 d x 1 = λ 2 e λ 2 x 2 β λ 1 2 1 + β λ 2 x 2 Γ 2 , 0 , θ 1 ; λ 1 + β λ 1 λ 2 x 2 + λ 1 1 + β λ 2 x 2 β Γ 1 , 0 , θ 1 ; λ 1 + β λ 1 λ 2 x 2 .
Using formulas (ii) and (iii.1) from Lemma 1, we obtain, with some calculation,
L 2 x 2 ; θ 1 = λ 2 e λ 2 x 2 β λ 1 2 1 + β λ 2 x 2 λ 1 2 1 + β λ 2 x 2 2 1 1 + θ 1 λ 1 1 + β λ 2 x 2 e θ 1 λ 1 1 + β λ 2 x 2 + λ 1 1 + β λ 2 x 2 β 1 e θ 1 λ 1 1 + β λ 2 x 2 λ 1 ( 1 + β λ 2 x 2 ) = λ 2 e λ 2 x 2 1 + β λ 2 x 2 β e θ 1 λ 1 1 + β λ 2 x 2 β + β θ 1 λ 1 1 + β λ 2 x 2 + 1 + β λ 2 x 2 β + 1 + β λ 2 x 2 β = λ 2 e λ 2 x 2 e θ 1 λ 1 1 + β λ 2 x 2 1 + β θ 1 λ 1 + 1 .
Proof of Lemma4. 
We write
I θ 1 , θ 2 = θ 1 λ 1 x 1 e λ 1 x 1 J θ 2 d x 1 ,
where
J θ 2 = θ 2 λ 2 x 2 e λ 2 x 2 1 + β λ 1 x 1 β λ 2 x 2 1 + β λ 1 x 1 + 1 + β λ 1 x 1 β d x 2 = β λ 2 2 1 + β λ 1 x 1 θ 2 x 2 2 e λ 2 x 2 1 + β λ 1 x 1 d x 2 + λ 2 1 + β λ 1 x 1 β θ 2 x 2 e λ 2 x 2 1 + β λ 1 x 1 d x 2 = β λ 2 2 1 + β λ 1 x 1 Γ 3 , θ 2 , ; λ 2 1 + β λ 1 x 1 + λ 2 1 + β λ 1 x 1 β Γ 2 , θ 2 , ; λ 2 1 + β λ 1 x 1 ,
and using the corresponding formulas (iv.2) and (iii.2) from Lemma 1, we obtain
J θ 2 = β λ 2 2 1 + β λ 1 x 1 e θ 2 λ 2 1 + β λ 1 x 1 λ 2 1 + β λ 1 x 1 3 θ 2 λ 2 1 + β λ 1 x 1 + 1 2 + 1 + λ 2 1 + β λ 1 x 1 β e θ 2 λ 2 1 + β λ 1 x 1 λ 2 1 + β λ 1 x 1 2 θ 2 λ 2 1 + β λ 1 x 1 + 1 = e θ 2 λ 2 1 + β λ 1 x 1 λ 2 1 + β λ 1 x 1 2 β θ 2 2 λ 2 2 1 + β λ 1 x 1 2 + 2 θ 2 λ 2 1 + β λ 1 x 1 + 2 + θ 2 λ 2 1 + β λ 1 x 1 2 + 1 + β λ 1 x 1 β θ 2 λ 2 1 + β λ 1 x 1 β = e θ 2 λ 2 1 + β λ 1 x 1 λ 2 1 + β λ 1 x 1 2 θ 2 λ 2 1 + β λ 1 x 1 2 1 + β λ 2 θ 2 + 1 + β λ 1 x 1 1 + β λ 2 θ 2 + β .
Inserting this result into the equation of I θ 1 , θ 2 yields
I θ 1 , θ 2 = λ 1 e λ 2 θ 2 λ 2 θ 1 x 1 e λ 1 x 1 1 + β λ 2 θ 2 1 + β λ 1 x 1 2 θ 2 λ 2 1 + β λ 1 x 1 2 1 + β λ 2 θ 2 + 1 + β λ 1 x 1 1 + β λ 2 θ 2 + β d x 1 ,
and by changing variable β y = 1 + β λ 1 x 1 and letting c = 1 + β λ 1 θ 1 β , we obtain
I θ 1 , θ 2 = λ 1 e λ 2 θ 2 λ 2 c β y 1 β λ 1 ( β y ) 2 e β y 1 β 1 + β λ 2 θ 2 θ 2 λ 2 β y 2 1 + β λ 2 θ 2 + β y 1 + β λ 2 θ 2 + β d y λ 1 = e 1 β β λ 1 λ 2 c e y 1 + β λ 2 θ 2 ( β y 1 ) θ 2 λ 2 1 + β λ 2 θ 2 + 1 + β λ 2 θ 2 β y + 1 β y 2 d y .
For simplicity, we denote u 2 = 1 + β λ 2 θ 2 ; hence
I θ 1 , θ 2 = e 1 β β λ 1 λ 2 c e y u 2 λ 2 θ 2 u 2 β y θ 2 λ 2 u 2 + u 2 u 2 β y + 1 y 1 β y 2 d y = e 1 β β λ 1 λ 2 β λ 2 θ 2 u 2 c y e y u 2 d y + u 2 1 λ 2 θ 2 c e y u 2 d y + 1 u 2 β c e y u 2 y d y 1 β c e y u 2 y 2 d y .
Using (iii.2), (v), and (vi) from Lemma 1, we evaluate
I θ 1 , θ 2 = e 1 β β λ 1 λ 2 β λ 2 θ 2 u 2 Γ 2 , c , ; u 2 + u 2 1 λ 2 θ 2 Γ 1 , c , ; u 2 + 1 u 2 β E 1 c u 2 1 β e c u 2 c u 2 E 1 c u 2 = e 1 β β λ 1 λ 2 β λ 2 θ 2 u 2 e c u 2 u 2 2 1 + c u 2 + u 2 1 λ 2 θ 2 e c u 2 u 2 e c u 2 β c + E 1 c u 2 = 1 β λ 1 λ 2 β λ 2 θ 2 1 u 2 + c + 1 λ 2 θ 2 1 β c e 1 β c u 2 + e 1 β E 1 c u 2 .
We now insert the formulas of c and u 2 ; note that
c u 2 = 1 + β λ 1 θ 1 1 + β λ 2 θ 2 β , 1 β c u 2 = λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 ,
and with some calculation, we obtain the stated formula of I θ 1 , θ 2 .
Proof of Lemma5. 
We note that
F Y 2 Y 1 = y 1 y 2 = 0 y 2 g Y y 1 , x g Y 1 ( y 1 ) d x = 1 g Y 1 ( y 1 ) L 1 y 1 ; y 2 = λ 1 e λ 1 y 1 1 1 + β λ 2 y 2 e λ 2 y 2 1 + β λ 1 y 1 λ 1 e λ 1 y 1 ,
where we used the formula of L 1 from Lemma 3. This easily yields the stated result. □
Proof of Proposition1. 
We prove the formulas for X 1 , with the formulas for X 2 resulting in a similar manner.
(i
Since f X 1 x 1 = 0 f x 1 , x 2 d x 2 , we have two cases:
Case x 1 θ 1 : it is easy to see that
f X 1 x 1 = r P D 0 g Y x 1 , x 2 d x 2 = r P D λ 1 e λ 1 x 1 .
Case x 1 > θ 1 : in this case,
f X 1 x 1 = r P D 0 θ 2 g Y x 1 , x 2 d x 2 + 1 r θ 2 f Z x 1 , x 2 d x 2 = r P D L 1 x 1 ; θ 2 + 1 r a θ 1 a x 1 a + 1 .
We insert the formula of L 1 from Lemma 3 and obtain the stated formula of f X 1 .
(ii
Based on the formula of f X 1 , we again have two cases:
Case x 1 θ 1 : clearly, here we obtain the cdf of the exponential distribution of Y 1 .
Case x 1 > θ 1 : in this case,
F X 1 ( x 1 ) = 0 θ 1 r P D λ 1 e λ 1 x d x + θ 1 x 1 r P D λ 1 e λ 1 x d x θ 1 x 1 r P D λ 1 e λ 1 x 1 + β λ 2 θ 2 e λ 2 θ 2 1 + β λ 1 x d x + θ 1 x 1 1 r a θ 1 a x 1 a + 1 d x .
The first two integrals add to the cdf of the exponential distribution of Y 1 in x 1 , while the last integral yields the cdf of the Pareto distribution of Z 1 . Therefore,
F X 1 ( x 1 ) = r P D 1 e λ 1 x 1 r P D λ 1 1 + β λ 2 θ 2 e λ 2 θ 2 θ 1 x 1 e λ 1 x 1 + β λ 2 θ 2 d x + 1 r 1 θ 1 x 1 a = r P D 1 e λ 1 x 1 + r P D e λ 2 θ 2 e λ 1 x 1 + β λ 2 θ 2 | θ 1 x 1 + 1 r 1 θ 1 x 1 a = r P D 1 e λ 1 θ 1 λ 2 θ 2 β λ 1 λ 2 θ 1 θ 2 + e λ 2 θ 2 λ 1 x 1 1 + β λ 2 θ 2 e λ 1 x 1 + 1 r 1 θ 1 x 1 a = r + r P D e λ 2 θ 2 λ 1 x 1 1 + β λ 2 θ 2 e λ 1 x 1 + 1 r 1 r θ 1 x 1 a ,
where for the last equality, we used formula (3) of P D . From here, the formula of F X 1 is immediate. □
Proof of Proposition2. 
(i) The continuity condition f X 1 θ 1 = f X 1 θ 1 + yields
r P D λ 1 e λ 1 θ 1 = r P D λ 1 e λ 1 θ 1 1 1 + β λ 2 θ 2 e λ 2 θ 2 ( 1 + β λ 1 θ 1 ) + ( 1 r ) a θ 1 a θ 1 a + 1 r λ 1 1 + β λ 2 θ 2 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 1 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 + a θ 1 r a θ 1 = 0 a θ 1 = r a θ 1 1 + θ 1 a λ 1 1 + β λ 2 θ 2 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 1 ,
which yields Formula (i). The proof of Formula (ii) is similar.
(iii
We equate r 1 = r 2 from (i) and (ii) and obtain
1 + λ 1 θ 1 a 1 + β λ 2 θ 2 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 1 = 1 + λ 2 θ 2 a 1 + β λ 1 θ 1 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 1 λ 1 θ 1 1 + β λ 2 θ 2 = λ 2 θ 2 1 + β λ 1 θ 1 λ 1 θ 1 = λ 2 θ 2 .
Moreover, the continuity condition at θ 1 , θ 2 means r = r 1 = r 2 ; hence, using (11) and λ 1 θ 1 = λ 2 θ 2 , we obtain
1 + 1 + β λ 1 θ 1 1 + β λ 2 θ 2 β e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 1 a + 1 a λ 1 λ 2 θ 1 θ 2 = 1 + λ 1 θ 1 a 1 + β λ 2 θ 2 e λ 1 θ 1 + λ 2 θ 2 + β λ 1 λ 2 θ 1 θ 2 1 1 + β λ 1 θ 1 2 β λ 1 θ 1 = ( a + 1 ) 1 + β λ 1 θ 1 ,
from which results the stated formula of a. □
Proof of Proposition3. 
We calculate the expected value and the second-order moment for X 1 (those of X 2 result in a similar way). Using the expected value of the exponential and Pareto distributions, we have
E X 1 = 0 x 1 f X 1 x 1 d x 1 = r P D 0 θ 1 λ 1 x 1 e λ 1 x 1 d x 1 + r P D θ 1 λ 1 x 1 e λ 1 x 1 1 1 + β λ 2 θ 2 e λ 2 θ 2 1 + β λ 1 x 1 d x 1 + 1 r θ 1 x 1 a θ 1 a x 1 a + 1 d x 1 = r P D 0 λ 1 x 1 e λ 1 x 1 d x 1 λ 1 1 + β λ 2 θ 2 e λ 2 θ 2 θ 1 x 1 e λ 1 x 1 1 + β λ 2 θ 2 d x 1 + 1 r a θ 1 a 1 = r P D 1 λ 1 λ 1 1 + β λ 2 θ 2 e λ 2 θ 2 Γ 2 , θ 1 , ; λ 1 1 + β λ 2 θ 2 + 1 r a θ 1 a 1 .
Inserting (iii.2) from Lemma 1 yields
E X 1 = r P D 1 λ 1 λ 1 1 + β λ 2 θ 2 e λ 2 θ 2 e λ 1 θ 1 1 + β λ 2 θ 2 λ 1 2 1 + β λ 2 θ 2 2 1 + λ 1 θ 1 1 + β λ 2 θ 2 + 1 r a θ 1 a 1 ,
from which the expected value formula is immediate. The moment of second order is
E X 1 2 = 0 x 1 2 f X 1 x 1 d x 1 = r P D λ 1 0 x 1 2 e λ 1 x 1 d x 1 λ 1 1 + β λ 2 θ 2 e λ 2 θ 2 θ 1 x 1 2 e λ 1 x 1 1 + β λ 2 θ 2 d x 1 + 1 r θ 1 x 1 2 a θ 1 a x 1 a + 1 d x 1 = r λ 1 P D Γ 3 , 0 , ; λ 1 1 + β λ 2 θ 2 e λ 2 θ 2 Γ 3 , θ 1 , ; λ 1 1 + β λ 2 θ 2 + 1 r a θ 1 2 a 2 .
Based on (iv.2) from Lemma 1, we obtain
E X 1 2 = r λ 1 P D 2 λ 1 3 1 + β λ 2 θ 2 e λ 2 θ 2 e λ 1 θ 1 1 + β λ 2 θ 2 λ 1 3 1 + β λ 2 θ 2 3 1 + λ 1 θ 1 1 + β λ 2 θ 2 + 1 2 + 1 r a θ 1 2 a 2 .
The stated formula of E X 1 2 easily results from here, which completes the proof. □
Proof of Proposition4. 
We write
E [ X 1 X 2 ] = 0 0 x 1 x 2 f x 1 , x 2 d x 1 d x 2 = D r λ 1 λ 2 P D x 1 x 2 e λ 1 x 1 + λ 2 x 2 + β λ 1 λ 2 x 1 x 2 1 + β λ 1 x 1 1 + β λ 2 x 2 β d x 1 d x 2 + D 22 1 r a a + 1 θ 1 θ 2 a + 1 x 1 x 2 θ 2 x 1 + θ 1 x 2 θ 1 θ 2 a + 2 d x 1 d x 2 = r P D I 1 + ( 1 r ) I 2 .
We separately calculate the two integrals. We start with the second one, which from Formula (5) is given by
I 2 = θ 1 θ 1 a a + 1 x 1 x 2 θ 1 θ 2 a + 1 θ 2 x 1 + θ 1 x 2 θ 1 θ 2 a + 2 d x 1 d x 2 = θ 1 θ 2 a 2 a 1 ( a 1 ) ( a 2 ) .
In what concerns I 1 , we note that given the definition of the domain D with the notation from Lemma 4, we have
I 1 = D λ 1 λ 2 x 1 x 2 e λ 1 x 1 + λ 2 x 2 + β λ 1 λ 2 x 1 x 2 1 + β λ 1 x 1 1 + β λ 2 x 2 β d x 1 d x 2 = I 0 , 0 I θ 1 , θ 2 .
Now using the formula in Lemma 4, we note that
I 0 , 0 = 1 β λ 1 λ 2 E 1 1 β e 1 β ,
and the stated formula of E [ X 1 X 2 ] results immediately. □
Proof of Proposition5. 
We recall that
f X 2 | X 1 = x 1 ( x 2 ) = f x 1 , x 2 f X 1 x 1 ,
and according to Proposition 1, we note that we must consider three different cases:
( x 1 θ 1 , x 2 > 0 ) ; ( x 1 > θ 1 , x 2 θ 2 ) and ( x 1 > θ 1 , x 2 > θ 2 ) .
Case I: x 1 θ 1 and x 2 > 0 . In this case,
F X 2 | X 1 = x 1 ( x 2 ) = 0 x 2 r P D g Y x 1 , x r P D λ 1 e λ 1 x 1 d x ,
and using Lemma 5, we obtain the first formula of F X 2 | X 1 = x 1 .
Case II: x 1 > θ 1 a n d x 2 θ 2 . Now, we have
F X 2 | X 1 = x 1 ( x 2 ) = 0 x 2 r P D g Y x 1 , x r P D λ 1 e λ 1 x 1 1 1 + β λ 2 θ 2 e λ 2 θ 2 1 + β λ 1 x 1 + 1 r a θ 1 a x 1 a + 1 d x ,
and, as in Case I, we easily get the second formula of F X 2 | X 1 = x 1 ( x 2 ) .
Case III: x 1 > θ 1 a n d x 2 > θ 2 . In this case,
F X 2 | X 1 = x 1 ( x 2 ) = 0 θ 2 r P D g Y x 1 , x r P D λ 1 e λ 1 x 1 1 1 + β λ 2 θ 2 e λ 2 θ 2 1 + β λ 1 x 1 + 1 r a θ 1 a x 1 a + 1 d x + θ 2 x 2 1 r a + 1 a θ 1 θ 2 a + 1 θ 2 x 1 + θ 1 x θ 1 θ 2 a + 2 r P D λ 1 e λ 1 x 1 1 1 + β λ 2 θ 2 e λ 2 θ 2 1 + β λ 1 x 1 + 1 r a θ 1 a x 1 a + 1 d x .
The first integral equals the formula obtained in Case II by taking x 2 = θ 2 , while for the second integral, we evaluate
J = a + 1 a θ 2 x 2 θ 1 θ 2 a + 1 θ 2 x 1 + θ 1 x θ 1 θ 2 a 2 d x = a + 1 a θ 1 θ 2 a + 1 1 θ 1 θ 2 x 1 + θ 1 x θ 1 θ 2 a 1 a + 1 | θ 2 x 2 = a θ 1 a θ 2 a + 1 θ 2 x 1 + θ 1 x 2 θ 1 θ 2 a 1 θ 2 x 1 + θ 1 θ 2 θ 1 θ 2 a 1 = a θ 1 a θ 2 a + 1 θ 2 x 1 + θ 1 x 2 θ 1 θ 2 a + 1 + a θ 1 a θ 2 a + 1 θ 2 a + 1 x 1 a + 1 .
Therefore,
F X 2 | X 1 = x 1 ( x 2 ) = r P D λ 1 e λ 1 x 1 1 e λ 2 θ 2 1 + β λ 1 x 1 1 + β λ 2 θ 2 r P D λ 1 e λ 1 x 1 1 1 + β λ 2 θ 2 e λ 2 θ 2 1 + β λ 1 x 1 + 1 r a θ 1 a x 1 a + 1 + 1 r a θ 1 a x 1 a + 1 a θ 1 a θ 2 a + 1 θ 2 x 1 + θ 1 x 2 θ 1 θ 2 a + 1 r P D λ 1 e λ 1 x 1 1 1 + β λ 2 θ 2 e λ 2 θ 2 1 + β λ 1 x 1 + 1 r a θ 1 a x 1 a + 1 ,
from which the last formula of F X 2 | X 1 = x 1 is immediate. □

References

  1. Sarabia, J.M.; Gómez-Déniz, E. Construction of multivariate distributions: A review of some recent results. SORT 2008, 32, 3–36. [Google Scholar]
  2. Klugman, S.A.; Panjer, H.H.; Willmot, G.E. Loss Models: From Data to Decisions; John Wiley & Sons: New York, NY, USA, 1998. [Google Scholar]
  3. Scarrott, C. Univariate extreme value mixture modeling. In Extreme Value Modeling and Risk Analysis: Methods and Applications; Dipak, K.D., Jun, Y., Eds.; CRC Press: Boca Raton, FL, USA, 2016; pp. 41–67. [Google Scholar]
  4. Cooray, K.; Ananda, M.M. Modeling actuarial data with a composite lognormal-Pareto model. Scand. Actuar. J. 2005, 5, 321–334. [Google Scholar] [CrossRef]
  5. Bolancé, C.; Guillen, M.; Pelican, E.; Vernic, R. Skewed bivariate models and nonparametric estimation for the CTE risk measure. Insur. Math. Econ. 2008, 43, 386–393. [Google Scholar] [CrossRef]
  6. Bahraoui, Z.; Bolancé, C.; Pelican, E.; Vernic, R. On the bivariate Sarmanov distribution and copula. An application on insurance data using truncated marginal distributions. SORT 2015, 39, 209–230. [Google Scholar]
  7. Gumbel, E.J. Bivariate exponential distributions. J. Am. Stat. Assoc. 1960, 55, 698–707. [Google Scholar] [CrossRef]
  8. Kotz, S.; Balakrishnan, N.; Johnson, N.L. Continuous Multivariate Distributions, Volume 1: Models and Applications; John Wiley & Sons: New York, NY, USA, 2004. [Google Scholar]
  9. Castillo, E.; Sarabia, J.M.; Hadi, A.S. Fitting continuous bivariate distributions to data. Statistician 1997, 46, 355–369. [Google Scholar] [CrossRef]
  10. Mardia, K.V. Multivariate Pareto distributions. Ann. Math. Stat. 1962, 33, 1008–1015. [Google Scholar] [CrossRef]
  11. Scarrott, C.J.; MacDonald, A. A review of extreme value threshold estimation and uncertainty quantification. REVSTAT Stat. J. 2012, 10, 33–60. [Google Scholar]
  12. Bahraoui, Z.; Bolancé, C.; Pérez-Marín, A.M. Testing extreme value copulas to estimate the quantile. SORT 2014, 38, 89–102. [Google Scholar]
  13. Chen, J. Optimal rate of convergence for finite mixture models. Ann. Stat. 1995, 23, 221–233. [Google Scholar] [CrossRef]
  14. Kim, D.; Lindsay, B.G. Empirical identifiability in finite mixture models. Ann. Inst. Stat. Math. 2015, 67, 745–772. [Google Scholar] [CrossRef]
Figure 1. Left: composite Gumbel–Pareto pdf with continuous marginals and continuity at θ 1 , θ 2 ; Right: zoom of the same pdf (parameters: λ 1 = 0.81 , λ 2 = 0.9 , β = 0.2 , a = 1.0258 , θ 1 = 2.1 , θ 2 = 1.89 ).
Figure 1. Left: composite Gumbel–Pareto pdf with continuous marginals and continuity at θ 1 , θ 2 ; Right: zoom of the same pdf (parameters: λ 1 = 0.81 , λ 2 = 0.9 , β = 0.2 , a = 1.0258 , θ 1 = 2.1 , θ 2 = 1.89 ).
Stats 05 00055 g001
Figure 2. Composite Gumbel–Pareto pdf with continuous marginals and continuity at θ 1 , θ 2 (parameters: λ 1 = 1 , λ 2 = 1.2 , β = 0.7 , a = 0.7515 , θ 1 = 1.2 , θ 2 = 1 ) .
Figure 2. Composite Gumbel–Pareto pdf with continuous marginals and continuity at θ 1 , θ 2 (parameters: λ 1 = 1 , λ 2 = 1.2 , β = 0.7 , a = 0.7515 , θ 1 = 1.2 , θ 2 = 1 ) .
Stats 05 00055 g002
Figure 3. Marginal pdfs of composite Gumbel–Pareto distribution: left—with parameters from Figure 1; right—with parameters from Figure 2 ( f X 1 solid line, f X 2 dashed line).
Figure 3. Marginal pdfs of composite Gumbel–Pareto distribution: left—with parameters from Figure 1; right—with parameters from Figure 2 ( f X 1 solid line, f X 2 dashed line).
Stats 05 00055 g003
Figure 4. Scatterplots of X 1 vs. X 2 in original (left) and natural (right) logarithm scales.
Figure 4. Scatterplots of X 1 vs. X 2 in original (left) and natural (right) logarithm scales.
Stats 05 00055 g004
Figure 5. Exponential pdf fitted by MLE and sample data shown as points on the horizontal axis for both marginals.
Figure 5. Exponential pdf fitted by MLE and sample data shown as points on the horizontal axis for both marginals.
Stats 05 00055 g005
Figure 6. Histogram of real data (left) and Gumbel–Pareto pdf with the estimated parameters (right).
Figure 6. Histogram of real data (left) and Gumbel–Pareto pdf with the estimated parameters (right).
Stats 05 00055 g006
Figure 7. Histogram of real data marginals with fitted pdfs: left, X 1 ; right, X 2 .
Figure 7. Histogram of real data marginals with fitted pdfs: left, X 1 ; right, X 2 .
Stats 05 00055 g007
Table 1. Simulation results with 100 replicas for MSE and MAE with sample sizes n = 200 and n = 1000.
Table 1. Simulation results with 100 replicas for MSE and MAE with sample sizes n = 200 and n = 1000.
Simulation Method I
λ 1 λ 2 β a θ 1 θ 2
nMSEMAEMSEMAEMSEMAEMSEMAEMSEMAEMSEMAE
2000.00370.05380.00600.07170.01240.10770.38550.61520.05300.22260.01440.1069
10000.00340.05110.00480.06100.01180.10440.02750.16250.04500.21090.00490.0619
Simulation Method II
λ 1 λ 2 β a θ 1 θ 2
nMSEMAEMSEMAEMSEMAEMSEMAEMSEMAEMSEMAE
2000.00540.07210.00410.04470.15300.39000.06400.24930.06940.26190.09850.3010
10000.00480.06100.00020.01310.12340.33670.01740.12670.06380.25640.05670.2229
Table 2. Descriptive statistics of property damage and third-party liability costs ( X 1 ) and exceptional medical expenses ( X 2 ).
Table 2. Descriptive statistics of property damage and third-party liability costs ( X 1 ) and exceptional medical expenses ( X 2 ).
MeanSTDMinQ25MedianQ75MaxKurtosisSkewness
X11.836.870.010.260.681.39137.9415.70301.30
X20.280.860.000.020.090.2011.868.0685.35
Table 3. Sample linear and rank correlation coefficients.
Table 3. Sample linear and rank correlation coefficients.
PearsonKendallSpearman
Correlation0.72880.42520.5903
Table 4. MLE of bivariate distributions with standard errors in parentheses.
Table 4. MLE of bivariate distributions with standard errors in parentheses.
GumbelGumbel–Pareto
λ ^ 1 0.5472 (0.0240)1.4184 (0.0328)
λ ^ 2 3.5221 (0.1548)11.1996
θ ^ 1 -0.9870 (0.0040)
θ ^ 2 -0.1250 (0.0003)
β ^ 0.00000.0455 (0.0465)
a-0.4292
r-0.8303
log L −696.1630−272.5549
AIC1398.3261557.1097
BIC1411.0759582.6096
CAIC1414.0759588.6096
Table 5. Value-at-Risk for the empirical distribution and alternative distributions, obtained using Monte Carlo simulation.
Table 5. Value-at-Risk for the empirical distribution and alternative distributions, obtained using Monte Carlo simulation.
95%99%99.50%
Empirical7.92625.40931.216
Gumbel6.3129.70011.178
Gumbel–Pareto6.361114.067410.897
Log-normal6.52915.12220.787
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Badea, A.; Bolancé, C.; Vernic, R. On the Bivariate Composite Gumbel–Pareto Distribution. Stats 2022, 5, 948-969. https://doi.org/10.3390/stats5040055

AMA Style

Badea A, Bolancé C, Vernic R. On the Bivariate Composite Gumbel–Pareto Distribution. Stats. 2022; 5(4):948-969. https://doi.org/10.3390/stats5040055

Chicago/Turabian Style

Badea, Alexandra, Catalina Bolancé, and Raluca Vernic. 2022. "On the Bivariate Composite Gumbel–Pareto Distribution" Stats 5, no. 4: 948-969. https://doi.org/10.3390/stats5040055

Article Metrics

Back to TopTop