Abstract
In this paper, we propose a bivariate extension of univariate composite (two-spliced) distributions defined by a bivariate Pareto distribution for values larger than some thresholds and by a bivariate Gumbel distribution on the complementary domain. The purpose of this distribution is to capture the behavior of bivariate data consisting of mainly small and medium values but also of some extreme values. Some properties of the proposed distribution are presented. Further, two estimation procedures are discussed and illustrated on simulated data and on a real data set consisting of a bivariate sample of claims from an auto insurance portfolio. In addition, the risk of loss in this insurance portfolio is estimated by Monte Carlo simulation.
1. Introduction
Dependent multivariate data frequently occur in practice in areas such as insurance, finance, economics, reliability, etc. Therefore, the development of bivariate and multivariate distributions is a very active field of research, especially since—in contrast to univariate distributions—it gained interest later on. Nowadays, there are various methods of constructing multivariate distributions, see e.g., the review [1]. Some of these methods follow lines from the univariate distributions. In this sense, in this paper, we propose a bivariate composite distribution built on the same idea as the univariate composite (or two-spliced) distribution (see [2] for the splicing method in the univariate case).
Two-component spliced distributions are usually encountered in univariate extreme value theory, where a classical heavy-tailed distribution (such as the generalized Pareto) is used to model the tail, in combination with a less heavy-tailed distribution used for the so-called bulk model; see, e.g., the review [3]. More precisely, such a distribution is defined from two different distributions on distinct intervals, with the aim to better capture tails of distributions such as the loss ones. A two-component spliced distribution was called composite in [4], where a particular form of such distribution, namely the lognormal–Pareto composite distribution, was studied in connection with skewed and heavy-tailed loss data.
Therefore, the bivariate distribution we propose equals a certain bivariate distribution on one domain and another bivariate distribution on another domain. More precisely, we aim at using a more heavy-tailed bivariate distribution beyond some thresholds, such as the Pareto one. As in the univariate case, the motivation of such a model is to better capture the behavior of dependent random data that present many small and medium pairs of values but also some very large ones; we note that this could be the case with, e.g., insurance or financial data arising from two dependent lines of business. In this sense, we recall the discussions in [5,6], where it was noticed that for the particular bivariate insurance data set under study (consisting of auto claims, property damage costs, and medical expenses), the best globally fitted distribution does not provide the best model for tail risk measures because heavier-tailed distribution is needed.
Thus, in this paper, we consider the bivariate Pareto distribution of the first kind for the tail (i.e., from some thresholds on) and the bivariate Gumbel exponential distribution for the remaining domain. In Section 2, we define some notation and recall the just-mentioned bivariate distributions. In Section 3, we define the general bivariate composite distribution, while in Section 4, we introduce the particular composite Gumbel–Pareto distribution and study some continuity conditions, marginal distributions, and moments. Further, we discuss simulation from this particular bivariate distribution, and in order to reduce the computing time, we propose two procedures for parameter estimation: the first one is based on marginal estimation and completed by a limited full Maximum Likelihood Estimation (MLE), and the second one is based on conditional MLE. The estimation procedures are illustrated on simulated data in Section 5.1 and on a real auto insurance data set in Section 5.2, followed by a conclusions section. The paper ends with Appendix A containing the proofs.
2. Preliminaries
2.1. Notation
We shall use the incomplete gamma function (or generalized plica function) defined by
We also introduce the notation
and note that
We recall the exponential integral notation
The following result holds (its proof is given in Appendix A).
Lemma 1.
With the above notation, with
In particular,
In particular,
For , we also define the following domains
2.2. Bivariate Classical Distributions
The following two bivariate continuous distributions are used in the bivariate composite model.
2.2.1. Gumbel’s Bivariate Exponential Distribution,
Gumbel’s [7] bivariate exponential distribution has pdf (see also [8])
with standard exponential marginal distributions. We shall, however, consider a more general bivariate pdf, having general exponential distributions (see e.g., [9]). Therefore, let follow Gumbel’s bivariate exponential distribution, defined by the joint pdf
Its cdf is given by
its joint survival function is
while the marginal distributions are exponentials with pdf , cdf , and expected value
In view of the bivariate composite model defined in the next section, an easy calculation yields the following lemma.
Lemma 2.
Let and Then, with the above notation, it holds that
The next lemmas are also needed.
Lemma 3.
If and then
Lemma 4.
If and then
Lemma 5.
Given that , the conditional cdf of the marginal of is
2.2.2. Bivariate Pareto Distribution of the First Kind,
Let follow the bivariate Pareto of the first kind distribution, Its pdf is (see [10])
Its marginal distributions are univariate Pareto of the first kind, having pdf and cdf, respectively,
Moreover, we recall the formulas of the expected values and variances
while the formula of the covariance is
From here, it is easy to see that
3. A Bivariate Composite Model
We shall now define the bivariate composite model. Let be a bivariate random vector, and let . We say that follows a bivariate composite distribution if its pdf is defined as
where is a normalizing constant. We note that, in general, and are pdfs of distributions truncated on the domains D and , respectively. Therefore, we can rewrite this composite distribution as a two-component mixture model with mixing weights r and , i.e.,
This form can be used for random number generation.
We would like our pdf to be at least continuous. However, in this case, the bivariate density changes shape on the line segments and , which generally restricts the continuity condition; more precisely, imposing continuity on, e.g., the first segment, results in
which, in general, cannot be satisfied for all . We can impose a continuity condition at and obtain the restriction for r
We can also impose continuity conditions to the marginal pdfs, since each one is two-spliced as we see in next section.
4. Particular Case: Bivariate Composite Gumbel–Pareto Distribution
In particular, we shall assume that is the pdf of a Gumbel bivariate exponential distribution truncated on the domain D, and that is a bivariate Pareto pdf defined on , which is left truncated by its nature. Therefore, let and let follow Gumbel’s bivariate distribution (1) truncated on the domain with parameters and having pdf
Additionally, let Then, using from (3), the pdf (6) of becomes
Note that by taking , we obtain the bivariate Pareto pdf; with , we obtain the bivariate Gumbel truncated on the domain D; if we take and , (10) reduces to the usual Gumbel pdf. If , the Gumbel component becomes the bivariate exponential with independent marginals.
If we impose the continuity condition at , we obtain the following formula of r
In the left side of Figure 1, we plotted a composite Gumbel–Pareto pdf satisfying marginal continuity conditions and the continuity condition at ; see (iii) in Proposition 2. However, as discussed above, this pdf is not continuous everywhere; e.g., the continuity condition (8) becomes, in this case, which, given the pdfs and , cannot be satisfied for all . This can be seen from the right plot of the same figure, where we focused better on the threshold lines and .
Figure 1.
Left: composite Gumbel–Pareto pdf with continuous marginals and continuity at ; Right: zoom of the same pdf (parameters: ).
In Figure 2, we plotted another composite Gumbel–Pareto pdf with different parameters and all continuity conditions, having a more heavy-tailed Pareto component (). A certain flexibility of the pdf’s shape can be noticed from the two plots. However, in both pdf plots, note the areas of strong decrease for small values of and due to the exponential characteristic of the Gumbel distribution.
Figure 2.
Composite Gumbel–Pareto pdf with continuous marginals and continuity at (parameters: .
We also plotted in Figure 3 the marginal pdfs of the two composite Gumbel–Pareto distributions considered in Figure 1 and Figure 2, and we note their continuity and exponential type shapes for small values of x.
4.1. Some Properties
The marginal distributions of are both of univariate composite type, having a standard exponential pdf up to the threshold.
Proposition 1.
(i) For the composite Gumbel–Pareto distribution, the marginal pdfs of and are given by
(ii) Further, the cdfs of and are
We can impose marginal continuity conditions and combine them with the continuity condition at . The following restrictions result.
Proposition 2.
Let follow the bivariate composite Gumbel–Pareto distribution. Then:
- (i)
- By imposing the continuity condition to the marginal we obtain
- (ii)
- By imposing the continuity condition to the marginal we obtain
- (iii)
- By simultaneously imposing continuity conditions to the marginals and we obtainIf, moreover, we also impose the continuity condition at the following restriction must be fulfilled
Proposition 3.
(i) The expected values of the marginals are given for by
(ii) The second-order moments of the marginals are given for by
Proposition 4.
The expected value of the product is
In view of the random generation procedure, we also need the following result on the conditional distribution of a marginal.
Proposition 5.
The conditional cdf of the marginal given is
4.2. Simulation
We propose two methods for generating random values from the bivariate composite Gumbel–Pareto distribution. The first one is the inversion method, while the second one is based on the representation in expression (7).
Method I: In the bivariate case, the inversion method consists of two steps:
- 1.
- Generate a value from the marginal distribution of by inverting its cdf given in Proposition 1;
- 2.
- Generate a value from the conditional distribution of given by inverting the conditional cdf given in Proposition 5. Thus, the resulting pair is simulated from (10).
Method II: Starting from the two-component mixture representation (7) with mixing weights r and , we propose the following algorithm:
- 1.
- Generate a value b from the Bernoulli distribution with parameter r;
- 2.
- If , then generate the pair from the Gumbel distribution truncated on D;
- 3.
- If , then generate the pair from the bivariate Pareto distribution (4).
Now the problem is to generate values from the two bivariate distributions: Gumbel and Pareto. Bivariate Pareto values can be generated without difficulty by the inversion method as described in Method I. Concerning the Gumbel distribution truncated on D, the following cdfs (obtained similarly to the ones in Propositions 1 and 5) can be used for inversion:
The cdf of the truncated Gumbel marginal, :
The conditional cdf of the marginal given of the truncated Gumbel distribution:
4.3. Parameter Estimation
For a univariate composite distribution, estimating the parameters is already a difficult problem because the threshold where the distribution changes shape is itself a parameter. Therefore, the usual approach in the univariate case consists of sorting the data, assuming that the threshold lies between each two consecutive data points, and finding the corresponding MLE solution; then, the best MLE solution is selected from among the available ones. Alternatively, a set of possible thresholds can be defined, and for each such value, the resulting likelihood is maximized; see also the review [11] for threshold estimation approaches.
In the bivariate case, the estimation problem becomes even more difficult because there are two unknown thresholds to estimate. Let be a bivariate data sample of size n, let denote the rest of the parameters of the bivariate density defined in (10) (note that r might be obtained from a continuity condition such as (11) or the ones in Proposition 2, if imposed), and let L denote the likelihood function
The log-likelihood function defined from (12) is the weighted sum of the two partial log-likelihood functions associated with the two distributions of the composite model: the Gumbel and the Pareto. Since the MLE exists for both distributions (see [10] for the bivariate Pareto distribution), then for a known r, we can easily find the MLE of our composite model. The aim of the proposed MLE procedures is to find the best value of r.
In the following, we propose two alternative methods to estimate the parameters.
Method 1: An approach similar to the one described in the univariate case would be to sort the marginal data, obtaining and , assume that each threshold lies, correspondingly, between each two consecutive marginal data points, find the MLEs, and choose the best one. However, this procedure is very time-consuming in the bivariate case, so we propose to combine it with marginal estimation in a two-part method as follows:
- I.
- Perform marginal estimation for both marginals; since the marginals are univariate composite distributions, the approach described above for the univariate case can be used. This would give starting values for the marginal parameters and the approximate location of the marginally estimated thresholds .
- II.
- Let and denote the (increasing) sorted marginal data and assume that the marginally estimated thresholds , where . Now consider the l intervals preceding and the l intervals following the interval that covers , as long as they exist; for each combination of such intervals, perform full MLE and keep the best solution. The resulting algorithm is:
- Step 1. For to , ,evaluate as solutions of the optimization problem:under the constraints and in the corresponding intervals, and continuity conditions, if imposed.
- Step 2. Among the solutions obtained from Step 1, choose the one that maximizes the log-likelihood function.
Note that in this way, for reasonable choices of and l, the computing time is significantly reduced.
Method 2: The second method is a more analytical procedure for a specific sample; it takes into account that the parameter of the bivariate Gumbel–Pareto density (10) is restricted to the interval. This allows us to define a grid for it and to optimize the rest of the parameters for each value in this grid. The following procedure is designed, assuming the continuity conditions given in (i–iii) of Proposition 2 and the conditional likelihood defined by:
with the continuity conditions (constraints)
The conditional likelihood is defined similarly. The procedure for maximizing is described below:
- Step 1. Obtain initial values for the parameters , , and as follows:
- -
- The initially estimated thresholds are and , where , , are two given large proportions, and denotes the integer part. An initial value for each proportion can be deduced from the Hill plot or by doing MLE of the univariate Pareto for the tail.
- -
- The initially estimated value of the exponential parameter is obtained by MLE of the univariate truncated exponential distribution with density function:
- Step 2. Define a grid for , i.e., . For each , the estimated parameters , , and are obtained by maximizing the conditional log-likelihood function . The optim() function of R software with the “Nelder–Mead” method can be used; this works reasonably well for non-differentiable functions. The parameters and are estimated using the continuity conditions.
- Step 3. Let be the optimal values of the log-likelihood obtained at Step 2, and let be the corresponding parameters. The final estimated parameters are:with
5. Numerical Illustration
In this section, we present two numerical illustrations: the first one is on simulated data, and the second one is on a real data set.
5.1. Numerical Illustration Using Simulated Data
In this section, we used simulated data to check the performance of the first estimation procedure (Method 1) proposed in Section 4.3. The true values of the parameters were selected such that they satisfied all the continuity conditions given in Proposition 2: Gumbel: Pareto: , while . Note that due to the heavy-tailedness of the Pareto distribution (), there is no expected value for this particular distribution (its pdf is plotted in Figure 2).
With the aim of studying the properties of Method 1, using the two simulation methods described in Section 4.2, we generated 100 samples of size and 1000, respectively, for the two methods. For each such sample, in the first step, we performed marginal estimation by imposing the continuity condition for each marginal (which restricts the parameters r, as stated in Proposition 2). As a consequence, and a are estimated twice (for each marginal), and because of the differences in these estimations, we cannot rely only on marginal estimation. However, marginal estimation provides starting values for performing full MLE, and even better, gives an idea of where to look for the thresholds. More precisely, we restricted the search to about 40 intervals for each , i.e., we took . Thus, the computing time was significantly reduced compared to the threshold search through all data.
Finally, we estimated the Mean Square Error and the Mean Absolute Error , where and represent the true and estimated parameters, respectively.
With the estimated parameters obtained from the 100 replicas generated with each simulation method, we obtained the MSE and the MAE that are shown in Table 1. The results indicate that both error criteria decrease when the sample size increases. Some differences between the two simulation methods can be observed (e.g., the MSE of is larger for simulation Method II than for simulation Method I, while the MSE of a is smaller for simulation Method II than for simulation Method I), but we believe that these differences are due to the randomness of the results, where some samples fall more in the Pareto part or in the exponential part; further simulation investigation is worthwhile, assuming that the estimation method can be modified to reduce the computing time.
Table 1.
Simulation results with 100 replicas for MSE and MAE with sample sizes and 1000.
Concerning Method 2, as already noticed, it is a more analytical procedure for a specific sample, and therefore, it cannot be standardized and we cannot perform several iterations to calculate MSE and MAE.
All the computations were preformed in R software using an optimization function with constraints to implement the continuity restrictions. The code is available upon request from the authors.
5.2. Numerical Illustration with Real Data
In this section, we fit our proposed bivariate Gumbel–Pareto distribution to a random sample of motor insurance claims that include bodily injury. For these claims, we separately know the cost of property damage including third-part liability (variable ) and the cost of exceptional medical expenses not covered by public social security (variable ). The data were provided by a major insurer in Spain in the year 2002 and correspond to claims that occurred in the year 2000. These data were studied in previous works (see [5,6,12]).
In Table 2, we display the descriptive statistics of the original data divided by 1000; this change of scale is convenient, and it facilitates the MLE of the parameters. These descriptive statistics show that both variables have a strong right skewness. Furthermore, the left plot in Figure 4 shows the scatterplot of both cost variables in the original scale divided by 1000, where the existence of extreme values in both variables can be noticed. When we have right-skewed variables with extreme values, the MLE of a simple distribution as, e.g., the exponential, the Weibull, or the log-normal, tends to underestimate the probability on the right tail. Figure 5 displays the univariate exponential pdf fitted by MLE to each marginal variable; with these densities, we also plotted the observed costs: on top the costs of property damage, including third-part liability, and on bottom the costs of exceptional medical expenses not covered by public social security. For better visibility, the domains of the cost variables were divided in two parts, resulting in two plots for each marginal. Figure 5 shows how the density reaches zero in the part of the domain where there are still sample observations; so clearly, this model assumes a zero probability where it should not. Similar results are obtained using univariate Weibull and log-normal densities.
Table 2.
Descriptive statistics of property damage and third-party liability costs () and exceptional medical expenses ().
Figure 4.
Scatterplots of vs. in original (left) and natural (right) logarithm scales.
Figure 5.
Exponential pdf fitted by MLE and sample data shown as points on the horizontal axis for both marginals.
Therefore, the composite model with a Pareto right tail is a good way to improve the MLE fit for both univariate and bivariate data. Moreover, graphical analysis (e.g., the Hill plot) indicates that both variables have a Pareto tail with a shape parameter very close to 1, i.e., we have heavy-tailed marginal distributions. Thus, we can conclude that their distributions have only the first-order moment finite, or they do not have finite moments at all. In the left scatterplot of Figure 4, we can note that the sample information on extreme values is scarce; this is a difficulty in samples from heavy-tailed or Pareto distributions.
To asses the joint behavior of and , we calculated the Pearson linear correlation and the Kendall and the Spearman rank correlation coefficients, displayed in Table 3. These results show a strong dependence between the two cost variables. However, as can be seen from Figure 4, which presents the data scatterplot in both original and natural logarithm scales, the dependence is not linear. As shown in [12], these data exhibit extreme value dependence, i.e., the higher the costs, the stronger the dependency. This behavior can also be observed in Figure 4. Furthermore, [10] shows that when the bivariate Pareto parameter a is , as is the case with our cost data, the theoretical variance and covariance do not exist or cannot be calculated. Therefore, the Pearson linear correlation cannot be interpreted.
Table 3.
Sample linear and rank correlation coefficients.
Further, from the right plot in Figure 4, it can be observed that for small values of both variables, the shape of the point cloud is spherical, i.e., the dependence is almost zero; however, for larger values, the shape indicates positive dependence between both variables. Clearly, this denotes a change of the joint distribution between the smaller and the larger costs.
In Table 4, we present the MLE parameters for Gumbel’s bivariate exponential distribution described in Section 2.2.1 and for the Gumbel–Pareto distribution from Section 4. The estimated parameters of the latter were obtained with Method 2 described in Section 4.3, imposing all continuity conditions (Method 1 yielded similar results). The initial values of the thresholds were taken from the Hill plots, and in this case, , resulting in , i.e., and ; also, . Comparing the AICs, BICs, and CAICs given in Table 4 indicates that the bivariate Gumbel–Pareto clearly outperforms Gumbel’s bivariate exponential distribution. Moreover, from MLE, the dependence parameter of Gumbel’s bivariate exponential distribution, , is zero, and it is close to zero for the Gumbel–Pareto distribution, which is coherent with the scatterplot in Figure 4.
Table 4.
MLE of bivariate distributions with standard errors in parentheses.
In Figure 6, we also plotted a partial histogram of the data alongside the corresponding Gumbel–Pareto pdf with the estimated parameters, while in Figure 7, we plotted the marginal histograms with the fitted pdfs.
Figure 6.
Histogram of real data (left) and Gumbel–Pareto pdf with the estimated parameters (right).
Figure 7.
Histogram of real data marginals with fitted pdfs: left, ; right, .
Finally, as a risk management application, we estimated the total risk of loss for the aggregate cost random variable using Monte Carlo simulation, and based on it, we calculated the Value-at-Risk (VaR) measure. VaR is equivalent to an extreme quantile of the distribution, i.e., , where is close to 1. In Table 5, we present the VaR results with for: the empirical distribution of the original data, the distribution of S simulated from Gumbel’s bivariate exponential distribution, and the distribution of S simulated from the Gumbel–Pareto distribution. Furthermore, we added the VaR obtained for the bivariate log-normal distribution fitted to the data; note that this distribution underestimates the risk in a way similar to that of Gumbel’s bivariate exponential.
Table 5.
Value-at-Risk for the empirical distribution and alternative distributions, obtained using Monte Carlo simulation.
When data follow a heavy-tailed distribution, the empirical VaR depends on the maximum data observed, and it is not an efficient estimator. The Gumbel–Pareto distribution provides an estimation that extrapolates beyond the observed maximum cost and takes into account the long and heavy bivariate tail with dependent marginal distributions.
6. Conclusions
To model bivariate dependent data that exhibit many small/medium values but also some very large values (i.e., extreme values), in this paper, we proposed a bivariate two-component spliced distribution. This distribution assumes a bivariate Pareto distribution on the domain consisting of values larger than some thresholds, and a bivariate Gumbel distribution on the complementary domain. We discussed some properties of the new distribution and focused on parameter estimation, proposing two alternative procedures. Because performing full MLE for this distribution may become time-prohibitive for larger data sets, as further work, we plan to investigate alternative methods that could reduce the computing time. Additionally, starting from the mixture formula (7), we plan to address the problem of parameter identifiability (see, e.g., [13] or [14]). Goodness-of-fit tests are envisaged for a future study.
Moreover, we also plan to study other such distributions by replacing the bivariate Gumbel with alternative distributions.
Author Contributions
Conceptualization, C.B. and R.V.; methodology, C.B. and R.V.; software, A.B., C.B. and R.V.; formal analysis, A.B., C.B. and R.V.; writing—original draft preparation, A.B. and R.V.; writing—review and editing, C.B. and R.V. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Data are available from the authors.
Acknowledgments
The authors are very grateful to the three referees for their valuable comments that helped to significantly improve the paper. Catalina Bolancé acknowledges the Spanish Ministry of Science, Innovation and Universities, grant PID2019-105986GB-C21.
Conflicts of Interest
The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| MDPI | Multidisciplinary Digital Publishing Institute |
| DOAJ | Directory of open access journals |
| TLA | Three letter acronym |
| LD | Linear dichroism |
Appendix A. Proofs
Proof of Lemma1.
Using integration by parts, it is easy to prove (i)–(iv); (v) results by changing variable , while (vi) is obtained by parts and by using (v). □
Proof of Lemma3.
Without loss of generality, we prove the formula of ; proof of results in a similar way.
Using formulas (ii) and (iii.1) from Lemma 1, we obtain, with some calculation,
□
Proof of Lemma4.
We write
where
and using the corresponding formulas (iv.2) and (iii.2) from Lemma 1, we obtain
Inserting this result into the equation of yields
and by changing variable and letting , we obtain
For simplicity, we denote ; hence
Using (iii.2), (v), and (vi) from Lemma 1, we evaluate
We now insert the formulas of c and ; note that
and with some calculation, we obtain the stated formula of □
Proof of Lemma5.
We note that
where we used the formula of from Lemma 3. This easily yields the stated result. □
Proof of Proposition1.
We prove the formulas for with the formulas for resulting in a similar manner.
- (i)
- Since , we have two cases:Case : it is easy to see thatCase : in this case,We insert the formula of from Lemma 3 and obtain the stated formula of .
- (ii)
- Based on the formula of , we again have two cases:Case : clearly, here we obtain the cdf of the exponential distribution of .Case : in this case,The first two integrals add to the cdf of the exponential distribution of in , while the last integral yields the cdf of the Pareto distribution of . Therefore,where for the last equality, we used formula (3) of . From here, the formula of is immediate. □
Proof of Proposition2.
(i) The continuity condition yields
which yields Formula (i). The proof of Formula (ii) is similar.
- (iii)
- We equate from (i) and (ii) and obtainMoreover, the continuity condition at means ; hence, using (11) and , we obtainfrom which results the stated formula of a. □
Proof of Proposition3.
We calculate the expected value and the second-order moment for (those of result in a similar way). Using the expected value of the exponential and Pareto distributions, we have
Inserting (iii.2) from Lemma 1 yields
from which the expected value formula is immediate. The moment of second order is
Based on (iv.2) from Lemma 1, we obtain
The stated formula of easily results from here, which completes the proof. □
Proof of Proposition4.
We write
We separately calculate the two integrals. We start with the second one, which from Formula (5) is given by
In what concerns , we note that given the definition of the domain D with the notation from Lemma 4, we have
Now using the formula in Lemma 4, we note that
and the stated formula of results immediately. □
Proof of Proposition5.
We recall that
and according to Proposition 1, we note that we must consider three different cases:
and .
Case I: and . In this case,
and using Lemma 5, we obtain the first formula of .
Case II: . Now, we have
and, as in Case I, we easily get the second formula of .
Case III: . In this case,
The first integral equals the formula obtained in Case II by taking , while for the second integral, we evaluate
Therefore,
from which the last formula of is immediate. □
References
- Sarabia, J.M.; Gómez-Déniz, E. Construction of multivariate distributions: A review of some recent results. SORT 2008, 32, 3–36. [Google Scholar]
- Klugman, S.A.; Panjer, H.H.; Willmot, G.E. Loss Models: From Data to Decisions; John Wiley & Sons: New York, NY, USA, 1998. [Google Scholar]
- Scarrott, C. Univariate extreme value mixture modeling. In Extreme Value Modeling and Risk Analysis: Methods and Applications; Dipak, K.D., Jun, Y., Eds.; CRC Press: Boca Raton, FL, USA, 2016; pp. 41–67. [Google Scholar]
- Cooray, K.; Ananda, M.M. Modeling actuarial data with a composite lognormal-Pareto model. Scand. Actuar. J. 2005, 5, 321–334. [Google Scholar] [CrossRef]
- Bolancé, C.; Guillen, M.; Pelican, E.; Vernic, R. Skewed bivariate models and nonparametric estimation for the CTE risk measure. Insur. Math. Econ. 2008, 43, 386–393. [Google Scholar] [CrossRef]
- Bahraoui, Z.; Bolancé, C.; Pelican, E.; Vernic, R. On the bivariate Sarmanov distribution and copula. An application on insurance data using truncated marginal distributions. SORT 2015, 39, 209–230. [Google Scholar]
- Gumbel, E.J. Bivariate exponential distributions. J. Am. Stat. Assoc. 1960, 55, 698–707. [Google Scholar] [CrossRef]
- Kotz, S.; Balakrishnan, N.; Johnson, N.L. Continuous Multivariate Distributions, Volume 1: Models and Applications; John Wiley & Sons: New York, NY, USA, 2004. [Google Scholar]
- Castillo, E.; Sarabia, J.M.; Hadi, A.S. Fitting continuous bivariate distributions to data. Statistician 1997, 46, 355–369. [Google Scholar] [CrossRef]
- Mardia, K.V. Multivariate Pareto distributions. Ann. Math. Stat. 1962, 33, 1008–1015. [Google Scholar] [CrossRef]
- Scarrott, C.J.; MacDonald, A. A review of extreme value threshold estimation and uncertainty quantification. REVSTAT Stat. J. 2012, 10, 33–60. [Google Scholar]
- Bahraoui, Z.; Bolancé, C.; Pérez-Marín, A.M. Testing extreme value copulas to estimate the quantile. SORT 2014, 38, 89–102. [Google Scholar]
- Chen, J. Optimal rate of convergence for finite mixture models. Ann. Stat. 1995, 23, 221–233. [Google Scholar] [CrossRef]
- Kim, D.; Lindsay, B.G. Empirical identifiability in finite mixture models. Ann. Inst. Stat. Math. 2015, 67, 745–772. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).