EM Estimation for the Bivariate Mixed Exponential Regression Model

: In this paper, we present a new family of bivariate mixed exponential regression models for taking into account the positive correlation between the cost of claims from motor third party liability bodily injury and property damage in a versatile manner. Furthermore, we demonstrate how maximum likelihood estimation of the model parameters can be achieved via a novel Expectation-Maximization algorithm. The implementation of two members of this family, namely the bivariate Pareto or, Exponential-Inverse Gamma, and bivariate Exponential-Inverse Gaussian regression models is illustrated by a real data application which involves ﬁtting motor insurance data from a European motor insurance company.


Introduction
Over the last few decades, there has been a vast increase in actuarial research works focusing on modeling costs of a particular claim type based on various claim severity modeling approaches such as • finite mixture models: see, for example, (Fung et al. 2021;Lee and Lin 2010;Miljkovic and Grün 2016;Tzougas et al. 2014Tzougas et al. , 2018Tzougas et al. , 2019; • composite or splicing models: see for instance, (Bakar et al. 2015;Calderín-Ojeda and Kwok 2016;Cooray and Ananda 2005;Grün and Miljkovic 2019;Nadarajah and Bakar 2014;Parodi 2020;Pigeon and Denuit 2011;Scollnik 2007;Scollnik and Sun 2012). • combinations of finite mixtures and composite models: see Reynkens et al. (2017).
Furthermore, several works have focused on understanding how the claim severity distribution is influenced by certain risk factors. See, for example, (Frees 2009;Laudagé et al. 2019;Tzougas and Jeong 2021; Karlis 2020) among many more. However, even if the literature in the univariate setting is abundant, the bivariate, and/or multivariate, extensions of such models have not been explored in depth even if in non-life insurance, the actuary may often be concerned with modeling jointly different types of claims and their associated costs.
In this paper, motivated by a European Motor Third Party Liability (MTPL) insurance data set which is described in Section 4 we introduce a family of bivariate mixed Exponential regression models for joint modeling the costs from positively correlated bodily injury and property damage claims in terms of covariates. The proposed class of bivariate claim severity regression models is based on a mixing between two marginal Exponential distributions and a unit mean continuous and at least twice differentiable mixing density. The modeling framework we consider can account for the positive dependency between the two claim types in a flexible manner since it allows for a variety of alternative distributional assumptions for the mixing density. Furthermore, depending on the choice of the mixing density the bivariate mixed Exponential model can be used to model both moderate and large bodily injury and property damage claim sizes which can be the result of the same accident. At this point, it is worth noting that modeling positively correlated claims and their associated counts for the same and/or different types of coverage, such as home and auto insurance, bundled together under a single policy, has been explored by many articles. See for example, (Bermúdez and Karlis 2011Valdez 2014a, 2014b;Abdallah et al. 2016;Bermúdez et al. 2018;Pechon et al. 2018Pechon et al. , 2019Pechon et al. , 2021Bolancé and Vernic 2019;Denuit et al. 2019;Fung et al. 2019;Bolancé et al. 2020;Jeong and Dey 2021;Gómez-Déniz and Calderín-Ojeda 2021;di Cerchiara 2021a, 2021b). Furthermore, Baumgartner et al. (2015) and Oh et al. (2021) consider shared random effects for capturing possible associations between the frequency and severity and/or among the longitudinal claims. However, with the exception of the bivariate, and/or multivariate Pareto model which have been actively studied in the actuarial literature for the case with and without covariates, see, for instance, Yang et al. (2011), Cockriel andMcDonald (2018) and Jeong and Valdez (2020), modeling positively correlated claim sizes based on alternative e bivariate, and/or multivariate, mixed Exponential regression models remains a largely uncharted research territory. Therefore, this is the main contribution of this study from a practical insurance business perspective. Additionally, our contribution from a computational maximum likelihood (ML) estimation standpoint is that we develop an Expectation-Maximization (EM) type algorithm 1 which takes advantage of the stochastic mixture representation of the bivariate mixed Exponential regression model for maximizing its log-likelihood in a computationally efficient and parsimonious manner. For expository purposes, the bivariate Pareto (BPA), or Exponential-Inverse Gamma, and bivariate Exponential-Inverse Gaussian (BEIG) regression models are fitted on the MTPL bodily injury and property damage data set.
Finally, it should be noted that when dealing with different types of claims from different types of coverage, such as motor and home insurance, the regressors on the mean parameters may differ according to different individual and coverage-type risk factors. However, in the case of our MTPL data, both mean parameters of the bivariate mixed Exponential regression model are only modeled using common explanatory variables for both claim types. Thus, we extend the proposed setup by pairing a bivariate Normal copula with the PA and EIG regression models. These copula-based models, which can cope with both positive and negative dependence structures, are compared with the BPA and BEIG regression models using a simulated data set in which we assume that we have different types of claims from different types of cover.
The rest of the paper proceeds as follows. Section 2 discusses how the bivariate mixed exponential regression model can be constructed and the joint probability density functions (jpdfs) of the BPA and BEIG regression models, which are used for demonstration purposes, are derived. Section 3 deals with parameter estimation for the proposed model based on the EM algorithm. In Section 4, the models presented in Section 2 are fitted to the MTPL bodily injury and property damage claims data set and the comparison based on the simulated data set mentioned in the previous paragraph is presented. Finally, concluding remarks are provided in Section 5.

The Bivariate Mixed Exponential Regression Model
Consider a non-life MTPL insurance which contains bodily injury and property damage claims and their associated costs. Please note that it is possible that there exists a positive correlation between the two types of claims we propose the following family of models.
The claim amounts of both types are denoted as Y i , i = 1, 2, which are well-defined when there is at least one claim for each type of claim. Furthermore, we consider that conditional on a random effect Z > 0, the random variables Y i , i = 1, 2 are independent exponential random variables with rates µ i Z. The random effect Z is a continuous random variable with density g φ (z) which takes positive values only and it mainly controls the variation and correlation of the whole bivariate sequence. To avoid the identifiability problem, we have to restrict the expectation E[Z] to be a fixed constant and one usually lets E[Z] = 1. On the other hand, to account for the impact of heterogeneity between different policyholders, the rates µ i , i = 1, 2 are modeled as functions of explanatory variables are the corresponding coefficients. Then, the unconditional joint density function, (1) In the following, for demonstration purposes, we specialize with two different mixing densities, the Inverse Gamma (IGA) and Inverse Gaussian (IG) distributions, which lead to the bivariate Pareto (BPA) and bivariate Exponential-Inverse Gaussian (BEIG) regression models, respectively.

Bivariate Pareto Regression Model
The general inverse Gamma density function IGA(α, β)is defined as where the mean and variance are β α−1 and . To avoid the aforementioned identification problem, the mean of this density function has to be one. Then we have the following parametrization by letting α = φ + 1 and β = φ, Under this parametrization, the random variable Z has a unit mean and variance equal to 1 φ−1 for φ > 1. This density is denoted as IGA(φ + 1, φ). Then, the joint density of the bivariate Pareto (BPA) regression model is given by Here, up to a scaling factor, the integrand is the density function of an IGA(φ + 3, φ + y 1 µ 1 + y 2 µ 2 ) distribution. Therefore, the value of the integral is the reciprocal of the normalizing constant. The mean, variance, covariance and correlation in the case of the BPA model are given by (4)

Bivariate Exponential-Inverse Gaussian Regression Model
In general, we say that the random variable X follows a generalized Inverse Gaussian where K p is the modified Bessel function of the second kind. The random variable X has mean and variance Then the general inverse Gaussian density IG(γ, δ) is a special case of Generalized Inverse Gaussian GIG(γ 2 , δ 2 , − 1 2 ) where the parameter p = − 1 2 is fixed. It density function has following form, Similar to the inverse gamma case, to avoid the identification problem, we have to restrict the mean of IG(γ, δ) to be one. Then one possible way is to set γ = δ = φ. Then the density becomes, The random effect Z now has a unit mean and variance 1 φ 2 . The unconditional joint density function of the bivariate Exponential-Inverse Gaussian (BEIG) can be derived as follows The integrand above is, up to a scaling constant, the density of a GIG φ 2 , 2y 1 µ 1 + 2y 2 µ 2 + φ 2 , − 5 2 . Therefore, the integral value is the reciprocal of the normalizing constant. The mean, variance, covariance and correlation in the case of the BEIG model are given by

The EM Algorithm for the Bivariate Mixed Exponential Regression Model
In this Section, an Expectation-Maximization (EM) algorithm is applied to facilitate the maximization likelihood estimation of the bivariate mixed Exponential regression model.
Consider the observed bivariate response sequence {Y j } j=1,...,n and the corresponding covariates {x 1,j } j=1,...,n and {x 2,j } j=1,...,n . Furthermore, let Θ = {β 1 , β 2 , φ} be the parameter space for this model. Then, the log-likelihood function can be written as The direct maximization of Equation (10) with respect to parameter space Θ is complicated. Fortunately, in such cases, the EM algorithm can be used to simplified the maximization problem of Equation (10). In particular, if we augment the unobserved variable {Z j } j=1,...,n , then the complete log-likelihood function is given by The two-steps of EM algorithm are described in what follows.
• E-step: The Q-function, Q(Θ; Θ (r) ), which is the conditional posterior expectation of Equation (11), is given by where µ where the posterior density function is defined as π(z|Θ (r) , y j , x 1,j , x 2,j ) = 1 µ 1,j µ 2,j z 2 exp − y 1,j • M-step: After calculating the Q-function, we find its maximum global point, Θ (j+1) , i.e., we update the parameters by computing the gradient function, g(.), and the Hessian matrix, H(.), of the Q-function. In particular, the Newton-Raphson algorithm is used for maximizing the Q-function and the parameters β 1 , β 2 for the Exponential part and the parameter φ for the randnom effect part are updated separately as shown below.
-For the Exponential part, For the random effect part, we derive the first and second order derivatives of log g φ (θ) and then we take the posterior expectations to construct its gradient functions and the Hessian matrix. In what follows, we derive the derivatives for the IGA and IG densities which were defined in the previous section. Finally, we update φ using the one-step ahead Newton iteration In what follows, we will show how Equation (11) can be modified in the case of the IGA and IG mixing densities.

Inverse Gamma mixing density
The first and second derivatives of the term ∑ n j=1 log E (r) z,j [g φ (z)] are given by g(φ (r) ) = n 1 + 1 Γ(x) is the digamma function and Ψ (x) is the corresponding first derivative. Furthermore, it is easy to observe that at iteration r, the posterior density would be an IGA φ (r) + 3, φ (r) + y 1,j µ (r) 1,j + y 2,j µ (r) 2,j . Thus, the expectations involved in the E-step of the algorithm are given by 2.

Inverse Gaussian density
In the case of the IG mixing density, the first and second derivatives of the term Furthermore, one can easily see that at iteration r, the posterior density in this case is a GIG (φ (r) ) 2 , (φ (r) ) 2 + Therefore, the expectations involved in the E-step of the algorithm are given by where η = (φ (r) ) 2 (φ (r) ) 2 +

Empirical Analysis
The study is based on data from automobile policies from a major insurance European company for the underwriting years 2012-2019. This data set contains bodily injury (BI) and property damage (PD) claims and their associated claim costs, denoted by Y 1 and Y 2 , respectively, and risk factors that affect both Y 1 and Y 2 . An exploratory analysis was conducted so as to select the subset of covariates with the highest predictive power for Y 1 and Y 2 . There were 7263 observations in total which met our criteria.
The summary statistics for Y 1 and Y 2 are shown in Table 1 and Figure 1. As was expected, both Y 1 and Y 2 are positively skewed. Furthermore, the Pearson correlation test indicates that it is appropriate to model both types of claim costs based on a single bivariate model rather than two independent univariate models.  Furthermore, a description of the explanatory variables which we included in the regression analysis for Y 1 and Y 2 is provided below.

•
The variable Driver's age. Policyholders aged 18 to 90 years old. Additionally, the empirical distributions of the categorical and continuous explanatory variables are shown in Table 2 and Figure 2, respectively.
The BPA and BEIG regression models were fitted to the claim costs Y = (Y 1 , Y 2 ). All computing was made using the R software. The vector of parameters Θ = {β 1 , β 2 , φ} was estimated using the EM algorithm which was presented in Section 3 and their standard deviations were obtained through expressions that were directly produced by the EM algorithm for the observed information matrix of each model. The fit of the competing models was compared by employing the classic hypothesis/specification tests Akaike information criterion (AIC) and Bayesian information criterion (BIC). The results are presented in Table 3. We see that the values of the estimated regression coefficients of the variables Driver's Age, Vehicle's Age and Region have a a similar effect (positive and/or negative) and are almost identical for both response variables in the case of the bivariate claim size models. Furthermore, we observe that the best fitting performances are provided by the BEIG regression models since according to a very well known rule of thumb, two models can be considered to be significantly different if the difference in their respective AIC and SBC values is greater than ten and five, respectively, see Anderson and Burnham (2004) and Raftery (1995), respectively.   Finally, we consider an extension of the proposed framework using copulas. In particular, the Gaussian copula is paired with the PA and EIG regression models. The copula-based models are compared to the BPA and BEIG regression models using two simulated datasets. The probability density functions of the univariate PA and EIG have similar definitions as those their bivariate counterparts. In particular, we have that where i = 1, 2 for two marginals, j = 1, . . . , n for different policyholders and µ i,j are the regression parameters which have the same definition as in Section 2. Two random samples of size n = 5000 are generated from the bivariate Gaussian copula which is paired with the PA and EIG marginals, respectively. Then we consider two sets of explanatory variables ν 1,i − ν 4,i that determine the size of µ i,j for two marginals. In particular, we assume that ν 1,i take integer values within the ranges (18-75) and (0-20), respectively. The rest of the variables are considered to be categorical. In particular, we let ν 2,1 have two categories while ν 2,2 has three. Then, we consider that ν 3,i and ν 4,i have three and four categories, respectively. All the explanatory variables are generated from the uniform distribution with length n. The fitting results are shown in Tables 4 and 5, respectively.

Concluding Remarks
In this paper, we developed a class of bivariate mixed Exponential regression models which can approximate moderate and large claim costs in an efficient manner based on the choice of mixing density. We illustrated our approach by fitting the BPA and BEIG regression models on MTPL data which were provided by a European insurance company. The proposed family of models can accommodate the positive correlation between MTPL bodily injury and property damage claims and their associated costs, when explanatory variables for each type of claims are taken into account through regression structure for their mean parameters.
The main achievement is that we developed an EM-type algorithm which is computationally efficient. This was demonstrated by obtaining reliable estimates when applying the models to the read data. Finally, the standard errors of estimated parameters were easily produced as byproducts of the algorithm.