Next Article in Journal
Machine Learning in Least-Squares Monte Carlo Proxy Modeling of Life Insurance Companies
Previous Article in Journal
Delta Boosting Implementation of Negative Binomial Regression in Actuarial Pricing
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Survey of the Individual Claim Size and Other Risk Factors Using Credibility Bonus-Malus Premiums

by
Emilio Gómez-Déniz
1,* and
Enrique Calderín-Ojeda
2
1
Department of Quantitative Methods and TIDES Institute, University of Las Palmas de Gran Canaria, 35017 Las Palmas de Gran Canaria, Spain
2
Centre for Actuarial Studies, Department of Economics, The University of Melbourne, Melbourne, VIC 3010, Australia
*
Author to whom correspondence should be addressed.
Risks 2020, 8(1), 20; https://doi.org/10.3390/risks8010020
Submission received: 24 December 2019 / Revised: 11 February 2020 / Accepted: 18 February 2020 / Published: 21 February 2020

Abstract

:
In this paper, a flexible count regression model based on a bivariate compound Poisson distribution is introduced in order to distinguish between different types of claims according to the claim size. Furthermore, it allows us to analyse the factors that affect the number of claims above and below a given claim size threshold in an automobile insurance portfolio. Relevant properties of this model are given. Next, a mixed regression model is derived to compute credibility bonus-malus premiums based on the individual claim size and other risk factors such as gender, type of vehicle, driving area, or age of the vehicle. Results are illustrated by using a well-known automobile insurance portfolio dataset.

1. Introduction

In a recent work, a modification in the bonus-malus systems was proposed Gómez-Déniz (2016), which are commonly applied in automobile insurance, that differentiated between two different types of claims by including a bivariate model based on the assumption of dependence. The aforementioned work studied the impact on the bonus-malus premium in a general setting without involving individual’s risk factors, such as gender, type of vehicle, area of circulation, etc.
It is well known that under the traditional bonus-malus system, the premium charged to each insured is based solely on the number of claims made. Therefore, an insured who has had an accident that causes a relatively small loss amount is penalised to the same extent as one who has experienced a more expensive accident. This event would seem to be unfair by the insureds. In the mentioned work a bivariate prior model, conjugated with respect to the likelihood, was also proposed, and as a result of this, simple credibility bonus-malus premiums that satisfy appropriate transition rules were obtained. These expressions were used to compute credibility bonus-malus premiums by considering two different types of claims: those ones above and below a threshold claim size, say ψ > 0 .
Similar related works have been proposed in the actuarial literature. In this sense, the work in Pinquet (1998) computed bonus-malus rates in a multi-equation Poisson model with random effects. The work in Ragulina (2011) introduced a bonus-malus system with different claim types and varying deductibles. The work in Walhin and Paris (2001) showed how to set up a practical bonus-malus system with a finite number of classes using both the actual claim amount and claim frequency distribution. The work in Bonsdorff (2005) also incorporated the claim number and the severity in the bonus-malus system literature by using Markov chains. The work in Bermúdez (2009) examined, in automobile insurance claims, an a priori ratemaking procedure that included two different types of claim, i.e., with and without bodily injuries. See also Bermúdez and Karlis (2017).
The main objective of this work is to develop a reparametrization of the bivariate distribution proposed in the previous work with the purpose of incorporating individual information in the model to adjust the premiums charged to each policyholder. Additionally, some statistical properties of the proposed parametrization that were not addressed in the previous work will be shown. Furthermore, an extensive set of a priori classification variables such as age, gender, type and age of car, etc., will be used to incorporate, depending on the heterogeneity of the insured’s behaviour, prior distributions assigned to the parameters of the model to build up a posteriori credibility, bonus-malus premiums.
The rest of this paper is organised as follows. The main model and some of its properties are presented in Section 2. In Section 3, the regression model is introduced, and maximum likelihood estimation methods are illustrated. We will show that the estimation procedure is simply derived, and Fisher’s information matrix associated with this regression model is obtained in closed-form. Credibility premiums related to the regression models are provided in Section 4. Numerical illustrations and results connected with the compound model are shown in Section 5, and finally, Section 6 concludes the work.

2. The Model

As pointed out by Dionne and Vanasse (1989), the classical Poisson distribution is generally employed for the characterization of random and independent events such as automobile accidents. Thus, we assume that the number of claims in an automobile insurance portfolio follows a Poisson distribution with parameter μ 1 > 0 . When an insured declares a claim, it might be for an amount exceeding ψ monetary units. In order to accommodate this characteristic into the model, we incorporate a second random variable, thus giving rise to the consideration of two separate sub-events (claims worth more or less than ψ ), in the following way. Let Z i be the variable that takes the value one if the i th claim corresponds to a claim size larger than ψ and the value zero otherwise. Thus, the  Z i s variables are modelled as independent and identically distributed with the following Bernoulli probability density function:
f ( z i | p ) = μ 2 / μ 1 , i f z i = 1 , 1 μ 2 / μ 1 , i f z i = 0 ,
where p = μ 2 / μ 1 is the probability of declaring a claim larger than ψ with 0 < μ 2 < μ 1 .
Let us now assume that X 2 = i = 1 X 1 Z i is the total claim number with a claim amount larger than ψ . Thus, if the Z i ( i = 1 , , x 1 ) are assumed to be mutually independent, then the conditional probability function of X 2 , given that X 1 = x 1 , is binomial with parameters x 1 and μ 2 / μ 1 . Therefore, the joint distribution of the total claim number ( X 1 ) and the corresponding claim number with claim amount exceeding ψ , X 2 , has this probability function:
Pr ( X 1 = x 1 , X 2 = x 2 ) = μ 2 x 2 ( μ 1 μ 2 ) x 1 x 2 exp ( μ 1 ) ( x 1 x 2 ) ! x 2 ! ,
for x 1 = 0 , 1 , , x 2 = 0 , 1 , , x 1 , μ 1 > 0 , and  0 < μ 2 < μ 1 .
Observe that the probability function (1) can be written as:
Pr ( X 1 = x 1 , X 2 = x 2 ) = h ( x ) exp i = 1 2 x i R i ( Θ ) Q ( Θ ) ,
where x = ( x 1 , x 2 ) , Θ = ( μ 1 , μ 2 ) , R 1 ( Θ ) = log ( μ 1 μ 2 ) , R 2 ( Θ ) = log ( μ 2 / ( μ 1 μ 2 ) ) , Q ( Θ ) = μ 1 , and  h ( x ) = [ ( x 1 x 2 ) ! x 2 ! ] 1 . Thus, (1) belongs to the multivariate exponential family of distributions provided in Khatri (1983a). See also Khatri (1983b) and Johnson et al. (1997, chp. 34). This family includes also the multivariate Lagrangian distributions and the multivariate power series distributions; (see Khatri 1983b).

Properties of the Distribution

The marginal means are given by E ( X i ) = μ i , i = 1 , 2 . The cross moment, the covariance, and the correlation are given by:
E ( X 1 X 2 | μ 1 , μ 2 ) = μ 2 ( 1 + μ 1 ) , c o v ( X 1 , X 2 | μ 1 , μ 2 ) = μ 2 , ρ ( X 1 , X 2 | μ 1 , μ 2 ) = μ 2 / μ 1 ,
respectively. Thus, the model admits only positive correlation.
The probabilities for different values of ( x 1 , x 2 ) were calculated, and  graphs were plotted for different values of these two parameters. They are shown in Figure 1. It is observable that for larger values of μ 1 and μ 2 , the modal value increases in x 1 and x 2 , illustrating that the new model is very  versatile.
The expression provided in (1) can also be obtained differently as follows: Let us consider an automobile insurance portfolio in which X 1 is a random variable that represents the number of claims in a given period and X 2 yields the number of claims with a size above a threshold ψ > 0 over the same period of time. If  each policyholder has a probability μ 2 / μ 1 of having a claim with a claim size above ψ , then Pr ( X 2 = x 2 ) and Pr ( X 1 = x 1 ) are related as follows:
Pr ( X 2 = x 2 ) = x 1 = x 2 x 1 x 2 μ 2 μ 1 x 2 1 μ 2 μ 1 x 1 x 2 Pr ( X 1 = x 1 ) .
Obviously, (3) represents a map from the probability function to the probability function. That is, x 2 = 0 Pr ( X 2 = x 2 ) = 1 with Pr ( X 2 = x 2 ) 0 , x 2 = 0 , 1 ,
Although other distributions, i.e., negative binomial, could be chosen to model count data, for the sake of simplicity, let us suppose that X 1 follows a Poisson distribution with parameter μ 1 > 0 . Then, we have:
Pr ( X 2 = x 2 ) = x 1 = x 2 x 1 x 2 μ 2 μ 1 x 2 1 μ 2 μ 1 x 1 x 2 μ 1 x 1 exp ( μ 1 ) x 1 ! = exp ( μ 1 ) x 2 ! μ 2 μ 1 μ 2 x 2 x 1 = x 2 ( μ 1 μ 2 ) x 1 ( x 1 x 2 ) ! = μ 2 x 2 exp ( μ 1 ) x 2 ! j = 0 ( μ 1 μ 2 ) j j ! = μ 2 x 2 x 2 ! exp ( μ 2 ) , x 2 = 0 , 1 ,
Expression (3) can be viewed as a weighted sum of binomial probabilities where the weights are given by the probability that the policyholder declares a certain number of claims. More specifically, it is the mean of the total number of claims with a threshold conditional on the fact that X 1 = x 1 claims and assuming the existence of a heterogeneity factor that causes different claims of different amounts. Hence, Expression (3) can be viewed as a mixture distribution. From this standpoint, the model provides a framework in which random effects are incorporated into the Poisson assumption. In this case, the bivariate distribution provided in (1) can be obtained by multiplying the conditional and the marginal distributions in the usual way.
Numerical simulation of the bivariate distribution can be simply obtained by following the approach explained in Kocherlakota and Kocherlakota (1992, chp. 1). In this regard, both the marginal distribution f ( x 1 ) and the conditional distribution f ( x 2 | x 1 ) will be used. The former is a Poisson distribution with parameter μ 1 and the latter a binomial distribution with parameters x and μ 2 / μ 1 . Thus, for specific values of x 1 , a realization of x 2 from f ( x 2 | x 1 ) can be generated, and therefore, the pairs ( x 1 , x 2 ) are observations from the joint distribution given in (1). This procedure can be repeated n times in order to obtain a random sample of size n.
The joint probability generating function is given by:
G X 1 , X 2 ( s 1 , s 2 ) = exp [ μ 1 ( s 1 1 ) + μ 2 ( s 2 1 ) s 1 ] , | s 1 | 1 , | s 2 | 1 .
Note that (4) is the limiting case of the bivariate Poisson distribution with parameters θ 1 = μ 1 μ 2 , θ 2 0 , and  θ 12 = μ 2 (see for instance this expression in (Johnson et al. 1997, chp. 37) and also Hesselager (1996) for more details of recursions for bivariate discrete distributions). Thus, the following recursions are valid:
p x 1 , x 2 = μ 1 μ 2 x 1 p x 1 1 , x 2 + μ 2 x 1 p x 1 1 , x 2 1 , p x 1 , x 2 = μ 2 x 2 p x 1 1 , x 2 1 ,
with:
p 0 , 0 = exp ( μ 1 ) , p x 1 , 0 = ( μ 1 μ 2 ) x 1 exp ( μ 1 ) x 1 ! ,
and zero otherwise.

3. The Role of the Covariates

Clearly, the number of claims below and above ψ may be influenced by different characteristics and factors; likewise, explanatory variables may be useful to explain the individual premium to be charged. As (1) satisfies that the marginal means are given by E ( X 1 ) = μ 1 and E ( X 2 ) = μ 2 , then covariates can be simply implemented in the model.
We now investigate the effect of including covariates to account for the total number of claims and the claims above the threshold ψ . Obviously, some factors are crucial when explaining the endogenous variables ( X 1 i , X 2 i ) . Two appropriate links are needed to connect the explanatory variables with the marginal means. A natural way to proceed is to assume that ( X 1 i , X 2 i ) for i = 1 , , n follows the probability function (1) and:
log μ 1 i = ω 1 i β 1 , μ 2 i = μ 1 i exp η 2 i β 2 1 + exp η 2 i β 2 ,
where ω 1 i and η 2 i denote vectors of m explanatory variables for the i th observation, i.e., with components ω j i and η j i , ( j = 1 , , m ) , used to model μ 1 i and μ 2 i , respectively, and where β k = ( β k 1 , , β k m ) , ( k = 1 , 2 ) designates the corresponding vector of regression coefficients. The log-linear specification for μ 1 i is widely used, while the link function for μ 2 i was chosen in this way to ensure that the latter one would not be larger than μ 1 i , and thus, it would be compatible with X 2 X 1 .
These mean values may be influenced by several characteristics and variables, and the explanatory variables that are used to model each parameter μ 1 i and μ 2 i may not be the same in practice. In this respect, the work in Cameron and Trivedi (1998) provided good insight into standard count regression models.
The marginal effect reflects the variation of the conditional mean of X 1 and X 2 due to a one-unit change in the j th covariate, and it is calculated as:
μ 1 i β 1 j = ω j i exp ( ω 1 i β 1 ) = ω j i μ 1 i , μ 2 i β 2 j = η j i μ 2 i 1 μ 2 i μ 1 i ,
for i = 1 , , n and j = 1 , , m . Thus, the marginal effect indicates that a one-unit change in the j th regressor increases or decreases the expectation of the total number of claims and the number of claims above the given threshold depending on the sign, positive or negative, of the regressor for each mean. For indicator variables such as ω i k , which takes only the value zero or one, the marginal effect in terms of the odds-ratio is exp ( β 1 j ) for μ i 1 and exp ( β 2 j ) for μ i 2 . Therefore, for  μ i 1 , the  conditional mean is exp ( β 1 j ) times larger if the indicator is one rather than zero. A similar conclusion is drawn for μ i 2 . Certainly, if  μ 1 i and μ 2 i share the same covariates, then (5) does not correspond to the marginal effect of the j th covariate since μ 1 i may also change in response to the changes of this covariate.

3.1. Estimation

In this section, we derive estimators based on the maximum likelihood for the model with and without covariates, and we also provide closed-form expressions for Fisher’s information matrix.

3.1.1. Model without Covariates

Let Θ = ( μ 1 , μ 2 ) and a random sample consisting of n observations x = { ( x 11 , x 21 ) , , ( x 1 n , x 2 n ) } , taken from the probability function (1). The log-likelihood is proportional to:
( Θ ; x ) n x ¯ 2 log μ 2 + n ( x ¯ 1 x ¯ 2 ) log ( μ 1 μ 2 ) n μ 1 ,
where x ¯ 1 and x ¯ 2 are the sample means of X 1 and X 2 , respectively. The normal equations to be solved are:
( Θ ; x ) μ 1 = n ( x ¯ 1 x ¯ 2 ) μ 1 μ 2 n = 0 , ( Θ ; x ) μ 2 = n x ¯ 2 μ 2 + n ( x ¯ 2 x ¯ 1 ) μ 1 μ 2 = 0 ,
from which it is easy to obtain the solution to obtain the maximum likelihood estimators μ ^ 1 = x ¯ 1 and μ ^ 2 = x ¯ 2 which coincide with the moment estimators. The second partial derivatives are:
2 ( Θ ; x ) μ 1 2 = n ( x ¯ 1 x ¯ 2 ) ( μ 1 μ 2 ) 2 , 2 ( Θ ; x ) μ 2 2 = n x ¯ 2 μ 2 2 + n ( x ¯ 2 x ¯ ) ( μ 1 μ 2 ) 2 , 2 ( Θ ; x ) μ 1 μ 2 = n ( x ¯ 1 x ¯ 2 ) ( μ 1 μ 2 ) 2 .
The expectation of the negative of the second partial derivative yields Fisher’s information matrix:
J ( Θ ^ ) = n μ ^ 1 μ ^ 2 n μ ^ 1 μ ^ 2 ( μ ^ 1 μ ^ 2 ) n μ ^ 1 μ ^ 2 ( μ ^ 1 μ ^ 2 ) n μ ^ 2 μ ^ 1 .
The asymptotic variance-covariance matrix of ( μ ^ 1 , μ ^ 2 ) is obtained by inverting this information matrix.

3.1.2. Model with Covariates

When covariates are considered, the log-likelihood is proportional to:
( β ; x ) i = 1 n x 2 i log μ 2 i + ( x 1 i x 2 i ) log ( μ 1 i μ 2 i ) μ 1 i ,
where β = ( β 1 , β 2 ) .
Observe now that μ 1 i = μ 1 i ( β 1 ) and μ 2 i = μ 2 i ( β 1 , β 2 ) , to emphasize that the first expression depends only on β 1 and the second on both β 1 and β 2 . Thus,
μ 1 i β 1 j = ω j i μ 1 i , μ 2 i β 1 j = ω j i μ 2 i , μ 2 i β 2 j = μ 2 i η j i 1 + exp ( η 2 i ) ,
for i = 1 , , n and j = 1 , , m .
Then, after some algebra, we obtain the normal equations,
( β ; x ) β 1 j = i = 1 n ω j i ( x 1 i μ 1 i ) = 0 , j = 1 , , m , ( β ; x ) β 2 j = i = 1 n η j i ϕ ( μ 1 i , μ 2 i , x 1 i , x 2 i ) 1 + exp ( η 2 i β 2 ) = 0 , j = 1 , , m ,
where:
ϕ ( μ 1 i , μ 2 i , x 1 i , x 2 i ) = x 2 i μ 1 i x 1 i μ 2 i μ 1 i μ 2 i .
These equations provide the maximum likelihood estimates for the vector of parameters β ^ 1 = ( β ^ 11 , , β ^ 1 m ) and β ^ 2 = ( β ^ 21 , , β ^ 2 m ) . Similarly to the previous case, Fisher’s information matrix can be obtained in closed-form. See the details in Appendix A.
The normal equations illustrated above can be used to estimate model parameters with and without covariates. The Newton–Raphson method provides solutions in a non-prohibitive time, obviously depending on the number of regressors used.

4. Credibility Regression Premiums

Briefly speaking, a bonus-malus system is an experience rating system that is based on the insured’s claim experience frequency rather than the claim size. Let us now assume some kind of heterogeneity between policyholders, by allowing that the parameters μ i , i = 1 , 2 follow some probability functions. For  μ 1 , a gamma prior distribution will be assumed π 1 ( μ 1 ) with a shape hyperparameter α 1 > 0 and a scale hyperparameter γ 1 > 0 , whereas a type beta prior distribution will be considered for μ 2 with the probability density function given by:
π 2 ( μ 2 ) = μ 2 α 2 1 ( μ 1 μ 2 ) γ 2 1 μ 1 α 2 + γ 2 1 B ( α 2 , γ 2 ) , 0 < μ 2 < μ 1 .
Here, α 2 > 0 , γ 2 > 0 , and  B ( a , b ) is the beta function given by B ( a , b ) = Γ ( a ) Γ ( b ) / Γ ( a + b ) where Γ ( · ) is the Euler gamma function.
The main benefit of selecting these prior distributions is that they are conjugate with respect to the likelihoods, and for that reason, they are common choices in Bayesian and actuarial statistics; see for instance Heilmann (1989); Denuit et al. (2009), and Klugman et al. (2008), among others.
Since μ 1 and μ 2 are dependent, we can choose the prior distribution given by:
π ( μ 1 , μ 2 ) = π 1 ( μ 1 ) π 2 ( μ 2 ) [ 1 + ω ϕ 1 ( μ 1 ) ϕ 2 ( μ 2 ) ] ,
which corresponds to the copula proposed by Lee (1996). Here, ϕ i ( μ i ) , i = 1 , 2 , are bounded non-constant functions such that π i ( μ i ) ϕ i ( μ i ) d μ i = 0 , and  ω a real number, which satisfies that 1 + ω ϕ i ( μ i ) 0 , i = 1 , 2 . Now, given a sample x = ( x ˜ 1 , x ˜ 2 ) = { ( x 11 , x 21 ) , , ( x 1 t , x 2 t ) } , where t is the sample size, the posterior distribution of ( μ 1 , μ 2 ) given the sample information is computed according to Bayes’ theorem, and it is proportional to the product of the likelihood and the prior distribution. Thus, the posterior distribution is almost conjugated with respect to the likelihood and similar to the product of a gamma and a beta distribution and where the updated parameters are given by:
α 1 * = α 1 + t x ¯ 1 ,
α 2 * = α 2 + t x ¯ 2 ,
γ 1 * = γ 1 + t ,
γ 2 * = γ 2 + t ( x ¯ 1 x ¯ 2 ) .
In practise, it is shown that μ 2 is near zero, then in this case, ω 0 , and the prior distribution reduces to π ( μ 1 , μ 2 ) = π 1 ( μ 1 ) π 2 ( μ 2 ) , which is the case considered here.
Now, the unconditional means and cross moment are given by:
E ( X 1 ) = α 1 γ 1 , E ( X 2 ) = α 1 γ 1 α 2 α 2 + γ 2 , E ( X 1 X 2 ) = α 1 α 2 ( α 1 + γ 1 + 1 ) γ 1 2 ( α 2 + γ 2 ) .
Finally, the unconditional bivariate distribution is:
Pr ( X 1 = x 1 , X 2 = x 2 ) = γ 1 α 1 ( 1 + γ 1 ) x 1 + α 1 × Γ ( x 1 + α 1 ) Γ ( x 2 + α 2 ) Γ ( x 1 x 2 + γ 2 ) ( x 1 x 2 ) ! x 2 ! B ( α 2 , γ 2 ) Γ ( α 1 ) Γ ( α 2 + γ 2 + x 1 ) .
For computational reasons, sometimes, it is more convenient to work with the parametrization α 1 = γ 1 μ 1 and α 2 = γ 2 μ 2 / ( μ 1 μ 2 ) .
The maximum likelihood estimates for this mixture regression model can be simply obtained by means of the EM algorithm. This method is a powerful technique that provides an iterative procedure to compute maximum likelihood estimation when data contain missing information. Details on the derivation of the EM algorithm can be found in Appendix B. The standard errors of the estimates Ω ^ = ( β ^ 1 , β ^ 2 , γ ^ 1 , γ ^ 2 ) can be computed by using the method given by Louis (1982). Here, we use Fisher’s information matrix found in Appendix A and replace the missing values by the corresponding pseudo-values calculated in the last iteration of the EM algorithm. Direct maximization of the likelihood surface is also possible to compute the maximum likelihood estimates of the mixture regression model.
By following the same arguments as those ones provided in Gómez-Déniz (2016) and also based on the ideas in Heilmann (1989) (see also Gerber 1979, Rolski et al. 1999, Bühlmann and Gisler 2005, and Gómez-Déniz 2008; among others), a premium calculation principle assigns to each risk vector of parameters Θ a premium within the set P R , the action space. Let L ( Θ , P ) = ( Θ P ) 2 be the squared-error loss function sustained by a decision-maker who takes the action P and is faced with the outcome Θ of a random experience. The premium must be determined in a way such that the expected loss is minimised. The unknown premium P ( Θ ) , called the risk premium, can be obtained by minimising ( g ( x 1 , x 2 ) P ) 2 , where g ( x 1 , x 2 ) is an appropriate function of the number of claims with a claim size below ψ and above ψ , respectively. It seems reasonable to take g ( x 1 , x 2 ) as:
g ( x 1 , x 2 ) = p l x 2 + p s ( x 1 x 2 ) ,
where p s , p l are appropriate weights assigned to the number of claims for claim sizes above and below the critical value, respectively, with  p s < p l . Now, simple algebra provides the risk premium given by,
P ( Θ ) = E [ g ( x 1 , x 2 ) ] = ( p l + p s ) μ 1 p s μ 2 ,
where the expectation is taken on (1). By taking p l = p s = 1 in (14), this reduces to P ( Θ ) = μ 1 , that is the risk premium depends only on the number of claims, irrespective of their size.
In the absence of experience, the actuary computes the collective premium,
P = E π ( Θ ) [ P ( Θ ) ] = α 1 ( p s γ 2 + p l ( α 2 + γ 2 ) ) γ 1 ( α 2 + γ 2 ) .
Again, by inserting p l = p s = 1 into (15), we obtain the collective premium computed under the traditional model, P = α 1 / γ 1 . On  the other hand, if experience is available, the actuary takes a sample ( x ˜ 1 , x ˜ 2 ) from the random variables ( X 1 , X 2 ) and uses this information to estimate the unknown risk premium P ( Θ ) , through the Bayes premium P * ( x ˜ 1 , x ˜ 2 ) = E π ( Θ | ( x ˜ 1 , x ˜ 2 ) ) [ P ( Θ ) ] . Due to the fact that the posterior distribution is conjugated with the prior, the Bayes premium can be derived from (15) by simply switching the parameters α i and γ i ( i = 1 , 2 ) with the updated parameters by using (8)–(11). Furthermore, the Bayesian premium can be rewritten as a credibility expression, i.e., a linear function of the data and the collective premium.
Obviously, the Bayesian premium based on (15) does not depend on the individual’s risk factors, and it is only based on the accumulated past claims. Individual’s risk factors can be incorporated into the premium by computing P i * ( x ˜ 1 , x ˜ 2 , β 1 i , β 2 i ) , for  i = 1 , , n . This general pricing formula is a function of the number of accumulated claims and the individual’s significant characteristics in the regression component.
Finally, the Bayesian bonus-malus premium is computed as the ratio between the Bayesian premium and the collective premium. This bonus-malus premium is usually normalised by multiplying this ratio by 100.

5. Empirical Results

We will now analyse a dataset that includes information based on one-year vehicle insurance policies taken out in 2004 or 2005. This dataset is available on the website of the Faculty of Business and Economics, Macquarie University (Sydney, Australia) (see also de Jong and Heller 2008). The total portfolio contained 67,856 policies, of which 4624 have at least one claim. With respect to the number of claims, the minimum and maximum were zero and four, respectively. The mean was 0.072, and standard deviation was 0.278. On the other hand, regarding the claim size, the minimum and maximum were zero and 55,922.10, respectively. The mean was 137.27, and the standard deviation was 1056.30. This value was very large for the severity of claims, which meant that a premium based only on the mean claim size was not adequate for computing the bonus-malus premiums. As this portfolio only included the aggregate value of the claims’ severity, we followed the approach provided in Gómez-Déniz (2016) to determine the exact value of all claims randomly. Since this portfolio only included the aggregate value of the claim amount for all of the claims in the portfolio, a simulation was performed to determine the exact amount corresponding to each claim. This simulation was carried out by using the Mathematica commands Permute, RandomChoice, IntegerPartitions, IntegerPart and RandomPermutation, as shown in the Appendix provided in Gómez-Déniz (2016). It is convenient to note that the partition obtained only provided the integer part, and this did not seem very relevant in the analysis. Furthermore, due to the RandomChoice command, the partition was different every time the program was run. The results obtained for the claim amounts via simulation are not shown in this work, but they are available from the authors upon request.
Below in Table 1, the observed (in bold) and expected frequencies with the threshold value for the claims assumed to be ψ = $ 1000 are shown. For each entry, observed frequencies (top row in bold), expected frequency under the basic model (given by using (1) in the middle row), and mixture model (bottom row), obtained by using (12), are illustrated. Furthermore, the marginal observed and expected frequencies are in the far right column and in the bottom row for X 1 and X 2 , respectively. The cells in this table are grouped to comply with the rule of five when applying the χ 2 test.
Similarly, Table 2 exhibits the observed and expected frequencies when the threshold amount was ψ = $ 3000 . Again, the cells are combined to comply with the rule of five. As can be seen, the fitting values obtained by using the mixture model were much more flexible since it incorporated heterogeneity among policyholders via the prior distributions, and it also provided a better fit to the data than those ones computed under the basic model for both thresholds.
Maximum likelihood estimation was used in both cases. It is convenient to point out that in the case of the mixture model, it was proven that directly maximizing the logarithm of the log-likelihood function provided, as expected, the same results as using the EM algorithm shown in Appendix B of this work. Mathematica and WinRaTs were the two packages used in this case.
Parameter estimates, standard errors (in brackets), the maximum of the log-likelihood function, figures of the chi-squared test statistics, degrees of freedom (d.f.), and the p-value are exhibited in Table 3 for the basic and mixture models. Results under the threshold value first ψ = $ 1000 are shown in the second and third columns and ψ = $ 3000 in the last two columns. Virtually, the same estimates were obtained for parameters μ 1 and μ 2 under the basic and mixture models. Similarly, no changes were discernible in the estimates between the estimates for the two thresholds with the exemption of the estimate of parameter γ 2 . In this case, it was observable that the estimate decreased when the threshold increased. By incrementing the threshold value, the fit to the data improved. The mixture model provided the best fit to the data in terms of the χ 2 test statistic and the negative of the maximum of the likelihood function max . Note that the mixture model was not rejected at the 5% significance level for the two thresholds previously considered. It is important to note that, although the gain in terms of maximum of the log-likelihood function did not seem significant, the mixture model was preferable in terms of the χ 2 test statistics since, unlike the basic one, it was not rejected at the 5% significance level (see the corresponding p-values) in either of the two thresholds mentioned above.
We now implement explanatory variables in our analysis. The following covariates were considered: gender of driver, vehicle body, driver’s area of residence, age of vehicle, and driver’s age category. In addition, an intercept was also included in the study. Details about the codification of these variables can be found on the same website. Moreover, an offset variable (exposure, log of the time exposed to risk) was included in the linear predictor associated with the first variable.
Table 4 illustrates the estimates of the regressors for the mixture model associated with the random variables X 1 and X 2 again for a threshold of ψ = $ 1000 and ψ = $ 3000 . In the first case, the explanatory variables hardtop (HDTOP), motorized caravan (MCARA), driver’s area of residence C (AREAC), age of Vehicles 1 and 2 (VAGE1 and VAGE2), and driver’s Age Category 1 (AGE 1) were statistically significant at the 5% significance level for the random variable total number of claims given that the claim size exceeded ψ = $ 1000 . Among these variables, only HDTOP, MCARA, VAGE1, VAGE2, and AGE1 were significant for both response variables. However, it is important to note that all these variables except for the regressors associated with AGE1 and AGE2, the sign of the estimates changed from positive to negative for claims above the threshold. Furthermore, the estimate of parameter γ 1 was statistically significant at the same nominal level. When the threshold value was increased up to ψ = $ 3000 , the number of significative variables above the threshold considerably grew since now, the intercept (CONSTANT), gender of driver (GENDER), HDTOP, SEDAN, station wagon (STNWG), TRUCK, AREAA, AREAB, AREAC, AREAD, VAGE1, VAGE2, and AGE1, were relevant. However, only CONSTANT, HDTOP, STNWG, AREAD, VAGE1, VAGE2, and AGE1 were significant for both dependent variables at the same nominal level. The regressors associated with the explanatory variables CONSTANT, AREAD, and the AGE1 had the same sign for claims below and above ψ = $ 3000 . The first two regressors were negatively correlated and the latter one positively correlated to the response variables, respectively. For the other regressors, once again, the sign of the estimates changed from positive to negative for claims above the threshold. Among the common statistically significant estimates for both threshold values, i.e., HDTOP, VAGE1, VAGE2, and AGE1, the same sign of the estimates in the variables X 1 and X 2 was observable. For the non-significant estimates, different signs were observed in the regressors. Furthermore, the estimates of parameters γ 1 and γ 2 were statistically significant at the same nominal level.
Similarly to the case previously considered, the fit to the data improved when covariates were incorporated in the model and when the threshold value enlarged. Table 5 exhibits the negative of the maximum of the likelihood function ( max ), Akaike’s information criterion (AIC), the Bayesian information criterion (BIC), and the consistent Akaike’s information criterion (CAIC) for the basic and mixture regression models. A lower value of these measures of model selection was desirable. It was observable that the latter model was preferable to the former one.
We plot the QQ-plots of the randomized quantile residuals to check for normality in Figure 2. The residuals for the basic regression models are shown in the top row and for the mixture regression model in the bottom row. Furthermore, models that use ψ = $ 1000 as the threshold value are exhibited in the left column and ψ = $ 3000 in the right-hand column. A perfect alignment with the 45 line implies that the residuals are normally distributed. It was observable that the residuals for the larger threshold values adhered a little bit closer to the line, but these differences were not significant.
Figure 3 exhibits the bonus-malus premiums (BMP) for the mixture model without covariates. Here, x 1 is the total number of claims when x 2 claims out of x 1 have a size larger than ψ . In each chart, the thick line represents ψ = $ 1000 , and the thin line denotes ψ = $ 3000 . It was noticeable that the BMP decreased with the time period when the observed pair x 1 and x 2 was fixed for the two thresholds considered. The BMP was consistently lower when the threshold ψ decreased. Although for both values of ψ , the premium charged increased when x 1 and x 2 grew, the premium paid also increased with x 2 when x 1 was fixed.
Figure 4 illustrates the bonus-malus premiums (BMP) to be charged to the subgroup of policyholders with SEDAN and AREAA. In this case, we used the mixture regression model including the rest of the explanatory variables and the exposure. Similar conclusions could be drawn from this set of graphs. Again, the BMP was persistently lower when the threshold ψ decreased. The premium charged increased when x 1 and x 2 grew for either value of ψ ; moreover, the premium paid rose with x 2 when x 1 was held fixed. As compared to the premiums obtained under this regression model were way higher than those ones derived before, this could be surely explained by the small sample size used to estimate regressors and also for the incorporation of the offset variable that without any doubt affected the individual average number of claims and the probability of making a claim higher than the threshold. Other different subgroups of policyholders could also be used for tarification purposes; however, for some of these classes, non-reliable estimates were obtained due to the very low sample size.

Computations in the Compound Model

Although it is customary to calculate the bonus-malus premium based on the variable number of claims (it is usually considered that once a loss has occurred, the company does not have the ability to model the amount corresponding to the loss), some attempts have been made to implement the severity in the calculation of the premium. Some works related to this topic are Frangos and Vrontos (2001); Pinquet (1998), and Gómez-Déniz et al. (2014), among others. As the practitioner wishes to calculate the premium using both variables, it is useful to rely on the composite collective model. Similarly to the univariate case, the bivariate compound distributions for the aggregate claim size random variable can be simply derived as follows:
g ( y 1 , y 2 ) = x 1 , x 2 = 0 p x 1 , x 2 f 1 * x 1 ( y 1 ) f 2 * x 2 ( y 2 ) ,
and this is the the joint probability density function of ( Y 1 , Y 2 ) = S 1 , S 2 , , where S 1 = i = 0 X 1 Y 1 i , S 2 = i = 0 X 2 Y 2 i are the aggregate severities, Y 1 and Y 2 being mutually independent and also independent of ( X 1 , X 2 ) with probability functions (discrete or continuous) f 1 ( y 1 ) , f 2 ( y 2 ) , respectively, with x 1 and x 2 -fold convolutions f 1 * x 1 ( y 1 ) and f 2 * x 2 ( y 2 ) , respectively. General expressions for E ( S ) , v a r ( S ) and c o v ( S 1 , S 2 ) , where S = S 1 + S 2 , were provided in Partrat (1994).
Recursion for bivariate count distributions and their compound distributions given in the form (16) have been previously considered in the actuarial literature; see Theorem 2.1. in Hesselager (1996). Other similar recursions can be found in Vernic (1997); Walhin and Paris (2000); Walhin and Paris (2001); Sundt (2002), and Sundt and Vernic (2009), among others. Moreover, bivariate recursions are useful in prediction problems involving the conditional g ( y | x ) of Y, given X = x ; see Hesselager (1996) for more details.
Let us now assume that the random variables X 1 and X 2 represent two kinds of claims, for instance bodily injury and material damage, or as in our study, claims below and above a threshold ψ .
The fact that the probability generating function of (1) is analytically obtained helps us to derive the probability generating function of the joint random variable ( X 1 ( d 1 ) , X 2 ( d 2 ) ) for d i , which can be deduced in type i ( i = 1 , 2 ) claim amounts. Here, X i ( d i ) is the random variable corresponding to the yearly frequency of type i claims exceeding d i . The work in Partrat (1994) then showed that the probability generating function of the random variable ( X 1 ( d 1 ) , X 2 ( d 2 ) ) is given by:
G X 1 ( d 1 ) , X 2 ( d 2 ) ( s 1 , s 2 ) = G X 1 , X 2 ( ( 1 F 1 ( d 1 ) ) s 1 + F 1 ( d 1 ) , ( 1 F 2 ( d 2 ) ) s 2 + F 2 ( d 2 ) ) ,
where F 1 and F 2 are the cumulative distribution functions of the random variables Y 1 and Y 2 , respectively; while the probability generating function of the random variable X ( d 1 , d 2 ) , with X = X 1 + X 2 , is given by:
G X ( d 1 , d 2 ) ( s 1 , s 2 ) = G X 1 , X 2 ( ( 1 F 1 ( d 1 ) ) s 1 + F 1 ( d 1 ) , ( 1 F 2 ( d 2 ) ) s 2 + F 2 ( d 2 ) ) .

6. Final Comments

In this paper, a flexible bivariate count data regression model that let us distinguish between different types of claims according to the claim size was introduced. Besides, it allowed us to examine the factors that affect the number of claims above and below a given claim size threshold. By means of a mixture regression model, the individual claim size and other risk factors such as gender, type of vehicle, driving area, or age of the vehicle could be used to compute credibility bonus-malus premiums. Extensions of this work includes a simple modification of this model to differentiate between more than two claims in the line of the work provided in Gómez-Déniz and Calderín-Ojeda (2018). Besides, a similar model can be simply implemented when the number of claims is distributed according to a negative binomial distribution. A study of this nature would be a possible extension of this work.

Author Contributions

E.G.-D. and E.C.-O. contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

Funding

E.G.D. was partially funded by grant ECO2017–85577–P (Ministerio de Economía, Industria y Competitividad. Agencia Estatal de Investigación).

Acknowledgments

The authors wish to acknowledge the Associate Editor and three anonymous referees for the constructive comments that helped to improve the quality of the paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

The second partial derivatives are provided by:
2 ( β 1 , β 2 ; x ) β 1 j 2 = i = 1 n ω j i 2 μ 1 i , j = 1 , , m , 2 ( β 1 , β 2 ; x ) β 1 j β 1 k = i = 1 n ω j i ω k i μ 1 i , j k , 2 ( β 1 , β 2 ; x ) β 1 j β 2 j = 0 , j = 1 , , m , 2 ( β 1 , β 2 ; x ) β 2 j 2 = i = 1 n η j i 1 + exp ( η 2 i β 2 ) 2 ϕ ( μ 1 i , μ 2 i , x 1 i , x 2 i ) exp ( η 2 i β 2 ) + ( x 1 i ϕ ( μ 1 i , μ 2 i , x 1 i , x 2 i ) ) μ 2 i μ 1 i μ 2 i , j = 1 , , m . 2 ( β 1 , β 2 ; x ) β 2 j β 2 k = i = 1 n η j i η k i ( 1 + exp ( η 2 i β 2 ) ) 2 ϕ ( μ 1 i , μ 2 i , x 1 i , x 2 i ) exp ( η 2 i β 2 ) + ( x i ϕ ( μ 1 i , μ 2 i , x 1 i , x 2 i ) ) μ 2 i μ 1 i μ 2 i , j = 1 , , m .
Now, the entries of Fisher’s information matrix (with dimension m × m ) are given by:
E ( β ^ 1 , β ^ 2 ; x ) β ^ 1 j 2 = i = 1 n ω j i 2 μ ^ 1 i , E 2 ( β ^ 1 , β ^ 2 ; x ) β ^ 1 j β ^ 1 k = i = 1 n ω j i ω k i μ ^ 1 i , j k , E ( β ^ 1 , β ^ 2 ; x ) β ^ 1 j β ^ 2 j = 0 , E ( β ^ 1 , β ^ 2 ; x ) β 2 j 2 = i = 1 n μ ^ 1 i μ ^ 2 i μ ^ 1 i μ ^ 2 i η j i 1 + exp ( η 2 i β ^ 2 ) 2 , E ( β ^ 1 , β ^ 2 ; x ) β 2 j β 2 k = i = 1 n μ ^ 1 i μ ^ 2 i μ ^ 1 i μ ^ 2 i η j i η k i ( 1 + exp ( η 2 i β ^ 2 ) ) 2 , j k ,
for j = 1 , , m , where we have taken into account that E ( ϕ ( μ 1 i , μ 2 i , x 1 i , x 2 i ) ) = 0 . Again, the asymptotic variance-covariance matrix of ( β ^ 1 , β ^ 2 ) is obtained by inverting this information matrix.

Appendix B

Given the vector of complete data x and the vector of missing observations ( δ ˜ 1 , δ ˜ 2 ) = { ( δ ˜ 11 , δ ˜ 21 ) , , ( δ ˜ 1 n , δ ˜ 2 n ) } , then the complete data log-likelihood takes the form:
( β 1 , β 2 , γ 1 , γ 2 ) i = 1 n x 2 i log δ 2 i μ 2 i ( x 1 i x 2 i ) log ( δ 1 i μ 1 i δ 2 i μ 2 i ) δ 1 i μ 1 i + n γ 1 log γ 1 + ( γ 1 1 ) i = 1 n log δ 1 i γ 1 i = 1 n δ 1 i + ( γ 2 1 ) i = 1 n log δ 2 i + ( γ 2 1 ) i = 1 n log ( δ 1 i δ 2 i ) ( 2 γ 2 1 ) i = 1 n log δ 1 i n log B ( γ 2 , γ 2 ) .
Expression (A1) can be divided into two parts; the regressors are included in the first part, and the mixing distributions appear only in the second part (i.e., parameters γ 1 and γ 1 ). Furthermore, we assume, without loss of generality, that to make the model identifiable, E π 1 ( δ 1 ) = 1 and E π 2 ( δ 2 ) = 1 / 2 . The EM algorithm is based on two steps. The E-step, i.e., expectation, fills in the missing data. Once the missing data are built-in, the parameters are estimated in the M-step, i.e., maximization. The regressors are estimated using the pseudo-values, E ( δ 1 i | x ˜ 1 , x ˜ 2 ) and E ( δ 2 i | x ˜ 1 , x ˜ 2 ) as offset variables and then fitting the regression model given in (6). Then, to estimate the parameters γ 1 and γ 2 , we maximize the log-likelihood of the mixing distributions, replacing the missing observations with their expectations. Next, if some terminating condition is achieved, then stop iterating, otherwise move back to the E-step for more iterations.
From the current estimates after the k th iteration, the new estimates ( β 1 ^ ( k ) , β 2 ^ ( k ) , γ ^ 1 ( k ) , γ ^ 2 ( k ) ) are obtained as follows:
E-step: 
Consider:
n ( δ 1 i , δ 2 i , μ 1 i , μ 2 i ) = m ( δ 1 i , δ 2 i , μ 1 i , μ 2 i ) 0 0 δ 1 i m ( δ 1 i , δ 2 i , x 1 i , x 2 i ) d δ 2 i d δ 1 i ,
where:
m ( δ 1 i , δ 2 i , μ 1 i , μ 2 i ) = ( δ 2 i μ 2 i ) x 2 i ( δ 1 i μ 1 i δ 2 i μ 2 i ) x 1 i x 2 i exp ( δ 1 i μ 1 i ) π 1 ( δ 1 i ) π 2 ( δ 2 i ) .
For all i = 1 , 2 , , n , we calculate:
c i = E ( δ 1 i | x ) = 0 0 δ 1 i δ 1 i n ( δ 1 i , δ 2 i , μ 1 i , μ 2 i ) d δ 2 i d δ 1 i , d i = E ( log δ 1 i | x ) = 0 0 δ 1 i log ( δ 1 i ) n ( δ 1 i , δ 2 i , μ 1 i , μ 2 i ) d δ 2 i d δ 1 i , m i = E ( δ 2 i | x ) = 0 0 δ 1 i δ 2 i n ( δ 1 i , δ 2 i , μ 1 i , μ 2 i ) d δ 2 i d δ 1 i , n i = E ( log δ 2 i | x ) = 0 0 δ 1 i log ( δ 2 i ) n ( δ 1 i , δ 2 i , μ 1 i , μ 2 i ) d δ 2 i d δ 1 i , s i = E ( log ( δ 1 i δ 2 i ) | x ) = 0 0 δ 1 i log ( δ 1 i δ 2 i ) n ( δ 1 i , δ 2 i , μ 1 i , μ 2 i ) d δ 2 i d δ 1 i .
M-step: 
This step works as follows:
  • Update the regressors β ^ j ( k + 1 ) , j = 1 , 2 , using the pseudo-values c i and m i as offset variables by fitting a the regression model given in (6), and then,
  • Update the estimate of the parameters γ ^ 1 ( k + 1 ) and γ ^ 2 ( k + 1 ) by using:
    γ ^ 1 ( k + 1 ) = exp 1 n i = 1 n c i + ψ γ ^ 1 ( k ) 1 1 n i = 1 n d i γ ^ 2 ( k + 1 ) = 1 2 ψ 1 1 n i = 1 n n i + 1 n i = 1 n s i 2 1 n i = 1 n d i ,
    where ψ ( · ) is the digamma function.
Stop iterating if some terminating condition is satisfied.
The following result concerns the concept of multivariate log-concavity, which was introduced by Bapat (1988). See also Johnson et al. (1997).
Proposition A1.
The probability function given in (1) is generalized log-concave.
Proof. 
To see this, observe that (1) can be rewritten as:
Pr ( X 1 = x 1 , X 2 = x 2 ) = m ( x , Θ ) i = 1 2 f i ( x i ) ,
where:
m ( x , Θ ) = x 1 ! ( μ 1 μ 2 ) x 1 x 2 exp ( μ 1 ) μ 1 x 1 ( x 1 x 2 ) ! , f i ( x i ) = μ i x i x i ! .
Since f i ( x i ) are log-concave functions ( f i ( x i ) 2 f i ( x i 1 ) f i ( x i + 1 ) , i = 1 , 2 , x i = 1 , 2 , ), then the result follows by applying Theorem 3 in Bapat (1988). ☐
The next result shows that the proposed distribution is strongly unimodal (see Barndorff-Nielsen 1973 and Pedersen 1975).
Proposition A2.
The probability function given in (1) is strongly unimodal.
Proof. 
Taking into account that for x 1 = 1 , 2 , , , x 2 = 1 , , x 1 , it is verified that:
p x 1 , x 2 p x 1 1 , x 2 1 p x 1 1 , x 2 p x 1 , x 2 1 = 1 + 1 x 1 x 2 1 , p x 1 , x 2 p x 1 1 , x 2 p x 1 , x 2 + 1 p x 1 1 , x 2 1 = 1 + 1 x 2 1 , p x 1 , x 2 p x 1 , x 2 1 p x 1 + 1 , x 2 p x 1 1 , x 2 1 = 1 + 1 x 1 x 2 1 ,
being p x 1 , x 2 = Pr ( X 1 = x 1 , X 2 = x 2 ) , and we get the result after applying Condition (b) in Theorem 1 in Pedersen (1975). ☐

References

  1. Bapat, Ravindra B. 1988. Discrete multivariate distributions and generalized log-concavity. Sankhya¯: The Indian Journal of Statistics, Series A 1: 98–110. [Google Scholar]
  2. Barndorff-Nielsen, Ole. 1973. Unimodality and exponential families. Communications in Statistics-Theory and Methods 1: 189–216. [Google Scholar]
  3. Bermúdez, Lluís. 2009. A priori ratemaking using bivariate Poisson regression models. Insurance: Mathematics and Economics 44: 135–41. [Google Scholar] [CrossRef] [Green Version]
  4. Bermúdez, Lluís, and Dimitris Karlis. 2017. A priori ratemaking using bivariate Poisson models. Scandinavian Actuarial Journal 2: 148–58. [Google Scholar] [CrossRef] [Green Version]
  5. Bonsdorff, Heikki. 2005. On asymptotic properties of Bonus-Malus systems based on the number and on the size of the claims. Scandinavian Actuarial Journal 4: 309–20. [Google Scholar] [CrossRef]
  6. Bühlmann, Hans, and Alois Gisler. 2005. A Course in Credibility Theory and Its Applications. Berlin: Springer. [Google Scholar]
  7. Cameron, Colin, and Pravin K. Trivedi. 1998. Regression Analysis of Count Data. Cambridge: Cambridge University Press. [Google Scholar]
  8. De Jong, lPiet, and Gillian H. Heller. 2008. Generalized Linear Models for Insurance Data. Cambridge: Cambridge University Press. [Google Scholar]
  9. Denuit, Michel, Xavier Marèchal, Sandra Pitrebois, and Jean F. Walhin. 2009. Actuarial Modelling of Claim Counts Risk Classification, Credibility and Bonus-Malus Systems. New York: John Wiley & Sons. [Google Scholar]
  10. Dionne, Georges, and Charles Vanasse. 1989. A generalization of actuarial automobile insurance rating models: The negative binomial distribution with a regression component. ASTIN Bulletin 19: 199–212. [Google Scholar] [CrossRef]
  11. Frangos, Nikolaos, and Spyridon Vrontos. 2001. Design of optimal bonus-malus systems with a frequency and a severity component on an individual basis in automobile insurance. ASTIN Bulletin 31: 1–22. [Google Scholar] [CrossRef] [Green Version]
  12. Gerber, Hans U. 1979. An Introduction to Mathematical Risk Theory. Homewood: Huebner Foundation Monograph. [Google Scholar]
  13. Gómez-Déniz, Emilio. 2008. A generalization of the credibility theory obtained by using the weighted balanced loss function. Insurance: Mathematics and Economics 42: 850–54. [Google Scholar] [CrossRef]
  14. Gómez-Déniz, Emilio. 2016. Bivariate credibility bonus-malus premiums distinguishing between two types of claims. Insurance: Mathematics and Economics 70: 117–24. [Google Scholar] [CrossRef]
  15. Gómez-Déniz, Emilio, and Enrique Calderín-Ojeda. 2018. Multivariate credibility in bonus-malus systems distinguishing between different types of vlaims. Risks 6: 34. [Google Scholar] [CrossRef] [Green Version]
  16. Gómez-Déniz, Emilio, Agustín Hernández, and María P. Fernández. 2014. Computing credibility bonus-malus premiums using the total claim amount distribution. Hacettepe Journal of Mathematics and Statistics 43: 1047–61. [Google Scholar]
  17. Heilmann, Wolf R. 1989. Decision theoretic foundations of credibility theory. Insurance: Mathematics and Economics 8: 75–95. [Google Scholar] [CrossRef]
  18. Hesselager, Ole. 1996. Recursions for certain bivariate counting distributions and their compound distributions. ASTIN Bulletin 26: 35–52. [Google Scholar] [CrossRef] [Green Version]
  19. Johnson, Norman, Samuel Kotz, and Narayanaswamy Balakrishnan. 1997. Discrete Multivariate Distributions. New York: Wiley. [Google Scholar]
  20. Khatri, Chinubhai G. 1983a. Multivariate discrete exponential distributions and their characterization by Rao-Rubin condition for additive damage model. South African Statistical Journal 17: 13–32. [Google Scholar]
  21. Khatri, Chinubhai G. 1983b. Multivariate discrete exponential family of distributions. Communications in Statistics-Theory and Methods 12: 877–93. [Google Scholar] [CrossRef]
  22. Klugman, Stuart A., Harry H. Panjer, and Gordon E. Willmot. 2008. Loss Models: From Data to Decisions, 3rd ed. New York: Wiley. [Google Scholar]
  23. Kocherlakota, Subrahmaniam, and Kathleen Kocherlakota. 1992. Bivariate Discrete Distributions. New York: Marcel Dekker. [Google Scholar]
  24. Lee, Mei-Ling T. 1996. Properties and applications of the Sarmanov family of bivariate distributions. Communications in Statistics-Theory and Methods 25: 1207–22. [Google Scholar]
  25. Louis, Thomas A. 1982. Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society. Series B 44: 226–33. [Google Scholar]
  26. Partrat, Christian. 1994. Compound model for two dependent kinds of claims. Insurance: Mathematics and Economics 15: 219–31. [Google Scholar] [CrossRef]
  27. Pedersen, Jean G. 1975. On strong unimodality of two-dimensional discrete distributions with applications to M-Ancillarity. Scandinavian Journal of Statistics 2: 99–102. [Google Scholar]
  28. Pinquet, Jean. 1998. Designing optimal bonus-malus systems from different types of claims. ASTIN Bulletin 28: 205–20. [Google Scholar] [CrossRef] [Green Version]
  29. Ragulina, Olena. 2011. Bonus-malus systems with different claim types and varying deductibles. Modern Stochastics: Theory and Applications 4: 141–59. [Google Scholar] [CrossRef]
  30. Rolski, Tomasz, Hanspeter Schmidli, Volker Schmidt, and Jozef Teugel. 1999. Stochastic Processes for Insurance and Finance. Hoboken: John Wiley & Sons. [Google Scholar]
  31. Sundt, Bjoern. 2002. Recursive evaluation of aggregate claims distributions. Insurance: Mathematics and Economics 30: 297–322. [Google Scholar] [CrossRef]
  32. Sundt, Bjoern, and Raluca Vernic. 2009. Recursions for Convolutions and Compound Distributions with Insurance Applications. New York: Springer. [Google Scholar]
  33. Vernic, Raluca. 1997. On the bivariate generalized Poisson distribution. ASTIN Bulletin 27: 23–31. [Google Scholar] [CrossRef] [Green Version]
  34. Walhin, Jean F., and John Paris. 2000. Recurs1ve formulae for some bivariate counting distributions obtained by the trivariate reduction method. ASTIN Bulletin 30: 141–55. [Google Scholar] [CrossRef] [Green Version]
  35. Walhin, Jean F., and John Paris. 2001. The mixed bivariate Hofmann distribution. ASTIN Bulletin 31: 127–42. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Joint probability mass functions of the bivariate discrete distribution proposed for selected values of the parameters. From top to bottom and left to right, we have: ( μ 1 , μ 2 ) = ( 0.5 , 0.25 ) , ( μ 1 , μ 2 ) = ( 5 , 0.5 ) , ( μ 1 , μ 2 ) = ( 5 , 2 ) , and ( μ 1 , μ 2 ) = ( 10 , 8 ) .
Figure 1. Joint probability mass functions of the bivariate discrete distribution proposed for selected values of the parameters. From top to bottom and left to right, we have: ( μ 1 , μ 2 ) = ( 0.5 , 0.25 ) , ( μ 1 , μ 2 ) = ( 5 , 0.5 ) , ( μ 1 , μ 2 ) = ( 5 , 2 ) , and ( μ 1 , μ 2 ) = ( 10 , 8 ) .
Risks 08 00020 g001
Figure 2. QQ-plots of the randomized quantile residuals for the basic (top) and mixture (bottom) regression models for ψ = $ 1000 (left) and ψ = $ 3000 (right) threshold values.
Figure 2. QQ-plots of the randomized quantile residuals for the basic (top) and mixture (bottom) regression models for ψ = $ 1000 (left) and ψ = $ 3000 (right) threshold values.
Risks 08 00020 g002
Figure 3. Bayesian bonus-malus premiums under the mixture model without covariates for x 1 claims when there are x 2 claims with a claim size larger than ψ . The thick line represents ψ = $ 1000 , and the thin line represents ψ = $ 3000 . BMP, bonus-malus premiums.
Figure 3. Bayesian bonus-malus premiums under the mixture model without covariates for x 1 claims when there are x 2 claims with a claim size larger than ψ . The thick line represents ψ = $ 1000 , and the thin line represents ψ = $ 3000 . BMP, bonus-malus premiums.
Risks 08 00020 g003
Figure 4. Bayesian bonus-malus premiums under the mixture model with covariates for x 1 claims when there are x 2 claims with a claim size larger than ψ . The thick line represents ψ = $ 1000 , and the thin line represents ψ = $ 3000 . This chart corresponds to the the subgroup of policyholders with SEDAN and AREAA.
Figure 4. Bayesian bonus-malus premiums under the mixture model with covariates for x 1 claims when there are x 2 claims with a claim size larger than ψ . The thick line represents ψ = $ 1000 , and the thin line represents ψ = $ 3000 . This chart corresponds to the the subgroup of policyholders with SEDAN and AREAA.
Risks 08 00020 g004
Table 1. Observed (in bold) and expected frequencies for threshold value ψ = $ 1000 .
Table 1. Observed (in bold) and expected frequencies for threshold value ψ = $ 1000 .
X 2 01234Total
X 1
063,232 63,232
63,098.00 63,098.00
63,279.50 63,279.50
125511782 4333
2713.211874.01 4587.22
2518.341768.19 4286.53
210911448 271
58.3380.5827.83 166.74
101.75116.1054.15 272
35661 18
0.831.731.200.27 4.03
4.266.144.661.81 16.87
4100102
0.010.020.010.020.010.07
0.180.310.290.180.061.02
Total65,1101902542067,856
65,870.381956.3429.040.290.0167,856.06
65,904.031890.7459.101.990.0667,856.00
Table 2. Observed (in bold) and expected frequencies for threshold value ψ = $ 3000 .
Table 2. Observed (in bold) and expected frequencies for threshold value ψ = $ 3000 .
X 2 01234Total
X 1
063,232 63,232
63,098.00 63,098.00
63,279.50 63,279.50
13576757 4333
3817.42769.79 4587.21
3554.25732.28 4286.53
22164411 271
115.4846.574.69 166.74
198.1654.7519.09 272
312420 18
2.331.410.280.01 4.03
11.133.471.620.64 16.86
4200002
0.030.030.010.000.000.07
0.630.210.110.060.021.03
Total67038805130067,856
67,033.26815.804.980.010.0067,856.05
67,043.67790.7120.820.700.0267,856.00
Table 3. Parameter estimates (in brackets) and measures of model selection for the basic and mixture models without covariates.
Table 3. Parameter estimates (in brackets) and measures of model selection for the basic and mixture models without covariates.
ψ = $ 1000 ψ = $ 3000
Basic ModelMixture ModelBasic ModelMixture Model
μ ^ 1 0.07270.07270.07270.0727
(0.001)(0.000)(0.001)(0.000)
μ ^ 2 0.02970.02970.01220.0123
(0.000)(0.000)(0.000)(0.000)
γ ^ 1 15.900 15.900
(0.000) (0.000)
γ ^ 2 4.334 2.035
(0.000) (0.000)
max −21,346.561−21,292.395−20,301.926−20,242.391
χ 2 >1005.16>1002.09
d.f.4231
p-value0.00%7.58%0.00 %14.83%
Table 4. Parameter estimates and p-values associated with the Wald test for the mixture model including covariates.
Table 4. Parameter estimates and p-values associated with the Wald test for the mixture model including covariates.
ψ = 1000 ψ = 3000
Variable X 1 Variable X 2 Variable X 1 Variable X 2
ParameterEstimatep-ValueEstimatep-ValueEstimatep-ValueEstimatep-Value
GENDER−0.0150.6130.1050.090−0.0220.4670.2200.007
BUS0.2440.610−0.3840.5581.0050.002−1.3860.208
CONVT−0.5620.342−0.4060.738−0.5250.3640.8900.461
COUPE0.5030.0000.2040.4010.4890.0000.0070.982
HDTOP0.2080.024−0.4270.0260.1810.049−0.6820.011
MCARA0.7660.003−1.2910.0500.6680.011−1.4950.152
MIBUS0.0980.5140.2920.3420.0180.905−0.5010.234
PANVN0.1240.335−0.2860.2720.1320.299−0.4150.223
RDSTR0.1310.856−0.2780.8230.3180.624−1.1430.672
SEDAN0.0630.098−0.1480.0550.0580.128−0.3480.001
STNWG0.1240.002−0.1500.0760.1070.010−0.4710.000
TRUCK0.0550.570−0.0400.8350.0560.560−0.5060.050
UTE−0.1000.152−0.0540.699−0.1110.110−0.2710.126
AREAA−0.0100.885−0.1940.152−0.0640.343−0.1080.001
AREAB0.0500.472−0.2070.132−0.0050.938−0.5710.004
AREAC0.0070.920−0.2930.027−0.0530.421−0.4960.035
AREAD−0.1100.144−0.1390.352−0.1710.021−0.3450.003
AREAE−0.0370.641−0.1250.420−0.0930.228−0.5720.293
VAGE10.1870.000−0.3880.0000.1680.000−0.2710.000
VAGE20.2190.000−0.2590.0010.2070.000−0.6190.009
VAGE30.0980.013−0.0100.2080.0830.035−0.2750.283
AGE10.5120.0000.2910.0340.4640.0000.7460.000
AGE20.3280.0000.0320.7950.2860.0000.2740.118
AGE30.2750.0000.0390.7460.2290.0000.2730.111
AGE40.2430.000−0.0430.7230.2020.0010.1960.253
AGE50.0300.656−0.0440.740−0.0130.843−0.0020.990
CONSTANT−2.2730.0000.0270.880−2.1560.000−1.0450.000
γ ^ 1 21.6020.000 30.7180.000
γ ^ 2 5.9030.185 2.2050.014
Table 5. Parameter estimates (in brackets) and measures of model selection for the basic and mixture models with covariates.
Table 5. Parameter estimates (in brackets) and measures of model selection for the basic and mixture models with covariates.
ψ = $ 1000 ψ = $ 3000
Basic ModelMixture ModelBasic ModelMixture Model
max 20,604.35520,588.93619,545.21219,511.783
AIC41,312.71041,289.87239,198.42239,135.565
BIC41,809.46841,800.88039,691.18039,646.573
CAIC41,863.46841,856.88039,745.18039,702.573

Share and Cite

MDPI and ACS Style

Gómez-Déniz, E.; Calderín-Ojeda, E. A Survey of the Individual Claim Size and Other Risk Factors Using Credibility Bonus-Malus Premiums. Risks 2020, 8, 20. https://doi.org/10.3390/risks8010020

AMA Style

Gómez-Déniz E, Calderín-Ojeda E. A Survey of the Individual Claim Size and Other Risk Factors Using Credibility Bonus-Malus Premiums. Risks. 2020; 8(1):20. https://doi.org/10.3390/risks8010020

Chicago/Turabian Style

Gómez-Déniz, Emilio, and Enrique Calderín-Ojeda. 2020. "A Survey of the Individual Claim Size and Other Risk Factors Using Credibility Bonus-Malus Premiums" Risks 8, no. 1: 20. https://doi.org/10.3390/risks8010020

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop