Next Article in Journal
Modeling Cycle Dependence in Credit Insurance
Next Article in Special Issue
Neumann Series on the Recursive Moments of Copula-Dependent Aggregate Discounted Claims
Previous Article in Journal
An Academic Response to Basel 3.5
Previous Article in Special Issue
Catastrophe Insurance Modeled by Shot-Noise Processes
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Modeling and Performance of Bonus-Malus Systems: Stationarity versus Age-Correction

Department of Mathematics, Aarhus University, Ny Munkegade, Aarhus C 8000, Denmark
Risks 2014, 2(1), 49-73; https://doi.org/10.3390/risks2010049
Submission received: 30 September 2013 / Revised: 17 February 2014 / Accepted: 18 February 2014 / Published: 11 March 2014
(This article belongs to the Special Issue Application of Stochastic Processes in Insurance)

Abstract

:
In a bonus-malus system in car insurance, the bonus class of a customer is updated from one year to the next as a function of the current class and the number of claims in the year (assumed Poisson). Thus the sequence of classes of a customer in consecutive years forms a Markov chain, and most of the literature measures performance of the system in terms of the stationary characteristics of this Markov chain. However, the rate of convergence to stationarity may be slow in comparison to the typical sojourn time of a customer in the portfolio. We suggest an age-correction to the stationary distribution and present an extensive numerical study of its effects. An important feature of the modeling is a Bayesian view, where the Poisson rate according to which claims are generated for a customer is the outcome of a random variable specific to the customer.

1. Introduction

In the classical actuarial model for bonus-malus systems in automobile insurance (Denuit et al. [1] or Lemaire [2], for example), there is a finite set of bonus classes = 1 , , K . A customer having n claims and bonus class in a given year has bonus class b ( , n ) in the next for some deterministic function b (the bonus rule; claim sizes are ignored and only claim numbers counted). The customer has a risk parameter λ, such that the number of claims N 0 , N 1 , in consecutive years are i.i.d. Poisson ( λ ) , and so the sequence L 0 , L 1 , of bonus classes is a time-homogeneous Markov chain with transition matrix P ( λ ) = p k ( λ ) k , = 1 , , K where
p , ( λ ) = n = 0 e - λ λ n n ! I b ( , n ) =
(such a customer we denote a λ-customer). Customers pay premium r when in class and enter the system in some fixed class 0 . The r and 0 may be chosen according to certain optimality and/or financial equilibrium principles (see below) or arbitrarily.
For a simple example of a bonus rule, consider the - 1 / + 2 rule. Here each claim causes the bonus level to increase by 2, whereas it decreases by 1 for each claim-free year (obvious boundary modifications apply to levels 1 , K - 1 , K ). The systems in use are often substantially more detailed, with K of order 15–25.
Much of the discussion of the literature employs stationarity modeling, measuring characteristics of the system via the stationary distribution π ( λ ) = ( π ( λ ) ) = 1 , , K (existing under the weak assumptions of irreducibility and aperiodicity). In particular, the average premium r ¯ ( λ ) of a customer with risk parameter λ is defined by
r ¯ ( λ ) = 1 K r π ( λ ) = = 1 K π · lim n p 0 , n ( λ )
where the p n are the n-step transition probabilities. From the r ¯ ( λ ) , one often proceeds to calculate the Loimaranta efficiency
e ( λ ) = d log r ¯ ( λ ) d log λ = λ r ¯ ( λ ) r ¯ ( λ )
at λ (denoted elasticity outside the actuarial sciences); it measures to which extent r ¯ ( λ ) is linear at λ (as should ideally be the case), with e ( λ ) = 1 expressing ‘local linearity’ at λ.
Such stationary performance measures are only meaningful if the Markov chain L attains (approximate) stationarity within the typical time a customer spends in the portfolio. For this reason much attention has been given to studying the approach of the p 0 , n ( λ ) to π ( λ ) . The rate of convergence is known to be geometric, with decay parameter the second largest eigenvalue of the transition matrix P ( λ ) . However, this is an asymptotic result and so the studies are most often numerical, depicting for example the mean annual premium E λ r L n or the total variation (t.v.) distance
d TV p 0 · ( λ ) , π ( λ ) = = 1 K p 0 , n ( λ ) - π ( λ )
as function of n (see, for example, Denuit et al. [1], p. 183ff). The results are sometimes encouraging: for some bonus systems, d TV p 0 · ( λ ) , π ( λ ) = 0 already for n = 4–6. However, these are typically simple-minded systems, and for the more realistic ones, one often sees a substantial value of the t.v. distance for say n = 30 , a value exceeding the time span a customer can be expected to stay in the portfolio. Nevertheless, the studies of the effects of the sojourn time A in the portfolio being finite are remarkably few, with Borgan et al. [3] being the main exception. One purpose of this paper is to go deeper into this direction and to formulate an alternative (which we call age-correction) to the stationarity point of view.
Bonus-malus systems may be seen as an example of experience rating which has as aim to calculate the premium on an individual basis by using the information available to the company. In the automobile insurance setting, we ignore in this paper profit, administration costs etc., and take the average claim size equal to 1, so that in after a given year m, the company would want to compute its net premium c m for year m as its best guess of the customer’s λ as function of the numbers N 0 , N 1 , , N m - 1 of claims filed in years 0 , 1 , , m - 1 . The naive guess is of course the average λ ^ m = ( N 0 + N 1 + + N m - 1 ) / m . However, a high value of λ ^ m could be due to bad luck of an otherwise good driver and a low value luck of an otherwise bad driver. An estimate which is more fair to the customer is therefore obtained by a Bayesian view where one involves information on the population of customers in form of a prior distribution, say U, of λ, views the particular customer’s λ as the outcome of a r.v. with distribution U, calculates the posterior distribution U m and takes c m as the mean of U m , the Bayes premium.
Example 1. A often considered choice for the prior U is a Gamma ( b , μ ) with density μ ( μ u ) b - 1 e - μ u / Γ ( b ) . This can be motivated by negative binomial fitting (e.g., Denuit et al. [1], p. 28), but is also mathematically convenient since then the posterior U m is again Gamma with parameters
b m = b + N 0 + N 1 + + N m - 1 , μ m = μ + m
and one gets the Bayes premium as the posterior mean
b m μ m = b + N 0 + N 1 + + N m - 1 μ + m = μ μ + m · b μ + m μ + m · λ ^ m
This has a neat interpretation as a weighted average of the population mean b / μ (the premium the company will charge without access to claims statistics) and the mean λ ^ m of the claims, with the weight m / ( μ + m ) of λ ^ m increasing to 1 as years go by and the information on the customer accumulates.  ☐
The Bayes premium enjoys the optimality property of minimizing the quadratic loss E ( c m - Λ ) 2 in the class of all functions c m of N 0 , N 1 , , N m - 1 , the natural class of predictors of Λ using information on N 0 , N 1 , , N m - 1 . In fact, it is standard that the solution to this minimization problem is c m = E [ Λ | N 0 , N 1 , , N m - 1 ] . For these facts and the general theory of Bayes premiums, see Bühlmann [4], Bühlmann & Gisler [5] and Denuit et al. ([1], Ch. 3). 1
Given the above optimality property, the Bayes premium can be viewed as the optimal fair choice of the insurer’s premium (it is often argued that the reason that bonus-malus systems are used instead in practice is that they are better understandable to the average customer who would not know about prior and posterior distributions). Nevertheless, as noted by Norberg [6], the Bayesian view is highly relevant also in bonus-malus systems but for a different purpose, to compute the premiums r 1 , , r K in the different bonus levels. To this end, the idea of basing the premium on the bonus level means that one chooses the minimizer of E ( c - Λ ) 2 in the class of all functions c of the bonus level. One then needs to specify what is meant by this level, and to avoid the dependency on m, the choice of [6] and much subsequent literature is a r.v. L distributed as π ( Λ ) (recall that π ( λ ) is the stationary distribution of the Markov chain L 0 , L 1 , when the customer’s Poisson parameter is λ). Using the general L 2 -theory quoted above, the minimizer is
c = E [ Λ | L ] = = 1 K I ( L = ) E [ Λ | L = ]
so that the optimal bonus level r in class is E [ Λ | L = ] . Evaluating the conditional expectation, we get
r = 0 λ π ( λ ) U ( d λ ) 0 π ( λ ) U ( d λ ) .
Formula (3) gives the premium rule of the bonus system which is optimal from the point of view of minimizing the error in predicting a customer’s Λ. 2
The rule (3) enjoys in a certain sense the principle of financial equilibrium (for the company) which asserts that on average, premium incomes and payments of claims should balance. Namely, the expected claims in a year of a typical customer is E Λ and his expected premiums are E r L assuming that a typical customers bonus class is distributed as the stationary r.v. L . By the tower property of conditional expectations, these expressions coincide. However, the point we take in this paper is that this assumption is questionable and needs further discussion.
Remark 1. The above set-up ignores claim amounts (for examples of discussion involving also claim severity, see Frangos & Vontos [7] and Mahmoudvand & Hassani [8]). For simplicity, we will assume that the monetary scale is chosen such that the mean claim size is one.     ☐
Remark 2. Regulations vary greatly from country to country. At one extreme, all insurance companies are obliged to use the same bonus-malus system, at the other they have complete freedom. The general tendency has gone towards deregulation. A detailed survey of the situation in Europe as of year 2000 concerning such rules is in Meyer [9]. Of course, much has changed since then but still, [9] will serve to give an impression of many practical issues connected with motor insurance.     ☐
The paper is organized as follows. In Section 2, we introduce our age-correction approach and give some of its simple properties. The fundamental formula (4) gives π * , the age-corrected π, expressed in terms of the sojourn distribution of a customer in the population. Section 3, Section 4, Section 5 and Section 6 then contain an extensive numerical study of its behavior in concrete case and how it compares to the traditional stationarity-based approach. A concluding discussion, including more careful references to the literature, is in Section 7, and the Appendix contains some complements as well as an outline of a more general modeling approach via Markov chains.

2. Independent Sojourn Times

Motivated by the criticism of the traditional use of stationarity, we now assume that a customer stays in the portfolio only for a finite number of years A. i.e., he is in the portfolio in years 0 , 1 , , A - 1 after entering. We further assume that A is independent of his Λ and his claim sequence N 0 , N 1 , and thereby his sequence of bonus levels, and that μ A = E A < . This independence assumption is crucial but of course questionable. For example, an insured with a high Λ will typically be in a high bonus class and be more prone to change company than one with a small Λ. The distribution of A is denoted by F, the point probabilities by f a = P ( A = a ) , and we write f a e = ( f a + 1 + f a + 2 + ) / μ A , n = 0 , 1 , . Here f e is the equilibrium distribution familiar from renewal theory, cf. ([10], V.3). It gives the distribution of the time elapsed since the last renewal, an interpretation that matches nicely the alternative derivation we give in Appendix B.
Much of the discussion of Section 1 remains relevant, only do we need for each value λ of the Poisson parameter to replace the stationary distribution π ( λ ) of the bonus level by the distribution π * ( λ ) of the typical bonus level L * .
We then need to specify what is meant by the ‘typical bonus level’ of a λ-customer, and our suggestion is to define this as a r.v. L * with distribution
π * ( λ ) = P λ ( L * = ) = 1 μ A E λ n = 0 A - 1 I ( L n = ) = a = 0 f a e P λ ( L a = )
denoted the age-corrected distribution in the rest of the paper. Expression (4) is fundamental for the paper and may be approached in various ways. We choose here the set-up in the following Theorem 1 where the interpretation is as a limiting long-term average of bonus classes of λ-customers seen by the company (for an alternative, see Appendix B).
Before stating the Theorem, we need some notation and assumptions. Let M y denote the number of λ-customers in the portfolio in year y = 0 , ± 1 , ± 2 , , E y the number of λ-customers entering the portfolio 3 4 and L y , c , c = 1 , , M y the bonus classes of the λ-customers, T y , c { 0 , 1 , } the time since they entered the portfolio. Assume that the E y are i.i.d. with finite mean μ E and independent of the M y , L y , 1 , , L y , M y with y < y , and that the A , Λ , N 0 , N 1 , for different customers are i.i.d. and independent of the E y .
Theorem 1. Under the above assumptions,
1 M 0 + + M Y y = 0 Y c = 1 M y I ( T y , c = a ) f a e
1 M 0 + + M Y y = 0 Y c = 1 M y I ( L y , c = ) π * ( λ )
a.s. as Y for all ℓ, no matter initial conditions.
Proof. We can write
M 0 + + M Y = C + C + C
where C is the time-in-portfolio-before-Y of λ-customers that first entered the portfolio at some time y < 0 , C the similar time of those that entered at some time y = 0 , , Y and left before Y + 1 , and C the time of those that entered at some time y = 0 , , Y but still remain in the portfolio at time Y + 1 (and possibly after). Here we can bound C by C ˜ , the total-time-in-portfolio (not necessarily before Y) of λ-customers that first entered at some time y = 0 , , Y but remained in the portfolio after Y. By the law of large numbers, C + C ˜ Y μ E μ A . Further,
C = M 0 - E 0 = O ( 1 ) = o ( Y ) E C = μ E y = 0 Y P ( A > Y - y ) = μ E y = 0 Y P ( A > y ) μ E μ A = o ( Y )
Combining these facts gives
1 Y ( M 0 + + M Y ) μ E μ A
A similar argument shows that the total times C or C λ-customers ever spend is bonus class (not necessarily before Y) is of order
μ E Y n = 0 P λ ( L n = , A > n ) = μ E Y μ A π * ( λ )
and that the C contribution to the l.h.s. of (6) dominates the C and C contributions. Combining with (7) gives (6). The proof of (5) is similar, though slightly easier. ☐
Remark 3. The analysis in the proof of Theorem 1 is similar to the one of a discrete time G/G/∞ queue. M y then plays the role of the queue length at time y, and T y , c as the elapsed service time of customer y , c . We will not use this connection and hence leave out further details.     ☐
It may be noted that expression (4) may be evaluated in closed analytical form. To this end, we need the fundamental matrix Z ( λ ) of the Markov chain given by Z ( λ ) = ( I - P ( λ ) + 1 π ( λ ) ) - 1 where 1 is the (row) vector with 1 at all entries ([10], p. 31). Further let f ^ [ z ] = 0 f n z n denote p.g.f. of F and use the same notation f ^ [ C ] = 0 f n C n for a square matrix C . Then, with e the th (row) unit vector:
Theorem 2. The distribution π * ( λ ) with point probabilities (4) is given by
π * ( λ ) = π ( λ ) + 1 μ A e 0 Z ( λ ) I - f ^ [ P ( λ ) ]
For the proof, see the Appendix.
Remark 4. If the univariate p.g.f. f ^ [ z ] is available in closed form, then so is usually the matrix version f ^ [ C ] by obvious changes in the expression for f ^ [ z ] . For example for a negative binomial distribution of order 2 with point probabilities f n = ( 1 - ρ ) 2 ( n + 1 ) ρ n , n = 0 , 1 , , we have f ^ [ z ] = ( 1 - ρ ) 2 / ( 1 - ρ z ) 2 and f ^ [ C ] = ( 1 - ρ ) 2 ( I - ρ C ) - 2 .

3. Numerical Set-up

3.1. The Bonus Systems

We have selected three rather different systems for our numerical studies. Doing so, our source has been the survey in Meyer [9] treating the situation in most European countries (as well as Japan and the US) around 1999. Important characteristics of a bonus system is the number K of classes and the spread factor, defined as the ratio between the highest premium r K and the lowest r 1 . We chose systems from three countries, Ireland, Italy and Germany. Ireland has a small number K = 6 of classes and a low spread factor of 2, Germany has a high number K = 29 of classes and a high spread factor of 8.2, whereas Italy is intermediate with K = 18 of classes and spread factor 4. More detail on the various systems are given below. It should be noted that each system may only have been one among several in the particular country when [9] was published and that much may have changed since then. However, our point is not to analyze systems that are necessarily in current use but rather that our examples both show diversity and are typical of many other systems.
The premiums r , often called relativities, are traditionally given in percent of the premium in the initial class 0 or some other reference class, and when presenting the three systems, we follow that tradition (as does Meyer [9]). However, later we shall renormalize to get financial equilibrium.
Figure 1. Trial distributions and their equilibrium distributions.
Figure 1. Trial distributions and their equilibrium distributions.
Risks 02 00049 g001

3.2. Trial Sojourn Time Distributions

We selected four trial distributions for the distribution of the sojourn time A of a customer in the portfolio. Two are negative binomial with point probabilities
f n = n + 1 2 ( 1 - ρ ) 3 ρ n , n = 1 , 2 ,
(i.e., A is distributed as 1 + B 1 + B 2 + B 3 where B 1 , B 2 , B 3 are independent geometric ( ρ ) on { 0 , 1 , } ). Here ρ was chosen to make the means 1 + 3 ρ / ( 1 - ρ ) equal to 7 and 13, and the two distributions are denoted N B ( 7 ) , resp. N B ( 13 ) . The two other, denoted U ( 6 . 5 ) and U ( 12 . 5 ) , were taken as uniform distributions on { 1 , 2 , , 12 } , resp. { 1 , 2 , , 24 } , i.e., with roughly the same means. The distributions and their equilibrium distributions given by (5) are illustrated in Figure 1, giving the (yearly) probability mass function.
In the numerical calculations, the distributions were truncated at n = 50 , except for Figure 2 where the truncation point was n = 2 , 000 .
Figure 2. Convergence of relativities, Ireland.
Figure 2. Convergence of relativities, Ireland.
Risks 02 00049 g002

3.3. Bayesian Assumptions

We have taken the distribution U of the customers’ (yearly) λ parameter to be exponential with mean 0 . 1 . The exponential assumption is from Bichsel [11], who fitted a gamma distribution to data and found the shape parameter to be close to 1. The value 0 . 1 is from Lemaire & Zi ([12], p. 288) who argue this to be typical in many countries. 5
Motivated by this assumption, we have in many of the illustrations selected four values of λ, 0 . 04 , 0 . 08 , 0 . 16 and 0 . 32 , i.e., two below the population mean and two above. For the spread, note that 0 . 04 and 0 . 32 roughly correspond to the 5%, resp. 95%, quantiles in the exponential distribution with mean 0 . 1 .

4. Convergence to Stationarity. Age-Corrected Distributions

4.1. Ireland

The Irish system is very simple with K = 6 classes and transition rules as in Table 1. The initial class is 0 = 6 .
Table 1. Bonus rules, Ireland.
Table 1. Bonus rules, Ireland.
r n = 0 n = 1 n = 2 +
6100566
590466
480366
370256
260146
150136
The convergence speed to the stationary distribution is illustrated in two figures. The first, Figure 3, shows the shape of the transient λ-distributions of L n for four selected values n = 5 , 10 , 15 , of n ( n = corresponds to the stationary distribution) and the four selected values 0 . 04 , 0 . 08 , 0 . 16 , 0 . 32 of λ, and the next, Figure 4, plots the t.v. distance (2) to the stationary distribution as function of the number n of years elapsed.
Figure 3. Transient distributions, Ireland.
Figure 3. Transient distributions, Ireland.
Risks 02 00049 g003
The shape of these figures may be understood from the transition rules. Consider for example a customer with λ = 0 . 04 . Here most of the mass of π is concentrated in class 1, but class 1 can at earliest been reached in year 5. This explains the steep drop in Figure 4 in the t.v. distance between years 4 and 5. When looking at the bar plots in Figure 3 for the distribution of his class in different years, consider for example year 5 and note that w.p. e - 0 . 2 = 0 . 82 he will have no claims in the first 5 years, so 0.82 is precisely the mass at class 1. W.p. 0 . 2 e - 0 . 2 = 0 . 16 he have will have exactly one claim. If this happens in year 0, his sequence of states in years 1,2,3,4,5 is 6 , 5 , 4 , 3 , 2 . The similar sequences for a claim in year 1,2,3, resp. 4 are
5 , 6 , 5 , 4 , 3 , 5 , 4 , 6 , 5 , 4 , 5 , 4 , 3 , 5 , 4 , 5 , 4 , 3 , 2 , 4 .
Since any of the years 0,1,2,3,4 are equally likely for the claim. this explains that class 4 is more likely than classes 2,3, which is of course not the case for a good customer in stationarity ( n = ). The possibility of two or more claims giving mass in states 5,6 is just 0.02 and hence negligible.
Figure 4. T.v. convergence rate, Ireland.
Figure 4. T.v. convergence rate, Ireland.
Risks 02 00049 g004
Similar remarks apply to other values of λ and n as well as the parallel figures for Italy and Germany to follow, but we shall not give the details.
The figures shows the fastest convergence rate among our three selected systems, and also that the rate is not that crucially depending on the value of λ. The explanation could be related to the simplicity of the Irish system.
Figure 5, plots the age-corrected distribution π * ( λ ) for our four selected values of λ (one in each column) and our four trial distributions together with the stationary distribution π ( λ ) (the benchmark of much literature) on top. It is seen that the agreement within columns is relatively good, with the most marked differences for small values of λ. The explanation could be the relatively fast convergence rate in the Irish system.
Figure 5. Stationary and age-corrected distributions, Ireland.
Figure 5. Stationary and age-corrected distributions, Ireland.
Risks 02 00049 g005

4.2. Italy

The Italian system is intermediate with K = 18 classes and transition rules as in Table 2. The initial class is 0 = 14 [note that r 0 100 , a normalization allowed in exceptional cases].
Figure 6 and Figure 7 are parallel to Figure 3 and Figure 4 for Ireland, illustrating the convergence speed to stationarity. As a new feature, we observe some gaps in the transient distributions. Consider for example n = 5 where Figure 6 shows that classes 10 and 11 can not be attained. The explanation is that with 0 claims in years 0,1,2,3,4 one will go down 5 classes from 14 to 9, but with 1 claim one goes down 4 and up two, so to 12, and with more than two of course even higher. One sees also a somewhat slower rate of convergence to stationarity.
Finally the age-corrected distribution π * ( λ ) are plotted in Figure 8 together with the stationary distribution π ( λ ) . One sees a marked worse agreement within columns than for Ireland. The most marked differences occur for small values of λ, with one feature being a considerable concentration of the π * ( λ ) (but not of π ( λ ) ) close to the inital class 14. Again, the most natural explanation is the slow convergence rate.
Table 2. Bonus rules, Italy.
Table 2. Bonus rules, Italy.
r n = 0 n = 1 n = 2 n = 3 n = 4 +
182001718181818
171751618181818
161501518181818
151301417181818
141151316181818
131001215181818
12941114171818
11881013161818
1082912151818
978811141718
874710131618
77069121518
66658111417
56247101316
4593691215
3562581114
2531471013
150136912
Figure 6. Transient distributions, Italy.
Figure 6. Transient distributions, Italy.
Risks 02 00049 g006
Figure 7. T.v. convergence rate, Italy.
Figure 7. T.v. convergence rate, Italy.
Risks 02 00049 g007
Figure 8. Stationary and age-corrected distributions, Italy.
Figure 8. Stationary and age-corrected distributions, Italy.
Risks 02 00049 g008

4.3. Germany

The German system is rather elaborate. It has a large number of classes, K = 29 , initial class 26, and quite detailed rules for the new class after one or more claims. For example, after one claim the customer moves up 14 classes when in class 1, always to class 17 when in classes 6–11, and up 3 classes when in classes 19–22. The rules for some selected cases are given in Table 3; for full details, see Meyer [9] or Mahmoudvand et al. [14].
Table 3. Bonus rules, Germany.
Table 3. Bonus rules, Germany.
r n = 0 n = 1 n = 2 n = 3 n = 4 +
292452529292929
251002426292929
20551923262729
15401421252729
1035917242629
530416222429
130115222429
A quite special feature of the German system is the very high initial class, 26, meaning that a customer at earliest can reach the lowest premium level in class 1 after 25 years! This clearly shows up in the following Figure 9, Figure 10 and Figure 11, for example in the λ = 0 . 04 row in Figure 10 where the t.v. distance from the stationary distribution is substantial up to time n = 25 , and in the comparisons of age-corrected distributions in Figure 11 which shows the same phenomenon as for Italy, a strong concentration of the π * ( λ ) (but not of π ( λ ) ) close to the inital class 26.
Figure 9. Transient distributions, Germany.
Figure 9. Transient distributions, Germany.
Risks 02 00049 g009
Figure 10. T.v. convergence rate, Germany.
Figure 10. T.v. convergence rate, Germany.
Risks 02 00049 g010
Figure 11. Stationary and age-corrected distributions, Germany.
Figure 11. Stationary and age-corrected distributions, Germany.
Risks 02 00049 g011

4.4. Population Averages

We believe the differentiation between high and low values of λ in the above figures is of interest rather than considering a single value as for example the population mean 0 . 1 . However, the Bayesian view could motivate to summarize by averaging λ over the structure distribution U.
Figure 12. Population averaged convergence rates: population mean 0.1.
Figure 12. Population averaged convergence rates: population mean 0.1.
Risks 02 00049 g012
Such averaging is done in Figure 12 and Figure 13. Figure 12 gives the population averaged t.v. distance
= 1 K 0 p 0 , n ( λ ) - π ( λ ) U ( d λ )
between the transient distribution at time n and the stationary distribution π, whereas Figure 13 gives the averaged age-corrected distributions π * ( λ ) U ( d λ ) .
These figures show essentially the same behavior as for the intermediate values 0.08 and 0.16 of λ, which could be expected since 0.04 and 0.32 are in the tails with low U-mass.
Figure 13. Population averaged age-corrected distributions: population mean 0.1.
Figure 13. Population averaged age-corrected distributions: population mean 0.1.
Risks 02 00049 g013
As a comparison, similar figures have been produced for the substantially smaller population mean 0.05, see Figure 14 and Figure 15. A first rough conclusion is that the behavior appears to be rather insensitive to the population mean.
Figure 14. Population averaged convergence rates: population mean 0.05.
Figure 14. Population averaged convergence rates: population mean 0.05.
Risks 02 00049 g014
Figure 15. Population averaged age-corrected distributions: population mean 0.05.
Figure 15. Population averaged age-corrected distributions: population mean 0.05.
Risks 02 00049 g015

5. Relativities

We now turn to the influence of finite customer sojourn times on the Bayes premium, proceeding as follows. For each of the three selected bonus systems and of the four trial sojourn time distributions, we first compute our age-corrected alternatives π * ( λ ) to the stationary distribution π ( λ ) by means of (4) and next the Bayes premium r * in bonus class by means of the analogue of (3)
r * = 0 λ π * ( λ ) U ( d λ ) 0 π * ( λ ) U ( d λ )
The results are in the following three Figure 16, Figure 17 and Figure 18. The legends are solid red for distribution N B ( 6 ) , dotted red for N B ( 13 ) , solid blue for U ( 6 ) , and dotted blue for U ( 12 ) . As supplement we also compute the Bayes premium corresponding to the stationary distribution π ( λ ) (dotted black) and supplement with the premium corresponding to the given relativities for the bonus system (e.g., 50, 60, 70, 80, 90, 100 for Italy) in solid black; whereas the Bayes premium automatically yields financial equilibrium, cf. the discussion following (3), we here need to normalize to satisfy this requirement.
Figure 16. Relativities, Ireland.
Figure 16. Relativities, Ireland.
Risks 02 00049 g016
Figure 17. Relativities, Italy.
Figure 17. Relativities, Italy.
Risks 02 00049 g017
Figure 18. Relativities, Germany.
Figure 18. Relativities, Germany.
Risks 02 00049 g018
When interpreting the figures, we first note that it does not contradict financial equilibrium that for a given country, one set of relativities is below the other. For example, all relativities corresponding to one of our four trial sojourn time distributions (colored graphs) are below the given relativities (solid black graph). But the explanation is simply that one set of relativities should be weighted with the age corrected distribution and the other with the stationary distribution, and the age corrected distributions have a region of importance which is more shifted towards high classes.
We next note that the two distributions N B ( 7 ) , U ( 6 . 5 ) with the low mean are quite close, in some cases even hard to distinguish. Distributions N B ( 13 ) , U ( 12 . 5 ) have a roughly doubled mean. As could be expected, this puts them closer to the Bayesian relativity computed w.r.t. π ( λ ) . The convergence rate appears quite slow, however, and this is further illustrated in the following Figure 2. We took here the Irish system and compared the stationarity-based Bayesian relativity (solid black) to those of four versions of the negative binomial distribution (8), one with mean 10 (dotted red), one with mean 100 (dashed red), one with mean 200 (dash-dotted red), and one with mean 400 (solid red). The figure confirms the expectation of convergence, but shows also that (as just noted) it is slow.

6. Age-Corrected Average Premiums

Analogous with the stationarity-based definition (1) of the average premium r ¯ ( λ ) of a λ-customer, we define the age-corrected version as
r ¯ * ( λ ) = 1 K r π * ( λ )
The r ¯ * ( λ ) are plotted in Figure 19, Figure 20 and Figure 21 one for each of the three bonus systems and the same 6 cases as for the relativities in Section 5, with the same legends.
We see a considerable difference between the two stationarity-based average premiums (solid black and dotted black) for Ireland and Italy, whereas they appear almost identical for Germany. The age-corrected average premiums are again quite different, and exhibit somewhat similar behavior as the relativities in Figure 16, Figure 17 and Figure 18.
Figure 19. The r ¯ * ( λ ) , Ireland.
Figure 19. The r ¯ * ( λ ) , Ireland.
Risks 02 00049 g019
Figure 20. The r ¯ * ( λ ) , Italy.
Figure 20. The r ¯ * ( λ ) , Italy.
Risks 02 00049 g020
Figure 21. The r ¯ * ( λ ) , Germany.
Figure 21. The r ¯ * ( λ ) , Germany.
Risks 02 00049 g021
The ideal fairness criterion for a Bayesian premium rule is that the premium for a λ-customer should come as close to λ as possible. This can never be perfectly achieved: since the premium in the lowest bonus class is non-zero, a customer with a small λ will always pay too much, and since the premium in the highest class is finite, a customer with a large λ will always pay too little. The figures show that this effect is substantially more marked for the age-corrected average premiums than for the stationarity-based ones. The explanation is natural: if the customer has a finite sojourn time, the system will have less time to learn about his risk characteristics in the form of λ than if he had been there for ever, as is the (false) assumption underlying the stationarity-based calculations.

7. Concluding Remarks

In this paper, we have inspected how reasonable it is to view bonus-malus system via the stationary distribution, as is usually done. The conclusion is that in many cases the transient distributions are quite far from the stationary ones, and that this has considerable consequences on the computation of such quantities as Bayesian relativities and average premiums.
We do not necessarily insist that our trial distributions for the sojourn time in the portfolio have the relevant time span. A motor insurance may be terminated for example just if the insured gets a new car. In that case, he will typically continue with a new policy in the same company, but not enter in the same level 0 as completely new insurers. Similar remarks to change of company, where usually some information on present bonus class or general previous claim statistics is passed from the old company to the new. Therefore, our choice of the A distributions should be seen as nothing more than scenario analysis.
Examples of numerical studies of special bonus-malus systems are, for example, in Lemaire [15], Lemaire & Zi [12] and Mahmoudvand et al. [14]. These papers differ from the present one by not going into the Bayesian aspect. Here the more closely related literature is Norberg [6] and Borgan et al. [3]. In particular, [3] contains ideas on how to get away from the stationary point of view. As analogue of our π * ( λ ) ,3] suggests a distribution of the form 0 w n e 0 P ( λ ) n where the w n are suitable weights summing to one. It is also briefly mentioned that one interpretation corresponds to sampling a customer at random from the portfolio, but the connection to our π * ( λ ) which is obtained by taking w n = f n e is not given. Also the concept of sampling a customer at random is not explained very clearly, cf., e.g., our Theorem 1 and Appendix B below, and in key examples the w n are taken constant on an interval whereas f n e is decreasing. Nevertheless, [3] contains some key ideas related to this paper, and to our mind, the paper has received surprisingly little attention in subsequent literature (but see Denuit et al. [1], Ch. 8).
Of further classical references in the bonus-malus area not cited elsewhere in the text, we mention in particular (in chronological order) Grenander [16], Loimaranta [17], Bonsdorff [18] and Rolski et al. [19].

Conflicts of Interest

The author declares no conflict of interest.

References

  1. M. Denuit, X. Marchal, S. Pitrebois, and J.F. Walhin. Actuarial Modelling of Claims Count. Chichester, UK: Wiley, 2007. [Google Scholar]
  2. J. Lemaire. Bonus-Malus Systems in Automobile Insurance. Boston, MA, USA: Kluwer, 1995. [Google Scholar]
  3. Ø. Borgan, J.M. Hoem, and R. Norberg. “A nonasymptotic criterion for the evaluation of automobile bonus systems.” Scand. Actuar. J. 1981 (1981): 165–178. [Google Scholar]
  4. H. Bühlmann. Mathematical Methods in Risk Theory. Berlin/Heidelberg, Germany: Springer-Verlag, 1970. [Google Scholar]
  5. H. Bühlmann, and A. Gisler. A Course in Credibility Theory and its Applications. Berlin/Heidelberg, Germany: Springer-Verlag, 2005. [Google Scholar]
  6. R. Norberg. “A credibility theory for automobile bonus system.” Scand. Actuar. J. 1976 (1976): 92–107. [Google Scholar] [CrossRef]
  7. N.E. Frangos, and S.D. Vontos. “Design of optimal bonus-malus systems with a frequency and a severity component on an individual basis in automobile insurance.” ASTIN Bull. 31 (2001): 1–22. [Google Scholar] [CrossRef]
  8. R. Mahmoudvand, and H. Hassani. “Generalized bonus-malus system with a frequency and a severity component on an individual basis in automobile insurance.” ASTIN Bull. 39 (2009): 307–315. [Google Scholar] [CrossRef]
  9. U. Meyer. Third Party Motor Insurance in Europe. Comparative Study of the Economical-Statistical Situation. Bamberg, Germany: University of Bamberg, 2000. [Google Scholar]
  10. S. Asmussen. Applied Probability and Queues, 2nd ed. New York, NY, USA: Springer-Verlag, 2003. [Google Scholar]
  11. F. Bichsel. “Erfahrungstariffierung in der Motorfahrzeug-Haftphlicht-Versicherung.” Mitt. Verein. Schweiz. Versich. Math., 1964, 119–130. [Google Scholar]
  12. J. Lemaire, and H. Zi. “A comparative analysis of 30 bonus-malus systems.” ASTIN Bull. 24 (1994): 287–309. [Google Scholar] [CrossRef]
  13. G.J. Tsougas. “Actuarial Modelling of Claims Counts and Losses in Motor Third Party Liability Insurance.” Ph.D. Thesis, University of Athens, Athens, Greece, 2013. [Google Scholar]
  14. R. Mahmoudvand, A. Edalati, and F. Shokooi. “Bonus-malus system in Iran: An empirical evaluation.” J. Data Sci. 11 (2013): 29–41. [Google Scholar]
  15. J. Lemaire. “A comparative analysis of most European and Japanese Bonus-malus systems.” J. Risk Insur. LV (1988): 660–681. [Google Scholar] [CrossRef]
  16. U. Grenander. “Some remarks on bonus systems in automobile insurance.” Scand. Actuar. J. 40 (1957): 180–197. [Google Scholar] [CrossRef]
  17. K. Loimaranta. “Some asymptotic properties of bonus systems.” ASTIN Bull. 6 (1972): 233–245. [Google Scholar]
  18. H. Bonsdorff. “On the convergence rate of bonus-malus systems.” ASTIN Bull. 22 (1992): 217–223. [Google Scholar] [CrossRef]
  19. T. Rolski, H. Schmidli, V. Schmidt, and J. Teugels. Stochastic Processes for Insurance and Finance. Chichester, UK: Wiley, 1999. [Google Scholar]
  20. S. Asmussen, and H. Albrecher. Ruin Probabilities, 2nd ed. Singapore, Singapore: World Scientific, 2010. [Google Scholar]
  21. J. Lemaire. Automobile Insurance: Actuarial Models. Boston, MA, USA: Kluwer, 1985. [Google Scholar]

Appendix

A. Proof of Theorem 2

For ease of notation, we suppress the dependency on λ. First note that Z = ( I - P + 1 π ) - 1 satisfies
Z π = π , 1 Z = 1 , P Z = Z P = Z - I + 1 π
(multiply by I - P + 1 π on both sides). From this it follows easily by induction that
P n Z = Z P n = Z - I - - P n - 1 + n 1 π
so
I + + P n - 1 = Z ( I - P n ) + n 1 π
e 0 E n = 0 A - 1 P n = e 0 E Z ( I - P A ) + A 1 π = e 0 Z ( I - f ^ [ P ] + μ A · π
Rewriting (4) in matrix notation and using the independence of A and L 0 , L 1 , gives
π * = 1 μ A E n = 0 A - 1 e 0 P n = 1 μ A e 0 E Z ( I - P A ) + A 1 π = 1 μ A e 0 Z ( I - f ^ [ P ] + μ A · π

B. A Variant of the Derivation of the Age-Corrected Distribution

A different way to arrive at distribution (4) as the relevant bonus class distribution in a model with finite sojourn times of customers is to ‘sieve customers one-by-one through the system’. By this we mean that we consider a sequence of λ-customers such that customer n has sojourn time A ( n ) and bonus classes L 0 ( n ) , L 1 ( n ) , , L A ( n ) - 1 ( n ) during his time in the portfolio. We then define a { 1 , , K } -valued stochastic process Z as the sequence
L 0 ( 1 ) , L 1 ( 1 ) , , L A ( 1 ) - 1 ( 1 ) , L 0 ( 2 ) , L 1 ( 2 ) , , L A ( 2 ) - 1 ( 2 )
and have:
Proposition 3. Assume that the distribution of A is non-lattice. Then the process Z given by (11) has limiting distribution given by (4).
This follows simply by noting that the instances A ( 1 ) , A ( 1 ) + A ( 2 ) , where a new customer takes over in the construction of Z are regeneration points (the process starts afresh as from n = 0 ) and appealing to the general theory of regenerative processes ([10], Ch. 6).
It should be noted, however, that Z is not a Markov chain. This follows by noting that
P ( Z n + 1 = | Z n = k ) = a = 0 P ( ξ n = a ) { p k P ( ξ n > a + 1 | ξ n > a )
+ δ 0 P ( ξ n = a + 1 | ξ n > a ) }
where ξ n is the time elapsed since the last regeneration point (backward recurrence time; ξ n = n for 0 n < A ( 1 ) , ξ n = n - A ( 1 ) for A ( 1 ) n < A ( 1 ) + A ( 2 ) etc.). Here the distribution of ξ n depends on n, except for the special case where A is geometric, and so must (12) do (the argument excludes time-homogeneity, but also the Markov property can be seen to fail).

C. A More General Model

We here suggest a model which incorporates several features not covered by the basic bonus-malus model consider in the body of the paper.
We assume that a customer is characterized by a random mark M taking value in some set M and a time-homogeneous Markov chain X = X 0 , X 1 , with state space X = N × L × Y where L is the finite set of bonus classes and Y is finite or countable. We write X k = ( N k , L k , Y k ) ; N k is the number of claims in year k = 0 , 1 , , assumed Poisson with parameter λ ( M , L k , Y k ) , and L k is the bonus class. The initial class L 0 as well as Y 0 depends on M.
In addition to potentially influencing the Poisson parameter, the Y component also generates the sojourn time A: it is assumed that the customer still in the portfolio in year k = 0 , 1 , will no longer be there in year k + 1 w.p. δ ( M , Y k ) , so that then A = k + 1 . The further transition rules state that L k + 1 is calculated as a deterministic function b ( N k , L k , Y k ) of X k = ( N k , L k , Y k ) , and given N k = n , L k = , Y k = y , one has Y k + 1 = y w.p. q y ; n , , y .
Example 2. To cover the classical model with independent sojourn time considered in the rest of the paper, take M = Λ as the Poisson parameter of the customer and λ ( m , , y ) = m , L 0 = 0 . If A has some general distribution with point probabilities f n , n = 1 , 2 , , there are at least two ways to conform to the general framework above. Both are familiar from the theory of discrete phase-type distributions, ([10], III.4) or ([20], IX.1 and A5) (see in particular Sections IX.1 and A5 of [20]).
In the first, we take the state space for the Y-chain to be Y = N { Δ } for some extra state Δ and Y 0 = 0 . From state y Δ [corresponding to still being in the portfolio in year y] one can go only to either y + 1 or Δ, w.p. f y + 1 / f ¯ y + 1 for Δ and 1 - f y + 1 / f ¯ y + 1 for y + 1 (thus q y ; n , , y does not depend on n , ); state Δ (the coffin state) is absorbing. In the second, we take Y = N and Y 0 = a w.p. f a . From Y n = m > 0 , one goes always to Y n + 1 = m , whereas state 0 is absorbing (thus q y ; n , , y = δ y , y - 1 if y > 0 , q y ; n , , y = δ y , if y = 0 ).     ☐
Example 3. Bonus hunger, i.e., the insured’s aptness not to file all small enough claims in order to avoid increase in future premiums, has been studied repeatedly in the literature. e.g., Lemaire [21] (see also Lemaire & Zi [12]) calculate for a given bonus system, each class and each λ a retention level z ¯ , such that the insured’s costs in terms of either covering a claim of size Z himself or expected future premiums precisely balance when Z = z when he is in class . He will then file the claim if Z > z ¯ and not otherwise. The calculation is based on a distribution G of a claim Z. We can model this by simply modifying Example 2 by taking λ ( m , , y ) = m 1 - G ( z ¯ ) rather than λ ( m , , y ) = m .     ☐
Example 4. An insured may be tempted to look for another insurer if he has had many recent claims, therefore a high bonus class and so (often without reason!) believes that his present insurer’s system is unfair. We can model this by modifying the first representation of A in Example 2 by allowing direct transitions from state n of y to the coffin state Δ, occuring with a probability θ depending on the present bonus class (typically θ will be increasing in ). Thus
q y + 1 ; n , , y = ( 1 - f y + 1 / f ¯ y + 1 ) ( 1 - θ ) , q Δ ; n , , y = f y + 1 / f ¯ y + 1 + ( 1 - f y + 1 / f ¯ y + 1 ) θ = θ + ( 1 - θ ) f y + 1 / f ¯ y + 1 ,
all other q y ; n , , y = 0 .
Example 5. Young or old drivers are generally considered to have risk characteristics different from the rest of the portfolio. We can model this by letting the mark M be the pair of the Poisson parameter Λ and (for a young driver) the year B = 0 , 1 , after the drivers license (or the age for an old driver), as well as a state of y to be of the form ( y , y ) , with y determining A as above and y the updated year after the license. The initial class L 0 is then chosen as function of B, and one could have, e.g., that λ ( λ , b ) , , ( y , y ) has the multiplicative form s ( y ) λ .     ☐
Further examples, not spelled out in detail, are M including covariates entering multiplicatively in λ ( m , , y ) .
  • 1These references have as their main theme not the Bayes premium but rather the credibility premium, also called the linear Bayes premium, computed as the minimizer of E ( a m - Λ ) 2 in the class of all linear functions a m of N 0 , N 1 , , N m - 1 ; for the Gamma example and many others, the Bayes premium and the linear Bayes premium coincide. The motivation for considering linear predictors only is computational ease.
  • 2Once this rule is chosen, one can also assert what is the optimal initial bonus level 0 by a further mean square error minimization, cf. [6]. We do not go into this here.
  • 3For convenience, the dependence on λ is suppressed in the notation. In the Bayesian set-up, E y a r.v. with a continuous distribution so that strictly speaking, E y = M y = 0 a.s.; the set-up and statements then have to be understood in a suitable conditional sense.
  • 4To be strict, one needs also to define some ordering of customers. These matters are to our mind of formal nature rather than intrinsically difficult, and so we omit the details.
  • 5This value is also compatible with the data in a recent study of a Greek portfolio, Tsougas [13]. See, however, Section 4.4.

Share and Cite

MDPI and ACS Style

Asmussen, S. Modeling and Performance of Bonus-Malus Systems: Stationarity versus Age-Correction. Risks 2014, 2, 49-73. https://doi.org/10.3390/risks2010049

AMA Style

Asmussen S. Modeling and Performance of Bonus-Malus Systems: Stationarity versus Age-Correction. Risks. 2014; 2(1):49-73. https://doi.org/10.3390/risks2010049

Chicago/Turabian Style

Asmussen, Søren. 2014. "Modeling and Performance of Bonus-Malus Systems: Stationarity versus Age-Correction" Risks 2, no. 1: 49-73. https://doi.org/10.3390/risks2010049

APA Style

Asmussen, S. (2014). Modeling and Performance of Bonus-Malus Systems: Stationarity versus Age-Correction. Risks, 2(1), 49-73. https://doi.org/10.3390/risks2010049

Article Metrics

Back to TopTop