1. Introduction
In the classical actuarial model for bonus-malus systems in automobile insurance (Denuit
et al. [
1] or Lemaire [
2], for example), there is a finite set of bonus classes
. A customer having
n claims and bonus class
ℓ in a given year has bonus class
in the next for some deterministic function
b (the
bonus rule; claim sizes are ignored and only claim numbers counted). The customer has a risk parameter
λ, such that the number of claims
in consecutive years are i.i.d. Poisson
, and so the sequence
of bonus classes is a time-homogeneous Markov chain with transition matrix
where
(such a customer we denote a
λ-customer). Customers pay premium
when in class
ℓ and enter the system in some fixed class
. The
and
may be chosen according to certain optimality and/or financial equilibrium principles (see below) or arbitrarily.
For a simple example of a bonus rule, consider the rule. Here each claim causes the bonus level to increase by 2, whereas it decreases by 1 for each claim-free year (obvious boundary modifications apply to levels ). The systems in use are often substantially more detailed, with K of order 15–25.
Much of the discussion of the literature employs stationarity modeling, measuring characteristics of the system via the stationary distribution
(existing under the weak assumptions of irreducibility and aperiodicity). In particular, the average premium
of a customer with risk parameter
λ is defined by
where the
are the
n-step transition probabilities. From the
, one often proceeds to calculate the
Loimaranta efficiency
at
λ (denoted
elasticity outside the actuarial sciences); it measures to which extent
is linear at
λ (as should ideally be the case), with
expressing ‘local linearity’ at
λ.
Such stationary performance measures are only meaningful if the Markov chain
L attains (approximate) stationarity within the typical time a customer spends in the portfolio. For this reason much attention has been given to studying the approach of the
to
. The rate of convergence is known to be geometric, with decay parameter the second largest eigenvalue of the transition matrix
. However, this is an asymptotic result and so the studies are most often numerical, depicting for example the mean annual premium
or the total variation (t.v.) distance
as function of
n (see, for example, Denuit
et al. [
1], p. 183ff). The results are sometimes encouraging: for some bonus systems,
already for
4–6. However, these are typically simple-minded systems, and for the more realistic ones, one often sees a substantial value of the t.v. distance for say
, a value exceeding the time span a customer can be expected to stay in the portfolio. Nevertheless, the studies of the effects of the sojourn time
A in the portfolio being finite are remarkably few, with Borgan
et al. [
3] being the main exception. One purpose of this paper is to go deeper into this direction and to formulate an alternative (which we call
age-correction) to the stationarity point of view.
Bonus-malus systems may be seen as an example of experience rating which has as aim to calculate the premium on an individual basis by using the information available to the company. In the automobile insurance setting, we ignore in this paper profit, administration costs etc., and take the average claim size equal to 1, so that in after a given year m, the company would want to compute its net premium for year m as its best guess of the customer’s λ as function of the numbers of claims filed in years . The naive guess is of course the average . However, a high value of could be due to bad luck of an otherwise good driver and a low value luck of an otherwise bad driver. An estimate which is more fair to the customer is therefore obtained by a Bayesian view where one involves information on the population of customers in form of a prior distribution, say U, of λ, views the particular customer’s λ as the outcome of a r.v. with distribution U, calculates the posterior distribution and takes as the mean of , the Bayes premium.
Example 1. A often considered choice for the prior
U is a Gamma
with density
. This can be motivated by negative binomial fitting (e.g., Denuit
et al. [
1], p. 28), but is also mathematically convenient since then the posterior
is again Gamma with parameters
and one gets the Bayes premium as the posterior mean
This has a neat interpretation as a weighted average of the population mean
(the premium the company will charge without access to claims statistics) and the mean
of the claims, with the weight
of
increasing to 1 as years go by and the information on the customer accumulates. ☐
The Bayes premium enjoys the optimality property of minimizing the quadratic loss
in the class of all functions
of
, the natural class of predictors of Λ using information on
. In fact, it is standard that the solution to this minimization problem is
. For these facts and the general theory of Bayes premiums, see Bühlmann [
4], Bühlmann & Gisler [
5] and Denuit
et al. ([
1], Ch. 3).
1Given the above optimality property, the Bayes premium can be viewed as the optimal fair choice of the insurer’s premium (it is often argued that the reason that bonus-malus systems are used instead in practice is that they are better understandable to the average customer who would not know about prior and posterior distributions). Nevertheless, as noted by Norberg [
6], the Bayesian view is highly relevant also in bonus-malus systems but for a different purpose, to compute the premiums
in the different bonus levels. To this end, the idea of basing the premium on the bonus level means that one chooses the minimizer of
in the class of all functions
c of the bonus level. One then needs to specify what is meant by this level, and to avoid the dependency on
m, the choice of [
6] and much subsequent literature is a r.v.
distributed as
(recall that
is the stationary distribution of the Markov chain
when the customer’s Poisson parameter is
λ). Using the general
-theory quoted above, the minimizer is
so that the optimal bonus level
in class
ℓ is
. Evaluating the conditional expectation, we get
Formula (
3) gives the premium rule of the bonus system which is optimal from the point of view of minimizing the error in predicting a customer’s Λ.
2The rule (
3) enjoys in a certain sense the principle of
financial equilibrium (for the company) which asserts that on average, premium incomes and payments of claims should balance. Namely, the expected claims in a year of a typical customer is
and his expected premiums are
assuming that a typical customers bonus class is distributed as the stationary r.v. . By the tower property of conditional expectations, these expressions coincide. However, the point we take in this paper is that this assumption is questionable and needs further discussion.
Remark 1. The above set-up ignores claim amounts (for examples of discussion involving also claim severity, see Frangos & Vontos [
7] and Mahmoudvand & Hassani [
8]). For simplicity, we will assume that the monetary scale is chosen such that the mean claim size is one. ☐
Remark 2. Regulations vary greatly from country to country. At one extreme, all insurance companies are obliged to use the same bonus-malus system, at the other they have complete freedom. The general tendency has gone towards deregulation. A detailed survey of the situation in Europe as of year 2000 concerning such rules is in Meyer [
9]. Of course, much has changed since then but still, [
9] will serve to give an impression of many practical issues connected with motor insurance. ☐
The paper is organized as follows. In
Section 2, we introduce our age-correction approach and give some of its simple properties. The fundamental formula (
4) gives
, the age-corrected
π, expressed in terms of the sojourn distribution of a customer in the population.
Section 3,
Section 4,
Section 5 and
Section 6 then contain an extensive numerical study of its behavior in concrete case and how it compares to the traditional stationarity-based approach. A concluding discussion, including more careful references to the literature, is in
Section 7, and the
Appendix contains some complements as well as an outline of a more general modeling approach via Markov chains.
2. Independent Sojourn Times
Motivated by the criticism of the traditional use of stationarity, we now assume that a customer stays in the portfolio only for a finite number of years
A. i.e., he is in the portfolio in years
after entering. We further assume that
A is independent of his Λ and his claim sequence
and thereby his sequence of bonus levels, and that
. This independence assumption is crucial but of course questionable. For example, an insured with a high Λ will typically be in a high bonus class and be more prone to change company than one with a small Λ. The distribution of
A is denoted by
F, the point probabilities by
, and we write
,
. Here
is the
equilibrium distribution familiar from renewal theory, cf. ([
10], V.3). It gives the distribution of the time elapsed since the last renewal, an interpretation that matches nicely the alternative derivation we give in
Appendix B.
Much of the discussion of
Section 1 remains relevant, only do we need for each value
λ of the Poisson parameter to replace the stationary distribution
of the bonus level by the distribution
of the typical bonus level
.
We then need to specify what is meant by the ‘typical bonus level’ of a
λ-customer, and our suggestion is to define this as a r.v.
with distribution
denoted the
age-corrected distribution in the rest of the paper. Expression (
4) is fundamental for the paper and may be approached in various ways. We choose here the set-up in the following Theorem 1 where the interpretation is as a limiting long-term average of bonus classes of
λ-customers seen by the company (for an alternative, see
Appendix B).
Before stating the Theorem, we need some notation and assumptions. Let
denote the number of
λ-customers in the portfolio in year
,
the number of
λ-customers entering the portfolio
3 4 and
,
the bonus classes of the
λ-customers,
the time since they entered the portfolio. Assume that the
are i.i.d. with finite mean
and independent of the
with
, and that the
for different customers are i.i.d. and independent of the
.
Theorem 1. Under the above assumptions,a.s. as for all ℓ, no matter initial conditions. Proof. We can write
where
is the time-in-portfolio-before-
Y of
λ-customers that first entered the portfolio at some time
,
the similar time of those that entered at some time
and left before
, and
the time of those that entered at some time
but still remain in the portfolio at time
(and possibly after). Here we can bound
by
, the total-time-in-portfolio (not necessarily before
Y) of
λ-customers that first entered at some time
but remained in the portfolio after
Y. By the law of large numbers,
. Further,
Combining these facts gives
A similar argument shows that the total times
or
λ-customers ever spend is bonus class
ℓ (not necessarily before
Y) is of order
and that the
contribution to the l.h.s. of (6) dominates the
and
contributions. Combining with (7) gives (6). The proof of (5) is similar, though slightly easier. ☐
Remark 3. The analysis in the proof of Theorem 1 is similar to the one of a discrete time G/G/∞ queue. then plays the role of the queue length at time y, and as the elapsed service time of customer . We will not use this connection and hence leave out further details. ☐
It may be noted that expression (4) may be evaluated in closed analytical form. To this end, we need the
fundamental matrix of the Markov chain given by
where
is the (row) vector with 1 at all entries ([
10], p. 31). Further let
denote p.g.f. of
F and use the same notation
for a square matrix
. Then, with
the
ℓth (row) unit vector:
Theorem 2. The distribution with point probabilities (4)
is given by Remark 4. If the univariate p.g.f. is available in closed form, then so is usually the matrix version by obvious changes in the expression for . For example for a negative binomial distribution of order 2 with point probabilities , , we have and .
5. Relativities
We now turn to the influence of finite customer sojourn times on the Bayes premium, proceeding as follows. For each of the three selected bonus systems and of the four trial sojourn time distributions, we first compute our age-corrected alternatives
to the stationary distribution
by means of (4) and next the Bayes premium
in bonus class
ℓ by means of the analogue of (3)
The results are in the following three
Figure 16,
Figure 17 and
Figure 18. The legends are solid red for distribution
, dotted red for
, solid blue for
, and dotted blue for
. As supplement we also compute the Bayes premium corresponding to the stationary distribution
(dotted black) and supplement with the premium corresponding to the given relativities for the bonus system (e.g., 50, 60, 70, 80, 90, 100 for Italy) in solid black; whereas the Bayes premium automatically yields financial equilibrium, cf. the discussion following (3), we here need to normalize to satisfy this requirement.
Figure 16.
Relativities, Ireland.
Figure 16.
Relativities, Ireland.
Figure 17.
Relativities, Italy.
Figure 17.
Relativities, Italy.
Figure 18.
Relativities, Germany.
Figure 18.
Relativities, Germany.
When interpreting the figures, we first note that it does not contradict financial equilibrium that for a given country, one set of relativities is below the other. For example, all relativities corresponding to one of our four trial sojourn time distributions (colored graphs) are below the given relativities (solid black graph). But the explanation is simply that one set of relativities should be weighted with the age corrected distribution and the other with the stationary distribution, and the age corrected distributions have a region of importance which is more shifted towards high classes.
We next note that the two distributions
,
with the low mean are quite close, in some cases even hard to distinguish. Distributions
,
have a roughly doubled mean. As could be expected, this puts them closer to the Bayesian relativity computed w.r.t.
. The convergence rate appears quite slow, however, and this is further illustrated in the following
Figure 2. We took here the Irish system and compared the stationarity-based Bayesian relativity (solid black) to those of four versions of the negative binomial distribution (8), one with mean 10 (dotted red), one with mean 100 (dashed red), one with mean 200 (dash-dotted red), and one with mean 400 (solid red). The figure confirms the expectation of convergence, but shows also that (as just noted) it is slow.
7. Concluding Remarks
In this paper, we have inspected how reasonable it is to view bonus-malus system via the stationary distribution, as is usually done. The conclusion is that in many cases the transient distributions are quite far from the stationary ones, and that this has considerable consequences on the computation of such quantities as Bayesian relativities and average premiums.
We do not necessarily insist that our trial distributions for the sojourn time in the portfolio have the relevant time span. A motor insurance may be terminated for example just if the insured gets a new car. In that case, he will typically continue with a new policy in the same company, but not enter in the same level as completely new insurers. Similar remarks to change of company, where usually some information on present bonus class or general previous claim statistics is passed from the old company to the new. Therefore, our choice of the A distributions should be seen as nothing more than scenario analysis.
Examples of numerical studies of special bonus-malus systems are, for example, in Lemaire [
15], Lemaire & Zi [
12] and Mahmoudvand
et al. [
14]. These papers differ from the present one by not going into the Bayesian aspect. Here the more closely related literature is Norberg [
6] and Borgan
et al. [
3]. In particular, [
3] contains ideas on how to get away from the stationary point of view. As analogue of our
,
3] suggests a distribution of the form
where the
are suitable weights summing to one. It is also briefly mentioned that one interpretation corresponds to sampling a customer at random from the portfolio, but the connection to our
which is obtained by taking
is not given. Also the concept of sampling a customer at random is not explained very clearly, cf., e.g., our Theorem 1 and
Appendix B below, and in key examples the
are taken constant on an interval whereas
is decreasing. Nevertheless, [
3] contains some key ideas related to this paper, and to our mind, the paper has received surprisingly little attention in subsequent literature (but see Denuit
et al. [
1], Ch. 8).
Of further classical references in the bonus-malus area not cited elsewhere in the text, we mention in particular (in chronological order) Grenander [
16], Loimaranta [
17], Bonsdorff [
18] and Rolski
et al. [
19].