1. Introduction
Every non-life insurance company is obligated to compensate its policy holders for claims that meet the terms of the policy. In order to meet and administer its contractual obligations to policyholders, the insurance company has to set up loss reserves. Since loss events with the number and amount of claims are random, it is important to calculate the claims reserve carefully, as underestimation would lead to solvency problems, and overestimation unnecessarily holds the excess capital instead of using it for other purposes. The claims estimation is one of the basic actuarial tasks in the insurance industry, because it gives the certainty to be solvent at any time moment in the future. There is a variety of methods for the actuary to choose amongst for reserving purposes. The focus has mainly been on aggregate reserving techniques, where models perform analysis with aggregate claims data. In recent years, considerable attention has been given to stochastic micro-level models, which use claims-related data on an individual basis, rather than aggregating by underwriting year and development period (for a reference, see [
1,
2,
3]). Despite the fact that stochastic micro-level models have emerged in an increasing steam of academic literature, these models are not substantially used by practitioners.
The most widely-used models are non-stochastic macro-level models, which are merely deterministic algorithms using aggregate claims data. The basic chain-ladder model is the flagship of macro-level models (see for details [
4,
5]). The simplest assumption of chain-ladder method is that payments will emerge in a similar way in each accident year. The proportionate increases in the known cumulative payments from one development year to the next can then be used to calculate the expected cumulative payments for future development years. Despite its well-known limitations, the chain-ladder remains as the most widely-applied claim reserving method, and several extensions of the model have been developed, for example the double chain-ladder method ([
6]), which simultaneously uses a triangle of paid losses and a triangle of incurred claim counts, and the continuous chain-ladder method ([
7]), which reformulates the triangular data as a histogram and proposes a continuous chain-ladder model through the use of a kernel smoother.
The chain-ladder method gives a point estimate, but the interest arises in developing estimates of the likely variability of the claims reserve. Stochastic macro-level models were first introduced in order to answer this question. An overview of stochastic macro-level models is given by [
8,
9]. A more thorough and detailed review is provided by [
10]. Stochastic claims reserving starts with constructing a model that produces the actuary’s best estimate and then using this model for estimating the prediction error of the model. Moreover, there is a tendency to find a model under which the best estimate is the one given by the chain-ladder. Within this group of models, the (over-dispersed) Poisson (ODP) model ([
11]), gamma model ([
12]), negative binomial model ([
13]), log-normal model ([
14]) and Mack’s model ([
15]) have received considerable attention. The first four models specify the distribution of the incremental losses, while the last is a distribution-free model that only specifies the first two moments.
The mean square error of prediction (MSEP), also known as the prediction error, has been used as the precision measure for the reserve estimates in most literature. It can be decomposed into two components: parameter uncertainty and process uncertainty (see, e.g., [
8,
16]). The former comes from the uncertainty in the estimation of parameters of the reserving model due to the limited sample size, whereas the latter comes from the intrinsic randomness of the claims development in the future. However, obtaining estimates for the standard error of prediction can be a difficult task. There are several analytical results for computing the prediction error (see [
17]), but those estimates can be difficult to calculate or are only approximate values. For that reason, the advantage of the bootstrap technique can be taken. In addition, calculating the MSEP certainly provides great insight into the performance of reserve estimates, but other information such as the cash flow or risk measures are also of interest. Thus, for the full predictive distribution of reserve estimates, bootstrapping can be used for the solution. The bootstrap technique has been extensively studied in the claims reserving framework by various authors, such as [
17,
18,
19]. The chain-ladder is still the benchmark for evaluating new models in the majority of the reserving literature. However, there is a need for more proper tools to validate and assess the quality of predictions when comparing different reserving methods. In order to validate the reserving method and identify any needed modifications, we need to rank the competing predictive models. We propose to consider scoring rules to measure the accuracy of probabilistic predictions.
The main purpose of this paper is to discuss different reserving methods on the basis of the chain-ladder method in combination with the bootstrap method in a case study approach. The definition of the proper residuals to base the bootstrap technique on is definitely an open subject when bootstrapping. We extend the work of [
19] by using another useful type of residual with bootstrapping, and we carry out a comparative study among several stochastic models like the (over-dispersed) Poisson, the gamma and the log-normal model. We will use claims data from an Estonian insurance company for the case study, where we discuss the impact of the chosen models and the residuals on the reserve estimates and prediction errors. To evaluate the goodness of fit of the models, we carry out a model assessment.
The paper is set out as follows. In
Section 2, we present a brief review of generalized linear models and their application to claim reserving, while in
Section 3, we discuss some aspects linked to the bootstrap methodology.
Section 4 is devoted to the application of the different methods to the real-life dataset. In
Section 5, the comparative analysis for model validation with the Schedule P database is carried out. It is followed by the discussion in
Section 6.
2. Chain-Ladder Method as a Generalized Linear Model
In this section, we introduce briefly the basic chain-ladder method, recall how the chain-ladder method is reformulated in the context of generalized linear models (GLM) and give a brief review of stochastic macro-level models, which will be used in the analysis. For a general introduction to GLM, we refer to [
20].
Stochastic macro-level models use aggregate claims data, and some of the main advantages over non-stochastic macro-level models are the possibilities to obtain first two moments or the predictive distribution of the reserve estimate. Several often-used and traditional actuarial methods to complete a run-off triangle can be described by GLM. The actuarial literature has also shown a close connection between the chain-ladder method and the multiplicative Poisson model.
Without loss of generality, we assume that the data that have been collected for
and
consist of a triangle of incremental claims:
where the row index
i refers to the year of origin and, depending on a particular situation, indicates the accident year, reporting year or underwriting year. The column index
j refers to the development year, indicating the delay, more precisely loss disbursal, reporting year or accident year. Claims data are given as a run-off triangle as shown in
Table 1.
The cumulative claim amounts with accident year index
i reported up to, and including, the delay index
j are defined as:
Thus,
is the total claims amount of accident year
i,
, either paid or incurred up to development year
j,
. The development factors of the chain-ladder technique are estimated as:
Generalized linear modeling is a methodology for modeling the relationships between variables. It generalizes the classical normal linear model, by relaxing some of its restrictive assumptions, and provides methods for the analysis of non-normal data. GLM is important in the analysis of insurance data, because with insurance data, the assumptions of the normal model are often not applicable. See [
21] for a detailed description of generalized linear models for insurance data.
Following [
11,
19], the structure of the stochastic models for claim reserving in the terminology of GLM can be given by:
- (1)
incremental claim amounts belong to the exponential family,
- (2)
,
- (3)
, where is the link function,
- (4)
linear predictor with an intercept c and factor effects and .
The given structure of GLM can be used to describe several often used actuarial methods. We consider the following multiplicative model ([
9]), with a parameter for each row
i, each column
j and each diagonal
:
where parameter
describes the effect of year of origin
i, parameter
corresponds to development year
j and
describes the effect of calendar year
. The approximation sign in Equation (
1) expresses a difference caused by a chance, i.e., there is a possible deviation of the observation on the left-hand side from its mean value on the right-hand side. The model involves three time scales, which give rise to the well-known identification problem. Parametrization using three time scales has been introduced for instance by [
22]. The identification problem has been revisited by several authors; see, for example, [
23,
24], who have proposed a canonical parametrization that is uniquely identified. In the framework of three time scales, we also face a problem with extrapolating the calendar estimates. Namely, we have no data on the values of
for the future calendar years, e.g., if
. This can be overcome by assuming that the
have a geometric pattern, with
for some real number
γ. Typically, the model (
1) is simplified by taking
, and the condition
is imposed. If the parameters
and
are estimated by using the maximum likelihood method, then the simplified model is a multiplicative GLM with log-link.
In the terminology of GLM, to linearize the multiplicative model (
1), the logarithm is chosen as a link function (log-link). Hence:
or, equivalently,
Parameters of the given model are estimated by using the maximum likelihood method. After obtaining the estimates of the parameters, it is easy to complete the run-off triangle, simply by taking:
This simple model allows one to generate quite a few reserving techniques, depending on the assumptions set on the distribution of
. It is common in claim reserving to consider the Poisson, gamma or log-normal distribution for the variable
. We proceed with reviewing the following methods from Model (
1).
The (over-dispersed) Poisson model: Already in 1975, a stochastic model corresponding to the Poisson model, which leads to the chain-ladder technique, was proposed. This model works on the incremental amounts from a Poisson distribution, where with unknown parameters and . Here, is the expected ultimate claims amount (up to the latest development year so far observed), and is the proportion of ultimate claims to emerge in each development year with the restriction . The restriction immediately follows from the fact that is interpreted as the proportion of claims reported in development year j. Obviously, the aggregate proportion over all periods has to be one.
We estimate the unknown parameters
and
from the triangle of known data with the maximum likelihood method. In the following, we use the notation Δ for the triangle of known data, i.e., the set of all
, where
is known. We also distinguish
and
. The estimation procedure and results are given in the following lemma. The initial idea of the lemma is attributed to [
12].
Lemma 1. Assume that all are independent with a Poisson distribution, and holds. Then, the maximum likelihood estimators and are given by:and: Proof of Lemma 1. We derive the maximum likelihood estimates for the unknown parameters
and
with the likelihood function:
Therefore, the log likelihood function is:
where the summation is for all
i,
j where
is known. The maximum likelihood estimator consists of values of
,
, which maximize
L or equivalently
. They are given by the equations:
and:
Thus, the likelihood estimator
and
is given, respectively, by Formulas (
4) and (
5), and the lemma is proven. ☐
Thus, the proportion factors express the ratio of the sum of observed incremental values for certain development year j with respect to certain ultimate claims, i.e., denotes the proportion of claims reported in development year j. The parameters refer to the ratio of the sum of observed incremental values for a certain origin year i to corresponding proportion factors. In other words, if the incremental claim amounts and respective proportions factors are known, it is simple to derive the corresponding ultimate claim for origin year i. One can note the principal similarities with the chain-ladder technique, where development factors are also the outcomes of certain ratios.
The Poisson model can be cast into the form of a GLM, and to linearize the multiplicative model, we need to choose the logarithm as a link function,
, so that:
or, equivalently,
where the structure of linear predictor (
6) is still a chain-ladder type, because parameters for each row
i and each column
j are given. Hence, the structure (
6) is defined as a GLM in which the incremental values
are modeled as Poisson random variables with a log-link. Reparametrizing (
6) gives us a structure of Property (4) defined in a GLM setting, i.e., we obtain a linear predictor:
where parameter
c can be considered as an intercept, which corresponds to the incremental amount in the cell (1, 1). This is obtained by taking:
to avoid over-parametrization. The Poisson model was studied in further detail by [
25], where also a new canonical parametrization was proposed.
We recall that the only distributional assumptions used in GLMs are the functional mean-variance relationship and the fact that the distribution belongs to the exponential family. When defining a GLM, we can omit the distribution of
’s and use only the most elementary information about the response variable, namely the relationship between variance and mean. This introduces a quasi-likelihood as an alternative, and using this elementary information alone can be often sufficient to stay close to the full efficiency of maximum likelihood estimators. Therefore, we can estimate the parameters by the maximum quasi-likelihood ([
20]) instead of the maximum likelihood, and the estimators remain consistent. However, it is necessary to impose the constraint that the sum of the incremental claims in every row and column has to be non-negative. This means that quasi-likelihood could not be used, for instance, when modeling incurred data with a large number of negative incremental claims in the later development periods.
In the case of the Poisson distribution, the mentioned relationship is
, and allowing for more or less dispersion in the data can be generalized to
without any change in form and solution of the likelihood equations. This kind of generalization allows for more dispersion in the data, and one speaks of an over-dispersed Poisson (ODP) model. It is shown ([
26]) that every ODP model can be transformed into the Poisson model by dividing all incremental claims by a certain parameter. The general form for the ODP model can be given as follows:
where:
The over-dispersion is introduced through the parameter
ϕ, which is unknown and estimated from the data. Considering a single incremental payment
with the origin year
i and claim payments in development year
j (yet to be observed), we obtain the estimates of future payments from the parameter estimates by inserting them into Equation (
6) and exponentiating, resulting as:
Given Equation (
10), the reserve estimates for any origin year can be derived by:
and the reserve estimate for the total amount can be easily derived by summation:
The negative binomial model can be derived from the Poisson model, and thus, these models are very closely related, but with a different parameterization. The model was first derived by [
13], by integrating out the row parameters from the Poisson model. The predictive distributions of both models are basically the same and give identical predicted values.
Log-normal model: When considering the log-normal distribution to describe claim amounts (see for a reference [
14]), we can still continue to use GLM for the logs of the incremental claim amounts. The log-normal class of models are given as:
i.e.,
Now, the identity link function is used, and the normal responses are assumed to decompose (additively) into a deterministic non-random component with mean and normally-distributed random error components with zero mean.
Following [
8], the fitted values on a log scale, given the estimates for the parameters in the linear predictor
and the process variance
, are obtained by forming the appropriate sum of estimates. Obtaining the estimates for the mean on the untransformed scale is not that simple. We cannot just exponentiate the linear predictor, since that would give an estimate of the median. Therefore, the fitted values on the untransformed scale are given by:
which is in the standard form of the expected value of a log-normal distribution and where:
are the prediction variance of the linear predictor. With already familiar notation (from the ODP subsection), we denote the triangle of predicted claims contributing to the reserve estimates by ∇. The reserve estimate in origin year
i is given by summing the predicted values in row
i of ∇, i.e.,
, and the total reserve estimate, summing the predicted values in row
i and in column
j of ∇, is given by
. The log-normal model is also referred to as the geometric chain-ladder model; see this additional analysis in [
27].
Gamma model: A further model was proposed by [
12] with a multiplicative parametric structure for the mean incremental claims amounts, which are modeled as gamma response variables. As noted in [
11], the same model can be fitted using the GLM described in over-dispersed Poisson model, but in which the incremental claim amounts are modeled as independent gamma response variables with a logarithmic link function and the same linear predictor and require a change in (
9). As with the log-normal model, the predicted values provided by the gamma model are usually close to the chain-ladder estimates, but it cannot be guaranteed. The gamma model implemented as a generalized linear model gives exactly the same reserve estimates as the gamma model implemented by [
12]. The gamma model is given with the mean:
and with the variance:
To obtain reserve estimates with the gamma model for any origin year or for the overall amount, the same formulas as defined in the ODP model, (
11) and (
12), respectively, can be used. The limitation of both the gamma and ODP model is that each incremental value should be nonnegative.
4. Case Study
To enable a comparison with previously-discussed methods in the framework of bootstrapping with defined residuals, we use the real-life dataset from an Estonian insurance company. The data considered describe the paid out claims and are shown here in incremental form. We are interested in the impact of the choice of the models and, mainly, in the effect of the choice of residuals and its adjustments.
We use both the Pearson and the Anscombe residuals first without corrections, then with the zeros corrected and lastly standardized residuals together with the zero correction. It is clear that using just standardized residuals will lead to the same results as obtained with the zero-corrected residuals; thus, we do not consider standardized residuals independently in the comparative study. In addition, we compare the obtained prediction errors and obtain the upper limits using both bootstrap approaches, i.e., the regular SEP-method based on the standard error of prediction and the alternative (using pseudo-reality) PPE method. We present PPE prediction errors only for the total reserve. When comparing SEP and PPE prediction errors, we have to take into account that different units are used: SEP prediction error equals one standard deviation, and PPE prediction error equals (approximately) 1.645 standard deviations (95%-quantile of normal distribution). This means that we have to multiply the prediction error obtained with SEP method by 1.645 and add it to the reserve estimate to obtain an upper confidence limit for the total reserve with the SEP method. In the case of the PPE method, we simply sum the prediction error and the mean to obtain the upper limit.
Reserve estimates provided by the over-dispersed Poisson model, the gamma model and log-normal model using the GLM implementation in the framework of bootstrapping with residuals outlined in this paper are shown in
Table 3,
Table 4,
Table 5,
Table 6 and
Table 7 below. As one can see, the data considered are rather inconvenient (see
Table 2), i.e., the large fluctuation of the values in the triangle is obvious: the smallest incremental value is 1022, and the largest one is 10,660,074, which is a 10,430-fold difference. The second column in
Table 3,
Table 4,
Table 5,
Table 6 and
Table 7 shows a point estimate for the reserve. These estimates are obtained directly from the defined model (not depending on the bootstrap procedure), and the point estimates do not depend on the choice of residual or on its correction.
The most problematic stage in the bootstrap method is the formation of the pseudo-data. If the magnitudes of the incremental values differ significantly, it is quite likely that the values of simulated residuals (simulated from the initial set of residuals) are sufficiently high compared to the predicted incremental values to cause the negative values to appear in the (pseudo-)data due to the use of the inverse function. Most of the probability distributions used in loss reserving are non-negative (or positive) valued; thus, the problem with negative values in the (pseudo-)data can often appear. For example, in the case of the Poisson distribution, the negative incremental values are often replaced by zeros in practice. Since incremental values in
Table 2 have a high volatility, we experienced some negative incremental values in the pseudo-data when using the gamma model with the Pearson residuals. We also tried to replace the appearing negative values with ones, but that caused non-convergence of the parameters. Thus, we could not present the results of the gamma model and the Pearson residuals with the given dataset. There were no problems in the case of the Anscombe residuals. See
Table 8 for an overview of the experienced negative values in the pseudo-data for each considered model and residual adjustment with the given dataset in
Table 2.
We first have a look at the results obtained by ODP model with using the Pearson residuals (see
Table 3) and the Anscombe residuals (
Table 4). The tables present the point estimates along with the standard errors of prediction for the three situations considered, as well as the upper limits for a confidence level of 95%. The standard errors of prediction grow up if we introduce the zero corrections, and consequently, the same happens to the upper limits, but the same estimates drop if we use the standardization (see Formula (
14)) with zero correction. The prediction errors (SEP) in the case of the Poisson model with Pearson residuals are varying from 1.6 million–1.94 million, depending on the residual adjustment, whereas in the case of the Anscombe residuals (see
Table 4), the prediction errors vary from 1.47 million–1.76 million. This means that the 95% confidence limits for the total reserve prediction are between 16 million and 16.6 million in the case of the Pearson residuals and 15.8 million and 16.3 million in the case of the Anscombe residuals, given the Poisson model and residual adjustments.
Using the Anscombe residuals, the same pattern of changes of prediction errors (and also upper limits) can be seen, but the prediction errors, as previously said, are smaller than the Pearson residuals. We see that the zero corrections do not effect the prediction errors significantly. The prediction errors without any corrections with the Anscombe residuals are 13% smaller than with the Pearson residuals. The corresponding numbers with zero correction and zero correction with standardization are 10% and 8%, respectively.
To compare the behavior of two bootstrapping approaches, we have the last two lines of each table presenting the prediction errors and the upper confidence limits of the total reserve obtained by the PPE method and the ratio of the results by the PPE method and SEP method. We can see that the upper confidence limits for the total reserve are lower with the PPE method (all of the ratios are smaller than one). On the other hand, the prediction errors (depending on the residual adjustments) obtained by the PPE method are higher than the estimates obtained by the SEP method. However, the ratios seem to decrease if we correct the residuals. In the case of Pearson residuals with zero correction and standardization, the corresponding ratio is slightly over one, and in the case of the Anscombe residuals, it is slightly below one.
Fitting the gamma model gives similar, but not identical, reserve estimates (see
Table 5) compared to the results obtained by ODP. The point estimate for the total reserve with the gamma model is 12.1 million, whereas with the Poisson model it was 13.4, which is 10.7% higher. If we compare the reserve estimates by origin year, then the biggest difference can be seen on the third year, where the difference is 55.8%. In the case of the gamma model and the Anscombe residuals, the prediction errors for the total reserve vary from 3.35 million–5.04 million. The upper limit for the total reserve in the case of the gamma model reaches 20.43 million. In a nutshell, when comparing the Poisson and the gamma model in this particular dataset, the latter gives us a smaller total reserve estimate, but higher prediction errors and, thus, higher upper limits for the reserve. In the case of both models, the PPE method tends to give higher prediction errors, except the case when the residuals are zero-corrected and standardized.
From
Table 6 and
Table 7, we can see the results of the log-normal model. The point estimate among all of the considered models is the lowest with the log-normal model, namely 10.8 million. However, we note a high increase in the prediction errors, especially in the case of residual’s zero correction.
The prediction errors for the total reserve with the log-normal model with the Pearson residuals vary from 2.7 million–10.7 million, depending on the residual’s adjustments. The upper limits for the total reserve with the Pearson residuals vary from 15.2 million–28.47 million; this shows a great fluctuation of the estimates. The prediction errors with the Anscombe residuals are between two million and 6.8 million; thus, the 95% confidence limit for the total reserve is between 14.2 million and 22 million, depending on the residual’s adjustments. However, higher values of the prediction errors should not be surprising, as the log-normal model is a more “conservative” model than, for example, the Poisson model or the gamma model. The prediction errors as the % of the total reserve estimates obtained by the Pearson residuals without corrections, with zeros corrected and then with zero correction with standardization are 81%, 99% and 25%, respectively. The corresponding % of prediction errors in the case of the Anscombe residuals are 53%, 63% and 19%, respectively. We see that the same pattern follows as before; if we use zero correction, then the prediction errors (and consequently, the upper limits, as well) are the highest. The lowest prediction errors are obtained by the zero correction together with using standardization. Furthermore, in case of the log-normal model, we see that the PPE method gives smaller upper limits than the SEP method for the total reserve. Note that when it comes to the prediction errors, the PPE method does not continue to give higher prediction estimates than the SEP method, which was the case with the Poisson and the gamma models. We see from the
Table 3,
Table 4,
Table 5,
Table 6 and
Table 7 that on the 10th year, the estimated reserve is the highest and is approximately three-times higher than the estimated reserve on the previous year. The reserve estimate on the 10th year makes nearly 56.4% of the total reserve estimate in the case of the Poisson model, 59.9% in the case of the gamma model and 63.4% in the case of the log-normal model, which is the highest percentage. This high proportion of the reserve estimate on one particular year can be explained by having a look at the initial dataset,
Table 2, where we see that on the last year, 2009, we have the largest value in the whole dataset.
We can draw four main conclusions from analyzing this dataset:
The over-dispersed Poisson model produces the highest estimated claim reserve, and the log-normal model produces the smallest estimated claim reserves. The figures of the gamma model are not that different from the ODP model.
The standard errors of prediction are quite different and consequently the estimated upper limits. These differences tend to be greater especially on the first years, since estimations are based on few predictions. The highest prediction errors are produced by the log-normal model, and the lowest prediction errors were obtained by the over-dispersed Poisson model.
With this particular dataset, the prediction errors are the lowest with the Anscombe residuals. Furthermore, no matter which residual of the two is used, the lowest prediction errors are obtained by using the zero correction with standardization.
When comparing the two bootstrap procedures, we can conclude that using the (alternative) PPE method, the upper confidence limits for the total reserve are lower with each considered model.
As we mentioned beforehand the possible problem associated with the negative values in the pseudo-data, we present
Table 8, which gives an overview of the amount of the negative values appearing in the procedure of creating a pseudo-data in the case of 1000 iterations. Roughly speaking, we observe that with the Poisson Models 1–2, negative pseudo-incremental values appeared with every iteration step. This is rather expected since the incremental values in the data differ largely. Note that using the Pearson residual caused more negative values than using the Anscombe residuals. There were no negative values in the pseudo-data in the case of the gamma, nor the log-normal model.
The presented prediction errors in the
Table 3,
Table 4,
Table 5,
Table 6 and
Table 7 above helped us to compare the variability of the mean of the total reserve. However, it can be also helpful to have an idea of the upper limit of the total reserve in general. The quantiles for a random total reserve and for the mean of the total reserve in the case of the Poisson model are presented in the tables below,
Table 9 and
Table 10.
As expected, the upper limits of the mean of the total reserve are lower than the upper limit of the random total payment (reserve). The adjustments of the residual have a great influence on the results: standardized residuals with zero corrections tend to lower the estimates.