Macro vs. Micro Methods in Non-Life Claims Reserving (an Econometric Perspective)

Charpentier, Arthur; Pigeon, Mathieu

doi:10.3390/risks4020012

Open AccessArticle

Macro vs. Micro Methods in Non-Life Claims Reserving (an Econometric Perspective)

by

Arthur Charpentier

^1,2

and

Mathieu Pigeon

^1,*

¹

Quantact/Département de mathématiques, Université du Québec à Montréal, Montreal, QC H2X 3Y7, Canada

²

Département de Sciences Économiques, Université de Rennes 1, Rennes 35000, France

^*

Author to whom correspondence should be addressed.

Risks 2016, 4(2), 12; https://doi.org/10.3390/risks4020012

Submission received: 29 February 2016 / Revised: 19 April 2016 / Accepted: 2 May 2016 / Published: 14 May 2016

(This article belongs to the Special Issue Non-Life Insurance Mathematics beyond Risk Theory: Pricing and Claims Reserving)

Download

Browse Figures

Versions Notes

Abstract

:

Traditionally, actuaries have used run-off triangles to estimate reserve (“macro” models, on aggregated data). However, it is possible to model payments related to individual claims. If those models provide similar estimations, we investigate uncertainty related to reserves with “macro” and “micro” models. We study theoretical properties of econometric models (Gaussian, Poisson and quasi-Poisson) on individual data, and clustered data. Finally, applications in claims reserving are considered.

Keywords:

loss reserving; clustering; generalized linear mixed models

Graphical Abstract

1. Introduction

1.1. Macro and Micro Methods

For more than a century, actuaries have been using run-off triangles to project future payments, in non-life insurance. In the 1930s, [1] formalized this technique that originated from the popular chain ladder algorithm. In the 1990s, [2] proved that the chain ladder estimate can be motivated by a simple stochastic model, and later on [3] provided a comprehensive overview on stochastic models that can be connected with the chain ladder method, including regression models that could be seen as extension of the so-called “factor” methods used in the 1970s.

The terminology of [4] and [5] were used in macro-level models for reserving. In the 1970s, [6] suggested using some marked point process of claims to project future payments, and quantify the reserves. More recently, [7,8,9,10] or [11] (among many others) investigated further some probabilistic micro-level models. These models handle claims related data on an individual basis, rather than aggregating by underwriting year and development period. As mentioned in [12], these methods have not (yet) found great popularity in practice, since they are more difficult to apply. Nevertheless, several papers address that issue, with several stochastic processes to model the dynamics of payments, such as [13]—extended in [14,15] or [16], and more recently [17] and [18].

All macro-level models are based on aggregate data found in a run-off triangle, which is their strength, but also probably their weakness. Indeed, with regulation rules, such as Solvency II or IFRS 4—Phase 2 norms, the goal is no longer to get (only) a best-estimate, but it becomes necessary to have the full conditional distribution of future cash-flows.

Macro-level approaches, such as the popular chain ladder model, have undeniable properties. Those models are part of the actuarial folklore and extensions have been derived to have more general and more realistic models. They are easy to understand, and can be mentioned in financial communication, without disclosing too much information. From a computational perspective, those models can also be implemented in a single spreadsheet and more advanced libraries, such as the ChainLadder package in R can also beused.

Nevertheless, those macro-level models have also major drawbacks, as discussed for instance in [16] or [17], and references therein. More specifically, we wish to highlight two important points:

Those models neglect a lot of information that is available on a micro-level (per individual claim). Some additional covariates can be used, as well as exposure, etc. In most applications, not only is that information available, but usually, it has a valuable predictive power. To use that additional information, one cannot simply modify macro-level models, and it is necessary to change the general framework of the model. It becomes possible to emphasize large losses and to distinguish them from regular claims, to get more detailed information about future payments, etc.
As discussed in this paper, macro-level models on aggregated data can be seen as models on clusters and not on individual observations, as we will do with micro-level models. In the context of macro-level models for loss reserving, [3] mention that prediction errors can be large, because of the small number of observations used in run-off triangles and the fact that clusters are usually not homogeneous. Quantifying uncertainty in claim reserving methods is not only important in actuarial practice and to assess accuracy of predictive models, it is also a regulatory issue. Finally, a small sample size can cause a lack of robustness and a risk of over-parametrization for macro-level models.

On the other hand, micro-level models have many pros. They can incorporate micro-level covariates, and model micro-structure of claims development. Thus, they can take into account heterogeneity, structural changes, etc. (see [19] for a discussion). Several studies, with empirical comparisons between macro- and micro-level models (see, e.g., [17]), illustrate that in many scenarios, reserves obtained with micro-level models have higher precision than macro-level ones. Further, since those models are fitted on much more data than aggregated ones, variability of predictions is usually smaller, which increases model robustness and reduces the risk of over-fitting. More specifically, [15] and [16] obtained, on real data analysis, lower variance on the total amount of reserves with “micro” models than with “macro” ones. A natural question is about the generality of such result. Should “micro” model generate less variability than standard “macro” ones? That is the question that initiated this paper.

1.2. Agenda

In Section 2, we detail intuitive results we expect when aggregating data by clusters, moving from micro-level models to macro-level ones. More precisely, we explain why with a linear model and a Poisson regression, macro- and micro-level models are equivalent. We also discuss the case of the Poisson regression model with random intercept. In Section 3, we study “micro” and “macro” models in the context of claims reserving, on real data, as well as simulated ones. More specifically, we present a methodology that allows to generate—by random generation—some micro-datasets obtained from macro-datasets. We investigate also the impact of adding micro-type covariates, allowing for various types of correlations. Finally, we present concluding remarks in Section 4.

2. Clustering in Generalized Linear Mixed Models

In the economic literature, several papers discuss the use of “micro” vs. “macro” data, for instance in the context of unemployment duration in [20] or in the context of inflation in [21]. In [20], it is mentioned that both models are interesting, since “micro” data can be used to capture heterogeneity while “macro” data can capture cycle and more structural patterns. In [21], it is demonstrated that both heterogeneity and aggregation might explain the persistence of inflation at the macroeconomic level.

In order to clarify notation, and make sure that objects are well defined, we use small letters for sample values, e.g.,

y_{i}

, and capital letters for underlying random variables, e.g.,

Y_{i}

in the sense that

y_{i}

is a realisation of random variable

Y_{i}

. Hence, in the case of the linear model (see Subsection 2.1), we usually assume that

Y_{i} \sim N (x_{i}^{T} b, σ^{2})

, and then

\hat{b}

is the vector of estimated parameters, in the sense that

\hat{b} = {(x x^{T})}^{- 1} x y

while

\hat{B} = {(x x^{T})}^{- 1} x Y

(here covariates

x

are given, and non stochastic). Since

\hat{B}

is seen as a random variable, we can write

E [\hat{B}] = b

.

With a Poisson regression,

Y_{i} \sim P (λ_{i})

with

λ_{i} = exp [x_{i}^{T} b]

. In that case,

Var [Y_{i}] = E [Y_{i}] = λ_{i}

. The estimate parameter

\hat{b}

is a function of the observations,

(x_{i}, y_{i})

’s, while

\hat{B}

is a function of the observations

x_{i}

and of the random variable

Y_{i}

. In the context of the Poisson regression, recall that

E [\hat{B}] \to b

as n, the number of observations, goes to infinity. With a quasi-Poisson regression,

Y_{i}

does not have, per se, a proper distribution. Nevertheless, its moments are well defined, in the sense that

Var [Y_{i}] = φ E [Y_{i}] = φ λ_{i}

, where φ denotes the dispersion parameter (see Subsection 2.2). For convenience, we will denote

Y_{i} \sim q P (λ_{i})

, with an abuse of notation.

In this section, we will derive some theoretical results regarding aggregation in econometric models.

2.1. The Multiple Linear Regression Model

For a

(k + 1) \times 1

vector of parameters

a

, we consider a (multiple) linear regression model,

\begin{matrix} y_{i, g} & = x_{g}^{T} a + ε_{i, g} \\ a & = {[\begin{matrix} a_{1} & \dots & a_{k + 1} \end{matrix}]}^{T} x_{g} = {[\begin{matrix} x_{g, 1} & \dots & x_{g, k + 1} \end{matrix}]}^{T} \end{matrix}

(1)

where observations belong to a cluster g and are indexed by i within a cluster g,

i = 1, \dots, n_{g}

,

g = 1, \dots, m

. Assume further assumptions of the classical linear regression model [22], i.e.,

(LRM1): no multicollinearity in the data matrix;
(LRM2): exogeneity of the independent variables $E [ε_{i, g} | x_{g}] = 0$ , $i = 1, \dots, n_{g}$ , $g = 1, \dots, m$ ; and
(LRM3): homoscedasticity and nonautocorrelation of error terms with $Var [ε_{i, g}] = σ^{2}$ .

Stacking observations within a cluster yield the following model

\begin{matrix} {\bar{y}}_{g} & = x_{g}^{T} b + e_{g} \end{matrix}

(2)

where

{\bar{y}}_{g} = \frac{1}{n_{g}} \sum_{i} n_{i, g} and b = {[\begin{matrix} b_{1} & \dots & b_{k + 1} \end{matrix}]}^{T}

with similar assumptions except for

Var [e_{g}] = σ^{2} / n_{g}

. Those two models are equivalent, in the sense that the following proposition holds.

Proposition 1.

Model (1) on a micro level and Model (2) on a macro level are equivalent, in the sense that

(i): ${\hat{a}}_{O L S} = {\hat{b}}_{O L S}$ when weights $n_{g}$ are used in Model (2); and
(ii): $\sum_{i, g} {\hat{y}}_{i, g} = \sum_{g} {\hat{y}}_{g}$ where $y_{g} = n_{g} {\bar{y}}_{g}$ .

Proof.

(i): The ordinary least-squares estimator for $a$ - from Model (1)—is defined as

$\hat{a} = \underset{a}{argmin} \{\sum_{i, g} {(y_{i, g} - x_{g}^{T} a)}^{2}\}$

(3)

which can also be written

$\hat{a} = \underset{a}{argmin} \{\sum_{i, g} {(y_{i, g} - {\bar{y}}_{g} + {\bar{y}}_{g} - x_{g}^{T} a)}^{2}\} .$

(4)

Now, observe that

$\begin{matrix} \sum_{i, g} {(y_{i, g} - {\bar{y}}_{g} + {\bar{y}}_{g} - x_{g}^{T} a)}^{2} & = \sum_{i, g} {(y_{i, g} - {\bar{y}}_{g})}^{2} + {({\bar{y}}_{g} - x_{g}^{T} a)}^{2} \\ + 2 (y_{i, g} - {\bar{y}}_{g}) ({\bar{y}}_{g} - x_{g}^{T} a) \end{matrix}$

where the first term is independent of $a$ (and can be removed from the optimization program), and the term with cross-elements sums to 0. Hence,

$\hat{a} = \underset{a}{argmin} \{\sum_{i, g} {({\bar{y}}_{g} - x_{g}^{T} a)}^{2}\} = \underset{a}{argmin} \{\sum_{g} n_{g} {({\bar{y}}_{g} - x_{g}^{T} a)}^{2}\} = \hat{b}$

(5)

where $\hat{b}$ is the least square estimator of $b$ from Model (2), when weights $n_{g}$ are considered.
(ii): If we consider the sum of predicted values, observe that

$\sum_{i, g} {\hat{y}}_{i, g} = \sum_{g} n_{g} x_{g}^{T} \hat{a} = \sum_{g} n_{g} \underset{{\hat{\bar{y}}}_{g}}{\underset{︸}{x_{g}^{T} \hat{b}}} = \sum_{g} {\hat{y}}_{g}$

(6)

Hence, the sum of predictions obtained from Model (1) is the same as the sum of predictions obtained from Model (2), even if partial sums are considered.

☐

In the proposition above, the equality should be understood as the equality between estimators. Hence we have the following corollary.

Corollary 2.

We define the following matrices

\begin{matrix} Y_{g} & = {[\begin{matrix} Y_{1, g} & \dots & Y_{n_{g}, g} \end{matrix}]}^{T} Y = {[\begin{matrix} Y_{1}^{T} & \dots & Y_{m}^{T} \end{matrix}]}^{T} \bar{Y} = {[\begin{matrix} {\bar{Y}}_{1} & \dots & {\bar{Y}}_{m} \end{matrix}]}^{T} \\ ε_{g} & = {[\begin{matrix} ε_{1, g} & \dots & ε_{n_{g}, g} \end{matrix}]}^{T} ε = {[\begin{matrix} ε_{1} & \dots & ε_{m} \end{matrix}]}^{T} e = {[\begin{matrix} e_{1} & \dots & e_{m} \end{matrix}]}^{T} \\ x_{g}^{n_{g}} & = \underset{n_{g} times}{\underset{︸}{[\begin{matrix} x_{g} & \dots & x_{g} \end{matrix}]}} x = [\begin{matrix} x_{1}^{n_{1}} & \dots & x_{m}^{n_{m}} \end{matrix}] \bar{x} = [\begin{matrix} x_{1}^{1} & \dots & x_{m}^{1} \end{matrix}] \end{matrix}

the

(1 \times n_{g})

vectors

1_{n_{g}} = [\begin{matrix} 1 & \dots & 1 \end{matrix}]

and

0_{n_{g}} = [\begin{matrix} 0 & \dots & 0 \end{matrix}]

, and the matrix

\begin{matrix} 1 & = [\begin{matrix} 1_{n_{1}} & 0_{n_{2}} & \dots & 0_{n_{m}} \\ 0_{n_{1}} & 1_{n_{2}} & \dots & 0_{n_{m}} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0_{n_{1}} & 0_{n_{2}} & \dots & 1_{n_{m}} \end{matrix}] \end{matrix}

The OLS estimators are given by

\begin{matrix} {\hat{A}}_{O L S} & = \underset{a}{argmin} {{(Y - x^{T} a)}^{T} (Y - x^{T} a)} \\ = {(x x^{T})}^{- 1} x Y \end{matrix}

\begin{matrix} {\hat{B}}_{O L S} & = \underset{b}{argmin} {{((\sqrt{1 1^{T}}) \bar{Y} - \sqrt{1 1^{T}} {\bar{x}}^{T} b)}^{T} ((\sqrt{1 1^{T}}) \bar{Y} - (\sqrt{1 1^{T}}) {\bar{x}}^{T} b)} \\ = {(\bar{x} 1 1^{T} {\bar{x}}^{T})}^{- 1} \bar{x} 1 1^{T} \bar{Y} \end{matrix}

Model (1) on a micro level and Model (2) on a macro level are equivalent, in the sense that

(i): $E [{\hat{A}}_{O L S}] = E [{\hat{B}}_{O L S}]$ and $V a r [{\hat{A}}_{O L S}] = V a r [{\hat{B}}_{O L S}]$ , when weights $n_{g}$ are used in Model (2); and
(ii): $E [\sum_{i, g} {\hat{Y}}_{i, g}] = E [\sum_{g} {\hat{Y}}_{g}]$ and $V a r [\sum_{i, g} {\hat{Y}}_{i, g}] = V a r [\sum_{g} {\hat{Y}}_{g}]$ .

Proof.

Straightforward calculations lead to

{(1 1^{T})}^{- 1} 1 Y = \bar{Y}

and

\bar{x} 1 = x

.

(i): Let

$\begin{matrix} E [{\hat{B}}_{O L S}] & = {(\bar{x} 1 1^{T} {\bar{x}}^{T})}^{- 1} \bar{x} 1 1^{T} E [\bar{Y}] \\ = {(\bar{x} 1 1^{T} {\bar{x}}^{T})}^{- 1} \bar{x} 1 1^{T} {(1 1^{T})}^{- 1} 1 E [Y] \\ = {(x x^{T})}^{- 1} x E [Y] = E [{\hat{A}}_{O L S}] \end{matrix}$

For the equality of variances, we have

$\begin{matrix} Var [{\hat{B}}_{O L S}] \\ = {(\bar{x} 1 1^{T} {\bar{x}}^{T})}^{- 1} \bar{x} 1 1^{T} Var [\bar{Y}] {({(\bar{x} 1 1^{T} {\bar{x}}^{T})}^{- 1} \bar{x} 1 1^{T})}^{T} \\ = {(\bar{x} 1 1^{T} {\bar{x}}^{T})}^{- 1} \bar{x} 1 1^{T} {(1 1^{T})}^{- 1} 1 Var [Y] 1^{T} {({(1 1^{T})}^{- 1})}^{T} {({(\bar{x} 1 1^{T} {\bar{x}}^{T})}^{- 1} \bar{x} 1 1^{T})}^{T} \\ = {(x x^{T})}^{- 1} x Var [Y] x^{T} {({(x x^{T})}^{- 1})}^{T} \\ = Var [{\hat{A}}_{O L S}] \end{matrix}$
(ii): Let

$\begin{matrix} E [\sum_{g} {\hat{Y}}_{g}] & = E [1_{m} 1 1^{T} \hat{\bar{Y}}] = E [1_{m} 1 1^{T} {\bar{x}}^{T} \hat{B}] \\ = E [1_{m} 1 1^{T} {\bar{x}}^{T} \hat{A}] = E [1_{n} x^{T} \hat{A}] \\ = E [1_{n} \hat{Y}] = E [\sum_{i, g} Y_{i, g}] \end{matrix}$

The proof of the equality of variances is similar.

☐

2.2. The Quasi-Poisson Regression

A similar result can be obtained in the context of Poisson regressions. A generalized linear model [23] is made up of a linear predictor

x^{T} b

, a link function that describes how the expected value depends on this linear predictor and a variance function that describes how the variance depends on the expected value

Var [Y] = φ V (E [Y])

, where

φ

denotes the dispersion parameter. For the Poisson model, the variance is equal to the mean, i.e.,

φ = 1

and

V (E [Y]) = E [Y]

. This may be too restrictive for many actuarial illustrations, which often show more variation than given by expected values. We use the term over-dispersed for a model where the variance exceeds the expected value. A common way to deal with over-dispersion is a quasi-likelihood approach (see [23] for further discussion) where a model is characterized by its first two moments.

Consider either a Poisson regression model, or a quasi-Poisson one,

Y_{i, g} \sim P (λ_{i, g}) or Y_{i, g} \sim q P (λ_{i, g})

(7)

In the case of a Poisson regression,

E [Y_{i, g}] = λ_{i, g} = exp [x_{g}^{T} a + ln (1 / n_{g})] and Var [Y_{i, g}] = λ_{i, g}

(8)

and in the context of a quasi-Poisson regression,

E [Y_{i, g}] = λ_{i, g} = exp [x_{g}^{T} a + ln (1 / n_{g})] and Var [Y_{i, g}] = φ_{micro} λ_{i, g}

(9)

with

φ_{micro} > 0

for a quasi-Poisson regression (

φ_{micro} > 1

for over-dispersion). Here again, stacking observations within a cluster yield the following model (on the sum and not the average value, to have a valid interpretation with a Poisson distribution)

Y_{g} = \sum_{i} Y_{i, g} \sim P (λ_{g}) or Y_{g} \sim q P (λ_{g})

(10)

In the context of a Poisson regression,

E [Y_{g}] = λ_{g} = exp [x_{g}^{T} b] and Var [Y_{g}] = λ_{g}

and in the context of a quasi-Poisson regression,

E [Y_{g}] = λ_{g} = exp [x_{g}^{T} b] and Var [Y_{g}] = φ_{macro} λ_{g}

(11)

with

φ_{macro} > 0

for a quasi-Poisson regression. Here again, those two models (“micro” and “macro”) are equivalent, in the sense that the following proposition holds.

Proposition 3.

Model (7) on a micro level and Model (10) on a macro level are equivalent in the sense that

(i): ${\hat{a}}_{M L E} = {\hat{b}}_{M L E}$ ; and
(ii): $\sum_{i, g} {\hat{y}}_{i, g} = \sum_{g} {\hat{y}}_{g}$ .

Proof.

(i): Maximum likelihood estimator of $a$ is the solution of

$\begin{matrix} \sum_{i, g} (\frac{y_{i, g} - exp [x_{g}^{T} a]}{φ_{micro}}) x_{g} & = 0 \end{matrix}$

or equivalently

$\begin{matrix} \sum_{i, g} (y_{i, g} - exp [x_{g}^{T} a]) x_{g} & = 0 \end{matrix}$

With offsets $λ_{g}^{*} = exp [x_{g}^{T} b + log (n_{g})]$ , $g = 1, \dots, m$ , maximum likelihood estimator of $b$ is the solution (as previously, we can remove $φ_{macro}$ ) of

$\begin{matrix} \sum_{g} (y_{g} - n_{g} exp [x_{g}^{T} b]) x_{g} & = 0 \end{matrix}$

$\begin{matrix} \sum_{i, g} (y_{i, g} - exp [x_{g}^{T} b]) x_{g} & = 0 \end{matrix}$

Hence, $\hat{a} = \hat{b}$ , as (unique) solutions of the same system of equations.
(ii): The sum of predicted values is

$\begin{matrix} \sum_{i, g} {\hat{y}}_{i, g} & = \sum_{g} n_{g} {\hat{λ}}_{i, g} = \sum_{g} n_{g} exp [x_{g}^{T} \hat{a}] = \sum_{g} n_{g} exp [x_{g}^{T} \hat{b}] \\ = \sum_{g} exp [x_{g}^{T} \hat{b} + log (n_{g})] = \sum_{g} {\hat{λ}}_{g}^{*} = \sum_{g} {\hat{y}}_{g} \end{matrix}$

☐

Nevertheless, as we will see later on, the Corollary obtained in the context of a Gaussian linear model does not hold in the context of a quasi-Poisson regression.

Corollary 4.

Model (7) on a micro level and Model (10) on a macro level are asymptotically equivalent for Poisson regressions, in the sense that

(i): $E [{\hat{A}}_{M L E}] = E [{\hat{B}}_{M L E}]$ and $V a r [{\hat{A}}_{M L E}] = V a r [{\hat{B}}_{M L E}]$ , when n goes to infinity; and
(ii): $E [\sum_{i, g} {\hat{Y}}_{i, g}] = E [\sum_{g} {\hat{Y}}_{g}]$ and $V a r [\sum_{i, g} {\hat{Y}}_{i, g}] = V a r [\sum_{g} {\hat{Y}}_{g}]$ , when n goes to infinity.

Proof.

(i): A classical result of asymptotic theory for maximum likelihood estimators indicates that, under mild regularity conditions, $E [{\hat{A}}_{M L E}] \to a$ and $E [{\hat{B}}_{M L E}] \to b$ as $n \to \infty$ . Since $a = b$ , we have $E [{\hat{B}}_{M L E}] = E [{\hat{A}}_{M L E}]$ when $n \to \infty$ . For Model (7), the Fisher information matrix is $I (A) = x W x^{T}$ and, when $n \to \infty$ , $Var [\hat{A}] \to {(x W x^{T})}^{- 1}$ , where $W = diag ((λ_{1} / n_{1}) 1_{n_{1}}, \dots, (λ_{m} / n_{m}) 1_{n_{m}})$ . For Model (10), we have $I (B) = \bar{x} 1 W 1^{T} {\bar{x}}^{T} = x W x^{T}$ and, when $n \to \infty$ , $Var [\hat{B}] \to {(x W x^{T})}^{- 1}$ .
(ii): By using a similar argument, we have when n goes to infinity

$\begin{matrix} E [\sum_{g} {\hat{Y}}_{g}] & = E [1_{m} 1 1^{T} \hat{\bar{Y}}] = 1_{m} 1 1^{T} M_{\hat{B}} ({\bar{x}}^{T}) \\ = 1_{m} 1 1^{T} M_{\hat{A}} ({\bar{x}}^{T}) = E [1_{n} 1^{T} e^{{\bar{x}}^{T} \hat{A}}] = E [1_{n} e^{x^{T} \hat{A}}] \\ = E [1_{n} \hat{Y}] = E [\sum_{i, g} {\hat{Y}}_{g, i}] \end{matrix}$

☐

In small or moderate-sized samples, it should be noted that

\hat{A}

and

\hat{B}

may be biased for

A

and

B

, respectively. Generally, this bias is negligible compared with the standard errors (see [24,25]).

In the quasi-Poisson micro-level model (from Model (7)), as discussed above, the estimator of

a

is the solution of the quasi-score function

\begin{matrix} \sum_{i, g} (\frac{y_{i, g} - λ_{i, g}}{φ_{m i c r o}}) x_{g} & = 0 \end{matrix}

which implies

{\hat{a}}_{Q L E} = {\hat{a}}_{M L E}

. The classical Pearson estimator for the dispersion parameter

φ_{micro}

is

\begin{matrix} {\hat{φ}}_{m i c r o} & = \sum_{i, g} \frac{{(y_{i, g} - {\hat{y}}_{i, g})}^{2} / {\hat{y}}_{i, g}}{\sum_{g} n_{g} - (k + 1)} = \sum_{i, g} \frac{{(n_{g} y_{i, g} - {\hat{y}}_{g})}^{2} / n_{g} {\hat{y}}_{g}}{\sum_{g} n_{g} - (k + 1)} \end{matrix}

Empirical evidence (see [26]) support the use of the Pearson estimator for estimating

φ

because it is the most robust against the distributional assumption. In a similar way, the quasi-Poisson macro-level model (from Model (10)), the estimator of

b

is the solution of

\begin{matrix} \sum_{g} (\frac{y_{g} - λ_{g}}{φ_{m a c r o}}) x_{g} & = 0 \end{matrix}

which implies here also

{\hat{b}}_{Q L E} = {\hat{b}}_{M L E}

. The dispersion parameter

φ

is estimated by

\begin{matrix} {\hat{φ}}_{m a c r o} & = \sum_{g} \frac{{(y_{g} - {\hat{y}}_{g})}^{2} / {\hat{y}}_{g}}{m - (k + 1)} \end{matrix}

Clearly,

{\hat{φ}}_{m i c r o} \neq {\hat{φ}}_{m a c r o}

involving the following results.

Corollary 5.

Model (7) on a micro level and Model (10) on a macro level are not asymptotically equivalent for quasi-Poisson regressions, in the sense that

(i): $E [{\hat{A}}_{Q L E}] = E [{\hat{B}}_{Q L E}]$ but $V a r [{\hat{A}}_{Q L E}] \neq V a r [{\hat{B}}_{Q L E}]$ , when n goes to infinity; and
(ii): $E [\sum_{i, g} {\hat{Y}}_{i, g}] = E [\sum_{g} {\hat{Y}}_{g}]$ but $V a r [\sum_{i, g} {\hat{Y}}_{i, g}] \neq V a r [\sum_{g} {\hat{Y}}_{g}]$ , when n goes to infinity.

Proof.

(i): The property that variances are not equal is a direct consequence of classical results from the theory of generalized linear models (see [23]), since the covariance matrices of estimators are given by

$\begin{matrix} Var [\hat{B}] & \to {\hat{φ}}_{m a c r o} {(x W x^{T})}^{- 1} \end{matrix}$

and

$\begin{matrix} Var [\hat{A}] & \to {\hat{φ}}_{m a c r o} {(x W x^{T})}^{- 1} \end{matrix}$

(12)

when n goes to infinity. Thus, covariance matrices of estimators are asymptotically equal for the Poisson regression model but differ for the quasi-Poisson model because ${\hat{φ}}_{m i c r o} \neq {\hat{φ}}_{m a c r o}$ .
(ii): Since the MLE and the QLE share the same asymptotic distribution (see [23]), the proof is similar to Corollary 4(ii).

☐

2.3. Poisson Regression with Random Effect

In the micro-level model described by Equation (7), observations made for the same event (subject) at different periods are supposed to be independent. Within-subject correlation can be included in the model by adding random, or subject-specific, effects in the linear predictor. In the Poisson regression model with random intercept, the between-subject variation is modeled by a random intercept γ which represents the combined effects of all omitted covariates.

Let

Y_{g}^{(t)}

represent the sum of all observations from subject t, in the cluster g and

\begin{matrix} E [Y_{g}^{(t)} | γ_{t}] & = exp [x_{g}^{T} a + γ_{t}] \\ Var [Y_{g}^{(t)} | γ_{t}] & = E [Y_{g}^{(t)} | γ_{t}] \\ γ = [\begin{matrix} γ_{1} \\ \dots \\ γ_{T} \end{matrix}] & \sim N_{T} (0, σ^{2} I) \end{matrix}

where

I

is the

(T \times T)

identity matrix, and

N_{T} (μ, Σ)

the T-dimensional Gaussian distribution with mean μ and covariance matrix Σ. Straightforward calculations lead to

\begin{matrix} E [Y_{g}^{(t)}] & = exp [x_{g}^{T} a + σ^{2} / 2] \\ Var [Y_{g}^{(t)}] & = E [Y_{g}^{(t)}] (1 + E [Y_{g}^{(t)}] (exp [σ^{2}] - 1)) \end{matrix}

Hence,

Var [Y_{g}^{(t)}] > E [Y_{g}^{(t)}], σ^{2} > 0

(13)

This last equation shows that the Poisson regression model with random intercept leads to an over-dispersed marginal distribution for the variable

Y_{g}^{(t)}

. The maximum likelihood estimation for parameters requires Laplace approximation and numerical integration (see the Chapter 4 of [27] for more details). This model is a special case of multilevel Poisson regression model and estimation can be performed with various statistical softwares such as HLM, SAS, Stata and R (with package lme4).

One may be interested to verify the need of a source of between-subject variation. Statistically, it is equivalent to testing the variance of γ to be zero. In this particular case, the null hypothesis places

σ^{2}

on the boundary of the model parameter space which complicates the evaluation of the asymptotic distribution of the classical likelihood ratio test (LRT) statistic. From the very general result of [28], it can be demonstrated (see [29]) that the asymptotic null distribution of the LRT statistic is a

50 / 50

mixture of

χ_{0}^{2}

and

χ_{1}^{2}

as

\sum_{g} n_{g} \to \infty

. In this case, obtaining an equivalent macro-level model is of little practical interest since the construction of the variance-covariance matrix would require knowledge of the individual (“micro”) data.

3. Clustering and Loss Reserving Models

A loss reserving macro-level model is constructed from data summarized in a table called run-off triangle. Aggregation is performed by occurrence and development periods (typically years). For occurrence period i,

i = 1, 2, \dots, I

, and for development period j,

j = 1, 2, \dots I

, let

C_{i, j}

and

Y_{i, j}

represent the total cumulative paid amount and the incremental paid amount, respectively with

Y_{i, j} = C_{i, j} - C_{i, j - 1}

,

i = 1, \dots, I

,

j = 2, \dots, I

.

\begin{matrix} [\begin{matrix} C_{1, 1} & C_{1, 2} & \dots & C_{1, I - 1} & C_{1, I} \\ C_{2, 1} & C_{2, 2} & \dots & C_{2, I - 1} \\ ⋮ & ⋮ & ⋱ \\ C_{I, 1} \end{matrix}] \end{matrix}

\begin{matrix} [\begin{matrix} Y_{1, 1} = C_{1, 1} & Y_{1, 2} & \dots & Y_{1, I - 1} & Y_{1, I} \\ Y_{2, 1} = C_{2, 1} & Y_{2, 2} & \dots & Y_{2, I - 1} \\ ⋮ & ⋮ & ⋱ \\ Y_{I, 1} = C_{I, 1} \end{matrix}] \end{matrix}

where columns, rows and diagonals represent development, occurrence and calendar periods, respectively. Each incremental cell

Y_{i, j}

can be seen as a cluster stacking

n_{i, j}

amounts paid in the same development period j for the occurrence period i. These payments come from M claims and let

Y_{i, j}^{(k)}

represent the sum of all observations from claims k in the cluster

(i, j)

. It should be noted that all claims are not necessarily represented in each of the clusters.

To calculate a best estimate for the reserve, the lower part of the triangle must be predicted and the total reserve amount is

\begin{matrix} \hat{R} & = \sum_{t = 2}^{I} {\hat{C}}_{t, I} - \sum_{t = 2}^{I} C_{t, I - t + 1} = \sum_{t = 2}^{I} \sum_{s = I + 2 - t}^{I} {\hat{Y}}_{t, s} \end{matrix}

To quantify uncertainty in estimated claims reserve, we consider the mean square error of prediction (MSEP). Let

\hat{R}

be a

Y

-mesurable estimator for

E [R | Y]

and a

Y

-mesurable predictor for R where

Y

represents the set of observed clusters. The MSEP is

\begin{matrix} M S E P_{R | Y} (\hat{R}) & = E [{(\hat{R} - R)}^{2} | Y] \\ = Var [R | Y] + {(\hat{R} - E [R | Y])}^{2} \end{matrix}

Independence between R and

Y

is assumed, so the equation is simplified as follows

\begin{matrix} M S E P_{R | Y} (\hat{R}) & = Var [R] + {(\hat{R} - E [R])}^{2} \end{matrix}

and the unconditional MSEP is

\begin{matrix} M S E P_{R} (\hat{R}) & = Var [R] + E [{(\hat{R} - E [R])}^{2}] \end{matrix}

3.1. The Quasi-Poisson Model for Reserves

3.1.1. Construction

From the theory presented in Subsection 2.2, we construct quasi-Poisson macro- and micro-level models for reserves. For both models, constitutive elements are defined in Table 1.

As a direct consequence of Proposition 3, the best estimate for the total reserve amount is

\begin{matrix} \hat{R} = \sum_{(i, j) \in K} \sum_{k = 1}^{n_{i, j}} {\hat{Y}}_{i, j}^{(k)} & = \sum_{(i, j) \in K} {\hat{Y}}_{i, j} \end{matrix}

where

K

represents unobserved clusters. For both models, the Proposition 6 gives results for the unconditional MSEP.

Proposition 6.

In the quasi-Poisson macro-level model, the unconditional MSEP is given by

\begin{matrix} {\hat{M S E P}}_{R} (\hat{R}) & \approx \sum_{(i, j) \in K} {\hat{φ}}_{m a c r o} {\hat{y}}_{i, j} \\ + \sum_{(i, j), (n, m) \in K} {\hat{φ}}_{m a c r o} {\hat{y}}_{i, j} {\hat{y}}_{m, n} x_{i, j}^{T} {(x W x^{T})}^{- 1} x_{n, m} \end{matrix}

where

x

and

W

are defined by Equation (12). The unconditional MSEP for the quasi-Poisson micro-level model is similar with

{\hat{φ}}_{m a c r o}

replaced by

{\hat{φ}}_{m i c r o}

.

Proof.

The proof for the macro-level model is done in [25]. For the micro-level model, we have

\begin{matrix} M S E P_{R} (\hat{R}) = Var [R] + E [{(\hat{R} - E [R])}^{2}] \\ = \sum_{(i, j) \in K} \sum_{k = 1}^{n_{i, j}} {\hat{φ}}_{m i c r o} exp [x_{i, j}^{T} \hat{a} + log (1 / n_{i, j})] \\ + \sum_{(i, j) \in K} \sum_{(m, n) \in K} \sum_{k = 1}^{n_{i, j}} \sum_{t = 1}^{n_{m, n}} Cov [{\hat{Y}}_{i, j}^{(k)}, {\hat{Y}}_{m, n}^{(t)}] \\ = \sum_{(i, j) \in K} {\hat{φ}}_{m i c r o} exp [x_{i, j}^{T} \hat{a}] \\ + \sum_{(i, j) \in K} \sum_{(m, n) \in K} \sum_{k = 1}^{n_{i, j}} \sum_{t = 1}^{n_{m, n}} exp [x_{i, j}^{T} a + log (1 / n_{i, j})] exp [x_{m, n}^{T} a + log (1 / n_{m, n})] \\ \times Cov [exp [x_{i, j}^{T} \hat{a} - x_{i, j}^{T} a], exp [x_{m, n}^{T} \hat{a} - x_{m, n}^{T} a]] \end{matrix}

Although

{\hat{Y}}_{i, j}^{(k)}

is not an unbiased estimator of

E [Y_{i, j}^{(k)}]

, the bias is generally of small order and by using the approximation

exp [x] \approx 1 + x

for

x \approx 0

, we obtain

\begin{matrix} = \sum_{(i, j) \in K} {\hat{φ}}_{m i c r o} exp [x_{i, j}^{T} \hat{a}] \\ + \sum_{(i, j), (m, n) \in K} exp [x_{i, j}^{T} a + x_{m, n}^{T} a] Cov [x_{i, j}^{T} \hat{a}, x_{m, n}^{T} \hat{a}] . \end{matrix}

By using the fact that

\hat{b} = \hat{a}

and the remark at the end of Subsection 2.2, we obtain

\begin{matrix} = \sum_{(i, j) \in K} {\hat{φ}}_{m i c r o} {\hat{y}}_{i, j} + \sum_{(i, j), (m, n) \in K} {\hat{φ}}_{m i c r o} {\hat{y}}_{i, j} {\hat{y}}_{m, n} x_{i, j}^{T} {(x W x^{T})}^{- 1} x_{m, n} \end{matrix}

☐

Thus, the difference between the variability in macro- and micro-level models results from the difference between dispersion parameters. Define standardized residuals for both models

\begin{matrix} r_{i, g} & = \frac{(y_{i, g} - {\hat{y}}_{i, g})}{\sqrt{{\hat{y}}_{i, g}}} and r_{g} = \frac{(y_{g} - {\hat{y}}_{g})}{\sqrt{{\hat{y}}_{g}}} \end{matrix}

Direct calculations lead to

\begin{matrix} Ψ = \frac{\sum_{i, g} r_{i, g}^{2}}{\sum_{g} r_{g}^{2}} & \leq \frac{\sum_{g} n_{g} - (k + 1)}{m - (k + 1)} \to {\hat{φ}}_{micro} \leq {\hat{φ}}_{macro} \end{matrix}

(14)

Thus, if the total number of payments (

\sum_{g} n_{g}

) is greater than the value

Ψ (m - (k + 1)) + k + 1

, then the micro-level Model (7) will lead to a greater precision for the best estimate of the total reserve amount and conversely. Adding one or more covariate(s) at the micro level will decrease the numerator of Ψ and will increase the interest of the micro-level model.

3.1.2. Illustration and Discussion

To illustrate these results, we consider the incremental run-off triangle from UK Motor Non-Comprehensive account (published by [30]) presented in Table 2 where each cell

(i, j)

,

i + j \leq 7

, is assumed to be a cluster g, i.e., the value

Y_{g}

is the sum of

n_{g}

independent payments. Simulations and computations were performed in R, using packages ChainLadder and gtools. The final reserve amount obtained from the Mack’s model [2] is $28,655,773.

Based on the construction detailed in Table 1, we consider 2 macro-level models

\begin{matrix} Model A : Y_{g} & \sim P (λ_{g}) λ_{g} = exp [x_{g}^{T} a] \\ Model B : Y_{g} & \sim q P (λ_{g}) \end{matrix}

and 2 micro-level models

\begin{matrix} Model C : Y_{i, g} & \sim P (λ_{i, g}) λ_{i, g} = exp [x_{g}^{T} a - \log (n_{g})] \\ Model D : Y_{i, g} & \sim q P (λ_{i, g}) \end{matrix}

To create micro-level datasets from the "macro" one, we perform the following procedure:

simulate the number of payments for each cluster assuming $N_{g} \sim P (θ)$ , $g = 1, \dots, m$ ;
for each cluster, simulate a $(n_{g} \times 1)$ vector of proportions assuming $ω_{g} = {[\begin{matrix} ω_{1} & \dots & ω_{n_{g}} \end{matrix}]}^{T} \sim Dirichlet (1)$ , $g = 1, \dots, m$ ;
for each cluster, define

$\begin{matrix} [\begin{matrix} Y_{1, g} \\ ⋮ \\ Y_{n_{g}, g} \end{matrix}] & = ω_{g} Y_{g}, g = 1, \dots, m \end{matrix}$
adjust Model C and Model D; and
calculate the best estimate and the MSEP of the reserve by using Proposition 6.

For each value of θ, we repeat this procedure 1000 times and we calculate the average best estimate and the average MSEP. Results are presented in Table 3 and Figure 1. For Poisson regression (Model A and C), results are similar, which is consistent with Corollary 4. For micro-level models, convergence of

\sqrt{M S E P}

towards (11622) is fast. For quasi-Poisson regression (Model B and D), expected values are equal and Figure 1 shows

\sqrt{M S E P}

as a function of the expected total number of payments for the portfolio. Above a certain level, (close to 3400 here), accuracy of the “micro” approach exceeds the “macro”. Again, those results are consistent with Corollary 5. Here, we consider that the expected number of payments by cluster

(θ

) is constant but it would also be possible to consider a mixture model where

(N_{g} | Θ_{g}) \sim P (θ_{g})

,

g = 1, \dots, m

, and

Θ_{g} \sim Gamma (α, β)

. This modification does not change the conclusions. Finally, a comparison of estimated MSEP for both Poisson and quasi-Poisson models confirms the presence of over-dispersion in the data.

In order to illustrate the impact of adding a covariate at the micro-level, we define a quasi-Poisson micro-level model with a weakly correlated covariate (Model E) and with a strongly correlated covariate (Model F). Following a similar procedure, we obtain results presented in the bottom part of Table 3 and in Figure 2.

As opposed to standard classical results on hierarchical models, the average of explanatory variable within a cluster (

(1 / n_{g}) \sum_{i} x_{i g}

) has not been added to the macro-level model (Model B), for several reasons,

(i): impossible to compute that average without individual data;
(ii): discrete explanatory variables used in the micro-level model; and
(iii): since claims reserve model have a predictive motivation, it is risky to project the value of an aggregated variable on future clusters.

With an explanatory variable highly correlated with the response variable, results obtained with Model D and E are very close. As claimed by Proposition 6 and Equation (14), an explanatory variable highly correlated with the response variable will decrease the value of

\sqrt{M S E P}

, and lowers the threshold above which the micro-level model is more accurate than the macro-level one.

The quasi-Poisson macro-level model (Model B) with maximum likelihood estimators leads to the same reserves as the chain-ladder algorithm and the Mack’s model (see [31]), assuming the clusters exposure, for

(i, j) \in K

, is one. To obtain similar results with a quasi-Poisson micro-level model (Model D), a similar assumption is necessary: exposure of each claim within cluster

(i, j)

is

1 / n_{i, j}

. That assumption implies, on a micro level, that predicted individual payments

{\hat{Y}}_{i j}^{(k)}

are proportional to

1 / n_{i j}

. That assumption has unfortunately no foundation.

In the Poisson and quasi-Poisson micro-level models (Model C and D), payments related to the same claim, in two different clusters are supposed to be non-correlated. As discussed in the previous Section, it is possible to include dependencies among payments for a given claim using a Poisson regression with random effects.

3.2. The Mixed Poisson Model for Reserves

3.2.1. Construction

From the results obtained in Subsection 2.3, it is possible to construct a micro-model for the reserves that includes a random intercept. The later will allow to model dependence between payments from a given claim. Note that it is not relevant to construct an aggregated model with random effects that could be compared with individual ones. In the context of claims reserves,

Y_{g}^{(t)}

represents the sum of payments made for claim t within cluster g. The assumptions of that model (called model G) are

\begin{matrix} (Y_{g}^{(t)} | γ_{t}) & \sim P (λ_{g} e^{γ_{t}}), λ_{g} = exp [x_{g}^{T} c + l n (1 / n_{g})] \\ γ_{t} & \sim N (0, σ^{2}) \end{matrix}

Because of the hierarchical structure of the model, predictions can be derived from several philosophical perspectives (see [32], and references therein). In our example, we focus on the following unconditional predictions

\begin{matrix} ({\hat{Y}}_{g}^{(t)} | γ_{t}) & \sim P ({\hat{λ}}_{g} e^{γ_{t}}) \\ {\hat{λ}}_{g} & = exp [x_{g}^{T} \hat{c} + l n (1 / n_{g})] \\ so that E [{\hat{Y}}_{g}^{(t)}] & = λ_{g} e^{σ^{2} / 2} \end{matrix}

and on the conditional ones, where the unknown component of claim t is predicted by the so-called best linear estimate (that minimizes the MSEP)

{\tilde{γ}}_{t}

(see [25])

\begin{matrix} ({\tilde{Y}}_{g}^{(t)} | {\tilde{γ}}_{t}) & \sim P ({\hat{λ}}_{g} e^{{\tilde{γ}}_{t}}) \\ so that E [{\tilde{Y}}_{g}^{(t)}] & = λ_{g} e^{{\tilde{γ}}_{t}} \end{matrix}

It is then possible to compute the overall best estimate for the total amount of reserves.

3.2.2. Illustration and Discussion

We illustrate these results with the same macro-level dataset. In order to construct a micro-level model from Table 2, we follow a procedure similar to the one described in the previous section,

1-3.: see previous section;
4.: for each accident year, allocate randomly the source (t) of each payment;
5.: fit model G; and
6.: compute the best estimate and the MSEP of the reserve.

For a fixed value of θ, the procedure is repeated 1000 times. Various values were considered for θ (10, 25, 50, 100 and 250), and results were similar. In order to avoid heavy tables, only the case where

θ = 10

is mentioned here. Simulations and computations were performed with R, relying on package lme4. Final results are reported in Table 4. On Figure 3 we can see predictions of the model on observed data, while on Figure 4 we can see predictions of the model for non-observed cells (with

\pm 2 σ

in both cases). For each simulation, a LRT is performed to check the non-nullity of the variance term and to confirm the necessity of including a random intercept in the model, which means that correlation among payments (related to the same claim) is positive. Observe that with the mixed model, the log-likelihood is approximated using numerical integration, which might bias p-values of the test. To avoid that, p have been confirmed using a bootstrap procedure (using package glmmML).

4. Conclusions

In this article, we study equivalence (as well as non-equivalence) between Poisson and quasi-Poisson regression models, obtained on aggregated (so called macro-level) and non-aggregated (micro-level) datasets, in order to understand when using micro-level models might over-perform macro-level ones. Those models are used here in the context of estimating claims reserves. The uncertainty is quantified using—as in standard macro-level approaches—the MSEP. We also investigate the impact of adding micro-level covariates in the model. Finally, we discuss the use of mixed Poisson regression, in the case of micro-level data, that might take into account possible dependence between observations in different clusters.

We illustrate theoretical results on simulated data, generated from cumulated payments, in R. A methodology that allows us to generate such micro-level datasets is described. In a first part, we compare results obtained with Poisson and quasi-Poisson regression, on micro- and macro-level datasets. That comparison reveals that in the context of a quasi-Poisson regression model, the expected number of claims plays a crucial role, since above a given threshold, the micro-level model is more robust than the macro-one. Moreover, the presence (or absence) of covariates can affect this threshold.

In a second part, we analyse predictions obtained using a mixed Poisson regression model, i.e., a Poisson regression model with a random component that characterizes dependence among payments on the same claim, at different dates. The necessity of this random component is verified by using a likelihood-ratio test. That study reveals that such a dependence might have a non-negligible impact on predictions.

Of course, that study is only the first step and several directions for future research can be intuited. For instance, as micro-level models are based on much more observations than macro-level ones, more robust estimation techniques, such as generalized method of moments, can be considered.

Acknowledgments

The authors thank two anonymous referees and the associate editor for useful comments which helped to improve the paper substantially. The first author received financial support from Natural Sciences and Engineering Research Council of Canada (NSERC) and the ACTINFO chair. The second author received financial support from Natural Sciences and Engineering Research Council of Canada (NSERC).

Author Contributions

Both authors contributed equally to this work.

Conflicts of Interest

The authors declare no conflict of interest.

References

E. Astesan. Les réserves techniques des sociétés d’assurances contre les accidents automobiles. Paris, France: Librairie générale de droit et de jurisprudence, 1938. [Google Scholar]
R. Mack. “Distribution-free calculation of the standard error of chain ladder reserve estimates.” ASTIN Bull. 23 (1993): 213–225. [Google Scholar] [CrossRef]
P.D. England, and R.J. Verrall. “Stochastic claims reserving in general insurance.” Br. Actuar. J. 8 (2003): 443–518. [Google Scholar] [CrossRef]
J. Van Eeghen. “Loss reserving methods.” In Surveys of Actuarial Studies 1. The Hague, The Netherlands: Nationale-Nederlanden, 1981. [Google Scholar]
G.C. Taylor. Claims Reserving in Non-Life Insurance. Amsterdam, The Netherlands: North-Holland, 1986. [Google Scholar]
J.E. Karlsson. “The expected value of IBNR claims.” Scand. Actuar. J. 1976 (1976): 108–110. [Google Scholar] [CrossRef]
E. Arjas. “The claims reserving problem in nonlife insurance—Some structural ideas.” ASTIN Bull. 19 (1989): 139–152. [Google Scholar] [CrossRef]
W.S. Jewell. “Predicting IBNYR events and delays I. Continuous time.” ASTIN Bull. 19 (1989): 25–55. [Google Scholar] [CrossRef]
R. Norberg. “Prediction of outstanding liabilities in non-life insurance.” ASTIN Bull. 23 (1993): 95–115. [Google Scholar] [CrossRef]
O. Hesselager. “A Markov model for loss reserving.” ASTIN Bull. 24 (1994): 183–193. [Google Scholar] [CrossRef]
R. Norberg. “Prediction of outstanding liabilities II: Model variations and extensions.” ASTIN Bull. 29 (1999): 5–25. [Google Scholar] [CrossRef]
O. Hesselager, and R.J. Verrall. “Reserving in Non-Life Insurance.” Available online: http://onlinelibrary.wiley.com (accesssed on 29 February 2016).
X.B. Zhao, X. Zhou, and J.L. Wang. “Semiparametric model for prediction of individual claim loss reserving.” Insur. Math. Econ. 45 (2009): 1–8. [Google Scholar] [CrossRef]
X. Zhao, and X. Zhou. “Applying copula models to individual claim loss reserving methods.” Insur. Math. Econ. 46 (2010): 290–299. [Google Scholar] [CrossRef]
M. Pigeon, K. Antonio, and M. Denuit. “Individual loss reserving using paid-incurred data.” Insur. Math. Econ. 58 (2014): 121–131. [Google Scholar] [CrossRef]
K. Antonio, and R. Plat. “Micro-level stochastic loss reserving for general insurance.” Scand. Actuar. J. 2014 (2014): 649–669. [Google Scholar] [CrossRef]
X. Jin, and E.W. Frees. “Comparing Micro- and Macro-Level Loss Reserving Models.” Madison, WI, USA: Presentation at ARIA, 2015. [Google Scholar]
A. Johansson. “Claims Reserving on Macro- and Micro-Level.” Master’s Thesis, Royal Institute of Technology, Stockholm, Sweden, 2015. [Google Scholar]
J. Friedland. Estimating Unpaid Claims Using Basic Techniques. Arlington, VA, USA: Casualty Actuarial Society, 2010. [Google Scholar]
G.J. Van den Berga, and B. van der Klaauw. “Combining micro and macro unemployment duration data.” J. Econom. 102 (2001): 271–309. [Google Scholar] [CrossRef]
F. Altissimo, B. Mojon, and P. Zaffaroni. Fast Micro and Slow Macro: Can Aggregation Explain the Persistence of Inflation? European Central Bank Working Papers; 2007, Volume 0729. [Google Scholar]
W.H. Greene. Econometric Analysis, 5th ed. Upper Saddle River, NJ, USA: Prentice Hall, 2003. [Google Scholar]
P. McCullagh, and J.A. Nelder. Generalized Linear Models. London, UK: Chapman & Hall, 1989. [Google Scholar]
G.M. Cordeiro, and P. McCullagh. “Bias correction in generalized linear models.” J. R. Stat. Soc. B 53 (1991): 629–643. [Google Scholar]
M. Wüthrich, and M. Merz. Stochastic Claims Reserving Methods. Hoboken, NJ, USA: Wiley Interscience, 2008. [Google Scholar]
M. Ruoyan. “Estimation of Dispersion Parameters in GLMs with and without Random Effects.” Stockholm University, 2004. Available online: http://www2.math.su.se/matstat/reports/serieb/2004/rep5/report.pdf (accessed on 29 February 2016).
T.A.B. Snijders, and R.J. Bosker. Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling. Thousand Oaks, CA, USA: Sage Publishing, 2012. [Google Scholar]
S.G. Self, and K.Y. Liang. “Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions.” J. Am. Stat. Assoc. 82 (1987): 605–610. [Google Scholar] [CrossRef]
D. Dunson. Random Effect and Latent Variable Model Selection. Lecture Notes in Statistics; New York, NY, USA: Springer-Verlag, 2008, Volume 192. [Google Scholar]
S. Christofides. “Regression models based on log-incremental payments.” Claims Reserv. Man. 2 (1997): D5.1–D5.53. [Google Scholar]
T. Mack, and G. Venter. “A comparison of stochastic models that reproduce chain ladder reserve estimates.” Insur. Math. Econ. 26 (2000): 101–107. [Google Scholar] [CrossRef]
A. Skrondal, and S. Rabe-Hesketh. “Prediction in multilevel generalized linear models.” J. R. Stat. Soc. A 172 (2009): 659–687. [Google Scholar] [CrossRef]

Figure 1. Square root of the mean square error of prediction obtained for Model D (solid line) and Model B (broken line) from simulated values for increasing expected number of payments for the portfolio.

Figure 2. Mean square error of prediction (

\pm 2 σ

) obtained from simulated values as a function of the expected number of payments for Model E (red lines) and Model F (blue lines). For comparison purposes, the MSEP obtained for the Model D (solid black line) and the Model B (broken black line) are added.

Figure 2. Mean square error of prediction (

\pm 2 σ

) obtained from simulated values as a function of the expected number of payments for Model E (red lines) and Model F (blue lines). For comparison purposes, the MSEP obtained for the Model D (solid black line) and the Model B (broken black line) are added.

Figure 3. Observed data (circles) with conditional predictions (red lines) and unconditional ones (blue lines) from Model G with

θ = 10

.

Figure 3. Observed data (circles) with conditional predictions (red lines) and unconditional ones (blue lines) from Model G with

θ = 10

.

Figure 4. Predictions with the quasi-Poisson macro-level model (strong black line), with conditional predictions (red lines) and unconditional ones (blue lines) from Model G with

θ = 10

.

Figure 4. Predictions with the quasi-Poisson macro-level model (strong black line), with conditional predictions (red lines) and unconditional ones (blue lines) from Model G with

θ = 10

.

Table 1. Quasi-Poisson macro- and micro-level models for reserve (

i, j = 1, \dots, I

). All clusters and all payments are independent.

**Table 1.** Quasi-Poisson macro- and micro-level models for reserve ( $i, j = 1, \dots, I$ ). All clusters and all payments are independent.
Components	Macro	Micro
Exp. value	$E [Y_{i, j}] = λ_{i, j}$	$E [Y_{i, j}^{(k)}] = λ_{i, j}$
Inv. link func.	$λ_{i, j} = exp [x_{i, j}^{T} b]$	$λ_{i, j} = exp [x_{i, j}^{T} a + log (1 / n_{i, j})]$
	$= exp [b_{i} + b_{I + j}]$	$= exp [a_{i} + a_{I + j} + log (1 / n_{i, j})]$
	with $b_{I + 1} = 0$	with $a_{I + 1} = 0$
Variance	$Var [Y_{i, j}] = φ_{m a c r o} λ_{i, j}$	$Var [Y_{i, j}^{(k)}] = φ_{m i c r o} λ_{i, j}$
Pred. value	${\hat{Y}}_{i, j} = exp [{\hat{b}}_{i} + {\hat{b}}_{I + j}]$	${\hat{Y}}_{i, j}^{(k)} = exp [{\hat{a}}_{i} + {\hat{a}}_{I + j} + log (1 / n_{i, j})]$
Known values	$Y_{m a c r o}$	$Y_{m i c r o}$

Table 2. Incremental run-off triangle for macro-level model (in 000’s).

**Table 2.** Incremental run-off triangle for macro-level model (in 000’s).
	1	2	3	4	5	6	7
1	3511	3215	2266	1712	1059	587	340
2	4001	3702	2278	1180	956	629	–
3	4355	3932	1946	1522	1238	–	–
4	4295	3455	2023	1320	–	–	–
5	4150	3747	2320	–	–	–	–
6	5102	4548	–	–	–	–	–
7	6283	–	–	–	–	–	–

Table 3. Results.

**Table 3.** Results.
Method	$E [Reserve]$	$\sqrt{MSEP}$
Mack’s model	28655773	1417267
Poisson reg.
Model A	28655773	11622
Model C	28655773	11622
quasi-Poisson reg.
Model B	28655773	1708196
Model D	28655773	see Figure 1
quasi-Poisson reg.
Model E ( $ρ \approx 0$ )	28657364	see Figure 2
Model F ( $ρ \approx 0.8$ )	20514566	see Figure 2

Table 4. Numerical results for

θ = 10

. Results for different values of θ are similar.

**Table 4.** Numerical results for $θ = 10$ . Results for different values of θ are similar.
Modèle	$E [Reserve]$	$\sqrt{Var (Reserve)}$
coll. quasi-Pois.	28656423	1708216
mixed Poisson non-cond.	27930624	3297401
mixed Poisson cond.	25972947	2280902

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Charpentier, A.; Pigeon, M. Macro vs. Micro Methods in Non-Life Claims Reserving (an Econometric Perspective). Risks 2016, 4, 12. https://doi.org/10.3390/risks4020012

AMA Style

Charpentier A, Pigeon M. Macro vs. Micro Methods in Non-Life Claims Reserving (an Econometric Perspective). Risks. 2016; 4(2):12. https://doi.org/10.3390/risks4020012

Chicago/Turabian Style

Charpentier, Arthur, and Mathieu Pigeon. 2016. "Macro vs. Micro Methods in Non-Life Claims Reserving (an Econometric Perspective)" Risks 4, no. 2: 12. https://doi.org/10.3390/risks4020012

APA Style

Charpentier, A., & Pigeon, M. (2016). Macro vs. Micro Methods in Non-Life Claims Reserving (an Econometric Perspective). Risks, 4(2), 12. https://doi.org/10.3390/risks4020012

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Macro vs. Micro Methods in Non-Life Claims Reserving (an Econometric Perspective)

Abstract

1. Introduction

1.1. Macro and Micro Methods

1.2. Agenda

2. Clustering in Generalized Linear Mixed Models

2.1. The Multiple Linear Regression Model

2.2. The Quasi-Poisson Regression

2.3. Poisson Regression with Random Effect

3. Clustering and Loss Reserving Models

3.1. The Quasi-Poisson Model for Reserves

3.1.1. Construction

3.1.2. Illustration and Discussion

3.2. The Mixed Poisson Model for Reserves

3.2.1. Construction

3.2.2. Illustration and Discussion

4. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI