# Claim Watching and Individual Claims Reserving Using Classification and Regression Trees

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

`rpart`routine implemented in

`R`. The results of the CART calibration are discussed in detail and a possible use of event predictions for early warning is illustrated.

**Part I. One-period Predictions**

## 2. A First Look at the Problem and the Model

- (a)
- What is the probability that $\mathcal{C}$ is closed in the next year?
- (b)
- What is the probability that a lawyer will be involved in the settlement of $\mathcal{C}$ in two years?
- (c)
- What is the expectation of a payment in respect of $\mathcal{C}$ in the next year?
- (d)
- What is the expectation of the total claim payments in respect of $\mathcal{C}$ until finalization?

- ·
- ${\mathcal{F}}_{t}$ denotes the information available at time t,
- ·
- the vector ${\mathit{x}}_{t}^{(\mathcal{C})}\in \mathcal{X}$ is the claim feature (also covariates, explanatory variables, independent variables, …), which is observed up to time t, i.e., is ${\mathcal{F}}_{t}$-measurable,
- ·
- $\mu :\mathcal{X}\to \mathbb{R}$ is the prediction function,
- ·
- ${Y}_{t+\tau}^{(\mathcal{C})}$ is the response variable (or dependent variable).

- (a)
- ${Y}_{t+\tau}^{(\mathcal{C})}$ is the indicator function of the event {$\mathcal{C}$ is closed at time $t+\tau $} (with $\tau =1$),
- (b)
- ${Y}_{t+\tau}^{(\mathcal{C})}$ is the indicator function of {$\mathcal{C}$ will involve a lawyer by the time $t+\tau $} (with $\tau =2$),
- (c)
- ${Y}_{t+\tau}^{(\mathcal{C})}$ is the random variable denoting the amount paid in respect of $\mathcal{C}$ at time $t+\tau $ (with $\tau =1$),
- (d)
- ${Y}_{t+\tau}^{(\mathcal{C})}$ is the random variable denoting the cumulated paid amount in respect of $\mathcal{C}$ at time $t+\tau $ (with $\tau \to \infty $).

**Remark**

**1.**

## 3. Notation and Basic Assumptions

`ay`). The accident years are indexed as $i=1,\dots ,I$. Then we are at time (calendar year) $t=I$.

`rd`) $j=0,1,\dots $. A claim with accident year i and reporting delay j will have reporting date $i+j$. As usual, we assume that there exists a maximum possible delay $J\ge 0$.

`cc`. For each block $(i,j)$ there are ${N}_{i,j}$ claims and we denote by $\nu =1,\dots ,{N}_{i,j}$ the index numbering the claims in block $(i,j)$; the $\nu $-th claim in $(i,j)$ is denoted by ${\mathcal{C}}_{i,j}^{(\nu )}$.

**Remark**

**2.**

- ·
- ${\mathit{X}}_{i,j|k}^{(\nu )}$ a generic random variable, possibly multidimensional, involved in the claim settlement process of ${\mathcal{C}}_{i,j}^{(\nu )}$ and observed at time $t=i+j+k$, for $k\in {\mathbb{N}}_{0}$,
- ·
- $\ell :=j+k=t-i$ the time-lag of ${\mathit{X}}_{i,j|k}^{(\nu )}$.

- (H0)
- At any date t the one-year prediction function $\mu \left({\mathit{x}}_{i,j|t-(i+j)}^{(\nu )}\right)$ depends only on the time-lag $\ell =t-i$. i.e.,:$${\mu}_{t-i}:\mathcal{X}\to \mathbb{R},\phantom{\rule{1.em}{0ex}}\phantom{\rule{0.277778em}{0ex}}{\mathit{x}}_{i,j|t-(i+j)}^{(\nu )}\mapsto {\mu}_{t-i}\left({\mathit{x}}_{i,j|t-(i+j)}^{(\nu )}\right)\phantom{\rule{0.166667em}{0ex}}.$$

## 4. The General Structure of the Frequency-Severity Model

#### 4.1. Frequency and Severity Response Variables

**Remark**

**3.**

#### 4.2. Model Assumptions

- (H1)
- The processes${({N}_{i,j})}_{i,j}$, ${({\mathit{F}}_{i,j|k}^{(\nu )})}_{i,j,k,\nu}$, ${({S\mathit{1}}_{i,j|k}^{(\nu )})}_{i,j,k,\nu}$and${({S\mathit{2}}_{i,j|k}^{(\nu )})}_{i,j,k,\nu}$are independent.
- (H2)
- The random variables in${({N}_{i,j})}_{i,j}$, ${({\mathit{F}}_{i,j|k}^{(\nu )})}_{i,j,k,\nu}$, ${({S\mathit{1}}_{i,j|k}^{(\nu )})}_{i,j,k,\nu}$and${({S\mathit{2}}_{i,j|k}^{(\nu )})}_{i,j,k,\nu}$for different accident years are independent.
- (H3)
- The processes${({\mathit{F}}_{i,j|k}^{(\nu )})}_{k}$, ${({S\mathit{1}}_{i,j|k}^{(\nu )})}_{i,j,k,\nu}$and${({S\mathit{2}}_{i,j|k}^{(\nu )})}_{i,j,k,\nu}$for different reporting delays j and different claims ν are independent.
- (H4)
- The conditional distribution of ${\mathit{F}}_{i,j|k}^{(\nu )}$ is the d-dimensional Bernoulli:$${\mathit{F}}_{i,j|k+1}^{(\nu )}|{\mathcal{F}}_{i+j+k}\sim d\text{-}\mathrm{Bernoulli}\left({p}_{j+k}^{(\mathit{f})}\left({\mathit{x}}_{i,j|k}^{(\nu )}\right)\right)\phantom{\rule{0.166667em}{0ex}},$$$$\sum _{{f}_{1},\dots ,{f}_{d}\in \{0,1\}}{p}_{j+k}^{(\mathit{f})}\left({\mathit{x}}_{i,j|k}^{(\nu )}\right)=1\phantom{\rule{0.166667em}{0ex}}.$$
- (H5)
- For the conditional distribution of${S\mathit{1}}_{i,j|k}^{(\nu )}|({\overline{S\mathit{1}}}_{i,j|k}^{(\nu )}=1)$and${S\mathit{2}}_{i,j|k}^{(\nu )}|({\overline{S\mathit{2}}}_{i,j|k}^{(\nu )}=1)$one has:$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& {S\mathit{1}}_{i,j|k+1}^{(\nu )}|\left({\overline{S\mathit{1}}}_{i,j|k+1}^{(\nu )}=1\right)\sim \mathcal{N}\left({\tilde{\mu}}_{j+k}^{(1)}\left({\tilde{\mathit{x}}}_{i,j|k}^{(\nu )}\right),{\sigma}_{1}^{2}\right)\phantom{\rule{0.166667em}{0ex}},\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& {S\mathit{2}}_{i,j|k+1}^{(\nu )}|\left({\overline{S\mathit{2}}}_{i,j|k+1}^{(\nu )}=1\right)\sim \mathcal{N}\left({\tilde{\mu}}_{j+k}^{(2)}\left({\tilde{\mathit{x}}}_{i,j|k}^{(\nu )}\right),{\sigma}_{2}^{2}\right)\phantom{\rule{0.166667em}{0ex}},\hfill \end{array}$$

**Remark**

**4.**

- The independence assumptions (H1), (H2) and (H3) were taken to receive a not too much complex model. In particular, assumptions in (H1) are necessary to obtain compound distributions, assumptions in (H3) allow the modeling of variables of individual claims independently for different ν.
- However, the specified model is rather general as regarding the prediction functions p and $\tilde{\mu}$ in (5) and (6). These functions at the moment are fully non-parametric and can have any form. In the following sections we will show how these functions can be calibrated with machine-learning methods provided by CARTs.
- The value of the variance parameters ${\sigma}_{1}^{2},{\sigma}_{2}^{2}$ in (4) is irrelevant since the normality assumption is used in this paper only to support the sum of squared errors (SSE) minimization for the calibration of the regression trees. The value of the variance is irrelevant in this minimization.
- Our model assumptions concern only one-year forecasting (from time t to $t+1$). Under proper conditions multiyear predictions can be obtained by compounding one-period predictions. This will be illustrated in Section 9.

#### 4.3. Equivalent One-Dimensional Formulation of Frequency Responses

- (H4’)
- For the conditional distribution of ${W}_{i,j|k}^{(\nu )}$ one has:$${W}_{i,j|k+1}^{(\nu )}|{\mathcal{F}}_{i+j+k}\sim \mathrm{Categorical}\left({p}_{j+k}^{(w)}\left({\mathit{x}}_{i,j|k}^{(\nu )}\right)\right)\phantom{\rule{0.166667em}{0ex}},$$$$\sum _{w=0}^{{2}^{d}-1}{p}_{j+k}^{(w)}\left({\mathit{x}}_{i,j|k}^{(\nu )}\right)=1\phantom{\rule{0.166667em}{0ex}}.$$

`R`package

`rpart`we use in these examples, multidimensional responses are not supported.

## 5. Characterizing the Feature Space

`cc`(categorical), the accident year i and the reporting delay j (ordered).

- ·
- ${\mathit{A}}_{i,j}^{(\nu )}$ is a column vector of static variables,
- ·
- ${\mathit{B}}_{i,j|h}^{(\nu )},\phantom{\rule{0.277778em}{0ex}}h=0,\dots ,k$, is a column vector of dynamic variables observed in year $i+j+h$.

- ·
- ${\mathit{A}}_{i,1}^{(\nu )}={\left(\mathtt{cc},i,j\right)}^{\prime}$,
- ·
- ${\mathit{B}}_{i,1|0}^{(\nu )}={\left(0,{Z}_{i,1|0}^{(\nu )},0,{\overline{S\mathit{1}}}_{i,1|0}^{(\nu )},0,{\overline{S\mathit{2}}}_{i,1|0}^{(\nu )},\right)}^{\prime}$,
- ·
- ${\mathit{B}}_{i,1|1}^{(\nu )}={\left({Z}_{i,1|1}^{(\nu )},{\overline{S\mathit{1}}}_{i,1|1}^{(\nu )},{\overline{S\mathit{2}}}_{i,1|1}^{(\nu )},{S\mathit{1}}_{i,1|1}^{(\nu )},{S\mathit{2}}_{i,1|1}^{(\nu )},\right)}^{\prime}$.

## 6. Organization of Data for the Estimation

`cc`$=1,2,5,6$. The feature of claims 1 and 2, belonging to accident year 1, is observed up to time $t=i+\ell =4$, but only features observed up to time $t=3$ can be used for the estimation. For claims 2, 3, which are reported with a one-year delay, historical data is missing for calendar year 1, 2, respectively. Cells highlighted in green color correspond to the data sets ${\mathcal{D}}_{\ell}^{P},\phantom{\rule{0.166667em}{0ex}}\ell =0,1,2$, used for the estimates ${\widehat{\mathit{Y}}}_{I-\ell ,j|\ell -j+1}^{(\nu )}$ of the responses, which replace the missing values in Table 2.

## 7. Using CARTs for Calibration

#### 7.1. Basic Concepts of CART Techniques

`rpart`routine implemented in

`R`, see e.g., Therneau et al. (2015).

#### 7.2. Applying CARTs in the Frequency Model

`rpart`this is obtained with the option

`method=‘class’`, which also implies that the Gini index is used as impurity measure. As previously pointed out, since the

`rpart`routine supports only one-dimensional response variables, instead of using the d-dimensional variables F we formulate the classification problem using the one-dimensional variables defined in (8). From (9) we have:

`rpart`routine provides the probabilities:

#### 7.3. Applying CARTs in the Severity Model

`rpart`with the option

`method=‘anova’`. In this case, the loss function used is the sum of squared errors (SSE). Given the normality assumption (H5) the SSE minimization performed by the binary splitting algorithm provides a log-likelihood minimization in this non-parametric setting.

## 8. Examples of One-Year Predictions in Motor Insurance

- ·
- Observed accident years: from 2010 to 2015. Then $i=1,\dots ,I$ with $I=6$.
- ·
- Only claims reported from 2013 onwards are observed, hence for accident year i, one has $j={\underline{j}}_{i},\dots ,6-i$, with ${\underline{j}}_{i}=(4-i)\vee 0$.
- ·
- The pairs feature-response are observed for lags $\ell =0,\dots ,I-2=4\phantom{\rule{0.166667em}{0ex}}$ (5 estimation steps).

#### 8.1. Prediction of Events Using the Frequency Model

`rpart`implemented in

`R`. The input data in ${\mathcal{D}}_{1}^{C}$ is organized as a table (a

`data frame`) where each row corresponds to a claim and in each column the value of the response and of all the feature components observed at different historical dates is reported.

`R`command is used for the calibration, see Therneau et al. (2015) for details3:

`dt_freq1`is the calibration set ${\mathcal{D}}_{1}^{C}$, and the variables are relabeled as follows:

`_h`, $\mathtt{h}=0,\dots ,\ell $, are observed at time $i+\mathtt{h}$ i.e., have historical depth $\theta =\ell -\mathtt{h}+1$. Therefore for $\ell =1$ variables with

`_1`have $\theta =1$ and variables with

`_0`have $\theta =2$.

`freqtree1`, was grown by

`rpart`. In a second step

`freqtree1`has been pruned using 10-fold cross-validation and applying the one-standard-error rule. The resulting pruned tree is reported in Figure 1, which is obtained by the package

`rpart.plot`.

`rpart`numerical output provides more precise figures.

`CYN0`, i.e., the state with a type-1 payment and claim closing (${W}_{5,j|2-j}^{(\nu )}=5$). In leaf 3 we find the claims in the calibration set which at time 4 and 5 were open without type-1 reserve and with a lawyer involved, i.e., $({Z}_{i,j|1-j}^{(\nu )}=0)\cap ({\overline{R\mathit{1}}}_{i,j|1-j}^{(\nu )}=0)\cap ({L}_{i,j|1-j}^{(\nu )}=1),\phantom{\rule{0.277778em}{0ex}}i=3,4$. These claims are $0.2\%$ of the total. From the frequency table we conclude that for claims that at time I have the same feature the most probable state at time $I+1$ ($36\%$ probability) is

`CNY0`, i.e., the state with a type-2 payment and claim closing (${W}_{5,j|2-j}^{(\nu )}=6$). In the fourth binary split, which produces the first two leaves in the tree, the splitting criterion is the existence of a type-2 payment (indicator

`P2_1`) for claims which at time 4 and 5 were open without type-1 reserve and without a lawyer. From the frequency tables in the second and the first leaf (referring to about $1\%$ and $2\%$ of the claims of the calibration set, respectively), one finds that if at time I the claim has a type-2 payment, the most probable state at time $I+1$ ($33\%$) is

`CNY0`; otherwise the most probable state ($40\%$) is

`ONN0`, i.e., it remains open without payments and without involving a lawyer.

`_0`), none of these variables has been considered useful for prediction by the algorithm (after pruning). Only explanatory variables with $\theta =1$ (subscript

`_1`) has been used for the splits in the pruned tree.

#### 8.2. Possible Use for Early Warnings

#### 8.3. Prediction of Claim Payments Using the Conditional Severity Model

- The estimate of a type-1 (i.e., NoCARD) payment for open claims with type-1 reserve placed on, for which we consider the claims in leaf 4 in the frequency tree in Figure 1.
- The estimate of a type-2 (i.e., CARD) payment for open claims without type-1 reserve placed on and with lawyer involved, for which we consider the claims in leaf 3 in Figure 1.

`R`command is used:

`dt_sev4`is the calibration set and the relabeling is used:

`sevtree4`was grown by

`rpart`, this was pruned using 10-fold cross-validation and applying the one-standard-error rule. The pruned tree thus obtained is illustrated in Figure 2, provided by

`rpart.plot`.

`R`command used for this regression tree is similar to that for Case 1. The tree pruned with the usual method is reported in Figure 3.

**Part II. Multiperiod Predictions and Backtesting**

## 9. Multiperiod Predictions

#### 9.1. The Shift-Forward Procedure and the Self-Sustaining Property

#### 9.2. Illustration in Terms of Partitions

#### 9.3. Illustration in Terms of Conditional Expectations

## 10. The Simulation Approach

#### 10.1. A Typical Multiperiod Prediction Problem

#### 10.2. Simulation of Sample Paths and Reserve Estimates

- 0.
- Initialization. Set:$${\ell}_{0}=\ell \phantom{\rule{0.166667em}{0ex}},\phantom{\rule{1.em}{0ex}}{K\mathit{2}}_{i,j|{\ell}_{0}-j}^{(\nu )}=0\phantom{\rule{0.166667em}{0ex}},\phantom{\rule{1.em}{0ex}}{K\mathit{2}}_{i,j|{\ell}_{0}-j}^{(\nu )}=0\phantom{\rule{0.166667em}{0ex}},\phantom{\rule{1.em}{0ex}}{\widehat{\mathit{x}}}_{i,j|{\ell}_{0}-j}^{(\nu )}={\mathit{x}}_{i,j|{\ell}_{0}-j}^{(\nu )}\phantom{\rule{0.166667em}{0ex}},\phantom{\rule{0.277778em}{0ex}}\phantom{\rule{1.em}{0ex}}{\widehat{\tilde{\mathit{x}}}}_{i,j|{\ell}_{0}-j}^{(\nu )}={\tilde{\mathit{x}}}_{i,j|{\ell}_{0}-j}^{(\nu )}\phantom{\rule{0.166667em}{0ex}}.$$
- 1.
- Find the index r of the leaf of ${\mathcal{P}}_{{\ell}_{0}}$ to which the feature ${\widehat{\mathit{x}}}_{i,j|{\ell}_{0}-j}^{(\nu )}$ belongs.
- 2.
- Simulate the state w of the frequency response ${\widehat{W}}_{i,j|{\ell}_{0}-j+1}^{(\nu )}$ at time ${\ell}_{0}+i+1$ using the probability distribution corresponding to the r-th leaf of ${\mathcal{P}}_{{\ell}_{0}}$.
- 3.
- If w implies:
- a.
- a type-1 payment (i.e., a NoCARD payment) at time ${\ell}_{0}+i+1$, then assume as the expected paid amount at time ${\ell}_{0}+i+1$ the estimate ${\widehat{S\mathit{1}}}_{i,j|{\ell}_{0}-j+1}^{(\nu )}$ corresponding to the leaf of ${\mathcal{Q}}_{{\ell}_{0}}^{(1)}$ to which the feature ${\widehat{\tilde{\mathit{x}}}}_{i,j|{\ell}_{0}-j}^{(\nu )}$ belongs.
- b.
- a type-2 payment (i.e., a CARD payment) at time ${\ell}_{0}+i+1$, then assume as the expected paid amount at time ${\ell}_{0}+i+1$ the estimate ${\widehat{S\mathit{2}}}_{i,j|{\ell}_{0}-j+1}^{(\nu )}$ corresponding to the leaf of ${\mathcal{Q}}_{{\ell}_{0}}^{(2)}$ to which the feature ${\widehat{\tilde{\mathit{x}}}}_{i,j|{\ell}_{0}-j}^{(\nu )}$ belongs.
- c.
- no payments at time ${\ell}_{0}+i+1$, then all payments at time ${\ell}_{0}+i+1$ are set to 0.

- 4.
- Set:$${K\mathit{1}}_{i,j|{\ell}_{0}-j+1}^{(\nu )}={K\mathit{1}}_{i,j|{\ell}_{0}-j}^{(\nu )}+{\widehat{S\mathit{1}}}_{i,j|{\ell}_{0}-j+1}^{(\nu )},\phantom{\rule{2.em}{0ex}}{K\mathit{2}}_{i,j|{\ell}_{0}-j+1}^{(\nu )}={K\mathit{2}}_{i,j|{\ell}_{0}-j}^{(\nu )}+{\widehat{S\mathit{2}}}_{i,j|{\ell}_{0}-j+1}^{(\nu )}\phantom{\rule{0.166667em}{0ex}}.$$
- 5.
- If ${\ell}_{0}<I-2$ then:
- 5.1.
- The features ${\mathit{x}}_{i,j|{\ell}_{0}-j}^{(\nu )}$ and ${\tilde{\mathit{x}}}_{i,j|{\ell}_{0}-j}^{(\nu )}$ are updated with the new information provided by the responses ${\widehat{W}}_{i,j|{\ell}_{0}-j+1}^{(\nu )}$, ${\widehat{S\mathit{1}}}_{i,j|{\ell}_{0}-j+1}^{(\nu )}$ and ${\widehat{S\mathit{2}}}_{i,j|{\ell}_{0}-j+1}^{(\nu )}$, and the new features ${\widehat{\mathit{x}}}_{i,j|{\ell}_{0}-j+1}^{(\nu )}$ and ${\widehat{\tilde{\mathit{x}}}}_{i,j|{\ell}_{0}-j+1}^{(\nu )}$ are then obtained (this requires that the self-sustaining property holds).
- 5.2.
- Set ${\ell}_{0}={\ell}_{0}+1$ and return to step 1.

#### 10.3. Including Dynamic Modeling of the Case Reserve

- The filtered probability space $(\mathsf{\Omega},\mathcal{F},\mathbb{P},\mathbb{F})$ must include also the two reserve processes:$${({R\mathit{1}}_{i,j|k}^{(\nu )})}_{i,j,k,\nu}\phantom{\rule{0.166667em}{0ex}},\phantom{\rule{1.em}{0ex}}{({R\mathit{2}}_{i,j|k}^{(\nu )})}_{i,j,k,\nu}\phantom{\rule{0.166667em}{0ex}},$$
- For the distribution of the case reserves a property similar to assumption (H5) holds, i.e.,:(HR5) For the conditional distribution of${R\mathit{1}}_{i,j|k}^{(\nu )}|({\overline{R\mathit{1}}}_{i,j|k}^{(\nu )}=1)$and${R\mathit{2}}_{i,j|k}^{(\nu )}|({\overline{R\mathit{2}}}_{i,j|k}^{(\nu )}=1)$one has:$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& {R\mathit{1}}_{i,j|k+1}^{(\nu )}|\left({\overline{R\mathit{1}}}_{i,j|k+1}^{(\nu )}=1\right)\sim \mathcal{N}\left({\dot{\mu}}_{j+k}^{(1)}\left({\dot{\mathit{x}}}_{i,j|k}^{(\nu )}\right),{\dot{\sigma}}_{1}^{2}\right)\phantom{\rule{0.166667em}{0ex}},\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& {R\mathit{2}}_{i,j|k+1}^{(\nu )}|\left({\overline{R\mathit{2}}}_{i,j|k+1}^{(\nu )}=1\right)\sim \mathcal{N}\left({\dot{\mu}}_{j+k}^{(2)}\left({\dot{\mathit{x}}}_{i,j|k}^{(\nu )}\right),{\dot{\sigma}}_{2}^{2}\right)\phantom{\rule{0.166667em}{0ex}},\hfill \end{array}$$As for the payment variables, these assumptions imply:$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& \mathbb{E}\left[{R\mathit{1}}_{i,j|k+1}^{(\nu )}|{\mathcal{F}}_{i+j+k},\left({\overline{R\mathit{1}}}_{i,j|k+1}^{(\nu )}=1\right)\right]={\dot{\mu}}_{j+k}^{(1)}\left({\dot{\mathit{x}}}_{i,j|k}^{(\nu )}\right)\phantom{\rule{0.166667em}{0ex}},\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \mathbb{E}\left[{R\mathit{2}}_{i,j|k+1}^{(\nu )}|{\mathcal{F}}_{i+j+k},\left({\overline{R\mathit{2}}}_{i,j|k+1}^{(\nu )}=1\right)\right]={\dot{\mu}}_{j+k}^{(2)}\left({\dot{\mathit{x}}}_{i,j|k}^{(\nu )}\right)\phantom{\rule{0.166667em}{0ex}},\hfill \end{array}$$$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& \mathbb{E}\left[{R\mathit{1}}_{i,j|k+1}^{(\nu )}|{\mathcal{F}}_{i+j+k}\right]={\dot{\mu}}_{j+k}^{(1)}\left({\dot{\mathit{x}}}_{i,j|k}^{(\nu )}\right)\phantom{\rule{0.277778em}{0ex}}\mathbb{P}\left[{\overline{R\mathit{1}}}_{i,j|k+1}^{(\nu )}=1|{\mathcal{F}}_{i+j+k}\right]\phantom{\rule{0.166667em}{0ex}},\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \mathbb{E}\left[{R\mathit{2}}_{i,j|k+1}^{(\nu )}|{\mathcal{F}}_{i+j+k}\right]={\dot{\mu}}_{j+k}^{(2)}\left({\dot{\mathit{x}}}_{i,j|k}^{(\nu )}\right)\phantom{\rule{0.277778em}{0ex}}\mathbb{P}\left[{\overline{R\mathit{2}}}_{i,j|k+1}^{(\nu )}=1|{\mathcal{F}}_{i+j+k}\right]\phantom{\rule{0.166667em}{0ex}}.\hfill \end{array}$$
- To further improve the predictive performance, an assumption similar to assumption (H4) or (H4’) can be added, which we express here in the one-dimensional form (8):(HR4’) For the conditional distribution of:$${\dot{W}}_{i,j|k}^{(\nu )}={\overline{R\mathit{1}}}_{i,j|k}^{(\nu )}+2\phantom{\rule{0.166667em}{0ex}}{\overline{R\mathit{2}}}_{i,j|k}^{(\nu )}\phantom{\rule{0.166667em}{0ex}},$$one has:$${\dot{W}}_{i,j|k+1}^{(\nu )}=w|{\mathcal{F}}_{i+j+k},\left({Z}_{i,j|k+1}^{(\nu )}=0\right)\sim \mathrm{Categorical}\left({\dot{p}}_{j+k}^{(w)}\left({\mathit{x}}_{i,j|k}^{(\nu )}\right)\right)\phantom{\rule{0.166667em}{0ex}},$$$$\sum _{w=0}^{3}{\dot{p}}_{j+k}^{(w)}\left({\mathit{x}}_{i,j|k}^{(\nu )}\right)=1\phantom{\rule{0.166667em}{0ex}}.$$This assumption implies:$$\mathbb{P}\left[{\dot{W}}_{i,j|k+1}^{(\nu )}=w|{\mathcal{F}}_{i+j+k},\left({Z}_{i,j|k+1}^{(\nu )}=0\right)\right]={\dot{p}}_{j+k}^{(w)}\left({\mathit{x}}_{i,j|k}^{(\nu )}\right)\ge 0\phantom{\rule{0.166667em}{0ex}},\phantom{\rule{1.em}{0ex}}w=0,\dots ,3\phantom{\rule{0.166667em}{0ex}}.$$

#### 10.4. Example of Simulated Cost Development Paths

- ·
- accident year: $i=I=6$;
- ·
- reporting delay: $j=0$, hence we denote the claim as ${\mathcal{C}}_{6,0}^{(\nu )}$;
- ·
- the claim is open at time I: ${Z}_{6,0|0}^{(\nu )}=0$;
- ·
- the claim does not involve a lawyer at time I: ${L}_{6,0|0}^{(\nu )}=0$;
- ·
- no type-1 (NoCARD) payment made at time I: ${\overline{S\mathit{1}}}_{6,0|0}^{(\nu )}=0$;
- ·
- no type-2 (CARD) payment made at time I: ${\overline{S\mathit{2}}}_{6,0|0}^{(\nu )}=0$;
- ·
- type-1 reserve at time I: ${R\mathit{1}}_{6,0|0}^{(\nu )}=31,460$ euros;
- ·
- type-2 reserve at time I: ${R\mathit{2}}_{6,0|0}^{(\nu )}=13,820$ euros.

`predict.rpart`function was invoked for each lag. The computation time required for simulating all sample paths (for the type-1 and type-2 cost) is roughly 4 min. In Figure 4 and Figure 5 $N=5000$ simulated sample paths for the type-1 and type-2 cumulated cost, respectively, of ${\mathcal{C}}_{6,0}^{(\nu )}$ are reported. Since many paths overlap, the simulated paths are shown in blue with the color depth being proportional to the number of overlaps. The average paths in the two figures are shown in red: their final point corresponds to ${\widehat{K\mathit{1}}}_{6,0|5}^{(\nu )}=17,069$ euros and ${\widehat{K\mathit{2}}}_{6,0|5}^{(\nu )}=4314$ euros. If we assume that the claims are finalized at time 11, i.e., after $\tau =5$ years for this claim, then these amounts can be taken as an estimate of the individual claim reserves ${E\mathit{1}}_{6,0}^{(\nu )}$ and ${E\mathit{2}}_{6,0}^{(\nu )}$ to be placed at the current date on ${\mathcal{C}}_{6,0}^{(\nu )}$. This suggests significant decreases in both the outstanding case reserves, namely a decrease of $14,391$ euros for ${R\mathit{1}}_{6,0|0}^{(\nu )}$ and a decrease of 7406 euros for ${R\mathit{2}}_{6,0|0}^{(\nu )}$.

## 11. Testing Predictive Performance of CART Approach

#### 11.1. The Data

- ·
- Observed accident years: from 2007 to 2016. Then $i=1,\dots ,10$.
- ·
- All claims reported are observed, hence for accident year i one has $j=0,\dots ,10-i$ (i.e., ${\underline{j}}_{i}\equiv 0$).

#### 11.2. Prediction of One-Year Event Occurrences

`cc`of the $\Lambda $ claims in the sample we predict as positive, where $\Lambda $ is the number of claims we expect to be positive. Our prediction strategy is very intuitive. Let ${\mathcal{R}}_{0}^{(r)},\phantom{\rule{0.277778em}{0ex}}r=1,\dots ,{R}_{0}$, the r-th leaf of the partition ${\mathcal{P}}_{0}$ provided by the calibrated frequency tree. Using notations introduced in Section 8.2, we denote by ${n}^{(r)}$ the number of claims belonging to ${\mathcal{R}}_{0}^{(r)}$ and by ${\lambda}^{(r)}$ the probability to be positive for each of these claims. We assume that the leaves are ordered by decreasing value of ${\lambda}^{(r)}$ and define ${r}^{*}=min\{r:\phantom{\rule{0.166667em}{0ex}}{N}^{(r)}\le \Lambda \}$, where ${N}^{(r)}={\sum}_{h=1}^{r}{n}^{(h)}$. Our forecasting strategy consists then in predicting as positive all the ${N}^{({r}^{*})}$ claims in the first ${r}^{*}$ leaves and, in addition, $\Lambda -{N}^{({r}^{*})}$ claims which are randomly chosen among those in leaf ${r}^{*}+1$.

- ·
- True positive ratio, also known as sensitivity: TPR = TP/(TP+FN)$=74.8\%$;
- ·
- True negative ratio, or specificity: TNR = TN/(TN+FP)$=98.8\%$;
- ·
- False negative ratio: FNR $=1-$ TPR$=25.2\%$;
- ·
- False positive ratio: FPR $=1-$ TNR$=1.2\%$.

#### 11.3. Prediction of Aggregate Claims Costs

#### 11.3.1. Aggregate RBNS Reserve as Sum of Individual Reserves

`predict.rpart`function can be invoked only one time for all claims with the same lag. With respect to the simulation of a single claim, this provides, proportionally, a substantial reduction of computation time. The run time for all the simulations was roughly 120 min.

`dy`), indexed as $h=0,\dots ,I-1$, where in the CART model “development year” is a new wording for the “lag” $I+\tau -i$. With this representation we obtain the “lower triangle” for the total costs (type 1 + type 2) reported in Table 9 in green color.

#### 11.3.2. Inclusion of the IBNYR Reserve Estimate

**Remark**

**5.**

#### 11.3.3. Comparison with Chain-Ladder Estimates

#### 11.3.4. Backtesting the Two Methods on the Next Diagonal

**Remark**

**6.**

## 12. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A. An Ancillary Model for the Estimation of IBNYR Reserve

- the conditional expectation $\mathbb{E}\left[{N}_{i,j}|{\mathcal{F}}_{I}\right]$ is given by an estimate ${\widehat{N}}_{i,j}^{CL}$ obtained by chain-ladder techniques applied to the aggregate number of reported claims;
- the expectation $\mathbb{E}\left[{\sum}_{k\ge 0}{S}_{i,j|k}^{(\nu )}\right]$ of the total cost for claims with reporting delay j is given by an estimate ${\widehat{c}}_{j}$ obtained with the CART approach for the RBNS claims. Assuming that claims in different accident years are identically distributed one has:$${\widehat{c}}_{j}=\frac{1}{I-j}\sum _{i=1}^{I-j}\frac{1}{{N}_{i,j}}\sum _{\nu =1}^{{N}_{i,j}}{\widehat{S\mathit{1}}}_{i,j}^{(\nu )}\phantom{\rule{0.166667em}{0ex}},$$$${\widehat{S\mathit{1}}}_{i,j}^{(\nu )}=\sum _{k=0}^{I-(i+j)}{S\mathit{1}}_{i,j|k}^{(\nu )}+\sum _{k>I-(i+j)}\widehat{\mathbb{E}}\left[{S\mathit{1}}_{i,j|k}^{(\nu )}|{\mathcal{F}}_{I}\right]\phantom{\rule{0.166667em}{0ex}},$$

## References

- Breiman, Leo, Jerome H. Friedman, Richard A. Olshen, and Charles J. Stone. 1998. Classification and Regression Trees. London: Chapman & Hall/CRC. [Google Scholar]
- D’Agostino, Luca, Massimo De Felice, Gaia Montanucci, Franco Moriconi, and Matteo Salciarini. 2018. Machine learning per la riserva sinistri individuale. Un’applicazione R.C. Auto degli alberi di classificazione e regressione. Alef Technical Reports No. 18/02. Available online: http://alef.it/doc/TechRep_18_02.pdf (accessed on 9 October 2019).
- Gabrielli, Andrea, Richman Ronald, and Mario V. Wüthrich. 2018. Neural network embedding of the over-dispersed Poisson reserving model. Scandinavian Actuaria Journal, 1–29. [Google Scholar] [CrossRef]
- Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. 2008. The Elements of Statistical Learning. Data Mining, Inference, and Predictions, 2nd ed. Springer Series in Statistics; Berlin: Springer. [Google Scholar]
- Hiabu, Munir, Carolin Margraf, Maria Dolores Martínez-Miranda, and Jens Perch Nielsen. 2015. The link between classical reserving and granular reserving through double chain ladder and its extensions. British Actuarial Journal 21: 97–116. [Google Scholar] [CrossRef] [Green Version]
- Martinez-Miranda, Maria Dolores, Bent Nielsen, Jens Perch Nielsen, and Richard Verrall. 2011. Cash Flow Simulation for a Model of Outstanding Liabilities Based on Claim Amounts and Claim Numbers. Astin Bulletin 41: 107–29. [Google Scholar]
- Martínez-Miranda, Maria Dolores, Jens Perch Nielsen, and Richard Verrall. 2012. Double Chain Ladder. ASTIN Bulletin 42: 59–76. [Google Scholar]
- Martínez-Miranda, Maria Dolores, Jens Perch Nielsen, and Richard Verrall. 2013. Double Chain Ladder and Bornhuetter-Ferguson. North American Actuarial Journal 17: 101–13. [Google Scholar] [CrossRef]
- Pešta, Michal, and Ostap Okhrin. 2014. Conditional least squares and copulae in claims reserving for a single line of business. Insurance: Mathematics and Economics 56: 28–37. [Google Scholar] [CrossRef] [Green Version]
- Taylor, Greg. 2019. Claim Models: Granular and Machine Learning Forms. Sydney: School of Risk and Actuarial Studies, University of South Wales. [Google Scholar]
- Taylor, Greg, Gráinne McGuire, and James Sullivan. 2008. Individual claim loss reserving conditioned by case estimates. Annals of Actuarial Science 3: 215–56. [Google Scholar] [CrossRef]
- Therneau, Terry M., Elizabeth J. Atkinson, and Mayo Foundation. 2015. An Introduction to Recursive Partitioning Using the RPART Routines. R Vignettes, version of June 29. Rochester: Mayo Foundation. [Google Scholar]
- Verrall, Richard, Jens Perch Nielsen, and Anders Hedegaard Jessen. 2010. Prediction of RBNS and IBNR claims using claim amounts and claim counts. ASTIN Bulletin 40: 871–87. [Google Scholar]
- Verrall, Richard J., and Mario V. Wüthrich. 2016. Understanding reporting delay in general insurance. Risks 4: 25. [Google Scholar] [CrossRef]
- Wüthrich, Mario V. 2016. Machine Learning in Individual Claims Reserving. Research Paper No. 16-67. Zürich: Swiss Finance Institute. [Google Scholar]
- Wüthrich, Mario V., and Christoph Buser. 2019. Data Analytics for Non-Life Insurance Pricing. Research Paper No. 16-68. Zürich: Swiss Finance Institute, Available online: https://ssrn.com/abstract=2870308 (accessed on 9 October 2019).
- Wüthrich, Mario V., and Michael Merz. 2019. Editorial: Yes, we CANN! ASTIN Bulletin 49: 1–3. [Google Scholar] [CrossRef]

1 | According to the logical foundations of probability theory, as stated by Bruno de Finetti in the 1930s mainly using the Italian language, the word corresponding to the English prediction is previsione (prévision in French) and not predizione. As strongly stated by de Finetti, previsione refers to providing expectation, while predizione refers to providing certainty, which obviously is possible only in a deterministic framework. A prediction problem can have a very general nature. Formulation (1) is only a particular, though important, specification. Usually prediction is also referred to as forecast or foresight. |

2 | It can happen, for example, that only claims reported from calendar year y onwards are observed, which implies $i+j\ge y$, i.e., ${\underline{j}}_{i}=(y-i)\vee 0$. |

3 | The value $0.01$ of the complexity parameter cp used in this example is rather high. It has been used here to simplify the illustration, since the pruned tree finally obtained with this choice is not too much large. For this reason, this pruned tree is slightly suboptimal. Using a more appropriate value of cp, however, does not change substantially the results that are discussed here. |

**Figure 4.**Representation of $N=5000$ simulated paths for the type-1 cost development of the chosen claim ${\mathcal{C}}_{6,0}^{(\nu )}$. In red the average path is reported.

**Figure 5.**Representation of $N=5000$ simulated paths for the type-2 cost development of the chosen claim ${\mathcal{C}}_{6,0}^{(\nu )}$. The average path is in red.

**Figure 6.**Confusion matrices for prediction of payment and closure indicators for claims with $\ell =0$.

**Table 1.**Pairs feature-response observed at time $I=4$ for a claims portfolio with ${N}_{i,j}\equiv 1$. In cells with “no” features are not observed because of reporting delay. Responses on the last column are not yet observed.

Feature-response Pairs at Calendar Years $\mathit{t}=\mathit{i}+\mathit{j}+\mathit{k},\phantom{\rule{0.277778em}{0ex}}{\mathit{t}}^{\prime}=\mathit{t}+1$ | |||||||
---|---|---|---|---|---|---|---|

cc | ay: i | rd: j | $\nu $ | $t=1$ | $t=2$ | $t=3$ | $t=4=I$ |

1 | 1 | 0 | 1 | $\left({\mathit{x}}_{1,0|0}^{(1)},{\mathit{Y}}_{1,0|1}^{(1)}\right)$ | $\left({\mathit{x}}_{1,0|1}^{(1)},{\mathit{Y}}_{1,0|2}^{(1)}\right)$ | $\left({\mathit{x}}_{1,0|2}^{(1)},{\mathit{Y}}_{1,0|3}^{(1)}\right)$ | $\left({\mathit{x}}_{1,0|3}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

2 | 1 | 1 | 1 | no | $\left({\mathit{x}}_{1,1|0}^{(1)},{\mathit{Y}}_{1,1|1}^{(1)}\right)$ | $\left({\mathit{x}}_{1,1|1}^{(1)},{\mathit{Y}}_{1,1|2}^{(1)}\right)$ | $\left({\mathit{x}}_{1,1|2}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

3 | 1 | 2 | 1 | no | no | $\left({\mathit{x}}_{1,2|0}^{(1)},{\mathit{Y}}_{1,2|1}^{(1)}\right)$ | $\left({\mathit{x}}_{1,2|1}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

4 | 1 | 3 | 1 | no | no | no | $\left({\mathit{x}}_{1,3|0}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

5 | 2 | 0 | 1 | . | $\left({\mathit{x}}_{2,0|0}^{(1)},{\mathit{Y}}_{2,0|1}^{(1)}\right)$ | $\left({\mathit{x}}_{2,0|1}^{(1)},{\mathit{Y}}_{2,0|2}^{(1)}\right)$ | $\left({\mathit{x}}_{2,0|2}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

6 | 2 | 1 | 1 | . | no | $\left({\mathit{x}}_{2,1|0}^{(1)},{\mathit{Y}}_{2,1|1}^{(1)}\right)$ | $\left({\mathit{x}}_{2,1|1}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

7 | 2 | 2 | 1 | . | no | no | $\left({\mathit{x}}_{2,2|0}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

8 | 3 | 0 | 1 | . | . | $\left({\mathit{x}}_{3,0|0}^{(1)},{\mathit{Y}}_{3,0|1}^{(1)}\right)$ | $\left({\mathit{x}}_{3,0|1}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

9 | 3 | 1 | 1 | . | . | no | $\left({\mathit{x}}_{3,1|0}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

10 | 4 | 0 | 1 | . | . | . | $\left({\mathit{x}}_{4,0|0}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

**Table 2.**Pairs feature-response observed at time $I=4$ organized by lag. Data on last column and row 4 cannot be used for prediction.

Feature-response Pairs Reorganized by Lag ($\mathit{\ell}=\mathit{j}+\mathit{k}$) | |||||||
---|---|---|---|---|---|---|---|

cc | ay: i | rd: j | $\nu $ | $\ell =0$ | $\ell =1$ | $\ell =2$ | $\ell =3$ |

1 | 1 | 0 | 1 | $\left({\mathit{x}}_{1,0|0}^{(1)},{\mathit{Y}}_{1,0|1}^{(1)}\right)$ | $\left({\mathit{x}}_{1,0|1}^{(1)},{\mathit{Y}}_{1,0|2}^{(1)}\right)$ | $\left({\mathit{x}}_{1,0|2}^{(1)},{\mathit{Y}}_{1,0|3}^{(1)}\right)$ | $\left({\mathit{x}}_{1,0|3}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

2 | 1 | 1 | 1 | no | $\left({\mathit{x}}_{1,1|0}^{(1)},{\mathit{Y}}_{1,1|1}^{(1)}\right)$ | $\left({\mathit{x}}_{1,1|1}^{(1)},{\mathit{Y}}_{1,1|2}^{(1)}\right)$ | $\left({\mathit{x}}_{1,1|2}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

3 | 1 | 2 | 1 | no | no | $\left({\mathit{x}}_{1,2|0}^{(1)},{\mathit{Y}}_{1,2|1}^{(1)}\right)$ | $\left({\mathit{x}}_{1,2|1}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

4 | 1 | 3 | 1 | no | no | no | $\left({\mathit{x}}_{1,3|0}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

5 | 2 | 0 | 1 | $\left({\mathit{x}}_{2,0|0}^{(1)},{\mathit{Y}}_{2,0|1}^{(1)}\right)$ | $\left({\mathit{x}}_{2,0|1}^{(1)},{\mathit{Y}}_{2,0|2}^{(1)}\right)$ | $\left({\mathit{x}}_{2,0|2}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | . |

6 | 2 | 1 | 1 | no | $\left({\mathit{x}}_{2,1|0}^{(1)},{\mathit{Y}}_{2,1|1}^{(1)}\right)$ | $\left({\mathit{x}}_{2,1|1}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | . |

7 | 2 | 2 | 1 | no | no | $\left({\mathit{x}}_{2,2|0}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | . |

8 | 3 | 0 | 1 | $\left({\mathit{x}}_{3,0|0}^{(1)},{\mathit{Y}}_{3,0|1}^{(1)}\right)$ | $\left({\mathit{x}}_{3,0|1}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | $\left(\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | . |

9 | 3 | 1 | 1 | no | $\left({\mathit{x}}_{3,1|0}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | $\left(\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | . |

10 | 4 | 0 | 1 | $\left({\mathit{x}}_{4,0|0}^{(1)},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | $\left(\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | . |

**Table 3.**Pairs feature-response organized by lag relevant for prediction at time $I=4$. Responses on the “last diagonal” (green cells) are not yet observed and require one-year forecasts, which are denoted by $\widehat{\mathit{Y}}$. In the two remaining “diagonals” neither the responses nor the features are yet observed; two-year and three-year forecasts are required in these cases.

Calibration Set and Prediction Set, by Lag | ||||||
---|---|---|---|---|---|---|

cc | ay: i | rd: j | $\nu $ | $\ell =0$ | $\ell =1$ | $\ell =2$ |

1 | 1 | 0 | 1 | $\left({\mathit{x}}_{1,0|0}^{(1)},{\mathit{Y}}_{1,0|1}^{(1)}\right)$ | $\left({\mathit{x}}_{1,0|1}^{(1)},{\mathit{Y}}_{1,0|2}^{(1)}\right)$ | $\left({\mathit{x}}_{1,0|2}^{(1)},{\mathit{Y}}_{1,0|3}^{(1)}\right)$ |

2 | 1 | 1 | 1 | no | $\left({\mathit{x}}_{1,1|0}^{(1)},{\mathit{Y}}_{1,1|1}^{(1)}\right)$ | $\left({\mathit{x}}_{1,1|1}^{(1)},{\mathit{Y}}_{1,1|2}^{(1)}\right)$ |

3 | 1 | 2 | 1 | no | no | $\left({\mathit{x}}_{1,2|0}^{(1)},{\mathit{Y}}_{1,2|1}^{(1)}\right)$ |

5 | 2 | 0 | 1 | $\left({\mathit{x}}_{2,0|0}^{(1)},{\mathit{Y}}_{2,0|1}^{(1)}\right)$ | $\left({\mathit{x}}_{2,0|1}^{(1)},{\mathit{Y}}_{2,0|2}^{(1)}\right)$ | $\left({\mathit{x}}_{2,0|2}^{(1)},{\widehat{\mathit{Y}}}_{2,0|3}^{(1)}\right)$ |

6 | 2 | 1 | 1 | no | $\left({\mathit{x}}_{2,1|0}^{(1)},{\mathit{Y}}_{2,1|1}^{(1)}\right)$ | $\left({\mathit{x}}_{2,1|1}^{(1)},{\widehat{\mathit{Y}}}_{2,1|2}^{(1)}\right)$ |

7 | 2 | 2 | 1 | no | no | $\left({\mathit{x}}_{2,2|0}^{(1)},{\widehat{\mathit{Y}}}_{2,2|1}^{(1)}\right)$ |

8 | 3 | 0 | 1 | $\left({\mathit{x}}_{3,0|0}^{(1)},{\mathit{Y}}_{3,0|1}^{(1)}\right)$ | $\left({\mathit{x}}_{3,0|1}^{(1)},{\widehat{\mathit{Y}}}_{3,0|2}^{(1)}\right)$ | · |

9 | 3 | 1 | 1 | no | $\left({\mathit{x}}_{3,1|0}^{(1)},{\widehat{\mathit{Y}}}_{3,1|1}^{(1)}\right)$ | · |

10 | 4 | 0 | 1 | $\left({\mathit{x}}_{4,0|0}^{(1)},{\widehat{\mathit{Y}}}_{4,0|1}^{(1)}\right)$ | · | · |

**Table 4.**Pairs feature-response organized by lag relevant for prediction at time $I=6$ in the considered claims portfolio.

Feature-response at Lag ℓ | |||||||
---|---|---|---|---|---|---|---|

ay: i | rd: j | $\nu $ | $\ell =0$ | $\ell =1$ | $\ell =2$ | $\ell =3$ | $\ell =4$ |

1 | 3 | $1,\dots ,\phantom{\rule{0.166667em}{0ex}}130$ | no | no | no | $\left({\mathit{x}}_{1,3|0}^{(\nu )},{\mathit{F}}_{1,3|1}^{(\nu )}\right)$ | $\left({\mathit{x}}_{1,3|1}^{(\nu )},{\mathit{F}}_{1,3|2}^{(\nu )}\right)$ |

1 | 4 | $1,\dots ,\phantom{\rule{0.166667em}{0ex}}68$ | no | no | no | no | $\left({\mathit{x}}_{1,4|0}^{(\nu )},{\mathit{F}}_{1,4|1}^{(\nu )}\right)$ |

2 | 2 | $1,\dots ,\phantom{\rule{0.166667em}{0ex}}871$ | no | no | $\left({\mathit{x}}_{2,2|0}^{(\nu )},{\mathit{F}}_{2,2|1}^{(\nu )}\right)$ | $\left({\mathit{x}}_{2,2|1}^{(\nu )},{\mathit{F}}_{2,2|2}^{(\nu )}\right)$ | $\left({\mathit{x}}_{2,2|2}^{(\nu )},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

2 | 3 | $1,\dots ,\phantom{\rule{0.166667em}{0ex}}119$ | no | no | no | $\left({\mathit{x}}_{2,3|0}^{(\nu )},{\mathit{F}}_{2,3|1}^{(\nu )}\right)$ | $\left({\mathit{x}}_{2,3|1}^{(\nu )},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

2 | 4 | $1,\dots ,\phantom{\rule{0.166667em}{0ex}}30$ | no | no | no | no | $\left({\mathit{x}}_{2,4|0}^{(\nu )},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ |

3 | 1 | $1,\dots ,\phantom{\rule{0.166667em}{0ex}}10,778$ | no | $\left({\mathit{x}}_{3,1|0}^{(\nu )},{\mathit{F}}_{3,1|1}^{(\nu )}\right)$ | $\left({\mathit{x}}_{3,1|1}^{(\nu )},{\mathit{F}}_{3,1|2}^{(\nu )}\right)$ | $\left({\mathit{x}}_{3,1|2}^{(\nu )},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | . |

3 | 2 | $1,\dots ,\phantom{\rule{0.166667em}{0ex}}623$ | no | no | $\left({\mathit{x}}_{3,2|0}^{(\nu )},{\mathit{F}}_{3,2|1}^{(\nu )}\right)$ | $\left({\mathit{x}}_{3,2|1}^{(\nu )},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | . |

3 | 3 | $1,\dots ,\phantom{\rule{0.166667em}{0ex}}97$ | no | no | no | $\left({\mathit{x}}_{3,3|0}^{(\nu )},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | . |

4 | 0 | $1,\dots ,\phantom{\rule{0.166667em}{0ex}}144,820$ | $\left({\mathit{x}}_{4,0|0}^{(\nu )},{\mathit{F}}_{4,0|1}^{(\nu )}\right)$ | $\left({\mathit{x}}_{4,0|1}^{(\nu )},{\mathit{F}}_{4,0|2}^{(\nu )}\right)$ | $\left({\mathit{x}}_{4,0|2}^{(\nu )},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | . | . |

4 | 1 | $1,\dots ,\phantom{\rule{0.166667em}{0ex}}10,767$ | no | $\left({\mathit{x}}_{4,1|0}^{(\nu )},{\mathit{F}}_{4,1|1}^{(\nu )}\right)$ | $\left({\mathit{x}}_{4,1|1}^{(\nu )},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | . | . |

4 | 2 | $1,\dots ,\phantom{\rule{0.166667em}{0ex}}519$ | no | no | $\left({\mathit{x}}_{4,2|0}^{(\nu )},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | . | . |

5 | 0 | $1,\dots ,\phantom{\rule{0.166667em}{0ex}}140,256$ | $\left({\mathit{x}}_{5,0|0}^{(\nu )},{\mathit{F}}_{5,0|1}^{(\nu )}\right)$ | $\left({\mathit{x}}_{5,0|1}^{(\nu )},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | . | . | . |

5 | 1 | $1,\dots ,\phantom{\rule{0.166667em}{0ex}}10,112$ | no | $\left({\mathit{x}}_{5,1|0}^{(\nu )},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | . | . | . |

6 | 0 | $1,\dots ,\phantom{\rule{0.166667em}{0ex}}148,918$ | $\left({\mathit{x}}_{6,0|0}^{(\nu )},\phantom{\rule{1.em}{0ex}}\xb7\phantom{\rule{1.em}{0ex}}\right)$ | . | . | . | . |

$\overline{\mathit{S}\mathit{1}}$ | $\overline{\mathit{S}\mathit{2}}$ | Z | L | W | State of the Response |
---|---|---|---|---|---|

0 | 0 | 0 | 0 | 0 | ONN0: open without payments and without lawyer |

1 | 0 | 0 | 0 | 1 | OYN0: open with $S\mathit{1}$ payment and without lawyer |

0 | 1 | 0 | 0 | 2 | ONY0: open with $S\mathit{2}$ payment and without lawyer |

1 | 1 | 0 | 0 | 3 | OYY0: open with $S\mathit{1}$ and $S\mathit{2}$ payment and without lawyer |

0 | 0 | 1 | 0 | 4 | CNN0: closed without payments and without lawyer |

1 | 0 | 1 | 0 | 5 | CYN0: closed with $S\mathit{1}$ payment and without lawyer |

0 | 1 | 1 | 0 | 6 | CNY0: closed with $S\mathit{2}$ payment and without lawyer |

1 | 1 | 1 | 0 | 7 | CYY0: closed with $S\mathit{1}$ and $S\mathit{2}$ payment and without lawyer |

0 | 0 | 0 | 1 | 8 | ONNL: open without payments and with lawyer |

1 | 0 | 0 | 1 | 9 | OYNL: open with $S\mathit{1}$ payment and with lawyer |

0 | 1 | 0 | 1 | 10 | ONYL: open with $S\mathit{2}$ payment and with lawyer |

1 | 1 | 0 | 1 | 11 | OYYL: open with $S\mathit{1}$ and $S\mathit{2}$ payment and with lawyer |

0 | 0 | 1 | 1 | 12 | CNNL: closed without payments and with lawyer |

1 | 0 | 1 | 1 | 13 | CYNL: closed with $S\mathit{1}$ payment and with lawyer |

0 | 1 | 1 | 1 | 14 | CNYL: closed with $S\mathit{2}$ payment and with lawyer |

1 | 1 | 1 | 1 | 15 | CYYL: closed with $S\mathit{1}$ and $S\mathit{2}$ payment and with lawyer |

r | ${\mathit{\lambda}}^{(\mathit{r})}$ | ${\mathit{n}}^{(\mathit{r})}$ |
---|---|---|

3 | 40.50% | 323 |

4 | 16.46% | 3204 |

2 | 5.89% | 1495 |

1 | 1.63% | 1878 |

5 | 0.10% | 143,468 |

**Table 8.**Number of observations in the calibration and the prediction set of each lag in the claims portfolio observed at time $I=9$.

ℓ | $|{\mathcal{D}}_{\mathit{\ell}}^{\mathit{C}}|$ | $|{\mathcal{D}}_{\mathit{\ell}}^{\mathit{P}}|$ |
---|---|---|

0 | 1,012,099 | 121,633 |

1 | 964,302 | 119,075 |

2 | 852,271 | 116,207 |

3 | 732,116 | 121,885 |

4 | 592,538 | 139,828 |

5 | 445,686 | 146,965 |

6 | 296,880 | 148,895 |

7 | 144,310 | 152,593 |

**Table 9.**Aggregate lower triangle of the incremental RBNS cost estimates and corresponding RBNS reserves. In the last two rows the adjustments for IBNYR claims are reported.

ayi | $\mathtt{dy}=1\phantom{\rule{1.em}{0ex}}$ | $\mathtt{dy}=2\phantom{\rule{1.em}{0ex}}$ | $\mathtt{dy}=3\phantom{\rule{1.em}{0ex}}$ | $\mathtt{dy}=4\phantom{\rule{1.em}{0ex}}$ | $\mathtt{dy}=5\phantom{\rule{1.em}{0ex}}$ | $\mathtt{dy}=6\phantom{\rule{1.em}{0ex}}$ | $\mathtt{dy}=7\phantom{\rule{1.em}{0ex}}$ | $\mathtt{dy}=8\phantom{\rule{1.em}{0ex}}$ | reserve: ${\mathit{E}}_{\mathit{i}}^{\mathbf{RBNS}}$ | CoVa${}_{\mathit{i}}$ |
---|---|---|---|---|---|---|---|---|---|---|

1 | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | 0 | $0.00\%$ |

2 | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | 548,939 | 548,939 | $5.51\%$ |

3 | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | 841,939 | 660,135 | 1,502,074 | $9.12\%$ |

4 | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | 1,336,090 | 989,338 | 679,961 | 3,005,388 | $5.50\%$ |

5 | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | 1,989,568 | 1,352,147 | 1,026,083 | 663,033 | 5,030,831 | $6.18\%$ |

6 | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | 2,652,175 | 1,842,702 | 1,266,884 | 799,521 | 595,623 | 7,156,905 | $3.63\%$ |

7 | $\xb7\phantom{\rule{2.em}{0ex}}$ | $\xb7\phantom{\rule{2.em}{0ex}}$ | 4,658,838 | 2,609,709 | 1,584,964 | 1,170,623 | 725,353 | 569,174 | 11,318,662 | $2.97\%$ |

8 | $\xb7\phantom{\rule{2.em}{0ex}}$ | 10,672,731 | 4,061,362 | 2,479,849 | 1,849,869 | 1,104,580 | 693,130 | 543,766 | 21,405,288 | $1.66\%$ |

9 | 32,184,296 | 7,479,426 | 3,467,081 | 2,644,659 | 1,819,141 | 1,158,269 | 685,487 | 593,554 | 50,031,913 | $1.11\%$ |

RBNS diagonal | 54,884,575 | 18,994,820 | 10,504,823 | 7,127,705 | 4,244,697 | 2,420,573 | 1,229,253 | 593,554 | 100,000,000 | $\mathbf{0}.\mathbf{79}\%$ |

IBNYR | 5,393,583 | 1,416,003 | 688,614 | 518,063 | 341,608 | 220,980 | 126,677 | 95,314 | 8,800,841 | |

RBNS+IBNYR | 60,278,158 | 20,410,822 | 11,193,436 | 7,645,768 | 4,586,305 | 2,641,554 | 1,355,930 | 688,868 | 108,800,841 |

**Table 10.**Chain-ladder reserve estimates on aggregate payments (type-1+type-2, incremental figures). The differences with the CARTs estimates are also reported.

ayi | $\mathtt{dy}=0\phantom{\rule{1.em}{0ex}}$ | $\mathtt{dy}=1\phantom{\rule{1.em}{0ex}}$ | $\mathtt{dy}=2\phantom{\rule{1.em}{0ex}}$ | $\mathtt{dy}=3\phantom{\rule{1.em}{0ex}}$ | $\mathtt{dy}=4\phantom{\rule{1.em}{0ex}}$ | $\mathtt{dy}=5\phantom{\rule{1.em}{0ex}}$ | $\mathtt{dy}=6\phantom{\rule{1.em}{0ex}}$ | $\mathtt{dy}=7$ | $\mathtt{dy}=8\phantom{\rule{1.em}{0ex}}$ | Reserve ${\mathit{E}}_{\mathit{i}}^{\mathbf{CL}}$ |
---|---|---|---|---|---|---|---|---|---|---|

1 | 35,699,311 | 37,879,857 | 12,003,345 | 6,478,312 | 3,033,793 | 1,895,577 | 1,026,086 | 922,252 | 497,792 | 0 |

2 | 41,730,803 | 36,146,954 | 14,363,454 | 4,928,858 | 3,051,338 | 2,913,180 | 1,237,083 | 899,977 | 529,656 | 529,656 |

3 | 40,033,745 | 31,396,571 | 13,499,535 | 5,668,671 | 2,719,742 | 2,314,666 | 856,136 | 868,753 | 489,839 | 1,358,592 |

4 | 39,027,439 | 38,571,568 | 12,499,545 | 6,084,483 | 2,903,344 | 2,930,986 | 1,075,959 | 928,216 | 523,367 | 2,527,541 |

5 | 39,143,444 | 37,227,132 | 11,612,033 | 4,676,458 | 2,897,767 | 2,477,989 | 1,033,956 | 891,980 | 502,936 | 4,906,860 |

6 | 33,900,305 | 33,987,815 | 11,872,716 | 5,088,186 | 2,644,290 | 2,268,885 | 946,706 | 816,711 | 460,496 | 7,137,087 |

7 | 31,820,892 | 33,590,427 | 10,841,703 | 4,822,608 | 2,526,694 | 2,167,983 | 904,604 | 780,390 | 440,016 | 11,642,296 |

8 | 33,667,137 | 32,084,528 | 11,173,371 | 4,865,109 | 2,548,961 | 2,187,090 | 912,576 | 787,268 | 443,894 | 22,918,269 |

9 | 39,151,374 | 37,275,145 | 12,987,380 | 5,654,965 | 2,962,788 | 2,542,166 | 1,060,734 | 915,081 | 515,961 | 63,914,221 |

CL diagonal | $\xb7\phantom{\rule{2.em}{0ex}}$ | 60,867,772 | 25,100,078 | 12,733,962 | 7,374,128 | 4,695,628 | 2,288,018 | 1,358,976 | 515,961 | 114,934,523 |

CL—CARTs | $\xb7\phantom{\rule{2.em}{0ex}}$ | 589,614 | 4,689,256 | 1,540,526 | −271,640 | 109,323 | −353,535 | 3,046 | −172,907 | 6,133,683 |

% | $\xb7\phantom{\rule{2.em}{0ex}}$ | 1.0% | 23.0% | 13.8% | −3.6% | 2.4% | −13.4% | 0.2% | −25.1% | 5.6% |

ay: i | Realized | Chain-ladder | (%) | CART | (%) |
---|---|---|---|---|---|

Payments | Error | Error | |||

2 (2008) | 586,099 | $-56,443$ | $-9.63$ | $-37,109$ | $-6.33$ |

3 (2009) | 1,145,117 | $-276,364$ | $-24.13$ | $-284,528$ | $-24.85$ |

4 (2010) | 2,272,564 | $-1,196,605$ | $-52.65$ | $-916,852$ | $-40.34$ |

5 (2011) |