Next Article in Journal
Exposure as Duration and Distance in Telematics Motor Insurance Using Generalized Additive Models
Next Article in Special Issue
Bounded Brownian Motion
Previous Article in Journal
The Impact of Risk Management in Credit Rating Agencies
Previous Article in Special Issue
An Integrated Approach to Pricing Catastrophe Reinsurance
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Bayesian Modelling, Monte Carlo Sampling and Capital Allocation of Insurance Risks

by
Gareth W. Peters
1,2,3,*,
Rodrigo S. Targino
4 and
Mario V. Wüthrich
5
1
Department of Statistical Science, University College London, London WC1E 6BT, UK
2
Oxford-Man Institute, Oxford University, Oxford OX1 2JD, UK
3
System Risk Center, London School of Economics, London WC2A 2AE, UK
4
Fundação Getulio Vargas, Escola de Matemática Aplicada, Botafogo, RJ 22250-040, Brazil
5
RiskLab, Department of Mathematics, ETH Zurich, 8092 Zurich, Switzerland
*
Author to whom correspondence should be addressed.
Risks 2017, 5(4), 53; https://doi.org/10.3390/risks5040053
Submission received: 3 May 2017 / Revised: 30 August 2017 / Accepted: 31 August 2017 / Published: 22 September 2017

Abstract

:
The main objective of this work is to develop a detailed step-by-step guide to the development and application of a new class of efficient Monte Carlo methods to solve practically important problems faced by insurers under the new solvency regulations. In particular, a novel Monte Carlo method to calculate capital allocations for a general insurance company is developed, with a focus on coherent capital allocation that is compliant with the Swiss Solvency Test. The data used is based on the balance sheet of a representative stylized company. For each line of business in that company, allocations are calculated for the one-year risk with dependencies based on correlations given by the Swiss Solvency Test. Two different approaches for dealing with parameter uncertainty are discussed and simulation algorithms based on (pseudo-marginal) Sequential Monte Carlo algorithms are described and their efficiency is analysed.

1. Introduction

Due to the new risk based solvency regulations (such as the Swiss Solvency Test FINMA (2007) and Solvency II European Comission (2009)), insurance companies must perform two core calculations. The first one involves computing and setting aside the risk capital to ensure the company’s solvency and financial stability, and the second one is related to the capital allocation exercise. This exercise is a process of splitting the (economic or regulatory) capital amongst its various constituents, which could be different lines of business (LoBs), types of exposures, territories or even individual products in a portfolio of insurance policies. One of the reasons for performing such an exercise is to utilize the results for a risk-reward management tool to analyse profitability. The amount of capital (or risk) allocated to each LoB, for example, may assist the central management’s decision to further invest in or discontinue a business line.
In contrast to the quantitative risk assessment, where there is an unanimous view shared by regulators world-wide that it should be performed through the use of risk measures, such as the Value at Risk (VaR) or Expected Shortfall (ES), there is no consensus on how to perform capital allocation to sub-units. In this work we follow the Euler allocation principle (see, e.g., Tasche (1999) and (McNeil et al. 2010, sct. 6.3)), which is briefly revised in the next section. For other allocation principles we refer the reader to Dhaene et al. (2012).
Under the Euler principle the allocation for each one of the portfolio’s constituents can be calculated through an expectation conditional on a rare event. Even though, in general, these expectations are not available in closed form, some exceptions exist, such as the multivariate Gaussian model, first discussed in this context in Panjer (2001) and extended to the case of multivariate elliptical distributions in Landsman and Valdez (2003) and Dhaene et al. (2008); the multivariate gamma model of Furman and Landsman (2005); the combination of the Farlie-Gumbel-Morgenstern (FGM) copula and (mixtures of) exponential marginals from Bargès et al. (2009) or (mixtures of) Erlang marginals Cossette et al. (2013); and the multivariate Pareto-II from Asimit et al. (2013).
In this work we develop algorithms to calculate the marginal allocations for a generic model, which, invariably, leads to numerical approximations. Although simple Monte Carlo schemes (such as rejection sampling or importance sampling) are flexible enough to be used for a generic model, they can be shown to be computationally highly inefficient, as the majority of the samples do not satisfy the necessary conditioning event (which is a rare event). We build upon ideas developed in Targino et al. (2015) and propose an algorithm based on methods from Bayesian Statistics, namely a combination of Markov Chain Monte Carlo (for parameter estimation) and (pseudo-marginal) Sequential Monte Carlo (SMC) for the capital allocation.
As a side result of the allocation algorithm, we are able to efficiently compute both the company’s overall Value at Risk (VaR) and also its Expected Shortfall (ES), (partially) addressing one of the main concerns of Embrechts et al. (2014): For High Confidence Levels, e.g., 95 % and beyond, the “statistical quantity” VaR Can only Be Estimated with Considerable Statistical, as well as Model Uncertainty. Even though the issue of model uncertainty is not resolved, our algorithm can, at least, help to reduce the “statistical uncertainty”, measured by a variance reduction factor taking as basis a standard Monte Carlo simulation with comparable computational cost.
The proposed allocation procedure is described for a fictitious general insurance company with 9 LoBs (see Table 1 and Section 8). Further, within each LoB we also allocate the capital to the one-year reserve risk (due to claims from previous years) and the one-year premium risk.
In order to study the premium risk we follow the framework prescribed by the Swiss Solvency Test (SST) in (FINMA 2007, sct. 4.4). In this technical document, given company-specific quantities, the distribution of the premium risk is deterministically defined and no parameter uncertainty is involved. For the reserve risk, we use a fully Bayesian version of the gamma-gamma chain ladder model, analysed in Peters et al. (2017). As this model is described via a set of unknown parameters, two different approaches to capital allocation are proposed: a marginalized one, where the unknown parameters are marginalized prior to the allocation process and a conditional one, which is performed conditional on the unknown parameters and the parameter is integrated out numerically ex-poste.
The remainder of this paper is organized as follows. Section 2 formally describes marginal risk contributions (allocations) under the marginalized and conditional models. Section 3 reviews concepts of SMC algorithms and how they can be used to compute the quantities described in Section 2. We set the notation used for claims reserving in Section 4, before formally defining the models for the reserve risk (Section 5) and the premium risk (Section 6); these are merged together through a copula in Section 7. Section 8 and Section 9 provide details of the synthetic data used, the inferential procedure for the unknown parameters and the implementation of the SMC algorithms. Results and conclusions are presented, respectively, in Section 10 and Section 11.

2. Risk Allocation for the Swiss Solvency Test

In this section we follow the Euler allocation principle (see, e.g., Tasche (1999) and (McNeil et al. 2010, sct. 6.3)) and discuss how the risk capital that is held by an insurance company can be split into different risk triggers. As stochastic models for these risks involve a set of unknown parameters, we present an allocation procedure for a marginalized model (which arises when the parameter uncertainty is resolved beforehand) and a conditional model (which is still dependent on unknown parameters).
Although we postpone the construction of the specific claims payments model to Section 5 we now assume its behaviour is given by a Bayesian model depending on a generic parameter vector θ , for which a prior distribution is assigned. Probabilistic statements, such as the calculation of the risks allocated to each trigger, have to be made based on the available data, described by the filtration { F ( t ) } t and formally defined in Section 4. This requirement implies that the uncertainty on the parameter values needs to be integrated out, in a process that must typically be performed numerically.
Therefore, to calculate the risk allocations we approximate the stochastic behaviour of functions of future observations, with the functions defined in Section 4. For the moment, let us denote by Z ¯ a multivariate function of F ( t + 1 ) , the future data, and θ the vector of model parameters. On the one hand, in the conditional model, we approximate the distribution of the components of the vector Z ¯ | θ , F ( t ) . On the other hand, in the marginalized model, the approximation is performed after the parameter uncertainty has been integrated out (i.e., marginalized). In this later framework, we approximate the distribution of the components of Z | F ( t ) , where the random vector Z is defined as Z = E [ Z ¯ | F ( t ) ] , with expectation taken with respect to θ | F ( t ) . Note that, given F ( t ) , Z is a random variable, as it depends on future information, i.e., F ( t + 1 ) . Both in the conditional and in the marginalized models we use moment matching and log-normal distributions for the approximations and couple the distributions via a Gaussian copula.
Suppressing the dependence on the available information, F ( t ) , these two models (marginalized and conditional) are defined through their probability density functions (p.d.f.’s), f Z ( z ) and f Z ¯ | θ ( z ¯ | θ ) , respectively, which are both assumed to be combinations of log-normal distributions and a gaussian copula. For the conditional model, as we work in a Bayesian framework, the unknown parameter vector θ has a (posterior) distribution with p.d.f. f θ ( θ ) . This is, then, combined with the likelihood f Z ¯ | θ ( z ¯ | θ ) to construct f Z ¯ ( z ¯ ) , the density used for inference under the conditional model.
For the methodology discussed in this work, the important features of these two models are that f Z ( z ) is known in closed form, whilst f Z ¯ ( z ¯ ) is not.
In summary, the two models presented in Section 4 to Section 7 are defined as
Marginalized model : Z f Z ( z ) ;
Conditional model : Z ¯ f Z ¯ ( z ¯ ) = f Z ¯ | θ ( z ¯ | θ ) f θ ( θ ) d θ .
Remark 1.
As the “original” model for claims payments is a Bayesian model, we use the Bayesian nomenclature for both the marginalized and the conditional model. For the former, the Bayesian structure of prior and likelihood is hidden in Equation (1), as the parameter θ has already been marginalized (with respect to its posterior distribution). For the later, we explicitly make use of the posterior distribution of θ in Equation (2). Another strategy, followed in Wüthrich (2015), is to use an “empirical Bayes” approach, fixing the value of the unknown parameter vector θ , for example at its maximum likelihood estimator (MLE).
Under the marginalized model we define S = i = 1 d Z i as the company’s overall risk. The SST requires the total capital to be calculated as the 99 % ES of S, given by
ρ ( S ) = E [ S | S VaR 99 % ( S ) ] .
In turn, the Euler allocation principle states that the contribution of each component Z i to the total capital in Equation (3) is given by
ρ i = E [ Z i | S VaR 99 % ( S ) ] , i = 1 , , d .
The allocations for the conditional model follows the same structure, with Z i and S replaced, respectively, by Z ¯ i and S ¯ in Equation (4) and reads as
ρ ¯ i = E [ Z ¯ i | S ¯ VaR 99 % ( S ¯ ) ] , i = 1 , , d ,
with S ¯ = i = 1 d Z ¯ i . For the models discussed below the density of f Z ¯ ( z ¯ ) is not known in closed form, adding one more layer of complexity to the proposed method.
Remark 2.
Observe that the log-normal approximations are done at different stages in the marginalized and the conditional models. Therefore, we expect that the results will differ.
Although computing ρ i and ρ ¯ i is a static problem, for the sake of transforming the Monte Carlo estimation into an efficient computational framework, we embed the calculation of these quantities into a sequential procedure, where at each step we solve a simpler problem, through a relaxation of the rare-event conditioning constraint to a sequence of less extreme rare-events. In the next section we discuss the methodological Monte Carlo approach used to perform this task. The reader familiar with the concepts of Sequential Monte Carlo methods may skip Section 3.1.

3. SMC Samplers and Capital Allocation

For the marginalized and conditional models presented in Section 4 to Section 7 the marginal contributions in Equations (4)) and (5) cannot be calculated in analytic form for a generic model, so a simulation technique needs to be employed. In the sequel we provide a brief overview of a class of Monte Carlo methods, named Sequential Monte Carlo (SMC). For a recent survey in the topic, with focus on economics, finance and insurance applications the reader is referred to Creal (2012) and Del Moral et al. (2013). For a generic introductory review we refer the reader to Doucet and Johansen (2009).

3.1. A Brief Introduction to SMC Methods

The class of Sequential Monte Carlo (SMC) algorithms, also called Particle Filters, has its roots in the fields of engineering, probability and statistics where it was primarily used for sampling from a sequence of distributions (see, e.g., Gordon et al. (1993) and Del Moral (1996)). In the context of state-space models, SMC methods can be used to sequentially approximate the filtering distributions of non-linear and non-Gaussian state space models, solving the same problem as the Kalman filter—a technique with a long-standing tradition in actuarial mathematics (see, e.g., De Jong and Zehnwirth (1983) and Verrall (1989)).
The general context of a standard SMC method is that one wants to approximate a (often naturally occurring) sequence of p.d.f.’s π ˜ t t 1 with the support of each function in this sequence is given by s u p p π ˜ t = R d × × R d = R d × t , for t 1 , where t can be any artificial ordering of the sequence that is problem specific. We assume that π ˜ t is (only) known up to a normalizing constant, and we write
π ˜ t ( z 1 : t ) = Z t 1 γ ˜ t ( z 1 : t ) ,
where z 1 : t = ( z 1 , , z t ) R d × t .

3.1.1. SMC Algorithm

Procedurally, we initialize the algorithm sampling a set of N independent particles (as the samples are denoted in the literature) from the distribution π ˜ 1 and set normalized weights to W 1 ( j ) = 1 / N , for all j = 1 , , N . If it is not possible to sample directly from π ˜ 1 , one should sample from an importance distribution q ˜ 1 and calculate its weights accordingly. Then the particles are sequentially propagated through each distribution π ˜ t in the sequence via three main processes: mutation, correction (incremental importance weighting) and resampling. In the first step (mutation) we propagate particles from time t 1 to time t, in the second one (correction) we calculate the new importance weights of the particles.
Without resampling, this method can be seen as a sequence of importance sampling (IS) steps, where the target distribution at each step t is γ ˜ t (the unnormalized version of π ˜ t ) and the importance distribution is given by
q ˜ t ( z 1 : t ) = q ˜ 1 ( z 1 ) j = 2 t K j ( z j 1 , z j ) ,
where K j ( z j 1 , · ) is the mechanism used to propagate particles from time t 1 to t, known as the mutation kernel. Therefore, after the mutation step each particle j = 1 , , N has (unnormalized) importance weight given by
w t ( j ) = γ ˜ t ( z 1 : t ( j ) ) q ˜ t ( z 1 : t ( j ) ) = w t 1 ( j ) γ ˜ t ( z 1 : t ( j ) ) γ ˜ t 1 ( z 1 : t 1 ( j ) ) K t ( z t 1 ( j ) , z t ( j ) ) incremental weight : α t ( j ) = w t 1 ( j ) α t ( j ) .
These importance weights can be normalized to create a set of (normalized) weighted particles z 1 : t ( j ) , W t ( j ) j = 1 N , with normalized weights W t ( j ) = w t ( j ) k = 1 N w t ( k ) . In this case, from the Law of Large Numbers,
j = 1 N W t ( j ) φ ( z 1 : t ( j ) ) E π ˜ t [ φ ( Z 1 : t ) ] = R d × t φ ( z 1 : t ) π ˜ t ( z 1 : t ) d z 1 : t ,
π ˜ t –almost surely as N , for any test function φ such that the expectation of φ under π ˜ t exists (see Geweke (1989)).
Remark 3.
The reader should note that the knowledge of π ˜ t up to a normalizing constant is sufficient for the implementation of a generic SMC algorithm, since the normalized weights W t ( j ) are the same for both π ˜ t and γ ˜ t .
In simple implementations of the SMC algorithm (such as the one discussed above), when the algorithmic time t increases, the estimates in Equation (8) become, eventually, effectively a function of one sample point { z t ( j ) , W t ( j ) } ; what is observed, in practice, is that for some particle j, W t ( j ) 1 and for all the others the normalized weights are negligible. This degeneracy is measured using the Effective Sample Size (ESS) defined in Liu and Chen (1995) and Liu and Chen (1998) as
ESS t = j = 1 N ( W t ( j ) ) 2 1 [ 1 , N ] .
This quantity has the interpretation that ESS t is maximized when { W t ( j ) , z t ( j ) } j = 1 N forms a uniform distribution on { z t ( j ) } j = 1 N and minimized when W t ( j ) = 1 for some j. One may also use the Gini index or the entropy as a degeneracy measure, as discussed, for example, in Martino et al. (2017).
One way to tackle this degeneracy problem is to unbiasedly resample the whole set of weighted particles, for example, choosing (with replacement) N samples from the system where each z t ( j ) is selected with probability weight W t ( j ) . In our algorithms we propose to resample the whole set of weighted samples whenever it is “too degenerate” and our degeneracy threshold is ESS t < N / 2 . Many different resampling schemes have been suggested in the literature and for a comparison between them we refer the reader to Douc and Cappé (2005) and Gandy and Lau (2015).
Although the resample step alleviates the degeneracy problem, its successive reapplication at each stage of the sampler produces the so-called sample impoverishment, where the number of distinct particles is extremely small. In Gilks and Berzuini (2001) it was proposed to add a “move” step with any kernel such that the target distribution is invariant with respect to it in order to rejuvenate the system. This kernel may be, for example, as a Markov Chain Monte Carlo (MCMC) kernel, which would begin with equally chosen weighted samples from the target distribution and then perturb them under a single step of a Metropolis Hastings acceptance-rejection mechanism. Note that in this case the samples start exactly in the target distribution’s stationary regime. Therefore, a single step of the Metropolis-Hastings accept-reject mutation is strictly valid and no burn-in is required.
More precisely, we can apply any kernel M ( z ^ 1 : t ( j ) , z 1 : t ( j ) ) that leaves π ˜ t invariant to move the sample z ^ 1 : t ( j ) to z 1 : t ( j ) (the hat denotes a sample after the resample step but before the “move” step), i.e.,
π ˜ t ( z 1 : t ) = M ( z ^ 1 : t , z 1 : t ) π ˜ t ( z ^ 1 : t ) d z ^ 1 : t .

3.1.2. SMC Samplers

Although very general, the SMC algorithm presented above, in principle, requires the sequence of p.d.f.’s to have an increasing support. However it has been shown in Peters (2005) and Del Moral et al. (2006) that these algorithms can be applied to sequences of p.d.f.’s defined on the same support, leading to the so-called SMC sampler algorithm discussed below. This development is central for the insurance applications explored in this paper.
Given the sequence of densities { π t } t 1 (and its unnormalized version, { γ t } t 1 ), where each element is defined over the same support, say R d , we create another sequence, defined on R d × t , the path space
π ˜ t ( z 1 : t ) γ ˜ t ( z 1 : t ) = γ t ( z t ) s = 1 t 1 L s ( z s + 1 , z s ) ,
which, for any Markov kernel L s ( z s + 1 , z s ) is a density with π t ( z t ) as marginal (which can be seen by integrating out z 1 : t 1 ). Note that, in Equation (9) time runs backwards, from t to 1. For completeness we define π ˜ 1 ( z 1 : 1 ) = π 1 ( z 1 ) and γ ˜ 1 ( z 1 : 1 ) = γ 1 ( z 1 ) .
If q 1 q ˜ 1 is an IS density targeting π 1 π ˜ 1 then, see Equation (6),
q ˜ t ( z 1 : t ) = q 1 ( z 1 ) j = 2 t K j ( z j 1 , z j ) ,
is defined, for Markov kernels K j ( z j 1 , z j ) , as an IS density targeting π ˜ t ( z 1 : t ) . For t = 1 we define K 1 1 .
As in the SMC algorithm, to generate a set of weighted samples { z t ( j ) , W t ( j ) } j = 1 M from π t ( z t ) one can use a sequence of IS steps on the path space where the unnormalized importance weights are, at each time step t 1 , given by, see Equation (7),
w t ( j ) = γ ˜ t ( z 1 : t ( j ) ) q ˜ t ( z 1 : t ( j ) ) = w t 1 ( j ) γ t ( z t ( j ) ) L t 1 ( z t ( j ) , z t 1 ( j ) ) γ t 1 ( z t 1 ( j ) ) K t ( z t 1 ( j ) , z t ( j ) ) = w t 1 ( j ) α t ( j ) ,
where w 0 ( j ) 1 , for all j = 1 , , N . The normalized weights are then computed as
W t ( j ) = w t ( j ) k = 1 N w t ( k ) .
The pseudo-code of the SMC sampler procedure just described is found in Algorithm 1.
Algorithm 1: SMC sampler algorithm.
Risks 05 00053 i001
The introduction of the sequence of kernels { L t 1 } t = 2 T creates a new degree of freedom in the design of SMC samplers compared with the usual SMC algorithms, where only the forward mutation kernels { K t } t = 2 T should be designed. As discussed in Peters (2005) and Del Moral et al. (2006) if one wants to minimize the variance of the importance weights one strategy is to use the following approximation to the optimal backward kernel that minimizes the variance of the incremental weights (which cannot be computed in practice and must be approximated)
L t ( z t + 1 , z t ) = γ t ( z t ) K t + 1 ( z t , z t + 1 ) 1 N j = 1 N w t ( j ) K t + 1 ( z t ( j ) , z t + 1 ) ,
which leads to incremental weights
α t ( j ) = γ t ( z t ( j ) ) 1 N k = 1 N w t 1 ( k ) K t ( z t 1 ( k ) , z t ( j ) ) .
With the methodological tools provided by the SMC samplers we now proceed on how to adapt these methods to the allocation of risks under our generic marginalized and conditional models.

3.2. Allocations for the Marginalized Model

For a generic random vector Z = ( Z 1 , , Z d ) with known marginal densities and distribution functions, respectively f Z i ( z i ) and F Z i ( z i ) , and copula density c ( u 1 , , u d ) on [ 0 , 1 ] d , due to Sklar’s theorem (see Sklar (1959) and (McNeil et al. 2010, chp. 5)) the joint density of Z can be written as
f Z ( z ) = c ( u ) i = 1 d f Z i ( z i ) ,
where u = ( u 1 , , u d ) [ 0 , 1 ] d and u i = F Z i ( z i ) . In order to approximate the marginal risk contributions ρ i from Equation (4) we can use samples from the distribution
π ( z ) = f Z ( z | z G Z ) = f Z ( z ) 1 1 G Z ( z ) P [ Z G Z ] ,
where the set G Z = G Z ( B ) is defined, for B = VaR 99 % ( S ) , as
G Z = z R d : i = 1 d z i B ,
and the indicator function 1 1 G Z ( z ) is one when z G Z and zero otherwise. It should be noted that since the boundary B in Equation (14) is given by VaR 99 % ( S ) with S = i = 1 d Z i we have P [ Z G Z ] = 0.01 , see discussion on this point in Targino et al. (2015).

3.2.1. Reaching a Rare Event Using Intermediate Steps

Instead of directly targeting the conditional distribution ( Z 1 , , Z d ) | { S VaR 99 % ( S ) } the idea of the SMC sampler of Algorithm 1 is to sequentially sample from intermediate distributions with conditioning events that become rarer until the point we reach the distribution of interest (see Equation (14)). The benefit of such an approach is that the samples (particles) from a previous step (with a less rare conditioning event) are “guided” to the next algorithmic step (when targeting a rarer conditioning set) and, if carefully designed, no samples are wasted on the way to the target distribution, in the sense that no samples are incrementally weighted with a strictly zero weight. This “herds” the samples into the target sampling region of interest.
In order to sample from the target distribution defined in Equation (13) we use a sequence of intermediate distributions { π t } t = 1 T , such that π T π and
π t ( z ) = f Z ( z | z G Z t ) ,
with G Z t = G Z t ( B t ) given by
G Z t = z R d : i = 1 d z i B t .
Remark 4.
Differently from Targino et al. (2015), in order to make the algorithm more easily comparable with the one used for the conditional model, we do not transform the original random variable Z through its marginal distribution functions. Therefore, instead of sampling from the conditional copula we sample from the conditional joint distribution of Z .
The thresholds B 1 , , B T 1 are chosen in order to have increasingly rarer conditioning events as a function of t, starting from the unconditional joint density. In other words, { B t } t = 1 T needs to satisfy 0 = B 1 < < B T 1 < B T = B = VaR 99 % ( S ) . Note that the choice B 1 = 0 assumes S > 0 , P -a.s., otherwise B 1 = . Depending on the choice of the thresholds { B t } t = 1 T 1 it may be the case that the densities defined in Equation (15) are only known up to a normalizing constant so, from now on, we work with γ t , the unnormalized version of π t :
π t ( z ) γ t ( z ) = f Z ( z ) 1 1 G Z t ( z ) .
If, at algorithmic time t, we have a set of N weighted samples { W t ( j ) , z t ( j ) } j = 1 N from π t , with z t ( j ) = ( z 1 , t ( j ) , , z d , t ( j ) ) then we construct the following empirical approximation:
E [ Z i | S B t ] j = 1 N W t ( j ) z i , t ( j ) .
It should be noticed, though, that in our application the final threshold B T = B = VaR 99 % ( S ) is not previously known. In these cases, an adaptive strategy, similar to the one studied in Cérou et al. (2012) can be implemented, where neither B 1 , , B T 1 nor B T needs to be previously known. More details on this aspect of the algorithm are provided in Section 9.1.

3.3. Allocations for The Conditional Model

From the discussion in Section 2 we see that the main difference between the marginalized and conditional models is the fact that the former density is analytically known (in fact, it its approximated by an analytically known density) whilst the latter is defined through an integral of a known density, see Equations (1) and (2). In this section we discuss how to adapt the algorithm presented in Section 3.2 for situations where the target density cannot be analytically computed but a positive and unbiased estimator for it can be calculated.
Following the recent developments on pseudo-marginal methods (see Andrieu and Roberts (2009) and Finke (2015) for a survey in the topic) we substitute the unknown density f Z ¯ in Equation (2) by a positive and unbiased estimate f ^ Z ¯ and show the SMC procedure still targets the correct distribution—a strategy similar to the ones proposed in Everitt et al. (2016) and McGree et al. (2015). In the context of rare event simulations a similar idea has been independently developed in Vergé et al. (2016) where the authors study the impact of the parameter uncertainty in the probability of the rare event, whilst we analyse the impact in expectations conditional to the rare event (as in Equation (5)).
The idea of replacing an unknown density by a positive and unbiased estimate is in the core of many recently proposed algorithms, such as the Particle Markov Chain Monte Carlo (PMCMC) of Andrieu et al. (2010), the Sequential Monte Carlo Squared (SMC 2 ) of Chopin et al. (2013) and Fulop and Li (2013) (see also the island particle filter of Vergé et al. (2015)) and the Importance Sampling Squared (IS 2 ) of Tran et al. (2014). In the context of Sequential Monte Carlo algorithms this argument first appeared as a brief note in Rousset and Doucet’s comments of Beskos et al. (2006), where it reads that “(...) a straightforward argument shows that it is not necessary to know w k ( X t 0 : t k ( i ) ) [the weights] exactly. Only an unbiased positive estimate w ^ k ( X t 0 : t k ( i ) ) of w k ( X t 0 : t k ( i ) ) is necessary to obtain asymptotically consistent SMC estimates under weak assumptions”.
To introduce the concept we first estimate f Z ¯ by f Z ¯ ( · | θ ) , which can be seen as a “one sample” approximation to the integral in Equation (2); then we show how to use an estimator based on M 1 samples from f θ . These two approaches have been named in the literature (see Everitt et al. (2016) and references therein) as, respectively, the single auxiliary variable (SAV) and the multiple auxiliary variable (MAV) methods.

3.3.1. Single Auxiliary Variable Method

To avoid direct use of f Z ¯ on the SMC sampler algorithm we provide a procedure on the joint space of Z ¯ and the parameter θ , defined as Y = R d × Θ . The reader is referred to Finke (2015) for an extensive list of known algorithms which can also be interpreted in a extended space way. The target distribution on this new space is defined as the joint distribution of Z ¯ and θ and its marginal with respect to Z ¯ is precisely the density of the conditional model.
Formally, for y = ( z ¯ , θ ) , G Z ¯ ( B ¯ ) = G Z ¯ = z ¯ R d : i = 1 d z ¯ i B ¯ and B ¯ = VaR 99 % ( S ¯ ) we define
π y ( y ) γ y ( y ) = f Z ¯ ( z ¯ | θ ) f θ ( θ ) 1 1 G Z ¯ ( z ¯ ) ,
which has the desired marginal target distribution of interest:
π ¯ ( z ¯ ) γ ¯ ( z ¯ ) = Θ f Z ¯ ( z ¯ | θ ) f θ ( θ ) d θ 1 1 G Z ¯ ( z ¯ ) .
Similarly to the densities defined in Equations (9) and (16) we define a sequence of target distributions both in Y and Y t , respectively, as
π t y ( y t ) γ t y ( y t ) = f Z ¯ ( z ¯ t | θ t ) f θ ( θ t ) 1 1 G Z ¯ t ( z ¯ t ) ,
and
π ˜ t y ( y 1 : t ) γ ˜ t y ( y 1 : t ) = γ t y ( y t ) s = 1 t 1 L s y ( y s + 1 , y s ) = f Z ¯ ( z ¯ t | θ t ) f θ ( θ t ) 1 1 G Z ¯ t ( z ¯ t ) s = 1 t 1 L ¯ s ( z ¯ s + 1 , z ¯ s | θ s ) f θ ( θ s ) ,
where the second identity specifies the choices of L s y , in terms of L ¯ s and f θ .
Assuming we can perfectly sample from the distribution of θ (in our application this distribution is a posterior, from which samples are generated via simulation algorithms), to move y samples backwards from time s + 1 to s we split this process into sampling θ s from f θ (ignoring θ s + 1 ) and then, conditional on θ s , moving z ¯ s + 1 to z ¯ s . In other words, to sample
y s = ( z ¯ s , θ s ) | { y s + 1 = ( z ¯ s + 1 , θ s + 1 ) } L s y ( y s + 1 , y s ) ,
we split the process in two stages,
  • θ s f θ ( θ s ) ;
  • z ¯ s | z ¯ s + 1 L ¯ s ( z ¯ s + 1 , z ¯ s | θ s ) .
The importance distribution on the path space of y can be expressed as
q ˜ t y ( y 1 : t ) = q 1 y ( y 1 ) s = 2 t K s y ( y s 1 , y s ) = q ¯ 1 ( z ¯ 1 ) f θ ( θ 1 ) s = 2 t K ¯ s ( z ¯ s 1 , z ¯ s | θ s ) f θ ( θ s ) ,
and, once again, the second identity provides the choices of q 1 y and K s y , i.e.,
q 1 y ( y 1 ) = q ¯ 1 ( z ¯ 1 ) f θ ( θ 1 ) and K s y ( y s 1 , y s ) = K ¯ s ( z ¯ s 1 , z ¯ s | θ s ) f θ ( θ s ) .
Therefore, a SMC procedure targeting the sequence { π t y ( y t ) } t = 1 T produces unnormalized weights
w t y = γ ˜ t y ( y 1 : t ) q ˜ t y ( y 1 : t ) = w t 1 y γ t y ( y t ) L t 1 y ( y t , y t 1 ) γ t 1 y ( y t 1 ) K t y ( y t 1 , y t ) = w t 1 y f Z ¯ ( z ¯ t | θ t ) f θ ( θ t ) 1 1 G Z ¯ t ( z ¯ t ) L ¯ t 1 ( z ¯ t , z ¯ t 1 | θ t 1 ) f θ ( θ t 1 ) f Z ¯ ( z ¯ t 1 | θ t 1 ) f θ ( θ t 1 ) 1 1 G Z ¯ t 1 ( z ¯ t 1 ) K ¯ t ( z ¯ t 1 , z ¯ t | θ t ) f θ ( θ t ) = w t 1 y f Z ¯ ( z ¯ t | θ t ) 1 1 G Z ¯ t ( z ¯ t ) L ¯ t 1 ( z ¯ t , z ¯ t 1 | θ t 1 ) f Z ¯ ( z ¯ t 1 | θ t 1 ) 1 1 G Z ¯ t 1 ( z ¯ t 1 ) K ¯ t ( z ¯ t 1 , z ¯ t | θ t ) ,
that can be used to create weighted samples from π ¯ t ( z ¯ t ) , which is the desired marginal of π t y ( y t ) , the density required for the capital allocation.
Remark 5.
From the structure of the mutation kernels K t y it should be noticed that at each iteration t a new value of θ t needs to be generated and used to sample z ¯ t | θ t . In other words, for each particle j = 1 , , N a different θ t ( j ) is to be used for each z ¯ t ( j ) | θ t ( j ) .

3.3.2. Multiple Auxiliary Variable

In the previous algorithm we, indirectly, estimate the density f Z ¯ ( z ¯ ) by f Z ¯ ( z ¯ | θ ) . In this section we discuss how to use a different and more robust estimator, using M 1 samples from θ . In the context of pseudo-marginal Monte Carlo Markov Chain (MCMC) Andrieu and Vihola (2015) show that reducing the variance of the estimate of the unknown density f Z ¯ ( z ¯ ) leads to reduced asymptotic variance of estimators from the MCMC. For SMC algorithms this strategy has been used, for example, in McGree et al. (2015) and Everitt et al. (2016).
Before proceeding, we note that even in the case that M = 1 the algorithm still produces asymptotic and unbiased estimators (when the number of particles N ). However, the rate of variance reduction in the asymptotic estimates is directly affected by the choice of M (in a non-trivial manner). Furthermore, the asymptotic variance of Central Limit Theorem (CLT) estimators under the class of such pseudo-marginal Monte Carlo approaches is strictly ordered in M, with M increasing reducing the the asymptotic variance.
For any M 1 , a positive and unbiased estimate for f Z ¯ ( z ¯ ) can be constructed as
f ^ Z ¯ ( z ¯ ; ϑ ) = 1 M i = 1 M f Z ¯ ( z ¯ | θ ( i ) ) ,
where ϑ = ( θ ( 1 ) , , θ ( M ) ) Θ M and each θ ( m ) is sampled independently from f θ ( θ ) . Note that when only one sample of θ is used to estimate f Z ¯ ( z ¯ ) the estimator is reduced to f ^ Z ¯ ( z ¯ ; ϑ ) = f Z ¯ ( z ¯ | θ ) . Also, note that f ^ Z ¯ ( z ¯ ; ϑ ) f Z ¯ ( z ¯ ) point-wise when M , by the law of large numbers. Indeed, since the random variable ϑ has density f ϑ ( ϑ ) = i = 1 M f θ ( θ ( i ) ) we obtain
Θ M f ^ Z ¯ ( z ¯ ; ϑ ) f ϑ ( ϑ ) d ϑ = Θ M 1 M i = 1 M f Z ¯ ( z ¯ | θ ( i ) ) i = 1 M f θ ( θ ( i ) ) d θ ( 1 ) d θ ( M ) = Θ f Z ¯ ( z ¯ | θ ) f θ ( θ ) d θ = f Z ¯ ( z ¯ ) .
Therefore the density π ¯ ( z ¯ ) constructed in Equation (18) is the marginal of the new target density defined on Y M = R d × Θ M
π y ( y ; ϑ ) γ y ( y ; ϑ ) = f ^ Z ¯ ( z ¯ ; ϑ ) f ϑ ( ϑ ) 1 1 G Z ¯ ( z ¯ ) .
Apart from the cumbersome notation, the same argument from the previous section can be used to show that a SMC procedure with estimated density f ^ Z ¯ ( z ¯ ; ϑ ) replacing f Z ¯ ( z ¯ ) has unnormalized weights given by
w t y = w t 1 y f ^ Z ¯ ( z ¯ t ; ϑ t ) 1 1 G Z ¯ t ( z ¯ t ) L ¯ t 1 ( z ¯ t , z ¯ t 1 | ϑ t 1 ) f ^ Z ¯ ( z ¯ t 1 ; ϑ t 1 ) 1 1 G Z ¯ t 1 ( z ¯ t 1 ) K ¯ t ( z ¯ t 1 , z ¯ t | ϑ t ) ,
when targeting a sequence { π ¯ t ( z ¯ t ) } t = 1 T with π ¯ T ( z ¯ T ) = π ¯ ( z ¯ ) .
The algorithms described in this section contain several degrees of freedom, whose choices are discussed in detail in Section 9. In the next section we formally define the elements necessary for constructing the statistical models underlying the risk drivers Z ¯ and Z . We also present the formulas for the Solvency Capital Requirements (SCRs) under both the conditional and marginalized models.
After this brief introduction to SMC algorithms, in the following section we introduce the random variables used in the risk allocation process. In particular, we formally define the random vectors Z and Z ¯ discussed in Section 2, and identify its components with the one-year reserve risk and the one-year premium risk.

4. Swiss Solvency Test and Claims Development

For the rest of this work we assume all random variables are defined in the filtered probability space ( Ω , F , P , { F ( t ) } t 0 ) . We denote cumulative payments for accident year i = 1 , , t until development year j = 0 , , J (with t > J ) on the = 1 , , L LoB by C i , j ( ) . Moreover, in the -th LoB incremental payments for claims with accident year i and development year j are denoted by X i , j ( ) = C i , j ( ) C i , j 1 ( ) . Remark that these payments are made in accounting year i + j .
The information (regarding claims payments) available at time t = 0 , , I + J for the -th LoB is assumed to be given by
D ( ) ( t ) = { X i , j ( ) : 1 i t , 0 j J , 1 i + j t } ,
and, similarly, the total information (regarding claims payments) available at time t is denoted as
D ( t ) = 1 L D ( ) ( t ) .
Remark 6.
By a slight abuse of notation we also use D ( ) ( t ) and D ( t ) for the sigma-field generated by the corresponding sets. Note that D ( t ) F ( t ) for all t 0 , as we assume that F ( t ) contains not only information about claims payments, but also about premium and administrative costs.
The general aim now is to predict the future cumulative payments C i , j ( ) for i + j > t at time t, given the information F ( t ) , in particular, the so-called ultimate claim C i , J ( ) . For more information we refer to Wüthrich (2015).

4.1. Conditional Predictive Model

As noted previously, we generically denote parameters in the Bayesian model for the LoB by θ ( ) . For the ease of exposition, whenever a quantity is defined conditional on θ ( ) it is going to be denoted with a bar on top of it.
At time t I , LoB and accident year i > t J predictors for the ultimate claim C i , J ( ) and the corresponding claims reserves are defined, respectively, as
C ¯ i , J ( ) ( t ) = E [ C i , J ( ) | θ ( ) , F ( t ) ] and R ¯ i ( ) ( t ) = C ¯ i , J ( ) ( t ) C i , t i ( ) .
Under modern solvency regulations, such as Solvency II European Comission (2009) and the Swiss Solvency Test FINMA (2007) an important variable to be analysed is the claims development result (CDR). For accident year i = 1 , , I , accounting year t + 1 > I and LoB , the CDR is defined as
CDR ¯ i ( ) ( t + 1 ) = R ¯ i ( ) ( t ) X i , t i + 1 ( ) + R ¯ i ( ) ( t + 1 ) = C ¯ i , J ( ) ( t ) C ¯ i , J ( ) ( t + 1 ) ,
and an application of the tower property of the expectation shows that (subject to integrability)
E [ CDR ¯ i ( ) ( t + 1 ) | θ ( ) , F ( t ) ] = 0 .
Thus, the prediction process in Equation (21) is a martingale in t and we aim to study the volatility of these martingale innovations.
Equation (23) justifies the prediction of the CDR by zero and the uncertainty of this prediction can be assessed by the conditional mean squared error of prediction (msep):
msep CDR ¯ i ( ) ( t + 1 ) | θ ( ) , F ( t ) ( 0 ) = E [ ( CDR ¯ i ( ) ( t + 1 ) 0 ) 2 | θ ( ) , F ( t ) ] = Var CDR ¯ i ( ) ( t + 1 ) | θ ( ) , F ( t )
= Var C ¯ i , J ( ) ( t + 1 ) | θ ( ) , F ( t ) .
Moreover, we denote the aggregated (over all accident years) CDR and the reserves, conditional on the knowledge of the parameter θ ( ) , respectively, by
CDR ¯ ( ) ( t + 1 ) = i = t J + 1 t CDR ¯ i ( ) ( t + 1 ) and R ¯ ( ) ( t ) = i = t J + 1 t R ¯ i ( ) ( t ) .
Using this notation we also define the total prediction uncertainty incurred when predicting CDR ¯ ( ) ( t + 1 ) by zero as
msep CDR ¯ ( ) ( t + 1 ) | θ ( ) , F ( t ) ( 0 ) = Var i = t J + 1 t C ¯ i , J ( ) ( t + 1 ) | θ ( ) , F ( t ) .
Remark 7.
It should be remarked that, in general, as the parameter vector θ ( ) is unknown none of the quantities presented in this section can be directly calculated unless an explicit estimate for the parameter is used.

4.2. Marginalized Predictive Model

Even though cumulative claims models are defined conditional on unobserved parameter values, any quantity that has to be calculated based on these models should only depend on observable variables. Under the Bayesian paradigm, unknown quantities are modelled using a prior probability distribution reflecting prior beliefs about these parameters.
Analogously to Section 4.1 we define the marginalized (Bayesian) ultimate claim predictor and its reserves, respectively, as
C i , J ( ) ( t ) = E [ C i , J ( ) | F ( t ) ] = E [ C ¯ i , J ( ) ( t ) | F ( t ) ] and R i ( ) ( t ) = C i , J ( ) ( t ) C i , t i ( ) .
We also define the marginalized CDR and notice, again using the tower property, that its mean is equal to zero
CDR i ( ) ( t + 1 ) = C i , J ( ) ( t ) C i , J ( ) ( t + 1 ) with E [ CDR i ( ) ( t + 1 ) | F ( t ) ] = 0 .
Furthermore, summing over all accident years i we follow Equation (26) and denote by R ( ) ( t ) and CDR ( ) ( t + 1 ) the aggregated version of the marginalized reserves and CDR, where the uncertainty in the later is measured via
msep CDR ( ) ( t + 1 ) | F ( t ) ( 0 ) = Var i = t J + 1 t C i , J ( ) ( t + 1 ) | F ( t ) .

4.3. Solvency Capital Requirement (SCR)

In this section we discuss how two important concepts in actuarial risk management, namely the technical result (TR) and the solvency capital requirement (SCR), can be defined for both the conditional and the marginalized models.
In this context the TR is calculated netting all income and expenses arising from the LoBs, while the SCR denotes the minimum capital required by the regulatory authorities in order to cover the company’s business risks. More precisely, the SCR for accounting year t + 1 quantifies the risk of having a substantially distressed result at time t + 1 , evaluated in light of the available information at time t.
As an important shorthand notation, we introduce three sets of random variables, representing the total claim amounts of the current year (CY) claims and of prior year (PY) claims, the later for both the conditional and marginalized models. These random variables are defined, respectively, as
Z C Y ( ) = C t + 1 , J ( ) ( t + 1 ) , Z ¯ P Y ( ) = i = t J + 1 t C ¯ i , J ( ) ( t + 1 ) C i , t i ( ) and
Z P Y ( ) = i = t J + 1 t C i , J ( ) ( t + 1 ) C i , t i ( ) .
In the standard SST model, CY claims do not depend on any unknown parameters and are split into small claims Z C Y , s ( ) for the LoBs = 1 , , L and into large events Z C Y , l ( p ) for the perils p = 1 , , P . Small claims are also called attritional claims and large claims can be individual large claims or catastrophic events, like earthquakes. In this context the company can choose thresholds β ( ) such that claims larger than these amounts are classified as large claims in its respective LoBs.
To further simplify the notation we also group all the random variables related to the conditional and the marginalized models in two random vectors, defined as follows
Z ¯ = ( Z ¯ 1 , , Z ¯ 2 L + P ) = ( Z ¯ P Y ( 1 ) , , Z ¯ P Y ( L ) , Z C Y , s ( 1 ) , , Z C Y , s ( L ) , Z C Y , l ( 1 ) , , Z C Y , l ( P ) ) ,
Z = ( Z 1 , , Z 2 L + P ) = ( Z P Y ( 1 ) , , Z P Y ( L ) , Z C Y , s ( 1 ) , , Z C Y , s ( L ) , Z C Y , l ( 1 ) , , Z C Y , l ( P ) ) .
Next we give more details on how the TR and the SCR are calculated in the generic structure of the conditional and the marginalized models.

4.3.1. SCR for the Conditional Model

At time t + 1 the technical result (TR) of the -th LoB in accounting year ( t , t + 1 ] based on the conditional model is defined as the following F ( t + 1 ) –measurable random variable:
TR ¯ ( ) ( t + 1 ) = Π ( ) ( t + 1 ) K ( ) ( t + 1 ) C t + 1 , J ( ) ( t + 1 ) + CDR ¯ ( ) ( t + 1 ) ,
where Π ( ) ( t + 1 ) and K ( ) ( t + 1 ) are, respectively, the earned premium and the administrative costs of accounting year ( t , t + 1 ] . For simplicity, we assume that these two quantities are known at time t, i.e., the premium and administrative costs of accounting year ( t , t + 1 ] are assumed to be previsible and, hence, F ( t ) -measurable. Moreover, it should be noticed that in this context F ( t ) not only includes the claims payment information defined in Equation (20). The general sigma-field F ( t ) should be seen as a sigma-field generated by the inclusion in D ( t ) of the information about Π ( ) ( t + 1 ) and K ( ) ( t + 1 ) , for = 1 , , L .
Given the technical result for all the LoBs, the company’s overall TR based on the conditional model, and aggregated cost and premium are denoted, respectively, by
TR ¯ ( t + 1 ) = = 1 L TR ¯ ( ) ( t + 1 ) , Π ( t + 1 ) = = 1 L Π ( ) ( t + 1 ) and K ( t + 1 ) = = 1 L K ( ) ( t + 1 ) .
In order to cover the company’s risks over an horizon of one year, the Swiss Solvency Test is concerned with the 99% ES (in light of all the data up to time t):
SCR ¯ ( t + 1 ) = ES 99 % [ TR ¯ ( t + 1 ) | F ( t ) ] ,
where SCR ¯ denotes the solvency capital requirement.
It is important to notice that even though the ES operator is being applied to a “conditional random variable”, namely TR ¯ , the operator is not being taken conditional on the knowledge of θ = ( θ ( 1 ) , , θ ( L ) ) , otherwise this quantity would not be computable (as discussed in Remark 7). Instead, the SCR is calculated based on the marginalized version of the conditional model, where the parameter uncertainty is integrated out. More precisely, the expected shortfall is based on the following (usually intractable) distribution
f Z ¯ ( z ¯ | F ( t ) ) = f Z ¯ ( z ¯ | θ , F ( t ) ) π ( θ | F ( t ) ) d θ .
In order to compute the SCR based on the conditional model we first discuss the measurablity of the terms in the conditional TR, which can be rewritten as
TR ¯ ( t + 1 ) = K ( t + 1 ) + Π ( t + 1 ) + = 1 L i = t J + 1 t C ¯ i , J ( ) ( t ) C i , t i ( ) = 1 L Z ¯ P Y ( ) + Z C Y ( ) .
From the above equation we see the first two terms are, by assumption, F ( t ) measurable and so are all the terms of the form C i , t i ( ) (payments already completed by time t), while the last summation is F ( t + 1 ) measurable and, therefore, a random variable at time t. Due to the dependence on the unknown parameter θ the conditional ultimate claim predictor C ¯ i , J ( ) ( t ) is usually not F ( t ) measurable. However, under the special models introduced in Section 5 we have that C ¯ i , J ( ) ( t ) depends only on the claims data up to time t and not on the unknown parameter vector, making it F ( t ) measurable. In this case one has
SCR ¯ ( t + 1 ) = K ( t + 1 ) Π ( t + 1 ) = 1 L R ¯ ( ) ( t ) + ES 99 % = 1 L Z ¯ P Y ( ) + Z C Y ( ) | F ( t ) ,
where, by assumption, = 1 L R ¯ ( ) ( t ) = = 1 L i = t J + 1 t C ¯ i , J ( ) ( t ) C i , t i ( ) is F ( t ) -measurable.

4.3.2. SCR for the Marginalized Model

As the parameter uncertainty is dealt with in a previous step, the calculation of the SCR for the marginalized model is simpler than its conditional counterpart.
Similarly to the conditional case, we define the TR for the marginalized model as
TR ( ) ( t + 1 ) = Π ( ) ( t + 1 ) K ( ) ( t + 1 ) C t + 1 , J ( ) ( t + 1 ) + CDR ( ) ( t + 1 ) ,
and its aggregated version as
TR ( t + 1 ) = = 1 L TR ( ) ( t + 1 ) .
Furthermore, the SCR for the marginalized model is given by
SCR ( t + 1 ) = ES 99 % [ TR ( t + 1 ) | F ( t ) ]
  = K ( t + 1 ) Π ( t + 1 ) = 1 L R ( ) ( t ) + ES 99 % = 1 L Z P Y ( ) + Z C Y ( ) | F ( t ) ,
where in this case the expected shortfall is calculated with respect to the density f Z ( z | F ( t ) ) .
Remark 8.
For the models discussed in Section 5, as C ¯ i , J ( ) ( t ) does not depend on the parameter vector θ and we also have that R ¯ ( ) ( t ) = R ( ) ( t ) .
Remark 9.
As we assume the cost of claims processing and assessment K ( t + 1 ) and premium Π ( t + 1 ) are known at time t they do not differ from the conditional to the marginalized model.

5. Modelling of Individual LoBs PY Claims

For the modelling of the PY claims reserving risk we need to model Z ¯ P Y or Z P Y as given in Equation (29). The uncertainty in these random variables will be assessed by the conditional and marginalized mean square error of prediction (msep), introduced in Equations (25) and (28). In order to calculate the msep we must first expand our analysis to the study of the claims reserving uncertainty. To do so, in this section we present a fully Bayesian version of the gamma-gamma chain-ladder (CL) model, which has been studied in Peters et al. (2017).
Since in this section we present the model for individual LoBs, for notational simplicity we omit the upper index ( ) from all random variables and parameters.
Model Assumptions 1.
[Gamma-gamma Bayesian chain ladder model] We make the following assumptions:
(a) 
Conditionally, given ϕ = ( ϕ 0 , , ϕ J 1 ) and σ = ( σ 0 , , σ J 1 ) , cumulative claims ( C i , j ) j = 0 , , J are independent (in accident year i) Markov processes (in development year j) with
C i , j + 1 | { F ( i + j ) , ϕ , σ } Γ C i , j σ j 2 , ϕ j σ j 2 ,
for all 1 i t and 0 j J 1 .
(b) 
The parameter vectors ϕ and σ are independent.
(c) 
For given hyper-parameters f j > 0 the components of ϕ are independent such that
ϕ j lim γ j 1 Γ γ j , f j ( γ j 1 ) ,
for 0 j J 1 , where the limit infers that they are eventually distributed from an improper uninfomative prior.
(d) 
The components σ j of σ are independent and F σ j -distributed, having support in ( 0 , d j ) for given constants 0 < d j < for all 0 j J 1 .
(e) 
ϕ , σ and C 1 , 0 , , C t , 0 are independent and P [ C i , 0 > 0 ] = 1 , for all 1 i t .
In Model Assumptions 1 (c) the (improper) prior distribution for ϕ should be seen as a non-informative limit when γ = ( γ 0 , , γ J 1 ) 1 = ( 1 , , 1 ) of the (proper) prior assumption
ϕ j Γ γ j , f j ( γ j 1 ) .
The limit in (c) does not lead to a proper probabilistic model for the prior distribution, however, based on “reasonable” observations { C i , j } i , j the posterior model can be shown to be well defined (see Equation (38)), a result that has been proved using the dominated convergence theorem in Peters et al. (2017).
From Model Assumptions 1 (a), conditional on a specific value of the parameter vectors ϕ and σ , we have that
E [ C i , j + 1 | F ( i + j ) , ϕ , σ ] = ϕ j 1 C i , j , Var ( C i , j + 1 | F ( i + j ) , ϕ , σ ) = ϕ j 2 σ j 2 C i , j ,
which provides a stochastic formulation of the classical CL model of Mack (1993).
Even though the prior is assumed improper and does not integrate to one, the conditional posterior for ϕ j | σ j , F ( t ) is proper and, in addition, also gamma distributed (see Appendix A and (Merz and Wüthrich 2015, Lemma 3.2)). More precisely, we have that
ϕ j | σ , F ( t ) Γ ( a j , b j ) ,
with the following parameters
a j = 1 + i = 1 t j 1 C i , j σ j 2 and b j = i = 1 t j 1 C i , j + 1 σ j 2 .
Therefore, given σ this model belongs to the family of Bayesian models with conjugate priors that allows for closed form (conditional) posteriors – for details see Wüthrich (2015).
The marginal posterior distribution of the elements of the vector σ is given by
π σ j | F ( t ) h j ( σ j | F ( t ) ) = Γ ( a j ) b j a j f σ j ( σ j ) i = 1 t j 1 ( C i , j + 1 σ j 2 ) C i , j σ j 2 Γ ( C i , j σ j 2 ) ,
with a j and b j defined in Equation (37). We note that as long as Model Assumptions 1 (d) and the conditions in Lemma A1 are satisfied, then one can ensure the posterior distribution of σ is proper.
Therefore, under Model Assumptions 1 inference for all the unknown parameters can be performed. It should be noticed, though, that differently from the (conditional) posteriors for ϕ j Equation (36), the posterior for σ j Equation (38) is not recognized as a known distribution. Thus, whenever expectations with respect to the distribution of σ j | F ( t ) need to be calculated one needs to make use of numerical procedures, such as numerical integration or Markov Chain Monte Carlo (MCMC) methods.

5.1. MSEP Results Conditional on σ

Following Model Assumptions 1 we now discuss how to explicitly calculate the quantities introduced in Section 4. We start with the equivalent of the classic CL factor. From the model structure in Equation (35) we define the posterior Bayesian CL factors, given σ , as
f ^ j ( t ) = E [ ϕ j 1 | σ j , F ( t ) ] ,
which, using the gamma distribution from Equation (36), takes the form
f ^ j ( t ) = k = 1 t j 1 C k , j + 1 k = 1 t j 1 C k , j ,
i.e., f ^ j ( t ) is identical to the classic CL factor estimate.
Following Equation (21) we define the conditional ultimate claim predictor
C ¯ i , J ( t ) = E [ C i , J | σ , F ( t ) ] = E ϕ E [ C i , J | ϕ , σ , F ( t ) ] | σ , F ( t ) ,
which can be shown (see (Wüthrich 2015, Theorem 9.5)) to be equal to
C ¯ i , J ( t ) = C i , t i j = t i J 1 f ^ j ( t ) ,
where this is exactly the classic chain ladder predictor of Mack (1993). For this reason we may take Model Assumptions 1 as a distributional model for the classical CL method. Additionally, the conditional reserves defined in Equation (21) and Equation (26) are also the same as the classic CL ones, that is,
R ¯ ( t ) = i = 1 t C ¯ i , J ( t ) C i , t i .
The importance of Equation (40) relies on the fact that it does not depend on the parameter vector σ . In other words, the ultimate claim predictor based on the Bayesian model from Model Assumptions 1 conditional on σ – which is, in general, a random variable – is a real number (independent of σ ). This justifies the argument used on the calculation of Equation (32).
Remark 10.
Using the notation from the previous sections the parameter vector σ plays the role of θ as the only unknown, since, due to conjugacy properties, ϕ can be marginalized analytically.
For the Bayesian model from Model Assumptions 1 the msep conditional on σ has been derived in (Wüthrich 2015, Theorem 9.16) as follows, for i + J > t
msep CDR ¯ i ( t + 1 ) | σ , F ( t ) ( 0 ) = C ¯ i , J ( t ) 2 1 + Ψ ¯ t i ( t ) β ¯ t i ( t ) j = t i + 1 J 1 1 + β ¯ j ( t ) Ψ ¯ j ( t ) 1 ,
where
β ¯ j ( t ) = C t j , j i = 1 t j C i , j and Ψ ¯ j ( t ) = σ j 2 k = 1 t j 1 C k , j σ j 2 .
Moreover, the conditional msep has been shown to be finite if, and only if, σ j 2 < k = 1 t j 1 C k , j . We also refer to Remark 12, below.
The aggregated conditional msep for CDR ¯ ( t + 1 ) = i = 1 t CDR ¯ i ( t + 1 ) is also derived in (Wüthrich 2015, Theorem 9.16), and given by
msep CDR ¯ ( t + 1 ) | σ , F ( t ) ( 0 ) = i = t J + 1 t msep CDR ¯ i ( t + 1 ) | σ , F ( t ) ( 0 ) + 2 t J + 1 i < k t C ¯ i , J ( t ) C ¯ k , J ( t ) 1 + Ψ ¯ t i ( t ) j = t i + 1 J 1 1 + β ¯ j ( t ) Ψ ¯ j ( t ) 1 .
Remark 11.
The assumption that σ j 2 < k = 1 t j 1 C k , j is made in order to guarantee the conditional msep is finite and we enforce this assumption to hold for all the examples presented in this work. See also Remark 12, below.

5.2. Marginalized MSEP Results

The results in the previous section are based on derivations presented in Merz and Wüthrich (2015) and Wüthrich (2015) where the parameter vector σ is assumed to be known. In this section we study the impact of the uncertainty in σ over the mean and variance of C i , J ( t + 1 ) | F ( t ) in light of Model Assumptions 1, which can be seen as a fully Bayesian version of the models previously mentioned.
In order to have well defined posterior distributions for σ , through this section we follow Lemma A1 and assume that, for all development years 0 j J 1 and t I , we have ( t j 1 ) I = 1 or at least one accident year 1 i ( t j 1 ) I is such that C i , j + 1 C i , j f ^ j ( t ) . For all the numerical results presented this assumption is satisfied.
Lemma 1.
The ultimate claim estimator under the marginalized model is equal to the classic chain ladder predictor, i.e., C i , J ( t ) = E [ C i , J | F ( t ) ] = C ¯ i , J ( t ) .
Proof. 
Due to the posterior independence of the elements of ϕ (also used in Equations (39) and (40)) and the fact that C ¯ i , J ( t ) does not depend on σ we have
C i , J ( t ) = E [ C i , J | F ( t ) ] = E E [ C i , J | ϕ , σ , F ( t ) ] | F ( t ) = E E E [ C i , J | ϕ , σ , F ( t ) ] | σ , F ( t ) | F ( t ) = E E C i , t i j = t i J 1 ϕ j 1 | σ , F ( t ) | F ( t ) = E C ¯ i , J ( t ) | F ( t ) = C ¯ i , J ( t ) .
Proposition 1.
The msep in the marginalized model is equal to the posterior expectation of the msep in the conditional model, i.e.,
m s e p CDR ( t + 1 ) | F ( t ) ( 0 ) = Var i = 1 I C i , J ( t + 1 ) | F ( t ) = E [ m s e p CDR ¯ ( t + 1 ) | σ , F ( t ) ( 0 ) | F ( t ) ] .
Proof. 
From the law of total variance we have that
Var i = 1 I C i , J ( t + 1 ) | F ( t ) = Var E i = 1 I C i , J ( t + 1 ) | F ( t ) , σ | F ( t ) + E Var i = 1 I C i , J ( t + 1 ) | F ( t ) , σ | F ( t ) = E Var i = 1 I C i , J ( t + 1 ) | F ( t ) , σ | F ( t ) ,
and the last equality follows from Lemma 1 and the fact that E [ C ¯ i , J ( t + 1 ) | F ( t ) , σ ] = C ¯ i , J ( t ) is independent of σ . ☐
Remark 12.
Following the conditions required for finiteness of the conditional msep, in the unconditional case, one can see that m s e p CDR ( t + 1 ) | F ( t ) ( 0 ) < whenever k = 1 t j 1 C k , j > d j 2 . Furthermore, we note that this condition can be controlled during the model specification, i.e., the range of the σ j 2 is chosen such that all posteriors are well-defined.

5.3. Statistical Model of PY Risk in the SST

Note that the distributional models derived in Section 5.1 and Section 5.2 are rather complex. To maintain some degree of tractability, the overall PY uncertainty distribution is usually approximated by a log-normal distribution via a moment matching procedure.

5.3.1. Conditional PY Model

As discussed in Section 4.3, when modelling the risk of PY claims we work with the random variables Z ¯ P Y , defined in Equation (29). Due to their relationship with the conditional CDR, see Equations (22) and (23) and the results discussed in Section 5.1, we can use the derived properties of these random variables to construct the model being used for Z ¯ P Y .
The conditional mean (see Equations (22), (23) and (41)) and variance (see Equations (25) and (44)) of the random variable Z ¯ P Y are as follows
E [ Z ¯ P Y | σ , F ( t ) ] = R ¯ ( t ) ,
Var ( Z ¯ P Y | σ , F ( t ) ) = msep CDR ¯ ( t + 1 ) | σ , F ( t ) ( 0 ) .
Given mean and variance, we make the following approximation, also proposed in the Swiss Solvency Test (see (FINMA 2007, sct. 4.4.10)).
Model Assumptions 2
(Conditional log-normal approximation). We assume that
Z ¯ P Y | σ , F ( t ) LN μ ¯ P Y , σ ¯ P Y 2 ,
with σ ¯ P Y 2 = log m s e p CDR ¯ ( t + 1 ) | σ , F ( t ) ( 0 ) R ¯ ( t ) 2 + 1 and μ ¯ P Y = log R ¯ ( t ) σ ¯ P Y 2 2 .
Although the distribution of Z ¯ P Y | σ , F ( t ) under Model Assumptions 1 can not be described analytically it is simple to simulate from it. To test the approximation of Model Assumptions 2 we simulate its distribution under the gamma-gamma Bayesian CL model (with fixed σ ) and compare it against the log-normal approximation proposed. For the hyper-parameters presented in Table 2 (and calculated in Section 8) the quantile-quantile plot of the approximation is presented in Figure 1. For all the LoBs we see that the log-normal distribution is a sensible approximation to the original model assumptions. Note that although the parameters used for the comparison are based on the marginalized model Figure 5 and Figure 6 show that they are “representative” values for the distributions of μ ¯ P Y and σ ¯ P Y .

5.3.2. Marginalized PY Model

As an alternative to the conditional Model Assumptions 2 we use the moments of Z P Y | F ( t ) calculated in Lemma 1 and Proposition 1 and then approximate its distribution. Note that due to the intractability of the distribution of σ | F ( t ) the variance term defined in Equation (45) can only be calculated numerically, for example, via MCMC.
Model Assumptions 3
(Marginalized log-normal approximation). We assume that
Z P Y | F ( t ) LN μ P Y , σ P Y 2
with σ P Y 2 = log m s e p CDR ( t + 1 ) | F ( t ) ( 0 ) R ¯ ( t ) 2 + 1 and μ P Y = log R ¯ ( t ) σ P Y 2 2 .
The same comparison based on the quantile-quantile plot of Figure 1 can be performed for the marginalized model and the results are presented in Figure 2. Once again, the log-normal model presents a viable alternative to the originally postulated gamma-gamma Bayesian CL model, even though for Motor Hull, Property and Others the right tail of the log-normal distribution is slightly heavier.

6. Modelling of Individual LoBs CY Claims

Model Assumptions 1 do not assume any specific distribution for E [ C t + 1 , J | F ( t + 1 ) ] , the CY claims. These claims are treated differently in the Swiss Solvency Test from PY claims and the models used for these claims are explained in Section 6.1 and Section 6.2, below. Throughout this section, we denote by λ C Y = λ C Y , s + λ C Y , l the expected number of CY claims over the next year, which is the sum of the expected CY small claims λ C Y , s and the expected CY large claims λ C Y , l .

6.1. Modelling of Small CY Claims

As mentioned in the SST Technical Document (FINMA 2007, sct. 4.4.7), the SST does not make any explicit assumption about the distribution of individual claims; instead, the annual claims expenses are only represented with their expected value and variance. More precisely, in (FINMA 2007, sct. 8.4.5.2) the distribution of the premium risk, Z C Y , s is assumed to be such that
CoVa 2 ( Z C Y , s | F ( t ) ) = a 1 + a 2 + 1 λ C Y , s ,
where the constants a 1 and a 2 are provided by the regulatory authority (under the names of parameter uncertainty and random fluctuation, respectively). Their values for the 2015 solvency test are found in FINMA (2016). In order to fully specify the model for CY small claims one also needs to decide on the mean of the variable Z C Y , s | F ( t ) , but we postpone a detailed discussion on this point until Section 8.2, where we also present the value of λ C Y , s .
Model Assumptions 4
(Distribution of CY small claims). For known constants v , r s > 0 and E [ Z C Y , s | F ( t ) ] we set
Z C Y , s | F ( t ) LN μ C Y , s , σ C Y , s 2 ,
with σ C Y , s 2 = log a 1 + a 2 + 1 λ C Y , s + 1 and μ C Y , s = log ( E [ Z C Y , s | F ( t ) ] ) σ C Y , s 2 2 .

6.2. Modelling of Large CY Claims

In the SST (see (FINMA 2007, sct. 4.4.8)), large CY claims are split into two groups. The first group of large claims are those triggered by the same market-wide event (a hailstorm, for example) and with many simultaneous (small) claims. These types of claims are likely to affect all market participants and are called “cumulated claims”. The second group encompasses individual claims with a large claim amount, which includes, as exemplified in (FINMA 2007, sct. 4.4.8), fire in a factory building.
For each risk trigger, CY large claims are required to be modelled as a compound Poisson random variable with i.i.d. Pareto severities, i.e.,
Z C Y , l = k = 1 N Y k ,
where N P o i s ( λ ) is the number of large claims in LoB under consideration and Y k i . i . d . P a r e t o ( β , α β ) model the intensity of large claims. Here we denote by X P a r e t o ( β , α β ) a random variable with density f ( x ) = α β α x α + 1 , for x β . It is assumed in the SST that large claims are i.i.d. within the same risk trigger and also between different risk triggers, and independent of all Z P Y and Z C Y , s .
As a notational remark, if Z follows a Compound Poisson – Pareto model as a shorthand notation we write Z CP P ( λ , β , α ) , with the same parameter interpretation as in Equation (49).

6.2.1. SST Model for Cumulated Claims

In this section we discuss the modelling of cumulated claims (those triggered by a market-wide event) which are modelled as an event that impacts the whole market and then scales down to an individual insurance company through its market share. In particular, we present the modelling approach used in (1) Motor Hull LoB due to hail events and (2) Workers Compensation (UVG) LoB due to a market-wide large accident.
In both cases market-wide parameters for a compound Poisson model with Pareto intensities have been determined by the regulator, (based on a large claims data set). The aggregated market-wide loss is given by
Z m k t = k = 1 N m k t Y k , m k t CP P ( λ m k t , β m k t , α m k t ) ,
where CP P ( λ m k t , β m k t , α m k t ) denotes a compound distribution with frequency given by P o i s ( λ m k t ) and severity given by P a r e t o ( β m k t , α m k t ) . The corresponding market-wide parameter values are found in FINMA (2016).
Denoting by β the company’s threshold after which losses are classified as large and m its market share in the -th LoB, to be consistent with its assumption the company should model market-wide large events as events above the threshold of
β * = β m .
Then, the market-wide total loss (viewed from the specific company in consideration) is defined as
Z * = k = 1 N * Y k * CP P ( λ * , β * , α m k t ) ,
from which it is easy to see that the only unknown parameter is λ * , since in the SST the Pareto parameter α m k t is kept the same. This frequency parameter is chosen such that the company’s view of the market-wide events is equivalent to the suggested market-wide process. In other words, λ m k t = P [ Y k * > β m k t ] λ * hence
λ * = λ m k t β / m β m k t α m k t .
Therefore, from the company’s point of view, its own large claims are modelled as
Z c o m p CP P ( λ * , β , α m k t ) .
Following the SST Technical Document FINMA (2007), an upper bound γ (provided by the regulator) is included in each Pareto random variable within the random sum. In other words, the final distribution of the company’s large cumulated claims is given by
Z ˜ = k = 1 N * Y ˜ k CP P ( λ * , β , α m k t , γ ) ,
where Y ˜ k Pareto ( β , α m k t , γ ) , a Pareto distribution defined in [ β , γ ] with tail index α m k t .
For efficiency purposes, this distribution is approximated by a single Pareto, with the same mean. This leads us to the following model assumptions.
Model Assumptions 5
(Marginal distribution of cumulated claims). For α m k t , β m k t and γ provided by the regulator in FINMA (2016), β { 1 , 5 } , m ( 0 , 1 ) ,
Z C Y , l Pareto λ * β α m k t 1 ( β / γ ) α m k t 1 β α m k t 1 1 γ α m k t 1 , α m k t
where λ * is defined in Equation (50).
Remark 13.
The reader should note that for large CY claims no parameter uncertainty is considered, since both λ m k t , α m k t and γ are given by the regulator, the market share, m can be perfectly calculated and β is chosen by the company.

6.2.2. SST Model for Individual Claims

For individual large events, the SST provides p 1 , the probability of observing losses larger than CHF 1 million and standard values for α β , for β = 1 and β = 5 (see Table FINMA (2016)). Since the probability of large claims provided by the SST is based on a lower threshold of CHF 1 million, a thinning process of the CP-P has to be done if the company decides to use β = 5 .
Following the same procedure presented in Section 6.2.1 we can see that the company’s large individual claims are modelled as
Z c o m p CP P ( λ β , β , α β ) ,
with an expected number of claims larger than β equal to
λ β = λ C Y , l = p 1 λ C Y β 1 α β ,
where λ C Y denotes the expected total number of CY claims in the -th LoB. Similarly, the regulator also requires a upper bound in the Pareto random variables, leading to the following distribution of large losses
Z ˜ CP P ( λ β , β , α m k t , γ ) .
As in Section 6.2.1, the distribution of Z C Y , l | F ( t ) is approximated by a single Pareto, with the same mean and Pareto index α β .
Model Assumptions 6
(Marginal distribution of large individual claims). For α β , p 1 and γ provided by the regulator in FINMA (2016), β { 1 , 5 } and λ C Y > 0 ,
Z C Y , l | F ( t ) Pareto λ β β α m k t 1 ( β / γ ) α m k t 1 β α m k t 1 1 γ α m k t 1 , α m k t ,
with λ β defined in Equation (51).

7. Joint Distribution of PY and CY Claims

Although the SST does not assume any parametric form for the joint distribution of Z | F ( t ) or Z ¯ | F ( t ) (defined in Equations (30) and (31), respectively) it is required that a pre-specified correlation matrix Λ is used (see FINMA (2016)). In this section we discuss how to use the conditional and marginalized models to define a joint distribution satisfying this correlation assumption.
It is important to notice, though, that the SST correlation matrix may not be attainable for some joint distributions, as discussed in Appendix B in the case of log-normal marginals (in Devroye and Letac (2015) the authors discuss a similar problem). Let us denote by S n the set of all n × n , symmetric, positive semi-definite matrices with diagonal terms equal to 1; and by S ( C ) = Corr ( U ) the correlation matrix of a random vector U C , with elements U i [ 0 , 1 ] . The question asked in Devroye and Letac (2015) is: given S S n , does there exist a copula C such that S ( C ) = S ? The answer is yes, if n 9 and the authors postulate that for n 10 there exists S S n such that there is no copula C such that S ( C ) = S .
It should be noted that, since in the SST the CY large claims are assumed to be independent from all the other risks, the correlation matrix of ( Z P Y , Z C Y , s , Z C Y , l ) | F ( t ) is essentially a correlation matrix between ( Z P Y , Z C Y , s ) | F ( t ) and the same is true also for the conditional model.
Regardless of assuming a conditional or a marginalized model, SST’s correlation matrix Λ should be such that, for i , j = 1 , , 2 L + P (recall that L are the number of LoBs and P the number of perils),
Λ i , j = Corr ( Z i , Z j | F ( t ) ) = Corr ( Z ¯ i , Z ¯ j | F ( t ) ) .
Remark 14.
In the conditional model we need to “integrate out” the parameter uncertainty, otherwise the (conditional) correlation would be dependent on an unknown parameter and could not be matched with the numbers provided by the SST.

7.1. Conditional Joint Model

Under Model Assumptions 2, 4, 5 and 6 our interest lies on modelling the joint behaviour of the vector Z ¯ | σ , F ( t ) . Under Model Assumptions 1 it can be shown that the required conditional independence between Z C Y , l and ( Z P Y , Z C Y , s ) given F ( t ) is equivalent to the conditional independence between Z C Y , l and ( Z P Y , Z C Y , s ) given F ( t ) and σ .
Moreover, since all the marginal conditional distributions of the prior year claims and small current year claims are assumed to be log-normal, following Equations (30) and (31), the notation can be further simplified to
Z ¯ i | σ , F ( t ) LN ( m ¯ i ( σ ) , V ¯ i ( σ ) ) , for i = 1 , , 2 L ,
with m ¯ i ( σ ) , and V ¯ i ( σ ) defined in Model Assumptions 2 and 4. For example, for i = L + 1 , m ¯ i ( σ ) = μ C Y , s ( 1 ) , defined in Model Assumptions 4.
We are now ready to define the joint conditional model to be used.
Model Assumptions 7
(Conditional joint model). Based on Model Assumptions 2 and 4 we link the marginals of the conditional model through a Gaussian copula with correlation matrix Ω ¯ , with elements ( Ω ¯ ) i , j = ω ¯ i , j . More formally, given F ( t ) and σ , the joint distribution of Z ¯ is given by
F Z ¯ ( z ¯ 1 , , z ¯ 2 L ; Ω ¯ | F ( t ) , σ ) = C F Z ¯ 1 ( z ¯ 1 | F ( t ) , σ ) , , F Z ¯ 2 L ( z ¯ 2 L | F ( t ) , σ ) ; Ω ¯ ,
where F Z ¯ i ( · | F ( t ) , σ ) denotes the conditional distribution of Z ¯ i | F ( t ) , σ defined in Equation (52) and C ( · ; Ω ¯ ) is the Gaussian copula with correlation matrix denoted by Ω ¯ .
Remark 15.
In this section the parameter matrix Ω ¯ should be understood as a deterministic variable, differently from σ and ϕ . For this reason we do not include it on the right hand side of the conditioning bar. Instead, whenever Ω ¯ needs to be explicitly written, we include it on the left hand side of the bar, separated by the function (or functional, for expectations) arguments by a semicolon.
In order to match SST’s correlation matrix Λ , under Model Assumptions 1 and 4, the following equation needs to be solved with respect to Ω ¯ :
Λ i , j = Corr ( Z ¯ i , Z ¯ j ; Ω ¯ | F ( t ) ) .
To compute the right hand side of the equation above we first notice that
Cov ( Z ¯ i , Z ¯ j ; Ω ¯ | F ( t ) ) = E [ Z ¯ i Z ¯ j ; Ω ¯ | F ( t ) ] E [ Z ¯ i | F ( t ) ] E [ Z ¯ j | F ( t ) ] ,
where, from Equation (46) and the discussion in Section 6.1,
E [ Z ¯ i | F ( t ) ] = E [ E [ Z ¯ i | F ( t ) , σ ] | F ( t ) ] = E [ m ¯ i | F ( t ) ] = @ l @ R ¯ ( i ) ( t ) , if 1 i L , E [ Z C Y , s ( i L ) | F ( t ) ] , if L + 1 i 2 L ,
and from Equation (A3), Appendix B,
E [ Z ¯ i Z ¯ j ; Ω ¯ | F ( t ) ] = E [ E [ Z ¯ i Z ¯ j ; Ω ¯ | F ( t ) , σ ] | F ( t ) ] = E exp m ¯ i + V ¯ i 2 + 2 V ¯ i ω ¯ i , j V ¯ j + V ¯ j 2 2 + m ¯ j | F ( t ) .
Therefore, to satisfy Equation (53) Ω ¯ i , j needs to be chosen such that the following implicit relationship (which can be solved through any univariate root search algorithm) holds:
Λ i , j Var ( Z ¯ i | F ( t ) ) Var ( Z ¯ j | F ( t ) ) + E [ Z ¯ i | F ( t ) ] E [ Z ¯ j | F ( t ) ] E [ Z ¯ i Z ¯ j ; Ω ¯ | F ( t ) ] = 0 .

7.2. Marginalized Joint Model

Similarly to Section 7.1, in this section we will fully characterize the joint distribution of Z | F ( t ) under Model Assumptions 3, 4, 5 and 6.
From these assumptions we define the following notation:
Z i | F ( t ) LN ( m i , V i ) , for i = 1 , , 2 L .
Model Assumptions 8
(Marginalized joint model). Based on Model Assumptions 3 and 4 we link the marginal distributions of the marginalized model through a Gaussian copula with correlation matrix Ω , with elements ( Ω ) i , j = ω i , j . More formally, given F ( t ) , the joint distribution of Z is given by
F Z ( z 1 , , z 2 L ; Ω | F ( t ) ) = C F Z 1 ( z 1 | F ( t ) ) , , F Z 2 L ( z 2 L | F ( t ) ) ; Ω ,
where F Z i denotes the conditional distribution of Z i | F ( t ) defined in (Equation 54) and C ( · ; Ω ) is the Gaussian copula with correlation matrix Ω .
In order to match SST’s correlation matrix, in the joint marginalized model the Gaussian copula correlation Ω is chosen such that (see Equation (A4), Appendix B) it satisfies
Λ i , j = exp { V i ω i , j V j } 1 ( e V i 2 1 ) ( e V j 2 1 ) 1 / 2 .

8. Data Description and Parameter Estimation

In this section we discuss how we set up the parameters in the models discussed so far, starting from the balance sheet of a fictitious insurance company. Using this balance sheet and the information contained in the SST we generate realistic claims triangles (see Appendix C) and, based on them, we show how to perform Bayesian inference for the unknown parameters. Our starting point is the fictitious balance sheet shown in Table 1, which is intended to represent a large insurance company in Switzerland (for this reason all monetary units should be understood as millions of Swiss Francs (CHF)).

8.1. Hyperparameters for ϕ j

Based on SST’s standard runoff pattern (see Table 3) we first compute the implied CL factors f j ( ) as follows (once again we suppress the index of the LoB). If F j is the deterministic cumulative claims payment pattern for development year j we define
f j = F j + 1 F j , for j = 0 , , J 1 .
These values can, then, be used as a hyperparameter in the prior for ϕ j (see Model Assumptions 1, item (c)).
To generate data from the model (see Appendix C) we fix ϕ j = 1 / f j and σ j = s j / f j , where s j is Mack’s standard deviation estimate calculated from exogenous triangles. The values of s j are presented in Table 4. That is, { F j } j should be understood as a (deterministic) prior payment pattern.

8.2. Current Year Small and Large Claims

To calculate the expected number of CY claims, λ C Y , defined in Section 6, we first set our prior belief for the claims ratio for each LoB, i.e., how much of the premium in that LoB is used to cover incoming claims (all the rest covers business’ costs). This information is available in Table 5, along with the average claim amount. Based on these values the expected number of claims is defined as
λ C Y = Claims ratio × Premium Average claim amount .
Given the expected number of CY claims, λ C Y , this value is used to compute the expected number of individual large claims, λ C Y , l , as in Equation (51). Using the fact that λ C Y , s = λ C Y λ C Y , l we calculate the coefficient of variation for small CY claims as given in Equation (48).
The last ingredient in Model Assumptions 4 is E [ Z C Y , s | F ( t ) ] which is given by
E [ Z C Y , s | F ( t ) ] = Claims ratio × Premium E [ Z C Y , l | F ( t ) ] ,
and the expectation on the right hand side is given either in Model Assumptions 5 or Model Assumptions 6, depending on the LoB.
For the large claims from Model Assumptions 5 and 6 we assume the threshold for large claims β to be equal to 5 (millions of CHF). For the large cumulated claims we use LoBs market share as given in Table 5. The resulting parameters can be found in Table 2. Note these parameters are the same both for the marginalized and conditional models.

8.3. Parameter Estimation

In this section we discuss how to compute the posterior distributions of the variance parameters σ j in Model Assumptions 1, which are used to compute quantities such as the marginalized msep from Section 5.2.
In order to compute the posteriors of σ j , we assume priors centred at Mack’s Mack (1993) CL standard deviation estimator normalized by the CL factor f, both implied by the data. Formally,
σ ^ j ( t ) = s ^ j 2 ( t ) f ^ j ( t ) , 0 j J 1 ,
where s ^ J 1 2 ( t ) = min { s ^ J 3 2 ( t ) , s ^ J 2 2 ( t ) , s ^ J 2 4 ( t ) / s ^ J 3 2 ( t ) } = min { s ^ J 3 2 ( t ) , s ^ J 2 4 ( t ) / s ^ J 3 2 ( t ) } .
To generate samples from the posteriors we use a Metropolis-Hastings algorithm, with proposals given by a truncated Normal centred at the current point and standard deviation equal to 10 × d j . All the chains are started at the CL variance estimate and the upper limit for the prior, d j = k × σ ^ j ( t ) is set as k = 5 times the CL variance estimate. To be left with N M C M C = 1000 samples from the posterior we ran the Markov chains for 12,500 iterations, discarding the first 20 % as a burn-in and keeping every 10th iteration of the remaining simulations.
Some of the results are presented in Figure 3 where one finds the unnormalized posteriors, the histogram of the MCMC outputs and a red dashed line indicating the CL variance estimate for three different LoBs: (a) MTPL, (b) Motor Hull and (c) Property. As expected, for unidimensional and unimodal densities the resulting estimates are highly accurate. It is also worth noticing that the larger the development year j the more diffuse the posterior is, due to the diminishing amount of data available. In the limit, when j = J 1 the information available is not enough to estimate the variance parameter and, therefore, as can be seen from the posterior distribution derived in Equation (38), the posterior is the same as the prior.
Using the sample of size N M C M C = 1000 mentioned above, the calculated parameters for the marginalized model are presented in Table 2. For the conditional model we use the same sample from the posterior and calculate the one value of σ P Y and μ P Y for each sampled value σ . The resulting (transformed) samples are presented as histograms in Figure 5 and Figure 6 and, for comparison only, the relevant marginalized parameters are included as a red dashed line.

8.4. The Correlation Matrices

For the copula correlation matrices we follow the procedures outlined in Section 7.1 and Section 7.2. The resulting matrix for the marginalized model is found in Table 6. From FINMA (2016) it can be seen the values in Ω P Y , C Y , s are very similar to ones in the standard Λ P Y , C Y , s . Also, it worth noticing that differently from SST’s original correlation matrix, the block Ω P Y , C Y , s is no longer symmetric, i.e., in order to have Corr ( Z P Y ( 1 ) , Z C Y ( 2 ) | F ( t ) ) = Corr ( Z P Y ( 2 ) , Z C Y ( 1 ) | F ( t ) ) the term ( 1 , 2 ) of the matrix Ω P Y , C Y , s is not equal to the term ( 2 , 1 ) of the same matrix.
The results for the copula correlation Ω ¯ P Y , C Y , s follow the same patterns as Ω P Y , C Y , s and for this reason its values are omitted.

9. Details of the SMC Algorithm

9.1. Selection of Intermediate Sets

Recall that a key component of the proposed SMC Sampler solution is to create a relaxation of the rare-conditional events that constrain the target posterior into a sequence of increasingly difficult constraints. In this section we discuss how one can select the sequence of constraint relaxations in an adaptive manner.
For both the marginalized and the conditional models we use an adaptive strategy similar to Cérou et al. (2012) in order to select adaptively online (as the algorithm runs) the levels B 1 , , B T , as well as the total number of intermediate sets T. When levels are being chosen adaptively one of the main advantages of the proposed SMC algorithm is the ability to estimate, in one run, the company-wide value at risk, the expected shortfall as well as the risk allocations.
Starting from B 0 = 0 (or B ¯ 0 = 0 if the conditional model is being used) the idea consists of, at each algorithmic iteration t 1 , choosing the next level, B t , such that a percentage p 0 ( 0 , 1 ) of the ( t 1 ) –particles is above this set. More formally, we set B t to be the 1 p 0 empirical quantile of the weighted sample { s t 1 ( j ) , W t 1 ( j ) } j = 1 N or { s ¯ t 1 ( j ) , W t 1 ( j ) } j = 1 N , where s t 1 and s ¯ t 1 denote, respectively, the sum of the components of z t and z ¯ t . Therefore, at algorithmic time t the level B t corresponds to an estimate of the ( 1 p 0 t ) -th quantile of the target distribution. In our examples we set p 0 = 0.4 , 0.5 and 0.7 which induces intermediate quantiles seen in Table 10 for the algorithm. Note that, given a value of p 0 the number of levels in the algorithm is deterministic. For example, for p 0 = 0.5 there are 7 levels until the estimated quantile is above 99 % .
An alternative approach to choosing the level sets is to use the classic normalizing constant estimator derived from the SMC sampler algorithm (see (Del Moral et al. 2006, sct. 3.2.1)). Using the notation from Section 3 we have that the normalizing constant Z t = P [ S > B t ] can be estimated as
Z ^ t = Z ^ t 1 j = 1 N W t 1 ( j ) α ˜ t ( j ) ,
where W t 1 and α ˜ t are, respectively the normalized and the incremental weights at time t 1 .
Similarly to our proposed estimate, in this alternative route one would choose B t such that p 0 × 100 % of the time t 1 particles are above this level. Using the estimator in Equation (56) one could stop the algorithm as soon as Z ^ t < α . The main disadvantage of this approach is that although Z ^ t can be proven to be unbiased and asymptotically normally distributed when the number of particles N (see (Del Moral 2004, Propositions 7.4.1 and 9.4.1) and Pitt et al. (2012) for a proof in the special case of state-space models) one can not guarantee Z ^ t [ 0 , 1 ] . In our experiments the results based on this classic estimate were deemed unsatisfactory, as we observed estimates of the normalizing constant as large as 15, as finite sample realizations.

9.2. Marginalized Model

9.2.1. The Forward Kernel

Similarly to (Targino et al. 2015, sct. 6.1) we propose a mutation kernel K t ( z t 1 , z t ) such that the condition i = 1 d z i , t > B t is always satisfied. Due to the independence assumption of the CY large claims (the P Pareto variables) we first independently mutate the Pareto coordinates, following their true (unconditional) marginal and then mutate the other 2 L variables.
First we split the vector into its log-normal and Pareto components, z t = ( z t , z t ) , where z t = ( z t , 1 , , z t , 2 L ) and z t = ( z t , 2 L + 1 , , z t , 2 L + P ) . Using this notation and denoting z t , m the vector z t without its m-th component, we use
K t ( z t 1 , z t ) = K t ( z t 1 , z t | z t ) K t ( z t 1 , z t ) = 1 2 L m = 1 2 L K t ( m ) ( z t 1 , z t , m ) K t ( m ) ( z t 1 , z t , m | z t , m , z t ) × i = 2 L + 1 2 L + P P a r e t o ( z t ; α i , β i ) ,
where the kernel K t ( m ) ( z t 1 , · ) , which mutates all but the m-th dimension of z t 1 , consists of independent moves in each dimension, i.e.,
K t ( m ) ( z t 1 , z t , m ) = i = 1 i m 2 L K t ( m , i ) ( z t 1 , i , z t , i ) .
Note that these moves are also independent of the P Pareto mutations.
Let us denote { z t 1 ( j ) , W t 1 ( j ) } j = 1 N the weighted sample approximating
π t ( z t 1 ) = f Z ( z t 1 | z t 1 G Z t 1 ) ,
as defined in Equation (15). The components of the mutation kernel are then defined as
K t ( m , i ) ( z t 1 , z t , i ) = L N ( z t , i ; μ ^ i , σ ^ i ) , for i = 1 , , 2 L , i m ,
where μ ^ i and σ ^ i are the empirical mean and variance of { z t 1 ( j ) , W t 1 ( j ) } j = 1 N when i = 1 , , 2 L
μ ^ t 1 , i = j = 1 N W t 1 ( j ) z t 1 , i ( j ) , σ ^ t 1 , i 2 = μ ^ t 1 , i 2 j = 1 N W t 1 z t 1 , i ( j ) 2 .
For the mutation of the remaining dimension, m, to ensure all the samples satisfy the condition i = 1 d z i , t > B t we proceed as follows. First we define
B t z ( m ) = max 0 , B t i = 1 i m d z t , i ,
and then sample the last component z m , t [ B t z ( m ) , + ) according to
K t ( m ) ( z t 1 , z t , m | z t , m ) = T N ( z t , m ; μ ^ m , σ ^ m , B t z ( m ) , + ) , for m = 1 , , 2 L ,
where T N ( · ; μ , σ , a , b ) denotes the density of a Normal distribution with mean μ and variance σ 2 truncated on support [ a , b ] .

9.2.2. The Backward Kernel

For the backward kernel we follow the discussion in Section 3.1.2 and use the (approximation to the) optimum kernel of Del Moral et al. (2006), given by equation Equation (11)
L t ( z t + 1 , z t ) = γ t ( z t ) K t + 1 ( z t , z t + 1 ) 1 N j = 1 N w t ( j ) K t + 1 ( z t ( j ) , z t + 1 ) ,
where w t ( j ) denotes the unnormalized weights at time t and the weighted sample { z t ( j ) , w t ( j ) } j = 1 N targets the unnormalized density γ t ( z t ) . Proceeding in this way the unnormalized weights for the SMC sampler algorithm (see Algorithm 1) satisfy the following recursion
w t ( j ) = w t 1 ( j ) γ t ( z t ) 1 N k = 1 N w t ( k ) K t ( z t 1 , z t ) .

9.2.3. The MCMC Move Kernel

To improve particle diversity after a resampling step (which is performed whenever the effective sample size drops bellow N / 2 ) the following MCMC move kernel is applied to the particles.
As in (Targino et al. 2015, sct. 6.2) we propose a Gibbs-type update combined with a slice sampler (see Neal (2003)). For notational simplicity we suppress the dependence in t in the vector z t and denote v * ( m ) = ( z 1 * , , z m * , z m + 1 , , z d ) the vector where the first m components have already been updated in the Gibbs scan. The full conditional for the m-th component of z t is given by
π t ( z m * | z 1 * , , z m 1 * , z m + 1 , , z d ) π t ( v * ( m ) ) f Z ( v * ( m ) ) 1 1 G Z t ( v * ( m ) ) ,
which is can be sampled from using an unidimensional slice sampler (see Neal (2003)).

9.3. Conditional Model

Following the discussion in Section 3.3.2 we use equation Equation (19) as an approximation to the unknown density f Z ¯ ( z ¯ ) . For our simulations M = 5 samples of the unknown parameter θ are used, where
θ = ( σ ( 1 ) , , σ ( L ) ) ,
and each vector σ ( ) = ( σ 1 ( ) , , σ J ( ) ) contains all the unknown variance parameters for the -th LoB. Therefore, ϑ = ( θ ( 1 ) , , θ ( M ) ) and it should be noticed the superscript have a different interpretation from those in σ j ( ) .
As the parameter estimation step described in Section 8.3 is independent of the allocation process we assume N M C M C samples for each unknown parameter vector σ have already been created. Therefore, to sample z ¯ f Z ¯ ( z ¯ ) we first sample an index n U ( { 1 , , N M C M C } ) and then z ¯ f Z ¯ ( z ¯ | θ ( n ) ) .

9.3.1. The Forward Kernel

The forward kernel used for the conditional model follows the same structure as the one used in the marginalized model and described in Section 9.2.1: first we sample the P independent Pareto variables (with the same distribution as in the marginalized case) and then the remaining 2 L variables. More precisely,
K ¯ t ( m , i ) ( z ¯ t 1 , z ¯ t , i | ϑ t ) = K ¯ t ( m , i ) ( z ¯ t 1 , z ¯ t , i ) = K t ( m , i ) ( z ¯ t 1 , z ¯ t , i ) ,
where the last term is defined in equation Equation (57) and μ ^ i and σ ^ i are now the empirical mean and variance of { z ¯ t 1 ( j ) , W t 1 ( j ) } j = 1 N . Likewise,
K ¯ t ( m ) ( z ¯ t 1 , z ¯ t , m | z ¯ t , m , ϑ t ) = K ¯ t ( m ) ( z ¯ t 1 , z ¯ t , m | z ¯ t , m ) = K t ( m ) ( z ¯ t 1 , z ¯ t , m | z ¯ t , m ) ,
with the last term defined in equation Equation (58). As samples from f ϑ ( ϑ ) have already been generated through MCMC then the mutation kernel in the extended space, K t y ( y t 1 , y t ) , is completely characterized.

9.3.2. The Backward Kernel

As in Section 9.2.2 we use the optimum backward kernel in the extended space Y = R d × Θ M , which for the conditional model leads to the following incremental weights (see Equation (12))
α t = γ t y ( y t ) 1 N j = 1 N w t 1 ( j ) K t y ( y t 1 , y t ) = f ^ Z ¯ ( z ¯ t ; ϑ t ) f ϑ ( ϑ t ) 1 1 G Z ¯ t ( z ¯ t ) 1 N j = 1 N w t 1 ( j ) K t ( z t 1 , z t ) f ϑ ( ϑ t ) = f ^ Z ¯ ( z ¯ t ; ϑ t ) 1 1 G Z ¯ t ( z ¯ t ) 1 N j = 1 N w t 1 ( j ) K t ( z t 1 , z t ) .

9.3.3. The MCMC Move Kernel

The MCMC move kernel used for the conditional model needs to keep the target distribution in the extended space, π t y ( y t ) , invariant. The strategy adopted is to first sample ϑ * f ϑ ( ϑ ) and then z t | ϑ * f ^ Z ¯ ( z ¯ t ; ϑ t * ) 1 1 G Z ¯ t ( z ¯ t ) .
For the second step above we use exactly the same Gibbs-sampler update as in Section 9.2.3, with f Z ( · ) replaced by f ^ Z ¯ ( · ; ϑ t ) .

10. Results

In this section we present the results of the SMC procedure when used to calculate the expected shortfall allocations from Equations (4) and (5) of the solvency capital requirement.
Before proceeding to the results calculated via the SMC algorithm, in order to understand the simulated data presented in Figure 4, in Table 2 we present some results based on a “brute force” Monte Carlo (rejection-sampling) simulation, which is taken as the base line for comparisons with the SMC algorithm. The table is divided in three blocks of rows, with PY claims, CY small (CY,s) claims and CY large (CY,l) claims.
First of all, it should be noticed that the reserves presented on the first block of Table 2 are the ones implied by the data, which we then assume to be the true ones (ignoring, from now on, the initial synthetic data from Table 1). That is, based on initial parameters we have generated synthetic claims development triangles, which naturally deviate slightly from their expected values. The parameters σ and μ for PY claims are related to the marginalized model (for the parameters of the conditional model see Figure 5 and Figure 6). It is also important to note that only the PY parameters are different between the conditional and marginalized models.
For each LoB the standalone expected shortfall (ES) is calculated analytically and its value is, then, combined with the LoB’s expectation to calculate the solvency capital requirement (SCR). These values are added up, both within risk type (i.e., PY, CY,s and CY,l) and globally, in order to calculate the overall standalone capital. For the marginalized and conditional models the columns “ES” and “SCR” denote, respectively, the expected shortfall and capital allocations to each LoB. These values are compared to their standalone counterparts to generate the diversification benefit, which is around 45% for PY and CY,s claims (regardless of the model used) and ranges between 30% and 70% within the PY and CY,s groups. Due to the independence assumptions the largest diversification benefit comes from the CY,l claims, where the capital is reduced by around 95%.
The data presented in Table 2 is calculated as follows. For the marginalized model (and conditional model in brackets), 5 × 10 9 ( 2.5 × 10 7 ) independent samples of the model are generated in order to calculate the overall VaR 99 % . Conditional on this value, for each LoB we then generate 5 × 10 7 ( 5 × 10 5 ) samples above the VaR and use the average of these samples as the true ES allocation (presented in Table 2). In order to asses the variance of the estimators, we divide these samples into N r e p = 500 groups of N M C = 10 5 ( N M C = 10 3 for the conditional model) simulations. More formally, we approximate the ES allocations ρ i , defined in Equation (4), by
E ^ [ ρ ^ i , M C ] = 1 N r e p k = 1 N r e p ρ ^ i , M C ( k ) ,
where ρ ^ i , M C ( k ) stands for the estimate (using N M C particles) from the k-th run (out of N r e p ), which is defined according to
ρ ^ i , M C = 1 N M C j = 1 N M C Z i ( j ) .
Similarly to the analysis performed in Peters et al. (2017) the impact of the prior density can be assessed by comparing the sum of the SCR allocations with the SCR from the “empirical Bayes model”, i.e., the model where the prior for σ is set as a Dirac mass on σ ^ j ( t ) , see Equation (55). In this case we have that the total capital is equal to SCR = 505.48 and the fully Bayesian model with prior defined with k = 5 (see Section 8.3) requires 15 % more capital (both in the marginalized and conditional cases).
To check the accuracy of the SMC procedure we first analyse the estimate of the level sets (intermediate VaRs). For p 0 = 0.5 , Figure 7 and Figure 8 show, respectively, the histogram of the levels B 1 , , B 7 (as per Table 10) for the marginalized and conditional models. The red dashed bars represent the true value of the quantiles (based on the “brute force” MC simulations), which is very close to the mode of the empirical distribution of the SMC estimates. It should be noticed, though, that the SMC estimates seem to be negatively biased and the bias appears to become more pronounced for extreme quantiles. Apart from this negligible bias we assume the levels are being sensibly estimated and proceed, as in Targino et al. (2015), to calculate the relative bias and the variance reduction of the SMC method when compared to a MC procedure.
For each of the LoBs the plots on the Figure 9 and Figure 10 show the relative bias, defined as
Relative Bias = E ^ [ ρ ^ i , S M C ] E ^ [ ρ ^ i , M C ] E ^ [ ρ ^ i , M C ] ,
where E ^ [ ρ ^ i , S M C ] is computed analogously to the MC estimate but, instead, using the SMC method, with N S M C = 100 . The behaviour of the two models is very similar, and we observe that the bias in the PY and CY,s allocations are negligible (less than 5%) while for some of the large CY risks a higher bias (of more than 10%) may be observed. Apart from the difficulty of performing the estimation based on Pareto distributions we stress the fact that although these errors may look large, as we can see from Table 2, their impact in the overall capital are almost imperceptible, due to the small capital charge due to these risks.
Another way to compare the SMC calculations is through the actual capital charges, as seen in Figure 11. In this figure we compare the 99 % SCR calculated via the MC scheme discussed above with the SMC results for the quantile level right before 99 % (which, for p 0 = 0.5 is 98.44 % ) and the one right after it ( 99.22 % ). From this figure we see the SMC calculation based on the 99.22 % quantile is very precise, for both the marginalized and conditional models. Visually, the only perceivable difference comes from the CY,l claims, which accounts (in total) for less than 2 % of the overall capital.
To calculate the improvement generated by the SMC algorithm compared to the MC procedure we need to analyse the variance of the estimates generated by both methods, under similar computational budgets.
We start by noticing that the expected number of samples in the Monte Carlo scheme in order to have N M C samples satisfying the α condition is equal to M M C = N M C / ( 1 α ) , which can be prohibitive if α is very close to 1. Then, similarly to Equation (59) we define the empirical variance of the MC and the SMC algorithms which are, then, compared as follows
Variance Reduction = M M C × Var ^ ( ρ ^ i , M C ) / T × N S M C × Var ^ ( ρ ^ i , S M C ) .
The variance reduction statistics defined in Equation (60) takes into account how many samples one needs to use in order to generate N M C samples via rejection sampling or N S M C using the SMC algorithm. The later also takes into account the fact that T levels are being used and in each one N S M C samples need to be generated. For the conditional model we further multiply the denominator by the number of samples used to estimate the unknown density, which in our examples is set to M = 5 .
The results follow on Figure 12 and Figure 13. As in Targino et al. (2015) we observe that the variance of the SMC estimates become smaller (compared to the MC results) for larger quantiles. In particular, for the quantiles of interest the variance of the marginal ES allocation estimates are around 10 0.5 3 times smaller than its MC counterparts, while the overall ES estimate is slightly less variable for the MC scheme.
For the marginalized model we also present two plots in Figure 14, related, respectively, to the sensitivity to (a) the parameter p 0 and (b) the number of samples, N S M C . In Figure 14a, for the same number of samples, N S M C = 100 we analyse the bias relative to the 99 % ES allocations of the first quantile larger than 99 % (top plot) and the previous one (bottom plot) for p 0 { 0.4 , 0.5 , 0.7 } . The quantiles used in these different setups are presented in Table 10. Although the results may look slightly different, the main message is the same: the “higher” quantile is effectively unbiased for PY and CY,s risks but presents a negative bias of around 10 % for some of the CY,l risks.
Regarding the sensitivity to the number of particles in the SMC algorithm, as expected, the absolute bias decreases when the number of samples increases, as seen in Figure 14b. Although the SMC algorithm is generically guaranteed to be unbiased when N S M C + the trade-off between bias and the variance reduction in the allocation problem may lead us to accept a small bias in order to have a smaller variance.

11. Conclusions

In this paper we provide a complete and self-contained view of the capital allocation process for general insurance companies. As prescribed by the Swiss Solvency Test we break down the company’s overall Solvency Capital Requirement (SCR) into the one-year reserve risk, due to claims from previous years (PY) and the one-year premium risk due to claims’ payments in the current year (CY). The later is further split into the risk of normal/small claims (CY,s) and large claims (CY,l). For the premium risk in each line of business we assume a log-normal distribution for CY,s risks with mean and variance as per the SST, which also describe a distribution for CY,l risks, in this case Pareto. For the reserve risk, as in Peters et al. (2017), we postulate a Bayesian gamma-gamma model which, for allocation purposes, is approximated by log-normal distributions leading to what we name the conditional (when the log-normal approximation is performed conditional on the unknown parameters) and the marginalized (when the log-normal approximation is performed after the parameter uncertainty has been integrated out) models.
As seen in Figure 1 and Figure 2, when assuming a Bayesian gamma-gamma model these two approximations do not deviate considerably from the actual model assumptions. Regarding the allocations, Figure 11 shows the results for both models are, once again, very close to each other (and to the “true” allocations, calculated via a large Monte Carlo exercise). Therefore, the decision on which approximation to use should not interfere with the allocation or reserving results, and is left to the reader.
The allocation process is performed using state-of-the-art (pseudo-marginal) Sequential Monte Carlo (SMC) algorithms, which are presented in a self-contained and accessible format. Although the algorithms described form an extremely flexible class, we provide an off-the-shelf version, where minimal or no tuning is needed. The algorithms are also shown to be computationally efficient in a series of numerical experiments.
One of the advantages of our proposed methodology is that it is able to compute in one single loop (1) the value at risk (VaR) and (2) the Expected Shortfall (ES), both at the company level and (3) the capital allocations for the risk drivers. This procedure should be compared with routinely applied methodologies, where one simulation is performed to compute the VaR, which is used in a different simulation to compute the allocations, in a process that accumulate different errors.
Moreover, even ignoring the computational cost of calculating a precise estimate for the required VaR in a “brute force” Monte Carlo scheme, the proposed SMC algorithm is numerically shown to provide estimates that are less volatile than comparable “brute force” implementations.

Author Contributions

All authors contributed equally to the paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Posterior Distributions

For ease of exposition we omit the LoB index . Under Model Assumptions 1 the posterior distribution of the parameter vectors ϕ and σ , for t I , is given by
π ( ϕ , σ | F ( t ) ) g ( F ( t ) | ϕ , σ ) f ϕ ( ϕ ) f σ ( σ ) = g ( C 1 , 0 , , C t , 0 ) j = 0 J 1 i = 1 t j 1 ( ϕ j σ j 2 ) C i , j σ j 2 Γ ( C i , j σ j 2 ) C i , j + 1 C i , j σ j 2 1 exp ϕ j σ j 2 C i , j + 1 × j = 0 J 1 lim γ j 1 ( f j ( γ j 1 ) ) γ j Γ ( γ j ) ϕ j γ j 1 exp { ϕ j f j ( γ j 1 ) } × j = 0 J 1 f σ j 2 ( σ j 2 ) j = 0 J 1 lim γ j 1 ϕ j γ j 1 + i = 1 t j 1 C i , j σ j 2 exp ϕ j f j ( γ j 1 ) + i = 1 t j 1 C i , j + 1 σ j 2 × j = 0 J 1 f σ j 2 ( σ j 2 ) i = 1 t j 1 ( C i , j + 1 σ j 2 ) C i , j σ j 2 Γ ( C i , j σ j 2 ) j = 0 J 1 ϕ j i = 1 t j 1 C i , j σ j 2 exp ϕ j i = 1 t j 1 C i , j + 1 σ j 2 × j = 0 J 1 f σ j ( σ j ) i = 1 t j 1 ( C i , j + 1 σ j 2 ) C i , j σ j 2 Γ ( C i , j σ j 2 ) .
From the functional form of π ( ϕ , σ | F ( t ) ) it can be seen that the components ϕ j of ϕ and σ j of σ are independent a posteriori, which is a direct consequence of the prior independence. Moreover, since π ( ϕ | σ , F ( t ) ) π ( ϕ , σ | F ( t ) ) , we have that
ϕ j | σ , F ( t ) Γ ( a j , b j ) ,
with a j = 1 + i = 1 t j 1 C i , j σ j 2 and b j = i = 1 t j 1 C i , j + 1 σ j 2 .
The marginal posterior π ( σ | F ( t ) ) and its unnormalized version h ( σ | F ( t ) ) are calculated as
π ( σ | F ( t ) ) = π ( ϕ , σ | F ( t ) ) d ϕ j = 0 J 1 Γ ( a j ) b j a j f σ j ( σ j ) i = 1 t j 1 ( C i , j + 1 σ j 2 ) C i , j σ j 2 Γ ( C i , j σ j 2 ) = h ( σ | F ( t ) ) .
Lemma A1.
(from Peters et al. (2017)) For 0 j J 1 and t 1 if either t j 1 = 1 or at least one accident year 1 i t j 1 is such that C i , j + 1 C i , j f ^ j ( t ) then the marginal posterior π ( σ | F ( t ) ) is integrable, i.e.,
0 d j h j ( σ j | F ( t ) ) d σ j < .

Appendix B. Correlation Bounds in the Log-Normal–Gaussian Copula Model

As mentioned in Section 7 and discussed, for example in (Embrechts et al. 2002, Fallacy 2), for given marginal distributions not all linear correlations between 1 and 1 can be achieved. This can also be seen in the following Lemma (see (Denuit and Dhaene 2003, sct. 2)).
Lemma A2
(Correlation bounds). Let ( X 1 , X 2 ) be a bivariate random variable with marginal distributions F 1 and F 2 . Then the correlation between X 1 and X 2 is bounded by
Cov ( F 1 1 ( U ) , F 2 1 ( 1 U ) ) Var ( X 1 ) Var ( X 2 ) Corr ( X 1 , X 2 ) Cov ( F 1 1 ( U ) , F 2 1 ( U ) ) Var ( X 1 ) Var ( X 2 ) ,
for U uniformly distributed in [ 0 , 1 ] .
Although theoretically interesting, Lemma A2 may provide bounds that are too wide and, in some cases just state that the correlation lies between 1 and 1. In the sequel we show that in the particular case of a random vector with log-normal marginals and Gaussian copula it is possible to calculate precisely the intended correlation and numerically check its limits.
Let us assume a random vector X = ( X 1 , , X 2 L ) is normally distributed with X N ( m , V ) , where a general term of the covariance matrix V is given by ( V ) i , j = V i , j and V i , i = V i 2 . Moreover, we denote by Ω = Corr ( X ) the correlation matrix of the random vector X , i.e.,
V = diag ( V 1 , , V 2 L ) Ω diag ( V 1 , , V 2 L ) ,
with ( Ω ) i , j = ( Ω ) j , i = ω i , j .
If we define Z i = e X i , for i = 1 , , 2 L then Z i LN ( m i , V i ) with
E [ Z i ] = exp m i + V i 2 2 , Var ( Z i ) = E [ Z i ] 2 e V i 2 1 .
On the other hand, since X i + X j N ( m i + m j , V i 2 + V j 2 + 2 V i ω i , j V j ) we have that
E [ Z i Z j ] = E [ e X i + X j ] = exp m i + m j + V i 2 + V j 2 + 2 V i ω i , j V j 2 .
Therefore, using Equations (A2) and (A3) the correlation between Z i and Z j can be written as
Corr ( Z i , Z j ) = exp { V i ω i , j V j } 1 ( e V i 2 1 ) ( e V j 2 1 ) 1 / 2 .
Since exp ( · ) is a strictly increasing function and the marginal distributions of ( X 1 , , X 2 L ) are continuous, from (McNeil et al. 2010, Proposition 5.6) we can conclude that ( Z 1 , , Z 2 L ) has the same copula as ( X 1 , , X 2 L ) : a Gaussian copula with correlation matrix Ω .
From equation Equation (A4) it is easy to see the correlation between Z i and Z j is a monotone function of ω i , j which implies that Corr ( Z i , Z j ) will be minimal when ω i , j = 1 and maximal when ω i , j = 1 . Therefore, for a given pair of standard deviations it is possible to compute the interval of admissible correlations for the pair ( Z i , Z j ) . In Figure A1 the lower (left plot) and upper (right plot) present bounds for the correlations.
Figure A1. Lower (left) and upper (right) bound for correlations in a Gaussian-copula model with Log-Normal marginal distributions, as a function of the scale parameters σ 1 and σ 2 .
Figure A1. Lower (left) and upper (right) bound for correlations in a Gaussian-copula model with Log-Normal marginal distributions, as a function of the scale parameters σ 1 and σ 2 .
Risks 05 00053 g0a1
Figure A1 shows that even when the copula correlation is set to 1 if at least one of the standard deviation parameters is “large” then minimum possible correlation between the log-normal variables is close to zero. For example, if σ 1 = σ 2 = 2 then the lower bound for the correlations is approximately 2 % . As actuarial risks are usually positively correlated this may not be a problem from the modelling point of view. The upper limit for the correlations have a different behaviour. If both standard deviations are the same then the range of attainable correlations is upper bounded by 1, meaning that any positive correlation can be achieved. Problems arrive when the standard deviations are sufficiently different from each other. If σ 1 = 1 then the correlation is upper bounded by 66 % if σ 2 = 2 , 16 % if σ 2 = 3 and about 1 % if σ 2 = 4 .

Appendix C. Data Generating Process

In this appendix we describe the process used to generate claims triangles using the balance sheet data from Table 1 in a way that the estimated reserves from the data match closely the reserves from Table 1.
First of all, for each LoB we set the maximum number of development years as the number of years it takes until F j = 1 , where F j denotes the cumulative payment pattern for development year j (see Section 8.1). As claims in the “Motor Third Part Liability (MTPL)” and “Workers Compensation (UVG)” LoBs should take between 20 and 30 years to settle, we make a simplifying assumption that I = max ( J + 1 , 10 ) .
For different accident years we calculate the present value of the runoff pattern, using a constant claim inflation r = 2 % for all years and LoBs. More precisely, we have that
P V i ( F j ) = ( 1 + r ) i F j for j = 1 , , J and j + i > I .
For the most recent accident year, i = I , we define the expected ultimate claim by
C I , J * = R × j = 1 J P I , j j = 1 J F j ,
where R denotes the reserves from Table 1 and
P I , j = P V I ( F j ) i = 1 I j = 1 J P V i ( F j ) .
Note that C I , J * is neither the ultimate claim predictor for the conditional model defined in Equation (21) nor the marginalized one from Equation (27). In this context C I , J * is just an auxiliary variable being used in order to simulate triangles which have estimated reserves similar to the original ones in Table 1.
For the remaining accident years the expected ultimate claim is taken as the present value of C I , J * . In other words,
C i , J * = P V i I ( C I , J * ) = ( 1 + r ) I i C I , J * .
Given all the values of C i , J * , we compute E i * = F 0 × C i , J * , the expected initial payment for each accident year. These values are, then, combined with the coefficients of variation for CY small claims and used to simulate the first column of our triangles as
C i , 0 L N ( m i * , V i * ) ,
with the auxiliary parameters m i * = log ( E i * ) V i * / 2 , V i * = log ( 1 + CoVa C Y 2 ) and CoVa C Y the coefficient of variation of CY small claims, based on Model Assumptions 4. For the remaining development years we follow Model Assumptions 1 (a) with ϕ j = 1 / f j and σ j = s j / f j , as discussed in Section 8.1.
Figure 4 presents the generated cumulative claims payments for all LoBs, where each line represents the cumulative claims payment. In each plot the lighter colours represent more recent accident years which are not yet fully developed. The reserves calculated based on this dataset are presented in Table 2 and given these values the original reserves from Table 1 are ignored.

References

  1. Andrieu, Christophe, Arnaud Doucet, and Roman Holenstein. 2010. Particle Markov chain Monte Carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72: 269–342. [Google Scholar] [CrossRef]
  2. Andrieu, Christophe, and Gareth O. Roberts. 2009. The pseudo-marginal approach for efficient Monte Carlo computations. The Annals of Statistics 37: 697–725. [Google Scholar] [CrossRef]
  3. Andrieu, Christophe, and Matti Vihola. 2015. Convergence properties of pseudo-marginal Markov chain Monte Carlo algorithms. The Annals of Applied Probability 25: 1030–77. [Google Scholar] [CrossRef]
  4. Asimit, Alexandru V., Raluca Vernic, and Rіċardas Zitikis. 2013. Evaluating risk measures and capital allocations based on multi-losses driven by a heavy-tailed background risk: The multivariate Pareto-II model. Risks 1: 14–33. [Google Scholar] [CrossRef] [Green Version]
  5. Bargès, Mathieu, Hélène Cossette, and Etienne Marceau. 2009. TVaR-based capital allocation with copulas. Insurance: Mathematics and Economics 45: 348–61. [Google Scholar] [CrossRef]
  6. Beskos, Alexandros, Omiros Papaspiliopoulos, Gareth O. Roberts, and Paul Fearnhead. 2006. Exact and computationally efficient likelihood-based estimation for discretely observed diffusion processes (with discussion). Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68: 333–82. [Google Scholar] [CrossRef]
  7. Cérou, Frédéric, Pierre Del Moral, Teddy Furon, and Arnaud Guyader. 2012. Sequential Monte Carlo for rare event estimation. Statistics and Computing 22: 795–808. [Google Scholar] [CrossRef] [Green Version]
  8. Chopin, Nicolas, Pierre E Jacob, and Omiros Papaspiliopoulos. 2013. SMC2: An efficient algorithm for sequential analysis of state space models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 75: 397–426. [Google Scholar] [CrossRef]
  9. European Comission. 2009. Directive 2009/138/EC of the European Parliament and of the Council of 25 November 2009 on the taking-up and pursuit of the business of Insurance and Reinsurance (Solvency II). Technical Report. Available online: http://eur-lex.europa.eu/legal-content/EN/ALL/?uri=CELEX:32009L0138 (accessed on 3 May 2017).
  10. Cossette, Hélène, Marie-Pier Côté, Etienne Marceau, and Khouzeima Moutanabbir. 2013. Multivariate distribution defined with Farlie–Gumbel–Morgenstern copula and mixed Erlang marginals: Aggregation and capital allocation. Insurance: Mathematics and Economics 52: 560–72. [Google Scholar] [CrossRef]
  11. Creal, Drew. 2012. A survey of sequential Monte Carlo methods for economics and finance. Econometric Reviews 31: 245–96. [Google Scholar] [CrossRef]
  12. De Jong, Piet, and Ben Zehnwirth. 1983. Claims reserving, state-space models and the Kalman filter. Journal of the Institute of Actuaries 110: 157–81. [Google Scholar] [CrossRef]
  13. Del Moral, Pierre. 1996. Non-linear filtering: interacting particle resolution. Markov Processes and Related Fields 2: 555–81. [Google Scholar]
  14. Del Moral, Pierre. 2004. Feynman-Kac Formulae. Berlin: Springer. [Google Scholar]
  15. Del Moral, Pierre, Arnaud Doucet, and Ajay Jasra. 2006. Sequential Monte Carlo samplers. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 68: 411–36. [Google Scholar] [CrossRef]
  16. Del Moral, Pierre, Gareth W Peters, and Christelle Vergé. 2013. An introduction to stochastic particle integration methods: With applications to risk and insurance. In Monte Carlo and Quasi-Monte Carlo Methods 2012. Berlin: Springer, pp. 39–81. [Google Scholar]
  17. Denuit, Michel, and Jan Dhaene. 2003. Simple characterizations of comonotonicity and countermonotonicity by extremal correlations. Belgian Actuarial Bulletin 3: 22–27. [Google Scholar]
  18. Devroye, Luc, and Gérard Letac. 2015. Copulas with prescribed correlation matrix. In In Memoriam Marc Yor-Séminaire de Probabilités XLVII. Berlin: Springer, pp. 585–601. [Google Scholar]
  19. Dhaene, Jan, Luc Henrard, Zinoviy Landsman, Antoine Vandendorpe, and Steven Vanduffel. 2008. Some results on the CTE-based capital allocation rule. Insurance: Mathematics and Economics 42: 855–63. [Google Scholar] [CrossRef]
  20. Dhaene, Jan, Andreas Tsanakas, Emiliano A. Valdez, and Steven Vanduffel. 2012. Optimal capital allocation principles. Journal of Risk and Insurance 79: 1–28. [Google Scholar] [CrossRef] [Green Version]
  21. Douc, Randal, and Olivier Cappé. 2005. Comparison of resampling schemes for particle filtering. Paper presented the 4th International Symposium on Image and Signal Processing and Analysis (ISPA 2005), Zagreb, Croatia, 15–17 September; pp. 64–69. [Google Scholar]
  22. Doucet, Arnaud, and Adam M. Johansen. 2009. A tutorial on particle filtering and smoothing: Fifteen years later. Handbook of Nonlinear Filtering 12: 656–704. [Google Scholar]
  23. Embrechts, Paul, Alexander McNeil, and Daniel Straumann. 2002. Correlation and dependence in risk management: Properties and pitfalls. In Risk Management: Value at Risk and Beyond. Cambridge: Cambridge University Press, pp. 176–223. [Google Scholar]
  24. Embrechts, Paul, Giovanni Puccetti, Ludger Rüschendorf, Ruodu Wang, and Antonela Beleraj. 2014. An academic response to Basel 3.5. Risks 2: 25–48. [Google Scholar] [CrossRef] [Green Version]
  25. Everitt, Richard G., Adam M. Johansen, Ellen Rowing, and Melina Evdemon-Hogan. 2016. Bayesian model comparison with un-normalised likelihoods. Statistics and Computing 27: 403–22. [Google Scholar] [CrossRef]
  26. Finke, Axel. 2015. On Extended State-Space Constructions for Monte Carlo Methods. Ph.D. dissertation, University of Warwick, Coventry, UK. [Google Scholar]
  27. FINMA. 2007. Technical Document on the Swiss Solvency Test. Technical Report. Bern: FINMA. [Google Scholar]
  28. FINMA. 2016. Standardmodell Schadenversicherung. Available online: https://www.finma.ch/de/~/media/finma/dokumente/dokumentencenter/myfinma/2ueberwachung/sst/standard-model-nonlife-2016.zip?la=de (accessed on 13 July 2016).
  29. Fulop, Andras, and Junye Li. 2013. Efficient learning via simulation: A marginalized resample-move approach. Journal of Econometrics 176: 146–61. [Google Scholar] [CrossRef]
  30. Furman, Edward, and Zinoviy Landsman. 2005. Risk capital decomposition for a multivariate dependent gamma portfolio. Insurance: Mathematics and Economics 37: 635–49. [Google Scholar] [CrossRef]
  31. Gandy, Axel, and F. Din-Houn Lau. 2015. The chopthin algorithm for resampling. arXiv, arXiv:1502.07532. [Google Scholar]
  32. Geweke, John. 1989. Bayesian inference in econometric models using Monte Carlo integration. Econometrica: Journal of the Econometric Society, 1317–39. [Google Scholar] [CrossRef]
  33. Gilks, Walter R., and Carlo Berzuini. 2001. Following a moving target Monte Carlo inference for dynamic Bayesian models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63: 127–46. [Google Scholar] [CrossRef]
  34. Gordon, Neil J., David J. Salmond, and Adrian F.M. Smith. 1993. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEEE Proceedings F-Radar and Signal Processing 140: 107–13. [Google Scholar] [CrossRef]
  35. Landsman, Zinoviy M., and Emiliano A. Valdez. 2003. Tail conditional expectations for elliptical distributions. North American Actuarial Journal 7: 55–71. [Google Scholar] [CrossRef]
  36. Liu, Jun S., and Rong Chen. 1995. Blind deconvolution via sequential imputations. Journal of the American Statistical Association 90: 567–76. [Google Scholar] [CrossRef]
  37. Liu, Jun S., and Rong Chen. 1998. Sequential Monte Carlo methods for dynamic systems. Journal of the American Statistical Association 93: 1032–44. [Google Scholar] [CrossRef]
  38. Mack, Thomas. 1993. Distribution-free calculation of the standard error of chain ladder reserve estimates. Astin Bulletin 23: 213–25. [Google Scholar] [CrossRef]
  39. Martino, Luca, Víctor Elvira, and Francisco Louzada. 2017. Effective sample size for importance sampling based on discrepancy measures. Signal Processing 131: 386–401. [Google Scholar] [CrossRef]
  40. McGree, James M., Christopher C. Drovandi, Gentry White, and Anthony N. Pettitt. 2015. A pseudo-marginal sequential Monte Carlo algorithm for random effects models in Bayesian sequential design. Statistics and Computing 26: 1–16. [Google Scholar] [CrossRef]
  41. McNeil, Alexander J., Rüdiger Frey, and Paul Embrechts. 2010. Quantitative Risk Management: Concepts, Techniques, and Tools. Princeton: Princeton University Press. [Google Scholar]
  42. Merz, Michael, and Mario V. Wüthrich. 2015. Claims run-off uncertainty: The full picture. Available at SSRN 2524352, version of 3/Jul/2015. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2524352 (accessed on 3 May 2017).
  43. Neal, Radford M. 2003. Slice sampling. Annals of Statistics 31: 705–41. [Google Scholar] [CrossRef]
  44. Panjer, Harry H. 2001. Measureement of Risk, Solvency Requirements and Allocation of Capital within Financial Conglomerates. Waterloo: University of Waterloo, Institute of Insurance and Pension Research. [Google Scholar]
  45. Peters, Gareth W. 2005. Topics in Sequential Monte Carlo Samplers. Master’s thesis, University of Cambridge, Cambridge, UK. [Google Scholar]
  46. Peters, Gareth W., Rodrigo S. Targino, and Mario V. Wüthrich. 2017. Full bayesian analysis of claims reserving uncertainty. Insurance: Mathematics and Economics 73: 41–53. [Google Scholar] [CrossRef]
  47. Pitt, Michael K., Ralph dos Santos Silva, Paolo Giordani, and Robert Kohn. 2012. On some properties of Markov chain Monte Carlo simulation methods based on the particle filter. Journal of Econometrics 171: 134–51. [Google Scholar] [CrossRef]
  48. Sklar, Abe. 1959. Fonctions de répartition à n dimensions et leurs marges. Fonctions de Repartition à n Dimensions et Leurs Marges 8: 229–31. [Google Scholar]
  49. Targino, Rodrigo S., Gareth W. Peters, and Pavel V. Shevchenko. 2015. Sequential Monte Carlo samplers for capital allocation under copula-dependent risk models. Insurance: Mathematics and Economics 61: 206–26. [Google Scholar] [CrossRef]
  50. Tasche, Dirk. 1999. Risk contributions and performance measurement. Report of the Lehrstuhl für mathematische Statistik, TU München. Available online: https://pdfs.semanticscholar.org/2659/60513755b26ada0b4fb688460e8334a409dd.pdf (accessed on 3 May 2017).
  51. Tran, Minh-Ngoc, Marcel Scharth, Michael K. Pitt, and Robert Kohn. 2014. Importance sampling squared for bayesian inference in latent variable models. Available at SSRN 2386371. Available online: https://ssrn.com/abstract=2386371 (accessed on 3 May 2017).
  52. Vergé, Christelle, Cyrille Dubarry, Pierre Del Moral, and Eric Moulines. 2015. On parallel implementation of sequential Monte Carlo methods: The island particle model. Statistics and Computing 25: 243–60. [Google Scholar] [CrossRef]
  53. Vergé, Christelle, Jérôme Morio, and Pierre Del Moral. 2016. An island particle algorithm for rare event analysis. Reliability Engineering & System Safety 149: 63–75. [Google Scholar]
  54. Verrall, Richard J. 1989. A state space representation of the chain ladder linear model. Journal of the Institute of Actuaries 116: 589–609. [Google Scholar] [CrossRef]
  55. Wüthrich, Mario V. 2015. Non-Life Insurance: Mathematics & Statistics. Available at SSRN 2319328, version of 29/Jun/2015. Available at SSRN 2386371. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2319328 (accessed on 3 May 2017).
Figure 1. Quantile-Quantile plots for the different lines of business (LoBs) comparing (vertical axis) the empirical distribution of Z ¯ P Y | σ , F ( t ) based on Model Assumptions 1 and (horizontal axis) the log-normal approximation from Model Assumptions 2. Based on 1000 samples.
Figure 1. Quantile-Quantile plots for the different lines of business (LoBs) comparing (vertical axis) the empirical distribution of Z ¯ P Y | σ , F ( t ) based on Model Assumptions 1 and (horizontal axis) the log-normal approximation from Model Assumptions 2. Based on 1000 samples.
Risks 05 00053 g001
Figure 2. Quantile-Quantile plots for the different LoBs comparing (vertical axis) the empirical distribution of Z P Y | F ( t ) based on Model Assumptions 1 and (horizontal axis) the log-normal approximation from Model Assumptions 3 and using posterior samples as in Figure 5 and Figure 6. Based on 1000 samples.
Figure 2. Quantile-Quantile plots for the different LoBs comparing (vertical axis) the empirical distribution of Z P Y | F ( t ) based on Model Assumptions 1 and (horizontal axis) the log-normal approximation from Model Assumptions 3 and using posterior samples as in Figure 5 and Figure 6. Based on 1000 samples.
Risks 05 00053 g002
Figure 3. Posterior distributions for σ j for the (a) Motor Third Part Liability (MTPL) (b) Property and (c) Motor Hull lines of business. One sees solid lines representing the unnormalized posteriors, the histogram of the Markov Chain Monte Carlo (MCMC) outputs and a red dashed line indicating the CL standard deviation estimate. Note that for LoB MTPL we only plot selected development periods: j { 0 , 7 , 14 , 21 , 28 } .
Figure 3. Posterior distributions for σ j for the (a) Motor Third Part Liability (MTPL) (b) Property and (c) Motor Hull lines of business. One sees solid lines representing the unnormalized posteriors, the histogram of the Markov Chain Monte Carlo (MCMC) outputs and a red dashed line indicating the CL standard deviation estimate. Note that for LoB MTPL we only plot selected development periods: j { 0 , 7 , 14 , 21 , 28 } .
Risks 05 00053 g003
Figure 4. Cumulative claims payment (in millions of CHF). Lighter colours represent more recent accident years.
Figure 4. Cumulative claims payment (in millions of CHF). Lighter colours represent more recent accident years.
Risks 05 00053 g004
Figure 5. Histogram of the parameter σ ¯ P Y for the conditional model. Red dashed line: σ P Y .
Figure 5. Histogram of the parameter σ ¯ P Y for the conditional model. Red dashed line: σ P Y .
Risks 05 00053 g005
Figure 6. Histogram of the parameter μ ¯ P Y for the conditional model. Red dashed line: μ P Y .
Figure 6. Histogram of the parameter μ ¯ P Y for the conditional model. Red dashed line: μ P Y .
Risks 05 00053 g006
Figure 7. Histograms levels used in the SMC sampler algorithm with p 0 = 0.5 in the marginalized model. The red dashed bar represents the true value of the α quantile.
Figure 7. Histograms levels used in the SMC sampler algorithm with p 0 = 0.5 in the marginalized model. The red dashed bar represents the true value of the α quantile.
Risks 05 00053 g007
Figure 8. Histograms levels used in the Sequential Monte Carlo (SMC) sampler algorithm with p 0 = 0.5 in the conditional model. The red dashed bar represents the true value of the α quantile.
Figure 8. Histograms levels used in the Sequential Monte Carlo (SMC) sampler algorithm with p 0 = 0.5 in the conditional model. The red dashed bar represents the true value of the α quantile.
Risks 05 00053 g008
Figure 9. Bias for the marginalized model. Note that although the bias for some of the CY large claims is around 10% their allocated capital is rather small, as seen in Figure 11 (a).
Figure 9. Bias for the marginalized model. Note that although the bias for some of the CY large claims is around 10% their allocated capital is rather small, as seen in Figure 11 (a).
Risks 05 00053 g009
Figure 10. Bias for the conditional model. Note that although the bias for some of the current year (CY) large claims is around 10% their allocated capital is rather small, as seen in Figure 11 (b).
Figure 10. Bias for the conditional model. Note that although the bias for some of the current year (CY) large claims is around 10% their allocated capital is rather small, as seen in Figure 11 (b).
Risks 05 00053 g010
Figure 11. Comparison between the “true” allocations (calculated via a large Monte Carlo procedure) and the SMC sampler solution for the (a) marginalized and (b) conditional models.
Figure 11. Comparison between the “true” allocations (calculated via a large Monte Carlo procedure) and the SMC sampler solution for the (a) marginalized and (b) conditional models.
Risks 05 00053 g011
Figure 12. Variance reduction for the marginalized model.
Figure 12. Variance reduction for the marginalized model.
Risks 05 00053 g012
Figure 13. Variance reduction for the conditional model.
Figure 13. Variance reduction for the conditional model.
Risks 05 00053 g013
Figure 14. Relative bias in the marginalized model as a function of (a) the parameter p 0 and (b) the sample size in the SMC sampler, N S M C .
Figure 14. Relative bias in the marginalized model as a function of (a) the parameter p 0 and (b) the sample size in the SMC sampler, N S M C .
Risks 05 00053 g014
Table 1. Initial synthetic balance sheet.
Table 1. Initial synthetic balance sheet.
LoBReservesPremium
1 MTPL2391.64503.14
2 Motor Hull99.08573.26
3 Property449.26748.76
4 Liability870.27299.73
5 Workers Compensation (UVG)1104.66338.63
6 Commercial Health271.54254.21
7 Private Health7.327.20
8 Credit and Surety49.5034.64
9 Others67.6446.28
Total5310.922805.87
Table 2. Parameters and capital calculations for the marginalized and conditional models.
Table 2. Parameters and capital calculations for the marginalized and conditional models.
LoBReserve/ σ μ CoVaExpectationStandaloneMarginalizedConditional
PremiumES 99 % SCRES 99 % SCRDiv. BenefitES 99 % SCRDiv. Benefit
12365.440.02877.76592.87%2365.442546.31180.872489.85124.4131.22%2492.05126.6130.00%
299.370.21644.575521.90%99.37173.2373.86131.7332.3656.19%132.5933.2155.03%
3405.990.11425.999811.46%405.99547.25141.26479.1173.1248.24%485.2779.2843.88%
4870.190.03156.76823.15%870.19946.0675.87905.4835.2953.49%905.2935.1053.73%
51105.950.01937.00831.93%1105.951,164.0458.091137.0631.1146.44%1136.8830.9346.76%
6274.910.04105.61564.10%274.91306.4331.52287.3312.4260.59%286.9712.0661.74%
77.1500.05471.96575.48%7.158.261.117.450.3073.27%7.430.2874.50%
848.180.04933.87384.93%48.1854.896.7150.512.3265.36%50.432.2566.44%
972.200.13324.270613.38%72.2102.1629.9685.3213.1256.21%85.1512.9556.77%
Total PY5249.38 5249.385848.63599.255573.84324.4545.86%5582.06332.6744.49%
1503.140.06856.09586.86%448.94533.0784.13499.1650.2140.32%498.3749.4341.25%
2573.260.07026.03567.03%402.87504.20101.33472.2569.3831.53%471.6668.7932.11%
3748.760.06836.30136.84%547.23654.38107.15603.3656.1347.62%602.6155.3848.31%
4299.730.09235.35969.25%216.70272.0555.35239.6922.9958.47%239.5722.8758.69%
5338.630.06485.68416.49%303.77349.6945.92319.1715.4066.47%318.7114.9467.45%
6254.210.08045.42968.05%228.79282.6253.83249.6320.8561.28%249.3120.5261.88%
77.200.10471.862810.5%6.488.522.047.010.5373.84%7.010.5374.06%
834.640.09813.31729.84%27.7235.848.1330.322.6067.95%30.282.5768.44%
946.280.10043.606610.06%37.0348.1611.1441.834.8156.83%41.794.7757.19%
Total CY,s2805.85 2219.532688.53469.022462.42242.948.21%2459.31239.848.87%
Peril β ( 5 ) γ α CoVaExpectationStandaloneMarginalizedConditional
ES 99 % SCRES 99 % SCRDiv. benefitES 99 % SCRDiv. benefit
12.50 2.80 3.8920.1416.254.030.1599.1%4.010.1299.27%
213.353001.85 27.08191.21164.1339.9612.8892.15%39.6112.5392.36%
36.281001.50 14.3484.3169.9716.52.1696.91%16.452.1196.98%
43.881001.80 8.1061.3453.248.940.8498.42%8.910.8198.48%
50.50 2.00 1.0010.009.001.070.0799.19%1.120.1298.69%
Total CY,l 54.41367312.5970.516.194.85%70.115.6994.98%
Total8055.26 7523.328904.181380.868106.77583.4557.75%8111.5588.1857.40%
Table 3. Swiss Solvency Test (SST)’s (2015) standard development patterns for claims provision (normalized to have at most 30 development years and rounded to 2 digits).
Table 3. Swiss Solvency Test (SST)’s (2015) standard development patterns for claims provision (normalized to have at most 30 development years and rounded to 2 digits).
LoBYear 0Year 1Year 2Year 3Year 4Year 5Year 6Year 7Year 8Year 9Year 10Year 11Year 12Year 13Year 14Year 15
130.18%15.63%5.78%4.94%4.43%4.34%4.09%3.92%3.66%3.50%3.08%2.64%2.16%1.86%1.50%1.30%
281.08%18.67%0.24%0%0%0%0%0%0%0%0%0%0%0%0%0%
358.24%35.06%4.36%1.37%0.64%0.33%0%0%0%0%0%0%0%0%0%0%
426.55%23.53%8.33%6.18%4.79%4.15%3.63%3.14%2.55%2.11%1.80%1.59%1.35%1.20%1.12%1.02%
540.62%24.92%7.14%4.86%4.43%3.13%2.57%1.67%1.31%1.22%1.05%0.69%0.60%0.56%0.51%0.47%
636.83%47.68%14.20%0.88%0.28%0.14%0%0%0%0%0%0%0%0%0%0%
746.26%38.05%10.78%2.94%1.27%0.69%0%0%0%0%0%0%0%0%0%0%
845.85%35.28%11.35%3.72%1.62%0.91%0.52%0.32%0.20%0.13%0.10%0%0%0%0%0%
958.24%35.06%4.36%1.37%0.64%0.33%0%0%0%0%0%0%0%0%0%0%
LoBYear 16Year 17Year 18Year 19Year 20Year 21Year 22Year 23Year 24Year 25Year 26Year 27Year 28Year 29Year 30
11.06%0.88%0.73%0.64%0.60%0.53%0.47%0.44%0.41%0.37%0.29%0.21%0.15%0.12%0.10%
20%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
30%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
40.88%0.77%0.72%0.66%0.60%0.55%0.52%0.49%0.45%0.4%0.31%0.22%0.16%0.13%0.11%
50.43%0.40%0.37%0.35%0.33%0.31%0.29%0.27%0.26%0.24%0.23%0.22%0.20%0.19%0.18%
60%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
70%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
80%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
90%0%0%0%0%0%0%0%0%0%0%0%0%0%0%
Table 4. Mack’s standard deviation parameter estimates, s j , based on exogenous triangles and for the development lengths given in Table 3.
Table 4. Mack’s standard deviation parameter estimates, s j , based on exogenous triangles and for the development lengths given in Table 3.
LoBYear 0Year 1Year 2Year 3Year 4Year 5Year 6Year 7Year 8Year 9Year 10Year 11Year 12Year 13Year 14
10.56730.22800.19220.26810.26830.39490.26520.26410.27890.30550.14580.15770.21400.10010.1016
20.66400.0659
31.36140.49210.32150.08750.0666
40.82480.43280.40210.36440.37720.27290.52680.2440.27860.15590.26600.07760.07570.12200.0418
50.99140.33170.18070.10720.07400.04440.03590.02550.01900.01060.01660.00940.00400.01050.0040
60.60690.24050.05970.03710.0172
70.10530.04500.01570.01130.0091
80.30980.07370.03100.02030.01370.00510.00200.00260.00200.00140.0011
90.91630.19100.12480.03400.0258
LoBYear 15Year 16Year 17Year 18Year 19Year 20Year 21Year 22Year 23Year 24Year 25Year 26Year 27Year 28Year 29
10.04660.10970.10810.05830.13530.09160.09160.09160.09160.09160.09160.09160.09160.09160.0916
2
3
40.02720.08860.04220.01900.02380.01900.01520.01220.00970.00780.00620.00500.00400.00320.0025
50.00400.00400.00400.00400.00400.00400.00400.00400.00400.00400.00400.00400.00400.00400.0040
6
7
8
9
Table 5. Claims ratio, average claim amount (in millions of CHF) and market share.
Table 5. Claims ratio, average claim amount (in millions of CHF) and market share.
LoBClaims RatioAverage Claim AmountMarket Share
190%0.005
275%0.00320%
375%0.004
475%0.004
590%0.00410%
690%0.003
790%0.002
880%0.003
980%0.003
Table 6. Copula correlation matrix from the marginalized model. Correlation block for the marginalized model: Ω P Y (Table 7); Ω P Y , C Y , s (Table 8); Ω C Y , s (Table 9).
Table 6. Copula correlation matrix from the marginalized model. Correlation block for the marginalized model: Ω P Y (Table 7); Ω P Y , C Y , s (Table 8); Ω C Y , s (Table 9).
Ω = Ω P Y Ω P Y , C Y , s 0 L × P Ω C Y , s 0 L × P I P × P
Table 7. Correlation block for the marginalized model: Ω P Y .
Table 7. Correlation block for the marginalized model: Ω P Y .
LoB123456789
110.15170.15050.25010.50010.25010.15010.25020.2511
2 10.15200.15170.15170.15170.15170.15170.2532
3 10.15050.15050.15050.15050.15050.2515
4 10.25010.15010.15010.15010.2511
5 10.25010.15010.25010.2511
6 10.15010.25020.2511
7 10.15020.2511
8 10.2511
9 1
Table 8. Correlation block for the marginalized model: Ω P Y , C Y , s .
Table 8. Correlation block for the marginalized model: Ω P Y , C Y , s .
LoB123456789
10.50040.50050.15020.25050.25030.25040.15040.25060.2506
20.50460.50460.25280.15190.25280.15180.15190.15190.2529
30.15060.25090.50130.25100.15060.15060.15080.15070.2511
40.25030.15020.25030.50080.15020.15030.15040.15040.2506
50.25030.25030.15020.15030.50040.25040.15040.25060.2506
60.25030.15020.15020.15030.25030.50060.25070.25060.2506
70.15020.15030.15020.15040.15020.25050.50100.15040.2506
80.25030.15020.15020.15040.25030.25040.15040.50090.2506
90.25110.25110.25110.25130.25110.25120.25140.25130.5018
Table 9. Correlation block for the marginalized model: Ω C Y , s .
Table 9. Correlation block for the marginalized model: Ω C Y , s .
LoB123456789
110.50060.15030.25060.25040.15040.15050.15050.2507
2 10.25050.15040.25040.15040.15050.15050.2507
3 10.25060.15030.15040.15050.15050.2507
4 10.15040.15050.15060.15060.2509
5 10.25050.15050.25070.2507
6 10.25080.25080.2508
7 10.15070.2510
8 10.2509
9 1
Table 10. Intermediate quantiles for different values of p 0 .
Table 10. Intermediate quantiles for different values of p 0 .
p 0 1234567
0.40.60.840.9360.97440.98980.9959
0.50.50.750.8750.93750.96880.98440.9922
0.70.30.510.6570.75990.83190.88240.9176
p 0 8910111213
0.4
0.5
0.70.94240.95960.97180.98020.98620.9903

Share and Cite

MDPI and ACS Style

Peters, G.W.; Targino, R.S.; Wüthrich, M.V. Bayesian Modelling, Monte Carlo Sampling and Capital Allocation of Insurance Risks. Risks 2017, 5, 53. https://doi.org/10.3390/risks5040053

AMA Style

Peters GW, Targino RS, Wüthrich MV. Bayesian Modelling, Monte Carlo Sampling and Capital Allocation of Insurance Risks. Risks. 2017; 5(4):53. https://doi.org/10.3390/risks5040053

Chicago/Turabian Style

Peters, Gareth W., Rodrigo S. Targino, and Mario V. Wüthrich. 2017. "Bayesian Modelling, Monte Carlo Sampling and Capital Allocation of Insurance Risks" Risks 5, no. 4: 53. https://doi.org/10.3390/risks5040053

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop