Estimating the Volatility of Non-Life Premium Risk Under Solvency II: Discussion of Danish Fire Insurance Data

Cerchiara, Rocco Roberto; Acri, Francesco

doi:10.3390/risks8030074

Open AccessArticle

Estimating the Volatility of Non-Life Premium Risk Under Solvency II: Discussion of Danish Fire Insurance Data

by

Rocco Roberto Cerchiara

¹ and

Francesco Acri

^2,*

¹

Department of Economics, Statistics and Finance “Giovanni Anania”, University of Calabria, 87036 Arcavacata di Rende (CS), Italy

²

Independent Researcher, 20135 Milan, Italy

^*

Author to whom correspondence should be addressed.

Risks 2020, 8(3), 74; https://doi.org/10.3390/risks8030074

Submission received: 6 July 2019 / Revised: 24 June 2020 / Accepted: 26 June 2020 / Published: 6 July 2020

(This article belongs to the Special Issue Capital Requirement Evaluation under Solvency II framework)

Download

Browse Figures

Versions Notes

Abstract

:

We studied the volatility assumption of non-life premium risk under the Solvency II Standard Formula and developed an empirical model on real data, the Danish fire insurance data. Our empirical model accomplishes two things. Primarily, compared to the present literature, this paper innovates the fitting of Danish fire insurance data using a composite model with a random threshold. Secondly we prove, by fitting the Danish fire insurance data, that for large insurance companies the volatility of the standard formula is higher than the volatility estimated with internal models such as composite models, also taking into account the dependence between attritional and large claims.

Keywords:

composite models; copula functions; Fast Fourier Transform; dependent random variables; volatility; Solvency II

JEL Classification:

65C05; 65T50; 68U99; 65C50

1. Introduction

A non-life insurance company faces premium risk, among others, which is the risk of financial losses related to premiums earned. The risk in the losses relates to uncertainty in severity, frequency or even timing of claims incurring during the period of exposure. For an operating non-life insurer, premium risk is a key driver of uncertainty both from operational and solvency perspectives. In regards to the solvency perspective, there are many different methods useful to give a correct view of the capital needed to meet adverse outcomes related to premium risk. In particular, evaluation of the distribution of aggregate loss plays a fundamental role in the analysis of risk and solvency levels.

As shown in Cerchiara and Demarco (2016), the standard formula under Solvency II for premium and reserve risk defined by the Delegated Acts (DA, see European Commission 2015) proposes the following formula for the solvency capital requirement (SCR):

S C R = 3 σ V

(1)

where V denotes the net reinsurance volume measure for Non-Life premium and reserve risk determined in accordance with Article 116 of DA, and

σ

is the volatility (coefficient of variation) for Non-Life premium and reserve risk determined in accordance with Article 117 of DA, combining the volatility

σ_{s}

according to the correlation matrix between each segment s. Then,

σ_{s}

is calculated as follows:

σ_{s} = \frac{\sqrt{σ_{(p r e m, s)}^{2} V_{(p r e m, s)}^{2} + σ_{(p r e m, s)} V_{(p r e m, s)} σ_{(r e s, s)} V_{(r e s, s)} + σ_{(r e s, s)}^{2} V_{(r e s, s)}^{2}}}{V_{(p r e m, s)} + V_{(r e s, s)}}

(2)

In this paper we focus our attention on

σ_{(p r e m, s)}

, i.e., the coefficient of variation of the Fire segment Premium Risk. Under DA, the premium risk volatility of this segment is equal to 8%. As shown in Clemente and Savelli (2017), Equation (1) “implicitily assumes to measure the difference between the Value at Risk (VaR) at 99.5% confidence level and the mean of the probability distribution of aggregate claims amount by using a fixed multiplier of the volatility equal to 3 for all insurers. From a practical point of view, DA multiplier does not take into account the skewness of the distribution with a potential underestimation of capital requirement for small insurers and an overestimation for big insurers”.

For Insurance Undertakings who do not believe in the standard formula assumptions, they may calculate the Solvency Capital Requirement using an Undertaking Specific Parameters approach (USP, see Cerchiara and Demarco 2016) and a Full or Partial Internal Model (PIM) after approval by Supervisory Authorities. Calculation of the volatility and VaR of independent or dependent risky positions using PIM is very difficult for large portfolios. In the literature, many different studies are based on definitions of composite models that aim to analyze loss distribution and dependence between the main factors that characterize the risk profile of insurance companies, e.g., frequency and severity, attritional and large claims and so forth. Considering more recent developments in the literature, Galeotti (2015) proves the convergence of a geometric algorithm (alternative to Monte Carlo and quasi-Monte Carlo methods) for computing the Value-at-Risk of a portfolio of any dimension, i.e., the distribution of the sum of its components, which can exhibit any dependence structure.

In order to implement PIM and investigate overestimation of the SCR (and the underlying volatility) for large insurers, we used the Danish fire insurance dataset1 that has been often analyzed according to the parametric approach and composite models. McNeil (1997), Resnick (1997), Embrechts et al. (1997) and McNeil et al. (2005) proposed fitting this dataset using Extreme Value Theory and Copula Functions (see Klugman et al. 2010 for more details on the latter), with special reference to modeling the tail of the distribution. Cooray and Ananda (2005) and Scollnik (2007) showed that the composite lognormal-Pareto model was a better fit than standard univariate models. Following the previous two papers, Teodorescu and Vernic (2009, 2013) fit the dataset firstly with a composite Exponential and Pareto distribution, and then with a more general composite Pareto model obtained by replacing the Lognormal distribution by an arbitrary continuous distribution, while Pigeon and Denuit (2011) considered a positive random variable as the threshold value in the composite model. There have been several other approaches to model this dataset, including Burr distribution for claim severity using XploRe computing environment (Burnecki and Weron 2004), Bayesian estimation of finite time ruin probabilities (Ausin et al. 2009), hybrid Pareto models (Carreau and Bengio 2009), beta kernel quantile estimation (Charpentier and Oulidi 2010) and bivariate composite Poisson process (Esmaeili and Klüppelberg 2010). An example of non-parametric modeling is shown in Guillotte et al. (2011) with a Bayesian inference on bivariate extremes. Drees and Müller (2008) showed how to model dependence within joint tail regions. Nadarajah and Bakar (2014) improved the fittings for the Danish fire insurance data using various new composite models, including the composite Lognormal–Burr model.

Following this literature, this paper innovates fitting of the Danish fire insurance data by using the Pigeon and Denuit (2011) composite model with a random threshold that has a higher goodness-of-fit than the Nadarajah and Bakar (2014) model. Once the best model is defined, we show that the Standard formula assumption is prudent, especially for large insurance companies, giving an overestimated volatility of the premium risk (and of the SCR). For illustrative purposes, we also investigate the use of other models, including the Copula function and Fast Fourier Transform (FFT; Robe-Voinea and Vernic 2016), trying to take into account the dependence between attritional and large claims and understand the effect on SCR.

The paper is organized as follows. In Section 2 we report some statistical characteristics of Danish data. In Section 3 and Section 4 we posit there is no dependence between attritional and large claims. We investigate the use of composite models with fixed or random thresholds in order to fit the Danish fire insurance data, and we compare our numerical results with the fitting of Nadarajah and Bakar (2014) based on a composite Lognormal–Burr model. In Section 5 we try to appraise risk dependence through the Copula function concept and FFT, for which Robe-Voinea and Vernic (2016) provide an overview and perform a multidimensional application. Section 6 concludes the work and presents estimation of the aggregate loss volatility distribution, and results are compared under independence and dependence conditions.

2. Data

In the following, we show some statistics of the dataset used in this analysis. The losses of individual fires covered in Denmark were registered by the reinsurance company Copenhagen Re and, for our study, have been converted into euros. It is worth mentioning that the original dataset (available also in R) covers the period 1980–1990. In 2003–2004, Mette Havning (Chief Actuary of Danish Reinsurance) was on the Astin committee where she met Tine Aabye from Forsikring & Pension. Aabye asked her colleague to send the Danish million re-losses from 1985–2002 to Mette Havning. Based on the two versions from 1980–1990 and 1985–2002, Havning then made an extended version of Danish Fire Insurance Data from 1980 through 2002 with only a few ambiguities in the overlapping period. The data were communicated to us by Mette Havning and consisted of 6870 claims over a time period of 23 years. We bring to the reader’s attention that, to avoid seasonal effects due to the use of the entire historical series that starts from 1980, the costs have been inflated to 2002. In addition, we referred to a wider dataset also including small losses, unlike that used by McNeil (1997), among others. In fact, we want to study the entire distribution of this dataset, while in McNeil (1997) and in other works the attention was focused especially on the right-tail distribution. We list some descriptive statistics in Table 1:

The maximum observed was around €55 million and the average cost was 613,100€. The empirical distribution is definitely leptokurtic and asymmetric to the right.

To make applications of composite models and Copula functions easier, we will suppose that claim frequency k is non-random, while for the Fast Fourier Transform algorithm we consider the frequency as a random variable. The losses have been split by year, so we can report some descriptive statistics for k in Table 2:

We note 50% of frequencies were included between 238 and 381 claims, and there is slight negative asymmetry. In addition, the variance is greater than mean value (299).

3. Composite Models

In the Danish fire insurance data we can find both frequent claims with low to medium severity and sporadic claims with high severity. If we want to define a joint distribution for these two types of claims we have to build a composite model.

A composite model is a combination of two different models: One having a light tail below a threshold (attritional claims) and another with a heavy tail suitable to model the value that exceeds this threshold (large claims). Composite distributions (also known as composite, spliced or piecewise distributions) have been introduced in many applications. Klugman et al. (2010) expressed the probability density function of a composite distributions as

f (x) = \{\begin{matrix} r_{1} f_{1}^{*} (x) & k_{0} < x < k_{1} \\ ⋮ \\ r_{n} f_{n}^{*} (x) & k_{n - 1} < x < k_{n} \end{matrix}

(3)

where

f_{j}^{*}

is truncated probability density function of marginal distribution

f_{j}

,

j = 1, \dots, n

;

r_{j} \geq 0

are mixing weights,

\sum_{j = 1}^{n} r_{j} = 1

; and

k_{j}

defines the range limit of the domain.

Formally, the density distribution of a composite model can be written as a special case of (3) as follows:

f (x) = \{\begin{matrix} r f_{1}^{*} (x) & - \infty < x \leq u \\ (1 - r) f_{2}^{*} (x) & u < x < \infty \end{matrix}

(4)

where

r \in [0, 1]

, and

f_{1}^{*}

and

f_{2}^{*}

are cut off density distributions of marginals

f_{1}

and

f_{2}

, respectively. In detail, if

F_{i}

is the distribution function of

f_{i}, i = 1, 2

, then we have

\{\begin{matrix} f_{1}^{*} (x) = \frac{f_{1} (x)}{F_{1} (u)} & - \infty < x \leq u \\ f_{2}^{*} (x) = \frac{f_{2} (x)}{1 - F_{2} (u)} & u < x < \infty \end{matrix}

(5)

It is simple to note that (4) is a convex combination of

f_{1}^{*}

and

f_{2}^{*}

with weights r and

1 - r

. In addition, we want (4) to be continuous, derivable and with a continuously derivative density function, and for this scope we fix the following limitation:

\{\begin{matrix} lim_{x \to u} f (x) = f (u) \\ lim_{x \to u^{-}} f^{'} (x) = lim_{x \to u^{+}} f^{'} (x) \end{matrix}

(6)

From the first we obtain

r = \frac{f_{2} (u) F_{1} (u)}{f_{2} (u) F_{1} (u) + f_{1} (u) (1 - F_{2} (u))}

(7)

while from the second

r = \frac{f_{2}^{'} (u) F_{1} (u)}{f_{2}^{'} (u) F_{1} (u) + f_{1}^{'} (u) (1 - F_{2} (u))}

(8)

We can define distribution function F of (4)

F (x) = \{\begin{matrix} r \frac{F_{1} (x)}{F_{1} (u)}, & - \infty < x \leq u \\ r + (1 - r) \frac{F_{2} (x) - F_{2} (u)}{1 - F_{2} (u)}, & u < x < \infty \end{matrix}

(9)

Suppose

F_{1}

and

F_{2}

admit inverse functions; we can define the quantile function via an inversion method. Let p be a random number from a standard Uniform distribution, then the quantile function results in

F^{- 1} (p) = \{\begin{matrix} F_{1}^{- 1} (\frac{p}{r} F_{1} (u)), & p \leq r \\ F_{2}^{- 1} (\frac{p - r + (1 - p) F_{2} (u)}{1 - r}), & p > r \end{matrix}

(10)

To estimate the parameters of (9) we can proceed as follows: First, we estimate marginal density function parameters separately (the underlying hypothesis is that there is no relation between attritional and large claims); then, these estimates will be the start values of the density function in order to maximize the following likelihood:

L (x_{1}, \dots, x_{n}; θ) = r^{m} {(1 - r)}^{n - m} \prod_{i = 1}^{m} f_{1}^{*} (x_{i}) \prod_{j = m + 1}^{n} f_{2}^{*} (x_{j})

(11)

where n is the sample dimension,

θ

is a vector including composite model parameters, while m is such that

X_{m} \leq u \leq X_{m + 1}

, otherwise m is the level of order statistics

X_{m}

immediately previous (or coincident) to u.

The methodology described in Teodorescu and Vernic (2009, 2013) has been used in order to estimate threshold u, which permits us to discriminate between attritional and large claims.

3.1. Composite Model with Random Threshold

We can define a composite model also using a random threshold (see Pigeon and Denuit 2011). In particular, given a random sample

X = (X_{1}, \dots, X_{n})

, we can assume that every single component

X_{i}

provides its own threshold. So, for the generic observation

x_{i}

we will have the threshold

u_{i}

,

i = 1, \dots, n

. For this reason,

u_{1}, \dots, u_{n}

are realizations of a random variable U with a distribution function G. The random variable U is necessarily non-negative and with a heavy-tailed distribution.

A composite model with a random threshold shows a completely new and original aspect: Not only are we unable to choose a value for u, but its whole distribution and the parameters of the latter are implicit in the definition of the composite model. In particular, we define the density function of the Lognormal–Generalized Pareto Distribution model (GPD, see (Embrechts et al. 1997)) with a random threshold in the following way:

f (x) = (1 - r) \int_{0}^{x} f_{2} (x) g (U) d U + r \int_{x}^{\infty} \frac{1}{Φ (ξ σ)} f_{1} (x) g (U) d U

(12)

where

r \in [0, 1]

, U is the random threshold with density function g,

f_{1}

and

f_{2}

are Lognormal and GPD density functions, respectively,

Ψ

is the Standard Normal distribution function,

ξ

is the shape parameter of GPD and

σ

is the Lognormal scale parameter.

3.2. Kumaraswamy Distribution and some Generalization

In this section we describe the Kumaraswamy Distribution (see Kumaraswamy 1980) and a generalization of the Gumbel distribution (see Cordeiro et al. 2012). In particular, let

K (x; α, β) = 1 - {(1 - x^{α})}^{β}, x \in (0, 1)

(13)

in the distribution proposed in Kumaraswamy (1980), where

α

and

β

are non-negative shape parameters. If G is the distribution function of a random variable, then we can define a new distribution by

F (x; a, b) = 1 - {(1 - G {(x)}^{a})}^{b}

(14)

where

a > 0

and

b > 0

are shape parameters that influence kurtosis and skewness. The Kumaraswamy–Gumbel (KumGum) distribution is defined throughout (14) with the following distribution function (see Cordeiro et al. 2012):

F_{K G} (x; a, b) = 1 - {(1 - Λ {(x)}^{a})}^{b}

(15)

where

Λ (x)

is the Gumbel distribution function with density defined by (20). The quantile function of KumGum is obtained by inverting (15) and explicating Gumbel parameters (v and

ϕ

):

x_{p} = F^{- 1} (p) = v - φ log [- log {(1 - {(1 - p)}^{1 / b})}^{1 / a}]

(16)

with

p \in (0, 1)

.

The following Table 3 and Figure 1 show the Kurtosis and Skewness of the KumGum density function by varying the four parameters:

Another generalization of the Kum distribution is the Kumaraswamy–Pareto one (KumPareto). In particular, we can evaluate Equation (14) in the Pareto distribution function P which is

P (x; β, κ) = 1 - {(\frac{β}{x})}^{κ}, x \geq β

(17)

where

β > 0

is a scale parameter, and

κ \geq 0

is a shape parameter. Thus, from (13), (14) and (17) we obtain the KumPareto distribution function:

F_{K P} (x; β, κ, a, b) = 1 - {\{1 - {[1 - {(\frac{β}{x})}^{κ}]}^{a}\}}^{b}, x \geq 0

(18)

The corresponding quantile function is

F^{- 1} (p) = β {\{{\{1 - {[1 - {(1 - p)}^{1 / b}]}^{1 / a}\}}^{1 / κ}\}}^{- 1}

(19)

where

p \in (0, 1)

. In the following Figure 2 we report the KumPareto density function varying the parameters:

4. Numerical Example of Composite Models

In this section we present numerical results on the fitting of Danish fire insurance data by composite models with constant and random thresholds between attritional and large claims. As already mentioned, for the composite models with a constant threshold, we used the methodology described in Teodorescu and Vernic (2009, 2013), obtaining u = 1,022,125€.

We start with a composite Lognormal–KumPareto model, choosing

f_{1} \sim L o g n o r m a l

and

f_{2} \sim K u m P a r e t o

. From the following Table 4 we can compare some theoretical and empirical quantiles:

Only the fiftieth percentile of the theoretical distribution function was very close to the same empirical quantile: From this percentile onwards the differences increased. In the following Figure 3 we show only right tails of the distribution functions (empirical and theoretical):

The red line is always above the dark line. This means Kumaraswamy-generalized families of distributions are very versatile in analyzing different types of data, but in this case the Lognormal–KumPareto model underestimated the right tail.

Therefore, we consider the composite model

f_{1} \sim L o g n o r m a l

and

f_{2} \sim B u r r

as suggested in Nadarajah and Bakar (2014). The parameters are estimated using the CompLognonormal R package as shown in Nadarajah and Bakar (2014). From the following Table 5 we can compare some theoretical quantiles with empirical ones:

This model seemed to be more feasible in catching the right tail of the empirical distribution with respect to the previous Lognormal–KumPareto, as we can see from the Figure 4 below:

Similar to the Lognormal–KumPareto model, the Lognormal–Burr distribution line is always above the empirical distribution line but not always at the same distance.

We go forward modeling a Lognormal–Generalized Pareto Distribution (GPD), that is we choose

f_{1} \sim L o g n o r m a l

and

f_{2} \sim G P D

and then we generate pseudo-random numbers from quantile function (10). In Table 6 and Figure 5 we report the estimates of parameters, 99% confidence intervals and the QQ plot (

μ_{1}

and

σ

are the Lognormal parameters, while

σ_{μ}

and

ξ

are GPD parameters):

We observe that this composite model adapted well to the empirical distribution; in fact, except for a few points, theoretical quantiles are close to corresponding empirical quantiles. In the Figure 6 and Figure 7 we compare the theoretical cut-off density function with the corresponding empirical one and the theoretical right tail with the empirical one.

The model exhibited a non-negligible right tail (kurtosis index is 115,656.2), which can be evaluated comparing the observed distribution function with the theoretical one.

The corresponding Kolmogorov–Smirnov test returned a p-value equal to 0.8590423, using 50,000 bootstrap samples.

Finally, in Table 7 we report the best estimate and 99% confidence intervals of the composite model Lognormal–GPD with a Gamma random threshold u (see Pigeon and Denuit 2011).

The threshold u is a parameter whose value depends on Gamma parameters. In the following Table 8 and Figure 8 we report the theoretical and empirical quantiles and the QQ plot.

We can see from the Figure 9 that Lognormal–GPD–Gamma model can be considered a good fitting model.

The Kolmogorov–Smirnov adaptive test returned p-value equal to 0.1971361; therefore, we cannot reject the null hypothesis under which the investigated model is a feasible model for our data.

Finally, Lognormal–KumPareto, Lognormal–Burr, Lognormal–GPD with fixed threshold and Lognormal–GPD with a Gamma random threshold can be compared using the AIC and BIC values, Table 9.

The previous analysis suggests that the Lognormal–GPD–Gamma gives the best fit.

5. Introducing Dependence Structure: Copula Functions and Fast Fourier Transform

In the previous section we restricted our analysis to the case of independence between attritional and large claims. We now try to extend this work to a dependence structure. Firstly, we defined a composite model using a copula function to evaluate the possible dependence. As marginal distributions, we referenced to a Lognormal distribution for attritional claims and a GPD for large ones. The empirical correlation matrix R

R = (\begin{matrix} 1 & 0.01259155 \\ 0.01259155 & 1 \end{matrix})

and Kendall’s Tau and Spearman’s Rho measures of association

K = (\begin{matrix} 1 & 0.00252667 \\ 0.00252667 & 1 \end{matrix})

S = (\begin{matrix} 1 & 0.00373077 \\ 0.00373077 & 1 \end{matrix})

suggest a weak but positive correlation between normal and large claims.

For this reason, the individuation of an appropriate copula function will not be easy, but we present an illustrative example based on a Gumbel Copula. We underline that an empirical dependence structure is inducted by distinction between attritional and large losses. In fact, there is no unique event that causes small and large losses simultaneously, but when an insured event occurs, only an attritional or large loss is produced. For this reason, the results showed in the following should be considered as an exercise that highlights the important effects of dependence on the aggregate loss distribution.

C_{θ} (u, v) = exp - {[ln {(u)}^{- θ} + ln {(v)}^{- θ}]}^{1 / θ}, 1 \leq θ < \infty

(20)

Table 10 reports the different methods to estimate the parameter

θ

:

We remind that Gumbel’s parameter

θ

assumes values in

[1, \infty)

, and for

θ \to 1

we have independence between marginal distributions. We observe that estimates were significantly different from 1, and so our Gumbel Copula did not correspond to the Independent Copula. We can say that because we verified using bootstrap procedures, the

θ

parameter has a Normal distribution. In fact, the Shapiro–Wilk test gave a p-value equal to 0.08551; thus, with a fixed significance level of 5%, it is not possible reject the null hypothesis. In addition, the 99% confidence interval obtained with Maximum pseudo-likelihood method was (1.090662; 1.131003), which does not include the value 1; the same confidence interval obtained with the Bootstrap procedure was (1.090662; 1.131003). In the following Figure 10 we report the distribution of the Gumbel parameter obtained by the bootstrap procedure.

We report two useful graphics (Figure 11 and Figure 12), obtained by simulation of the estimated Gumbel.

The density function (Figure 12) assumed greater values in correspondence of great values both for Lognormal and GPD marginal; in other words, using the Gumbel Copula, the probability that attritional claims produced losses near to the threshold u, and that large claims produced extreme losses, was greater than the probability of any other joined event.

We report also the result of the parametric bootstrap goodness-of-fit test performed on the estimated Gumbel Copula.

Statistic	$θ$	p-Value
2.9381	1.1108	0.00495

We can consider the estimated Gumbel a good approximation of dependence between data. In our numerical examples, we referred to the Gumbel Copula function despite having estimated and analyzed other copulas for which there was no significant difference for the aims of this paper. While the empirical dependence is not excessive, we will see how the introduction in the estimation model of a factor that takes it into account, such as a Copula function, will produce a non-negligible impact on the estimate of the VaR.

An Alternative to the Copula Function: The Fast Fourier Transform

Considering the fact that it is not easy to define an appropriate copula for this dataset, we next modeled the aggregate loss distribution directly with the Fast Fourier Transform (FFT) using empirical data. That approach allowed us to avoid the dependence assumption between attritional and large claims (necessary instead with the copula approach).

To build an aggregate loss distribution by FFT, it is first necessary to discretize the severity distribution Z (see Klugman et al. 2010) and obtain the vector

z = (z_{0}, \dots, z_{n - 1})

, of which element

z_{i}

is the probability that a single claim produces a loss equal to

i c

, where c is a fixed constant such that, given n length of vector z, the loss

c n

has a negligible probability. We considered also frequency claim distribution

\tilde{k}

through Probability-Generating function (PGF) defined as

P G F_{\tilde{k}} (t) = \sum_{j = 0}^{\infty} t^{j} P r (\tilde{k} = j) = E [t^{k}]

(21)

In particular, let

F F T (z)

and

I F F T (z)

be the FFT and its inverse, respectively. We obtain the discretized probability distribution for the aggregate loss X as

(x_{0}, x_{1}, \dots, x_{n - 1}) = I F F T (P G F (F F T (z)))

(22)

Both

F F T (z)

and

I F F T (z)

are n-dimensional vectors whose generic elements are, respectively,

{\hat{z}}_{k} = \sum_{j = 0}^{n - 1} z_{j} exp (\frac{2 π i}{n} j k)

(23)

z_{k} = \frac{1}{n} \sum_{j = 0}^{n - 1} {\hat{z}}_{j} exp (- \frac{2 π i}{n} j k)

(24)

where

i = \sqrt{- 1}

.

From a theoretical point of view, this is a discretized version of Fourier Transform (DFT):

ϕ (z) = \int_{- \infty}^{+ \infty} f (x) exp (i z x) d x

(25)

The characteristic function created an association between a probability density function and continuous complex one, while the DFT made an association between an n-dimensional vector and an n-dimensional complex vector. The former one-to-one association can be done through the FFT algorithm.

For a two-dimensional case, matrix M_Z is a necessary input; this matrix contains joined probabilities of attritional and large claims such that it is possible to obtain corresponding marginal distributions by adding long rows and columns respectively. For example, let

M_{z} = (\begin{matrix} 0.5 & 0 & 0 \\ 0.2 & 0.25 & 0 \\ 0 & 0.05 & 0 \end{matrix})

be that matrix. The vector (0.5, 0.45, 0.05), obtained by adding long three rows, contains the marginal distribution of attritional claims, while the vector (0.7, 0.3, 0), obtained by adding long three columns, contains the marginal distribution of large claims. The single element of the matrix, instead, is the joined probability. The aggregate loss distribution will be a matrix

M_{X}

given by

M_{x} = I F F T (P G F (F F T (M_{z})))

(26)

For more mathematical details, we point to Robe-Voinea and Vernic (2016) and Robe-Voinea and Vernic (2017), in which FFT is extended to a multivariate setting, and several numerical examples are illustrated.

We decided to discretize the observed distribution function, without reference to a specific theoretical distribution, using the discretize R function available in the actuar package (see Klugman et al. 2010). This discretization allows us to build the matrix

M_{Z}

to which we applied the two-dimensional FFT version. In this way, we obtained a new matrix

F F T (M_{Z})

that acted as input for the random

\tilde{k}

probability generating function.

As reported in Section 2, in our dataset, 50% of frequencies were included between 238 and 381 claims, and there was a slightly negative asymmetry. In addition, the variance was greater than the mean value (299). Thus, it is possible to suppose a Negative Binomial distribution for frequency claims. The corresponding probability generating function is defined by

P G F (t) = {(\frac{1 - p}{1 - p t})}^{m}

(27)

We estimated its parameters that resulted

m = 5

and

p = 0.82

. Then, we obtained the matrix

P G F_{\tilde{k}} (F F T (M_{z}))

. As the last stage we applied the IFFT whose output is matrix

M_{X}

. Adding long counter-diagonals of

M_{X}

we can individuate the discretized probability distribution of aggregate loss claims, having maintained the distinction between normal and large claims and, above all, preserving the dependence structure.

6. Final Results and Discussion

As shown previously, from the perspective of strict adaptation to empirical data, we can say that the best model to fit the Danish fire data is the Lognormal–GPD–Gamma one, which presented a coefficient of variation equal to 10, 2%, lesser than Standard Formula volatility. In fact, considering the premium risk and Fire segment only, the volatility of the Standard Formula was equal to 24% (3 times

σ_{(p r e m, f i r e)}

, where

σ_{(p r e m, f i r e)} = 8 %

; see Tripodi 2018). As written in the introduction of the present work, this result mainly was due to the fact that the DA multiplier did not take into account the skewness of the aggregate claim distribution, and it potentially overestimated the SCR for large insurers.

For illustrative purposes only, we estimated the

V a R_{p}

and the volatility of aggregate loss using the previous models, taking into account a dependence structure as well. According to the collective approach of risk theory, aggregate loss is the sum of a random number of random variables, and so it requires convolution or simulation methods. We remember that among the considered methodologies, only FFT directly returned the aggregate loss. Relating to FFT, as we mentioned above, an empirical dependence structure was inducted by discriminating between attritional and large losses, so we referred to empirical discretized severities in a bivariate mode. This is a limitation of our work that could be exceeded considering a bivariate frequency and two univariate severities, inducting such dependence by the frequency component, as it happens in practice (i.e., dependency between severities is not typical for this line of business); however, this approach would not have allowed us to apply the FFT methodology.

Considering the statistics of frequency in the Danish fire insurance data, we can assume claim frequency k distributed as a Negative Binomial, as done previously with the FFT procedure. A single simulation of aggregate loss can be achieved by adding the losses of k single claims, and by repeating the procedure n times, we obtained the aggregate loss distribution.

Contrary to the copula approach, we point out that it would be possible to obviate the need to simulate by applying FFT to generate aggregates from the fit severities.

In Table 11, we report the VaRs obtained using composite models Lognormal–GPD–Gamma, Gumbel Copula and FFT, and the corresponding coefficients of variation that give us indications of the applied models’ volatilities:

If we consider the independence assumption, the aggregate loss distribution will return a VaR significantly smaller (over −200%) than that calculated using the dependence hypothesis. The assumption of independence, or not, would therefore produce obvious repercussions on the definition of the risk profile and, consequently, on the calculation of the capital requirement. As seen above, for the case analyzed, the Gumbel Copula took into account the positive dependence, even if of discrete magnitude, between the tails of the marginal distributions of the severities. That is, an attritional loss close to the discriminatory threshold is, with good probability, accompanied by an extreme loss. This can only induce a decisive increase in the VaR of the aggregate distribution, as can be seen from Table 11. In the same way as Fast Fourier Transform, taking into account not only the (empirical) dependence between claims but also the randomness of frequency claims also induces a further increase in the risk estimate.

Therefore, it is fundamental to take into account the possible dependence between claims, regarding its shape and intensity, because the VaR could increase drastically with respect to the independence case, leading to an insolvent position of the insurer.This analysis highlights the inadequacy of using CV when the actual objective is to estimate VaR.

However, all previous approaches have advantages and disadvantages. With the composite models we can robustly fit each of the two underlying distributions of attritional and large claims, without a clear identification of the dependency structure. With the Copula we can model dependency, but it is not easy to determine what is the right copula to use, and this is the typical issue companies have to face for capital modeling purposes using a copula approach. FFT allows one to not simulate the claim process and to not estimate a threshold, working directly on empirical data, but includes some implicit bias due to the discretization methods; for example, since the FFT works with truncated distributions, it can generate aliasing errors. We point again to Robe-Voinea and Vernic (2016) and Robe-Voinea and Vernic (2017) for a detailed discussion and the possible solutions insurers have to consider when implementing PIM.

Finally, compared to the present literature, we remark that this paper innovates the fitting of the Danish fire insurance data, using a composite model with a random threshold. Secondly, our empirical model could have managerial implications, supporting insurance companies in understanding that the Standard Formula could lead to a volatility (and the SCR) of the premium risk that is very different from the real risk profile. It is worth mentioning CEIOPS (2009), in that “Premium risk also arises because of uncertainties prior to issue of policies during the time horizon. These uncertainties include the premium rates that will be charged, the precise terms and conditions of the policies and the precise mix and volume of business to be written. Various studies (e.g., Mildenhall (2017) Figure 10) have shown that pricing risk results in a substantial increase in loss-volatility, especially for commercial lines”. Therefore, one would expect that the SCR premium charge would look high compared to a test that only considers loss (frequency and severity) uncertainty. In next developments of this research we will try to take into account these features in order to have a full picture of this comparison.

Author Contributions

Conceptualization, R.R.C.; Methodology, R.R.C. and F.A.; Software, F.A.; Validation, R.R.C.; Formal Analysis, R.R.C.; Investigation, F.A.; Data Curation, F.A.; Writing Original Draft Preparation, F.A.; Writing Review & Editing, R.R.C.; Supervision, R.R.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ausin, M. Concepcion, Michael P. Wiper, and Rosa E. Lillo. 2009. Bayesian estimation of finite time ruin probabilities. Applied Stochastic Models in Business and Industry 25: 787–805. [Google Scholar] [CrossRef]
Burnecki, Krzysztof, and Rafal Weron. 2004. Modeling the risk process in the XploRe computing environment. Lecture Notes in Computer Science 3039: 868–75. [Google Scholar]
Carreau, Julie, and Yoshua Bengio. 2009. A hybrid pareto model for asymmetric fat-tailed distribution. Extremes 12: 53–76. [Google Scholar] [CrossRef]
CEIOPS. 2009. Advice for Level 2 Implementing Measures on Solvency II: SCR Standard Formula—Article 111-Non-Life Underwriting Risk. Available online: https://register.eiopa.europa.eu/CEIOPS-Archive/Documents/Advices/CEIOPS-L2-Final-Advice-SCR-Non-Life-Underwriting-Risk.pdf (accessed on 31 October 2009).
Cerchiara, Rocco R., and Valentina Demarco. 2016. Undertaking specific parameters under solvency II: Reduction of capital requirement or not? European Actuarial Journal 6: 351–76. [Google Scholar] [CrossRef]
Charpentier, Arthur, and Abder Oulidi. 2010. Beta kernel quantile estimators of heavy-tailed loss distributions. Statistics and Computing 20: 35–55. [Google Scholar] [CrossRef]
Clemente, Gian Paolo, and Nino Savelli. 2017. Actuarial Improvements of Standard Formula for Non-life Underwriting Risk. In Insurance Regulation in the European Union. Edited by Pierpaolo Marano and Michele Siri. Cham: Palgrave Macmillan. [Google Scholar]
Cooray, Kahadawala, and Malwane M.A. Ananda. 2005. Modeling actuarial data with a composite lognormal-Pareto model. Scandinavian Actuarial Journal 5: 321–34. [Google Scholar] [CrossRef]
Cordeiro, Gauss M., Saralees Nadarajah, and Edwin M. M. Ortega. 2012. The Kumaraswamy Gumbel distribution. Statistical Methods & Applications 21: 139–68. [Google Scholar]
Drees, Holger, and Peter Müller. 2008. Fitting and validation of a bivariate model for large claims. Insurance: Mathematics and Economics 42: 638–50. [Google Scholar] [CrossRef]
EIOPA. 2011. Report 11/163. 2011. Calibration of the Premium and Reserve Risk Factors in the Standard Formula of Solvency II-Report of the Joint Working Group on Non-Life and Health NSLT Calibration. Frankfurt: EIOPA. [Google Scholar]
Embrechts, Paul, Claudia Klüppelberg, and Thomas Mikosch. 1997. Modelling Extremal Events for Insurance and Finance. Berlin/Heidelberg: Springer. [Google Scholar]
Esmaeili, Habib, and Claudia Klüppelberg. 2010. Parameter estimation of a bivariate compound Poisson process. Insurance: Mathematics and Economics 47: 224–33. [Google Scholar] [CrossRef] [Green Version]
European Commission. 2015. European Commission. 2015. Commission Delegated Regulation (EU) 2015/35 supplementing Directive 2009/138/EC of the European Parliament and of the Council on the taking-up and pursuit of the business of Insurance and Reinsurance (Solvency II). Official Journal of the EU 58: 1–797. [Google Scholar]
Galeotti, Marcello. 2015. Computing the distribution of the sum of dependent random variables via overlapping hypercubes. Decisions in Economics and Finance 38: 231–55. [Google Scholar] [CrossRef] [Green Version]
Guillotte, Simon, Francois Perron, and Johan Segers. 2011. Non-parametric Bayesian inference on bivariate extremes. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73: 377–406. [Google Scholar] [CrossRef]
Klugman, Stuart, Harry H. Panjer, and Gordon E. Wilmot. 2010. Loss Models: From Data to Decisions. Hoboken: John Wiley & Sons. [Google Scholar]
Kumaraswamy, Poondi. 1980. A generalized probability density function for double-bounded random processes. Journal of Hydrology 46: 79–88. [Google Scholar] [CrossRef]
McNeil, Alexander J. 1997. Estimating the tails of loss severity distributions using extreme value theory. ASTIN Bulletin: The Journal of the IAA 27: 117–37. [Google Scholar] [CrossRef] [Green Version]
McNeil, Alexander J., Rüdiger Frey, and Paul Embrechts. 2005. Quantitative Risk Management. Concepts, Techniques and Tools. Princeton: Princeton University Press. [Google Scholar]
Mildenhall, Sthepen J. 2017. Actuarial geometry. Risks 5: 31. [Google Scholar] [CrossRef]
Nadarajah, Saralees, and Anuar A.S. Bakar. 2014. New composite models for the Danish-fire data. Scandinavian Actuarial Journal 2: 180–87. [Google Scholar] [CrossRef]
Pigeon, Mathieu, and Michel Denuit. 2011. Composite Lognormal-Pareto model with random threshold. Scandinavian Actuarial Journal 3: 177–92. [Google Scholar] [CrossRef]
Resnick, Sidney I. 1997. Discussion of the Danish data on large fire insurance losses. ASTIN Bulletin: The Journal of the IAA 27: 139–51. [Google Scholar] [CrossRef] [Green Version]
Robe-Voinea, Elena-Gratiela, and Raluca Vernic. 2016. Fast Fourier Transform for multivariate aggregate claims. Computational & Applied Mathematics 37: 205–19. [Google Scholar] [CrossRef]
Robe-Voinea, Elena-Gratiela, and Raluca Vernic. 2017. On a multivariate aggregate claims model with multivariate Poisson counting distribution. Proceedings of the Romanian Academy series A: Mathematics, Physics, Technical Sciences, Information Science 18: 3–7. [Google Scholar]
Scollnik, David P. M. 2007. On composite lognormal-Pareto models. Scandinavian Actuarial Journal 1: 20–33. [Google Scholar] [CrossRef]
Teodorescu, Sandra, and Raluca Vernic. 2009. Some composite Exponential-Pareto models for actuarial prediction. Romanian Journal of Economic Forecasting 12: 82–100. [Google Scholar]
Teodorescu, Sandra, and Raluca Vernic. 2013. On Composite Pareto Models. Mathematical Reports 15: 11–29. [Google Scholar]
Tripodi, Agostino. 2018. Proceedings of the Seminar Non-Life Premium and Reserve Risk: Standard Formula with Undertaking Specific Parameter at Department of Economics, Statistics and Finance, University of Calabria, 5 November 2018. Available online: https://www.unical.it/portale/portaltemplates/view/view.cfm?84609 (accessed on 6 July 2019).

1	Danish insurance market data have been included in the calibration of standard parameters by EIOPA (2011) Calibration of the Premium and Reserve Risk Factors in the Standard Formula of Solvency II, Report of the Joint Working Group on Non-Life and Health NSLT Calibration.

Figure 1. KumGum density functions.

Figure 2. KumPareto density functions.

Figure 3. Right tails of Lognormal–KumPareto (red line) and empirical distribution (dark line) functions.

Figure 4. Lognormal–Burr and empirical distribution functions (red and dark lines).

Figure 5. Observed–theoretical quantile plot for the Lognormal–GPD model.

Figure 6. Left, comparison between cut-off density functions. Right, empirical and theoretical (red) right tail.

Figure 7. Lognormal–GPD (red) and empirical (dark) distribution function.

Figure 8. Observed-theoretical quantile plot for the Lognormal–GPD–Gamma model.

Figure 9. Lognormal–GPD–Gamma (red) versus empirical (dark) distribution functions.

Figure 10. Normal distribution of the Bootstrap Gumbel parameter.

Figure 11. Lognormal (top) and GPD (right) marginal histograms and Gumbel Copula simulated values plot.

Figure 12. Density function of the estimated Gumbel Copula. Attritional claim losses on the X-axis, large claim losses on the Y-axis.

Table 1. Descriptive statistics of Danish empirical losses.

Min	Mean	Median	Q3	Max	Kurtosis	Skewness	Std Dev
27,730	613,100	327,000	532,800	55,240,000	505.592	17.635	1,412,959

Table 2. Empirical distribution of frequency claim statistics.

Q1	Mean	Q3	Max	Variance	Skewness
238	299	381	447	8482	−0.12

Table 3. Kurtosis and Skewness of the KumGum distribution.

v	$φ$	a	b	Kurtosis	Skewness
0	5	1	1	5.4	1.1
0	1	0.5	0.5	7.1	1.6
5	3	2	3	3.6	0.5
1	10	5	0.66	6.4	1.4
0	15	1	0.4	7.6	1.7

Table 4. Comparison between empirical and Lognormal–KumPareto quantiles.

Level	50%	75%	90%	95%	99%	99.5%
Empirical quantile	327,016	532,757	1,022,213	1,675,219	5,484,150	8,216,877
Theoretical quantile	333,477	462,852	642,196	840,161	2,616,338	4,453,476

Table 5. Comparison between empirical quantiles and Lognormal–Burr ones.

Level	50%	75%	90%	95%	99%	99.5%
Empirical quantile	327,016	532,757	1,022,213	1,675,219	5,484,150	8,216,877
Theoretical quantile	199,681	332,341	634,531	1,029,262	3,189,937	5,181,894

Table 6. Estimated parameters and 99% confidence intervals of Lognormal–GPD.

Parameter	Low Extreme	Best Estimate	High Extreme
$μ_{1}$	12.82	12.84	12.86
$σ$	0.59	0.61	0.62
$σ_{μ}$	1,113,916	1,115,267	1,116,617
$ξ$	0.33	0.45	0.56

Table 7. Estimated parameters and 99% confidence intervals of Lognormal–GPD–Gamma distribution.

Parameter	Low Extreme	Best Estimate	High Extreme
$μ_{1}$	12.78	12.79	12.81
$σ$	0.52	0.54	0.55
u	629,416	630,768	632,121
$σ_{μ}$	1,113,915	1,115,266	1,116,616
$ξ$	0.22	0.29	0.37

Table 8. Comparison between empirical and Lognormal–GPD–Gamma quantiles.

Levels	50%	75%	90%	95%	99%	99.5%
Empirical percentile	327,016	532,757	1,022,213	1,675,219	5,484,150	8,216,877
Theoretical percentile	360,574	517,996	1,103,309	2,077,792	5,266,116	7,149,253

Table 9. AIC and BIC indices for a comparison between different models.

Index	KumPareto	Burr	GPD	GPD–Gamma
AIC	193,374	191,459	191,172	190,834
BIC	193,409	191,494	191,207	190,882

Table 10. Different methods for estimating the dependence parameter

θ

of a Gumbel Copula.

Table 10. Different methods for estimating the dependence parameter

θ

of a Gumbel Copula.

Method	$θ$	Standard Error
Maximum pseudo-likelihood	1.11	0.008
Canonical maximum pseudo-likelihood	1.11	0.008
Simulated maximum likelihood	1.11	-
Minimum distance	1.09	-
Moments based on Kendall’s tau	1.13	-
Bootstrap	1.11	0.008

Table 11. Estimates of VaR at the 99.5% level with different models and their volatilities.

Model	VaR (€)	CV
Lognormal–GPD–Gamma	216,913,143	0.102
Gumbel Copula	664,494,868	0.110
FFT	703,601,564	0.193

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cerchiara, R.R.; Acri, F. Estimating the Volatility of Non-Life Premium Risk Under Solvency II: Discussion of Danish Fire Insurance Data. Risks 2020, 8, 74. https://doi.org/10.3390/risks8030074

AMA Style

Cerchiara RR, Acri F. Estimating the Volatility of Non-Life Premium Risk Under Solvency II: Discussion of Danish Fire Insurance Data. Risks. 2020; 8(3):74. https://doi.org/10.3390/risks8030074

Chicago/Turabian Style

Cerchiara, Rocco Roberto, and Francesco Acri. 2020. "Estimating the Volatility of Non-Life Premium Risk Under Solvency II: Discussion of Danish Fire Insurance Data" Risks 8, no. 3: 74. https://doi.org/10.3390/risks8030074

APA Style

Cerchiara, R. R., & Acri, F. (2020). Estimating the Volatility of Non-Life Premium Risk Under Solvency II: Discussion of Danish Fire Insurance Data. Risks, 8(3), 74. https://doi.org/10.3390/risks8030074

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Estimating the Volatility of Non-Life Premium Risk Under Solvency II: Discussion of Danish Fire Insurance Data

Abstract

1. Introduction

2. Data

3. Composite Models

3.1. Composite Model with Random Threshold

3.2. Kumaraswamy Distribution and some Generalization

4. Numerical Example of Composite Models

5. Introducing Dependence Structure: Copula Functions and Fast Fourier Transform

An Alternative to the Copula Function: The Fast Fourier Transform

6. Final Results and Discussion

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI