Application of a Vine Copula for Multi-Line Insurance Reserving

: This article introduces a novel use of the vine copula which captures dependence among multi-line claim triangles, especially when an insurance portfolio consists of more than two lines of business. First, we suggest a way to choose an optimal joint loss development model for multiple lines of business that considers marginal distribution, vine copula structure, and choice of family for each pair of copulas. The performance of the model is also demonstrated with Bayesian model diagnostics and out-of-sample validation measures. Finally, we provide an implication of the dependence modeling, which allows a company to analyze and establish the risk capital for whole portfolio.


Introduction
There have been various approaches to determine loss reserve for a single line of business with statistical models. According to Mack (1993), the chain-ladder method, which has been used as a rule of thumb for the determination of reserve, can be interpreted as a nonparameteric stochastic model. After that, many univariate stochastic reserving models have been developed and used for both determination of point estimates of reserve and risk management. For detailed discussions on univariate stochastic reserving models, see England and Verrall (2002).
However, it is also certain that most of the insurance companies do not run only a single line of business so one needs to consider possible dependence among the lines of business in reserve modeling. Consequently, stochastic reserving models need to be extended to multivariate frameworks. In this regard, some actuarial literature focused on the extension of the chain-ladder method to multivariate cases, such as Braun (2004), Schmidt (2006), and Merz and Wüthrich (2008) which are followed by Shi et al. (2012), which used a bivariate normal distribution to model multi-line reserves. Besides the dependence among multiple lines of business, there have been some works on the dependence between paid claim triangles and incurred claim triangles such as Zhang (2010) and Merz and Wüthrich (2010).
Note that the methods mentioned above are less flexible since it is not allowed to disentangle marginal distributions and multivariate association structure. Therefore, use of a copula was introduced to the multivariate reserving problem since this allows us to consider marginal distribution and association structure in a separate way. Based on this idea, Shi and Frees (2011) proposed the use of a bivariate Gaussian copula to model dependence among multi-line reserves where the marginal distributions are chosen as Gaussian and gamma, respectively. Peters et al. (2014) considered dependence between paid-incurred claim triangles via bivariate copula. Further, Abdallah et al. (2015) used the idea of the hierarchical archimedian copula to capture calendar year effect and multi-line dependence simultaneously.
Although there have been some works on the multivariate reserving in actuarial literature, applications of all the aforementioned works are restricted only to possible dependence between two lines of business. Indeed, it is natural that a property and casualty insurance company runs more than two lines of business, so we need to incorporate such a high-dimensional dependence structure in our reserve modeling.
In that regard, we apply the idea of a vine copula in this article, which uses bivariate copulas as its building blocks and connects them with vine structure to describe the high-dimensional association in a flexible way. The vine copula has been widely used in the actuarial and financial literature. Loaiza Maya et al. (2015) investigated dependency among the exchange rates of Latin American countries using the vine copula. Reboredo and Ugolini (2016), Arreola Hernandez et al. (2017), andTrucíos et al. (2020) used the vine copula to assess systemic risks due to possible dependence among economic subjects. Surprisingly, usage of the vine copula in property and casualty insurance literature does exist but is scarce. For example, Shi and Yang (2018) used a vine copula to capture the serial dependence of claim amounts and derive the experience ratemaking factor. This paper has been organized as follows. In Section 2, basic concept of the vine copula is introduced and the model selection procedure to be implemented is specified. In Section 3, we describe the data used for our empirical analysis. In Section 4, we go through the model selection procedure to determine marginal distributions and the vine copula structure to be used in our analysis. In Section 5, we discuss the implications of estimation results from the perspective of enterprise risk management using the predictive distribution of unpaid claims. Section 6 addresses practical issues for implementing the proposed methodology. Finally, we conclude this article in Section 7 by providing some future directions of research.

Proposed Methodology
Suppose an insurance company owns a portfolio which consists of N multiple lines of business. By assuming balanced observations in multiple triangles, one can write the multivariate cumulative paid claims as Y ij = (Y (1) ij , Y (2) ij , . . . , Y (N) ij ) where n = 1, . . . , N indicates a claim triangle from the n th line of business, i = 1, . . . , I means the i th accident years, and j = 1, . . . , J denotes j th development lag. In general, it is of interest to predict the cumulative paid claim for the next year given information up to the current year, which can be written as E Y (n) i,j+1 |Y (n) i,j . Therefore, instead of working with Y ij , one can directly model the incremental development of claims, or age-to-age factors as follows: ..,I,j=1,...,J are observed from the same business line, it is natural that we model the marginal distribution of D (n) and the dependence structure among D (n) via copulas to jointly model D (n) , n = 1, 2, . . . , N.
According to Sklar (1959), if all marginal distribution functions are continuous, then there is a unique function C : where F i denotes marginal distribution function of X i and H denotes joint distribution function of (X 1 , X 2 , . . . , X m ). Therefore, use of copulas allows us to capture the association among the joint response random variables, which may follow different marginal distributions. In that sense, by letting θ = (θ (1) , . . . , θ (N) ), where θ (n) is the parameter vector for the marginal distribution of D (n) and φ is the copula parameter, the likelihood for joint distribution is given as follows: Here h means the joint density of (D (1) , . . . , D (N) ) and we assume that d (n) ij (i = 1, . . . , I, j = 2, . . . , J) are independent for fixed n.
Depending on the sign of dependence, we may use different families of copulas. For example, in order to capture positive dependence, one can suggest the following bivariate copulas: Although those two copulas can only capture positive dependence, one can easily rotate those in order to capture reversed tail behavior and negative dependence as follows: Further, the Gaussian copula and Frank copula are also prevalent choices which can capture both positive and negative dependence: where Φ 2 (·, ·) stands for the cumulative distribution function of a bivariate, standard, normal, random variable with correlation φ and Φ(·) stands for the cumulative distribution function of a univariate, standard, normal random variable. Given families of copulas, allow us to consider various facets of possible dependence among the lines of business, including the upper and lower tail dependence properties. For example, the Clayton copula can capture a positive association and has lower tail dependence but no upper tail dependence. The Gumbel copula can capture positive associations and has upper tail dependence but no lower tail dependence. Further, both Frank and Gaussian copulas are symmetric and able to capture positive and negative dependence, but they have no tail dependencies. Note that upper tail dependence of original copulas corresponds to lower tail dependence of survival copulas, and vice versa. Further, if an original copula can capture a positive association, then the corresponding 90 • or 270 • rotated copula can capture a negative association; for example, 90 • rotated Clayton and 270 • rotated Gumbel copulas can be used to capture negative associations. We refer the readers to Nelsen (1999), Embrechts et al. (2001), and Hua and Joe (2011) for a more detailed explanation. The variability of tail behaviors is potentially important considering the data used in our empirical analysis. Here we capture the dependencies among the incremental development factors of different lines of business so that it is natural that the incremental development factors will be quite large in the initial years and become smaller in the mature years. In this regard, upper tail dependence corresponds to the associations among the loss development factors in the initial years, whereas lower tail dependence corresponds to the associations among the loss development factors in the mature years.
If N = 2, then the dependence can be captured by a bivariate copula, which has been explored in previous works such as Shi and Frees (2011). However, we may not preclude the possibility that N > 2, and one can suggest the use of vine copula when we have more than two lines of business to capture the multiple dependence. The concept of vine copula was introduced by Aas et al. (2009) who extended the research conducted by Bedford and Cooke (2001) and Bedford and Cooke (2002) to show how multivariate data, which exhibit complex patterns of dependence in the tails, can be modeled using a cascade of pair-copulas, acting on two variables at a time. Later Czado (2010) and Joe and Kurowicka (2011) developed this idea further and provided us with a definite road map for constructing vine copulas to represent complex multivariate data. The core concept behind construction of vine copulas is that a d-dimensional density f (x 1 , x 2 , . . . , x d ) can be represented as a product of pair copula densities and marginal densities. For example, let us consider d = 3 dimension. A possible way to represent the joint density f (x 1 , x 2 , x 3 ) is as follows: where c ij (F i (x i ), F j (x j )) is the joint copula density of random variables X i and X j . One can write out the joint density of X 1 , X 2 , X 3 as follows: Note that in general, c 13|2 is affected by x 2 , not only through F 1|2 (x 1 |x 2 ) and F 3|2 (x 3 |x 2 ), but also directly through x 2 . However, such a general form of conditional coplula density is cumbersome to be used in statistical inference. For this reason, it is tempting to use a simplified form of conditional copula density so that c 13|2 is affected by x 2 only through F 1|2 (x 1 |x 2 ) and F 3|2 (x 3 |x 2 ). Haff et al. (2010) showed that we may write each conditional pair copula density in a simplified form for elliptical distributions with a positive definite scale matrix. Further, it was also shown that use of the simplified form could be a good approximation even in a case wherein such an assumption could not be fully validated. Hence, we will use a simplified form of conditional copula density in this article hereafter.
The sequential dependence structure via pair copulas is described with a regular vine, or R-vine. Let V = (T 1 , . . . , T d−1 ) be a set of trees where each tree T j = (N j , E j ) is connected and N 1 = {1, . . . , d}.
Here N j and E j mean the nodes and edges of T j , respectively. If N j = E j−1 for j = 2, . . . , d, V is called a vine. Further, if |e ∩ e | = 1 for {e, e } ∈ E j and j = 2, . . . , d − 1, then V is called a R-vine. There are two important subclasses of R-vine, C-vine and D-vine. If there is a node n ∈ N j so that the degree of that node is d − j in an R-vine, it is called a C-vine. D-vine is an R-vine wherein the degree of every node is at most 2.
The above decomposition of the joint density in terms of vine structure is not unique. For example, when d = 3, we have the following three available R-vine structures as in Figure 1. Note that in the case of 3-dimensional R-vine structures, C-vine and D-vine are identical, which is no more true for d > 3. Note that the number of regular vine structures on d-dimensional variables grows super-exponentially so that there are d! · 2 ( d−2 2 ) regular vine structures on d variables, as shown in Joe and Kurowicka (2011). Therefore, it is already meaningless to display every possible vine structure if d ≥ 5.
Therefore, for the calibration of vine copula model, we need to carefully determine the following components based on observed data: Copula family for each of pair copula in the selected vine structure.
For statistical inference, one may consider either the frequentist or the Bayesian approach and subsequently choose the optimal structure via a model selection procedure. However, it can be quite computationally demanding to choose an optimal model in both approaches. For example, as mentioned in Table 1 of Gruber and Czado (2015), the size of search space of vine copula model for d = 5 is already (1.3559 × 10 11 ) when we consider seven families of copula for each of the pair copulas, which is definitely computationally infeasible. In this regard, Dissmann et al. (2013) proposed a top-down approach to choose the vine structure from a frequentist view. Gruber and Czado (2018) proposed sequential vine copula model selection from a Bayesian perspective. Schamberger et al. (2017) also considered a Bayesian approach to model dependence structure with a factor copula in order to handle the dimensionality issue. Recently, Kreuzer and Czado (2019) showed that the pair copula family can be chosen together with their parameters using the Hamiltonian MCMC.
The Bayesian approach can be beneficial in reserve modeling since it can incorporate the uncertainty of parameter estimation and reflect it to compute the predictive distribution of future unpaid claims, as proposed in Shi et al. (2012). Therefore, we propose the following two-step approach, which can be considered as a compromise of Dissmann et al. (2013) and Gruber and Czado (2018): 1.
Estimate the parameters with Bayesian inference and choose marginal distribution based on Bayesian model selection criteria, such as deviance information criterion (DIC) and the logarithm of the pseudomarginal likelihood (LPML). The DIC for each marginal distribution of n th line, proposed by Spiegelhalter et al. (2002), is defined as i,J−i+1 |i = 1, . . . I} are the observed values of incremental loss developments from n th line of business. Since this integral is hardly available in a closed form, one can estimate DIC as follows: [s] } S s=1 are MCMC samples generated from the posterior density. Note that we prefer models with smaller DIC values.
LPML is calculated based on the conditional predictive ordinate (CPO), which was proposed by Gelfand et al. (1992) and Geisser (2017). The CPO for d (n) i,j is defined as follows: Since CPO is usually not readily available in closed form, one can estimate CPO (n) i,j as follows, according to Gelfand and Dey (1994): Finally, according to Ibrahim et al. (2014), CPO (n) i,j can be summarized as LPML as follows: We prefer models with larger LPML values.

2.
Based on the fitted marginal model and corresponding posterior means of marginal parameterŝ θ (n) for n = 1, . . . , N, generate probability integral transfrom (PIT)Û optimize the following: which ends up with the optimal vine copula structure and copula family for each pair copula using the top-down algorithm of Dissmann et al. (2013). The algorithm starts with the choice of first tree T 1 and subsequently chooses T j for j = 2, . . . , d − 1 while we keep the constraint of having R-vine structure in (1). The optimal tree T j is chosen by solving the following optimization problem: where w e is pairwise BIC of an edge e in our implementation.
In our search for the best family for each of pair copula, we use the aforementioned bivariate Gaussian, Frank, Clayton, Gumbel, and their rotated copulas. Note that this approach also utilizes the inference by margin (IFM) method proposed by Joe and Xu (1996), since it decomposes estimation of marginal distribution and copula structure separately for model selection a with lesser computational burden. It should be noted that such computational convenience is obtained at the expense of potential estimation bias, as shown in Louzada and Ferreira (2016).

Data Description
For the empirical analysis, we used a publicly available dataset from ACE Limited 2013 Global Loss Triangles , which consists of aggregated claim developments for insurance operations in North America (4 lines of business), insurance operations overseas (3 lines of business), and global reinsurance operations (2 lines of business). A description and the corresponding index for each line of business are given in Table 1. We incorporate a 9-dimensional vine copula structure to capture possible dependence among all lines of business. Indeed, the current COVID-19 situation could be a clear example of showing the inappropriateness of ignoring potential dependencies among different countries. Note that the dataset and code for data analysis are attached as Supplementary Materials. North American Non-Casualty 5 Overseas General Casualty 6 Overseas General Non-Casualty 7 Overseas General Personal Accident 8 Global Reinsurance Property 9 Global Reinsurance Non-Property Although we consider all possible lines of business simultaneously here, an actuary who applies vine copulas to model dependence among the lines of business should be careful, since the complexity of the vine copula structure increases super-exponentially as the dimensionality grows. Therefore, it is a role of an actuary to apply his/her industrial experience or knowledge for the dependence modeling. Further, it is also possible to observe more than 10 lines of business according to the classification in Schedule P of the National Association of Insurance Commissioners (NAIC) annual report, including but not limited to: Therefore, it is also possible to merge some lines of business into one category in order to not only enhance homogeneity and credibility in the analysis but also avoid too much complexity of the vine structure due to the large number of lines of business.
The loss development triangles, which are training sets in terms of predictive analytics, can be expressed as follows: Our task is to predict the cumulative paid claims for the next years, which are described as follows: For enterprise risk management (ERM) purposes, insurance companies are usually interested in the distribution of aggregate unpaid claims after one year of loss development, which is determined by incremental paid losses defined as follows: This formulation validates us to model D A simple approach for the analysis of L is the silo approach, which means we just aggregate all lines of business so that the paid losses of the same development lag and accident year are merged and modeled altogether. In this case, one can write L as follows: However, such an approach ignores the heterogeneity of the lines of business and could be problematic. Suppose there are two lines of business, personal and commercial auto insurances, and reported losses of these lines are given in Table 2. Note that the incremental development is calculated as log Y j+1 /Y j . One can see that due to the difference in volume, the volatility of loss development for the commercial auto line is wiped out if we analyze the aggregated data. Therefore, the vine copula model can be considered as a flexible generalization of two extreme models, silo and independent approaches, so that one can consider possible dependence and heterogeneity among the lines of business simultaneously.

Model Selection and Parameter Estimation
For marginal distribution, we apply the idea of the cross-classified model for each line of business as in Shi and Frees (2011) and Taylor and McGuire (2016), which regresses Y as follows: where α (n) i corresponds to the effect for the i th accident year and δ (n) j corresponds to the cumulative effect up to j th development lag, for n th line of business, respectively.

Recall that it is in our interests
i,j+1 . Therefore, one can rewrite (6) in the following way: where η can be interpreted as the incremental loss development factor from j th lag to j + 1 th lag. In this regard, we utilize two candidate distributions in (7) and compare the model selection diagnostics to choose a suitable distribution for each of marginal models.
Further, it is natural to expect that for a fixed accident year, paid claim amounts gradually increase until they are fully mature and developed, while the magnitude of development gets smaller as development lag increases, which means the incremental loss development factor is expected to be greater than 1 but monotonically decreasing until it converges to 1. In terms of cross-classified model, such a statement is equivalent to the following mathematical condition: for a large enough integer L(n), or there are ζ (n) t ∈ [0, ∞) for t = 2, . . . , L(n) and n = 1, . . . , 9 such that t . Therefore, we may suggest the following four models as the candidates for marginal distribution of each line of business: • LNU: The cross-classfied lognormal model in (7); • LNC: The cross-classfied lognormal model in (7)  t , n = 1, . . . , 9, and t = 1, . . . , J. For the unconstrained models, a diffuse uniform prior on the positive real number is directly used for η (n) j+1 , n = 1, . . . , 9, and j = 2, . . . , J. For each marginal model, four chains with 1000 iterates are used; the first 500 iterates are discarded for burn-in, which usually requires computation time for MCMC sampling of less than a second. Table 3 provides a summary of the estimates of parameters in all models for the marginal components, which are summarized with posterior mean and upper/lower bounds of 90% of the Bayesian credible interval for each parameter. One can see thatη (n) 9 tends to be greater than or equal toη (n) 8 in the unconstrained models, whereasη (n) 8 >η (n) 9 for all n = 1, . . . , 9 in the constrained models, which is a more natural pattern in loss development analysis.
When posterior MCMC samples of parameters are obtained, it is quite important to make sure that these samples converge to (proper) posterior distributions. In this regard, we useR statistics proposed by Gelman and Rubin (1992) and traceplots which enable us to judge the convergence of MCMC sampling with the generated samples. Basically,R statistics tell us that the samples from distinct chains are well-mixed so that the chains are considered converged ifR 1. Table A1 in Appendix A shows thatR 1 for all parameters, which tells us that the chains are well-mixed. One can also see that MCMC chains are well-mixed by looking at selected traceplots of maginal models, as shown in Figure A1. It indicates different MCMC chains end up with similar empirical distributions, which is a necessary condition of the convergence of the MCMC algorithm to the correct posterior distribution.
After the convergence of Bayesian models is assessed, we compare the goodness-of-fit of Bayesian models using DIC and LPML. Based on the DIC and LPML values for fitted marginal models in Table 4, it turns out that lognormal distribution is favored compared to gamma distribution. Further, DIC values of constrained lognormal distribution are always less than ones of unconstrained lognormal distribution, while LPML values have no definitive patterns between constrained and unconstrained lognormal distributions. Therefore, we suggest to use constrained lognormal distribution as the marginal distribution of every line of business, based on the model selection diagnostics and interpretability of the regression coefficients while unconstrained lognormal distributions are considered as a benchmark.
Once a marginal distribution for each triangle is specified, the dependence structure can be modeled with a 9-dimensional vine copula. Since marginal distributions are fixed and calibrated, it is possible to optimize (2). For the optimization routine, we use the algorithm of Dissmann et al. (2013) which is readily available in an R package VineCopula, as a function RVineStructureSelect, which explores both the optimal vine structure and the choice of family for each of the pair copulas sequentially. For a detailed explanation of such an implementation, see Chapter 8 of Czado (2019).
In our search for the vine structure, we also consider the sparsity of the vine copula. Since we need ( d 2 ) pair copulas to construct a d-dimensional vine structure, the number of required pair copulas quadratically increases as the dimension increases, which adds both complexity and computational burden to the optimization scheme. Therefore, in this article, we also incorporate the idea of a sparse copula so that a pair copula is considered to be an independent copula unless the estimated association Kendall's Tau is significantly different from 0 for the pair copula. More specifically, we use the following test statistic T based on the estimated value of Kendall's Tau,τ as proposed in Genest and Favre (2007): where T asymptotically follows a standard normal distribution if m is large enough. For more approaches of incorporating sparsity in a high-dimensional vine copula, see Gruber and Czado (2018) and Nagler et al. (2019). According to the optimization with the data, the optimal vine structure is given Table 5 which only displays non-independent pair copulas. In the table, φ and τ mean estimated copula parameter and Kendall's Tau for each pair copula, respectively. Some empirical relationships are observed by the estimated copula parameters, which seem intuitive considering the nature of the lines of business. For example, it might be quite natural to have strongly positive dependence among the development patterns of line 4 (North American non-casualty) and line 6 (overseas general non-casualty) due to inherent characteristics of the same claim type. Dependence between line 8 (global reinsurance property) and line 9 (global reinsurance non-property) is captured by the Gumbel copula, which indicates there is positive association between claim settlements in the initial years (that corresponds to upper tail part) for a global reinsurance business unit. Such a positive tail dependence might have originated from an internal decision on the claim processing in the same business unit. Note that a company has limited resources to deal with ligitations related to the claim adjustments so that claim adjustments can be delayed in a line by focusing on expediation in claim adjustments in another line, which might explain negative associations between lines 6 and 9, lines 2 and 7, and lines 5 and 9. One can see that the estimated vine structure exhibits enough of a degree of sparsity because only 10 pair copulas are non-independent among ( 9 2 ) = 36 pair copulas in the vine structure so that most of the dependencies arise from tree 1, as shown in Figure 2. There are nine lines of business that correspond to the nodes of tree 1 so that we have nine nodes and eight edges as an R-vine structure which captures the first layer of dependence among the lines of business. Since the dependence of a pair of copulas is quite weak except for those in tree 1, we only display the R-vine structure for tree 1.   T1  T2  T3  T4  T5  T6  T7  T8 T1  T2  T3  T4  T5  T6  T7  T8      We also visualize the dependence structure via normalized contour plots in Figure 3, which displays the association between two probabilty integral transforms in a normalized scale. For example, if a pair copula c(u 1 , u 2 ) is given, then the corresponding normalized contour plot gives us a contour plot with the following density g: Normalized contour plots for trees 1, 2, and 3 in the chosen vine structure are provided in Figure 3. One can see that some pair copulas (for example, c 4,3 and c 8,1 ) demonstrate strong tail dependence or skewed shape of the contours, which cannot be captured with naive use of eillptical copulas such as Gaussians or t−copulas. Therefore, Figure 3 substantiates that the choice of pair copula families are enough to describe the dependence structure observed in the data. In the figure, the first row corresponds to normalized contour plots for tree 3 where one can see concentric circles as contour plots, which indicates independence of corresponding pair copulas. On the other hand, one can see many skewed contour plots for tree 3 from the third row of the figure, which indicates varying dependence of corresponding pair copulas. 9,3 ; 6,4 4,5 ; 6,9 8,4 ; 9,6 8,2 ; 1,7 9,7 ; 8,1 6,1 ; 9,8 6,3 ; 4 6,5 ; 9 9,4 ; 6 1,2 ; 7 8,7 ; 1 9,1 ; 8 8,6 ; 9 4,3 9,5 6,4 7,2 1,7 8,1 9,6 9,8

Validation and Actuarial Implication
Once a predictive model is calibrated based on the upper loss triangles D 1:9 , which is defined in (3), it needs to be validated based on available D 9+k , which is defined in (4). For this purpose, D 10 = {Y (n) ij : 2 ≤ i ≤ 9 and j = 11 − i, n = 1, . . . , 9}, the latest diagonals are used as a validation set. Under the proposed lognormal model for marginal components, one can show that 2(n) , whereη (n) and σ (n) are obtained as posterior means from the MC samples of parameters. Since it is of interest to predict the future unpaid claim L as defined in (5), here we apply three models to project the unpaid claims; the independent model which assumes independence among all lines of business, the silo model which assumes a perfect positive association among all lines of business, and the copula model which utilizes the vine copula structure specified and optimized in Section 4. Due to linearity of expectation, it is natural to expect that point estimates of reserves under the copula model would be more or less the same as the estimates under independent model. However, insurance companies are often interested in not only the predictive mean, but also the confidence interval of the estimated mean and risk measures for enterprise risk management. In this regard, the proposed vine copula model allows us to consider the impacts of associations among the lines of business that might lead to more accurate estimation of risk measures. For example, if there are positive associations among the lines of business, then the independent model will underestimate Value at Risk (VaR) or Conditional tail expectation (CTE) since a worse scenario can affect all lines of business adversely. Table 6 summarizes the point estimates of unpaid claims for the subsequent calendar year based on the copula model and the silo model. Note that point estimates of unpaid claims from the independent model should be the same as ones from the copula model, as long as we model each line of business with the same marginal distribution. Further, although we primarily use constrained models for the marginal components, here we also provide the point estimates of unpaid claims from the models where we do not impose any constraints on the parameters.
From the table, it is shown that copula model with constraints turns out to be the best model in terms of prediction of aggregate unpaid claims for the subsequent year. We observe that under unconstrained models, the predicted values of unpaid claims can be severely exaggerated, especially for mature years (in our case, AY = 2005). That is because the value of η (n) 9 can be overestimated in the unconstrained models, as observed in Table 3. Therefore, it implies that naive implementation of unconstrained model for multi-line reserving problem may end up with poor prediction. In order to evaluate the prediction performance, here we use validation measures such as root mean squared error (RMSE) and mean absolute error (MAE) defined as follows: We prefer the models with smaller values of RMSE and/or MAE. According to Table 7, the copula model with constraints still turns out to be the best model in terms of RMSE and MAE, which evaluate the performance of predicting the point estimates. However, it is of less concern to get point estimates of L, the aggregate unpaid claims. Actuaries (and companies) want to know a range of estimates, or the predictive distribution of the aggregate unpaid claims to assess the quality of decision making on reserve and retain appropriate amounts of risk capital for the whole company. Thus, here we provide a way to simulate predictive samples of L (n) and describe the predictive distribution of L (n) for n = 1, . . . , 9 and subsequently L = ∑ 9 n=1 L (n) with the following steps, which are similar to those of Gao (2018): 1.
Generate a 9-dimensional uniform random vector (u is generated subsequently due to the following identity: where h 3|1;2 (w, v) = ∂ ∂w C 13;2 (w, v). One can continue such implementation to obtain a random sample of d-dimensional uniform vector, which is readily available with RVineSim function in R package VineCopula.

3.
Repeat steps 1 to 2 to get L r:(n) , the MC samples of L (n) for r = 1, . . . , R where i,11−i : 2 ≤ i ≤ 9, n = 1, . . . , 9} needs to be simulated to describe the distribution of L. In the simulation scheme, possible uncertainties of estimated parameter values are already considered since the marginal parameters are estimated with the Bayesian approach. Figure 4 shows the predictive kernel densities of L based on the Monte Carlo samples generated with the aforementioned approach from the constrained model. In the silo approach, all nine triangles are aggregated into one triangle based on the development lags and accident years, so that loss development patterns of all business lines are implicitly assumed to be identical, which may not be true in some cases. One can see that the predictive distribution of L under the copula model is more jagged than one under the independent model, but the copula model is still closer to the independent model than the silo model. It agrees with the proposed vine copula structure described in Figure 2, which shows a sufficient degree of sparsity.
The discrepancy among predictive distributions based on copula, independent, and silo models also can be quantified via Hellinger distance. Hellinger distance, originally proposed by Hellinger (1909), is used to quantify the "distance" between two distributions. For example, if there are two functions f and g which are probability density functions of two distributions F and G, the square of Hellinger distance is calculated as follows: One can see that 0 ≤ H(F, G) ≤ 1 is always satisfied with Hellinger distance so that it is well calibrated and interpreted, where H(F, G) = 0 means F and G are identical almost surely and H(F, G) = 1 means F and G are totally different. Due to this property, Hellinger distance has been widely used in the statistics literature, such as Beran (1977), and estimation of H(F, G) using generated samples from F and G can be easily implemented via R packages such as statip. Here, H 2 (L|Copula, L|Independent), the square of the Hellinger distance between the kernel density of L under copula model with constraints (illustrated with blue solid line in Figure 4) and the density under the independent model with constraints (illustrated with black dotted line in Figure 4) is about 4.33% whileĤ 2 (L|Copula, L|Silo) = 13.54%. Therefore, one can see that the proposed vine copula structure can capture weak dependencies among the lines of business.

Kernel density
Aggregate Unpaid Claims Density 4 × 10 6 5 × 10 6 6 × 10 6 7 × 10 6 8 × Note that Hellinger distance is a special case of the functional Bregman divergence proposed in Goh and Dey (2014) and Jeong (2020), which is given as follows: where ψ(·) is a strictly convex and differentiable function with positive support and ψ(1) = 0. One can Based on the predictive distributions of L and L (n) for n = 1, . . . , 9, one can calculate the risk margins for the unpaid claims as in Table 8. Both VaR and CTE are estimated in an empirical way-in other words, based on the generated samples of the predictive distribution of L or L (n) . Usually, it is expected that a company can get a benefit from the diversification of risks for multiple lines of business so that the risk capital for aggregate reserve is less than the sum of the risk capitals for the reserve of each line, which is known as subadditivity of risk measure. Although subadditivity is not guaranteed for VaR, it is guaranteed for CTE and we can quantify the diversification effects by taking the differences of risk capitals by subtracting the risk capital for aggregate reserve from the sum of the risk capitals for the reserve of each line. Table 9 summarizes measured diversification effects for given models. We can observe that the measured diversification effect of the copula model is much greater than that of the silo model because of weak dependence captured in the vine structure, although it is still less than that of the independent model, which assumes perfect independence among the lines of business.

Practical Issues for Implementation
Note that a loss reserving dataset from a single company was utilized in this article because a primary insurer will be more interested in analysis of dependence among the lines of business in the same company for effective enterprise risk management, rather than whole industry. However, this methodology can be applied for analysis of reserving trend in a dataset of several companies as well. For example, analyzing dependencies among the loss development profiles of several companies can be of interest for the regulatory authorities and reinsurance companies.
One can be also concerned about the predictive ability of these models, which can be distorted by catastrophic claims. In this regard, insurance companies often decompose the loss development profiles into two layers, primary and excess layers, to minimize the distortion impact from the presence of catastrophic claims (Dew and Hedges 1998), and our proposed methodology can be applied to analyze dependencies among the loss development for the primary layer. Note that such distinction of layers is usually internal and may not be disclosed to the public, which also applies to the dataset used in this article.

Conclusions
In this article, we explored and introduced a novel approach which considered possible dependence among the multiple lines of business via vine copula. Use of a vine copula is very flexible and it allows us to model high-dimensional dependence among the business lines, not only bivariate dependence. In the case of traditional copula models, one needs to calibrate many copula models with different families, and go through model validation procedure to choose the best one among the candidates. On the other hand, our proposed vine copula structure enables us to explore the optimal copula structure for the given data in a unified manner.
For model selection, a stepwise approach can be applied to choose the marginal distributions, copula structure, and family for each pair of copulas subsequently. Our empirical analysis on a synthetic insurance portfolio consisting of nine lines of business showed weak associations among the lines.
Further, it was also shown that the naive implementation of a cross-classified model may result in a counterintuitive pattern of loss development, so one might consider a constrained cross-classified model to mitigate this issue. In our work, constraints on the development lag parameters are naturally incorporated via prior elicitation in a Bayesian framework. We expect that a more thorough discussion of the constrained cross-classified model in terms of variable selection could be one of the future directions of this work.
Supplementary Materials: The dataset and code for data analysis are available at https://github.com/ssauljin/ vine_copula_reserving.

Author Contributions:
Both authors worked on the development of the methodology and proofreading. Data preparation, empirical analysis, and draft preparation was done by H.J. Both authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by the Society of Actuaries' James C. Hickman doctoral scholarship.

Acknowledgments:
The authors thank anonymous referees and editors for helpful comments that improved this article.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: