Monte Carlo Simulation of the Moments of a Copula-Dependent Risk Process with Weibull Interwaiting Time

: The widely used Poisson count process in insurance claims modeling is no longer valid if the claims occurrences exhibit dispersion. In this paper, we consider the aggregate discounted claims of an insurance risk portfolio under Weibull counting process to allow for dispersed datasets. A copula is used to deﬁne the dependence structure between the interwaiting time and its subsequent claims amount. We use a Monte Carlo simulation to compute the higher-order moments of the risk portfolio, the premiums and the value-at-risk based on the New Zealand catastrophe historical data. The simulation outcomes under the negative dependence parameter θ , shows the highest value of moments when claims experience exhibit overdispersion. Conversely, the underdispersed scenario yields the highest value of moments when θ is positive. These results lead to higher premiums being charged and more capital requirements to be set aside to cope with unfavorable events borne by the insurers.


Introduction
The primary objective of an insurance company is to ensure that it can pay its promised obligations and remain solvent in the business. This can be achieved by managing surplus processes effectively and charging the appropriate premium amount, which will then guarantee an adequate reserve and capital requirement to cope with any unfavorable events. The ability to estimate accurately the aggregate discounted claims would help insurers manage their future liabilities and estimate appropriate premiums to be charged for an insurance contract under a particular line of business. The aggregate discounted claims, or the present value of the total claims amount paid out by an insurance company, are based on the number of claims occurrences up to time t, claims arrival time and claims amount, as well as the discount factor. Past studies related to the first two moments of the aggregate discounted claims were seen in Léveillé and Garrido (2001a) using renewal theory argument while, per Jang (2004), obtaining the Laplace transformation of the distribution using a different martingale approach. Léveillé and Garrido (2001b) applied the renewal theory arguments and conditioning on the first claim arrival to derive the m-th recursive aggregate moment, assuming independence of claim arrival time from its severity.
The independence assumption between claims size and inter-claim arrival time in a classical risk model, as seen in Jang (2004), Waters (1983) and Yang and Zhang (2001) may no longer be appropriate for insurance risk portfolio modeling. With the increased frequency of catastrophic events (MunichRE 2018), the independence assumption will underestimate the risks faced by insurers, especially the values related to pricing, reserving and ruin measures of a risk portfolio. Past literature has relaxed the independence assumption used in the classical risk theory to include dependency elements between claims amounts and claim occurrences that follows a Poisson process (Barges et al. 2011;Mohd Ramli and Jang 2014;Sun et al. 2020;Woo and Cheung 2013). Barges et al. (2011) and Woo and Cheung (2013) were among the first to derive the m-th moment of the aggregate discounted claims using the Laplace transformation and relax the independence assumptions through a copula. Mohd Ramli and Jang (2014) extended the works by explicitly finding the Neumann series expressions and numerically solving the first two moments, where the dependence structure is captured by Farlie-Gumbel-Morgenstern (FGM), Gumbel, and Gaussian copulae. Mohd Ramli et al. (2018), on the other hand solved the recursive moments which take the form of a Volterra integral equation (VIE) using the Laplace transform, assuming an exponential claims size under the FGM, Frank and the HRT copulae. Recent study by Sun et al. (2020) derived the explicit expressions for the first three moments using a Laplace transformation by adopting the Spearman copula to cover a wide range of positive dependency, and developed the convex approximation of the copula. The studies mentioned above illustrated the use of the Poisson counting process to represent event occurrence.
Owing to the practical limitation in the Poisson process which requires an equidispersed set of data, researchers has also adopted the Poisson-gamma or also known as negative binomial count process that only allows for an overdispersed dataset (Lora and Singer 2011). The Weibull count process, on the other hand, is a good alternative to allow for both underdispersed and overdispersed datasets, as commonly seen in applied data analysis with heterogeneous populations (Kreer et al. 2015;McShane et al. 2008;Winkelmann 1995). McShane et al. (2008) derived a closed form of the renewal process based on Weibull inter-arrival times using a Taylor series expansion. Additionally, as the constant hazard rate assumption is not valid in real practice, the duration dependence or time-varying function under a renewal process replaces the constant hazard function. The non-constant hazard rates that vary according to the duration of the interwaiting time (IWT) would be useful in modeling the unexpected random shocks that lead to the breakdown of many engineering systems, as seen in the work of Liu (2019). Compared with a simple Poisson count model with independent marginals, the Weibull count model with a copula is found to be more useful for predicting the number of goals scored in a football match (Boshnakov et al. 2017).
The Weibull process has also been widely used in insurance and financial studies. Kreer et al. (2015) and Hasumi et al. (2009) mentioned that the probability distribution of the inter-occurrence time of the earthquake events and the automotive claim sizes distribution can be described by the Weibull distribution. It has also been applied in ruin theory see (Albrecher et al. 2011) to establish explicit formulas for ruin probabilities under a Sparre-Andersen risk process. Past researchers worked with explicit solutions of the moments derived from the Laplace transform (Barges et al. 2011;Li and Lu 2018;Mohd Ramli et al. 2018) or Neumann series (Mohd Ramli and Jang 2014), which may require heavy computational capacity resulting from lengthy expressions. The variance expression derived from the Neumann series (Mohd Ramli and Jang 2014) has up to sextuple numerical integration even with a simple Poisson counting process. Thus, Mohd Ramli et al. (2019) work with numerical inversion of the Laplace transform via Gaver-Stehfest algorithm to compute the first two moments of Weibull risk process. However, the slow convergence of the numerical solutions under copulae other than FGM may be improved with better computational techniques. This study seeks to overcome the mathematical complexity under a Weibull risk process by using the Monte Carlo simulation to allow for faster computation and more flexibility to compute the higher-order moments through a wider choice of copula.
We organize the remainder of this article as follows: In Section 2, we introduce the aggregate discounted claims model under Weibull counting process together with copulae to represent the dependency between the IWT and the subsequent claim sizes. In Section 3, we first show the comparison of the simulated moments under Monte Carlo simulation for a special case when λ = 1, with results obtained in previous studies to ensure consistency.
We then fit the distributions and estimate the associated parameters to the New Zealand catastrophe occurrences and losses amount data to illustrate the dispersion in the dataset as allowed by a Weibull process. We also find the best copula to accommodate the dependency in the dataset. We then perform a scenario analysis to examine the characteristics of the resulting risk portfolio through the moments, the VaR and the premium computation. Section 4 concludes the article.

Aggregate Risk Model
We define the aggregate discounted claims as in Léveillé and Garrido (2001b), whereby Z = {Z(t)} t≥0 with: where X i , i = 1, 2, . . . is a continuous, non-negative, independent and identically distributed (i.i.d.) random variable that represents claims sizes occurring at random times T i , i = 1, 2, . . . , N(t). The instantaneous rate of net interest δ is assumed to be deterministic.

Weibull Counting Process
In this study, we let the counting process N = {N(t)} t≥0 follow a basic Weibull counting process as in Mohd Ramli et al. (2019), whereby the corresponding continuous random variable of the inter-claim arrival time, W i , is defined as: which follows a Weibull distribution. The time elapsed between two successive claims arrival is called the interwaiting time (henceforth IWT) of the counting process N(t). Each pair of the joint variables forms a sequence of i.i.d. random vectors {(X i , W i )} i∈N . The mean and variance of a Weibull count model with shape parameter λ and scale parameter β, derived by McShane et al. (2008) are given by: where Γ(j−m+1) , for n = 1, 2, 3, ..., j = n, n + 1, n + 2, ..., in which the parameter n denotes the number of events that occur in the interval [0, t].
Although the memoryless property of the exponentially distributed IWT allows for mathematical convenience, the Poisson counting process is only adequate if the data satisfy the restrictive assumption of equidispersion in which the variance of the data is equal to their mean. Hence, the Weibull count model is a better alternative for its ability to accommodate both overdispersed and underdispersed datasets. It also nests commonly used count models, including the Poisson and the negative binomial distributions. Additionally, the Weibull count model also allows for non-constant hazard rates that vary according to the duration of the IWT (McShane et al. 2008). The hazard function characterizes the IWT distribution and relates them to the type of dispersion observed in the corresponding count data. Past studies (Jose and Abraham 2011;McShane et al. 2008;Winkelmann 1995) have verified that the underlying IWT displays negative (positive) duration dependence under a decreasing (increasing) hazard for shape parameter of 0 < λ < 1 (λ > 1) which causes overdispersion (underdispersion). The lack of duration dependence for λ = 1 leads to the Poisson distribution with a constant hazard function, as shown in Figure 1.
We also note that there are two types of dependency applied when modeling the IWT under the Weibull counting process. First dependency lying between the duration of the IWT and the frequency of the claims arrival represented by non-constant hazard rates. As mentioned in McShane et al. (2008) and Winkelmann (1995), the negative (positive) duration dependence indicates a higher (lower) probability of claims occurring immediately after the occurrence of previous claims and decreases (increases) steadily as the IWT increases. This implies that the overdispersed (underdispersed) cases produce higher claims frequencies under short (long) IWT and vice versa.

Copula
The second dependency is between the IWT and its subsequent claims amount, which is captured by a copula. A copula provides a more convenient way to model any joint distribution of two or more random variables. We can model the marginal distributions of each variable by itself, and the copula can link these into a joint distribution. As we allow dependency between the claims size X i , and the inter-arrival time W i , we have a dependent Sparre-Andersen risk process with a dependence structure defined by a copula or other joint probability functions. Please note that when the marginals are negatively (positively) correlated, a large claim amount X i will occur following a short (long) inter-arrival claim time W i , and vice versa. In this study, the dependence structure between the two marginals of X and W are described by a few copulae including the Clayton, Frank and FGM copulae.
Both the Clayton and Frank copulae belong to the same Archimedean family, in which the cumulative distribution function (CDF) of each copula is given by: where θ ∈ R. The CDF of an FGM copula, which is from a different class of copula family is given by: where θ ∈ [−1, 1] represents the dependence parameter (Klugman and Parsa 1999;Ly et al. 2019). The copulae return to an independence copula when θ = 0. The FGM copula is a popular choice in extreme value analysis and risk management (Mao and Yang 2015) due to its simplicity and analytical tractability, despite its ability to capture only moderate dependency. The moderate dependence structure for FGM copula can be seen in Figure 2k-o, in which θ ∈ [−1, 1] is restricted to Kendall's tau, τ ∈ [−2/9, 2/9]. In contrast, the Clayton copula (Figure 2a,e) and Frank copula (Figure 2f,j) allow for a wider range of dependency with τ ∈ [−1, 1], as shown by the almost linear scatter plots of strong negative and positive dependency. However, the Frank copula could neither capture lower nor upper tail dependence, whereas the Clayton copula allows for lower tail dependence, or dependency on small values (Ly et al. 2019). The lower tail dependency can be seen clearly in its 'ice cream cone' shapes as in Figure 2b,d.

Recursive Moment Expressions
The general form of the m-th moment of aggregate discounted claims Z(t), as given here, has a recursive nature (Barges et al. 2011;Mohd Ramli and Jang 2014). Working with explicit solutions of Equation (8) require heavy computational capacity due to lengthy expressions as seen in Mohd Ramli and Jang (2014) and Mohd Ramli et al. (2018). We therefore use the Monte Carlo simulation to compute the higher order of moments in a much shorter period with a common computational capacity. Hence, the mathematical properties of the insurance risk portfolio can be examined in a more efficient and viable manner despite the complicated expressions, which would be welcomed by the industrial community (Driels and Shin 2004).

Monte Carlo Simulation
Monte Carlo simulation is a popular and powerful quantitative tool often used in risk analysis such as in evaluation of insurers' capital requirements (Casarano et al. 2017), estimation of claim size distribution (Bar-Lev and Ridder 2019), mortality projection (Zamzuri and Hui 2020) and bootstrapping approach in Structural Equation Modeling (Razak et al. 2019). We apply the Monte Carlo simulations to incorporate and understand the impact of risk and uncertainty in our model by computing the probability of different outcomes in a risk process (Zamzuri and Hui 2020).
In this study, we simulate the aggregate discounted claims process using the estimated parameters and distributions from a real insurance dataset. In simulating the copula function, we adopt the same approach as Kelly (2007), whereby the first stage involves modeling the unidimensional marginal distributions while the second stage involves modeling the dependence structure. The steps taken in the simulation process are the following: 1. Generate pairs of dependent random variates {(X i , W i )} from multivariate distributions constructed from the chosen copula. The multivariate distributions are based on the best-fit distribution obtained from the insurance dataset. 2. Compute the random time T i , from the accumulated W i as in Equation (2). 3. Compute the aggregate discounted claims Z(T), by assuming deterministic δ. 4. Stop running the iterations once the T i is above a pre-determined term of a policy contract. 5. Repeat the process from step 1 to 4 for n simulations. 6. Determine the moments, premium and VaR from the simulated risk process Z(T).

Results and Discussion
In this section, we organize the results and discussion by first showing the consistency of results returned by the Monte Carlo simulation method with results obtained in previous studies (Section 3.1). Next, we fit the aggregate discounted claims model to the New Zealand catastrophe data in Section 3.2. Using the estimated parameters of the claims size distribution, the IWT distribution and the dependency between these marginals, we then examine the risk portfolio of the New Zealand catastrophe dataset. This includes computing the value of moments, premium and VaR in Section 3.3. Finally, we perform scenario analysis in Section 3.4 by varying the parameters of the Weibull process and the copulae to examine the impact on the risk portfolio.

Results Verification
We first consider the special case of a Weibull IWT with the shape parameter λ = 1 to ensure consistency of the Monte Carlo simulations with the numerical values obtained through Laplace transform as in Barges et al. (2011), Mohd Ramli and Jang (2014) and Mohd Ramli et al. (2018). We simulate pairs of the joint variables of random vectors (X i , W i ) (i∈N) with an FGM copula defining the dependency between the variables. The values were computed using the same parameters in which X ∼ Exp(α = 0.01), W∼ Exp(β = 1), δ = 0.04 and T = 5 for different values of θ. The average number of iterations for each simulation is 5, which is consistent with the average of the exponentially distributed IWT. The process is then repeated for n = 100, 000 simulations.
The mean and variance of the aggregate discounted claims obtained via Monte Carlo simulation are presented in Table 1 and we compare the results with the explicit solution under the Laplace transform as in Mohd Ramli et al. (2018). The simulation results show only slight deviation from the values computed using the Laplace transform, of less than 1% differences for the mean and 2% for the variance.

Fitting Distribution and Parameter Estimation of Insurance Datasets
In this section, we fit the aggregate discounted claim models on the New Zealand catastrophe occurrences and losses amount data from 1968 to 2014 which was retrieved from the CASdatasets package in R Statistical Software (Dutang and Charpentier 2020). We estimate the parameters of the fitted distributions of the claims size and the IWT, as well as the dependency between the marginals, using the maximum likelihood estimation method.

The Claim Sizes Distribution
The Cullen and Frey (Cullen et al. 1999) graph in Figure 3 illustrates the skewnesskurtosis plot of the empirical distribution of the New Zealand catastrophe loss claims. The values are computed on 1000 bootstrap samples to allow for the uncertainty of the estimated values of kurtosis and skewness from data. This graph exhibits some common distributions on the plot to help with the choice of distributions to fit to data. The normal, uniform, and logistic distributions are not good fits for the data since the skewness-kurtosis points of the distribution are far from the empirical value. The descriptive statistic in Table 2 shows a positive skewness and a heavy-tailed kurtosis. Thus, the fit of three common rightskewed distributions could be considered, which are the Weibull, Gamma and Lognormal distributions, together with other heavy-tailed distributions such as Pareto and Burr. The results in Table 3 and Figure 4 show that the lognormal distribution, with a meanlog of 1.35969 and an sdlog of 1.52856, provides the best fit for the claims size distribution (values in million USD). The fitted lognormal distribution has the lowest negative log likelihood, Akaike information criterion (AIC) and Bayesian information criterion (BIC) values, with significant p-values under the Kolmogorov-Smirnov (KS) and Anderson-Darling (AD) test.

The Interwaiting Time Distribution
The descriptive statistics of the IWT of the New Zealand catastrophe claims arrival in Table 4 imply that the claims arrive on average every 138.29 days, with a maximum IWT of 7.39 years. The IWT exhibits positive skewness with high kurtosis. Table 5 and Figure 5 show that the Weibull distribution is the best fit for the IWT when compared with exponential distribution. The Weibull shape parameter value of 0.700152 < 1 indicates an overdispersed claims occurrences, implying that the claims arrive at times that vary more than expected.   Table 6 exhibits that the Clayton copula is the best fit for the dependence structure between the IWT and its subsequent claim amount embedded in the New Zealand catastrophe historical data. Since Clayton copula allows for lower tail dependence, this indicates that the IWT and the subsequent claims amount show more dependency on small values. The positive dependency between the marginals as shown by parameter θ = 0.32 or the equivalent Kendall's τ of 0.14, implies that the large (small) claim amount would occur following the long (short) IWT.

Risk Characteristic of the Risk Process with Overdispersed Claim Arrival
We will examine the effect of overdispersion in the claims arrival of the New Zealand catastrophe historical data on the moments, premium and VaR in comparison to the equidispersed counting process through the Monte Carlo simulation. We are setting δ = 0.025, X i ∼lognormal (1.3597,1.5286), with W i following the Weibull and exponential distributions from the estimated parameters as in Table 5. The dependency of X i and W i is captured through a Clayton copula with θ = 0.32.
The simulation results as illustrated in Table 7 show that both risk processes of Z(t) are a right-skewed distribution based on the positive skewness. The high value of kurtosis from the results indicates a heavy-tailed distribution. However, the exponential IWT underestimates the value of the mean and variance, which results in a lower value of premium and VaR compared with the overdispersed Weibull IWT as seen in Table 8.

Scenario Analysis on Risk Characteristic of the Risk Process under Various Dispersion Effect on Claims Arrival
In this section, we perform a scenario analysis on the moments, premium and VaR by varying the shape λ and the scale β parameters under the Weibull IWT distribution. This is done to allow for various dispersion effect of claims occurrences, across different dependence parameter θ. The parameters of the Weibull count process are chosen so that they have an equal mean of claim frequencies but with different variances, obtained through Equation (4). Table 9 illustrates the chosen parameter in which the equidispersed case has an equal mean and variance of 4.243, whereas the variance under the overdispersed (underdispersed) case is higher (lower) than the mean. This is then reflected in the probability distribution of the claims frequencies as seen in Figure 6. There is an increased chance of more claims occurring under the overdispersed case, while the claim frequencies are more concentrated around the mean for the underdispersed case in a fixed time interval. Table 9. Marginal mean and variance of overdispersed, equidispersed and underdispersed Weibull count models, Weibull C when T = 5.

Overdispersed
Equidispersed Please note that the simulation of the aggregate discounted claims under Weibull process adopt two types of dependency. The first type is the duration dependence between the duration of the IWT, W i , and the frequency of claims arrivals, which causes dispersed dataset. The second type is the dependency between the W i and its subsequent claims amount X i , which is captured by a dependence parameter θ of a copula. Table 10 shows that the highest mean and variance are obtained when the X i and W i exhibit strong negative correlation as seen when θ = −1. This is intuitively reasonable; short IWT (or frequent claim occurrence) followed by large claims amount X i , will produce higher value of moments. However, as θ changes from negative to positive dependency, the mean and variance are decreasing under all dispersion scenarios.
In comparison to an equidispersed process, the case of overdispersed claims occurrences produces the highest first two moments when θ is negative. Conversely, the underdispersed case shows the highest values of moments when θ is positive. These outcomes could be related to both aforementioned types of dependencies under Weibull process. Higher claim frequencies under short (long) IWT will result in many large claims when θ is negative (positive).
Additionally, the positive skewness and the high value of kurtosis imply that the risk process exhibits a right-skewed with a heavy-tailed risk distribution. However, as θ changes to a strong positive dependency, the tail of the risk process under the overdispersed (underdispersed) case is less (more) skewed with less (more) extreme values on the tail distribution as shown in the value for skewness and kurtosis. This is then reflected in Table 11, whereby the risk process with heavier-tailed distribution leads to a greater difference between the value of 95% and the 99.5% VaR. The 99.5% VaR has more than tripled in value than the 95% VaR under the most heavy-tailed risk process shown by the underdispersed case with a strong positive θ. The solvency capital requirement (SCR), which is the additional amount of funds an insurer is required to hold, should correspond to the 99.5% VaR as outlined under the Solvency II regime (Christiansen and Niemeyer 2014). In doing so, the premium should be charged appropriately, taking into account risks related to the claims experience. The computation of loaded premium, Π according to the mean and standard deviation premium principle are given by the following: with κ ∈ [0, 1] as the loading factor. The results presented in Table 11 and Figure 7 show that an insurer needs to charge higher premiums and set aside more capital requirement if the claims experience exhibits overdispersion (underdispersion) with negative (positive) θ, in comparison to equidispersed claims experience. These results are consistent with a Frank copula which is seen in Table 12 and Figure 8. However, the claims modeling based on FGM copula as represented in Table 13 and Figure 9, do not produce the same pattern because of that copula's weak dependence structure as compared with the other two Archimedean copulae with greater dependency.
We plotted again the simulated premium and VaR across different Kendall's τ to standardize the value on x-axis for comparison. The VaR and premium for each dispersion case exhibit the same value, respectively when τ = 0 as seen in Figures 10 and 11. As τ changes from −1 to 1, the overdispersed (underdispersed) case illustrates the widest (narrowest) spread of values among all. On average, Frank (Clayton) copula illustrates higher value of premiums and VaR than the Clayton (Frank) copula when τ is negative (positive), while the values under FGM copula are illustrated between the two copulae.
In comparison to Mohd Ramli et al. (2019) which only illustrated the computations of the first two moments of a Weibull risk process with λ = 2 under an FGM copula, we used the Monte Carlo simulation technique and obtained the higher-order moments of the Weibull risk process with wider choice of copulae. With the additional information on the asymmetry and the tail distribution of the aggregate discounted claims, insurance providers can allow for the dispersion in the claims occurrences and dependency between variables when estimating the premium and capital requirement for an insurance risk portfolio.

Conclusions
The ability of an insurance firm to manage surplus effectively is important to ensure capital adequacy, as required by the Solvency II standard. To allow for dispersed datasets, we propose the Weibull counting process to represent the claims arrival process of an insurance risk portfolio and an arbitrary continuous claims size. Under this model, two types of dependency were accommodated which were captured by the hazard function and the copula. We then fit the model to the historical data of New Zealand catastrophe events occurrences and losses, which was best represented by an overdispersed Weibull process with claim amounts following a lognormal distribution and a Clayton copula with a weak dependency. The moments estimation via Monte Carlo simulations shows the highest mean and variance of the aggregate discounted claims were obtained under an overdispersed claim arrivals with a strong negative correlation. The positive skewness and the high value of kurtosis of the aggregate discounted claims with Weibull IWT also implied an asymmetric and a heavy-tailed risk distribution. The scenario analysis conducted indicated that insurers will need to charge higher premiums and set aside more capital requirements as the claims experience exhibits overdispersed (underdispersed) IWT with negative (positive) dependency captured by copulae, in comparison to claims experience with equidispersed IWT. Other than computing the respective ruin probability under the Weibull risk process and applying commonly used copulae in finance and insurance, we may also examine the applicability of the model on a random annuity plan with random payments being paid at random times.  Data Availability Statement: Publicly available datasets were analyzed in this study. This data can be found here: http://cas.uqam.ca/, http://dutangc.perso.math.cnrs.fr/RRepository/ (accessed on 15 March 2021).

Conflicts of Interest:
The authors declare no conflict of interest. The funder had no role in the design of the study; in the collection, analyses or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations
The following abbreviations are used in this manuscript:

AD
Anderson