Claims Modelling with Three-Component Composite Models

: In this paper, we develop a number of new composite models for modelling individual claims in general insurance. All our models contain a Weibull distribution for the smallest claims, a lognormal distribution for the medium-sized claims, and a long-tailed distribution for the largest claims. They provide a more detailed categorisation of claims sizes when compared to the existing composite models which differentiate only between the small and large claims. For each proposed model, we express four of the parameters as functions of the other parameters. We ﬁt these models to two real-world insurance data sets using both maximum likelihood and Bayesian estimation, and test their goodness-of-ﬁt based on several statistical criteria. They generally outperform the existing composite models in the literature, which comprise only two components. We also perform regression using the proposed models.


Introduction 1.Current Literature
Modelling individual claim amounts which have a long-tailed distribution is an important task for general insurance actuaries.The usual candidates with a heavy tail include the two-parameter Weibull, lognormal, Pareto, and three-parameter Burr models (e.g., Dickson 2016).Venter (1983) introduced the four-parameter generalised beta type-II (GB2) model, which nests more than 20 popular distributions (e.g., Dong and Chan 2013) and can provide more flexibility in describing the skewness and kurtosis of the claims.McNeil (1997) applied the generalised Pareto distribution (GPD) to the excesses above a high threshold based on the extreme value theory.Many advanced models have been built with these various distribution assumptions, as it is crucial for an insurer to provide an adequate allowance for potential adverse financial outcome.
In order to deliver a reasonable parametric fit for both smaller claims and very large claims, Cooray and Ananda (2005) constructed the two-parameter composite lognormal-Pareto model.It is composed of a lognormal density up to an unknown threshold and a Pareto density beyond that threshold.Using a fire insurance data set, they demonstrated a better performance by the composite model when compared to traditional models like the gamma, Weibull, lognormal, and Pareto.Scollnik (2007) improved the lognormal-Pareto model by allowing the weights to vary and also introduced the lognormal-GPD model, in which the tail is modelled by the GPD instead.By contrast, Nadarajah and Bakar (2014) modelled the tail with the Burr density.Scollnik and Sun (2012) and Bakar et al. (2015) further tested several composite models which use the Weibull distribution below the threshold and a variety of heavy-tailed distributions above the threshold.In all these extensions, an important feature is that the threshold selection is based on the data.Moreover, all the authors hitherto imposed continuity and differentiability conditions on the threshold point, and so the effective number of parameters is reduced by two.While there are some other similar mixture models (e.g., Calderín-Ojeda and Kwok 2016; are some other similar mixture models (e.g., Calderín-Ojeda and Kwok 2016; Reynkens et al. 2017) in the literature, we preserve the term "composite model" for only those with these continuity-differentiability requirements in this paper.Some other recent and related studies include those of Laudagé et al. (2019), Wang et al. (2020), and Poufinas et al. (2023).

Proposed Composite Models
All the composite models mentioned above have only two components.For a very large data set, the behaviour of claims of different sizes may differ vastly, which would then call for a finer division between the claim amounts and thus more components to be incorporated (e.g., Grün and Miljkovic 2019).In this paper, we develop new three-component composite models with an attempt to provide a better description of the characteristics of different data ranges.Each of our models contains a Weibull distribution for the smallest claims, a lognormal distribution for the medium-sized claims, and a heavytailed distribution for the largest claims.We choose the sequence of starting with the Weibull and then lognormal for a few reasons.First, as shown in Figure 1, the Weibull distribution tends to have a more flexible shape on the left side, which makes it potentially more useful for the smallest claims.Second, the lognormal distribution usually has a heaver tail, given the mean and variance, as the limiting density ratio of Weibull to lognormal approaches zero when x goes to infinity (see Appendix A).This means that the lognormal distribution would be more suitable for claims of larger sizes.Nevertheless, both the Weibull and lognormal do not really possess a sufficiently heavy tail for modelling the largest claims.Comparatively, a heavy-tailed distribution like Pareto, Burr, and GPD are better options for this purpose.We apply the proposed three-component composite models to two real-world insurance data sets and use both maximum likelihood and Bayesian methods to estimate the model parameters for comparison.Based on several statistical tests on the goodness-of-fit, we find that the new composite models outperform not just the traditional models but also the earlier two-component composite models.In particular, it would be informative to see how the fitted models indicate the splits or thresholds to separate different claim sizes into three categories: small, medium, and large.We experiment with applying regression under the proposed model structure and realise that different claims sizes have different significant covariates.Moreover, we consider a 3D map which can serve as a risk management tool and summarise the entire model space and their resulting tail risk estimates.Note that we focus on the claim severity (but not the claim frequency) in this study.The remainder of the paper is as follows.Sections 2-4 introduce the composite Weibulllognormal-Pareto, Weibull-lognormal-GPD, and Weibull-lognormal-Burr models.Section 5 provides a numerical illustration using two insurance data sets of fire claims and vehicle claims.Section 6 sets forth the concluding remarks.The Appendix A presents some where φ, τ, µ, σ, and α are the model parameters.The weights w 1 and w 2 decide the total probability of each segment.The thresholds θ 1 and θ 2 are the points at which the Weibull and lognormal distributions are truncated, and they represent the splitting points between the three data ranges.We refer to this model as the Weibull-lognormal-Pareto model.
In line with previous authors including Cooray and Ananda (2005), two continuity conditions f (θ 1 −) = f (θ 1 +) and f (θ 2 −) = f (θ 2 +), and also two differentiability conditions f (θ 1 −) = f (θ 1 +) and f (θ 2 −) = f (θ 2 +) are imposed at the two thresholds.It can be deduced that the former leads to the two equations below for the weights: and that the latter generates the following two constraints: Because of these four relationships, there are effectively five unknown parameters, including τ, σ, α, θ 1 , and θ 2 , with the others φ, µ, w 1 , and w 2 expressed as functions of these parameters.As in all the previous works on composite models, the second derivative requirement is not imposed here because it often leads to inconsistent parameter constraints.One can readily derive that the kth moment of X is given as follows (see Appendix A): in which γ(s, z) = z 0 t s−1 exp(−t)dt is the lower incomplete gamma function and α > k.

Weibull-Lognormal-GPD Model
Similarly, we construct the Weibull-lognormal-GPD model as for 0 < x ≤ θ 1 the continuity and differentiability conditions, the weights are determined as follows: and there are also two other constraints: There are six effective model parameters of τ, σ, α, λ, θ 1 , and θ 2 , with the others φ, µ, w 1 , and w 2 given as functions of these parameters.The kth moment of X is equal to is the moment-generating function of the GPD, and 0) is its kth derivative with respect to t at t = 0 for α > k.

Weibull-Lognormal-Burr Model
Lastly, we define the Weibull-lognormal-Burr model as For α, β, γ, θ 2 > 0, the Burr distribution is truncated from below.Again, the continuity and differentiability conditions lead to the following equations for the weights: and also the constraints below: There are effectively seven model parameters to be estimated, including τ, σ, α, β, γ, θ 1 , and θ 2 .The others φ, µ, w 1 , and w 2 are derived from these parameters.The kth moment of X is computed as 2 gives a graphical illustration of the three new composite models.All the graphs are based on the values of w 1 = 0.2 and w 2 = 0.6, that is, the expected proportions of small, medium, and large claims are 20%, 60%, and 20%, respectively.For illustration purposes, the parameters are arbitrarily chosen such that each set gives rise to exactly the same expected proportions of the three claim sizes.For the case in the top panel, which has similar Weibull and lognormal parameters and the same weights amongst the three models, the Pareto tail is heavier than the GPD tail, followed by the Burr one.In the bottom panel, while all the three Weibull-lognormal-Pareto models have the same component weights, the differences in the parameter values can generate very different shapes and tails of the densities.The three-component composite models can provide much flexibility for modelling individual claims of different lines of business.actly the same expected proportions of the three claim sizes.For the case in the top panel, which has similar Weibull and lognormal parameters and the same weights amongst the three models, the Pareto tail is heavier than the GPD tail, followed by the Burr one.In the bottom panel, while all the three Weibull-lognormal-Pareto models have the same component weights, the differences in the parameter values can generate very different shapes and tails of the densities.The three-component composite models can provide much flexibility for modelling individual claims of different lines of business.

Application to Two Data Sets
We first apply the three composite models to the well-known Danish data set of 2492 fire insurance losses (in millions of Danish Krone; a complete data set).The inflation-adjusted losses in the data range from 0.313 to 263.250 and are collected from the "SMPracticals" package in R.This data set has been studied in earlier works on composite models, including those of Cooray and Ananda (2005), Scollnik and Sun (2012), Nadarajah andBakar (2014), andBakar et al. (2015).For comparison, we also apply the Weibull, lognormal, Pareto, Burr, GB2, lognormal-Pareto, lognormal-GPD, lognormal-Burr, Weibull-Pareto, Weibull-GPD, and Weibull-Burr models to the data.Based on the reported results from the authors mentioned above, the Weibull-Burr model has been shown to produce the highest log-likelihood value and the lowest Akaike Information Criterion (AIC) value for this Danish data set.
The previous authors mainly used the maximum likelihood estimation (MLE) method to fit their composite models.While we still use the MLE to estimate the parameters (with nlminb in R), we also perform a Bayesian analysis via Markov chain Monte Carlo (MCMC) simulation.More specifically, random samples are simulated from a Markov chain which has its stationary distribution being equal to the joint posterior distribution.Under the Bayesian framework, the posterior distribution is derived as We perform MCMC simulations via the software JAGS (Just Another Gibbs Sampler) (Plummer 2017), which uses the Gibbs sampling method.We make use of non-informative uniform priors for the unknown parameters.Note that the posterior modes under uniform priors generally correspond to the MLE estimates.For each MCMC chain, we omit the first 5000 iterations and collect 5000 samples afterwards.Since the estimated Monte Carlo errors are all well within 5% of the sample posterior standard deviations, the level of convergence to the stationary distribution is considered

Application to Two Data Sets
We first apply the three composite models to the well-known Danish data set of 2492 fire insurance losses (in millions of Danish Krone; a complete data set).The inflationadjusted losses in the data range from 0.313 to 263.250 and are collected from the "SMPracticals" package in R.This data set has been studied in earlier works on composite models, including those of Cooray and Ananda (2005), Scollnik and Sun (2012), Nadarajah and Bakar (2014), and Bakar et al. (2015).For comparison, we also apply the Weibull, lognormal, Pareto, Burr, GB2, lognormal-Pareto, lognormal-GPD, lognormal-Burr, Weibull-Pareto, Weibull-GPD, and Weibull-Burr models to the data.Based on the reported results from the authors mentioned above, the Weibull-Burr model has been shown to produce the highest log-likelihood value and the lowest Akaike Information Criterion (AIC) value for this Danish data set.
The previous authors mainly used the maximum likelihood estimation (MLE) method to fit their composite models.While we still use the MLE to estimate the parameters (with nlminb in R), we also perform a Bayesian analysis via Markov chain Monte Carlo (MCMC) simulation.More specifically, random samples are simulated from a Markov chain which has its stationary distribution being equal to the joint posterior distribution.Under the Bayesian framework, the posterior distribution is derived as f (θ|X) ∝ f (X|θ) f (θ) .We perform MCMC simulations via the software JAGS (Just Another Gibbs Sampler) (Plummer 2017), which uses the Gibbs sampling method.We make use of non-informative uniform priors for the unknown parameters.Note that the posterior modes under uniform priors generally correspond to the MLE estimates.For each MCMC chain, we omit the first 5000 iterations and collect 5000 samples afterwards.Since the estimated Monte Carlo errors are all well within 5% of the sample posterior standard deviations, the level of convergence to the stationary distribution is considered adequate in our analysis.Some JAGS outputs of MCMC simulation are provided in the Appendix A. We employ the "ones trick" (Spiegelhalter et al. 2003) to specify the new models in JAGS.The Bayesian estimates provide a useful reference for checking the MLE estimates.Despite the major differences in their underlying theories, their numerical results are expected to be reasonably close here, as we use non-informative priors, leading to most of the weights being allocated to the posterior mean rather than the prior mean.Since the posterior distribution of the unknown parameters of the proposed models are analytically intractable, the MCMC simulation procedure is a useful method for approximating the posterior distribution (Li 2014).
Table 1 reports the negative log-likelihood (NLL), AIC, Bayesian Information Criterion (BIC), Kolmogorov-Smirnov (KS) test statistic, and Deviance Information Criterion (DIC) values 1 for the 14 models tested.The ranking of each model under each test is given in brackets, in which the top three performers are highlighted for each test.Overall, the Weibull-lognormal-Pareto model appears to provide the best fit, with the lowest AIC, BIC, and DIC values and the second lowest NLL and KS values.The second position is taken by the Weibull-lognormal-GPD model, which produces the lowest NLL and KS values and the second (third) lowest AIC (DIC).The Weibull-lognormal-Burr and Weibull-Burr models come next, each of which occupies at least two top-three positions.Apparently, the new three-component composite models outperform the traditional models as well as the earlier two-component composite models.The P-P (probability-probability) plots in Figure 3 indicate clearly that the new models describe the data very well.Recently, Grün and Miljkovic (2019) tested 16 × 16 = 256 two-component models on the same Danish data set, using a numerical method (via numDeriv in R) to find the derivatives for the differentiability condition rather than deriving the derivatives from first principles as in the usual way.Based on their reported results, the Weibull-Inverse-Weibull model gives the lowest BIC (7671.30),and the Paralogistic-Burr and Inverse-Burr-Burr models give the lowest KS test values (0.015).Comparatively, as shown in Table 1, the Weibull-lognormal-Pareto model produces a lower BIC (7670.88)and all the three new composite models give lower KS values (around 0.011), which are smaller than the critical value at 5% significance level, and imply that the null hypothesis is not rejected.Table 2 compares the fitted model quantiles (from MLE) against the empirical quantiles.It can be seen that the differences between them are generally small.This result conforms with the P-P plots in Figure 3.Note that the estimated weights of the three-component composite models are about w 1 = 0.08 and w 2 = 0.54.These estimates suggest that the claim amounts can be split into three categories of small, medium, and large sizes, with expected proportions of 8%, 54%, and 38%.For pricing, reserving, and reinsurance purposes, the three groups of claims may further be studied separately, possibly with different sets of covariates where feasible, as they may have different underlying driving factors (especially for long-tailed lines of business).Table 3 lists the parameter estimates of the three-component composite models obtained from the MLE method and also the Bayesian MCMC method.It is reassuring to see that not only the MLE estimates and the Bayesian estimates but also their corresponding standard errors and posterior standard deviations are fairly consistent with one another in general.A few exceptions include λ and β, which may suggest that these parameter estimates are not as robust and are less significant.This implication is in line with the fact that the Weibull-lognormal-GPD and Weibull-lognormal-Burr models are only the second and third best models for this Danish data set.We then apply the 14 models to a vehicle insurance claims data set, which w lected from http://www.businessandeconomics.mq.edu.au/our_departmeplied_Finance_and_Actuarial_Studies/research/books/GLMsforInsuranceData (a on 2 August 2020).There are 3911 claims in 2004 and 2005 ranging from $20 $55,922.13.For computation convenience, we model the claims in thousand dollar 4 shows that the Weibull-lognormal-GPD and Weibull-lognormal-Burr models two best models in terms of all the test statistics covered.They are followed Weibull-Burr and lognormal-Burr models, which produce the next lowest NLL, A and DIC values.As shown in Table 5, the fitted model quantiles and the empirica tiles are reasonably close under the two best models.It is noteworthy that the W lognormal-Pareto model ranks only about fifth amongst the 14 models.For this the computed second threshold ( 2 1,312 θ = ) turns out to be larger than the ma claim amount observed in the data.This implies that the Pareto tail part is not ne preferred at all for the data under this model, and the fitted model effectively be Weibull-lognormal model.By contrast, for the Weibull-lognormal-GPD and W lognormal-Burr models, the GPD and Burr tail parts are important components th to be incorporated ( 2 θ = 4.6 and 3.5).Similar observations can be made among t component models, in which the GPD and Burr tails are selected over the Pareto t estimated weights of the best composite models are around 1 0.1 w = and 2 0 w = ble 6 gives the parameter estimates of the three-component composite models, an the MLE estimates and the Bayesian estimates are roughly in line.We then apply the 14 models to a vehicle insurance claims data set, which was collected from http://www.businessandeconomics.mq.edu.au/our_departments/Applied_Finance_and_Actuarial_Studies/research/books/GLMsforInsuranceData (accessed on 2 August 2020).There are 3911 claims in 2004 and 2005 ranging from $201.09 to $55,922.13.For computation convenience, we model the claims in thousand dollars.Table 4 shows that the Weibull-lognormal-GPD and Weibull-lognormal-Burr models are the two best models in terms of all the test statistics covered.They are followed by the Weibull-Burr and lognormal-Burr models, which produce the next lowest NLL, AIC, BIC, and DIC values.As shown in Table 5, the fitted model quantiles and the empirical quantiles are reasonably close under the two best models.It is noteworthy that the Weibull-lognormal-Pareto model ranks only about fifth amongst the 14 models.For this model, the computed second threshold (θ 2 = 1312) turns out to be larger than the maximum claim amount observed in the data.This implies that the Pareto tail part is not needed or preferred at all for the data under this model, and the fitted model effectively becomes a Weibull-lognormal model.By contrast, for the Weibull-lognormal-GPD and Weibull-lognormal-Burr models, the GPD and Burr tail parts are important components that need to be incorporated (θ 2 = 4.6 and 3.5).Similar observations can be made among the two-component models, in which the GPD and Burr tails are selected over the Pareto tail.The estimated weights of the best composite models are around w 1 = 0.1 and w 2 = 0.7.Table 6 gives the parameter estimates of the three-component composite models, and again the MLE estimates and the Bayesian estimates are roughly in line.Blostein and Miljkovic (2019) proposed a grid map as a risk management tool for risk managers to consider the trade-off between the best model based on the AIC or BIC and the risk measure.It covers the entire space of models under consideration, and allows one to have a comprehensive view of the different outcomes under different models.In Figure 4, we extend this grid map idea into a 3D map, considering more than just one model selection criterion.It can serve as a summary of the tail risk measures given by the 14 models being tested, comparing the tail estimates between the best models and the other models under two chosen statistical criteria.For both data sets, it is informative to see that the 99% value-at-risk (VaR) estimates are robust amongst the few best model candidates, while there is a range of outcomes for the other less than optimal models (the 99% VaR is calculated as the 99th percentile based on the fitted model).It appears that the risk measure estimates become more and more stable and consistent as we move to progressively better performing models.This 3D map can be seen as a new risk management tool and it would be useful for risk managers to have an overview of the whole model space and examine how the selection criteria would affect the resulting assessment of the risk.In particular, in many other modelling cases, there could be several equally well-performing models which produce significantly different risk measures, and this tool can provide a clear illustration for more informed model selection.Note that other risk measures and selection criteria than those in Figure 4 can be adopted in a similar way.Blostein and Miljkovic (2019) proposed a grid map as a risk management tool managers to consider the trade-off between the best model based on the AIC or B the risk measure.It covers the entire space of models under consideration, and allo to have a comprehensive view of the different outcomes under different models.In 4, we extend this grid map idea into a 3D map, considering more than just one selection criterion.It can serve as a summary of the tail risk measures given by models being tested, comparing the tail estimates between the best models and th models under two chosen statistical criteria.For both data sets, it is informative to the 99% value-at-risk (VaR) estimates are robust amongst the few best model can while there is a range of outcomes for the other less than optimal models (the 99% calculated as the 99th percentile based on the fitted model).It appears that the ris ure estimates become more and more stable and consistent as we move to progr better performing models.This 3D map can be seen as a new risk management too would be useful for risk managers to have an overview of the whole model sp examine how the selection criteria would affect the resulting assessment of the particular, in many other modelling cases, there could be several equally well-perf models which produce significantly different risk measures, and this tool can pr clear illustration for more informed model selection.Note that other risk measu selection criteria than those in Figure 4 can be adopted in a similar way.To our knowledge, regression has not been tested on any of the composite m far in the actuarial literature.We now explore applying regression under the pr model structure via the MLE method.Besides the claim amounts, the vehicle in claims data set also contains some covariates including the exposure, vehicle age age, and gender (see Table 7).We select the best performing Weibull-lognorm model (see Table 4) and assume that φ , μ , and  To our knowledge, regression has not been tested on any of the composite models so far in the actuarial literature.We now explore applying regression under the proposed model structure via the MLE method.Besides the claim amounts, the vehicle insurance claims data set also contains some covariates including the exposure, vehicle age, driver age, and gender (see Table 7).We select the best performing Weibull-lognormal-GPD model (see Table 4) and assume that φ, µ, and β = (λ + θ 2 )/α are functions of the explanatory variables, based on the first moment derived in Section 3. We use a log link function for φ and β to ensure that they are non-negative, and an identity link function for µ 2 .It is very interesting to observe from the results in Table 7 that different model components (and so different claim sizes) point to different selections of covariates.For the largest claims, all the tested covariates are statistically significant, in which the claim amounts tend to increase as the exposure, vehicle age, and driver age decrease, and the claims are larger for male drivers on average.By sharp contrast, most of these covariates are not significant for the medium-sized claims and also the smallest claims.The only exception is the driver age for the smallest claims, but its effect is opposite to that for the largest claims.These differences are insightful in the sense that the underlying risk drivers can differ between

Concluding Remarks
We have constructed three new composite models for modelling individual claims in general insurance.All our models are composed of a Weibull distribution for the smallest claims, a lognormal distribution for the moderate claims, and a long-tailed distribution for the largest claims.Under each proposed model, we treat four of the parameters as functions of the other parameters.We have applied these models to two real-world insurance data sets of fire claims and vehicle claims, via both maximum likelihood and Bayesian estimation methods.Based on standard statistical criteria, the proposed three-component composite models are shown to outperform the earlier two-component composite models.We have also devised a 3D map for analysing the impact of selection criteria on the resulting risk measures, and experimented with applying regression under a three-component composite model, from which the effects of different covariates on different claim sizes are illustrated and compared.Note that inflation has been very high in recent years, and can have a serious impact on the claim sizes.Accordingly, it is advisable to adjust recent claim sizes with suitable inflation indices before the claims modelling, similar to the Danish data set.
There are a few areas that would require more investigation.For the two data sets considered, each of which has a few thousand observations, it appears that three distinct components are adequate to describe the major data patterns.For other much larger data sets, however, we conjecture that an incorporation of more than three components can become an optimal choice.Additionally, if the data set is sufficiently large, clustering techniques can be applied, and the corresponding results can be compared to those of the proposed approach.When clustering methods are used, the next step is to fit a distribution or multiple distributions to different claim sizes, while our proposed approach has the convenience of performing both in one single step.Moreover, we select the Weibull and then lognormal distributions because of their suitability for the smallest and medium-sized claims, as shown and discussed earlier, and the fact that they have been the common choices in the existing two-component composite models.While we use these two distributions as the base for the first two components, it may be worthwhile to test other distributions instead and see whether they can significantly improve the fitting performance.Finally, as in Pigeon and Denuit (2011), heterogeneity of the two threshold parameters can be introduced by setting appropriate mixing distributions.In this way, the threshold parameters are allowed to differ between observations.There are also other interesting and related studies such as those of Frees et al. (2016), Millennium andKusumawati (2022), andPoufinas et al. (2023).
The following plots show the JAGS outputs of MCMC simulation when fitting the Weibull-lognormal-Pareto model to the Danish data, using uninformative uniform priors.All the parameters τ, σ, α, θ 1 , and θ 2 are included.For each parameter, the four graphs include the history plot, posterior distribution function, posterior density function (in histogram), and autocorrelation plot (between iterations).The history and autocorrelation plots strongly suggest that the level of convergence to the underlying stationary distribution is highly satisfactory (Spiegelhalter et al. 2003).The AIC is defined as −2l + 2n p , and the BIC as −2l + n p ln n d , where l is the computed maximum log-likelihood value, n p is the effective number of parameters estimated, and n d is the number of observations.The KS test statistic is calculated as max|F n (x) − F(x)|, that is, the maximum distance between the empirical and fitted distribution functions.The DIC is computed as the posterior mean of the deviance plus the effective number of parameters under the Bayesian framework (Spiegelhalter et al. 2003).

Figure 1 .
Figure 1.Examples of density functions of Weibull and lognormal distributions.The remainder of the paper is as follows.Sections 2-4 introduce the composite Weibull-lognormal-Pareto, Weibull-lognormal-GPD, and Weibull-lognormal-Burr

Figure 3 .
Figure 3. P-P plots of fitting three-component composite models to Danish data.

Figure 3 .
Figure 3. P-P plots of fitting three-component composite models to Danish data.

Figure 4 .
Figure 4. 3D map of 14 models' 99% VaR estimates against BIC and KS values for Danish fi ance claims data (left) and vehicle insurance claims data (right).The three major categories a as traditional models (triangles), two-component composite models (empty circles), and ne component composite models (solid circles).
based on the first moment derived in Section 3. We use a function for φ and β to ensure that they are non-negative, and an identity link f

Figure 4 .
Figure 4. 3D map of 14 models' 99% VaR estimates against BIC and KS values for Danish fire insurance claims data (left) and vehicle insurance claims data (right).The three major categories are noted as traditional models (triangles), two-component composite models (empty circles), and new three-component composite models (solid circles).

Figure A2 .
Figure A2.History plot, posterior distribution function, posterior density function, and autocorrelation plot of Weibull-lognormal-Pareto model parameters for Danish fire insurance claims data.(The blue and purple lines represent two separate chains of simulations.)

Figure A2 .
Figure A2.History plot, posterior distribution function, posterior density function, and autocorrelation plot of Weibull-lognormal-Pareto model parameters for Danish fire insurance claims data.(The blue and purple lines represent two separate chains of simulations.)

Table 1 .
Fitting performances of 14 models on Danish fire insurance claims data.

Table 2 .
Empirical and fitted composite model quantiles for Danish fire insurance claims data.
Note: The figures are produced from the authors' calculations.

Table 3 .
Parameter estimates of fitting three-component composite models to Danish fire insurance claims data.
Note: The figures are produced from the authors' calculations.Risks 2023, 11, 196 9 of 16 Risks 2023, 11, x FOR PEER REVIEW

Table 4 .
Fitting performances of 14 models on vehicle insurance claims data.

Table 5 .
Empirical and fitted composite model quantiles for vehicle insurance claims data.
Note: The figures are produced from the authors' calculations.

Table 6 .
Parameter estimates of fitting three-component composite models to vehicle insurance claims data.
Note: The figures are produced from the authors' calculations.