Bayesian Predictive Analysis of Natural Disaster Losses

: Different types of natural events hit the United States every year. The data of natural hazards from 1900 to 2016 in the US shows that there is an increasing trend in annul natural disaster losses after 1980. Climate change is recognized as one of the factors causing this trend, and predictive analysis of natural losses becomes important in loss prediction and risk prevention as this trend continues. In this paper, we convert natural disaster losses to the year 2016 dollars using yearly average Consumers Price Index (CPI), and conduct several tests to verify that the CPI adjusted amounts of loss from individual natural disasters are independent and identically distributed. Based on these test results, we use various model selection quantities to ﬁnd the best model for the natural loss severity among three composite distributions, namely Exponential-Pareto, Inverse Gamma-Pareto, and Lognormal-Pareto. These composite distributions model piecewise small losses with high frequency and large losses with low frequency. Remarkably, we make the ﬁrst attempt to derive analytical Bayesian estimate of the Lognormal-Pareto distribution based on the selected priors, and show that the Lognormal-Pareto distribution outperforms the other two composite distributions in modeling natural disaster losses. Important risk measures for natural disasters are thereafter derived and discussed.


Introduction
Different types of natural events hit the United States (US) every year. The east coast of the US suffers hurricanes, the middle of the US sees tornadoes, the west coast of the US endures earthquake, and the south of the US bears a variety of issues such as hurricane, wind, drought, and floods 1 . The data of the occurrence and damage of natural events from 1900 to present in the US, from the Emergency Events Database (EM-DAT)-International Disaster Database 2 , showed that every year's natural disaster losses after 1980 have been increasing. Climate change is recognized as one of the contributors to this increasing natural losses. Injury, homelessness, displacement, and economic losses from natural events can have significant impact on populations and societies. Predictive analysis of future natural events is imperative to provide important information for prevention and remedy plans to reduce the human impact and economic losses from natural disasters.
Various research has attempted to model the frequencies and damages due to the natural events. Levi and Partrat (1991) analyzed hurricane losses between 1954 and 1986 in the US, and found that the amounts of losses were independent and identically distributed (i.i.d.) and independent of the frequencies of hurricanes. These assumptions are confirmed in our research, based on the EM-DAT natural disaster data from 1900 to 2016 after taking into account price inflations. The number of losses in different years is converted into 2016 dollars using the annual average Consumers Price Index (CPI) 3 . We found that both the numbers of natural events and the CPI adjusted total damages have been increasing over the time; however there is no visible trend in the individual CPI adjusted losses, which is consistent with the conclusion claimed by Levi and Partrat (1991).
To discover the distribution features of individual losses from natural disasters, we use the most recent 37 years of the CPI adjusted amounts of natural losses to investigate appropriate parametric distribution models for the amount of loss caused by individual natural event, i.e., natural loss severity. Levi and Partrat (1991) proposed to use lognormal distribution for the losses of natural events based on the hurricane data between 1954 and 1986 in the US. However, this distribution cannot describe the typical feature of the natural disaster losses, i.e., many small amounts of losses and occasional occurrence of huge amount of losses. Some researchers argued to use composite distributions to capture this recognized feature.
A composite distribution has a threshold for the support of the loss random variable, with the Pareto distribution usually used to model a large loss beyond the threshold amount and popular parametric continuous distributions used for a small amount of loss below the threshold. For example, Bakar et al. (2015) developed several Weibull distribution-based composite models for heavy tailed insurance loss data. In our paper, we compare the performance of three composite distributions, namely Exponential-Pareto (Exp-Pareto), Inverse Gamma-Pareto (IG-Pareto), and Lognormal-Pareto (LN-Pareto), and the corresponding three non-composite parametric models, based on the negative loglikelihood, the Akaike information criterion, and Bayesian information criterion. All the three model selection criteria show that the composite distributions fit the data better than the non-composite models.
We thereafter apply the three composite models to the data and select the best fitted composite model for the loss severity of natural events. For consistent model comparison, we make the first attempt to derive analytic Bayesian estimation of the LN-Pareto composite distribution, because Deng (2018, 2019) has analytically derived Bayesian estimators for Exp-Pareto and IG-Pareto. In addition, Bayesian method enables us to perform predictive analysis of future losses of natural disasters using the best fitted composite model. Cooray and Cheng (2015) developed Bayesian estimators for the LN-Pareto composite model based on a Markov Chain Monte Carlo (MCMC) simulation algorithm. In our paper, we carefully select prior distributions for the LN-Pareto composite model and derive a closed form Bayesian estimators for the unknown parameters of the LN-Pareto distribution. The mean squared errors (MSE) of the Bayesian estimation method and the maximum likelihood (ML) estimation method show that Bayesian estimation method outperform the ML estimation method in estimating these three composite models.
The comparison results show that the LN-Pareto is the best among these three composite models for the loss severity of natural disasters. Various risk measures of natural event losses are thereafter presented based on the LN-Pareto distribution. Same risk measures based on the other two composite distributions of natural losses are also provided for comparison.
The remainder of this paper is organized as follows. Section 2 describes the natural disaster data and tests the assumption that individual loss amounts are independent and identically distributed. Section 3 introduces three composite distributions and compare these models with the corresponding non-composite models. Section 4 derives Bayesian estimators for the LN-Pareto composite distribution and demonstrates the performance of Bayesian estimation method versus the ML estimation method. Sections 5 presents risk measures of the future natural disaster losses based on the LN-Pareto model and the other two composite distributions. The concluding remarks are given in Section 6.

The Data
The EM-DAT contains worldwide data on the occurrence and impact of natural events from 1900 to the present day. The database is compiled from various sources, including United Nation agencies, non-governmental organizations, insurance companies, research institutes and press agencies. There were a total of 258 natural events in the US from 1900 to 2016, among which 156 events happened between 1980 to 2016, accounting for 60% of the total events during this one third of the whole time period. This observation is consistent with the fact that Earth's climate is changing faster than at any point in the history as a result of human activities.
We also look into the trend of the natural loss amounts from 1900 to 2016. To eliminate the effect of price inflation, we convert the amounts of losses in each year into the year 2016 dollars. Let CPI t be the annual average CPI in year t and y t be the CPI adjusted amount of losses in that year, then we have 1900, 1991, . . . , 2016. After price inflation effect being adjusted, the amount of losses during 1980 to 2016 accounts for 85.3% of the total losses. Figure 1 shows the yearly number of natural events and the CPI adjusted (in 2016 dollars) total damage costs from 1900 to 2016 in the US. We can see that both the number of natural events and the CPI adjust damage costs in each year have been increasing over these years, especially after 1980.

The Data
The EM-DAT contains worldwide data on the occurrence and impact of natural events from 1900 to the present day. The database is compiled from various sources, including United Nation agencies, non-governmental organizations, insurance companies, research institutes and press agencies. There were a total of 258 natural events in the US from 1900 to 2016, among which 156 events happened between 1980 to 2016, accounting for 60% of the total events during this one third of the whole time period. This observation is consistent with the fact that Earth's climate is changing faster than at any point in the history as a result of human activities.
We also look into the trend of the natural loss amounts from 1900 to 2016. To eliminate the effect of price inflation, we convert the amounts of losses in each year into the year 2016 dollars. Let CPI t be the annual average CPI in year t and y t be the CPI adjusted amount of losses in that year, then we have 1900, 1991, . . . , 2016. After price inflation effect being adjusted, the amount of losses during 1980 to 2016 accounts for 85.3% of the total losses. Figure 1 shows the yearly number of natural events and the CPI adjusted (in 2016 dollars) total damage costs from 1900 to 2016 in the US. We can see that both the number of natural events and the CPI adjust damage costs in each year have been increasing over these years, especially after 1980. To demonstrate the change in both number and damage costs of natural events caused by human activities, we break down the data to be before and after 1980. Figure 2 displays the percentage of the number of occurrence to the total number of occurrence and the percentage of the CPI adjusted damage costs to the total losses by different time periods (before and after 1980) and by different types of natural disasters. The exact numbers for this figure are listed in Tables A1 and A2 in Appendix A. We can see that storms and floods account for most of natural events. The damage costs from natural events after 1980 account for the majority of the total costs, among which the damage costs caused by storms after 1980 account for more than half. In addition, these storm losses were caused by fewer storms after 1980 than before 1980, indicating higher average storm losses after 1980 than before 1980. However, this is not the case for other types of natural events.
Although there is an obvious increasing trend in both the numbers and the total damage costs of natural events after 1980, it is not trivial to find out the pattern in the economic losses of individual natural events. Based on the aforementioned features of the natural loss data, we use the natural losses in the most recent 37 years from 1980 to 2016. There are a total number of 462 natural events in these 37 years, and only seven events caused economic loss above 2 × 10 7 (in 2016 dollars). In Figure 3, we depict the scatter plot of those losses below 2 × 10 7 , and we can see that there is no identified trend in individual natural disaster losses. We are interested in how to appropriately model individual natural disaster losses. Motivated by this research question and the above observations, in this paper we aim to investigate the features of the severity of natural events, attempt to find appropriate loss severity models for natural disasters, and provide the predictive modeling of loss severity for risk management and insurance arrangement.

The i.i.d. Assumption for Loss Severity
Let the random variable N denote the number of occurrences per year and X i , for i = 1, 2, . . . , N, be the severity random variable of the i th occurrence of natural events occurring in each year. The vector X = (X 1 , X 2 , ..., X N ) T denotes the random vector of yearly loss severity. We use the natural disaster data between 1980 to 2016, and the random variable N has a sample size of 37. Table A3 in Appendix B lists the CPI adjusted individual natural damage losses in these 37 years. It can easily be seen that random variable N takes one of the values 0, 2, 6, 8, ..., 28 over the 37 years. One value of N is zero, because there was no record of natural events in year 1988. In all other years, there were at least two natural events every year. Therefore, there are 36 non-zero realizations (n k , x k ) k=1,2,...,36 of (N, X). The i th column in the matrix ( x 1 , x 2 , . . . , x 36 ) T are the realizations of X i , i.e., the i th occurrence of natural disasters in each of 36 years.
Let m i be the sample size of the i th random variable X i . From Table A3 we also see that X 1 , the 1 st severity random variable, has 36 non-zero values and therefore the sample size m 1 = 36. The realization of X 1 is (x 1 1 , x 2 1 , ..., x 36 1 ) T , which takes the value (1019.45, 1056.14, ..., 550.00) million in 2016 dollars. X 28 , the 28th severity random variable with m 28 = 1 and takes only one non-zero value 1822.71 millions in 2016 dollars, because there is only one year with 28 losses. The total number of the observation is ∑ 28 k=1 m k = 462. We want to test the general assumption that X 1 , X 2 , . . . , X 28 are independent and identically distributed random variables. Since X 21 , X 22 , . . . X 28 have small sample sizes, we group them together as one random variable X 21 . Applying the nonparametric methods, the Kendall Tau test and the Spearman test (see Gibbons and Chakraborti 2003), we test the independence assumption for X 1 , X 2 , . . . , X 21 as follows.
For X i , X j , i = j, where i, j = 1, 2, . . . , 21, the null hypotheses and alternative hypotheses are i and x k j among m observations respectively, for k = 1, 2, . . . , m. The Spearman rho test statistic is whereR andS are the average of R k and S k respectively.
For X 1 , X 2 , . . . , X 21 , there are total of 21 2 = 210 tests. We obtain 210 values of all the Kendall Tau test statistic and Spearman test statistic with the corresponding p-values. Table 1 lists the pairs of severity random variables with the null hypotheses being rejected at the significance level of 1%. All other pairs have non-significant test results and we fail to reject the null hypotheses at the significance level of 1%. Based on these test results, it is reasonable to assume that X 1 , X 2 , ... are independent. Next, the Kruskal-Wallis test is used to verify the identical distribution assumption. The details of this test are introduced by Gibbons and Chakraborti (2003). Let F i be the cumulative distribution function of X i , i = 1, 2, ..., 21. In our Kruskal-Wallis test, the null hypotheses and alternative hypotheses are There are a total of 462 observations. Sort all the 462 observations in an increasing order. If there are more than one observations with an identical value, then the median is assigned as the rank for these observations. Define T i to be the sum of the ranks of all the observations of X i , where i = 1, 2, ..., 21. We have T 1 = 8817, T 2 = 8444, ..., and Under the null hypotheses, E(T i ) = m i (462 + 1)/2. Therefore, E(T 1 ) = 8334, E(T 2 ) = 88,334, . . . , and E(T 21 ) = 6019. The Kruskal-Wallis Statistic is Asymptotically, KW has Chi-Square distribution with the degree of freedom 20, with the p-value 0.494. We fail to reject H 0 at the significance level 1%. Therefore, it is reasonable to assume that X 1 , X 2 , ... are identically distributed.

Non-Parametric Distribution of Loss Severity
Based on the test results, we can reasonably assume that the CPI adjusted natural disaster damage random variable is independent and identically distributed. Let X denote the CPI adjusted damage random variable with unknown distribution F(x). The 462 individual damage losses are the realizations of the loss severity random variable X. As we explore an appropriate parametric distribution F(x) for the natural disaster loss severity, we first look at the non-parametric distribution of X, which is also called a data dependent distribution.
Let y 1 < y 2 < · · · < y k be the k unique loss values, and s i be the number of times the observations y i appears in the sample. Let r j = ∑ k i=j s i be the number of observations greater than or equal to y j . The Nelson-Aalen estimator of the cumulative hazard rate function isĤ . We set α = 0.05 and the true unknown distribution of loss lies between F L (x) and F U (x) with 95% confidence.
The plots ofF(x), F L (x), and F U (x) is given in Panel (a) of Figure 4. We also display the histogram of the CPI adjusted severity of the natural events in Panel (b) of Figure 4. We can see that the histogram of the CPI adjusted severity of the natural event losses is both skewed and fat-tailed, which shows the typical feature that there are many small losses and a few very large losses.

Composite Models
We confirmed that the natural event severity data has the typical feature of high frequency of small losses and low frequency of large losses; while the traditional distributions such as the normal, exponential, inverse-gamma, and log-normal distributions, are not able to describe this feature. Some researchers have addressed this feature of insurance data by using composite models for insurance losses. The probability density function of a composite model consists of two distributions, namely probability density functions f 1 (x) and f 2 (x). The general composite model has the probability density function as follows.
where c is a normalized constant and θ is the parameter that represents the threshold of the supports for the two distributions. In order to make the composite density function smooth, it is usually assumed that the pdf f X (x) is continuous and differentiable at θ, that is f 1 (θ) = f 2 (θ) and f 1 (θ) = f 2 (θ).

Three Composite Distributions
Cooray and Ananda (2005) introduced a two-parameter continuous and differentiable composite LN-Pareto model, which is a two-parameter lognormal density up to an unknown threshold value and a two-parameter Pareto density for the rest of the model. The resulting density is similar in shape to a lognormal density with the tail behavior quite similar to a Pareto density. They applied the proposed composite model to a fire insurance data to show the importance of the proposed composite LN-Pareto distribution in describing insurance claim data.
Motivated by Cooray and Ananda (2005), Teodorescu and Vernic (2006) introduced a composite Exp-Pareto distribution, which is an exponential density up to an unknown threshold value and a two parameter Pareto density for the rest of the model. The model is reduced to a one-parameter distribution after satisfying the continuous and differentiable condition. Aminzadeh and Deng (2019) proposed an IG-Pareto composite model. Under the general assumption of smoothness and continuity, the IG-Pareto model is reduced to be a one parameter model. Let Φ(·) be the cumulative distribution function (cdf) of the standard normal distribution. Table 2 summarizes the distribution density function of Exp-Pareto, LN-Pareto, and IG-Pareto composite models, and Figure 5 plots three distribution functions with various parameter values.  We can see that the Nelson Aalen estimate of the unknown distribution of the natural disaster severity loss random variable is quite similar to the distributions of three composite models. This indicates that composite models might be able to describe the features of the natural disaster damage losses. In the next section, we will compare the performance of these three composite distributions in modeling the natural disaster loss severity based on standard model comparison and selection criteria.

Model Selection for Loss Severity
The maximum likelihood estimators for the unknown parameters of the Exp-Pareto, IG-Pareto, and Lognormal-Pareto composite models have been derived by Teodorescu and Vernic (2006), Aminzadeh and Deng (2019), and Cooray and Ananda (2005) respectively. We know that the parameter θ is the unknown threshold value dividing the domain of the two distributions of a composite model, and the maximum likelihood functions changes when the value of θ changes; therefore, a grid search method has to be used to find the ML estimates. The grid search algorithm can be briefly summarized as follows.
1. Sort the sample of the natural disaster damage losses in an increasing order, i.e., x 1 < x 2 < ..., < x n , where n is the sample size. Let n * be the size of the partial sample of the first n * losses x 1 , x 2 , . . . , x n * . Start from n * = 1. 2. Compute the maximum likelihood estimatesθ andβ as in Table 3 for the given n * . Ifθ is in between x n * ≤θ ≤ x n * +1 , we found n * ; otherwise, increase n * by 1.

Repeat
Step 2 for n * = 2, 3, ..., till x n * ≤θ ≤ x n * +1 . The ML estimates of the parameters are found based on the correct n * .
In our research, Mathematica software is used to code the algorithm. Table 3 lists the ML estimates of three composite distributions.

LN-Pareto
If n * = 1, Based on the ML estimates, we use the negative log-likelihood (NLL) value, the Akaike's Information Criterion (AIC), the Bayesian Information Criterion (BIC) goodnessof-fit measures to compare the appropriateness of these three composite models in modeling the natural disaster severity.
NLL can be used to compare models with the same number of parameters. It is equivalent to the maximum value of the likelihood function and defined as where θ is the vector of unknown parameters. The smaller the NLL value, the larger the value of the likelihood function, and the better the fitted model. AIC was defined by Akaike (1973) as where q is the number of unknown parameters. The smaller the AIC value, the better the fitted model. The first term −2 log L(x 1 , x 2 , ..., x n |θ) will decrease when the number of unknown parameters increases and is offset by the value of 2q, indicating a trade-off between the goodness-of-fit and the number of parameters. BIC was developed by Schwarz (1978) and defined as where n is sample size and q is the number of parameters. BIC penalizes a large number of parameters and a big size of sample. The smaller the BIC, the better the fitted model. Interested readers are referred to Burnham and Anderson (2002) for more details about these model selection criteria. Based on the sample of 462 natural disaster damage losses in the US from 1980 to 2016, the values of the NLL, AIC, and BIC and the maximum likelihood estimates of the three composite models are summarized in Table 4. For comparison, we also fit three noncomposite parametric models, namely the Exponential, Lognormal, and Inverse Gamma distributions, to the natural disaster data. We can see that three composite models fit the natural disaster losses better than three corresponding non-composite models. This supports the claim that a composite model can describe the distribution features of insurance data and natural disaster losses. Among three composite models, the LN-Pareto fit the data better than the other two composite models, in terms of the NLL, AIC, and BIC values. Therefore, the LN-Pareto composite distribution is the best model to conduct Bayesian predictive analysis of the natural disaster losses in the US.

Bayesian Estimator of LN-Pareto
There is no analytical Bayesian estimator of the LN-Pareto composite distribution in the current literature. Cooray and Cheng (2015) found the Bayesian estimation for LN-Pareto based on the MCMC method. In this paper, we use conjugate priors for the two parameters of the LN-Pareto and make the first attempt to derive a closed-form Bayesian estimator without the MCMC simulation.
Recall the LN-Pareto probability density function given by Cooray and Ananda (2005), as listed in Table 2, where k = 0.372238898 and Φ(·) is the cdf of the standard normal distribution. We use Gamma(a 1 , b 1 ) to be the prior distributions of β and LN(c 1 , ( k β ) 2 ) to be the prior distribution of θ|β, where a 1 , b 1 , c 1 are hyper parameters. The prior distributions have probability density functions as follows.
Without loss of generality, we assume that x 1 < x 2 < ... < x n is an ordered random sample from the LN-Pareto distribution. Given n * , the size of the partial sample of the first n * losses x 1 , x 2 , . . . , x n * , such that x n * ≤ θ ≤ x n * +1 . The likelihood function can be written as To find posterior distributions π(β|x) and π(θ|β, x), we need the joint pdf f (x, θ, β) of (x, θ, β), which is obtained as The joint distribution function in Equation (2) can be reduced to From Equation (3), the posterior probability distribution functions can be obtained as and It is noted that the right hand side of Equation (5) is the kernel of a lognormal distribution with parameters A 1 and k β √ n * +1 . Our next step is to find E[θ|β, x], the expectation of the posterior distribution in Equation (5), i.e., the conditional Bayes estimate of θ under the squared error loss function.
We first need to find the normalizating constant C 1 for the probability density function in Equation (5) where M Y (t) denotes the moment generating function of a random variable Y. ln(θ) ∼ Normal(A 1 , ξ 2 ) based on the probability density functions of a lognormal distribution. As a result of Equation (6) we get Therefore, and the conditional Bayes estimate of θ iŝ The Bayes estimate of β can be derived based on the posterior probability distribution function of β in Equation (4). Let C 2 denote the normalizating constant for the probability distribution function of β. Let B 1 = n + a 1 , is the probability density function of Gamma(B 1 , B 2 ). As a and the Bayes estimate of β isβ where E 1 and E 2 are the expected values E[e −0.5( β k ) 2 A 2 ], when β follows Gamma(B 1 + 1, B 2 ) and Gamma(B 1 , B 2 ) distribution respectively. Numerical integration in software Mathematica is used to compute both E 1 and E 2 . Similar to the ML estimation method, the following grid searching method is used in Bayesian estimation: 1. Sort the sample of size n in increasing order, i.e., x 1 < x 2 < ..., < x n and let n * be the size of the partial sample of the first n * losses x 1 , x 2 , . . . , x n * . Start from n * = 1. 2. Compute the Bayes estimate β via (8) for the given n * . 3. Compute the conditional Bayes estimate of θ via (7), givenβ Bayes from Step 2. If x n * +1 ≤θ Bayes|β ≤ x n * +1 , then we found n * . otherwise, increase n * by 1. 4. Repeat Step 2 and 3 for n * = 2, 3, . . . till x n * +1 ≤θ Bayes|β ≤ x n * +1 , and we found the correct n * .
The values ofβ Bayes andθ Bayes|β found based on the correct n * , represent the actual Bayes estimates of the parameters β and θ. We therefore obtain analytical Bayesian estimates of the LN-Pareto composite model without simulation.

Validation by Simulation
We employ simulation studies to validate the accuracy of the proposed Bayesian estimation method for the LN-Pareto distribution, and compare it with the ML estimation method. In each simulation, we generate a sample of n observations from the LN-Pareto composite density function given by Equation (1) for different selected values of θ and β. Then we obtain the ML estimates and Bayesian estimates of these two parameters, based on the simulated samples. We repeat the simulation for N = 100 times. The average value of the estimates (denoted byθ andβ ) and the mean squared errors (MSE, denoted by ) are computed. MSEs show the differences between the true values and the estimated values of the parameters and intuitively indicate the performance of the estimation method.
Bayesian estimates of θ and β needs appropriate values of the hyper-parameters a 1 , b 1 , and c 1 . Recall that the prior distribution for β is a Gamma distribution with hyperparameter a 1 and b 1 . Therefore, when specifying the values of hyper-parameter a 1 and b 1 , we need to make sure the product of a 1 and b 1 , which is the expectation of the Gamma prior distribution, equal to the preset value of β. In addition, since the variance of the Gamma prior distribution is proportional to b 2 1 , we choose b 1 very small and then solve a 1 from the equation a 1 * b 1 = β . For example, when the selected true value for β is 0.5, a 1 and b 1 are chosen to be a 1 = 100 and b 1 = 0.005 such that b 1 is small and the product of a 1 and b 1 is 0.5.
Similarly, the conditional prior of θ|β are assumed to follow a log-normal distribution LN(c 1 , ( k β ) 2 ), therefore we choose c 1 to be the solution of the equation exp(c 1 + 0.5( k β ) 2 ) = θ, based on the expectation of the log-normal distribution. Table 5 lists Bayesian estimates and the ML estimates using simulated data from the LN-Pareto composite distribution with different values of the parameters.
From Table 5, we can see that the informative Bayesian estimates outperform the ML estimates in both cases β < 1 and β > 1, for Bayesian estimates have smaller MSEs than the ML estimates in all the simulation scenarios. Please note that as sample size n increases, both ML estimation and Bayes method provide more accurate estimates for θ, in terms of smaller MSEs when n increases. However, in Equation (8) the Bayesian estimate of β is proportional to n since B 1 = n + a 1 . As n increases, the MSE of Bayesian estimate of β increases slightly but is still much smaller than the MSE of the ML estimates. These simulation results indicate that Bayesian estimates are consistently better than the ML estimates if choosing reasonable hyper-parameters.
where a 2 and b 2 are hyper-parameters, n is the sample size and equal to 462 for our data, and n * is the size of partial sample, as aforementioned. The Bayesian estimate of θ in IG-Pareto distribution iŝ where a 3 and b 3 are hyper-parameters, and k = 0.144351, a = 0.163847, as specified in Table 2, for the IG-Pareto distribution. Based on the Bayesian estimators given by Deng (2018, 2019) and the Bayesian estimators for the LN-Pareto distribution derived in this research, we have closed form Bayesian estimators for all three composite models. Based on the CPI adjusted natural disaster losses from 1980 to 2016 in the US, we obtain analytical Bayesian estimates of the three composite models for natural loss severity in Table 6. Bayesian estimation also shows that the LN-Pareto is the best model among all three composite distribution models for the smallest NLL, AIC, and BIC values. Comparing the NLL, AIC, and BIC values in Tables 4 and 6, we can see that Bayesian estimation method has smaller values of these three criteria when fitting three composite models to the natural losses data. Kass and Raftery (1995) claimed that difference of 10 or more is strong evidence to favor the model with a smaller BIC value. Although the advantage of Bayesian estimate over the ML estimate is marginal especially in fitting Exp-Pareto and LN-Pareto to the data, Bayesian estimation overall performs better than the ML estimation method. The advantage of Bayesian method will be significant and favorable when the sample size is small.

Risk Measures
In this section, we are going to investigate risk measures of the loss severity of natural events, based on the LN-Pareto composite model. Two important risk measures, Value at Risk (VaR) and Tailed Value at Risk (TVaR), are used in our research. As comparison, we also display VaR and TVaR of loss severity based on the other two composite models.

Value at Risk and Tailed Value at Risk
VaR is a point risk measurement and describes the minimum loss with the desired level of confidence. Given a level of confidence p and a cumulative distribution function F(x) of a loss random variable X, VaR is defined as Pr(X ≤ VaR p (X)) = p, i.e., VaR p (X) = F −1 (p).
For example, if the VaR of the natural disaster loss severity is $100 million at a 95% confidence level, there is a only a 5% chance that the damage from a natural event will be more than $100 million in any natural disaster. This risk measure can be used by an insurance company to assess reinsurance need and risk management, so that the losses can be covered without putting the company at risk.
TVaR was developed as an alternative to VaR. It describes the average loss over VaR for a given confidence level p. Mathematically put, In three composite distributions, the Pareto distribution is used to model large losses with small frequencies. However, the expectation of the Pareto distribution does not exists if the shape parameter is smaller than 1. This is true for all fitted composite distributions in our research. Therefore, we define Limited Tailed Value at Risk (LTVaR) as where b is the maximum liability of a loss. From its definition, LTVaR is the average of the losses that are great than VaR but capped by the loss limit b, and therefore greater than the limited expectation. LTVaR is a very useful measure and can be easily implemented in insurance because this concept matches the maximum insurance benefit of an insurance policy. Table 7 displays the derived VaR p and LTVaR p at 85% confidence level from the three Bayesian estimated and ML estimated composite distributions. b is chosen to be 10 5 ('0000000 US$) for LTVaR 0.85 . Table 7 shows that the theoretical VaR 0.85 (X) based on three composite models are different when a different estimation method is used. For the IG-Pareto model, the Bayesian estimation method is significantly better than the ML estimation and these two estimation methods result in dramatic difference in the theoretical VaR 0.85 (X) and LTVaR 0.85 (X). For the other two composite models, although the Bayesian estimation method has marginally lower BIC values than the ML estimation method, the theoretical VaR 0.85 (X) and LTVaR 0.85 (X) are significantly different. Therefore, we need to be cautious in the choice of model estimation methods when calibrating a composite model.
Secondly, large values of theoretical VaR 0.85 (X) and LTVaR 0.85 (X) indicates that composite distributions do take care of the fat tail problem in our real world situation and that the average loss in the worst cases will not be underestimated by a composite model. Moreover, both the ML estimation and Baysian estimation methods confirm that the LN-Pareto fits the US natural disaster data better than the other two composite models, and our simulation validation has verified that the Bayesian estimation method for the LN-Pareto distribution is more accurate than the ML estimation; therefore, we would like to put more weight on the result from the Bayesian estimated LN-Pareto model, as highlighted by bold font in in Table 7.

Conclusions
In this paper, we propose using composite distributions to model natural disaster losses. A composite model piece-wisely models the typical feature of insurance losses, that is, high frequency of small amount of losses and low frequency of large amount of losses. We use the US natural disaster data from 1980 to 2016 in our research, considering the change in natural disasters' occurrence after 1980 due to climate change.
There are a total of 462 natural disasters during 1980 to 2016. After converting the amounts of losses into the year 2016 dollars, we test the assumption that natural disaster severity random variables in different years are independent and identically distributed. Our tests support the i.i.d assumption and we are able to use all the natural losses as realizations of one natural disaster severity random variable.
Based on the sample of 462 natural losses, we compare the performance of three composite distributions in modeling the natural losses in the US, namely Exp-Pareto, IG-Pareto, and LN-Pareto distributions. Based on the ML estimation method, we find that composite distributions fit the natural disaster losses better than the corresponding noncomposite distributions according to the NLL, AIC, and BIC measures. In addition, we also find that the LN-Pareto model is the best one among these three composite distributions, We make the first attempt to derive analytical Bayesian estimates for the LN-Pareto model. Simulation studies are conducted to assess the performance of the derived Bayesian estimates. The values of the MSEs from simulation show that the analytical Bayesian estimation performs better than the ML estimation method for the LN-Pareto distribution with various values for its parameters. The simulation study also reveals that Bayesian method is superior to ML in particular if the sample size is not very large.
In this research, the MCMC method is not used since we derived closed form Bayesian estimates for the LN-Pareto model. Based on the analytical Bayesian estimates of the three composite models, it is confirmed that LN-Pareto composite model is the best fit to the natural disaster losses. Bayesian estimation is proven to perform better than the ML estimation, according to the NLL, AIC, and BIC values.
Several risk measures for natural losses based on these three composite models are thereafter presented and compared. The differences in the derived risk measures from different composite distributions and different estimation methods reveal the importance of choosing an appropriate composite model for modeling natural losses and the difficulty in estimating the model.
Our research provides alternative information for insurance and risk management of natural disasters. We acknowledge the sparseness of natural disaster data. There are only 462 individual natural losses in the past 37 years and we rely on the CPI data to convert the losses to be consistent and free of the effect of price inflations over years. In the future research, we will continue to investigate the features of composite models and explore Bayesian model selection in predictive analysis of natural losses.