Numerical and Statistical Analyses of Tsunami Heights with the L-Moments Method

: As devastating and unpredictable tsunamis generated by underwater earthquakes are occurring more frequently, the need for tsunami disaster prevention measures is rapidly increasing. In this study, tsunami heights were estimated, and the best-ﬁt distribution was examined through a combination of numerical analyses and statistical methods. A numerical model was employed to estimate the tsunami heights, and the parameters were estimated using the method of L-moments applied to the estimated tsunami heights. The best-ﬁt distribution was determined by applying the estimated parameters to the L-moment ratio diagram. The study areas were the Imwon Port and the Sadong Port located in the eastern part of the Korean Peninsula. The tsunami height distribution was represented by a log-normal distribution for the Imwon Port, whereas the distribution was represented by a generalized Pareto distribution for the Sadong Port. The study indicates that the distribution most commonly suggested by previous studies, i.e., the log-normal distribution, is not always accurate. Therefore, when performing statistical analysis on tsunami heights, the assumption of a log-normal distribution should be considered carefully.


Introduction
The increasing frequency of earthquakes in the regions ranging from South and Central America to North America, and the Kyushu and Honshu regions of Japan in recent years has caused a growing concern over the so-called 50-year period theory, which suggests that large earthquakes occur approximately every 50 years in the circum-Pacific seismic zone, which is referred to as the Ring of Fire [1]. Many countries close to the sea have been severely damaged by tsunamis that are caused by submarine earthquakes. For example, the impact of the 9.0-magnitude earthquake that occurred on 11 March 2011, off the east coast of Sendai, Japan, and the subsequent tsunami waves as high as 15 m, was felt worldwide. Nearly 20,000 people were killed, presumed killed or missing, and more than 150,000 people were dislocated, with the local property damage estimated at more than $220 billion USD [2]. This combined event also severely damaged the Fukushima nuclear power plant on the eastern Japan coast, the effects of which are still ongoing.
In general, the Korean Peninsula is recognized as safe from tsunamis. However, the 1983 Central East Sea earthquake (7.7 M) and the 1993 Southwest Hokkaido earthquake (7.8 M) caused damage to the Korean Peninsula, and the probability of an undersea earthquake near the Korean Peninsula is increasing, given the 2016 Gyeongju earthquake (5.8 M)  There have been approximately 30 tsunamis worldwide that have caused massive damage. As huge tsunamis occur relatively rarely, observational data of tsunamis alone have limitations in predicting tsunami heights. Therefore, various methods have been proposed to predict the risks associated with tsunamis and to calculate probable tsunami heights. In general, for past tsunami events, the tsunami heights were estimated using numerical model experiments or statistical analysis methods. While the prediction of tsunami height through numerical modeling could quantify the risks associated with a tsunami with relative accuracy, it is nearly impossible to model all tsunami cases; even producing models for a small number of cases requires considerable time and cost. In contrast, statistical analysis after-the-fact is highly accurate because it uses actual tsunami measurement data. However, it is difficult to define a precise height distribution because the available number of tsunami height observation points is very limited.
Applying statistical methods to predict tsunami events in the East Sea is difficult, given the very few past events as well as the lack of data. Therefore, in this study we predicted the maximum tsunami heights caused by the initial waveforms of virtual tsunamis and applied statistical analyses for the Imwon Port, which is the most susceptible to damage by an unexpected tsunami in the East Sea, and for the Sadong Port, which is scheduled for a new airport construction. For both cases, we combined numerical modeling with statistical analysis methodology. For the governing equations for numerical modeling, we applied linear shallow-water equations. To select the best-fit distribution of the tsunami heights obtained through such numerical modeling, we applied the L-moments method and the L-moment ratio diagram.

Study Area and Earthquake Sources
In this study, we used virtual earthquake sources in Japan that have a high probability of causing damage in the Republic of Korea. The values given by the Ministry of Land, Infrastructure, Transport and Tourism [3] of Japan were used as the fault parameters. Figure 1 shows the areas in Korea chosen for this study-the Imwon Port and the Sadong Port. The East Sea, surrounded by Korea, Japan, and Russia, is also shown (Figure 1). At the Imwon Port, one person was killed and two others were reported missing in the 1983 Central East Sea tsunami that occurred on 26 May 1983. This is the only human casualty recorded in the Korean Peninsula in the last 100 years.
Firstly, we had to determine the earthquake sources in order to perform numerical experiments for the target area. To estimate the initial water surface displacement of each source, we used a model based on the fault parameters under the assumption that the initial water surface displacement presented by Manshinha and Smylie (1971) [4] was the same as the vertical component of the submarine displacement.
Numerical model domains of the East Sea, the Imwon Port, and their bathymetries are presented in Figure 2. The numerical information and boundary conditions for each region are given in Tables 1 and 2, respectively. The grid size was decreased by 1/3 based on the dynamic linking method (Table 1). Region A used the free transmission condition, and the other regions used the dynamic linking method as a boundary condition for the open sea. A fully reflected condition was used for the A, B, C, and D regions. The tsunami heights in this study were predicted at all grid points in Region D of Table 1. Numerical model domains of the East Sea, the Imwon Port, and their bathymetries are presented in Figure 2. The numerical information and boundary conditions for each region are given in Tables 1 and 2, respectively. The grid size was decreased by 1/3 based on the dynamic linking method (Table  1). Region A used the free transmission condition, and the other regions used the dynamic linking method as a boundary condition for the open sea. A fully reflected condition was used for the A, B, C, and D regions. The tsunami heights in this study were predicted at all grid points in Region D of Table 1.   Numerical model domains of the East Sea, the Imwon Port, and their bathymetries are presented in Figure 2. The numerical information and boundary conditions for each region are given in Tables 1 and 2, respectively. The grid size was decreased by 1/3 based on the dynamic linking method (Table  1). Region A used the free transmission condition, and the other regions used the dynamic linking method as a boundary condition for the open sea. A fully reflected condition was used for the A, B, C, and D regions. The tsunami heights in this study were predicted at all grid points in Region D of Table 1.     Figure 3 shows the distribution of the virtual earthquake sources. All of the sources are located near the west coast of Japan. Many existing studies have reported tsunami traveling times of about 100-150 min from the west coast of Japan to the east coast of Korea (e.g., [1]). In the numerical simulation, we employed all 44 virtual earthquakes plotted in Figure 3. In the figure, Case 1 is the northmost case, while Case 44 is the southmost one. Fault parameters for the virtual earthquakes displayed in Figure 3 are listed in Table A1 in Appendix A.  Figure 3 shows the distribution of the virtual earthquake sources. All of the sources are located near the west coast of Japan. Many existing studies have reported tsunami traveling times of about 100-150 min from the west coast of Japan to the east coast of Korea (e.g., [1]). In the numerical simulation, we employed all 44 virtual earthquakes plotted in Figure 3. In the figure, Case 1 is the northmost case, while Case 44 is the southmost one. Fault parameters for the virtual earthquakes displayed in Figure 3 are listed in Table A1 in Appendix A.

Governing Equations
We numerically modeled an offshore tsunami, where the linear shallow-water theory could be applied because the water surface displacement is relatively small compared with the water depth [5]. As a tsunami can propagate a long distance, the physical frequency dispersion, which

Governing Equations
We numerically modeled an offshore tsunami, where the linear shallow-water theory could be applied because the water surface displacement is relatively small compared with the water depth [5]. As a tsunami can propagate a long distance, the physical frequency dispersion, which significantly affects the propagation, must be considered. Therefore, we used linear Boussinesq equations that can consider physical frequency dispersion as the governing equations, following Cho et al. [6]. The propagation model, except for the nonlinear terms and the Coriolis force, consisted of the continuity equation and momentum equations as follows: where ζ is the free surface displacement, h is the still-water depth, P(= uh) and Q(= vh) are the depth-integrated volume fluxes in the x and y directions, respectively, and g is the gravitational acceleration. Equation (1) is the continuity equation with no truncation error. Equations (2) and (3) are the momentum equations that do not consider the hydrodynamic pressure. Using a modified leap-frog scheme technique, we employed the following propagation model to generate numerical dispersion consistent with the physical dispersion term of the linear Boussinesq equations, without changing the lattice spacing and the computation time interval, even when the water depth changes [6].
In Cho et al. [6], the variance correction factors α and γ are determined as follows, and the numerical results are the same as those of the linear Boussinesq equation.
Detailed information on the numerical scheme, including the leap-frog scheme, can be found in the literature [1,4].

The L-Moments Method
We used the method of L-moments to calculate the parameters of the best-fit distribution of the tsunami heights. Hosking (1990) [7] introduced L-moments to estimate parameters of distribution through a linear combination of probability-weighted moments. It is similar to the existing method of moments in terms of application, but it has little bias because the estimates of the samples by L-moment are always linear combinations. As L-moments are a linear combination of probability-weighted moments, a probability-weighted moment must be defined prior to the explanation of the L-moment [8]. A probability-weighted moment is defined as follows: where F X (x) is the cumulative probability distribution function of X. An unbiased estimator of the probability-weighted moment must be used in order to apply it to the data of n samples, which is calculated as follows [9].β The sample data x j are rearranged in ascending order (x 1 ≤ x 2 ≤ · · · ≤ x n ). Using the unbiased estimator, the L-moment is calculated using the following equations:

The L-Moment Ratio Diagram
The L-moment ratio diagram is a useful method for comparing the best-fit distributions of data for multiple target areas [7,9].
Hosking [7] defined the L-coefficient of variation, L-skewness, and L-kurtosis, which correspond to the coefficients of variation, skewness, and kurtosis of the existing moments method, respectively, using the L-moment ratio as follows: where τ 2 is the L-coefficient of variation, τ 3 is the L-skewness, and τ 4 is the L-kurtosis.
In general, either a method of plotting the average point of the L-skewness and the L-kurtosis for each sample on the theoretical L-moment ratio diagram of the candidate distributions, or a method of comparing the optimal line diagrams of the L-moment ratio for the samples, could be used to select the best-fit distribution. In this study, we numerically simulated each of the 44 hypothetical undersea earthquakes on the 2387 grid points of the Imwon Port and the 3492 grid points of the Sadong port. As the number of samples was large, the best-fit distribution was selected by drawing the average of the L-skewness and the L-kurtosis instead of using optimal line diagrams.  Table A1 and plotted in Figure 3 are employed to estimate tsunami heights. However, only representative cases are drawn in Figures 4 and 5. In figures, Case 1 is the northmost case, while Case 44 is the southmost one. As the distance between the seismic sources and the target area increases, the tsunami heights tend to decrease. However, compared with Case 14, Case 13 produced lower tsunami heights with the same magnitude and a shorter distance (Table A1). produced lower tsunami heights with the same magnitude and a shorter distance (Table A1).

Results of Numerical Simulation
Although the 44 cases of virtual earthquakes plotted in Figure 3 are simulated numerically, only six cases are shown in Figures 4 and 5, respectively. Case 9 provides the largest tsunami heights at the Imwon Port as shown in Figure 4, while Case 44 provides the largest heights at the Sadong Port. This could have been caused by the inclusion of a complicated dispersion that accounts for the influence of the Yamato Rise fishing ground and the terrain under the East Sea.   Although the 44 cases of virtual earthquakes plotted in Figure 3 are simulated numerically, only six cases are shown in Figures 4 and 5, respectively. Case 9 provides the largest tsunami heights at the Imwon Port as shown in Figure 4, while Case 44 provides the largest heights at the Sadong Port. This could have been caused by the inclusion of a complicated dispersion that accounts for the influence of the Yamato Rise fishing ground and the terrain under the East Sea.

Results of Statistical Analysis and the L-Moments Method
We used the method of L-moments and the maximum tsunami heights to create virtual tsunami scenarios, and then we drew the results into L-moment ratio diagrams to analyze the probability distribution of the tsunami heights at the Imwon Port and the Sadong Port. Various distributions, including generalized logistic (GL), generalized extreme value (GEV), generalized Pareto (GP), 3-parameter log normal (LN3), and Pearson type 3 (PA3), were candidates for the best-fit distribution. Figures 6 and 7 show the L-moment ratio diagrams used to identify the best-fit distribution type for representing the tsunami heights in this study. The candidate model probability distributions are plotted as curvilinear lines, and pairs of the sample third and fourth L-moment ratio values are numerically represented. The point numbers represent the undersea earthquake identification numbers.

Results of Statistical Analysis and the L-Moments Method
We used the method of L-moments and the maximum tsunami heights to create virtual tsunami scenarios, and then we drew the results into L-moment ratio diagrams to analyze the probability distribution of the tsunami heights at the Imwon Port and the Sadong Port. Various distributions, including generalized logistic (GL), generalized extreme value (GEV), generalized Pareto (GP), 3parameter log normal (LN3), and Pearson type 3 (PA3), were candidates for the best-fit distribution. Figures 6 and 7 show the L-moment ratio diagrams used to identify the best-fit distribution type for representing the tsunami heights in this study. The candidate model probability distributions are plotted as curvilinear lines, and pairs of the sample third and fourth L-moment ratio values are numerically represented. The point numbers represent the undersea earthquake identification numbers.    By analyzing the tsunami heights through the L-moment diagram, we confirmed that the probability distribution of the tsunami heights could be changed for each tsunami case or location area. This indicates that previous studies of tsunami heights in limited areas that followed the lognormal distribution were not always accurate (e.g., [10]). Therefore, caution must be taken when applying a log-normal distribution to tsunami heights, regardless of the spatial width.
From the perspective of disaster prevention, it is reasonable to select a probability distribution out of the many possibilities because it is impossible to predict the distribution of all the cases. By By analyzing the tsunami heights through the L-moment diagram, we confirmed that the probability distribution of the tsunami heights could be changed for each tsunami case or location area. This indicates that previous studies of tsunami heights in limited areas that followed the log-normal distribution were not always accurate (e.g., [10]). Therefore, caution must be taken when applying a log-normal distribution to tsunami heights, regardless of the spatial width.
From the perspective of disaster prevention, it is reasonable to select a probability distribution out of the many possibilities because it is impossible to predict the distribution of all the cases. By integrating all the cases, the results of applying the L-moment ratio are shown in Figures 8 and 9. Figure 8 shows that the log-normal distribution is appropriate for the Imwon Port, whereas Figure 9 shows the generalized Pareto distribution for the Sadong Port.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 10 of 15 integrating all the cases, the results of applying the L-moment ratio are shown in Figures 8 and 9. Figure 8 shows that the log-normal distribution is appropriate for the Imwon Port, whereas Figure 9 shows the generalized Pareto distribution for the Sadong Port.    Additionally, we compared three different goodness-of-fit measures, namely, negative loglikelihood (NLogL), Bayesian Information Criterion (BIC) and Akaike Information Criterion (AIC). When fitting models, it is possible to increase the likelihood by adding parameters, but doing so may result in overfitting. Both BIC and AIC attempt to resolve this problem by introducing a penalty term for the number of parameters in the model. Lower BIC (or NlogL and AIC) is better for the model, as Additionally, we compared three different goodness-of-fit measures, namely, negative log-likelihood (NLogL), Bayesian Information Criterion (BIC) and Akaike Information Criterion (AIC). When fitting models, it is possible to increase the likelihood by adding parameters, but doing so may result in overfitting. Both BIC and AIC attempt to resolve this problem by introducing a penalty term for the number of parameters in the model. Lower BIC (or NlogL and AIC) is better for the model, as summarized in Table A2. We thus confirmed that the best-fit distributions of the tsunami heights for the Imwon and Sadong Ports were the log-normal distribution and generalized Pareto distribution, respectively. Based on these results, Figures 10 and 11 show the cumulative distribution function (CDF) of the distributions of the two ports. Through CDF, the tsunami heights could be quantitatively predicted by calculating the probability of the tsunami height caused by an earthquake that might occur in the East Sea. Based on this quantitative prediction methodology, realistic and economical responses and measures can be presented for preparing hazard maps or designing sea-facing facilities.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 11 of 15 summarized in Table A2. We thus confirmed that the best-fit distributions of the tsunami heights for the Imwon and Sadong Ports were the log-normal distribution and generalized Pareto distribution, respectively. Based on these results, Figures 10 and 11 show the cumulative distribution function (CDF) of the distributions of the two ports. Through CDF, the tsunami heights could be quantitatively predicted by calculating the probability of the tsunami height caused by an earthquake that might occur in the East Sea. Based on this quantitative prediction methodology, realistic and economical responses and measures can be presented for preparing hazard maps or designing sea-facing facilities.

Concluding Remarks
According to Kim et al. [10], similar tsunami height distributions appear in fixed, small coastal areas, regardless of the undersea earthquake characteristics. However, as the number of earthquakes increases, various tsunami height distributions might occur in a given area because tsunamis are affected by various topographical effects, such as refraction and diffraction during propagation, depending on the location of the earthquake. In general, higher tsunami heights are observed when earthquakes occur in adjacent locations, but there could be other aspects such as terrain effects, as Cases 13 and 14 describe here. Therefore, the assumption that tsunami heights will follow the log-normal distribution in smaller regions is not always correct.
For disaster prevention, because it is impossible to apply the optimal tsunami height distribution type for each earthquake, it might be efficient to establish a disaster prevention plan by selecting one optimal distribution type by integrating all of the historical earthquake and tsunami events. Through this approach, realistic and economical preparations could be made for creating hazard maps and for designing sea-facing population centers and port facilities.
Finally, the results of the study shall be used by a civil defense authority to establish a mitigation plan against possible but unexpected tsunami attacks in Korea.