Calculation of Joint Return Period for Connected Edge Data

Guilin Liu 1, Baiyu Chen 2,* , Zhikang Gao 1, Hanliang Fu 3 , Song Jiang 3 , Liping Wang 4 and Kou Yi 5 1 College of Engineering, Ocean University of China, Qingdao 266100, China; liuguilin73@ouc.edu.cn (G.L.); zhikanggao94@163.com (Z.G.) 2 College of Engineering, University of California Berkeley, Berkeley, CA 94720, USA 3 School of Management, Xi’an University of Architecture and Technology, Xi’an 710055, China; fuhanliang@xauat.edu.cn (H.F.); jiangsong925@163.com (S.J.) 4 School of Mathematical Sciences, Ocean University of China, Qingdao 266100, China; wlpjsh@163.com 5 Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA; yikou@usc.edu * Correspondence: baiyu@berkeley.edu; Tel.: +1-510-502-6352


Introduction
Coastal and ocean engineering is generally under the influence of extreme wind wave, which has a high risk of damage.Extreme ocean waves belong to multidimensional random variable that is of interdependence.In recent years, with the requirement of ocean engineering structure reliability and the environmental design standard improving, extreme value distributions of single variables can no longer meet the need of researchers [1][2][3][4].Researchers pay more and more attention to the theory of multivariate distribution [5][6][7], such as the joint distributions of wave heights, wave periods, and corresponding wind speeds [8,9].Actually, the damage of coastal and ocean engineering are usually the result of their combined effect.Currently, the theory of multivariate distribution has been applied in many fields [10][11][12][13][14][15][16].Many application examples have also shown that the design parameters derived from a joint probability distribution can better meet the design need of ocean engineering [17][18][19][20], reduce the building cost of ocean engineering more reasonably, explore return periods of multivariable waves, and calculate joint design level values more accurately.Additionally, all of these are of great significance for the design and risk management of related projects [21].

Common Copula Functions and Distribution Characteristics
Copula theory was first proposed by Sklar in 1959 when the relationship between low dimensional marginal distribution numbers, low dimensional marginal distribution function, and multi-dimensional joint distribution function was studied.With the concept of Copula put forward and its theory gradually improved, Nelsen [39] made a strict definition of Copula in 1999, and presented the Sklar theorem that is important in the application of Copula theory.The Sklar theorem stated that if F(x,y) is the joint distribution function of the random vector (X, Y), and F x (x) and F y (y) are corresponding marginal distributions, there must be a correlation structure function C that enables Equation (1) true.
Additionally, when the marginal distributions are continuous distribution functions, C is unique.Gumbel-Hougaard (GH) Copula function and Clayton Copula function are two Copula functions suitable for describing the positive correlation between variables, which are widely used in the fields of ocean engineering and hydrometeorology.Its distribution function and density function are as follows: First, the Gumbel-Hougaard (GH) Copula function: Water 2019, 11, 300 where u and v are corresponding marginal distributions, θ is the parameter of Copula function.
The corresponding density function is: Gumbel-Hougaard (GH) Copula is only suitable for the condition in which variables are of positive correlation, and mainly describes the upper tail correlation between random variables.
Second, the Clayton Copula function: The corresponding density function is: Like the Gumbel-Hougaard (GH) Copula function, the Clayton Copula function is only suitable for the condition in which variables are of positive correlation, and mainly describes the lower tail correlation between random variables in the joint distribution.In addition to the above two two-dimensional Copula functions, the other two common Copula functions are as follows: Third, Ali-Mikhail-Haq (AMH) Copula function: The AMH function can describe the random variables with positive or negative correlation, but it is not suitable for the variables with high positive or negative correlation.AMH Copula structure is symmetrical.
Fourth, the Frank Copula function: It is similar to AMH Copula function, but has no restriction on the degree of correlation.Frank Copula structure is of symmetry, that is, the correlation between the variables increases symmetrically at the upper tail and lower tail of its distribution.
In order to understand Copula functions intuitively, the diagrams of scatter, functions and probability functions are used to describe the distribution characteristics of the above two Copula functions.Figure 1a       As shown in Figure 1, after 2000 times of stochastic simulation, the Gumbel-Hougaard (GH) Copula function shows more obvious tendency of a fat upper tail, while Clayton Copula function is more intensive in the lower tail, and scattered in the upper tail.The similar characteristics can be seen from Figures 3 and 5.When θ is two and four, the Gumbel-Hougaard (GH) Copula density function appears as a J-shaped distribution, namely, the upper tail is higher than the lower tail.Thus, Gumbel-Hougaard (GH) Copula function is very sensitive to the correlation of the upper tail among variables.For Clayton Copula function, the tail behavior is quite the opposite.When θ is two and four, Clayton Copula density function tends towards an L-shaped distribution.That is, the lower tail is higher than the upper tail, and the Copula function is very sensitive to the correlation of the lower tail among variables.As shown in Figure 1, after 2000 times of stochastic simulation, the Gumbel-Hougaard (GH) Copula function shows more obvious tendency of a fat upper tail, while Clayton Copula function is more intensive in the lower tail, and scattered in the upper tail.The similar characteristics can be seen from Figure 3 and 5.When θ is two and four, the Gumbel-Hougaard (GH) Copula density function appears as a J-shaped distribution, namely, the upper tail is higher than the lower tail.Thus, Gumbel-Hougaard (GH) Copula function is very sensitive to the correlation of the upper tail among variables.For Clayton Copula function, the tail behavior is quite the opposite.When θ is two and four, Clayton Copula density function tends towards an L-shaped distribution.That is, the lower tail is higher than the upper tail, and the Copula function is very sensitive to the correlation of the lower tail among variables.

Examples of the Application of the Copula Function
Taking the time series of the annual extreme wave height and corresponding wind speed measured at the Weizhou Island Ocean Station as an example, this paper discussed the validity of two-dimensional joint distribution constructed by Copula functions, when used in ocean engineering.Weizhou Island Ocean station (196°20′ E, 10°90′ N) is located in the South China Sea area.We used the data of the annual extreme wave heights and corresponding wind speeds during 1970-1990 as an example.Its linear correlation coefficient ρ and rank correlation coefficient τ are shown in Table 1.As shown in Figure 1, after 2000 times of stochastic simulation, the Gumbel-Hougaard (GH) Copula function shows more obvious tendency of a fat upper tail, while Clayton Copula function is more intensive in the lower tail, and scattered in the upper tail.The similar characteristics can be seen from Figure 3 and 5.When θ is two and four, the Gumbel-Hougaard (GH) Copula density function appears as a J-shaped distribution, namely, the upper tail is higher than the lower tail.Thus, Gumbel-Hougaard (GH) Copula function is very sensitive to the correlation of the upper tail among variables.For Clayton Copula function, the tail behavior is quite the opposite.When θ is two and four, Clayton Copula density function tends towards an L-shaped distribution.That is, the lower tail is higher than the upper tail, and the Copula function is very sensitive to the correlation of the lower tail among variables.

Examples of the Application of the Copula Function
Taking the time series of the annual extreme wave height and corresponding wind speed measured at the Weizhou Island Ocean Station as an example, this paper discussed the validity of two-dimensional joint distribution constructed by Copula functions, when used in ocean engineering.Weizhou Island Ocean station (196°20′ E, 10°90′ N) is located in the South China Sea area.We used the data of the annual extreme wave heights and corresponding wind speeds during 1970-1990 as an example.Its linear correlation coefficient ρ and rank correlation coefficient τ are shown in Table 1.

Examples of the Application of the Copula Function
Taking the time series of the annual extreme wave height and corresponding wind speed measured at the Weizhou Island Ocean Station as an example, this paper discussed the validity of two-dimensional joint distribution constructed by Copula functions, when used in ocean engineering.Weizhou Island Ocean station (196 • 20 E, 10 • 90 N) is located in the South China Sea area.We used the data of the annual extreme wave heights and corresponding wind speeds during 1970-1990 as an example.Its linear correlation coefficient ρ and rank correlation coefficient τ are shown in Table 1.
Table 1.The linear correlation coefficient and rank correlation coefficient.

Correlation Coefficients Linear Correlation Coefficient ρ Rank Correlation Coefficient τ
The annual extreme wave height and corresponding wind speed 0.4992 0.383 Table 1 shows the linear correlation coefficient of the data is 0.4992 and the rank correlation coefficient is 0.383.The linear correlation coefficient is well known, and can be expressed as, where x i stands for the ith observation value of X and x is the mean value of X; y i stands for the ith observation value of Y and y is the mean value of Y.
Water 2019, 11, 300 6 of 15 It reflects the linear correlation between two random variables X and Y.The rank correlation coefficient is also called the relational coefficient of gradation, and its expression is: where d i is the corresponding rank difference and n represents the number of the data in the dataset.
The rank correlation coefficient reflects the correlation between the annual extreme wave height and corresponding wind speed.Figure 6 is a scatter plot of wind speed and annual extreme wave height.
The scatter plot of the two is shown below: where xi stands for the ith observation value of X and x is the mean value of X; yi stands for the ith observation value of Y and y is the mean value of Y.
It reflects the linear correlation between two random variables X and Y.The rank correlation coefficient is also called the relational coefficient of gradation, and its expression is: where di is the corresponding rank difference and n represents the number of the data in the dataset.The rank correlation coefficient reflects the correlation between the annual extreme wave height and corresponding wind speed.Figure 6 is a scatter plot of wind speed and annual extreme wave height.The scatter plot of the two is shown below: Table 1 presents the correlation coefficient analysis, and results there is positive correlation between the annual extreme wave height and wind speed.Therefore, it is reasonable to use the Gumbel-Hougaard Copula function and Clayton Copula function to construct the two-dimensional joint distribution of annual extreme wave height and wind speed.As for the parameter estimation of the Copula function, the most common and concise method is the rank correlation coefficient.The rank correlation coefficient and the parameters meets: Table 1 presents the correlation coefficient analysis, and results there is positive correlation between the annual extreme wave height and wind speed.Therefore, it is reasonable to use the Gumbel-Hougaard Copula function and Clayton Copula function to construct the two-dimensional joint distribution of annual extreme wave height and wind speed.As for the parameter estimation of the Copula function, the most common and concise method is the rank correlation coefficient.The rank correlation coefficient and the parameters meets: The parameters of the above two Copula functions are easily calculated as 1.6207 and 1.2415, respectively, through the rank correlation coefficient shown in Table 1.
To construct the two-dimensional joint distribution by Copula function, the marginal distribution need to be determined, namely, the single variable extreme distributions of annual extreme wave height and wind speed are in need of determination, respectively.The two-dimensional mixed Gumbel distribution was first proposed by Gumbel, and the Gumbel density function and distribution function of its marginal distribution are: Water 2019, 11, 300 7 of 15 where x stands for the observation value, σ and µ are two undetermined parameters, f (x; µ, σ) is the Gumbel density function and F(x; µ, σ) is the Gumbel distribution function.Two-dimensional mixed Gumbel joint probability density function (PDF) and distribution function (CDF) are expressed as follows: where and ρ is linear correlation coefficient: where c, d, σ 1 and σ 2 are four undetermined parameters, g(x, y) is two-dimensional mixed Gumbel joint probability density function and G(x, y) is the corresponding distribution function.The results of the corresponding diagnostic tests show that the annual extreme wave height and wind speed observation data comply with the Gumbel, Weibull, and Pearson-III distributions, and thus, the data can be used as analysis samples of the corresponding extreme value distribution.Tables 2 and 3 present interval estimates of the 95% confidence levels when using the Gumbel, Weibull, and Pearson-III distributions fitting of the annual extreme wave height and wind speed.As shown in Tables 2 and 3, it can be seen that, when the maximum entropy distribution is used to describe the annual extreme wave height, the P-Value is the maximum.Additionally, when describing the wind speed with Gumbel distribution, the P-Value is the maximum.Therefore, it is most appropriate to use the maximum entropy distribution to describe the annual extreme wave height, and use Gumbel distribution to describe the wind speed.
According to this, the two-dimensional joint distribution functions constructed by Gumbel-Hougaard Copula function and Clayton Copula function can be obtained.The joint distribution functions are as follows: When the joint distribution of annual extreme wave height and wind speed is described with the mixed Gumbel distribution and Gumbel-Logistic distribution as well as Equations ( 14) and ( 15), three-dimensional contour plot of the corresponding distribution function is shown in Figure 13.The corresponding two-dimensional contour plot is shown in Figure 14.As shown in Figure 13, it can be seen that the shape of the four distribution function plots has little difference.Considering that mixed Gumbel and Gumbel-logistic distributions have been applied many times to different hydrological probabilistic analysis, the two established joint distributions based on Copula functions are similar to them, which are of a certain practical value in engineering application as well.Figure 14 shows a certain difference, especially when the joint probability exceeds 0.95, there are some differences between their contour.The mixed Gumbel distribution and Gumbel-Logistic distributions are steeper than the two distributions based on the Gumbel-Hougaard Copula function and Clayton Copula function.That means that when comparing with the established joint distributions, the traditional joint distributions can not fit the tail data very well.Namely, the multiyear return values calculated by the traditional joint distributions will be slightly larger.In the actual projects, the design parameters in the traditional joint distributions will bring a lot of unnecessary economic cost, and that the newly established joint distributions will be more reasonable and directly improve the economic benefits of projects.As shown in Figure 13, it can be seen that the shape of the four distribution function plots has little difference.Considering that mixed Gumbel and Gumbel-logistic distributions have been applied many times to different hydrological probabilistic analysis, the two established joint distributions based on Copula functions are similar to them, which are of a certain practical value in engineering application as well.Figure 14 shows a certain difference, especially when the joint probability exceeds 0.95, there are some differences between their contour.The mixed Gumbel distribution and Gumbel-Logistic distributions are steeper than the two distributions based on the Gumbel-Hougaard Copula function and Clayton Copula function.That means that when comparing

The Joint Return Period Analysis
When analyzing two wave elements, we usually pay attention to the following two events, {X > x} and {Y > y}.Therefore, the joint return period can be defined as: When the joint distribution of annual extreme wave height and wind speed is described with the mixed Gumbel distribution and Gumbel-Logistic distribution, as well as Equations ( 14) and ( 15), the multiyear design value and its joint return period of a single factor can be calculated, respectively.Tables 4-7 shows them, respectively.Tables 4-7 present the joint return period calculated by the above four joint distributions, respectively, with the design wave height values and design wind speed values in 100, 200, 500, and 1000 year return periods.Obviously, the joint return periods of annual extreme wave height and wind speed are larger than that of single variable.The joint return periods calculated by the joint distributions that was constructed by the Gumbel-Hougaard Copula function and Clayton Copula function are lower than that calculated by the mixed Gumbel distribution and the Gumbel-Logistic distribution.In terms of the conservation of ocean engineering construction, the joint distribution of the Gumbel-Hougaard Copula function and Clayton Copula function are more reasonable.After constructing the joint distribution of annual extreme wave height and wind speed with the Copula function, some design criteria for ocean engineering can be determined through joint return period analysis and conditional probability analysis.This provides a theoretical basis for the construction of ocean engineering, which can well control the construction cost of ocean engineering and ensure its safety in theory.

Conclusions
In this paper, the two-dimensional joint distributions of annual extreme wave height and corresponding wind speed in Weizhou Island from 1961 to 1989 are constructed by using the Gumbel-Hougaard (GH) Copula function and Clayton Copula function.The established models are compared with the commonly used two-dimensional joint distribution, mixed Gumbel distribution and Gumbel-Logistic distribution.Then the joint return period analysis and conditional probability analysis are analyzed by the two-dimensional joint distributions constructed by the Gumbel-Hougaard (GH) Copula function and Clayton Copula function.The conclusions are summarized as follows: (1) The multivariate joint distribution constructed by Copula function is more flexible than the common multivariate joint distribution in terms of marginal distribution.The common multivariate joint distributions usually require the marginal distribution to be specific.(2) The multivariate joint distribution constructed by the Copula function can better describe the non-normality of single variable, and combine multiple non-normal wave elements.(3) The marginal distribution of multivariate joint distribution constructed by Copula function is easy to be determined.Thus, it is easier to be realized in the joint return period analysis, especially in conditional probability analysis.

Figure 6 .
Figure 6.The scatter plot of the annual extreme wave height and corresponding wind.

Figure 6 .
Figure 6.The scatter plot of the annual extreme wave height and corresponding wind.

Water 2019, 11 FOR PEER REVIEW 12 Figure 13 .
Figure 13.The above four joint distribution function plots: (a) mixed Gumbel distribution, (b) Gumbel-Logistic distribution, (c) joint distribution based on G-H Copula function, and (d) joint distribution based on Clayton Copula function.

Figure 14 .
Figure 14.The above four joint distribution function contour plots: (a) mixed Gumbel distribution, (b) Gumbel-Logistic distribution, (c) joint distribution based on G-H Copula function, and (d) joint

Figure 13 .
Figure 13.The above four joint distribution function plots: (a) mixed Gumbel distribution, (b) Gumbel-Logistic distribution, (c) joint distribution based on G-H Copula function, and (d) joint distribution based on Clayton Copula function.

Figure 13 .
Figure 13.The above four joint distribution function plots: (a) mixed Gumbel distribution, (b) Gumbel-Logistic distribution, (c) joint distribution based on G-H Copula function, and (d) joint distribution based on Clayton Copula function.

Figure 14 .
Figure 14.The above four joint distribution function contour plots: (a) mixed Gumbel distribution, (b) Gumbel-Logistic distribution, (c) joint distribution based on G-H Copula function, and (d) joint distribution based on Clayton Copula function.

Figure 14 .
Figure 14.The above four joint distribution function contour plots: (a) mixed Gumbel distribution, (b) Gumbel-Logistic distribution, (c) joint distribution based on G-H Copula function, and (d) joint distribution based on Clayton Copula function.

Table 2 .
K-S test of the annual extreme wave height distribution function.

Table 3 .
K-S test of the wind speed distribution function.

Table 4 .
Design wave height (m) and design wind speed (m/s) in different return periods based on the mixed Gumbel distribution.

Table 5 .
Design wave height (m) and design wind speed (m/s) in different return periods based on the Gumbel-Logistic distribution.

Table 6 .
Design wave height (m) and design wind speed (m/s) in different return periods based on the G-H Copula distribution.

Table 7 .
Design wave height (m) and design wind speed (m/s) in different return periods based on the Clayton Copula distribution.