Risk Analysis of Marine Environmental Elements Based on Kendall Return Period

: Against the background of global warming and rising sea levels, the threat of typhoon disasters to marine engineering structures has become increasingly serious. Therefore, the research on the design standards of marine environmental elements has become an important topic. In this study, two-dimensional joint distributions of wave height and surge height, surge height and wind speed, and wave height and wind speed were constructed based on the Gumbel–Hougaard (G-H) Copula function according to the data of marine environmental elements under extreme sea conditions from Naozhou observation station in the sea waters of western Guangdong, and the Kendall return period is introduced. The joint return periods, co-occurrence return periods, and Kendall return periods of joint distributions were calculated, along with the latter’s corresponding design values of environmental elements. The results showed that the Kendall return periods and the corresponding design values are between two kinds of traditional return periods. After analysis, the conclusion is that the Kendall return period can reﬂect the occurrence regularity of marine hydrological events more accurately, and the design values of marine environmental elements calculated under this standard can reasonably lower the investment under the premise of ensuring structural safety. Therefore, the Kendall return period can serve as the new selection for marine engineering design and risk management.


Introduction
The marine environment is complex and harsh, and sea weather is changeable. Marine engineering has a large scale, high investment, and long construction period. Once a marine structure is destroyed in extreme sea conditions, it will cause huge economic losses, serious casualties, and heavy marine environmental pollution. Currently, the number of typhoon disasters is increasing year by year, and the threat to marine engineering structures is becoming more and more serious under the climate background of global warming and sea-level rise [1,2]. The selection of the design standard of marine environmental elements determines the safety and investment of the engineering [3]. Therefore, study on the calculation method should be carried out to achieve a reasonable reduction of investment under the premise of ensuring the safety of the structure.
Usually, the failure of most engineering structures is not only determined by the over-value of a certain environmental load but also by the combined effect of several environmental elements that exceeds a certain critical level [4,5]. For example, the collapse of an oil platform is usually the result of the combined action of wind, wave, and current. Given this situation, the Copula function is applied to the calculation of reliability, design standards, and failure probability of marine engineering as a kind of connection function [6][7][8][9][10]. The Copula function can describe the non-linear correlation among variables flexibly and simulate the joint probability distribution in practical applications objectively [11].
In addition, it is not limited by the type of marginal distribution. For example, Wist established the joint distribution of wave height and wave period by using the binary-normal Copula and applied it to marine engineering in the sea area of Japan [12]. De Waal and van Gelder established the joint distribution of extreme wave height and wave period by using the Burr-Pareto Logistic Copula and compared the results with a physical model [13]. In the optimal design of rubble-mound breakwaters, Muhaisen established a two-dimensional probability model of characteristic wave height and storm duration based on a Copula function [14]. Chen has researched the joint probability distribution and the conditional probability distribution of maximum water increase and the corresponding wind speed based on the Gumbel-Hougaard (G-H) Copula function theory in the Archimedean function family. The hydrological data came from the Mayu tide station in the sea waters of eastern Guangdong [15]. Taking the coastal city of Haikou as the research area, Xu constructed the joint distribution model of wind speed, rainfall, and tidal level by using the three-dimensional Archimedean Copula function to study the joint return period and the failure probability of typhoon disaster events [16].
The above researches have enriched and improved the calculation method of the design criteria of marine engineering. However, the calculations are mostly carried out based on the traditional return period, which includes the joint return period and the co-occurrence return period. Salvadori addressed the problem of amplifying or narrowing the danger area by proposing a new method to define the return period of multivariate joint distribution, and he called it the Kendall return period [17]. The Kendall return period is re-introduced in this study. The series of annual extreme wind speeds and the corresponding wave heights and surge heights were extracted as analysis samples based on the historical observation data of Naozhou station in the sea waters of western Guangdong. After data analysis, the Pearson type III (P-III) distribution was selected to construct the marginal distribution of each environmental element. The G-H Copula function was used for the two-dimensional joint distribution of the three combinations. The joint return periods, co-occurrence return periods, and Kendall return periods of joint distributions were calculated. The design values of environmental elements under each return standard were then calculated. The purpose of the study was to gain a deeper understanding of the disaster risk of marine engineering caused by environmental elements in extreme sea conditions and provide a reference for engineering design and risk control.

Research Method
The workflow of this study was shown in Figure 1. In extreme sea conditions affected by typhoons, strong winds, storm waves, and storm surges are the main environmental factors leading to the destruction of urban protection engineering and marine engineering structures [18][19][20]. In this study, the P-III distribution was used to fit the sample sequence of three kinds of marine environmental elements that are widely used in domestic hydrological frequency analysis. P-III is also the distribution pattern recommended by the code for sea port hydrology of China. The maximum entropy method was used to estimate parameter values.
Salvadori proved that the constructed joint distribution can be regarded as the extreme distribution only under the condition that both the marginal distribution and Copula function are extreme value distributions [11]. The Gumbel-Hougaard Copula function is the only multivariate extreme Copula function in the Archimedean Copula family that is applicable to the frequency analysis of extreme marine hydrological events. Therefore, the G-H Copula function was used to construct the two-dimensional joint distribution of marine environmental elements in the study.

Principle of Maximum Entropy
The maximum entropy principle was first proposed by scientist E.T. Jaynes in 1957 [21]. The essence of this principle is to give the least subjective speculation on unknown events under the given conditions to minimize the risk of estimated values [22]. According to previous research, the errors of the estimated values are small, and the determined marginal distributions can reflect the statistical characteristics of environmental factors more objectively [22][23][24]. The maximum entropy method was used to derive the probability distribution of marine environmental elements in the current study, and the parameters were expressed in the form of an equation set, which can avoid the artificial interference of apriority as much as possible.
In his work, American scientist Singh detailed the parameter estimation method of P-III distribution based on the maximum entropy principle [25]. The main idea was introduced as follows.
For a characteristic variable X, its entropy can be expressed by Equation (1), which is also called the objective function: The data are interpreted through the constraint (2): where f (x) is the probability density function, R is the domain of definition, g i (x) is the j th constraint of X, and n is the number of constraints. The parameters that can satisfy the maximum value of the objective function H(x) under the constraint condition G j are the parameter values estimated by the principle of maximum entropy. Under this condition, the uncertainty among variables is the largest, which is close to the natural distribution state of random variables.
The parameter values of P-III distribution are estimated based on this principle, and the probability distribution function and the probability density function are expressed as Equations (3) and (4): The obtained constraint conditions based on the maximum entropy principle are shown as Equations (5)- (7), where the domain R is given by all the values above the threshold a 0 .
The probability distribution corresponding to the maximum entropy can be derived by the Lagrange method, as shown in Equation (8), where λ 0 , λ 1 , and λ 2 are Lagrange multipliers.
The multipliers λ 0 , λ 1 , and λ 2 can be eliminated by taking Equation (8) into Equations (5)- (7). Then, the following system of equations can be obtained: Finally, the parameter values can be obtained by solving the above equations.

G-H Copula Function
Copula theory was first proposed by Sklar in 1959 [26]. It did not develop rapidly until the 1990s when it became an important analytical method in statistics. Nelson gave a strict definition of the Copula function in 1999, thinking that the Copula function is a connection function that connects the joint distribution function F(x 1 , x 2 , · · · , x n ) of random variables X 1 , X 2 , · · · , X n with their respective marginal distributions F(x 1 ), F(x 2 ), · · · , F(x n ) [27]. If the multidimensional Copula function is expressed by C, for any x ∈ R n , there is: where the n-dimensional joint distribution F(x 1 , x 2 , · · · , x n ) is defined as: The n-dimensional Copula function is actually a mapping of [0, 1] n →[0, 1]. The density functions have the following relationship: The definition of the Archimedean Copula function is: where ϕ(u) is the generator of the Copula function; it is a continuously decreasing convex function.
The generator of the G-H Copula function used in the study was ϕ(u) = (− ln u) θ . If we use u and v to express marginal distributions, the G-H Copula function can be expressed as: After the calculation of Equation (17), the density function of the G-H Copula is obtained as Equation (18): The characteristic of the G-H Copula is that it can describe the upper-tail correlation between variables with a positive correlation.
The Kendall correlation coefficient τ was adopted to analyze the correlation between marine environmental elements, and the correlation coefficient method was adopted to estimate the parameter θ of the G-H Copula function in this study. The calculation method of τ is shown as Equation (19), and the relationship between θ and τ is τ where sign (·) is a symbolic function, which is 1 when The Pearson correlation coefficient ρ was also adopted to measure the linear correlation between marine environmental elements in the study, and its calculation method is shown in Equation (20). There is a strong linear correlation between variables when the absolute value approaches 1. However, the two variables are independent of each other when the value is 0. (20)

Calculation Method of Return Period
The return period is defined as the average time interval when the characteristic quantity of the environmental element greater than or equal to the set value appears one time. The joint return period and co-occurrence return period are two kinds of frequently-used traditional return periods in the frequency analysis of ocean hydrology. E ∪ is used to define the or event of a typhoon disaster with at least one variable exceeding a certain level in extreme sea conditions, and the corresponding risk rate can be expressed as P{U ≥ u∪V ≥ v}. The return period defined by the E ∪ event is called the joint return period. The calculation method is related to the joint distribution C(u,v), and the expression is shown as Equation (21): According to the definition of the disaster event E ∪ , the shaded part D 1 in Figure 2a, is the danger area of the joint return period when the set values are u 1 and v 1 . At least one variable in the area exceeds the set values. The definition method of the traditional return period has numerous combinations of variables corresponding to it at a given return level, and they form a probability contour. However, different combinations correspond to different danger areas. So, this method of directly dividing the danger area according to the characteristic values of variables will lead to some contradictions. Figure 2 is used as an example to illustrate this phenomenon.
The greater the value of C(u,v), the greater the return period. This means that the event is not likely to happen, and it belongs to a small probability event. Therefore, the danger area defined by a greater value of C(u,v) is the high-danger area, and the danger area defined by a smaller value of C(u,v) is the low-danger area. Theoretically, the high-danger area defined by a low probability event should be completely contained in the low-danger area defined by a high probability event. Only the high-danger areas are highlighted with shadows in Figure 2 for a clear presentation.
For the joint return period, the combination (u1', v1') is used as the characteristic values to determine the low-danger area, and the combination (u1, v1) is used as the characteristic values to determine the high-danger area, as shown in Figure 2a. For the re-occurrence return period, the combination (u2', v2') is used to determine the low-danger area, and the combination (u2, v2) is used to determine the high-danger area, as shown in Figure 2b. Then, point A (in Figure 2a) and point B (in Figure 2b) have the common problem, that is they are all in the safe area identified by the event with a small probability, but also in the danger area identified by the event with a high probability. This phenomenon is pretty contradictory, so it can be said that the traditional return periods have the problem of unclear identification of danger areas. Moreover, it also can be judged that the joint return period enlarges the danger area, while the co-recurrence return period narrows the danger area according to their definitions. E ∩ is used to define the and event of a typhoon disaster with all variables exceeding certain levels in extreme sea conditions. The corresponding risk rate can be expressed as P{U ≥ u∩V ≥ v}. The return period defined by the E ∩ event is called the re-occurrence return period. The calculation method is expressed as Equation (22) [28]: According to the definition of the disaster event E ∩ , the shaded part D 2 in Figure 2b is the danger area of the re-occurrence return period when the set values are u 2 and v 2 . In this area, both variables exceed the set values.
The definition method of the traditional return period has numerous combinations of variables corresponding to it at a given return level, and they form a probability contour. However, different combinations correspond to different danger areas. So, this method of directly dividing the danger area according to the characteristic values of variables will lead to some contradictions. Figure 2 is used as an example to illustrate this phenomenon.
The greater the value of C(u,v), the greater the return period. This means that the event is not likely to happen, and it belongs to a small probability event. Therefore, the danger area defined by a greater value of C(u,v) is the high-danger area, and the danger area defined by a smaller value of C(u,v) is the low-danger area. Theoretically, the high-danger area defined by a low probability event should be completely contained in the low-danger area defined by a high probability event. Only the high-danger areas are highlighted with shadows in Figure 2 for a clear presentation.
For the joint return period, the combination (u 1 , v 1 ) is used as the characteristic values to determine the low-danger area, and the combination (u 1 , v 1 ) is used as the characteristic values to determine the high-danger area, as shown in Figure 2a. For the re-occurrence return period, the combination (u 2 , v 2 ) is used to determine the low-danger area, and the combination (u 2 , v 2 ) is used to determine the high-danger area, as shown in Figure 2b. Then, point A (in Figure 2a) and point B (in Figure 2b) have the common problem, that is they are all in the safe area identified by the event with a small probability, but also in the danger area identified by the event with a high probability. This phenomenon is pretty contradictory, so it can be said that the traditional return periods have the problem of unclear identification of danger areas. Moreover, it also can be judged that the joint return period enlarges the danger area, while the co-recurrence return period narrows the danger area according to their definitions.
Given the limitations of traditional return periods, Salvadori proposed a new definition method, the Kendall return period [17,28,29]. This kind of return period is characterized by high-danger areas identified by small probability events being completely included in the low-danger areas, as shown in Figure 2c.
In the theory, the Kendall measure K c is introduced, which is related to the Copula function of joint distribution, and is used to represent the risk level that the joint probability of two random variables is less than or equal to the p-value, at a certain probability level of p ∈ (0,1). This kind of return period is defined reasonably, and the combination of variables has a unique risk area corresponding to it under a specific return period. The expression is: The Kendall return period expressed by K c is: The calculation expression of two-dimensional joint distribution is shown in Equation (25), and the derivation process is detailed in the paper [28,30].
For the two-dimensional G-H Copula function, the K c measure can be solved by Equation (26):

Background Information
As shown in Figure 3, the coastal areas of western Guangdong include Zhanjiang, Maoming, and Yangjiang, with a coastline of 1662.8 km. There are natural deep-water harbors and abundant aquatic resources along the coast. The offshore waters of Zhanjiang are rich in oil and gas resources. The area is dominated by a subtropical monsoon climate, with sufficient sunlight, rain, and heat. So, subtropical cash crops are also abundant. Naozhou observation station is located on the East island of Zhanjiang, which is on the east side of the Leizhou Peninsula, and in Leizhou Bay. The east coast of the peninsula recesses inward, forming an arc; Naozhou station has been built in this large-scale bay. Therefore, the area forms storm surges easily affected by the pocket-shaped terrain when a typhoon comes. Guangdong province is seriously affected by typhoon disasters [31]. The number of typhoons that land in the western section is the largest among the three coastal sections, far exceeding that of the eastern section and the Pearl River mouth section [32]. The storm surge disasters in the study area are frequent, severe, and have long durations, so the data collected by the observation station are representative. We collected the measured data of environmental elements during a total of 25 years from 1992 to 2016 during typhoons, and extracted samples of annual extreme wind speed, and the accompanying wave height and surge height for calculation and analysis. The extracted data are shown in Figure 4. High winds act on harbor engineering, oil platforms and ships, and have a direct impact on maritime transportation and operations. In addition, the strong wind field acts on the sea surface, forming waves, currents, and increasing water, which indirectly affect the layout planning of the port and the channel. Therefore, the extreme wind speed is a key meteorological dynamic factor for the Naozhou observation station is located on the East island of Zhanjiang, which is on the east side of the Leizhou Peninsula, and in Leizhou Bay. The east coast of the peninsula recesses inward, forming an arc; Naozhou station has been built in this large-scale bay. Therefore, the area forms storm surges easily affected by the pocket-shaped terrain when a typhoon comes. Guangdong province is seriously affected by typhoon disasters [31]. The number of typhoons that land in the western section is the largest among the three coastal sections, far exceeding that of the eastern section and the Pearl River mouth section [32]. The storm surge disasters in the study area are frequent, severe, and have long durations, so the data collected by the observation station are representative. We collected the measured data of environmental elements during a total of 25 years from 1992 to 2016 during typhoons, and extracted samples of annual extreme wind speed, and the accompanying wave height and surge height for calculation and analysis. The extracted data are shown in Figure 4. High winds act on harbor engineering, oil platforms and ships, and have a direct impact on maritime transportation and operations. In addition, the strong wind field acts on the sea surface, forming waves, currents, and increasing water, which indirectly affect the layout planning of the port and the channel. Therefore, the extreme wind speed is a key meteorological dynamic factor for the planning and the design of marine engineering. Waves also have an important impact on the location and the layout of ports, as well as the design, operation, and management of marine engineering structures. The design's wave height is an important reference for lighthouses, vertical-wall buildings, slope buildings, and pile foundation buildings. In addition, the storm surge always threatens the marine engineering structure; the extreme water increase formed by it is an important basis for the design of urban protection projects such as breakwater and seawall, as well as coastal nuclear power plants, oil platforms, and harbor engineering structures. So, wind speeds, wave heights, and surge heights were chosen as the analysis data in this study.

Construction of Marginal Distribution
The maximum entropy method introduced above was used for the parameter estimation of P-III distribution of sample sequences about wave height, surge height, and wind speed. The plotting position formula y i = (i − 0.75)/(n + 0.25) was used to calculate the empirical frequency. The Kolmogorov-Smirnov (K-S) test, root-mean-square error method (RMSE), and probability point correlation coefficient method (PPCC) were used to test the goodness of fit. The parameter values and test results obtained based on the above settings are shown in Table 1, and the probability distribution diagrams are shown in Figure 5.  The test results show that the p-values of the K-S test are quite high, the values of RMSE are quite small, and the values of PPCC are all greater than 0.95, which indicate that the empirical data fit well with the theoretical frequency curves. The probability distribution diagrams also intuitively reflect that the fitting effects of the tail fit well. So, it can be seen that the parameter values estimated by the maximum entropy method are more accurate with small errors.

Construction of Two-Dimensional Joint Distribution
The G-H Copula function was used to construct two-dimensional joint distributions of wave height-surge height, surge height-wind speed, and wave height-wind speed. The Kendall correlation coefficient τ of each combination was calculated, and the parameter value θ was determined according to the relationship between τ and θ. The calculation results are shown in Table 2. The Pearson correlation coefficient ρ of each combination is given in Table 2, and the scatter plots of the Copula function with calculated τ-values are given in Figure 6. They can further demonstrate the correlation between environmental factors.  The τ-values, the p-values, and the scatter plots show that the correlation between wave height and surge height is relatively strong. There is a certain correlation between surge height and wind speed. The correlation between wave height and wind speed is relatively weak.
The values calculated by the empirical 2D plotting position formula (27) and the values of the theoretical frequency calculated by the Copula function are plotted in the two-dimensional coordinate system to obtain the fitting test diagrams, as shown in Figure 7. According to Figure 7, the scattered points of each combination are distributed around the 45 • straight line. The test diagrams show that the degree is well fitted, so the determined Copula functions can be used to build the joint distribution models.

Calculation of Return Period
According to the calculation method of the return period introduced above, the joint distribution diagrams under three return standards were drawn based on the determined marginal distributions and two-dimensional joint distributions of marine environmental elements, Figure 8. The design return period of the single variable was set from 5a to 500a. The three kinds of return periods of the joint distribution calculated for each combination are shown in Table 3. Due to the large space occupied by the graph, only the distribution diagrams for the combination of wave height and surge height are given, as shown in Figure 8.
It can be seen from the table that the calculated joint return periods are less than the design return periods of the single variable; the co-occurrence return periods are greater than the design return periods; and the Kendall return periods are between the two kinds of traditional return periods, but they are larger than the design return periods. The risk probability and the return period are reciprocals of each other, so it shows that for the two-dimensional joint distribution, the risk probability of the joint return period is relatively large, and the risk rate of the co-recurrence return period is relatively low. However, the Kendall return period can more accurately express the risk level of the combined action of two environmental factors at the specific design frequency.  Table 3. Three kinds of return periods of the two-dimensional joint distribution for different combinations.

Calculation of Design Value
In the univariate hydrological frequency analysis, the design value corresponding to a specific return period is unique, and can be directly calculated by the inverse function of the marginal distribution. For two-dimensional joint distribution, not only is there no inverse function corresponding to the probability distribution function, but the design value is not unique, as shown in the counters of the return period in Figure 9. Although there are multiple combinations of design values under the same recurrence level, there is always one group with the maximum probability. The combination can make the joint probability density function f (u,v) determine the maximum value, which can be used as a reference for measuring the cost and the risk of marine engineering. Taking Figure 9 as an example, it gives schematic diagrams about the contour superposition of the return period and the probability density function of the joint distribution. According to the above analysis, the horizontal value and the vertical value at the tangent point should be selected as the design values of wave height and surge height under a specific return period. Due to space constraints, we only give the schematic diagrams of the combination of wave height and surge height. Therefore, the weight function method was used to calculate the design values of environmental elements in this study: The obtained design values of environmental elements under the different return period standards are shown in Table 4. According to the table, the joint return period is relatively conservative, and the design values calculated based on it are higher than those calculated based on the co-occurrence return period. Whereas, the design values calculated based on the Kendall return period are between the two kinds of traditional return periods, and closer to those calculated based on the marginal distribution.

Discussion
According to the calculation results of the return period and the design value, the joint return period expands the danger area of extreme marine hydrological events, resulting in relatively small return periods and relatively large design values for the dangerous events with the same combination of environmental factors. The joint return period will lead to a relatively high project investment. In contrast, the co-occurrence return period narrows the danger area of extreme marine events. So, it obtains larger return periods and smaller design values. Although this design method is more economical in investment, it may lead to certain risks in engineering construction. After comparison and analysis, we believe that the design value calculated from the Kendall return period can balance the two factors of safety and economy and reduce the project costs reasonably on the premise of ensuring structural safety.
It should also be noted that the design value calculated under the standard of the Kendall return period is lower than that of a single variable. It shows that the safety of an engineering structure can be guaranteed by using the design value of a single variable of environmental factors according to the existing design specifications. However, compared with the method of calculating the design value by using the marginal distribution of a single variable, the method of the Kendall return period is more reasonable as it takes the correlation between variables into account.

Conclusions
In this study, the two-dimensional joint distributions of the G-H Copula function were constructed based on the observation samples of marine environmental elements from Naozhou station in the sea waters of western Guangdong. The Kendall return period was re-introduced. The Kendall return periods were calculated and compared with the obtained joint return periods and the co-occurrence return periods. The design values of environmental elements under each recurrence standard were also calculated. The calculation results showed that the obtained Kendall return periods and the corresponding design values were all between the two kinds of traditional return periods. The conclusion is that the definition of the Kendall return period is reasonable and avoids the limitations of traditional definition modes (that have the problem of unclear identification of danger areas). The Kendall return period can accurately characterize the risk level of the combined action of two environmental elements under the specific design frequency. The design values of environmental elements deduced from the Kendall return period can not only protect the marine engineering structure from typhoon disasters but also reduce the project cost to avoid unnecessary waste. The application of the Kendall return period provides a reasonable and feasible method for the formulation of ocean engineering design standards. In future research, the Kendall return period should be developed into three-dimensions or even multi-dimensions to analyze the joint risk level when a variety of environmental elements act on the marine engineering structure at the same time.
Author Contributions: Conceptualization, formal analysis, investigation, methodology, software, visualization, writing-original draft, writing-review and editing were completed by Y.L.; data curation, funding acquisition, project administration, resources, supervision, and validation were completed by G.L. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by the Natural Science Foundation of Shandong Province, grant number ZR2019MEE050.