4.3.1. Sampling Method
To identify a suitable sampling range that ensures stable dependencies between S
W and the corresponding H
1/3, the samples were generated using the sampling approach described in
Section 3.4.1, applying different values of r annual maxima per year.
Figure 12 shows the variation of Kendall’s τ as a function of r for the six coastal sections. At low r-values (1–6), Kendall’s τ fluctuates considerably, indicating unstable dependencies due to limited sample sizes and inconsistent event representation. With increasing r, the correlations gradually converge. To reduce subjectivity in the selection of r, the stability of the dependence structure was assessed based on the change in Kendall’s τ between consecutive values, defined as
Stability was assumed when the maximum change across all coastal sections remained consistently below a threshold of ε = 0.04 for at least two consecutive values of r.
It should be noted that the choice of the threshold ε is to some extent subjective. However, the selected value is consistent with the observed magnitude of Δτ (≈ 0.02–0.03 in the stable range) and ensures that the stabilization of the dependence structure is identified without being overly sensitive to minor fluctuations. The results show a reduction in Δτ up to r ≈ 7, after which the changes remain small (Δτ ≈ 0.02–0.03), indicating that the dependence structure has reached an approximately stable plateau. Therefore, r = 7 was selected as the smallest value at which the dependence can be considered stable. This interpretation is also supported by the visual assessment of
Figure 12, where the curves clearly flatten from r ≈ 7 onward.
Based on this selection, for each year, the seven highest S
W values together with their corresponding H
1/3 were included in the bivariate sample, ensuring that the resulting dataset provides a balance between statistical robustness and the preservation of physically meaningful S
W–H
1/3 combinations. The samples for the coastal sections are shown in
Figure 13. Events with S
W < 0.5 m above NHN were excluded from the dataset, as they do not represent extreme events with high load potential for coastal protection structures. The analysis is based on absolute water levels rather than isolated storm surge components. Consequently, negative anomalies (e.g., related to inverse barometer effects) are not represented in the selected extreme event sample. Additionally, physically implausible parameter combinations were removed.
Although
Figure 13 indicates an overall increasing trend between water level and significant wave height, the relationship is not strictly linear and shows considerable scatter. The dependence is only moderate (τ ≈ 0.23–0.34), and the variability of the significant wave height increases with higher water levels. For design applications, the joint probability of extreme events is more relevant than the average trend, which motivates the use of copula-based approaches.
4.3.2. Determination of the Univariate Marginal Distributions
To identify suitable marginal distributions for the stochastic modeling of S
W and H
1/3, a univariate goodness-of-fit analysis was performed. Several candidate distributions were assessed—including Normal, Weibull, Generalized Extreme Value (GEV), Logistic, and Lognormal distributions—based on the KS test, the RMSE, and the AIC.
Figure 14 presents the results for S
W. Both Lognormal and GEV distributions demonstrated the best overall performance. The KS test indicates that the Lognormal distribution yields
p-values above the significance level α = 0.05 for most coastal sections, suggesting no significant deviation from the empirical data.
This finding is supported by low RMSE and AIC values, which confirm that the Lognormal model provides a consistent representation of the upper tail and extreme water levels. The GEV distribution performs comparably but shows slightly higher residuals in some cases, while the Logistic distribution tends to underestimate high water levels. Consequently, the Lognormal distribution was selected as the marginal model for S
W. The estimated parameters of the Lognormal distribution for all coastal sections are summarized in
Table 6.
Figure 15 shows the results for H
1/3. Among the tested models, the GEV distribution exhibits the most consistent behavior across all coastal sections, with the highest KS
p-values and the lowest RMSE and AIC values. These results indicate that the GEV distribution adequately represents the statistical characteristics of H
1/3, particularly in the upper range of the data. The Normal and Logistic distributions, in contrast, underestimate the tail behavior, while the Weibull distribution shows higher deviations for extreme events. Therefore, the GEV distribution was selected as the marginal model for H
1/3. The corresponding parameter estimates for all coastal sections are provided in
Table 7.
It should be noted that
Figure 14 and
Figure 15 only display the probability distributions that successfully passed the KS test (
p > 0.05) for each coastal section. Distributions that did not meet this criterion were excluded from graphical representation to ensure that only statistically valid models are compared. Overall, the combination of a Lognormal distribution for S
W and a GEV distribution for H
1/3 provides a statistically consistent and physically meaningful description of the univariate extremes. These marginals form the basis for the subsequent bivariate dependence modeling described in
Section 4.3.3.
4.3.3. Determination of the Bivariate Joint Distributions
To describe the dependence structure between S
W and H
1/3, several Archimedean copulas were tested, including Clayton, Frank, Gumbel, and Normal copulas. The copula parameters are estimated using the ML method.
Figure 16 shows the resulting goodness-of-fit evaluation for all coastal sections based on the CvM test, the RMSE, and the AIC.
Among the tested families, the Frank copula provides the most consistent and statistically robust fit across all coastal sections. Across individual coastal sections, differences in copula performance are relatively small, indicating similar dependence structures between S
W and H
1/3 along the coast. It shows the lowest RMSE values and the highest CvM test
p-values in most cases, indicating that the Frank copula adequately captures the central dependence structure between S
W and H
1/3 without overemphasizing either tail. The Gumbel copula performs comparably but tends to overrepresent upper-tail dependence, while the Clayton copula overestimates the lower-tail behavior. The Normal copula generally shows weaker performance, particularly for extreme events. The Göhren section shows reduced fit performance, with higher RMSE values and lower
p-values, likely caused by increased local variability or data-related uncertainties. Overall, the Frank copula was selected as the most suitable dependence model for the subsequent bivariate extreme value analysis, as it provides a statistically consistent and physically plausible representation of the dependence between S
W and H
1/3. The estimated copula parameters for all sections are summarized in
Table 8.
To evaluate the co-occurrence of extreme S
W and H
1/3, the empirical upper-tail dependence λ
U(u) was estimated following the approach of Frahm et al. [
55]. This method quantifies the conditional probability that one variable exceeds a high quantile given that the other exceeds the same quantile, thereby measuring the strength of dependence between simultaneous extremes. The analysis was applied to the observed bivariate samples of S
W and H
1/3 from all coastal sections (see
Figure 13).
Figure 17 presents the variation of λ
U as a function of the quantile threshold u, illustrating how the strength of dependence evolves toward the upper tail of the joint distribution.
The empirical analysis of the upper-tail dependence λU(u) shows moderate positive dependence between SW and H1/3 across all coastal sections, with values mostly between 0.2 and 0.5. The Boltenhagen section exhibits the strongest and most stable dependence, whereas the eastern sections (Göhren, Koserow) show slightly weaker and more variable dependence. For u > 0.98, λU becomes unstable due to limited sample sizes.
An infinite number of parameter combinations can be generated using the parameterized copula model, for which either the non-exceedance probability or the corresponding return period can be determined. This allows the identification of all combinations S
W and H
1/3 associated with a given probability of occurrence. A graphical representation of these combinations is provided by isolines, which connect all pairs of S
W and H
1/3 values corresponding to the same return period.
Figure 18 illustrates the isolines for R = 5 a, R = 25 a, and R = 100 a for all coastal sections. Each isoline theoretically contains an infinite number of possible parameter combinations, and there is no inherent criterion for selecting a single design event along the curve. Therefore, a direct comparison of events between different coastal sections is challenging. To support a consistent interpretation, the black points (P
maxD) indicate the locations of maximum probability density along each isoline, following the approach of Salvadori et al. [
58].
The shape of the isolines in
Figure 18 further supports the findings of the dependence analysis. In all coastal sections, the isolines exhibit a pronounced curvature toward the upper right corner, indicating a positive dependence between S
W and H
1/3. The curvature is strongest in Boltenhagen, reflecting a stable coupling between high S
W and H
1/3. In the central sections (Warnemünde and Zingst), the dependence is moderately expressed, while in the eastern sections (Göhren and Koserow) the isolines flatten slightly, suggesting weaker joint extremes and a higher degree of variability. These spatial patterns correspond well with the tail-dependence results shown in
Figure 17.
With the adapted univariate and bivariate probability models, it is now possible to compute and compare S
W and H
1/3 for defined R in each coastal section. The red lines in
Figure 19 represent the univariate results, while the black lines correspond to the bivariate combinations derived from the copula models. Water levels are indicated by dots and wave heights by squares. The curves of the univariate and bivariate determined S
W show only minor differences in magnitude, and from R = 50 a onward, both curves progress almost linearly. In all sections, the univariate determined S
W values are higher than the corresponding bivariate ones. In contrast, the univariate determined H
1/3 show both linear and exponential tendencies across the coastal sections. In Boltenhagen, the increase in H
1/3 remains almost linear, while in Göhren a pronounced exponential rise is observed. At the points P
maxD, the H
1/3 values exhibit no significant increase beyond R = 50 a, indicating an approximately linear progression. The deviations between univariate and bivariate determined H
1/3 are considerably larger than those of the water levels.
Accounting for the dependencies between SW and H1/3 using the copula models leads to systematically lower parameter magnitudes for the same return periods across all coastal sections. This suggests that univariate analyses tend to overestimate design values, while the bivariate approach provides more realistic estimates by incorporating the physical dependence between SW and H1/3.
Section 4 presented the results of the data analysis, univariate and bivariate probabilistic modeling of S
W and H
1/3 along the Baltic Sea coast of MP. The findings revealed distinct spatial variations in the hydrodynamic parameters and a consistent positive dependence between S
W and H
1/3 across all coastal sections. Univariate analyses identified the Lognormal distribution as the most suitable model for S
W and the GEV distribution for H
1/3. The bivariate dependence was successfully represented by the Frank copula, which captured the moderate upper-tail dependence and provided statistically robust fits. The comparison of univariate and bivariate design values demonstrated that neglecting the dependence between S
W and H
1/3 leads to an overestimation of design parameters, particularly for the wave height.
Overall, the combined use of univariate and copula-based bivariate models provided a reliable and physically consistent framework for determining joint design parameters. These results form the basis for the subsequent design example in
Section 5, where the derived design parameters are applied to calculate the wave run-up on coastal dikes.