1. Introduction
Understanding and predicting the evolution of coastal morphology is important in coastal engineering, because of its implications for coastal safety, the environment, and the economy. For instance, coastal morphodynamics influence the occurrence of rip currents affecting swimmer safety, the protection of the hinterland from coastal flooding and erosion, and the establishment and development of coastal ecosystems. Often, process-based morphodynamic models are used to predict coastal evolution. These models account for a wide range of coastal processes, such as waves, currents, sediment transport, and morphology, which results in a high level of complexity and, consequently, extensive computational effort.
As complexity and computational effort increase, process or input reduction is necessary to obtain feasible computational times for engineering applications. In a morphodynamic sense, input reduction (IR) can be defined as the selection of a reduced set of representative forcing conditions that lead to accurate approximations of the long-term morphological evolution [
1]. A robust input reduction method should preserve some natural variability of the environment to be able to represent the full set of conditions accurately [
2,
3]. In coastal environments, waves are typically the dominating forcing conditions (i.e., for wave-dominated coasts). Accurate modeling of the nearshore morphodynamics requires selecting representative wave conditions that capture the variation in both wave height and direction, including mild and extreme events.
Mainly two categories of wave input reduction methods exist: binning and clustering methods. Binning methods divide the wave conditions into bins, sometimes using a specific weight target, such as longshore sediment transport. Clustering methods cluster the wave conditions according to their statistical similarity. Binning methods have been previously investigated by [
2,
3]. However, to the best of our knowledge, clustering methods applied to inter-annual morphological predictions have not yet been addressed.
In addition to the selection of the representative wave conditions, the number, the duration, and sequencing of the wave conditions also affect the IR performance [
4]. Sequencing of wave conditions refers to the order in which the representative wave conditions occur in the model. The chronology of the storm conditions is likely to affect the morphological response of the sandbar due to non-linear effects. The sequencing of the representative wave conditions can be performed by random, systematic or Markov Chain sequencing methods. Random sequencing draws wave conditions randomly from the reduced wave climate [
2,
3]. Systematic sequencing orders the wave conditions according to wave height (e.g., descending or ascending) and incident wave angle with respect to the shore-normal (e.g., either from positive to negative or vice versa, see [
2]). Markov Chain sequencing utilizes the wave chronology of the full dataset to order the representative wave conditions. To the best of our knowledge, Markov Chain sequencing methods applied to inter-annual morphological predictions have not yet been addressed in the literature.
Although input reduction is a common practice in morphodynamic modeling [
2,
3,
5,
6,
7,
8,
9], a comprehensive study on the performance of different IR-methods is lacking. In this study, we investigate the performance of 10 input reduction techniques with 36 subvariants (i.e., different initializations and input variables), including both binning and clustering methods. To this end, we use a cross-shore sandbar behavior model forced with measured wave time-series to simulate the morphological evolution of a beach profile for the cases of Noordwijk in the Netherlands (
Figure 1a) and Anmok in South Korea (
Figure 1b). For the most promising method, we systematically assess the performance with respect to the number of wave conditions, the sequencing method, and the duration of reduced wave climate (following [
2]). As the cross-shore model is computationally inexpensive, we are able to test a wide range of input reduction methods and settings to derive guidelines to select a suitable input reduction setup for future studies.
3. Tested Input Reduction Methods
We selected five binning and five clustering methods for the performance assessment (see
Table 1). In most of these methods, the wave conditions are clustered based on their spectral parameters, such as the root-mean-square wave height (
), peak period (
), and wave direction (
). However, we also clustered the conditions with respect to their contribution to the sediment transport by substituting
by the associated longshore sediment transport,
(when known) or
, where
p is the power that represents the non-linear relation between wave height and sediment transport. Typically,
varies from
to
. In this study,
was applied. All variations in terms of input variables are shown in
Table 1.
In each bin or cluster, the representative wave condition is defined as the centroid of that bin or cluster. For the spectral wave parameters, the centroids are defined as the average of the wave conditions within a cluster or bin. For
, the definition of the centroids within a bin is defined by a non-linear weighting formula for the wave height Equation (4).
where
is the frequency of occurrence of the root-mean-square wave height
of a wave condition
belonging to the cluster or bin
with observations
. For
the centroids are defined by the average of the wave conditions within the bin or cluster. To obtain the values of
the nearest wave condition of the centroid is used. The sub-sections below discuss the principles of the selected IR-methods.
3.1. Binning Methods
3.1.1. Conditions with the Largest Transport Contribution Method
The Conditions with the Largest Transport Contribution method (CLTCM) [
13] selects only the wave conditions with the highest contributions to the longshore sediment transport. Initially, the wave conditions are binned into a larger number of wave height and wave direction bins than the desired number of wave conditions for the reduced wave climate. The sediment transport contribution of each bin is determined, and the
bins with the highest sediment transport contribution are selected as the representative wave conditions. This method uses the sediment transport rates as the nput variable (see
Table 1) and, hence, requires the transport rates corresponding to the wave conditions to be known before the input reduction is executed.
3.1.2. Fixed Bins Method
The Fixed Bins method (FBM) ([
3]) divides the wave conditions in pre-defined wave height and wave direction bins. The algorithm first divides the wave conditions in directional bins with uniform resolution. Next, each directional bin is divided into wave height bins according to its range of wave height. This results in wave height bins that can vary among the directional classes (see
Figure 3a).
3.1.3. Energy Flux Method
The Energy Flux method (EFM) ([
3]) divides the wave conditions in pre-defined wave direction and wave height bins with equal amount of energy flux (
).
where
is the water density (assumed to be 1025 kg/m
3),
is the gravity acceleration (
= 9.81 m/s
2),
= deep water wave height and
= wave group celerity in deep water. The EFM generates a higher bin resolution for conditions with more wave energy and a lower resolution for conditions with less wave energy (see
Figure 3b). The wave height of the representative wave conditions is defined as the inverse function of the average energy flux of each bin while wave period and wave direction are defined as the average of the wave conditions in a bin.
3.1.4. Sediment Transport Bins Method
Similar to the EFM, Sediment Transport Bins method (STBM) divides the wave data in pre-defined wave direction and wave height bins with equal weight, but the weight is determined by the longshore sediment transport obtained from the brute force simulation. In contrast to the EFM, the definition of the directional bins starts from the shore-normal angle such that wave conditions that cause opposite sediment transport rates do not average out within a bin (see
Figure 3c).
3.1.5. Representative Wave Approach
The Representative Wave approach (RWA) is adapted from [
6] and divides the wave data into bins over time. In this paper, we divided the wave data into seasons. For each section, the representative wave condition is the average of the wave conditions in that bin. This is the only method that preserves the chronology of the original wave dataset.
3.2. Clustering Methods
Table 1 provides an overview of the clustering methods and variations that were selected for this study. The clustering methods use the normalized 3-dimensional (e.g.,
,
,
) Euclidean distance as a measure of similarity between the wave conditions. The closer the distance between the wave conditions, the more similar to each other they are. Similar to the binning methods,
or
can be used as an alternative input variable for
. We tested these variations for the crisp k-means, fuzzy k-means, and k-harmonic means methods (see
Table 1). The similarity of the wave conditions depends on the cluster initiation. Therefore, we tested different cluster initiations for the clustering methods: fixed bins, maximum dissimilarity algorithm, and K-means++ algorithm [
14].
3.2.1. Maximum Dissimilarity Algorithm
The Maximum Dissimilarity algorithm (MDA) (see [
15]) creates a subset of k centroids that represents the full diversity of the wave data by maximizing the dissimilarity between the vectors in the subset. To measure the dissimilarity between vectors, we used the MaxMin Algorithm [
16] with the efficient algorithm of [
17]. The first centroid is the wave condition with the maximum distance to all other wave data. After the first centroid is excluded from the dataset, the second centroid is the wave condition with the maximum distance to the first centroid. The subsequent centroids are the wave conditions with the maximum distance among the minimum distance of the remaining wave conditions to the previous centroids.
3.2.2. Grouping with Equal Sediment Influence Method
The Grouping with Equal Sediment Influence method (GESIM) aggregates wave conditions in clusters with approximately the same sediment transport contribution [
13]. Therefore, it uses only
,
, and
as input variables. It has the same principle as the STBM, but it aggregates the wave conditions into clusters instead of dividing them into bins. GESIM starts by selecting
k initial wave conditions as individual clusters using the MDA. Subsequently, in every iteration, each cluster incorporates the closest observation to the cluster until a total sediment transport threshold is reached. The threshold is defined by the total sediment transport divided by the number of representative cases (
). When wave conditions cannot join a cluster anymore, the remaining wave conditions join the cluster to which they have the smallest distance. In the end, this results in
k clusters that represent approximately the same amount of sediment transport. The centroid of each cluster is defined as the average of the wave conditions in the cluster.
3.2.3. Crisp K-Means Method
The Crisp K-Means method (CKM), also known as K-means, is one of the most widely used clustering methods [
18,
19]. It starts with
k initial centroids that are defined randomly with weights based on the distance of the wave conditions through the K-means++ algorithm [
14]. Then, every wave condition is assigned to the cluster it is closest to. The CKM has a hard membership function which means that wave conditions can only be a member of one cluster. Next, the centroids are updated by averaging the wave conditions that constitute the clusters. This procedure is repeated iteratively until the difference between the current and previous centroids is smaller than a user-defined accuracy criterion (see
Figure 4a). More details can be found in [
20].
3.2.4. Fuzzy K-Means Method
The Fuzzy K-Means method (FKM) (see [
21]) is similar to the CKM, but with a soft membership function. Therefore, wave conditions can be assigned to more than one cluster. This means that all wave conditions have some influence on the definition of the centroids determined by the fuzzy membership function. Initially, the centroids are defined as in the CKM. Then, the fuzzy membership function (
) of each wave condition is calculated for every cluster.
where
is the Euclidean distance between wave conditions (
) and centroids (
),
i being the wave observation index of the full dataset,
j the cluster index, and
k the number of clusters (i.e., number of representative wave conditions). The fuzzy parameter
, where
, is case specific and requires calibration.
Based on sensitivity analyses, we used
. The new centroids are defined as the weighted average of the wave conditions using the fuzzy membership as weight. In this way, wave conditions closer to the previous centroid have a higher influence on the definition of the next centroid. This iterative process is repeated iteratively until the algorithm converges towards a stable solution (see
Figure 4b).
3.2.5. K-Harmonic Means
The K-Harmonic means (KHM) (see [
22,
23]) has the same procedure as the FKM, but the weight used for the definition of the centroids is defined by a dynamic weighting function (
).
In this case
. Higher dimensions of the dataset require a larger value for
[
23]. The parameter
o is case specific and calibration is required to define it. Based on sensitivity analysis we used
. The dynamic weight leads to a larger influence of outliers on selecting the centroids rather than wave conditions that are closer to the centroids (
Figure 4c).
6. Discussion
A good selection of representative wave conditions for morphology should balance mild and energetic conditions as well as direction variability while prioritizing directions that contribute the most to the sediment transport. In our assessment, we found that binning methods perform better than clustering methods. Among the binning methods, the ones that split the wave conditions into bins with equal weight performed better than the ones that split the wave conditions arbitrarily into bins, as long as the reduced wave climate is not very detailed (e.g.,
, consistent with [
2]). On the other hand, [
3] found that the EFM performed better than the CERC (Coastal Engineering Research Center) method proposed by them, which is analogous to the STBM used in this study but with the longshore sediment transport calculated by the CERC formula [
24]. The difference in the findings of [
3] and the present study could be related to the incident wave angle that is not considered in the CERC formula. Therefore, positive and negative transport contributions can cancel themselves out. Yet, the EFM is the second-best method in this study. Note that in the STBM, sediment transport rates obtained from the brute force simulations were used as input, which commonly is not available. In this case, we recommend the use of sediment transport formulas considering different coast angles or other proxies, such as the energy flux. The clustering methods did not perform well because of their high dependency on the most recurrent observations. For Noordwijk, these were the mild wave conditions, resulting in a lack of energetic conditions. However, for very energetic coasts where the occurrence of mild conditions is not dominant over energetic conditions, the clustering methods may perform better.
The sequencing of wave conditions influences the morphological response of the simulations considerably. The random sequencing showed the best morphological response for the cases studies in this paper since randomly ordered reduced wave time series retained a higher variability than the other methods that use statistical information through Markov Chain probabilities. This is in agreement with the results of [
2], who found that randomly ordered synthetic time series performed better than systematic sequencing of wave conditions, such as ascending or descending wave heights combined with wave angles towards positive and negative directions. Despite its good performance, the random sequencing has the drawback that it is completely random without any user control. Furthermore, random sequencing has its limitations since it highly depends on the initial condition (i.e., the initial profile). For instance, a winter profile evolves differently than a summer profile with the same sequence of wave conditions. Since in Noordwijk and Anmok the bar dynamics do not seem to be related to specific storm events, the chronology is limitedly relevant, and random sequencing can be applied. However, [
2] found that in Hasaki, where chronology is important, random ordering of synthetic time series did not represent the inter-annual bar evolution very well. For such cases, other sequencing methods may perform better.
Regarding the number of representative wave conditions in the reduced wave climate, we found that
is a good quantity of representative wave conditions given that the duration of the wave climate is of the order of 100 days. This aligns with [
3] who indicated
as an optimal quantity. Additionally,
is in agreement with the commonly applied wave climates in morphodynamic simulations which typically make use of about 10 waves conditions [
5,
25]. The wave climate duration turned out to be a very important aspect of wave climate reduction for morphological applications. This agrees with [
2]. The same wave condition applied on different timescales will likely give rise to distinct morphology. Generally, decreasing the reduced wave climate duration improves the performance. The present analysis has as the lower limit. A further decrease on
would not necessarily improve the skill of the reduced models even further. There is a lower limit of the wave climate duration associated with the response of morphology to the hydrodynamic forcing. If the duration of the wave climate is too short, there is not enough time for the morphology to adjust to the hydrodynamic conditions. This is not observed in the results of this study because the simulated durations of the wave climate were well above the lower limit, which is around 10 to 20 days according to [
2]. Additionally, a further decrease on
implies loss of applicability since the number of transitions and, thus, computational time increase when the duration of the reduce wave climate decreases.
In this study, we used the Unibest-TC profile model, due to its reduced computational time that allowed to run the considerable amount of simulations required by our methodology. In reality, brute force computations are feasible with this model. Therefore, input reduction techniques are strictly not necessary. Moreover, in Noordwijk, the alongshore variability is small, so a 1D domain is acceptable. In Anmok beach alongshore variability is present and affects the local morphodynamics. However, the changes in alongshore positions of the crescentic bars are very slow compared to the cross-shore evolution allowing for a 1D approach. For larger timescales, this is not expected to be valid. The findings of this study can still be used as initial guidelines when performing input reduction with different models and domains even for areas where alongshore variability is important.
7. Conclusions
In this paper, the performance of 36 variants of wave input reduction (IR) methods in modeling the interannual sandbar evolution was evaluated. The selection of the proper settings for wave-IR is a balance between the resemblance of the natural variability of the full dataset and computational effort. This study provided insights into the methods and settings that are most promising to reduce computational effort at limited performance loss. The results showed that the Sediment Transport Bins method has the best performance of all 36 methods. Generally, binning methods perform better than clustering methods. Binning methods using weighted bins based on sediment transport proxies (i.e., longshore sediment transport or energy flux) perform better than those based on wave statistics only. The performance improves for an increased number of representative wave conditions at the expense of less reduction in computational time. Furthermore, a higher resolution in wave direction bins performs better than a higher resolution in bins for the wave height or sediment transport proxy. In terms of sequencing, random sequencing yielded the highest performance for the case studies. However, this is probably related to the limited importance of wave chronology in the case studies analyzed in this paper and, hence, may be different in other case studies. Finally, the performance is sensitive to the duration of the wave climate and, hence, its number of repetitions. The performance is better for a reduced wave climate with fewer representative wave conditions and a higher number of repetitions on a short timescale than for a more detailed wave climate applied on a longer timescale. The insights of this study may help coastal practitioners in performing input reduction more efficiently and effectively in practice.