1. Introduction
Coastal zones are widely used for human settlements, agriculture, trade, industry, and amenities, and their usage is increasing [
1]. Accurate shoreline monitoring is particularly critical for coastal development as the demand for human settlements, agriculture, trade, industry, amenities, and marine transportation activities—such as ships, fishing boats, and recreational marinas—increases [
2]. These growing demands may necessitate coastal city expansion; however, if development interests do not adhere to conservation principles, they can pose significant threats to coastal resources. Such threats can lead to a wide range of problems for resource users and decision makers [
3]. Coastal issues include beach and coastal erosion, the construction of coastal structures that change current and wave patterns, and sand mining on beaches [
4]. Additionally, coastal areas where at least 100 million people live within 1 m of the mean sea level will face increased risk in the coming decades due to flooding caused by rising sea levels as a result of global warming. Another threat affecting some of the most developed and economically valuable real estate is the increased frequency and intensity of coastal disasters, exacerbated by beach loss due to worsening sand beach erosion. As the baseline rate of long-term sand beach erosion is two orders of magnitude higher than the rate of sea level rise, it has long been assumed that significant sea level rise would have severe consequences for coastal populations [
5,
6,
7,
8,
9].
To understand the changes in coastal morphology that significantly impact the sustainability of coastal communities, structures, and ecosystems and to predict coastal disasters such as coastal erosion, it is essential to analyze beach responses or beach evolution processes across multi-scale spatiotemporal dimensions. This involves examining long-term pressures, as well as short-term pulses, based on sufficient observational data and further modeling for prediction. Observations of coastal morphology for this purpose are conducted by using various approaches, such as field surveys, video monitoring, and satellite-based remote sensing, which are employed on many coasts around the world [
10,
11,
12,
13]. However, while satellites are useful for observing long-term changes on a global scale, their use is limited due to issues such as spatiotemporal resolution and accuracy, as well as challenges in elevation and water depth conversion [
6,
7]. Additionally, technologies such as shoreline extraction using land-based or drone-mounted video systems and lidar scanning for elevation mapping are actively applied to investigate coastal morphology. However, these methods have limitations regarding inaccuracy and the frequent use of indirect observation approaches. The most accurate method for examining coastal morphology, which serves as ground truth for optimizing remote sensing or numerical models, is still performing in situ field surveys using RTK-GPS with human and water vehicles; this method is reasonably practical for frequent surveys. The higher the number of quality-controlled and accurate observation data and the higher the spatiotemporal resolution, the better the results. However, observing beach morphology, such as beach profiles extending from land to water, with high spatiotemporal resolution over long periods for many beaches is challenging. Over the past several decades, only a few beaches worldwide have been regularly monitored with accurate yet expensive in situ surveys [
14]. Typically, field campaigns for monitoring beach width and profile along coasts that are tracked over long periods and for which data are publicly available are conducted at least four times per year and twice per month, respectively [
15,
16]. The long-term coastal morphology data from these few well-monitored beaches have provided an in-depth understanding of many site-specific coastal processes.
In this study, we aimed to introduce a long short-term memory (LSTM)-based encoder–decoder network using deep learning techniques to model the relationship between the coastal hydrodynamics and the beach profile response, a key driver of beach profile changes. The goal was to estimate or generate beach profile data with an increased temporal resolution of at least one set per week, which is essential to understanding and predicting morphological beach changes. This approach leverages previous field survey data combined with coastal hydrodynamic information. We assessed its performance by applying it to four major sand beaches on the east coast of Korea which are experiencing serious coastal erosion problems [
17,
18,
19,
20,
21].
The remainder of this paper is organized as follows:
Section 2 provides an overview of related studies that have focused on generating beach profile data to understand beach morphological evolution.
Section 3 describes the beaches in the study area, detailing their coastal hydrodynamic characteristics and field-surveyed beach profile data.
Section 4 presents the proposed LSTM-based encoder–decoder network, which uses deep learning techniques to model the relationship between coastal hydrodynamics and beach profile response.
Section 5 explains the experimental setup, implementation details, and evaluation metrics. In
Section 6, the performance of the proposed methodology is evaluated by assessing the accuracy of the estimated beach profile data. Finally, in
Section 7, we summarize the paper, discuss the strengths and limitations of the proposed method, and outline directions for future research.
2. Related Work
The proposed method involves developing a data-driven model that increases the temporal resolution of existing field-surveyed beach profile data through data imputation [
22]. From the standpoint of generating non-existent data, relevant studies on data imputation for beach profiles, along with prediction or forecasting, were reviewed, and three main approaches were examined: empirical-based, numerical model-based, and data-driven machine learning.
The earliest approaches to quantitatively estimate beach profile changes were based on empirical equations, using laboratory or field data on sandbar or berm formation under specific wave and sediment conditions [
23,
24,
25,
26]. However, such empirical-based approaches require extensive laboratory and field survey data. Laboratory data in particular have limitations in representing the real-world coastal environment.
Kriebel and Dean [
27] proposed a model to simulate beach profile evolution, focusing on dune erosion during storms, using the equilibrium beach topography theory. Larson et al. [
28] developed an empirically based model, SBEACH, which explains the formation of sandbars and berms through the equilibrium beach topography theory. Later models continued to focus on dune erosion during storms. For example, Steetzel [
29] developed DUROSTA, a model that simulates cross-shore transport and profile response during severe storms, specifically addressing dune erosion. Since then, numerous profile evolution models have been introduced, including UNIBEST [
30], COSMOS [
31,
32], and BEACH [
33]. Johnson et al. [
34] developed CSHORE, a cross-shore profile evolution model primarily used to predict beach erosion caused by the effects of waves and currents. However, these profile models are not well suited for modeling foreshore and dunes on seasonal timescales. Although they have performed well in simulating dune erosion in field applications, further improvements are required for sediment transport in intermittently wet and dry regions [
34,
35,
36]. A widely used model alongside SBEACH is XBeach, a two-dimensional (2D) process-based model proposed by Roelvink et al. [
37] which stimulates dune erosion, ice breaking, avalanches, overwash, and sediment transport in the swash and surf zones. XBeach predicts the evolution of both subaerial and subaqueous profiles during short-term storm events. However, further performance improvements for subaqueous profiles and optimization tailored to specific beach areas with coastal hydrodynamics are needed [
19,
21].
Machine learning-based, data-driven approaches to modeling beach profile changes have been introduced in Hsu et al. [
38]’s 2D empirical eigenfunction model (EEM) and de Melo et al. [
39]’s random forest (RF) ensemble method. The 2D EEM allows for the analysis and prediction of beach profile changes caused by alongshore and cross-shore sediment transport, and it was applied to the Red Hill coast of Taiwan. By using beach profile data measured every two months at 150 m intervals along the detached breakwaters, a model was formulated—using a Markov process and linear regression—that considered the effects of wave breaking, bottom sediment, and wave radiation stress. This model was effective in analyzing and predicting beach profile changes near coastal structures. The RF approach demonstrated reliable and robust morphological forecasting performance in predicting beach profile responses. This was achieved by using beach profile data, wave conditions, and sediment size data to simulate the seasonal morphological evolution of several Portuguese beaches. A total of 11–12 beach profile measurements were taken at 12 sites across seven beaches, approximately every 3 months from 2018 to 2021. For wave data, the Atlantic-Iberian Biscay Irish ocean wave reanalysis product, managed by the EU Copernicus Marine Service Information, was used. This dataset provided wave data with a spatiotemporal resolution of one day and a spatial resolution of 0.027° × 0.027° for the seven beach locations. Beach sediment sizes were determined through sieving and digital imaging, with the characteristic diameters
and
measured at the berm’s intermediate position, berm edge, and beach face.
In this study, considering the substantial accumulation of data compared with the past and the continuous public release of beach profile data [
13,
15,
16,
40], we aimed to propose a self-attention-based spatiotemporal network. This method, inspired by rapid advancements in large-scale language models [
41], offers a machine learning method approach to estimate beach profile data with higher temporal resolution. By establishing a nonlinear relationship between beach profile and coastal hydrodynamics, this model aimed to improve the accuracy of beach profile estimations.
3. Study Area and Data
The study area comprised seven beaches across three littoral systems (denoted by GW13, GW17, and GW31) on the east coast of Gangwon Province, South Korea, as shown in
Figure 1. The beaches within the GW13 system included Kensington and Yongchonri, consisting of three transects, GW13−02, GW13−04, and GW13−07, as described in
Figure 1b. The beaches belonging to GW17 included Naksan and Songjeon, featuring four transects, GW17−03, GW17−05, GW17−08, and GW17−11, as described in
Figure 1c. Finally, the GW31 littoral system consisted of Dojik, Gigok, Mangsang, and Daejin with a total of five transects, GW31−04, GW31−08, GW31−11, GW31−14, and GW31−17, as described in
Figure 1d. The geomorphological characteristics of these areas include a steep gradient from the Taebaek Mountains (elevation level (E.L)(+) 800 m) to the east coast, with an average distance of about 6 km. Most sediment is primarily transported from rivers to the sea during floods. The region has a small tidal range, with a spring range of 19.2 cm (GW13 and GW17) and 18.4 cm (GW31) and mean sea level of 19.5 cm (GW13 and GW17) and 18.4 cm (GW31). The dominant wave direction is northeast, with an average significant wave height (
) of 1 m and a mean absolute wave period (
-10) ranging from 7 to 12 s. Instead of
or
, which are mainly used as scalar estimators of waves in time-domain analysis, we use
-10 (mean wave period using spectral moments of order −1 and 0), which is a major wave period item of spectral estimators using the Fourier transform [
23,
42]. Additionally, the urban structure along the east coast is narrow and elongated in a north–south direction, making it directly exposed to high waves. Recent climate changes have increased both the frequency and intensity of high waves, with the region experiencing four distinct seasons with prevailing northeastern waves in winter and southeastern waves in summer.
Table 1 present the main characteristics of each littoral cell system, such as geolocation, beach length, beach width, grain size, and beach slope. Beach slope is a measure of the gradient between E.L(+) 0.5 m and E.L(−) 0.5 m. The coastal hydrodynamics, covering wave and tide information, are shown in
Figure 2.
Kensington and Yongchonri Beaches, located within the GW13 littoral system on the south side of Goseong-gun, Gangwon Province, South Korea, extend for approximately 3 km in a straight line. The Yongchon Stream serves as the primary source of sand. Most of the hinterland of Yongchonri Beach is predominantly covered by vegetation, while Kensington Beach has been developed into a hotel area. The offshore annual mean
is 0.87 m, with a
-10 of 4.96 s. The beach experiences a mixed diurnal and semi-diurnal tidal pattern, with a spring range of 19.2 cm and mean sea level of 19.5 cm. The locations of wave and tidal observations for the GW13, GW17, and GW31 littoral systems are shown in
Figure 1a, marked by yellow squares and white triangles, respectively. For GW13, the wave and tidal observation points are located at 38.333° N, 128.583° E and 38.20722° N, 128.5942° E, respectively. The coastline azimuth angle is 54.7°, and the beach face slope ranges from 8° to 10.5° at different reference points, resulting in a steep cross-section during high-wave conditions. The central particle diameter (
) of sand in the swash zone varies between 1.156 and 1.606 mm.
Naksan and Songjeon Beaches, part of the GW17 littoral system, are located on the north side of Yangyang-gun, Gangwon Province, South Korea. Naksan Beach is located to the north of the Yangyang Namdaecheon River, while Songjeon Beach is located to the south. The beach extends for approximately 5.65 km in a straight line. Naksan Beach’s hinterland is largely urbanized, while Songjeon Beach’s hinterland comprises coastal dunes and terraces. The offshore annual mean is 0.80 m, with a -10 of 5.09 s. This beach experiences mixed diurnal and semi-diurnal tides, with a spring range of 19.2 cm and mean sea level of 19.5 cm. Wave and tidal observations for GW17 are located at 38.167° N, 128.667° E and 38.20722° N, 128.5942° E, respectively. The coastline azimuth angle is 50.0°, and the beach face slope varies between 9° and 13°. The of sand in the swash zone ranges from 0.487 to 0.607 mm.
Dojik, Gigok, Mangsang, and Daejin Beaches, part of the GW31 littoral system, are located on the northern side of Donghae City, Gangwon Province, South Korea. This beach extends for approximately 5.63 km in a straight line, with The Masangcheon Stream serving as the main source of sand. Crescentic sandbars are well developed in front of the beach. The hinterland encompasses railways and highways, and recent coastal development plans have been announced. The area is prone to flooding during high-wave events. The offshore annual mean is 0.94 m, with a -10 of 5.17 s. The beach experiences mixed diurnal and semi-diurnal tides, with a spring range of 18.4 cm and mean sea level of 18.4 cm. The locations of wave and tidal observations for GW31 are situated at 37.667° N, 129.167° E and 37.55028° N, 129.1164° E, respectively. The coastline azimuth angle is 39.6°, with the slope of Mangsang Beach being approximately 6°, while Dojik Beach and Gigok Beach feature steeper cross-sections exceeding 12°. The of sand in the swash zone ranges from 0.403 to 0.541 mm.
Figure 3 illustrates the distribution characteristics of
,
-10,
, and tidal level, representing the coastal hydrodynamics for the GW13, GW17, and GW31 littoral systems.
Beach observation data included beach width and beach profile measurements from reference point to coastline (E.L(+) 0.0 m) for each transect from 2010 to 2024. Beach width was surveyed three–four times per year on average, while the beach profile was surveyed one–two times per year due to the complexity and time required for the survey.
Figure 4 presents a time series of the beach width measured for the transects belonging to GW13, GW17, and GW31. The vertical lines show the survey periods for each littoral system.
Figure 5,
Figure 6 and
Figure 7 depict the beach profiles for each cross-section of the three littoral systems.
Figure 8 and
Table 2 display the survey dates for the beach width and beach profile measurements for the littoral systems. Beach width was surveyed on all the dates listed, while dates marked in red indicate that both beach width and beach profile measurements were taken simultaneously.
To derive beach data for GW13, GW17, and GW31, field surveys were conducted from April 2010 to April 2024. During this period, beach width was surveyed a total of 55 times, and beach profile measurements were taken 29 times. The general wave characteristics at these study sites indicate that northeast-oriented waves cause erosion in winter, while southeast-oriented waves contribute to sedimentation in summer, maintaining a seasonal equilibrium system. Therefore, shoreline surveys were conducted four times a year to capture seasonal variations, and beach profile surveys were performed in spring and autumn, when significant seasonal changes begin. Shoreline surveys were carried out by using a global navigation satellite system (RTK-GNSS, Leica Viva16) mounted on a quad motorcycle, providing a horizontal accuracy of 8 mm + 0.5 ppm and a vertical accuracy of 15 mm + 0.5 ppm. Beach profile surveys, similar to the shoreline surveys, were conducted by using a combination of quad bike and on-foot methods, incorporating on-site tidal information, down to a depth of E.L(−) 1.0 m.
As presented in
Table 3, the survey results of beach width and profiles from 2010 to 2024 indicate that the average beach width at point GW13−02 was approximately 50.1 m, while the average beach width at point GW13−04 was 68.8 m, which is relatively wider compared with the northern beaches. Beach width and beach profile represent the distance measurements and cross-sectional area measurements from the reference location (red circles in
Figure 1b–d) to E.L(+) 0.0 m, respectively. This result is attributed to the sedimentation induced by rocks exposed in the sea in front of the coast near the central part of transect GW13−04. Additionally, it was analyzed that a seasonal equilibrium system is maintained, as the changes in the coastline were not significant by season. Nevertheless, at point GW13−02, in October 2019, the maximum retreat rates of beach width and beach profile were 46% and 59%, respectively. This was attributed to the impact of Typhoon Hagibis (Max.
= 4.24 m), which struck the Korean Peninsula from 6–13 October 2019. The changes in cross-sectional area at GW13−04 demonstrated a decreasing trend over time at points above E.L(+) 1.50 m but showed repeated increases and decreases at points below E.L(+) 1.50 m (
Figure 5b). Specifically, the significant changes in beach width and profile over the entire survey period are attributed to the sensitivity of the beach profile structure to high waves.
As presented in
Table 3, the beaches in GW17 receive a large amount of sand from the Yangyang Namdaecheon River, forming a crescent-shaped sandbar in the sea in front of them. GW17−03 and GW17−05, situated north of Namdaecheon, maintain average beach widths of 47.1 m and 74.2 m, respectively and average beach cross-sectional areas of approximately 196.1
and 189.4
, respectively, with no significant seasonal changes. At the beginning of the survey, the cross-sectional area above E.L(+) 3.0 m did not show much change; however, the area below E.L(+) 3.0 m exhibited significant changes over time, suggesting sensitivity to high waves. Conversely, GW17−08 and GW17−11, located south of Namdaecheon, maintain an average beach width of over 55 m. However, these areas are relatively narrow and more variable compared with Naksan Beach, which is attributed to a larger amount of sand being supplied to the north during floods at the mouth of Namdaecheon. The cross-sectional area at point GW17−11 experienced significant fluctuations at the start of the survey but was found to show a relatively stable cross-sectional pattern over time.
As presented in
Table 3, GW31−04 and GW31−08 maintain a beach width of more than 40 m irrespective of seasonal changes; however, beach fluctuations are significant during high-wave events. The average cross-sectional areas were similar, at 101.9
and 119.0
, respectively, with both beaches displaying similar cross-sectional area change characteristics over time compared with the start of the survey. Specifically, the average cross-sectional areas of GW31−04 and GW31−08 were 101.9
and 119.0
, respectively, with maximum and minimum changes in cross-sectional area displaying approximately 30% variation. Notably, cross-sections with elevations of E.L(+) 3.0 m or higher exhibited minimal changes, while significant changes were observed in cross-sections with elevations of E.L(+) 3.0 m or lower. Overall, there was a trend of decreasing cross-sectional area over time compared to the initial survey. GW31−11 maintained a relatively wide beach width compared with the surrounding beaches, whereas GW31−14, located to the south, exhibited significant seasonal variation with a decrease in beach width. GW31−17 was found to maintain a relatively stable beach width of over 90 m. The average cross-sectional area of GW31−11 was 266.4
, with rapid changes seen in cross-sections at elevations of E.L(+) 2.2 m or lower. GW31−14 had an average cross-sectional area of 78.3
but showed unstable profile characteristics because of large fluctuations during the survey period and a steady decrease compared with the initial survey. The average cross-sectional area of GW31−17 was 174.5
, displaying a significant decrease in cross-sectional area in summer (see
Table 3 and
Figure 7e).
4. Long Short-Term Memory (LSTM)-Based Encoder–Decoder Network
The proposed model framework comprises two variable encoders and two temporal encoders, as depicted in
Figure 9.The first
, illustrated in
Figure 9a, comprises a multiplayer perceptron (MLP) [
43] composed of
operation [
44] layers to encode the features of wave and tide variables for coastal hydrodynamics. The wave data include three variables:
,
-10, and
. Therefore, four variables, including tide, were configured as inputs (
) to the first variable encoder. The second
in
Figure 9b uses an MLP to encode the beach width and beach profile variables (
) that exist at the nearest point in time, which is the output variable of the beach profile at the estimation point, together with the features of the beach width at the estimation point.
The feature vector encoded by the each
is fed into the
, composed of LSTM [
45], to learn sequential features for temporal information. LSTM is a type of recurrent neural network architecture commonly used for sequence modeling tasks and has a more complex structure with additional memory cells and gates (input gate, forget gate, and output gate) that can selectively remember or forget information from previous time steps, making it an architecture suitable for modeling long-term dependencies in sequential data.
The two feature vectors, now encoded with temporal information through the
, are passed through a simple MLP consisting of
after performing the concatenate operation to generate weight vectors in the
, as shown in
Figure 9c. The weight vector serves as a parameter for the three linear layers of the
. The
is an MLP consisting of
layers, generated by the implicit neural representation (INR) method [
46]. It plays a role in converting the beach profile coordinates into the corresponding estimated values. The weight vector is divided into three weights, which are used to construct each INR; therefore, it is displayed in different colors. The weight vector
W, which contains the necessary information to construct a network that estimates the next beach profile based on the wave and tide, the previous beach profile, and beach width information through the variable and temporal encoders, performs INR learning. This approach allows the model to estimate values only when the coordinates are given externally, rather than estimating both the coordinates (
) and values (
) of the next beach profile. The model was trained to determine the weights W to construct a network that outputs accurate values from the coordinates of the beach profile.
W represents the output of the model that varies depending on the input (wave, tide, beach width, and beach profile), rather than being a parameter learned during training. Once generated by the model, it is used to estimate values from the coordinates of the beach profile without any additional learning, thereby serving as a mapping. Thus, the model optimally learns the parameters of all MLPs and LSTM cells regarding the weights W and the input layers throughout the training process.
5. Experiments
Figure 10 illustrates the experimental setup for the input and output of the proposed LSTM-based encoder–decoder network for beach profile estimation. The model uses five input variables in total: three for wave and tide data, representing coastal hydrodynamics, and two for beach width and beach profile, representing beach information. Wave information comprises three variables:
,
-10, and
. The output is a beach profile, and to estimate the beach profile at point
M through the model, the beach width and beach profile that existed just before
M are used as inputs; this corresponding time is referred to as
N. The inputs for hydrodynamics—
,
-10,
, and water level—become time-series data from
N to
. Although wave and tide data are hourly data, the daily average time-series data for
N-
period are used as input data. The total number of transects used in the experiment was 12 (GW13−02, 04, 07; GW17−03, 05, 08, 11; GW31−04, 08, 11, 14, 17), and 55 beach width survey data points for each individual transect were used as input data. In the case of beach profiles, a total of 29 input data points for each transect were used, with 26 data points being estimated as output.
The task performed by the model was to estimate beach profiles for all survey dates, as presented in
Table 2. The model used the beach width and beach profile from the survey conducted immediately before the time point to be estimated, along with the time series of wave and tide data during that period as input. As beach profile surveys were only conducted on the dates marked in red, if no recent beach profile data from the previous time were available, data from the time when beach profile data from the past were available were used. Of the total dataset shown in
Figure 8 and
Table 2, 65% of the data were used for model training, and 35% were reserved for testing. Specifically, data from 2010 to April 2019 were used for model training, allowing the model to learn the temporal information about the survey time sequence. Data from October 2019 to June 2024 were used to assess model performance. The model was trained to autoregressively estimate beach profiles at one-month intervals. The loss function is defined by Equation (
1):
Here, w represents a time-series matrix of size , consisting of , -10, , and tide level. The wave period, wave direction, and tide level are represented as additional features. d is a vector of size , consisting of the previous beach width and beach profile and its coordinate; i is a coordinate vector for the beach section to be estimated; o is a vector of values () representing the estimated beach section; finally, denotes the weights generated by the model M.
stands for expected value, and when expressing the expected value of the random variable X, it is written as . D denotes a network parameterized by consisting of 3 and 2 layers.
The model was trained to reduce the mean squared error loss by using the Adam optimizer, with a learning rate () of 0.0001, of 0.9, of 0.999, and L regularization of 0.0001. The model’s batch size was 16, and the training was run for 30,000 iterations.
Model training and all experiments were conducted on an NVIDIA RTX 3090 (24 GB memory) with an Intel(R) Xeon(R) Gold 6130 CPU @ 2.10 GHz and 192 GB main memory. The proposed network and algorithms were implemented by employing Python 3.6.9 and PyTorch 1.8.1.
Three metrics were used to assess the prediction performance: bias, root mean square error (RMSE), and Pearson correlation coefficient (CC). Each matrix is defined as follows in Equations (
2)–(
4).
Here, represents the surveyed beach profile data regarded as ground truth, and denotes the estimated beach profile produced by the trained LSTM-based encoder– decoder network.
6. Results
Table 4 presents the performance of the beach profile estimated by the proposed model during the test period (October 2019–April 2024), evaluated by using four metrics: bias, RMSE, and CC. Additionally, the plots of the estimated beach profile results, alongside the ground truth for each transect, are depicted in
Figure 11 for each of the three littoral systems. Representative cases, including those that closely followed the overall shape well and those that did not, were randomly selected from the test data.
The transect showing the best average performance belonged to GW13, with the lowest error and highest correlation. GW13−02 exhibited an RMSE of 0.34 m, and achieved a CC of 0.99. In contrast, the transect of GW13−04 showed the highest RMSE, 0.54 m. In terms of the littoral system to which the transects belong, GW31 showed the best performance, with an average bias of 0.16 m and an RMSE of 0.46 m. For the transects associated with GW17, the error was similar to that of GW31, but the CC was higher.
Compared with the average RMSE of 0.52 m for beach profile estimation across 13 sections in de Melo et al. [
39]’s study using RF-based ensemble models, the proposed model achieved an average RMSE of 0.50 m, demonstrating excellent overall performance in beach profile estimation. It is important to note that although both studies address the same task of beach profile estimation by using machine learning methods, there are differences in study area, data period and quantity, and input variable composition. In de Melo et al. [
39]’s results, the lowest RMSE was 0.20 m, and the highest RMSE was 1.38 m. In contrast, our study reports a lowest RMSE of 0.34 m and a highest RMSE of 0.71 m. These results are within a similar range and can be considered baseline performance for machine learning approaches.
Specifically,
Figure 11 facilitates a morphological comparison of the estimated individual transect beach profiles with the surveyed data in x and y coordinates. In the figure, the red solid line indicates the ground truth, the blue dotted line shows the beach profile estimated by the proposed model, and the black solid line denotes the beach profile from the most recent time used as input for beach profile estimation. It can be observed that the beach profile used as input before the estimated time and the shape of the ground truth are quite similar. In particular, in cases where the model estimation errors are large, most of the discrepancies occur in beach profiles estimated between August and October. The difference is especially pronounced in 2019, when typhoons that hit the Korean Peninsula and passed through areas containing GW13, GW17, and GW31 can be confirmed, as presented in
Table 5. Therefore, the significant error in the estimated beach profiles during August to October 2019 appears to stem from the model’s difficulty in capturing the substantial morphological changes in the beach caused by the extremely high waves generated by the typhoon.
7. Discussion and Conclusions
In this study, we proposed an LSTM-based encoder–decoder network to estimate beach profiles by effectively learning the spatiotemporal relationship between beach profile responses and coastal hydrodynamics. The proposed model was applied to a total of 12 transects across three littoral systems on the east coast of Korea, where erosion problems are severe. To estimate the beach profile, the model’s inputs included coastal hydrodynamics, such as waves and tides, along with beach width and beach profile data from the most recent past relative to the time point being estimated. For coastal hydrodynamics, the entire time-series data from the estimation point to when the past beach width existed were used as input, while discrete survey values were used for beach width and beach profile.
Although the study area, data, and inputs differed, the proposed method outperformed the results of de Melo et al. [
39], who performed the same beach profile estimation task by using an RF-based machine learning method, showing a lower average RMSE. This improved performance is attributed to the proposed model’s ability to learn the features of coastal hydrodynamics (waves and tides) and previous beach information (beach width and beach profile) more effectively through the
, while the
better captures their sequential patterns. In particular, the architecture of LSTM cells used as a temporal encoder to capture the context of long-term dependencies in time-series data is shown to lead to improved performance as a differentiated approach from previous empirical and numerical models, such as CSHORE and XBeach, for modeling beach morphology. Moreover, the proposed deep learning-based deep neural network architecture achieves better performance than existing machine learning methods through architectural features and the backpropagation learning process of loss, which allows for the extraction of more useful features from data of various variables through deep hidden layers and the modeling of their nonlinear relationships. In cases of lower performance, the issue was attributed to extreme morphological changes in the beach caused by successive typhoons.
However, the degradation of predictive performance for such storm-induced extreme events is also a problem faced by numerical storm-induced erosion simulation models such as XBeach and SBEACH. Therefore, future research will aim to improve performance by augmenting data to address the imbalance in extreme events within data-driven machine learning or deep learning approaches. Additionally, we plan to propose a loss function based on extreme value distribution [
47,
48,
49,
50].
The proposed method estimates or generates beach profile data from unsurveyed periods using past survey data and ocean hydrodynamic forcing data to increase the temporal resolution of field survey data. Furthermore, it can predict future beach morphological changes by using past survey data and ocean hydrodynamics. The proposed model is highly applicable to the study area and coasts with similar beach and external forcing characteristics. This is because the beach width and beach profile of the predicted progress and the external forcing during the predicted period can be input to estimate the subsequent beach profile changes. This has higher site applicability compared with one-dimensional empirical or numerical model-based coastline models such as UNIBEST, which require individual model grid establishment, parameter optimization experiments, etc., for each target site.