Signiﬁcant Wave Height Prediction in the South China Sea Based on the ConvLSTM Algorithm

: Deep learning methods have excellent prospects for application in wave forecasting research. This study employed the convolutional LSTM (ConvLSTM) algorithm to predict the South China Sea (SCS) signiﬁcant wave height (SWH). Three prediction models were established to investigate the inﬂuences of setting different parameters and using multiple training data on the forecasting effects. Compared with the SWH data from the China–France Ocean Satellite (CFOSAT), the SWH of WAVEWATCH III (WWIII) from the paciﬁc islands ocean observing system are accurate enough to be used as training data for the ConvLSTM-based SWH prediction model. Model A was preliminarily established by only using the SWH from WWIII as the training data, and 20 sensitivity experiments were carried out to investigate the inﬂuences of different parameter settings on the forecasting effect of Model A. The experimental results showed that Model A has the best forecasting effect when using three years of training data and three hourly input data. With the same parameter settings as the best prediction performance Model A, Model B and C were also established by using more different training data. Model B used the wind shear velocity and SWH as training and input data. When making a 24-h SWH forecast, compared with Model A, the root mean square error ( RMSE ) of Model B is decreased by 17.6%, the correlation coefﬁcient ( CC ) is increased by 2.90%, and the mean absolute percentage error ( MAPE ) is reduced by 12.2%. Model C used the SWH, wind shear velocity, wind and wave direction as training and input data. When making a 24-h SWH forecast, compared with Model A, the RMSE of Model C decreased by 19.0%, the CC increased by 2.65%, and the MAPE decreased by 14.8%. As the performance of the ConvLSTM-based prediction model mainly rely on the SWH training data. All the ConvLSTM-based prediction models show a greater RMSE in the nearshore area than that in the deep area of SCS and also show a greater RMSE during the period of typhoon transit than that without typhoon. Considering the wind shear velocity, wind, and wave direction also used as training data will improve the performance of SWH prediction.


Introduction
The South China Sea (SCS) is a large semi-enclosed marginal sea and the third-largest continental marginal sea in the world (after the Coral Sea and the Arabian Sea) [1]. With abundant mineral, oil and gas, and fishery resources, the SCS has a considerable variation in water depth [2], including deep-sea and shallow nearshore areas. The climate of the SCS is dominated by the southwest monsoon in summer and the northeast monsoon in winter due to the East Asian monsoon system [3]. The topographic characteristics of the SCS and the monsoon system significantly influence the wave characteristics of the SCS. Accurate wave forecasting can effectively improve the safety of marine activities in the SCS, such as fishing, exploration, power generation, and shipping, and the efficiency of marine operation, as well as reduce marine accidents [4][5][6]. gave significantly better results than feedforward neural networks (FNN) and support vector regression (SVR) models. Many combined LSTM and other methods were applied in recent studies in SWH. For example, Ni and Ma [32] combined principal component analysis (PCA) with LSTM to predict wave height and compared the results with linear regression (LR), regression tree (TR), SVM, and Gaussian process regression (GPR), and the results performed much better in terms of performance metrics and time consumption. Fan et al. [19] combined SWAN with LSTM and found that the SWAN-LSTM model outperformed ELM and SVM in prediction. Pirhooshyaran and Snyder [33] combined LSTM neural networks with Bayesian hyperparametric optimization and elastic network methods. Sequence-to-sequence neural networks were developed for the first time, and the prediction results of SWH were superior in validation.
The previous LSTM network model for SWH prediction was limited to the single-point prediction of spatial elements. To address the problem of prediction of spatio-temporal sequences in the proximity forecasting of precipitation, Shi et al. [34] developed a convolutional LSTM (ConvLSTM) algorithm. The ConvLSTM algorithm is a predictive model of variables constructed by establishing relationships between input and predictor variables with a sufficient amount of training data. Experiments have shown that the ConvLSTM network can better capture the spatio-temporal correlation of elements and consistently outperforms other algorithms, such as fully connected LSTM (FC-LSTM). Previous studies have demonstrated the feasibility of employing the ConvLSTM algorithm for SWH prediction. For example, Choi et al. [35] predicted SWH from continuous ocean images based on a two-way ConvLSTM regression model, and the model predictions yielded meager error rates in terms of mean absolute error (MAE) and mean absolute percentage error (MAPE). However, the limitation of this study was caused by the difficulty of collecting continuous ocean images and the short length of the estimated time. Zhou et al. [36] performed intelligent wave forecasting in the South and East China Seas based on the ConvLSTM algorithm. However, the training and input data used in such studies were mainly limited to previous SWH data, and other environmental and physical factors that may influence SWH variation were disregarded.
The SWH is mainly influenced by wind direction, wind speed, sea surface temperature, and atmospheric pressure [9,37], among which wind speed and direction are the most critical factors affecting the variation of SWH [38][39][40]. Fan et al. [19] and Hu et al. [41] considered the role of multiple input elements for SWH to design prediction models, but their studies were limited to single-point forecasts at the measurement sites. Therefore, multi-factor data such as the historical SWH, wind speed, and wind direction were used as training and input data of the ConvLSTM neural network model in this study. A variety of network models were designed to predict SWH in the SCS. The optimal control parameters were determined by training and testing neural network models with different model parameters. On this basis, the influence of different input factors on SWH prediction was studied.
The remainder of this paper is organized as follows. In Section "Data and methods", we describe the data and preprocessing used in this study, the methodology employed for the study, and how the predictive model of SWH in the SCS was constructed. In Section "Results and discussion", we describe the results of the prediction models using three different input data and discuss the differences between the three models. Finally, Section "Conclusions" provides our conclusions.

Data and Pre-Processing
The SWH data and wave direction data used in this study are the best time series WAVEWATCH III (WWIII) global wave model data from the official website of the Pacific Islands Ocean Observing System (PacIOOS). A global-scale WWIII model was implemented at the University of Hawaii through a partnership with the National Oceanic and Atmospheric Administration/National Centers for Environmental Prediction (NOAA/NCEP) and the National Weather Service Honolulu Forecast Office (NWS Honolulu) [42]. The SWH and wave direction (θ) data have a temporal resolution of 1 h and a spatial resolution of 1/2 • × 1/2 • . The spatial range of the data used in this study is 99 •~1 26 • E, 0 •~2 6 • N, and the time range is from January 2016 to October 2021, where the data from January 2016 to December 2020 were used as the training dataset and the data from January to October 2021 were used as the testing dataset.
The wind data used in this study were obtained from the fifth generation (ERA5) ECMWF reanalysis for the global climate and weather data. The ECMWF-ERA5 data is an atmospheric reanalysis product based on the 2016 version of the Integrated Forecast System (IFS) that combines model data with observations from around the world to form a globally complete and consistent dataset. The ERA5 data replaces its predecessor, the ERA-Interim reanalysis, and provides data products from 1979 onward that are in real-time updated [43]. The ERA5 data used in this study are the eastward component (u) and the northward component (v) of the 10 m wind, and the data have a temporal resolution of 1 h and a spatial resolution of 1/4 • × 1/4 • . The spatial range of the data used in this study is 99 •~1 26 • E, 0 •~2 6 • N, and the time range is from January 2016 to October 2021, where the data from January 2016 to December 2020 were used as the training dataset and the data from January to October 2021 were used as the testing dataset.
Typhoon and tropical cyclone data were also used in this study due to their frequent occurrence in the SCS [5]. To evaluate the performance of the SWH prediction model during extreme weather events, the path data and transit time data of typhoons and tropical storms that were generated in or were transiting through the South China Sea in April, September, and October 2021 were selected. The typhoon data were obtained from the China Central Weather Bureau Typhoon Network [44], and the attribute information of typhoons and tropical storms are shown in Table 1. To assess the quality of WWIII data and the accuracy of the predicted SWH, we used the SWIM (surface waves investigation and monitoring instrument) data products from CFOSAT (Chinese-French Oceanic satellite). The French AVISO+ (archiving, validation, and interpretation of satellite oceanographic data) Cnes Data Center provided the SWIM L2P SWH box off nadir NRT products, which had a delivery delay of 4 h for the period from 25 April 2019, to the present [45]. Li et al. [46] demonstrated that CFOSAT could provide high-precision SWH by comparing it with the SWH data from the National Data Buoy Center (NDBC) buoys and the Jason-3 altimeter SWH data. Therefore, we selected the CFOSAT SWIM data passed through SCS in 2020 and October 2021 to evaluate the data quality of SWH from WWIII and the wave prediction model capabilities.

Preprocessing
In order to accurately predict the SWH, the controlling factors for SWH generation need to be determined. The previous SWH is one of the most critical factors. In order to unify the resolution of the data and improve the quality of the data, the wave data were interpolated to the exact spatial resolution as the wind field data. Wind speed and wind direction are also important physical factors affecting the SWH. Wind speed (U 10 ) and wind direction (Φ) at 10 m were calculated from the eastward component (u) and northward component (v) of the 10 m wind from ECMWF-ERA5. Zamani et al. [47] used wind shear velocity (U * ) instead of U 10 for modeling, and U * was able to improve the predictions in extreme events. The formula for U * is shown in Equation (1).
where C D is the wind resistance coefficient as shown in Equation (2) [48].
The wind and wave direction also have an important effect on the wave growth rate and need to be considered when training the SWH prediction model. The wind has the greatest effect on wave generation if the wind and wave directions are the same. Therefore, this study uses cos (Φ − θ) [9] to quantify this effect, where Φ is the wind direction, and θ is the wave direction.

ConvLSTM Algorithm
ConvLSTM was first applied to the proximity forecasting of precipitation [34], which addresses the deficiency of LSTM in losing spatial correlation and spatial features of spatial data. ConvLSTM extracts feature from a series of images rather than from a single image. A model that processes sequential images needs to be able to extract spatial and temporal information from the images, as it should adapt to the changes in the sequential data over time. Thus, ConvLSTM uses convolution operations to generate a good spatial representation of each frame, using LSTM to encode the temporal variations in the sequence. The LSTM is a class of recurrent neural networks that can process sequential data and was introduced to solve the gradient disappearance problem encountered by recurrent neural networks when processing long sequences [31]. The LSTM incorporates memory units that contain information about the input seen by the LSTM units and is conditioned using several fully connected gates. Because the main purpose of processing image sequences is to discover changes in spatial and temporal dimensions, ConvLSTM uses convolutional gates in the LSTM to encode spatio-temporal information.
Equations (3)-(7) and Figure 1 describe the architecture of the ConvLSTM. The σ is the sigmoid function. The " * " denotes the convolution operation and the "•" denotes the Hadamard product. i t is the input gate, f t is the forgetting gate, o t is the output gate, C t is the current state, H t is the final output, and W, b represent the weight and bias coefficients, respectively, which are three-dimensional (3D) tensors. The ConvLSTM layer is a recursive layer, similar to the LSTM, except that the internal matrix multiplication is exchanged with the convolution operation. The data flow through the ConvLSTM unit keeps the input dimension as 3D and not just a one-dimensional vector. Thus, the ConvLSTM layer uses the same weight sharing as a CNN and treats the input data as serial data, which allows the model to process time-series data similar to an RNN. Figure 1. ConvLSTM cell architecture [35]. C t is the current state, C t-1 is the state of the previous moment, o t is the output gate, i t is the input gate, f t is the forgetting gate, h t-1 is the final output of the previous moment.

Constructing the SWH Prediction Model
Based on the ConvLSTM model for proximity precipitation forecasting by Shi et al. [34], a ConvLSTM model for SWH prediction in the SCS was developed in this study, and the overall structure of the model is shown in Figure 2. The model has five hidden layers for each step, including four ConvLSTM layers and one Conv2D layer as the final output layer. The process of the SWH forecast was to input several previous time data from each training set sample into the model of Figure 2 to obtain the SWH of the target time. The first SWH of WW3 and wind of ERA5 were initially 2D data; in the ConvLSTM algorithm, the traditional LSTM multiplication operation is transformed into convolution operation; it can directly operate on 2D data. In the final output, the SWH forecast can be directly output from the 2D map.  [35]. C t is the current state, C t−1 is the state of the previous moment, o t is the output gate, i t is the input gate, f t is the forgetting gate, h t−1 is the final output of the previous moment.

Constructing the SWH Prediction Model
Based on the ConvLSTM model for proximity precipitation forecasting by Shi et al. [34], a ConvLSTM model for SWH prediction in the SCS was developed in this study, and the overall structure of the model is shown in Figure 2. The model has five hidden layers for each step, including four ConvLSTM layers and one Conv2D layer as the final output layer. The process of the SWH forecast was to input several previous time data from each training set sample into the model of Figure 2 to obtain the SWH of the target time. The first SWH of WW3 and wind of ERA5 were initially 2D data; in the ConvLSTM algorithm, the traditional LSTM multiplication operation is transformed into convolution operation; it can directly operate on 2D data. In the final output, the SWH forecast can be directly output from the 2D map. In this study, three different SWH prediction models for the SCS were established using SWH, U*, Φ, and θ as the training and input data, respectively. Model A was built as a univariate SWH prediction model using only SWH as training and input data. The effect of two parameters, training dataset size and input data time span, on the forecasting effect of Model A was explored through 20 sets of sensitivity experiments. In these 20 experiments, the input data time span was chosen to be 2, 3, 4, and 5 h, and the training dataset size was opted to be 1, 2, 3, 4, and 5 years, respectively. The time span and training dataset size of the optimal input data were determined by analyzing and evaluating the error indices of twenty sets of experiments. In addition, wind speed and wind direction In this study, three different SWH prediction models for the SCS were established using SWH, U * , Φ, and θ as the training and input data, respectively. Model A was built as a univariate SWH prediction model using only SWH as training and input data. The effect of two parameters, training dataset size and input data time span, on the forecasting effect of Model A was explored through 20 sets of sensitivity experiments. In these 20 experiments, the input data time span was chosen to be 2, 3, 4, and 5 h, and the training dataset size was opted to be 1, 2, 3, 4, and 5 years, respectively. The time span and training dataset size of the optimal input data were determined by analyzing and evaluating the error indices of twenty sets of experiments. In addition, wind speed and wind direction are also important physical factors affecting SWH. In order to further improve the accuracy of the prediction model, multi-variable input data were used to forecast SWH based on the Model A parameter settings. Model B was designed using SWH and U * as input data, and Model C was constructed using SWH, U * , Φ, and θ as input data. The three models developed in this study are as follows: Model A: Model B: Model C: where T is a certain moment, T + N is the moment when the SWH needs to be predicted, H p denotes the SWH predicted by the model, H w denotes the SWH from WWIII.

Model Quality Assessment Methods
To quantify the accuracy of the SWH prediction model, model quality was assessed using root mean square error (RMSE), correlation coefficient (CC), and mean absolute percentage error (MAPE), with expressions as shown in Equations (11) to (13).
where M is the total number of cases, H p represents the predicted SWH, H w represents the SWH from WWIII, H w represents the average of WWIII, and H p represents the average of the predicted SWH. Since the CFOSAT data have spatial and temporal discontinuities, therefore, in the training and test sets of the SWH prediction model, the SWH of WWIII, which was validated by CFOSAT data, was used in this study to calculate the RMSE, MAPE, and CC of the predicted data.
To evaluate the discrepancy in the prediction performance of different models, in Equation (14), the assessment skill used by Ji et al. [49] with corresponding changes based on this study was used to assess the numerical differences in the error indices among the models.
where E m a index and E m b index denote the values of the error indices of model a and model b, respectively, and "index" denotes the different error indices of the models, including RMSE, CC, and MAPE.

Validation of SWH from WWIII
The SWH from WWIII from PacIOOS was evaluated by the SWH from satellite. Li et al. [46] demonstrated that CFOSAT can provide high-precision SWH. Thus, the study used CFOSAT SWIM SWH to calculate the SWH from WWIII for CC and RMSE (Figure 3b). Altogether, 170 CFOSAT tracks of SWH data in 2020 ( Figure 3a) were collected in the study to evaluate the SWH data of the training set in 2020. Compared with the CFOSAT SWIM SWH, the CC of SWH from WWIII is 0.9586 and the RMSE is 0.3658 m in 2020 ( Figure 3b). Therefore, the precision of the SWH from WWIII is within a certain extent that is acceptable and can be used as the training data for the SWH prediction model. are also important physical factors affecting SWH. In order to further improve the

Model A Sensitivity Experiments
Model A was built using only SWH as training and input data for the SWH prediction model in the SCS. In the process of establishing Model A, because the training dataset size and the input data time span were critical parameters affecting the performance of the forecast model, the effects of these two parameters on the forecasting effectiveness of Model A were explored through 20 sensitivity experiments. The time range of the training set chosen for the experiments was from 2016 to 2020, and the time range of the validation set was from January to October 2021. The RMSE and CC were used in these experiments to assess the performance differences between different experimental models. Figure 4 shows the RMSE and CC results of SWH forecasting at 3-, 6-, 12-, and 24-h for the twenty sets of experiments, respectively.

Model A Sensitivity Experiments
Model A was built using only SWH as training and input data for the SWH prediction model in the SCS. In the process of establishing Model A, because the training dataset size and the input data time span were critical parameters affecting the performance of the forecast model, the effects of these two parameters on the forecasting effectiveness of Model A were explored through 20 sensitivity experiments. The time range of the training set chosen for the experiments was from 2016 to 2020, and the time range of the validation set was from January to October 2021. The RMSE and CC were used in these experiments to assess the performance differences between different experimental models. Figure 4 shows the RMSE and CC results of SWH forecasting at 3-, 6-, 12-, and 24-h for the twenty sets of experiments, respectively.
For a fixed input data time span, each row in Figure 4 shows the relationship between the experimental model's RMSE, CC, and the training dataset size. As the training dataset size increases, the experimental model has the characteristics that the RMSE decreases at first and increases after, and CC ascends and then diminishes. At a specific training dataset size, each column in Figure 4 shows the relationship between the RMSE, CC, and the experimental model's input data time span. As the input data time span increases, the RMSE of the experimental model first decreases and then increases, and CC first ascends and then diminishes.
With a constant training dataset size, the model with a time span of 3 h had the smallest RMSE and the highest CC in the 3- (Figure 4a (Figure 4c,d) SWH forecasting. When the input data time span was 2 h, the forecasting accuracy was low due to the small amount of wave data. As the time span rises, the CC of the model gradually increases, and RMSE decreases by degrees. However, when the input data time span was too large, the precision of model did not further improve due to the data's redundancy. For the experiments with a determined time span of input data, the experimental model with a training dataset size of 3 years had the smallest RMSE and the highest CC for SWH forecasting of 3- (Figure 4a,b), 6- (Figure 4c,d), and 24-h (Figure 4g,h); for SWH forecasting of 12-h (Figure 4e,f), the experimental model with a training dataset size of 3 or 4 years had the lowest RMSE and the largest CC. As training dataset size increases, the CC of the model gradually advances, and the RMSE decreases by degrees. However, after the training dataset size was greater than 3 years, the increase in model accuracy was not apparent, but the model consumed significantly more computer resources. We considered the model quality of the SCS SWH prediction model and the computational resources consumed when training the model. Concerning SCS SWH forecasting for 3-, 6-, 12-and 24-h, compared with other experimental models, the model at this moment predicted SWH with not only the smallest RMSE but also the largest CC when the input data time span was 3 h and the training dataset size was 3 years. The RMSE of the model were 0.108 m, 0.176 m, 0.282 m, and 0.421 m, and CC were 0.980, 0.944, 0.881, and 0.794, respectively. These parameters were the optimal prediction model parameters. Therefore, the same parameter settings were adopted in Model B and C.

Model Comparison and Analysis
Model A, B, and C were used to predict SWH at 3-, 6-, 12-, 24-, and 36-h in the SCS. In order to compare the performance of the models, the error statistics of the three models were calculated, and the root mean square error (RMSE), correlation coefficient (CC), and mean absolute percentage error (MAPE) were calculated, respectively. Figure 5 shows the error indices variation curves for the three models for 3-, 6-, 12-, 24-, and 36-h SWH forecasting on the test set. The blue dashed line, orange dashed line, and red dashed line represent Model A, B, and C, respectively. As shown in the figure, the RMSE (Figure 5a) and MAPE (Figure 5c) of the prediction model gradually increase and CC (Figure 5b) gradually decreases as the forecasting time increases from 3-h to 36-h. This was consistent with the theory and the expected result. Meanwhile, as shown in Figure 5, for the 3-h SWH forecast, the RMSE and MAPE of Model A, B, and C were fewer and the CC between the predicted SWH and the SWH from WWIII was large. For a fixed forecast time, the RMSE and MAPE of Model B were less than those of Model A, and the CC of Model B was larger than that of Model A. This was because the accuracy of the models depends not only on the wave parameters but also on the previous wind speed. Model C outperforms Model B for the 3-, 6-, and 12-h SWH forecasts, but for the 24-h SWH forecast, the differences in RMSE, MAPE, and CC between Model C and Model B were very small. Particularly, the RMSE of Model C was rather slightly larger than Model B for the 36-h forecast. The result means that for lengthy forecasts (36 h or more), wind and wave directions had a very weak impact on forecast performance. It may even cause a reduction in forecast accuracy due to data redundancy. In addition, as shown in Figure 5b, the CC of Model B and Model C were greater than 0.8 for the 24-h SWH forecast (0.817 and 0.815, respectively), and the forecast results were considered to be significantly correlated with the true values at this time. Therefore, for our study, we focused on the SWH forecast results over a 24-h time period. The comprehensive assessment showed that for SWH prediction within 24 h, Model C outperforms other models in terms of integrated predictive capability. For a fixed input data time span, each row in Figure 4 shows the relationship between the experimental model's RMSE, CC, and the training dataset size. As the training dataset size increases, the experimental model has the characteristics that the RMSE decreases at first and increases after, and CC ascends and then diminishes. At a specific training dataset size, each column in Figure 4 shows the relationship between the RMSE, CC, and the experimental model's input data time span. As the input data time span increases, the RMSE of the experimental model first decreases and then increases, and CC first ascends and then diminishes.
With a constant training dataset size, the model with a time span of 3 h had the smallest RMSE and the highest CC in the 3- (Figure 4a (Figure 4c,d) SWH forecasting. When the input data time span was 2 h, the forecasting accuracy was low due to the small amount of wave data. As the time span rises, the CC of the model gradually increases, and RMSE decreases by degrees. However, when the input data time span was too large, the precision of model did not further improve due to the data's redundancy. For the experiments with a determined time span of input data, the experimental model with a training dataset size of 3 years had the smallest RMSE and the highest CC for SWH forecasting of 3- (Figure 4a As training dataset size increases, the CC of the model gradually advances, and the RMSE means that for lengthy forecasts (36 h or more), wind and wave directions had a very weak impact on forecast performance. It may even cause a reduction in forecast accuracy due to data redundancy. In addition, as shown in Figure 5b, the CC of Model B and Model C were greater than 0.8 for the 24-h SWH forecast (0.817 and 0.815, respectively), and the forecast results were considered to be significantly correlated with the true values at this time. Therefore, for our study, we focused on the SWH forecast results over a 24-h time period. The comprehensive assessment showed that for SWH prediction within 24 h, Model C outperforms other models in terms of integrated predictive capability.  In order to further quantify the impact of multi-element training and input data on model performance, evaluations of the changes in RMSE, CC, and MAPE from Model A to Model B to Model C were performed. The statistical analysis of the Skill A B index and Skill A C index was completed in Table 2 based on the error indices of each model in Figure 5 and Equation (14). According to the results in Table 2, for the 3-h SWH forecast, both Skill A B index and Skill A C index were relatively small and gradually increased with the increase in forecasting time. Both Skill A B index and Skill A C index were larger for the 6-and 12-h SWH forecast. However, the discrepancies between Skill A B index and Skill A C index were not significant for the 24-h SWH forecast. When the forecast time was relatively longer (6-and 12-h), the results of Model A were less accurate compared with the results of Model B and C. When the forecast time was too long (24-h) or too short (3-h), the input of multiple elements did not significantly improve the forecast performance. This was because the correlation between wave height and previous wave/wind characteristics became lower at longer forecast times [37]. The Skill A C index was greater than Skill A B index with respect to a fixed forecasting time.

Spatial Distribution and Statistical Analysis of Model Errors
To evaluate the spatio-temporal distribution characteristics of the model errors of SWH forecasts in the SCS, Figure 6 shows the spatial distribution results of the monthly mean RMSE of Model A, B, and C in the 24-h SWH forecast from January to October 2021. The RMSE and spatial locations of all three SCS SWH forecast models were significantly correlated, with smaller RMSE in the deep-sea region away from the coast yet larger RMSE in the shallow-water region along the coast. This was because wind-wave relationships in the nearshore shallow water area are uncertain due to irregular shoreline shapes and seafloor conditions, while the interaction of ocean hydrodynamics and coastal morphology leads to complex relationships between wind and waves [47]. In addition, the RMSE of the prediction models was relatively larger in the eastern and southeastern parts of the SCS, which might be due to multiple reasons. SWH from WWIII was used as the training data, and the error of the prediction results was affected by the accuracy of the original data. Meanwhile, the frequent typhoon events in the sea near the Luzon Strait [5,6] cause irregular and drastic changes in SWH in the nearby ocean. It is difficult for the prediction model to obtain information on the spatial and temporal characteristics of SWH. In addition, many islands are in the eastern and southeastern parts of the SCS, resulting in spatial incoherence of wave data. The lack of data information may also be the reason for this phenomenon. Moreover, the monthly mean RMSE of the SWH prediction model had monthly variations. The prediction model had the smallest RMSE for May-August 2021, followed by the results for January-March 2021, and the worst forecasting for April, September, and October 2021 with the largest RMSE.  In order to quantitatively evaluate the magnitude of the monthly mean RMSE of Model A, B, and C in the 24-h SCS SWH forecast, the results of the spatial distribution of RMSE in Figure 6 were statistically analyzed, and boxplots of RMSE statistics were plotted ( Figure 7). As shown in Figure 7, each subplot's blue, red, and orange boxes indicate the RMSE statistics of Model A, B, and C. From Model A to Model B to Model C, the median and third quartile of the models were decreasing, and it can be observed that in most months, the median and third quartile of Model C were the minimum. Model C had the best forecasting ability. Meanwhile, the RMSE of the prediction model had the most outliers in April and September 2021 (Figure 7d,i), indicating that the RMSE of the model had more exception value in these two months.  Figures 6 and 7 show that the prediction model had maximum values and more outliers for the RMSE results in April, September, and October 2021. Based on the information on generated or transiting typhoons and tropical storms for April, September, and October 2021 in the SCS waters in Table 1, the spatial distribution characteristics of the RMSE of the prediction model under extreme weather conditions were analyzed. For April, September, and October 2021, the spatial distribution of the RMSE of the prediction model was calculated by dividing each month into two periods with typhoon transit and no typhoon transit, respectively. Figure 8 shows the spatial distribution of RMSE in the 24-h SWH forecast for the three months mentioned above. The left and right plots in each subplot indicate the presence and absence of typhoon transit, respectively, where the solid line in the left plot indicates the typhoon path in that month. The solid black line in Figure  8a is the path of Typhoon 2102 "Surigae". The solid red, black and blue lines in Figure 8b are the paths of Typhoon 2113 "Conson", Typhoon 2114 "Chanthu" and Typhoon 2115 "Dianmu", respectively. The solid red and black lines in Figure 8c are the paths of Typhoon 2117 "Lionrock" and Typhoon 2118 "Kompasu", respectively.  Figures 6 and 7 show that the prediction model had maximum values and more outliers for the RMSE results in April, September, and October 2021. Based on the information on generated or transiting typhoons and tropical storms for April, September, and October 2021 in the SCS waters in Table 1, the spatial distribution characteristics of the RMSE of the prediction model under extreme weather conditions were analyzed. For April, September, and October 2021, the spatial distribution of the RMSE of the prediction model was calculated by dividing each month into two periods with typhoon transit and no typhoon transit, respectively. Figure 8 shows the spatial distribution of RMSE in the 24-h SWH forecast for the three months mentioned above. The left and right plots in each subplot indicate the presence and absence of typhoon transit, respectively, where the solid line in the left plot indicates the typhoon path in that month. The solid black line in Figure 8a is the path of Typhoon 2102 "Surigae". The solid red, black and blue lines in Figure 8b are the paths of Typhoon 2113 "Conson", Typhoon 2114 "Chanthu" and Typhoon 2115 "Dianmu", respectively. The solid red and black lines in Figure 8c are the paths of Typhoon 2117 "Lionrock" and Typhoon 2118 "Kompasu", respectively.

Model Performance in Extreme Weather
As shown in Figure 8, there is a close spatial correlation between the spatial distribution of RMSE during the occurrence of typhoons and the path of typhoons in each month, and the RMSE of the prediction models is small during the periods when there is no typhoon transit. This is consistent with the findings in Figure 8 that wind resistance coefficients in extreme conditions were very different from those in weak wind conditions, which may alter the relationship between wind and waves and, thus, reduce the accuracy of predicting extreme events [50]. Meanwhile, the RMSE changes between the three models with and without typhoon transit were compared separately. In the period with typhoon transit, the RMSE of Model B and C, constructed by adding wind field data to the input data, decreased significantly compared to Model A's. The RMSE changes in Model B and C compared to those in Model A in the period without typhoon transit were insignificant. In order to quantitatively analyze this feature, according to Equation (14), the Skill A B

RMSE
and Skill A C RMSE of typhoon transit or no typhoon transit were statistically analyzed in Figure 9. The left and right histograms in each subplot denote the presence and absence of typhoon transit, respectively, and the blue and red histograms indicate Skill A B RMSE , and Skill A C RMSE , respectively. As shown in Figure 9, the Skill A B RMSE and Skill A C RMSE for each month of the typhoon transit period in April, September, and October 2021, with extreme weather occurrences, were greater than that for the period without typhoon transit. It indicates that the quality of the prediction models and the wind field were more correlated in the period with extreme weather occurrences. At the same time, the Skill A C RMSE was bigger than the Skill A B RMSE .
which may alter the relationship between wind and waves and, thus, reduce the accuracy of predicting extreme events [50]. Meanwhile, the RMSE changes between the three models with and without typhoon transit were compared separately. In the period with typhoon transit, the RMSE of Model B and C, constructed by adding wind field data to the input data, decreased significantly compared to Model A's. The RMSE changes in Model B and C compared to those in Model A in the period without typhoon transit were insignificant.
In order to quantitatively analyze this feature, according to Equation (14), the Skill RMSE A B and Skill RMSE A C of typhoon transit or no typhoon transit were statistically analyzed in Figure 9. The left and right histograms in each subplot denote the presence and absence of typhoon transit, respectively, and the blue and red histograms indicate Skill RMSE A B , and Skill RMSE A C , respectively. As shown in Figure 9, the Skill RMSE A B and Skill RMSE A C for each month of the typhoon transit period in April, September, and October 2021, with extreme weather occurrences, were greater than that for the period without typhoon transit. It indicates that the quality of the prediction models and the wind field were more correlated in the period with extreme weather occurrences. At the same time, the Skill RMSE A C was bigger than the Skill RMSE A B . In order to quantify this feature more coherently, the Skill RMSE A B and Skill RMSE A C for typhoon transit or no typhoon transit in April, September, and October 2021 were calculated in Table 3. As shown in Table 3, the best performance for Model C was constructed using SWH, wind shear velocity, wind direction, and wave direction data.  In order to quantify this feature more coherently, the Skill A B RMSE and Skill A C RMSE for typhoon transit or no typhoon transit in April, September, and October 2021 were calculated in Table 3. As shown in Table 3, the best performance for Model C was constructed using SWH, wind shear velocity, wind direction, and wave direction data. For validation of the accuracy of Model C in the 24-h SWH forecasting and further comparison of the accuracy characteristics of the model during typhoon transit and no typhoon transit, we obtained CFOSAT SWIM SWH data products provided by AVISO+. As shown in Figure 8, typhoons had the largest impact in the SCS in October 2021. Therefore, we selected 22 tracks of CFOSAT SWIM SWH data that passed through SCS in October 2021, including 8 tracks during the typhoon (Figure 10a) and 14 tracks during no typhoon transit period (Figure 10b). typhoon transit, we obtained CFOSAT SWIM SWH data products provided by AVISO As shown in Figure 8, typhoons had the largest impact in the SCS in October 2021. The fore, we selected 22 tracks of CFOSAT SWIM SWH data that passed through SCS in O tober 2021, including 8 tracks during the typhoon (Figure 10a) and 14 tracks during typhoon transit period (Figure 10b). The SWH from WWIII and Model C SWH were interpolated to the coordinates c responding to the tracks data by the nearest neighbor method. The quality of the origin SWH from WWIII data during typhoon and no typhoon were analyzed first (Figu  11a,b). Subsequently, we analyzed the accuracy characteristics of Model C in the 24 SWH forecasting during typhoon transit and no typhoon transit period by correlation a error analysis (Figure 11c,d). During typhoon transit, the CC of SWH from WWIII relat to CFOSAT SWIM SWH was 0.8894 and the RMSE was 0.6555 m ( Figure 11a); during typhoon transit, the CC of SWH from WWIII relative to CFOSAT SWIM SWH was 0.96 and the RMSE was 0.2657 m (Figure 11b). In the 24-h SWH forecasting, the CC of Mod C SWH relative to CFOSAT SWIM SWH was 0.7895 and RMSE was 0.9393 m during phoon transit ( Figure 11c); during no typhoon transit, the CC of Model C SWH relative CFOSAT SWIM SWH was 0.8719 and RMSE was 0.4993 m (Figure 11d). The SWH from WWIII and Model C SWH were interpolated to the coordinates corresponding to the tracks data by the nearest neighbor method. The quality of the original SWH from WWIII data during typhoon and no typhoon were analyzed first (Figure 11a,b). Subsequently, we analyzed the accuracy characteristics of Model C in the 24-h SWH forecasting during typhoon transit and no typhoon transit period by correlation and error analysis (Figure 11c,d). During typhoon transit, the CC of SWH from WWIII relative to CFOSAT SWIM SWH was 0.8894 and the RMSE was 0.6555 m ( Figure 11a); during no typhoon transit, the CC of SWH from WWIII relative to CFOSAT SWIM SWH was 0.9643 and the RMSE was 0.2657 m (Figure 11b). In the 24-h SWH forecasting, the CC of Model C SWH relative to CFOSAT SWIM SWH was 0.7895 and RMSE was 0.9393 m during typhoon transit ( Figure 11c); during no typhoon transit, the CC of Model C SWH relative to CFOSAT SWIM SWH was 0.8719 and RMSE was 0.4993 m (Figure 11d).
In summary, the prediction precision of Model C during typhoon transit was not as accurate as that during no typhoon transit for the 24-h SWH forecasting. There were probably two reasons. One was that the accuracy of SWH from WWIII during typhoon transit is comparatively less, and the input data influenced the accuracy of Model C forecasting SWH. Another reason was that during the no typhoon transit period, it was more difficult for Model C to encompass the characteristic patterns of the wave and wind fields during the drastic changes, which then contributed to the decrease of the predicted SWH accuracy. In summary, the prediction precision of Model C during typhoon transit was not as accurate as that during no typhoon transit for the 24-h SWH forecasting. There were probably two reasons. One was that the accuracy of SWH from WWIII during typhoon transit is comparatively less, and the input data influenced the accuracy of Model C forecasting SWH. Another reason was that during the no typhoon transit period, it was more difficult

Conclusions
In this study, in order to explore the role of discrepancy input elements in the ConvL-STM algorithm-based SWH prediction in the South China Sea, three different prediction models were developed using SWH, wind shear velocity (U * ), wind direction (Φ), and wave direction (θ) as input data. Model A was constructed using single-element (SWH) training and input data. The two important parameters of input data time span and training dataset size were determined by sensitivity experiments. To further improve the performance of the SWH forecasting model, Model B and C were constructed using multi-element training and input data. Model B used SWH and U * data to predict SWH, and Model C added wind and wave direction data to the input data of Model B. Subsequently, the spatial distribution characteristics and differences of the forecast results of the three models were analyzed, and the forecast characteristics and discrepancies of the three models under extreme climate were discussed.
The main innovation of this paper was to consider the influence of various physical factors on the prediction model in the prediction of 2-dimensional SWH field. The effect of wind forcing on SWH was quantified using wind shear velocity instead of wind speed, especially the use of cos (Φ − θ) to quantify the influence of the difference between wind and wave directions for the SWH. Moreover, the relationship between the prediction model performance and the typhoon tracks was explored.
The most significant findings of this study are as follows: It is feasible to apply the ConvLSTM algorithm to the forecast of SWH in the South China Sea, which can provide an efficient and high-precision forecast of SWH. When using only the SWH data as input data to predict the SWH in the South China Sea, the optimal training dataset size for the model was 3 years, and the optimal input data time step was 3 h. Model C, in which the SWH, U * , and cos (Φ − θ) were conducted as input data, outperformed other models. For the 3-h SWH forecasting, the correlation between the forecasting results and the wind field was not significant. For the 6-and 12-h SWH forecast, the Skill A C RMSE gradually increased when U * and cos (Φ − θ) were added to the input data. However, the discrepancies between Skill A B index and Skill A C index were not significant for the 24-h SWH forecast. The RMSE of the SWH prediction models had spatial distribution characteristics, and the RMSE of the models was smaller in the deep-water region far from the shore. However, the RMSE of the models was larger in the shallow water region along the coast. The RMSE of the SWH prediction models and the extreme climate were spatially and temporally correlated, and the RMSE of the models was larger in the vicinity of the typhoon path during the period of typhoon occurrence. In addition, Skill A C RMSE was 27.5% for the period of typhoon transit and 19.3% for the period of no typhoon transit, which implies that the correlation between SWH and the previous U * , Φ, and θ was greater during the period of typhoon transit. As the training data show larger error during the period of typhoon transit than that without typhoon, Model C also showed a similar performance in forecasting error as SWH from WWIII.
There were several potential points for improvement in this study. When using multielement training and input data, the optimal input data time span and training dataset size for Model B and C probably differ from the parameter values that were identified in Model A. This would necessitate further discussion in subsequent work. In addition, the achievement of this study was limited to the SWH prediction, and more diverse physical elements can be added as training and input data in the subsequent work to achieve multielement prediction, such as simultaneous prediction of wave direction and average wave period, etc. Institutional Review Board Statement: Not applicable.