Machine Learning for Short-Term Prediction of Ship Motion Combined with Wave Input

: There is a response relationship between wave and ship motion. Based on the LSTM neural network, the mapping relationship between the wave elevation and ship roll motion is established. The wave elevation and ship motion duration data obtained by the CFD simulation are used to predict ship roll motion with different input data schemes. The results show that the prediction scheme considering the wave elevation input can predict ship roll motion. Compared with the direct prediction scheme based on the roll data input, the prediction scheme considering the wave elevation input factor can greatly improve the prediction accuracy and effective advance prediction time. Different wave elevation data inputs have different prediction effects. The advance prediction duration will increase with the increase in the input wave elevation position and the ship distance. The simultaneous input of multi-point wave elevation greatly increases the amount of data, allowing the trained model to utilize a greater data depth. This not only improves the advance prediction duration of the prediction model, but it also enhances the robustness of the model, making the prediction results more stable.


Introduction
Ships present random motion under the action of irregular waves. The violent motion of ships harms their navigation and operation. Wave is the main incentive factor of the ship oscillation motion. If the mapping relationship between wave elevation near the ship and ship motion can be established, the ship motion can be predicted in a very short time. Its motion attitude for a period of time in the future can be known in advance, so as to ensure the safety of the ship's operation at sea.
The very short-term prediction of ship motion refers to the prediction of a ship's motion in the next few seconds to tens of seconds. Triantafyllou and Bodson (1983) [1] used the Kalman filter (KF) method to study the very short-term prediction of the ship motion. Through the characteristics of the ship motion and the corresponding assumptions, the mechanical principle was used to analyze the force on the ship. The state equation of the ship sway motion was derived, so as to obtain the multi-step ship motion predictor based on the KF method. The KF method needs an accurate ship motion state equation. Because the sea state environment often changes, the hydrodynamic parameters are not stable enough, which makes it difficult to obtain an accurate ship motion state equation. Zhao et al. (2004) [2] used minor component analysis (MCA) to predict ship motion a long time ahead with consistent accuracy. The prediction error is almost the same for the 5 s and 20 s predictions. Li et al. (2017) [3] used a nonlinear autoregressive exogenous (NARX) network combining 29 ship state attributes to predict heading, roll, and pitch. Their study showed that modeling and analyzing based on the NARX network was helpful in generating the data-driven model for the ship motion prediction. Suhermi et al. (2018) [4] combined the depth neural network (DNN) and the auto-regressive (AR) model to predict and temporal pattern attention mechanism (TPA). Compared with the LSTM model and the SVM model, the combined model had a significant reduction in errors. Later, Wang et al. (2021) [20] proposed the single input single output (SISO) and multiple input single output (MISO) methods to predict ship motion based on deep learning. The methods achieved good prediction accuracy. Table 1 summarizes the relevant machine learning models for the short-term perdition of ship motion.  [20] At present, most scholars used the LSTM neural network to predict ship motion in the very short-term, mostly based on the data of ship motion itself, while there are few reports on the research of ship motion prediction based on wave elevation data. In fact, in the process of ship operation at sea, especially at zero speed, waves are the main excitation of ship movement. The wave itself has the memory effect. If the ship movement can be predicted in the very short-term by using the wave elevation data as the input feature, the prediction duration and prediction accuracy can be effectively improved in theory. In order to verify the effectiveness of the scheme, this paper, based on the LSTM neural network model, combined with wave elevation and ship motion duration data to study the very short-term prediction of ship roll motion.

LSTM Theory
The LSTM unit structure is composed of the forgetting gate, input gate, output gate, and unit state. The unit structure is shown in Figure 1. At the current time t, there are three input parameters of the LSTM network: input value x t at the current time (such as wave elevation), output value h t−1 at the previous time (such as roll motion), and unit state C t−1 at the previous time. There are two output parameters: the current time output value h t and the current time unit state C t . Through the activation function σ, the LSTM realizes the control of the three gates, so as to preserve and forget the historical information. Referring to study [21] for the specific principle of the mathematical expression of the gating structure. the control of the three gates, so as to preserve and forget the historical information. Referring to study [21] for the specific principle of the mathematical expression of the gating structure. The function of forgetting gate determines how many Ct−1 unit states need to be retained until the current moment. The mathematical expression is as follows: where Wf is the vector weight; bf is the offset vector quantity. The gate will read the two input values, xt and ht−1, and activate the function σ (sigmoid function) that controls the output of a value between 0 and 1 to Ct−1, where 0 represents the complete rejection of Ct−1 and 1 represents the complete retention of Ct−1.
The function of the input gate determines how much new data need to be saved to the cell state Ct. The mathematical expression is as follows: The gate needs two steps to realize the new unit state Ct. The first step is to determine which information needs to be updated through Equation (2), and calculate the temporary state at the current time through Equation (3). The second step is to obtain the new unit state Ct at the current time through Equation (4).
The function of the output gate controls how much of the unit state Ct needs to be output to ht. The mathematical expression is as follows: The door activates the function through Equation (5) σ to determine the output part ot in the cell state Ct, and then determine the final output value ht through Equation (6). The function of forgetting gate determines how many C t−1 unit states need to be retained until the current moment. The mathematical expression is as follows:

LSTM Theory
where W f is the vector weight; b f is the offset vector quantity. The gate will read the two input values, x t and h t−1 , and activate the function σ (sigmoid function) that controls the output of a value between 0 and 1 to C t−1 , where 0 represents the complete rejection of C t−1 and 1 represents the complete retention of C t−1 . The function of the input gate determines how much new data need to be saved to the cell state C t . The mathematical expression is as follows: The gate needs two steps to realize the new unit state C t . The first step is to determine which information needs to be updated through Equation (2), and calculate the temporary state at the current time through Equation (3). The second step is to obtain the new unit state C t at the current time through Equation (4).
The function of the output gate controls how much of the unit state C t needs to be output to h t . The mathematical expression is as follows: The door activates the function through Equation (5) σ to determine the output part o t in the cell state C t , and then determine the final output value h t through Equation (6).

LSTM Theory
The main process of the ship motion prediction model based on the LSTM neural network includes data preprocessing, model parameter setting, model training, prediction data, and prediction result evaluation.
Normalization can reduce the impact of the data magnitude differences, accelerate convergence speed, and improve computational efficiency. All data in this study are normalized first, and then LSTM training is conducted. After the prediction, the data Appl. Sci. 2023, 13, 5298 5 of 12 are restored by anti-normalization processing. The normalization method is expressed as follows: where x* is the normalized value of x; x max and x min represent the maximum and minimum values of x, respectively. The LSTM neural network can realize a multi-feature input, as shown in Figure 2. When training, it is necessary to establish the mapping relationship between the input duration data x ti and the output duration data y t ; where i represents the number of input feature types. First, the input duration data group is slided to form the input vector group X. The window length j is the number of input steps. The output vector group Y is the output data y j+n corresponding to each window, where n is the number of advance prediction steps and is used to control the advance prediction duration. The advance prediction duration is n times the sample data sampling interval. The wave elevation input data are used to predict the ship motion. When n = 1, the prediction model predicts the ship motion attitude at the current time. When n > 1, the prediction model predicts the ship motion attitude corresponding to the n-th data moment in the future. The maximum absolute error (MAE) and the root mean square error (RMSE) are used to evaluate the forecast results. The MAE represents the error of the data peak point, while the RMSE represents the overall prediction error.
network includes data preprocessing, model parameter setting, model training, prediction data, and prediction result evaluation.
Normalization can reduce the impact of the data magnitude differences, accelerate convergence speed, and improve computational efficiency. All data in this study are normalized first, and then LSTM training is conducted. After the prediction, the data are restored by anti-normalization processing. The normalization method is expressed as follows: where x* is the normalized value of x; xmax and xmin represent the maximum and minimum values of x, respectively. The LSTM neural network can realize a multi-feature input, as shown in Figure 2. When training, it is necessary to establish the mapping relationship between the input duration data xti and the output duration data yt; where i represents the number of input feature types. First, the input duration data group is slided to form the input vector group X. The window length j is the number of input steps. The output vector group Y is the output data yj+n corresponding to each window, where n is the number of advance prediction steps and is used to control the advance prediction duration. The advance prediction duration is n times the sample data sampling interval. The wave elevation input data are used to predict the ship motion. When n = 1, the prediction model predicts the ship motion attitude at the current time. When n > 1, the prediction model predicts the ship motion attitude corresponding to the n-th data moment in the future. The maximum absolute error (MAE) and the root mean square error (RMSE) are used to evaluate the forecast results. The MAE represents the error of the data peak point, while the RMSE represents the overall prediction error.

Ship Motion Data
In this study, the motion of ships in irregular waves is simulated by the CFD method. The time-history data of the ship motion and wave elevation at different positions on the ship's side are obtained for the study of the short-term prediction of ship motion. The test ship is DTMB5415 with scale ratio of 1/51. The ship parameters are listed in Table 2. The simulated working condition is the roll and heave motion of the ship with zero speed in the beam wave sea state of level five. The sea wave spectrum adopts the Pierson-Moskowitz spectrum. The significant wave height at the model scale is 0.078 m. The average wave period is 0.98 s.

Ship Motion Data
In this study, the motion of ships in irregular waves is simulated by the CFD method. The time-history data of the ship motion and wave elevation at different positions on the ship's side are obtained for the study of the short-term prediction of ship motion. The test ship is DTMB5415 with scale ratio of 1/51. The ship parameters are listed in Table 2. The simulated working condition is the roll and heave motion of the ship with zero speed in the beam wave sea state of level five. The sea wave spectrum adopts the Pierson-Moskowitz spectrum. The significant wave height at the model scale is 0.078 m. The average wave period is 0.98 s. The CFD solver is STAR-CCM+. The computation model is based on solving the Reynolds time-averaged Navier-Stokes (RANS) equation combined with the shear stress transport (SST) k-ω formulation. The interface of water and air is captured using the volume of the fluid method with a second-order high resolution interface capturing (HRIC) scheme. The mesh updates caused by the ship motion are processed using the dynamic Appl. Sci. 2023, 13, 5298 6 of 12 overlapping mesh technology. An irregular wave is generated through the velocity entry boundary condition. A damping source method is used to eliminate the reflection of waves from the open boundary. The time stepping method is a second-order implicit scheme. The algorithm of the semi-implicit method for pressure-linked equations (SIMPLE) is used to couple the pressure and velocity fields. The computational grid is generated by the trimmer mesher. The boundary layer of the hull surface is divided by a prism layer grid. The thickness of the first layer of the grid adjacent to the hull surface is 2 mm, to ensure that the Y+ value on the ship's wetted surface is within 100. In order to ensure the accuracy of the wave surface generation and propagation, the grid in the vicinity of the wave surface is refined along the wave height and wavelength directions. The numbers of grid per average wave height and per average wavelength is 6 and 40, respectively. In addition, the local refinement of the grid is performed around the border area between the background grid and the overlapping grid. The computational time step is 0.001 s, approximately 1/1000 of the average wave period.
There are four groups of monitoring data in the CFD simulation. One group is the ship roll motion data. Three groups are the wave elevation data at the position 1 m, 3 m, and 5 m away from the ship side in the direction of the incoming wave. Figure 3 shows the ship rolling time-history. Figure 4 shows the wave elevation time-history at the position of 3 m away from the ship's side. The number of simulation data in each group is 5000. The sampling interval is 0.04 s. The total duration is 200 s. The first 4000 data are selected for model training data. The data for model validation start from the 4500th component.
The CFD solver is STAR-CCM+. The computation model is based on solving the Reynolds time-averaged Navier-Stokes (RANS) equation combined with the shear stress transport (SST) k-ω formulation. The interface of water and air is captured using the volume of the fluid method with a second-order high resolution interface capturing (HRIC) scheme. The mesh updates caused by the ship motion are processed using the dynamic overlapping mesh technology. An irregular wave is generated through the velocity entry boundary condition. A damping source method is used to eliminate the reflection of waves from the open boundary. The time stepping method is a second-order implicit scheme. The algorithm of the semi-implicit method for pressure-linked equations (SIM-PLE) is used to couple the pressure and velocity fields. The computational grid is generated by the trimmer mesher. The boundary layer of the hull surface is divided by a prism layer grid. The thickness of the first layer of the grid adjacent to the hull surface is 2 mm, to ensure that the Y+ value on the ship's wetted surface is within 100. In order to ensure the accuracy of the wave surface generation and propagation, the grid in the vicinity of the wave surface is refined along the wave height and wavelength directions. The numbers of grid per average wave height and per average wavelength is 6 and 40, respectively. In addition, the local refinement of the grid is performed around the border area between the background grid and the overlapping grid. The computational time step is 0.001 s, approximately 1/1000 of the average wave period.
There are four groups of monitoring data in the CFD simulation. One group is the ship roll motion data. Three groups are the wave elevation data at the position 1 m, 3 m, and 5 m away from the ship side in the direction of the incoming wave. Figure 3 shows the ship rolling time-history. Figure 4 shows the wave elevation time-history at the position of 3 m away from the ship's side. The number of simulation data in each group is 5000. The sampling interval is 0.04 s. The total duration is 200 s. The first 4000 data are selected for model training data. The data for model validation start from the 4500th component.

Prediction Solely Using Roll Data
Based on the LSTM model, the roll motion duration data are used as the input feature to directly predict the roll motion with a different advance time. The advance forecast

Prediction Solely Using Roll Data
Based on the LSTM model, the roll motion duration data are used as the input feature to directly predict the roll motion with a different advance time. The advance forecast steps n is tested from 5 to 70. The sampling interval is 0.04 s. The advance forecast duration in the model scale is from 0.2 s to 2.8 s, which is converted to 1.4 s to 20 s in full scale.
The roll prediction errors solely based on the roll data are shown in Table 3. The comparison between the predicted values of the advance prediction steps n of 5, 15, 40, and 70 and the sample data in the time series are shown in Figure 5. When the number of the advance prediction step is 5, the prediction accuracy is the best. The prediction results are basically matched with the sample values. When the prediction interval step is 70, the accuracy of the prediction results is poor. There is a large deviation between the prediction value and the sample value. This is because the correlation between the time series data weakens the prediction accuracy, which gradually declines with the increase in the advance forecast duration. The results show that the number of effective advance prediction steps solely utilizing the input data of roll cannot exceed 70.

Prediction Combining with Single-Wave Data
Combined with the ship roll data and the wave elevation data, the roll motion is predicted for a different advance time. The advance prediction steps n based on the wave elevation input at 1 m from the ship's side are 5,10,15,20,30,40,50, 60 and 70. The number of advance prediction steps based on the wave elevation input at 3 m from the ship's side increases to 120. The number of advance prediction steps based on the wave elevation input at 5 m from the ship's side increases to 160. The errors of the prediction results with different input schemes are summarized in Table 4. The results solely based on the ship's own roll data are also listed in the table. Figures 6-8 show the comparison between the time series of roll prediction and the sample data.

Prediction Combining with Single-Wave Data
Combined with the ship roll data and the wave elevation data, the roll motion is predicted for a different advance time. The advance prediction steps n based on the wave elevation input at 1 m from the ship's side are 5,10,15,20,30,40,50, 60 and 70. The number of advance prediction steps based on the wave elevation input at 3 m from the ship's side increases to 120. The number of advance prediction steps based on the wave elevation input at 5 m from the ship's side increases to 160. The errors of the prediction results with different input schemes are summarized in Table 4. The results solely based on the ship's own roll data are also listed in the table. Figures 6-8 show the comparison between the time series of roll prediction and the sample data.

Prediction Combining with Single-Wave Data
Combined with the ship roll data and the wave elevation data, the roll motion is predicted for a different advance time. The advance prediction steps n based on the wave elevation input at 1 m from the ship's side are 5, 10, 15, 20, 30, 40, 50, 60 and 70. The number of advance prediction steps based on the wave elevation input at 3 m from the ship's side increases to 120. The number of advance prediction steps based on the wave elevation input at 5 m from the ship's side increases to 160. The errors of the prediction results with different input schemes are summarized in Table 4. The results solely based on the ship's own roll data are also listed in the table. Figures 6-8 show the comparison between the time series of roll prediction and the sample data.      Table 4. Roll prediction combining with single-wave data (unit: deg).

Advance Prediction
Step  Compared with the prediction results solely based on roll input, when the wave elevation data at 1 m from the ship's side are also taken as the input feature, the overall prediction accuracy is greatly improved. Because the wave is the excitation of the ship rolling, the wave and roll data are related. When roll data and wave elevation data are used as input features for training, the LSTM can mine more effective features for roll prediction. By comparing the prediction results of three wave data inputs, the longer the input wave measurement point is from the ship, the longer will the effective advance prediction time of the ship roll take to gradually increase, although the corresponding prediction accuracy still gradually declines. On the one hand, because of the memory effect of the waves, with the increase in the distance between waves and ships, the time of propagation in space needs to be longer, which can effectively improve the prediction duration. On the other hand, the response relationship between the wave and ship motions will weaken with the increase in distance, resulting in the decline of the prediction accuracy. Combined with the wave data at 5 m from the ship, the number of advance steps for ship roll prediction can reach 120.

Prediction Combining with Multiple-Wave Data
In order to explore the relationship between waves and ship motions, the effect of multiple-wave elevation data as input characteristics on the roll motion prediction is studied. Based on three sets of wave elevation data at 1 m, 3 m, and 5 m from the ship's side, three different wave elevation data schemes and roll data are combined as input features for roll motion prediction. The three wave elevation input forms are based on the wave elevation data at 1 m and 3 m, at 3 m, and 5 m and at 1 m, 3 m, and 5 m. The comparison between the time series of the ship roll predicted based on different input schemes and the sample value is shown in Figures 9-11. The roll prediction error is summarized in Table 5.     The prediction results based on multiple-wave data have better stability than those based on single-wave data. When the input scheme combines the wave elevation data of 1 m and 3 m from the ship, compared with the prediction scheme with the input wave height of 1 m from the ship, the prediction accuracy decreases with a small advance prediction step (n ≤ 30), but the prediction accuracy increases significantly in case of a large advance prediction step (n ≥ 40). Compared with the prediction scheme with the input wave height of 3 m from the ship, the prediction accuracy based on the two sets of wave    The prediction results based on multiple-wave data have better stability than those based on single-wave data. When the input scheme combines the wave elevation data of 1 m and 3 m from the ship, compared with the prediction scheme with the input wave height of 1 m from the ship, the prediction accuracy decreases with a small advance prediction step (n ≤ 30), but the prediction accuracy increases significantly in case of a large advance prediction step (n ≥ 40). Compared with the prediction scheme with the input wave height of 3 m from the ship, the prediction accuracy based on the two sets of wave  The prediction results based on multiple-wave data have better stability than those based on single-wave data. When the input scheme combines the wave elevation data of 1 m and 3 m from the ship, compared with the prediction scheme with the input wave height of 1 m from the ship, the prediction accuracy decreases with a small advance prediction step (n ≤ 30), but the prediction accuracy increases significantly in case of a large advance prediction step (n ≥ 40). Compared with the prediction scheme with the input wave height of 3 m from the ship, the prediction accuracy based on the two sets of wave input has greatly improved even if the number of advance prediction step is small. With the increase in the number of prediction steps, the prediction accuracy is equivalent to both. When the input scheme is combined with the wave data at 1 m, 3 m, and 5 m from the ship, the stability of the prediction results is improved to a certain extent compared with that based on the two sets of wave input, especially when the advance prediction step is smaller. The prediction accuracy is improved significantly. The number of advance step for the ship roll prediction can reach 150 by combining the three sets of the wave input scheme. Because the wave elevations at different locations have different optimal advance prediction durations, the LSTM neural network could extract the optimal weights of the multiple wave elevations. When the different advance prediction durations are used, the stability of the prediction could be improved as a whole.

Conclusions
Based on the LSTM neural network, this paper studies the very short-term prediction of the ship roll motion considering the wave elevation input. The influence of the wave data input on the prediction accuracy and the advance prediction duration is analyzed. Using the wave elevation and ship motion data, the feasibility of considering the wave elevation input to predict the ship roll motion is verified. The following conclusions are obtained.
(1) Compared with the prediction results solely based on the roll data input, the roll motion prediction considering the wave elevation input greatly improves both the prediction accuracy and the advance prediction duration; (2) For single-wave data input, the LSTM model can mine the mapping relationship between the advance prediction duration and the distance of the wave height measurement points. The effective advance prediction time increases with the increase in the distance from the wave gauge to the ship; (3) Compared with the prediction results of the single-wave data input, the prediction of the multiple-wave data input has better stability for different advance times. As the amount of data increases, the overall advance prediction duration of the LSTM model improves, and the model also has better robustness for larger advance prediction durations.