Identiﬁcation and Prediction of Ship Maneuvering Motion Based on a Gaussian Process with Uncertainty Propagation

: Maritime transport plays a vital role in economic development. To establish a vessel scheduling model, accurate ship maneuvering models should be used to optimize the strategy and maximize the economic beneﬁts. The use of nonparametric modeling techniques to identify ship maneuvering systems has attracted considerable attention. The Gaussian process has high precision and strong generalization ability in ﬁtting nonlinear functions and requires less training data, which is suitable for ship dynamic model identiﬁcation. Compared with other machine learning methods, the most obvious advantage of the Gaussian process is that it can provide the uncertainty of prediction. However, most studies on ship modeling and prediction do not consider the uncertainty propagation in Gaussian processes. In this paper, a moment-matching-based approach is applied to address the problem. The proposed identiﬁcation scheme for ship maneuvering systems is veriﬁed by container ship simulation data and experimental data from the Workshop on Veriﬁcation and Validation of Ship Maneuvering Simulation Methods (SIMMAN) database. The results indicate that the identiﬁed model is accurate and shows good generalization performance. The uncertainty of ship motion prediction is well considered based on the uncertainty propagation technology.


Introduction
Maritime transport plays a positive role in promoting the sustainable development of the country's economy [1], and it is also directly related to environmental pollution [2]. Accurate maritime traffic simulators (MTS) can provide an effective basis for the port and route planning and management [3], and can help liner shipping companies arrange vessel schedule efficiently [4,5]. Moreover, an accurate ship maneuvering model is of great practical value for ship trajectory prediction and controller design [6]. With the rapid development of maritime autonomous surface ships (MASSs) [7], autonomous navigation and collision avoidance systems require a more intelligent digital maneuvering model, which can predict the future dynamics of ships and estimate the uncertainty caused by the actions to be performed.
Modeling techniques for ship dynamic models involve parametric modeling and nonparametric modeling. Parametric modeling must define a complete mathematical structure in advance from a physical viewpoint and subsequently estimate the hydrodynamic derivatives through parameter identification techniques. Classic system identification methods are widely used for hydrodynamic parameter identification, such as least square estimation [8], the recursive prediction error (RPE) method [9]. However, the traditional methods are sensitive to noise, and the multicollinearity will significantly affect the identification accuracy [10]. Over the decades, a great number of new methods have been proposed to solve the above problems. Yoon and Rhee used ridge regression to suppress the parameter drift due to multicollinearity [11]. Revestido Herrero and Velasco Gonzalez proposed a Section 3. In Section 4, the identification scheme of the ship and experimental example are presented to demonstrate the applicability of the proposed method. Section 5 summarizes the study with conclusions.

Ship Nonparametric Dynamic Model
For a surface ship, the dynamic model is usually described by a 3-DOF model, including the motion of surge, sway and yaw. Figure 1 shows the coordinate system of a surface ship maneuvering motion, including the Earth-fixed coordinates O − X 0 Y 0 and body-fixed coordinates o − x 0 y 0 . Here, u, v, and r are the state variables of surge velocity, sway velocity, and yaw rate, respectively, while δ is the rudder angle and ψ is the heading angle.
are taken as the study object. The identified models are assessed by the prediction error with other motion data not included in the training set.
The remainder of the paper is organized as follows. Section 2 describes the nonparametric ship dynamic model. The algorithms of GP with uncertain input are depicted in Section 3. In Section 4, the identification scheme of the ship and experimental example are presented to demonstrate the applicability of the proposed method. Section 5 summarizes the study with conclusions.

Ship Nonparametric Dynamic Model
For a surface ship, the dynamic model is usually described by a 3-DOF model, including the motion of surge, sway and yaw. Figure 1 shows the coordinate system of a surface ship maneuvering motion, including the Earth-fixed coordinates − and body-fixed coordinates − . Here, , , and are the state variables of surge velocity, sway velocity, and yaw rate, respectively, while is the rudder angle and is the heading angle. The ship maneuvering system is a nonlinear autoregressive model with an exogenous input (NARX) system [36], and the outputs at the next moment are based on the previous state variables. Figure 2 illustrates the modeling and prediction process of the ship dynamic model. In the first stage, ship motion data are collected by onboard sensors such as IMU and GPS. After data preprocessing, the machine learning technique is used to fit the surrogate time series model. Finally, other motions can be predicted through the learned model. The symbol "~" represents random variables.  The ship maneuvering system is a nonlinear autoregressive model with an exogenous input (NARX) system [36], and the outputs at the next moment are based on the previous state variables. Figure 2 illustrates the modeling and prediction process of the ship dynamic model. In the first stage, ship motion data are collected by onboard sensors such as IMU and GPS. After data preprocessing, the machine learning technique is used to fit the surrogate time series model. Finally, other motions can be predicted through the learned model. The symbol "~" represents random variables.
are taken as the study object. The identified models are assessed by the prediction error with other motion data not included in the training set.
The remainder of the paper is organized as follows. Section 2 describes the nonparametric ship dynamic model. The algorithms of GP with uncertain input are depicted in Section 3. In Section 4, the identification scheme of the ship and experimental example are presented to demonstrate the applicability of the proposed method. Section 5 summarizes the study with conclusions.

Ship Nonparametric Dynamic Model
For a surface ship, the dynamic model is usually described by a 3-DOF model, including the motion of surge, sway and yaw. Figure 1 shows the coordinate system of a surface ship maneuvering motion, including the Earth-fixed coordinates − and body-fixed coordinates − . Here, , , and are the state variables of surge velocity, sway velocity, and yaw rate, respectively, while is the rudder angle and is the heading angle. The ship maneuvering system is a nonlinear autoregressive model with an exogenous input (NARX) system [36], and the outputs at the next moment are based on the previous state variables. Figure 2 illustrates the modeling and prediction process of the ship dynamic model. In the first stage, ship motion data are collected by onboard sensors such as IMU and GPS. After data preprocessing, the machine learning technique is used to fit the surrogate time series model. Finally, other motions can be predicted through the learned model. The symbol "~" represents random variables.  According to the relevant studies of nonparametric ship dynamic modeling [25,34], the formulation of the ship discrete nonparametric model is as follows: The selected regressors of the GP are inspired by parametric models, including the Abkowitz [37] and Maneuvering Modeling Group (MMG) models [38]. The ship position variables can be obtained as follows:

Gaussian Process with Deterministic Input
The following notation with a set of training data is defined: where x t is an input vector, and output y t is given by A standard GP is a collection of random variables and can be considered a collection of random variables with a joint Gaussian distribution for any finite subject [39]. GP is specified by the mean function m(x) and covariance function k(x, x ) as where E is the expectation operator. Then, the GP can be written as: The proposed model adopts the commonly used squared exponential covariance function: where σ f and Λ are the amplitude and squared length-scale hypermeters, respectively. Bayesian inference can be defined as the process of fitting a posterior probability model from a prior model with a set of training data D. The GP prior is given as: With these modeling assumptions in place, the likelihood function can be obtained, Then, combining the prior Equation (9) and the likelihood function Equation (10) with the Bayesian rule, the posterior probability distribution and predicted function values f * can be calculated, at a set of deterministic inputs X * .
which leads to the RGP regression predictive equations, where the predictive mean and variance function are specified by: The hyperparameters in Equation (8) are usually obtained by maximizing the log of the marginal likelihood function. It is defined as [36]: This nonlinear and nonconvex optimization problem is usually solved by the gradient ascent-based methods, such as BFGS and the conjugate gradient (CG) algorithm [36].

Prediction with Uncertain Inputs and Uncertainty Propagation
In Equation (11), we assume that the input is deterministic, while the output is Gaussian distributed. This assumption is true in the one-step prediction. For multiple-step predictions, the traditional method is to recycle the one-step prediction. However, the uncertainties induced by each successive prediction cannot be ignored in the time-series tasks [40].
The uncertainty propagation problem can be addressed assuming that the input follows a Gaussian distribution [41]: For convenience of expression and marking, the input variables are divided into speed variable x and control variable u, as X * = [x, u]. The mean and variance are given as: The predictive distribution can be obtained by integrating over the input: However, this integration cannot be analytically computed, since the Gaussian distribution is mapped through a nonlinear function. Taylor approximation [40] or moment matching [42] is commonly used to approximate the integration. The computation of the Taylor approximation is expensive because it accounts for the gradient of the posterior mean and variance of the input. In this paper, moment matching is chosen. The moment matching method assumes that the unknown distribution only has two parameters: the mean and the variance. The mean and variance at an uncertain input can be computed by the laws of iterated expectations and conditional variances [42]: Then, the predicted mean and variance at time t + 1 are given as:

Simulation Study on a Container Ship
The first case uses the simulation maneuvers of a large container ship. The selected parametric numerical model is a nonlinear 4-degree-of-freedom (DOF) dynamic model [43]. The main particulars of the container ship are listed in Table 1. The model has been verified by experiments, which can well reflect the complex dynamic characteristics of the container ship. It is widely used in the testing of system identification algorithms [33,44]. The parametric maneuvering model is given as follows [43]: where m denotes the ship mass, W is the weight of the displaced water, GM is the metacenter height. I x and I z denote the moments of inertia of the ship about the x 0 , z 0 axes. X . u , Y. v , Y. r , N. v , and N. r are acceleration derivatives which can be determined using potential theory. F X F Y F K , and F N are the forces and moment disturbing quantity at x 0 -axis, y 0 -axis, and z 0 -axis, respectively. The nonlinear forces and moments are composed of Taylor expansion of hydrodynamic coefficient and speed.
With the 4-DOF model, 2 groups of maneuvers, including 10 • /10 • and 20 • /20 • zigzag tests, are undertaken under the following initial conditions: u 0 = 7 m/s, v 0 = 0 m/s, r 0 = 7 m/s, δ 0 = 0 rad and the propeller velocity is fixed at 70 rpm. Each maneuver lasted for 850 s, and the simulation interval was 0.1 s. The timeseries of the yaw velocity and rudder angle are shown in Figure 3.  [45] with BFGS and CG algorithms is used to train the Gaussian process. The training process took 9 s in total and trained 3 GPs in Equation (1). The optimization parameter settings of each GP are listed in Table 2. Generalization verification is necessary for system identification. The ability of the identified model to predict other motions not included in the training data is called generalization. The generalization performance of the trained model is verified by predicting the motions, including the 15°/15° zigzag maneuver and port 30° turning circle test. The prediction results of the 15°/15° zigzag test are shown in Figure 4, where the predictions are compared with the raw data. The prediction results are consistent with the raw data. The prediction variance is also plotted in the figure and shows that the uncertainty is small. The prediction results of the 30° turning circle test are shown in Figure 5. The prediction results can well track the tendency of the raw data. The uncertainty in Figure 5 is much higher than that in Figure 4 because the training data, including 10°/10° and 20°/20° zigzag tests, completely reflect the dynamic characteristics of the ship when the rudder angle is less than 20°. However, the dynamic characteristics of a rudder angle of 30° are not included in the information range provided by the training data. Under this condition, the identified model can predict the motion of a large rudder angle with good generalization ability and maintains good accuracy while providing high uncertainty of prediction through the proposed method.
The root mean square error (RMSE) is applied to evaluate the model performance and is defined as: where is the prediction result, and is the raw value. The RMSEs of the 15°/15° zigzag maneuver are listed in Table 3 with each DOF. With 2 s as the time interval of the training data, 850 points of the above training data are used to train the GP hyperparameters with maximum likelihood. All calculations are performed in MATLAB R2020a with 4.0 GHz CPU and 16 GB RAM. PILCO toolbox [45] with BFGS and CG algorithms is used to train the Gaussian process. The training process took 9 s in total and trained 3 GPs in Equation (1). The optimization parameter settings of each GP are listed in Table 2. Generalization verification is necessary for system identification. The ability of the identified model to predict other motions not included in the training data is called generalization. The generalization performance of the trained model is verified by predicting the motions, including the 15 • /15 • zigzag maneuver and port 30 • turning circle test. The prediction results of the 15 • /15 • zigzag test are shown in Figure 4, where the predictions are compared with the raw data. The prediction results are consistent with the raw data. The prediction variance is also plotted in the figure and shows that the uncertainty is small. The prediction results of the 30 • turning circle test are shown in Figure 5. The prediction results can well track the tendency of the raw data. The uncertainty in Figure 5 is much higher than that in Figure 4 because the training data, including 10 • /10 • and 20 • /20 • zigzag tests, completely reflect the dynamic characteristics of the ship when the rudder angle is less than 20 • . However, the dynamic characteristics of a rudder angle of 30 • are not included in the information range provided by the training data. Under this condition, the identified model can predict the motion of a large rudder angle with good generalization ability and maintains good accuracy while providing high uncertainty of prediction through the proposed method.
The root mean square error (RMSE) is applied to evaluate the model performance and is defined as: whereŷ i is the prediction result, and y i is the raw value. The RMSEs of the 15 • /15 • zigzag maneuver are listed in Table 3 with each DOF.           There is some interference and noise in the experimental dataset. Directly taking the speed as the input and output will reduce the identification accuracy. Using the acceleration obtained by the speed difference as the prediction output can effectively alleviate the influence of noise [29], and the new flowchart is shown in Figure 6.

Experimental Study of a Ship Scale Model
The experimental dataset of KCLCC2 from SIMMAN is used to further validate the proposed method. KVLCC2 is a scale model of large tankers. The main particulars of the scale ship model are detailed in Table 4. The model free-running tests are performed by the Hamburg Ship Model Basin (HVSA). There is some interference and noise in the experimental dataset. Directly taking the speed as the input and output will reduce the identification accuracy. Using the acceleration obtained by the speed difference as the prediction output can effectively alleviate the influence of noise [29], and the new flowchart is shown in Figure 6. To include more dynamic characteristic information of the ship, the experimental data of 10°/5°, 20°/5° and 30°/5° zigzag maneuvers are used to train the GP. Moreover, the empirical Bayes method [46] is applied here to reduce the noise of acceleration with 'wdenoise' MATLAB function. More details of the empirical Bayes denoising method for ship motion data can be found in our previous study [47]. Figure 7 shows the effect of noise reduction. There are 800 training points with a time interval of 0.6 s. It took 10.1 s to train 3 GPs for the ship maneuvering system, and the obtained hyperparameter settings are shown in Table 5. To include more dynamic characteristic information of the ship, the experimental data of 10 • /5 • , 20 • /5 • and 30 • /5 • zigzag maneuvers are used to train the GP. Moreover, the empirical Bayes method [46] is applied here to reduce the noise of acceleration with 'wdenoise' MATLAB function. More details of the empirical Bayes denoising method for ship motion data can be found in our previous study [47]. Figure 7 shows the effect of noise reduction. There are 800 training points with a time interval of 0.6 s. It took 10.1 s to train 3 GPs for the ship maneuvering system, and the obtained hyperparameter settings are shown in Table 5. i. Eng. 2021, 9, x FOR PEER REVIEW 10 of 15 Figure 7. Raw data and denoising data of the sway motion by the empirical Bayes method. Then, the identified model is validated by comparing the experimental data with predictions of 15°/5°, 35°/5° and 10°/10° zigzag maneuvers, as shown in Figures 8-10. The accuracy of the prediction speed assessed by RMSE is listed in Table 6. In the similar study [29], the same three sets of training data were used for training the nu-SVM. The prediction error of the proposed method can be compared with the nu-SVM in [29]. Figures 8  and 9 show that the experimental data and prediction follow similar trends, and the cumulative deviation is small. The comparison results between Table 6 and the error results in [29] indicate that the proposed method has stronger prediction ability than nu-SVM. However, in Figure 10, the deviation between prediction speed and experiment is obvious, especially in surge motion. The predicted acceleration is also plotted in Figure 11 to analyze the reason. There is a strong oscillation in the measurements of the surge speed. This oscillation in accretion causes a cumulative deviation in speed. As for the uncertainty, it can be observed that the variance of the predictions of the experiment is bigger than the simulation in the previous case. This is because there are more disturbances and noises in the experiment than the simulation.  Then, the identified model is validated by comparing the experimental data with predictions of 15 • /5 • , 35 • /5 • and 10 • /10 • zigzag maneuvers, as shown in Figures 8-10. The accuracy of the prediction speed assessed by RMSE is listed in Table 6. In the similar study [29], the same three sets of training data were used for training the nu-SVM. The prediction error of the proposed method can be compared with the nu-SVM in [29]. Figures 8 and 9 show that the experimental data and prediction follow similar trends, and the cumulative deviation is small. The comparison results between Table 6 and the error results in [29] indicate that the proposed method has stronger prediction ability than nu-SVM. However, in Figure 10, the deviation between prediction speed and experiment is obvious, especially in surge motion. The predicted acceleration is also plotted in Figure 11 to analyze the reason. There is a strong oscillation in the measurements of the surge speed. This oscillation in accretion causes a cumulative deviation in speed. As for the uncertainty, it can be observed that the variance of the predictions of the experiment is bigger than the simulation in the previous case. This is because there are more disturbances and noises in the experiment than the simulation.

Discussion and Conclusions
In this work, a novel identification modeling and prediction scheme based on GP is proposed to identify the ship nonparametric maneuvering model. By introducing the moment matching approximation method, the multi-step prediction uncertainty of ship motion can be propagated. The performance of the proposed method has been tested with a large container ship and a scale ship model and shows good accuracy and generalization ability. Moreover, the uncertainty of propagation can help drivers or controllers make safe decisions. Through the simulation of the container ship, it is proven that the prediction uncertainty obtained by the proposed method is reliable enough. Where there is less dynamic information in the training data, the prediction uncertainty of turning circle motion is larger than that of zigzag maneuver. In addition, it has been demonstrated that the performance of the presented approach is superior to the nu-SVM method in the experimental case. There are also some limitations of this study: (1) The proposed method needs to spend more calculation time due to consider the uncertainty propagation compared with other methods. The sparse method can be used to improve computational efficiency. (2) Both the two verified cases in this paper are container ships. The applicability of the model to other ship types, especially new unmanned ships, needs further study.
Future work includes two main tasks: (1) Although the presented method has been verified by simulation and experimental data, full-scale trials with disturbances should be performed, including waves, currents, and wind. In this environment, the uncertainty prediction provided by this method will have great application value. (2) This method can be used in modern controllers such as model predictive control. The uncertainty of predictions can be introduced in the cost function to construct a cautious controller.
Author Contributions: Conceptualization, Y.X.; methodology, Y.X. and G.X.; software, Y.X. and G.C.; validation, G.C. and Y.L.; resources, Y.L.; data curation, Y.X. and G.C.; writing and editing, Y.X. and G.X. All authors have read and agreed to the published version of the manuscript.