An Adaptive RBF-NMPC Architecture for Trajectory Tracking Control of Underwater Vehicles

: An adaptive control algorithm based on the RBF neural network (RBFNN) and nonlinear model predictive control (NMPC) is discussed for underwater vehicle trajectory tracking control. Firstly, in the off-line phase, the improved adaptive Levenberg–Marquardt-error surface compensa-tion (IALM-ESC) algorithm is used to establish the RBFNN prediction model. In the real-time control phase, using the characteristic that the system output will change with the external environment interference, the network parameters are adjusted by using the error between the system output and the network prediction output to adapt to the complex and uncertain working environment. This provides an accurate and real-time prediction model for model predictive control (MPC). For optimization, an improved adaptive gray wolf optimization (AGWO) algorithm is proposed to obtain the trajectory tracking control law. Finally, the tracking control performance of the proposed algorithm is veriﬁed by simulation. The simulation results show that the proposed RBF-NMPC can not only achieve the same level of real-time performance as the linear model predictive control (LMPC) but also has a superior anti-interference ability. Compared with LMPC, the tracking performance of RBF-NMPC is improved by at least 43% and 25% in the case of no interference and interference, respectively. fall into the local optimal value, AGWO can jump out of the local optimal value and achieve a good optimization result. The results show that the AGWO proposed in this paper has a great improvement in convergence speed and convergence effect.


Introduction
With the progress of intelligent control technology, the development of underwater vehicles has entered a new stage. Whether it is the exploration and exploitation of marine mineral resources, the investigation of marine topography, or military applications, it is inseparable from the participation of underwater vehicles [1]. Trajectory tracking is one of the key technologies in the field of underwater vehicles. It is the premise and guarantee for underwater vehicles to complete the specified tasks [2]. Therefore, the research of trajectory tracking control technology is particularly important.
At present, the main research methods of underwater vehicle trajectory tracking control are proportional-integral-derivative (PID) control, fuzzy control, backstepping control, sliding mode control, etc. In [3], a variable integral PID controller based on disturbance observer has been designed to realize the heading control of an underwater vehicle. In [4], good trajectory tracking results have been achieved for underactuated underwater vehicles using terminal sliding mode control. In [5], a bio-inspired backstepping control method and a three-dimensional trajectory tracking controller have been proposed for the deepdiving control. However, most of these methods do not consider the constraints of system state and input, which leads to the phenomenon of thrust saturation in actual control easily. On the contrary, model predictive control (MPC) has the ability to deal with various constraints and has great flexibility in describing control problems. These remarkable (3) Based on the traditional gray wolf optimization (GWO) algorithm, the idea of adaptive weight and worst-case crossover is added to improve the global search ability and convergence speed, so as to ensure the real-time performance of NMPC.
The content of this paper is arranged as follows: Section 2 introduces the problems of this paper, including the kinematics and dynamics model of underwater vehicles. Section 3 introduces the design process of the controller. Section 4 is the related simulation experiments, including the results of model identification, optimization and tracking control. Some results and future work are described in Section 5.

Problem Description
This paper studies the trajectory tracking control of the underwater vehicles in the horizontal plane, and the motion of surge, sway and yaw are considered.
The kinematics model of underwater vehicles in the horizontal plane is shown in (1), which is used to describe the transformation relationship between the motion coordinate system and inertial coordinate system.
where J(η) is the coordinate transformation matrix, which is defined as: where ψ is the heading angle. The dynamic model of underwater vehicles is represented as [18]: where v = [u, v, r] T is the vector composed of surge velocity, sway velocity and yaw angular velocity in the motion coordinate system; η = [x, y, ψ] T is the vector composed of X-direction position, Y-direction position and heading angle in the inertial coordinate system; M = diag(M x , M y , M ψ is the inertia matrix; C(v) is the centripetal force and Coriolis force matrix; D(v) is the damping matrix; g(η) is the restoring force vector; τ = [F u , F v , F r ] T is the force and moment vector acting on three degrees of freedom. By combining kinematics and dynamics equations, the model of the underwater vehicle's system can be established as: The system behavior in MPC needs to be described by a predictive model. However, for underwater vehicles, the randomness of ocean current direction and velocity have a great impact on its dynamic characteristics. Coupled with the strong nonlinearity and strong coupling of the underwater vehicle's system, it is difficult to establish an accurate system model, which affects the trajectory tracking performance. Therefore, in this paper, the RBFNN identification method is used to identify the dynamic model of underwater vehicles. On the one hand, it can improve the accuracy of model identification; on the other hand, it can modify the network parameters online and suppress the external interference and environmental changes to achieve the purpose of adaptive control.
The trajectory tracking control process of the underwater vehicles studied in this paper is shown in Figure 1. The trajectory tracking control process of the underwater vehicles studied in per is shown in Figure 1. As shown in Figure 1, the reference trajectory 1 2 , , N s s s defined in inertia nate system is composed of N discrete trajectory points from a given initial state. S the trajectory point at a certain time is x t y t . For trajectory tracking the reference trajectory should satisfy the physical characteristics constraints and ematics equation of the underwater vehicles [19]. That is: cos sin sin cos x u v y u v r According to (5), the real-time state of the underwater vehicle is obtained as x y y x x y In this paper, a radial basis function neural-nonlinear model predictive contr NMPC) method is designed to give the corresponding control law at each sampli so that the real state [ , , , , , ]  T x y u v r of the underwater vehicles coincides with t ence trajectory [ , , , , , ]  The external disturbance will change the sta underwater vehicle. Therefore, the state information of an underwater vehicle the information of external interference. The model accuracy of an underwater veh be improved by adjusting the network parameters. The adaptive ability of the m ables the controller to send out correct control law to ensure that the running sta underwater vehicles is still on the reference trajectory when there are external ances.

Controller Design
The principle of RBF-NMPC constructed in this paper is shown in Figure 2. cific implementation process is as follows: First, from the input and output data, the dynamic model of underwater ve As shown in Figure 1, the reference trajectory s 1 , s 2 , · · · s N defined in inertial coordinate system is composed of N discrete trajectory points from a given initial state. Suppose the trajectory point at a certain time is s t = [x R (t), y R (t)] T . For trajectory tracking control, the reference trajectory should satisfy the physical characteristics constraints and the kinematics equation of the underwater vehicles [19]. That is: According to (5), the real-time state of the underwater vehicle is obtained as: In this paper, a radial basis function neural-nonlinear model predictive control (RBF-NMPC) method is designed to give the corresponding control law at each sampling time, so that the real state [x, y, ψ, u, v, r] T of the underwater vehicles coincides with the reference trajectory [x R , y R , ψ R , u R , v R , r R ] T . The external disturbance will change the state of the underwater vehicle. Therefore, the state information of an underwater vehicle contains the information of external interference. The model accuracy of an underwater vehicle can be improved by adjusting the network parameters. The adaptive ability of the model enables the controller to send out correct control law to ensure that the running state of the underwater vehicles is still on the reference trajectory when there are external disturbances.

Controller Design
The principle of RBF-NMPC constructed in this paper is shown in Figure 2. The specific implementation process is as follows: First, from the input and output data, the dynamic model of underwater vehicles is identified off-line by using the IALM-ESC algorithm, so that the RBFNN can basically grasp the dynamic of the underwater vehicle.
Second, in the real-time control stage, in order to improve the accuracy of the NMPC, the error between the system output and the network prediction output is used to adjust the RBFNN parameters. The adjusted network is used to predict the state quantity in the future.
Finally, in the NMPC optimization stage, the objective function is optimized by the proposed AGWO algorithm, and the control sequence is obtained.
Machines 2021, 9, x FOR PEER REVIEW 5 Second, in the real-time control stage, in order to improve the accuracy of the NM the error between the system output and the network prediction output is used to a the RBFNN parameters. The adjusted network is used to predict the state quantity i future.
Finally, in the NMPC optimization stage, the objective function is optimized b proposed AGWO algorithm, and the control sequence is obtained.

RBFNN Training
RBFNN is a kind of local approximation neural network with simple structur introducing the idea of Gaussian kernel function, any nonlinear system can be app mated with a compact set and any precision. Suppose that there are r neurons in hidden layer and  When constructing RBFNN, there are two problems to be solved: one is stru identification, the other is parameter estimation. Structure identification is to deter the number of nodes in the hidden layer of the network, and the parameter estimati to find a set of network parameters (center, band width and weight), which minim the sample error function (root mean square error): The IALM-ESC applied in this paper is an incremental network construction rithm. Starting with zero network nodes, the maximum of the error surface is com sated by adding network nodes at the peak or valley of the error surface. The param of each new network node are adjusted by the following update rules [20]:

RBFNN Training
RBFNN is a kind of local approximation neural network with simple structure. By introducing the idea of Gaussian kernel function, any nonlinear system can be approximated with a compact set and any precision. Suppose that there are r neurons in the hidden layer and x = [x 1 , x 2 , · · · x m ] T is the input of the RBFNN, then the output can be expressed as: where c j = [c j1 , c j2 , · · · c jm ] is the vector value of the center point of the j th hidden layer neuron, σ j is the width vector of the Gaussian kernel function of the j th hidden layer neuron, and ω j is the weight of the j th hidden layer neuron. When constructing RBFNN, there are two problems to be solved: one is structure identification, the other is parameter estimation. Structure identification is to determine the number of nodes in the hidden layer of the network, and the parameter estimation is to find a set of network parameters (center, band width and weight), which minimums the sample error function (root mean square error): where p is the current sample, n is the total number of samples, y re f is the expected output of the sample, and y is the output of RBFNN. The IALM-ESC applied in this paper is an incremental network construction algorithm. Starting with zero network nodes, the maximum of the error surface is compensated by adding network nodes at the peak or valley of the error surface. The parameters of each new network node are adjusted by the following update rules [20]: where Ψ(t) is a quasi-Hessian matrix, Ω(t) is the gradient vector, η(t) is the adaptive damping coefficient, and its adjustment rule is as: where β is the constant. Ψ(t) and Ω(t) are the sum of sub matrix ψ p (t) and sub vector ω p (t) of all samples, respectively, and there is: where, where, j p (t) is the row vector of Jacobian matrix, which is described as: According to the update rule of the gradient descent learning algorithm, the elements of row vector of Jacobian matrix can be expressed as: After adjusting all the parameters, if the root mean square error between the predicted value and the actual value of the RBFNN does not reach the target value, the network nodes will continue to be added at the maximum error, and the node parameters will be trained until the target value is met.
In the off-line phase, the model of the underwater vehicle is initially established. However, in the real-time control stage, the prediction model should also have the ability of adjustment to adapt to the unknown underwater environment. The underwater interference is decomposed into an interference force on each degree of freedom, which causes the state of the underwater vehicle to change. Therefore, it can be considered that the state information of an underwater vehicle contains the information of external interference. According to the error between the system output and the predicted state of the model, the model accuracy of an underwater vehicle can be improved by adjusting. This idea is widely used in the existing adaptive neural network controller of underwater vehicle [21,22]. In this paper, the adaptive gradient descent is used to adjust the network parameter in the real-time control phase to minimize the error between the MPC model and the system output. In each sampling period, after the new sample data are collected, the network parameters are updated once by (15). With the increase in running time, network parameters will be gradually adjusted to adapt to the changes of the external environment.
where η(t) is adaptive learning rate, that equal to the adaptive damping coefficient in (10). g(t) is the gradient vector, that composed of factors in (14).

Objective Function and Constraints
The velocity and angular velocity of each degree of freedom of underwater vehicle in the future can be estimated according to the established RBFNN prediction model. The expression is as follows: where N p is the prediction horizon, y m (k + p|k) is the prediction output at the sampling T is the input, and it will change with the prediction horizon. n A and n B are the order of the output and input, respectively. Then the position and attitude [x, y, ψ] T of the underwater vehicle in the prediction horizon are calculated by the kinematics equation, and all the state variableŝ y(k + p|k) of the underwater vehicle in the prediction horizon are obtained. In this paper, the minimum value of quadratic objective function is used to express the optimization performance index at k time [23]. The expression is as follows: where y(k + p|k) = y sp (k + p|k) −ŷ(k + p|k) is the difference between reference trajectory and model prediction. ∆u(k + p|k) = u(k + p|k) − u(k + p − 1|k) is the control increment. N p and N u represent the prediction horizon and control horizon. Q and λ are corresponding weighting matrices. u min and u max are the upper and lower bounds of u(k + p|k) . −∆u max and ∆u max are the upper and lower bounds of ∆u(k + p|k) . y min and y max are the upper and lower bounds ofŷ(k + p|k) , respectively.

AGWO Algorithm
For solving NMPC, the traditional gradient descent method has limitations in calculating capability [24]. Biological heuristic optimization algorithm has been proved to have strong application potential in complex NMPC problems [25]. In [26], a modified GWO and the Moth-Flame Optimization were proposed to improve the performance when applied as an NMPC solver. The GWO algorithm simulates the predatory behavior of gray wolf group and achieves the goal of optimization based on the mechanism of wolf group cooperation [27]. In GWO algorithm, the first three wolves with the best fitness (optimal solution: α, β and δ) guide other wolves to search for the target. The remaining wolves (candidate solutions) are defined as ω, and they update their positions around α, β and δ. The distance between the individual and the prey is shown in (18), and its position update is shown in (19).
where t is the current iteration times. The location update of each search factor is expressed as: With the increase in the number of iterations t, the GWO algorithm finally finds the optimal solution through the way of trapping. Although the GWO algorithm has been widely used, it also has the characteristics of slow convergence speed and is easy to be limited to local minimum [28]. In order to improve the performance of the GWO algorithm, an adaptive strategy is applied in this paper. The adaptive weight is added to Equation (20) to speed up the convergence speed. In addition, the ability to jump out of the local optimum can be improved by crossing the worst set of each iteration.
In general, wolf α has the highest command in the pack. However, in some special cases, β and δ can also command wolves temporarily. Therefore, the position of the wolf pack must be iterated according to different weights. However, if the weight is fixed, it will not be conducive to the regeneration of the population. Therefore, an adaptive weighted position updating method is applied, as shown in (21).
where λ α , λ β and λ δ are the adaptive weights in each iteration, which can be expressed as: where, randn(0, 1) is the standard normal distribution; M λ p is the average value of weight update, which can be expressed as: where c is a constant, which is set to 0.1 in this paper. G λ p is a file which stores better weight than the last iteration. In the initialization phase, the stored weights G λ p are all 1. In addition, inspired by the differential evolution algorithm, the idea of worst-case crossover is proposed in order to improve the global search performance of the GWO algorithm. The search factor set with poor fitness in each iteration exchanges information with α, β and δ, so as to increase the diversity of the population. The ability to jump out of the local optimum can be improved by crossing the population. Set the number of bad sets as K. The information exchange formula of each difference factor in K is expressed as follows: where a is a random number between [0, 3], and d is the dimension of search factor.
Through the above improvements, the proposed AGWO algorithm has a great improvement in the convergence speed and the search ability of the global optimal value when compared with the traditional GWO algorithm.
The AGWO algorithm proposed in this paper is applied to the optimization problem shown in (17), and the first control sequence of the optimization result is taken as the optimal control law. Then it is loaded into the underwater vehicle's system for real-time control. The flow chart of the proposed RBF-NMPC Algorithm 1 is as follows:

1:
Develop RBFNN predictive model offline using offline data; 2: Initialize the parameters of RBF-NMPC; 3: For k = 1 to N do 4: Sample the plant output y(k);

5:
Update the parameters of RBFNN to adapt the real environment;

Model Identification Results
In order to verify the effectiveness of the proposed model identification based on RBFNN, the random step signals are used as the excitation signal to obtain the state response of the underwater vehicle, which are taken as the model training sample. The width of each step signal represents the excitation action time, which reflects the relationship between the dynamic response of the underwater vehicle and the excitation frequency. The dynamic information of the underwater vehicle can be more captured by random step signal. According to the thrust constraints of each propeller, the thrust moment range of , and the actual output of the RBFNN is y(k) = [u(k), v(k), r(k)]. Figure 3 shows 2000 sets of input and output data for underwater vehicles dynamic model identification, where 1900 groups are used as training data and 100 groups are used as test data.
Before the neural networks training based on the IALM-ECS, the data preprocessing is carried out to prevent system instability or slow training speed caused by different dimensions. The target mean square error is set at 0.05. In Figure 4, the root mean square error (RMSE) of the network output decreases with the increase in the number of nodes. After reaching the target value, the number of network nodes stops increasing. Based on the IALM-ECS algorithm, RBFNN is constructed incrementally from zero nodes, which makes the network more compact and has good generalization ability. Compared with the traditional trial and error method, there is no randomness in the whole process.
The root mean square error curve of RBFNN offline training is shown in Figure 5. It can be seen that the proposed IALM-ESC algorithm can improve the convergence speed. Even if the initialization error is large, it can converge about 100 iterations. However, the traditional GD method cannot get the better parameters when it reaches 500 iterations.
The comparison between the actual and predicted values of the test set is shown in Figure 6. It can be seen that the actual value is basically consistent with the predicted value. This shows that the network structure is simple, and the modeling effect is satisfactory.  is carried out to prevent system instability or slow training speed caused by different dimensions. The target mean square error is set at 0.05. In Figure 4, the root mean square error (RMSE) of the network output decreases with the increase in the number of nodes. After reaching the target value, the number of network nodes stops increasing. Based on the IALM-ECS algorithm, RBFNN is constructed incrementally from zero nodes, which makes the network more compact and has good generalization ability. Compared with the traditional trial and error method, there is no randomness in the whole process. The root mean square error curve of RBFNN offline training is shown in Figure 5. It can be seen that the proposed IALM-ESC algorithm can improve the convergence speed. Even if the initialization error is large, it can converge about 100 iterations. However, the traditional GD method cannot get the better parameters when it reaches 500 iterations.
The comparison between the actual and predicted values of the test set is shown in Figure 6. It can be seen that the actual value is basically consistent with the predicted value. This shows that the network structure is simple, and the modeling effect is satisfactory.   The root mean square error curve of RBFNN offline training is shown in Figure 5. It can be seen that the proposed IALM-ESC algorithm can improve the convergence speed. Even if the initialization error is large, it can converge about 100 iterations. However, the traditional GD method cannot get the better parameters when it reaches 500 iterations.
The comparison between the actual and predicted values of the test set is shown in Figure 6. It can be seen that the actual value is basically consistent with the predicted value. This shows that the network structure is simple, and the modeling effect is satisfactory.

Optimization Results of AGWO
Two typical functions (convex function and nonconvex function) are chosen to test the optimization performance of AGWO, and the results are shown in Figures 7 and 8. First of all, the two figures show that the optimization effect of AGWO is better than the other three methods. It can be seen from Figure 7 that AGWO has a great improvement in the convergence speed compared with GWO, especially in the later stage. It indicates that the existence of adaptive weights can accelerate the convergence of AGWO. In Figure 8, when particle swarm optimization (PSO), differential evolution (DE) and GWO all fall into the local optimal value, AGWO can jump out of the local optimal value and achieve a good optimization result. The results show that the AGWO proposed in this paper has a great improvement in convergence speed and convergence effect.

Optimization Results of AGWO
Two typical functions (convex function and nonconvex function) are chosen to test the optimization performance of AGWO, and the results are shown in Figures 7 and 8. First of all, the two figures show that the optimization effect of AGWO is better than the other three methods. It can be seen from Figure 7 that AGWO has a great improvement in the convergence speed compared with GWO, especially in the later stage. It indicates that the existence of adaptive weights can accelerate the convergence of AGWO. In Figure 8, when particle swarm optimization (PSO), differential evolution (DE) and GWO all fall into the local optimal value, AGWO can jump out of the local optimal value and achieve a good optimization result. The results show that the AGWO proposed in this paper has a great improvement in convergence speed and convergence effect. the existence of adaptive weights can accelerate the convergence of AGWO. In Figure 8, when particle swarm optimization (PSO), differential evolution (DE) and GWO all fall into the local optimal value, AGWO can jump out of the local optimal value and achieve a good optimization result. The results show that the AGWO proposed in this paper has a great improvement in convergence speed and convergence effect.

Trajectory Tracking Control Results
In order to verify the effectiveness of the proposed trajectory tracking control method, the reference trajectory is selected as follows: In the simulation, the sampling period is t =0.05 s. The prediction horizon is N =5 t . The control horizon is u N =2 t . The state quantity weighting coefficient is  Q diag The tracking results of the two methods (RBF-NMPC and LMPC) are shown in Figure  9. Combining with Figure 9a,b, it can be seen that both the overall tracking and the tracking in each state of RBF-NMPC are highly consistent with the reference trajectory. On the contrary, due to the simplification of the underwater vehicle model in LMPC, there are large modeling errors, and the trajectory tracking control error is larger. The optimization time of the two methods is shown in Figure 10. In a control cycle, the network parameters are adjusted once according to (15), and then the state prediction and control law optimization are carried out. The average time for solving calculation of RBF-NMPC is 0.0144 s and that of LMPC is 0.0076 s (The time is measured with the time function in MATLAB). Although the proposed RBF-NMPC takes longer, the optimization time of the two methods is much lower than the sampling time, which can ensure the real-time tracking. The simple network structure and the convergence effect of AGWO ensure that the optimization time of RBF-NMPC can reach the same level as LMPC.

Trajectory Tracking Control Results
In order to verify the effectiveness of the proposed trajectory tracking control method, the reference trajectory is selected as follows: In the simulation, the sampling period is ∆t = 0.05 s. The prediction horizon is N = 5∆t. The control horizon is N u = 2∆t. The state quantity weighting coefficient is Q = diag (10 4 , 10 4 , 10 2 , 10 1 , 10 1 , 10 1 ). The control quantity weighting coefficient is λ = diag(10 −4 , 10 −4 , 10 −4 ).
The tracking results of the two methods (RBF-NMPC and LMPC) are shown in Figure 9. Combining with Figure 9a,b, it can be seen that both the overall tracking and the tracking in each state of RBF-NMPC are highly consistent with the reference trajectory. On the contrary, due to the simplification of the underwater vehicle model in LMPC, there are large modeling errors, and the trajectory tracking control error is larger. The optimization time of the two methods is shown in Figure 10. In a control cycle, the network parameters are adjusted once according to (15), and then the state prediction and control law optimization are carried out. The average time for solving calculation of RBF-NMPC is 0.0144 s and that of LMPC is 0.0076 s (The time is measured with the time function in MATLAB). Although the proposed RBF-NMPC takes longer, the optimization time of the two methods is much lower than the sampling time, which can ensure the real-time tracking. The simple network structure and the convergence effect of AGWO ensure that the optimization time of RBF-NMPC can reach the same level as LMPC. In addition, different optimization algorithms are used to solve the same NMPC problem. Based on the obtained control laws, trajectory tracking control is carried out respectively. The mean square error (MSE) for each optimization algorithm in trajectory tracking are summarized in Table 1. The tracking performance of AGWO is significantly improved in each degree of freedom compared with other optimization algorithms. In addition, different optimization algorithms are used to solve the same NMPC problem. Based on the obtained control laws, trajectory tracking control is carried out respectively. The mean square error (MSE) for each optimization algorithm in trajectory tracking are summarized in Table 1. The tracking performance of AGWO is significantly improved in each degree of freedom compared with other optimization algorithms. In order to verify the anti-interference ability of RBF-NMPC, the trajectory tracking control under the interference environment is carried out. In the simulation, unknown random interference is added to each degree of freedom, and the expression is shown in (26 adaptive ability in a complex working environment, and its trajectory tracking control effect is better. Figure 12 shows the optimization times of the two methods. Under interference environment, the average time for solving calculation of RBF-NMPC is 0.0146 s, and that of LMPC is 0.0083 s. The optimization time of RBF-NMPC is still far less than the sampling time. At the same time, RBF-NMPC has the ability of adaptive parameter adjustment, which will show excellent tracking performance in complex conditions.  It can be seen from Figure 11 that LMPC cannot track the reference trajectory well under interference environment, especially since the deviation between the heading angle and reference value is large. In contrast, RBF-NMPC can update the model according to the real value of the system model output after each optimization, which shows superior adaptive ability in a complex working environment, and its trajectory tracking control effect is better. Figure 12 shows the optimization times of the two methods. Under interference environment, the average time for solving calculation of RBF-NMPC is 0.0146 s, and that of LMPC is 0.0083 s. The optimization time of RBF-NMPC is still far less than the sampling time. At the same time, RBF-NMPC has the ability of adaptive parameter adjustment, which will show excellent tracking performance in complex conditions. The mean square errors for trajectory tracking are summarized in Table 2. The track ing performance of RBF-NMPC is improved by at least 43% and 25% in the case of no interference and interference, respectively.

Conclusions
In this paper, an adaptive RBF-NMPC trajectory tracking control algorithm is proposed for underwater vehicles. This method combines the NMPC with RBFNN and AGWO algorithm. It solves the problems of modeling difficulty and poor real-time per The mean square errors for trajectory tracking are summarized in Table 2. The tracking performance of RBF-NMPC is improved by at least 43% and 25% in the case of no interference and interference, respectively.

Conclusions
In this paper, an adaptive RBF-NMPC trajectory tracking control algorithm is proposed for underwater vehicles. This method combines the NMPC with RBFNN and AGWO algorithm. It solves the problems of modeling difficulty and poor real-time performance in the application of NMPC in underwater vehicles. Simulation results show that the trajectory tracking performance of RBF-NMPC is greatly improved compared with the LMPC and traditional optimization algorithms. In the near future, how to reduce the time of first optimization, and how to combine RBF-NMPC with robust control to make the whole trajectory tracking control system more stable, will be our next research direction.