Dynamic Neural Network-Based Adaptive Tracking Control for an Autonomous Underwater Vehicle Subject to Modeling and Parametric Uncertainties

This research presents a way to improve the autonomous maneuvering capability of a four-degrees-of-freedom (4DOF) autonomous underwater vehicle (AUV) to perform trajectory tracking tasks in a disturbed underwater environment. This study considers four second-order input-affine nonlinear equations for the translational (x,y,z) and rotational (heading) dynamics of a real AUV subject to hydrodynamic parameter uncertainties (added mass and damping coefficients), unknown damping dynamics, and external disturbances. We proposed an identification-control scheme for each dynamic named Dynamic Neural Control System (DNCS) as a combination of an adaptive neural controller based on nonparametric identification of the effect of unknown dynamics and external disturbances, and on parametric estimation of the added mass dependent input gain. Several numerical simulations validate the satisfactory performance of the proposed DNCS tracking reference trajectories in comparison with a conventional feedback controller with no adaptive compensation. Some graphics showing dynamic approximation of the lumped disturbance as well as estimation of the parametric uncertainty are depicted, validating effective operation of the proposed DNCS when the system is almost completely unknown.


Introduction
In recent years, the use of underwater robotics as a technological tool for solving a wide variety of problems in underwater environments has experienced a great increase. These types of robots are known as Autonomous Underwater Vehicles (AUVs) or Unmanned Underwater Vehicles (UUVs). Some operations carried out in shallow and deep water such as ship hull inspections, surveillance operations in harbor areas, three dimensional mapping of coral reefs, and deep water archaeology are only a few examples of the current issues in different fields of interest. Recent advances in technology and science in this domain has made this type of vehicle increasingly accessible for the scientific community and for people in general.
One of the most important features of AUVs is their ability to operate without human management. Control theory aims to provide underwater vehicles with the capacity to execute autonomous maneuvering operations in underwater environments where an adequate performance is achieved. In this sense, there exists a wide range of control approaches, most of them based mainly on adaptive or robust techniques. This is due to the AUV dynamics having parametric and/or modelling uncertainties. In general, the mathematical model of underwater vessels dynamics ( [1][2][3]) depends on characteristics such as weight, buoyancy, frame geometry, number, and configuration of the propellers and hydrodynamic effects, among others. In the majority of cases, it is challenging to obtain complete model dynamics of underwater vehicles mostly due to the difficulty in obtaining the hydrodynamic parameters (added mass, and linear and nonlinear damping coefficients) with sufficient accuracy according to real operations [4]. This difficulty stems largely from the fact that vehicles such as AUVs move at low speeds, describing a uniform translation motion or a stationary state, due to their slow thruster dynamics [5]. The most common method to obtain hydrodynamic parameters is by using experimental tests with a planar motion mechanism (PMM) [6,7], based on measurements of the reaction forces when the vehicle is moving through water. However, experiment-based methods require complicated and expensive setups [7][8][9]. An estimation of these parameters can be approximated by computational fluid dynamic (CFD) simulations [10,11] to obtain nominal values for specific conditions. Analytical derivation and identification schemes of these values represent a very complex issue and are generally applied for quite simple cases that do not reflect realistic conditions [12,13]. Nevertheless, it is common that AUVs replace their equipment with a diversity of sensors and technological devices depending on the mission to be carried out; in consequence, physical characteristics such as mass and shape are modified. Therefore, nominal values of the hydrodynamic parameters are of limited practical utility [14]. Furthermore, it is necessary to consider that the vehicle experiences external disturbances when its body moves in underwater environments such as waves or currents. The mixed effect of all these phenomena for an underwater vehicle results in a highly nonlinear model subject to uncertainties and external disturbances. From the control theory point of view, the hydrodynamic phenomena (the coefficients of which are taken to be unknown but constant) are considered as external disturbances to the vehicle dynamics.
Motivated by the great challenges associated with the problem of AUV control subject to uncertainties, the research community has increased its efforts to find new strategies to achieve solutions for different control objectives related to underwater robots such depth control, station-keeping, trajectory tracking, path-following, fault tolerant control, or docking operations [15][16][17][18]. Research studies have demonstrated that intelligent control techniques ( [19,20]) are highly effective to deal with this type of problem because of their adaptive and robustness nature. Related to this matter, References [21][22][23] make use of neural networks (NNs) and fuzzy inference systems (FISs) as well as mixed structures of these approaches.
NNs represent a suitable approach to control complex nonlinear systems due to their capacity to deal with problems characterized by an incomplete model information or when the system to be controlled is completely unknown (a black box). Multiple controller structures based on neural networks have been postulated and deployed in previous decades for the AUV control problem [24,25]. In the related literature, it is observed that the adaptive techniques based on neural networks have been widely used for controlling AUVs. In [26], a fault-tolerant trajectory tracking control based on dynamic programming for an AUV subject to rudders faults and disturbances produced by ocean currents is addressed. In the proposed solution, the performance index function for the optimal control problem is constructed by using estimates of faults and disturbances provided by two dynamic neural network estimators. In [27], an adaptive control system based on S surface recurrent neural networks with wavelet activation functions is designed. This control law aims to approximate the uncertain AUV dynamics introduced by a tether and a robotic manipulator. The study reported in [28] deals with the tracking control problem in a three-dimensional space of an underactuated AUV subject to external disturbances and parametric uncertainties. The proposed solution makes use of a command-filtered backstepping control scheme and an adaptive neural network control technique. In this case, the neural network is used to estimate the parametric system uncertainties in order to cancel their effect in the closed-loop system. The work in [29] presents a backstepping control based on Lyapunov theory for a Trans-media Aerial Underwater Vehicle (TMAUV), which evolves in a two dimensional space. A neural network-based adaptive estimation of model, parameter uncertainties, and external disturbances regarding the coupled input condition of the vehicle is proposed. Additionally, in [30], an adaptive tracking controller based on a combination of backstepping and terminal sliding mode techniques for a tethered AUV is designed. This type of vehicle suffers from unknown underwater disturbances as well as time-variant nonlinear cable dynamics. A recurrent neural network with radial basis functions and a disturbance observer provides uncertainty estimates.
In the same way, several studies related to the design of controllers, observers, and identifiers based on a special type of dynamic neural networks have been presented in [31][32][33][34]. The approach used in these works implements a type of adaptive identifier based on Hopfield-type dynamic neural networks structure introduced in [35], also called Differential Neural Networks (DNNs). This name is due to the fact that the considered NNs and the dynamical control systems (subject to modeling and/or parametric uncertainties) evolve over continuous time. One of the advantages of Hopfield neural networks is that they have a simple structure, which facilitates the selection of the number of neurons required to identify a given system model, that is, the number of the system states to be identified corresponds to the number of network neurons. The static (feedforward) neural networks [36][37][38] have many drawbacks; for example, the function approximation is sensitive to the training data. In contrast, the dynamic (recurrent) neural networks can overcome this disadvantage because their structure incorporates feedback. Moreover, the Hopfield model describes neural networks that evolves in continuous time, i.e., this type of network can be represented as a set of continuous states, which are obtained as a solution of a set of ordinary differential equations (or difference equations). Therefore, dynamical neural networks exhibit great properties that have been exploited in the areas of identification and control of highly nonlinear dynamic systems [39,40].
Contributions: The proposed work presents a way to improve the autonomous maneuvering capability of a four-degrees-of-freedom (4DOF) AUV to perform trajectory tracking tasks in an underwater environment. The main contributions rely in the following three aspects: Dynamic Model. The present study considers that the vehicle has a nonlinear mathematical description, which is subject to modeling and parametric uncertainties as well as external disturbances. The dynamic model is developed specifically for the real AUV from BlueRobotics Inc called BlueROV2 [41] considering real operation conditions. The modelling process results in a decoupled translational dynamics (x,y,z) described by three decoupled input-affine second-order nonlinear subsystems. This is done by first solving for the heading angle dynamics stabilization. We integrated the unknown damping phenomena and external disturbances in a single expression, defined as lumped disturbances. In addition, we considered that the added mass and damping coefficients can be approximated by constant values. Additionally, we supposed that these constant values are unknown, taking into account the difficulty of finding them for complex-shape vehicles. Thus, the inputaffine AUV model considered in this work resulted in unknown input gains dependent on constant added mass coefficients. Then, this fact does not represent a limitation for the neural network adaptive controller proposed in the present work, but rather, it contemplates a reasonable consideration of real conditions when these parameter uncertainties (added mass-dependent gains) are present in the system structure. Control Strategy. We proposed a novel control design for each dynamic based on a modification of the DNN-based parallel identifier structure in [42,43] by replacing the DNN that is used to identify the function affecting the control input by a parametric estimate of the unknown constant input gain provided by an online parameter identification law. The proposed identifier structure merges a nonparametric identification of the lumped disturbances by using a DNN and a parametric identification of the constant input gain provided by a parameter adaptive law. It is worth noting that this type of combination, namely an identifier based on DNN plus parametric estimator, is not reported in the literature. Moreover, in contrast with the research related to such a parallel identifier, in our case, we add to this identification scheme a correction term based on the error between the AUV dynamics and the identifier to improve its approximation quality. The proposed indirect adaptive control structure makes use of such an identification error to update weights adapting itself to the dynamic changes of the vehicle and adjusts the parameter estimator simultaneously. Online neural network approximation of the unknown dynamic part (lumped disturbances) and estimation of the unknown gain of the control input are used to cancel the unknown and nonlinear effect from each vehicle dynamic via the designed control law. Then, for the resulting quasi-linear AUV's second-order subsystems, a feedback control law was calculated in order to force each subsystem to track the desired states provided by a reference system. This type of identification-control scheme is name the Dynamic Neural Control System (DNCS). Stability Analysis. The ultimate boundedness of the identification and tracking errors considering the proposed DNCS is guaranteed via the Lyapunov stability framework. Additionally, the stability proof allows us to derive the learning laws for the DNNs and the adaption law for the input (constant) gain parameter.
The rest of the paper is structured as follows. Section 2 introduces some preliminaries and the mathematical model of the AUV for control purposes. In Section 3, the identification scheme based on nonparametric identification of the lumped disturbances and estimation of the parametric uncertainty of the input gains affecting the vehicle dynamics are addressed. Section 4 presents the trajectory tracking problem of desired reference states for the vessel dynamics. In Section 5, we present a theorem summarizing the main contribution of the present study, comprising the design of the proposed DNCS. In this section, the ultimate boundedness of the tracking and identification errors supported by the Lyapunov stability analysis are guaranteed. In Section 6, a set of numerical experiments is carried out to show the effectiveness of our proposed DNCS approach, forcing the AUV to track reference states in different scenarios despite the presence of lumped disturbances and parametric uncertainties. Additionally, in order to show the effectiveness of the approach used in the present study, a performance index analysis between the proposed indirect adaptive neural-control scheme and the conventional feedback controller solving the AUV tracking problem is presented. General comments and future research perspectives are stated in Section 7.

Preliminaries and AUV Mathematical Model
In this section, some mathematical preliminaries necessary for the development of the present work are stated.

Preliminaries
Theorem 1. (Young's inequality) [44]. Let f be a continuous and increasing function on where f −1 stands for the inverse function of f . The inequality in (1) is valid if and only if b = f (a). If we consider the function f (x) = x p−1 (p > 1), we get Lemma 1 ([45] ). For any matrices X, Y ∈ R n×m and any symmetric positive definite matrix ∆ ∈ R n×n , the following inequality holds with constant parameters A ∈ R n×n , 0 < R = R ∈ R n×n and 0 < Q = Q ∈ R n×n has a unique positive definite solution 0 < P = P if the following conditions are fulfilled: 1.
The matrix A is stable.
The matrices (A,Q,R) satisfy the inequality

Mathematical Model of an AUV
In this section, the mathematical model of the 4DOF AUV object of our analysis is addressed. Following the underwater modeling methodology proposed by Fossen [46], two reference frames are defined: the inertial coordinate frame (O I , X I , Y I , Z I ) and the body-fixed coordinate frame (O B , X B , Y B , Z B ), as presented in Figure 1. In order to conduct the modeling, the following assumptions are considered Assumption 1. By adequate placement of the buoyancy and gravity centers of the AUV in Figure  1, the roll and pitch angles are kept small, i.e., φ ≈ 0 and θ ≈ 0. This gives the vehicle only four degrees of freedom (4DOF).

Assumption 2.
The vehicle is completely actuated in the remaining 4DOF, i.e., the translational displacements in the X, Y, and Z axes, and the angular movement around the Z axis.  For the 4-DOF vehicle, the linear and angular velocities are defined in the body-fixed frame by the vector ν = [u, v, w, r] , whereas the three-dimensional position and the heading angle are defined in the inertial frame by the vector η = [x, y, z, ψ]. The kinematic equation, which relates the velocities between inertial and body frame, is given aṡ where, under Assumption 1, J(η) is defined as The underwater vehicle dynamics expressed in the body-fixed frame is obtained by using the Newton-Euler formulation and are written in matrix form as follows [46]: where M is the inertia matrix, C(ν) is the Coriolis and centripetal matrix, D(ν) is the hydrodynamic damping matrix, and G(ν) is the vector of restoring forces and moments. The control input vector τ represents the vector of forces and moments exerted by the thrusters, and the vector τ d denotes the environmental disturbances produced by wave-or current-induced effects. These matrices and vectors have the following form.
The parameter m is the mass of the underwater vehicle. The parameters W and B are the weight and buoyancy forces acting on the vehicle. The hydrodynamic added mass effects are associated to the coefficients Xu, Yv, Zẇ, and Nṙ, and the linear and nonlinear hydrodynamic damping phenomena are characterized by the coefficients X u , Y v , Z w , N r and X u|u| , Y v|v| , Z w|w| , N r|r| , respectively. The added mass and damping coefficients can be considered constant for maneuvering applications [47].
From the kinematic Equation (6), we can obtain Then, the underwater vehicle dynamics in Equation (8) expressed in the body-fixed frame can be expressed in the inertial frame as where It is important to mention that the model presented in Equation (15) is the most commonly used representation for an AUV in the specialized literature.
Equations (17)- (20) represent the nonlinear equations of a completely coupled 4DOF AUV system. We can notice that the Equations (19) and (20) for z and heading dynamics are decoupled from the x and y dynamics, so it is possible to control them separately. In this sense, for control purposes, first, the yaw angle dynamics Equation (20) is stabilized at zero, i.e., ψ = 0,ψ = 0. It is worth mentioning that the procedure involving stabilization of the orientation of the vehicle as a first step in order to reduce the complexity of the control design is an approach that has been used for several studies related to autonomous quadrotor ( [48,49]) and underwater ( [50]) vehicles. Under the previous assumption, the x, y, and z dynamics expressed in an inertial frame are simplified to the uncoupled equations below: It is important to notice that, in the analysis presented in this study, the damping and added mass coefficients as well as the buoyancy and weight forces are supposed to be unknown. Furthermore, we assume that the structure of the damping dynamics X uẋ + X u|u| |ẋ|ẋ, Y vẏ + Y v|v|ẏ and Z wż + Z w|w|ż are not available.
In this sense, the nonlinear models in Equations (20)- (23) suffer from modeling and parametric uncertainties as well as unknown but bounded external disturbances. Hence, the first and third elements of the right-hand sides of these equations, namely the modeling uncertainty and external disturbances, will hereafter be referred to as lumped disturbances and the unknown constant gain multiplying the input force will be referred to as the parametric uncertainty.

Dynamic Neural Network Identification
This section focuses on the proposed identification scheme. Let us consider the following second-order input-affine nonlinear system: The state vector is given by ζ = [ζ 1 ζ 2 ] ∈ R 2 . We assume that the state vector can be obtained directly from system measurements without the need for a state observer. The function f (·, ·) : R 2+1 → R denotes the unknown lumped disturbances, defined by f (ζ, t) = f (ζ) + (t), where the nonlinear function f (·) : R 2 → R represents the nonlinear unmodeled dynamics of the system and (t) ∈ R corresponds to the external disturbances.
The control signal is denoted by u(ζ) ∈ R. The constant gain b is assumed to be unknown but satisfying the boundary condition b 0 ≤ |b| ≤ b * for known positive constants b 0 and b * . This condition guarantees that the system is controllable. In addition, assume that sgn(b) is known. It should be noted that each of the four second-order subsystems presented in Equations (20)-(23), can be rewritten in the form presented in Equation (24). This is due to the AUV subsystems having a double integrator structure. The lumped disturbances and parametric uncertainty only affect the second differential equation of Equations (20)-(23).

Identifier Based on Dynamic Neural Networks
Inspired by the dynamic identifier scheme reported in [42,43], we propose the following novel parallel identifier structure for the second-order system in Equation (24) wheref ζ |W(t) is the dynamic identification operator of the unknown lumped disturbances f (ζ, t). According to the dynamic neural network approach presented in such works, this operator is selected as the following Hopfield-type neural network: where a 1 , a 2 are negative constants to be chosen, Ψ(ζ) : R 2 → R 2 is the vector of activation functions, and W (t) is the weight matrix describing the neural network connections. Notice that the time dependence in W (t) is explicitly considered hereafter to highlight that the weights of the dynamic neural network are time-varying. In the structure proposed in Equation (25), we integrate an adaptive parametric estimationb(t) of the unknown constant b in contrast to the identifiers based on DNNs where another neural network is used. Moreover, the suggested identifier scheme in Equation (25) has a double integrator form unlike the n-dimensional first-order vector differential equation used by the renamed identifier structures ( [42,43]). Additionally, another difference in the structure presented in Equation (25) relies on the addition of the correction term K id ζ , which is used to improve the convergence performance of the identifier. This term is based on an adjustable constant gain K id ∈ R 2×1 and the identification errorζ is defined as follows: Proceeding on the analysis method, let us suppose that, for the given parameters a 1 and a 2 , it is possible to select a suitable matrix W 0 in Equation (26), so we can closely match the function f (ζ, t) in the structure of Equation (24). However, there exists a bounded difference between the neural network and the real dynamic. In this sense, the function f (ζ, t) can be represented by the following neural network model: where W 0 ∈ R 2 denotes the vector of unknown constant weights andf (ζ, t) = f (ζ, t) − [a 1 ζ 1 + a 2 ζ 2 + W 0 Ψ(ζ)] defines the approximation error.

Assumption 4.
The upper bound for initial weights W 0 is available: where ∆ andW are positive definite matrices with appropriate dimensions. The problem to tune the approximator Equation (28) in order to find the initial weights W 0 can be solved separately by implementing an offline training before carrying out the online system identification process. By means of input-output relation data (ζ(k), u(k))|k = 1, ..., N produced by several experimental tests, we can use a certain numerical algorithm, for example, least squares [51,52] in order to obtain these baseline values.
where f 0 and f 1 are positive constants.
The elements of the vector Ψ(ζ) are defined as sigmoid-type activation functions: where the positive real parameters b i , d i , and c i represent the bound, the slope, and a constant that shifts the function, respectively. Assumption 6. The vector activation function Ψ fulfill the conditioñ where D Ψ is a positive definite matrix.
From Equations (24) and (25), the dynamic of the identification error is given bẏζ is the parameter error. The weight and activation function errors are defined asW (t) = W 0 − W (t) andΨ = Ψ(ζ) − Ψ ζ , respectively. Defining the matrices then the identification error dynamic can be written aṡζ where A results in a Hurwitz matrix. The time-varying weight vector W (t) is adjusted by certain weight learning laws designed by Lyapunov stability:Ẇ (t) = Γ(W (t),ζ) (36) Notice that the nonlinear function in the rirght hand side of Equation (36) depends on the identification error. This means that, if the state of the system Equation (24) deviates from the identifier state, the weight vector changes in order to improve the approximation of the unknown plant structure as it evolves over time. The DNN adjustment is explained in more detail in Section 5.

Trajectory Tracking Problem Statement
This section states the tracking problem based on the identification structure proposed in Equation (25). The control objective is to design an indirect adaptive control law u(ζ) based on the approximation given by the neural network Equation (26) and the parametric estimation of b and to force the state of the second-order system Equation (24) to track a desired trajectory ζ re f = [ζ 1 r , ζ 2 r ] ∈ R 2 . The reference trajectory represents the solution of the nonlinear reference model:ζ Assumption 7. By design, we assume that the reference signal is bounded, such that Now, let us define the tracking error e as From Equations (25) and (37), the tracking error dynamics are given bẏ Adding and subtracting the term a 1 ζ 1 r + a 2 ζ 2 r in the second line of Equation (40), we obtain for the tracking error dynamiċ The tracking problem is solved by stabilizing the dynamic in Equation (41), which implies that the tracking error e converges to an arbitrary small neighborhood of zero. If we can ensure that the error e converges to a bounded region, then we can guarantee that the system Equation (24) tracks the reference given by Equation (37).

Neural Control Design and Stability Analysis
This section presents the main contribution of our study regarding the proposed identifier-controller structure. The trajectory tracking control law design based on the dynamic neural network structure is summarized in the following theorem. (24) and its approximation given by the dynamic neural identifier in Equation (25). Suppose that Assumptions 1-7 are fulfilled. If the control signal u(ζ) is proposed as

Theorem 3. Consider the second-order system presented in Equation
where K ζ ∈ R 2×1 , there exist constant parameters ∆ 3 ∈ R and ∆ 1 = ∆ 1 ∈ R 2×2 such that P i = P i ∈ R 2×2 , i = 1, 2 are positive definite solutions for the following algebraic Riccati equations: where K ζ and K id are chosen such that matrices A 1 and A 2 will be Hurwitz: and the adaptive laws for the weights W (t) and the unknown parameterb(t) are chosen aṡ Then, we have the following: (a) The identification (ζ) and tracking (e) errors converge to a compact residual set around zero, i.e., these trajectories are uniformly ultimately bounded [53] as follows: (b) The weight errorW (t) and estimation errorb trajectories are also bounded as follows: where Q 1 ∈ R 2×2 and Q 2 ∈ R 2×2 are positive definite symmetric matrices of appropriate dimensions, I 2×2 is an identity matrix of 2 × 2 and γ 1 , γ 2 , δ W , and δ b are positive real constants, with γ 1 , γ 2 defining the learning gain of the DNN and the adaptive gain for the parameter estimation, respectively.
Equations (25) and (42) define the indirect adaptive controller structure developed in this work, called Dynamic Neural Control System (DNCS).
With the aim of validating that identification and tracking errors of the control system achieve a given invariant region around the origin, this study applies Lyapunov stability theory. The mathematical proof is based on the proposed Lyapunov function candidate below: V e,ζ,W (t),b(t), t = e P 1 e +ζ P 2ζ where 0 < P i = P T i ∈ R 2×2 , i = 1, 2. The full proof is given in Appendix A. Theorem 3 ensures the existence of conditions of sufficiency, which can be used to justify the convergence of the identification and tracking errors, that is, both error trajectories are ultimately bounded as in Equation (48). This formal inference considers the application of the proposed control structure Equation (42) and the resolution of two matrix-type algebraic Riccati equations (AREs). The two Riccati equations can be solved before (offline) the identifier-based controller starts.
The DNCS control law in Equation (42) is comprised of the dynamic neural network approximation of the lumped disturbances, a compensation of the reference system dynamics, and a feedback controller based on the tracking error. Furthermore, it incorporates a compensation of the estimateb of the unknown control input gain. Therefore, the proposed control law design improves the behavior of the closed-loop system when a standard feedback controller with no compensation is applied. On the other hand, the expressions that represent the weight-updating laws for the DNN weights W (t) and the estimation of the input gainb(t) are obtained also as a consequence of the Lyapunov stability analysis and satisfy Equations (46) and (47), respectively. The parameters γ 1 and γ 2 represent the DNN learning rate and the adaptive gain, respectively, and δ W is a positive scalar. At this point, we suppose that the matrix W 0 is estimated via an offline training and then used during the adjustment of W (t), which is closely related to the nominal description of Equation (28). Moreover, both adaptation laws in Equations (46) and (47) depend on the matrix P 2 calculated from the second algebraic Riccati Equation (43) as well as for the identification errorζ. The diagram in Figure 2 describes the general operation of the proposed algorithm applied to the AUV system dynamic Equations (20)-(23) for trajectory tracking. The closed-loop system structure is composed of the AUV dynamic model to develop a parallel identifier based on online adjustment of the DNN weights and the estimation law of parameter b, a reference system that generates the desired trajectory and a controller to calculate the input signal for both the AUV system and the identifier.
The DNCS scheme performs a simultaneous identification and control of the unknown perturbed AUV system, implementing an online training process that makes use of inputoutput data of its evolution over time. The learning laws that feed the proposed DNCS scheme comprise a set of differential equations for which the solutions are the DNN weight matrices and the parametric estimation of the constant input gain involved in the structure proposed in Equation (25). The DNN weight matrices W(t) are updated according to the evolution over time of the identification error. This error is defined as the difference between the system and the identifier states. Additionally, the difference between the nominal and the actual values of these weight matrices is added, as a correction term (denoted byW(t)), in the adjustment of the learning law. On the other hand, the estimateb(t) is also calculated by considering the identification error evolution to perform a parameter identification process. Thus, the DNCS-based identifier copies the system behavior by conducting an online adjustment of the weight matrices and the parametric estimate, causing the identification error to attain reduced values. The information provided by the identifier is used to design a tracking control for each AUV subsystem. This control input is comprised of the compensation of unknown dynamics of the plant provided by the DNCS and a standard feedback control.  Note for practitioners: the DNCS identification and control scheme carry out an online training process by using only the input-output information of the unknown perturbed AUV system. Regarding the DNCS algorithm tuning procedure, a set of parameters are adjusted by trial and error to perform effective identification and control of the unknown dynamics. The gain γ 1 determines the speed of learning of each DNN. This value is adjusted by taking into account the speed of the dynamic changes of the unknown AUV dynamics. That is, it takes greater values if the dynamic response to be identified is characterized by several signals of different frequencies (for example, hydrodynamic damping combined with abrupt currents or waves affecting the AUV system). Additionally, the nominal values of the weight matrices W 0 in the DNN structure can be adjusted. However, one may use their corresponding upper values, which may be supposed to be known, or they can be obtained by an offline training considering some AUV experimental data. The parameters a 1 and a 2 that correspond to the linear part of the DNN structure only must fulfill being negative real values, and generally, once they are selected, they are considered as fixed values. Another free design parameter is the adaptive gain γ 2 , which is related to the rate of convergence of the parameter identification. Numerical simulations showed that the rate of parameter convergence increases with increasing adaptive gain. Regarding the sigmoidal activation function parameters b i , c i , and d i , they can be selected by trial and error to enhance the identifier response; however, for simplicity in the simulation experiments, we considered some baseline values suggested by related works [39,54] that implements the kind of dynamic neural network used in the proposed DNCS.

Numerical Results
In this section, we present a set of two simulation results. The simulations were carried out by using the Matlab ® software environment. The first simulation consists of tracking the reference of an angular position for the yaw's dynamic and a depth reference in the Z axis. In the second one, the underwater vehicle has the task of tracking a 3D helical trajectory with radius of 5 meters. In this simulation analysis, we present a comparison between the DNCS, a standard feedback controller designed by using Linear Quadratic Regulator (LQR) [55], and a Proportional Derivative (PD) controller implementing an online auto tuning method to set its parameters by using the successive approximation method (SAM) [56]. Hence, the self-tuned PD controller applies an adaptive approach to tackle the AUV control problem in order to perform a comparative performance analysis against the DNCS controller under fair conditions. Certainly, there exist other nonparametric control schemes or sliding mode techniques, which incorporates adaptive properties, that could be suitable alternatives for the system class subject of our analysis. However, a number of them need some gain adjustment, which is no longer required for the proposed DNCS controller. To conduct a more realistic simulation, in both scenarios, external disturbances were added to the dynamic model. Table 1 presents the parameters used for the dynamic model given in Equation (8). These parameters are taken from [57], derived by several experimental tests on the BlueROV2 vehicle [41]. These parameter values are also used for derivation of the referred LQR feedback controller. For the control strategy presented in Equation (42) and the adaptive laws given by Equations (46) and (47), the parameters used in the simulation are shown in Table 2.  Table 2. Controller gains and adaptive law parameters used in the simulation results.

Parameters of the Controller and Adaptive Laws
Heading's dynamic It is worth mentioning that, for control purposes, the DNCS was developed based on the simplified dynamics given by Equations (21)- (23). However, it is important to stress that, in simulation experiments, we introduce the complete AUV 4DOF dynamical Equations (17)-(20) that considers a completely coupled AUV system (without any simplification). The simulation framework used in the experiments of the second scenario considers the four controllers (x, y, z, and yaw) acting at the same time. Implementation of the DNCS algorithm stabilizes to the yaw dynamic Equation (20), ensuring a fast controller response providing a rapid convergence to the origin. The previous action guarantees that the translational (x,y) dynamics in Equations (17) and (18) can be rapidly reduced to the ones described in Equations (21) and (22), simplifying the trigonometric coupling functions involving the yaw angle. Therefore, the DNCS algorithm is applied to the horizontal translational dynamic Equations (21) and (22) and for the z-axis dynamic Equation (23) separately to track desired 3D trajectories in the x-y-z space. However, the simulation were conducted by using the nominal model given by Equations (17) and (20).
In order to preserve the controllability condition for the estimated AUV system, it is necessary to guarantee the boundedness of u(ξ) in Equation (42) by modifying the adaptive law Equation (47), so the estimatesb cannot reach values arbitrarily close or equal zero. In general, most of the parametric estimation problems provide some prior information about the localization of the parameter to be estimated. Such a prior information can be used to constrain or project the online estimation to remain in a known convex set. In our case, we need to constrain (project) the parametric estimateb in a convex set to avoid the estimate taking values near to zero. One of the most common operators used in adaptive control systems is the so-called parameter projection operator [58]. For our case, we consider the following prior knowledge: the sign of b and the lower bound b 0 of the absolute value of b, i.e., |b| ≥ b 0 . This information can be obtained using the nominal parameter values of the added mass coefficients of the BlueROV2 vehicle structure, reported in [57]. Therefore, the restriction for the parameter estimate can be expressed as |b| =bsgn(b) > b 0 . Now, following the procedure in [58], we can define the convex set where the parameter estimate must evolve, that is S = {b ∈ R|b 0 −bsgn(b) ≤ 0}. Therefore, we can apply the projection operator to the adaptive law:b leading tȯb whereb(0) is chosen so thatb(0)sgn(b) > b 0 . This modification constrains the parameter estimates inside some known convex bounded sets, which do not include zero. In other words, when the parameterb reaches its lower absolute value b 0 and the time derivativeb is not zero, the adaption process is stopped. Moreover, by using the properties of the projection operator, the stability analysis developed in this research work holds [58].

Yaw and Depth Trajectory Tracking
In this first simulation scenario, the underwater vehicle simultaneous maneuvers in the yaw and depth dynamics. The signal reference for yaw dynamic is a sine function with an amplitude of 45 degrees and a frequency of 0.1π hertz given by and, for the Z axis, a sine function with an amplitude of 2 degrees and a frequency of 0.1π hertz is chosen: ζ 1 r ,z = 2 sin(0.1πt) + 3, ζ 2 r ,z = 0.2π cos(0.1πt). (55) In this scenario, the disturbance signals induced for the dynamic model (8) are given as In Figures 3 and 4, we present the results obtained for the trajectory tracking problem in the yaw and depth dynamics. We can notice from Figures 3a and 4a that the performance of the feedback controller is deteriorated as a consequence of unknown dynamics and the induced disturbances. In the case of the online tuned PD controller, it attains an improved tracking performance under the presence of unknown dynamics and disturbances due to its adaptive characteristics; however, the actual trajectory is still affected by oscillations around the reference signal. In contrast, the developed DNCS enables the system to track satisfactorily the reference signal, with a significantly less tracking error compared with the other controllers. The estimation of the lumped disturbances presented in Figures 3b and 4b by using the dynamic neural network enables the DNCS to increase the performance of the underwater vehicle.   From Figures 3a and 4a, we can observe that the parametersb ψ (t) andb z (t) converge to a small region in about 4 and 12 s, respectively. Additionally, we observe that the smaller the identification error of the parametersb ψ (t) andb z (t), the better approximation of the unknown dynamicsf ψ andf z obtained. However, even when the adaptive parameters are far from their true value, the compensation of the dynamic neural network approximation enables us to conduct satisfactory trajectory tracking in the first seconds of the simulation, as can be observed in Figures 3 and 4.

3D Helical Trajectory Tracking
To verify the performance of the DNCS developed, a more complex scenario is proposed, where a 3D helical reference signal with a radius of 5 m is used. In this case, the desired signals for the X, Y, and Z axes are defined as ζ 1 r ,x = 5 cos(0.1πt), ζ 2 r ,x = −0.5π sin(0.1πt), (57) ζ 1 r ,y = 5 sin(0.1πt) ζ 2 r ,y = 0.5π cos(0.1πt), (58) In the same way as in Section 6.1, disturbance signals were added to the dynamic model (8) for the 3D helical desired trajectory, which are given as The simulation results for the 3D helical trajectory tracking scenario are shown in Figures 5-8. Figure 5 depicts three-dimensional trajectory tracking with respect to the reference trajectory, which shows the accomplishment of the tracking mission. In this simulation experiment, the yaw angle desired is selected as 0 radians. The initial position for the AUV is selected as (x(0), y(0), z(0)) = (3, −2, 2). The 3D trajectory begins at a depth of 2 m, reaching up to 6 meters, with the center at the origin and a radius of 5 meters. This kind of trajectory can be useful in 3D reconstruction applications of underwater structures. We can observe the trajectory tracking results for both controllers: DNCS and the feedback controller. Notice that the feedback controller is not able to counteract the effects caused by the lumped disturbances, whereas the DNCS enables the AUV to track the desired 3D trajectory with a small tracking error. Figures 6a, 7a, and 8a illustrate the time evolution of the position of the underwater vehicle in the X, Y, and Z axes. In the three axes are presented the performance of the DNCS; the feedback controller; and the PD controller, which implements a self-tuning process. When the dynamic neural network is added to the feedback controller, performance is significantly increased, reducing the tracking error and enabling the AUV to track the desired reference without abrupt changes. In the case of the PD controller that implements an online parameter adjustment, the deviation with respect to the desired trajectory is reduced; however, this type of controller fails to eliminate oscillations of the system trajectory.      The time evolution of the adaptive parametersb x (t),b y (t), andb z (t) is presented in Figures 6b, 7b, and 8b, respectively. These parameters converge from the initial value to a small bounded region in approximately 15 s in all the cases. The importance of the estimation of the adaptive parameters should be recalled because it enables the UAV to track a desired references without the knowledge of any parameter of the dynamic model Equations (17)- (20).
In Figures 6b, 7b, and 8b, the estimation of the unknown functions f x , f y and f z , the so-called lumped disturbances, is presented. After the first 10 s of simulation, one can appreciate that the identifier based on the DNCS estimates in a satisfactory way the unknown dynamics and the external disturbances. Moreover, the identification error converges to the compact residual set around zero given by Equation (48). Additionally, a performance index analysis is conducted to compare both DNCS and feedback controllers. The comparison were performed by the following two quality indexes for the tracking error: and the following two indexes for the identification error: whereζ 1 denotes the mean value of the set of identification error values.
The performance index proposed in Equation (61) is given by the Integral of Absolute Error (IAE) to measure the absolute tracking deviation. The second index in Equation (62) is defined by the Integral of Squared Error (ISE) to quantify the absolute squared tracking deviation. The index given in Equation (63) is the standard deviation used to measure the dispersion of the identification error produced by the DNCS scheme. Additionally, the performance index (64) is defined as the root mean square error (RMSE) of the identification error. This is used to measure the amount of error between the set of values of the system and the identifier states. The proposed comparison analysis reports that applying the DNCS controller and the feedback controller solves the trajectory tracking. Tables 3 and  4 show the calculated values for J I AE and J ISE (considering T = 40 s), with the tracking error data involving the two proposed scenarios. This quantitative comparison validates the argument concerning the advantage of using the proposed DNCS-based controller. The numerical data presented in Tables 3 and 4 demonstrate that the proposed DNCS-based controller yields the smallest tracking error of J I AE and J ISE ; this means that the AUV tracks the reference trajectories with a superior performance. This evaluation intends to highlight the benefit for the application of the suggested controller in contrast with the feedback controller to regulate AUVs when the dynamic model information is incomplete.
The information in Tables 5 and 6 shows numerical data of J STD and J RMSE for both experiments considering T = 40 s. The reported small standard deviation values indicate that the identification error at every step tends to be near the mean value of each set of error values (for each dynamic). In other words, the values of the identification errors have uniform behavior, avoiding abrupt changes or dispersion when they evolve over time.
On the other hand, the small numerical values for the RMSE index evidence the fact that each DNCS estimate remains close to its corresponding subsystem trajectory during the simulation time. This validation seems to justify implementation of the proposed DNCS controller for identifying the closed loop AUV subsystems. Remark 1. The DNCS algorithm was designed with the aim to provide a low computational complexity solution considering the notion of a lightweight algorithm, which allows an eventual onboard implementation in a small experimental platform. In this sense, the 4DOF AUV mathematical model is represented by four second-order equations, which are rewritten in their corresponding integrator chain representation. Thus, each second-order system is characterized by two first-order differential equations. The latter aims to avoid matrix operations (nested for loops, which may increment the computational time). Additionally, the right-hand sides of the learning law for the DNN weight matrix W(t) and the adaptive law for b(t) were decomposed in simple arithmetic operations. Hence, the closed loop system simulation of the DNCS control algorithm and the complete AUV model is comprised of several simulation steps, but it intends to be computationally efficient since it uses simple calculations. This fact makes it a conservative algorithm without any major impact to system performance.

Conclusions
In this research work, a novel Dynamic Neural Control System (DNCS) was developed to solve the trajectory tracking problem for an autonomous underwater vehicle with 4DOF in the presence of unmodeled dynamics, parametric uncertainties, and unknown external disturbances. A proof based on a proposed Lyapunov function candidate to demonstrate the closed-loop stability was derived. The stability formal result establishes sufficient conditions in order to guarantee the ultimate boundedness of the identification and tracking errors. These conditions involve the solution of a set of two AREs, a specific control law structure, and certain adaptive laws for the synaptic weights and for the estimate of the unknown input parameter. Besides solving the trajectory tracking problem, the proposed DNCS lets us estimate and compensate (a) unmodeled dynamics and external disturbances and (b) uncertain constant gains that multiply the control signal. The main strength of the proposed control strategy consists of solving the trajectory tracking problem with reduced knowledge of the mathematical model of the AUV (basically a reduced structure) but without any knowledge of the system parameters. A set of simulations were conducted in order to demonstrate the effectiveness and applicability of the proposed methodology. Moreover, a comparison analysis via two performance indexes of the DNCS with respect to a feedback controller and a self-tuned PD controller both without a dynamic neural compensation and a parameter estimation were also conducted. These comparisons show that the proposed DNCS technique is more effective in dealing with unmodeled dynamics, parameter uncertainties, and external disturbances. The identification capacity of the DNCS to approximate the highly nonlinear behavior of the AUV is validated in the different scenarios presented in the numerical experiments. Furthermore, the simulations also showed that the tracking and identification errors are ultimately bounded and remain in a compact residual set around zero. Furthermore, two additional performance indexes were used to measure the performance of the identification process of the AUV system through the standard deviation and RMSE criterions, reaching the conclusion that the DNCS identifier possesses great properties dealing with modeling a and parametric uncertainties. Future research directions consider the implementation of the control strategy developed in at least two underwater vehicle platforms with different characteristics to verify the performance in real-time and under unknown environments. Moreover, if we consider the more general case where the added mass terms are wave frequency-dependent as considered in hydrodynamic theory, the assumption in the mathematical model that the parameter b remains constant is unsatisfied. Hence, the DNCS developed can be extended to estimate time-varying parameters, adding for example another dynamic neural network for this purpose. Applying the inequality of Lemma 1 to the last term of (A9), we obtain 2x P 2 Bf (ζ, t) ≤f (ζ, t) ∆ 3f (ζ, t) +ζ P 2 B∆ 3 −1 B P 2ζ . (A10) Notice that, from Assumption 5 and the Rayleigh-Ritz inequality, we obtaiñ Then, substituting the adaptive law (46) for the weights of the dynamic neural network as well as (A10) and (A11), we obtain for (A9) the following: V ≤e A 1 P 1 + P 1 A 1 + P 1 ∆ 1 P 1 e +ζ A 2 T P 2 + P 2 A 2 + P 2 B W + ∆ 3 −1 B P 2 + From the definition ofζ and e given by Equations (27) and (39), respectively, the inequality of Theorem 1 and Assumption 7, we obtain for ζ 2 the following: Substituting (A13) in (A12) as well as the adaptive law (47), we obtaiṅ V ≤e A 1 P 1 + P 1 A 1 + P 1 ∆ 1 P 1 + 4λ max (∆ 3 ) f 1 I 2×2 e +ζ A 2 P 2 + P 2 A 2 For the term δ bb (t)b(t) and the inequality of Theorem 1, we obtain Considering the terms defined in (45), the temporal derivative of the Lyapunov function is rewritten aṡ V ≤e A 1 P 1 + P 1 A 1 + P 1 R 1 P 1 + Q 10 e +ζ A 2 P 2 + P 2 A 2 + P 2 R 2 P 2 + Q 20 ζ From (A16), we define the following algebraic Riccati equations in (43). We can solve the equations Ric(P 1 ) and Ric(P 2 ), that is, we can find matrices 0 < P i = P i , i = 1, 2, yielding Ric(P 1 ) = 0, Ric(P 2 ) = 0.
The conditions for the existence of P 1 and P 2 are presented in Theorem 2. From (A17), we obtain for (A16) the following: From (A18) and Rayleigh-Ritz inequality, the temporal derivative of the Lyapunov function can be rewritten aṡ Finally, we obtain for (A19) the following: where α is defined as By using the comparison principle, we can solve the differential inequality (A20), leading to In consequence, In order to obtain the size of the convergence region around the origin for the identification and tracking errors as well as the weightW(t) and estimationb error trajectories, we use the Raleigh-Ritz inequality, and the Lyapunov functions (51) and (A23), obtaining that these deviations are ultimately bounded [53] as in (48) and (49).