Velocity-Free Formation Control and Collision Avoidance for UUVs via RBF: A High-Gain Approach

: This paper designs an adaptive formation control system for unmanned underwater vehicles (UUVs) in the presence of unmeasurable states and environmental disturbance. To solve the problem of unmeasurable UUV states, a ﬁltered high-gain observer (FHGO) is employed to estimate the states, despite measurement noise. Then, an adaptive control scheme is designed to achieve UUV formation collision avoidance. The radial basis function (RBF) is used to estimate the unknown disturbance. The stability of UUV formation with collision avoidance is proven by using the Lyapunov theorem. Numerical simulation is carried out to demonstrate that the proposed ﬁltered high-gain observer is successful in estimating the states of UUVs. The control law can keep the UUV formation from collision with good performance.


Introduction
Due to its high efficiency and wide searching area, formation control for multiple agents has become a hot topic. In the research process, there are a lot of challenges to address, such as nonlinearity in parameters, communication constraints and dynamic environmental disturbance. Various methods have been proposed for the formation control of autonomous underwater vehicle (AUV) groups in a decentralized manner, which is also known as high-level control, examples of which include artificial potential field or methods based on agreement protocol such as leader-follower, virtual structure and behavioral approaches [1][2][3]. In [4], an adaptive neural network formation controller was developed for multiple AUVs with unknown model coupling terms and unknown disturbances. A neuroadaptive sliding mode formation controller was proposed for multiple AUVs with environmental disturbances [5]. Gao proposed a fixed-time sliding control scheme with a disturbance observer to solve the compound disturbance, including both external environment disturbance and parameter uncertainties [6]. Based on the minimal learning parameter algorithm, Lu proposed a robust adaptive formation controller to achieve the formation control of multiple underactuated surface vessels (USVs) [7]. Due to the complicated underwater situation, the research progress of UUV formation is relatively slow, but research on unmanned aerial vehicle (UAV) formations, autonomous ground vehicles and multiagents is advancing. In [8], an adaptive neural network control scheme was investigated to address the formation control problem of multi-USVs with data dropout and time delays [9].
Based on the approximation of a partial differential equation modeling the nonlinear steady state, the characterization of the low-pass filtering to high-frequency measurement noise of high-gain observers has been proved in [10]. High-gain observers are a good choice for dealing with the influence of noise. It is fundamental to recall that the performance of a high-gain observer in the presence of colored measurement noise is mainly characterized 1.
An FHGO is designed to exactly estimate the states of UUVs to satisfy the so-called observer matching condition. Differently from [10], the input-to-state stability property of the estimation error is still preserved in the presence of measurement noise; 2.
An adaptive neural network formation control scheme is designed for UUV formation with collision avoidance under unknown disturbance. The RBF neural network is used to estimate the unknown disturbance. Studies on collision avoidance for multi-UUV formation are mostly on the 3-DOF model in the horizontal plane. The state feedback linearization method is used to transfer the nonlinear and coupling mathematical model of UUVs into a second-order system model in 5-DOF. The artificial potential field theory is applied to cope with collision avoidance among the UUVs. The form of the potential function is much simpler; 3.
Based on the Lyapunov theory, the stability of the formation system is proven. The proposed controller is valid and performs well, which can be found in the 3-D simulation figures.

Graph Theory
Considering a multi-UUV system consisting of n vehicles, we use graph theory to model the information exchange among UUVs. Let the graph G = (N, E, A) be an undirected graph, which consists of a node set N = {n 1 , n 2 , · · ·, n n }, an edge set E ⊆ N × N and the adjacency matrix A = a ij ∈ R n×n . The element a ij = 1 denotes that the node i can receive information from the node j; otherwise a ij = 0, and for all, i, a ii = 0. Moreover, G is undirected if a ij = a ji = 1. The collection of connected neighbors of n i is denoted a ij , i = 1, 2, . . . , n is the in-degree of node n i . Then, the Laplacian matrix L = l ij ∈ R n×n is defined as L = B − A ∈ R n×n . Define ϑ = diag{θ 1 , θ 2 , . . . , θ n }, which is the communication weight matrix between UUV i and the virtual leader, where θ i = 1 means that UUV i can receive the information from the virtual leader, and θ i = 0 means otherwise. Then, we obtain the matrix L = L + ϑ, and the eigenvalues of L are positive.

Feedback Linearization of UUV Model
The kinematic and dynamic model of a UUV are described in two coordinate frames, which are the Earth-fixed frame {E} and the body-fixed frame {B}, as shown in Figure 1 3. Based on the Lyapunov theory, the stability of the formation system is proven. The proposed controller is valid and performs well, which can be found in the 3-D simulation figures.

Graph Theory
Considering a multi-UUV system consisting of n vehicles, we use graph theory to model the information exchange among UUVs. Let

Feedback Linearization of UUV Model
The kinematic and dynamic model of a UUV are described in two coordinate frames, which are the Earth-fixed frame {E} and the body-fixed frame {B}, as shown in Figure 1. According to the structure of the UUV studied in engineering applications, there is no thruster to control the angular velocity in roll. Meanwhile, the rolling has little influence on the translational motion, so the roll speed can be ignored. Then, the kinematic and dynamic equations of UUVs can be described as [24]: According to the structure of the UUV studied in engineering applications, there is no thruster to control the angular velocity in roll. Meanwhile, the rolling has little influence on the translational motion, so the roll speed can be ignored. Then, the kinematic and dynamic equations of UUVs can be described as [24]: where p 0 = [p x 0 , p y 0 , p z 0 , p θ 0 , p T is the velocity vector of the UUV in the bodyfixed frame. ω 0 ∈ R 5 is an unknown time-varying disturbance due to currents and waves. τ 0 ∈ R 5 is the control input acting on the UUV in the body-fixed frame. ∆(v) represents the parameter uncertainty. ω n denotes the measurement noise. Additionally, J(p 0 ) is the transformation matrix. M 0 , D(v 0 ) and C(v 0 ) denote the inertia matrix, damping matrix and the matrix of Coriolis and centrifugal terms, respectively. The mathematical model of UUV is nonlinear and has strong coupling. To solve the problem, the feedback linearization method is adopted to simplify the UUV model. The standard double integrator dynamic model can be described as: The specific linearization process can be obtained from [25]. In (3), the external disturbances are not considered. Since the disturbance is unknown and nonlinear, no matter how the conversion is performed in the linearization process, the final form of disturbance is still unknown and nonlinear. Then, a modified linearization model is proposed in this paper as: where T ∈ R 5 are the position and velocity of the UUV i in UUV formation with i = 1, 2, . . . , n. τ i ∈ R 5 is the control input of UUV i . ω ∈ R 5 is an unknown time-varying disturbance.
Due to the complexity of the underwater environment, if faults occur to the leader in the leader-follower method, the mission of UUV formation will not be able to be completed. To enhance the fault tolerance ability of the formation control, a virtual leader is introduced and defined as: where p l ∈ R 5 is the position of the virtual leader, and v l ∈ R 5 is the velocity of the virtual leader. g l (t) is a given bounded and time-varied function, ð is a positive constant, and g l (t) < ð. Define the error variable of UUV i as: where ε i ∈ R 5 is the desired relative position between UUV i and the virtual leader. The desired velocity of UUV i is the same as the velocity of the virtual leader. Design T . Then, we obtain the error variable of the system as: where θ i = 1 means that UUV i can receive the information from the virtual leader, and θ i = 0 means otherwise.

Artificial Potential Field and Virtual Repulsive
The APF method regards each UUV as a high-potential field. If any UUV is close to its neighbor, the repulsive force will repel the UUV away from the other UUV's potential field. There are two advantages to collision avoidance by using the APF method. The first is that the individuals of multi-UUV systems can be separated from each other to avoid collisions. The second is that fewer parameters need to be debugged, and the controller design is much simpler than other collision avoidance methods. In practical engineering, the UUV is a rigid body with volume instead of a particle. Then, we assume that each UUV has the same structure and define the collision sphere and the collision avoidance sphere of a UUV, as shown in Figure 2. The collision avoidance sphere is defined by the black sphere with safe radius r s . The collision sphere is shown by the red sphere with collision radius and UUV j . Shi [26] concludes that UUV j is a collision avoidance neighbor N c i of UUV i, while d ij ≤ r s . When d ij ≤ 2r c , collision occurs between UUV i and UUV j.
The APF method regards each UUV as a high-potential field. If any UUV is close to its neighbor, the repulsive force will repel the UUV away from the other UUV's potential field. There are two advantages to collision avoidance by using the APF method. The first is that the individuals of multi-UUV systems can be separated from each other to avoid collisions. The second is that fewer parameters need to be debugged, and the controller design is much simpler than other collision avoidance methods. In practical engineering, the UUV is a rigid body with volume instead of a particle. Then, we assume that each UUV has the same structure and define the collision sphere and the collision avoidance sphere of a UUV, as shown in Figure 2. The collision avoidance sphere is defined by the black sphere with safe radius s r . The collision sphere is shown by the red sphere with collision radius c r .We define Shi [26] concludes that UUV j is a collision avoidance neighbor will tend to infinity. Thus, the repulsive force for collision avoidance is: To ensure that no collision occurs among the UUVs, an artificial potential function δ ij (d) and an action function ς(d) are defined as: where β c i > 0 is a design parameter. When β c i is large enough and d → 2r c or d → 0 , the potential function δ ij (d) will tend to infinity. Thus, the repulsive force for collision avoidance is: where −∇ p i denotes a negative gradient along p i , and β c i is a positive gain parameter.

RBF Neural Network
The radial basis function neural network (RBFNN) is usually applied to approximate the unknown nonlinear functions because of its advantage of approximation property [14]. Then, the RBFNN is used to address the unknown disturbance in this article. The RBFNN W * T H(z) can approximate the continuous function of disturbance W ω i as: where ϕ(z) ∈ R 5 is the estimation error, and W * = [W * 1 , W * 2 , . . . , W * m ] ∈ R m×5 is the ideal constant weight. H(z) represents the basis function vector described as: where i is the center of the Gaussian kernel, and κ i denotes the width of the Gaussian kernel. Define the error between the ideal weight and the estimated weight as: whereŴ i indicates the estimation of the ideal constant weight vector. Design the adaptive weight update law of the neural network as: where α i is the designed positive parameter, and c i is a positive constants.

Control Objective
The goal of this paper is to design an adaptive controller for a multi-UUV system with collision avoidance. The control problems can be formally stated by the following objectives: where d ij (t) denotes the relative position variable between the UUV i and its collision avoidance neighbor UUV j, and · is a Euclidean norm. Assumption 1. The disturbance is time-varying and bounded. Then, there exists a positive constant σ ω , such that ω < σ ω . Assumption 2. The velocity of UUVs and the virtual leader are not zero, e.g., v i = 0 and v l = 0. According to the characteristics of UUV, the velocity of UUVs is bounded, e.g., v i < σ v i and v l < σ v l , where σ v i and σ v l are positive constants. Assumption 3. At least one UUV can receive the information from the virtual leader, e.g., ϑ = 0, where ϑ is the communication matrix between the virtual leader and followers. The communication between the UUVs is connected, e.g., A = 0 and L = 0. Assumption 4. The initial error in relative position and relative velocity of any two UUVs are bounded. e.g., e pi − e pj < α p , e vi − e vj < α v . α p and α v are positive finite values.
Lemma 2 [26]. V(t) > 0 is a continuous function for any time, and the initial state of V(0) is bounded. If the inequality . V(t) > −γV(t) + Γ holds with γ > 0, Γ > 0, then we have the following inequality: Lemma 3 [27]. S(t) > 0 is a continuous function for any times, and the initial state of S(0) is bounded. If the inequality holds . S(t) > λS(t) for t > 0, λ > 0, then we have the following inequality:

Filtered High-Gain Observer Design
There are some conditions, such as guidance failure or the fault of facilities, where the states of the UUV cannot be obtained. To achieve the state estimation of the UUV formation, a filtered high-gain observer (FHGO) is designed for multi-UUV systems with measurement noise in the subsection. The main feature of the FHGO lies in its filtering capabilities, which allow obtaining relatively smooth estimates in the presence of noisy output measurements.
Let the UUV model be transformed into regular form: The high-gain observer design will be performed according to the following assumption.
Assumption 5. The states p i and v i are bounded. In addition, the noise ω n is essentially a bounded function.
For comparison purposes, one recalls the equations of the standard high-gain observer (SHGO) proposed in [28]: where h 1 and h 2 denote the states of HGO, ε is a sufficiently small positive constant, ∂ 1 and ∂ 2 are chosen such that he roots of s 2 + ∂ 1 s + ∂ 2 have negative real parts, and . Inspired by Khalil [11], to improve the sensitivity properties of the observer with respect to measurement noise, an FHGO for UUV formation can be designed as follows where T f is the time constant, and A f , B f and C f are gain matrices such that Proof of Theorem 1. The following auxiliary variables are defined Obviously, according to the above definition, C f q f is the filtered signal of measurement noise ω n , which is denoted by ω n f . Thus, we have z f = p f + q f . Define the estimation According to the definition of z f , we can obtain the detailed expansion of C f z f − h 1 : where Substitute the above Equation (24) into the error dynamics (23), we obtain: Now, consider the following change of coordinate where σ 2 = . σ 1 . Taking derivative to the (24) .
Then, it can be written in the compact form as: . where Consider the Lyapunov function V as where A f T P f + P f A f = −I. Taking the derivative of the Lyapunov function and inserting (26) leads to: By the definition of p f in (20), we have Then, by the negative definiteness of A f and Assumption 1, it can be shown that p f is ultimately bounded by O(T f ). Furthermore, we can deduce that C f P f is bounded by some constant b n , and where c after some expansions and simplifications, we arrive at It can be verified that In the light of V O = φ T Pφ, some positive constants r 1 , r 2 , r 3 must exist such that: Recall that Obviously, if σ 1 and σ 2 are bounded, then the uniform ultimate boundedness of tracking error can be obtained. The derivative of σ 1 and σ 2 can be derived as Then, by Assumption 1, it can be shown that σ 1 and σ 2 are ultimately bounded by O(T f ). After some computations, we obtain where r 4 = C f . Then, by choosing a constant β = max{2β 1 , 2β 2 } and µ = max , O(T f ) , where β 1 = max r 1 b n , r 2 , 2r 4 , β 2 = max r 1 b n , r 2 , 2r 4 , time T(µ) must exists such that η ≤ µβ for t ≥ t 0 + T(µ). This completes the proof.

Remark 1.
It should be noted that when the measurement noise bound ω n is small enough, µ can always be arbitrarily small by selecting the appropriate time constant T f and observer parameter . Therefore, the estimation errors of the observer are guaranteed to be bounded. Moreover, according to Theorem 1, the accurate estimations of the position vector p and velocity vector v of the ship can be obtained through the appropriate coordinate transformation of the state of the FHGO: where γ = diag(I 3 , I 3 ). The FHGO designed in this subsection will be used to accurately estimate the states of multi-UUV systems in the following sections. Define error of the whole system with state observer as:

Adaptive Neural Network Formation Control with Collision Avoidance under Unknown Disturbances
The formation control law of the multi-UUVs is defined as: Combining (9) with (40), we can obtain the formation controller based on a high-gain observer with collision avoidance under unknown disturbances is designed as Proof of the Stability of the System. Define the Lyapunov function as follows: where ⊗ represents the Kronecker product. Define the derivatives of the error variable as follows: .
Taking the time derivative of the Lyapunov function, we obtain where Q = 2L L L L . Substituting (9), (13) and (40) becomes For an undirected graph, the repulsive forces between UUVs are equal and opposite in direction [29]. Therefore, β c Based on the Young's inequality where i,j is a positive constant, and j = 1, 2, . . . , 7. Then, we can obtain and α i W * 2 are constants. Thus, let Then, (47) can be written as: .
Then, we can obtain simulation experiment, and UUV 0 is the virtual leader. Before showing the simulation results, some parameters used in the system are listed in Table 1.

State Observer Simulation Result
To illustrate the noise resistance ability of the proposed FHGO, a simulation comparison is presented by applying the SHGO (20) in [29]. The design parameters are taken as = 0.05, ∂ 1 = 4, ∂ 2 = 4. The low-pass filter is taken as

Collision Avoidance Simulation Result
To show the performance of the controller of collision avoidance, we used the RBF neural network to estimate the unknown disturbance. Figure 6 shows that the UUVs follow the desired trajectory well and keep a good formation form. The distance between any two UUVs is more clear to show the collision. The 2 c r line is set at the bottom of

Collision Avoidance Simulation Result
To show the performance of the controller of collision avoidance, we used the RBF neural network to estimate the unknown disturbance. Figure 6 shows that the UUVs follow the desired trajectory well and keep a good formation form. The distance between any two UUVs is more clear to show the collision. The 2r c line is set at the bottom of Figure 7. If the distance d ij < 2r c , collision occurs. Figure 7 shows the distance between any two UUVs of the formation considering collision avoidance. There is no line under the 2r c line. The distance between UUVs is below to 2r c . With the collision avoidance control, the formation performance can be guaranteed.    The position error and velocity error performance are validated as shown in Figure  8. All the errors converge to zero in finite time, which means that the controller works satisfactorily and the system is stable. Simulation results show the four UUVs can track the virtual leader well without collision and maintain a good formation. The position error and velocity error performance are validated as shown in Figure 8. All the errors converge to zero in finite time, which means that the controller works satisfactorily and the system is stable. Simulation results show the four UUVs can track the virtual leader well without collision and maintain a good formation.

Conclusions
The adaptive neural network formation control for multi-UUVs with collision avoidance under unknown disturbance is discussed. Graph theory is utilized to model the communications between UUVs. Feedback linearization is used to simplify the 5-DOF mathematical model. The overall structure of the control law is composed of the formation controller and the collision avoidance controller. In the condition of unmeasurable states, FHGO is designed to exactly estimate the states of UUVs. The RBF neural network is adopted to approximate the nonlinear dynamic disturbance term. By integrating the artificial potential field method into the virtual leader-following formation strategy, the collision among UUVs problem is solved.

Conclusions
The adaptive neural network formation control for multi-UUVs with collision avoidance under unknown disturbance is discussed. Graph theory is utilized to model the communications between UUVs. Feedback linearization is used to simplify the 5-DOF mathematical model. The overall structure of the control law is composed of the formation controller and the collision avoidance controller. In the condition of unmeasurable states, FHGO is designed to exactly estimate the states of UUVs. The RBF neural network is adopted to approximate the nonlinear dynamic disturbance term. By integrating the artificial potential field method into the virtual leader-following formation strategy, the collision among UUVs problem is solved. The system stability is proven using the Lyapunov theory. The result shows the observer proposed in this paper performs better. Simulation results on formation control with collision avoidance demonstrated the controller designed in this paper is valid and performs well.