On AUV Control with the Aid of Position Estimation Algorithms Based on Acoustic Seabed Sensing and DOA Measurements

This article discusses various approaches to the control of autonomous underwater vehicles (AUVs) with the aid of different velocity-position estimation algorithms. Traditionally this field is considered as the area of the extended Kalman filter (EKF) application: It became a universal tool for nonlinear observation models and its use is ubiquitous. Meanwhile, the specific characteristics of underwater navigation, such as an incomplete sets of measurements, constraints on the range metering or even impossibility of range measurements, observations provided by rather specific acoustic beacons, sonar observations, and other features seriously narrow the applicability of common instruments due to a high level of uncertainty and nonlinearity. The AUV navigation system, not being able to rely on a single source of position estimation, has to take into account all available information. This leads to the necessity of various complex estimation and data fusion algorithms, which are the matter of the present article. Here we discuss some approaches to the AUV position estimation such as conditionally minimax nonlinear filtering (CMNF) and unbiased pseudo-measurement filters (UPMFs) in conjunction with velocity estimation based on the seabed profile acoustic sensing. The presented estimation algorithms serve as a basis for a locally optimal AUV motion control algorithm, which is also presented.


Introduction
Navigation of AUVs (autonomous underwater vehicles), being essentially different from the navigation of UAVs (unmanned autonomous vehicles) due to the principal difference between the sources of navigation information, nevertheless shares several characteristic features with the latter. Indeed, in both cases, the INS (inertial navigation system) plays the main role in determining the position-velocity state of the vehicle. Navigation with the aid of INS is usually based on the dynamic model of the vehicle motion with inputs from various sensors of linear and angular velocities and uses filters (usually of Kalman type) for calculation of the position and attitude. However, all types of velocity sensors used in INS are subject to drift, which must be systematically compensated to keep the position estimation accuracy at an acceptable level. In UAV navigation such compensation may be accomplished with the aid of satellite positioning systems (GPS or GLONASS), or in the case of a so-called GPS denied environment, with the aid of specific video survey tools [1]. In an underwater the locally optimal formulation since the stochastic problems with nonlinear observations do not have explicit globally optimal solutions. The model is described in Section 2. In Section 3 the following problem of nominal trajectory is formulated as a control problem and its locally optimal solution is presented. Section 4 contains a position estimation algorithm based on dead reckoning with speed evaluated from the evolution of the "acoustic images"-the seabed distances registered with the aid of acoustic sonar. Section 5 is devoted to the filtering algorithms based on the dynamic model of the vehicle motion and DOA measurements. In Section 6 the performance of the locally optimal control based on the discussed filtering algorithms is evaluated through simulation.

The AUV Navigation System Model
Let the coordinates of the AUV at time instant t k be X k = X(t k ) = (X(t k ), Y(t k ), Z(t k )) T = (X k , Y k , Z k ) T , where (·) T is the transposition operator. The sequence t k , k = 0, . . . , N, t 0 = 0, t N = T is known, and the interval between the consequent time instants is assumed to be the same for the sake of simplicity: t k+1 − t k = ∆t, k = 0, . . . , N − 1. The initial condition X 0 is a random vector with known expectation m 0 and covariance matrix S 0 , the distribution X 0 ∼ P X 0 is assumed unknown. Denoting by P (m, S) the set of all distributions with expectation m and covariance S, we can write P X 0 ∈ P (m 0 , S 0 ).
Let the controlled motion model of the AUV be defined by the following vector equation: where V k is the vector of the AUV speed at the moment t k and W k = (W X k , W Y k , W Z k ) T is a sequence of independent and identically distributed (i.i.d.) random vectors with zero mean and covariance matrix S W independent from the initial condition: W k ∼ P W k ∈ P (0, S W ). The exact distribution of these vectors is assumed unknown.
The AUV speed vector is defined by its absolute value V k = V k and two angles γ k , θ k shown in Figure 1. The angle γ k , is the angle between V k and the horizontal plane x0y, and θ k is the angle between the projection of V k on the horizontal plane x0y and the axis 0x. Thus, the components of the speed vector are defined by V k = V k (cos γ k cos θ k , cos γ k sin θ k , sin γ k ) T and vector model given by Equation (1) can be rewritten in the form of a system of scalar equations: (2) The vector of three parameters u k = (γ k , θ k , V k ) T that define the direction and the absolute value of the speed vector V k = V k (u k ) (and hence the whole dynamics of the AUV), is further called the control vector.

Locally Optimal AUV Path Control
We define the control problem for the AUV as a problem of following some predefined pathX k , k = 0, . . . , N, which is given by the control sequenceů k = (γ k ,θ k ,V k ) T and calculated in the absence of any noise: We first consider the optimal control problem with full information, where the real coordinates of the AUV X k and its speed V k are known at all times t k , k = 0, . . . , N.
The quality of the control at time instant t k is defined by the following criterion: (4) is called locally optimal since at any time instant t k it provides the least deviation from the nominal path on the next step t k+1 .
denote the deviation of the real path of the AUV from the nominal one at the time instant t k . Let alsoV k = (V X k ,V Y k ,V Z k ) T = V k (cosγ k cosθ k , cosγ k sinθ k , sinγ k ) T denote the nominal speed vector and its components. Then the solution to the optimization problem: is given by the following relations: where atan2(y, x) is the angle between the positive direction of axis 0x and the vector (x, y) T .
The proof of Lemma 1 is given in Appendix B.
The relations for the optimal angles and the absolute value have straightforward geometrical interpretation. From Figure 2, it follows that the locally optimal control at the time instant t k determines the AUV movement speed V * k = V * k (cos γ * k cos θ * k , cos γ * k sin θ * k , sin γ * k ) T aiming towards the point of the nominal pathX k+1 and possessing the magnitude V * k , which is necessary to reach that point at the next time instant t k+1 . The last observation provides the justification for a control policy, which is derived from Relations (6) by substituting the real AUV coordinates X k by an unbiased estimateX k : The original optimal control problem for the nonlinear stochastic System (2) with incomplete information is rather hard for analysis. Moreover, it does not have an explicit solution even in the case of the quadratic criterion. In these circumstances, suboptimal control policies like Relations (7) seems reasonable substitution for the unattainable optimum.
Another feature of the optimal control in Relations (6) is that in general, it is unbounded, whereas for the real AUV navigation system the limits on the instantaneous course and speed change (or acceleration constraints) would be natural. Nevertheless, this does not affect the applicability of the estimation methods, which are the primary focus of this study. In presence of such constraints, optimization Problem (5) no longer has a solution in an explicit form, but since it is a well-known quadratic optimization problem with linear (box) constraints, the solution can be calculated using numerous effective numerical methods [28].

Position Estimation with Seabed Sensing
In this section, we present an approach to the position estimation based on dead reckoning with speed evaluated from the evolution of the "acoustic images". Each image consists of the seabed distance measurements L ij k = L(X k , u k , γ i , θ j ) made at the same time instant t k by acoustic sensors aimed at different angles (γ i , θ j ) with respect to the AUV attitude, given by its movement direction u k . The information about the AUV position shift ∆X k+1 = X k+1 − X k , or the speed V k is then derived from the difference between the values of correspondent distance measurements L ij k and L ij k+1 made at the consecutive time instants t k+1 and t k . The way this is done is close to the Lucas-Kanade method for optical flow estimation [15] with the set of acoustic measurements treated as an image and a single distance to the seabed treated as a pixel intensity.
In detail, the model of acoustic measurements is presented in Figure 3. Let (γ i , θ j ), i, j = 1, . . . , M be the set of aiming angles at which the sensors of the AUV emit the acoustic beams and acquire the distance to the seabed. These angles are defined with respect to the AUV attitude, which means that the absolute beam (i, j) direction is given by the angles (γ k + γ i , θ k + θ j ). Thus the coordinates of the point x ij k = (x ij k , y ij k , z ij k ) T , where the beam (i, j) reaches the seabed at the time instant t k are expressed as: where T defines the direction of the beam. It should be noted that this model differs from the one considered in [16,17] since it takes into account the controlled AUV attitude, whether in the mentioned works the beam directions were assumed constant. Note also, that the number of aiming angles γ i and θ j could be assumed different with no effect on the further derivations. Let the seabed profile be defined by the equation ψ(x) = 0, where ψ(·) is some smooth function.
Then considering x as a function of variables X, L, and e, each of which in turn depends on the time t: and calculating the total derivative dψ(x) dt = dψ(x(X(t),L(t),e(t))) dt , we get: where the actual functions' arguments were omitted for the sake of simplicity. Assume that the partial derivatives of the function ψ(·) are known at any point of the seabed which could be reached with an acoustic beam. Rewriting Equation (9) in discrete time with substitution of the differentials with corresponding increments: we have the following equation: which in turn can be expressed with unknowns ∆X k+1 , ∆Y k+1 , ∆Z k+1 collected on the left-hand side: The unknowns ∆X k+1 , ∆Y k+1 , ∆Z k+1 can not be derived from a single equation, but assuming that these shifts are the same for all the acoustic sensors, we have a set of Equation (11) for different Finally it is possible to estimate the unknowns with the least-squares method: Vectorization of Equation (11) with respect to the set of aiming angles (γ i , θ j ), i, j = 1, . . . , M gives: where A k and B k are made of vertically stacked row-vectors and values corresponding to the individual observations: Now the solution to the least-squares optimization Problem (13) can be expressed in the standard form: Finally the AUV position dead reckoning estimate is: It should be noted that Equation (16) requires the values of the partial derivatives of the function where the acoustic beams reach the seabed. Precise calculation of these values even in the case of known seabed profile is not possible since it requires the precise values of the points x ij k , which depend on the unavailable precise AUV position X k due to Equation (8). The way to overcome this issue is to use the acoustic image data to estimate the seabed slope, i.e., the required partial derivatives of ψ(·). In Appendix A, we present a method of estimation, based on the approximation of the slope function ψ(·), which requires only the values of seabed distances L ij k . Another method is proposed in [16]. The following algorithm summarizes the proposed acoustic seabed sensing AUV position estimation method: 2. using the control values on the current (γ k+1 , θ k+1 ) and the previous (γ k , θ k ) steps calculate the increments (∆e X k+1 , ∆e Y k+1 , ∆e Z k+1 ) T according to Equation (10); 3. evaluate the slope estimates using Equation (A1) or another method; 4. using Equations (12) and (14) calculate the matrix A k and the vector B k ; 5. calculate the estimate of the AUV position shift ∆X k+1 with Equation (15) and the position estimateX k+1 with Equation (16).

Position Estimation with DOA Measurements
In this section we discuss various position estimation algorithms based on the dynamic model of the vehicle motion, Equation (1), and the external bearing-only measurements provided by a passive acoustic DOA estimation device (pressure hydrophone or acoustic vector sensor array) [6].
The position of the acoustic source X B = (X B , Y B , Z B ) T is assumed to be known and constant which is the case of a pre-deployed stationary acoustic beacon. It is assumed that at any time instant t k the bearing vector ( Figure 4) is available for observation in the following form: where E k = (ε ϕ k , ε λ k ) T ∼ P E k ∈ P (0, S ε ) is a sequence of i.i.d. random vectors independent from W k and X 0 . The exact distribution of these vectors, as in the case of the noise in the Model (2), is assumed unknown. Systems (2) and (17) can be expressed in the following general vector form: The common way to deal with the nonlinear System (18) is the extended Kalman filter. In this section we present alternative approaches, which in some cases allow to achieve better estimation quality then by direct system linearization. In Section 5.1, we present an observation Equation (17) transformation, allowing to reduce the original problem to a form, where the optimal filtering solution is also available in the form of the Kalman filter. In Section 5.2, we provide the conditionally minimax filtering problem statement and solution for the original nonlinear model of Equation (18) and show how this particular filtering approach allows data fusion from the DOA measurements and the dead reckoning navigation system based on the acoustic seabed sensing from Section 4.

Pseudo Measurements Filter
Rewriting the observations of Equation (17) in the following form: and gathering all the known or measured values at the left-hand side, we get: The right-hand side in the previous expression is linear with respect to the system state X k = (X k , Y k , Z k ) T , but at the same time, it involves the additive noise, whose covariance is now state-dependent. Denote: and rewrite Equation (19) in the vector form: where The introduction of Y k , which is called pseudo-measurements, is a common method of nonlinear systems analysis [21,25]. The idea is based on the fact that linear Kalman filtering estimate is the linear-optimal one in the case of linear observations and system dynamics and on the assumption that the noise Ψ 2 k E k covariance, even being state-dependent, may be evaluated or replaced by an upper bound. In our previous work [26] the filtering algorithm based on the pseudo-measurements has been suggested and its recurrence relations has been derived from the assumption of the unbiasedness of the estimate on the previous step. As applied to the problem at hand these relations are as follows: This filter has the common Kalman structure with items recurrently calculated upon the bearing measurements update. The difference with the classic case is that here the Riccati equation is not solvable a priori, because it involves estimate-dependent terms and current measurement values.

Conditionnaly Minimax Nonlinear Filter
Let the functions Φ k (·, ·), Ψ k (·) and the feedback control u k = u k (Y 0 , . . . , Y k ) be such that the first and second moments of the state X k and observation Y k are finite. Introduce two sets of functions: Basic prediction α k (x, u) and basic correction functions β k (x, y). Then the CMNF estimate is defined by the following recurrent relations: where cov(x, y) is the covariance matrix of two random vectors x, y, A + denotes matrix pseudo-inversion, andX 0 = m 0 .
In the case when the functions α k (·, ·) and β k (·, ·) have first and second order moments for all the random arguments in Equation (22), then the CMNF estimate exists and has the following minimax property. LetX k−1 be the CMNF estimate at the time instant t k−1 . Then the linear functions F * k (ξ) = F k ξ + f k , H * k (ζ) = H k ζ + h k defined by Equation (22) provide the solution for the following minimax optimization problem: where P k ∈ P (E Z k , cov(Z k , Z k ))-the set of all possible distributions of the compound vector Z k = (X k , α k (X k−1 , u k−1 )) T and P k ∈ P (E Z k , cov(Z k , Z k ))-the set of all possible distributions of the compound vector Z k = (X k −X k , β k (X k , Y k )) T . The details on the CMNF approach to the nonlinear stochastic systems state estimation, including the thorough justification of Equation (22) being the solution to Equation (23) and the conditions of the solution in Equation (23) existence, could be found in [22]. Further application of the concept along with the comparative numerical study is the matter of the works [23,24].
The fact that the CMNF estimate, Equation (22), is the solution to the minimax problem, Equation (23), means that at each time instant t k it delivers the minimum for the worst case (with respect to the a priori uncertainty in the distributions P k and P k ) of the mean-square error of the predictionX k and correctionX k . It should be noted that bothX k andX k are unbiased estimates of X k and the quality of these estimates is a priori known: Though the CMNF filter for a nonlinear stochastic system in a rather general form, Equation (18), which indeed covers the AUV navigation problem at hand, is fully defined by Equation (6), there are still two questions which need to be discussed in order to clarify all the aspects of the CMNF application on practice: The functions α(x, u), β(x, y), and the calculation of the covariances in Equation (22).
The basic prediction and correction functions α(x, u), β(x, y) define the structure of the CMNF filter. Their choice is model-specific and it can reflect particular features of the nonlinear functions Φ k (·, ·), Ψ k (·). The common option nevertheless is the prediction "by virtue of the system" and the correction in the form of the residual, which in the case of Systems (2) and (17) with E {W k } = 0 and E {E k } = 0 are as follows: Unlike the EKF and pseudo-measurements filtering, the CMNF approach provides a natural way of data fusion from the INS and external measurements: The basic prediction function α k+1 (X k , u k ) can be chosen in the form of the estimate from the internal navigation System (16). Finally, the CMNF estimate, which is based on the data fusion from the dead reckoning seabed sensing and external bearing-only measurements is defined by Equation (22) with structure functions where the shift ∆X k+1 is defined by Equation (15) and the corresponding algorithm from the Section 4.
The only question left is the covariance matrices, which are necessary for the calculation of the linear estimator coefficients in Equation (22). The general CMNF approach implies that instead of the real covariances one uses their estimates, obtained by means of the Monte-Carlo sampling.

AUV Control Simulation
In this section, we provide the results of the numerical simulation of the described filtering and control algorithms. We compare the results of the simulation in a case when the nominal AUV speed V k in the motion dynamic Equation (2) is known exactly and when it is evaluated by an external estimator-seabed sensing algorithm from Section 4.
The nominal trajectory, i.e., the path, which AUV has to follow, is given by the Equation (3) with γ k = π 100 cos( π 100 t k ),θ k = π 3 cos( π 25 t k ), where k = 0, . . . , N, t 0 = 0, t N = T = 300 and ∆t = t k+1 − t k = 1 second. The absolute value of the AUV speed is constant:V k = 2m/s for all k and the initial condition isX 0 = (0, 0, −10) T (hereinafter all distances are measured in meters). The nominal path of the AUV is shown in Figures 5 and 6 in 3D and component-wise respectively. It should be noted that this particular path is chosen for modeling purposes only: There was no aim to reflect any specific real-world AUV mission, but simply to get the variation of the components in different scale:X k ∼ 100 m,Y k ∼ 10 m,Z k ∼ 1 m.
The sample paths of the system are simulated using the Equation (2), where the disturbances W k are i.i.d. Gaussian vectors with zero mean and covariance S W equal to the identity matrix I 3×3 : W k ∼ N (0, I 3×3 ). The initial condition is supposed to have the same distribution: X 0 ∼ N (0, I 3×3 ).
In this simulation, it is assumed that on the sea surface there is a set of four sources of acoustic signals (beacons) with known coordinates X i B , i = 1, . . . , 4: X 1 B = (500, 100, 0) T , X 2 B = (500, −100, 0) T , X 3 B = (−100, 100, 0) T , X 4 B = (−100, −100, 0) T . These coordinates are chosen so that the situation of the bearing angles φ k , λ k close to ± π 2 would not be possible and there would be no need in the regularization of the observations in Equation (17). The beacons' positions with respect to the AUV nominal path are shown in Figure 5. Note that despite the fact that in all the previous considerations a single beacon was assumed, the transition to the multiple observations case is rather obvious and requires only vertical stacking of correspondent values in the original Equations (17) and (18) and pseudo-measurements of Equations (19) and (20).
The precision of the bearing angles measurements is assumed to be equal to 0.5 • , which in this simulation means that the noise in Equation (17) is Gaussian with the distribution E k = (ε ϕ k , ε λ k ) T ∼ N (0, tan 2 (0.5 • )I 2×2 ).
The control sequence is defined by System (7) with position estimatesX CMNF k ,X UPMF k andX EKF k given by: • CMNF: Conditionally minimax nonlinear Filter (22) with basic structure Functions (24) and covariances calculated by the Monte-Carlo sampling on a separate test set of 10 5 AUV paths; • UPMF: Unbiased pseudo-measurements Filter (21); • EKF: Extended Kalman filter, which provides a common approach to deal with the non-linearity in the system dynamics or observations [18].
The estimates and the correspondent random processes of state evolution (and their samples) are further referenced asX F k and X F k = X k (u * k (X F k )) respectively, where F ∈ {CMNF, UPMF, EKF}. The simulation is done for two cases of the predictionX k+1 used by all three filtering algorithms: • prediction by virtue of System (18): prediction based on the acoustic measurements:X k+1 =X k + ∆X k+1 , with ∆X k+1 defined in Equation (15).
The former takes advantage of the exact values of the AUV speed, while the latter uses an estimate of the AUV position shift.
For the position estimation with seabed sensing simulation, the following setting is used. The direction of the i, k-th measurement beam from Equation (8) where K = 2, the parameters a 0 = 30.0 m, P x = P y = 20.0 are fixed, and the coefficients a l , b l , c l and d l are randomly generated for each trajectory using the standard uniform distribution. In Figure 7, we show sample paths of the system for three mentioned state estimators and the nominal trajectory which they intended to follow. In Figure 8, we show the estimate error mean and standard deviation calculated on the set of 10 5 sample paths. The simulation shows almost equal performance of all three filters. This can be explained by the fact that the chosen acoustic sources positions provide the observations close to linear and in this case all the filters at hand act like the Kalman filter, which is optimal for linear systems.
This, nevertheless, can not serve as a justification for the EKF supreme applicability. In Figure 9, we present the results for the same setting, but with one beacon moved closer to the AUV starting position X 0 , namely X 4 B = (−5, −5, 0) T and for a shorter time period equal to 10 s. In this situation the linearization errors play more significant role and hence the EKF demonstrates divergence, while the other two filters remain stable.  Figure 9. Estimate error sample mean (dotted line) and standard deviation (solid line) for CMNF (red), UPMF (blue), and EKF (green) estimates for the case of the path close to one of the beacons.
In the final simulation we aim to demonstrate how inaccurate description of the system dynamic can dramatically affect the quality of the estimation and hence the control of the UAV. To that end we apply the same three filters used in the previous simulations but now with inaccurate prediction provided by the acoustic measurements. In Figure 10 we show sample paths for the acoustic seabed sensing predictor. The control based on the extended Kalman filter and the pseudo-measurements filter fail to follow the nominal path. In Figure 11, we show the results for the conditionally minimax nonlinear filter based on the data fusion from the dead reckoning seabed sensing and external bearing-only measurements defined by Equation (22) with structure Functions (24). The increase of the sampled standard deviation shown in Figure 11 in comparison with the prediction by virtue of the system case shown in Figure 8 is not very high. It should be noted, that according to the simulation CMNF exhibits a regular bias. This bias is inherited from the acoustic seabed shift estimate which is also known to have a bias conforming to the nominal path [16]. As a final assessment of the estimate/control performance we provide the sample values of the uniform version of Criterion (4): The sampling is performed on the same sets of paths as for the calculation of the sample mean and standard deviation presented in Figures 8 and 11. The uniform performance criterion sample values are presented in Table 1. The first row corresponds to the position prediction by virtue of System (18) and the second row shows the results with prediction obtained by velocity estimation via seabed sensing, Equation (15). The columns correspond to the filtering algorithms applied: The conditionally minimax nonlinear Filter (22) with basic structure Functions (24), the unbiased pseudo-measurements filter, Equation (21), and the extended Kalman filter. Note that for the seabed sensing prediction only the CMNF filter allowed to acheive a reasonable estimate/control performance. The simulation shows that in the ideal situation with good initial accuracy and observation conditions close to that of a linear system, the gain in the estimation quality and hence in the control performance is insignificant in comparison with the standard extended Kalman filter. Nevertheless, the proposed filtering algorithms can demonstrate better qualities in less favorable settings, e.g., when one of the beacons is close to the initial point of the path, the EKF diverges, while the proposed filters remain stable. Another result is that the linear filters are highly sensitive to the dynamic model description, and inaccurate parameters' determination or evaluation can even lead to divergence.

Conclusions
In the present paper a mathematical model for an AUV navigation system based on the locally optimal (predefined path following) control and position estimation provided by seabed acoustic sensing and external DOA measurements. The performance of the proposed algorithms, namely the conditionally minimax nonlinear filter and pseudo-measurements filter, was evaluated by numerical experiments. It was demonstrated, that the CMNF approach provides a natural way for data fusion from different sources of observation by allowing a certain level of liberty in the basic prediction function choice and, at the same time, it allows the errors of the prediction step to be taken into account. The nonlinear character of measurement equations does not permit to evaluate the control quality in advance and needs a detailed analysis of the sensor fusion, which requires detailed modeling in the settings, which take into account the environment features and noise characteristics. We suppose that the methods considered in the present paper permit one to obtain sufficiently reliable approaches to underwater navigation without costly field experiments.  (A1)