Neural Network-Based Self-Tuning PID Control for Underwater Vehicles

For decades, PID (Proportional + Integral + Derivative)-like controllers have been successfully used in academia and industry for many kinds of plants. This is thanks to its simplicity and suitable performance in linear or linearized plants, and under certain conditions, in nonlinear ones. A number of PID controller gains tuning approaches have been proposed in the literature in the last decades; most of them off-line techniques. However, in those cases wherein plants are subject to continuous parametric changes or external disturbances, online gains tuning is a desirable choice. This is the case of modular underwater ROVs (Remotely Operated Vehicles) where parameters (weight, buoyancy, added mass, among others) change according to the tool it is fitted with. In practice, some amount of time is dedicated to tune the PID gains of a ROV. Once the best set of gains has been achieved the ROV is ready to work. However, when the vehicle changes its tool or it is subject to ocean currents, its performance deteriorates since the fixed set of gains is no longer valid for the new conditions. Thus, an online PID gains tuning algorithm should be implemented to overcome this problem. In this paper, an auto-tune PID-like controller based on Neural Networks (NN) is proposed. The NN plays the role of automatically estimating the suitable set of PID gains that achieves stability of the system. The NN adjusts online the controller gains that attain the smaller position tracking error. Simulation results are given considering an underactuated 6 DOF (degrees of freedom) underwater ROV. Real time experiments on an underactuated mini ROV are conducted to show the effectiveness of the proposed scheme.


Introduction
Underwater Remotely Operated Vehicles (ROVs) have been widely used in many subsea tasks, ranging from inspection to repair of underwater structures related mainly to the power and oil industry. Very often, according to the task, the ROV is required to continuously change its operating tool and/or to pick up and release loads causing a change in behavior. That results as an inherent change in its weight, buoyancy and hydrodynamic forces; and as a consequence, a decrease in the position tracking performance. In addition, ROVs have to deal with the highly dynamical underwater environment represented in the form of ocean currents and waves in shallow water. With this in mind, when the dynamic characteristics of the system are time dependent or the operating conditions of the system vary, it is necessary to re-tune the gains to obtain the desired performance, resulting in time consumption. In this paper, a self-tuning algorithm based on Neural Networks (NN) is proposed to automatically tune the gains of a PID (Proportional + Integral + Derivative) controller. The optimal set of gains is computed online with less computation effort by using desired and actual state variables. The self-tuning mechanism will avoid time consuming manual tuning of the PID controller and promises better results by providing PID controller settings as the system dynamics or operating points change.
With this in mind, a mix of control and a smart system might offer an accurate tune of the control gains online. Even when the state of art yields different tuning techniques, it is common to find controls poorly tuned such that their performance is limited. Intelligent control techniques include fuzzy control, neural networks or a mix of them; they have been widely used to control underwater and nonlinear systems, such as in [1][2][3], and have become an accurate option, though these algorithms do require long periods of training and tuning.
Control schemes vary from tracking to dynamic positioning [4,5] where their main target is to estimate and compensate for the unknown forces of changing environments. Research [6][7][8][9][10][11][12][13][14] present systems with a mix of neural networks and fuzzy control in which the training and rules of behavior are based on the desired states. Their performance is described as accurate when uncertainty and perturbations take place while performing a trajectory. Although the training periods are extremely long, there are also combinations of PID controls and a smart system aimed to auto-tune the gains of different systems such as: sub-aquatic [15,16], non linear [17][18][19][20][21][22], and others: [23][24][25][26].
In this paper, an auto-tune PID-like controller based on an online Neural Networks (NN) is implemented on Remotely Operated Vehicles (ROVs); for trajectory tracking with unknown disturbances. Simulation results are given considering the non-linear hydrodynamics of ROV Kaxan; including disturbances of ocean currents. Real time experiments on an underactuated mini ROV are conducted to show the effectiveness of the proposed scheme. For the remaining sections of this paper in Section 2 the general system model of 6 DOF underwater vehicles is presented, Section 3 includes the effect of ocean currents, Section 4 presents the Self-tuning Neural Network for PID Control, Section 5 describes the simulation results, and the experimental results are presented in Section 6; Finally in Section 7 the concluding remarks are provided.

General System Model of 6 DOF Underwater Vehicles
In [27], the nonlinear model of a 6 DOF to build the mathematical model that represents the underwater vehicle dynamics two reference frames were used; one referenced to earth (called the Earth-fixed frame) and another referenced to the vehicle (called the body-fixed frame), Figure 1.

Kinematic Model
The general velocity vector is represented as: where u, v and w are components of the linear velocity in surge, sway and heave directions, respectively, and p, q and r are components of the angular velocity in roll, pitch and yaw, respectively. The position vector η 1 ∈ R 3 and orientation vector η 2 ∈ R 3 coordinates expressed in the Earth-fixed frame are: where x, y and z represent the Cartesian position in the Earth-fixed frame and φ represents the roll angle, θ the pitch angle and ψ the yaw angle.
The relationship between velocities on the fixed and Equations are [27,28].
where J 1 (η 2 ) ∈ R 3×3 is the rotation matrix which expresses the transformation from body to fixed frame, and J 2 (η 2 ) ∈ R 3×3 is another transformation matrix that relates the angular velocity ν 2 ∈ R 3 with the time derivative of η 2 ∈ R 3 .

Hydrodynamic Model
Equations of motion expressed on the Equation [27], where ν ∈ R n and η ∈ R n were previously defined, M ∈ R n×n denotes the inertial matrix (including the added mass), C ∈ R n×n is the Coriolis matrix and centripetal forces (including the effects of added mass), D ∈ R n×n refers to the damping matrix, G ∈ R n represents the vector of gravitational forces, and τ ∈ R n is the input control vector.

Ocean Currents
Ocean current is generated by wind, tides, variation of densities and re-circulation of water, among others. The main objective of this work is not to generate a detailed report of this phenomena; nevertheless, it is appropriate to highlight the model of induced ocean currents proposed by Fossen [27]. In the mentioned work, the equations of motion are represented in terms of relative velocity of the vehicle and the currents, where ν CI = u c v c w c 0 0 0 T is a non-rotational vector of the current velocity according to Equation (3). Note that the linear velocity on the fixed frame can be transformed to linear velocity in the equation by applying the elemental rotation matrices. Let u E c v E c w E c be the current velocity referenced to the Earth-fixed frame. Then the components of the linear velocity on the equation are calculated as follows, Suppose the current velocity in the equation as constant or at least with a minimum variation, so that:ν Then, the relative equations of motion become: Now, the current velocity in the Earth-fixed frame u E c v E c w E c can be related to the mean velocity of the current V C through two angles: α (angle of attack), β (sideslip angle), describing the orientation of ν CI around the axes y and z respectively as follows: where V c is the average currents velocity in the earth-fixed reference frame.

Self-Tuning Neural Network for PID Control
The tuning of PID (Proportional + Integral + Derivative) controllers depends on adjusting its parameters (i.e., K p ; K i ; K d ), so that the performance of the system under control becomes robust and accurate according to the established performance criteria. The proposed auto-tuning algorithm is based on NN which exhibit the following characteristics: 1. Parallelism and generalization. A NN are able to produce useful outputs for inputs not provided under the learning phase. 2. Non-linearity. A NN can be linear or not allowing it to represent systems generated by nonlinear guidelines. 3. Adaptability. NN are capable of re-adjusting weights and adapting to new environmental situations. This is specially useful when the system offers non-stationary data, that is, the properties involved by the system vary over time. 4. Fault tolerance. When an operational failure occurs on a local part of the network, it lightly affects the global performance.
This property is because of the distributive nature of stored data processed along the neural network.
Consistent with above, this work is based on a backpropagation neural network, which also meets the desired characteristics to accomplish the goal tasks. Recurrent networks with supervised learning structured with delay are widely used in underwater vehicles as mentioned in [17,18], as well as for linear systems with large uncertainties in their surrounding environment as shown in [5,[29][30][31].

Control Law
In the discrete time domain, the digital PID algorithm can be expressed as follows [17]: where τ(n) is the original control signal, e(n) = η d − η represents the position tracking error, η d denotes the desired trajectory, K p is the proportional gain, K i the integral gain, K d the derivative gain, and n the sample time.
A block diagram of the auto-tuning control with artificial neural network (NN) is shown in Figure 2.

Algorithm Auto-Tuner
The algorithm used as auto-tuning is the backpropagation method, chosen for its ability to adapt to changing environments. Operation begins applying the inputs to the network (see Figure 3), this is propagated from the first layer to the hidden layers in, up to produce an output (K p , K i and K d ).
The output signal is compared to the desired output and an error signal is calculated for each of the outputs, this is shown in Figure 2. The error outputs backpropagate, starting from the output layer, to all neurons in the hidden layer that contribute directly to the output; however, the hidden layer neurons receive only a fraction of the total error signal. This process repeats iteratively, layer by layer, until all neurons in the network has received an error signal describing its relative contribution to the total error. Figure 3 presents the topology of the NN used to auto-tune the PID control gains implemented on the ROV. Its structure shows seven neurons on the input layer, three neurons on the hidden layer, and finally another three neurons on the output layer. The neurons placed on the output layer correspond to the PID gains: K p , K i , K d . where u(n) and u(n − 1) are reference inputs (desired trajectory), y(n) and y(n − 1) are reference outputs (real trajectory), C(n) and C(n − 1) correspond to the control signals, w ji are the weights of the hidden layer, and v ji are the weights of the output layer.
The back-propagation algorithm looks for the minimum of the error function in weight space using the method of gradient descent [3]. The combination of weights which minimizes the error function is considered to be a solution of the learning problem. The activation functions for back-propagation networks is the sigmoid, a real function s c : R → (0, 1) defined by the expression The output of the j hidden layer neuron may be calculated by means of: The shape of the sigmoid changes according to the value of h j . At the same time, the output layer neuron value will be: where: The criteria used to minimize the error correspond to Rojas et al. [32], as: where e y (k) = y r (n) − y(n).
The minimization procedure consists, as it is known, in a movement in the negative gradient direction of the function E(n) with respect to the weighting coefficients v ji and w ji .
The E(n) gradient is a multi-dimensional vector [3] whose components are the partial derivatives The weighting coefficients of the input layer are The weighting coefficient of the hidden layer are Using Equations (16) and (17), the adjustments of weighting coefficients v ji (Equtation (18)), w ji (Equtation (19)) can be made by means of the expressions: v ji (n + 1) = v ji (n) + (a ∂e y ∂e u )δ 1 h j (18) w ji (n + 1) = w ji (n) + (a ∂e y ∂e u where a is the learning coefficient, w ji (n + 1) is a vector of weights for the hidden layer, v ji (n + 1) is the vector of weights of the output layer and equivalent gain ∂e y ∂e u is unknown.

Simulation Results
The auto-tuned PID was evaluated using Matlab/Simulink software. The ODE 45 with a variable step was used, setting the maximum sample step as 0.01 s. The first proposed task consists of moving the robot in a straight line from its start position to a set point, letting x, y, z remain constant while ψ is varying. The next task is that the robot begins rising in a spiral motion, perturbed by water currents of considerable intensity. The first perturbation takes place in the first 20 s and its magnitude is V c = 1.1 m/s with orientation of α = 0 and β = 0. The second perturbation goes from time 20 to 45 s with a magnitude of V c = 1.1 m/s and an orientation of α = 0 and β = π/2, as set in Equation (9).

Underactuated 6 DOF ROV Kaxan
The Kaxan robot hydrodynamic parameters are included [33]. The behaviour can be observed in Figures 4-8. Figure 4 depicts the trajectory in 3D that the Kaxan robot follows.

PID vs. Auto-Tuned PID
In order to compare the conventional PID vs. the auto-tuned PID, a statistical indicator was implemented, allowing to determine which one has the best behavior following a trajectory. The mean square error (MSE) lets us estimate the performance of every control by analyzing the error generated in the trajectory tracking. (20) where MSE x is the MSE in x, MSE y is the MSE in y, and so on. As mentioned previously, MSE was used to evaluate the tracking performance. Figure 13 shows the evaluation of the experiment mentioned above, which considers the experiment under initial conditions (without water currents) in a 0 s to 45 s time frame in which, afterwards the ocean current appears from time 45 s to 90 s. Finally, at time 90 s the ocean current stops, with discrete time for the PID control being implemented throughout. Figure 14 demonstrates the same experiment with the auto-tuned PID control.  Figure 13 describes the increase of MSE of the conventional PID when perturbation occurs (red bar), while green and blue bars, corresponding to the absence of perturbations, remain steady. Moreover, the MSE of the auto-tuned PID [14] is around a 50% less than the one of the conventional PID, due to the self-tuning algorithm with a NN. Additionally, when perturbation happens, the increase and decrease of the auto-tune MSE is minimum. For this reasons, it is feasible to conclude that the auto-tuned PID has better performance facing changes on the hydrodynamic parameters and perturbations of the surrounding environment.

Experimental Set up
In this section, the experimental set up as well as the results are discussed. Two sets of experiments are presented, one considering the conventional PID controller and the other one considering the auto-tuned PID proposed in this paper. Both controllers were tested under the same conditions in order to evaluate their performance under disturbances. A comparative analysis in terms of position tracking and energy consumption is given.
The proposed intelligent control was implemented on an underactuated mini-ROV. This vehicle is a ROV developed in CIDESI named Nu'ukul Ja (which in the Mayan language means "water instrument"). Its dimensions are: 50 cm long, 30 cm wide, and 30 cm height; as shown un Figure 15. It has a cylindrical pressure chamber of 15 cm diameter where the major part of the electronic architecture is placed. The total weight of the ROV is 10 kg. According to the experimental environment, it was placed in a pool of 2.5 m where both PID controllers (auto-tuned and conventional) were implemented. The electronic architecture of the ROV (Figure 16) consist on three groups: instrumentation, signal and data acquisition, and actuators. The instrumentation involves: pressure sensor, leakage sensors, AHRS (Attitude and Heading Reference System), voltage and current sensors. While the signal and data acquisition implicates a micro controller embedded in a development board. Finally, the actuators consist on 4 thrusters used to provide direction and displacement to the vehicle, and an IP camera for inspection missions.
The ROV is connected to the surface by navel string of thirteen wires. Where eight of them are used to receive video from the IP camera, three for the power connections (12 V DC , 20 V DC and Ground), and two for data UART transmission (TX)/reception (RX).

Instrumentation
This ROV has a MS5803 − 05BA pressure sensor which is placed outside the chamber of the submarine, Figure 17. The sensor is a high resolution barometer which obtain data of the surrounding hydrostatic pressure, acquiring frequencies up to 50 kHz by I2C protocol. Once the hydrostatic pressure is obtained the depth level is calculated by: h = P−P 0 ρg , where h = depth (m), P = hydrostatic pressure (bar), P 0 = atmospheric pressure (bar), ρ = water density (kg/m 3 ). In order to sense 3 DOF's of the ROV (pitch, yaw and roll), the AHRS U M7 is a CHRobotics device is used. The AHRS sends NMEA serial packages with a frequency up to 100 Hz. To prevent malfunction of the electronics due to water presence, printed electrodes connected to the controller represent the leakage sensors. Also, 4282 voltage sensor (5 to 1 V DC divider) offers an analogical signal of the batteries voltage. Pololu s AC715, a Hall Effect current sensor, allows to monitor the operation of the thrusters.

Signal and Data Acquisition
The ROV has a SAM3X8E ARM Cortex embedded in the Arduino Due's developing board. It has 54 general purpose inputs and outputs, 12 of them PWMs, 12 analog inputs, 4 UART ports and one I2C bus. This controller is used to manage communication between the user and the mini-ROV.

Actuators
As mentioned above, to displace and lead the submarine in one plane, two brushed SeaBotix BTD150 thrusters are placed horizontally on each side of the underwater robot. These thrusters are powered by 20 V DC @ 4 A. To dive the submarine, 2 brushed thrusters were placed vertically on each side of the ROV (beside the lateral thrusters, Figure 18); these are basically modified fuel pumps with 4 cm propellers attached to their shaft. Each brushed motor consume 12 V DC @1.5 A .

Results
The controls were evaluated by performing a data capture of 3 m, once the ROV was placed 1m underwater. In the first minute non disturbance took place. After this time, the weight was increased by 400 g ( Figure 19) until a two minute mark. The gains of the conventional PID control were obtained by means of the NN. The ROV was requested to get the set point of 1 m depth by using the Auto-tuned PID controller. Once the ROV reached the stability and the PID gains, computed by the NN, became stationary, these gains were programmed into the conventional PID as Kp, Kd, and Ki. This the way the conventional PID gains were tuned. It is important to remark that once the conventional PID was tuned, the gains remained constant along the experiment, even when the disturbances took place, unlike the Auto-tuned PID controller wherein gains were dynamically changing to attain the better performance along the experiment.
Finally, in the last minute of data capture the weight was removed. Figure 20 shows the desired trajectory (Solid line) vs. the real path (dotted line), also the control signals for thrusters F1 and F2 are displayed. Apparently, the control signal given by the auto-tuned PID (shown in Figure 20), seems to be more active than the conventional PID's signal; though, the root mean square (RMS) value of each one (the complete experiment), shows that the RMS of the conventional PID is 6.8874 whereas in the auto-tuned PID is 6.6781. The auto-tuned PID has a 3.038% Of energy saving against the conventional PID.
MSE offers a better notion of the results, leading to the conclusion that the neuronal PID is better than PID fixed gains, as can be seen in the Figure 21. In order to determine where the neural PID has a better performance, the test was divided in three phases corresponding to: where no disturbance took place, when the disturbance is added and again when the disturbance stopped, as can be seen in Figure 22. Once again, to compare the auto-tuned PID vs. the conventional PID controller with fixed gains, the MSE of every phase was obtained and it is shown in Figure 23.

Conclusions
The actual work presents the development of a control algorithm to automatically tune the gains of a PID control, based on a neural network. The control algorithm was implemented on ROVs for trajectory tracking with unknown disturbances. The algorithm performance was evaluated in two instances: a numerical simulation and implemented on a ROV in real-time. The numerical simulation took place with the non-linear hydrodynamics of ROV Kaxan with 4 of the 6 DOF actuated; including disturbances of ocean currents in different directions. In reference of the second validation, it was implemented in a mini-ROV for the depth DOF, in order to validate in real-time the auto-tuned PID control. A comparative study between the conventional PID and the auto-tuned PID (proposed here) was discussed. The study took into consideration two criterions to assess the performance of each controller: position tracking error and energy consumption, leading to the conclusion that the proposed controller attained the best performance with less energy.