1. Introduction
Unmanned underwater vehicles (UUVs) play an important role in the exploration of oceans. With the increasing demand for ocean development, the UUV, lacking the intervention ability, is no longer capable of handling certain tasks [
1]. Meanwhile, the underwater vehicle manipulator system (UVMS) has been widely used in the marine energy and underwater architecture industry, and significant results have been achieved [
2,
3]. The application scenarios of the UVMS include but are not limited to, underwater pipeline inspection [
4], ship maintenance [
5], underwater rescue [
6], terrain exploration [
7], marine biology research [
8,
9], and marine archaeology research [
10,
11].
At present, these industrial UVMS are often fixed on the seabed during operation, forming a stable working environment, which ensures a certain level of security and sustained stable output while the scope and flexibility are limited. Furthermore, many underwater tasks cannot provide a reliable ground support environment. This makes it necessary to develop the floating-based UVMS, especially for the scenes of the bridge and the pipeline maintenance that require continuous movement, as well as the scenes such as marine archaeology and biological research that need to avoid damage to the environment during the operation. Based on UVMSs, underwater vehicle dual-manipulator systems (UVDMSs) have received a lot of research attention in recent years.
In terms of UVMS, the Girona 500 UVMS [
12] of Girona University has dealt with tasks such as dynamic positioning and object grasping. The Dexrov project [
13] has reduced the dependence on the work environment with the concept of remote operation. The Ocean One project [
14,
15] of Stanford University has developed a dual-arm humanoid robot with strong operational capabilities and perception capabilities. By remote operation, it has successfully carried out an archaeological task. Ref. [
16] studied the task of moving an object to a precisely positioned peg while considering the impacts due to contact and provided a multiple impedance manipulation method that exhibits a smooth performance in the simulation. A whole-body control strategy of UVDMSs proposed in the MARIS Italian research project [
17] has extended the task priority framework to deal with coordination manipulation and transportation problems. An alternative dual-arm configuration of UVDMSs is designed in [
18] with only one manipulator for task operation and another for pose maintenance. This unique concept is proven to perform well in the presence of currents. Ref. [
19] recently showed another different style of UVDMS similar to an underwater glider and provided a moving strategy making use of drag, which provides a low-power consumption solution for UVDMSs.
All the applications and research of the UVDMS above require reliable control methods, and the primary objective of the UVMS control is the dynamic positioning that ensures a stable working environment for other tasks. Due to the limitations of underwater signal transmission, visual positioning is commonly used for operations in the local space. The most widely used visual servo method is the image-based visual servo (IBVS) [
20,
21], which does not require calculating the 3D information of the targets. Refs. [
22,
23] propose a visual servo strategy for dynamic positioning, which utilizes the measuring sensors equipped with the UUV and achieves significant results. In [
24], the visual servo method has been applied to the UVMS system, while the redundancy problem is solved through model prediction, resulting in a good positioning performance. However, few studies focus on visual positioning for the UVDMS. Considering that the manipulators of the UVDMS are usually installed away from the vehicle center, unlike the UVMS, in which the manipulator is always fixed along the center of gravity and buoyancy. This configuration has a significant impact on the stability of the system, and greater torques are required during the positioning task. Additionally, dual manipulators bring more difficulties for motion planning since the joint limits and collision risks increase.
Modeling the UVMS dynamics is challenging, as it requires consideration of both the impact of the water and coupling effects between the manipulators and the vehicle. A brief description of the kinematic and dynamic models of the UVMS is provided in [
25]. Considering the model in [
25], the numerical simulation is executed for the dynamic model in [
26], and the coupling between the dual manipulators and the vehicle is analyzed in this work. It is impossible to develop a mathematical model that accurately represents the physical system. Until now, control research of the UVMS mostly uses robust and adaptive tools to solve the model uncertainties. In [
24], the model coupling information is estimated by the EKF filter, and the model predictive-based control provides an optimal kinematic solution. In [
27], the sliding mode control of the UVMS, with certain anti-interference ability, has been provided for a tracking task. To deal with uncertainties and disturbance, Refs. [
28,
29] provide the adaptive control strategy and observer-based method, respectively, and both of them show good performance while handling uncertainties.
Based on the research above, this work studies the visual servo control of the UVDMS, considering the model uncertainties. A hybrid visual servo method is proposed based on the multiple cameras and attitude sensors equipped with the UVDMS. The command velocity is produced by the kinematic controller using the task priority scheme. In addition, the reinforcement-learning-based speed tracking controller was designed, and the system model error is compensated by the designed actor neural network. Meanwhile, the error system is proved to be ultimately uniformly bounded by the Lyapunov method. Finally, a UVDMS model is simulated, and the results prove the effectiveness of the proposed control.
This paper is organized as follows.
Section 2 describes the mathematical model of the kinematics and dynamics of the UVDMS, as well as the hybrid visual servo model of the UVDMS.
Section 3 formulates the kinematic control that produces the command velocity to be tracked. In
Section 3, the reinforcement-learning-based adaptive control is formulated, while the actor–critic networks are designed to compensate for the system uncertainties. And the stability analysis is down in this section. Then, the simulation work using an 18-dof UVDMS is shown in
Section 4. At last, in
Section 5, a brief summary of this work is provided.
2. Problem Formulation
This section introduces the kinematics and dynamics of the UVDMS by providing the necessary coordinates and variables. A visual servo model considering the camera configuration of the UVDMS is established. In addition, a task priority strategy for redundancy control and a universal reinforcement learning method based on the actor–critic algorithm are provided. At last, the objective of this study is described.
2.1. UVMS Model
To illustrate the modeling of UVDMS, a 3D model
consisting of a fully actuated UUV and two 6-dof manipulators is shown in
Figure 1. Obviously, this is a redundant system
with 18 degrees of freedom. According to [
25,
30],
the underwater rigid body model can be described in several coordinate frames
defined as
(the inertial frame),
(the vehicle body fixed frame with origin at the
center of the mass),
(the main
camera frame),
(the manipulator base frame,
),
(the camera frame fixed with the end effector,
), and
(the end-effector frame attached at the end of the manipulator,
). In frame
, the pose vector of the base vehicle is defined as
, which contains the global position
and Euler angles
. The vehicle’s velocity
containing the linear velocity
and the angular velocity
is defined in frame
.
In terms of the underwater manipulators, the states are generally described as the joint angles
, with
, and the corresponding velocity
, with
, where
. Combining both the velocities of the UUV and the manipulators, one obtains the motion transformation from the vehicle body frame to the inertial frame as
where
is the rotation matrix from the vehicle body frame to the inertial frame, and
is the Jacobian matrix of the vehicle angular velocity. Then, the relationship between the system velocity and the end effector velocity can be formulated as
The equation above is a brief description of the direct kinematic of the UVMS with the Jacobian including (Jacobian of the left end effector) and (Jacobian of the right end effector), which can be computed by the DH parameters and the rotation matrix. and present poses of the left and right end effectors respectively.
Most of the system uncertainties come from the UVDMS dynamics since it is difficult to obtain the hydrodynamic and internal couplings precisely. For simplicity, this work formulates a concise Lagrangian dynamic model without considering the water velocity as follows:
where
represents the inertial matrix consisting of both the vehicle’s inertia and the manipulators’ inertia on the diagonal, while other matrix elements represent the coupling factors. Similarly, the added Coriolis and centripetal matrix
is developed in a compact manner, as well as the damping and hydrodynamic lift matrix
.
and
denote the input vector of the vehicle and joint torques and the restoring forces by gravity buoyancy, respectively. As a rigid body system, these matrices have the following properties: the symmetric matrix
is positively defined with
, where
and
are positive definite functions, and
can be chosen to satisfy
. The dynamic model can be written as a general nonlinear differential equation
to simplify the subsequent derivation.
2.2. Visual Servo Model
Traditional image-based visual servo requires multiple feature points or specially shaped patterns, which are not easy to obtain or deploy underwater. Fortunately, the UUVs, especially the UVMS, are always equipped with several cameras so that fewer feature points are needed by developing a semi-stereoscopic visual system.
In this work, we consider that the UVDMS has two cameras fixed with both end effectors and a main camera fixed with the vehicle body. As long as the object feature (i.e., a single point) is within all the cameras’ field of view, the depth information can be computed by taking advantage of the transformation between the hand cameras and the body camera. Then, the visual servo model is formulated briefly to provide a command velocity for the dynamic control.
The transformation from the camera velocity in the camera frame to the image feature velocity in the image plane is described as
where
This equation shows that the feature point
in the image plane is driven by the camera velocity
with the image Jacobian
. The scalar variable
, z-position of the object in the camera frame, will be calculated with the Jacobian of the main camera and the camera’s internal parameters in
are obtained by camera calibration. To make full use of the sensors on the UUV (i.e. IMU), an augmented feature vector (including the image features, the image depth, and the Euler angles of the end effector) and its desired form are defined as
4. Simulation Experiments
This section shows the simulation experiments to test the performance of the proposed visual servo control method for UVDMSs using Matlab R2023b and Unity 2022 The dynamic parameters of the UUV model shown in
Figure 1 are listed as follows:
The weight of the vehicle is 106 kg, while the buoyant force is 1058 N. The centers of the gravity and the buoyancy are
and
. The manipulator’s geometrical parameters are shown in
Figure 3 and
Table 1.
Inspired by
simurv 4.0 [
17], the dynamics of each single rigid body link and the UUV are projected into the generalized velocity coordinates to be added together as the generalized forces of 18 dimensions, which formulates the dynamic model of the UVDMS. Using the Jacobians and DH functions from
simurv 4.0, the transformation between different frames can be easily achieved when developing the kinematics. The model parameters of the UUV are computed through Creo 10.0 (designing all models and measuring moment of inertial of the rigid bodies) and Ansys 2019 R3 (Computed Fluid Dynamics module of the Ansys Workbench is used for the identification of UUV hydrodynamic parameters).
and are the initial conditions of the states. The target coordinates in the inertial frame are and . The desired position of the vehicle is , and the desired orientations of both end effectors are towards the x-direction of the inertial coordinate with their z-axis.
The simulation results of the UVDMS position and orientation are shown in
Figure 4,
Figure 5,
Figure 6 and
Figure 7, including the pose of the vehicle in
Figure 4, the angles of both manipulators in
Figure 5, the position of both end effectors in
Figure 6, the orientation(in the form of Euler angles) of both end effectors in
Figure 7a. It can be seen that all states tend to be stable eventually. Comparing the convergence time of the vehicle position with that of the end effectors, we see that the vehicle arrived at its target earlier. That is, before the visual servoing, the vehicle should be driven to the working space by changing the task’s priority. The vehicle Euler angles show that the angles of the roll and pitch change more significantly than the yaw angle, which is caused by the floating-based operation with gravity changing during the movement. Similarly, this phenomenon also occurred in the figure of the velocities and torques below since the controller is trying to restore the orientation. In
Figure 5, the angles of all joints change smoothly without exceeding the joint limits. In addition,
Figure 6 indicates that the task priority strategy results in smoother curves of the end effector than that of the UUV.
The UVDMS velocity are shown in
Figure 7b,
Figure 8 and
Figure 9, where
Figure 7b shows the linear velocities of the end effectors,
Figure 8 shows the linear and angular velocities of the vehicle, and the angular velocity of all manipulator joints are in
Figure 8. From the velocity figures, it can be observed that most velocities are within the limits and running safely. Then, the command torques of the vehicle and manipulators are shown in
Figure 10 and
Figure 11. It can be seen that the command torques are always working until the simulation stops. That is, during the dynamic positioning, the open loop system is far away from its equilibrium with two manipulators forward. This is also proved by the pitch velocity and y-torque curves, which have larger values than others. The joint torques of both manipulators show that the joints closer to the base require more torques for moving. It should be noted that to compensate for gravity, several torques (joints 2–4) maintain non-zero values in the end.
At last, in
Figure 12a, the compensation signals, divided into three parts, from the actor neural network are plotted, which indicates the system error (including the disturbance). It is assumed that the system uncertainty is bounded and does not change drastically when the estimated model is close enough to the real system model. So, we use RBF neural networks as the actor and critic networks for their infinite approximation ability and simple configurations. The base functions are Gaussian basis functions with the number of neurons as 100. A direct influence of the actor–critic networks on the control performance is that the parameter adjustment of the linear controller is easier. The simulation data have been displayed in Unity 2022 with the lightweight UVDMS designed by the author to test the control strategy as well as to avoid joint collisions and singularities. As shown in
Figure 12b, the UVDMS successfully tracked the target features and positioned itself by the side of the shipwrecks. The purple trajectories show that the vehicle center and both end effectors move smoothly.
5. Conclusions
A reinforcement-learning-based adaptive control for the visual servoing of a UVDMS equipped with two six-dof manipulators is developed in this work: (a) Different from the classical IBVS schemes, the proposed hybrid visual servo takes advantage of the multiple cameras and other sensors to obtain the image depth such that fewer target image features are needed. (b) The command velocity is computed through a kinematic controller using the task priority method, considering both the positioning of the vehicle and the visual servo command. (c) In addition, a DPG strategy is used to design an actor–critic method to deal with the system uncertainties. The model uncertainty is compensated by the actor neural network, while the critic neural network evaluates the performance of the actor. The critic network is updated by the gradient of the velocity tracking error, and the actor is adjusted by the critic network. (d) Also, it is proved that the tracking error of the velocity is ultimately bounded using the Lyapunov method. (e) At last, the simulation of the UVDMS using Matlab and Unity shows a good performance of the proposed strategy.
This approach aims to provide a dynamic positioning method for UVDMSs’ tasks in a relatively stable local environment. However, the effects of currents, joint dynamics, thruster dynamics, low-quality underwater images, unknown environments, and low-precision sensor measurements (especially the linear velocity of UUV) are not taken into account. Additionally, collision avoidance (especially the mutual manipulators’ collision avoidance problem) and cooperative control are not involved.
Control and planning of the UVMS, especially the UVDMS, are challenging. Due to the coupling and uncertainty problems, designing a universal high-performance controller is far more difficult than for manipulators on land or UUVs. At the same time, the demand for ocean exploration makes the research of UVDMS and UVMS promising. Therefore, the follow-up study of this work will continue by focusing on the motion planning and visual control of the UVDMS.