1. Introduction
In recent years, with the rapid development of artificial intelligence, autonomous navigation, and intelligent control technologies, underwater robots, especially remotely operated vehicles (ROVs), have become critical platforms for deep-sea resource exploration, scientific research, and engineering operations due to their strong environmental adaptability and task execution capabilities.
In international research on underwater robotics, significant progress has been made in ROV technology. The U.S. Hydroid’s REMUS 6000 participated in the search for Malaysia Airlines MH370, showcasing strong large-scale deep-sea search capabilities. France’s Ifremer launched the Victor 6000, a 6000 m class ROV with versatile manipulators and precision sampling tools, widely used in hydrothermal vent studies [
1]. Britain’s SMD developed the Quantum series heavy-duty ROVs, operating at 6000 m with dual-manipulator collaboration, supporting deep-sea oil and gas development. Japan’s JAMSTEC’s Kaiko ROV reached the 11,000 m Mariana Trench, laying the groundwork for ultra-deep exploration [
2]. Domestically, notable progress includes the “Haidou 1” by the Shenyang Institute of Automation, which achieved autonomous navigation and precision sampling at 11,000 m in the Mariana Trench, marking China’s entry into ultra-deep exploration. Shanghai Jiao Tong University’s “Haima” ROV, operating at 4500 m, has completed hundreds of deep-sea samplings, verifying practicality. The “Hailong 2” ROV discovered hydrothermal vents in the eastern Pacific, contributing to deep-sea geology [
3]. The 3000 m class “Faxian” ROV has been applied in submarine cable inspection, addressing operational challenges on complex seabeds.
Despite these advancements, trajectory tracking control of ROVs in complex underwater environments remains challenging [
4]. Conventional control strategies, such as proportional-integral-derivative (PID) control [
5] and sliding mode control (SMC) [
6,
7], often struggle in narrow passages, sharp turns, or environments with dynamic obstacles. In such situations, accumulated tracking errors can significantly degrade system performance and hinder high-precision control. MPC, known for its receding horizon optimization and explicit constraint-handling capabilities, has emerged as a promising solution for trajectory tracking tasks [
8]. However, traditional MPC approaches rely heavily on manually tuned parameters—such as the prediction horizon, control horizon, and weighting matrices—resulting in limited adaptability in complex or highly disturbed scenarios [
9]. Enhancing MPC adaptiveness by minimizing manual intervention has thus become a key research challenge in advancing trajectory tracking for WMRs in dynamic underwater environments.
Recent research on adaptive parameter optimization has led to various improved algorithms and hybrid frameworks, which can be broadly categorized into three directions: optimization using intelligent algorithms, prediction via machine learning, and hybrid intelligent optimization strategies [
10]. Swarm intelligence algorithms are widely employed due to their parallel search capabilities. For example, Jiao et al. [
11] introduced a chaotic PSO (CPSO) with a tent map to enhance diversity, although premature convergence remained an issue. Shi et al. [
12] proposed an adaptive-inertia PSO to dynamically adjust learning factors, which effectively reduced tracking errors. Hybrid GA-PSO algorithms [
13] demonstrated improved global search performance but incurred high computational costs. Meanwhile, GWO has been applied to dynamically adjust MPC prediction horizons [
14]. In the domain of learning-based approaches, deep reinforcement learning (DRL) has shown considerable potential. Lillicrap [
15] developed an actor–critic architecture for online parameter tuning in dynamic obstacle environments. Zhang et al. [
16] proposed an LSTM-based model to predict optimal parameters from historical data. Hybrid strategies have also gained attention. Zhang et al. [
17] combined GWO with simulated annealing to improve convergence, while Tang et al. [
18] introduced a fuzzy-PSO system with a rule base for parameter adjustment. Wei et al. [
19] proposed a WOA-SA approach that enhanced MPC performance in sharp-turn scenarios.
Despite these advancements, several challenges persist. First, many studies overly simplify the optimization objective by focusing solely on tracking error, often neglecting control smoothness and energy efficiency. Second, high computational complexity hinders real-time applications. Lastly, many methods lack adaptability to dynamic and uncertain underwater environments, limiting their practical effectiveness.
To address these issues, this paper proposes a hybrid intelligent optimization framework—GPSO—that integrates GWO and PSO for adaptive parameter tuning in MPC-based trajectory tracking control. The main contributions of this study are as follows:
A nonholonomic kinematic model for the WMR is constructed, and an error-state-based MPC controller is developed. A multi-objective fitness function is designed to optimize tracking accuracy, control smoothness, and energy consumption.
A novel GPSO algorithm is introduced by combining the hierarchical search mechanism of GWO with the collaborative learning strategy of PSO. Enhancements include dynamic inertia adjustment, chaotic perturbation, and contraction learning to improve convergence and maintain diversity.
The GPSO algorithm is embedded into the MPC framework to enable online adaptive optimization of key parameters (i.e., prediction horizon Np, control horizon Nc, and weighting matrices Q, R). This integration significantly enhances the robustness and adaptability of the control system in dynamic and uncertain environments.
2. Robot Dynamics and Kinematics Modeling
In the exploitation and research of deep-sea resources, the motion control accuracy of wheeled remotely operated vehicles (ROVs) over complex seabed terrains directly determines operational efficiency, thus possessing significant research value. This chapter focuses on the wheel force characteristics on soft sloped terrains, and systematically constructs and analyzes the kinematic and dynamic models of the robot. For the kinematic aspect, emphasis is placed on tire slip on seabed slopes: through body–terrain coordinate transformations and by introducing parameters such as slip ratio and sideslip angle, a model considering both longitudinal and lateral slips is developed to accurately describe pose variations in complex terrains. For the dynamic aspect, by exploring the wheel–terrain interaction mechanisms in soft sand and combining this with the nonlinear contact behavior of seabed sediments, a model reflecting their interaction mechanism is established. This model reveals the movement characteristics of the robot in complex seabed environments, laying a theoretical foundation for high-precision motion control.
2.1. Robot Kinematic Modeling
The research object of this paper is a four-wheeled differential-drive mobile robot, whose three-dimensional coordinate relationship on submarine soft slopes is illustrated in
Figure 1. To analyze the motion behavior of the wheeled mobile robot on soft slopes, two spatial coordinate systems are defined as follows: a local coordinate frame (denoted as
oxyz) is established at the robot’s center of mass, whereas the world inertial coordinate system (
OXYZ) serves as the global reference frame.
The robot’s pose in the global coordinate system is represented as q = [x y θ]T, where (x,y) denotes the position of the robot’s center of mass in the global frame, and θ represents the heading angle—defined as the orientation between the robot’s longitudinal axis and the global X-axis. v denotes the linear velocity along the x-axis of the robot’s body-fixed frame, and w represents the angular velocity about its center of mass.
Considering the slippage phenomena and slip conditions of each wheel for a four-wheeled robotic vehicle operating on complex soft-sloped terrains of the actual seabed, a simplified kinematic model is constructed, with its schematic diagram shown in
Figure 2.
Table 1 presents the nomenclature of the robot kinematic model, which explains the symbols and parameters involved in the model. When a wheeled robot slips on a slope, its actual linear velocity
v can be decomposed into components along the
x-axis and
y-axis of the robot’s body frame. The relationship between the linear velocity and its longitudinal/lateral components is expressed as follows:
The relationship between the local coordinate frame and the inertial (global) coordinate frame is established, with the corresponding velocity transformation derived as follows [
20]:
On soft terrains, variations in the normal load on each wheel lead to corresponding changes in their slip ratios and sideslip angles. For differential-steered robots, the longitudinal velocities of wheels on the same side remain consistent during straight-line travel, whereas a speed difference emerges between the wheels on the two sides during turning. From this, the relationship between wheel velocities and the robot’s traveling speed can be derived as follows:
The sideslip angle satisfies the small-angle approximation condition:
On soft terrain, relative sliding occurs at the contact interface between the tires and the ground, inducing wheel slippage and consequent partial loss of driving force. To accurately characterize the wheel slippage behavior, the slip coefficients of the four driving wheels are defined as follows:
Further, the kinematic equation can be derived as follows:
2.2. Robot Dynamics Modeling
2.2.1. Analysis of Wheel–Terrain Interaction Forces
The movement characteristics of wheeled ROVs in soft seabed sandy environments differ significantly from those on hard, flat surfaces. Due to the distinct physical property of soft soil—its proneness to deformation—phenomena such as longitudinal skidding, lateral wheel sliding, and even wheel subsidence are highly likely to occur.
Figure 3 illustrates the wheel–soil interaction force model.
For wheeled robots operating in soft sandy environments, the key parameters characterizing the wheel motion state are as follows: (1) The sideslip angle
β denotes the angle between the actual motion direction of the wheel and the longitudinal axis of the vehicle body. (2) The wheel–soil contact angle
ϕt defines the contact range between the wheel and the soil, which influences the distribution of shear forces. It includes the entry angle
ϕe, exit angle
ϕl, and maximum stress angle
ϕm. (3) The slip ratio
s reflects the deviation between the actual and ideal speeds of the wheel: when
ksi > 0, it indicates slipping (actual speed < theoretical speed); when
ksi = 0, pure rolling occurs; when
ksi < 0, it indicates sliding (actual speed > theoretical speed). The wheel slip speed
vs can be expressed as follows:
Furthermore, the longitudinal shear displacement at the wheel–soil interface can be derived through integration.
The tangential shear strength of soil dictates the maximum traction force of the wheel, thus making shear characteristics a critical factor in robot motion. Using the Wong–Reece and Janosi-Hanamoto soil mechanics models, the formulas for calculating normal stress and shear stress at any point within the wheel–soil contact interface are derived as follows:
where
c represents the cohesion parameter of the soil;
kj denotes the shear deformation amount of the soil;
kc stands for the cohesive deformation modulus;
n is the sinkage exponent;
φ is the internal friction angle of the soil; and
kφ represents the frictional deformation modulus. The parameters of the seabed soil referred to above are set out in
Table 2. Accordingly, the traction force
Fti, torque of the drive-wheel motor
Ti, and wheel normal load
Wi can be derived as follows:
2.2.2. Robot Dynamic Modeling on Soft Slopes
When the robot moves on an inclined plane, it is subjected to a vertically upward buoyancy force
Ff, a vertically downward gravitational force
G, and a water resistance force
FW acting opposite to the forward direction. Each wheel experiences a traction force
Fti, a ground resistance force
Fri, a lateral force
Fyi, and a normal load
Wi perpendicular to the inclined plane. The force diagram of the ROV on the seabed slope is presented in
Figure 4, and the nomenclature of robot dynamic model on soft slope in
Table 3.
When the robot is situated on slopes with varying angles, the normal loads borne by individual wheels differ. Based on the force and moment equilibrium conditions of the robot, the respective normal loads of the four wheels can be derived as follows:
where:
,
,
.
The lateral force of a wheel is jointly influenced by its sideslip angle and normal load. Consequently, an approximately linearized lateral force formula is adopted in this study, where
cβ denotes the wheel relative cornering stiffness [
21].
Using the Newton–Euler system modeling approach, the three-degree-of-freedom dynamic equilibrium equations for the ROV can be established as follows:
where
Fri = μWi,
μ is the ground friction coefficient;
Iz represents the moment of inertia of the robot. The simplified hydrodynamic drag formula is as follows:
where
ρ is the seawater density;
Cd is the drag coefficient;
A is the projected area of the robot perpendicular to the direction of motion; and
vw is the speed of the robot relative to the water flow.
The kinematic Equation (7) of the robot is defined as
. By differentiating this equation and substituting relevant expressions, and then through derivation and simplification, a dynamic model in the form of a matrix equation can be obtained:
where:
,
is the inertia force matrix,
is the Coriolis force and centrifugal force matrix,
is the water resistance term,
is the gravity term,
is the ground resistance term,
is the unknown disturbance term,
is the input transmission matrix, and
is the motor driving torque.
4. Design of Improved Particle Swarm Optimization Algorithm
Although MPC offers strong path-tracking performance, its effectiveness is highly sensitive to the selection of control parameters, including the prediction horizon Np, control horizon Nc, and weighting matrices Q R. Traditional fixed-parameter tuning approaches rely heavily on empirical experience, lack a systematic framework, and show limited adaptability to dynamic changes across different trajectory scenarios. To improve the adaptability and performance of the MPC controller, this section presents an enhanced PSO algorithm for adaptive tuning of MPC parameters.
The PSO algorithm, originally proposed by Kennedy and Eberhart in 1995, is a stochastic optimization method inspired by swarm intelligence theory [
25]. The algorithm simulates the social behavior observed in bird flocking or fish schooling to iteratively search for optimal solutions in a given problem space.
Within an
N-dimensional solution space and a swarm population of size
M, the current position and velocity of the
i-th particle are defined as follows:
The velocity and position of each particle are updated using the following equations:
where
denotes the current number of iterations; ω is the inertia weight factor;
and
are the cognitive and social learning coefficients, respectively;
and
are uniformly distributed random numbers in the range [0, 1];
in represents the personal best position of the
i-th particle;
in denotes the global best position identified by the entire swarm.
In conventional PSO algorithms, the inertia factor
and learning coefficients
and
are typically fixed, which limits the algorithm’s ability to balance global exploration and local exploitation. To overcome this limitation, a fitness-value-based adaptive inertia weight adjustment mechanism is proposed in this study.
where
denotes the current fitness value of the
i-th particle;
represents the average fitness value of the swarm; and
indicates the minimum fitness value within the current population.
Compared to conventional methods, the proposed approach adaptively adjusts the inertia weight based on the fitness level of each particle relative to the overall population. This mechanism encourages particles with lower fitness to enhance global exploration, while guiding those with higher fitness to improve local convergence accuracy, thereby accelerating convergence speed and improving optimization precision.
Furthermore, to overcome limitations associated with fixed learning factors, this study introduces a dynamic adjustment method incorporating chaotic perturbation, building upon the approach in Reference [
26]. Specifically, the learning factors
,
are perturbed using chaotic sequences generated through logistic mapping, ensuring smooth transitions during iterations and reducing search instability caused by abrupt parameter changes.
where
denotes the chaotic sequence value generated by the logistic map (with initial value
=1);
represents the iteration-dependent control factor function, where
is the current iteration index and
is the maximum number of iterations.
A contraction factor
is introduced to regulate the magnitude of the velocity update:
The velocity and position update equations for the enhanced PSO algorithm are formulated as follows, incorporating adaptive inertia weight, dynamic learning factors, and a contraction–expansion coefficient:
The MPC controller parameter vector to be optimized is defined as follows:
where
is the prediction horizon;
is the control horizon;
,
, and
are the weighting coefficients for lateral error, longitudinal error, and yaw error in the weighting matrix
; and
,
are the weighting coefficients for linear and angular velocities in the weighting matrix
.
To enhance the evaluation capability of PSO in MPC parameter optimization, a multi-objective fitness function is designed to guide particles toward optimal parameter combinations.
where
is the integral of the absolute lateral error;
is the integral of the absolute longitudinal error;
is the integral of the absolute yaw error;
denotes the total variation of control inputs; and
represents the soft-constraint penalty term. The penalty term is defined as follows:
where
,
, and
,
denote the maximum allowable errors and control input limits, respectively. The weighting coefficients
to
in the fitness function require empirical tuning based on trajectory characteristics to balance performance metrics.
5. Design and Optimization of the GPSO-MPC Algorithm
Although the improved PSO algorithm demonstrates strong search capability and computational efficiency, it still suffers from premature convergence, entrapment in local optima, and population diversity degradation in complex parameter spaces. These limitations reduce its effectiveness in solving nonlinear, multimodal optimization problems. To overcome them, this section introduces a hybrid GPSO strategy that integrates GWO with PSO. By combining their complementary strengths, the proposed approach improves global optimization performance and enhances robustness in tuning MPC controller parameters.
Figure 5 presents the control block diagram of the proposed GPSO-MPC system.
Where and denote the reference and actual state vectors, respectively; and represent the reference and actual control inputs; , , , and indicate the longitudinal error, lateral error, heading angle error, and control input increment, respectively.
GWO, proposed by Mirjalili et al. in 2014, is a metaheuristic swarm intelligence algorithm inspired by the social hierarchy and cooperative hunting strategies of grey wolf packs [
27]. The population is divided into four hierarchical roles:
wolves (dominant leaders guiding the pack),
wolves (secondary decision-makers),
wolves (tertiary scouts), and
wolves (followers that help maintain population diversity). The population evolves toward optimal solutions by cooperatively tracking the positions of the three dominant wolves (
,
,
), ensuring a balanced exploration–exploitation trade-off. The position update mechanism is mathematically formulated as follows:
The final position is computed as the weighted average of the three position updates derived from the
,
, and
wolves’ guidance strategies.
where
is the current iteration index;
,
, and
denote the position vectors of
,
, and
wolves at iteration k, respectively;
,
, and
represent the guided positions of grey wolves by
,
, and
wolves at iteration
, with the final updated position
obtained by averaging these three positions. Vectors
and
are control parameters regulating search scope and direction, calculated as follows:
where
and
are random numbers uniformly distributed in the interval [0, 1], and
is the convergence factor. In this study, an exponentially decaying convergence factor is designed to balance global exploration and local exploitation throughout the iteration process.
Furthermore, based on Equation (35), a new improvement incorporating learning factor
is introduced.
The formulation of the hybrid velocity–position update is provided in Equation (38):
The hybrid algorithm operates on two fundamental mechanisms: (1) utilizing the directional guidance provided by the
,
, and
leaders in GWO to steer the population toward optimal regions within the search space; and (2) retaining the PSO velocity update mechanism, which considers both personal best and global best positions to enhance dynamic adaptability and solution stability.
Figure 6 presents the comprehensive workflow of this integrated GPSO approach. Consequently, the enhanced PSO velocity–position update strategy is effectively incorporated into the GWO architecture.