1. Introduction
Image-Based Visual Servoing (IBVS) technology directly utilizes image feature feedback to control robotics movement, eliminating the need for precise hand-eye calibration and complex environmental modeling. Particularly in unstructured and dynamic environments—such as high-speed assembly lines and collaborative workspaces—IBVS acts as an irreplaceable solution where precise calibration is impossible. However, the engineering significance of IBVS relies heavily on its ability to maintain sub-pixel accuracy under strict safety constraints, which remains a critical hurdle for its widespread deployment [
1,
2,
3]. For a robotic IBVS system, the seamless integration of precision control, reliable estimation, and efficient disturbance compensation is a prerequisite for stable operation. The IBVS controller must adapt to physical constraints and system nonlinearities; state estimation must extract reliable image feature information from noisy measurements; and disturbance compensation must offset friction, load changes, and modeling errors in real-time. The synergy of these three factors directly determines the engineering practicality of the robotic IBVS system [
4,
5,
6].
However, in actual engineering scenarios, the IBVS system faces a three-level progressive bottleneck of “nonlinearity and constraint coupling, noise and disturbance interference, and insufficient disturbance observation performance”, which seriously restricts its high-precision application. Firstly, at the control layer, there is a fundamental conflict between nonlinearity decoupling and physical constraint handling. While feedback linearization effectively decouples system dynamics [
7], it inherently ignores actuator limits, posing a risk of hardware damage or saturation. Conversely, linear MPC handles constraints explicitly but suffers from model mismatch when applied to the highly nonlinear IBVS dynamics, leading to tracking degradation [
8]. Secondly, at the estimation layer, there is the problem of noise and disturbance coupling interference. Image sensors are susceptible to Gaussian noise and sudden changes in illumination, resulting in distortion of feature measurement signals. The lumped disturbances formed by joint friction, load fluctuations, and Jacobian modeling errors during the robot movement will further distort the system state information. This coupling induces a vicious cycle of degradation: measurement noise propagates into the disturbance observer, causing estimation divergence, while uncompensated disturbances conversely distort the state prediction. This mutual interference makes it extremely challenging to simultaneously suppress high-frequency noise and accurately learn low-frequency lumped disturbances [
9,
10]. Thirdly, there is the problem of optimizing key parameters for disturbance observation. Due to its high local approximation accuracy and fast learning speed [
11], the Radial Basis Function (RBF) neural network is a mainstream disturbance observation tool in current IBVS systems [
12,
13]. Recently, to further mitigate computational burdens and enhance convergence speed, advanced neural control architectures have been developed. For instance, ref. [
14] proposed a velocity-free adaptive neural-fuzzy control strategy, achieving predefined-time convergence for spacecraft attitude tracking with reduced computational complexity. Similarly, ref. [
15] designed a fast fixed-time distributed neural disturbance observer for UAVs, which effectively guarantees rapid disturbance estimation under limited resources. However, IBVS systems usually use multiple feature points to construct image feature vectors, resulting in a high-dimensional input space for RBF. At this time, traditional center selection methods face the ’curse of dimensionality’ dilemma: Global clustering generates excessive redundant centers, violating the real-time requirements of high-frequency servoing; whereas random sampling fails to cover critical task regions, resulting in unacceptable approximation errors [
16,
17]. Balancing estimation fidelity with computational efficiency is an urgent problem to be solved.
To break through the above bottlenecks, various researchers have carried out extensive research from three dimensions: IBVS control, disturbance estimation, and RBF center optimization. In the field of control methods, Caradonna et al. [
18] proposed feedback linearization to control the dynamics of continuum soft robots, realizing system linearization to handle nonlinear coupling, but did not consider joint physical constraints. Sauvée et al. [
19] combined IBVS with the Nonlinear Model Predictive Control (NMPC) architecture, explicitly incorporating joint angle limits, actuator torque saturation, and target visibility constraints (ensuring that image features are always within the camera resolution range) into the constraints of the optimization problem, but there is still much room for improvement in real-time performance. Allibert et al. [
20] attempted a combined scheme of “feedback linearization + MPC”, but did not introduce a disturbance compensation module, which is suitable for ideal disturbance-free environments.
In the field of state estimation and disturbance observation, the Extended Kalman Filter (EKF) has become the mainstream choice due to its strong noise suppression ability and moderate computation [
21,
22,
23,
24]. However, the traditional EKF classifies unknown disturbances as process noise, which needs to be accommodated by increasing the process noise covariance matrix, leading to reduced state estimation accuracy; although the RBF neural network can approximate nonlinear disturbances through online learning, directly using noisy image features as network inputs may cause network parameter oscillation and divergence; Esfandiari et al. [
25] used EKF to serially update RBF network weights to adapt to external disturbances, realizing robot trajectory tracking. However, in the one-way design, the disturbance observation results cannot be fed back to correct the state prediction of EKF, failing to give play to the cooperative advantages of the two. The extended state EKF can estimate both state and disturbance [
26,
27], but it needs to expand the dimension of the state vector, resulting in an increase in computation compared with the traditional EKF, which is difficult to meet the real-time control requirements of the IBVS system.
In the field of RBF center optimization, Wurzberger et al. [
28] pointed out that people directly randomly sample a certain number of centers from the training data set, which has the advantages of simple implementation and high real-time performance, and is suitable for simple tasks with small data volume, uniform distribution, and low precision requirements; some studies from the literature [
29,
30] used K-means clustering to generate RBF centers offline to improve disturbance approximation accuracy; in the IBVS field, if the selection of RBF centers can be combined with the IBVS task path and robot kinematic constraints, it can avoid introducing too many redundant or unreachable centers.
Inspired by the above research, this paper proposes an integrated solution of “feedback linearization IBVS-MPC control, EKF-RBF bidirectional coupled estimation, and task-oriented optimization of RBF centers for IBVS trajectories”. Through the closed-loop design of “decoupling nonlinearity and constraints at the control layer, cooperatively suppressing noise and disturbances at the estimation layer, and optimizing RBF center selection at the tool layer”, the three-level bottleneck is systematically broken through.
The main research contents and innovations of this paper are as follows:
- 1.
A coupled RBF-EKF bidirectional estimation mechanism is developed. Specifically, the EKF filters noisy visual measurements to provide refined state estimates, which serve as high-quality inputs for the RBF network. Simultaneously, the RBF network, configured with task-oriented centers, learns lumped disturbances online. This disturbance estimate is then fed back into the EKF’s prediction step to compensate for model deviations. This interaction establishes a synergistic closed-loop of “state estimation, disturbance learning, and predictive correction.”
- 2.
A task-oriented RBF center selection method based on K-means clustering was designed. By coupling the IBVS-MPC control law, robot forward kinematics, and a camera projection model, a task-oriented nominal image feature sequence covering the “initial-target posture” is iteratively generated. A compact center set closely fitting the task path is obtained through K-means clustering, which not only ensures the kinematic reachability of the centers but also controls the number of centers within a certain range, balancing the disturbance approximation accuracy and system real-time performance.
- 3.
The Uniformly Ultimately Bounded (UUB) stability of the EKF-RBF coupled state-disturbance estimation system was strictly proven based on Lyapunov stability theory, the convergence boundaries of state errors and disturbance estimation errors were clarified, the convergence of time-varying disturbance observation was ensured, and network divergence caused by noisy signals was avoided.
- 4.
The proposed method was verified by manipulator simulation experiments to significantly improve the estimation accuracy of states and disturbances while meeting real-time control requirements, and it exhibited strong engineering practicality.
The subsequent section arrangement of this paper is as follows:
Section 2 establishes the image feature motion model and the robot velocity kinematics model of the IBVS system, as well as the challenges it faces, and clarifies the problem description;
Section 3 details the design of the integrated solution, including the feedback linearization IBVS-MPC control law, the EKF-RBF coupled state-disturbance estimation method, and the task-oriented RBF center selection method based on K-means clustering;
Section 4 conducts the stability analysis of EKF-RBF coupled state-disturbance estimation;
Section 5 verifies the effectiveness of the method through simulation experiments and compares it with traditional methods;
Section 6 summarizes the full text and looks forward to future research directions.
3. Control Methods
To address the critical issues identified in
Section 2—specifically reduced mapping accuracy, increased trajectory tracking error, and insufficient control stability caused by the lumped disturbance
D (comprising image measurement noise, Jacobian modeling errors, and external environmental interference)—this section proposes an integrated solution combining “high-precision estimation” and “control compensation.” The core strategy involves accurately acquiring system states and lumped disturbance information by integrating advanced control strategies with intelligent estimation methods, and subsequently introducing a disturbance feedforward compensation term into the control law. This approach effectively suppresses uncertainties, thereby improving the overall control accuracy and stability of the IBVS system.
To fully realize this design philosophy, the section content is structured logically, progressing from control law design to estimation method construction, and finally to parameter optimization.
Section 3.1 begins by considering the motion constraints and trajectory tracking requirements of the robot. An IBVS control law based on feedback linearization and Model Predictive Control (MPC) is designed, which enhances the system’s adaptability to physical constraints—such as joint angles and velocities—through rolling optimization and constraint handling capabilities.
Section 3.2 subsequently addresses the complexity of the lumped disturbance
D and the interference of state measurement noise by proposing a coupled state-disturbance estimation method combining the Extended Kalman Filter (EKF) and Radial Basis Function (RBF) neural network. The EKF is employed to suppress measurement noise and achieve high-precision estimation of image features. Simultaneously, using the estimated state as input, the RBF neural network utilizes its local approximation characteristics to learn the lumped disturbance
D online. This estimation result is fed back to the EKF to correct state predictions, establishing a closed-loop cooperative mechanism of “state estimation–disturbance learning–prediction correction.”
Section 3.3 focuses on the core prerequisite for accurate RBF estimation: the reasonable selection of high-dimensional network centers. Traditional methods often suffer from redundancy due to undifferentiated coverage, physical infeasibility due to kinematic constraint violations, or difficulties in balancing accuracy with real-time performance. To overcome these challenges, an RBF center selection method based on the IBVS nominal path is proposed. By leveraging the coupled iteration of IBVS-MPC, robot kinematics, and the camera projection model, this method achieves task-oriented coverage of key image feature areas and guarantees the kinematic reachability of centers, thus balancing estimation accuracy with system real-time performance.
Through these interconnected components, the effective suppression of lumped disturbance and a significant improvement in control performance are achieved, providing solid theoretical and methodological support for the subsequent experimental verification.
3.1. Feedback Linearization IBVS-MPC Control
A fundamental challenge in IBVS systems is their inherent nonlinearity, which arises from the time-varying nature of both the image Jacobian matrix and the robot velocity Jacobian matrix. Consequently, traditional linear control methods, such as proportional control, are ineffective for ensuring control accuracy in large-scale trajectory tracking scenarios. Furthermore, the robot joint positions and velocities are subject to strict physical constraints. Failure to account for these constraints in the control law may lead to system oscillation or hardware damage. To address these issues, this section proposes a robust control law that combines the constraint-handling capabilities of Model Predictive Control (MPC) with the nonlinear decoupling capacity of feedback linearization. Crucially, linear MPC offers low computational cost and high real-time performance. Its optimization process can be transformed into a convex Quadratic Programming (QP) problem, allowing for rapid solutions that meet the high-frequency control requirements of IBVS systems.
However, linear MPC is strictly applicable only to linear systems, whereas the IBVS system is typically nonlinear. Direct application of linear MPC would result in model mismatch, leading to increased tracking errors and instability. Therefore, it is necessary to decouple and linearize the nonlinear IBVS system via feedback linearization to establish the foundation for linear MPC application.
Feedback linearization is a nonlinear control method based on accurate modeling. Its core principle involves transforming a nonlinear system into a fully controllable linear system through nonlinear state feedback and coordinate transformation, thereby facilitating the application of mature linear control strategies. Unlike local linearization methods (e.g., Taylor expansion), feedback linearization achieves accurate linearization within the global scope, effectively retaining system dynamic characteristics and avoiding large-scale tracking errors. Additionally, the system’s lumped disturbance D can be explicitly separated during this process, providing an interface for subsequent feedforward compensation via EKF+RBF estimation, thus enhancing system robustness.
First, we design the following nonlinear state feedback control law:
where
represents the pseudoinverse of the total Jacobian matrix.
represents the virtual control quantity, i.e., the control input of linear MPC.
is the estimated value of the lumped disturbance
D, which will be generated by the EKF+RBF coupled estimation introduced later.
By substituting this feedback control law into Equation (
10), under the premise that
, the linearized system can be obtained:
At this stage, the nonlinear IBVS system has been converted into a simple linear integral system, eliminating the nonlinear coupling of the original system and providing a nominal linear model for MPC design.
Next, the linearized system Equation (
15) is discretized to obtain the discrete state equation:
where
represents the discrete system matrix,
represents the discrete input matrix,
represents the system state at time
, and
represents the control input at time
k.
To prevent system instability or hardware damage, the strict physical limits of the robot must be respected. This paper converts these physical constraints into constraints on the system state
x (image features), the MPC virtual control quantity
, its increment
, and the joint velocity
, as follows:
where
and
represent the feasible region of image features within the pixel coordinate system.
and
represent the minimum and maximum values of the MPC virtual control quantity, respectively, which together define the range of the robot joint velocity.
and
represent the minimum and maximum constraints on the control input increment.
We define the control input sequence
and state prediction sequence
at time
k:
where
represents the prediction of the system state at time
based on the state at time
k,
represents the planned control input, and
represents the prediction horizon.
The prediction model can be derived as follows:
where
,
.
The reference state sequence is defined as:
where
represents the reference state sequence, and
is the desired state at time
, serving as the target trajectory.
A cost function is designed to balance tracking accuracy with control smoothness:
where
and
are the weight diagonal matrices for the state and input, respectively.
represents the weight error matrix, and
represents the control input weight matrix.
represents the projected tracking error vector within the prediction horizon.
The core objective of MPC is to find a feasible solution—the optimal control sequence
—such that the cost function
is minimized. This is achieved by solving the following convex optimization problem under the constraints defined in Equation (
17)
where
,
.
Finally, the MPC controller employs a receding horizon strategy. Only the first element of the optimal input sequence
is applied as the auxiliary control quantity
in the control law (
14). At the next moment
, the state
is re-measured, and the prediction and optimization process is repeated. This real-time correction compensates for errors caused by uncertainties and ensures system robustness.
3.2. EKF + RBF Coupled Estimation Method
The effectiveness of the feedback linearization IBVS-MPC strategy proposed in
Section 3.1 relies on a critical premise: the estimated value of the lumped disturbance
must closely approximate the actual lumped disturbance
D. The accuracy of feedback linearization depends entirely on the effective compensation of these disturbances. If a significant discrepancy between
and
D exists, the control law fails to cancel out the nonlinear coupling and original system disturbances. Worse, it may introduce additional errors into the linearized system, causing a mismatch between the MPC optimization model and actual system dynamics. Ultimately, this leads to degraded feature point tracking accuracy and system oscillation.
In practical scenarios, however, the lumped disturbance D exhibits significant complexity and uncertainty, making it difficult to address with standard techniques. On one hand, D aggregates deterministic deviations caused by Jacobian modeling errors and external environmental disturbances, which are difficult to describe accurately using traditional analytical modeling. On the other hand, relying on a single estimation method makes it difficult to balance the dual requirements of “noise suppression” and “disturbance approximation.” The standalone Extended Kalman Filter (EKF), while effective at suppressing Gaussian measurement noise, has limited capacity to approximate complex, unmodeled disturbances. Conversely, while the Radial Basis Function (RBF) neural network possesses strong nonlinear approximation capabilities, it is highly sensitive to input quality; inputting noisy measurement signals directly often leads to parameter oscillation and divergence, rendering the estimation unstable.
To address these challenges, this section proposes a joint estimation method based on the bidirectional cooperation of the EKF and RBF neural network. The core design logic leverages the optimal estimation characteristics of the EKF to filter noisy measurement signals first, thereby achieving high-precision estimation of image features. This process provides high-quality, denoised inputs for the RBF neural network. Subsequently, the RBF network utilizes its local approximation capability to learn the lumped disturbance D online. Crucially, this disturbance estimation is fed back into the EKF’s state prediction stage to correct deviations caused by unmodeled dynamics. This establishes a closed-loop cooperative mechanism characterized by “EKF state estimation–denoised input–RBF disturbance learning–EKF prediction correction.”
The remainder of this section details the parameter update rules and the bidirectional cooperation process. This ensures the proposed method outputs high-precision state and disturbance estimates, providing reliable support for the control strategy outlined in
Section 3.1.
For the IBVS system, the EKF designed in this paper is as follows:
where
represents the Kalman gain. The estimated value of the lumped disturbance
is provided by the RBF neural network described below, which serves to correct the state prediction deviation caused by system disturbances, as shown in
Figure 1.
Although the EKF is formulated in continuous time to describe the physical dynamics, the actual implementation employs a discrete-time approximation using the First-Order Forward Euler method. The state prediction step is computed as with a fixed sampling interval of . This method was selected to minimize computational overhead while maintaining sufficient accuracy for the 100 Hz control loop. The measurement update is triggered discretely upon the acquisition of each new image frame. It is noted that the choice of sampling time is critical; the selected 10 ms interval ensures that the discretization error inherent in the Euler method remains negligible, providing stable estimation without imposing the heavy computational load associated with higher-order integration methods.
The RBF neural network possesses adaptive approximation capabilities for high-dimensional nonlinear disturbances via its radial basis functions. In this study, the RBF network is employed to estimate the lumped disturbance
D. Its structure is defined as:
where the input vector
is obtained from the a posteriori state estimation of the EKF at the previous step.
represents the weight estimation matrix (where N denotes the number of hidden layer nodes), and
represents the radial basis function vector. Given that Gaussian functions exhibit desirable characteristics—such as smoothness, differentiability, and universal approximation capabilities [
34]—they are adopted as the activation functions in this study. The expression for the radial basis function
of the
i-th node in the hidden layer is:
where
represents the coordinate vector of the selected center, and
represents the width of the radial basis function.
Based on the Lyapunov stability analysis method, the adaptive law for the RBF neural network can be derived as:
where
represents a positive definite symmetric learning rate matrix.
is a matrix introduced in the Lyapunov stability proof process, and its specific derivation process will be detailed in subsequent section. Finally, the bidirectional cooperation result of EKF and RBF neural network is
.
Remark 1. Although this study validates the proposed method within a simulation environment, the framework is explicitly designed to mitigate the uncertainties inherent in physical visual sensors. Specifically, practical challenges such as depth estimation deviations and camera calibration inaccuracies are structurally treated as components of the system’s lumped disturbance. By learning these unknown terms online via the RBF neural network, the controller can maintain precision without relying on perfect parameter identification. Furthermore, the predictive nature of the Extended Kalman Filter combined with the receding horizon strategy of Model Predictive Control provides inherent robustness against temporary visual occlusions and sensor latency, allowing the system to maintain stable operation during short-term signal loss or processing delays.
3.3. Task-Oriented K-Means Clustering
A critical prerequisite for the accurate estimation of lumped disturbances by the RBF neural network (
Section 3.2) is the appropriate selection of network centers. As defined in
Section 2.2.1, the selection of four feature points results in an 8-dimensional input space for the RBF neural network; consequently, the network centers must also be 8-dimensional vectors. This high dimensionality presents significant challenges for center selection. The nonlinear approximation performance of RBF neural networks relies on the Universal Approximation Theorem [
35]. A key corollary of this theorem is that the approximation error bound depends on the density of center distribution within the input space
. To minimize the approximation error with fewer neurons, the centers should be concentrated and uniformly distributed across the effective variation region of the disturbances.
However, traditional center selection methods typically employ a strategy of “indiscriminate coverage” in high-dimensional space. This approach tends to generate a large number of invalid centers (e.g., corresponding to feature points outside the robot workspace or the task path), leading to low approximation efficiency. Furthermore, it is difficult for traditional methods to balance estimation accuracy with the real-time performance of the IBVS system. While offline clustering with a large dataset can improve accuracy, the excessive number of centers results in high computational delays; conversely, random sampling supports real-time performance but suffers from insufficient accuracy due to the scattered distribution of centers.
To address the core limitations of traditional RBF center selection methods in disturbance estimation—specifically the redundancy caused by indiscriminate coverage, the kinematic infeasibility of centers falling outside robot constraints, and the conflict between estimation accuracy and real-time performance—we propose a task-oriented K-means clustering method based on the nominal IBVS path. The process proceeds as follows. First, at time step
, the current image is captured via the camera, and the 8-dimensional initial image feature vector is extracted. Next, based on the IBVS-MPC control law (Equation (
14)) and assuming ideal conditions with no system disturbances (Equations (
3) and (
8)), the control input for the robot (i.e., the joint velocity) is computed. Subsequently, the joint velocity is integrated to predict the joint position at the next time step,
, as follows:
where
denotes the sampling period of the control system. The detailed process is illustrated in
Figure 2.
The predicted joint position is then substituted into the robot forward kinematics forward kinematics model to calculate the end-effector pose in the base coordinate system for the next time step:
where
denotes the homogeneous transformation matrix of the end-effector with respect to the base frame at time
, and
represents the robot forward kinematics function.
Through the transformation relationship between multiple coordinate systems, the posture of the camera in the world coordinate system can be solved:
where
denotes the camera pose within the world coordinate system.
represents the fixed transformation from the robot base to the world frame, while
signifies the constant transformation between the end-effector and the camera, which is determined via hand-eye calibration.
By applying the camera projection model, we can predict the nominal, disturbance-free image feature vector
at the next time step
:
where
K denotes the camera intrinsic parameter matrix, determined via Zhang’s calibration method [
36].
represents the 3D world coordinates of the feature points.
By iteratively executing the steps outlined above, a continuous sequence of nominal image features is generated, covering the entire trajectory from the initial pose to the target pose:
where
represents the number of iterations, i.e., the length of the planned path under ideal conditions.
Subsequently, K-means clustering is performed on the sequence S to obtain the final center set for the RBF neural network.
The center selection strategy proposed in this paper generates an nominal trajectory feature sequence through the coupled iteration of the IBVS-MPC control law and robot forward kinematics. This approach achieves task-oriented, compact coverage of key disturbance regions, where all feature points are strictly aligned with the complete task path—from the initial pose to the target pose—while satisfying camera constraints. Consequently, it avoids the spatial redundancy typical of traditional random sampling or unconstrained interpolation. Furthermore, the strategy ensures the kinematic reachability of each feature point through joint velocity solving and limit checking, thereby eliminating estimation blind spots caused by physically infeasible centers. By employing task-oriented K-means clustering on these continuous trajectory features, the method enhances approximation accuracy while maintaining a sparse and efficient set of RBF centers. This effectively balances RBF estimation accuracy with the real-time requirements of the Visual Predictive Control system, laying a solid foundation for the practical application of coupled state-disturbance estimation.
Remark 2. It is worth noting that the performance of the proposed strategy depends on a trade-off between approximation accuracy and computational efficiency regarding the number of RBF centers. An insufficient number of centers may fail to capture the spatial complexity of the lumped disturbance (under-fitting), whereas an excessive number increases the computational load of the neural network, potentially compromising the real-time performance of the control loop. In this study, the number of centers was empirically determined to ensure sufficient disturbance approximation while strictly satisfying the control frequency requirements.
4. Stability Proof of EKF-RBF Coupled Estimation
The previous sections have completed the design of the feedback linearization IBVS-MPC control law, the EKF-RBF coupled state-disturbance estimation method, and the RBF center optimization method based on the IBVS ideal path, forming an integrated technical framework of “control–estimation–optimization”. To ensure the reliability and convergence of the proposed method at the theoretical level and avoid system oscillation or even instability caused by the EKF-RBF coupled state-disturbance estimation mechanism, this section conducts a rigorous analysis of the stability of the system based on Lyapunov stability theory. The core proof goal is to clarify the Uniformly Ultimately Bounded (UUB) stability conditions of the system state error (image feature observation error) and disturbance estimation error, verify the coupled compatibility between the noise suppression capability of EKF, the disturbance approximation performance of RBF, and the constraint optimization characteristics of MPC, and provide a solid theoretical basis for subsequent experimental verification and engineering implementation.
First, we make the following assumptions:
Assumption 1. The system state x and state estimation are bounded, i.e., , where E is a compact set (a compact set in is a bounded closed set).
Assumption 2. The system lumped disturbance D is bounded, i.e., ; the measurement noise is bounded, i.e., and has finite variance.
Assumption 3. The ideal weight of RBF exists and satisfies the following boundedness condition:so that the system lumped disturbance can be expressed as:where represents the approximation error, and the approximation error is bounded, i.e., . Assumption 4. The radial basis function is bounded, i.e., .
Assumption 5. The radial basis function is Lipschitz continuous on the compact set E, i.e., there exists a constant such that: Assumption 6. The Kalman gain is bounded, i.e., .
Remark 3. The assumptions are grounded in practical system constraints. Assumptions A1 and A2 hold because the finite workspace of the UR5 manipulator and the fixed camera resolution inherently bound the system states, while actuator torque limits restrict the magnitude of lumped disturbances [31]. Assumptions A3–A5 follow standard RBF properties: the Universal Approximation Theorem [35] guarantees the existence of ideal weights, and Gaussian basis functions ensure Lipschitz continuity. Finally, Assumption A6 aligns with EKF stochastic stability theory [37], which ensures bounded Kalman gains under the condition of uniform observability. Based on these assumptions, the stability of the proposed system is summarized in the following theorem:
Theorem 1. Consider the IBVS system described by Equation (13) with the control law (14) and the coupled estimator Equations (23) and (24). Under Assumptions A1–A6, the state estimation error and disturbance estimation error are Uniformly Ultimately Bounded (UUB). Proof of Theorem 1. We define the state error:
By substituting Equations (
14) and (
23) into Equation (
35) and differentiating with respect to time, it can be obtained that:
We use the first-order Taylor linearization [
38] of
at
:
where
is the Jacobian matrix of
, and
is a higher-order term. By assuming that the higher-order term can be ignored, the following relationship is found to exist:
By substituting Equation (
38) into Equation (
36), it can be obtained that:
Then, we define the disturbance observation error:
By combining the expressions of the system lumped disturbance Equations (
24) and (
33), it can be obtained that:
We define the weight error:
Through derivation, Equation (
41) can be written as:
By combining the definition (
40) and substituting Equation (
43) into Equation (
39), it can be obtained that:
where
, which contains all uncertain terms. Through Assumptions 2, 3, 5 and 6, we know that:
where
,
. We design the following form of Lyapunov function
V:
By differentiating
V with respect to time, it can be obtained that:
Through the RBF neural network weight update law (
26) and (
42), we can obtain:
By substituting Equations (
44) and (
48) into the derivative expression of the Lyapunov function (
47) and through derivation, it can be obtained that:
By reasonably selecting the parameters of EKF, we can ensure that
is Hurwitz, then the following relationship exists:
where
is a positive definite symmetric matrix. Therefore, Equation (
49) can be written as:
The following inequality relationship exists:
By combining Equations (
45) and (
52), it is known that:
where
,
. When
,
. Therefore, there exists a spherical domain
. When
,
V decreases until
x enters
, and the radius of the spherical domain
is only related to system parameters, not to the initial error. Therefore,
is Uniformly Ultimately Bounded (UUB) [
39], and the final bound is
.
Since
V is positive definite, the following relationship exists:
When
,
,
V has an upper bound. Let us define:
By combining Equation (
54), the following relationship exists:
therefore:
That is,
is also UUB. Through the expression of
Equation (
43), combined with Assumptions A3–A5, we can derive the following relationship:
Since
and
on the right side of Equation (
58) are both UUB,
is UUB.
In summary, we have strictly proved based on Lyapunov stability theory that the system state observation error , RBF weight error , and system lumped disturbance estimation error are Uniformly Ultimately Bounded (UUB). This provides theoretical support for the stability of the proposed method and lays a solid theoretical foundation for subsequent simulation verification. □
Remark 4. It is important to address the system behavior during the initial transient phase when the RBF estimator has not yet converged (i.e., ). During this period, the estimation error acts as a bounded uncertainty acting on the linearized system. The robustness of the proposed strategy in this phase is guaranteed by two factors. First, as proven in Theorem 1, the errors are Uniformly Ultimately Bounded (UUB) regardless of the initial state, ensuring that the system state does not diverge even before convergence. Second, the MPC framework explicitly incorporates image visibility constraints (Equation (17)) into the optimization problem. Unlike classical feedback linearization, which might generate aggressive control inputs based on an inaccurate model, the MPC solver seeks a feasible control sequence that strictly satisfies the feature constraints (). Consequently, while tracking accuracy may be temporarily lower during the first few iterations of the transient phase, the system explicitly prevents the loss of visual features, ensuring task safety until the RBF estimator converges to the true disturbance. 5. Experiments
To accurately quantify the performance gains of the two core innovations proposed in this paper—RBF-EKF coupled state-disturbance estimation and task-oriented K-means clustering for center selection—this section presents four groups of controlled experiments designed under the “single-variable control” principle. The simulation environment was implemented in MATLAB (The MathWorks, Inc., Natick, MA, USA) R2021b on a computer equipped with an Intel Core i9-13900HX CPU and 64 GB of RAM. The experiments utilize a UR5 manipulator (Universal Robots, Odense, Denmark) (official D-H parameters; joint limits: ; velocity limits: ) to evaluate two key metrics: image feature tracking accuracy and lumped disturbance estimation precision.
The consistency of core parameters was strictly maintained to ensure a fair comparison: Visual Predictive Control parameters were set as
,
,
. The EKF used process noise covariance
and measurement noise covariance
. The vision system utilized a pinhole model (
) tracking an 8-dimensional feature vector. To simulate the measurement setup, Gaussian noise with a standard deviation of 0.1 was added to the image features. These noisy measurement data are directly processed by the EKF to generate filtered state estimates, which are then used by the RBF network for disturbance learning and by the MPC for trajectory planning. Lumped disturbances, including end-effector load variations and Jacobian modeling errors, were introduced as:
5.1. Evaluation of Coupled Estimation and RBF Center Selection Strategy
This subsection evaluates the performance of the proposed estimation mechanism. The experimental groups are defined as follows:
Group 1: Uncoupled Benchmark (Uncoupled-TOC). This group employs a serial “EKF-filtering followed by RBF-observation” structure. RBF centers (20 in total) are selected using the task-oriented K-means clustering proposed in this paper.
Group 2: Random Center Control (Coupled-Random). This group utilizes the RBF-EKF coupled estimation mechanism, but with 20 centers randomly distributed across the 8-dimensional image space ().
Group 3: Global Clustering Control (Coupled-Global). This group uses the coupled mechanism with centers derived from global K-means clustering on a dataset of 2,000 points sampled from the entire image space, rather than the task trajectory.
Group 4: Proposed Method (Coupled-TOC). This group integrates both the RBF-EKF coupled mechanism and the task-oriented K-means clustering (using the same 20 centers as Group 1).
By comparing Group 1 and Group 4, the superiority of the coupled feedback estimation over the serial structure is verified. By comparing Groups 2, 3, and 4, the advantages of task-oriented RBF center selection—specifically in reducing computational redundancy and improving approximation accuracy—are independently quantified.
Through four groups of control experiments, the time-series data and statistical metrics for image feature tracking errors and disturbance observation errors were obtained. Accordingly, the image feature trajectories, disturbance observation error curves, and error bar charts (including variance) were generated.
Figure 3 illustrates the motion trajectories of the image features in the
pixel plane for the four experimental groups. The `dot’, `star’, and `cross’ markers explicitly denote the initial, desired, and final positions of the feature points, respectively. Crucially, the zoomed-in insets in the top-right corners highlight the terminal convergence details. It can be clearly observed from these trajectories that the trajectory curve of the Coupled-TOC group (the core method of this paper) is significantly smoother throughout the experiment: from the initial point to the desired point, the curve always maintains a stable convergence trend without obvious fluctuations or oscillations. In contrast, the trajectory curves of the other three groups all have varying degrees of instability. This difference stems from the error correction effect of the coupled state-disturbance estimation mechanism: the fourth group corrects the EKF prediction deviation through the feedback of the RBF disturbance estimation value, effectively offsetting the coupling interference of noise and disturbance; the one-way structure of the first group (Uncoupled-TOC) cannot use the disturbance observation result to optimize the state estimation, making it difficult to resist the influence of disturbance, resulting in small-scale oscillations; the second group (Coupled-Random) has severe trajectory oscillations and significant deviations in control output because the randomly selected RBF centers cannot effectively approximate task-related disturbances; and the trajectory of the third group (Coupled-Global) is relatively stable, but the redundancy of the global clustering centers increases the computational burden.
The smooth trajectory of the Coupled-TOC group validates that the task-oriented K-means clustering ensures the RBF network focuses on the feature space most relevant to the mission, thereby maximizing approximation accuracy with minimal centers. Combined with the RBF-EKF coupled mechanism, the system maintains robust Visual Predictive Control performance even under complex lumped disturbances, fulfilling the requirements for high-precision robotic tasks.
Figure 4a–h display the time-series tracking performance across all eight dimensions of the lumped disturbance vector
. In each subplot, the blue solid line represents the ground truth (`Real’) disturbance, which is characterized by sinusoidal fluctuations. By comparing the estimation curves of the different groups against this ground truth, the specific tracking capability of each method is revealed.
Figure 4i is a bar chart of the mean and variance of disturbance observation errors, quantifying the accuracy and stability of disturbance estimation.
In terms of disturbance estimation capability: Group 1 (Uncoupled-TOC) exhibits a maximum peak error of 62.41 with violent fluctuations. This is attributed to the fact that the uncoupled EKF lacks a disturbance compensation term, leading to delayed responses, noisy RBF inputs, and significant state deviations, which collectively degrade estimation accuracy. The curve for Group 2 (Coupled-Random) remains nearly flat with high error levels throughout the experiment. This phenomenon occurs because the local approximation performance of the RBF network depends on the spatial alignment between the centers and the input data. Randomly selected centers fail to be effectively activated by the task-specific trajectory, leaving the network under-activated and incapable of capturing the time-varying characteristics of the lumped disturbances. Group 3 (Coupled-Global) shows a relatively stable curve, but its convergence lags 1–2 s behind Group 4 (Coupled-TOC). This delay is inherent to global clustering: the network integrates responses from a vast number of centers across the entire image space rather than focusing on the core feature regions relevant to the task. Consequently, the disturbance estimates are over-smoothed, making the network insensitive to rapid transients. Furthermore, the redundant centers increase computational complexity, where the weight update process is hampered by irrelevant features. In contrast, Group 4 (Coupled-TOC) demonstrates superior rapid response capability. By leveraging the RBF-EKF coupled mechanism, the EKF actively offsets disturbances in the state-space model in real-time. Simultaneously, the task-oriented K-means clustering ensures the centers are highly aligned with the actual trajectory, enabling the network to track time-varying disturbances with high fidelity and zero redundant oscillations.
To quantitatively evaluate the performance, the statistical metrics regarding disturbance observation error (Mean and Variance) and the average execution time are summarized in
Table 2.
As presented in
Table 2, the proposed Group 4 (Coupled-TOC) consistently achieves the lowest mean error and variance, significantly outperforming the other three strategies. Comparison with the Uncoupled benchmark (Group 1) reveals that the proposed method delivers a marked reduction in estimation error, thereby confirming the effectiveness of the coupled state-disturbance estimation mechanism. Specifically, regarding computational requirements, the average execution time of the proposed method is 2.87 ms. Although slightly higher than the Uncoupled strategy due to the feedback mechanism, it remains well within the system’s sampling period of 10 ms (100 Hz). This confirms that the proposed task-oriented K-means clustering effectively controls the network scale to balance high-precision estimation with the real-time performance required for engineering applicability.
5.2. Comparative Analysis of Control Strategies
To strictly respond to the necessity of validating the proposed control framework against established methods, this section conducts a comparative study of four distinct control strategies. The objective is to decouple and quantify the contributions of the MPC constraints and the RBF disturbance compensation to the final tracking performance. The four experimental groups are defined as follows:
Group 1: Classical IBVS Control (IBVS). The standard image-based visual servoing controller using a proportional control law without physical constraints or disturbance compensation.
Group 2: Compensated IBVS Control (IBVS + Comp.). The classical IBVS controller augmented with the proposed RBF-EKF disturbance feedforward compensation.
Group 3: Standard Model Predictive Control (MPC). The model predictive controller that handles constraints but relies on the nominal model without the RBF-estimated disturbance compensation term ().
Group 4: Proposed Method (MPC + Comp.). The complete framework proposed in this paper, integrating both the constrained MPC optimization and the RBF-EKF coupled disturbance compensation.
To rigorously test robustness, a time-varying sinusoidal lumped disturbance (consistent with Equation (
59)) was introduced to the system.
Figure 5 illustrates the error convergence trajectories and the statistical performance metrics for the four groups.
As shown in
Figure 5a, the
Classical IBVS (blue line) exhibits the slowest convergence rate. More critically, due to the lack of constraint handling and disturbance rejection capability, it suffers from significant oscillations (visible around iteration 700) when the external disturbance intensifies, failing to achieve a stable steady state. The
IBVS + Comp. strategy (orange line) improves convergence speed and stability by compensating for the disturbance, yet it is still limited by the fixed gain of the proportional controller.
The Standard MPC (yellow line) outperforms the IBVS groups in terms of convergence speed due to its receding horizon optimization. However, because it optimizes based on a nominal model that ignores the lumped disturbance D, model mismatch occurs, leading to a steady-state offset.
Finally, the
Proposed Method (purple line) demonstrates the superior performance. By explicitly incorporating the estimated disturbance
into the feedback linearization loop (Equation (
14)), it recovers the linear system dynamics, allowing the MPC to generate optimal control inputs that are accurate even under heavy disturbances.
The quantitative results in
Figure 5b further confirm this analysis. The proposed method achieves the lowest mean tracking error (171.18), representing a
46.6% reduction compared to Classical IBVS (321.03) and a
23.5% reduction compared to Standard MPC (223.96). These results empirically validate that the integration of predictive constraint handling and active disturbance compensation is essential for high-precision visual servoing.