1. Introduction
In recent years, with the rapid development of vision technology, visual servoing (VS) has been widely applied in many robotics fields such as unmanned aerial vehicles (UAVs) [
1], soft robots [
2], mobile robots [
3], and industrial robots [
4]. Depending on how the visual error is defined and regulated, visual servoing is typically divided into PBVS [
5], which uses pose-level feedback; IBVS [
6], which directly regulates image features; and hybrid approaches that blend the two [
7]. Visual servoing is usually used in high-precision control scenarios, requiring the real-time performance of the task. Compared with PBVS, IBVS directly uses image features as feedback, and thus the control accuracy is less sensitive to camera calibration errors. Therefore, IBVS is considered in this paper [
8].
Due to the limitation of the visual sensor’s field of view (FOV), once the visual features leave the camera’s FOV, the VS system can easily crash. Therefore, the FOV constraint is a key point to consider in visual servoing control [
9]. There have been a number of studies on FOV constraints in VS. In [
10], a VS controller was designed using the Model Predictive Control (MPC) method, which takes into account the input and output constraints of the system and thus satisfies the camera’s FOV constraint requirements. A novel MPC system enhancing visual servo performance is proposed in the literature [
11], which maintains the original controller’s speed without attenuation. To ensure that the FOV constraints are satisfied in [
12], hybrid Dynamic Motion Prototyping (DMP) is used to define the safe motion region and model the implicit constraints by learning the task and manual demonstration, leading to a closed-loop controller that is obtained by solving an optimization problem rather than using a direct analytical servo law. Additionally, there are implementations used to solve the camera’s FOV constraints by means of path planning [
13,
14]. While these approaches have demonstrated effectiveness in enforcing FOV-related constraints, deploying predictive optimization or learning-based schemes on real robots often requires additional computational resources and careful tuning (e.g., solver configuration, horizon selection, or training/validation procedures). Moreover, under fast visual servoing, practical systems may still suffer from feature degradation or intermittent feature loss due to perception limitations, which motivates constraint-handling designs that are solver-free, straightforward to implement, and compatible with real-time IBVS pipelines.
In [
15,
16], prescribed performance control (PPC) was proposed for visual servoing control, which can constrain the FOV in real time and has low computational requirements. In [
17], a time-varying performance specification is enforced on the image error and integrated with an asymmetric barrier Lyapunov function (BLF) to achieve custom convergence accuracy, which converges the image error to a predetermined range. The IBVS controller in [
18] was designed based on a performance function with camera parameters, combined with a tan-type BLF. The convergence accuracy of the image can be customized, and the camera’s FOV constraints are satisfied. In the above study on PPC, a time-varying performance function was used where the tracking error could converge to a custom range when
. This only guarantees that the error will eventually converge to the specified range, and the accurate convergence time cannot be determined. Additionally, the controller is designed to make the IBVS system asymptotically stable, but the accurate convergence time cannot be determined.
Moreover, prescribed-time behavior can also be enforced via a prescribed-time performance function, as demonstrated in studies on the fixed-time control and prescribed-time consensus of multi-agent systems [
19,
20]. Compared to the time-varying performance function [
15,
16,
17,
18], the prescribed-time performance function [
21] allows the tracking error to converge to a custom range at a prescribed time. A prescribed-time performance function and a log-type BLF are introduced in [
22] to limit the tracking error, implementing a symmetric output constraint that allows the error to converge to a custom range in a predetermined time. In [
23], a prescribed-time performance boundary function and a tan-type transformation function are introduced to limit the tracking error for each subsystem, making the error converge to a threshold value within a predefined time. In the above studies, the parameters of the prescribed-time performance function were artificially set based on self-experience, e.g., the prescribed time
T, the initial value of the performance function, etc. Moreover, the prescribed-time performance function has not been applied in the field of visual servoing research, and the above studies use symmetric output constraints that do not satisfy the asymmetric constraint requirements of VS, i.e., the camera FOV angle constraints.
In numerous applications, finite-time stabilization is desirable to comply with stringent performance requirements [
24]. A fractional-order sliding-mode VS scheme with an online adaptation of sliding-surface parameters is reported in [
25] to achieve finite-time convergence. However, in conventional finite-time control, the convergence time bound typically depends on the initial conditions of the closed-loop system. However, fixed-time control ensures faster convergence, and the convergence time is independent of the initial state. In UAV visual servoing [
26], tracking differentiators and fixed-time visual servo controllers are designed to estimate the image feature derivatives in fixed time, and the tracking error can also converge to a neighborhood near zero in fixed time. The high-performance controller, designed using the performance function and fixed-time stabilization theory, allows the system to stabilize in fixed time while satisfying the FOV constraints. In the visual formation control of mobile robots [
27], the performance function was used to constrain the formation error to satisfy the camera’s FOV constraints due to the camera’s viewing distance limitation. Combined with the theory of fixed-time stability, the formation tracking error was made to converge to a neighborhood near zero in fixed time.
Motivated by the above results, a novel control method combining the prescribed-time performance function and the fixed-time control theory is proposed for the IBVS system. Firstly, this paper introduces prescribed-time control into the VS control domain to ensure that the image error converges to a predefined range within a prescribed time. Secondly, the asymmetric time-varying output constraints are achieved by designing a specific prescribed-time performance function and combining it with the asymmetric BLF to design the controller, which satisfies the camera’s FOV constraints. Combining fixed-time and prescribed-time control significantly improves the speed of convergence of image errors. Finally, the advantages and effectiveness are verified through theoretical analyses and experiments. The main contributions of this paper are as follows:
This paper introduces prescribed-time control into visual servoing. Combined with an asymmetric BLF, it ensures the image error converges to a predetermined range within a prescribed time while satisfying the camera’s FOV constraint.
Compared with traditional prescribed-time performance functions, where parameters rely on empirical tuning [
21,
22,
23], this paper integrates the prescribed-time performance function with fixed-time stability theory. This novel integration ensures that the prescribed-time parameter is systematically determined by the control system parameters, enabling adaptive adjustment without manual redesign when conditions change. Furthermore, the proposed control strategy incorporating fixed-time control and predetermined time control methods ensures that the image tracking error not only converges to a predefined range within a prescribed time but also further approaches zero in a fixed time, significantly accelerating convergence.
Distinguished from our prior work [
28], this study achieves dual improvements in the control methodology: a universal time-varying barrier Lyapunov function is integrated with fixed-time stability theory to reconstruct the control framework. The proposed approach completely eliminates dependence on system initial conditions, uniformly accommodates both constrained and unconstrained systems, and effectively circumvents singularity risks.
Beyond laboratory algorithmic verification, practical validation was conducted in engineering scenarios—a case study involving the bolt alignment process of Overhead Contact System (OCS) components demonstrates the applicability of the proposed method in practice. This is a critical step for advancing visual servoing control through the transition from theoretical investigation to practical applications.
The remainder of this paper is structured as follows.
Section 2 states the problem and outlines the visual servoing setup.
Section 3 presents the controller design.
Section 4 reports comparative studies and application-oriented experiments.
Section 5 concludes this paper.
4. Experimental Results
In this section, we conduct three sets of experiments to verify the effectiveness of the proposed control method in FOV constraints, the superiority of control performance, and the applicability of the experiments for real-world applications. Finally, an HD video of the experimental demonstrations is available at
https://youtu.be/RnZMddgVcoA (accessed on 11 December 2025).
4.1. Experimental Setup
The experimental platform (
Figure 1) comprises a 6 DoF UR5 collaborative manipulator and an Intel RealSense D435i camera. A stationary AprilTag (family 36h11) is used as the visual target, providing four corner features (
) that form a square of side length
. These corners are selected as the image features, and their real-time extraction is implemented using ViSP, an open-source visual servoing library developed by the IRISA-Inria Rainbow team [
32].
Case 1: FOV Constraint Experiment. This case evaluates the proposed controller against a conventional IBVS scheme. The image feature trajectories are reported to assess whether the features remain within the admissible region, thereby validating the effectiveness of the imposed FOV constraints.
Case 2: Comparison Experiment. These experiments are carried out between the proposed control method and other control methods to verify the advantages of the proposed control method.
Case 3: Deployment in the Real-World Environment. A bolt alignment task for OCS components was performed to verify the applicability of our method in practice.
Timing and latency: The UR control loop is event-triggered by the arrival of each processed vision update after image acquisition and feature extraction. Thus, the controller update frequency is synchronized with the vision pipeline and equals the achieved vision update rate. The visual processing latency is task-dependent (e.g., ViSP-based AprilTag corner extraction versus YOLOv7-based bolt corner detection). In our experiments, the achieved closed-loop rates are 30 fps/30 Hz for the AprilTag setup and approximately 10 fps/10 Hz for the bolt setup; these effective rates are reported after visual processing and therefore reflect the practical perception and I/O latency in the closed loop.
4.2. FOV Constraint Experiment
In Case 1, a comparative study is conducted to validate the proposed FOV-constrained controller against the conventional IBVS scheme [
8]. An Intel RealSense D435i depth camera is used with an image resolution of
pixels. In Case 1, the initial camera-to-target distance is approximately
m, and the desired distance at convergence is approximately
m. The depth
used in the interaction matrix is measured online by the Intel RealSense D435i. The number of image feature points is four, i.e.,
. Therefore the parameters in Equation (
Section 3.1) are selected as follows:
,
,
, and
. The gains in (
22)–(
23) are set to
and
. The performance function parameters in (
13) are chosen as
,
,
, and
, in which
, where
n signifies the total number of image feature points.
is selected for the experiments in this paper.
Parameter selection guideline: The performance function
in (
12) specifies a time-varying envelope satisfying
for
and
, which is used to constrain the evolution of image feature errors under the field-of-view (FOV) limits. The parameters in (
12)–(
13) are selected according to the following reproducible procedure.
(1) Initial bounds and (feasibility at ). To ensure that the prescribed constraints are satisfied from the initial time, the initial bounds are computed from the admissible image region and the desired feature locations. Specifically, for the
i-th feature point,
where
denotes the desired image plane coordinates. Here,
and
denote the lower and upper bounds (in pixels) of the admissible image plane coordinates, which are determined by the camera resolution (and the predefined admissible region under the FOV constraint).
(2) Terminal bounds and . The parameters and specify the terminal accuracy bounds of the corresponding image errors. Smaller values impose tighter steady-state accuracy but may increase sensitivity to measurement noise and feature jitter; thus, and should be selected according to the desired final precision and sensing quality.
(3) Prescribed time . The prescribed time determines when the envelope reaches . In this work, is chosen using the fixed-time upper bound derived in the stability analysis, i.e., , which can be computed directly from the controller gains and the number of feature points n (with in our experiments).
(4) Shape parameters and q. The parameters
and
q shape how fast
shrinks over
. From (
13), increasing
and/or
q generally leads to a more aggressive contraction (larger
) over part of the interval, which can accelerate transients but may require larger control effort and amplify noise effects. Therefore,
and
q can be tuned to balance convergence speed and robustness while preserving the prescribed-time property.
A 36h11 AprilTag with a side length of
is employed as the visual target, and the corresponding desired feature coordinates are given by
The initial pixel coordinates of the detected features are
Figure 3 depicts the motion trajectories of the four image feature points within the camera’s FOV. These trajectories illustrate the movement of the image feature points from their initial positions (green crosses) to their target positions (red crosses).
Figure 3a shows the trajectories produced by a traditional control method [
8], whereas
Figure 3b presents those obtained under the proposed control method’s constraints. To further assess FOV constraint satisfaction,
Figure 4 shows the feature trajectories in pixel coordinates. The dashed black boundary indicates the imposed limits specified by
, and
. Additionally, the black solid lines indicate the camera’s maximum resolution (640 × 480 pixels), representing the physical limitations of its FOV. As shown in
Figure 4a, the traditional method causes image feature points 1 and 2 to exceed the constraint boundaries, leading to the failure of the visual servoing task under constraint requirements. By contrast,
Figure 4b demonstrates that enforcing the prescribed performance constraints keeps the image features strictly within the specified FOV. This strongly confirms the effectiveness of the proposed method.
4.3. Comparison Experiment
To evaluate the proposed approach, the proposed fixed-time and prescribed-time controller (FTPT) (
22) is compared with the fixed-time controller using a common performance function (FT) (
32), the symmetric constant log-type controller (SCBLF) [
15], the asymmetric log-type prescribed performance controller (PPC) [
16], and the tan-type BLF controller (TBL) [
18].
where
. The methodology and parameters remain the same as those of the proposed control method, except for the modified performance function.
To implement visual servoing, a 36h11 series AprilTag measuring 7cm in size is selected. The coordinates of the desired feature points are defined as follows:
The initial pixel coordinates of the detected features are as follows:
With the Intel RealSense D435i operating at
resolution, the pixel bounds are set to
,
,
, and
. In (
13), the performance function parameters are chosen as
,
,
, and
, where
n signifies the total number of image feature points.
is selected for the experiments in this paper. This means that the steady-state error is prescribed in advance to a maximum of three pixel points. In (
22), the controller gains are set to
and
.
The image plane trajectories of the four AprilTag features obtained with different control methods are presented in
Figure 5. Green crosses indicate the initial feature positions, whereas red crosses denote the desired positions. As shown in
Figure 6, the
u and
v errors of all four features enter the predefined admissible band within the prescribed time under the proposed method, thereby ensuring the desired transient behavior and FOV satisfaction. Additionally, the proposed FTPT method achieves a faster convergence rate than other control methods. To better illustrate performance, the total image feature error in visual servoing is plotted in
Figure 7. To quantify the overall tracking performance, the total image feature error is evaluated by the Euclidean norm
. As shown in
Figure 7, the error curves in blue, red, cyan, green, and black reach a neighborhood near zero at 4.9 s, 6.3 s, 8.8 s, 11.1 s, and 11.9 s, respectively. For ease of comparison, the key experimental results are summarized in
Table 2. The control methods (
22) and (
32) with fixed-time stability have faster convergence than the other methods [
15,
16,
18] with asymptotic stabilization. Furthermore, the blue and red error curves show that the proposed FTPT method converges faster than the FT method. The FTPT method incorporates both fixed-time and prescribed-time theories, while the FT method only uses fixed-time theory. The experimental results verify that the proposed method has better transient performance.
4.4. Deployment in the Real-World Environment
The proposed method was applied to the task of aligning bolts on OCS components. The bolt components and the alignment scenarios are shown in
Figure 8a,b, respectively. The relative transform between the camera frame
c and the end-effector/tool frame
is obtained by standard hand-eye calibration. The M12 bolt is characterized by a hexagonal surface with an opposite side length of 18 mm. The sleeve is a hexagonal surface with a length of 19 mm on the opposite side. In this task, the goal is to align the sleeve at the end of the manipulator to the bolts of the contact network components. High requirements are placed on the assembly positioning between the sleeve and the bolt. In this case, visual servoing control was used to accomplish the alignment task.
In the practical experiment, the YOLOv7-based corner detection algorithm [
33] was applied to the OCS components’ bolts, replacing the earlier AprilTag recognition algorithm. The algorithm can mark four corner points of the bolt. Although the bolt head is hexagonal, we use four consistently detectable and ordered corner points for robust real-time servoing; using all six corners is possible but less reliable under reflections and may introduce correspondence switching. In this experiment, the bolt corner checking algorithm requires the camera to have a higher resolution, so the Intel RealSense D435i camera pixels are 1920 × 1080, and therefore the parameters of the prescribed-time performance function are specified to be
,
,
, and
. The robot controller is configured with a computer featuring an i5-13490F CPU, 32 GB RAM, and NVIDIA GeForce RTX 4070.
Figure 9 shows the experimental process of bolt visual servoing alignment using the control method proposed in this paper, where the bottom-left pictures in
Figure 9a–c represent the camera’s field of view.
Figure 9a–c illustrate the visual servoing control sequence during a robotic bolt alignment task using a sleeve. This validates that the bolt is maintained within the camera’s FOV during the entire servoing process, preventing failures due to FOV exceedance. To demonstrate the visual servoing results more intuitively and to verify their validity, a sleeve alignment bolt experimental step was added after the end of visual servoing. Firstly, the 6D pose of the camera (
) was converted to the 6D pose of the end of the manipulator (
) through the calibration relationship of the camera. Then the robot was controlled to bring the end-of-manipulator pose to the camera pose. Finally, the manipulator end performs the linear alignment task.
Figure 9d shows the sleeve successfully aligning the bolt, validating the applicability of visual servoing in the bolt alignment task. Snapshots of the trajectories of the image feature points in
Figure 9a,c in the camera’s field of view are zoomed in as in
Figure 10a,b. The four corners of the bolt were selected as image feature points, represented by green crosses. The red crosses represent the desired image feature points. The green curve represents the trajectory of the image feature point. In
Figure 10b, the red and green crosses are nearly overlapping, indicating that visual servoing was successfully completed. The trajectories of the feature points in the real experiments are not as smooth as they are in theory. The reason for this is the slight jittering of the bolt corner points detected in real time. It is worth noting that when visual servoing control is applied to actual robotic systems, it will inevitably encounter the influence of external noise factors, such as variations in lighting, camera calibration parameters, and image sensor noise. In this application experiment, the switch from QR codes to bolts as the detection target resulted in poorer image quality. The image detection frame rate decreased from 30 frames per second to 11 frames per second, exerting a certain degree of influence on the vision servo control. These challenges remain key areas for our future research. Ultimately, the proposed control method enabled the robot to successfully align the socket with the bolt, validating the effectiveness of this approach.
Despite the encouraging experimental results, several limitations of the proposed approach should be acknowledged. First, the method assumes that all selected visual features are initially within the camera field of view; if features lie outside the field of view at initialization, additional re-detection or exploratory motion is required before the constrained IBVS law can be engaged. Second, the closed-loop performance depends on reliable perception and accurate depth and interaction matrix computation; illumination changes, motion blur, and intermittent detection may degrade feature localization accuracy and introduce measurement jitter that propagates to the commanded motion. Third, the use of the pseudoinverse of the stacked interaction matrix presumes a non-degenerate configuration with sufficient numerical conditioning; near-singular feature geometries or the temporary loss of some features can deteriorate conditioning and lead to degraded transients or slower convergence. Moreover, although the proposed design aims to keep valid features within a predefined admissible image region, partial target exit from the field of view may still occur in practice due to missed detections, tracking drift, or abrupt motions near image boundaries; in such cases, the constrained IBVS law should be re-engaged after feature recovery restores valid measurements within the admissible region. Finally, while the prescribed-time parameter can be systematically bounded via fixed-time analysis, practical gain tuning still involves a trade-off between convergence speed and sensitivity to measurement noise, and overly aggressive gains may amplify jitter in real-world sensing. To further improve robustness under realistic sensing constraints, an enhanced visual servoing framework is currently under development that integrates a dual-rate perception control scheme, prediction-assisted feature and state estimation, and robust feature management mechanisms; comprehensive treatment and experimental validation will be reported in future work.
Remark 6. Failure cases and discussion. We observed representative failure/near-failure cases when the visual features approach the boundary of the camera field of view (FOV) or when the perception quality degrades. First, in the FOV-constrained scenario, the conventional IBVS baseline may drive some feature points outside the admissible region, causing feature loss and task failure (see Figure 4a). Second, in real-world deployment, the detected feature corners may exhibit jitter due to illumination changes, motion blur, and sensor noise. This effect becomes more evident when the detection frame rate decreases (e.g., from 30 fps to 11 fps in the bolt alignment experiment), which may induce oscillations in the image error and can potentially lead to transient FOV violations or degraded convergence. The proposed method alleviates these issues by enforcing asymmetric time-varying output constraints through the prescribed-time performance function and barrier Lyapunov design, which keeps features inside the predefined range throughout the servoing process. Nevertheless, when perception updates are excessively sparse or detection intermittently fails, closed-loop performance may deteriorate; this motivates future work on perception-robust feature tracking and outlier rejection.