1. Introduction
Cable-Driven Parallel Robots (CDPRs) are a multifaceted class of robotic systems, distinguished by their expansive workspaces, high payload-to-weight ratios, and structurally simple designs [
1,
2]. By replacing traditional rigid-link mechanisms with tensioned cables, CDPRs significantly reduce moving mass, yielding scalable, cost-effective systems [
3,
4,
5]. These characteristics make them especially well-suited for applications requiring large-scale mobility or field deployment, including construction, manufacturing, rehabilitation robotics, and environmental monitoring [
6]. Traditionally, the control of CDPRs has relied on precise mechanical and kinematic modeling, assuming that system parameters (such as cable lengths, anchor attachment points, and the geometry of the moving platform) are accurately known and remain constant during operation [
7,
8]. Within this framework, inverse kinematics can be employed to compute the necessary cable lengths for achieving a desired end-effector position, allowing for precise kinematics-based trajectory control through coordinated actuation of the motors [
9].
In certain deployments, particularly in outdoor, mobile, or field-based applications, CDPRs are subjected to a range of environmental and mechanical variabilities [
10,
11]. Cable stretch, anchor-point drift, backlash in spooling mechanisms, and structural vibrations can all contribute to discrepancies between the assumed model and the actual configuration. These deviations are especially problematic in under-constrained or lower-degree-of-freedom systems, where the platform’s pose is not directly actuated but instead emerges from the balance of tensions across multiple cables [
12,
13]. As noted by Carricato and Merlet [
14], such systems exhibit complex, nonlinear behaviors that are highly sensitive to small perturbations, making accurate modeling and control particularly challenging without near-perfect calibration. Most conventional CDPR control strategies rely on accurate kinematic calibration, assuming fixed anchor-points, known cable lengths, and rigid structural elements [
15]. While such assumptions may hold in controlled laboratory settings, they often break down in real-world scenarios.
To mitigate the discrepancies between modeled and real-world CDPR behavior, researchers have developed a range of model-based calibration procedures aimed at improving control accuracy. For example, Tremblay et al. [
16] introduced an eye-in-hand, marker-based vision system that estimates cable lengths and anchor-point locations to support offline calibration. Other notable techniques include drift-compensation methods for planar CDPRs [
17] and nonlinear optimization approaches tailored to suspended systems [
18]. While these strategies have shown effectiveness under static conditions, they remain fundamentally constrained by their reliance on pre-deployment or offline calibration routines [
19,
20]. In dynamic or unstructured environments (where factors such as wind, terrain instability, or human interaction introduce continuous variability) such calibrated models rapidly become outdated, limiting their practical utility [
21].
To overcome calibration demands, uncalibrated visual servoing (UVS) emerged as a more adaptable extension of early servoing methods [
22]. UVS eliminates the need for explicit geometric and camera calibration by estimating the mapping between image-space features and actuator commands from data [
23]. Unlike conventional visual servoing approaches that depend on detailed knowledge of robot kinematics and a calibrated visual system [
1,
11,
24], UVS learns this mapping offline or online and can, in principle, adapt during operation [
25,
26,
27]. This makes it particularly well-suited for field-deployed or mobile CDPRs, where structural variability, cable behavior, and shifting visual perspectives can render static calibration impractical or quickly obsolete.
At the core of UVS is the image Jacobian, a matrix that relates actuator-space motion to changes in visual features [
28]. While traditional methods compute this Jacobian analytically using known system parameters, UVS estimates it empirically (often via regression) by analyzing how observed image-space features respond to controlled motor inputs [
29]. Foundational works by Malis et al. [
30] and Santamaria-Navarro et al. [
25] demonstrated that this data-driven strategy enables UVS to adapt in real time, offering flexibility in the face of mechanical drift, sensor noise, or dynamic environmental conditions [
31].
For CDPRs, the adaptability offered by UVS addresses common sources of performance decay. In real-world deployments, these systems are routinely affected by issues such as cable stretch, backlash, tension loss, anchor-point drift, sensor noise, and environmental variability, each of which can degrade the performance of traditional calibration-based control strategies [
13]. UVS mitigates these vulnerabilities by linking control directly to observable visual outcomes rather than relying on fixed geometric or kinematic models [
25]. In outdoor or field applications (such as disaster response, agricultural robotics, environmental monitoring), anchor points may shift with terrain changes, cables may slack or stretch unpredictably, and visual noise may distort feature tracking [
15]. While conventional systems would require frequent recalibration under such conditions, UVS can update its internal mapping by observing how image-space features respond to actuator inputs, enabling low-maintenance operation with minimal setup requirements.
Extending UVS to spatial CDPRs introduces underactuation and single-view observability issues that can render the image Jacobian ill conditioned near common poses [
32]. In principle, these challenges motivate online adaptation of the Jacobian, for example, with Recursive Least Squares (RLS) [
23]. In this work, we implement RLS as an online Jacobian estimator but deliberately restrict it to a baseline form with constant forgetting and no event gating. Event gating refers to selectively pausing parameter updates when the data are uninformative or unreliable (such as under weak excitation, poor signal-to-noise ratio, or ill-conditioned geometry). By omitting such safeguards here, we sought to evaluate the raw behavior of RLS against a Global Static Jacobian (GSJ) baseline.
We investigate calibration-free control of a spatial CDPR using a single eye-in-hand camera, with an OptiTrack motion-capture (MoCap) system providing ground truth for evaluation only (never in the loop). We evaluate two controllers: a GSJ estimated offline from small excitation motions and image displacements, and an RLS estimator that updates the Jacobian online on two hardware configurations: (i) a 3-cable (3C) single-plane platform with all anchors coplanar at a common elevation, and (ii) a redundant 6-cable (RC) dual-plane platform (see
Figure 1). The RC has vertically separated anchor planes and paired cables (two per corner) driven by the same actuator to increase rotational stiffness. We monitor Jacobian conditioning via the minimum singular value
. Both the 3C and RC platform setup used in experimentation can be seen in
Figure 2.
Being static by construction, a GSJ cannot, in principle, reflect arbitrarily large configuration changes without re-estimation. Empirically, however, our GSJ remained robust across the tested scenarios within our test envelope, making it a strong baseline for single-camera UVS. RLS achieved comparable accuracy on successful runs but sometimes over-fits to noisy residuals and produced spurious updates. We therefore regard RLS as promising but fragile, and we discuss an event-triggered variant (gates on excitation, residual SNR, conditioning, and dwell) as future work.
A key mechanical enhancement evaluated is a dual-cable redundant layout that damps undesired rotational modes and reduces pose drift (the mean slow change in the platform’s Degrees of Freedom (DoF) pose (particularly roll/pitch) under near-constant commands due to under-constrained nature, static equilibrium, elasticity, friction, and slack events) near workspace edges, improving the conditioning seen by the visual loop. This mechanical element materially improved closed-loop behavior, most notably helping RLS remain stable more often and brought overall performance closer to what a calibrated kinematics-based controller would reach on similar target sets (see
Section 3.1 for RMS and settling-time).
Although prior work shows that visual servoing can improve tracking, it remains unclear how far one can go without calibration. This is especially true for spatial CDPRs, where single-view observability and cable slackness can render the image Jacobian ill-conditioned. Our results address this by contrasting GSJ with a baseline RLS variant. To address under-documented aspects of single-camera UVS on spatial CDPRs. We target three gaps and pose corresponding research questions:
Under what operating conditions does a carefully estimated GSJ suffice for accurate tracking, and where does RLS provide benefit?
How do redundant per-corner cable pairs (6-cable dual-plane) affect conditioning, oscillation, and tracking behavior?
What failure modes arise for RLS (updates under low excitation or noisy residuals), and what diagnostic criteria are needed to make adaptation event-triggered and safe?
We answer these questions using MoCap ground truth for evaluation; two controllers (GSJ; RLS); two platforms (3C; RC); tracking multiple fixed targets at four locations and a moving target test, both with five runs attempted per scenario. We operationalize behavior with standard, reproducible metrics: RMS image error, band + dwell settling time for step commands, and a disturbance residual (jitter) defined as RMS of the high-pass (0.4 s moving average (MA)) component of the error; when motor data are available, we use
before RMS. (see
Section 2). We log estimator internals and key loop signals (period, targets, errors) to enable post hoc analysis (metric definitions in
Section 2). Our prior GSJ study emphasized strong static accuracy in world space despite under-constrained pose on a 3C platform, whereas the RLS study showed that an online Jacobian can match GSJ only under good initialization and sufficient excitation; otherwise, it over-fits noise and may fail to converge. The present paper extends those results across two platforms (3C; RC), adds MoCap-ground-truthed metrics, and evaluates simple mechanical aid (redundant cable pairs) that materially improve conditioning and closed-loop behavior.
Put simply, the controller uses what the camera sees (pixel coordinates) to steer the robot, without needing a geometric calibration of the whole mechanism. We compare a fixed mapping from pixels to motors (GSJ) against a version that learns and updates this mapping on the fly (RLS). We also test simple mechanical tweaks that reduce camera jitter and improve controllability.
The remainder of this paper is organized as follows.
Section 2 details the system architecture, the GSJ and RLS controllers, mechanical enhancements, and the experimental setup (static and dynamic targets).
Section 3 reports results and broader implications across platforms and controllers.
Section 4 discusses limitations and outlines an event-triggered, stability-aware RLS design and highlighting directions for future research in robust and adaptive control for CDPRs in dynamic, real-world environments.
Section 5 concludes the paper by summarizing the key contributions.
2. Materials and Methods
This section details the complete control and estimation pipeline used in our experiments so that a practitioner can reproduce the results on another cable-driven platform with minimal adaptation. We begin by defining the image-space signals, preprocessing, and a least-squares visual-servo control law that operates purely on camera features. We then present two strategies for obtaining the image Jacobian that converts pixel errors to joint commands: a GSJ, estimated once from excitation motions, and an online RLS estimator with constant forgetting. For each approach we spell out the data collection protocol, estimation equations, numerical safeguards, and implementation defaults (rates, filters, limits). Throughout, we emphasize practical choices that influence conditioning and robustness.
We study two cable-driven platforms: (i) a spatial 3C platform and (ii) a RC, see
Figure 1 and
Figure 2. A downward-facing, eye-in-hand camera is rigidly attached to the platform; drives operate in position mode with encoder feedback. The control loop runs at a nominal period of
ms (
Hz).
At each cycle we detect
feature points
on a fiducial marker (typically
or
); the controlled image feature is the centroid:
assembled as
. We prefer the centroid over selecting 3–4 fixed points because it is robust to transient point loss and yields a stable 2-DoF image feature for both GSJ and RLS. Raw
are smoothed with a 5-sample moving average (MA (5)) before differencing or control with a causal, finite-impulse-response (FIR) moving average of length
(we use
in all runs). MA (
) reduces high-frequency pixel noise and desynchronization artifacts relative to the 25 Hz loop, improving conditioning of Jacobian estimation and stability of the TSVD control update. For any scalar sequence
(e.g.,
or
),
and we initialize the MA by averaging the available prefix (first
samples) and run 10 control cycles to initialize
, and filters before logging; for
by averaging the available prefix
. Implement Equation (2) via:
with
and
for
. Here
is the running sum used to implement MA in
per step. For the centroid
, apply MA (
) elementwise:
All increments in the RLS model use the smoothed signal, . The image reference is either a fixed setpoint (step) or a time-indexed reference; the image error is (pixels).
Motor positions are (encoder counts), with for the setup we used with our platform. We denote increments by (counts). Unless otherwise stated, is in pixels, in counts, and Jacobians map counts to pixels so that is dimensionally consistent.
2.1. Visual-Servo Control Law
The controller closes the loop entirely in the image plane. The control loop is summarized in
Figure 3, which shows the image error driving the Jacobian-based pseudoinverse update that generates actuator increments. Given the current estimate of the image Jacobian
, we compute a small setpoint increment
where
is the SVD and
with
rows and machine-epsilon heuristic
. This truncated-SVD (TSVD) pseudoinverse is numerically robust near weak directions without introducing a tunable damping parameter. TSVD is used only in the control pseudoinverse.
We clip per-motor increments to actuator-safe bounds and integrate them into a commanded setpoint:
Trials automatically stop when (we use px depending on scenario). Feature-loss events trigger an immediate or ramped stop via the drive API.
Signals and notation:
Image features: (pixels) each control cycle. We use either the marker centroid or the average of three tracked points; the control law remains -DoF in image space.
Reference: (pixels): steps or a time-varying path.
Error: (pixels).
Actuators: (encoder counts); is finite-difference velocity (counts/s).
Image Jacobian: relates joint velocities to feature velocities:
What the symbols mean in practice:
(conditioning) indicates how much the image actually responds to joint motions; small values warn of poor leverage or impending slackness.
The controller is entirely image-space: no world pose, depth, or camera intrinsics are required for the loop to function.
We report offline to characterize conditioning; it is not used online to retune the controller in these runs. The control loop per cycle is summarized in Algorithm 1.
Algorithm 1 Pseudocode (Per Cycle) for the visual servoing control loop |
- 1:
- 2:
J = current Jacobian estimate - 3:
[U,Σ,V] = svd (J) - 4:
Σp = diag (1/σ_i if σ_i > τ else 0) - 5:
U’ - 6:
e, dq_max_per_motor) - 7:
qcmd = qcmd + dq - 8:
send_to_drives (qcmd)
|
2.2. Global Static Jacobian (GSJ)
A single, well-estimated is often sufficient over a wide region of the workspace. It is simple, robust to noise, and avoids the failure modes of online adaptation when excitation or SNR is low. The idea is to learn the local, linear mapping from joint velocity to image velocity by probing the system around a nominal view.
We identify a fixed mapping from motor velocities to per-point pixel velocities and then reduce it to the centroid feature used online. From a brief nudge sequence we compute pixel velocities
by finite differencing and dividing by the measured period
, and motor velocities
from the log. We solve an ordinary least-squares problem with no intercept:
yielding
.
is measurement noise and unmodeled image motion (pixels/s).
Because
and
, the centroid Jacobian is the row-wise average of the per-point rows:
The centroid Jacobian is the row-wise average of per-point rows (10). We hold fixed during GSJ runs and use its TSVD pseudoinverse in (5) and (6). The detailed estimation procedure is summarized in Algorithm 2.
- (i)
Apply MA (5) to before differencing;
- (ii)
compute from timestamps;
- (iii)
drop the first sample and any rows that become NaN/Inf after differencing.
Algorithm 2 Estimation of Global Static Jacobian (GSJ) pseudocode |
- 1:
collect per-point (u_i,v_i) and motor velocities - 2:
dotp = diff(pixels)/Δt; dotq = diff(encoders)/Δt - 3:
stack P = [dotp_k], Q = [dotq_k] - 4:
1}% OLS, no intercept - 5:
J_GSJ = [mean_rows_u(J_pts); mean_rows_v(J_pts)]% centroid reduction
|
2.3. RLS Jacobian Adaptation (Constant Forgetting)
As the robot moves, anchor geometry, cable stretch, and viewing perspective drift slowly. Rather than re-identifying offline, RLS updates the Jacobian in flight, fitting the same model as (8) but recursively.
We fit the incremental relation between image-feature changes and motor-count changes:
Each cycle we compute
with the same TSVD rule as in
Section 2.1. Let
be the covariance (confidence) matrix.
is the pixel-space innovation (modeling error + noise). With forgetting factor
the covariance-form RLS update we implement is:
We initialize
from a brief local nudge sequence (same feature and filter as above) and set
(
). The forgetting factor is fixed at
. Rows are updated jointly (matrix form above); no event gating is used. Event gating means we temporarily pause the online Jacobian/RLS update during data that would corrupt the estimate (such as at target steps, when commands are too small to provide excitation, when motors saturate, when image SNR is poor, or when the Jacobian is ill-conditioned). Control continues normally; only the parameter update is paused. We did not use gating in this study; we analyze its potential benefits in
Section 3.3. In preliminary tests,
adapted faster but produced overshoot/oscillation
, whereas
yielded smoother trajectories and comparable steady-state error; we therefore use
in all following experiments. These observations match our prior findings [
29]. For clarity, the matrix-form implementation is summarized in Algorithm 3.
Trials stop when
px. Feature-loss events trigger a stop as exposed by the drive. Per cycle we log: time; encoder counts
; raw and filtered
(and the four corner points); the target
; error
; the current
(all entries);
; and the measured loop period. These are the exact fields used for our plots and metrics in
Section 3.
These safeguards never block updates; they simply keep the numbers well-behaved:
Covariance caps: clip eigenvalues of to (example: ) to prevent blow-ups or collapse and, if is very small, temporarily set λ for that sample (memory-only).
Conditioning (Offline Analysis): the minimum singular value of the logged Jacobian estimate, summarized per scenario; this quantity is used only for analysis, not in-loop.
Update magnitude monitor: track ; a spike indicates low SNR or a transient.
Initialization sequence (what to actually do on the bench):
- 1.
Warm up filters for – s to stabilize and .
- 2.
Perform brief micro-nudges on each motor to elicit – px/s image velocities; collect – s of .
- 3.
Solve a tiny ridge LS for the two rows to set ; set with .
- 4.
Start control and run (12)–(15) each cycle; assemble and use it in (9).
What the parameters mean:
controls memory: lower adapts quickly but can over-fit noisy residuals if excitation is weak.
sets initial confidence: larger means it will trust data quickly; too small makes the estimate sluggish.
Normalizing (e.g., rescaling counts/s per motor to unit variance) is optional but can improve conditioning and make feel more consistent across axes.
Algorithm 3 Matrix-form RLS pseudocode (Per Cycle) |
- 1:
# incremental RLS (2 × m J), per cycle - 2:
1] - 3:
1] # p = [u v]^T after low-pass - 4:
dq_meas - 5:
dq_meas - 6:
P))/den - 7:
P/den)/λ - 8:
# control - 9:
p - 10:
e) - 11:
qcmd += dq
|
All quantities above are computable from the logs we provide:
, and the loop period. We derive
by finite-difference, compute
by SVD and analyze
offline. No commanded inputs or tension sensors are required for the pipeline as described. Smoothing
with
reduces white measurement-noise variance by
but attenuates and delays increments
. In (11)–(15), this scales the innovation
and increases its correlation, which (i) decreases spurious updates in low-SNR regimes, while (ii) modestly slowing adaptation (effective memory increases by roughly the MA group delay). Empirically this helps when excitation is weak; to compensate one can raise
slightly (such as 0.98 → 0.985) or gate updates on residual SNR/conditioning (
Section 4). The practical parameter values used in our experiments are summarized in
Table 1. RLS may be combined with safety gates that pause updates when excitation is weak, geometry is ill-conditioned, or residuals are noisy. In this work, we omit such safeguards to evaluate the raw, baseline RLS behavior, providing a calibration-free benchmark consistent with how UVS has been studied in serial robots. The frequent divergence of RLS in our results suggests that event gating or related safeguards may be beneficial; our baseline results establish why such measures are required in CDPRs.
2.4. Experimental Platforms and Instrumentation
We tested on two hardware configurations: (i) a spatial 3C (underactuated pose), and (ii) a RC platform with paired cables per corner to increase rotational stiffness. Anchor pulleys form an equilateral triangle, 2400 mm apart and mounted at 1650 mm elevation. The test area at the triangle centroid measures 1150 mm × 1150 mm and is raised 370 mm above ground. An 8-camera OptiTrack Flex 3 system samples at 100 Hz (±0.20 mm) to log platform, targets, and anchors; MoCap is used only for evaluation.
The camera is eye-in-hand, rigidly attached to the platform, facing downward. We track either a centroid or three points on a fiducial (ArUco) to compute the image features
per cycle. We smooth features online with MA (5) (see
Section 2.1). We keep the same acquisition and extraction settings as our prior UVS papers.
An OptiTrack system (8 cameras, 100 Hz) records the 3D pose of (a) the platform reference frame, (b) each visual target, and (c) each anchor pulley location. MoCap is never used in the loop only for evaluation and for documenting anchor/target geometry. We export synchronized CSVs with controller logs using a shared wall clock and timestamp alignment (nearest-neighbor sync; – ms typical).
The control node runs on an embedded GPU SBC (same class as prior work), communicating with winch drives over a low-latency link. Drives are position-mode with encoder feedback (counts). Each loop executed at Hz, with average computation times of ms for the GSJ update and ms for the RLS update. In both cases, the limiting factor was image acquisition, not numerical computation, indicating that control law complexity did not constrain throughput in our setup.
For dynamic trials we reuse the prior circular target protocol: a fiducial on a motorized turntable at fixed radius , rotating at one revolution every s (angular speed rad/s). This yields a low, steady image-velocity challenge while keeping the target fully visible. In this study, target pose and trajectory are also recorded by MoCap (new vs. prior work), so both image-space and world-space tracking errors are available.
We evaluate two calibration-free IBVS controllers:
- 1.
Global Static Jacobian (GSJ): a single
is identified offline from small excitation motions (nudges) and held fixed during runs; control uses the TSVD pseudoinverse
as in
Section 2.1.
- 2.
Recursive Least Squares (RLS): the image Jacobian is adapted online via matrix-form RLS with constant forgetting
(typ. 0.98), initialized from a brief local nudge set; numerical hygiene and conditioning-aware damping follow
Section 2.3.
Both controllers use identical feature processing and rate limits. The GSJ/RLS implementations follow the earlier UVS/RLS methodology with our added conditioning monitors; the ungated design choice is deliberate to study raw behavior vs. GSJ.
We consider two scenarios on each platform:
Static targets: four distinct target locations (two near-center and two near-edge positions chosen to stress conditioning), five independent runs per target for each controller.
Moving target: a single circular trajectory as above; five independent runs for each controller. A fiducial follows a repeatable low-speed circular trajectory; for each controller we run five trials consisting of: converge to the target at rest, start the motion, track for one full revolution ( s), stop, and log an additional 3–5 s while static to measure residual bias.
Ideally, the platform remains horizontal so that image-plane error correlates tightly with Cartesian error. In practice, under-constrained pose is the result of its static equilibrium and thus allows small tilts and roll/pitch coupling, partially decoupling image-plane error from Cartesian error; we quantify the impact with MoCap in
Section 3.
2.5. GSJ Identification and RLS Initialization
GSJ: With the target centered, we perform symmetric motor nudges () producing –30 px image shifts; we estimate in velocity form via ridge LS (small ), rejecting samples with or image spikes at near-zero . The resulting is frozen for all GSJ runs on that platform.
RLS: We warm up filters (0.5–1 s), collect a short micro-nudge set, solve a tiny ridge LS for , set (), and update rows with constant- RLS each cycle. No event gating is used; we rely on conditioning-aware damping and covariance caps for numerical safety.
Experimental Procedures, Static-target trials (per run): (1) Load controller (GSJ or RLS) and initialize as above; (2) select one of the four target locations; (3) start the loop and step the reference to center the marker; (4) hold for 8–12 s after first settling (or same wall-clock for all runs); (5) return to neutral pose. All signals and MoCap are logged continuously.
Moving-target trials (per run): (1) Converge to the target at rest; (2) start the turntable; (3) track for one full revolution ( s); (4) stop; (5) log another 3–5 s while the target is static to measure residual bias. Image-space errors and MoCap ground truth are recorded throughout.
2.6. Signals Logged and Synchronization
Each cycle we log: time, (encoder counts), (raw and filtered), , , current (all entries), (RLS), control period , rate-limited , and basic status flags (feature validity). MoCap logs of the platform, each target, and each anchor. Timestamps are synchronized by aligning to the controller wall clock (interpolating MoCap to controller stamps).
We report both image-space and world-space performance:
RMS image error (px): over the post-transient window.
95% settling time (static steps): first time the normed error stays within 5% of the commanded step amplitude for 1 s.
Disturbance residual (px): low-pass RMS of , where is the finite-difference motor motion; filtered at 2–3 Hz (zero-phase).
MoCap position error (mm): Euclidean distance between platform optical-center proxy and target in the world frame (static); radial tracking error vs. MoCap target trajectory (moving).
Per target/scenario/controller we summarize mean ± SD across the five runs; we also report failure counts (target loss, safety abort). Jitter is looked at as well, which for us is the RMS of the high-frequency part of the tracking error after the system has locked/settled. In plain terms, how much the platform wiggles around the nominal track once it is on target (mm in MoCap, px in the image plane). If residual is available, it is the RMS of the high-frequency unmodeled image motion.
Relative to the two prior UVS papers we follow the same hardware class, eye-in-hand vision, feature tracking, and GSJ/RLS estimation approach, but we limit targets to four locations with five runs per scenario, and we now capture MoCap ground truth for the platform, targets, and anchors in every trial. Thus, all figures can be reproduced both in image space and in the world frame without relying on model calibration.
4. Discussion
Our results are for a single monocular camera regulating a 2-DoF image point, which under-actuates the platform dynamics; behaviors may differ with multi-feature or pose-based IBVS. The settle-band criteria (±20 px GSJ, ±22 px RLS) and short dwell were chosen to match the controllers’ intended operation; different bands would shift reported settle rates but not the qualitative 3C↔RC trends we observe. The RLS implementation was deliberately ungated and lightly regularized to expose failure modes; event-triggered updates or stronger regularization would likely improve robustness (and will be the focus of future work). We did not sense cable tensions, so unmodeled elasticity and backlash enter as disturbances; closed-loop force feedback could further reduce residual error. Finally, our tests used a single system and controller gains; while we varied targets and repeated runs, broader hardware and gain sweeps would strengthen external validity.
A natural question is how these uncalibrated baselines compare to calibrated visual servoing (CVS) or other adaptive schemes. Such benchmarks require calibrated kinematic and camera models and separate controller implementations, which were beyond the scope of this initial study. Our aim here was to establish a calibration-free baseline for spatial CDPRs, consistent with how UVS has been evaluated for serial manipulators. Future work will extend the comparison set to include vs. and alternative adaptive strategies, enabling a fuller picture of trade-offs between calibration effort, robustness, and performance. Comparing image-based UVS to calibration-based visual servoing is a valuable next step; here we isolate calibration-free performance.
Dual-cable redundancy made the platform mechanically stiffer, which suppressed roll and pitch motion and reduced tilt or drift of the platform. With this extra stiffness, the camera remained closer to level, so image errors corresponded more directly to actual platform displacements. As a result, the redundant setup tracked targets more faithfully, settled with less overshoot, and exhibited less jitter. This same effect also explains why the RLS estimator behaved more stably in RC: because the platform tilted less, the feature motions it observed were cleaner and more consistent, leading to smoother parameter updates rather than the occasional unstable spikes seen in 3C (see
Figure 14).
The single-camera, eye-in-hand configuration also imposes an observability limitation. With only one monocular view, depth must be inferred indirectly through motion or model assumptions, which restricts robustness compared to stereo or anchor-mounted multi-view setups that could provide direct depth information. Likewise, our trajectory set was limited to a fixed circular path. This speed was chosen deliberately to isolate controller behavior under quasi-static conditions. More complex trajectories were not available on the current platform but represent an important direction for future experiments. Multi-view sensing and richer trajectory libraries would both strengthen robustness tests and clarify the boundaries where GSJ remains adequate versus where adaptive methods become necessary.
Our results address three questions about UVS on spatial CDPRs: when a fixed image Jacobian (GSJ) suffices, how mechanical redundancy affects closed-loop behavior, and where an online Jacobian (RLS) helps or fails. First, the RC platform (redundant per-corner cable pairs) consistently improved conditioning, reflected in higher , lower oscillation, and tighter steady behavior, relative to the 3C platform. In that better-conditioned regime, GSJ delivered reliable tracking on both static and moving targets with lower post-settle/world-space error and comparable or smaller image-plane jitter than 3C, and with settle rates 0.48–0.73 vs. 0.14–0.71, and median settle times 1.0–1.3 s vs. 1.1–2.2 s. These findings support the working hypothesis that, under adequate conditioning, a carefully estimated fixed mapping from pixels to motors is sufficient for accurate UVS without full geometric calibration. They also show that simple mechanical changes (redundant cables) can be more impactful than increasing algorithmic complexity.
Second, we examined RLS fairly and in context. RLS did provide benefit when the run offered sufficient excitation and the Jacobian remained well conditioned, but outside that envelope it frequently amplified noise, produced large residuals, and exhibited failure modes on the moving-target task (like divergence under low excitation or near-singular
). This aligns with prior intuition: single-view UVS on a spatial CDPR is strongly nonlinear, with depth/pose coupling and internal platform modes (roll/pitch) that a local linear RLS model cannot robustly absorb without safeguards. The clear implication is not that RLS is ineffective, but that it must be deployed with (i) excitation scheduling, (ii) strong regularization, and (iii) a supervisor that prevents updates when the information geometry is poor. Concretely, event-triggered adaptation using gates on
, a bounded update norm
, and residual magnitude can prevent noise-driven changes; robust losses (like Huber [
33]) and directional or covariance-reset forgetting can stabilize estimates; and confidence-weighted blending
with
tied to these gates can retain GSJ reliability while exploiting learnable structure when safe.
Third, the sensing/actuation coupling deserves emphasis. With a body-mounted monocular camera, platform attitude changes modulate the image Jacobian and degrade observability; this was most visible on the under-constrained 3C platform and on moving-target trajectories. A pragmatic remedy is to decouple perception from platform attitude via a nadir-stabilized camera (passive gravity gimbal or active two-axis pitch/roll gimbal). Maintaining a near-nadir view flattens the mapping from world translations to image motion, raises , and improves the signal-to-noise ratio of the residual; conditions under which both GSJ and RLS operate more predictably. Keeping the camera near-nadir reduces perspective foreshortening and roll/pitch coupling, so world x–y translations produce more uniform, linear pixel shifts. This improves Jacobian conditioning and increases the residual signal-to-noise ratio, under which both GSJ and RLS behave more predictably. In the same spirit, multi-view sensing or lightweight depth cues would further linearize the mapping and reduce reliance on platform pose to reduce camera-frame jitter.
Broader implications are twofold. For practitioners seeking robust, low-tuning UVS on spatial CDPRs, invest first in mechanical conditioning (redundant cables or equivalent stiffness/layout changes) and perception decoupling (nadir-stabilized camera). In that envelope, a fixed GSJ is a strong baseline that meets accuracy and settling requirements without the brittleness of adaptation. For applications with evolving workspaces or payloads where genuinely changes, RLS (or more expressive Jacobian models) remains attractive, but only with excitation management and safety supervision as above.
Limitations include single-camera sensing and a laboratory trajectory set; also, our jitter metric (high-pass RMS after dwell) captures residual motion but not all perceptual artifacts (such as blur). Future research should: (1) formalize the adaptation supervisor with provable bounds and auto-tuned gates; (2) explore richer, scheduled micro-excitation that is persistently exciting yet visually unobtrusive; (3) test nadir-stabilized or gimbal-mounted cameras and quantify their effect on , residuals, and tracking; (4) extend to multi-view or visual-inertial sensing to break single-view degeneracies; and (5) compare fixed-, RLS, and hybrid/confidence-weighted controllers across broader tasks and payloads. Taken together, the evidence suggests a pragmatic path: fix the physics (conditioning and viewpoint), then add adaptation sparingly and safely where it truly buys invariance. Additionally, in this study, we passed moving-averaged image coordinates into the RLS update to suppress sensor noise. While this improves numerical conditioning, it deviates from the standard RLS assumption of single-sample updates and may bias adaptation. Future work will analyze this effect explicitly or remove pre-filtering to ensure algorithmic consistency.