Figure 1.
Representative application scenarios. (a) Factory inspection. (b) Geographic survey and mapping. (c) Aerial retrieval.
Figure 1.
Representative application scenarios. (a) Factory inspection. (b) Geographic survey and mapping. (c) Aerial retrieval.
Figure 2.
Comparison of the prior and proposed folding mechanisms. (a) Prior dual-servo design folding into a mm elongated strip. (b) Proposed single-servo crank-rocker design. (c) Engineering drawing with key dimensions.
Figure 2.
Comparison of the prior and proposed folding mechanisms. (a) Prior dual-servo design folding into a mm elongated strip. (b) Proposed single-servo crank-rocker design. (c) Engineering drawing with key dimensions.
Figure 3.
Coordinate frames and spatial relationships used in the dynamic model. : Earth-fixed ENU world frame (blue); : UAV body frame (red); : robot-dog back-surface frame (green). The position vector , rotation , gravity, and thrust vectors are annotated.
Figure 3.
Coordinate frames and spatial relationships used in the dynamic model. : Earth-fixed ENU world frame (blue); : UAV body frame (red); : robot-dog back-surface frame (green). The position vector , rotation , gravity, and thrust vectors are annotated.
Figure 4.
Fold transition key frames from deployed () to fully folded (). Blue: body BBox; red dashed: propeller-tip BBox.
Figure 4.
Fold transition key frames from deployed () to fully folded (). Blue: body BBox; red dashed: propeller-tip BBox.
Figure 5.
Fold–angle dependence of control moment arms. (a,b) Roll and pitch moment arms per motor. (c) , dropping ∼48% from deployed to fully folded.
Figure 5.
Fold–angle dependence of control moment arms. (a,b) Roll and pitch moment arms per motor. (c) , dropping ∼48% from deployed to fully folded.
Figure 6.
Hardware prototype collage and annotated assembled configuration of the air–ground cooperative robot prototype.
Figure 6.
Hardware prototype collage and annotated assembled configuration of the air–ground cooperative robot prototype.
Figure 7.
Perception-to-control pipeline used by the hardware concept. Depth-camera, LiDAR, dog-IMU, and UAV-inertial measurements are first processed by separate visual–inertial, LiDAR–inertial, and landing-plate registration modules, then fused into a relative-pose EKF. The FSM uses both the pose estimate and covariance to gate docking/takeoff phases and passes the resulting state to GFA-FEO, CFNTSM, FiLM-SAC, and the fold-aware mixer.
Figure 7.
Perception-to-control pipeline used by the hardware concept. Depth-camera, LiDAR, dog-IMU, and UAV-inertial measurements are first processed by separate visual–inertial, LiDAR–inertial, and landing-plate registration modules, then fused into a relative-pose EKF. The FSM uses both the pose estimate and covariance to gate docking/takeoff phases and passes the resulting state to GFA-FEO, CFNTSM, FiLM-SAC, and the fold-aware mixer.
Figure 8.
Block diagram of the three-layer hierarchical control architecture. The FSM selects phase-dependent gains () for both the position and attitude loops. In each loop, GFA-FEO provides feedforward disturbance compensation, CFNTSM provides robust sliding-mode feedback, and FiLM-SAC adds a learned residual correction modulated by . The fold-aware mixer inverts in real time.
Figure 8.
Block diagram of the three-layer hierarchical control architecture. The FSM selects phase-dependent gains () for both the position and attitude loops. In each loop, GFA-FEO provides feedforward disturbance compensation, CFNTSM provides robust sliding-mode feedback, and FiLM-SAC adds a learned residual correction modulated by . The fold-aware mixer inverts in real time.
Figure 9.
Frequency-domain comparison of the disturbance estimation error (dB). Blue: standard FEO (, no internal model). Red: proposed GFA-FEO () with internal-model harmonics at . Green dashed: GFA-FEO in enhanced-gain mode (, ). Orange dotted lines mark the three gait-harmonic frequencies.
Figure 9.
Frequency-domain comparison of the disturbance estimation error (dB). Blue: standard FEO (, no internal model). Red: proposed GFA-FEO () with internal-model harmonics at . Green dashed: GFA-FEO in enhanced-gain mode (, ). Orange dotted lines mark the three gait-harmonic frequencies.
Figure 10.
Model-based performance ceiling analysis (noise-free evaluation). (
a) Position RMSE under progressive addition of controller components showing diminishing returns and a noise-free plateau at 5.1 mm; the dashed red line marks the 10 mm docking tolerance (
22) and the dotted purple line indicates the
mm with-noise baseline. (
b) Per-axis RMSE decomposition at the model-based optimum, annotated with dominant bottleneck sources; the shaded region between the 5.1 mm plateau and the 10 mm tolerance represents the
RL compensation margin.
Figure 10.
Model-based performance ceiling analysis (noise-free evaluation). (
a) Position RMSE under progressive addition of controller components showing diminishing returns and a noise-free plateau at 5.1 mm; the dashed red line marks the 10 mm docking tolerance (
22) and the dotted purple line indicates the
mm with-noise baseline. (
b) Per-axis RMSE decomposition at the model-based optimum, annotated with dominant bottleneck sources; the shaded region between the 5.1 mm plateau and the 10 mm tolerance represents the
RL compensation margin.
Figure 11.
Integrated FiLM-SAC residual-learning architecture. The actor maps the observation and conditioning vector to a bounded residual action through FiLM modulation; the environment applies this residual on top of the CFNTSM + GFA-FEO baseline under domain randomisation and returns transitions stored in replay buffer . Twin critics share the encoder and are trained with SAC.
Figure 11.
Integrated FiLM-SAC residual-learning architecture. The actor maps the observation and conditioning vector to a bounded residual action through FiLM modulation; the environment applies this residual on top of the CFNTSM + GFA-FEO baseline under domain randomisation and returns transitions stored in replay buffer . Twin critics share the encoder and are trained with SAC.
Figure 12.
PSO convergence curve: global-best fitness versus iteration (18 D, 50 particles, per evaluation). Dashed red line: hand-tuned baseline fitness (re-evaluated). The swarm reaches 99% of its total improvement within 30 iterations.
Figure 12.
PSO convergence curve: global-best fitness versus iteration (18 D, 50 particles, per evaluation). Dashed red line: hand-tuned baseline fitness (re-evaluated). The swarm reaches 99% of its total improvement within 30 iterations.
Figure 13.
Monte Carlo cross-validation fitness landscape ( conditions per configuration, logarithmic colour scale). Dark green indicates low fitness (good); dark red indicates high fitness (failure). Solid contour: success boundary. (a) Hand-tuned parameters (SRall = 37%, SRop = 59%); (b) PSO-optimised parameters (SRall = 59%, SRop = 97%). Dashed box: operational band Hz.
Figure 13.
Monte Carlo cross-validation fitness landscape ( conditions per configuration, logarithmic colour scale). Dark green indicates low fitness (good); dark red indicates high fitness (failure). Solid contour: success boundary. (a) Hand-tuned parameters (SRall = 37%, SRop = 59%); (b) PSO-optimised parameters (SRall = 59%, SRop = 97%). Dashed box: operational band Hz.
Figure 14.
Experiment 1 representative trajectories in the -plane (dog-body frame). The panels show lateral (y) error versus altitude during descent and final attitude matching between and .
Figure 14.
Experiment 1 representative trajectories in the -plane (dog-body frame). The panels show lateral (y) error versus altitude during descent and final attitude matching between and .
Figure 15.
Experiment 1 time series for M1 (Ours Full): vertical error, roll tracking, disturbance estimation, fold angle, and adaptive trust weight.
Figure 15.
Experiment 1 time series for M1 (Ours Full): vertical error, roll tracking, disturbance estimation, fold angle, and adaptive trust weight.
Figure 16.
Experiment 2: payload-adaptive takeoff (representative run). (a) Altitude tracking error: M1′ (no hot-switch) reaches cm due to the mass mismatch; M5 (PID) shows the largest overshoot. (b) FEO estimate : M1 (with hot-switch) maintains bounded m/s2 oscillation, whereas M1′ diverges to m/s2 bias, demonstrating hot-switch necessity. (c) Smoothed 3-D position error for M1, M1′, and M5: M1′ peaks at 85 mm; M1 achieves the lowest steady-state error among the three methods.
Figure 16.
Experiment 2: payload-adaptive takeoff (representative run). (a) Altitude tracking error: M1′ (no hot-switch) reaches cm due to the mass mismatch; M5 (PID) shows the largest overshoot. (b) FEO estimate : M1 (with hot-switch) maintains bounded m/s2 oscillation, whereas M1′ diverges to m/s2 bias, demonstrating hot-switch necessity. (c) Smoothed 3-D position error for M1, M1′, and M5: M1′ peaks at 85 mm; M1 achieves the lowest steady-state error among the three methods.
Figure 17.
Experiment 3: full mission cycle for a representative run. (a) Altitude tracking; (b) fold-angle cycle; (c) position-error norm; (d) vertical disturbance estimate. Background bands denote the six mission phases.
Figure 17.
Experiment 3: full mission cycle for a representative run. (a) Altitude tracking; (b) fold-angle cycle; (c) position-error norm; (d) vertical disturbance estimate. Background bands denote the six mission phases.
Figure 18.
6-DoF tracking error over the full 40 s mission. (a) Position errors , , (mm); (b) attitude errors , , (°). Background bands mark the six mission phases. The descent fold-through (–8 s) and takeoff mass transition ( s) produce the largest transients, both of which settle within 0.5 s.
Figure 18.
6-DoF tracking error over the full 40 s mission. (a) Position errors , , (mm); (b) attitude errors , , (°). Background bands mark the six mission phases. The descent fold-through (–8 s) and takeoff mass transition ( s) produce the largest transients, both of which settle within 0.5 s.
Table 1.
Quadrotor dynamic-model parameters.
Table 1.
Quadrotor dynamic-model parameters.
| Parameter | Symbol | Value |
|---|
| Total mass | m | 2.50 kg |
| Arm assembly mass (each) | | 0.18 kg |
| Arm length (pivot to motor) | | 116 mm |
| Propeller radius | | 65 mm |
| Thrust coefficient | | Ns2 |
| Torque coefficient | | Nms2 |
| Torque-to-thrust ratio | | 0.016 m |
| Rotor inertia | | kgm2 |
| Aerodynamic drag coeff. | | Nms |
| Roll/pitch inertia (deployed) | | kgm2 |
| Roll/pitch inertia (folded) | | kgm2 |
| Yaw inertia (deployed) | | kgm2 |
| Yaw inertia (folded) | | kgm2 |
Table 2.
Mission-phase FSM: phases and transition conditions.
Table 2.
Mission-phase FSM: phases and transition conditions.
| ID | Phase | Entry Condition |
|---|
| approach | Mission start; UAV airborne, m |
| align | m
and m |
| descend | Aligned: mm;
arms begin folding () |
| dock | Docking window: mm,
|
| lock | EPM activated; docking condition
(22) satisfied |
| stow | Locked; arms fold to ;
motors idle |
| takeoff | Stow complete; takeoff command received |
| cruise | m and
m/s |
Table 3.
Model-based performance ceiling: position RMSE under progressive component addition (10-seed average, nominal docking scenario, and noise-free evaluation). All configurations use PSO-optimised parameters.
Table 3.
Model-based performance ceiling: position RMSE under progressive component addition (10-seed average, nominal docking scenario, and noise-free evaluation). All configurations use PSO-optimised parameters.
| Configuration | (mm) | (mm) | (mm) | (mm) | (mm) | (°) |
|---|
| PID baseline | 15.2 ± 0.3 | — | — | — | — | 3.8 |
| CFNTSM only | 8.9 ± 0.1 | 4.8 | 2.5 | 7.0 | | 2.0 |
| CFNTSM + FEO (no IM) | 6.6 ± 0.1 | 3.5 | 2.0 | 5.2 | | 2.2 |
| CFNTSM + GFA-FEO | 5.1 ± 0.0 | 3.2 | 1.9 | 3.5 | | 2.4 |
Table 4.
PSO decision-variable space (18 dimensions).
Table 4.
PSO decision-variable space (18 dimensions).
| Layer | Symbol | Lower | Upper | Init. | Description |
|---|
| CFNTSM | | 3 | 50 | 12.66 | Position sliding gain |
| 1 | 25 | 5.63 | Position reaching gain |
| 0.05 | 3.0 | 0.102 | Position surface coeff. |
| 2 | 80 | 7.11 | Attitude sliding gain |
| 2 | 40 | 25.91 | Attitude reaching gain |
| 0.1 | 3.0 | 0.770 | Attitude surface coeff. |
| 1.05 | 1.8 | 1.064 | Terminal exponent |
| 0.3 | 0.98 | 0.691 | Finite-time exponent |
| GFA-FEO | | 0.3 | 5.0 | 1.61 | Position-channel gain |
| 0.3 | 8.0 | 1.73 | Velocity-channel gain |
| 0.1 | 4.0 | 0.206 | Disturbance-channel gain |
| 1 | 60 | 2.91 | Position observer bandwidth |
| 1 | 60 | 9.39 | Attitude observer bandwidth |
| 0.05 | 0.45 | 0.188 | Fractional exponent |
| 10 | 200 | 104.5 | IM adaptation rate |
| 0.5 | 20 | 6.88 | Velocity correction gain |
| 0.01 | 1.0 | 0.191 | -modification coeff. |
| 10 | 100 | 99.4 | Disturbance saturation |
Table 5.
Hand-tuned vs. PSO-optimised parameter values.
Table 5.
Hand-tuned vs. PSO-optimised parameter values.
| Parameter | Hand-Tuned | PSO-Optimised | Change |
|---|
| 12.66 | 14.13 | |
| 5.63 | 4.51 | |
| 0.102 | 0.101 | |
| 7.11 | 29.54 | |
| 25.91 | 33.08 | |
| 0.770 | 1.097 | |
| 1.064 | 1.059 | |
| 0.691 | 0.706 | |
| 1.61 | 1.37 | |
| 1.73 | 3.35 | |
| 0.206 | 2.265 | |
| 2.91 | 12.57 | |
| 9.39 | 13.92 | |
| 0.188 | 0.069 | |
| 104.5 | 14.93 | |
| 6.88 | 17.67 | |
| 0.191 | 0.287 | |
| 99.4 | 62.24 | |
Table 6.
Monte Carlo cross-validation results ( = 2500 conditions per configuration).
Table 6.
Monte Carlo cross-validation results ( = 2500 conditions per configuration).
| Metric | Hand-Tuned | PSO-Optimised |
|---|
| Overall success rate (%) | 37 | 59 |
| Operational-band SR (%, Hz) | 59 | 97 |
Table 7.
Compared methods and active components. A check mark indicates that the component is active, whereas—indicates that the component is absent.
Table 7.
Compared methods and active components. A check mark indicates that the component is active, whereas—indicates that the component is absent.
| ID | Method | CFNTSM | GFA-FEO | FiLM-SAC | -Sched. | Adaptive |
|---|
| M1 | Ours (Full) | ✓ | ✓ | ✓ | ✓ | ✓ |
| M2 | w/o RL | ✓ | ✓ | — | ✓ | — |
| M3 | w/o GFA | ✓ | FEO only | ✓ | — | ✓ |
| M4 | Fixed- | ✓ | ✓ | ✓ | ✓ | |
| M5 | PID | PID | — | — | — | — |
Table 8.
Experiment 1 results ( Hz): precision landing and docking (100 runs per method; mean ± std).
Table 8.
Experiment 1 results ( Hz): precision landing and docking (100 runs per method; mean ± std).
| Metric | M1 | M2 | M3 | M4 | M5 |
|---|
| Pos. RMSEdock (mm) ↓ | | | | | |
| Att. RMSEdock (°) ↓ | | | | | |
| Pos. RMSEdesc (mm) ↓ | | | | | |
| Contact vel. (m/s) ↓ | | | | | |
| Success rate (%) ↑ | | 99 | 92 | 100 | 94 |
| Peak att. excursion (°) ↓ | | | | | |
Table 9.
Experiment 2 results: payload-adaptive takeoff (100 runs per method; mean ± std).
Table 9.
Experiment 2 results: payload-adaptive takeoff (100 runs per method; mean ± std).
| Metric | M1 | M2 | M1′ (No HS) | M5 |
|---|
| Altitude overshoot (cm) ↓ | | | | |
| Settling time to ±5 cm (s) ↓ | | | | |
| Peak roll excursion (°) ↓ | | | | |
| Pos. RMSE s (mm) ↓ | | | | |
Table 10.
Experiment 3: per-phase position RMSE for M1 (Ours Full) and M5 (PID), each over 100 runs (mean ± std). Attitude RMSE (M1) is reported for the phases where gait-coupled attitude tracking is most critical.
Table 10.
Experiment 3: per-phase position RMSE for M1 (Ours Full) and M5 (PID), each over 100 runs (mean ± std). Attitude RMSE (M1) is reported for the phases where gait-coupled attitude tracking is most critical.
| Phase | M1 Pos. RMSE (mm) | M5 Pos. RMSE (mm) | M1 Att. RMSE (°) | Key Challenge |
|---|
| I Approach | | | — | Gait tracking |
| II Descend | | | | fold-through |
| III Dock | | | | Surface hold |
| IV Takeoff | | | — | Mass doubling |
| V Cruise | | | — | Coupled cruise |
| VI Station | | | — | Long-term hold |
| Peak error (mission-wide) | M1: mm/M5: mm |
| Docking time | M1: s/M5: s |
Table 11.
Summary of key results across experiments.
Table 11.
Summary of key results across experiments.
| Finding | Exp. | Compared | Result |
|---|
| RL residual necessity | 1 | M1 vs. M2 | Dock RMSE % ( vs. mm, ) |
| GFA internal model | 1 | M1 vs. M3 | Dock RMSE % ( vs. mm), SR pp |
| Adaptive | 1 | M1 vs. M4 | Dock RMSE % ( vs. mm, ) |
| FEO hot-switch | 2 | M1 vs. M1′ | Pos. RMSE % ( vs. mm, ) |
| CFNTSM + FEO vs. PID | 2 | M1 vs. M5 | Pos. RMSE % ( vs. mm, ) |
| Full-cycle feasibility | 3 | M1 vs. M5 | 100% SR; peak err. M1 vs. M5 mm |