5.2.1. Algorithm Implementation
To evaluate the overall performance of the proposed CCPP and competing methods in heterogeneous UAV–USV collaboration, we conduct multiple independent trials for seven algorithms across three representative scenarios. Key metrics related to mission efficiency, safety, and constraint satisfaction are recorded. Given the large number of algorithm–scenario combinations, results for each metric are summarized using grouped bar charts, where bar heights represent the mean values over trials and error bars indicate the variability across trials. This allows for simultaneous assessment of average performance and stability. Unless otherwise stated, percentage improvements are reported relative to the strongest competing baseline (the best non-CCPP method) in the same scenario. The corresponding results are presented in
Figure 5,
Figure 6,
Figure 7,
Figure 8,
Figure 9 and
Figure 10.
The temporal-consistency metric
quantifies the dispersion of the arrival-time distribution across the heterogeneous swarm, where a lower value indicates better synchronized coordination quality. As shown in
Figure 5,
CCPP achieves the lowest in all three scenarios. In
Scene A, it attains
, representing a
10.3% improvement over the
second-best method,
TETO (
). For
Scene B, CCPP’s score of
yields a substantial
27.5% reduction compared to
RRT_ACS (
). Even in the more challenging
Scene C, CCPP further decreases
to
, outperforming the
runner-upELP (
) by a remarkable
50.9%. Moreover, the relatively low variability in
Scenes B and C underscores the consistency of CCPP’s performance across random seeds. Collectively, these results confirm that CCPP consistently delivers higher-quality cooperative solutions, achieving a superior balance between efficiency and constraint satisfaction.
Planning step time
quantifies the average computational cost per online planning cycle; a smaller value indicates faster replanning at a given control rate and thus stronger real-time responsiveness. As evidenced in
Figure 6, CCPP achieves the lowest or near-lowest millisecond-level step time across all scenarios. In
Scene A, CCPP reaches
ms, which is only
3.5% higher than the best
TETO at
ms, while for additional context still reducing the step time by
16.1% relative to
DRLC ms and by
29.4% relative to
PSC ms. In
Scene B, CCPP attains the best performance with
ms, yielding a
31.8% reduction compared with the runner-up
RRT_ACS ms. In the more challenging
Scene C, CCPP further decreases the step time to
ms, outperforming
RRT_ACS ms by
20.8%. Overall, these results verify that CCPP delivers consistently lower per-step planning overhead, enabling higher-frequency online coordination and faster mission-level response.
Task completion time
measures the total elapsed time required to accomplish the “detect-all/verify-all” objective. A smaller
indicates a higher throughput of the detect–confirm loop and thus more responsive online coordination. Since completion times span different orders of magnitude across scenarios and methods,
Figure 7 reports
on a logarithmic scale for readability. In
Figure 7, timeouts are plotted at
as a visualization penalty on the log scale, while the actual mission horizon is
21,600
and timeout cases are handled in a method-agnostic manner during statistical aggregation. According to
Figure 7, CCPP consistently yields the shortest (or near-shortest) completion time across all scenarios. In
Scene A, CCPP achieves
, reducing the completion time by
36.3% relative to the fastest baseline
PSC (
). In
Scene B, CCPP attains
, shortening the mission duration by
43.2% relative to the fastest competing baseline
ELP (
) and by
48.9% relative to
PSC (
). In the larger-scale
Scene C, CCPP maintains the mission time at
; while slightly slower than
RRT_ACS (
) in this setting, it remains substantially faster than
PSC/RAST/DRLC (approximately
), corresponding to a
35.9% reduction in completion time. Overall, these results demonstrate that CCPP can markedly compress the end-to-end mission time and thereby improve task throughput and responsiveness under online cooperative planning.
The minimum separation metric
captures the closest inter-agent distance over time, and larger values indicate a larger safety margin. As can be observed from
Figure 8,
CCPP maintains a consistently high and well-balanced safety margin across all scenarios, achieving
,
, and
in
Scenes A–C, respectively. In
Scene A, CCPP improves upon
DRLC from
to
, corresponding to an increase of about
6.7%. In
Scene C, CCPP surpasses
ELP from
to
, yielding an improvement of about
20.0%. While
RAST attains the highest mean in
Scene B at
, it also exhibits the
largest uncertainty, indicating substantial variability and reduced robustness; therefore, a higher mean alone does not imply superior overall safety performance. Considering both mean and dispersion across scenarios,
CCPP offers a favorable trade-off by preserving a high safety margin with more controlled variability, demonstrating more reliable collision avoidance and safety constraint satisfaction.
Figure 9 summarizes the coverage satisfaction metric
, which evaluates how well the coverage requirement is satisfied; a higher value indicates more complete and stable area coverage. With the updated results,
CCPP consistently achieves the best coverage satisfaction across all scenarios, reaching
98.2%/95.8%/93.0% in
Scene A/B/C, respectively, and remaining at 93.0% even in the more challenging Scene C. Compared with the runner-up
RRT_ACS, CCPP leads by
0.9/0.9/0.8 percentage points in A/B/C. Against other representative baselines, CCPP outperforms
PSC by
3.3/4.3/5.4 percentage points and exceeds
RAST by
1.3/2.4/3.5 percentage points in A/B/C. These results demonstrate that CCPP provides stronger coverage-constraint assurance and more reliable coverage quality across varying scenario difficulties.
Figure 10 reports the verification satisfaction score
, which measures how reliably the team completes the full discover-and-confirm workflow; higher values indicate more consistent confirmation of all targets. With the latest results,
CCPP reaches 92.6% in Scene A, improving over
PSC, ELP, RAST, TETO, and DRLC by
3.9, 3.0, 2.1, 4.5, and 2.5 percentage points, and exceeding the next-best
RRT_ACS by
1.7 percentage points. As the scenario becomes more challenging,
CCPP remains at 88.6% in Scene B and 86.0% in Scene C, staying within
0.1 and
0.2 percentage points of the best method while still outperforming
DRLC by
1.8 and
2.9 percentage points, which demonstrates robust verification performance under increasing scenario complexity.
5.2.2. Ablation Study
To further dissect the individual contributions of each core module within the CCPP framework in complex, dynamic environments, we conducted systematic ablation experiments on the most challenging Scenario C. These analyses are intended to provide both component-level sensitivity evidence and repeated-run statistical evidence for separating the roles of the decision-making layer, the vision-to-planning interface, and the consistency-related design within the overall framework. By sequentially ablating the AIDT meta-strategy module, the vision-to-planning interface module, and the Gini-based equalization synchronization mechanism, three ablation variants were derived: NoAIDT, NoVision, and NoGini. In particular, the NoVision setting is used to explicitly evaluate the contribution of the vision perception interface by removing the uncertainty-aware perceptual structuring while keeping the remaining planning framework unchanged as far as possible. All ablation experiments were performed over 20 independent repeated trials, and the evolution curves of 6 key metrics over time were comprehensively recorded. The following three figure groups focus on planning computational efficiency, coordination consistency, and mission reliability, as well as mission coverage and collision risk, respectively. Each group consists of two subfigures, accompanied by detailed quantitative analysis.
For the time-series ablation analysis in Scene C, we use a transformed progress-oriented score, denoted by , which is derived from the raw temporal-consistency metric . Specifically, is obtained from the raw through a monotonic visualization-oriented transformation and is used only for the Scene C time-series ablation plots. This transformed quantity is adopted only for visualization purposes, so as to more intuitively reflect cumulative coordination progress within the fixed time horizon. Under this representation, a higher value of indicates better progress in the ablation analysis, whereas in the main cross-scenario comparison the raw is retained as the formal temporal-consistency metric, for which a lower value indicates better coordination quality.
Figure 11 reports the evolution of the progress-oriented coordination score
in
Scene C under the ablation settings within 0–
, in which the full
CCPP consistently exhibits the strongest late-stage coordination progress. Here,
is a transformed visualization-oriented score derived from the raw temporal-consistency metric
, and is introduced only for the Scene C time-series ablation plots so as to more intuitively reflect cumulative coordination progress within the fixed horizon. Accordingly, a higher value of
indicates better progress in this ablation view, whereas in the main cross-scenario comparison the raw
is retained, for which a lower value indicates better temporal coordination quality. Specifically,
, while the three ablated variants yield
,
, and
. Accordingly, under this progress-oriented representation,
CCPP achieves a higher value than NoAIDT, a
higher value than NoGini, and still a
higher value than NoVision. These results indicate that the
AIDT module contributes most significantly to sustained coordination progress, and removing it leads to a pronounced degradation in late-stage cooperative effectiveness; meanwhile, the
Gini-based synchronization mechanism continues to provide tangible benefits in the later phase of the mission.
Figure 12 compares the planning computation cost
in
Scene C within 0–
, where
characterizes the average per-step decision-making latency. The full
CCPP achieves a markedly lower computational cost in the later stage, with
, whereas the ablated variants yield
,
, and
. Consequently,
CCPP reduces by relative to NoAIDT, by
relative to NoVision, and by
relative to NoGini. This demonstrates that, enabled by more effective information fusion and cooperative policy execution,
CCPP avoids frequent high-cost replanning in the later phase, thereby substantially reducing computational burden and improving real-time feasibility.
Figure 13 presents the minimum safety distance
over 0–
for the
Scene C ablation study, where
quantifies multi-agent collision-avoidance capability during mission execution. The curves indicate that the full
CCPP maintains a substantially larger safety margin in the mid-to-late phase. In heterogeneous multi-agent swarms, a minimum inter-agent separation of
is enforced as the absolute safety threshold (indicated by the red dashed line). Over time, all ablated variants except the complete
CCPP are observed to repeatedly approach this critical boundary. Specifically,
, while
and
; consequently,
CCPP improves by over NoAIDT and by
over NoVision. Moreover, at the mid-phase (
),
CCPP achieves whereas
NoAIDT only reaches , yielding a gap close to
, which further confirms that
removing AIDT markedly weakens safety coordination.
Figure 14 reports the task completion-time CDF for
Scene C over 0–
, where the CDF characterizes the fraction of trials that have completed the mission by a given time; a curve that rises earlier and is left-shifted indicates higher completion efficiency. The
CCPP curve leads consistently across the entire range, suggesting faster and more stable mission execution. For instance, at
,
CCPP completes of the tasks, while
NoAIDT,
NoVision, and
NoGini complete only
,
, and
, respectively; in particular,
CCPP exceeds NoVision by in completion rate within
. More importantly, the median completion time (
) shows
, whereas
NoAIDT,
NoVision, and
NoGini yield
,
, and
, respectively; accordingly,
CCPP reduces the median completion time by relative to NoAIDT, demonstrating its pronounced execution-efficiency advantage in large-scale scenarios.
Figure 15 illustrates the temporal evolution of the coverage efficiency
over 0–
, which quantifies the effective coverage gain per unit time. The full
CCPP exhibits the strongest reward-focusing capability in the late phase: near
,
, whereas
and
. Accordingly,
CCPP improves by 24.8% over NoVision and by 34.3% over NoGini. This indicates that the
visual observation module enhances online perception and reward focusing on effective coverage regions, while the
Gini-based scheduling mechanism further improves resource/path allocation efficiency across agents, leading to sustained late-stage gains.
To characterize the reliability and timeliness of closing the verification loop,
Figure 16 reports the time-series behavior of the verification efficiency
using the cumulative completion ratio under a given time threshold. Overall, CCPP remains higher and rises earlier across most of the horizon, suggesting that it more stably drives trials toward verified completion under the highly dynamic and strongly constrained Scene C. At
, CCPP reaches
0.44–0.45, while NoAIDT, NoVision, and NoGini attain
0.26,
0.20–0.22, and
0.35, respectively; the corresponding relative gains of CCPP are
+70%/+100%/+25%. At
, CCPP reaches
0.55, compared with
0.35/0.30/0.34 for NoAIDT/NoVision/NoGini, yielding gains of
+57%/+83%/+62%. By
, CCPP achieves
1.0, whereas the ablated variants remain within
0.85–0.90, implying that the full framework not only completes earlier but also attains a higher final completion rate with fewer failures/timeouts. In terms of module contributions, NoVision shows a clear lag around 350–
, indicating insufficient structured representation of dynamic risks and uncertainties without the vision–planning interface; NoAIDT stays below CCPP throughout, suggesting weaker online replanning due to the absence of the AIDT meta-policy for efficient mode switching and weight guidance; and NoGini briefly approaches CCPP around 200–
but falls behind over 300–
, reflecting degraded spatiotemporal coordination in critical phases without the synchronization/equilibrium mechanism.
5.2.3. Supplementary Ablation on Path–Speed Coupling
To further isolate the contribution of the proposed path–speed coupling mechanism, we introduce an additional ablation variant based on decoupled sequential optimization. In this setting, geometric path planning and speed coordination are performed in sequence rather than jointly within the rolling optimization loop. More specifically, the path is first generated under the same environmental and safety constraints, after which a separate speed-coordination step is applied on the resulting path. All remaining components, including the vision-to-planning interface, the AIDT-based meta-strategy, and the consistency-related modules, are kept unchanged as far as possible.
The supplementary ablation is conducted on Scene C, which is the most challenging dynamic scenario in this paper. For fairness, the original CCPP joint optimization scheme and the decoupled sequential variant are evaluated under identical initialization settings, target realizations, and random seeds. Each method is tested over independent runs. We report the synchronization indicator , total mission time , planning time per cycle , minimum safety distance , coverage rate , and verification rate in order to evaluate the effect of path–speed coupling on temporal coordination, computational efficiency, safety, and mission effectiveness.
Therefore, this supplementary comparison provides a more direct component-level attribution for the proposed joint path–speed coupling mechanism beyond the main system ablation results.
As shown in
Table 6, the proposed joint path–speed optimization scheme exhibits clear overall advantages over the decoupled sequential variant in the most critical aspects of heterogeneous cooperative planning. In terms of temporal coordination, the joint scheme reduces the synchronization indicator
from 0.33 to 0.13, indicating substantially stronger arrival-time consistency among heterogeneous agents. In terms of mission completion efficiency,
is reduced from 885 s to 552 s, showing that the proposed path–speed coupling enables the swarm to complete the cooperative search-and-confirmation task much faster. In terms of online computational efficiency, the average planning time per cycle is further reduced from 1.39 ms to 1.19 ms, suggesting that the coupled formulation does not increase online burden but instead yields a more efficient closed-loop planning process. In terms of mission effectiveness, the coupled scheme also achieves better task outcomes, improving the coverage rate
from 86% to 92% and, more importantly, raising the verification rate
from 39% to 84%, which demonstrates a much stronger ability to complete the cue–confirmation chain under the challenging Scene C setting. It should also be noted that the decoupled variant attains a larger minimum safety distance
than the joint scheme. This suggests that the sequential strategy tends to behave more conservatively in spatial separation. However, such a gain is obtained together with markedly weaker synchronization, slower task completion, and much lower verification performance. Therefore, the overall results indicate that the proposed joint path–speed coupling mechanism achieves a more desirable balance among coordination consistency, completion efficiency, online planning efficiency, and mission effectiveness, which more convincingly supports its necessity in the CCPP framework.
5.2.4. Supplementary Comparison with a CV-Based Consistency Metric
To further analyze the influence of different consistency metrics on the temporal coordination behavior of heterogeneous swarms, we introduce a comparison group based on the coefficient of variation (CV) while keeping the visual interface, the AIDT mode-switching logic, and the path–speed co-optimization framework unchanged, and compare it with the original Gini-based consistency metric. Specifically, in addition to the original CCPP-Gini scheme, a CCPP-CV variant is constructed, in which the consistency evaluation and synchronization-trigger mechanism are reformulated using the arrival-time CV and the error-domain CV, while all other modules remain unchanged. For completeness, the arrival-time CV and error-domain CV used in the supplementary comparison are defined as:
where
is a small positive constant introduced to avoid denominator degeneration. In addition, to provide a metric-neutral synchronization indicator, we further introduce the mean pairwise arrival-time gap
This comparative experiment is conducted on Scene C, which is the most challenging dynamic scenario considered in this paper. Under identical random seeds, initialization conditions, and target realizations, both CCPP-Gini and CCPP-CV are evaluated over
independent runs. Since the purpose of this comparison is to examine how different consistency metrics affect the coordination evolution trend rather than the final task completion instant, all metrics are evaluated at a common fixed horizon.
Table 5 reports the comparative results in terms of the minimum safety distance
, the planning time per cycle
, the coverage rate
, the verification rate
, and the mean pairwise arrival-time gap
. This supplementary comparison also serves as an empirical response to the metric-selection question, showing that the Gini-based design is not only theoretically well grounded, but also practically competitive under the small-scale heterogeneous swarm setting considered in this paper.
Under the fixed 300 s evaluation horizon, a supplementary comparison between CCPP-Gini and CCPP-CV was conducted, and the results are summarized in
Table 5. Overall, the two consistency metrics exhibit highly comparable performance in terms of the mean pairwise arrival-time gap
, where CCPP-CV achieves 286.37 and CCPP-Gini yields 287.20. The difference between the two is marginal, indicating that CCPP-Gini can maintain a level of temporal coordination comparable to that of CCPP-CV on this metric. In contrast, CCPP-Gini demonstrates clearer overall advantages on the remaining key indicators: the planning time per cycle
is reduced from 1.45 ms to 1.36 ms, corresponding to a decrease of approximately 6.2%; the minimum safety distance
is improved from 636.56 to 660.05, corresponding to an increase of approximately 3.7%; the coverage rate
is increased from 0.01 to 0.02; and the verification rate
is improved from 0.39 to 0.44, corresponding to a relative increase of approximately 12.8%. Moreover, from the perspective of standard deviation, CCPP-Gini exhibits smaller fluctuations in
,
,
, and
, with values of 12.26, 6.88, 0.05, and 0.03, respectively, all lower than those of CCPP-CV (25.48, 35.90, 0.08, and 0.08). This indicates that CCPP-Gini achieves more stable performance across different random seeds. Overall, the Gini-based consistency metric maintains synchronization performance comparable to that of the CV-based alternative, while exhibiting superior online efficiency, safety margin, task advancement capability, and cross-seed stability.
Together with the main ablation study and the path–speed coupling comparison, this metric-level comparison further strengthens the sensitivity-style analysis of how individual components contribute to the overall performance of the proposed CCPP framework.
5.2.5. Supplementary Packet-Loss Sensitivity Analysis
Table 7 reports the supplementary packet-loss sensitivity results in
Scene C. As the packet-loss rate increases from
0% to
20%, all methods exhibit a certain degree of performance degradation, which is consistent with the increased difficulty of maintaining timely coordination under communication uncertainty. Nevertheless, the overall relative ranking remains stable across all packet-loss settings.
Specifically, the full CCPP consistently achieves the best overall trade-off among coordination quality, mission efficiency, verification performance, and online planning latency. At packet-loss rates of 0%, 10%, and 20%, CCPP attains values of 0.138, 0.145, and 0.148, respectively, compared with 0.281/0.303/0.320 for ELP and 0.293/0.325/0.346 for RAST. The corresponding verification satisfaction values of CCPP remain at 86.3%, 82.0%, and 80.3%, while the average planning latency stays low at 1.22, 1.281, and 1.305 ms, respectively. Relative to ELP, CCPP reduces by 50.9%/52.1%/53.8%; relative to RAST, the reductions are 52.9%/55.4%/57.2%. In addition, CCPP improves over ELP by 3.5/5.8/9.1 percentage points and over RAST by 2.7/7.6/11.7 percentage points across the three packet-loss settings. These results indicate that, although packet loss causes moderate absolute degradation, the relative ranking remains unchanged and the superiority of the full CCPP framework is preserved under this supplementary communication-uncertainty perturbation.