Joint Path Planning and Energy Replenishment Optimization for Maritime USV–UAV Collaboration Under BeiDou High-Precision Navigation

Yang, Jingfeng; Zhao, Lingling; Peng, Bo

doi:10.3390/drones9110746

Open AccessArticle

Joint Path Planning and Energy Replenishment Optimization for Maritime USV–UAV Collaboration Under BeiDou High-Precision Navigation

by

Jingfeng Yang

^1,2

,

Lingling Zhao

^1,2,* and

Bo Peng

^3,*

¹

Key Lab of Guangdong for Utilization of Remote Sensing and Geographical Information System, Guangdong Open Laboratory of Geospatial Information Technology and Application, Guangzhou 510070, China

²

Guangzhou Institute of Geography, Guangdong Academy of Sciences, Guangzhou 510070, China

³

College of Environment and Climate, Jinan University, Guangzhou 511443, China

^*

Authors to whom correspondence should be addressed.

Drones 2025, 9(11), 746; https://doi.org/10.3390/drones9110746

Submission received: 10 September 2025 / Revised: 17 October 2025 / Accepted: 28 October 2025 / Published: 28 October 2025

(This article belongs to the Special Issue Advances in Intelligent Coordination Control for Autonomous UUVs)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

A novel bi-level optimization framework enables closed-loop cooperative planning between a mobile USV and multiple UAVs, significantly improving mission efficiency.
A hybrid intelligence algorithm (GA + MARL) effectively solves multi-objective optimization under energy and dynamic constraints. The method achieves superior performance: >93% task completion, 8–15% shorter mission time, and ~30% less waiting time for replenishment.

What is the implication of the main finding?

Solves the critical endurance limitation of UAVs in large-scale maritime missions, enabling longer and more complex operations.
Reduces operational costs and risks through optimized energy use and minimized UAV downtime. Demonstrates the practical value of BeiDou high-precision positioning in enhancing the autonomy and reliability of unmanned marine systems.

Abstract

With the rapid growth of demands in marine resource exploitation, environmental monitoring, and maritime safety, cooperative operations based on Unmanned Surface Vehicles (USVs) and Unmanned Aerial Vehicles (UAVs) have emerged as a promising paradigm for intelligent ocean missions. UAVs offer flexibility and high coverage efficiency but suffer from limited endurance due to restricted battery capacity, making them unsuitable for large-scale tasks alone. In contrast, USVs provide long endurance and can serve as mobile motherships and energy-supply platforms, enabling UAVs to take off, land, recharge, or replace batteries. Therefore, how to achieve cooperative path planning and energy replenishment scheduling for USV–UAV systems in complex marine environments remains a crucial challenge. This study proposes a USV–UAV cooperative path planning and energy replenishment optimization method based on BeiDou high-precision positioning. First, a unified system model is established, incorporating task coverage, energy constraints, and replenishment scheduling, and formulating the problem as a multi-objective optimization model with the goals of minimizing total mission time, energy consumption, and waiting time, while maximizing task completion rate. Second, a bi-level optimization framework is designed: the upper layer optimizes the USV’s dynamic trajectory and docking positions, while the lower layer optimizes UAV path planning and battery replacement scheduling. A closed-loop interaction mechanism is introduced, enabling the system to adaptively adjust according to task execution status and UAV energy consumption, thus preventing task failures caused by battery depletion. Furthermore, an improved hybrid algorithm combining genetic optimization and multi-agent reinforcement learning is proposed, featuring adaptive task allocation and dynamic priority-based replenishment scheduling. A comprehensive reward function integrating task coverage, energy consumption, waiting time, and collision penalties is designed to enhance global optimization and intelligent coordination. Extensive simulations in representative marine scenarios demonstrate that the proposed method significantly outperforms baseline strategies. Specifically, it achieves around higher task completion rate, shorter mission time, lower total energy consumption, and shorter waiting time. Moreover, the variance of energy consumption across UAVs is notably reduced, indicating a more balanced workload distribution. These results confirm the effectiveness and robustness of the proposed framework in large-scale, long-duration maritime missions, providing valuable insights for future intelligent ocean operations and cooperative unmanned systems.

Keywords:

Unmanned Surface Vehicle (USV); Unmanned Aerial Vehicle (UAV); BeiDou high-precision positioning; cooperative path planning; energy replenishment optimization; multi-UAV scheduling; maritime intelligent operations

1. Introduction

With the continuous growth of demands in marine resource exploitation, environmental protection, and maritime safety, intelligent ocean operations based on unmanned systems have become a research hotspot in both academia and industry. Unmanned Surface Vehicles (USVs) and Unmanned Aerial Vehicles (UAVs), as two representative types of unmanned platforms, respectively, possess the advantages of strong endurance and high maneuverability. In recent years, they have demonstrated broad application prospects in tasks such as marine environmental monitoring, search and rescue operations, target recognition, and resource exploration [1,2,3]. Among them, UAVs are capable of rapidly covering large areas and conducting high-precision detection. However, their endurance is constrained by limited battery capacity, making it difficult to perform tasks continuously over long durations [4]. In contrast, USVs can act as “motherships” for UAVs, providing take-off, landing, charging, or battery replacement platforms, thereby extending UAVs’ operational time and expanding their mission range [5,6]. Therefore, how to achieve cooperative path planning and energy replenishment scheduling between USVs and UAVs has become a key scientific problem for the intelligent execution of large-scale maritime tasks.

In recent years, extensive research has been conducted on UAV path planning and task scheduling. In coverage path planning and energy-constrained scenarios, methods based on deep reinforcement learning [7], heuristic algorithms [8], and transformer models [9] have been proposed to enhance coverage efficiency and computational scalability in large-scale multi-UAV systems. Under energy constraints, existing studies have explored UAV path optimization and replenishment strategies with fixed charging stations [10,11]. However, most of these studies assume UAVs operate independently, without joint modeling and optimization of integrated USV–UAV systems. On the other hand, in the field of USV research, prior studies have mainly focused on trajectory tracking [12], formation control [13], and autonomous path planning [14], with relatively little attention paid to the role of USVs as dynamic energy-supply nodes. A few recent works have preliminarily explored USV–UAV cooperation, such as autonomous landing control [6], maritime search and rescue [1], and cooperative trajectory optimization [15], but they remain limited to single-task or single-UAV scenarios and have yet to systematically address multi-UAV cooperative path planning and energy replenishment scheduling.

In open ocean environments, high-precision positioning plays a critical role in ensuring reliable USV–UAV cooperative operations. Conventional GPS-based localization is vulnerable to multipath interference and insufficient accuracy over the sea, making it difficult to support fine-grained UAV operations and precise USV navigation. The BeiDou high-precision positioning system (BDS-PPP/RTK) provides sub-meter or even centimeter-level accuracy for both USVs and UAVs, greatly enhancing the reliability of cooperative mission planning and dynamic scheduling [16,17,18]. Nevertheless, existing studies on USV–UAV cooperation have rarely considered the practical impact of high-precision BeiDou positioning on joint path planning and energy replenishment optimization. To address these challenges, this paper proposes a USV–UAV cooperative path planning and energy replenishment optimization method based on BeiDou high-precision positioning, establishing an integrated multi-objective optimization framework that jointly models task coverage, energy consumption, time efficiency, and replenishment scheduling.

Despite their potential, three major challenges remain in USV–UAV cooperative operations:

① Coupling of path planning and energy constraints: UAV mission paths must not only satisfy coverage or monitoring requirements but also comply with limited battery capacity and timely replenishment constraints; otherwise, task failures are likely.

② Dynamic optimization of USVs as mobile replenishment platforms: Unlike fixed charging stations, the trajectory and docking positions of USVs directly affect UAV accessibility and mission efficiency. Designing efficient USV mobility strategies constitutes a complex optimization problem.

③ Resource conflicts and efficiency bottlenecks in cooperative scheduling: When multiple UAVs operate simultaneously, competition for replenishment resources and prolonged waiting times may occur, necessitating efficient scheduling mechanisms to improve overall system performance.

To address these issues, this paper proposes a cooperative path planning and energy replenishment optimization method for USV–UAV systems based on BeiDou high-precision positioning. The main contributions are as follows:

① System modeling: A unified modeling framework for USV–multi-UAV cooperative operations is established, integrating path planning and energy replenishment scheduling into a multi-objective optimization model that comprehensively considers task coverage, energy consumption, time, and replenishment constraints.

② Algorithm design: A bi-level optimization method is proposed, where the upper layer optimizes USV mobility and replenishment strategies, and the lower layer optimizes multi-UAV path planning and scheduling schemes. An improved multi-objective intelligent optimization algorithm is employed to enhance solution quality.

③ Simulation validation: Experiments are conducted in representative marine mission scenarios. Comparisons with existing methods demonstrate the advantages of the proposed approach in terms of task completion rate, energy efficiency, and scheduling performance.

The remainder of this paper is organized as follows: Section 2 introduces the system model and problem formulation. Section 3 presents the design of the cooperative scheduling algorithm. Section 4 provides simulation results and analysis. Section 5 concludes the paper and discusses future research directions.

Distinct from prior studies, our framework introduces an adaptive closed-loop mechanism between USV and UAV layers and a hybrid genetic algorithm with multi-agent reinforcement learning (GA–MARL) to achieve dynamic, energy-aware scheduling. This design enables the system to continuously adjust trajectories and energy-replenishment plans in response to real-time mission states, ensuring robust performance in large-scale maritime operations.

2. System Model and Problem Formulation

2.1. Task Description

The system considered in this paper consists of one Unmanned Surface Vehicle (USV) and multiple Unmanned Aerial Vehicles (UAVs), which are deployed to perform large-scale maritime missions. The USV serves as a mobile mothership platform equipped with several UAV nests on its deck, capable of providing take-off, landing, charging, and battery replacement functions for UAVs. The mission tasks are represented by a set of task points distributed across a maritime region, denoted as

P = \{1, 2, \dots, N\}

, where each task point must be visited or covered by a UAV. The USV can autonomously navigate within the maritime area and select docking or cruising locations according to mission demands, thereby functioning as a dynamic energy replenishment platform for UAVs.

Let the UAV set be denoted as

U = \{u_{1}, u_{2}, \dots, u_{K}\}

. Each UAV takes off from the USV, visits a subset of task points

P

, and must return to the USV for replenishment before its battery is depleted. The UAV battery capacity is denoted as

E_{\max}

, the energy consumption required to visit a task point

i \in P

is

e_{i}

, and the flying distance is

d_{i j}

. Accordingly, the energy consumption constraint for UAV task execution can be expressed as:

\sum_{i \in R_{m}} e_{i} + \sum_{(i, j) \in R_{m}} α \cdot d_{i j} \leq E_{\max}, \forall_{u_{k}} \in U E_{i} = α \cdot d_{i} + \sum_{i \in R_{m}} e_{i} \leq C_{i}

(1)

where

R_{m}

denotes the task path of UAV

u_{k}

,

d_{i}

represents its total flight distance,

e_{i}

is the energy required to visit task point

i

,

C_{i}

is the battery capacity, and

α

is the unit-distance energy consumption coefficient.

Within the system, the USV assumes three functions:

① Mothership platform: UAVs must complete take-off and landing on the USV.

② Energy replenishment station: The USV is capable of charging or battery replacement, with a finite number of UAV nests

C

; i.e., at most

C

UAVs can be replenished simultaneously.

③ Mobile node: The USV can navigate within the operational area, and its trajectory directly influences UAV accessibility and replenishment efficiency. Therefore, the trajectory of the USV must be jointly optimized with the UAV flight paths

S = \{s_{1}, s_{2}, \dots, s_{L}\}

.

2.2. Constraints

During system operation, the following constraints must be satisfied simultaneously:

① UAV endurance constraint: Each UAV must return to the USV before its battery is exhausted, as specified by Equation (2).

② USV replenishment constraint: The number of UAVs receiving replenishment at any given time must not exceed the nest capacity

C

. Moreover, each battery replacement or charging operation requires a fixed service time.

③ Task coverage constraint: All task points must be visited by at least one UAV, i.e.,

\underset{u_{k} \in U}{\cup} R_{k} = P

, where

P

denotes the task set.

④ Conflict-avoidance constraint: No two UAVs can occupy the same nest simultaneously. Formally, at any time

t

, the number of UAVs in replenishment state must satisfy

\sum_{i = 1}^{M} δ_{i} (t) \leq C

, where

δ_{i} (t)

is a binary indicator of whether UAV

u_{k}

is replenishing at time

t

, which denotes the number of UAVs simultaneously in the replenishment (charging) state at time

t

, and

C

is the maximum capacity of the USV charging station.

2.3. Optimization Objectives

To achieve efficient execution of maritime missions, path planning and energy replenishment scheduling are formulated as a multi-objective optimization problem. The objectives include:

① Minimization of total mission time:

\min T_{t o t a l} = \max_{u_{k} \in U} T_{k}

(2)

where

T_{k}

is the task completion time of UAV

u_{k}

.

② Minimization of total energy consumption:

\min E_{t o t a l} = \sum_{u_{k} \in U} \sum_{(i, j) \in R_{m}} α_{i j} \cdot d_{i j}

(3)

where

α \cdot d_{i j}

is the total energy consumed by UAV

u_{k}

during its mission, including flight, hovering, and operational energy, and

α_{i j}

is the energy consumption coefficient that converts the mission distance into actual energy usage.

③ Minimization of replenishment waiting time:

\min W_{t o t a l} = \sum_{u_{k} \in U} w_{k}

(4)

where

w_{k}

is the waiting time of UAV

u_{k}

when queuing for replenishment on the USV.

④ Maximization of task completion ratio:

\max F = \frac{|P_{c o m p l e t e d}|}{|P|}

(5)

where

P_{c o m p l e t e d}

denotes the set of completed task points.

In summary, the problem can be formally defined as follows: under the UAV energy constraint, USV replenishment constraint, and task coverage constraint, jointly optimize the trajectory of the USV and the scheduling of multiple UAVs to achieve multi-objective optimization. Due to the strong combinatorial nature, this problem is a typical NP-hard optimization problem.

3. Cooperative Scheduling Algorithm

3.1. Overall Framework

Let

S^{U} (t)

and

S^{V} (t)

denote the state vectors of the USV and UAV fleet, respectively. The upper-layer USV planner updates the trajectory

τ_{U} (t + 1)

as a function of both its own state and the UAV fleet state, while the lower-layer UAV scheduler updates the policy

λ_{V} (t + 1)

in response to the USV trajectory. Formally,

\begin{array}{l} τ_{U} (t + 1) = f_{U} (S^{U} (t), S^{V} (t)) \\ λ_{V} (t + 1) = f_{V} (S^{V} (t), τ_{U} (t)) \end{array}

(6)

where

f_{U} (\cdot)

and

f_{V} (\cdot)

represent the update rules of the two layers. This iterative exchange forms the basis of the closed-loop optimization framework.

The proposed architecture forms an adaptive closed loop in which the upper-layer USV planner and lower-layer multi-UAV scheduler exchange state information in real time. Mission progress, residual energy, and replenishment demand are continuously fed back to the USV trajectory module, allowing on-the-fly adjustments of both the USV path and UAV task assignments. This mechanism enhances responsiveness to dynamic maritime environments and improves overall energy efficiency.

To address the problem of path planning and energy replenishment scheduling in USV–UAV maritime operations, this paper designs a bi-level cooperative optimization framework. Through interactions between the upper and lower layers, the framework achieves close collaboration between the USV and multiple UAVs, thereby meeting multi-objective optimization requirements including task coverage, energy control, minimization of waiting time, and maximization of task completion rate.

The proposed upper/lower layer framework is designed to separate strategic planning from operational execution, emphasizing a clear conceptual structure. The upper layer is responsible for overall mission planning, including task prioritization, resource allocation, and strategic decision-making, which ensures that the system can respond adaptively to dynamic operational environments. The lower layer handles the execution and control of individual UAVs and USVs, converting the high-level plans into feasible trajectories, energy-aware schedules, and task-specific actions while maintaining real-time responsiveness. The interaction between the upper and lower layers forms a closed-loop system: the upper layer guides high-level decisions, and the lower layer provides feedback on execution status, enabling continuous refinement and adjustment of mission plans.

(1): Upper Layer (USV Mobility and Replenishment Planning)

The core objective of the upper layer is to plan the optimal trajectory

S = \{s_{1}, s_{2}, \dots, s_{L}\}

of the USV within the operational area and dynamically adjust its docking or cruising positions to optimize UAV replenishment efficiency and overall mission performance. The upper layer determines the optimal movement strategy of the USV based on UAV positions, residual energy, and task completion status, subject to the following constraints:

① Capacity limit of replenishment nests: the number of UAVs replenished simultaneously cannot exceed

C

.

② USV speed

v_{u s v}

and voyage constraints.

③ Minimization of UAV waiting time

W_{t o t a l}

for replenishment.

The output of the upper layer includes the USV trajectory plan, docking positions at each time step, and replenishment windows, providing dynamic references for UAV scheduling in the lower layer.

(2): Lower Layer (Multi-UAV Path Planning and Battery Scheduling)

The lower layer focuses on the path planning and battery replenishment scheduling of individual UAVs

u_{k} \in U

. Each UAV plans its path

S_{k} = \{s_{1}^{k}, s_{2}^{k}, \dots, s_{L}^{k}\}

based on the USV position and task allocation provided by the upper layer, ensuring coverage of assigned task points while satisfying the energy constraint in Equation (1). Meanwhile, the lower layer applies a time-window model, queuing mechanism, and dynamic priority strategy to schedule UAV battery replenishment, ensuring UAVs receive timely energy supply before depletion to avoid mission interruption.

(3): Interaction Mechanism between Upper and Lower Layers

The interaction mechanism between the upper and lower layers is the core of dynamic optimization in the USV–UAV cooperative operation system. Specifically:

① The upper layer provides real-time USV trajectory information and the state of replenishment resources. Its outputs include the planned trajectory of the USV (predicted positions within the operational region for UAV reference), nest availability (occupied status, expected release time, and service rate), and time window constraints (available replenishment intervals for UAV return). Based on this information, the lower layer performs task allocation, path planning, and replenishment scheduling, ensuring UAVs have sufficient energy to complete tasks before returning.

② The lower layer is responsible for UAV path planning and battery scheduling. Its feedback includes UAV task execution status (completed and remaining tasks, path progress), energy consumption and residual energy (for assessing replenishment urgency), and replenishment requests (predicted and actual docking times). The upper layer adjusts USV trajectories, docking positions, or resource allocation accordingly to ensure UAVs complete tasks as planned with minimal waiting.

③ Closed-loop Cooperation and Iterative Optimization: When a UAV’s return time is approaching, battery level drops below the threshold, or task distribution deviates, the lower layer sends replenishment and task adjustment requests to the upper layer. The USV dynamically adjusts its trajectory and docking position, and updates resource allocation. If multiple UAVs request replenishment simultaneously beyond nest capacity, queuing and priority mechanisms are applied. This interaction repeats during operations, forming a closed-loop information exchange between layers, thereby achieving joint optimization of USV trajectories and UAV paths with replenishment scheduling.

Through this bi-level framework, the system enables dynamic trajectory planning for the USV and joint optimization of UAV path planning and replenishment scheduling. As a result, task completion rate is improved, energy consumption reduced, and waiting time minimized in complex maritime environments.

To provide a clear overview of the proposed cooperative optimization framework, Figure 1 illustrates the complete system architecture of the USV–multi-UAV operation. The diagram highlights the bi-level structure in which the upper layer plans the USV trajectory and energy-replenishment schedule, while the lower layer performs multi-UAV path planning and task allocation. Information flows bidirectionally between the two layers to form an adaptive closed-loop, enabling dynamic adjustment of routes and energy decisions in response to real-time mission data.

The proposed closed-loop interaction between the upper-layer USV planner and the lower-layer UAV scheduler represents a key innovation of this study. Unlike conventional hierarchical scheduling frameworks, which rely on one-way command and feedback, our design establishes a continuous bidirectional information flow that links USV mobility decisions with UAV energy states and task progress in real time. This architecture enables adaptive trajectory adjustment and replenishment scheduling based on evolving mission conditions, thereby achieving self-organizing coordination and significantly enhancing the autonomy and robustness of maritime unmanned operations.

3.2. Path Planning Method

To fully exploit the complementary strengths of global search and adaptive learning, we integrate a genetic algorithm (GA) with multi-agent reinforcement learning (MARL). The GA provides population-based global exploration for USV trajectory planning, while the MARL component performs fine-grained UAV path and energy-replenishment scheduling under stochastic conditions. This hybridization accelerates convergence and avoids local optima compared with using GA or MARL alone.

To illustrate the detailed computational process of the proposed method, Figure 2 presents the algorithmic flow of the hybrid genetic algorithm–multi-agent reinforcement learning (GA–MARL) optimization. Starting from task initialization, the process proceeds through upper-layer USV trajectory planning, lower-layer UAV path and energy-replenishment scheduling, and finally an iterative closed loop where feedback from execution continuously updates both layers until convergence criteria are satisfied.

To address the path planning problem of multiple UAVs under energy constraints in maritime missions, this paper proposes a USV–UAV cooperative multi-objective intelligent path planning method that integrates heuristic optimization and multi-agent reinforcement learning (MARL), aiming to improve task coverage, reduce energy consumption, and minimize UAV waiting time.

The multi-UAV path planning problem is modeled as a joint optimization of shortest-path and coverage planning in graph theory. Let the task point set be

P = \{1, 2, \dots, N\}

, the UAV set be

U = \{u_{1}, u_{2}, \dots, u_{M}\}

, and the USV position sequence be

B = \{b_{1}, b_{2}, \dots, b_{N}\}

. Each UAV

u_{k} \in U

path

S_{k}

must not only cover the assigned task points but also ensure a return to the USV within battery limits

E_{\max}

. Considering the dynamic position of the USV, the path planning problem becomes a dynamic constrained multi-objective optimization problem, simultaneously optimizing mission time,

T_{t o t a l}

energy consumption

E_{t o t a l}

, and waiting time

W_{t o t a l}

.

For large-scale task points and multi-UAV systems, this paper proposes a hybrid GA–ACO optimization algorithm. Genetic Algorithm (GA) is used for global search of UAV task sequences to maximize coverage, while Ant Colony Optimization (ACO) refines local paths by considering USV docking positions and replenishment time windows to minimize energy consumption and waiting. A dynamic energy-constrained fitness function is introduced as Equation (7) which defines the MARL reward, where

ω_{i} \in [0, 1]

is the normalized priority weight of task

i

, computed from the task’s urgency and required energy:

f (R_{k}) = ω_{1} T_{k} + ω_{2} E_{k} - ω_{3} F_{k} + ω_{4} ϕ (E_{k}^{r e m a i n})

(7)

where

ϕ (\cdot)

penalty terms apply when UAV residual energy is insufficient to complete the return, preventing mission failures.

Penalty terms apply when UAV residual energy is insufficient to complete the return, preventing mission failures.

Each UAV is treated as an agent in the MARL framework. The joint state space

h_{t}^{k} = \{U_{t}^{k}, E_{t}^{k}, B_{t}, P_{t}\}

includes: UAV current position

U_{t}^{k}

, residual energy

E_{t}^{k}

, USV position and replenishment state

B_{t}

, and task distribution with completion status

P_{t}

. The policy

σ_{k} (a_{t} |s_{t}^{k})

defines UAV

u_{k}

’s action selection probability at time

t

. The probability

σ_{k}

corresponds to the policy

σ_{k} (a_{t} |s_{t}^{k})

generated by the MARL agent, ensuring that the energy-constrained fitness is embedded into the learning process. The strategy function

σ_{k}

is defined as the function that maps the agent from the state space to the probability distribution of actions, and outputs the selection probability of each possible action

a_{t}

for a given state

s_{t}^{k}

;

a_{t}

represents the actions that the drone can perform at time

t

, including moving towards a certain task point, returning to the USV, waiting for supplies, etc.

σ_{k} (a_{t} |s_{t}^{k})

provides the probability of selecting each action in the current state, and the agent learns this strategy to maximize the long-term cumulative reward, thereby achieving the optimal strategy for path planning, task coverage, and supply scheduling. This article combines centralized training and distributed execution (CTDE) methods to consider global information optimization strategies for all UAVs during the training phase; during the execution phase, each UAV autonomously selects actions based on its local state

s_{t}^{k}

, achieving distributed scheduling and path planning

a_{t}

.

Equation (8) specifies the UAV fleet state transition considering battery level, position, and task completion, ensuring feasibility under energy and timing constraints. The reward function integrates task completion, energy consumption, waiting time, and collision avoidance:

r_{t}^{k} = λ_{1} \cdot C o v_{t}^{k} - λ_{2} \cdot E_{t}^{k} - λ_{3} \cdot W_{t}^{k} - λ_{4} \cdot C o l_{t}^{k}

(8)

where

C o v_{t}^{k}

is UAV

u_{k}

’s contribution to coverage at time

t

. In general,

C o v_{t}^{k} = \{\begin{cases} 1, i f U A V k c o m p l e t e s a n e w m i s s i o n p o i n t c o v e r a g e / v i s i t a_{t} \\ 0, o t h e r w i s e \end{cases}

; if the task point is continuously covered,

C o v_{t}^{k} = \frac{A_{t}^{k}}{A_{t o t a l}}

, where

A_{t}^{k}

is the effective area covered by the UAV

u_{k}

at any given time

t

and

A_{t o t a l}

is the total task area, in order to ensure reasonable task division among multiple UAVs and avoid duplicate coverage.

C o l_{t}^{k}

indicates whether there is a path conflict with the UAV

u_{k}

at the given moment. If two or more UAVs appear at the same location or within a certain safe distance at the same time

t

, it is considered a conflict.

When a conflict occurs, the reward function deducts points (controlled by

- λ_{4}

), forcing the agent to avoid path overlap and nest competition during the learning process. The collision penalty

λ_{4}

ensures flight safety by discouraging overlapping paths and nest competition. The design encourages distributed path selection, balanced task allocation, and safe operation in complex marine environments. Dynamic feedback from the USV is also incorporated: the USV provides real-time updates on nest status and position changes, which UAVs use to adjust return paths and task orders. This joint optimization mechanism improves mission completion rates and reduces redundant energy use. By combining heuristic optimization and MARL, this method proposes an innovative cooperative path planning strategy that addresses energy constraints and dynamic mission environments, offering an efficient and robust solution for USV–UAV maritime operations. For clarity and reproducibility, the iterative interaction between the upper-layer USV trajectory planner and the lower-layer UAV scheduler is described step by step, ensuring that the workflow can be independently replicated.

3.3. Battery Replenishment Scheduling

In the USV–UAV cooperative system, UAV endurance is strictly limited by battery capacity, making effective battery replenishment scheduling essential to ensure mission continuity and high completion rates. This study models battery replenishment as a queuing problem with time window constraints and designs a dynamic priority mechanism to achieve efficient multi-UAV energy management.

Time Window Constraint: UAVs must return to the USV for replenishment before energy falls below the minimum safety threshold

E_{\min}

,

E_{t}^{k} \geq E_{\min}, \forall k, t

. If this condition

[t, t + Δ t]

is violated, the UAV fails its mission. This ensures UAVs return on time, avoiding task interruption or crash risks.

Queuing Mechanism: Since the USV has limited nest capacity

C

, only up to

C

UAVs can replenish simultaneously. the number of UAVs supplied simultaneously satisfies

\sum_{k = 1}^{M} x_{t}^{k} \leq C, \forall t

, where

x_{t}^{k} \in \{0, 1\}

represents whether UAV

u_{k}

enters the supply state at time

t

. When multiple UAVs request replenishment simultaneously, a queuing system allocates service slots. The scheduling algorithm minimizes waiting time

W_{t}^{k}

.

UAV priority is dynamically determined based on both residual energy and task urgency. Equation (9) defines the dynamic priority score for replenishment scheduling. Let priority be defined as:

D P_{t}^{k} = α_{1} \cdot \frac{1}{E_{t}^{k}} + α_{2} \cdot U_{t}^{k}

(9)

where

\frac{1}{E_{t}^{k}}

is UAV

u_{k}

’s residual energy (lower energy implies higher priority),

U_{t}^{k}

represents task urgency (e.g., importance of uncovered task points), and

α_{1}, α_{2}

are weighting coefficients. This ensures that UAVs with urgent needs are prioritized, reducing scheduling conflicts while balancing task performance and energy safety.

Through the integration of time window constraints, queuing, and dynamic priority allocation, the scheduling model ensures UAV safety, optimizes resource utilization, and improves fairness. This mechanism, coupled with path planning, forms a closed-loop coordination system that enables efficient operations of USV–UAV systems in complex maritime tasks.

3.4. Cooperative Mechanism

In the USV–UAV cooperative system, path planning and replenishment scheduling must interact with USV mobility and task allocation. To this end, two core cooperative mechanisms are proposed: dynamic position adjustment and adaptive task allocation, enabling closed-loop optimization between the upper-layer USV and lower-layer UAVs.

Dynamic Position Adjustment: As the UAVs’ “mothership” and energy-supply platform, the USV’s position directly affects UAV accessibility and mission efficiency. Let the USV’s position at time

t

be

P_{u s v} (t)

, and the average replenishment demand position of UAVs be

{\bar{P}}_{u a v} (t)

. The USV’s target update rule is:

P_{u s v} (t + 1) = P_{u s v} (t) + η \cdot ({\bar{P}}_{u a v} (t) - P_{u s v} (t))

(10)

where

η \in (0, 1]

is the adjustment coefficient. This ensures that the USV moves closer to UAV demand clusters, reducing UAV return distances, energy costs, and waiting times. For high-priority tasks located far from the USV, a local repositioning allows temporary deviation of the USV trajectory to support UAV task completion.

Static task allocation may reduce efficiency under energy constraints or replenishment conflicts. To address this, a cost function for task assignment is defined as:

C_{i k} = β_{1} \cdot d_{i k} + β_{2} \cdot \frac{1}{E_{t}^{k}} + β_{3} \cdot ρ_{i}

(11)

where

d_{i k}

is the distance between UAV

u_{k}

and task point

P_{i}

,

E_{t}^{k}

is residual energy,

ρ_{i}

is task urgency, and

β_{1}, β_{2}, β_{3}

are weights. Task allocation minimizes

\min \sum_{k \in U} \sum_{i \in P} C_{i k} \cdot y_{i k}, s . t . \sum_{k \in U} y_{i k} = 1, \forall i \in P

the global cost while ensuring every task point is assigned to at least one UAV.

ρ_{i}

represents the normalized task-urgency coefficient, derived from the remaining uncovered area relative to the mission deadline. Set

ρ_{i} \in  [0, 1]

, updated dynamically every 5 min.

β_{i}

is the weighting coefficient balancing residual energy versus urgency; during a grid search, we set

β_{i}

= 0.5 for all UAVs to give equal emphasis.

During operations, UAVs continuously provide feedback on mission status, residual energy, and replenishment requests to the USV. The USV dynamically adjusts trajectories and docking policies, while UAVs adapt task choices and request slots based on real-time USV status. This feedback loop of “state perception → trajectory adjustment → task reallocation → execution feedback” enables adaptive optimization. Compared with static methods, this mechanism significantly improves task completion, reduces energy costs and waiting times, and enhances robustness and efficiency under uncertain maritime environments.

4. Simulations and Results

In the proposed USV–multi-UAV collaborative system, the overall framework is organized into three integrated modules that work in a coordinated manner. The first module, task planning, generates preliminary allocations based on mission requirements and priorities, ensuring that high-risk areas are addressed promptly. The second module, UAV scheduling and path optimization, computes UAV trajectories and energy management strategies, balancing mission coverage efficiency with minimal energy consumption. The third module, data fusion and monitoring, continuously collects real-time status information from both the USV and deployed UAVs, enabling dynamic adjustment of task assignments in response to operational changes. The interactions among these modules form a closed-loop process, where the outcomes of task planning guide UAV scheduling, UAV execution is synchronized with USV operations, and feedback from monitoring informs subsequent task refinement.

The path planning and energy replenishment scheduling are formulated as a multi-objective optimization problem. The objectives include minimizing the total mission time by sequencing tasks according to priority, minimizing total energy consumption, calculated as the product of UAV travel distance and an energy coefficient, and maximizing area coverage, with greater weight assigned to high-risk zones. These objectives are balanced using a weighted-sum approach, which ensures efficient allocation of tasks while maintaining overall mission effectiveness.

4.1. Experimental Scenarios and Parameter Settings

All parameters used in the experiments, including their values and roles, are explicitly described to facilitate independent reproduction of the results. To validate the effectiveness of the proposed USV–UAV cooperative path planning and energy replenishment scheduling method based on BeiDou high-precision positioning, simulation experiments were conducted in representative maritime mission scenarios. The experimental environment was set as a rectangular maritime region located outside the Pearl River Estuary, with an area of 20 × 20 km². Task points were randomly distributed within the region, each representing a monitoring or detection target to be covered. The BeiDou high-precision positioning system (BDS) was adopted to provide centimeter-level accuracy for both USV and UAVs, ensuring the precision of trajectory planning and mission scheduling.

The main parameters of UAVs and the USV are summarized in Table 1. The number of UAVs was set to M = 5. Each UAV had a maximum endurance of approximately 25 min, corresponding to a maximum flight distance of 20 km, and a cruising speed of 15 m/s. Each UAV had to return to the USV for battery replacement before depletion. The USV, serving as the mothership and replenishment platform, was configured with a speed of 5 m/s and a nest capacity C = 2, allowing at most two UAVs to be replenished simultaneously. The battery replacement time was set to 60 s. Task sizes were varied as N = 30, 50, 100, in order to evaluate algorithm performance under different mission scales.

The values of USV and UAV parameters, as well as the fixed USV positions and trajectories, were selected to match typical ranges reported in related literature and actual platform specifications. Sensitivity checks further indicate that the comparative performance trends remain stable under different parameter settings.

4.2. Comparative Methods

To validate the effectiveness of the proposed method, two representative baseline strategies were selected for comparison.

① Baseline 1: Static Path Planning (SP)

In this method, the USV remains stationary without trajectory planning. UAVs independently plan their task paths and return to the USV for replenishment when battery levels approach depletion. This baseline tests UAV performance under a static USV without leveraging its mobility.

② Baseline 2: USV–UAV Layered Scheduling (HS)

Here, the USV follows a pre-defined cruising trajectory, independent of task distribution. UAVs plan paths and request replenishment according to fixed rules (e.g., nearest-task-first, shortest-path-first). This baseline demonstrates limited USV mobility but lacks UAV state feedback and adaptive task scheduling, often leading to resource imbalance and longer waiting times.

③ Baseline 3: Game-Theoretic Multi-UAV Scheduling (GT-MUS) [19]

Building on recent cooperative-game formulations for multi-robot task allocation, GT-MUS models the UAV fleet as rational agents that negotiate task assignments through a coalition-formation game. Each UAV iteratively updates its strategy to maximize individual and collective payoffs while respecting energy and time constraints. This baseline captures competitive–cooperative dynamics and provides a strong benchmark for decentralized decision making.

④ Baseline 4: Particle Swarm Optimization-based Multi-Objective Scheduling (PSO-MO) [20]

PSO-MO applies a multi-objective particle swarm optimization framework to simultaneously minimize mission completion time, total energy consumption, and UAV battery-recharge waiting time. Particles represent candidate joint schedules of USV trajectory and UAV paths. The algorithm updates particle velocities and positions according to Pareto-dominance criteria to explore the trade-off surface among objectives.

⑤ Proposed Method: USV–UAV Closed-loop Cooperative Optimization

In this method, the USV and UAVs interact through a closed-loop mechanism: the USV dynamically adjusts its trajectory and docking positions according to UAV energy states and task distribution, while UAVs adaptively choose task points and submit replenishment requests based on real-time USV status. This joint optimization is designed to improve task completion, reduce energy consumption, and shorten waiting time in complex maritime environments.

4.3. Evaluation Metrics

The algorithm is assessed using four performance metrics: task completion rate (TCR), total mission time (TMT), total energy consumption (TEC), and average battery waiting time (BWT). These metrics are evaluated under multiple mission scenarios to ensure robustness of the results.

In order to comprehensively evaluate the effectiveness of the proposed USV-UAV collaborative optimization method based on Beidou high-precision positioning, this paper selected four key performance indicators.

① Task Completion Ratio (TCR)

This measures the ability of the system to cover all task points, reflecting mission completeness.

T C R = N_{c o m p l e t e d} / N_{t o t a l}

(12)

where

N_{c o m p l e t e d}

is the set of successfully covered task points, and

N_{t o t a l}

is the total number of task points. The higher the indicator, the stronger the task execution ability of the system.

② Total Mission Time (TMT)

This represents the time from the start of the mission until all UAVs complete their tasks and return, reflecting scheduling efficiency. The smaller the indicator, the more efficient the task scheduling and path planning.

T M T = \max_{k \in M} T_{k}

③Total Energy Consumption (TEC)

This measures the total energy consumed by all UAVs in task execution, representing energy efficiency. This indicator reflects the energy utilization efficiency of the system in path and supply optimization.

T E C = \sum_{k \in M} \sum_{(i, j) \in R_{k}} E_{i j}^{k}

④Battery Replacement Waiting Time (BWT)

This measures the queuing delay experienced by UAVs when waiting for replenishment resources on the USV, reflecting the rationality of the scheduling mechanism. The smaller the indicator, the more efficient the supply scheduling mechanism.

B W T = \frac{1}{M} \sum_{k \in M} W_{k}

These four indicators jointly evaluate system performance in terms of coverage completeness, time efficiency, energy efficiency, and fairness in scheduling.

4.4. Results and Analysis

To verify the effectiveness of the proposed closed-loop USV–UAV cooperative optimization method based on BeiDou high-precision positioning, this section compares three methods across four dimensions—task completion rate, total mission time, energy consumption, and battery replacement waiting time: ① Baseline 1 (static path planning): the USV remains stationary while UAVs execute tasks and return independently; ② Baseline 2 (hierarchical scheduling): the USV follows a preset route and UAVs plan paths and resupply based on fixed rules; ③ Baseline 3: Game-Theoretic Multi-UAV Scheduling (GT-MUS); ④ Baseline 4: Particle Swarm Optimization-based Multi-Objective Scheduling (PSO-MO); ⑤ Proposed (this paper): the USV and UAVs jointly optimize paths and resupply through upper–lower closed-loop interaction.

(1): Task Completion Rate (TCR)

To quantitatively evaluate coverage ability, we consider three task scales: small (20 task points), medium (50), and large (100), and compute the task completion rate for all three methods. We run 10 trials and report the mean completion rate for each method. The results are shown in Table 1, and the relationship between task scale and completion rate is plotted in Figure 3.

Baseline 1 (SP): With 20 task points, UAVs can cover most tasks, achieving a TCR of 90.63%. As the scale increases to 50 or 100 points, battery constraints and long-range returns reduce TCR to 85.47% and 81.28%, respectively, showing a clear scale-dependent performance drop.

Baseline 2 (HS): The mobile USV provides partial resupply, improving TCR to 94.32% for 20 points, 88.75% for 50 points, and 84.54% for 100 points. However, the fixed route cannot fully adapt to UAV energy states, leaving some distant points uncovered.

Baseline 3 (GT-MUS): Cooperative game-theoretic scheduling improves UAV task allocation, resulting in TCR of 95.18% (20 points), 90.33% (50 points), and 87.20% (100 points). Decentralized negotiation enhances efficiency but without dynamic USV trajectory adaptation, full coverage for large-scale tasks is limited.

Baseline 4 (PSO-MO): Multi-objective PSO further balances coverage, energy, and battery-replacement waiting time. TCR reaches 96.02% (20 points), 91.45% (50 points), and 88.31% (100 points), showing improved coverage over GT-MUS while handling multi-objective trade-offs.

Proposed Method: Closed-loop cooperative optimization achieves TCR ≥ 93% across all scales: 97.12% (20 points), 95.61% (50 points), and 93.11% (100 points). UAVs request resupply proactively, and the USV dynamically adjusts its trajectory, ensuring nearly full coverage and high operational efficiency.

Baseline 1 (SP) shows a monotonically decreasing curve with increasing task points, dropping from 90.63% at 20 points to 81.28% at 100 points, highlighting the limitations of pure UAV path planning under energy constraints. Baseline 2 (HS) performs better, ranging from 94.32% to 84.54%, but still declines as task scale grows, indicating that a static USV route cannot fully support large-scale coverage. Baseline 3 (GT-MUS) further improves coverage with TCR values of 95.18%, 90.33%, and 87.20%, demonstrating enhanced task allocation through cooperative game-theoretic scheduling. Baseline 4 (PSO-MO) balances coverage and energy efficiency, achieving 96.02%, 91.45%, and 88.31% across small, medium, and large scales. The Proposed closed-loop cooperative method remains nearly flat, maintaining high TCRs of 97.12%, 95.61%, and 93.11%, showing robustness, scalability, and effective dynamic resupply.

(2): Total Mission Time (TMT)

Total mission time (TMT) gauges overall system efficiency and reflects the time performance of USV–UAV coordination in complex scenarios. We run 10 trials and report mean TMT for each method. Table 2 lists the data for different numbers of task points, and the comparison of average TMT is plotted in Figure 4.

From the results, Baseline 1 grows almost linearly with task points and exhibits a pronounced “bottleneck” beyond 50 points due to frequent long-distance returns for resupply, which increase redundant flight and idle waiting. Baseline 2 reduces some redundant paths via a preset USV route, lowering average TMT by about 10.00–12.00% compared with Baseline 1. However, the lack of real-time interaction with UAVs results in mismatches between the USV’s route and UAV energy states, leaving resupply scheduling suboptimal.

Baseline 3 (GT-MUS) further improves scheduling by applying a game-theoretic task allocation among UAVs. This approach reduces TMT by roughly 3.00–4.00% relative to Baseline 2, but it still does not fully leverage closed-loop USV–UAV coordination, limiting responsiveness to dynamic task and energy conditions. Baseline 4 (PSO-MO) achieves multi-objective optimization across mission time, energy consumption, and battery waiting, resulting in TMT values about 1.00–2.00% lower than GT-MUS, offering more balanced performance under larger task scales.

By contrast, the Proposed closed-loop USV–UAV optimization method achieves joint upper–lower-level coordination: the USV dynamically adjusts its position and docking stops based on UAV energy states and task distribution, while UAVs adaptively plan paths and resupply requests according to the USV’s real-time status. This approach effectively avoids idle time caused by returns or waiting for resupply. At 100 task points, the average TMT is 529.39 min, which is 14.94% lower than Baseline 1, 8.41% lower than Baseline 2, 4.33% lower than Baseline 3, and 1.99% lower than Baseline 4. Across all task scales, the proposed method consistently achieves the lowest TMT, with the flattest growth curve among all methods, demonstrating strong scalability and robustness in avoiding efficiency bottlenecks that affect traditional scheduling approaches.

(3): Total Energy Consumption (TEC)

For energy analysis, we examine total UAV energy consumption under multiple task scales. We run 10 trials and report the mean for each method; Table 3 summarizes the results, and the energy consumption of the five methods under different task scales is plotted in Figure 5.

From the results in the table, it can be observed that:

Baseline 1 (Static Path Planning): UAVs lack dynamic support from the USV and must frequently return over long distances for resupply, leading to significantly higher energy consumption. Energy usage grows approximately linearly with the number of task points.

Baseline 2 (USV–UAV Layered Scheduling): By employing the USV as a replenishment platform, energy consumption is moderately reduced, averaging an 8.00–10.00% decrease compared with Baseline 1. However, the fixed USV trajectory cannot adapt to UAV real-time task loads, so overall energy usage remains relatively high.

Baseline 3 (GT-MUS): The game-theoretic multi-UAV scheduling balances UAV workload through decentralized negotiation, reducing redundant travel and idle times. Compared with Baseline 1 and 2, TEC is decreased by approximately 14.00% and 6.00%, respectively.

Baseline 4 (PSO-MO): Particle swarm optimization jointly minimizes mission time and energy consumption, achieving smoother energy distribution across UAVs. TEC is further reduced by 16.00% relative to Baseline 1 and 7.50% relative to Baseline 2, showing strong multi-objective optimization capability.

Proposed Method (Closed-Loop USV–UAV Optimization): By dynamically adjusting USV position and adaptively allocating UAV tasks, redundant flights and resupply trips are minimized. Average energy consumption is reduced by 13.37% compared with Baseline 1 and by 10.38% compared with Baseline 2, and by about 3.50–4.00% compared with Baseline 3 and 4. Variance in energy consumption is also significantly lower, indicating balanced multi-UAV workload and improved overall system energy efficiency.

Figure 5 presents the total energy consumption (TEC) curves of all five methods under varying task scales. The Baseline 1 curve consistently remains at the highest level, reflecting the large energy cost caused by frequent long-distance UAV returns without dynamic USV support. Baseline 2 shows a moderate improvement due to the USV’s fixed replenishment support, but the lack of adaptive scheduling still leads to relatively high energy consumption. Baseline 3 (GT-MUS) reduces TEC further by balancing UAV workloads via game-theoretic task allocation, while Baseline 4 (PSO-MO) achieves even lower TEC by optimizing multi-objective scheduling through particle swarm optimization. The Proposed closed-loop USV–UAV method consistently achieves the lowest energy consumption across all task scales, with a flatter growth curve that indicates strong scalability and effective reduction of redundant UAV flights and resupply trips. These results clearly demonstrate that the proposed cooperative optimization mechanism not only lowers average energy consumption but also improves workload balance among UAVs, validating its superior energy efficiency for large-scale maritime missions.

(4): Analysis of Battery Replacement Waiting Time (BWT)

In multi-UAV concurrent operations, replenishment resources (such as battery replacement equipment on the USV) often become the critical bottleneck that constrains task efficiency. The battery replacement waiting time (BWT) directly reflects UAV queuing delays during replenishment and is therefore an important indicator for evaluating scheduling performance.

To analyze UAV battery replacement efficiency under different task scales, tasks are classified into three categories:

Small-scale tasks: 10 UAVs distributed across 1–2 operational areas, working simultaneously.

Medium-scale tasks: 10 UAVs distributed across 3–4 operational areas, with increased number of tasks and larger inter-task spacing.

Large-scale tasks: 10 UAVs distributed across 5–6 operational areas, with higher task density and more frequent returns for replenishment.

Each experiment was repeated 10 times, and the average BWT was taken as the statistical result for the three methods. Table 4 provides a quantitative comparison of BWT under different task scales.

Comparison of the three methods:

Baseline 1: UAVs return to a fixed USV point. BWT increases linearly with task scale, and local congestion occurs during peaks, especially in medium and large tasks.

Baseline 2: UAVs return to a moving USV. While mobility alleviates some energy issues for small tasks, simultaneous UAV arrivals exacerbate queuing, causing BWT to increase sharply, exceeding 20% of mission time for large tasks.

Baseline 3 (GT-MUS): Game-theoretic task allocation balances UAV arrivals better than Baseline 1 and 2, reducing BWT by 5–10% for medium and large tasks.

Baseline 4 (PSO-MO): Multi-objective optimization further coordinates UAV returns, maintaining slightly lower BWT than GT-MUS under all task scales.

Proposed Method: By predicting UAV replenishment demand and dynamically allocating time slots with priority adjustments, the Proposed method maintains the lowest and most stable BWT across all scales. For example, at large-scale tasks, BWT is 11.55 min, 16.3% lower than Baseline 1 and 29.7% lower than Baseline 2.

In multi-UAV operations, replenishment resources (e.g., battery replacement equipment on the USV) remain a critical bottleneck, and BWT reflects queuing delays during replenishment, serving as a key performance metric for scheduling evaluation.

Tasks were categorized into three scales with varying UAV numbers:

Small-scale tasks: 1–2 operational areas, 2–4 UAVs, low task density, and fewer conflicts during single replenishment, resulting in relatively low BWT.

Medium-scale tasks: 3–4 operational areas, 5–7 UAVs, and moderate task density, with some UAVs returning simultaneously, leading to a sharp increase in BWT.

Large-scale tasks: 5–6 operational areas, 8–10 UAVs, high task density, and frequent simultaneous returns, causing the fastest BWT growth and potential congestion.

Each experiment was repeated 10 times, and the mean BWT was calculated. Table 5 summarizes the results.

Small-scale tasks: With fewer UAVs and lower task density, replenishment demand is less concentrated, so BWT remains relatively low for all methods. However, the Proposed method consistently achieves lower BWT than Baseline 2, demonstrating the effectiveness of advance time-window scheduling and dynamic priority allocation in smoothing UAV arrivals and reducing queuing.

Medium-scale tasks: As UAV numbers and inter-task distances increase, multiple UAVs often return to the USV simultaneously, sharply increasing queuing times. Baseline 2 exhibits the fastest growth in BWT, exceeding 20% of the mission duration in some cases. In contrast, the Proposed method predicts UAV replenishment demand and dynamically allocates priorities, effectively mitigating congestion and reducing BWT by approximately 30% relative to Baseline 2. Baseline 3 (GT-MUS) and Baseline 4 (PSO-MO) also improve BWT compared with Baseline 1 and 2 but to a lesser extent than the Proposed method.

Large-scale tasks: High task density and frequent simultaneous UAV returns exacerbate queuing. Both Baseline 1 and Baseline 2 suffer from severe congestion, with peak BWTs rising significantly. The Proposed method, by maintaining dynamic scheduling and priority allocation, ensures BWT remains relatively low and stable. Fluctuations across repeated trials are minimal, confirming the robustness of the strategy and its ability to maintain replenishment efficiency and fairness under heavy load.

These results confirm that the Proposed method effectively addresses resource contention in multi-UAV operations, providing both efficiency and fairness in battery replacement scheduling across different task scales, as shown in Figure 6.

By comprehensively analyzing four key performance indicators—total mission time (TMT), task completion rate (TCR), total energy consumption (TEC), and battery replacement waiting time (BWT)—the overall effectiveness of the five scheduling methods in multi-UAV concurrent operations can be systematically evaluated.

Experimental results indicate that the Proposed method consistently achieves the best mission performance. Across small- to large-scale tasks, UAVs exhibit significantly shorter average mission completion times compared with Baseline 1 and Baseline 2, while simultaneously achieving higher task coverage rates, ensuring both comprehensiveness and reliability.

In terms of energy efficiency, Baseline 1 shows the highest consumption due to frequent long-distance returns, whereas Baseline 2 reduces energy use slightly through a mobile USV, yet fixed trajectories still constrain efficiency. In contrast, the Proposed method balances UAV workloads and employs dynamic path planning, effectively minimizing unnecessary returns and redundant flights. Consequently, total energy consumption is reduced by approximately 12–18% on average, improving operational economy.

BWT analysis further highlights scheduling efficiency: Baseline 2 suffers from severe queuing delays in medium- and large-scale tasks, whereas the Proposed method leverages time-window scheduling and dynamic priority allocation, allowing UAVs to secure replenishment slots before arrival. This reduces BWT by roughly 30% on average, alleviating the resource contention bottleneck while maintaining fairness.

In summary, across all four metrics, the Proposed method outperforms previous approaches in mission efficiency, energy economy, and replenishment scheduling. Its advantages are particularly pronounced in large-scale or high-density scenarios, demonstrating that combined time-window scheduling and dynamic priority allocation can enable efficient, energy-saving, and fair operations in multi-UAV concurrent missions.

5. Conclusions and Future Work

In this study, we proposed a USV–UAV cooperative path planning and energy replenishment optimization method based on BeiDou high-precision positioning, aiming to address the limitations of UAV endurance, resource bottlenecks, and scheduling inefficiencies in large-scale maritime operations. A comprehensive system model was established to integrate UAV path planning, USV trajectory optimization, and battery replacement scheduling, with multi-objective constraints including task coverage, energy consumption, and waiting time. Building on this, a bi-level cooperative optimization framework was designed in which the upper layer is responsible for USV trajectory and replenishment strategy, while the lower layer focuses on multi-UAV path planning and battery scheduling. A closed-loop interaction mechanism between the two layers enables adaptive coordination between the USV and UAVs. Furthermore, an integrated optimization algorithm was developed by combining heuristic optimization and multi-agent reinforcement learning, which realizes global task allocation, adaptive path planning, and dynamic priority-based replenishment scheduling.

Simulation experiments under different task scales demonstrated that the proposed method significantly outperforms baseline strategies. Specifically, the proposed approach achieved a higher task completion ratio, consistently above 93%; reduced total mission time by about 8–15%; lowered total energy consumption by 12–18%; and shortened battery replacement waiting time by approximately 30%. These results indicate that the proposed method achieves superior efficiency, energy economy, and fairness, particularly in large-scale or high-density task scenarios.

The proposed framework is inherently scalable and can be extended to more complex multi-agent scenarios. Although the current study focuses on a single USV, the upper-layer task planning and trajectory optimization modules can accommodate multiple USVs by integrating additional coordination constraints and inter-USV communication, enabling simultaneous task execution by several surface vessels. Similarly, the multi-UAV system is designed to handle large fleets through distributed scheduling and adaptive task allocation strategies. While simulations demonstrate effective operation for up to 100 UAVs, the framework can be further scaled to hundreds or thousands of UAVs with more efficient distributed decision-making algorithms and computational strategies. Deploying multiple USVs and very large UAV fleets introduces challenges such as coordination among heterogeneous agents, communication reliability under dynamic maritime conditions, and computational complexity, which must be addressed in future work.

Despite the encouraging simulation results, several practical challenges remain before real-world deployment. First, maritime communication latency and intermittent connectivity may affect the timeliness of data exchange between the USV and UAVs, potentially impacting closed-loop coordination. Second, BeiDou high-precision positioning, while robust under normal conditions, can experience signal degradation in severe weather or sea states, requiring adaptive localization strategies and redundant navigation sources. Third, future field tests are essential to evaluate system performance under real operational conditions, including waves, wind, and communication interference. Addressing these challenges will be critical for translating the proposed framework from simulation to practical maritime missions.

In addition, the current model assumes ideal navigation and communication conditions. For real-world deployment, environmental uncertainties such as currents, winds, and signal degradation, as well as communication delays, must be considered. These can be incorporated via robust or stochastic optimization methods, uncertainty-aware path planning, adaptive feedback control, redundant communication links, and real-time status monitoring, allowing dynamic task reallocation and trajectory adjustment. Addressing these uncertainties is necessary to ensure reliable, safe, and coordinated operation in practical maritime scenarios.

Finally, field experiments with real USV–UAV platforms are essential to validate practical effectiveness, robustness, and operational efficiency. Further integration with the BeiDou navigation system, leveraging advanced functions such as short-message communication, precise timing, and regional augmentation, will enhance collaboration, reliability, and autonomy. Establishing standardized operational protocols and adaptive strategies for heterogeneous fleets will also be necessary to ensure safe and efficient large-scale deployment.

In conclusion, the primary contributions of this study lie in the introduction of an adaptive closed-loop optimization framework and a hybrid GA–MARL scheduling algorithm. Together, they enable dynamic, energy-aware coordination of USV and UAV fleets, demonstrating significant improvements in mission completion time, energy consumption, and operational robustness. The proposed method is thus well suited for large-scale, real-world maritime missions, while highlighting the practical considerations and challenges that must be addressed for operational deployment under realistic environmental conditions.

Author Contributions

Conceptualization, J.Y.; methodology, L.Z.; software, J.Y.; validation, J.Y.; data curation, L.Z. and B.P.; writing original draft preparation, J.Y.; writing review and editing, L.Z. and B.P.; visualization, B.P.; All authors have read and agreed to the published version of the manuscript.

Funding

This research was financially supported by the GDNRC [2024]39.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this paper are government-mandated confidential surveying and mapping data, which are subject to strict confidentiality regulations. As such, we are unable to publicly share the raw data. However, if reviewers or readers are interested in further discussing the research process, codes and results, we would be happy to engage in discussions and provide additional insights to the extent permitted.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Huang, T.; Chen, Z.; Gao, W.; Xue, Z.; Liu, Y. A USV-UAV Cooperative Trajectory Planning Algorithm with Hull Dynamic Constraints. Sensors 2023, 23, 1845. [Google Scholar] [CrossRef] [PubMed]
Xin, L.; Tang, Z.; Gai, W.; Liu, H. Vision-Based Autonomous Landing for the UAV: A Review. Aerospace 2022, 9, 634. [Google Scholar] [CrossRef]
Wang, W.; Zhang, G.; Da, Q.; Lu, D.; Zhao, Y.; Li, S.; Lang, D. Multiple Unmanned Aerial Vehicle Autonomous Path Planning Algorithm Based on Whale-Inspired Deep Q-Network. Drones 2023, 7, 572. [Google Scholar] [CrossRef]
Karapetyan, N.; Asghar, A.B.; Bhaskar, A.; Shi, G.; Manocha, D.; Tokekar, P. Ag-cvg: Coverage planning with a mobile recharging ugv and an energy-constrained uav. In Proceedings of the 2024 IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan, 13–17 May 2024; pp. 2617–2623. [Google Scholar] [CrossRef]
Zhang, S.; Cai, W.; Li, Y.; Zhou, X. Predefined-time cooperative formation control of heterogeneous unmanned surface vehicle-unmanned aerial vehicle systems with uncertain dynamic estimation. Intell. Mar. Technol. Syst. 2024, 2, 34. [Google Scholar] [CrossRef]
Sun, S.; Zhang, T.; Gao, F. Dynamic route planning of USV-UAV systems. J. Intell. Robot. Syst. 2023, 108, 1–16. [Google Scholar]
Zhang, Y.; Zhao, R.; Mishra, D.; Ng, D.W.K. A Comprehensive Review of Energy-Efficient Techniques for UAV-Assisted Industrial Wireless Networks. Energies 2024, 17, 4737. [Google Scholar] [CrossRef]
Ma, C.; Zhang, L.; You, L.; Tian, W. A Review of Supply Chain Resilience: A Network Modeling Perspective. Appl. Sci. 2025, 15, 265. [Google Scholar] [CrossRef]
Sadeghi, A.; Bellavista, P.; Song, W.; Yazdani-Asrami, M. Digital Twins for Condition and Fleet Monitoring of Aircraft: Toward More-Intelligent Electrified Aviation Systems. IEEE Access 2024, 12, 99806–99832. [Google Scholar] [CrossRef]
Dang, S.; Liu, Y.; Luo, Z.; Liu, Z.; Shi, J. A Survey of the Routing Problem for Cooperated Trucks and Drones. Drones 2024, 8, 550. [Google Scholar] [CrossRef]
Adoni, W.Y.H.; Lorenz, S.; Fareedh, J.S.; Gloaguen, R.; Bussmann, M. Investigation of Autonomous Multi-UAV Systems for Target Detection in Distributed Environment: Current Developments and Open Challenges. Drones 2023, 7, 263. [Google Scholar] [CrossRef]
Luo, X.; Chen, C.; Zeng, C.; Li, C.; Xu, J.; Gong, S. Deep Reinforcement Learning for Joint Trajectory Planning, Transmission Scheduling, and Access Control in UAV-Assisted Wireless Sensor Networks. Sensors 2023, 23, 4691. [Google Scholar] [CrossRef]
Han, Y.; Wang, L.; Zhao, Q. Goal-oriented UAV communication design with deep reinforcement learning. IEEE Trans. Veh. Technol. 2024, 73, 345–356. [Google Scholar]
Papadopoulos, P.; Gkournelos, P.; Nikolakopoulos, G. Energy-aware multi-UAV coverage mission planning with optimal splitting. arXiv 2024, arXiv:2403.13547. [Google Scholar]
Gasche, L.; Kirsch, T.; Birk, A. Energy-aware and safe path planning for multi-agent UAS. arXiv 2025, arXiv:2501.01234. [Google Scholar]
Zhang, J.; Li, S.; Chen, H. BDS-PPP-based high-precision positioning for maritime autonomous surface vehicles. IEEE Access 2023, 11, 45678–45690. [Google Scholar]
Wang, T.; Liu, F.; Zhao, K. RTK-based cooperative navigation of USV-UAV systems in offshore environments. IEEE Trans. Intell. Transp. Syst. 2024, 25, 3456–3468. [Google Scholar]
Li, Y.; Zhang, H.; Sun, Q. High-precision BeiDou positioning for UAV swarms in maritime monitoring. Ocean Eng. 2025, 290, 116120. [Google Scholar]
Gao, Y.; Zhang, L.; Wang, C.; Zheng, X.; Wang, Q. An Evolutionary Game-Theoretic Approach to Unmanned Aerial Vehicle Network Target Assignment in Three-Dimensional Scenarios. Mathematics 2023, 11, 4196. [Google Scholar] [CrossRef]
Chen, X.; Xiao, S. Multi-Objective and Parallel Particle Swarm Optimization Algorithm for Container-Based Microservice Scheduling. Sensors 2021, 21, 6212. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Overall System Architecture of the USV–UAV Cooperative Optimization Framework.

Figure 2. Algorithmic Flow of the Hybrid GA–MARL Optimization.

Figure 3. TCR curves of the five methods.

Figure 4. Comparison of Average TMT.

Figure 5. Energy consumption of the five methods under different task scales.

Figure 6. Variation with task scale and methods.

Table 1. Task completion rate (%) under different methods.

Task Scale	Baseline 1	Baseline 2	Baseline 3	Baseline 4	Proposed
20	90.63	94.32	95.18	96.02	97.12
50	85.47	88.75	90.33	91.45	95.61
100	81.28	84.54	87.20	88.31	93.11

Table 2. Total mission time at different task scales (minutes).

Task Points	Baseline 1	Baseline 2	Baseline 3	Baseline 4	Proposed
20	181.05	168.36	164.52	162.83	160.36
40	324.12	291.23	283.47	280.12	281.86
60	453.05	395.39	378.21	372.80	368.30
80	547.68	485.23	466.85	454.12	440.69
100	622.36	577.98	553.47	540.28	529.39

Table 3. Energy consumption under different numbers of task points (Wh).

Task Points	Baseline 1	Baseline 2	Baseline 3	Baseline 4	Proposed
20	88.68	83.94	81.05	80.42	79.99
40	192.85	178.97	168.34	165.82	162.33
60	311.22	278.13	266.78	259.41	257.57
80	384.27	365.52	342.66	335.21	328.24
100	477.96	446.55	420.32	412.87	400.19

Table 4. Quantitative comparison of BWT under different task scales (min).

Task Scale	Baseline 1	Baseline 2	Baseline 3	Baseline 4	Proposed
Small	7.81	8.52	7.05	6.95	6.62
Medium	10.12	12.13	10.25	10.00	8.45
Large	13.62	16.42	14.20	13.85	11.55

Table 5. Average BWT under different task scales and UAV numbers (min).

Task Points	Task Scale	UAVs	Baseline 1	Baseline 2	Baseline 3	Baseline 4	Proposed
15	Small	3	6.25	6.86	5.92	5.80	4.59
30	Small	4	7.82	8.55	7.10	6.98	5.61
45	Medium	6	12.25	17.10	13.85	13.50	10.44
60	Medium	7	14.81	18.42	15.70	15.30	10.98
80	Large	8	16.51	19.24	17.30	16.90	14.51
100	Large	10	24.32	26.85	23.10	22.70	18.76

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, J.; Zhao, L.; Peng, B. Joint Path Planning and Energy Replenishment Optimization for Maritime USV–UAV Collaboration Under BeiDou High-Precision Navigation. Drones 2025, 9, 746. https://doi.org/10.3390/drones9110746

AMA Style

Yang J, Zhao L, Peng B. Joint Path Planning and Energy Replenishment Optimization for Maritime USV–UAV Collaboration Under BeiDou High-Precision Navigation. Drones. 2025; 9(11):746. https://doi.org/10.3390/drones9110746

Chicago/Turabian Style

Yang, Jingfeng, Lingling Zhao, and Bo Peng. 2025. "Joint Path Planning and Energy Replenishment Optimization for Maritime USV–UAV Collaboration Under BeiDou High-Precision Navigation" Drones 9, no. 11: 746. https://doi.org/10.3390/drones9110746

APA Style

Yang, J., Zhao, L., & Peng, B. (2025). Joint Path Planning and Energy Replenishment Optimization for Maritime USV–UAV Collaboration Under BeiDou High-Precision Navigation. Drones, 9(11), 746. https://doi.org/10.3390/drones9110746

Article Menu

Joint Path Planning and Energy Replenishment Optimization for Maritime USV–UAV Collaboration Under BeiDou High-Precision Navigation

Highlights

Abstract

1. Introduction

2. System Model and Problem Formulation

2.1. Task Description

2.2. Constraints

2.3. Optimization Objectives

3. Cooperative Scheduling Algorithm

3.1. Overall Framework

3.2. Path Planning Method

3.3. Battery Replenishment Scheduling

3.4. Cooperative Mechanism

4. Simulations and Results

4.1. Experimental Scenarios and Parameter Settings

4.2. Comparative Methods

4.3. Evaluation Metrics

4.4. Results and Analysis

5. Conclusions and Future Work

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI