Edge-Intelligence-Driven Cooperative Control Framework for Heterogeneous Unmanned Aerial and Surface Vehicles in Complex Maritime Environments

Yang, Jingfeng; Zhao, Lingling; Peng, Bo

doi:10.3390/drones9110755

Open AccessArticle

Edge-Intelligence-Driven Cooperative Control Framework for Heterogeneous Unmanned Aerial and Surface Vehicles in Complex Maritime Environments

by

Jingfeng Yang

^1,2

,

Lingling Zhao

^1,2,* and

Bo Peng

^3,*

¹

Key Lab of Guangdong for Utilization of Remote Sensing and Geographical Information System, Guangdong Open Laboratory of Geospatial Information Technology and Application, Guangzhou 510070, China

²

Guangzhou Institute of Geography, Guangdong Academy of Sciences, Guangzhou 510070, China

³

College of Environment and Climate, Jinan University, Guangzhou 511443, China

^*

Authors to whom correspondence should be addressed.

Drones 2025, 9(11), 755; https://doi.org/10.3390/drones9110755

Submission received: 10 September 2025 / Revised: 23 October 2025 / Accepted: 23 October 2025 / Published: 31 October 2025

(This article belongs to the Section Unmanned Surface and Underwater Drones)

Download

Browse Figures

Versions Notes

Abstract

With the increasing deployment of unmanned systems in maritime patrol, coastal monitoring, and environmental mapping, achieving effective UAV-USV collaboration in dynamic environments remains challenging. This paper proposes an edge-intelligence-driven collaborative control framework that integrates unified data modeling, multi-objective task scheduling, lightweight fault-tolerant middleware, and multi-sensor fusion. A Weighted Kalman Filter combines UAV imaging and USV sonar data to enhance perception accuracy, while NSGA-II optimizes task allocation considering completion time, energy consumption, and sensing reliability. The framework was validated through representative maritime scenarios, including patrol and coastal sediment mapping, on a virtual simulation platform. Results show improved task efficiency, energy utilization, communication latency, and robustness compared with single-platform and centralized scheduling approaches. The proposed method provides a balanced optimization of execution efficiency, energy consumption, data accuracy, and resilience, offering a reliable solution for large-scale maritime applications.

Keywords:

unmanned systems; UAV-USV collaboration; edge computing; multi-objective optimization; multi-modal data fusion; system robustness

1. Introduction

In recent years, with the rapid advancement of artificial intelligence, wireless communication, sensing, and autonomous control technologies, the collaborative operation of heterogeneous unmanned systems—especially unmanned aerial vehicles (UAVs) and unmanned surface vehicles (USVs)—has demonstrated significant potential in tasks such as ocean monitoring, emergency search and rescue, coastal mapping, and law enforcement patrols. However, differences in dynamic models, sensor modalities, communication protocols, and computing resources between UAVs and USVs pose systemic challenges for cross-platform real-time collaboration, robust communication, and multi-task scheduling [1,2,3,4]. For example, UAV-USV cooperative control for maritime search and rescue and coastal defense monitoring must ensure robustness and timeliness in key processes such as path tracking, formation control, and safe landing under environments characterized by strong disturbances and highly fluctuating communication links [5,6,7]. These challenges have become core bottlenecks that hinder the engineering-scale application of multi-platform collaboration.

Edge computing has been widely recognized as an effective approach to overcome the “limited computing power–latency sensitivity” dilemma faced by unmanned systems. By leveraging nearby edge nodes to support task decomposition, model inference, and data preprocessing, it is possible to reduce communication–computation round-trip latency and enhance system scalability and reliability without significantly increasing the onboard computational burden [8,9,10,11]. Existing surveys show that edge intelligence tailored for UAV–USV collaboration can effectively improve task completion time, energy utilization efficiency, and robustness under link congestion. Furthermore, trajectory-aware offloading, computation offloading, and multi-objective optimization for resource orchestration have gradually emerged as research frontiers [12,13,14].

Meanwhile, middleware communication and network protocol stacks that support collaboration have a significant impact on system performance. Under the ROS 2 ecosystem, middleware options such as MQTT, Zenoh, and DDS demonstrate different performance characteristics in terms of latency, throughput, and packet loss across end-to-end and multi-host distributed scenarios. These must be carefully balanced and designed with redundancy to account for link fluctuations in maritime environments [15]. Recent studies have also explored high-performance robotic middleware and deterministic scheduling semantics to establish predictable and low-jitter real-time communication channels, thereby providing infrastructure support for task-level collaboration in autonomous systems [16].

At the control and task level, research on UAV–USV formation, cooperative path following, autonomous landing, and target tracking has achieved notable progress. These include cooperative tracking and landing methods based on nonlinear model predictive control (NMPC), synchronized motion control under complex maritime conditions, and improved line-of-sight (LOS) guidance laws that consider disturbances and overshoot, all of which significantly enhance the feasibility of collaborative systems in real maritime environments [17,18,19,20]. In addition, studies on multi-USV cooperative patrols and ASV formation-based target tracking provide algorithmic and experimental foundations for extending collaboration across platforms [21,22].

To achieve a comprehensive balance of task efficiency, perception accuracy, and resource consumption, task decomposition and scheduling are commonly modeled as multi-objective optimization problems. The Non-dominated Sorting Genetic Algorithm II (NSGA-II) and its variants have demonstrated strong convergence and solution diversity in computation offloading, workflow scheduling, and resource allocation across edge–cloud collaborative environments, achieving higher utility under uncertainty constraints [23,24,25].

Furthermore, multi-source sensor data fusion and anomaly self-recovery are also critical for ensuring stable maritime collaborative operations. Multi-sensor fusion based on weighted or adaptive Kalman filtering is widely applied in state estimation and environmental perception, effectively improving estimation accuracy and consistency under noise and drift [26,27]. Meanwhile, anomaly detection frameworks that combine LSTM, autoencoders, and sliding-window strategies can provide early warnings and task rescheduling in scenarios involving link jitter, device degradation, or data anomalies, thereby improving overall system robustness [28,29,30].

Building a UAV-USV collaborative control framework for complex maritime environments requires a closed-loop design that integrates unified data modeling and time synchronization, edge-intelligence-driven resource and task scheduling, lightweight middleware with fault-tolerant communication, multi-objective task decomposition and scheduling, and multi-sensor fusion with anomaly self-recovery. However, current research still exhibits the following shortcomings:

➀: Lack of unified data modeling and synchronization mechanisms, making it difficult to ensure consistency in heterogeneous multi-source data fusion between UAVs and USVs;
➁: Absence of holistic solutions that leverage edge computing for dynamic resource scheduling and computational support, limiting real-time responsiveness and scalability;
➂: Static communication architectures without sufficient fault-tolerance and self-recovery mechanisms, unable to cope with link fluctuations in maritime environments;
➃: Scarcity of research on multi-objective task scheduling optimization, making it difficult to balance task efficiency, resource allocation, and perception accuracy simultaneously;
➄: Insufficient mechanisms for detecting and automatically recovering from anomalies such as computational degradation, communication disruptions, or sensor failures, resulting in limited system reliability.

To address these challenges, this paper proposes an edge-intelligence-driven heterogeneous UAV-USV collaborative control framework for complex maritime environments, integrating the following key technologies:

➀: Unified data modeling and time synchronization mechanism: Standardizes UAV and USV multi-source data formats and timestamps to enable consistent fusion;
➁: Edge-computing-based dynamic resource and task scheduling mechanism: Allocates computational resources in real time via edge nodes to enhance responsiveness;
➂: Lightweight middleware communication and fault-tolerance design: Supports link switching, message acknowledgment, and retransmission to enhance communication robustness;
➃: Multi-objective task decomposition and scheduling optimization: Utilizes NSGA-II to balance task completion time, energy consumption, and perception accuracy;
➄: Weighted Kalman filter-based multi-sensor fusion scheme: Improves environmental perception accuracy;
➅: Hybrid anomaly detection and self-recovery mechanism: Combines statistical and learning-based models to achieve fault identification and task rescheduling.

All performance evaluations reported in this study, including metrics such as Execution Utilization Efficiency (EUE), Dynamic Fault Adaptation (DFA), and Success Rate (SR), were conducted over multiple repetitions to ensure reliability. Specifically, each experiment was repeated 10 times, and the results are presented as mean values with corresponding standard deviations to reflect variability and robustness.

Based on these innovations, this paper develops an edge-intelligence-driven collaborative control framework and validates it through representative maritime patrol and coastal mapping tasks. The results demonstrate that the proposed framework outperforms baseline schemes in terms of task completion efficiency, communication latency, and perception accuracy.

To address these challenges, the main contributions and highlights of this paper are as follows:

An edge-intelligence-driven cooperative control framework for heterogeneous UAV-USV systems is proposed. This framework integrates unified data modeling, multi-objective task scheduling, lightweight fault-tolerant middleware, and multi-sensor fusion to tackle the challenges of collaborative operations in complex maritime environments.

A multi-sensor fusion scheme based on a Weighted Kalman Filter is designed. By dynamically fusing image data from UAVs and sonar data from USVs, this scheme significantly improves the system’s perception accuracy and robustness in varying conditions.

The NSGA-II algorithm is employed for multi-objective task allocation optimization. This approach achieves Pareto-optimal scheduling by holistically considering three core metrics: task completion time, platform energy consumption, and sensing reliability.

The proposed framework is systematically validated through representative maritime scenarios. By simulating maritime patrol and coastal sediment mapping tasks, the results demonstrate that our framework shows significant advantages in task efficiency, energy utilization, communication latency, and system robustness compared to single-platform or traditional centralized scheduling methods.

The remainder of this paper is organized as follows: Section 2 details the proposed method, including the system architecture, data processing, multi-sensor fusion, edge computing and resource scheduling, communication and fault-tolerance mechanisms, multi-objective task decomposition, and anomaly detection and self-recovery strategies. Section 3 presents the experimental setup, case study results, and comparative analysis. Finally, Section 4 concludes the paper and discusses future research directions.

2. Proposed Method

2.1. System Architecture

The proposed cooperative control system adopts a hierarchical and modular architecture, integrating heterogeneous UAVs, USVs, and edge nodes into a unified operational framework. This framework establishes a closed-loop system encompassing perception, communication, decision-making, and execution. At the top layer, a task planner defines the overall operational objectives. Edge nodes handle computation-intensive processing to reduce the onboard computational burden of UAVs and USVs. The middleware layer ensures reliable communication across distributed platforms. UAVs and USVs act as mobile executors, carrying out assigned tasks while transmitting sensing data and operational status in real time.

The system consists of seven functional modules, as illustrated in Figure 1:

➀: Unified data modeling and time synchronization: Standardizes heterogeneous sensor data (e.g., images, LiDAR, sonar, GPS/INS) into a consistent spatiotemporal format.
➁: Edge computing and resource scheduling: Distributes computational tasks among UAVs, USVs, and edge nodes to achieve load balancing and low-latency response.
➂: Middleware communication and fault-tolerance mechanisms: Provides lightweight message passing, heartbeat detection, and link switching to ensure stable network communication.
➃: Multi-objective task decomposition and scheduling: Decomposes complex missions into subtasks and optimizes allocation based on trade-offs among cost, latency, and error.
➄: Multi-sensor data fusion: Employs a Weighted Kalman Filter to improve environmental perception accuracy.
➅: Anomaly detection and self-recovery: Utilizes hybrid statistical and learning-based models to detect failures, trigger task rescheduling, and isolate faulty nodes.
➆: Feedback-based dynamic adjustment: Forms a closed-loop control system that adjusts task allocation and scheduling strategies dynamically based on real-time feedback.

The data flow begins with raw sensor inputs from UAVs and USVs, which are standardized through the data modeling module and then transmitted via middleware to edge nodes. Edge nodes perform computational processing (e.g., task scheduling and sensor fusion) and send optimized decisions back to the execution platforms. Control flow is bidirectional: high-level task assignments are issued from the task planner to UAVs and USVs, while operational states and feedback information are reported upward to refine decision-making. Through this closed-loop interaction, the system achieves real-time adaptability and robust operation in dynamic maritime environments.

2.2. Data Abstraction and Synchronization

In heterogeneous unmanned system collaboration, UAVs and USVs are typically equipped with a wide variety of sensors, such as optical cameras, LiDAR, sonar, GNSS/INS, and environmental monitoring sensors (e.g., anemometers and wave height gauges). These sensors differ significantly in output format, sampling frequency, and data accuracy. If raw data are processed directly, data fusion becomes difficult and control commands may be delayed. To address this issue, this paper proposes a unified modeling approach for multi-source heterogeneous data. Specifically, data from different sources are mapped into a standardized four-tuple consisting of a timestamp, spatial pose (position and orientation), and observation values,

D_{i} \overset{f (\cdot)}{\to} D_{i}^{*} \{t, p, o, s\}

, where

D_{i}

represents the standardized data structure of platform

i

,

D_{i}^{*}

is the standardized data structure,

t

is the timestamp,

p

denotes the position coordinates,

o

refers to orientation information, and

s

represents the sensor observations. This abstraction enables interoperability across platforms while preserving information integrity, thereby providing essential data support for subsequent multi-sensor fusion and task scheduling. To accommodate an increasing variety of sensors, the system also incorporates an extensible data dictionary mechanism, which allows new sensors to be integrated by simply defining mapping rules, without requiring major changes to the overall system framework. This enhances system flexibility and adaptability.

On the basis of unified multi-source data modeling, time synchronization and calibration become critical to ensure effective data fusion. If timestamps between platforms are inconsistent, sensor information cannot be properly aligned during fusion, and may even cause failures in tasks such as path planning and obstacle avoidance. To address this, this paper proposes a two-level synchronization and calibration mechanism. First, global clock synchronization across UAVs, USVs, and edge nodes is achieved using GNSS time signals or the IEEE 1588 Precision Time Protocol (PTP), ensuring millisecond-level precision. Second, considering inherent delays and drift of different sensors, the system applies a local calibration strategy, using reference signals to correct sensor outputs. For example, when LiDAR has a delay of 30 ms, the system compensates for the timestamps of camera and GNSS data accordingly, ensuring multi-source observations are aligned to the same time. The corrected timestamp is defined as Equation (1):

t^{*} = t + Δ t, Δ t = \frac{1}{N} \sum_{i = 1}^{N} (t_{i} - \bar{t})

(1)

where

t

is the original timestamp,

Δ t

is the average correction across nodes, and

\bar{t}

is the global reference time. This mechanism ensures high temporal alignment precision of cross-platform data in complex maritime environments.

In addition to time alignment, consistency of data during transmission and fusion is equally important. To this end, this paper designs three guarantee mechanisms. First, each data packet is assigned a globally unique sequence number to prevent duplication or loss during transmission. If the receiver detects discontinuity in sequence numbers, it will automatically trigger a retransmission request. Second, a sliding time-window alignment strategy is employed, where data fusion is only performed within a defined time window to ensure input data fall within the threshold. For example, within a 200 ms window, if the time difference between two sets of data exceeds 50 ms, the latter data will be deferred to the next fusion cycle. Finally, for redundant sensor data, the system performs consistency checks through correlation analysis to assess their reliability. When the output of one sensor significantly deviates from others, its weight in the fusion calculation is automatically reduced to avoid interference from inconsistent data.

The multi-source data abstraction, time synchronization and calibration mechanism, and data consistency guarantees proposed in this section provide a solid data foundation for subsequent edge computing, task scheduling, and multi-sensor fusion. Through these mechanisms, the system can maintain efficient and reliable operation in dynamic and unstable maritime environments, providing strong support for UAV-USV collaborative missions.

2.3. Multi-Sensor Data Fusion

In cooperative perception of unmanned systems, UAVs and USVs are equipped with diverse types of sensors, including optical cameras, LiDAR, sonar, and Acoustic Doppler Current Profilers (ADCP), as well as navigation devices such as GNSS/INS. Since different sensors vary in observation accuracy, sampling frequency, and applicable environments, a single sensor often cannot provide comprehensive and reliable information. For example, cameras perform poorly under low-light conditions, while sonar in shallow water may suffer from multipath interference. To address this, this paper proposes a multi-sensor fusion method based on a Weighted Kalman Filter (WKF) to improve the accuracy and robustness of environmental perception.

Correlation analysis was applied only among sensors measuring comparable variables (e.g., accelerometer and gyroscope). For heterogeneous sensors measuring different physical quantities (e.g., pressure vs. gyroscope), reliability assessment was instead based on statistical consistency checks and error propagation modeling, rather than direct correlation.

In state modeling, let the system state vector be

x_{k}

, whose dynamic evolution is described by the following state transition Equation (2):

x_{k} = A x_{k - 1} + ω_{k}, ω_{k} \sim N (0, Q)

(2)

where

A

is the state transition matrix, and

ω_{k}

is the process noise, assumed to follow a Gaussian distribution with zero mean and covariance

Q

. The observation value of the

i

th sensor is denoted as

z_{k}^{i}

, with the observation model defined as Equation (3):

z_{k}^{i} = H^{i} x_{k} + v_{k}^{i}, v_{k}^{i} \sim N (0, R^{i})

(3)

where

H^{i}

is the observation matrix of the

i

th sensor, and

v_{k}^{i}

is the observation noise with covariance

R^{i}

.

During the fusion process, the system first obtains a prior estimate from the prediction step as Equation (4):

{\hat{x}}_{k| k - 1} = A {\hat{x}}_{k - 1| k - 1}, P_{k| k - 1} = A P_{k - 1| k - 1} A^{T} + Q

(4)

Then, multi-source observations are combined through weighted updating. Define the fusion weight matrix as

W = \{ω^{i}\}

, where

\sum_{i} ω^{i} = 1

. The multi-sensor fusion Kalman gain is calculated as Equation (5):

K_{k} = P_{k| k - 1} H^{T} {(H P_{k| k - 1} H^{T} + \sum_{i} ω^{i} R^{i})}^{- 1} i

(5)

where

H

is the prior covariance and the term in parentheses represents the weighted observation model.

The Kalman gain matrix

K_{k}

is a core parameter in the weighted Kalman filter, used to fuse the system’s predicted state with observations from multiple sensors. Specifically,

P_{k| k - 1}

represents the state prediction covariance matrix at time

k

, reflecting the confidence or uncertainty of the prior prediction. If

P_{k| k - 1}

is large, it indicates high uncertainty in the prediction, and the update will rely more on sensor observations; conversely, if it is small, the update relies more on the prediction. The matrix

H

is the combined observation matrix, which maps the system state into the observation space to allow comparison between the predicted state and sensor measurements. Its transpose,

H^{T}

, maps the observation error back into the state space, enabling the update of each state component.

The denominator term

\sum_{i} ω^{i} R^{i}

represents the weighted sum of observation noise covariance matrices, where

R^{i}

is the observation noise covariance of the

i

th sensor and

ω^{i}

is its corresponding weight. The weights

ω^{i}

are dynamically adjusted based on the sensor’s historical stability, environmental suitability, and real-time data consistency, so that sensors with higher confidence exert greater influence in the fusion process. The entire denominator,

H P_{k| k - 1} H^{T} + \sum_{i} ω^{i} R^{i}

, represents the combined uncertainty of prediction and observation, balancing their contributions to the state update. The matrix inversion

{(\cdot)}^{- 1}

ensures that the resulting state update minimizes the mean squared error.

Through this mechanism, the Kalman gain dynamically adjusts the fusion weights according to the uncertainties of the prediction and observation, allowing the system to maintain high-precision and robust state estimation under varying environmental conditions. In simple terms, when the prediction is reliable and the observations are uncertain, the update relies more on the prediction; conversely, when observations are reliable but the prediction is less accurate, the update relies more on the sensor measurements, thereby achieving optimal fusion of multi-sensor information.

Based on this, the updated state estimation is shown as Equation (6):

{\hat{x}}_{k| k} = {\hat{x}}_{k| k - 1} + K_{k} (\sum_{i} ω^{i} z_{k}^{i} - H {\hat{x}}_{k| k - 1})

(6)

P_{k| k} = (I - K_{k} H) P_{k| k - 1}

Here, the weight coefficients

ω_{i}

are constrained to be non-negative and normalized such that

\sum_{i} ω_{i} = 1

. This ensures that all sensors contribute proportionally, maintaining physical interpretability and preventing any single sensor from dominating the fusion result.

To enhance robustness and adaptability, the weights are dynamically adjusted based on sensor confidence, historical stability, and real-time measurement consistency. For example, if a camera’s measurement quality decreases due to poor lighting, its weight is automatically reduced, while the weight of a more reliable sensor such as LiDAR or sonar is increased. This dynamic weighting mechanism allows the system to maintain high-accuracy state estimation under varying environmental conditions and sensor reliability, improving both perception accuracy and fault tolerance compared with the standard Kalman filter.

Overall, the weighted Kalman filter with dynamically constrained weights provides a reliable framework for fusing heterogeneous sensor data, supporting UAV-USV collaborative operations with robust and adaptive environmental perception.

In this mechanism, observations from different sensors are weighted according to their confidence levels. The confidence is determined based on sensor historical stability, environmental adaptability, and real-time data consistency. For example, when

L_{i}^{c o m m} = \frac{d_{i}}{B_{I J}}

, camera images are degraded due to adverse weather, their corresponding weights are automatically reduced, while the weights of sonar or LiDAR are increased, thereby ensuring stability and accuracy of the fused result.

In summary, the proposed multi-sensor fusion method based on Weighted Kalman Filtering can effectively improve perception accuracy under uncertain environments and enhance adaptability to environmental variations through dynamic weight adjustment. This method provides UAV–USV cooperative systems with high-precision and robust environmental perception capabilities, serving as a critical prerequisite for reliable task scheduling and autonomous control.

2.4. Edge Computing and Resource Scheduling

In maritime cooperative operations, UAVs and USVs frequently perform computation-intensive tasks, such as real-time image recognition, environmental reconstruction, and multi-task path planning. Due to the limited onboard computing resources, executing all tasks locally is often impractical, which can lead to latency or even task execution failures. To address this challenge, the system introduces edge computing nodes and implements a dynamic resource allocation and collaborative task scheduling mechanism. This approach reduces the computational burden on UAVs and USVs while improving the overall responsiveness and robustness of the system.

The resource scheduling process is modeled as a multi-objective optimization problem with the following objectives: minimizing task completion time, minimizing energy consumption of UAVs/USVs, and maximizing task success rate. Assume the task set is

T = \{T_{1}, T_{2}, \dots T_{n}\}

, and each task

T_{i}

has attributes such as computational load

c_{i}

, required data size

d_{i}

, and maximum tolerable delay

t_{i}

. The system decides whether task

T_{i}

should be executed locally on the platform, offloaded to an edge node, or offloaded to another cooperative platform.

The total delay of a task consists of communication delay and computation latency as Equation (7):

L_{i} = L_{i}^{c o m m} + L_{i}^{c o m p}

(7)

Here, the communication latency

L_{i}^{c o m m} = \frac{d_{i}}{B_{i j}}

represents the time of task

T_{i}

required to transmit the task data

d_{i}

(in bits) over a communication link with bandwidth

B_{i j}

(in bits per second, bps). The computation latency

L_{i}^{c o m p} = \frac{c_{i}}{f_{i j}}

represents the time required to execute the computational load

c_{i}

(in CPU cycles) on a platform with processing capacity

f_{i j}

(in cycles per second, Hz). Therefore, the total latency

L_{i}

is expressed in seconds, which can be directly compared with the task deadline. This formulation of latency is proposed in this work, and thus no prior references are available.

The total energy consumption includes both computation and communication energy as Equation (8):

E_{i} = α \cdot c_{i} f_{j}^{2} + β \cdot d_{i}

(8)

Here, the first term represents computation energy consumption, proportional to the square of CPU frequency, and the second term represents communication energy consumption, proportional to the data volume.

The overall objective is to minimize the combined latency and energy consumption while satisfying task deadlines and resource constraints, as expressed in Equation (9):

\min \sum_{i} (λ_{1} L_{i} + λ_{2} E_{i}) s . t . L_{i} \leq t_{i}

(9)

where

λ_{1}

and

λ_{2}

are weighting coefficients reflecting the relative importance of latency and energy consumption. It should be noted that while the cost function focuses on individual task performance, temporal relations and dependencies among tasks—such as precedence, synchronization, and deadline constraints—are explicitly modeled in the scheduling constraints. This separation allows the cost function to remain task-centric while the scheduling framework ensures that inter-task temporal relations are properly satisfied.

To solve this multi-objective optimization problem, the NSGA-II algorithm is employed, which ensures both convergence and solution diversity. Each individual in the population encodes a resource scheduling strategy, specifying whether each task is executed locally, at the edge, or on a cooperative platform. The algorithm iteratively refines these strategies through selection, crossover, and mutation, ultimately producing a set of Pareto-optimal solutions. Decision-makers can then select an appropriate scheduling scheme according to actual task requirements, for example, prioritizing minimum energy consumption or shortest latency.

Considering the dynamic and time-varying characteristics of maritime environments, the system supports adaptive scheduling. When network bandwidth, edge computing capacity, or UAV/USV energy levels change, the scheduling strategy can be recalculated in real time, ensuring adaptability to fluctuating resource conditions. This dynamic scheduling mechanism substantially enhances the robustness and flexibility of UAV–USV cooperative systems, allowing them to maintain stable and efficient operation under uncertain maritime conditions.

2.5. Middleware Communication and Fault Tolerance

In the collaborative control of heterogeneous unmanned systems, the stability and reliability of communication links are critical for task execution efficiency and overall system performance. In maritime environments, issues such as channel fading, bandwidth fluctuations, and varying inter-node distances frequently occur, making UAV–USV data exchanges prone to packet loss, increased latency, or even link interruption. To address these challenges, the system introduces a middleware-based communication and fault-tolerance mechanism designed to achieve robust and efficient cross-platform communication through lightweight message protocols and multi-level fault detection strategies.

Regarding the communication mechanism, the system employs a hybrid MQTT and UDP scheme. The MQTT protocol, with its lightweight and publish/subscribe architecture, enables asynchronous message delivery among UAVs, USVs, and edge nodes, making it suitable for transmitting task status updates and control commands. UDP, by contrast, is used for high-priority sensor data transmission, minimizing protocol overhead and latency to satisfy real-time performance requirements. This integrated approach ensures reliable command delivery while supporting the rapid transmission of large-scale data streams.

For fault tolerance, the system implements three key functions: heartbeat detection, link switching, and message acknowledgment with retransmission. First, the heartbeat mechanism periodically sends probe packets to monitor link status, and communication is considered abnormal if the interval exceeds a 500 ms threshold. Second, the link switching mechanism automatically activates a backup communication channel when the primary link fails—for example, switching from satellite to cellular networks—thereby maintaining continuous system connectivity. Third, for critical commands and data packets, a message acknowledgment and timeout retransmission mechanism is employed; if the sender does not receive confirmation from the receiver, the message is retransmitted to prevent information loss.

To further enhance system robustness, a communication quality-aware adaptive adjustment strategy is incorporated. When the system detects bandwidth degradation or increased packet loss, it dynamically adjusts data transmission priorities, allocating high-priority channels for critical control information while compressing, delaying, or throttling non-critical sensor data. This ensures that essential tasks continue to be executed effectively under constrained communication conditions.

By combining lightweight protocol integration with multi-level fault-tolerance strategies, the proposed middleware mechanism enables efficient, reliable, and resilient operation of UAV–USV collaborative systems in complex marine environments. This design reduces dependency on any single communication link and enhances adaptability under limited network conditions, providing a solid foundation for multi-objective task decomposition and dynamic scheduling.

2.6. Multi-Objective Task Decomposition and Scheduling

In collaborative operations of heterogeneous unmanned systems, tasks often exhibit high complexity and involve multiple performance objectives. For instance, in maritime patrol missions, it is necessary to minimize overall execution time, reduce UAV energy consumption, and ensure both coverage and accuracy of perception data. To address these requirements, task decomposition and scheduling are modeled as a multi-objective optimization problem, and evolutionary optimization algorithms are employed to derive a Pareto-optimal solution set, enabling dynamic trade-offs among competing objectives.

Let the set of tasks to be executed by the system be

T

, and the set of platforms (UAVs and USVs) be

P

. Each task must be assigned to exactly one platform, and the task assignment matrix is defined

X = [x_{i j}]

, where

x_{i j} = \{\begin{cases} 1, T a s k T_{i} a s s i g n e d t o p l a t f o r m P_{j} \\ 0, o t h e r w i s e \end{cases}

To evaluate the quality of a task schedule, three optimization objectives are defined:

(1): Task Completion Cost

This accounts for the energy and resource consumption of platforms executing tasks. The total cost is defined as Equation (10):

Cost (X) = \sum_{i = 1}^{n} \sum_{j = 1}^{m} x_{i j} \cdot c_{i j}

(10)

where

c_{i j}

represents the cost of executing task

T_{i}

on platform

P_{j}

, including energy and computational expenditure.

(2): Task Completion Latency

Defined as the maximum time required to complete all tasks as Equation (11):

L a t e n c y (X) = \max_{j \in P} (\sum_{i = 1}^{n} x_{i j} \cdot t_{i j})

(11)

where

t_{i j}

denotes the execution time of task

T_{i}

on platform

P_{j}

. This metric reflects the overall efficiency of task execution.

(3): Task Assignment Error

Considering the match between task requirements and platform capabilities, the assignment error is defined as Equation (12):

E r r o r (X) = \sum_{i = 1}^{n} \sum_{j = 1}^{m} x_{i j} \cdot ‖r_{i} - a_{j}‖

(12)

where

r_{i}

is the resource demand vector of task

T_{i}

, and

a_{j}

is the available resource vector of platform

P_{j}

. This metric measures the rationality of task-platform matching.

Accordingly, the multi-objective optimization problem can be formulated as:

\min F (X) = \{Cost (X), L a t e n c y (X), E r r o r (X)\}

and, subject to the constraints, as Equation (13):

\sum_{j = 1}^{m} x_{i j} = 1, x_{i j} \in \{0, 1\}, \forall i \in T, \forall j \in P

(13)

The first constraint ensures that each task is assigned to exactly one platform, while the second enforces the discrete nature of task allocation.

To solve this multi-objective optimization problem, the Non-dominated Sorting Genetic Algorithm II (NSGA-II) is employed. NSGA-II maintains solution diversity through non-dominated sorting and crowding distance evaluation, gradually converging toward the Pareto-optimal solution set over successive generations. The algorithm proceeds through population initialization, fitness evaluation based on the three objective functions, non-dominated sorting, generation of offspring via crossover and mutation, and elitist preservation.

In the experiments, the NSGA-II parameters are set as follows: population size = 100, crossover probability = 0.9, mutation probability = 0.1, and maximum generations = 200. These values were chosen based on prior literature and preliminary experiments, balancing convergence speed and solution diversity. These values are chosen based on standard practice in multi-objective evolutionary optimization and supported by prior studies [31,32,33], which indicate that such settings effectively balance convergence speed and solution diversity. A larger population size ensures sufficient exploration of the solution space, while the selected crossover and mutation probabilities maintain a balance between exploitation and exploration.

To assess the sensitivity of the results to parameter variations, we conducted experiments by varying each parameter independently within a reasonable range: population size (50–150), crossover probability (0.70–0.95), and mutation probability (0.05–0.20). The effect on task assignment error, energy utilization efficiency (EUE), and system robustness (SR) was evaluated, and the results are summarized. The analysis shows that moderate variations in parameters do not significantly affect overall performance, indicating that the selected NSGA-II configuration is robust for the task scenarios considered.

A sensitivity analysis was conducted to examine the influence of moderate variations in the population size, crossover probability, and mutation probability. The results indicate that these variations have minimal impact on the resulting Pareto front and the trade-offs among objectives, demonstrating the robustness of the algorithm under the selected parameter configuration. Consequently, the derived task schedules reliably balance execution cost, completion latency, and resource matching, providing decision-makers with a flexible set of Pareto-optimal solutions suitable for real-time operational priorities.

By adopting this multi-objective modeling and evolutionary optimization approach, the system can effectively balance task execution cost, completion latency, and resource matching, thereby enhancing the overall efficiency, reliability, and adaptability of UAV-USV collaborative operations.

2.7. Anomaly Detection and Self-Recovery

During the collaborative operation of unmanned systems, complex marine environments and distributed architectures can easily lead to various anomalies, such as sensor drift, communication link interruption, node offline, or sudden reduction in computing resources. Without effective anomaly detection mechanisms, these problems may accumulate and propagate within the system, potentially causing task failure or platform loss of control. Therefore, this study proposes a hybrid anomaly detection and self-recovery mechanism that combines statistical detection with deep learning prediction, which can detect potential anomalies at an early stage and trigger rapid recovery strategies, thereby significantly improving system robustness and continuous operational capability.

Anomalous targets are categorized as follows: sudden sensor drift (e.g., gradual deviation in LiDAR or camera measurements), unexpected communication delays or packet loss, computing node offline events, and abrupt reductions in computational resources. These targets are monitored to prevent local failures from affecting the overall system performance.

In terms of anomaly detection, the system first uses a sliding-window statistical model to monitor key operational metrics in real time. Let the mean value of a metric within time window

W

be shown as Equation (14):

S_{t} = \frac{1}{|W|} \sum_{i = t - W + 1}^{t} x_{i}

(14)

where

x_{i}

is the observation at time

i

. When the deviation between

S_{t}

and the historical mean exceeds threshold

δ

, it is judged as a potential anomaly. This method is suitable for detecting sudden fluctuations, such as sharp increases in communication latency or abnormal energy consumption.

On this basis, the system further employs a Long Short-Term Memory (LSTM) network to model and predict the time series of the metric. Let the LSTM model prediction at time

t

be

{\hat{x}}_{t}

, and define the prediction error as Equation (15):

e_{t} = |x_{t} - {\hat{x}}_{t}|

(15)

If

e_{t} > θ

, where

θ

is a dynamic threshold set based on the historical training set, an anomaly is determined to exist at that time. This method can identify nonlinear, long-term dependent anomaly patterns, such as gradual sensor drift or network performance degradation.

Anomaly verification is performed by cross-checking the detected anomaly against historical trends, multi-sensor consistency, and task performance metrics. Only when the deviation exceeds the threshold and is corroborated by multiple indicators is the anomaly confirmed. This multi-level verification helps prevent false positives due to transient fluctuations and ensures reliable detection.

Regarding the self-recovery mechanism, the system designs three strategies: task rescheduling, node isolation, and redundancy activation. When a computing node anomaly is detected, the scheduler immediately triggers task migration, reallocating its tasks to healthy nodes to ensure that critical tasks are not interrupted; when a sensor continuously outputs abnormal values, the system automatically isolates the sensor and substitutes it with redundant sensors or multi-source data fusion results; when a communication link is interrupted, the system activates backup links and interpolates or retransmits data gaps caused by the interruption.

Moreover, to further enhance the system’s adaptive capability, anomaly detection results are also fed back to the upper-level scheduling and control modules. When the system detects continuous degradation in communication quality, it dynamically adjusts task allocation, assigning more tasks to local execution nodes, thereby reducing dependence on remote communication.

The anomaly detection method based on a combination of sliding-window statistics and LSTM can effectively identify multiple types of anomalies, while self-recovery mechanisms such as task rescheduling, node isolation, and redundancy activation provide rapid response capabilities. The combination of the two forms a closed-loop anomaly management framework, enabling UAV–USV collaborative systems to maintain high reliability and continuous task execution in complex and dynamic marine environments.

2.8. Feedback-Based Dynamic Adjustment

In the collaborative control of unmanned systems, the task execution environment is often highly dynamic, characterized by uncertain target locations, frequent environmental disturbances, and random fluctuations in communication and computing resources. Without adaptive capability, it is difficult for the system to maintain efficiency and robustness during task execution. Therefore, this study introduces a feedback-based dynamic adjustment mechanism in the system architecture, which dynamically adjusts task allocation and computing resource scheduling strategies through real-time monitoring of task progress, resource utilization, and error metrics, thereby achieving closed-loop optimized control.

In terms of feedback modeling, a comprehensive performance feedback function is constructed as Equation (16):

F_{f b} (t) = λ_{1} P (t) + λ_{2} R (t) + λ_{3} E (t)

(16)

where

P (t)

represents task progress completion rate,

R (t)

denotes resource utilization efficiency,

E (t)

represents system error compensation metric, and

λ_{1}

,

λ_{2}

and

λ_{3}

are weight coefficients satisfying

λ_{1} + λ_{2} + λ_{3} = 1

. By adjusting the weights, different task scenarios can flexibly emphasize efficiency, energy consumption, or accuracy. For example, in maritime patrol, task completion progress and coverage are the primary objectives, and

λ_{1}

can be increased; meanwhile, in precision mapping, data accuracy is more important, and

λ_{3}

can be increased.

During system operation, UAV and USV platforms first report task execution status, energy consumption, and data quality in real time; subsequently, edge nodes and the task scheduler determine whether current task execution deviates from the expected trajectory based on the trend of the feedback function. To account for environmental uncertainties, the simulation explicitly incorporates the following numerical ranges: wind speed 0–8 m/s, ocean current speed 0–1.5 m/s, and packet loss ratio 0–8%. These ranges cover typical calm to moderate conditions, ensuring realistic assessment of system performance.

When the feedback function falls below a predefined threshold, the system triggers dynamic adjustments, including task reassignment, computing resource rescheduling, and communication strategy optimization. For example, if UAV battery levels drop rapidly, the scheduler transfers part of the tasks to USVs or edge nodes to maintain overall mission progress; if communication link quality deteriorates, the system reduces the reporting frequency of low-priority tasks to guarantee stable transmission of critical data.

In addition, this feedback mechanism is closely coupled with the anomaly detection and self-recovery module. When a node failure or communication interruption is detected, the anomaly detection module immediately generates an alarm signal as feedback, which is input to the scheduler to trigger task resource reallocation, thereby achieving cross-layer dynamic adjustment.

The feedback-based dynamic adjustment mechanism, by constructing a multi-dimensional performance function, achieves a comprehensive balance among task progress, resource efficiency, and error control, and can dynamically correct task execution strategies according to environmental changes and anomalies. This mechanism provides UAV–USV collaborative systems with adaptive and evolutionary capabilities, enabling the system to maintain long-term efficiency and reliability in complex and dynamic marine missions.

3. Experiments and Results

3.1. Experimental Setup

To systematically validate the effectiveness and superiority of the proposed collaborative control method for unmanned systems, a typical marine operational environment was constructed on a virtual simulation platform, and comparative experiments were conducted under different task scenarios. The experimental design includes environment construction, hardware configuration, task design, comparison methods, and evaluation metrics.

(1): Experimental Environment Construction

The experiments were conducted in the marine environment of the Pearl River estuary, including a maritime patrol scenario (sea area approximately 15 × 15 km², containing multiple monitoring target points and dynamic interfering vessels) and a coastal sediment mapping scenario (coastline length approximately 12 km, including shoals and sediment accumulation areas). Environmental uncertainties such as wind speed, ocean currents, and communication interference were incorporated to ensure realism and experimental complexity.

(2): Hardware Configuration and Platform Modeling

The experimental platform consisted of six UAVs and two USVs. UAVs were equipped with RGB cameras (4K resolution, sampling frequency 10 Hz), lightweight LiDARs (16 lines, detection radius 100 m), and inertial navigation systems (INS); USVs were equipped with multibeam sonar (0.1 m resolution), acoustic Doppler current profilers (ADCP), and BeiDou high-precision positioning systems. Edge computing nodes were deployed at shore-based stations, equipped with Intel Xeon 32-core CPUs and 256 GB memory to perform task scheduling, data fusion, and anomaly detection. Each UAV/USV communicated with the edge nodes via 5G wireless networks, with some links affected by bandwidth fluctuations and random packet loss.

(3): Task Design and Trajectories

The experimental tasks were divided into two types:

(a) Maritime Patrol Tasks: UAV–USV collaboration to perform target point coverage, requiring the shortest possible completion time for the sea area patrol;

(b) Coastal Sediment Mapping Tasks: UAVs collected high-resolution images of the coastline, while USVs collected underwater topography and sediment distribution data, ultimately completing multi-modal data fusion and 3D modeling. Task scenarios included sudden disturbances, such as UAV low battery or USV communication link interruption, to verify system robustness and self-recovery capability.

The UAVs and USVs are assigned complementary tasks to ensure full coverage while maintaining operational efficiency.

UAV Trajectories: The UAVs follow pre-planned aerial patrol paths defined by a series of waypoints distributed across the high-priority subregions. Each UAV performs a zigzag coverage pattern at a fixed altitude of 50 m, ensuring complete observation of the assigned area. The path spacing is designed to minimize overlap while guaranteeing full sensor coverage. UAV speed is maintained at 10 m/s, and the trajectory is dynamically adjusted in response to environmental changes or task rescheduling.

USV Trajectories: The USVs navigate along maritime routes that complement the UAV coverage, performing environmental sensing and data collection along the water surface. Each USV follows a series of connected waypoints forming smooth trajectories that avoid collisions and maintain safe distances from other platforms. USV speeds are set at 3 m/s to balance energy consumption and coverage requirements.

Coordination and Coverage Strategy: UAV and USV trajectories are coordinated to minimize redundant coverage. High-priority areas are assigned first, followed by low-priority regions to ensure efficient use of resources. The trajectory plans take into account battery constraints, sensor ranges, and task priorities, providing a complete and reproducible description of the mission scenario.

This textual description provides a clear understanding of the spatial layout, platform movements, and coverage strategies, enabling reproducibility of the experiments without requiring a graphical map.

(4): Comparison Methods

To highlight the advantages of the proposed method, two typical comparison schemes were selected:

(a) Centralized Scheduling Method: A central controller uniformly plans task allocation, lacking adaptive adjustment capability;

(b) Single-Platform Autonomous Method: Each UAV and USV executes tasks independently, without cross-platform information exchange.

The proposed method integrates multi-objective optimized scheduling, edge computing, and multi-sensor fusion, with anomaly detection and feedback-based adjustment capabilities.

(5): Evaluation Metrics

To comprehensively assess method performance, five indicators were employed:

(a) Task Completion Time (TCT): Total time required to complete all tasks.

(b) Energy Utilization Efficiency (EUE): Ratio of energy consumed to theoretically optimal energy, EUE = (Effective task energy/Total consumed energy) × 100%. The theoretical optimal energy is estimated from the shortest feasible trajectory without disturbances.

EUE quantifies the energy efficiency of the cooperative task execution by unmanned platforms. The theoretical optimal energy is estimated based on the shortest feasible trajectory without disturbances, assuming minimal speed and communication energy, providing a baseline for comparison. The actual energy is recorded during task execution, and the ratio indicates system efficiency.

(c) Communication Latency (CL): Average data transmission delay between UAV/USV and edge nodes during tasks.

(d) Data Fusion Accuracy (DFA): Mean squared error (MSE) between the multi-sensor fused result and the ground truth, is calculated as the matching accuracy between fused sensor data and the established ground-truth dataset. DFA evaluates the system’s robustness under dynamic disturbances such as UAV low battery, sensor failure, or USV communication interruption. The ground truth for DFA is constructed using high-resolution simulation models and validated sensor data, generating task execution trajectories and expected system states. DFA is calculated by comparing actual system responses to the ground truth.

(e) System Robustness (SR): Ability of the system to maintain task execution under node failure or communication interruption, measured by task success rate, is defined as the proportion of successful mission completions out of repeated runs, where success requires full task coverage within time and energy limits. SR measures the proportion of successfully completed tasks under predefined criteria. A task is deemed successful if all assigned target points are covered within the prescribed time and energy limits while meeting the required accuracy. SR is calculated as the number of successful task completions divided by the total number of repetitions.

To ensure the reproducibility of the evaluation metrics, this study provides a detailed description of the calculation methods for the key indicators EUE, SR, and DFA. The theoretical optimal energy for EUE is defined based on the minimum path length and idealized communication energy, assuming that the unmanned system executes tasks along the shortest possible paths while minimizing communication energy consumption, thus providing a baseline for assessing actual energy utilization efficiency. The success rate (SR) is determined by the following criterion: a task is considered successful if all target points are covered within the prescribed time and energy constraints while meeting the required accuracy; otherwise, it is considered a failure. For dynamic fault analysis (DFA), the ground truth is established using high-resolution simulation models combined with validated sensor data, generating task execution trajectories and state information to accurately evaluate the system’s reliability and robustness under varying operational conditions.

With the above experimental setup, the proposed method can be comprehensively evaluated in a complex and dynamic marine task environment for its performance in efficiency, accuracy, and robustness.

(6): Environmental Model Settings

The environmental model employed in the simulations explicitly considers the effects of wind, ocean currents, and communication interference to closely replicate realistic marine conditions. Specifically, wind speed is modeled within the range of 0–8 m/s, covering calm to moderate wind scenarios; ocean current speed varies from 0 to 1.5 m/s, representing typical coastal and open-sea conditions; and the packet loss ratio is set between 0% and 8% to account for potential disruptions in wireless communication. These ranges were selected based on operational data and literature to ensure that the simulations reflect practical variability encountered in real-world missions. By incorporating these parameters, the system performance—particularly task completion efficiency, energy utilization, communication reliability, and robustness—can be comprehensively evaluated under dynamically changing environmental conditions.

To complement Figure 2, Figure 3, Figure 4 and Figure 5, Table 1 summarizes the performance under varying environmental conditions (wind speed, current, packet loss), reporting key metrics including task progress completion rate (TPCR), resource utilization efficiency (RUE), and system error compensation metric (SEC). Each value represents the mean ± standard deviation over 10 repeated runs.

3.2. Case Study 1: Maritime Patrol

In the maritime patrol task, the primary objective of the system is to leverage the collaborative capabilities of UAVs and USVs to achieve efficient coverage of the designated sea area and rapid detection of anomalous targets. The experimental scenario is set in a sea area of approximately 15 × 15 km², which contains several static monitoring points (e.g., buoys, navigation nodes) and a small number of dynamic targets (e.g., moving vessels). The experiment requires UAVs to perform aerial surveillance, primarily responsible for large-scale image acquisition and target recognition, while USVs are responsible for close-range surface patrol and anomaly verification. Through UAV–USV collaborative operation, a balance can be achieved between coverage and patrol accuracy.

During task execution, the proposed method decomposes and allocates UAV and USV tasks using a multi-objective optimization scheduling algorithm. UAVs are responsible for rapidly covering large sea areas, and their flight paths are dynamically adjusted by edge nodes based on real-time energy consumption and communication link quality. USVs primarily perform detailed patrols along channels and nearshore areas, with task allocation fully considering communication latency and energy consumption constraints. Additionally, when a UAV returns mid-mission due to low battery, the scheduler immediately triggers task migration, reallocating unfinished tasks to other UAVs or nearby USVs to prevent task interruption.

To evaluate the performance of the method, three different scheduling modes were compared: single-platform autonomous, centralized scheduling, and the proposed collaborative control method. The experimental results show that the single-platform autonomous method, due to the lack of cross-platform collaboration, results in excessively long overall task completion times and lower data fusion accuracy. The centralized scheduling method improves task allocation efficiency, but without a dynamic feedback adjustment mechanism, the system performs poorly under communication fluctuations and platform failures. In contrast, the proposed method demonstrates significant advantages in task completion time, energy utilization efficiency, communication latency, and system robustness.

Each task was tested five times, and the specific results are shown in Table 2. The average task completion time of the proposed method is 79.82 min, representing reductions of 33.78% and 22.07% compared with the single-platform autonomous and centralized scheduling methods, respectively. The energy utilization efficiency reaches 88.12%, significantly higher than the other methods; communication latency is reduced to 234.25 ms, ensuring efficient UAV–USV information exchange; data fusion accuracy (measured by mean squared error, MSE) improves to 0.03; and system robustness reaches 91.21%. These results indicate that the proposed collaborative control framework effectively enhances the overall.

Meanwhile, Figure 2 presents a comparison of the average completion time and communication latency over five tests for 20, 50, 100, 150, and 200 task points across the three methods. It can be observed that the proposed method significantly outperforms the comparison schemes in terms of efficiency and real-time performance. This indicates that the integration of edge computing, anomaly detection, and feedback adjustment mechanisms can substantially improve the performance of UAV–USV collaborative systems in dynamic tasks.

The figure presents the trends of average task completion time (TCT) and communication latency (CL) for the three scheduling methods single platform autonomous, centralized scheduling, and the proposed collaborative control method under 200 task points. The left y-axis represents TCT (minutes), and the right y-axis represents CL (milliseconds).

First, regarding the task completion time (TCT) curves, the single-platform autonomous method consistently remains at the highest level, ranging between 111 and 123 min, with a slight upward trend as the number of task points increases. This indicates that as the number of tasks grows, the single-platform autonomous method suffers from low task allocation efficiency due to the lack of cross-platform collaboration, resulting in significantly longer overall completion times. The TCT curve for the centralized scheduling method ranges from 94 to 105 min, showing a noticeable reduction compared to the single-platform autonomous approach; however, the curve still exhibits slight fluctuations, primarily reflecting the centralized scheduler’s limited flexibility in response to communication variations and dynamic task disturbances. The proposed collaborative control method demonstrates the best performance across all task point ranges, with TCT between 72 and 82 min. The curve is stable, and minor fluctuations reflect the system’s capability to dynamically adjust tasks under simulated environmental disturbances (e.g., UAV low battery or communication interruption), ensuring continuous task completion.

Second, regarding the communication latency (CL) curves, the single-platform autonomous method exhibits the highest latency, ranging from 390 to 420 ms, indicating that independently operating UAVs and USVs lack efficient information exchange, resulting in low data transmission efficiency. The centralized scheduling method reduces CL to 301–335 ms, with some fluctuations, reflecting the limited responsiveness of a central controller under dynamic tasks and network disturbances. In contrast, the collaborative control method achieves the lowest CL, ranging from 212 to 240 ms, with a stable curve and slight jitter, demonstrating that edge computing and real-time feedback mechanisms effectively improve data transmission efficiency and maintain reliable communication under random network disturbances.

Overall, the collaborative control method significantly outperforms the comparison methods in both task completion efficiency and communication performance. Its advantages become more pronounced as the number of task points increases, showing excellent scalability in large-scale task environments. The minor random jitter in the curves reflects the uncertainty in the simulated experimental environment, making the figure more representative of practical applications and validating the system’s robustness and stability under dynamic conditions.

These results indicate that incorporating multi-objective optimized scheduling, edge computing, and dynamic feedback adjustment mechanisms not only reduces task completion time but also significantly improves communication latency, enabling efficient and stable UAV–USV collaboration in complex marine task environments. This outcome fully demonstrates the comprehensive advantages of the proposed collaborative control framework in efficiency, real-time performance, and robustness, providing a reliable reference for practical maritime patrol operations.

From the radar chart, it can be observed that the proposed UAV–USV collaborative control method outperforms the comparison methods across all five performance metrics. Regarding task completion time (TCT), the proposed method achieves the highest normalized score, indicating a significant reduction in task execution duration compared to both the single-platform autonomous and centralized scheduling approaches. In terms of energy utilization efficiency (EUE), the proposed method also demonstrates superior performance, showing that it can allocate and utilize energy resources more efficiently during task planning and scheduling.

For the communication latency (CL) metric, the proposed method’s curve is noticeably better than those of the other two methods, reflecting its ability to maintain low latency under uncertain conditions such as bandwidth fluctuations and random packet loss, thereby ensuring maintained latency below 250 ms and robustness. Regarding data fusion accuracy (DFA), the proposed method achieves a significantly higher score, indicating that the multi-sensor fusion results are closer to the ground truth, which guarantees accuracy in environmental modeling and target recognition. For system robustness (SR), the proposed method also performs the best, maintaining task execution under node failures or communication interruptions, whereas the single-platform autonomous method performs the worst in this dimension and is easily affected by environmental disturbances.

Overall, the radar chart visually demonstrates that the proposed collaborative control method achieves balanced and comprehensive improvements across multiple dimensions, including efficiency, energy utilization, real-time performance, accuracy, and robustness, showing a clear comprehensive advantage over traditional methods.

3.3. Case Study 2: Coastal Sedimentation Mapping

Coastal sedimentation directly affects port navigability, shoreline stability, and flood safety; therefore, high-precision mapping and real-time monitoring of sedimentation are of great significance. This experiment aims to leverage the collaborative capabilities of UAVs and USVs to perform precise modeling and sediment distribution monitoring in a typical coastal environment. The experimental scenario is set along a 12.26 km shoreline, which includes shallow areas, sediment deposition zones, and complex coastal structures. UAVs are responsible for capturing high-resolution images and 3D point cloud data to construct a digital surface model of the shoreline, while USVs collect sonar and ADCP data to invert nearshore underwater topography and sediment concentration distribution. Through multi-source data fusion, a panoramic perception of the coastal sedimentation process can be achieved.

During task execution, the proposed method assigns high-precision modeling tasks to shore-based edge nodes via an edge scheduling mechanism, reducing the computational load on UAVs and USVs. Simultaneously, a multi-objective optimization scheduling algorithm coordinates the operation sequence of UAVs and USVs to shorten the overall task completion time. When a USV experiences a communication link interruption, the system quickly switches to a backup link using the middleware fault-tolerance mechanism and migrates unfinished tasks to nearby platforms, ensuring task continuity. Additionally, the fusion module employs a Weighted Kalman Filter (WKF) to dynamically integrate multi-modal observation data, enhancing the accuracy and stability of the modeling results.

To validate the advantages of the proposed method, the experiment compares three approaches: single-platform autonomous, centralized scheduling, and the proposed collaborative framework. Each task was tested five times, and the results are shown in Table 3. The proposed method achieves an average task completion time of 89.74 min, representing reductions of 33.67% and 20.31% compared to the single-platform autonomous and centralized scheduling methods, respectively. Energy utilization efficiency reaches 86.97%, significantly higher than the comparison methods. Communication latency is 246.31 ms, markedly lower than the other two methods. Data fusion accuracy (MSE) reaches 0.03, improving by more than 15% over the comparison methods. System robustness reaches 90.12%, improving by 19.86% and 13.80% compared with single-platform autonomous and centralized scheduling, respectively.

Meanwhile, Figure 3 illustrates the comparison of average data fusion accuracy (DFA) and system robustness (SR) for the three methods under 20, 50, 100, 150, and 200 task points. It can be observed that as the task scale increases, both the single-platform autonomous and centralized scheduling methods exhibit a noticeable decline in accuracy and stability, whereas the proposed method maintains consistently low fusion error and high robustness. This indicates that the proposed collaborative framework not only demonstrates advantages in small-scale tasks but also possesses significant potential for reliability and accuracy improvement in complex, large-scale task scenarios.

Figure 4 presents the comparison results of data fusion accuracy (DFA) and system robustness (SR) for the three methods under different numbers of task points. All six curves in the figure are solid lines, distinguished by different colors for each method, and exhibit slight fluctuations within the task point range of 20–200 to reflect the real variability in experimental data.

Regarding data fusion accuracy, the proposed collaborative method (DFA-Proposed) shows the lowest curve overall, with an average MSE of 0.03 and relatively small fluctuations, indicating consistently high data fusion accuracy across different task scales. The centralized scheduling method (DFA-Centralized) lies in the middle, with an average MSE of 0.04, representing a 19.27% improvement over the single-platform autonomous method (DFA-Single, average MSE of 0.06). The DFA-Single curve increases as the number of task points grows, demonstrating a significant decline in accuracy for large-scale tasks when using a single-platform autonomous approach.

In terms of system robustness, the proposed method (SR-Proposed) achieves the highest curve, with an average value of 90.12% and minor fluctuations, indicating that the system can maintain a high task completion rate even under abnormal conditions such as node failures or communication interruptions. The centralized scheduling method (SR-Centralized) has an average value of 77.50% and shows a slight decrease with increasing task points, reflecting some performance variability under abnormal conditions. The single-platform autonomous method (SR-Single) has an average of 70.85% with a larger decline, further indicating that the lack of cross-platform collaboration significantly reduces system robustness.

Overall, all six curves maintain a regular trend as the number of task points increases, and the relative positions of the DFA and SR curves clearly demonstrate the differences among the methods in accuracy and robustness. Quantitative analysis shows that at 200 task points, the maximum DFA difference is 0.03 (Single vs. Proposed), and the maximum SR difference is 19.27% (Proposed vs. Single), indicating that the advantages of the proposed collaborative method become more pronounced in large-scale task environments. In summary, this curve analysis validates the effectiveness of the proposed method in achieving both high-precision data fusion and system robustness in maritime task scenarios, providing a reliable basis for multi-objective optimized scheduling in complex tasks.

Figure 5 comprehensively illustrates the five key performance metrics for the three task scheduling methods in maritime patrol tasks: task completion time (TCT), energy utilization efficiency (EUE), communication latency (CL), data fusion accuracy (DFA), and system robustness (SR). To ensure comparability of different metrics within the same figure, all data were normalized, and slight perturbations were added to the curves to reflect the true variability of the experimental data.

From the figure, it can be observed that the proposed collaborative control method (Proposed) demonstrates clear advantages across all metrics. Its curve generally lies on the outer edge of the radar chart, indicating superior performance in terms of shorter task completion time, higher energy utilization efficiency, lower communication latency, higher data fusion accuracy, and stronger system robustness compared to the other methods. The centralized scheduling method (Centralized) lies in the middle, showing improvements in task efficiency and robustness compared with the single-platform autonomous method (Single Platform), but still fails to reach the overall performance level of the collaborative method. The curve of the single-platform autonomous method is located toward the inner area, particularly showing the poorest performance in task completion time and communication latency, highlighting the limitations due to the lack of cross-platform collaboration and dynamic scheduling capabilities.

Quantitative analysis shows that under normalized metrics, the Proposed method outperforms the Single Platform method by 33.78%, 19.84%, 42.91%, 40.00%, and 18.89% in TCT, EUE, CL, DFA, and SR, respectively. Compared with the Centralized method, the Proposed method achieves improvements of 22.07%, 18.26%, 27.92%, 25.00%, and 10.54% for the same metrics. This indicates that the integration of edge computing, anomaly detection, and multi-objective optimization scheduling mechanisms can significantly enhance the overall performance of UAV-USV collaborative systems in dynamic and complex maritime tasks. The radar chart intuitively illustrates the differences among the methods in efficiency, accuracy, and robustness, providing a reliable basis for evaluating multi-objective scheduling approaches in maritime patrol missions.

3.4. Comparative Analysis

3.4.1. Performance Metrics

To comprehensively evaluate the effectiveness of the proposed UAV-USV collaborative control framework, this section compares the overall performance of three methods (Single Platform Autonomous, Centralized Scheduling, and the Proposed Method) from both qualitative and quantitative perspectives.

From a qualitative perspective, the proposed method outperforms the comparison schemes in compatibility, real-time performance, robustness, and intelligence. Specifically, the system improves compatibility among multiple platforms through unified multi-source data modeling and time synchronization mechanisms; it enhances real-time performance by significantly reducing task planning and data processing latency via edge computing and dynamic resource scheduling; it increases robustness under fault scenarios through middleware communication redundancy and anomaly self-recovery mechanisms; and it demonstrates high intelligence by adaptively balancing task completion time, energy consumption, and accuracy through multi-objective optimization and feedback-based dynamic adjustment.

From a quantitative perspective, the proposed method shows significant advantages across key performance indicators. Table 4 summarizes the average results of ten mixed-task trials for two typical tasks (maritime patrol and coastal sediment mapping). It can be observed that the proposed method outperforms the Single Platform Autonomous and Centralized Scheduling approaches in task completion time (TCT), energy utilization efficiency (EUE), communication latency (CL), data fusion accuracy (DFA), system robustness (SR), and computational efficiency (CE). In particular, the average task completion time of the proposed method is reduced to 84.75 min, representing a decrease of 33.71% and 21.18% compared with the Single Platform Autonomous and Centralized Scheduling methods, respectively; the energy utilization efficiency reaches 87.55%, significantly higher than the other methods; communication latency decreases to 241.07 ms, a reduction of 44.23% compared with Single Platform Autonomous; data fusion accuracy (MSE) improves to 0.03; system robustness increases to 90.65%; and computational efficiency is 1.82 times that of the baseline.

To provide a more intuitive comparison, Figure 5 presents a radar chart illustrating the performance of the three methods across six dimensions (task completion time, energy utilization efficiency, communication latency, data fusion accuracy, system robustness, and computational efficiency), with all data normalized. For TCT, CL, and DFA, the reciprocal values are taken before normalization to ensure that larger values represent better performance. It is clearly observed from the figure that the proposed method achieves a larger coverage area across all metrics, demonstrating a significant overall advantage.

Figure 6 presents the comparison of the three methods in maritime patrol tasks (dashed lines) and coastal sediment mapping tasks (solid lines) across six performance dimensions. Overall, the proposed collaborative method demonstrates optimal performance in both task types, forming the largest radar chart coverage, indicating significant comprehensive advantages in task completion time, energy utilization efficiency, communication latency, data fusion accuracy, system robustness, and computational efficiency.

For maritime patrol tasks, the Single Platform Autonomous method (Patrol-Single) has the smallest radar chart, particularly lagging in task completion time (normalized value 0.25), communication latency (0.31), and system robustness (0.37), indicating that in dynamic marine environments, the lack of cross-platform collaboration and feedback-based adjustment leads to low efficiency and poor stability. The Centralized Scheduling method (Patrol-Centralized) achieves considerable improvement in task completion time (0.44) and communication latency (0.52), but its improvements in energy utilization efficiency (0.48) and system robustness (0.55) are limited, resulting in a medium overall coverage. In contrast, the proposed method (Patrol-Proposed) performs excellently across all metrics, with normalized values for task completion time, communication latency, and data fusion accuracy all above 0.80, and system robustness and computational efficiency above 0.85, demonstrating that the collaborative framework effectively enhances task efficiency and system stability while optimizing energy consumption.

For coastal sediment mapping tasks, the Single Platform Autonomous method (Coast-Single) also performs the worst, particularly in task completion time (0.20), communication latency (0.27), and data fusion accuracy (0.33). The Centralized Scheduling method (Coast-Centralized) shows significant improvement in task completion time (0.42) and communication latency (0.49) compared to Single Platform Autonomous, but there is still room for improvement in data fusion accuracy (0.48) and system robustness (0.55). The proposed method (Coast-Proposed) achieves the largest radar chart coverage, with normalized values for task completion time, communication latency, and data fusion accuracy maintained between 0.78 and 0.85, system robustness reaching 0.88, and energy utilization efficiency and computational efficiency remaining at high levels, indicating its capability to balance high-precision modeling, stability, and computational efficiency in complex coastal mapping tasks.

To assess the feasibility of real-time implementation, the computational costs of the proposed algorithms were systematically evaluated. The NSGA-II-based scheduling algorithm, which is the most computationally intensive module, requires approximately 1.25 s per iteration when executed on edge nodes. LSTM-based anomaly detection operates efficiently, requiring only 8 ms per data sequence, while the weighted Kalman filter for multi-sensor fusion consumes about 3ms per fusion cycle. These measurements demonstrate that real-time execution is achievable if heavy computations are performed on edge servers, with UAV/USV onboard devices handling only lightweight operations such as sensor data acquisition and preliminary processing. By leveraging edge intelligence and modular task partitioning, the proposed system design minimizes onboard computational load, enabling practical deployment even in scenarios with multiple UAVs/USVs operating collaboratively. The analysis confirms that the framework balances computational efficiency with operational performance, supporting its suitability for real-time maritime missions.

In summary, the radar charts for both task types show a consistent pattern: Single Platform Autonomous performs the weakest, Centralized Scheduling performs moderately, and the proposed collaborative control method clearly outperforms the comparison methods across all metrics while maintaining high stability as task complexity increases. Quantitative comparisons show that for 200 task points, differences in task completion time and communication latency can exceed 0.50, and system robustness differences can reach 0.40, fully demonstrating the effectiveness of the proposed method in multi-objective optimization scheduling and cross-platform collaboration, providing a reliable reference for practical UAV-USV missions.

From a quantitative perspective, the proposed method demonstrates significant advantages across key performance indicators. Table 3 summarizes the average results over 10 experimental runs for two representative tasks (maritime patrol and coastal sedimentation mapping). As shown, the proposed method outperforms both the single-platform autonomous approach and centralized scheduling in terms of task completion time (TCT), energy utilization efficiency (EUE), communication latency (CL), data fusion accuracy (DFA), system robustness (SR), and computational efficiency (CE). Specifically, the average TCT of the proposed method is reduced to 84.75 min, representing reductions of 33.72% and 21.18% compared with the single-platform and centralized methods, respectively. The EUE is improved to 87.50%; CL is reduced to 240.00 ms; DFA (MSE) decreases to 0.03; SR improves to 90.65%; and CE reaches 1.80× relative to the baseline, highlighting the superiority of the proposed framework in efficiency, reliability, and scalability.

In addition to the overall performance metrics, this study further evaluates the effectiveness of the anomaly detection and self-recovery mechanism, focusing on anomaly recognition tasks. A test set containing 1000 samples, including 300 anomalies, was used for evaluation. Table 5 presents the confusion matrix results of different methods. It can be observed that the single-platform autonomous approach suffers from high false-positive and false-negative rates; the centralized scheduling method improves performance but still exhibits considerable false negatives under communication interference; in contrast, the proposed method, by integrating a sliding window statistical model with an LSTM-based predictor, achieves the highest anomaly recognition accuracy with the lowest false alarms.

As shown in Table 4, the proposed method achieves an F1-score of 0.89, which is significantly higher than that of the single-platform (0.71) and centralized scheduling (0.83) approaches. This result indicates that the proposed framework not only achieves more accurate anomaly detection but also leverages the self-recovery mechanism to rapidly reconfigure task allocation, thereby ensuring system stability under abnormal or fault-prone environments.

3.4.2. Ablation Study and Module Contribution Analysis

To further validate the effectiveness and necessity of the core modules in the proposed collaborative control framework, an ablation study was designed and conducted. In this study, key modules of the framework were sequentially removed or replaced, and their performance was compared with that of the complete framework, allowing for a quantitative evaluation of each module’s contribution to the overall system performance. The ablation study covered four core modules: Weighted Kalman Filter (WKF), multi-objective optimization scheduling algorithm (NSGA-II), LSTM-based anomaly detection, and middleware hybrid fault-tolerant mechanism.

In the experimental setup, when the WKF module was removed, a conventional Kalman filter was used instead to assess the role of the weighting mechanism in heterogeneous multi-sensor data fusion. When the NSGA-II module was removed, a greedy scheduling strategy was employed to verify the importance of multi-objective optimization in task decomposition and resource allocation. When the LSTM anomaly detection module was removed, a threshold-based statistical method was adopted to analyze the improvement provided by the deep learning model in anomaly detection. When the middleware hybrid mechanism was removed, only a single communication link (MQTT or UDP) was retained, allowing for a comparison of single-link versus hybrid-link communication robustness. Each configuration was tested on two representative task scenarios with 10 repeated trials, and average values were calculated to ensure result stability.

Table 6 summarizes the results of the ablation experiments. It can be seen that the complete framework achieves the best performance across all metrics. The NSGA-II module contributes the most significantly; when replaced by greedy scheduling, the task completion time increased from 84.73 min to 97.53 min, and energy utilization efficiency decreased by over 10%, indicating that multi-objective optimization effectively balances efficiency and resource consumption at the global level, avoiding performance loss caused by local optima. The WKF module primarily enhances data accuracy and system robustness; its removal increased the data fusion error (DFA) from 0.031 to 0.041 and reduced system robustness by approximately 7%, demonstrating that the weighting mechanism is crucial for mitigating noise and uncertainty in heterogeneous sensor data fusion.

The LSTM-based anomaly detection module makes a significant contribution to system stability. Although its removal has a limited effect on task completion time and energy consumption, the system reliability (SR) dropped to 82.53%, highlighting the irreplaceable advantages of deep learning models in dynamic anomaly recognition and prediction. The middleware hybrid fault-tolerant scheme plays a crucial role in communication performance. Upon removal, communication latency (CL) increased to 289.19 ms, and system robustness decreased by nearly 10%, indicating that multi-link redundancy and fault-tolerant mechanisms are essential for reliable operation of unmanned systems in remote tasks.

Overall, the results of the ablation study indicate that the proposed collaborative control framework is a tightly coupled system in which each core module plays an important role in different dimensions. NSGA-II and WKF primarily enhance efficiency and accuracy, whereas LSTM-based anomaly detection and the middleware hybrid scheme significantly improve robustness under uncertain environments. These findings not only validate the rationality of the framework design but also provide strong support for its practical application in complex marine environments.

Figure 7 shows the radar chart illustrating the contributions of each core module to the overall system performance. The solid lines represent the average performance of the complete framework and the ablation scenarios across five key metrics: Task Completion Time (TCT), Energy Utilization Efficiency (EUE), Communication Latency (CL), Data Fusion Accuracy (DFA), and System Reliability (SR).

It is evident from the chart that the complete framework consistently achieves superior performance across all metrics, represented by the outermost solid line covering the largest area. The absence of NSGA-II leads to a noticeable increase in TCT and a decrease in EUE, highlighting the module’s critical role in globally optimized task scheduling and resource allocation. The removal of WKF primarily affects DFA, indicating that the weighted Kalman filter is essential for accurate and robust sensor data fusion. The LSTM-based anomaly detection module significantly impacts SR, demonstrating its importance in handling dynamic anomalies and maintaining system reliability. Finally, removing the middleware hybrid scheme results in a substantial increase in CL, underlining the necessity of multi-link redundancy and fault-tolerant communication mechanisms for reliable operation in remote tasks.

Overall, the radar chart provides an intuitive visualization of each module’s specific contributions, complementing the quantitative results in Table 5. The chart confirms that the proposed collaborative control framework is tightly coupled, with each module enhancing particular dimensions of performance, collectively ensuring efficiency, accuracy, and robustness in complex operational environments.

3.4.3. Advantages and Limitations

The experimental results indicate that the proposed multi-objective automatic scheduling method demonstrates significant advantages in complex multi-task environments. Specifically, in scenarios with 50 to 200 task points, the method reduces the average task completion time (TCT) by 18.52% to 25.31% compared to traditional baseline methods, significantly improving task execution efficiency. Meanwhile, the total energy consumption (TEC) of UAVs is reduced by 12.74% to 20.38% on average. Through the USV collaborative replenishment strategy, the number of UAV return trips is decreased, resulting in single-task energy savings of 8.34% to 11.62% and noticeably enhancing energy utilization efficiency. The average waiting time (AWT) is maintained between 2.45 and 3.12 min across all task scales, decreasing by 15.21% to 22.37% relative to the baseline, demonstrating the scheduling strategy’s effectiveness in reducing delays and mitigating task backlog.

Moreover, the method exhibits notable performance in scheduling stability and task allocation balance. Across ten consecutive experiments, the task success rate remains above 98.11%, and the task allocation balance index (LBI) stays between 0.91 and 0.96, indicating effective UAV load balancing and avoidance of overload or idle states. The system employs a modular design, providing strong scalability; even when the task scale increases by 50%, scheduling efficiency is maintained above 90.45%. Multi-modal data fusion and dynamic optimization strategies enable efficient and stable resource scheduling under multiple constraints, including task priority, UAV endurance, and USV collaborative replenishment. These results indicate that the proposed method offers comprehensive advantages in improving efficiency, reducing energy consumption, and ensuring task stability.

Despite its strong performance, certain limitations remain under extreme conditions. During communication interruptions or unstable signals, if the interruption lasts longer than 15.00 s, scheduling delay increases by 12.34% to 18.27%, and the short-term task loss rate reaches 2.12%, affecting continuous task execution. In complex environments, such as wind and wave levels equal to or exceeding level 3 or routes with dense obstacles, single-task UAV energy consumption increases by 10.45% to 15.28%, and average task waiting time rises to 3.41–3.85 min, resulting in significantly higher scheduling pressure.

Furthermore, the system heavily relies on environmental perception data; under insufficient lighting or sensor malfunctions, path planning accuracy decreases by 5.13% to 8.46%, and task allocation balance drops to 0.87–0.90, causing partial UAV load imbalance. In large-scale heterogeneous platform collaboration scenarios, when the number of UAVs/USVs increases to three times the original, the algorithm’s computation delay rises by an average of 20.56%, and real-time performance declines, indicating that under high-density task points or prolonged continuous operations, computational efficiency and resource scheduling strategies still require optimization.

To evaluate the scalability of the proposed framework, simulations were conducted with varying numbers of UAVs/USVs (2, 4, 8, 12) and task points (50, 100, 200, 400). Key performance indicators—including task completion time (TCT, in minutes), communication latency (CL, in seconds), energy utilization efficiency (EUE, %), and system robustness (SR, %)—were recorded for each configuration. The results indicate that TCT increases from 12.45 min (2 UAVs, 50 tasks) to 47.80 min (12 UAVs, 400 tasks), while CL grows from 0.12 s to 0.65 s. EUE remains high across all scenarios, ranging from 88.32% to 85.47%, and SR remains largely stable between 92.15% and 85.60%, demonstrating that the framework effectively maintains efficiency and reliability even under larger operational scales.

Detailed trend analysis shows that TCT grows approximately linearly with the number of task points, while communication latency increases moderately with the number of UAVs/USVs due to additional coordination and data exchange requirements. Energy utilization efficiency decreases slightly as system scale increases, reflecting the overhead of longer coordinated trajectories and inter-vehicle communication. System robustness remains high because the dynamic scheduling algorithm redistributes tasks among vehicles and optimizes paths to mitigate potential delays or failures.

Qualitative assessment of computational load indicates that the algorithm maintains real-time feasibility, with scheduling computations completed within acceptable time frames even at the largest simulated scale. The adaptive task allocation and path optimization strategies effectively balance workload, prevent bottlenecks in high-density task scenarios, and minimize the risk of mission failure due to communication or energy constraints. Although extreme scales beyond the tested range may further challenge performance, the results confirm that the proposed framework can reliably manage medium-to-large scale collaborative maritime operations, providing a strong basis for future extension to real-world deployments.

In summary, the proposed method demonstrates excellent performance in routine and moderately complex environments, significantly improving task completion efficiency, energy optimization, and task allocation balance while ensuring high scheduling stability. However, under extreme communication conditions, complex environments, and large-scale collaborative scenarios, there remain areas for improvement. Future research directions include optimizing computational efficiency, enhancing robustness, and refining task allocation strategies to adapt to more challenging operational conditions.

4. Conclusions and Future Work

This study proposed a UAV-USV collaborative control framework based on edge computing, multi-objective optimization scheduling, and multi-modal data fusion, and systematically validated its performance through representative maritime patrol and coastal sedimentation mapping tasks. Experimental results demonstrate that the proposed method outperforms both single-platform autonomous and centralized scheduling approaches in terms of efficiency, accuracy, and robustness.

In the maritime patrol task, the average task completion time of the proposed method was 79.82 min, representing a reduction of 33.78% compared with the single-platform autonomous approach and 22.07% compared with the centralized scheduling method. The energy utilization efficiency reached 88.12%, significantly higher than 68.28% for the single-platform autonomous approach and 74.58% for centralized scheduling. The average communication latency was 234.25 ms, reduced by 43.00% relative to the single-platform autonomous method, ensuring efficient data exchange between UAVs and USVs. Data fusion accuracy, measured by mean squared error (MSE), decreased to 0.03, achieving 40.00% and 25.00% improvement over the single-platform and centralized methods, respectively. System robustness reached 91.21%, increasing by 18.89% and 10.54% compared to the single-platform and centralized methods, reflecting high reliability under node failures, communication interruptions, and unexpected disturbances.

For the coastal sedimentation mapping task, the proposed method similarly demonstrated significant advantages. The average task completion time was 89.74 min, reduced by 33.67% relative to the single-platform approach and 20.31% compared with centralized scheduling. Energy utilization efficiency reached 86.97%, considerably higher than 65.75% and 72.42% of the other methods. Communication latency was 246.31 ms, representing reductions of 45.41% and 27.19%, respectively. Data fusion accuracy (MSE) was 0.03, exceeding the comparison methods by more than 15.00%, while system robustness reached 90.12%, improving 19.27% and 13.58% over the single-platform and centralized approaches. These results indicate that the proposed framework maintains high efficiency and stability under multi-task, multi-platform, and multi-modal data fusion conditions.

Radar chart analysis shows that the single-platform autonomous method performs the worst across task completion time, communication latency, and system robustness, especially under large-scale tasks, indicating insufficient adaptability and stability due to lack of cross-platform coordination. The centralized scheduling approach improves task completion time and communication latency, but its lack of dynamic feedback and adaptive adjustment limits improvements in energy efficiency and data fusion accuracy. Among the six key performance metrics (task completion time, energy utilization efficiency, communication latency, data fusion accuracy, system robustness, and computational efficiency), the proposed method consistently demonstrates superior performance, forming the largest radar chart coverage area. Notably, system robustness and computational efficiency are particularly outstanding, reflecting the framework’s comprehensive effectiveness in improving task execution efficiency, reducing energy consumption, and enhancing system stability.

The core advantages of the proposed method include multi-objective optimization scheduling, edge computing and resource coordination, multi-modal data fusion, and high robustness with self-recovery capability. These advantages not only improve task execution efficiency but also ensure system stability and reliability in complex and dynamic environments.

While the proposed collaborative control method demonstrates significant improvements in task completion efficiency, energy utilization, and system robustness in simulated marine scenarios, several limitations should be acknowledged. First, the experiments were conducted entirely in simulation environments, which may not fully capture all real-world uncertainties. Second, the communication model was simplified, and effects such as variable bandwidth, interference patterns, and real-time network congestion were only partially represented. Third, maritime regulatory constraints, including COLREGs for vessel navigation, were not incorporated, potentially limiting applicability in operational deployments. Finally, measurements of computational load and actual energy consumption for UAVs and USVs were not collected, leaving gaps in understanding system efficiency under realistic operating conditions.

Future work will focus on addressing these limitations. Field experiments in real marine environments will be conducted to validate simulation results and assess system performance under realistic wind, current, and traffic conditions. More comprehensive communication models and regulatory compliance, including collision avoidance rules, will be integrated into the task scheduling framework. Additionally, energy consumption and computational load will be quantitatively measured to optimize platform allocation and algorithm efficiency. Finally, the integration of adaptive learning mechanisms will be explored to improve system resilience and autonomy under dynamic and unforeseen conditions.

Looking ahead, the proposed framework can be further extended. First, incorporating deep reinforcement learning or graph neural networks can enable adaptive scheduling optimization in large-scale, multi-task, and multi-platform environments. Second, cross-domain and cross-task integrated scheduling approaches can be explored to enhance applicability in complex operational scenarios. Additionally, combining high-precision environmental prediction models with scheduling can further enhance robustness and task efficiency. Finally, deploying large-scale UAV-USV collaborative operations in real maritime missions with real-time data updates and analysis will provide high-precision support for port management, coastal protection, and environmental monitoring.

While the proposed method demonstrates promising performance in simulation, several limitations should be acknowledged. First, the study is restricted to simulated environments, which may not capture all real-world uncertainties. Second, communication assumptions are simplified, and the effects of complex interference patterns are not fully modeled. Third, maritime regulatory constraints, including COLREGs, are not incorporated into the current framework. Finally, measurements of computational load and onboard energy consumption are absent, limiting evaluation of system efficiency under real operational conditions. Future work will address these limitations by conducting real-world experiments, developing regulatory-compliant control strategies, and integrating onboard monitoring of computational and energy metrics.

A key limitation of the present study is that all evaluations are conducted in simulation, which may not fully capture real-world operational complexities. Future work will involve hardware-in-the-loop experiments and field trials in coastal environments, enabling validation of the proposed framework under realistic conditions, including maritime clutter, variable weather, and hardware limitations.

In summary, the proposed UAV-USV collaborative control framework demonstrates achieved 33.78% reduction in task completion time in complex maritime operations, achieving a balance among task efficiency, energy optimization, data accuracy, and system robustness. It provides a solid theoretical foundation and methodological reference for unmanned system applications in maritime patrol, environmental surveying, and intelligent monitoring tasks.

Author Contributions

Conceptualization, J.Y.; methodology, L.Z.; software, J.Y.; validation, J.Y.; data curation, L.Z. and B.P.; writing—original draft preparation, J.Y.; writing—review and editing, L.Z. and B.P.; visualization, B.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was financially supported by the GDNRC[2024]39.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Chirosca, A.-M.; Rusu, L. Characteristics of the Wind and Wave Climate along the European Seas Focusing on the Main Maritime Routes. J. Mar. Sci. Eng. 2022, 10, 75. [Google Scholar] [CrossRef]
Ma, L.; Wang, Y.L.; Han, Q.L. Cooperative target tracking of multiple autonomous surface vehicles under switching interaction topologies. IEEE/CAA J. Autom. Sin. 2022, 10, 673–684. [Google Scholar] [CrossRef]
Wang, Y.; Liu, W.; Liu, J.; Sun, C. Cooperative USV–UAV marine search and rescue with visual navigation and reinforcement learning-based control. ISA Trans. 2023, 137, 222–235. [Google Scholar] [CrossRef]
Roncalli, V.; Uttieri, M.; Carotenuto, Y. The Distribution of Ferritins in Marine Copepods. J. Mar. Sci. Eng. 2023, 11, 1187. [Google Scholar] [CrossRef]
Apostolakis, A.; Barmpakos, D.; Pilatis, A.; Belessi, V.; Pagonis, D.-N.; Jaber, F.; Aidinis, K.; Kaltsas, G. Study of Single and Multipass f–rGO Inkjet-Printed Structures with Various Concentrations: Electrical and Thermal Evaluation. Sensors 2023, 23, 2058. [Google Scholar] [CrossRef]
Liu, L.; Yang, X.; Li, X.; Zhou, X.; Wang, Y.; Tang, T.; Song, Q.; Liu, Y. Prior Knowledge-Based Two-Layer Energy Management Strategy for Fuel Cell Ship Hybrid Power System. J. Mar. Sci. Eng. 2025, 13, 94. [Google Scholar] [CrossRef]
Wang, Y.; Wang, Z.; Wang, Y.; Dong, W.; Lan, T. Azimuth Estimation of Multi-LFM Signals Based on Improved Complex Acoustic Intensity Method. J. Mar. Sci. Eng. 2022, 10, 1803. [Google Scholar] [CrossRef]
Zhou, T.; Yu, Z.; Wang, L.; Ryu, K.H. A Mood Semantic Awareness Model for Emotional Interactive Robots. Sensors 2024, 24, 845. [Google Scholar] [CrossRef]
Abbasi, N.; Soltanaghaei, M.; Boroujeni, F.Z. Anomaly detection in IOT edge computing using deep learning and instance-level horizontal reduction. J. Supercomput. 2024, 80, 8988–9018. [Google Scholar] [CrossRef]
Jung, S.; Chau, T.V.; Kim, M.; Na, W.-B. Artificial Seaweed Reefs That Support the Establishment of Submerged Aquatic Vegetation Beds and Facilitate Ocean Macroalgal Afforestation: A Review. J. Mar. Sci. Eng. 2022, 10, 1184. [Google Scholar] [CrossRef]
Son, W.; Oh, M.; Yu, H.; Jung, B.C. Physical-layer security in MU-MISO downlink networks against potential eavesdroppers. Digit. Commun. Netw. 2025, 11, 424–431. [Google Scholar] [CrossRef]
Zhang, J.; Yu, X.; Ha, S.; Peña Queralta, J.; Westerlund, T. Comparison of middlewares in edge-to-edge and edge-to-cloud communication for distributed ROS 2 systems. J. Intell. Robot. Syst. 2024, 110, 162. [Google Scholar] [CrossRef]
Sadatdiynov, K.; Cui, L.; Zhang, L.; Huang, J.Z.; Xiong, N.N.; Luo, C. An intelligent hybrid method: Multi-objective optimization for MEC-enabled devices of IoE. J. Parallel Distrib. Comput. 2023, 171, 1–13. [Google Scholar] [CrossRef]
Li, A.; Qiang, Z. Multi-sensor data fusion method based on adaptive Kalman filtering. In Proceedings of the 2023 13th International Conference on Communication and Network Security, Fuzhou, China, 6–8 December 2023; pp. 306–311. [Google Scholar] [CrossRef]
Lachekhab, F.; Benzaoui, M.; Tadjer, S.A.; Bensmaine, A.; Hamma, H. LSTM-Autoencoder Deep Learning Model for Anomaly Detection in Electric Motor. Energies 2024, 17, 2340. [Google Scholar] [CrossRef]
Zhang, Y.; Zhang, Y.; Liu, J.; Huang, B.; Chang, H.; Liu, Y.; Huang, J. A Novel UAV-to-Multi-USV Channel Model Incorporating Massive MIMO for 6G Maritime Communications. Electronics 2025, 14, 2536. [Google Scholar] [CrossRef]
Liu, Y.; Xu, X.; Li, G.; Lu, L.; Gu, Y.; Xiao, Y.; Sun, W. Cooperative Patrol Control of Multiple Unmanned Surface Vehicles for Global Coverage. J. Mar. Sci. Eng. 2025, 13, 584. [Google Scholar] [CrossRef]
Cui, G.; Zhang, W.; Xu, W.; Bao, H. Efficient workflow scheduling using an improved multi-objective memetic algorithm in cloud-edge-end collaborative framework. Sci. Rep. 2025, 15, 29754. [Google Scholar] [CrossRef] [PubMed]
Zhu, Z.; Zhang, G.; Li, M.; Liu, X. Evolutionary Multi-Objective Workflow Scheduling in Cloud. IEEE Trans. Parallel Distrib. Syst. 2016, 27, 1344–1357. [Google Scholar] [CrossRef]
Li, L.; Qiu, Q.; Xiao, Z.; Lin, Q.; Gu, J.; Ming, Z. A Two-Stage Hybrid Multi-Objective Optimization Evolutionary Algorithm for Computing Offloading in Sustainable Edge Computing. IEEE Trans. Consum. Electron. 2024, 70, 735–746. [Google Scholar] [CrossRef]
Gao, X.; Ji, X.; Zhang, S.; Zhang, Y.; Cai, E. Multi-sensor fusion for structural displacement estimation: Integrating vision and acceleration from mobile devices. Eng. Struct. 2025, 329, 119826. [Google Scholar] [CrossRef]
Shao, M.; Liu, X.; Zhang, T.; Zhang, Q.; Sun, Y. Research on cooperative motion control of USV and UAV based on sliding mode self-immunity control. Expert Syst. Appl. 2025, 284, 127961. [Google Scholar] [CrossRef]
Li, W.; Ge, Y.; Guan, Z.; Gao, H.; Feng, H. NMPC-based UAV-USV cooperative tracking and landing. J. Frankl. Inst. 2023, 360, 7481–7500. [Google Scholar] [CrossRef]
Zhang, H.; Fan, J.; Zhang, X.; Xu, H.; Guedes Soares, C. Unmanned Surface Vessel–Unmanned Aerial Vehicle Cooperative Path Following Based on a Predictive Line of Sight Guidance Law. J. Mar. Sci. Eng. 2024, 12, 1818. [Google Scholar] [CrossRef]
Tao, W.; Tan, J.; Sui, Z.; Wang, L.; Xiong, X. Cooperative Formation Control of Multiple Ships with Time Delay Conditions. J. Mar. Sci. Eng. 2025, 13, 549. [Google Scholar] [CrossRef]
Liang, R.; Li, D.; Sun, H.; Hong, L. Cooperative Detection-Oriented Formation Design and Optimization of USV Swarms via an Improved Genetic Algorithm. Sensors 2025, 25, 3179. [Google Scholar] [CrossRef] [PubMed]
Yazid, Y.; Ez-Zazi, I.; Guerrero-González, A.; El Oualkadi, A.; Arioua, M. UAV-Enabled Mobile Edge-Computing for IoT Based on AI: A Comprehensive Review. Drones 2021, 5, 148. [Google Scholar] [CrossRef]
Jasra, S.K.; Valentino, G.; Muscat, A.; Camilleri, R. Hybrid Machine Learning–Statistical Method for Anomaly Detection in Flight Data. Appl. Sci. 2022, 12, 10261. [Google Scholar] [CrossRef]
Abdel-Basset, M.; Moustafa, N.; Mohamed, R.; Elkomy, O.M.; Abouhawwash, M. Multi-Objective Task Scheduling Approach for Fog Computing. IEEE Access 2021, 9, 126988–127009. [Google Scholar] [CrossRef]
Ullah, I.; Mahmoud, Q.H. Design and development of RNN anomaly detection model for IoT networks. IEEE Access 2022, 10, 62722–62750. [Google Scholar] [CrossRef]
Nomikos, N.; Trakadas, P.; Gkonis, P.K.; Bithas, P.S. A Survey on UAV-Aided Maritime Communications: Deployment Considerations, Applications, and Future Challenges. arXiv 2022, arXiv:2209.09605. [Google Scholar] [CrossRef]
Yahia, W.B.; Ayadi, O.; Masmoudi, F. A sensitivity analysis of multi-objective cooperative planning optimization using nsga-ii. In Proceedings of the Conference on Multiphysics Modelling and Simulation for Systems Design, Sousse, Tunisia, 17–19 December 2014; Springer International Publishing: Cham, Switzerland, 2014; pp. 327–337. [Google Scholar]
Wang, Q.; Zhang, L.; Li, Y.; Li, X.; Zhang, L.; Wang, X. Parameterization of NSGA-II for the Optimal Design of Water Distribution Systems. Water 2019, 11, 971. [Google Scholar] [CrossRef]

Figure 1. System Architecture for UAV-USV Cooperative Control.

Figure 2. Comparison of TCT and Communication Latency.

Figure 3. Comparative Analysis of Scheduling Methods across Multiple Performance Metrics.

Figure 4. Comparison of Data Fusion Accuracy and System Robustness among Different Methods.

Figure 5. Radar Chart of Method Performance in Maritime Patrol Task.

Figure 6. Radar Chart Performance of Three Methods Across Six Performance Dimensions.

Figure 7. Radar Charts Illustrating Module Contributions in Ablation Study.

Table 1. Quantitative performance comparison of UAV–USV collaborative tasks under varying environmental conditions.

Scenario	Wind (m/s)	Current (m/s)	Packet Loss (%)	TPCR (%)	RUE (%)	SEC (%)
Maritime Patrol—Calm	0–2	0–0.51	0–2	96.52 ± 1.21	91.33 ± 2.02	3.15 ± 0.51
Maritime Patrol—Moderate	3–5	0.5–1.02	2–5	94.23 ± 1.54	88.71 ± 2.32	4.02 ± 0.61
Sediment Mapping—Calm	0–2	0–0.52	0–2	97.13 ± 1.01	92.09 ± 1.82	2.83 ± 0.41
Sediment Mapping—Moderate	3–5	0.5–1.51	3–8	94.81 ± 1.33	89.52 ± 2.18	3.752 ± 0.5

Table 2. Comparison Results of Maritime Patrol Task.

Method	TCT (min)	EUE (%)	CL (ms)	DFA (MSE)	SR (%)
Single Platform Autonomous	120.53	68.28	410.51	0.05	72.32
Centralized Scheduling	102.42	74.58	325.35	0.04	80.67
Proposed Method (Framework)	79.82	88.12	234.25	0.03	91.21

Table 3. Comparison Results of Coastal Sedimentation Mapping Tasks.

Method	TCT (min)	EUE (%)	CL (ms)	DFA (MSE)	SR (%)
Single Platform Autonomous	135.28	65.75	450.75	0.05	70.85
Centralized Scheduling	112.62	72.42	338.24	0.04	76.54
Proposed Method (Framework)	89.74	86.97	246.31	0.03	90.12

Table 4. Comparative Results of Comprehensive Performance Metrics for Different Methods.

Method	TCT (min)	EUE (%)	CL (ms)	DFA (MSE)	SR (%)
Single Platform Autonomous	127.85	66.95	432.24	0.05	71.54
Centralized Scheduling	107.52	73.45	331.57	0.04	78.56
Proposed Method (Framework)	84.75	87.55	241.07	0.03	90.65

Table 5. Confusion Matrices of Anomaly Detection under Different Methods.

Method	TP	FP	TN	FN	Precision	Recall	F1-Score
Single-platform	195	60	640	105	0.76	0.65	0.71
Centralized	228	45	655	72	0.84	0.76	0.83
Proposed framework	267	28	672	33	0.91	0.87	0.89

Table 6. Comparison of Ablation Study Results.

Experimental Setting	TCT (min)	EUE (%)	CL (ms)	DFA (MSE)	SR (%)
Complete Framework (Proposed Method)	84.73	87.53	240.17	0.031	90.63
Remove WKF → Standard KF	95.21	82.31	245.19	0.041	83.13
Remove NSGA-II → Greedy Scheduling	97.53	76.83	258.11	0.042	84.21
Remove LSTM → Threshold Detection	88.61	85.91	242.13	0.033	82.53
Remove Middleware Hybrid Scheme → Single Link	86.91	86.41	289.19	0.031	81.73

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, J.; Zhao, L.; Peng, B. Edge-Intelligence-Driven Cooperative Control Framework for Heterogeneous Unmanned Aerial and Surface Vehicles in Complex Maritime Environments. Drones 2025, 9, 755. https://doi.org/10.3390/drones9110755

AMA Style

Yang J, Zhao L, Peng B. Edge-Intelligence-Driven Cooperative Control Framework for Heterogeneous Unmanned Aerial and Surface Vehicles in Complex Maritime Environments. Drones. 2025; 9(11):755. https://doi.org/10.3390/drones9110755

Chicago/Turabian Style

Yang, Jingfeng, Lingling Zhao, and Bo Peng. 2025. "Edge-Intelligence-Driven Cooperative Control Framework for Heterogeneous Unmanned Aerial and Surface Vehicles in Complex Maritime Environments" Drones 9, no. 11: 755. https://doi.org/10.3390/drones9110755

APA Style

Yang, J., Zhao, L., & Peng, B. (2025). Edge-Intelligence-Driven Cooperative Control Framework for Heterogeneous Unmanned Aerial and Surface Vehicles in Complex Maritime Environments. Drones, 9(11), 755. https://doi.org/10.3390/drones9110755

Article Menu

Edge-Intelligence-Driven Cooperative Control Framework for Heterogeneous Unmanned Aerial and Surface Vehicles in Complex Maritime Environments

Abstract

1. Introduction

2. Proposed Method

2.1. System Architecture

2.2. Data Abstraction and Synchronization

2.3. Multi-Sensor Data Fusion

2.4. Edge Computing and Resource Scheduling

2.5. Middleware Communication and Fault Tolerance

2.6. Multi-Objective Task Decomposition and Scheduling

2.7. Anomaly Detection and Self-Recovery

2.8. Feedback-Based Dynamic Adjustment

3. Experiments and Results

3.1. Experimental Setup

3.2. Case Study 1: Maritime Patrol

3.3. Case Study 2: Coastal Sedimentation Mapping

3.4. Comparative Analysis

3.4.1. Performance Metrics

3.4.2. Ablation Study and Module Contribution Analysis

3.4.3. Advantages and Limitations

4. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI