1. Introduction
With the rapid advancement of unmanned system technologies, the maritime cross-domain unmanned system-of-systems (MCUSoS), which is composed of heterogeneous platforms such as unmanned surface vehicles (USVs) and unmanned aerial vehicles (UAVs), has emerged as a significant operational paradigm for future maritime missions [
1]. In collaborative detection tasks, the MCUSoS enables joint detection, persistent monitoring, and information fusion against both surface and underwater targets by coordinating multiple unmanned clusters across three domains, namely the airspace, the sea surface, and the underwater environment, thereby providing fundamental support for situational awareness and decision-making [
2,
3]. Collaborative detection capability is therefore one of the core indicators for measuring the overall performance of the MCUSoS, and its level directly determines whether the system can achieve full-domain perception and rapid response.
However, the maritime operating environment exhibits notable complexity and confrontational characteristics. On the one hand, UAVs and USVs differ fundamentally in sensor types, detection mechanisms, and maneuverability. Cross-domain coordination faces multiple difficulties due to constraints such as limited communication range, satellite relay latency, and intermittent link connectivity caused by sea state fluctuations and platform mobility [
4,
5,
6,
7,
8]. On the other hand, the harsh maritime environment poses various failure threats to the system in practical operations [
9]. Salt fog corrosion, high humidity, and persistent wave-induced vibrations accelerate the degradation of onboard electronic components and mechanical structures, leading to random equipment faults [
10]. USVs serving as communication relay nodes are vulnerable to deliberate targeted attacks such as anti-ship strikes and electronic warfare jamming, which can trigger cascading disconnection of dependent UAV clusters. Regional threats including wide-area electromagnetic interference further degrade or even completely eliminate the collaborative detection capability of the system [
11,
12,
13,
14]. How to accurately evaluate the collaborative detection capability of the MCUSoS and effectively enhance system resilience to achieve capability recovery under external failure disturbances has become a critical issue that urgently needs to be addressed in current unmanned system research.
Extensive research has been conducted on unmanned system modeling and collaborative detection. At the system architecture level, multi-level organizational structures and cluster-based grouping modes have been widely adopted for unmanned system modeling. Giles et al. [
15] proposed a hierarchical mission-decomposition architecture framework for swarm unmanned systems by integrating mission engineering with model-based systems engineering, which supports modular design and mission reuse across unmanned platforms. Liu et al. [
16] developed a distributed adaptive fixed-time formation control method for UAV-USV heterogeneous multi-agent systems, which effectively addressed the dynamic uncertainties caused by external disturbances in leader-follower structures. In terms of collaborative detection coverage modeling, Luna et al. [
17] presented a coverage path planning method for multi-UAV systems that employed a sensor-footprint-based area decomposition strategy to achieve cooperative coverage search, with the capability of in-flight dynamic re-planning. Bai et al. [
18] established a detection coverage and resilience evaluation model for UAV swarms under communication-constrained scenarios using a complex network approach. Regarding communication constraint modeling, Lim et al. [
19] derived closed-form approximations of communication reliability for UAV swarms based on a comprehensive signal-to-interference-and-noise ratio (SINR) model that incorporated shadowing, multipath fading, and line-of-sight probability, and provided a quantitative method for evaluating and maintaining intra-swarm link quality during deployment. Yan et al. [
20] formulated the multi-hop relay selection problem in underwater acoustic sensor networks as a combinatorial multi-armed bandit learning model and developed an online cooperative reasoning mechanism to adapt relay strategies to dynamic network topologies. However, most existing modeling efforts have focused on a single operational domain or a single cluster type. A unified system modeling framework aimed at cross-domain cooperative scenarios spanning air, sea surface, and underwater remains largely lacking, making it difficult to comprehensively characterize the architectural features, formation-based detection modes, and multi-level communication coordination relationships of MCUSoS in maritime collaborative detection missions.
In terms of capability evaluation for unmanned systems, existing research can be broadly classified into two categories. The first category comprises the evaluation methods based on performance analysis, such as the classic ADC model [
21], the fuzzy mathematics [
22] and the system dynamics model [
23]. These methods are capable of characterizing system-of-systems effectiveness at a macroscopic level, yet they struggle to reveal the internal cooperative organizational mechanisms of the system. The second category comprises the capability evaluation methods based on complex network theory. Wang et al. [
24] established a multi-layer network model for UAV swarm mission reliability oriented toward systematic and networked missions, and adopted vulnerability and connectivity as two indicators to evaluate mission reliability under both random and deliberate attacks. Li et al. [
25] proposed a capability-oriented equipment contribution analysis method in temporal combat networks, which quantified the functional contribution of individual nodes to the overall combat capability. Chen et al. [
26] developed a mission reliability evaluation method based on the effective operation loop, which associated the capability of reconfigurable unmanned weapon system-of-systems with the dynamic reconfiguration process of actual combat workflows. Feng et al. [
27] established a phased mission reliability evaluation model and a UAV number optimization model for swarms on the basis of importance measure theory. In addition, regarding the modeling of physical detection performance, Yan et al. [
28] constructed a formation control and collision avoidance model for multi-USV systems using virtual structure and artificial potential field methods, which provided a foundation for characterizing the formation coverage and detection performance of unmanned platforms. However, most of the aforementioned methods address the problem from a single perspective. They either focus on cooperative relationships at the network topology level or on coverage performance in the physical space, while a comprehensive evaluation indicator system that integrates cooperative organizational capability with dynamic detection coverage capability in a unified manner has not yet been established.
System-of-systems resilience has received increasing attention in recent years. Resilience is generally defined as the comprehensive ability of a system to absorb shocks, adapt to changes, and restore functionality after a disruption [
29]. In the domain of unmanned systems, Kong et al. [
30] proposed a resilience evaluation framework for UAV swarms based on performance curve analysis that incorporated external resource supplementation, and quantified the entire process from performance degradation to recovery following an attack. Zhang and Liu [
31] constructed a dual-layer coupled network model for UAV swarms that accounted for both the communication layer and the structural layer, and investigated a resilience assessment method considering cascading failure with dynamic evolution. Regarding multi-stage resilience modeling, Tran et al. [
32] introduced a quantitative assessment framework that characterized the temporal evolution of system performance and decomposed resilience into multiple stages, including resistance, adaptation, and recovery. This framework provided a new perspective for resilience analysis of unmanned systems. In terms of resilience enhancement strategies, Zhong et al. [
33] proposed a kill chain optimization method to improve the resilience of unmanned combat system-of-systems, which enhanced robustness and recovery capability in adversarial environments through optimizing the operation loop structure. Sun et al. [
34] investigated a cooperative topology reconfiguration method for unmanned weapon system-of-systems based on multi-swarm collaboration, which restored system functionality through dynamic reorganization of formation relationships. Li et al. [
35] developed a soft resource optimization method for unmanned swarms driven by resilience and based on autonomous coordination, which achieved capability recovery through redundant node supplementation and task reassignment strategies. Nevertheless, existing resilience research still has several limitations. First, most studies adopt generic network connectivity or overall performance as the object of resilience evaluation, while resilience indicators specifically designed for the collaborative detection capability scenario remain scarce. In addition, the modeling of failure modes is relatively simplistic and lacks a unified description that covers multiple typical threats. Additionally, resilience evaluation and recovery strategies are typically studied within separate frameworks, and a complete closed loop methodology that spans capability evaluation, resilience assessment, and strategy enhancement has not yet been developed. In particular, how to incorporate internal dynamic reconfiguration and external resource supplementation jointly into a unified resilience enhancement framework under maritime cross-domain multi-cluster scenarios has not been adequately explored.
To address the aforementioned challenges, this study focuses on the collaborative detection mission of MCUSoS and systematically investigates the methods for collaborative detection capability evaluation and resilience enhancement. The main contributions are as follows:
- (i)
A system-of-systems architecture of the MCUSoS is established, encompassing formation detection models, and multi-level cooperative communication network models, which provides a unified structural foundation for capability evaluation and resilience analysis of heterogeneous cross-domain unmanned system-of-systems.
- (ii)
A capability evaluation model is developed from the capabilities of collaboration and detection. Collaborative capability indices from intra-cluster and inter-cluster dimensions are integrated with dynamic sea-surface and underwater detection coverage metrics to form the composite evaluation function, which enables the integrated quantification of cooperative organizational relationships and detection coverage performance under both steady state and disturbed conditions.
- (iii)
Three representative failure models are established to analyze the degradation mechanisms of collaborative detection capability under multiple failure modes. A multi-phase resilience evaluation model incorporating elastic, plastic, and fracture stages is proposed, with a comprehensive resilience metric constructed from the dimensions of performance margin, internal reconfiguration efficiency, and external resource support rate, offering a quantitative basis for comparing resilience across different failure scenarios.
- (iv)
An integrated resilience enhancement strategy combining dynamic reconfiguration with external resource supplementation is designed. The proposed methods are validated through case studies under various failure scenarios. This strategy offers a practical approach to restoring system performance under diverse operational threats, bridging the gap between resilience evaluation and performance recovery of the MCUSoS.
The remainder of this paper is organized as follows.
Section 2 establishes the mathematical framework of the MCUSoS for collaborative detection missions.
Section 3 constructs the collaborative detection capability evaluation index system and formulates multiple failure models.
Section 4 presents the multi-stage resilience evaluation model and resilience enhancement strategies.
Section 5 validates the effectiveness of the proposed methods through simulation case studies.
Section 6 concludes this paper.
4. Resilience Evaluation and Enhancement for the MCUSoS
The resilience evaluation and enhancement methods for the system under external failure disruptions are further investigated in this section, building upon the collaborative detection capability evaluation framework established in
Section 3. An index transformation addressing the cumulative nature of detection coverage capability is first performed to establish a resilience evaluation model incorporating multi-stage response mechanisms. Resilience enhancement strategies considering dynamic reconfiguration and external supplementation are then proposed to achieve capability recovery after failure disruptions. A resilience enhancement simulation workflow and effectiveness evaluation method are finally constructed to provide an implementation framework for subsequent case verification.
4.1. Multi-Stage Resilience Evaluation Model
The performance response of the MCUSoS under external failure disruptions exhibits pronounced stage-wise characteristics. This subsection first constructs instantaneous performance indices suitable for resilience evaluation, then establishes a multi-stage resilience response model inspired by the stress–strain behavior in material mechanics, and finally develops a resilience metric system from three dimensions: performance margin, internal reconfiguration efficiency, and external resource support rate.
4.1.1. Performance Index for Resilience Evaluation
The detection coverage capability index
defined in
Section 3 is a time-integral-based cumulative quantity that increases monotonically with mission time and cannot directly reflect the instantaneous performance state of the system at a given moment. Resilience evaluation requires characterizing the complete process from normal performance to degradation and then to recovery, demanding a performance index that remains stable under undisturbed conditions, responds promptly when disruptions occur, and gradually recovers during the restoration phase. The cumulative detection coverage capability is therefore converted into an instantaneous detection efficiency index to meet this requirement.
The instantaneous sea-surface detection efficiency is defined as the effective detection coverage increment of the system over the mission area per unit time. Under steady-state operation, each platform executes detection tasks along predetermined trajectories, and the newly scanned area per unit time remains relatively constant. The number of effective detection platforms decreases when node or link failures occur, and the incremental coverage area per unit time declines accordingly. The instantaneous sea-surface detection efficiency is defined as
The discrete form adopted for practical computation with discrete time step
is:
where the set difference in the numerator denotes the newly added sea-surface detection area during the time interval
. The instantaneous underwater detection efficiency is similarly defined as
The instantaneous detection efficiencies are normalized by their steady-state values at the initial mission time to eliminate the differences in absolute detection efficiency across different mission scenarios: , .
The normalized indices remain near 1 during normal operation, decrease below 1 after failure occurrence, and gradually recover during the restoration phase, thereby providing an intuitive representation of the dynamic performance evolution. The composite performance function of the system is constructed by combining the collaborative capability index
defined in
Section 3 with the normalized instantaneous detection efficiency:
where
is the weighted normalized instantaneous detection efficiency, and the weights
are selected according to the mission scenario, based on expert judgment and historical mission data. Under undisturbed steady-state operation,
. After a failure disruption,
evolves dynamically, providing a quantifiable performance trajectory for resilience evaluation.
4.1.2. Resilience Response Phase Classification and Evolution Modeling
A multi-stage resilience response model for the MCUSoS is established by drawing an analogy to the stage-wise behavior of elastic deformation, plastic deformation, and fracture failure in material mechanics, as illustrated in
Figure 4. Two critical performance thresholds are defined: the task baseline
and the failure baseline
(
). The task baseline represents the minimum performance required to accomplish basic detection tasks, and the failure baseline represents the critical point at which system functionality is completely lost.
The resilience response of the system is classified into three phases based on these thresholds. The elastic phase corresponds to , where performance loss can be automatically recovered through internal redundancy without external support. The plastic phase corresponds to , where performance loss is irreversible and recovery requires external resources or active reconfiguration; this is the primary interval where resilience enhancement strategies take effect. The failure phase corresponds to , where core functionality is lost and the system is regarded as unrecoverable. At this stage, essential command-and-control connectivity and the minimum cooperative relationships required for mission execution are assumed to have collapsed, so that conventional internal reconfiguration and limited external supplementation can no longer restore effective system performance.
As illustrated in
Figure 4, the performance evolution of the system comprises four typical stages when an external disruption occurs at time
and persists until
. The performance decreases from the initial level
to the task baseline
during the disruption absorption stage
, where performance redundancy absorbs the impact of the disruption. The system performance rapidly declines from
to the minimum value
during the rapid degradation stage
, and internal reconfiguration combined with external supplementation strategies must be employed to achieve resilience recovery. The system performance gradually recovers to the original mission level during the stable recovery stage
and subsequently approaches a new steady-state value
. External resource supplementation commences at time
and continues until the performance recovers to the task baseline, accounting for the spatiotemporal delay inherent in external resource supply.
4.1.3. Resilience Metric System
Resilience metrics are constructed from three dimensions, namely performance margin, internal reconfiguration efficiency, and external resource support rate, and a composite resilience value is formed through weighted aggregation.
Performance margin measures the ability of the system to maintain core functionality during the plastic phase, reflecting the safety margin relative to the failure baseline during the performance degradation period. It is defined as the ratio of the actual performance integral to the ideal performance integral during the plastic phase:
where
, and a larger value indicates a greater distance from the failure baseline during the degradation period and stronger survivability.
Internal reconfiguration efficiency measures the ability of the system to achieve performance recovery through self-resource adjustment, reflecting the level of self-organization and self-adaptation. It is defined as the ratio of the performance recovery contributed by dynamic reconfiguration strategies to the total performance loss:
where
is the performance recovery achieved through dynamic reconfiguration strategies. A larger
indicates stronger self-recovery capability and less dependence on external support.
External resource support rate measures the contribution of externally supplemented resources to system recovery, defined as the ratio of the performance recovery brought by supplementary nodes to the total performance loss:
where
is the performance recovery brought by supplementary nodes,
is the number of supplementary nodes, and
is the average performance contribution per node.
The composite resilience value integrates the above three dimensions to provide an overall resilience evaluation:
where
and
are weight coefficients reflecting the relative importance of survivability, self-recovery capability, and external support efficiency, respectively. They are determined for the present case study through the AHP-based procedure described in
Appendix A.1.
, and a larger value indicates stronger composite resilience of the system.
4.2. Resilience Enhancement Strategies with Dynamic Reconfiguration and External Supplementation
Active recovery measures are required when the system enters the plastic phase after a failure disruption. Recovery measures are categorized into two types: dynamic reconfiguration strategies that achieve structural adjustment using existing resources, and resource supplementation strategies that introduce external supplementary nodes. The mathematical models and optimization methods for these two types of strategies are established in this subsection.
4.2.1. Dynamic Reconfiguration Strategies
Dynamic reconfiguration strategies maximize the utilization efficiency of remaining resources by adjusting the resource organization and spatial configuration without changing the total resource quantity, offering the advantages of rapid response and no external dependency. Three representative strategies are proposed.
The UAV of cluster
become disconnected platforms when the USV of that cluster fails, and the set of disconnected UAVs is defined as
. The candidate reassignment clusters for a disconnected UAV
are those whose USVs remain operational and lie within communication range:
The optimal reassignment cluster is selected using a collaborative capability gain criterion:
where
balances cooperation and discernibility. In the simulation,
is set to 0.5 by default.
The disconnected UAV must first maneuver to the communication coverage area of the nearest operational cluster if the candidate set is empty.
Partial link failures within a cluster may cause network fragmentation, and the set of link-failure clusters is defined as
. Communication links can be reestablished by adjusting UAV positions, and the position adjustment optimization problem is formulated as
The first constraint ensures that the adjusted communication graph is connected, and the second constraint ensures that each UAV remains within the formation radius.
The mission area
of cluster
is reallocated to neighboring operational clusters when that cluster fails entirely. The failed cluster is defined as
. The set of neighboring clusters is defined as
, and the optimization problem is:
The cost function jointly considers the maneuver distance, load increase, and capability margin, where is the performance margin indicator defined in Equation (45).
4.2.2. Supplementary Node Optimization and Allocation
External supplementary nodes are introduced when dynamic reconfiguration alone is insufficient for adequate recovery. Let the supplementary node set be
and the allocation decision variable be
. The optimization objective is to maximize the recovered performance:
A greedy algorithm is adopted for solving: the node-cluster pair with the largest marginal contribution is iteratively selected for allocation. USV supplementation should be prioritized when failed clusters exist and spare USVs are available, as it can simultaneously restore collaborative capability and underwater detection capability, yielding significantly higher recovery benefits than UAV supplementation.
4.3. Resilience Enhancement Simulation Workflow and Effectiveness Evaluation
A complete simulation framework comprising three phases of shock simulation, strategy execution, and effectiveness evaluation is constructed by integrating the preceding content to verify the effectiveness of the proposed resilience evaluation model and enhancement strategies.
The first phase is shock simulation. The system operates at the initial steady state with performance at the level at the beginning of the simulation. The failed node set and link set are determined according to the selected failure mode and its parameters at the disruption time . The system topology is updated after failure identification, the post-disruption performance is calculated, and failure types are identified, including USV-failed clusters, link-failed clusters, and the set of disconnected UAVs.
The second phase is strategy execution. Strategy selection and execution order are based on a joint consideration of recovery efficiency and resource dependency: dynamic reconfiguration strategies utilize existing resources with rapid response and should be executed first; external resource supplementation is activated when dynamic reconfiguration is insufficient. Disconnected platform reassignment has the most relaxed execution conditions and should be attempted first. Intra-cluster communication topology reconstruction addresses link failure situations through minor position adjustments to restore connectivity. Mission area reallocation of failed clusters involves multi-cluster coordination and is employed when the preceding two strategies cannot achieve sufficient recovery. Regarding external resource supplementation, USV supplementation yields significantly higher recovery benefits than UAV supplementation and should be prioritized when resources are limited.
The third phase is effectiveness evaluation. The key time instants of performance evolution
) and performance values (
) are recorded after strategy execution. The resilience metrics (
) and recovery effectiveness indices are calculated. The recovery rate
measures the degree of performance recovery, the recovery timeliness
measures the recovery time. The complete simulation workflow of the above three phases is summarized in Algorithm 1.
| Algorithm 1. Resilience improvement simulation and evaluation algorithm. |
| Input: Initial topology ; failure mode and parameters; disturbance schedule ; supplement pool ; performance baseline; simulation horizon ; time step |
| Output: Resilience metrics ; recovery metrics ; performance trajectory |
| 1: Compute ; set , |
| 2: while do |
| 3: |
| 4: if disturbance is triggered at then |
| 5: Generate , according to |
| 6: Remove , from |
| 7: end if |
| 8: Update topology ; identify , , |
| 9: Compute ; update and record if |
| 10: if then |
| 11: Reassign each to optimal cluster via cooperative gain criterion |
| 12: for each : solve position optimization to restore |
| 13: Redistribute task areas of fully failed clusters to |
| 14: while and : deploy supplement nodes via greedy optimization |
| 15: end if |
| 16: Recompute on updated ; append to trajectory |
| 17: end while |
| 18: Extract from ; compute , , |
| 19: Compute , , |
| 20: Return , , |
The resilience performance of the MCUSoS under various failure disruptions can be systematically evaluated through the above simulation workflow, the effectiveness of the proposed strategies can be validated, and quantitative evidence can be provided for system resilience optimization design.
6. Conclusions
The MCUSoS faces multiple challenges during collaborative detection missions, including the organizational complexity of coordinating heterogeneous platforms across domains, stringent communication constraints, and dynamic capability degradation caused by various failure disturbances. An integrated framework for collaborative detection capability evaluation and resilience enhancement is proposed in this paper to address these challenges. A system-of-systems architecture of the MCUSoS is established by incorporating formation detection modes and multi-level cooperative communication network models. Collaborative capability indices are then constructed from two dimensions of intra-cluster and inter-cluster collaborative capability, and combined with sea-surface and underwater dynamic detection coverage capability indices to form a composite collaborative detection capability evaluation model. Furthermore, three representative disturbance models are established and a multi-stage resilience evaluation mechanism is proposed to quantify MCUSoS resilience under three disturbance modes. A resilience enhancement strategy integrating dynamic reconfiguration with external resource supplementation is designed to recover MCUSoS performance in multi-disturbance environments. A simulation case study is conducted to validate the effectiveness of the proposed methods. The results demonstrate that the proposed collaborative detection capability evaluation model can accurately characterize the MCUSoS performance variations under disturbance scenarios. Among the three disturbance modes, the targeted attack mode produces the strongest degradation of system performance, followed by disintegration circle attacks, with random failure being the least severe. The resilience enhancement strategy achieves effective performance recovery under all three disturbance scenarios. Across different failure modes, lower disturbance consistently corresponds to higher resilience values, as the system has more sufficient response time under low intensity disturbances, enabling the resilience enhancement strategy to function more effectively. In practical maritime applications, the proposed framework can help designers of maritime unmanned systems quantitatively balance the cost of auxiliary node deployment and the resulting resilience improvement, thereby supporting resilient system design and resource allocation in engineering practice.
Future work will extend in the following directions: introducing dynamic adversarial game mechanisms to investigate adaptive resilience enhancement strategies under alternating attack-defense conditions; incorporating the effects of time-varying communication channel characteristics and energy constraints in the marine environment on collaborative detection capability; and extending the proposed methods to larger-scale cross-domain systems that include intelligent systems such as unmanned underwater vehicles, further verifying the scalability of the framework.