1. Introduction
With the increasing deployment of distributed intelligent systems across domains such as environmental monitoring, industrial logistics, and smart infrastructure, the challenges of cooperative execution and dynamic communication modeling in heterogeneous agent systems have attracted growing attention [
1,
2,
3]. Beyond efficiency alone, these challenges are naturally framed through the lens of symmetry and symmetry breaking: on the geometric side, localization objectives based on the FIM enjoy invariance to rigid Euclidean motions; on the algorithmic side, agents of the same type should satisfy role-level permutation symmetry (exchangeability in decision rules), whereas heterogeneity and mission constraints break this symmetry; and on the structural side, the communication graph may exhibit structural symmetry (e.g., repeated local motifs or near-automorphisms) that shapes information flow and diffusion efficiency. Recent advances in graph neural networks (GNNs) and multi-agent reinforcement learning (MARL) have enabled more efficient process coordination and decentralized decision-making. For instance, Ratnabala et al. proposed HIPPO-MAT [
4] and MAGNNET [
5], which utilize graph embeddings to enhance coordination efficiency across agents with diverse capabilities. Goeckner and Ma further developed GNN-enhanced MARL frameworks and assignment networks tailored to distributed workload assignment [
6,
7]. However, in much of this literature the symmetry structure remains implicit: fixed or hand-crafted communication graphs can mask when symmetry (e.g., regular, highly clustered layouts) helps rapid dissemination versus when controlled symmetry breaking is required for better observability and identifiability. Consequently, many existing approaches assume fixed or hand-crafted communication graphs, limiting their scalability and flexibility in real-world systems where topology evolves dynamically due to heterogeneity in sensing, mobility, or resource availability [
8,
9].
Accurate localization is foundational to coordinated behavior in heterogeneous agent systems, supporting processes such as collaborative perception, spatial process assignment, and multi-agent scheduling. From an efficiency perspective, accurate localization enables optimized trajectory planning, avoids redundant sensing, and reduces energy consumption; from a communication perspective, spatial awareness facilitates dynamic topology formation and relay selection, thereby lowering latency and packet loss; from a safety and robustness perspective, localization errors can trigger collisions, task conflicts, or communication isolation. In missions such as search-and-rescue, surveillance, and distributed mapping, reliable localization is fundamental to task allocation and situational awareness.
Yet, discrepancies in sensing modalities, communication limitations, and environmental uncertainties across agents challenge the ability of traditional techniques to maintain both high localization fidelity and coordination efficiency [
10,
11]. FIM, a classical metric for quantifying the informativeness of observations, has been widely applied in cooperative localization under static or homogeneous network assumptions [
12]. In applications ranging from wireless sensor arrays to environmental monitoring networks, FIM-based optimization has guided trajectory planning and sensor deployment to enhance estimation accuracy [
13,
14]. Geometrically, the FIM objective is invariant to rigid Euclidean motions of the global frame, but strictly symmetric sensor or agent layouts can yield poor observability; thus, practical designs often require controlled departures from symmetry to avoid degenerate configurations while preserving desirable invariances. While FIM maximization is theoretically aligned with uncertainty reduction, most existing approaches operate under fixed topologies or assume agent homogeneity, limiting their practicality in dynamic, heterogeneous systems where agents must make real-time decisions under partial observability [
15]. In contrast, recent developments in Heterogeneous-Agent Reinforcement Learning (HARL) have shown promise in learning decentralized coordination strategies for agents with varying capabilities and perspectives [
16]. However, few HARL frameworks have incorporated FIM-based objectives into the reward design to actively guide agents toward localization-aware behaviors [
8]. This reveals a critical gap: how to embed FIM maximization into the reinforcement learning process to jointly optimize localization accuracy and activity scheduling within adaptive, information-constrained agent networks.
In heterogeneous intelligent agent networks, both communication topology and cooperative localization accuracy jointly determine system safety and system efficiency under complex, dynamic conditions. Random network models provide a versatile framework for capturing the diversity and uncertainty intrinsic to evolving communication structures while making explicit the role of structural symmetry. Notably, the average path length quantifies the efficiency of information dissemination across the network, the clustering coefficient reflects the capacity for local collaboration among agents [
17], and the betweenness centrality measures the importance of individual nodes in facilitating network-wide connectivity [
18]. Complementing these, symmetry-oriented proxies—such as the size of the graph automorphism group and the multiplicities of Laplacian/adjacency eigenvalues—characterize repeated motifs and regularities that affect diffusion, redundancy, and vulnerability. These structural properties have been shown to significantly influence message propagation and collective observation in multi-agent systems [
19]; in particular, highly symmetric topologies can accelerate dissemination but also induce indistinguishable measurement geometries that degrade observability, motivating controlled departures from symmetry when optimizing localization-aware coordination.
Meanwhile, cooperative localization accuracy is commonly assessed using FIM-based metrics, which characterize the uncertainty in state estimation. Differences in localization error under varying network topologies underscore the critical role of structural connectivity—and, by extension, structural symmetry—in information fusion and error dynamics. However, most prior studies analyze either network topology or localization performance in isolation, motivating a need for frameworks that integrate both aspects to provide a comprehensive evaluation of multi-agent coordination.
To address these challenges, this work introduces a node–edge representation grounded in random network theory to model the communication and collaboration topologies of heterogeneous agent networks. By leveraging a random network generation mechanism, the framework captures potential connectivity and cooperation probabilities among diverse agents, while also accommodating structural symmetry and asymmetry that arise in practice. Compared with fixed or static adjacency schemes, this approach better reflects evolving communication characteristics under complex conditions, providing a realistic and flexible structural foundation for learning-based cooperative activity scheduling [
16,
20].
Building on this foundation, we integrate FIM maximization into heterogeneous multi-agent learning by designing a hybrid reward that preserves geometric invariance, respects within-role permutation symmetry, and introduces controlled symmetry breaking via distance/FIM terms to avoid degenerate symmetric layouts. The resulting objective balances localization accuracy during concurrent process execution and improves scheduling efficiency and system resilience in collaborative scenarios.
To systematically evaluate the trade-off between communication efficiency and localization fidelity, we develop a unified assessment framework that combines structural metrics from random network theory—including average path length, clustering coefficient, and betweenness centrality—with localization error measures, and, when relevant, symmetry proxies such as spectral multiplicities. This framework identifies when symmetric topologies aid diffusion and when symmetry breaking improves observability, enabling dynamic balancing of communication costs and estimation accuracy for robustness and generalization across evolving multi-activity environments [
21,
22,
23]. In practical deployments, continuous monitoring of topology and localization performance supports real-time score updates and timely communication reconfiguration or path replanning, ensuring adaptive responses to disruptions or environmental changes [
24]. Distinct from recent coordination approaches based on GNNs and MARL [
4,
5], our method departs both in its modeling assumptions and in its optimization objective. Rather than relying on learned message-passing architectures or a fixed communication graph, we explicitly instantiate the inter-agent topology using random network theory, leveraging structural randomness to capture communication uncertainty and role heterogeneity. In addition, by introducing a Fisher Information Matrix (FIM)-maximization criterion, we ground the objective in estimation theory with clear physical interpretability, which differentiates our framework from conventional policy-gradient or multi-agent value-based methods that focus solely on reward optimization.
The main contributions of this work are as follows:
We propose a symmetry-informed node–edge representation grounded in random network theory to characterize evolving communication and collaboration topologies in heterogeneous agent networks, exposing structural symmetry and asymmetry via average path length, clustering, betweenness, and (when relevant) spectral-multiplicity proxies.
We integrate a geometric invariance-preserving FIM objective into heterogeneous multi-agent coordination, leveraging its invariance to rigid motions while discouraging symmetric but low-observability layouts through principled spacing and information terms.
We design a hybrid reward that respects within-rolefor permutation symmetry and introduces controlled symmetry breaking via distance/FIM components, enabling localization-aware scheduling with improved efficiency and robustness.
We establish a unified evaluation that links network structure and information quality by combining random-network metrics (average path length, clustering coefficient, betweenness centrality) with localization error measures and symmetry proxies, quantifying when symmetry aids diffusion and when symmetry breaking improves observability and task performance.
2. Methodology
This section presents a unified optimization framework for cooperative decision making in heterogeneous intelligent agent networks, with its workflow illustrated in
Figure 1. The framework is composed of three principal modules: (1) constructing a networked model of heterogeneous agent collaboration that reflects the complexities of real-world cooperative scenarios; (2) utilizing the inverse relationship between the determinant of FIM and localization uncertainty to inform optimal path planning, such that relative agent positioning reduces estimation error and supports high-precision situational awareness; and (3) applying RL to enable decentralized agents to autonomously adapt, coordinate, and cooperate under dynamic and uncertain conditions, thereby improving scheduling efficiency and system resilience.
In addition, the framework is symmetry-aware: the FIM objective preserves geometric invariance, policies maintain within-role permutation consistency, and we introduce controlled symmetry breaking to avoid degenerate but symmetric layouts that harm observability.
2.1. Network Model for Collaborative Processes
We model collaborative processes using a graph-theoretic network representation , where V is the set of nodes and E is the set of edges; each edge in E represents an active communication or coordination link between a pair of nodes in V. In the heterogeneous agent network, each autonomous entity is represented as a node, and collaborative interactions are encoded as edges. An edge exists whenever two nodes can directly share information or coordinate actions. To expose structural symmetry, we later summarize F using L (average path length), C (clustering), betweenness, and (when useful) spectral multiplicities that capture repeated motifs or regularities.
Inspired by the OODA (Observe–Orient–Decide–Act) paradigm, activity entities are categorized into four functional classes: sensing nodes (e.g., environmental monitoring agents
and spatial mapping agents
), coordination nodes (
C), processing nodes (support
, transport
, coordination
), and target or goal nodes. This structure can be formalized as:
Within each role (e.g., agents), operations are designed to be permutation-consistent so that relabeling same-type agents does not alter role-level behavior or decisions.
In collaborative scenarios, the agent network may include a small number of monitoring nodes and multiple spatial mapping nodes. These sensing agents ( and ) jointly gather relevant environmental or situational data, which is relayed to the coordination node (C) for processing and assignment. The resulting process directives are then distributed within a specified time window to the processing nodes— (support), (transport), and (coordination)—which carry out their designated actions according to the centralized plan.
We assume the total number of nodes is
N, where
. According to the typical composition of a collaborative agent network, the proportions of each node type—process allocation (
C), ground sensing (
), spatial sensing (
), support execution (
), transport execution (
), and coordination execution (
)—are represented as
,
,
,
,
, and
, respectively is illustrated in
Figure 2.
The probabilities , , , , and denote the likelihoods of establishing connections between the coordination node and other node types. For instance, represents the connection probability between C and , while reflects that between C and . characterizes the collaboration capability between ground sensing and support processing nodes. Similarly, , , and describe the probabilities of interconnections among processing nodes, capturing their cooperative execution and coordination potential. specifies the likelihood of collaboration between spatial sensing and coordination nodes. To account for finite resource constraints, the degree of each node is bounded such that for all . Differentiated across roles and the cap serve as controlled symmetry-breaking mechanisms that prevent overly regular (symmetric) topologies when they hinder observability or load balancing.
The probabilities
,
,
,
, and
denote the likelihoods of establishing connections between the coordination node and other node types. For instance,
represents the connection probability between
C and
, while
reflects that between
C and
.
characterizes the collaboration capability between ground sensing and support processing nodes. Similarly,
,
, and
describe the probabilities of interconnections among processing nodes, capturing their cooperative execution and coordination potential.
specifies the likelihood of collaboration between spatial sensing and coordination nodes. To account for finite resource constraints, the degree of each node is bounded such that
for all
.
Using this stochastic connection rule, we construct the adjacency matrix
A, which defines the network topology for collaborative processes. A representative schematic of the resulting agent network is illustrated in
Figure 3. The inter-type connection probabilities (e.g., among
C,
,
,
, etc.) were selected based on iterative simulation calibration to balance connectivity, load, and robustness under the target operating conditions.
The stochastic adjacency mechanism in Equation (
2) can be viewed as a role-aware extension of the classical Erdős–Rényi random graph
and the stochastic block model (SBM). Concretely, the heterogeneous link probabilities
correspond to inter-block connection probabilities in a block-structured adjacency matrix, where each block encodes the communication preference between node categories
. This modeling choice preserves analytical tractability while faithfully capturing real-world asymmetries in sensing, coordination, and actuation. From a theoretical perspective, the family of heterogeneous random graphs
satisfies a high-probability connectivity condition:
where
denotes the minimum nonzero connection probability across all role pairs. This ensures that, despite random link fluctuations, a multi-role agent network remains connected with overwhelming probability, providing a robust topological substrate for distributed cooperation.
Moreover, the expected degree of node
i is
, and closed-form approximations yield the expected average path length and clustering coefficient [
17]:
allowing structural efficiency and local cohesiveness to be assessed analytically under communication uncertainty.
Importantly, the random network interacts directly with FIM- and information-theoretic objectives. The system-level Fisher Information can be expressed as a function of the random adjacency:
where
quantifies the contribution of link
to the joint information gain. Taking expectation over the random graph ensemble gives
which characterizes the expected localization fidelity under stochastic connectivity. This establishes a theory-level bridge between topological randomness and estimation performance: larger mean degree and clustering generally increase
, whereas excessive symmetry (e.g., uniform
) can reduce the rank of the FIM and degrade observability. Accordingly, the proposed random-graph formulation offers a principled and physically interpretable approach to modeling communication uncertainty, ensuring robustness and scalability in heterogeneous multi-agent coordination.
2.2. FIM-Optimized Cooperative Localization in Heterogeneous Agent Networks
Inspired by cooperative localization strategies in distributed agent systems, we consider a scenario where ground-based agents perform positioning using an Ultra-Short Baseline (USBL) system or analogous ranging technology. Let the coordinates of a coordination node C be , and those of the m-th support node be .
As illustrated in
Figure 4, the coordination agents utilize a USBL (Ultra-Short Baseline) or analogous positioning system to localize support agents within the network. Assuming a uniform sensor array distribution on the coordination agent, the distances
, where
d is the array spacing. The measurement model and the associated probability density function are given by:
where the target state vector is
, and
denotes the phase difference vector between receiving units. Here,
and
, where
c is the propagation speed (e.g., sound or radio),
f is the signal frequency,
denotes zero-mean Gaussian white noise, and
is the measurement noise covariance. By taking the second-order derivatives of the log-likelihood function, the FIM for the system is given by:
The FIM objective is invariant to rigid Euclidean motions of the global frame, but strictly symmetric formations can be stationary yet poorly observable. To mitigate these degeneracies while preserving desirable invariances, we pair FIM maximization with a mild spacing penalty over selected pairs
:
which discourages low-information symmetric layouts without altering the underlying rigid-motion invariance.
After analytical simplification, the determinant of the FIM can be expressed as:
where
,
, and
represents the angular difference along the
X-axis between agents. By maximizing the determinant
, we can identify the optimal horizontal distance
between agents, as illustrated in
Figure 5.
Equations (
9) and (
10) formalize the link between spatial configuration design and maximization of the Fisher information. Equation (
10) provides a closed-form expression for
; maximizing this determinant minimizes the estimation error covariance, and the optimizer
yields the optimal inter-agent horizontal spacing
. To avoid convergence to degenerate or perfectly symmetric layouts with poor observability, Equation (
9) augments the objective with a mild regularizer that discourages such configurations. Thus, (
9) acts as a geometric constraint that improves the well-posedness of the FIM optimization in (
10): it preserves the information-maximization objective while providing stable descent directions in parameter space, enabling convergence to the information-optimal
without symmetry-induced rank loss.
2.3. Multi-Agent Collaboration via Reinforcement Learning
The proposed framework leverages RL to enable effective collaboration among multiple heterogeneous agents. Within each role, policies are designed to be permutation-consistent so that relabeling same-type agents leaves decisions unchanged. Rather than employing a standard Markov Decision Process (MDP) with conventional reward signals, we design a modified reward function that explicitly incorporates the pairwise distances between agents:
This distance component acts as a controlled symmetry-breaking mechanism that, together with the FIM objective, discourages degenerate symmetric formations and improves localization-aware coordination. Through extensive RL-based training, the agents progressively develop expert-level collaborative behaviors and adaptive decision-making capabilities as their situational awareness evolves.
Here,
denotes the task-level reward,
is the USV–AUV distance guidance term defined in Equation (
11), and
is the FIM-based information metric (Equation (
10)). The hat indicates normalization (via a running window or empirical min–max) to ensure numerical comparability across terms. The weights
,
, and
control the relative contributions of these components.
By integrating the components above, we propose an optimized cooperative framework for heterogeneous multi-agent systems. The pseudo code for the complete framework is presented in Algorithm 1.
| Algorithm 1 Collaborative Multi-Agent Framework |
- 1:
Input: Initial agent positions; replay buffer ; critic and actor network parameters for each agent - 2:
for episode do - 3:
Reset environment and agent parameters - 4:
for time step do - 5:
Perform multi-agent path planning via FIM-based optimization - 6:
Obtain agent positions using Equation ( 7) - 7:
for each agent k do - 8:
Observe state - 9:
Sample action - 10:
Execute , receive reward , observe next state - 11:
Store tuple in replay buffer - 12:
if training step condition satisfied then - 13:
Sample random minibatch of N transitions from - 14:
Update critic network via gradient descent - 15:
Update actor network via policy gradient - 16:
end if - 17:
end for - 18:
end for - 19:
end for
|
2.4. Performance Evaluation Under Complex Network Topologies
To assess the collaborative efficiency of the proposed optimization framework, we construct a heterogeneous cooperation network based on graph theory, where nodes represent functional entities and edges represent communication or interaction links. By computing key network metrics—such as average path length, clustering coefficient, and betweenness centrality—we quantitatively analyze both information transmission efficiency and the relative importance of each node. This network-theoretic analysis underpins a systematic method for collaborative performance evaluation, establishing a strong foundation for subsequent system optimization.
The network’s collaborative communication capability is characterized by the actual mean shortest path length. A shorter average path length corresponds to more efficient information propagation and stronger communication capability. Specifically, let
denote the number of edges along the shortest path between nodes
i and
j, and define the average path length
L as:
This metric directly reflects the efficiency of information exchange: shorter average path lengths indicate more optimized communication links, resulting in lower latency and enhanced overall system coordination.
The clustering coefficient measures the extent to which a node’s neighbors tend to form tightly-knit groups. For a given node
u,
denotes the number of triangles formed among its neighbors, while
is the number of first-order neighbors. The clustering coefficient
is defined as:
Equivalently,
The numerator quantifies the number of observed triangles (actual clustering), while the denominator reflects the maximum possible number of triangles among the neighbors (if they form a complete clique). The mean of
over all nodes defines the network clustering coefficient
C; a higher
C signifies greater local cohesiveness and connectivity.
Node betweenness centrality quantifies the fraction of all shortest paths in the network that pass through a given node, thereby reflecting its relative importance for information flow. Formally,
where
is the total number of shortest paths between nodes
j and
q, and
is the number of those paths passing through node
i.
In this work, we further leverage the optimal horizontal distance
from Equation (
6) as a metric of collaborative situational awareness. In practical multi-agent networks, effective collaboration and information sharing among agents result in reduced localization errors, signifying more comprehensive situational awareness and, hence, improved global perception capabilities across the network. Empirically, we observe that configurations with slightly broken symmetry (guided by spacing/FIM terms) outperform strictly symmetric layouts when the latter induce ambiguous measurement geometries.
3. Experiments
In this section, we present comprehensive simulation studies to validate the effectiveness of the proposed collaborative optimization framework for heterogeneous multi-agent systems. We detail the experimental setup and provide analysis of the results. To align with our methodology, we report results with a symmetry-aware reading: the FIM objective preserves geometric invariance, while the learned behaviors introduce mild, performance-driven symmetry breaking relative to fixed symmetric placements.
3.1. Experimental Settings
All experiments were performed in a two-dimensional simulation environment with an area of . To demonstrate the feasibility and practical utility of our framework, we considered a scenario in which a sensing agent M is trained to localize two ground agents U. The simulation parameters are specified as follows: time step , spatial step , angular frequency , initial displacement amplitude , agent acoustic emission level , and emission frequencies and for and , respectively. Unless otherwise noted, initial layouts are generic (non-engineered), and the symmetry perspective concerns how FIM-driven planning and learned policies avoid degenerate symmetric formations while retaining rigid-motion invariance. Experiments were implemented in Python 3.10 using PyTorch 2.2, Gymnasium 0.29.1, NetworkX 3.1 and were executed on a workstation with an Intel (Santa Clara, CA, USA) Core i9–13900K CPU, an NVIDIA RTX 4090 GPU (24 GB), 64 GB RAM, running Ubuntu 22.04 LTS.
3.2. Experimental Results and Analysis
We evaluated the proposed framework using two mainstream reinforcement learning algorithms: DDPG and SAC. As illustrated in
Figure 6, the training curves for both algorithms demonstrate stable convergence, indicating that agent
M successfully acquires an expert-level collaborative policy. Consistent with our design, the learned behaviors introduce controlled symmetry breaking in agent spacing relative to fixed symmetric baselines, which improves observability without violating the FIM’s geometric invariance.
To assess the statistical significance of the performance gains, we conducted ten independent runs (
) under the experimental setup described above and report the mean and standard deviation of SDR (total data rate per time step) and ARPS (average reward per time step) for each algorithm. The results, summarized in
Table 1, further support the robustness and adaptability of the proposed multi-agent framework.
In addition, we leveraged the expert policy obtained via DDPG to guide the heterogeneous multi-agent system.
Figure 7a depicts representative trajectories between coordination agent
M and ground agents
U during a single RL training episode. To further demonstrate the advantages of the proposed cooperative framework, we evaluated the localization error of the multi-agent system across three scenarios: (1) path planning for agent
M optimized using FIM, (2) agent
M fixed at coordinate
, and (3) agent
M fixed at
. As shown in
Figure 7b, the FIM-optimized path planning achieves the lowest localization error, highlighting the effectiveness of adaptive, information-driven coordination for enhancing situational awareness and collaborative perception.
Further simulations, based on the network configuration in
Section 2.1, produce the shortest path heatmap illustrated in
Figure 8. The computed average path length is 1.721, and it is evident that the shortest path between any two agents in the network does not exceed four hops. This demonstrates rapid information dissemination and high transmission efficiency within the heterogeneous multi-agent communication network.
The clustering coefficient distribution for each node is presented in
Figure 9. Significant variability is observed, indicating differing tendencies among nodes to form tightly connected local communities. In particular, nodes of type
demonstrate the highest clustering coefficients, suggesting their central roles in highly interconnected sub-networks.
Intermediate nodes—especially and —exhibit a greater impact on the overall clustering characteristics of the network, while coordination (core) nodes (C) and peripheral processing nodes (, , ) contribute less. This trend reveals that agents located in the network’s central regions have a higher propensity to form clusters, resulting in denser local connectivity, whereas those situated at the core or periphery are less likely to be closely linked with their immediate neighbors.
Areas dominated by nodes are associated with more frequent interactions or heightened collaborative activity, further enhancing the network’s overall robustness and information-sharing capacity. A high clustering coefficient indicates that neighboring nodes form densely connected subgraphs, facilitating short-range communication and information exchange. This structure improves local perception fusion, map consistency, and the identifiability of local observations, thereby reducing estimation error. Moreover, through cross-cluster links provided by the coordination node C and higher-layer connections, the benefits of strong local clustering can propagate to the global scale, enhancing overall localization accuracy and convergence speed while preserving local robustness.
Figure 10 presents the betweenness centrality distribution for all nodes in the heterogeneous multi-agent network. As expected, core coordination nodes (
C) display the highest betweenness centrality, with
exhibiting a particularly dominant value that far surpasses those of other nodes. This finding underscores the essential role of
as a network bridge, facilitating the majority of shortest paths and serving as a primary connector for information flow across the network.
Beyond the core C nodes, processing nodes (U) also demonstrate notable betweenness centrality, functioning as important intermediaries in the network. In contrast, sensing nodes (, ) are characterized by low betweenness values, reflecting their primary engagement in localized connectivity rather than system-wide mediation.
Table 2 summarizes the average betweenness centrality for each node type. Coordination nodes consistently exhibit much higher average betweenness than any other node category, confirming their pivotal role in system connectivity and robustness. Notably, the table also highlights subtle differences between sensing and processing nodes: in some configurations, processing nodes may even surpass sensing nodes in betweenness, indicating the importance of distributed coordination in multi-agent collaboration.
The relatively low betweenness of sensing nodes indicates that, under the present topology, they act primarily as information endpoints rather than mediators. Consequently, failures at core coordination nodes could disproportionately degrade collaborative capacity, underscoring the need for resilient architectures (e.g., redundant bridges, bounded centrality, alternative routing). Consistent with the symmetry-aware reading of our unchanged experiments, mild, performance-driven symmetry breaking—induced by FIM-aware planning and learned spacing—improves observability and coordination compared with strictly symmetric, fixed placements. Fault tolerance can be further improved by introducing a small number of redundant links among sensing nodes or by employing adaptive relay mechanisms, without materially increasing communication overhead.
Coordination nodes (C)—which exhibit the highest betweenness centrality—serve as critical bridges for information flow and therefore constitute potential single points of failure. If such a node fails or experiences a communication outage, overall collaborative efficiency and information throughput can be severely impaired. To address this risk, future work will investigate robust coordination mechanisms based on dynamic network reconfiguration and reinforcement learning, aiming to enhance adaptability and fault tolerance under critical-node failures. Concretely, within the existing RL framework we plan to introduce redundancy-aware role reallocation and post-fault link self-recovery to enable rapid rerouting of information and dynamic reconfiguration of collaboration patterns, thereby preserving network availability and cooperative performance. This line of research will lay the groundwork for long-term reliable operation of heterogeneous multi-agent systems in complex, time-varying environments. In addition, deploying a small number of redundant links among sensing nodes or adaptive relay mechanisms can further improve fault tolerance without materially increasing communication overhead.