1. Introduction
For the scenarios with a large number of agents in modern military operations (e.g., UAV swarms for cooperative reconnaissance), intelligent transportation systems (e.g., connected autonomous vehicle fleets), and large-scale industrial IoT (e.g., distributed sensor networks for smart manufacturing), information transmission congestion and the increasing difficulty of management have become urgent problems to be solved. Clustering, as an innovative hierarchical management strategy, has attracted extensive attention from academic and practical circles in recent years [
1,
2,
3]. With its outstanding capabilities, this strategy has shown remarkable results in solving network management problems and enhancing network stability. By implementing clustering management, large-scale multi-agent systems (LS-MAS) not only significantly improve operational efficiency, enhance survivability, optimize resource utilization, but also promote the rapid development of related technologies. Specifically, the clustering strategy enables more efficient allocation and execution of operational tasks, enhancing the flexibility and response speed of the overall system; at the same time, by dispersing risks and improving concealment, it significantly improves the survivability of agents; furthermore, the optimal allocation and utilization of resources are realized, reducing operational costs and maintenance difficulties.
In MAS, clustering allocation has become a fundamental and widely adopted strategy to tackle scalability, collaborative task execution, and resource optimization challenges, with substantial research advancements spanning diverse application scenarios. Early studies lay the groundwork for adaptive and utility-driven clustering in mobile networks, such as the adaptive clustering mechanism for wireless ad hoc networks [
4] and the on-demand weighted clustering algorithm [
5], which are further enhanced by load-balancing clustering frameworks [
6] and connectivity-centric k-hop clustering [
7]. For multi-robot MAS, clustering has been integrated with auction-based task allocation [
8], heuristic grouping [
9], and game-theoretic optimization [
10] to improve collaboration efficiency. In UAV-based MAS, clustering allocation has been deployed to minimize insecure communication ranges [
11] and energy provision [
12] and enhance cluster stability via mobility-aware strategies [
13]. Meanwhile, in cloud/fog computing and data center MAS, modified k-means clustering [
14], hierarchical clustering [
15], and workflow-aware clustering [
16] have enabled balanced resource scheduling [
17,
18]. Extensions to heterogeneous networks [
19], ultra-dense networks [
20], and non-ideal non-orthogonal multiple access (NOMA) systems [
21] have leveraged clustering for user grouping and resource allocation, with reinforcement learning-aided clustering further optimizing performance [
22]. Beyond traditional MAS, clustering allocation has also been applied to asset allocation [
23], facility location [
24], and regional resource network optimization [
25], demonstrating its versatility across domains. By analyzing these references, balancing conflicting objectives remains a core dilemma as most algorithms prioritize specific criteria at the expense of others, such as communication efficiency [
11], real-time responsiveness [
22], and energy consumption [
26]. Furthermore, scalability limitations persist in large-scale deployments. Existing methods often struggle to adapt dynamic re-clustering to rapid topology changes caused by agent mobility [
13] or suffer from exponential computational overhead when the number of agents grows exponentially [
18].
Despite the broad applicability of clustering allocation, it is faced with two critical challenges when applied to LS-MAS. Firstly, existing clustering allocation strategies lack effective mechanisms to reconcile individual agent self-interest with global system optimality, leading to unstable and inefficient cluster formations. Most conventional approaches prioritize global objectives such as load balancing [
6] or connectivity [
7] without adequately accounting for agents’ autonomous utility maximization. This conflict is exacerbated in LS-MAS scenarios where agents may deviate from cluster assignments to pursue local gains, undermining long-term cluster stability. Secondly, current methods fail to achieve decentralized, dynamic clustering and resource allocation with scalable fairness in LS-MAS. Centralized clustering approaches suffer from excessive communication overhead and single-point vulnerabilities when scaling to massive agent deployments, while distributed heuristic methods lack systematic frameworks to ensure equitable resource distribution across clusters. This limitation is evident in cloud computing [
14], asset allocation [
23], and facility location [
24] scenarios, where static clustering structures cannot adapt to real-time changes in agent states or task demands.
Motivated by these observations, this paper uses coalition game to solve the clustering allocation problem of large-scale multi-agent systems. Coalition game theory inherently models strategic agent interactions, enables self-organized coalition formation, and balances individual and collective utilities, such that the optimality–stability trade-off and decentralized fairness gap in LS-MAS clustering allocation is achieved. In this paper, a large-scale multi-agent clustering model is firstly established. Then, the target allocation result of agents is obtained according to the cost of agents countering targets. Based on the allocation result, a coalition game solution algorithm is designed, and repeated simulation cases are designed to verify the applicability of the algorithm. The main contributions of this paper are summarized as below.
A coalitional game clustering allocation scheme is developed for large-scale multi-agent systems. This scheme can effectively reconcile individual agent self-interest with global optimality and adapt to dynamic tasks, as it models each cluster as a cooperative alliance where agents voluntarily form stable partitions to maximize collective benefits while satisfying their local preferences. By integrating dynamic switching strategies, the scheme enables agents to adjust their coalition memberships in response to changes in task requirements, ensuring sustained performance in dynamic operational scenarios.
A new coalitional switching strategy is designed and incorporated into the proposed allocation scheme to generate stable coalition partition for the clustering allocation process. Related Nash stable analysis is also provided. The game-theoretic analysis framework inherently provides a rigorous basis for analyzing the stability of cluster structures, guaranteeing that the final partitions are Nash-stable and free from unilateral deviations that could undermine system-wide efficiency.
The remaining part of this article is arranged as follows.
Section 2 formulates the problem considered in this paper.
Section 3 presents a clustering allocation algorithm based on coalitional game theory.
Section 4 shows simulation results for the developed allocation scheme.
Section 5 concludes this work.
2. Problem Formulation
When the number of agents in the cluster is large, to facilitate communication network management, all agents in the cluster will form clusters, where represents the set of agents contained in the r-th cluster. On the one hand, since the efficiency of inter-cluster communication is lower than that of intra-cluster communication, too many clusters will lead to high communication delay. On the other hand, agents performing the same task need to exchange task information more frequently. Therefore, how to balance the communication efficiency of the multi-agent system and the characteristics of task execution to obtain the optimal clustering scheme, thereby effectively managing the communication network, is currently a difficult problem.
Since this work focuses solely on communication efficiency and the impact of agent dynamics on the core results is negligible, no specific constraints are imposed on the system model, which is thus omitted from the problem formulation in this paper. However, it does not mean that the proposed method lacks the fidelity required for real-world deployment where dynamic physical limitations are critical. As can be seen later, dynamic physical limitations affect initializing coalition partition. A different initialization strategy of grouping agents may generate distinct coalition partition results, e.g., for consensus-based auction strategy taking flight distance and energy consumption as the trajectory cost, dynamic physical limitations can influence the calculation of energy consumption, and then it will affect the generation of initial coalition partition.
2.1. Constraint Condition
Due to limited communication resources, the number of agents in a single cluster should not be excessive to ensure effective intra-cluster communication. Therefore, the number of agents in a cluster is constrained as shown below:
where
represents the number of agents in the
r-th cluster, and
represents the maximum number of agents allowed in a single cluster.
2.2. Performance Metric
Considering that multiple agents performing the same task need to frequently exchange task information, a feasible clustering method is to group agents performing the same task into one cluster. However, when the number of agents performing the same task is small, it may lead to an excessive number of clusters, reducing communication efficiency of agents. To control the number of clusters and improve communication performance, we comprehensively consider the communication efficiency and task attributes of agents and establish the performance metric shown below:
where
is a trade-off factor,
is related to the communication efficiency of agents and
is related to the task attributes of agents.
For
, since inter-cluster communication is less efficient than intra-cluster communication, excessive clusters cause high latency, and dynamic agent movement requires frequent cluster structure updates. To reduce the number of clusters and improve network stability,
is defined as
where
is the set of direct communication links between any two agents in cluster
, and
is the predicted link survival probability between agent
i and its adjacent agent
k in cluster
. For
, we calculate the following:
where
is the distance between agent
i and its adjacent agent
k in cluster
, and
L is the rated distance for unobstructed communication between agents.
For
, considering that agents performing the same task need frequent task information exchange, they should be grouped into the same cluster as much as possible.
is defined as
where
represents the total number of agents performing the
j-th task,
represents the number of agents performing the
j-th task in cluster
, and
represents the task number executed by agents in cluster
.
2.3. Optimization Model
The optimization model for agent clustering problem can be written as follows:
The goal of the above optimization problem is to find a suitable clustering structure that maximizes the overall network performance while satisfying the constraints on the number of agents, considering agent communication efficiency and task attributes.
3. Clustering Allocation Scheme Design
Coalition game refers to the process where decision-makers form stable alliances with other decision-makers through alliance and cooperation. A coalition is formed when all rational decision-makers are willing to cooperate and hope to achieve better results by establishing a cooperative organization. Therefore, the main problem to be solved by coalition game is how to form an appropriate cooperative organization to achieve expected outcomes. From the above definition, the problem solved by coalition game is consistent with the agent clustering problem in
Section 2. In agent clustering, the concepts of “coalition” and “cluster” are equivalent—each cluster corresponds to a coalition. In other words, agent swarms will eventually form multiple non-overlapping coalitions, a structure called a coalition partition in coalition game. In this section, a new coalition game clustering allocation scheme for large-scale multi-agent systems is designed.
3.1. Overall Scheme Design
Since the problem to be solved by coalition game is highly similar to the clustering problem of multi-agent systems, the main idea to address the discussed problem is to construct an appropriate cooperation mechanism to ensure the achievement of expected positive results. This means that in the final state, the multi-agent system will form multiple non-overlapping coalitions, i.e., coalition partition, in a coalition game.
In the above algorithm idea, agents assigned to perform the same task will automatically gather to form initial clusters, completing the initialization of clustering. Subsequently, each agent in the cluster will periodically execute three key steps in turn: (1) the generation of a switching set; (2) the establishment of a switching operation; (3) the selection of the optimal switching operation. This process will continue until a stable coalition partition is finally formed.
The specific steps are given below.
(1) Initialize coalition partition, where agents performing the same task form a cluster. Specifically, agents assigned to execute the same task are grouped into an initial cluster, as they need frequent interactions to share task-related information and coordinate operational actions. This initial grouping not only simplifies the subsequent optimization process by reducing unnecessary computational overhead but also ensures that the basic collaborative needs of agents are met at the early stage. Such a task-based initialization strategy aligns with the core requirements of multi-agent system operations and provides a reasonable starting point for further alliance adjustment and optimization.
(2) For any agent i, assuming it performs the j-th task and belongs to cluster , periodically execute the three steps of “switching set generation–switching operation establishment–optimal switching operation selection” as follows.
Switching set generation. First, initialize the agent switching set , and ; then, establish different switching sets according to whether the cluster meets the constraint conditions before and after switching: when the cluster where agent i is located does not meet the constraint conditions, the agent switching set remains unchanged; when clusters and meet the constraint conditions, the agent switching set becomes: .
Switching operation establishment. Agents can only switch between adjacent clusters (including empty clusters). Only when the benefit brought by the switching operation is greater than zero, such a switching will be considered effective, and a series of actually executable switching operations will be generated accordingly.
Optimal switching operation selection. According to the switching benefit, find the optimal agent switching set and cluster . If all agents in are effective, then the agent set leaves cluster and joins cluster , and the coalition partition is updated according to .
3.2. Algorithm Design
Specific flow of the proposed clustering allocation algorithm is shown in
Figure 1. Agents performing the same task initially form a cluster. For any agent
i performing task
m and belonging to cluster
, it initiates with three core initialization steps: coalition partition initialization, setting the switching set
where
with
the set of agents who performs task m, and initializing the switching operation
. It then enters an iterative loop focused on constraint verification and coalition adjustment. Firstly, it checks if the new coalition
meets preset constraints. If
fails the check, the algorithm further verifies whether the merged coalition
satisfies constraints where
is an iteration flag for any agent in cluster
. When this condition holds, it judges the switch gain of
which is denoted as
and is used to evaluate the quality of a switch operation. Mathematically, for two coalitions labeled by
l and
k, the switch gain
is defined as
where
and
are coalitions before the switch operation, and
and
are new coalitions formed after the switch, expressed as
and
. Now, proceed to elaborate on the algorithm. If
is positive,
U is updated to
. When
is larger than the agent amount
of cluster
, it follows by reassigning
to the element maximizing
where
is a temporary switching set symbol. Meanwhile, it conducts the coalition switching operation
before incrementing the index
i. Above the coalition switching operation, symbol ↦ has a meaning of replacement.
The iteration continues until where represents the total number of agents. In this point, the algorithm checks if the current coalition partition has reached a Nash equilibrium. If Nash equilibrium is achieved, the algorithm terminates; if not, the iterative process of constraint checking, coalition adjustment, and switching set/operation updates restarts, repeating the logical sequence until the equilibrium condition is satisfied to obtain a stable coalition structure.
Remark 1. The superiority of the proposed method over established heuristic clustering techniques like k-means or auction-based protocols mentioned in the literature are described as below. For the k-means technique, it can only guarantee clustering without considering allocation. For auction-based protocols, only allocation is taken into account without fully considering clustering. The proposed coalitional game based clustering allocation method can not only ensure clustering but also generate an optimal allocation result, which represents a major innovation of this work.
3.3. Stability Analysis
This subsection presents a theorem that indicates the stability for the designed clustering allocation scheme as summarized in the following Theorem 1.
Theorem 1. Under the assumption that the algorithm completes the iteration without exceeding the maximum iteration number N, using the coalition switching operation , the proposed clustering allocation algorithm in Figure 1 ensures that after a period of time, the cluster structure tends to be stable and the formed coalition partition Π is Nash stable. Proof. The above theorem can be proved by contradiction. If the final coalition partition
is not Nash stable, there exists a switching operation
that causes agent
i in cluster
to trigger the switching operation, leave the current cluster, and join cluster
, which contradicts the stability of the cluster structure. Since the assumption that the algorithm completes the iteration without exceeding the maximum iteration number
N, it is obvious that the algorithm cannot be trapped in a local optimum. This is mainly supported by the fact that the switch gain
is a monotonic function which can be observed from the definition as below. For a participant, if the switch gain
of switch operation
is greater than the switch gain
of another switch operation
, the participant prefers
over
. Thus, the participant’s preference relation “≻” for switch operations can be expressed as
where “⇔” denotes an equivalence relation. For a monotonically switching gain, the local optimal solution and the global optimal solution are equivalent. Therefore, through the proposed algorithm, the finally formed coalition partition
is Nash stable. □
Theorem 1 reveals that the proposed coalition game clustering allocation algorithm can obtain stable coalition partition
after a limited number of iterations. It can also be inferred that if the algorithm completes at the maximum iteration number, the algorithm becomes trapped at a local optimum during iterative coalition switches and the final result is not a Nash equilibrium. Furthermore, from the flowchart of clustering allocation algorithm in
Figure 1, we can see that compared with the enumeration method, the proposed coalitional game method can reduce the complexity of clustering allocation for multi-agent systems from
to
.
4. Simulation Results
This section uses the coalition game algorithm to solve the clustering allocation problem of large-scale multi-agent systems. Two groups of repeated experiments are set up by changing the initial positions of agents and targets. Detailed analysis is presented to illustrate the effectiveness of the proposed scheme in solving clustering allocation problem of large-scale multi-agent systems. Meanwhile, simulation results compared with the clustering allocation problem using the enumeration method are also presented.
Consider the scenario of the communication link planning problem when multiple unmanned aerial vehicles (UAVs) collaborate to scout multiple targets. The number of targets and UAVs are denoted as
and
, respectively. The target can be a fixed or moving target, but they are all considered as particles, and for moving targets, their velocity is taken into account. For UAV agents, a universal longitudinal motion model is considered, where the height, speed, inclination angle, and yaw angle are included. Model description is omitted here for simplicity and one can refer to [
27] for details. Committed to establishing the optimal communication link, UAV agents need to perform clustering, where in each cluster, they are divided into two roles: cluster heads and cluster members. Cluster heads are responsible for managing intra-cluster members, finding routes to the ground station, and completing resource allocation. The clustering process of UAV swarms is similar to coalition game, which refers to the process where decision-makers form stable coalitions with other decision-makers through alliance and cooperation. Therefore, a coalition game algorithm is adopted to solve such clustering problem.
Next, how the optimization problem in
Section 2 is instantiated in the experimental scenario of communication link planning problem is explained. For the constraint condition, each cluster allows up to 15 UAV agents, i.e.,
. For the objective function, since the rated distance for unobstructed communication between agents is
, i.e.,
, then
is obtained by substituting the clustering results for each round, thus
and
can be calculated. For the interaction topology, valid communication distance is set to be
, within which two UAV agents are treated as connected. Due to the changes in the positions between agents, their communication topology is also changing, requiring real-time computation.
In simulations, the consensus-based auction algorithm is used to solve the allocation problem of agents scouting targets, after then, the algorithm is initialized. The inputs of the consensus-based auction algorithm, i.e., the trajectory cost, mainly consist of flight distance and energy consumption. Therefore, the initialization strategy of grouping agents solely by task assignment fully considers spatial proximity and it will not induce high initial communication latency that requires excessive iterations to resolve. Since the trajectory cost calculation is not the core part of this work, it is omitted here. Subsequently, the coalition game algorithm is used to solve the clustering results. The simulation environment is DESKTOP-8DOTIFE, with a processor of AMD Ryzen 7 4800H with Radeon Graphics 2.90 GHz, installed RAM of 16.0 GB, and a 64-bit operating system. Collectively, this hardware–software configuration balances computational power, memory capacity, and system compatibility, creating a reliable and efficient platform to accurately assess the algorithm’s performance in terms of coalition benefit, allocation time, and stability, while also ensuring the reproducibility and generalizability of the validation results. The two groups of simulations are presented and compared to verify the adaptability of the algorithm as follows.
Case 1: The number of our agents
is 100, and there are 10 targets; the initial position information of agents and targets is randomly set as shown in
Figure 2, where the height of all agents is 300 m, the speed is 100 m/s, the initial inclination angle and yaw angle are 0°, and the height of all targets is 0 m.
Case 2: The number of our agents
is 100, and there are 10 targets; the initial position information of agents and targets is randomly set as shown in
Figure 3, where the speed of all agents is 100 m/s, the initial inclination angle and yaw angle are 0°, and the height of all targets is 0 m.
The allocation of agents countering targets is carried out. The allocation results of the two cases are shown in
Table 1 and
Table 2.
Based on the above initial allocation results, the benefits corresponding to different agent clusters are calculated according to the model. According to the calculated benefits, the coalition partition results are obtained, as shown in
Table 3 and
Table 4. Meanwhile, the results using the proposed algorithm are compared with clustering using the enumeration algorithm. Comparative coalition partition results are given in
Table 5.
Based on the results in
Table 3,
Table 4 and
Table 5, a comprehensive analysis of the coalition partition outcomes and algorithm performance is presented here. In both Cases 1 and 2, the proposed algorithm yields well-structured coalition partitions with nine non-overlapping clusters formed respectively to cover all 100 agents, and the overall coalition benefits reach 2.80 and 2.38, respectively. This demonstrates that the proposed algorithm effectively balances communication efficiency and task collaboration required in the UAV agent swarm. Meanwhile, it is shown that although the difference of geographic locations is significant, Nash equilibrium is achieved in both cases. More importantly, comparative results in
Table 5 highlight the superior performance of the proposed algorithm. While achieving the same optimal coalition benefit as the enumeration algorithm (2.8 for Case 1 and 2.38 for Case 2), it drastically reduces the allocation time from 51.382 s and 46.423 s with the enumeration algorithm to only 5.296 s and 4.969 s, respectively. This indicates that the proposed algorithm not only ensures the optimality of clustering results but also significantly improves computational efficiency, making it more suitable for practical large-scale multi-agent system applications. It is worth noting that the proposed algorithm is fast enough for the real-time control of UAVs where topology changes occur in milliseconds. In fact, our algorithm for clustering allocation of large-scale multi-agent systems is deployed on the cloud or high-performance edge devices, where the update of instructions is conducted in seconds. The cluster allocation results need to be updated only when multiple new tasks emerge, which is not affected by the real-time control of UAVs. This process will not be in milliseconds, but at least in seconds or longer.
To show the impact of the number of agents on the performance of the proposed clustering scheme, the simulation is expanded to include larger agent populations, ranging from 100 to 1200, where massive swarm scenarios involving thousands of nodes are also encompassed. Related results are shown in
Figure 4, in which the scalability of the proposed method for industrial IoT or massive swarm scenarios involving thousands of nodes are sufficiently demonstrated. From
Figure 4, it can also be seen that the round of iterations required for convergence increases slowly as the cluster size expands. This is because as the cluster size increases, the number of switch operations UAV agents can perform increases, requiring more iterations to achieve network stability.