A Coalition Formation Game Approach for Efﬁcient Cooperative Multi-UAV Deployment

: Unmanned aerial vehicle (UAV) cooperative control has been an important issue in UAV-assisted sensor networks, thanks to the considerable beneﬁt obtained from the cooperative mechanism of UAVs being applied as a ﬂying base station. In a coverage scenarios, the trade-off between coverage and transmission performance often makes deployment of UAVs fall into a dilemma, since both indexes are related to the distance between UAVs. To address this issue, UAV coverage and data transmission mechanism is analyzed in this paper; then, an efﬁcient multi-UAV cooperative deployment model is proposed. The problem is modeled as a coalition formation game (CFG). The CFG with Pareto order is proved to have a stable partition. Then, an effective approach consisting of coverage deployment and coalition selection is designed, wherein UAVs can decide strategies cooperatively to achieve better coverage performance. Combining analysis of game approach, coalition selection and the position deployment algorithm based on Pareto order (CSPDA-PO) is designed to execute coverage deployment and coalition selection. Finally, simulation results are shown to validate the proposed approach based on an efﬁcient multi-UAV cooperative deployment model.


•
An efficient cooperative multi-UAV deployment model based on coalition formation is proposed, describing the trade-off between the coverage reward and energy consumption in UAV-assisted network. Coverage utility function is characterized as a reference for coalition formation to accurately depict the transmission model.

•
Through game analysis, the stable solution of the proposed system model is proved to exist. The proposed UAV coverage and data transmission approach can achieve at least one stable coverage deployment and coalition formation scheme to maximize the coverage utility of the whole UAV network.
• Coalition selection and position deployment algorithm based on Pareto order (CSPDA-PO) is designed to perform coalition selection and position deployment. The CSPDA-PO algorithm is able to achieve the stable state of our system model. Meanwhile, convergence of the proposed algorithm is also shown in the simulation.
Note that authors in [7,24] did some works that are relevant to our work, especially in constructing the UAV coverage model, including evaluation of coverage ability and deployment mechanism. The main differences can be summed up to: (i) the UAV deployment is not only concerned with the coverage ability, but also the coalition formation under the comprehensive coverage reward and energy consumption; and (ii) the energy consumption model for transmitting and receiving through radio frequency (RF) signals is introduced into coalition formation mechanism, which is associated with coverage reward under current deployment.
The rest of the paper is organized as follows: Section 2 shows the system model of multi-UAV cooperative deployment and problem formulation. In Section 3, a coalition formation game approach for the system model is analyzed, and the learning algorithm is designed to converge the proposed problem to the stable state. Simulation results and analysis are presented in Section 4. Finally, the concluding remarks are given in Section 5.

System Model and Problem Formulation
We consider a UAV-assisted sensor network employing N distributed UAVs and one central control UAV, where UAV groups perform reconnaissance and data collection tasks under disaster scenarios. UAVs communicate with each other through radio frequency (RF) signals of wireless sensors. The sensor devices are equipped on UAV's fuselage. Sensor devices have limited battery power and data processing capabilities. They can also acquire the location of other sensor nodes (UAV). The UAV groups need to collect data information by covering the ground. Here, UAV's coverage capability is determined by observation thresholds. Then, the data traffic collected by distributed UAVs is sent to the central UAV for processing. Due to the huge energy consumption caused by long distance transmission, it requires forming groups and utilizes some UAV sensor nodes as relays to carry out data transmission.
For the cooperative multi-UAV deployment example showed in Figure 1, there are seven UAVs performing coverage mission and uploading the collected information data to the upper central UAV. Due to the limited energy of sensor devices, UAVs need to form groups to cooperate. In this case, some UAVs are used as relays to optimize the energy consumption of data transmission. In particular, the high transmission energy consumption between UAVs that are too far apart (such as UAV 1 and UAV 7) prevents the formation of the grand coalition. Therefore, it makes sense to consider the coalition formation by optimizing the coverage utility of the UAV network.
To facilitate computation, we discretize the continuous area I ∈ R 2 , which is uniformly meshed and divided into cells, and each cell i's width is l w and is identified by the position of its center The serial number of the central UAV is set to 0. For each n ∈ N , let c n denote UAV n's coalition selections, g n = {x n , y n , h n } denotes UAV n's three-dimensional position. Hence, the UAV n's state is defined as s n = {g n , c n }. For each m ∈ M, let CO m denote the set of UAVs which select coalition m for data transmission, i.e, CO m = {n ∈ N : c n = m}. In addition, the central UAV's position is denoted as For each i ∈ I, suppose that σ(i) is the importance (traffic, etc.) of mission cell i, and is denoted by the form of a normalized density function. For simplicity, we set U ξ as the reward for UAV set ξ completing the coverage of mission area I successfully, which is given as follows [1]: where Equation (2) indicates that, as long as area i is within the detection range of UAV n in the network, it will be considered successfully covered by UAV n with all data traffic collected. Hence, U ξ represents the coverage performance (says reward) of mission area I under UAV set ξ. In particular, the overall coverage reward of UAV network is U N .  From the above description, data traffic collected by distributed UAVs will be transmitted through the UAV-to-UAV link to the central control UAV for processing. In the UAV-assisted sensor network, most of the power energy consumed by wireless sensors is derived extensively from a transmission operation that consists of sending and receiving. Therefore, motivated by the work in [25], we adopt expressions to estimate the energy consumption for transmitting and receiving of each sensor node through RF signals, which are given by: where d i,j represents the distance between UAV i and j, E 0 is the energy consumed by radio when performing transmitting and receiving. ε s and ε t represents the energy consumed by transmitting one bit of data at a shorter distance and at a longer distance, respectively. Thus, consider the data transmission through link of UAV i and UAV j, the overall energy consumption per bit of data is derived as follows: In a coalition, cluster head UAV plays an important role in connecting coalition members and the upper central UAV, so as to achieve data transmission. Suppose that the overall energy consumption function of the whole network is defined as follows: where cl m represents the coalition m's cluster head UAV, U n and U CO m is UAV n and UAV set CO m 's mission reward, indicating the data traffic they collect. Equation (5) contains two items: the first item represents the energy consumption of data transmission, which is collected from coalition m's cluster member UAV to m's cluster head UAV. The second item represents the energy consumption of data transmission, which is transmitted from coalition m's cluster head UAV to the upper central UAV. Therefore, Equation (5) reflects the overall energy consumption of the whole UAV network when all data are transmitted to the central UAV.
From the above formulation, it can be concluded that UAV deployment is affected by the importance of ground data traffic distribution, while the energy consumption of the path is also limited by the communication distance between UAVs. When the battery can't support long distance UAV communication, the deployment of UAV coverage can be severely affected, making our proposed problem fall into a dilemma. It is meaningful to accurately characterize the relationships between coverage reward and energy consumption. Therefore, according to Equations (1) and (5), we have the whole UAV network's coverage utility: where ρ represents the weight coefficient, indicating the importance of reward and transmission energy consumption. Here, ρ is calculated and measured from parameters of UAVs and mission requirements environment. Therefore, the setting of ρ decides the mission requirements and has a strong reference value of weighing the gain and cost. It should be pointed out that some constraints for both variables can be set under the condition of driven mission, such as a guarantee of minimal reward and limitation of maximal energy consumption. For intuitive comparison, the object of this model is maximizing the whole UAV network's coverage utility through adjusting UAVs' state: Here, solving G can obtain an efficient multi-UAV deployment and coalition transmission mechanism, which makes UAVs achieve better coverage reward without producing much energy consumed by data transmitting and receiving.

A Coalition Formation Game Approach for Coverage Utility Maximization
Notably, solving G is challenging due to the enormous strategies (positions and coalitions), an effective method is required to simplify the process: Step 1 (Reward optimization): Carry out coverage optimization deployment under current coalition formation.
Step 2 (Coverage utility maximization): Under the fixed deployment of UAVs, the coalition selection method is considered to maximize U1.
Step 3: Repeat the above steps until U1 reaches maximization. This approach can not only avoid the computing complexity caused by multi-strategies, but also be able to work out the optimal solution of G.
In distributed multi-agent control system, players interact with each other and make self-determined strategies. Motivated by that, the proposed UAV efficient deployment problem can be formulated as a game. According to Equation (6), the utility function consists of two items, and we focus on achieving global optimization. Formally, the game is denoted by G = (N , {S n } n∈N , {u n } n∈N ), where N = {1, 2, . . . , N} is a set of UAVs, S n = G n ⊗ C n (⊗ is Cartesian product) represents a set of available joint state profiles, where G n and C n = {1, 2, . . . , M} represent a set of the available positions and coalitions for each UAV n, respectively. Denote an element of G n and C n as g n and c n . u n represents the utility function and can be expressed as u n (s n , s −n ), where s n ∈ S n represents the state selection of UAV n and s −n ∈ S 1 ⊗ S 2 ⊗ · · · ⊗ S n−1 ⊗ S n+1 ⊗ · · · ⊗ S N is the state profiles of all the UAVs excluding n, the same is true with g −n and c −n .
Definitions are given in the following to analyze the properties of the coalition formation game.
It should be noted that a coalition partition is denoted as a set Π = {CO m } M m=1 which partitions N, where the subsets CO m are called coalitions and are disjoint.
Apparently, G is a typical CFG. UAVs in G make decisions through different preference relations, and form different coalition partitions. However, first of all, the UAV deployment should be discussed first given the coalition partition Π.

Analysis of the Mission Reward Maximization
(1) Utility Function: As discussed before, each UAV prefers a higher coverage performance and less energy consumption of transmission. Accordingly, we consider the UAV n's marginal coverage contribution and measure the coverage utility of mission area I of UAV n as follows: In Step 1, given the method that coverage optimization deployment should be done before coalition selection each loop, we have the following equation: (2) Analysis of G 1 : Considering that the cooperative behaviours among UAVs should be well depicted, a potential game is adopted, which applies to a distributed multi-agent system and can associate with the local utility and global utility of each participant [26]. Hence, combining with the distributed self-organizing characteristics of UAVs, the whole system can make decisions to achieve more efficient performance. Here, the definition of Nash equilibrium (NE) is shown in the following, indicating the steady state of a noncooperative game.

Definition 2 (Nash equilibrium (NE) [27]). A position selection profile g
if and only if no player n can improve its utility by changing its states, i.e., Definition 3 (Exact potential game (EPG) [28]). For utility function u * n (g n , g −n ) in a game G 1 = (N , {S n } n∈N , {u * n } n∈N ), if there exists a potential function φ : G n → R, for arbitrary position strategy selection changes from g n to g n , the following equation is true: Then, this game is called exact potential game and has at least one NE point.

Lemma 1.
Given that the current coalition partition Π, G 1 is an EPG and has at least one NE point.
Proof. Here, we construct the potential function as follows: which also represents the global coverage utility of UAV network. Assume that an arbitrary UAV n changes its position profile from g n to g n ; then, we have the following formulation: Intuitively, for each UAV k ∈ N \n, its individual coverage performance is completely unaffected by the above action changes, in addition to the energy consumption of UAV set CO k , k ∈ N \ c n . Thus, the results of the last four items from Equation (14) are identical: zero. Then, we derive the following equation according to Equations (13) and (14): The work for the proof has been done in [7]. Therefore, according to Definition 3, G 1 is an EPG with coverage utility U I serving as the potential function. It can be concluded that P 1 has at least one NE point. Note that the designed potential function refers to overall coverage utility. Hence, the NE solution of P 1 turns to be the pure strategy NE point of G 1 . which can be effectively helpful for the proof of the stability in the proposed model G.
Motivated by the learning algorithm design in [7], we introduce a coverage optimization deployment algorithm based on binary log-linear learning (CODA-BLL) (Algorithm 1) to execute coverage maximization deployment. The proposed algorithm is mainly based on the binary log-linear learning, which was proved to guarantee convergence of Nash equilibrium [7,29]. Given the current coalition selection profiles, the algorithm is executed one time to explore an optimal deployment for coverage utility per loop. Here, we design Equation (16) as the UAVs' position selection probability function, where β is the learning parameter (β > 0): Algorithm 1. One time coverage optimization deployment algorithm based on Binary log-linear learning (CODA-BLL).
Initialization: Input UAVs' current state profiles {s n } n∈N .
Step 1: Randomly select UAV n, calculate its coverage performance u * n (g n , g −n ) of current UAV deployment state according to Equation (9).
Step 2: For selected UAV n, choose an expected action g n from the G n . The selected UAV n computes its current coverage performance u * n (g n , g −n ) and the expected utility u * n (g n , g −n ) according to Equation (6).
Step 3: UAV n choose a position deployment according to Equation (16) and update its state g n .

Analysis of the Stable Coalition Partition
(1) Utility Function: In Step 2, under fixed deployment, UAV n's individual energy consumption can be expressed as follows: where g cl c n represents the three-dimensional position of coalition c n 's cluster head UAV. T1(c n ) represents the coalition c n 's data traffic energy consumption from the terrestrial to the central controller.
Then, for each UAV n ∈ N , it makes coalition selection through its individual coverage utility, which can be derived as follows: In Definition 1, the preference relation is introduced, which decides whether a player prefers to join or leave the coalition. Notably, the preference relation can affect the convergence and the stable state of the final structure of CFG [30]. Next, an ordinary preference order called Pareto order is adopted to analyze the property of the proposed game model.

Definition 4 (Pareto order).
In CFG, its preference relation n satisfies Pareto order if for each player n ∈ N and all coalition partition CO , CO ∈ Π, there exists the following formula: In Pareto order, each player n obeys the principle that they won't make strategy damage the utility of other players i in the original or new coalition. Hence, the profit of all coalition partition will not decrease. Due to the limited profit players can get, Pareto order can be a very strong proof of stable coalition partition.
The UAVs' coalition selection satisfies the definition of preference profiles, which also indicates that G 2 is a typical CFG according to Definition 1. Typically, in a CFG, the most important part is the formation criteria, which determines the structure of coalitions. Here, we consider the Pareto order [31], which rests with the preference on the individual payoffs of the players rather than the coalition value [22]. Afterwards, the coalition formation rule is required when the preference order is determined.
Authors in [32] introduced a merge and split rule, which is used for forming or breaking coalition. Both rules focus on the profits of all the players in the coalition. For example, any pair of coalition CO and CO can be merged into one coalition when all the players have an increased profit. On the contrary, a coalition can be split into coalition CO and CO since each players in their new coalition can achieve a better profit. It is noted that, according to Equation (18), for each UAV n, the first item of its individual coverage utility function is fixed due to the stability of G 1 .
(2) Analysis of CFG: In this part, we analyze the stability of the proposed CFG through the given preferences and rules, and then solve the problem.

Definition 5 (Stable coalition partition [19]).
A partition Π is said to be stable if no player can improve its utility by arbitrarily change its strategy, i.e., if u n (c n , c −n ) ≥ u n (c n , c −n ), ∀n ∈ N , ∀c n , c n ∈ C n , c n = c n , (20) then Π is thought to have a stable coalition partition.

Theorem 1.
With the preference relation of Pareto order, the CFG G 2 can be converged to the stable coalition partition.

Proof.
(1) Note that, given the coalition selection profiles {c n } n∈N of UAVs in Step 1, the reward maximization model G 1 is proved to be an EPG and has at least one NE point through Lemma 1. According to Definition 2, there exists at least one position selection profile for UAVs if and only if no UAV n can improve its utility by changing its state under the current coalition selection.
(2) In the coverage utility maximization step, the utility function can be derived as u n (c n , c −n ) since UAVs' deployment is determined. Based on the above description, Pareto order is used for coalition selection while each UAV n take u n (c n , c −n ) as its payoff. In the above description, the sets of players (UAVs) and strategies ({C n } n∈N ) are limited, and Pareto order can improve current coalition's utility without damage any other UAVs, so that the profit of all coalitions can finally converge to a peak. Denote CO * as the final coalition partition. If CO * is not stable, then there exists a UAV, says n, whose arbitrary change will improve u n (CO * ), which contradicts the previous argument of limited profit. Finally, we can conclude from Definition 5 that there exists a stable coalition partition Π in CFG G 2 with Pareto order.

Algorithm Design of CFG Model
Based on the existence of the stable state in the above section, next we are going to solve the stable stable of G by designing an algorithm. However, due to the diversity of UAV set and strategy sets (position and coalition), the kind of optimal selection approach often falls into a trap loop, that is, local optimum.
In that case, learning algorithms need to be applied to explore the stable state of game. Motivated by the exploring mechanism of a distributed learning algorithm called spatial adaptive play (SAP) in [27], we designed a coalition selection and position deployment algorithm based on Pareto order (CSPDA-PO) to converge the proposed model to the stable coalition state. In each iteration of the algorithm, the chosen UAV updates its coalition selection and makes a comparative update under Pareto order, while all the other UAVs maintain their current selection strategy. Learning from Theorem 1, the stable solution of P 1 can be solved out. In the following table, Algorithm 2 demonstrates procedures of CSPDA-PO. In Equation (16), q g n (t) represents the coalition selection probability function of UAV n. β is the learning parameter (β > 0).
Combining the existence of NE in G 1 with the existence of stable coalition partition in G 2 , our proposed method can converge the problem P 1 to a stable state solution.

Algorithm 2. Coalition selection and position deployment algorithm based on Pareto order (CSPDA-PO).
Initialization: Set j = 1 and the position of central UAV g cen , initialize UAVs' state s n = {g n , c n , n ∈ N } and mission area I's state σ(i), i ∈ I. Specially, each UAV n ∈ N chooses different coalition from C n . Loop: Step 1: All UAVs in the network exchange information (coverage deployment and coalition selection) with each other.
Step 2: Randomly select one UAV at each iteration, says n. Input UAVs' current position profiles {s n } n∈N to Algorithm 1 and obtain optimal coverage deployment {s opt n } n∈N under given coalition selections of UAVs. {s n } n∈N ←− {s opt n } n∈N .
Step 3: All the other UAVs repeat the previous coalition selection, i.e., c k (j + 1) = c k (j), k ∈ N \n. For the choosing UAV n, it changes its coalition selection to c n ∈ C n \c n . Update s n = {g opt n , c n } and input {s n } n∈N to Algorithm 1, obtain optimal coverage deployment s opt n under a given coalition selection for UAV n. {s n } n∈N ←− {s opt n } n∈N .
Step 4: Set CO = CO c n , CO = CO c n ; then, UAV n calculates its coverage utility u n (j) on the original and expected coalition and updates its strategy according to Definition 4 (Pareto order), where the relationships of utility are explored using probability formula according to Equation (21): where β is the learning parameter (β > 0) and can be used to adjust convergence performance and speed.
Step 5: j = j + 1. If the stop criterion* is satisfied, output UAVs' state {s n } n∈N ; otherwise, go to Step 1.
* Stop criterion can be described as the following rules: q c n (j) is more than a certain value like 0.98 or j reaches a predefined maximum number of iteration steps.

Simulation Results and Discussion
In this section, we carry out simulations to verify the convergence and effectiveness of our proposed approach. Given the simulation parameters in [25], we set E 0 = 50 nJ/bit, ε s = 10 (nJ/bit/m 2 ), ε l = 10 (nJ/bit/m 4 ). The mission area is uniformly divided into 50 × 50 cells (each cell's width is 10 m). We consider a UAV network consisting of one central control UAV g cen = (15, 15, 40)(×10 m) and a certain number of UAVs for data collecting and transmission, whose flying altitudes are all set to 30 (×10 m). All UAVs' detection ranges are set to be 50m. The amount of data traffic per unit coverage reward is denoted as 1000 MB.

Diagram Form
Assume that the prior information of the mission area is known. In particular, a probability density function (PDF) which obeys the normal distribution is intended to describe the importance (data traffic) in the mission area. Figure 2a shows the color-coded display of the density map for the mission area, where the length of each mission cell is 10 m, and σ i is denoted as the normalized density of overall data traffic. Figure 2b-d illustrates deployment and coalition formation diagram of 10 UAVs under different values of weight coefficient ρ. The coloured circles represent the UAVs' detection range, that is, coverage ability. These figures depict the formation of UAV coalitions and the way data are transmitted. Coalition member UAVs (black dots) send their data to the coalition head UAVs (green triangles), which upload the data to the central UAV (red star). It can be seen that, as ρ increases, the UAVs' state is more inclined to forming a coalition to minimize the energy consumption, while a mission reward becomes lower. We analyze that, in Equation (18), UAVs' whole coverage utility is determined by mission reward and energy consumption. After taking the partial derivative of Equation (18), ρ is the indicator that determines the marginal utility of mission reward and energy consumption.

Basic Performance
In order to better reflect the performance of the proposed approach in UAVs' coverage deployment and coalition formation, we introduce the coverage reward based deployment approach for comparison. In the comparison approach, the coverage utility only depends on the mission reward and there is no coalition formation mechanism. In order to achieve better contrast, we assume that the coverage reward under the comparison approach converges to that of the proposed approach. Under this premise, we evaluate the overall energy consumption and coverage utility for the specific mission area showed in Figure 2a. As mentioned above, the weight coefficient ρ is the parameter that needs to be calculated and measured from parameters of UAVs and mission environment. For convenience, performance comparisons considering different weight coefficients are paid as reference. To prevent contingency, both algorithms are performed on 100 independent trials; then, the mean results are acquired. Here, we observe some important results from the figures and draw conclusions from the analysis. Figure 3a,b demonstrates the comparison results. In Figure 3a, as the weight coefficient ρ scales up in the beginning, the overall coverage reward U N of the proposed approach decreases due to the increased importance of energy consumption. Meanwhile, the overall reward of comparison approach tends to be the same as U N , as can be expected. In particular, in the late stage (ρ = 3.5 × 10 −5 ∼ 4.5 × 10 −5 ), UAVs are no longer keen to cover high-yielding areas since the importance of energy consumption is extremely high, which makes UAVs form a stable deployment state.
In Figure 3b, the comparison result shows that the proposed approach can achieve a lower energy consumption E 0 . We analyse that the existence of coalition formation mechanism makes the energy consumption of the proposed approach much less than that of the comparison approach. As the weight coefficient scales up, E 0 decreases rapidly at first. It suggests that the whole UAV network has a bigger tendency to form coalitions when the weight coefficient increases. Another discovery is that, as weight coefficient ρ scales up, energy consumption under the coalition formation mechanism tends to be steady due to the limited data traffic UAV collection.  According to the above results and Equation (6), the comparison of coverage utility is shown in Figure 4, which is an important indicator of the whole UAV network. This figure shows that coverage utility under all approaches decreases as the weighting coefficient increases in the beginning, while the proposed approach can achieve higher coverage utility under the same weight coefficient. This suggests that, without a guarantee of coalition formation mechanism, the system will obtain a fair reward, but, at the same time, needs more energy consumption, so that the coverage utility is less. Likewise, in the late stage (ρ = 3.5 × 10 −5 ∼ 4.5 × 10 −5 ), the coverage utility of the proposed approach tends to be stable since the coverage reward and energy consumption is in a stable state. Figure 5a,b shows changes in overall coverage reward and overall coverage utility for different numbers of UAVs, respectively. From the figure, we observe that the coverage performance improves as the number of UAVs increases, which are consistent with the fact that more UAVs are able to collect more data information from the ground. However, when the number of UAVs reaches a certain number, the increase in coverage utility will slow down until eventually declining. The main reason says that there is an upper bound for coverage reward due to the limited data information. Meanwhile, in the coalition structure, more UAVs mean more transmission paths, and the high-energy consumption of the increasing transmission paths will make it worthless. Note that the model with the increase of weighting coefficients will also make the model focus more on energy consumption, so coverage utility will decrease with the increase of the weighting coefficient, even reaching the "saturated" state in advance (e.g., eleven UAVs with ρ = 1.3 × 10 −5 and twelve UAVs with ρ = 1 × 10 −5 ). Therefore, the performance shows such a fluctuant shape.

Convergence Performance
In order to deeply investigate convergence performance, we choose ρ = 1 × 10 −5 as a contrastive index of both approaches. Figure 6a shows the diagram which shows the diagram of the convergence state of UAV deployment and coalition formation. Convergence performance of different algorithms is shown in Figure 6b-d. We observe that all the indicators converge after a certain number of iterations, which indicates that our proposed method can achieve a stable state solution of the system model. The coverage reward of the comparison algorithm converges to the same value. However, since there is no coalition formation mechanism for a comparison approach, its energy consumption and coverage utility converge to a worse level, which is consistent with the model analysis. It is an issue when facing complex environment, such as disaster, which requires UAVs to make accurate and fast self-decision for rapidly changing scenario parameters. Authors in [33] utilized reinforcement learning (RL) to solve the problem of anti-jamming communications in dynamic jamming environment. The application of RL can significantly help UAVs adapting to dynamic environment and make deployment and parameters adjustment accordingly. This gives us inspiration and will be our next step of the work outlook.
To summarize, the simulation results illustrate that the proposed cooperative multi-UAV deployment model based on coalition formation can describe UAVs' cooperative behaviors well. Moreover, the setting of the weight coefficients can provide a theoretical reference for different task requirements. Finally, the proposed approach can achieve higher coverage utility compared with a comparison approach, reflecting the good performance of coalition formation mechanism and efficiency of the proposed model.

Conclusions
In this paper, we proposed an efficient cooperative UAV deployment model in the UAV-assisted sensor network. A multi-UAV coverage deployment and transmission mechanism was established, and the coverage utility and corresponding energy consumptions under the network structure were studied based on cooperation. The system model was transformed into a coalition formation game (CFG). The CFG with Pareto order was proved to have at least a stable coalition partition. Then, coalition selection and a position deployment algorithm based on Pareto order (CSPDA-PO) were designed to perform coalition selection. Finally, simulations were performed to validate the effectiveness of our proposed approach, indicating that the proposed model can describe UAVs' cooperative behavior well and has better performance compared with other approaches in UAV networks.

Conflicts of Interest:
The authors declare no conflict of interest.