1. Introduction
With the rapid development of Internet-of-Things (IoT), it has brought severe challenges to the design of mobile wireless networks, mainly to provide high data rates and extremely low latency [
1]. However, terrestrial base stations may not be available when destroyed by disasters, and cannot support the communication demands in emergency scenarios.
Due to the flexible maneuverability, easy deployment, low cost and miniaturization of unmanned aerial vehicle (UAV), it has been used in many emergency communication scenarios caused by disasters [
2,
3]. As one of the supporting technologies of sixth-generation (6G) wireless communication systems, the deployment of aerial base stations is also an efficient way to enhance wireless communication services. With the reduction in UAV cost, the scale of the UAV communication network can be significantly extended by introducing multiple UAVs to provide service [
4]. UAVs can construct an aerial UAV swarm network through flexible networking, supplementing to the existing network architecture for wireless information transmission, which can realize rapid movement of the wireless network coverage area.
When the number of users increases in 6G networks, it is very difficult to provide satisfactory service for users [
5]. Therefore, UAV communication networks will undoubtedly have to face the challenges of dense deployment scenarios [
6]. A typical scenario of ultra-dense UAV communications is illustrated in
Figure 1, where ground base stations are not deployed or unavailable in this area. People may encounter this situation when disasters destroy the base stations [
7]. To satisfy the users’ communication demands, many UAVs carry communication devices to provide connection services to the ultra-dense users.
In the scenario where dense communication links coexist, mutual interference is a problem that network operators must face. How to achieve interference control in ultra-dense networks in a communication environment with restricted or more complicated channel conditions is still a research direction for academia and industry. By deploying UAVs at suitable positions, it can reduce the interference from the users in other cells, while maintaining a good coverage for the users in its cell.
In extreme cases, if the interference is too severe for the link to transmit information losslessly, the received information inevitably contains distortions. The conventional communication systems will discard the error-corrupted information since the lossy recovery cannot be further utilized. Nonetheless, artificial intelligence (AI)-enabled 6G networks have the capability to exploit the useful information from lossy recoveries. In task-oriented communication scenarios such as IoT [
8], a certain degree of distortion is acceptable provided that the final decision is still correct. Furthermore, 6G networks may intentionally perform lossy communications, i.e., semantic communications [
9,
10,
11,
12,
13] emphasize the reconstructed information having the same “meaning” rather than “bit sequences” as the original information. Thus, lossy communications have a bright future in the 6G era.
There is already some research focusing on UAV lossy communications [
14,
15,
16]. The work in [
14] proposes an optimization method for minimizing the age of information in UAV communications. The authors in [
15] analyze the lossy communication performance of cooperative UAV networks. The work in [
16] focuses on the adaptive communication protocol for transmitting critical video data by lossy compression. However, the scenario of these studies contains a limited number of UAVs and users. Although [
17] investigates data sharing of a large-scale UAV swarm in lossy communication environments, the system objective is to reliably exchange information based on consensus algorithm instead of exploiting the lossy information. Therefore, an ultra-dense UAV network adopting lossy communications remains to be investigated. The key problem for optimizing the ultra-dense UAV network is the computing complexity of the algorithm. Especially in dynamic environment, UAVs face complex optimization constraints [
18]. In ultra-dense networks, conventional optimization approaches face a huge amount of parameters to be optimized, which require incredible computing complexity. It is extremely hard to solve an optimization problem with a huge amount of parameters within a certain time. To solve this problem, mean field game (MFG) is a state-of-the-art tool which can significantly alleviate the curse of dimensionality. The so-called mean field theory is simply to average the effect of the environment on the object, by collectively processing the influence of the surrounding objects on the target, using the global average effect result to replace the effect caused by a large number of monomers. In the MFG, for a typical individual, the game with all other individuals is simplified to a game with a mean field. MFG has brought the possibility of modeling and solving such a dense network distributed by game strategies.
In recent years, MFG has gradually been used in communication scenarios [
19,
20]. The authors in [
21] discussed the downlink interference management in dense UAVs networks using MFG theory, which modeled the interference control problem as an altitude control problem. The MFG was used to obtain the optimal altitude control strategy. In [
22], a joint channel access and power control optimization problem was solved by formulating a multiple MFG for large-scale UAV networks. In [
23], the authors studied a prediction-based charging policy and interference mitigation approach in wireless powered IoT networks. In these networks, it modeled the interference mitigation problem as an MFG system, where the drone powered the sensors through the appropriate path. The authors in [
24] combined MFG with multi-agent deep reinforcement learning for resource allocation of UAV-assisted multi-access edge computing networks.
In summary, the numbers of UAVs and users were quite limited in many scenarios, which hardly satisfied the needs of ultra-dense scenes. Although there were some works adopting the MFG in drone communication to deal with the interference management and deployment problem, the system is designed for lossless communications, which do not fit the AI-enabled scenario in 6G.
Motivated by the aforementioned facts, this paper considers a large number of UAVs that serve multiple users in lossy communication networks, and proposes a 3D distributed dynamic flight strategy. In order to solve the problem of mutual interference minimization in the dense network of drones, we will deeply study the system optimization problem of the ultra-dense network composed of drone base stations and a large number of users. The MFG framework is used to find the best 3D position solution for drones. The main idea of the proposed algorithm is to average the interference from the mass of all users as a mean field, which significantly reduces the computing complexity for solving the optimization problem.
The main contributions of this paper are summarized as follows:
An MFG framework for dynamic emergency communication networks: We propose an MFG framework for dynamic communication networks. The framework contains a small number of base stations and a large number of UAVs and users. Among them, the UAVs assist the base stations to provide services for the users.
Energy consumption and location problem formulation: With the help of the proposed MFG framework, we optimize the trajectory of the UAV, and design the corresponding cost function. We formulate this problem as the problem of cost minimization, but the constraints of energy consumption and penalty must be considered.
Constructing a robust MFG: Considering the time-varying problem of the channel, a robust mean field framework is designed to solve the trajectory optimization problem.
Equilibrium solution of MFG: We obtain the equilibrium solution for ultra-dense uplink UAV lossy communications by alternately solving the Hamilton-Jacobi-Bellman (HJB) and Fokker-Planck-Kolmogorov (FPK) functions of MFG.
The rest of the paper is organized as follows. In
Section 2, we introduce the system model and the related assumptions including the application scenarios, the channel model, the flight energy model as well as the cost function. In
Section 3, the stochastic differential game and the MFG system framework are presented. In
Section 4, the presented framework is derived and the iteration equations of UAV control and state are obtained. The simulation results and analyses are shown in
Section 5. The conclusions are drawn in
Section 6.
2. System Model
In this section, we introduce the system model of ultra-dense uplink UAV communications.
Section 2.1 mathematically describes the basic scenario considered in this paper.
Section 2.2 characterizes the channel model for UAV communications.
Section 2.3 presents the cost function of each UAV based on the user interference and the flight energy consumption.
2.1. Basic Scenario
We consider the basic scenario illustrated in
Figure 1, where the users stay in a large square region with a side length of
R. To formulate the optimization problem, we make the following assumptions.
(1) The locations of the users are assumed to be independently and randomly distributed according to the Poisson point process (PPP) [
25].
(2) We assume that the UAVs follow a uniform distribution in the horizontal plane, while all of the UAVs are at the same altitude. The density of users is denoted as , and hence, the number of users can be expressed as , where A is the area of the whole region. Moreover, the set of users is denoted as .
(3) We further assume that the positions of users are constant or changing slowly during the service time . Thus, the number of users in the responsible area of each UAV remains constant, and each UAV only serves the users within its responsible area.
In the considered scenario, N UAVs form a set and share the same time-frequency channel for receiving uplink data from the users assigned to them. The transmit power is denoted as , and the users request access continuously during the time interval. Under this circumstance, we have to consider the interference from other users when optimizing the UAV positions.
Figure 2 depicts the cell served by a UAV
. If we can guarantee the access quality of the farthest user, the access quality can also be satisfied for the other users. Without loss of generality, we assume that the user
j is the farthest user in the cell. Therefore, the user
j is located at the boundary of the optimal coverage area of the UAV
i. The UAV
i is located at the attitude
, and the radius of the coverage area is
. For simplicity, we neglect the altitude of the user and the antenna heights. Then, the distance between the UAV
i and the boundary of the coverage area is
, and the corresponding elevation angle is
. Furthermore, we denote the range of the UAV flight altitude and the maximum UAV speed as
and
, respectively.
2.2. Channel Model
In this paper, the air-to-ground (A2G) channel is assumed to follow the probabilistic line-of-sight (LoS) model, where the channels between the UAVs and the users are of either the LoS or of the non-line-of-sight (NLoS) nature and their occurrence probabilities are determined by the elevation angle of the transmission link [
26]. Moreover, the multiple reflected signals which cause the multi-path fading [
27] are also taken into consideration.
The path-loss between the UAV
i and a user
k for LoS and NLoS links is denoted as
and
, respectively. The path-loss in dB can be expressed by [
26]
where
is the distance between the UAV
i and the user
k.
and
represent the free space propagation loss of LoS and NLoS links, respectively, which depends on the environmental conditions.
represents the carrier frequency, and
c stands for the speed of light. Moreover, the probability of LoS links is given by [
27]
where
and
are the constants which depend on the environment, the density and height of buildings, and the elevation angle.
is the elevation angle from the user
k to the UAV
i. Hence, the probability of NLoS links is
. Then, the path-loss function can be represented as
By combining (
1)–(
4), the path-loss function can be rewritten as
which is a function of
and
. Specifically, for the user located at the boundary of the coverage area, the optimal angle
equals
,
,
, and
for the suburban, urban, dense urban and high-rise urban environments, respectively [
28]. Therefore, if the value of the path-loss is given, the optimal UAV attitude and coverage radius pair
can be obtained, and vice versa. Since MFG averages the mass of users’ interference, including the small-scale fading, this paper simplifies the small-scale fading by utilizing MFG.
2.3. Cost Function
For achieving higher energy efficiency while guaranteeing successful access for each user, the UAV should adjust its position jointly based on the user interference and its flight energy consumption.
The quality of user access is characterized by the distortion of the recovered information. According to the rate-distortion theory [
29], the minimum rate for a Bern(0.5) source to satisfy the distortion requirement
for the
i-th UAV is given by
where
denotes the binary entropy function.
Based on Shannon’s lossy source-channel separation theorem, the rate is constrained by the signal-to-interference-plus-noise ratio (SINR) of the received signals as
where
is the end-to-end coding rate,
represents the SINR of the received signals, and
denotes the channel capacity with two-dimensional signalling. Therefore, the required SINR threshold
can be obtained as
with
denoting the inverse function of
. For simplicity, this paper considers the case that all distortion requirements are the same, and hence, the SINR thresholds are also the same for all links, i.e.,
.
At the moment
t, the external interference
for the UAV
i is caused by those users assigned to other UAVs, which can be formulated as
where
stands for the set of internal users served by the UAV
i.
denotes the complement set of
in
.
represents the geometric gain, which is given by
For the internal user
j located at the boundary of the coverage area, the internal interference
from the other users in the responsible area of the UAV
i is
Consequently, the SINR from the farthest internal user to the UAV
i is readily given by
where
is the power of the Gaussian white noise.
is the geometric gain from the boundary of the coverage area to the UAV
i.
and
are the weights distinguishing the internal and external interference, respectively.
By minimizing the gap between the required SINR threshold and the SINR of the signal received from the farthest internal user, the energy efficiency is optimized while guaranteeing the quality of user access.
Meanwhile, it is necessary to consider the flight energy consumption when planning the optimal UAV trajectory. The accumulative flight energy consumption of the UAV
i at time
t is given by [
30]
where
stands for the flight distances.
,
, and
represent the rotor induced speed when hovering, the speed of the blade tip, and the flight speed of the UAV, respectively.
and
are the rotor solidity and the air density.
and
represent the fuselage drag ratio and the total area of rotary wing, respectively. The constant parameters
and
denote the induced power and the blade profile power, respectively. For simplicity, we assume that each time unit is short enough, and hence, the acceleration of UAV is neglected, which means
is a constant.
For enhancing the energy efficiency, the optimization problem can be formulated as
The challenge for solving
is that the received power in the target and the interference to others are coupled. Increasing the received power and SINR of one UAV results in the SINR reduction of other UAVs. To solve the problem,
is transformed to a problem of minimizing the cost of the system, i.e.,
where
and
are the weight coefficients for the quality of user access and the energy consumption, respectively.
Moreover, two punishment functions are formulated to balance the locations between the UAV and the mass consisting of all users. One of the punishment functions is to restrain the overlap and the energy consumption caused by the height variation, which is expressed as
where
is the coverage area of UAV
i at time
t. Clearly,
. Another punishment function is caused by the horizontal movement of the UAV. In order to satisfy the SINR requirement
, the UAV
i will try to increase the distance from the mass. Therefore, this penalty term
is utilized to restrain the drop-out of the mass, where
represents the distance between the UAV
i and the mass at time
t. Finally, the cost function of the UAV
i is given by
where
and
are the coefficients of two punishment terms. The optimization problem is rewritten as
4. Energy-Efficient Flight Strategy
In this section, we obtain the energy-efficient flight strategy by solving the HJB and FPK Equations (
33) and (
35). These two equations are coupled mutually and interact with each other, which can reach the MFE by resorting to the finite difference method [
33].
In this finite difference framework, the time space
and the 3D vector space representing the location space, including 2D vector space in the horizontal direction
,
and vector space in the vertical direction
, respectively, are discretized into
spaces. Then, we aim to find the optimal control policy in this four-dimensional discrete vector space including the time space and the location space. Hence, we define
which represent the iteration steps of the time, the transverse vector, the longitudinal vector, and the vertical vector, respectively.
Then, we use the Lax–Friedrichs schemes to solve the FPK equation in (
35). Let
n denote time index,
j,
k and
l denote location coordinate indices in the discretized grid. Therefore, we have the iterative equation of mean field term as
where
,
, and
are given by
,
,
, and
denote the value of the mean field, the transverse control, the longitudinal control and the vertical control, respectively.
In order to solve the HJB equation, we have to consider the constraints of the forward equation and mean field. Therefore, we use the Lagrange relaxation to solve the HJB equation. The Lagrangian
is defined as
where
is the Lagrange multiplier. Here,
is defined as
We solve (
41) by using the finite difference method. Similar to the previous method of solving the FPK equation, we discretize the Lagrangian as
where
and
represent the value of the cost function and the Lagrange multiplier at time
n location
on the discretized grid, respectively. Here,
is given by
In this model, the optimal decision variables include
,
and
. To begin with, we update the value of the Lagrange multiplier by calculating
. Therefore, we can obtain the iterative equation of variables
as
where
are arbitrary on this discretized grid. Then, we update the value of the control by calculating
,
and
, respectively. Therefore, the iterative equation of control
can be expressed as follows:
Similarly, we can obtain the iterative equations of
y and
z as
Finally, the MFE is solved by (
37) and (
46)–(
48) iteratively until they converge. The specific iteration step is displayed in Algorithm 1.
Algorithm 1 Obtaining the MFE |
- 1:
Initialization: - 2:
: initialize mean-field distribution; - 3:
: initialize Lagrangian parameters; - 4:
: initial control. - 5:
Repeat: Until the system obtains the MFE - 6:
Compute update for the mean-field m: - 7:
for , , , and do - 8:
Update using ( 37). - 9:
end for - 10:
Compute update for the Lagrangian parameters : - 11:
for , , , and do - 12:
Update using ( 45). - 13:
end for - 14:
Compute update for the control u: - 15:
for , , , and do - 16:
Update using ( 46). - 17:
end for - 18:
for , , , and do - 19:
Update using ( 47). - 20:
end for - 21:
for , , , and do - 22:
Update using ( 48). - 23:
end for
|
In Algorithm 1, we solve the FPK equation by iterating the mean field term
m. During the iteration, if
j equals 1 or
X, we assume
, and the term
in (
38) can be expressed as
.
k,
l are similar to
j. On the other hand, the HJB equation can be solved by reversely iterating the Lagrange multiplier
and the control
u. The end condition of iterations is the convergence point appearing or exceeding the number of iteration steps. We assume that the coordinates of the UAV in the area are positive. The values of control and the mean field term are positive for any time
n and any state
. Therefore, the reformulated problem with the constraints mentioned above is a convex optimization problem. Meanwhile, the conditions of this algorithm (iterative equations) satisfy the necessary and sufficient conditions of the convex optimization problem. In other words, this convergence point is the MFE with the cost function
.
5. Numerical Results
In this section, we evaluate the system performance with the main simulation parameter settings listed in
Table 1. We assume the coordinate origin of the UAV location is located in the lower left corner of the desired area, so that the state of the UAV is positive. The side length of the large square geographical region (desired area) is set as
km, which means the maximum value of the horizontal area
km. In this model, the minimum and the maximum altitudes of UAVs are
km and
km, respectively, which ensures the complete coverage of users in each small area.
At the initial time, 100 UAVs are hovering at initial positions. The initial distributions of the UAVs and users are shown in
Figure 3. The red circles represent the UAVs and the blue stars represent the users. In the horizontal direction, these 100 UAVs populate the desired area and each UAV has its responsible area, which is the small rectangle in
Figure 3. The users are of random positions and numbers in each small rectangle. Meanwhile, each UAV has the same altitude
. We also illustrate the user and UAV distributions in
Figure 4, where we assume that the initial altitude of users ranges from 0 to 5 m.
To illustrate the evolution of the mean field state under the control obtained by Algorithm 1 over a predefined period of time
, we provide the distribution of the mass UAVs at different times, as shown in
Figure 5. Here,
T is set as 15 s.
Figure 5 shows the changes in the distribution of the UAV at three times, namely,
t = 5 s, 10 s, and 15 s. In
Figure 5a–c, we show the 2D distribution of the UAVs (red circles) and users (blue stars) at those three moments. Meanwhile, we show the position changes of 100 UAVs in 3D space, which can be seen from
Figure 5d–f. Compared with the initial time
, the 100 UAVs find the optimal locations to ensure users’ access. The users’ access status is presented in
Figure 6.
To verify the users’ access situation under these flight conditions, we show the access situation of all users in the predefined period of time
in
Figure 6. The black curve shows the rate of successful access of all users by adopting our proposed algorithm. At the initial time, the ratio of successful access of all users is at a lower level because the altitude of UAVs is lower and the initial position is certain, which corresponds to a smaller coverage area and the incorrect position. Then, all UAVs adjust their positions with the 3D optimal control. The ratio of successful access of all users rises until they all can access. Clearly, it is seen that this ratio reaches 1 at
t = 14, which means that when the system reaches equilibrium, the user’s access target will be satisfied. For comparison, we assume that all UAVs increase access ratio in the same flight mode (rising at a constant speed), as shown in
Figure 6. It can be observed that the ratio of user’s access is still at
and the growth is slow until the last time
t = 15. Clearly, there is an obvious gap between the proposed algorithm and the benchmark scheme for the ratio of successful access. This is because the solution derived from the HJB equation follows the Bellman principle of optimality, which selects the optimal actions for system control.
To further illustrate the user’s access rate, we show the average SINR from time 0 to
T, as shown in
Figure 7. The red straight line represents the threshold of SINR. At the initial time, the average SINR is low enough because there are only a few users satisfying the access requirement. It can be seen that the average SINR reaches the threshold we set above after
t = 6. Corresponding to
Figure 6, at
t = 6, the user’s access rate reaches
, which means 20% of the users still have lower SINR, which leads to the average SINR still rising. Moreover, MFE can be achieved until the optimal position is reached because the average SINR is basically unchanged. In addition, as another important term of the cost function, the flight energy consumption will be shown in
Figure 8.
In
Figure 8, we present the average flight energy consumption of 100 UAVs. At the initial time, all UAVs hover at the same altitude (1000 m), and the initial energy consumption (hover energy consumption) is 60 J/s. Then, each UAV flies by adopting the optimal control, which leads to a higher average flight energy consumption at the beginning for achieving the SINR requirements. Subsequently, the average energy consumption decreases because the closer target area implies that more users are capable of reaching the threshold of SINR. The equilibrium emerges when all users arrive at the SINR threshold. At that time, each UAV keeps hovering, and the flight energy is restored to 60 J/s. The red straight line represents the average energy consumption of 100 UAVs by rising at a constant speed. The energy consumption is a constant at each time. Combining with
Figure 6, when the proposed algorithm arrives at equilibrium, the total energy consumption by all UAVs is close to the energy consumption of all UAVs rising at the same speed by using the same time, but the users’ access rate is higher.