1. Introduction
In recent years, with the popularity of smart devices and the continuous expansion of the Internet of Things’ scale, a quantity of the data show an exponentially increasing trend [
1]. Through the analysis and processing of these data, some intelligent applications (face recognition, intelligent driving, voice recognition) have been developed rapidly, but at the same time, it also raises the issue of data security and privacy. In addition, as people gradually pay more attention to privacy issues, more and more entities begin to emphasize the attribution and use rights of data, reducing the data exchange between different entities, making each entity gradually become an “isolated data island“ [
2]. So as to solve the above problems, give greater play to the value of data and improve the performance of artificial intelligence algorithms, scholars have put forward a new paradigm, Federated Learning (FL), which has the ability to crack the “isolated data island“ problem and protect privacy. Federated learning was first proposed by H. Bradley McMahan et al. [
3], as it can coordinate the joint learning of all parties without exposing the data of all parties. Compared with traditional centralized learning, it not only improves the utilization rate of wireless resources, but also guarantees the privacy of users. In the process of FL, participants do not need to upload local original data, they only need to upload data-related neural network model parameters, and then the master server security aggregates model parameters, and provides feedback to participants. Then, participants according to their own dataset update the global model, effectively ensuring the security and privacy of each participant’s sensitive data. Although FL can effectively solve the problems of the “isolated data island” and privacy, it also faces great challenges: the participant problem and the communication problem. In the FL process, as more participants have increased data diversity, the learning ability of the trained model will be stronger. Moreover, the model parameters need to be exchanged frequently among all parties, and the central party often needs to wait for all user’s training models to be uploaded before it can conduct security aggregation or other data processing. For example, in case of user drop off or communication block, etc. The central party will be trapped in the infinite waiting process, which seriously affects the efficiency of model aggregation and FL.
Therefore, the current research focuses of FL are how to establish a perfect incentive and selection mechanism, encourage more participants to join, select participants with high quality data, how to propose a way to ensure the FL efficiency and how to improve the quality and capacity of communication channels. On the one hand, the participant problems of FL over wireless networks have been studied in [
4,
5,
6,
7,
8,
9]. In [
4], Nishio T et al. investigated user scheduling algorithms that maximize the number of participating users in each global round to improve FL test performance. Motivated by [
4], Yoshida N et al. [
5] proposed a client scheduling scheme based on the multi-armed bandit (MAB) theory, which balanced the selection of clients with uncertain resources and known large resources. In addition, Xu B et al. [
6] proposed an online client scheduling scheme based on the greedy algorithm, which reduces the number of training rounds and the time interval of each global round. Based on age of update (AOU) measurements, Yang H. H et al. [
7] proposed a scheduling strategy, which jointly considers the stability of received parameters and instantaneous channel quality to improve FL operation efficiency. In addition, Ren J et al. [
8] investigated the gradient average effect over participating clients in each global round, and proposed a scheduling strategy to select high-quality clients by taking into account the diversity of channels and the “importance” of clients’ updates. An FL task-training model based on contract theory was proposed in [
9], which minimized the incentive budget according to the individual rationality (IR) and incentive compatibility (IC) of users in each FL training round. However, the authors of [
4,
5,
6,
7,
8,
9] only started from the perspective of how to select and motivate users, and did not consider the optimization of the clients’ energy consumption. When too many clients participate, as the number of wireless resources is fixed, the clients’ energy consumption will rise sharply, which is unfavorable to clients with limited energy storage. Moreover, if users suddenly quit the learning process, it will also have an impact on learning performance. On the other hand, the communication problems of FL over wireless networks have been investigated in [
10,
11,
12,
13]. In [
10], Yang Z et al. researched the computation and communication resource allocation of FL on wireless networks, and under the condition that constraints are met, an iterative algorithm was proposed to derive the closed-form solutions of transmission time, transmission bandwidth, transmission power, computation frequency and local accuracy. Luo S et al. [
11] introduced a Hierarchical Federated Edge Learning (HFEL) framework in which model aggregation was partially migrated from the cloud to the edge server, and formulated a joint wireless resource allocation and user edge association problem to minimize the energy and latency. Zeng Q et al. [
12] proposed a new computation and communication resource framework, which uses multiple processors of clients to simultaneously process data, and jointly controls bandwidth allocation, CPU-GPU workload allocation, computation frequency of processors and transmission time. In addition, the study in [
12] was further extended by Ruby R et al. [
13], within a certain time budget, solving the problem of resource allocation in the HFEL structure. However, in [
10,
11,
12,
13] the authors were only concerned with the optimization of the client energy without setting a suitable client scheduling scheme. When the number of users reaches a critical value, the data owned by the users may contain all the data features of the overall dataset. Therefore, the FL learning performance will not be significantly improved with the increase in the user numbers. In addition, some users’ data quality may be very poor, and if this user participates, the FL learning performance will be degraded.
In summary, in [
4,
5,
6,
7,
8,
9,
10,
11,
12,
13] the proposals have only considered the participant or communication problem in FL unilaterally, without comprehensively thinking about the effect of user choice and terminal energy consumption in FL. So, in the work of [
14], Thi Le et al. put forward an auction game as an incentive mechanism between the base station (BS) and users, with each user submitting bids based on the minimal energy consumption to participate in FL, and a primal–dual greedy auction mechanism was introduced to determine the winners and maximize social welfare. In the case that the uplink transmission may be wrong, Chen M et al. [
15] formulated the optimization problem of minimizing FL loss function and proposed an optimization scheme of wireless resource allocation and user selection. Under the framework of HFEL, Wen W et al. [
16] introduced a user selection and resource allocation strategy to take into account the uncertainty of wireless channels and the importance of the weighted gradient. Moreover, in the case of users’ data that are independently identically distributed (IID) and non-IID, Liu S et al. [
17] studied the optimization problem by optimizing user association and resource allocation. In [
18], Shi W et al. proposed a jointly optimized user selection and resource allocation strategy to maximize the FL test accuracy within a given total FL completion time. In particular, based on a novel technique, namely over-the-air computation, Yang K et al. [
19] investigated client scheduling and beam-forming design in FL. In addition, Al-Abiad M. S et al. [
20] studied the optimization problem of multi-layer FL under non-orthogonal multiple access (NOMA), and proposed a resource allocation scheme with the total energy consumption of Internet of Things’ devices as the optimization objective. Moreover, Xiao H et al. [
21] studied FL in the scenario of the Internet of Vehicles, considering the position and speed of vehicles, and a minimum–maximum optimization problem was formulated to optimize the computation frequency, communication power and local accuracy to achieve the minimum cost.
As an aerial mobile-base station, UAVs can quickly build a wireless network and provide emergency communication services. It plays an important role in earthquakes, floods, fire and other emergencies as well as battlefield operations. Due to their mobility, compared with fixed base stations, UAVs can find a suitable location to reduce the transmission energy consumption of users. Therefore, with the purpose of further improve the FL performance, UAVs can be combined with FL. Tursunboev, J et al. [
22] proposed a novel and high-performing FL scheme to improve FL performance. By considering the unavailability of ground BS and the limited energy of users, Pham Q. –V et al. [
23] deployed a UAV equipped with edge computation and wireless power communications (WPC) capabilities to perform FL missions, and proposed an algorithm that jointly optimize the UAV’s position, local accuracy and wireless resources to minimize the energy consumption of UAVs and users. Based on the premise that UAV-assisted users upload models, Ng J. S et al. [
24] established the coalition formation game to maximize the sum of UAV individual profit, according to users’ preferences over heterogeneous UAVs, and proposed a joint auction–coalition to achieve a stable partition of the UAV coalitions and applied an auction scheme to solve the allocation of the UAV coalition. Under the limitation of learning accuracy and training delay, Jing Y et al. [
25] proposed a jointly optimized UAV placement and resource allocation to reduce the energy consumption of use. In addition, based on reinforcement learning, Yang H et al. [
26] proposed an asynchronous federated learning (AFL) framework, which implemented asynchronous distributed computation, and proposed an asynchronous advantage actor–critic (A3C) reinforcement learning algorithm by joint optimizing device selection, UAVs’ position and resource allocation to improve the FL convergence speed and test accuracy. However, in [
23,
24,
25,
26], the altitude of the UAV is fixed. In fact, the UAV’s coverage area should be related to the UAV’s altitude. The higher the hover position of the UAV, the greater the chance of air-to-ground line of sight transmission, that is, the larger the coverage radius, which means the larger the service range of the UAV and more users can access, the faster the FL converges. However, the higher the deployment height, the larger the signal transmission distance will be, resulting in greater path loss and increased energy consumption of users participating in FL. Therefore, the height deployment of UAVs needs to be optimized. Based on the above analysis, in the UAV-assisted federated learning wireless scenario, this paper takes into account the impact of the UAV’s height on the coverage area, and under the constraint of completing the total delay, realizes the balanced optimization of users’ total energy consumption and FL performance through the joint optimization of the UAV’s position (height, horizontal position), local accuracy and computation and communication resources. The main contributions of this paper are summarized as follows.
(1) For the UAV-assisted federated learning wireless network scenario, by considering the impact of the UAV’s altitude on coverage area, we defined the system’s cost function which is composed of the total energy consumption and the reciprocal of the participating user number, and formulated a cost function minimization–optimization problem within a given total FL completion time budget. By jointly optimizing the UAV’s position (horizontal placement, altitude), computation frequency, communication power, communication bandwidth, communication time and local accuracy this can realize the balanced optimization between the user’s total energy consumption and FL performance;
(2) Because the formulated problem is not convex, we decomposed it into three optimization subproblems: UAV horizontal placement, local accuracy and user computation and communication resources. For subproblem 1 (UAV horizontal placement), firstly we introduced a relaxation variable, and then used the first order Taylor expansion to transform it into a convex problem. Finally, we used successive convex approximation (SCA) to obtain the optimal UAV placement;
(3) For subproblem 2 (local accuracy), since it is a fractional form, first of all we converted the fractional form into the integral form, and then used the Dinkelbach method to obtain the optimal local accuracy; For subproblem 3 (computation and communication resources), this paper further decomposed it into two sub-problems, namely, transmission power, computation frequency, transmission time optimization subproblem 3-1 and transmission bandwidth optimization subproblem 3-2. For subproblem 3-1, it was deduced that the optimal solution of transmission time can be acquired by using the bisection method, furthermore, the optimal solution of transmission power and computation frequency is obtained. For subproblem 3-2, since this problem is convex, the optimal solution can be obtained by using the KKT condition.
The rest of this paper is organized as follows:
Section 2 shows the system model and problem formulation.
Section 3 presents the joint optimization algorithm.
Section 4 analyzes the simulation results and performance analysis. Conclusions are drawn in
Section 5. For convenience, the key notations used in this paper are summarized in
Table 1.
4. Simulation Results and Performance Analysis
In this paper, we used python to simulate the proposed algorithm, and used pytorch to build the convolutional neural network to verify the performance of FL. In our simulations, we deploy
users uniformly in a square area of size 250 m × 250 m. In addition, the data distribution among users is non-IID, we used the MNIST dataset that consists of handwritten numbers “0” through “9” to train the neural network, and a total of 60,000 labeled training data samples. For data distribution processing between users, we first sorted all data samples by their digit labels, divided them into 200 shards of size 300, and then assigned each user with eight shards. So, each user obtains the data samples with no more than eight types of digits, with a total size of 2400. Other simulation parameters are shown in
Table 2.
Next, we compared the algorithm proposed in this paper with three algorithms, including: (1) the algorithm in [
23] (UAV altitude fixed). In [
23], the UAV altitude is the highest, that is, all users in the UAV-covered area; (2) The baseline algorithm that does not optimize local accuracy
is called the FLA in this paper, where the local accuracy is fixed; (3) The baseline algorithm that does not optimize the transmission time is called FTT in this paper, where the transmission time is fixed.
Figure 3a,b show the system cost and the overall energy consumption of users when the required FL completion time
increases from 100 to 200, respectively, where the model size
is 50 kbit and the global accuracy
is 0.001. As the required FL completion time
increases, so the four algorithms show a downward trend as users have more time to train and upload, which reduces the resource competition among users. When
is small, the UAV in [
23] is a fixed height, and as the UAV altitude is not optimized, so the total system cost and the total energy consumption of users are very high. Because the algorithm FLA does not optimize the local accuracy and the algorithm FTT does not optimize the transmission time, the system total cost and the total energy consumption of users are higher than our proposed algorithm. In
Figure 3b, with the required FL completion time
increases, the total energy consumption of users decreases, and the learning performance plays a dominant role at this time. Therefore, the altitude of the UAV is increased, and the coverage area is expanded to reduce the cost of learning, thus the number of users participating in learning is increased. Therefore, as the required FL completion time increases, the number of participating users increases and the total energy consumption of users decreases. When the number of participants increases to a certain level, the diversity of the user dataset participating in learning is sufficient. If the number of users is increased on this basis, the effect of learning improvement is not obvious, but will occupy wireless resources and the use energy consumption will rise sharply. From the above analysis, our proposed algorithm has the fewest users participating in learning when
is 100, it only needs to compare the learning performance of our proposed algorithm and that in [
23].
When
is 100, we can obtain the global rounds and local update rounds of the two algorithms according to the optimal local accuracy and preset global accuracy. The time of each global round can be approximately obtained on the basis of
. Since it only verifies the diversity of the data contained in both, the typical FedAvg algorithm in FL is used for training. As shown in
Figure 3c, since the algorithm in [
23] covers all users, so the convergence speed is faster than our proposed algorithm, but from the final convergence results, the final convergence accuracy of the two algorithms is only slightly different. However, our proposed algorithm can greatly reduce the total energy consumption of users under the condition of a lower required FL completion time. Therefore, the user data covered by our proposed algorithm basically contain all the data features of the overall dataset, if the user number is increased, the final learning performance will be little affected, but the total energy consumption of users will be increased. Through the above analysis, our proposed algorithm can not only guarantee the training performance but also reduce the total energy consumption of users under the condition of total completion time variation.
Figure 4 a,b show the system cost and the overall energy consumption of users when the global model accuracy
increases from 0.001 to 0.1, respectively, where the model size
is 50 kbit and total completion time
is 120. From
Figure 4a,b, when the required global model accuracy decreases, the global rounds of users become less. Indirectly, the four algorithms’ total cost function and the total energy consumption of users are decreased. When the global model accuracy changes to 0.05, the gap between the algorithm in [
23] and our proposed algorithm is very small. This is because when the required global model accuracy is very small, users only need to train several rounds to reach the required accuracy, which reduces the users’ total energy consumption. As a result, the UAV altitude optimized by our proposed algorithm is close to that of the algorithm in [
23]. The total system cost and total energy consumption of algorithm FLA and FTT are also higher than our proposed algorithm. Based on previous analysis, because our proposed algorithm will have more users as the global model accuracy increases, so it only needs to compare the learning performance of our proposed algorithm and in [
23], when the global model accuracy is 0.001. As shown in
Figure 4c, our proposed algorithm covers enough diversity of user data, so the final test accuracy of the two is only slightly different. However, our proposed algorithm reduces the total energy consumption of users when the global accuracy is highly required. In addition,
Figure 4c compared with
Figure 3c, since
in
Figure 4c is 120, our proposed algorithm covers more users than that in
Figure 3c. Therefore, in the early stage of training, the convergence gap between our proposed algorithm and that in [
23] is smaller than that in
Figure 3c. Moreover, the convergence rate of our proposed algorithm is very close to the algorithm in [
23]. This also verifies the conclusion that the more users, the faster the convergence speed, but this has little influence on the final convergence accuracy within the time expectation. In conclusion, our proposed algorithm can guarantee the training performance while reducing the total energy consumption of users under the condition of the global accuracy changes.
Figure 5a,b show the system cost and the total energy consumption of users when the model size
increases from 30 kbit to 80 kbit, respectively, where the global model accuracy
is 0.001 and total completion time
is 120. From the
Figure 5a,b, as the model size increases, the users need more resources to transmit data, so the four algorithms’ system cost and the total energy consumption of users are increased. Because the UAV altitude is not optimized in [
23], as the model size increases, wireless resources’ competition among users becomes more intense, resulting in a sharp rise in system cost and the total energy consumption of users. However, our proposed algorithm will appropriately reduce the UAV altitude to reduce resource competition among users. Therefore, the total system cost and total energy consumption of [
23] are higher than our proposed algorithm. Due to the influence of local accuracy and transmission time, the system cost and total energy consumption of algorithm FLA and FTT are also higher than our proposed algorithm. When the model size is 80 kbit, our proposed algorithm has the least number of participating users, so it only needs to compare the learning performance at 80 kbit. As shown in
Figure 5c, because there are more participating users in [
23], so the convergence speed is faster than that of our proposed algorithm. However, within the total completion time 120, there is little difference in the final convergence accuracy of the two algorithms. Therefore, our proposed algorithm not only guarantees the training performance but also reduces the total energy consumption of users when the model size changes, and can adjust the number of participants in learning according to the model size.
Figure 6 compares the effects of different total bandwidths on the system cost and user total energy consumption of our proposed algorithm. The total bandwidths are set at 1.5 MHz, 1 MHz and 0.5 MHz, respectively. From
Figure 6, with the increase in total bandwidth, the total system cost and the total energy consumption of users are decreased. This is because when the total bandwidth increases, users can have more wireless resources, reducing the wireless resources’ competition among users, the total energy consumption of users will be decreased, which further leads to more users joining the learning process and so the system cost will also be decreased.