Joint Client and Resource Optimization for Federated Learning in Wireless IoT Networks

: Federated learning (FL) is a promising technique to provide intelligent services for the internet of things (IoT). By transmitting the model parameters instead of user data between the client and central server, FL greatly improves the user privacy and reduces transmission latency. However, due to the fading effects of the wireless channel, the outage of wireless transmission degenerates the learning efficiency when FL is applied in wireless IoT networks. In order to address this issue, we investigate the joint optimization of client selection and wireless resource allocation in FL-aided cellular IoT networks. By taking both the amount of training data and wireless resource consumption into consideration, we formulate the problem as a mixed integer non-linear programming to maximize the utility of the network. To solve the problem effectively, an alternative direction-based algorithm is proposed by decomposing the original problem into two sub problems. The simulation results indicate that the proposed algorithm substantially improves the FL learning performance and reduces the consumption of wireless resources compared with existing methods.


Introduction
With the rapid development of artificial intelligence (AI), massive intelligent devices have been deployed in the wireless internet of things (IoT), which makes the ability of providing intelligent services one of the most important evolutionary directions for future IoT.As a basic element of AI, machine learning (ML) [1] is widely utilized by various intelligent applications.However, the centralized training manner in ML becomes more and more inefficient and data transmission and privacy problems arise because massive amounts of training data are generated and stored in various intelligent devices other than the central server.Federated learning (FL) [2,3] is a desired solution to handle this mismatch.As a distributed ML, FL only requires the transmission of model parameters other than the data themselves between client and central server, which largely reduces the amount of data transmission.Meanwhile, the user's privacy is also well protected, since the data avoid being transmitted to the central server.
Despite the advantage of FL, the communication issue should be addressed when deploying it in wireless IoT networks [4].Specifically, because of the fading effects of the wireless channel, the transmission of model parameters may experience an outage, which degenerates the performance of FL [5].Towards this issue, an admission control algorithm is proposed in [6], where the number of accessed devices is considered as the optimization target.Furthermore, in [7], the quality of local training and the channel state are utilized to decide the accessed devices.In addition to admission control, the wireless resource optimization is also efficient for FL-aided IoT networks.For example, in [8], a joint optimization algorithm is proposed to improve the communication reliability of FL-aided wireless networks, where the client selection and power allocation are employed to reduce the loss of the trained model.In [9], split learning and FL are combined to handle the diversity of clients with different channel states and computational capabilities.In this algorithm, client selection is achieved by a multi-arm bandit scheme, which employs both the channel states and local model as the optimized objective.In [10], a blockchain-based FL is proposed for wireless computing power networks, where the client selection is achieved by a evolutionary game-based incentive scheme.The incentive function takes the resource and security into account; however, the amount of local training data is not involved.Additionally targeting wireless computing power networks, a resource-aware FL is proposed in [11] that aims to reduce the energy consumption.The algorithm is employed to adjust the depth of the neural network and total training round without involving power and wireless channel selection.To handle the dynamics of wireless channels and network resources, in [12], the global FL models received in previous training are reused to replace erroneous local models.In this algorithm, the client selection is adopted by minimizing the accuracy loss in training data, so it focuses on the repair other than selecting superior wireless channels.All these previous works adopt the synchronous model; conversely, in [13], an asynchronous FL framework with client selection is proposed.In their optimization, the client availability and long-term fairness are taken into consideration to minimize latency.Lyapunov optimization is employed to tackle the asynchronous problem in an online manner.However, the amount of local training data is not considered in client selection.In [14], the training data of local model and wireless channel quality are jointly considered for asynchronous FL.The optimization objective is to reduce the variance and bias of the aggregated model updates, while the amount of local training data is also not involved.In [15], a joint optimization of bandwidth allocation and client scheduling is considered to achieve the ideal trade-off between training accuracy and latency.To solve the problem efficiently, the reformulation and decoupling are adopted, and the optimal resource allocation can be achieved by using an online algorithm.Nevertheless, the amount of local training data is not considered in the client scheduling.Another joint optimization of client selection and resource allocation for wireless FL is studied in [16], where the target is to maximize the total average number of active clients and transmission time.The Lyapunov optimization is employed to achieve an online-manner solution.Similarly, in [17], a joint optimization of client scheduling and resource allocation for hierarchical FL is investigated.The formulation simultaneously captures the uncertainty of the wireless channel and the weight gradient.However, in these former works, the joint optimization mainly focuses on the wireless channel and resource allocation without considering the amount of local training data in each client.In [18], both the bandwidth and power allocation are considered in the wireless resource optimization, and the objective is to maximize the number of accessed clients.However, in these existing works, the objective functions treat all nodes equally, and do not involve the amount of the training data in the client, which is worth being considered, because, in practice, the amount of collected data for each IoT node varies.
Motivated by these observations, we investigate the joint optimization of client selection and wireless resource allocation in FL-aided IoT networks, and the major contributions are as follows.

•
We developed a joint optimization framework for FL in wireless IoT networks.Specifically, the framework supports client selection and wireless resource allocation, which includes the power and bandwidth allocation of the clients.In the framework, the objective function takes both the amount of training data and wireless resource consumption into consideration.

•
We solve the formulated problem by using an alternative direction-based algorithm.
In the algorithm, the primal problem is decomposed and transformed, and then, by combining and solving the constraints simultaneously, we derive the iteration equation for the optimal power and bandwidth.After that, the optimal client selection can also be achieved using a greedy algorithm.

•
We conduct extensive simulations using the MNIST to examine the effectiveness of the proposed joint optimization in utility of the network, accuracy rate and energy consumption.
The rest of the paper is organized as follows.Section 2 introduces the system model.Section 3 describes the problem formulation.The proposed algorithm is then presented in Section 4. The simulation results are discussed in Section 5. Finally, the conclusion is drawn in Section 6.

System Model
In this section, we describe the system model of FL-aided wireless IoT networks, which is divided into the network model, learning process and communication model.

Network Model
As depicted in Figure 1

Learning Process
In the considered network, an ML model of interest is trained in a distributed manner by the cooperation of the FL server and IoT nodes.The goal of the training is to obtain the model parameter w by minimizing the loss function f (w) on the data set S. The minimization can be expressed as where f i (w, S l ) is the local loss function of IoT node i on sample S l .We focus on the widely used federated averaging learning framework [2], where the training communication round is periodical.The tth communication round consists of the broadcasting phase, the local training phase and the aggregating phase.In the broadcasting phase, the FL server broadcasts the global model parameter w t to all the IoT nodes via the wireless down link (DL) of BS.Then, in the local training phase, each IoT node i calculates the gradient of the local loss function ∇ f i (w t , S i ), and then, E epochs of the gradient descent method are used to obtain w t+1 as where ζ i is the learning rate of IoT node i.Finally, in the aggregating phase, each IoT node i transmits its local model parameter w t+1 i to BS via the wireless up link (UL), while the FL server updates the global model parameter as The learning process terminates when the following condition holds.
where Λ is the learning termination threshold.

Communication Model
In the learning process, the aggregating phase is the bottleneck, because the wireless UL of the IoT node is resource-limited compared to the DL of BS.Hence, we focus on the resource optimization of the wireless UL in FL-aided wireless IoT networks.We assume that each IoT node transmits its local model parameter through frequency division multiple access.The achievable rate of IoT node i at the tth round can be written as where B t i is the frequency bandwidth of IoT node i. P t i is the transmitting power of IoT i, h t i is the channel gain from IoT node i to BS, σ 2 is the power spectral density of the background noise.For each IoT node i, the local model parameter w t+1 i is encapsulated into a packet with a size of D in the transmission.Thus, the transmission time from IoT node i to BS can be written as With ( 6), the outage probability of IoT node i can be expressed as where θ is the maximum acceptable latency for the tth round in FL.For the convenience of readers, the parameters concerning the formulation can be found in Table 1.The achievable rate of IoT node i in the tth communication round

Problem Formulation
To improve the accuracy of the trained model, it is desirable to bring as much data as possible into the training process [19].Meanwhile, it is also beneficial to reduce the consumption of wireless resources in parameter transmission.Hence, the utility function of the FL-aided wireless IoT network is formulated as where x t i is the node selection indicator, x t i = 1 represents that IoT node i is selected to participate in the tth round training; otherwise, x t i = 0. b t i is the outage indicator, b t i = 1 indicates that the UL of node i experiences an outage; otherwise, b t i = 0. τ denotes the cost coefficient.In (8), the first term captures the total amount of training data, while the second term represents the communication cost, which is formulated by the bandwidth-power product [20].
Our goal is to optimize the bandwidth, power and client selection by maximizing the utility function U FL ; therefore, the problem is formulated as follows.

P1: max
x,B,P U FL s.t.C1: C2: C3: C4: Constraint C1 is the binary integer constraint for the client selection indicator.Constraint C2 guarantees that the sum of the allocated bandwidth cannot be beyond the total bandwidth.Constraint C3 ensures that the transmitting power of each IoT node is nonnegative and cannot be beyond the maximal value.Constraint C4 denotes that only when the transmission time is smaller than the maximum acceptable latency can the transmission avoid the outage.

Algorithm Design
It can be observed that P1 belongs to mixed integer non-linear programming (MINLP); therefore, its complexity is high.In order to handle this intractable problem, we propose an alternative direction-based algorithm.In the proposed algorithm, x is first fixed to 1 while B and P are solved from the simplified problem.After that, the obtained B and P are brought into P1 to solve x.

Solving B and P When x Is Given
When x is fixed to 1, it indicates that all the nodes are selected to join the learning process.In that case, P1 can be simplified to P2: min C3.
Comparing constraint C4 with the objective function f P2 , it can be derived that T t i = θ should be satisfied, so by combining ( 5) and ( 6), there is By substituting ( 14) into P2, there is Using the derivation Hence, we utilize the iteration method in [21] and derive the iteration equation of P i as where k is the iteration index.We denote P * i as the optimal value of P t i when the iteration (17) ends.By substituting P * i into ( 14), there is where B * i denotes the optimal value of B t i .

Solving x with B and P
By substituting the obtained B * i and P * i into P1, it can be simplified as Since P4 belongs to the knapsack problem, we propose a greedy algorithm to solve it in Algorithm 1.
Algorithm 1 Greedy Algorithm for P4.The block diagram of the proposed algorithm is presented in Figure 2. We propose an alternative direction-based algorithm to handle the high complexity of P1.First, x is fixed to 1, and P1 can be simplified to P2.Then by combining and substituting C4, ( 5) and ( 6) into P2, it can be further simplified to P3. Letting the derivation d f P3 dP i = 0 in P3, the iteration Equation ( 17) is derived.Through the iteration of ( 17), P * i can be obtained.Then, by substituting P * i into (14), B * i is also solved.Next, by substituting and into P1, it can be transformed to P4, which belongs to the knapsack problem.Hence, we adopt Algorithm 1 to solve x from P4.After that, x, P * i and B * i are obtained and used for the client and resource optimization.

P4
x x is fixed to 1 Combine C4, (

Simulation Results
In this section, the performance of the proposed alternative direction-based algorithm is examined by comparison with two existing benchmarks.Benchmark 1 is the vanilla FL scheme [2] over wireless IoT networks, among which the federated average algorithm is adopted.Benchmark 2 is the communication-efficient FL [18], among which the optimization objective is to maximize the number of clients participating in the FL process.We consider a FL-aided cellular IoT network within a 1 km × 1 km area, where the BS is located at the center while 20 IoT nodes are randomly distributed.The channel model [22] is adopted; that is, 103.2 + 27.3 log 10 (d) is adopted between BS and the IoT nodes.The total bandwidth of the network B T = 10 MHz, and the power of background noise σ 2 = −109 dBm.The maximum transmitting power P M = 15 dBm.In the simulation, the ML model of interest is a convolutional neural network (CNN) [23], which consists of 2 convolutional layers, 1 fully connected layer, and 1 softmax function-based output layer.Each convolutional layer is 5 × 5 and is connected with a 2 × 2 max-pooling.The fully connected layer has 500 units.The training data set is MNIST [24], and the original data from MNIST are randomly partitioned into M pieces and each IoT node is assigned one piece.Thus, the data distribution follows i. i. d.Here, the amount of training data S i follows uniform distribution between 1500∼3500 images.
Figure 3 shows the comparisons of network utility as the number of transmission round increases.It is observed that as the number of transmission round grows, the proposed algorithm outperforms benchmark 2. In fact, in the optimization of benchmark 2, the objective function considers all the nodes equally, while in the proposed algorithm, the amount of training data collected by the IoT nodes is considered as the profit weight in the objective function, which leads to more data being brought into the training under the same communication rounds.Meanwhile, the proposed algorithm and benchmark 2 are superior to benchmark 1.The reason is that in these two algorithms, the power and bandwidth are optimized to maintain the learning process, which efficiently reduces the consumption of communication resources.Figure 4 depicts the loss of the trained model for the compared algorithms.First, as the number of transmission rounds increases, the loss of the trained model decreases for all three algorithms.The reason is that with the growth of the transmission round, more and more data are brought into training, which is helpful to improve the trained parameters.We found that when the number of transmission rounds is larger than 1000, the proposed algorithm and benchmark 2 converge to almost the same loss level.This is due to the fact that with sufficient communication rounds, the two schemes can bring adequate data into the training.However, recalling the results in Figure 3, it can be deduced that the consumption of communication resources are different for the two algorithms, and the proposed algorithm is superior in power and bandwidth saving.Figure 5 illustrates the comparisons of the accuracy rate when the number of transmission rounds increases.It is observed that as the transmission rounds grow, the accuracy rate of the trained model also increases for all compared algorithms.The proposed algorithm and benchmark 2 are superior to benchmark 1.The reason is that compared to benchmark 1, the two schemes brought more data into training with limited communication resources.When the number of transmission rounds is greater than 1000, the performance of the three algorithms is close, because in that case, sufficient data have been brought into the training process.The proposed algorithm outperforms benchmark 2 because its objective function considers both the amount of training data and the consumption of communication resources.As a result, compared to benchmark 2, the proposed algorithm efficiently brings more data into the training process.
Figure 6 depicts the comparisons of energy consumption per client when the number of transmission rounds increases.We observed that when the number of transmission rounds increases, the energy consumption per client also increases for all the three compared algorithms; therefore, the energy consumption is basically proportional to the number of transmissions.The proposed algorithm consumes less energy than the two benchmarks because the proposed optimization only selects the client with a superior wireless channel state and a large amount of local training data, which reduces the energy consumption in parameter transmission.On the contrary, benchmark 1 consumes more energy than benchmark 2 and the proposed algorithm because its client selection is random without considering wireless channel state.Additionally, benchmark 2 also consumes more energy than the proposed algorithm.The reason is that the optimization objective function of benchmark 2 has not taken the amount of local training data into consideration.This result indicates that the proposed algorithm achieves superior trade-off between the wireless channel state and the amount of local training data in parameter transmission.

Conclusions
In this paper, we have investigated the joint optimization of client selection and communication resource allocation in FL-aided wireless IoT networks.By taking both the amount of trained data and the consumption of communication resources into consideration, a MINLP was formulated to maximize the utility of the network.Then, an alternative direction-based algorithm was proposed to solve the problem efficiently.The simulation results have confirmed that the proposed algorithm is effective in reducing communication resource consumption and improving learning performance.Furthermore, the results also revealed that distinguishing node weights based on the amount of collected data is beneficial for FL.In the future work, we will extend the algorithm to the FL-enabled scenario with massive multiple-input multiple-output communications, where the higher dimensional signals make joint optimization of client selection and resource allocation more challenging.

Figure 1 .
Figure 1.System model of FL-aided wireless IoT networks.

11 :Algorithm 2
Output: xWith Algorithm 1, the proposed client and resource optimization can be summarized as Algorithm 2. Client and Resource Optimization for FL-aided IoT.

1 Figure 2 .
Figure 2. Block diagram of the proposed algorithm.

Figure 3 .
Figure 3. Comparisons of network utility when the number of transmission round increases.

Figure 4 .
Figure 4.The loss of the trained model for the compared algorithms.

Figure 5 .Figure 6 .
Figure 5. Comparisons of accuracy rate when the number of transmission rounds increases.

Table 1 .
Explanation of Abbreviations.