More Efficient and Verifiable Privacy-Preserving Aggregation Scheme for Internet of Things-Based Federated Learning

: As Internet of Things (IoT) technology continues to advance at a rapid pace, smart devices have permeated daily life. Service providers are actively collecting copious numbers of user data, with the aim of refining machine learning models to elevate service quality and accuracy. However, this practice has sparked apprehensions amongst users concerning the privacy and safety of their personal data. Federated learning emerges as an evolution of centralized machine learning, enabling a collective training of machine learning models by multiple users on their respective devices. Crucially, this is achieved without the direct submission of data to a central server, thereby significantly mitigating the hazards associated with privacy infringements. Since the machine learning algorithms act locally in federated learning, passing just the local model back to the central server, the users’ data remain locally. However, current research work indicates that local models also include user data privacy-related components. Moreover, current privacy-preserving secure aggregation schemes either offer insufficient accuracy or need significantly high computing resources for training. In this work, we propose an efficient and secure aggregation scheme for privacy-preserving federated learning with lower computational costs, which is suitable for those weak IoT devices since the proposed scheme is robust and fault-tolerant, allowing some of the users to dynamically exit or join the system without restarting the federated learning process or triggering abnormal termination. In addition, this scheme with the property of result verification in the situation when the servers return incorrect aggregation results, which can be verified by the users. Extensive experimental evaluations, based on real-world datasets, have substantiated the high accuracy of our proposed scheme. Moreover, in comparison to existing schemes, ours significantly reduces computational and communication costs by at least 85% and 47%, respectively.


Introduction
With the rapid development of IoT technology, IoT devices have provided ubiquitous sensing and computing powers and are widely used in many scenarios such as smart cities, healthcare, smart homes, and smart industries.According to a report by IoT Analytics [1], by 2027, there will be 29 billion IoT devices working concurrently around the world, compared to just 14.4 billion in 2022.The vast number of data generated by a multitude of devices provides rich material for analysis by artificial intelligence technologies such as machine learning and deep learning.These technologies have demonstrated powerful analytical and mining capabilities in fields like image classification, speech recognition, and natural language processing, effectively leveraging the data collected by the Internet of Things to uncover hidden patterns and information.Traditional machine learning methods rely on centralized AI frameworks located in data centers or clouds for data analysis and model training, which require centralized collection of data dispersed in various locations and model training in data centers.However, centralized machine learning methods also raise concerns about data privacy, as the data may contain sensitive information such as users' personal preferences, health status, and more.Owing to concerns over privacy protection, users are often reluctant to share the data they possess, leading to data silos that prevent effective centralization for training and analysis.Therefore, a new machine learning training method is needed to alleviate users' concerns about privacy issues.
In 2016, Google researchers introduced a technique known as federated learning [2], which is a form of distributed machine learning aimed at addressing the issue of machine learning across a range of mobile devices.Federated learning, as opposed to the more traditional centralized machine learning, does away with the requirement to centralize data within a data center.Under the coordination of a server, various IoT devices act as clients to collaboratively train a model.Initially, the server establishes the current global model.Subsequently, each client downloads this model, utilizes its own data to train a local model, and then uploads the outcomes back to the server.The server then amalgamates these local models to form a new global model, which is subsequently downloaded by the clients to train their new local models.This cycle is repeated until a predefined training goal is met.By maintaining data privacy at the local level, this approach to machine learning ensures data privacy and security, potentially alleviating users' privacy concerns.It can enhance user privacy as the server is only aware of the model, not the learning data themselves.As a result, federated learning has been the subject of extensive research and application in areas such as health care [3], financial services [4], and intelligent manufacturing [5], emerging as a highly promising research field within artificial intelligence.
However, existing research indicates that traditional machine learning can be outperformed by federated learning in better protecting users' private data.In the domain of federated learning, there are still multiple attack methods that can threaten users' private data.Through their research, Mothukuri et al. [6] discovered that the local models submitted by users contain some information, and it is possible to infer sensitive information, such as users' training data, by analyzing the output of these local models.A framework was proposed by Wang et al. [7] that integrates a GAN with a multi-task discriminator on a malicious FL server to reconstruct a user's private training data.On the other hand, Zhu et al. [8] suggested a method where by constructing fake data and labels for training, the resulting fake model is maximized to differ as much as possible from the user-submitted local model, and through iterative updates, the user's training data can ultimately be recovered.To address the aforementioned issues, the methods commonly used to enhance privacy preservation in federated learning are primarily divided into homomorphic encryption [9][10][11], differential privacy [12,13], and secure multi-party computation [14][15][16].
In addition, it is also important to ensure that the server aggregation results are correct.A malicious federated learning server may return incorrect aggregation results.For example, a malicious federated learning server may intentionally return the aggregation results of the previous round to the user during a certain round of federated learning for the purpose of saving its own computational cost.Therefore, it is of interest to design a federated learning protocol that can both protect user privacy and verify the correctness of the server's aggregation results.Recently, some researchers have proposed the verifiable privacy-preserving aggregation schemes [17][18][19][20][21].These schemes either require additional trusted hardware support or have high computational power requirements on the user side, rendering them inappropriate for IoT devices characterized by limited processing capability.The properties that the related work satisfies are shown in Table 1.
In this paper, we propose a privacy-preserving federated learning data aggregation scheme that is both efficient and verifiable, specifically designed for IoT devices with lower computational and communication costs to accommodate their weaker performance capabilities.The specific contributions of this paper are as follows.

•
We propose a novel, efficiency-focused federated learning scheme that facilitates privacy-preserving data aggregation, while notably reducing computational and communication costs.One of the distinguishing features of our solution is that it allows users to perform low-cost validation of the aggregation results provided by the server to ensure their integrity and reliability.This neither compromises computational efficiency nor strict privacy measures, making it a secure and effective data aggregation solution.

•
We also devised a dynamic user management mechanism.This mechanism is constructed to enable the seamless onboarding and departure of participants within the federated learning process.Through its implementation, it ensures that the federated learning process remains robust against user attrition.When some users dropout, the remaining active participants experience minimal disruption in terms of both computational load and communication bandwidth.Despite these departures, the integrity of the aggregated encrypted gradients is preserved, maintaining the overall efficacy of the federated learning process.

•
We conduct comprehensive tests on numerous real-world datasets and thoroughly analyze accuracy and performance.The outcomes of these tests have been strikingly positive.Our scheme demonstrates a significant reduction in computational and communication costs, achieving at least 85% and 47%, respectively, when compared to current schemes in existence.Furthermore, our scheme has proven its mettle in terms of accuracy as well.It has managed to attain a level of precision that is highly comparable to those achieved using plaintext baselines, thereby affirming its reliability and robustness.These findings not only highlight the superiority of our scheme but also underscore its potential for practical application in real-world scenarios, particularly where computational resources are limited.

Related Work
Federated learning, a decentralized machine learning technique, was pioneered by Google in 2016 as a solution to train predictive models on mobile devices without consolidating datas [2].Its fundamental principle revolves around enabling users to contribute to a collective global model by transmitting their locally trained models to a central server, while keeping their datasets secure on their devices.In the federated learning framework, the server initiates each training cycle by distributing the current global model parameters to participating clients.These clients then utilize their own localized datasets to refine a model tailored to their unique data.Upon completion of local training, the clients relay their updated model parameters back to the server.The server employs an aggregation mechanism to amalgamate these parameters from various clients, resulting in a refined set of global model parameters.Subsequently, these enhanced global parameters are made available for download by the clients, marking the commencement of the subsequent training iteration.This cyclic process persists until the global model reaches a satisfactory level of performance or meets predefined criteria.
In recent years, federated learning has been gaining traction in the business world as a solution to privacy and security concerns; academics are pushing its boundaries through research, and innovative algorithms like FedGRU [28] and FedSem [29] are being developed to predict traffic flow for smart vehicles, train medical models without sharing patient data, and utilize unlabeled data in smart cities, respectively.All these developments highlight federated learning's significant promise in cross-institutional data collaboration while maintaining privacy.
However, some research indicates that within the context of federated learning, some attack methods remain capable of threatening users' privacy.Mothukuri et al. [6] have identified several types of attacks, including membership inference assaults, unintentional data leaks, inference-based reconstructions, and GAN-based inference attacks.A membership inference attack, specifically, is when an attacker examines the global model to uncover information about the training data of other users [30].This type of attack allows the attacker to determine whether a certain type of dataset was used for model training [31][32][33][34].The GAN-based inference attack takes use of GAN's superior data reconstruction performance to learn the distribution of the user's privacy dataset based on the user's local model and then rebuild the user's privacy dataset.It has been proved experimentally that GAN may be used to rebuild user privacy data [35].
To address the aforementioned issues, federated learning commonly employs several privacy-enhancing techniques, including homomorphic encryption [22,24,36] and differential privacy [12,[37][38][39].These methods aim to achieve model training and information sharing among participants while preserving the privacy of user data.
In the field of verifiable privacy-preserving federated learning, researchers have innovated diverse methods to ensure data confidentiality and collaborative model training.Hahn et al. [17] uses a dual-mask protocol for gradient privacy, requiring servers to prove aggregated results' accuracy.Xu et al. [18] substitutes homomorphic hashing for dual masks in challenge creation.Ghodsi et al. [19] designed a specialized interactive proof protocol using arithmetic circuits to detect malicious servers.This method ensures that the server cannot tamper with users' data or models during the federated learning process.Tramer et al. [20] utilized trusted execution environments such as SGX (Software Guard Extensions) and TrustZone to achieve privacy preserving and execution verification.These trusted execution environments ensure that the computation processes running on servers are secure and users' privacy data remain confidential.Fu et al. [21] applied Lagrange interpolation and blinding techniques for enhanced privacy and gradient verification.Each approach uniquely contributes to the evolving landscape of privacy in federated learning, showcasing varied strategies to tackle privacy concerns in collaborative machine learning settings.
All of these studies and frameworks promote the development of federated learning techniques while protecting privacy, providing powerful methods and guarantees to protect user data privacy.The introduction of homomorphic encryption and differential privacy allows for better protection of individual users' private information during federated learning, realizes cooperative computation and model training while maintaining data privacy, and enables cooperative computation and model training across devices to be performed more reliably.

Preliminaries Federated Learning
Federated learning is a distributed machine learning approach that allows for model training and optimization without moving datasets from everywhere to a centralized server.Users send only local models trained from local data to a centralized server for aggregation, thus jointly training a global model.The aggregation formula is , where K denotes the total number of samples and k n denotes the number of samples for the n-th client.
During each iteration, each client proceeds as follows: first, the current global model w t g is downloaded from the server.Next, the model is trained using its own local dataset using a stochastic gradient descent (SGD) optimizer.During the SGD process, the model parameters of the n-th client in the (t + 1)-th round of training are updated as follows: , where β denotes the batch randomly sampled from the local dataset, η denotes the learning rate and L denotes the loss function.
After training, each client submits the local model w t+1 n to the server.Finally, the server aggregates the local models w t+1 n submitted by each client to obtain a new global model w t+2 g , which will be used in the next round of iteration.

System and Threat Model
Figure 1 depicts the system model of our proposed scheme.we also discuss the threat model, security model and design goals.

System Model
In this paper, we focus on how to aggregate a global models in federated learning while protecting the user's private data, as well as enabling the user to validate the server's aggregation results.Moreover, the global model can still be correctly aggregated when some users drop out.The system model in this paper involves three main parties: Server.The fog node dispatches ciphered gradients to the server, which then conducts aggregation calculations without breaching user confidentiality.The server's access is limited to the aggregate of locally encrypted model gradients, precluding any direct exposure to users' unencrypted gradients.As an added measure, the server creates a proof of correctness for its aggregated results, subsequently transmitting this proof alongside the aggregated results to the respective users.
• Fog Node.The Fog Node (FN), equipped with robust computing, storage, and networking functionalities, is strategically positioned in proximity to the user.Its primary duty involves gathering data from all online users involved in the current round.Following this collection phase, the aggregated information is seamlessly transmitted to the central aggregation server for further processing.

Threat Model
In this paper, we posit that every participant, including all users, the fog node, and the server, operates under a semi-honest assumption.They adhere to the protocol's stipulations accurately, yet they harbor curiosity regarding the privacy of their peers and will endeavor to deduce others' confidential data from the local models that are submitted by other users.Beyond this semi-honest threat scenario, we also contemplate a more severe one where, at any juncture, the server might fall into the clutches of malevolent actors.In such a case, the server could stray from the protocol's original intent, deliver erroneous computations, and endeavor to deceive users into accepting these flawed global parameters.It is noteworthy that our proposed scheme permits collusion between any two entities within the aggregation server, a user, and a fog node to uncover a specific user's private information.However, it should be acknowledged that when the server and fog node are in collusion, it becomes quite feasible to fabricate aggregation outcomes and verifications, and our scheme does not allow the simultaneous collusion of all three parties.

Security Model
We assume that the risk of data interception by malicious parties during user-server communications.To counteract this, we developed a strategy centered around the computational Diffie-Hellman (CDH) problem.Our belief is that the CDH conundrum is computationally arduous, with the chance of any probabilistic polynomial-time attacker (PPTA) deciphering it being practically negligible [40].

Design Goal
For the federated learning scenario, our goal is to propose a robust, privacy-preserving framework for federated learning networks that enables data aggregation.There are several goals, as follows: • Privacy-preserving: In our proposed scheme, the utmost importance is given to safeguarding users' privacy in federated learning.At no point can an attacker access any genuine information regarding the user's data.• Verifiability: Users have the ability to check if the results that the server sends back, after combining all the data, are correct.They can do this by performing a simple calculation.This process is designed to be easy and not too complicated, ensuring that users can confirm the correctness of the server's aggregated results without much cost.

•
Fault tolerance: Even if some users drop out or fail to upload their local model gradients in time due to various reasons, the federated learning system is still capable of correctly aggregating the gradients from the remaining normally active users to form a global model gradient.

Methodology
In this section, we describe a verifiable privacy-preserving data aggregation scheme for IoT-based federated learning in detail.We give the system process in Figure 2.

System Initialization
In our system, we envision a setup with a total of n users, represented by the set U, where each user i is assigned a unique identifier ranging from 1 to n.The server begins by setting up the initial model parameter w 0 and choosing a large prime number p.We define G 1 and G 2 as two distinct multiplicative cyclic groups that operate under multiplication and have the same prime order p, with g 1 and g 2 serving as their respective generators.
The server also picks a secret number κ ∈ Z * p , and calculates h 1 and h 2 as g κ 1 and g κ 2 .Additionally, the server defines a bilinear mapping function e : G 1 × G 2 → G T and selects two hash functions: H 0 : {0, 1} → Z * p 2 and H 1 : {0, 1} → G 1 .After these preparations, the server makes public the system parameters, which include p, w 0 , H 0 , H 1 , G 1 , G 2 , e, h 1 , h 2 , g 1 , and g 2 .Furthermore, each user U i independently selects a private key sk i ∈ Z * p , the fog node (FN) chooses its own private key sk FN ∈ Z * p , and the server selects a private key sk CS ∈ Z * p .

Keys Broadcasting
In this phase, user U i randomly selects a secret number r i ∈ Z * p and computes y i = g r i 1 .The FN computes H 0 (t) sk FN for epoch t.The server computes H 0 (t) sk CS and h sk CS 2 .After that, user U i sends y i to the FN, the server sends H 0 (t) sk CS and h sk CS 2 to the FN.After the FN receives information from all users, it denotes this group of users by U 1 .Then, the FN sends ({y i } i∈U 1 , H 0 (t) sk FN , H 0 (t) sk CS , h sk CS

2
) to users in U 1 .

Local Training and Masking
In this phase, user U i trains a local model w i,t with their own private dataset for epoch t.And then, the user U i computes the blinding factor Y i The user i generates the gradient ciphertext The user i also computes three pieces of auxiliary information: After that, the user i sends (C i,t , Au 1 i,t , Au 2 i,t , Au 2 i,t ) to the FN.

Messages Collecting
When the FN receives (C i,t , Au 1 i,t , Au 2 i,t , Au 2 i,t ) from the user U i , it adds user U i into U 2 .The FN computes the unmasking key and Afterwards, the FN sends ({C i,t } i∈U 2 , {Au 2 i,t } i∈U 2 , Au FN t ) to the server, and sends Au 3 t to users in U 2 .

Aggregating and Unmasking
After the server receives the message from the FN, it executes the aggregation computation protocol and generates the corresponding proof of correctness.The server first computes and obtains the global gradient We give a full description in Section 6.1.Furthermore, the server computes the proof Finally, the server sends the aggregation result and corresponding proof of correctness (W t , Au 2 t ) to users in U 2 .

Verification
After receiving the result and proof from the server, each user verifies the correctness of the aggregation result with Au If the equation mentioned above is valid, then every user will agree to the aggregated result and proceed to update their local model.Should the equation not be valid, they will disregard the aggregated result and advance to the next round.

Correctness
Theorem 1.If the server faithfully executes the aggregation protocol, it is easy to aggregate the global gradients.
Proof of Theorem 1.Given the encrypted gradients {C i } i∈U 2 , the server can aggregate the global gradients W t with the auxiliary information Au FN t sent by the FN.The server first computes and obtains the global gradient W t of current epoch t by That is, if the server faithfully executes the aggregation protocol, it is easy to aggregate the global gradients.Theorem 2. If the server faithfully executes the aggregation protocol, the aggregated global gradients are able to pass the verification of users.
Proof of Theorem 2. Given the encrypted gradients {C i } i∈ [1,n] , the server can aggregate the global gradients W t with the auxiliary information Au FN t sent by the FN.Then, it can compute the proof Au 2 t .After that, it sends the global gradients W t and the proof Au 2 t to users.Also, the FN will send auxiliary information Au 3 t to users.Based on the global gradients W t , the proof Au 2 t and Au 3 t , Each user can use bilinear pairing to verify the correctness of W t with Equation (11).
The validity of Equation ( 11) can be derived as follows: ) If the Equation ( 11) holds, the aggregated global gradients are able to pass the verification of users.

Privacy Protection
Theorem 3. The server, FN, and curious user can know nothing about the private information of user U i .
Proof of Theorem 3. We assume that any probabilistic polynomial time adversary (PPTA), such as the honest but curious server, the FN, and the user, can eavesdrops on any communication between user U i , the FN, and the server.Moreover, the PPTA successfully obtains all messages transmitted between user U i and the FN, including but not limited to (y i , y i+1 , y i−1 , C i,t , H 0 (t) sk FN ).To obtain the privacy information w i,t , it is impossible for any PPTA without knowing the blinding factor Y i and H 0 (t) sk FN sk i .However, solving Y i and H 0 (t) sk FN sk i with (y i , y i+1 , y i−1 , H 0 (t) sk FN ) is a CDH problem.That is, for any PPTA, inferring the private value w i,t during training of our proposed scheme is at least as difficult as the CDH problem in G 1 .Thus, any PPTA can know nothing about the private information of user U i .Theorem 4.Even though user U i+1 , U i−1 and the FN are colluded, it is infeasible to deduce any private information of user U i .
Proof of Theorem 4. We assume that user U i+1 , U i−1 and the FN are colluded.User U i sends (C i,t , Au 1 i,t , Au 2 i,t ) to the FN.The FN randomly selects a number r i ′ ∈ Z * p , and sends y i ′ to user U i−1 and U i+1 .U i−1 and U i+1 compute Y ′ i−1 = (y i /y i ′ ) r i−1 and Y ′ i+1 = (y i ′ /y i ) r i+1 , respectively, and send them to the FN.The FN can remove blinding factor Y i .The FN first computes Y i ′ = (y i−1 /y i+1 ) r i ′ , and then computes )H 0 (t) sk FN sk i .To obtain the privacy information w i,t , it is impossible without knowing the blinding factor H 0 (t) sk FN sk i .However, solving H 0 (t) sk FN sk i with H 0 (t) sk FN is a CDH problem.That is, for any PPTA, inferring the private value w i,t during training of our proposed scheme is at least as difficult as the CDH problem in G 1 .Thus, even though user U i+1 , U i−1 and the FN are colluded, it is infeasible to deduce any private information of user U i .Theorem 5.Even though the user U i+1 , U i−1 and the server are colluded, it is infeasible to deduce any private information of user U i .
Proof of Theorem 5. We assume that user U i+1 , U i−1 and the server are colluded.The server will obtain (C i,t , H 0 (t) sk FN ).The server randomly selects a number r i ′ ∈ Z * p , and sends , respectively, and send them to the server.The server can remove blinding factor Y i .The server first computes Y i ′ = (y i−1 /y i+1 ) r i ′ , and then computes To obtain the privacy information w i,t , it is impossible without knowing the blinding factor H 0 (t) sk FN sk i .However, solving H 0 (t) sk FN sk i with H 0 (t) sk FN is a CDH problem.That is, for any PPTA, inferring the private value w i,t during training of our proposed scheme is at least as difficult as the CDH problem in G 1 .Thus, even though the user U i+1 , U i−1 and the server are colluded, it is infeasible to deduce any private information of user U i .

Verifiability
Theorem 6.Even though some users and the server are colluded, this scheme realizes aggregate unforgeability and verifiability.
Proof of Theorem 6.We assume that some users and the server are colluded and that the server obtains h sk CS ∑ n i=1 sk i 2 . If the server wants to forge the proof of correctness of the aggregated results, it must obtain H 1 (t) sk CS ∑ n i=1 sk i .In other words, the server must obtain ∑ n i=1 sk i .However, to solve ∑ n i=1 sk i with h , which is a discrete logarithm problem.That is, it is at least as difficult for the aggregation server to falsify the proof of correctness of the aggregation result during the training period of our proposed scheme and convince all honest users as it is for the discrete logarithm problem.Thus, even though some users and the server are colluded, this scheme realizes aggregate unforgeability and verifiability.

Fault Tolerance
Theorem 7.Even though some users drop out, the server is still able to aggregate the encryption gradients of other online users.
Proof of Theorem 7. Suppose that there is a user U i that fails to execute the protocol after the Keys Broadcasting phase, but U i+1 and U i−1 use the parameter g r i 1 of U i to compute blinding factor Y i+1 and Y i−1 .Therefore, we need to eliminate the effects of the U i drop or the server will not be able to aggregate correctly.We need to consider two scenarios.
Scenario I: If user U i drops out accidentally and does not send a message to the FN, the FN issues an announcement to recruit a new user; a new user U i ′ is willing to join the system.The FN sends system parameters and ) to user U i ′ .User U i ′ trains a local model w i ′ ,t with their own private dataset and randomly selects a secret number r i ′ ∈ Z * p .Then, user ) to the FN.When the FN receives the message from user U i ′ , it sends y i ′ to user U i−1 and U i+1 .U i−1 and U i+1 compute , respectively, and send them to the FN.FN computes Scenario II: If user U i drops out accidentally and do not send a message to the FN, the FN issues an announcement to recruit a new user, but no new users are willing to join the system.
The FN randomly selects a number r i ′′ ∈ Z * p , sends y i ′′ to to user U i−1 and U i+1 .
, respectively, and send them to the FN.The FN first computes Y i ′′ = (y i+1 /y i−1 ) r i ′′ , and then computes a fake gradient ciphertext The FN and server will be able to eliminate the adverse effects of user U i dropping out with C ′ i ′ ,t and C i ′′ ,t , and the server can aggregate correctly.

Experimental Setup
The proposed scheme in this paper is mainly applied to IoT scenarios characterized by clients with limited resources and servers with abundant resources.Therefore, this paper is mainly concerned with the computational and communication costs on the user side.To implement a prototype, we utilized a library written in Java [41].Table 2 shows the details of the hardware and software configurations required for the experiments.
The large prime p is set to 512 bits in the experiments in this paper.The length of the elements in the cyclic groups G 1 is 512 bits.In the course of model training, we utilized three widely recognized public datasets: MNIST, Fashion-MNIST, and CIFAR-10.These datasets each consist of a training and testing set.We split the training set of each dataset into twenty equal segments, allocating each segment to one out of twenty users.In every iteration of training, ten of these users were randomly chosen to engage in the training session.
We employed these three datasets to train three distinct CNN models.For both MNIST and Fashion-MNIST, the architecture of the models we constructed was identical, featuring two convolutional layers followed by two fully connected layers.However, the input size of the first fully connected layer differed between the two models: for MNIST, it was 1024, whereas for Fashion-MNIST, it was 33856.As for CIFAR-10, the model design incorporated two convolutional layers and subsequently three fully connected layers.

Hyperparameter Settings
During local model training, this paper uses the stochastic gradient descent (SGD) optimizer for all three datasets mentioned above.Table 3 demonstrates the experimental settings for different datasets.Typically, model parameters are represented as floating-point numbers.But when it comes to encrypting these parameters, we need to convert them from floating-point to positive integers.As outlined in the method by [42], we effectively transformed floatingpoint numbers into positive integers using the following approach: In our study, each floating-point number w is scaled to a positive integer x = α × (w + L), where α is a scaling factor, L is a sufficiently large positive integer, α equals 10 k , and k represents the precision level.The choice of k influences the model's accuracy.
Figures 3-5 highlight the discrepancy in model accuracy between the FedAVG [2] method using floating-point numbers and that quantizing model parameters to positive integers.Experimental findings reveal that when k is set to 4, the FedAVG method employing floating-point numbers outperforms the quantized approach in terms of accuracy.However, as k increases, the gap in accuracy between the two methods markedly narrows.Specifically, when k reaches 5, the difference in accuracy falls within an acceptable range; by the time k hits 7, the accuracy levels of both methods are virtually indistinguishable.

Computational and Communication Costs 7.2.1. Complexity Analysis
We analyze the computational and communication complexity of the proposed scheme.It is easy to see that each user's computational complexity can be broken up as (1) O(n) in LocalTraningandMasking and O(n) in AggregatingandUnmasking.To summarize, the computational and communication complexity of each user is O(n) and O(n).

Computational Cost
The computational cost of each user in each epoch is shown in Table 4. From the results, it is not difficult to find that the main computational overhead for each user is in the Local Training and Masking phase, because in this phase, each user needs to mask the trained local model gradients and compute three pieces of auxiliary information, and it can also be found that the computational cost of in the Local Training and Masking phase increases with the number of gradients, while the other phases do not increase with it.In addition, we evaluated the computational cost incurred by each user in scenarios involving user dropout.As depicted in Figure 6a, the computational cost per user remained relatively stable, showing no considerable variation as the ratio of users dropping out escalated.The rationale for this conclusion arises from the fact that regardless of the user dropout rate, each user is only required to calculate two auxiliary pieces of information in the worst-case scenario, a computation that takes less than one millisecond.Consequently, despite an increase in the proportion of users exiting the system, the computational cost on a single user remains relatively constant.Furthermore, we evaluate the differences in the computational cost between the settings that with verification and non-verification; the result is shown in Figure 7.This result demonstrates that the verification cost in our scheme is very low and stable, with no increases with the number of gradients or users.In addition, it is not difficult to find that the computational cost of each user is only affected by the number of gradients and not the number of users.To comprehensively evaluate the performance of our proposed scheme, we benchmarked it against existing approaches, namely SA [16], Versa [17], and ESVFL [43].Comparative results are illustrated in Figure 8.It becomes evident from these results that our proposed scheme significantly reduces computational overhead at the user side compared to alternative methods.Precisely, our scheme decreases computational costs by at least 85% and 89% relative to SA and Versa, respectively.This disparity in performance is largely due to the fact that both SA and Versa rely on an elliptic curve Diffie-Hellman key exchange to construct n − 1 encryption keys for secure communication and compute n − 1 pseudorandom number seeds for generating cryptographic parameters used to mask local model gradients.Moreover, Versa requires two masking procedures to enable verification.Notably, in our scheme, each user's computational load does not scale with the number of other participants.Hence, as the number of users grows, the superiority of our proposed scheme over others becomes increasingly pronounced.

Communication Cost
We have recorded and analyzed the communication performance under varying conditions and compared our proposed scheme against existing methods, including SA [16], Versa [17], and ESVFL [43].The comparative results are depicted in Figure 9.These results clearly demonstrate that our proposed scheme exhibits a significant advantage in terms of communication cost.Specifically, our scheme achieves reductions in communication cost of at least 47%, 69%, and 75% compared to SA, Versa, and ESVFL, respectively.This superior performance stems from the fact that in SA and Versa, each user must transmit a large volume of parameters to either the cloud server or other users, thereby increasing communication overhead.Similarly, in ESVFL, each user is required to send four encrypted gradients, which also contributes to higher communication burdens.Notably, our scheme demonstrates remarkable stability in communication cost, unaffected by an increase in the number of users, with variations primarily influenced by changes in gradient quantities.
Similarly, we evaluated the communication cost per user when users drop out.As shown in 1. Please confirm whether there are repeated formulas in paper.2. The format of the formulas needs to be consistent.Please check/confirm Figure 6b, there was no significant fluctuation in the communication overhead per user as the proportion of dropout users increased.This stability is attributed to the fact that, irrespective of fluctuations in the user dropout rate, each active user is required to transfer merely two auxiliary pieces of information in the worst-case scenario, a process consuming less than 0.05MB of communication resources.Consequently, despite a rise in the dropout user ratio, the communication cost on a single user did not experience a significant escalation.

Conclusions
In this paper, we propose an efficient, verifiable, and privacy-preserving data aggregation scheme for federated learning.The scheme features lower computational and communication costs, making it highly resistant to inference attacks and suitable for scenarios involving IoT devices with relatively weak computational and communication capabilities.To effectively achieve privacy protection and ensure the verifiability of aggregated results, we devised a lightweight encryption protocol and leveraged fog nodes to mitigate the computational and communication costs for users.In addition, we designed a user management mechanism to ensure that the solution proposed in this paper is fault-tolerant.Even if some users drop out due to various issues, the solution proposed in this paper can still be successfully aggregated.Also, users can dynamically join the federated learning process.In the event of worst-case conditions, the incremental computational cost incurred by users remains capped at less than 1 millisecond, and the additional communication cost is maintained at under 0.05 MB.The experiments are conducted on three commonly used datasets in academia and compared with other existing schemes.The experimental results show that the federated learning scheme has high accuracy while protecting the privacy of users' data.In the future, we will consider how to support malicious user detection and prevention.

•.
UsersData ownership and local execution of the training algorithm fall under the purview of the users.Upon finishing a training round, users encrypt the local gradient parameters and transmit them to the fog node.Moreover, users receive the encrypted global gradient parameters that have been aggregated by the aggregation server.They then update their local models and proceed to conduct a next round of model training with their respective datasets.This iterative process continues until the model's training is deemed successful.Users possess the capability to authenticate the correctness of the aggregation outcomes produced by the server.•

Figure 6 .
Computational and communication costs of each user with different dropout rates, |U | = 100, |G| = 1 × 10 4 .(a) Computational cost with different dropout rates.(b) Communication cost with different dropout rates.

Figure 7 .
Computational cost comparison between verification and non-verification for each user.(a) |U | = 100, with different numbers of gradients.(b) |G| = 1 × 10 5 , with different numbers of users.

Figure 9 .
Communication cost comparison of each user.(a) |U | = 100, communication cost comparison with different numbers of gradients.(b) |G| = 1 × 10 4 , communication cost comparison with different numbers of users.

Table 1 .
Advantages and disadvantages comparison among the related schemes."L" denotes the lightweight cryptography."F" denotes fault tolerance against users dropping out."I" denotes individual keys without trusted third parties."H" denotes high accuracy of global model.

Table 2 .
Software and hardware configurations.

Table 3 .
Experimental settings on different datasets.
Au 2 i,t and encrypting the local model w i,t in LocalTraningandMasking, which takes O(n) time in total; (3) when some users drop out accidentally, computing a parameter Y ′ in MessagesCollecting, which takes O(1) time.During each training round, across its four distinct phases, the communication cost incurred by each participating user amounts to O(1) in SystemInitialization, O(1) in KeyBroadcasting,

Table 4 .
Computation cost of users at each epoch (ms)."Phase I" denotes system initialization."Phase II" denotes keys broadcasting."Phase III" denotes local training and masking."Phase IV" denotes message collecting."Phase V" denotes verification.