A Truthful Reverse Auction Mechanism for Federated Learning Utility Maximization Resource Allocation in Edge–Cloud Collaboration

: Federated learning is a promising technique in cloud computing and edge computing environments, and designing a reasonable resource allocation scheme for federated learning is particularly important. In this paper, we propose an auction mechanism for federated learning resource allocation in the edge–cloud collaborative environment, which can motivate data owners to participate in federated learning and effectively utilize the resources and computing power of edge servers, thereby reducing the pressure on cloud services. Speciﬁcally, we formulate the federated learning platform data value maximization problem as an integer programming model with multiple constraints, develop a resource allocation algorithm based on the monotone submodular value function, devise a payment algorithm based on critical price theory and demonstrate that the mechanism satisﬁes truthfulness and individual rationality


Introduction
Traditional centralized machine learning methods require the user's source data to be aggregated to a central server for model training.However, this method has the drawback of leaking user privacy, which makes users unwilling to contribute their data for model training and, thus, hinders the development of artificial intelligence technology.To address the user data privacy issue in machine learning, Google introduced the concept of federated learning (FL), which is a distributed machine learning method.Unlike the traditional centralized machine learning approach, data owners train models with the local data and then send the models to the cloud [1,2], iterating this step until the model converges.A key point of FL is how to select users to participate in training, and a reasonable mechanism for user selection and payment calculation is a challenge for FL [3].
To exploit cloud computing and edge computing resources and to address the resource allocation and pricing issue of federated learning in cloud-edge collaboration scenarios, we propose a multi-agent federated learning framework, as shown in Figure 1, where 1. the FL platform publishes training tasks.2. the data owner evaluates the value of the data and sends the resource request, data value, bandwidth request and bid to the FL platform for decision making.3. the FL platform decides the winners by an auction mechanism.4. the winning users start participating in FL training.5. the FL platform pays the corresponding rewards to the winning users.In summary, multiple edge servers are deployed in different regions, which aggregate the local models of the Internet of Vehicles (IoV) collected by themselves and then transmit the edge models to the FL platform for aggregation processing, iterating this process until a high-quality model is obtained.Finally, the FL platform pays rewards to all winning users.In this process, only the trained local model is transmitted, which greatly protects the privacy of users' local data.

Motivation
In most federated learning mechanism design studies, data owners are usually directly connected to the cloud to transmit model data [15], which not only under-utilizes the computing resources of edge servers but also imposes considerable pressure on the resources and computing power of the cloud [16].Edge computing can address this issue well.In the mechanism design study on the network flow model, ref. [9] proposed a strategy-proof mechanism to address the utility maximization problem of unmanned aerial vehicle base stations network transmissions.Similarly, ref. [10] proposed an auction mechanism to allocate the edge server resources.Auctions are an effective strategy to encourage users to participate in market activities.Therefore, in federated learning, FL platforms allow users to utilize the data collected by users; therefore, reverse auctions should be used.In addition, in terms of data privacy, after training the local model, the model only needs to be transmitted to the cloud after aggregation through the edge server, which protects the privacy of the user's local data.
Furthermore, the value of the data greatly affects the quality of the local models and subsequently affects the quality of the final model of federated learning.Selecting the most valuable data is an important challenge of FL.The FL platform is willing to pay higher fees for recruiting users with high-value data, and data owners (also known as users) also evaluate local data and bid, which poses winner decision issues and data value pricing problem.For example, in [17], the data value function takes into account information such as data size and weather conditions.
Moreover, another challenge in building an auction mechanism is how to decide whether a user should be selected.The usual practice is to design an operator in the selection algorithm, calculate the operator value for each user and, finally, select according to the ranking.However, the construction of the operator has potential drawbacks, such as inappropriate parameter setting, and the subjective intention of the designer has a significant impact on the operator performance.Under the premise that the federated learning utility maximization problem has the property of diminishing marginal utility, this problem can be formulated as a submodular function to address.The authors of [18] proposed a computationally efficient approximate mechanism under a submodular function model.Subsequently, the authors of [19] used a sequential submodular function model for the mechanism design of mobile crowdsourcing tasks.
Furthermore, users participating in training in federated learning need to use a part of the bandwidth to receive and transmit models, and bandwidth size not only affects whether users can communicate stably and wirelessly with edge servers but also directly affects the quality of the final model.However, the existing incentive mechanisms do not consider the bandwidth limitations of federated learning in cloud-edge collaborative environments.Therefore, it has great practical significance to design appropriate resource allocation strategies for federated learning in edge computing environments.

Main Contribution
The main contributions of this paper can be summarized as follows: (1) Unlike the existing research, we consider both resource constraints and bandwidth constraints at the same time.We transform the data value maximization problem of federated learning in the cloud-edge collaborative environment into an integer programming model with multiple constraints and submodular features.(2) We design a reverse auction mechanism to solve the problem.Specifically, we design a resource allocation algorithm based on the monotone submodular value function, a payment algorithm based on critical price theory and prove that the mechanism satisfies truthfulness and individual rationality.In the experiment, we compare the proposed algorithm with the greedy method and the optimal method in terms of system total utility, resource utilization, and other metrics and demonstrate its effectiveness.
The remainder of this paper is organized as follows.In Section 2, we propose the FL auction mechanism, formulate the system total utility maximization problem, design the model quality function and data quality function and define the service cost of the FL platform and users.In Section 3, we present the FLRA algorithm and explain the key insights behind the algorithm.In addition, we provide an example for illustration.We also prove the properties of the mechanism.In Section 4, we conduct experiments with a large amount of real data and compare the FLRA algorithm with the greedy algorithm and OPT-FLRA in terms of system total utility, payment, execution time, etc.

Federated Learning Data Value Maximization Problem
To address the challenges of selecting the most valuable data and allocating users, in this section, we introduce the relevant concepts of federated learning, design a data quality function to evaluate the quality of user data and a model quality function to reflect the contribution of user data to the federated learning model, which, together with the service cost function, form the total utility function of the system to solve the problem of maximizing data value in federated learning.

Parameters
Assume that the number of users of the FL platform is N, denoted by the set N = {1, . . ., N}.The FL platform deploys a total of M edge servers in the regions where data need to be collected, denoted by the set M = {1, . . ., M}.Each edge server has R types of resources, denoted by the set R = {1, . . ., R}.Each type of resource in the edge server is limited, and the resource capacity is denoted by the matrix C = [c 1 , c 2 , . . ., c M ], where c j is the resource capacity vector of the edge server j, c j = (c 1 , c 2 , . . ., c R ), j = 1, . . ., M. Therefore, we use c jk to denote the capacity of resource k of edge server j.The edge server collects the local models of the IoV nodes and forwards them to the FL platform after aggregation.The FL platform aggregates the models from each edge server, sends the updated global model to the IoV nodes through the edge server and then starts the next round of training.In this paper, we use THP edge j to denote the processing forwarding rate of the edge server j and THP cloud to denote the processing speed of the FL platform.The transmission rate between user i and edge server j is denoted by the matrix BW = [bw 1 , bw 2 , . . ., bw N ], where bw i is the transmission rate vector of user i for each edge server bw i = (bw i1 , bw i2 , . . ., bw iM ), i = 1, . . ., N. We use bw ij to denote the bandwidth between user i and edge server j.The requests of all users are θ = (θ 1 , θ 2 , . . ., θ i ), and the task request submitted by user i to the system is θ i = (b i , s i , v i , d i ), where b i is the user's bid, and the vector s i is the user's request for each type of resource and transmission rate s i = (s 1 , s 2 , . . ., s R , s ibw ), i = 1, . . ., N. Therefore, we use s ik to denote the request of user i for the resource k, k = 1, . . ., R, and s ibw denotes the request for bandwidth by user i. d i is the data size, and we use the set W j , W j ⊆ N , j ∈ M to denote the set of workers selected by each edge server.Let W ∪ W j , ∀j ∈ M denote the set of all the selected workers.
Here, we give a simple example of 5 users, 3 edge servers, and 1 resource, which we will explore in more detail in Section 3.3.The resource capacity of each edge server is 5. bw 2 = (0, 2, 1), which means that the transmission rate between user 2 and edge server 2 is 2 M/s (bw 22 = 2), the transmission rate between user 2 and edge server 3 is 1 M/s (bw 23 = 1), and the transmission rate of 0 indicates that the distance between user 2 and edge server 1 is too far to transmit (bw 21 = 0).[5,5,5].

The Service Cost of the FL System
The service costs of the FL system include the service costs of the IoV nodes, the edge servers and the FL platform.The service cost c user i of IoV node i consists of the cost of collecting data c col i and the cost of computing c comp i . The user's transmission cost is included in the edge server's transmission cost.
The size of the user's local data is denoted by d i , generally referring to the size of the driving video.α > 0 is the unit data cost, which is used to calculate the cost of data collected by the user.
The unit bandwidth cost is denoted by α, the unit computation cost is denoted by β, and the unit storage cost is denoted by γ.The edge server aggregates the collected models and forwards them to the cloud.
The FL platform service cost c cloud is mainly the bandwidth cost: where α is the unit bandwidth cost of the cloud.

Data Quality Function and Model Quality Function
Before participating in the task, the IoV node needs to evaluate the local data [17].To quantify the potential contribution of the IoV node to the FL task, we consider two aspects of the local data: the data size and the data distribution.The data quality function is as follows: The data value of all users is denoted by v = {v 1 , . . ., v n }, where k i is the number of entity classes in the data, l i is the sum of all category objects, and y i is the weather condition.In Formula ( 9), due to the limited bandwidth and resource capacity of the platform, transmitting large amounts of data consumes a significant amount of bandwidth and resources, which is not conducive to being selected by the platform.Generally speaking, the platform prefers a short section of driving records with more vehicles and pedestrians in urban areas, rather than a large section of driving records on empty roads.
The model quality ϕ can be expressed as an ensemble function with respect to the set of winners W: The purpose of this paper is to maximize utility; therefore, it is necessary to define the total utility function of the system.The total system utility function consists of FL platform utility and data owner utility.FL platform utility includes model quality, edge server costs, FL cloud costs, user costs and payments to data owners, as follows: The utility of user i is defined as the difference between the payment p i paid by the FL platform and the bid b i .If user i does not win, the utility is 0. The utility function can be expressed as follows: We denote the total utility of the system as V.We can formulate the utility maximization problem as the following nonlinear programming problem, given by Formula (13): where Formula (13a) ensures that the user's bandwidth request cannot exceed the transmission speed between the user and the edge node to which it is connected.Formula (13b) ensures that the sum of the bandwidth received by each edge server cannot exceed its processing and forwarding speed.Formula (13c) ensures that the resource allocation of each edge server cannot exceed the capacity of each type of resource.Formula (13d) ensures that the sum of the bandwidth of all edge nodes selected by the user does not exceed the processing speed of the cloud.Formula (13e) ensures that the payment given by the FL platform to the user is greater than its bid.

Preliminary Mechanism Design
Formula ( 13) is an ideal model; however, in practice, users are selfish and may submit untruthful bids to gain greater benefits.Therefore, to encourage users to participate in the auction, the mechanism must ensure individual rationality.To prevent users from submitting untruthful bids, the mechanism must ensure truthfulness.In addition, to quickly obtain the allocation and payment solutions, the mechanism must achieve computational efficiency.For ease of reading, we have summarized the frequently used notations in Table 1.
We use θ i = (b i , s i , v i , d i ) to denote the real request of user i and θ i = (b i , s i , v i , d i ) to denote the declared request of user i.In addition, we assume that users may misrepresent their offers such that b i = b i .We do not discuss the case where users misrepresent the quality of their data v i because the quality of their local data is not self-reported by the users but is evaluated by the platform.This approach reduces the risk of user dishonesty compared to existing studies [20,21].We use θ = {θ 1 , . . ., θ N } and θ −i = {θ 1 , . . .θ i−1 , θ i+1 , . . ., θ N } to denote the declared needs of the users submitted to the platform and θ = {θ −i , θ i }.
User utility is an important indicator to measure the value of the auction, and users always want to maximize their own utility.In this paper, user utility is given by Formula (12).If a user lies about their bid b i , they are very likely to fail in the competition, and their utility is 0. Based on the above description, we can propose an auction mechanism that satisfies individual rationality and truthfulness.Definition 1 (Individual rationality).If a mechanism ensures individual rationality, then it must satisfy the following condition: when the user submits the real request θ i = (b i , s i , v i , d i ), its utility will be greater than or equal to 0, that is, u i ≥ 0. In simple terms, as long as the user participates in the auction and proposes its real request, it will never suffer a loss [22].Definition 2 (Monotonicity).If an allocation algorithm ensures monotonicity, it must satisfy that user i wins with θ i = (b i , s i , v i , d i ) and finally pays p i .Then, when b i ≥ b i , it will also win with θ i = (b i , s i , v i , d i ) and finally pay p i .In reverse auctions, the lower the user's bid, the higher the probability of winning.Definition 3 (Critical value).If i ∈ W, then there must exist a critical value cv i .If the bid of user i, b i < cv i , then the user also wins; otherwise, user i fails.

Definition 4 (Truthfulness).
If the mechanism is truthful, it implies that for each user, given a truthful request θ i and the requests of other users θ −i , we can derive u i (θ −i , θ i ) ≥ u i (θ −i , θ i ), which is equivalent to u i (θ) ≥ u i (θ i ).Thus, submitting a truthful request is the dominant strategy for each user [23].The literature [24] states that a mechanism is truthful if its allocation function satisfies monotonicity and its payment function satisfies critical value theory.Definition 5 (Monotone submodular function).Define W as a finite set.For any Definition 6 (Computational efficiency).If an algorithm is computationally efficient, then it can be executed in polynomial time.This is because obtaining an optimal solution to a general submodular problem may take exponential time.

N
The set of users M The set of edge servers R The types of resources for edge servers C Resource capacity W The set of users generated by the allocation stage W The set of users generated by the payment stage V(W ) The utility of the winner set W V i|W The utility of user i joining set W in the allocation stage The utility of user i joining set W after changing bid b i In j The in-degree of edge server j in the allocation stage (a, b) Append vector b to the end of vector a J The set of preallocated edge servers for the user θ The user request after changing the bid of a winner p i Payment the FL platform finally pays to winner i

Optimal Federated Learning Reverse Auction Mechanism Design
We propose the OPT-FLRA mechanism based on the principles of VCG, which is an auction mechanism that achieves optimal allocation solutions.We use dynamic programming or column generation theory to solve the optimal allocation solution.Through experiments, we use the IBMCplex solver to find the optimal allocation.
where p i is the payment from the FL platform to user i, V(W * ) is the maximum utility of the system, and W * is the optimal allocation solution (the optimal set of winners).V(W * −i ) is the maximum system utility without user i, and W * −i is the optimal solution for allocation in this case.It is easy to see that V(W * ) ≥ V(W * −i ).

Multimachine Reverse Auction Mechanism Design for Federated Learning Resource Allocation
This section proposes the auction mechanism for the problem.We first describe the training framework for multimachine federated learning and the design ideas of the allocation algorithm and payment algorithm.Second, we present the pseudocode of the FLRA algorithm.Third, we give an example to illustrate the FLRA algorithm.Finally, we prove that the FLRA algorithm satisfies truthfulness and individual rationality.for each winner i in the set W j parallely do 5: end for 7: w k+1 e,j = ∑ end for 16: end for

Server and User Matching Policy
When deciding which edge server to assign a user to, we choose to assign the user to the edge server with the smallest in-degree among the edge servers to which they can connect.In other words, when the user to be assigned 1 can connect to edge servers 1, 2, and 3, we need to calculate how many unassigned users can connect to edge servers 1, 2, and 3.Then, user 1 is assigned to the edge server with the smallest in-degree.

Monotone Submodular Allocation Strategy
In the user selection phase, we use a monotone submodular function to select the user with the highest current gain.The basic principle is to iterate over the candidate users and add them to the winner set one by one and calculate the gain resulting from adding the user to the set.We select the user that maximizes the gain of the submodular function as the preselected user.This can allocate the limited resources to efficient users, avoiding the problem of inefficient users wasting resources and preventing efficient users from being allocated.

Payment Strategies for Critical Prices
After selecting the users who will participate in the training, payments need to be calculated for each winner.In this paper, we use the binary search method to calculate the payment.This is performed as follows: first, the user's bid is continuously multiplied by 2 until this user is no longer allocated.This determines the upper and lower bounds of the payment interval and then iterates the following steps in this interval: 1.Using the midpoint as the user's bid.2. Running the allocation algorithm to determine if the user is allocated.3. Adjusting the interval until the upper and lower bounds of the interval are sufficiently close.This gives the payment paid by that winner.

Federated Learning Reverse Auction Mechanism Design Algorithm
In this part, we first propose a truthful auction mechanism, called federated learning reverse auction (FLRA), to achieve the system total utility maximization defined in Section 3.4.FLRA is shown in Algorithm 2, which consists of the allocation algorithm FLRA_ALLOC and the payment algorithm FLRA_PAY.
1: W, V ← FLRA_ALLOC(C, BW, θ, N ) 2: P ← FLRA_PAY(C, BW, θ , N , W ) 3: return W, V, P; In the allocation phase, we define V i|W = V(W ∪ {i}) − V(W ) as the marginal value of adding user i to the set W. For simplicity, we use V i instead of V i|W when there is no ambiguity.The main idea of the allocation algorithm is as follows: in each round of allocation, we find the user who maximizes the marginal utility in the candidate user set N, and then we check whether there is at least one edge server among the edge servers to which they can connect that satisfies their resource and bandwidth requirements.If there are multiple edge servers that meet the requirements, we select the edge server with the smallest in-degree for allocation.Lines 5-9 obtain the set of edge servers that meet the current user's bandwidth requirements, denoted as J , which is the preallocated edge server set; lines 12-14 assign the current user to the edge with the smallest in-degree; lines 17-18 delete the edge with the smallest in-degree but insufficient resources from J ; lines 20-22 delete the current user from the user set N when all edge servers in J cannot meet the current user's requirements.
In the payment phase, lines 4-12 are used to determine the upper and lower bounds of the payment interval by multiplying the winner's bid by 2 until the user is no longer allocated.Lines 13-22 use the binary search method to calculate the user's payment.2-4 to demonstrate how the mechanism is implemented.For simplicity, we have 5 users, denoted by u 1 , u 2 , u 3 , u 4 , u 5 ; the edge server scale is 3, denoted by e 1 , e 2 , e 3 ; and the edge server resource type is 1.The resource capacity of edge servers is c = (5), and the processing speed of the edge servers is THP edge = 5.  (2, 1, 3) u 5

Example of FLRA Execution
(1, 2, 3)  From Table 4, we can see that in the first allocation, W = ∅, V 5|W = 1234.08.The bandwidth requirement for u 5 is s 5bw = 1, bw 5 = (1, 2, 3); therefore, all three edge servers can meet the bandwidth requirements.u 1 and u 5 can connect to e 1 ; u 1 , u 2 , u 3 and u 5 can connect to e 2 ; u 5 can connect to e 3 and meet their own bandwidth requirements, resulting in the in-degrees of the edge servers being 2, 4, 1, respectively; therefore, this allocation assigns u 5 to the edge with the smallest in-degree e 3 and updates (c 3 , THP edge 3 ) = (3,4).In the second allocation, W = {u 5 }, V 1|W = 652.50.u 1 can connect to e 1 , e 2 , with in-degrees of 1 and 3, respectively, so this allocation assigns u 1 to the edge with the smallest in-degree e 1 and updates (c 1 , THP edge 1 ) = (4, 3).In the third allocation, W = {u 5 , u 1 }, V 2|W = 499.87.u 2 can connect to e 2 with an in-degree of 2; therefore, this allocation assigns u 2 to the edge with the smallest in-degree e 2 and updates (c 2 , THP edge 2 ) = (2, 3).In the fourth allocation, W = {u 5 , u 1 , u 2 }, V 3|W = 410.60.u 3 can connect to e 2 with an in-degree of 1; therefore, it can assign u 3 to the edge with the smallest in-degree e 2 , but the allocation fails because the remaining resources of e 2 are insufficient.In the fifth allocation, W = {u 5 , u 1 , u 2 }, V 4|W = −0.80, the marginal value is negative and allocation fails.Proof.If user i wins with requirement θ i = (b i , s i , v i , d i ), then it will still win even if it reduces its bid b i .That is, if other users do not change their requirements, user i changes its requirement θ i = (b i , s i , v i , d i ), where b i ≥ b i , then V i |W > V i|W , user i will still win.FLRA satisfies monotonicity.Lemma 2. FLRA satisfies critical value theory.
Proof.When other users do not change their requirements, the final payment p i of the winner i is the upper bound of its bid b i ; when b i ≤ p i , then i ∈ W , when b i > p i , then i / ∈ W . Here, we define W as the winner set that only changes the bid of user i. Proof.Assume that user i is the winner.Now, we need to prove that if user i accepts his true requirement, then his utility is non-negative.We use the binary method as the payment pricing algorithm in FLRA_PAY.In line 4 of the payment algorithm, we multiply the user's bid by 2; therefore, the lower bound of the payment interval is b i .The final payment is p i > b i ; therefore, the utility of the winner i is.FLRA satisfies individual rationality.Theorem 3. FLRA satisfies computational efficiency.
Proof.According to Algorithms 3 and 4, we know that the time complexity of FLRA_ALLOC is O(mn), and the time complexity of FLRA_PAY is O(mn 2 ).Therefore, the FLRA algorithm is a polynomial time algorithm.

Experiment 4.1. Experimental Setting
We conduct experiments on the ApolloScape dataset, which contains 10 km areas around three sites from three cities, and each area is scanned repeatedly in different weather conditions.From the dataset, we extract the information needed for the experiments, such as object category, number of categories, video size, weather condition and so on.
The hardware configuration of the experimental platform includes a 6-core Intel Core i5 CPU (Intel Corporation, Santa Clara, CA, USA), 8 GB memory and 500 GB storage.The experimental settings are as follows:

•
We constructed a virtual map Figure 3b, based on the real map Figure 3a, and randomly selected edge server coordinates and user coordinates in it.

•
In the user bid setting, we define the user's bid as the user's data value multiplied by a random value.

•
For each experiment, we randomly generate 100 samples.We use the average values of these sample attributes to eliminate the impact of randomization.

•
We use IBMCplex to write the optimal allocation algorithm and the optimal payment algorithm.

•
We use the Python programming language to program the FLRA algorithm and comparison algorithm greedy method.

•
For edge server parameters, unless otherwise specified, we assume that edge servers have memory as a resource and express bandwidth and processing forwarding speed together as (c 1 , THP edge j ) = (32, 30), j ∈ M.
• In this section, we conduct experiments on real-world data and present numerical results to evaluate the performance of our proposed reverse auction mechanism.

The Impact of User Scale on the Algorithm
In this experiment, we analyze the performance of the FLRA algorithm, the optimal algorithm and the greedy algorithm with the participation of small-scale users, and the number of users gradually increases from 10 to 80.The number of edge servers is 20 and the capacity is (c 1 , THP edge j ) = (32, 30), j ∈ M. As shown in Figure 4a, as the number of users increases, the total utility of all algorithms increases simultaneously, and for 10, 20 and 40 users, FLRA obtains the same total utility as the optimal solution.It can also be observed that in all cases, the FLRA algorithm outperforms the greedy algorithm; this also indicates that the allocation algorithm based on the submodular function has advantages over the operator sorting algorithm.Figure 4b shows the payment costs of the three algorithms, and FLRA's payment cost is close to the optimal algorithm and gradually lower than the payment cost of the optimal algorithm as the number of users increases.From the perspective of the FL platform, it is a very good result to obtain higher values at a lower payment, which is also one of the advantages of the FLRA algorithm.It is worth noting that the total payment of the three algorithms is gradually increasing because the quality of the final model obtained by small-scale users participating in training needs to be improved, and before the model reaches its quality upper limit, more users participate in the auction, there will be more winners and more total payments.The resource utilization rates of different algorithms are shown in Figure 4c,d.Because FLRA selects users according to the principle that marginal utility cannot be negative, it will not allocate resources to users who cannot generate benefits because of sufficient resources, and the resource utilization rate is usually lower than the greedy method and close to the optimal algorithm's resource utilization rate.The execution times of different algorithms are shown in Figure 4e.Compared with the optimal algorithm, the execution times of FLRA and greedy algorithm are still very short.The FLRA algorithm and greedy algorithm are both polynomial time algorithms.

The Impact of Large-Scale Users on the FLRA Algorithm and Greedy Algorithm
Because it is very difficult for the optimal algorithm to obtain the result when the user scale reaches 80, in this experiment, we only analyze the performance of the FLRA algorithm and greedy algorithm with the participation of large-scale users.
As shown in Figure 5a, when the number of users increases, the total utility of the FLRA algorithm increases steadily, but the total utility of the greedy algorithm decreases.This is because the greedy algorithm selects a large number of inefficient users whose value is less than their own consumption, which reduces the total utility.Figure 5b shows the total payment of the two algorithms, and we find that the total payment tends to be stable.This is because under the current condition of 20 edge servers and an edge server capacity of (c 1 , THP edge j ) = (32, 30), j ∈ M, the model has reached its quality limit, and adding more users participating in the auction does not obtain a higher model quality.This also implies that bandwidth capacity affects model quality, which is verified in Section 4.4.The memory utilization rates in Figure 5c and the bandwidth utilization rates in Figure 5d tend to be stable.This is because when the users reach a certain scale, their local data are sufficient to support the data volume required for high-quality model training, and the resource utilization rate does not increase.Figure 5e shows the execution times of the two algorithms.The execution times of the FLRA algorithm under different user scales are 159,693; 264,002; 383,145; 595,053 (ms).This can be easily explained by the pseudocode in Section 3.2.With the increase in users, FLRA_ALLOC increases the number of loops to select the user with the largest current gain.In calculating payments, FLRA_PAY calls the FLRA_ALLOC algorithm for each winner multiple times, which also increases the execution time of the algorithm.The execution times of the greedy algorithm under different user scales are 1853, 2042, 2470 and 2747 (ms).In the allocation phase, the greedy algorithm selects users according to the operator sorting.The increase in users prolongs the time to calculate operators for each user.In the payment phase, the greedy algorithm uses a binary search method to calculate payments, such as FLRA_PAY, and the execution time increases with user scale.

The Impact of Bandwidth Capacity of Edge Servers on Algorithms
This experiment shows the influence of different bandwidth capacities of edge servers on the three algorithms.As shown in Figure 6a, with increasing bandwidth capacity, the total utility of the three algorithms increases slowly and tends to be stable.This is because the increase in bandwidth capacity can accommodate more users to participate in training, making the users who failed due to high bandwidth requirements but low gains succeed because of the increase in bandwidth, which produces more utility.In Figure 6b, the total payments of the FLRA, OPT-FLRA and greedy algorithms do not change significantly with the change in bandwidth.This is because the total payment is affected by two factors: the number of winning users and the individual payment.The more winning users there are, the lower the individual payment and vice versa.The fewer winning users there are, the higher the individual payment.Figure 6d shows that when the number of users and edge servers remains unchanged, with the increase in edge server bandwidth capacity, it is obvious that the bandwidth utilization rate gradually decreases.The decrease in the memory utilization rate in Figure 6c is because with the increase in edge server bandwidth, the high-quality users who were eliminated due to insufficient bandwidth can be selected.In Figure 6e, it can be clearly seen that whether the bandwidth increases or not does not affect the execution time of the algorithm because Theorem 3 in Section 3.4 has already shown that the time complexity of the algorithm is only related to the number of users and edge servers.

Truthfulness Verification of FLRA
This experiment verifies the truthfulness of FLRA from two perspectives: (1) changing the bids of winning users to observe utility changes and (2) changing the bids of losing users to observe utility changes.In this experiment, the number of users is 10, the number of edge servers is 5, and the number of resources is 1.The experimental results are shown in Figure 7. Figure 7a shows the case of winner 4. Their true bid is 17.67, and when they win, the FL platform pays them 31.14, and their utility is 13.47.By changing their bid continuously, as long as the bid is lower than 31.14, the user can still win, but their utility remains unchanged at 13.47 because if the user can win, changing their bid will not affect the critical value; therefore, the payment never changes, and the utility remains unchanged.When the bid exceeds 31.14, the allocation fails, and the utility is 0. Figure 7b shows the case of loser user 2. Their true bid is 40.15, the final payment is 0, and the utility is 0. When the bid is lower than 30.96, the user wins in the allocation phase, but the utility is 30.96− 40.15 = −9.19.When the bid exceeds 30.96, the user loses allocation, and the payment and utility are still 0.These two examples show that users cannot obtain greater utility by changing their bids, which verifies the truthfulness of FLRA.

Conclusions
This paper transforms the federated learning platform data value maximization problem into a multi-constrained integer programming model, proposes an auction mechanism, designs a resource allocation algorithm by introducing monotone submodular theory and designs a payment algorithm based on critical price theory.The mechanism and algorithm are shown to be well designed.Based on the experimental results, the total utility of the FLRA algorithm proposed in this paper reaches 98% of the optimal algorithm, and the execution time is greatly shortened, which demonstrates the good performance of the whole system.There are also some potential limitations in this paper.For example, setting a budget upper limit for the FL platform is more realistic and helps the FL platform control the cost of training the model.Similarly, frequency division multiplexing channel resources can be more accurately modeled, and the framework studied here can be extended to more scenarios, such as blockchain and metaverse.The submodular functions in this paper can also be extended to ordered submodular functions, which will be part of our main research work in the future.Extending the submodular functions in this paper to ordered submodular functions and exploring scenarios where participants may have asymmetric information or might collude in an auction will be our main future research.
δ l is the number of local training rounds, δ g is the number of global training rounds, and β is the cost per unit of computation.The structures of the global model and the local model are the same; therefore, we use m to denote the size of the model, and the user needs to train the local model in each training round.The total service cost c MEC j of edge server j ∈ M consists of bandwidth cost c bw j , computation cost c comp j and storage cost c memo j :

3. 1 .Algorithm 1 1 : 2 : Initialize w 0 g 3 :
Highlighted Idea 3.1.1.Multimachine Federated Learning Training Framework First, federated learning in vehicular networks is the application scenario of the auction market in this paper, and we need to understand some basic concepts of federated learning.We use the federated averaging algorithm in this paper, as shown in Algorithm 1. Lines 2-9 are the federated platform training process.In each round of training, each participating user updates the local model w k+1 u,i with local data, and each edge server aggregates the local models and uploads w k+1 e,j to the federated learning platform for updating the global model w k+1 g , where line 7 d j = ∑ i∈W j d i is the sum of the data sizes of all the winners on edge j , and line 8 d = ∑ j∈M d j is the sum of the data sizes of all the winning users.Lines 11-16 are the user's local model update process [25].Multi-FL framework Input: learning rate η, local minibatch size δ b , number of global epochs δ g , number of local epochs δ l .Output: global model w g .FL platform executes: for each global epoch k from 1 to δ g do 4:

10 :
ClientUpdata(i, w i )://Executed on client i 11: for each local epoch from 1 to δ l do 12: batches←(data d i split into batches of size) 13: for batch b in batches do 14: w ← w − η∇l(w; b) 15:

Figure 2
Figure 2 illustrates a simple example in which we use the information in Tables2-4to demonstrate how the mechanism is implemented.For simplicity, we have 5 users, denoted by u 1 , u 2 , u 3 , u 4 , u 5 ; the edge server scale is 3, denoted by e 1 , e 2 , e 3 ; and the edge server resource type is 1.The resource capacity of edge servers is c = (5), and the processing speed of the edge servers is THP edge = 5.

Figure 2 .
Figure 2. Parameters in a simple example.

Figure 3 .
Figure 3. Real map and Virtual map.

Figure 4 .
Figure 4.The impact of user scale on the algorithm.

Figure 5 .
Figure 5.The Impact of large-scale users on the FLRA algorithm and greedy algorithm.

Figure 6 .
Figure 6.The impact of the bandwidth capacity of edge servers on algorithms.

Funding:
This work is supported in part by the National Natural Science Foundation of China (Nos.62062065, 12071417, 61962061) and the Program for Excellent Young Talents, Yunnan, China.

Table 2 .
The user requirement sample.

Table 3 .
The bandwidth between the user and each edge server.

Table 4 .
The allocation result.