In the above scenario, we construct a cooperative game model for participants in federated learning based on Nash bargaining theory. Nash bargaining aims to maximize the joint surplus by multiplying each participant’s utility gain relative to their disagreement outcome. In the context of federated learning, this means determining a reward allocation scheme that enables the data requester to train a better global model while ensuring that data providers receive higher rewards. In the following, we model each party involved in federated learning to formulate the corresponding optimization problem and design the incentive mechanism we proposed.
  3.3.1. Revenue Modeling for Data Providers
If data provider m decides to participate in round t of federated learning, then from the perspective of m, their utility is the reward  obtained from participating in this round. Clearly, . At the same time, the provider incurs a cost , where  denotes the computation cost and  denotes the communication cost.
We define a binary variable 
 to indicate whether data provider 
m participates in round 
t of federated learning: if they participate, then 
; otherwise, 
. For any client 
m in round 
t, the utility can be expressed as the difference between the reward and incurred cost, as shown in Equation (
3).
Here,  is the decision made by data provider m, while  is determined by the data requester R. The possible combinations of these decisions by R and data provider m can be interpreted as whether an agreement is reached for m to participate in the federated task in this round. Once decisions are made, the provider’s utility can be computed accordingly.
The total cost of a client can be explicitly modeled as a function of its dataset size 
 [
26]. Specifically, the cost consists of three parts: (i) data cost 
, where 
 is the unit data processing cost; (ii) computation cost 
, where 
M is the model dimension, 
 and 
 are the numbers of local and global iterations, and 
 is the unit computation cost; and (iii) communication cost 
, which depends on bandwidth, channel gain, and rate constraints but is independent of 
. Therefore, the total cost is as shown in Equation (
4).
This affine form shows that costs grow linearly with dataset size, which plays a key role in ensuring truthfulness in our mechanism.
If data provider 
m decides to participate in the federated learning task for round 
t, they will sparsify their local gradient update according to the method in 
Section 3.2, upload it to the shared data system, and wait for the aggregated gradient to be used in reward calculation.
  3.3.2. Revenue Modeling for Data Requesters
When data requester 
R engages in bargaining with multiple data providers 
 during round 
t of a federated learning task, it must first determine which type of bargaining protocol to adopt. Existing one-to-many bargaining protocols include sequential bargaining [
27] and parallel bargaining [
19]. In sequential bargaining, the requester negotiates with each data provider in a predetermined order, which in the worst case requires time complexity of 
 [
28], making it impractical in real-world data-sharing scenarios.
Therefore, this paper adopts a **parallel bargaining framework for incentive mechanism design. Inspired by the study by Tang [
28], we define the utility of the data requester in round 
t as the global model’s accuracy improvement function [
29,
30,
31].
As mentioned above, when data requester 
R receives gradient updates from data providers 
m containing 
 non-zero elements, the total number of received sparse gradient parameters 
 increases, leading to an increase in the model’s overall accuracy 
 [
29,
30,
31].
If no gradient is received (i.e., zero parameters), the model’s accuracy remains 
. Hence, the accuracy gain during this round of bargaining is defined in Equation (
5).
Here,  is the amplification coefficient that reflects the data requester’s sensitivity to accuracy improvements.
For data requester R, the incurred cost  includes both communication cost and payment cost. It is assumed here that the data requester has sufficient communication resources and bears a fixed communication cost  to communicate with the data providers. Therefore, the total communication cost for the data requester is . For each data provider m, the requester provides a reward  for participating in round t of federated learning. Obviously, if , then , meaning the requester does not pay providers who do not participate.
For simplicity, we define the participation vector 
 and the payment vector 
. Based on these definitions, the requester’s revenue is given by Equation (
6).
For any data provider , if they do not participate in the current round (i.e., ), they will receive no reward (i.e., ) and incur no cost (i.e., ). In this case, their utility is zero (i.e., ). Similarly, if no data provider participates in this round, the data requester’s utility is also evidently zero (i.e., ). Therefore, the worst-case utility in this bargaining process is 0.
If provider 
 does not participate (
), then 
, 
, and revenue 
. If no providers participate, requester revenue 
 is the worst case. After agreeing on 
 and 
, the requester and providers’ revenues are 
 and 
. By Nash bargaining, the negotiation solves the optimization in Equation (
7), equivalently transformed to convex form in Equation (
8) via log transform under the same constraints.
The objective in Equation (
8) maximizes the joint surplus of the requester and providers above their disagreement points (here 
 and 
), which is the classical Nash bargaining criterion (log-sum form after a log transform). In Equation (
8), (i) the budget/value constraint ensures that the total payment to providers does not exceed the requester’s available benefit 
, (ii) the provider cost constraints 
 guarantee individual rationality (no provider is paid below its incurred cost), and (iii) the non-negativity conditions ensure all parties obtain non-negative utility. Together, these conditions make the outcome both fair and feasible: improving one party’s utility cannot occur at the complete expense of another, and the surplus is split in a way that balances all sides.
Unfortunately, this is a typical Mixed Integer Convex Programming (MICP) problem, which is a classical NP-hard problem. It is difficult to find a globally optimal solution and also challenging to design an algorithm with theoretical approximation guarantees. This difficulty arises because the global utility function and communication costs are determined only after the decisions are made. To address this issue, we propose a heuristic algorithm that derives an approximate Nash bargaining solution within polynomial time complexity. The proposed algorithm is based on the following two key processes: client selection and bonus payment.
  3.3.3. Client Selection Strategy
We adopt a non-uniform probabilistic sampling distribution to design the client selection strategy. The proposed strategy is based on the practical observation that the more parameters the data requester receives, the greater the potential utility gain—i.e., the probability that 
 increases is higher [
29,
30,
31]. Since 
 is a non-decreasing function of 
, this method assigns a non-zero probability to each data provider 
m based on the size of their local dataset using the Softmax function, thereby enabling the mechanism to better adapt to non-IID data scenarios [
32] and ensuring the global model’s convergence [
32]. The probability that data provider 
m is selected in the 
t-th round of federated learning is calculated as shown in Equation (
9).
As provider 
m increases reported cost 
, their selection probability decreases, while underbidding lowers rewards, incentivizing truthful reporting. Since costs correlate with dataset size 
 [
26], misreporting data size has similar effects, encouraging truthfulness.
Although Equation (
9) suggests that a client might increase its reported 
 to gain a higher selection probability, the affine cost function above guarantees that such misreporting cannot improve net utility. If a client is already selected truthfully, exaggerating 
 does not change the allocation or payment but increases the incurred cost 
, reducing utility. If a client is not selected truthfully, inflating 
 may cross the selection threshold, but the payment is determined by the critical type (threshold), not by the inflated report. Since the cost strictly increases with 
, the utility under misreporting cannot exceed that under truthful reporting. Therefore, under monotone allocation and threshold-based payments, truth telling is a dominant strategy, and our client selection strategy satisfies the truthfulness property.
Based on this, we design a probabilistic client selection: At round 
t, set all 
. If providers 
, select all; else, compute selection probabilities via Equation (
9), and randomly sample 
B clients to form set 
S. Return participation vector 
. Algorithm 1 runs in 
 time.
  3.3.4. Bonus Payment Strategy
For simplicity, we define 
 as the data requester’s net benefit in round 
t and 
 as provider 
m’s incurred cost. Substituting these into Equation (
8), we derive the equivalent optimization in Equation (
10) and, by reorganizing the constraints, express the problem equivalently as Equation (
11).
Given the convex nature of Equation (
11), we apply the Karush–Kuhn–Tucker (KKT) conditions to characterize its optimal solution. By introducing Lagrange multipliers 
 and 
 to constrain 
 and 
, we derive the KKT conditions as in Equation (
12). Solving them yields the closed-form solution in Equation (
13), from which the optimal payment 
 for selected providers can be computed (Equation (
14)).
From these conditions, the complementary slackness relations imply that the optimal solution must satisfy 
 for each selected provider and that the total payment cannot exceed the requester’s budget or task value. Rearranging the first-order condition in Equation (
12) yields a system of linear equations in the payment variables, where each 
 depends on the requester’s budget 
A, minimum cost 
, and the payments of other selected providers. Solving this system leads to the recursive form shown in Equation (
13).
Finally, under the mild symmetry assumption that all selected providers are treated homogeneously in equilibrium, the surplus can be evenly divided among the 
 providers plus the requester. This simplification yields the closed-form expression in Equation (
14), where each selected provider receives a payment proportional to its minimum cost and the requester’s budget. This step makes explicit the assumption of symmetric equilibrium and explains the transition from Equation (
13) to Equation (
14).
Equation (
14) distributes the cooperative surplus between the requester and the 
 selected providers in a Nash-consistent manner: each provider’s payment increases with its minimum cost 
 (ensuring individual rationality), while the requester’s share is implicitly reflected through the denominator 
. This is a differentiated allocation (not an equal split) in general; it reduces to equal sharing only under symmetry.
The overall social welfare generated by federated learning is shared between the requester and the providers, ensuring aligned interests and encouraging cooperation. As in 
Section 3.3.3, our incentive mechanism prevents cost or data misreporting and stops the requester from undervaluing benefits to cut costs. Providers can verify accuracy gains locally each round, ensuring transparency. As discussed in 
Section 3.4, smart contracts provide an additional layer of assurance. The full process is detailed in Algorithm 2.
          
| Algorithm 2 Incentive mechanism. | 
- Input:
                       
 - Output:
                       
 Global model  - 1:
 Initialization: global model , initial accuracy  - 2:
 for   to T do - 3:
     - 4:
    Select data providers  for round t based on budget  - 5:
    Send the latest global model  to each data provider in  - 6:
    Each data provider  trains a local model and obtains local gradient  - 7:
    Each data provider   sparsifies   using Equation ( 2) and uploads it - 8:
 - 9:
 - 10:
    Compute the payment amount for each data provider in the current round according to Equation ( 14) - 11:
 end for - 12:
 return 
					   
  |