A Game-Theoretic Rent-Seeking Framework for Improving Multipath TCP Performance †

: There is no well-deﬁned utility function for existing multipath TCP algorithms. Therefore, network utility maximization (NUM) for MPTCP is a complex undertaking. To resolve this, we develop a novel condition under which Kelly’s NUM mechanism may be used to explicitly compute the equilibrium. We accomplish this by deﬁning a new utility function for MPTCP by employing Tullock’s rent-seeking paradigm from game theory. We investigate the convergence of no-regret learning in the underlying network games with continuous actions. Based on our understanding of the design space, we propose an original MPTCP algorithm that generalizes existing algorithms and strikes a good balance among the important properties. We implemented this algorithm in the Linux kernel, and we evaluated its performance experimentally.


Introduction
The ongoing development of 5G and gigabit WiFi solutions will considerably improve throughput and delay performance on future wireless networks. These developments offer new possibilities for enhancing existing applications by using multipath TCP (MPTCP) [1] to exploit multiple available network interfaces. Multipath TCP seamlessly aggregates the capacities of multiple access technologies, and it is now an integral part of ATSSS [2] (access traffic steering, switching, and splitting) for 5G [3].
A challenging design issue that arises is that of best using WiFi and non-WiFi 5G seamlessly [4] across a variety of applications and platforms. This is a challenge, since different applications impose different requirements and constraints for successful MPTCP usage.
One example application is video streaming. This seems like an ideal use case for MPTCP, since it has been difficult to scale existing services to meet the growing traffic demands of mobile users. For some existing applications, such as online video conferencing, high-resolution video still suffers from jitter, unwanted glitches, and frozen screens when used over mobile networks.
Augmented reality (AR) is another emerging application that faces similar challenges due to bandwidth demands. AR allows users with nomadic wireless devices to seamlessly access information by overlaying graphics and digital information upon their perception of the physical world. This enables immersive AR/VR applications, but requires dependable, low-latency, and high-bandwidth data communication.
Vehicular networking is yet another application that could benefit from MPTCP [5]. For example, autonomous vehicles may need to send 3D views of their environment to remote monitoring servers under extremely tight deadlines to ensure safe operation. Similarly, unmanned aerial vehicles, such as delivery drones, have stringent communication requirements for navigating their dynamically changing environment in a cooperative fashion. One example of this would be using live video from multiple drones to anticipate obstacles ahead and avoid risky blind spots while on the move.
These aforementioned applications have three key similarities. First, they are throughputintensive. Second, they require low end-to-end delay. Third, they need reliable data packet transport.
Although the mobile/wireless networking standards are evolving to support higher capacity requirements, existing transport-layer protocols (e.g., TCP, UDP, QUIC, and MPTCP) rarely provide adequate support for low-latency and high-throughput applications, especially in wireless networks. One oft-cited reason for this is the tendency of TCP congestion control to induce excessive buffering delay at the network bottleneck. Further compounding this problem are the vagaries of wireless channels, which often exhibit non-deterministic behavior in signal quality.
Aside from the application aspects (5G/WiFi) mentioned above, the main motivation of this research is to advance the theoretical knowledge in the overhaul of MPTCP, which requires a major departure from the existing design. These theoretical challenges are discussed in the following.

Motivation and Contributions
Kelly et al. [6,7] designed a distributed congestion control framework for obtaining a network equilibrium with an objective of achieving proportional fairness. This model holds as long as the TCP sources react to the sum of the congestion prices in their paths. Moreover, when the TCP sources react to the maximum of the prices in the network paths, the system instead achieves max-min fairness.
Our main insight is that Kelly's approach of "transforming the fairness problem into a game between TCP users" shares several features with the Tullock rent-seeking game [8]. The main differences will be discussed later.
In coupled MPTCP algorithms, it is known that, if one path fails to meet the loss and delay constraints, then the subflows on other paths suffer [9][10][11]. Therefore, the competition between MPTCP subflows can be considered as a game with common coupled constraints. In such games, when a player fails to achieve the constraints common to other players, all other players will also be penalized. Such games can, therefore, be viewed as cooperative in the objectives of meeting common constraints and non-cooperative in the context of utilities. We exploit the rent-seeking game framework for studying the contests between subflows, thus providing insights into the design of cooperative MPTCP algorithms.
Masson et al. [12] used a similar framework to model contests over time for maximizing visibility in social networks. We observe that the rent-seeking game has the potential to model MPTCP transmissions over multiple network paths. Each MPTCP subflow may bid its proposed packet-sending rate λ i before transmitting. Then, each MPTCP subflow will pay a price p i based on its bid. Based on this price, they eventually obtain the throughput θ i proportional to their bid.
In a realistic network setting, since the achievable bandwidth across the network paths is finite, this approach will lead to (infinitely) many equilibria. Unfortunately, the difficulty in designing a mechanism for selecting a desired equilibrium point and, thus, maximizing the utility is a serious drawback with this approach. One possible solution to this is to use the Karush-Kuhn-Tucker (KKT) conditions to solve a similar (relaxed) version of the game, rather than the exact constrained game. This relaxed game has the same equilibria as the original game, but is easier to solve.
Using Kelly's theory, the Lagrange multipliers can be interpreted as pseudo-prices associated with the network path, and the equilibrium will satisfy the (desired) coupled constraints. However, the KKT conditions alone are not sufficient for our needs, since these do not guarantee that the pseudo-price per unit bandwidth is the same for all contending subflows. Rather, this approach increases the already large state space of our MPTCP system model and makes it unscalable. More specifically, it becomes intractable, since once pseudo-prices (Lagrange multipliers) are obtained, they depend not only on the policy adopted by an MPTCP subflow, but also on the corresponding strategies of other coupled subflows.
Our idea is to find a pseudo-price that is globally constant per unit resource across all network paths. Recall that such an equilibrium with a global fixed pseudo-price is the desired normalized equilibrium for our MPTCP dynamics over the multipath network. Therefore, the Tullock game framework is used to redesign MPTCP and analyze the existence and uniqueness of the equilibrium (λ , p ). Moreover, we prove that, starting from any initial point (λ(0), p(0)), the trajectory (λ(t), p(t)) generated by the cooperative MPTCP algorithm over multiple paths converges to a unique solution with a globally stable equilibrium (λ , p ) [13].
To summarize, the main contributions of our work are the following: • We design a utility function for MPTCP based on Tullock's rent-seeking game. • We prove that this utility function implies a unique and stable solution to the MPTCP optimization problem. • We examine the convergence of no-regret learning in the underlying network games. • We provide extensive experimental results showing the performance of our MPTCP algorithm.
The rest of this paper is organized as follows. Section 2 provides an overview of our MPTCP modeling approach and situates it in the context of prior work. Section 3 discusses the details of our approach, including the game theory framework and the equilibrium properties of our system. Section 4 extends our model to consider multiple MPTCP flows and no-regret learning. Section 5 presents the experimental results of the evaluation of the performance of our MPTCP solution using network emulation experiments. Section 7 concludes the paper.

Model Overview
This section provides the background context for our work by summarizing prior work on network optimization and MPTCP design. The section concludes with a conceptual overview of our MPTCP modeling approach using Tullock's rent-seeking framework (RSF).

Background and Related Work
Of particular relevance to this work are the core foundational works on charging and rate control for elastic traffic and their analyses reported in [6,7,[13][14][15][16][17][18][19][20][21]. Kelly's approach in [7] presented price-taking users to attain system optimality or efficiency. However, when the users are price-anticipating (or strategic), the users exhibit rent-seeking behavior. This problem then falls within Tullock's rent-seeking framework [13], and there is an efficiency loss [17,18]. For a finite-capacity single-path TCP user's case where the path is shared by all TCP users/flows, there are a few important works, viz. Hajek and Gopalakrishnan [16], Johari and Tsitsiklis [17], and Altman et al. [13,14], among others.
The analyses in [16,17] deal with the so-called 'posted prices' setting in which all users perceive the same price on the path. The works [13,14] deal with differential pricing. For a discussion on the advantage of differential pricing over posted prices, see the implementation-theoretical perspective explained in [19].To this end, the existence and uniqueness of the Nash equilibrium were proved independently in [16,17] and by using Rosen [18] in [13] or variations of Rosen [14]. The authors of [17] also dealt with network settings where agents can use multiple paths.
Kelly's theory on network utility maximization (NUM) [6] provides a rigorous foundation for the design of efficient TCP algorithms. It has proven useful for understanding the structural properties of networks under end-to-end congestion control and for the design of optimal TCP algorithms. For path selection in multipath congestion control, Key et al. [22] developed another novel approach in order to study the potential benefits of coordinated congestion control. This enables users to select (or reselect) routes by adopting the notion of a Nash equilibrium.
There are two distinct steps in most existing optimization frameworks for TCP congestion control design. First, one develops a mathematical model for the resource allocation problem (e.g., fluid and/or queueing model [23]). Second, one designs an algorithm to solve the model (e.g., convex optimization based on its properties) [23,24].
A limitation of such NUM-based approaches is that they often assume a stable and steady-state network. As a result, they may lead to TCP algorithms that struggle in the presence of highly fluctuating and time-varying characteristics, such as those in wireless networks (e.g., 5G and next-generation WiFi). Specifically, the rapid variation in the capacity of these wireless channels may cause their state to change before the TCP algorithm converges on its optimal solution.
In NUM and current TCP design, utility is typically a function of the instantaneous packet-sending rate (i.e., throughput). This approach leads to bandwidth-sharing networks [6], which are suitable for bulk data transfer flows over slowly varying channels. In addition, all of the TCP algorithms proposed (and deployed) have strictly concave utility functions, implying a unique stable equilibrium.
However, the case for MPTCP is much more complex. This is because the existence of an underlying utility function depends on the design of the coupling function between the subflows (see Table 1 in [23]). For many well-known MPTCP algorithms, such as LIA [4], OLIA [9], and BALIA [23], a utility function does not even exist. Therefore, utility maximization for MPTCP is highly challenging.
Existing MPTCP congestion control algorithms [4,9,10,23] have used common coupled constraints (e.g., loss/delay) to represent the channel characteristics and congestion conditions of the network paths (see Table 1 for a concise summary of the TCP window update operations for LIA, OLIA, and BALIA; as mentioned earlier, if one path fails to meet the constraints, the subflows on other paths may also suffer [9][10][11]).

MPTCP Algorithms
Increase on Path i Decrease on Path i LIA [4] For each acknowledgment, For each packet loss,

as regular TCP
OLIA [9] For each acknowledgment, For each packet loss,

as regular TCP
BALIA [23] For each acknowledgment, For each packet loss, Moreover, the throughput achieved by a subflow depends in subtle ways upon the throughputs of all other subflows belonging to the same MPTCP connection [4,10,15]. When packets are sent via diverse wireless paths, they can arrive at the receiver out of order or may be lost due to channel impairments. Lost packets must be retransmitted and out-of-order packets must be buffered for resequencing so that they can be delivered to the application in proper order. Both of these artifacts affect subflow throughput because of buffering delays and the need for retransmissions at either the data-link layer or the MPTCP level [15].
At a conceptual level, almost all coupled MPTCP congestion control designs focus on minimizing reordering delay and head-of-line (HOL) blocking. They provide both load balancing and congestion control by coupling the delays and losses across all subflows. Based on our understanding of the design space, we propose a new MPTCP algorithm that generalizes existing algorithms and strikes a good balance among these properties. We implemented this algorithm in the Linux kernel, and we evaluated its performance experimentally by using our prototype.

Our Proposed Scheme
We now summarize our proposed MPTCP algorithm. The key idea is to schedule the packets via multiple paths at different rates in such a way that they balance the delay across all paths and are, therefore, more likely to arrive at the receiver in order. More specifically, an MPTCP congestion control mechanism with heterogeneous network paths will perceive different utilities on each path.
In essence, we strive to harmonize the performance achieved across multiple paths by opportunistically exploiting path dynamics to provide seamless MPTCP performance. The primary advantage of our method is that it implicitly addresses all known issues that limit MPTCP performance (i.e., HOL blocking, reordering latency, route heterogeneity). Furthermore, it enables the pooling of multipath resources in a way that emulates the performance of a single harmonized path.
One way to achieve utility maximization is to formulate a Tullock rent-seeking game [8,25] and use this to equalize the delays across all paths in order to ameliorate reordering delay. A plausible approach is to set equal delay for all subflows (across all paths) and determine the corresponding sending rates that jointly satisfy the delay constraints and load-balancing objectives.
DELAY-BALANCED MPTCP (RSF-DB): In RSF-DB, each source has a set of paths and a predefined target delay d > 0, which is fixed and equal for all paths. The source maintains a congestion window W i for each path i and measures its forward delay q i and roundtrip time τ i for each path. Here, q i is the queueing plus service delay from the source to the sink on path i, where K is the number of paths. Then, the window adaptation in the congestion avoidance phase is: This algorithm is designed to adjust the window size dynamically so that q i converges toward the target delay d in the steady state. Note that q i is not necessarily strictly bounded by d, so the "window increase" could, in some cases, be negative if the situation warrants.
A concise summary of the window update operations for three well-known MPTCP algorithms (LIA, OLIA, and BALIA) is shown in Table 1.
Similarly, we designed a complementary solution called LOSS-BALANCED MPTCP (RSF-LB), in which each MPTCP source has a set of paths and a predefined target loss rate. The key idea here is to balance the loss rate (i.e., lost packets per second) on each path to ameliorate the potential effects of heterogeneity in the packet loss probabilities, l i , on different paths. For RSF-LB, we replace (1) with so that the instantaneous λ i l i converges toward the target loss rate µ packets/sec in the steady state. We extend Kelly's framework [6] by exploiting the well-known Tullock rent-seeking game [8] to design a cooperative MPTCP with common coupled constraints. We consider a non-linear dynamical system model for an MPTCP connection over K network paths, as seen in Equation (3):ẏ where f (y) = 0 has a unique solution. In this equation, the λ i values denote the packet-sending rates, and denote the congestion prices (based on the delay q i or loss l i ) for the respective paths. Then, motivated by the generalized MPTCP analysis in [23], we specify the following fluid model: for analyzing MPTCP (Equations (3) and (4) in [23]). Here, γ i > 0 is a positive gain, k i (λ) is a vector of positive gains determining the dynamic properties of the system, and φ i (λ) determines the equilibrium properties. Our objective is to find a pseudo-price that is globally fixed per unit of resource across all network paths. The equilibrium in Equation (4), along with this special pseudo-price, is known as a normalized equilibrium for MPTCP dynamics over multiple networks (Section 2 in [13]). Therefore, a successful idea developed recently by Altman et al. [13] can be used to analyze the existence and uniqueness of the equilibrium (λ , p ). Moreover, starting from any initial point (λ(0), p(0)), the trajectory (λ(t), p(t)) generated by the cooperative MPTCP algorithm converges to a unique and globally stable equilibrium (λ , p ) (see Lemma 2 later in the paper).
In summary, the approach in our model is as follows. We use an existing fluid modeling approach for MPTCP to quantify its interactions over the network paths as a dynamic system [23]. We then implement the Lagrangian method to solve the dynamic system. However, for the Lagrangian approach to succeed, we require the existence of a well-defined utility function with certain properties to guarantee the stability. Therefore, we establish a utility function that satisfies these properties by adopting the idea from Tullock's rent-seeking game [8,13], which we describe in the next section.

Modeling Framework
The details of the background of the Tullock rent-seeking game and its equilibrium properties were explained with the proposed MPTCP in our preliminary work [1]. In this section, we discuss the essential remaining details for completeness.

Rent-Seeking Game Framework
Rent-seeking is a concept from game theory in economics. Unlike profit-seeking, which tries to increase total wealth, rent-seeking tries to increase one's own share of existing wealth without increasing the total wealth. Specifically, it is an attempt to obtain a reward R (rent) by manipulating the environment in which the activities occur, rather than by creating new wealth. One example is lobbying and/or bribery of government officials, in which the efforts (e.g., time and/or money) spent may result in favorable outcomes (e.g., tax laws, corporate rebates), albeit with potentially reduced economic efficiency. Other examples include taxi-licensing fees, bridge tolls, and bidding in loss-load curves [26], all of which result in reallocation of resources.
Strategies for rent-seeking games were studied by Gordon Tullock in 1967 and form a good basis for studying MPTCP flows. We use this rent-seeking approach to design our utility function below.
We consider an MPTCP connection in a network setting with K paths and one subflow on each path. Let λ i = W i /τ i (strictly positive λ i > 0) represent the sending rate for the ith subflow (1 ≤ i ≤ K), and let p i represent the corresponding price for that subflow. Then, assuming that the sending rates on each path are the actions to be chosen and that the connection throughput is concave, the individual throughput (payoff) obtained from path i can be represented as [13]: where V(.) is a concave function. Observe that the throughput θ i comes at the cost ofpλ i (wherep is a constant). With this formulation, the utility for the MPTCP subflow over path i is structurally similar to that of a Tullock rent-seeking game [13], for which: Lemma 1 (Existence and Uniqueness). The utility U i (λ) of an MPTCP subflow i is concave in its action λ i and is continuous in the corresponding actions of other subflows. Further, for p > 0 (strictly positive pricing), the trajectory (λ(t), p(t)) generated by the cooperative MPTCP algorithm has a unique Nash equilibrium in pure policies.
The proof of the concavity and continuity of U i (λ) follows from direct calculation. The existence and uniqueness can be established along the lines of Rosen [18] and Altman [13], respectively.
In contrast to Kelly's approach [6], the subflows of an MPTCP connection can have an action on one path (such as an increase or decrease in λ i ) that is not dependent on the actions over other paths. Further, similarly to [6], the capacity of the system is limited in real communication networks. Therefore, the aggregate network bandwidth can be bounded by a finite constant, say B. As a result, However, with this additional capacity constraint, the uniqueness of the Nash equilibrium given by Lemma 1 collapses, and there can be (infinitely) many equilibria [27].

Equilibrium Properties
We now discuss the application of our new utility function from Equation (6) to the solution of our dynamic system defined in Equation (4).
Let (λ , p ) be the equilibrium of the aforementioned dynamical system and let λ (−i) be the action vectors of all of the MPTCP subflows, except for the one along path i. Now, by using KKT, since U i (λ) is concave for each λ i , there exists a Lagrange multiplier L i λ (−i) such that λ i maximizes the Lagrangian: and Equation (9) is obtained by using the complementary property. In addition, observe in Equation (8) that the Lagrangian Q i (.) replaces the utility U i (.).
The Lagrange multipliers are interpreted as pseudo-prices of the paths. If a price is set on path i while other players are at equilibrium, then the subflow on path i pays λ i L i λ (−i) for its use of the bandwidth. In this case, λ is an equilibrium of our non-linear dynamical system with the capacity constraints. However, the pricing is not tractable, since it may vary from one subflow to another (for similar use of the network resources). In addition, it depends on the equilibrium selected.
Therefore, our main objective is to find a constant Lagrange multiplier L that is independent of i. Furthermore, L should be independent of the actions taken by any of the MPTCP subflows, along with an associated equilibrium λ for our new relaxed game. If we can find such a constant Lagrange multiplier L, the trajectory (λ(t), p(t)) generated by our cooperative MPTCP algorithm converges to a unique and globally stable normalized equilibrium (λ , p ). Moreover, our interest is to establish the existence and uniqueness of the normalized equilibrium. Therefore, we investigate the diagonally strict concave utility functions of our dynamical system (given by Equation (1)), since it is known that these lead to a unique global equilibrium (Theorem 4 in [18]).

Diagonal Strict Concavity
Consider a game with K subflows of an MPTCP connection. The strategy space is S ⊂ R, where S is a bounded set, and subflow i's utility function is U i : S K → R. Rosen's condition [18] for uniqueness of the Nash equilibrium in a K-player game states that the equilibrium is unique when: (i) U i (λ) is concave in subflow i s own action; (ii) There exists a vector z of non-negative numbers such that the function ρ(λ, z) To define the concept of diagonal strict concavity, we first compute the pseudo-gradient of ρ(.): where CONDITION C 0 : A sufficient condition for the function ρ(.) to be diagonally strictly concave is that the matrix [G (λ, z) + G(λ, z))] is negative definite for z ∈ S: here G(λ, z) is the Jacobian of the pseudo-gradient g(λ, z) with respect to λ.
CONDITION C 1 : A sufficient condition for the existence of a unique global equilibrium is that ρ(.) is diagonally strictly concave for some z (i.e., C 0 holds).
Note that CONDITION C 1 is essential for the stability of RSF MPTCP. Observe that C 1 is obtained directly from Theorem 4 in Rosen (Theorem 4 in [18]). We acknowledge that both prior works [13,18] nicely established the existence, uniqueness, and stability of the underlying dynamical system for the Tullock rent-seeking game. We implement these ideas in our work to design a cooperative framework for MPTCP. Furthermore, we consider a practical operational model, in which the existence of the utility function, and its interaction with multipath network constraints is one of the building blocks.
Lemma 2 (Stability). Assume that C 0 and C 1 both hold, the utility function U(.) exists, and the existence and uniqueness of the equilibrium (λ , p ) is guaranteed. Then, starting from any initial point (λ(0), p(0)), the trajectory (λ(t), p(t)) generated by the cooperative MPTCP algorithm converges to a unique solution, and Equation (1) has a globally stable solution given by (λ , p ).
First, we provide a proof that a Tullock rent-seeking game with a finite capacity constraint has a diagonally strictly concave utility function and, thus, a unique normalized equilibrium.

Proof. We have
; and Next, we take a column vector Therefore, for all z i = 1, where F is given by Observe in Equation (15) that F is positive for λ > 0; therefore, the matrix [G (λ, z) + G(λ, z))] is negative definite. As a result, U(λ) is diagonally strictly concave. See [13] for more background.
In the following, we provide further details on stability and our interpretation of utility maximization. Given the normalized equilibrium and the concave utility, the remainder of the proof of Lemma 2 is provided in Section 3.3.3.

Lyapunov Stability Theory
We apply Lyapunov's theorem to prove the stability of the nonlinear dynamic system represented by Equation (3). We make use of the following Lyapunov stability theorem to test the stability of the system.
LYAPUNOV'S THEOREM FOR STABILITY: If there exists a Lyapunov function L(λ) of a system, then λ = 0 is a stable equilibrium point in the sense of Lyapunov: (i) If dL(λ)/dt < 0, 0 < ||λ|| < r for some r, i.e., if dL(λ)/dt is locally negative definite, then λ = 0 is an asymptotically stable equilibrium point; (ii) if L(λ) is positive definite on the entire state space and has the property that |L(λ)| → ∞ as ||λ|| → ∞, and the derivative dL(λ)/dt is negative definite on the entire state space (except at the equilibrium point), then the equilibrium point is globally asymptotically stable.
A nonlinear system is asymptotically stable around its equilibrium if it satisfies the aforementioned Lyapunov theorem. The first condition of the theorem requires that the state trajectory can be confined to an arbitrarily small 'ball' of radius r centered at the equilibrium. This is called stability in the sense of Lyaponov.

Utility Maximization Interpretation
The well-known approach for an MPTCP-based dynamic system is to associate a utility function U i (λ i ) with each user i and interpret the sources and network to be executing distributed algorithms to maximize the aggregate utility of the users. The primary idea is to connect the equilibrium points of the dynamical systems with the solution of some optimization problem. Therefore, the nonlinear dynamical system can be viewed as the one that solves the utility maximization problem in a distributed manner. For example, (λ * i , p * i ) is an equilibrium of the dynamical system of MPTCP users if and only if λ * i is optimal for where p * i is the Lagrange multiplier or pseudo-price associated with the link at the optimal λ * i . Here, y i ≤ B i means that the aggregate traffic (y i ) at each link is less than the capacity B i of the link.
In fact, all of the existing TCP algorithms proposed in the literature have concave utility functions implying a unique stable equilibrium [23]. Furthermore, we conclude that, for our MPTCP, given p i , we have a unique λ i given by (see Equation (6))

Stability of the Dynamical System
We make use of a Lyapunov function for Lasalle's invariance principle to prove the global asymptotic stability of the dynamical system. Let ∂p := p − p and ∂λ := λ − λ . The function is a Lyapunov function. To prove Lemma 2, we extend the ideas from [23], and therefore, we include the essential steps only.

Multiple MPTCP Connections
Now, we extend our solution to the case of N MPTCP sources sharing K paths. Each subflow i of an MPTCP connection n over the K paths has a sending rate λ n i , and the following constraint should hold: The throughput obtained by the nth MPTCP connection is the sum of the throughputs for all of its subflows and is given by: where λ n is the vector [λ n 1 , λ n 2 , . . . , λ n K ] and (5)).
Therefore, the utility of the nth MPTCP connection is: Observe that the net utility in Equation (26) is computed as the sum of the utilities of each subflow. This decoupling approximation is motivated by the findings in realistic network settings [10], as each network path i has its own capacity (bandwidth) and prices (independent of each other); therefore, let C be the vector with the ith element being a finite constant C i . Then, B = ∑ C i , and for each i, we have the following constraint: Due to this constraint, the target loss and delay of RSF MPTCP are attainable iff the aggregate achievable loss/delay is lower than the target.
Observe that when applying the KKT conditions to the optimal response at the normalized equilibrium, there are K Lagrange multipliers. Clearly, our objective is to find a vector of K Lagrange multipliers such that the Nash equilibrium for the relaxed game will be an equilibrium for the primary game, where the constraints will be satisfied along with the conditions for the complementary equation. If diagonal strict concavity holds for each subflow and/or for each path separately, then Theorem 4 in [18] guarantees the existence and uniqueness of the equilibrium.

Learning for Continuous Actions
For learning continuous actions of MPTCP, we propose the adoption of the generally practiced dual-averaging method of Nesterov [28] in our new game-theoretic setting. Intuitively, the main idea of the learning process is as follows. At each stage of the process, the window adaptation process of path i generates a new estimate of the individual update of the throughput function at the current action profile, which is possibly subject to loss, channel impairments, and delay uncertainties. With this estimate, one may move into the dual space (for gradients) and translate the performance back into the primal space to select the next action, as shown in Figure 1. Starting with some arbitrary estimate, our approach can be captured using recurrence relations as follows (using Equation (5)): W i+1,n,m = W i,n,m + γ mθi,n,m+1 where m represents the step of the process,θ i,n,m+1 is the estimate of the individual throughput gradient of the ith subflow of the nth MPTCP. To this end, the core components of dual averaging for learning continuous actions are the gradient estimates and the mapping, which determines the actions of the MPTCPs as players. For the MPTCPs' gradient obser-vations, we assume that each MPTCP subflow i has an underlying feedback mechanism that responds with a prediction of the throughput gradients at its current profile. However, such information can be imperfect for several reasons: (i) Predictions are susceptible to errors; (ii) predictions may suffer from channel impairments; (iii) throughput functions may be the stochastic expectations of some random variable (θ i (W i ) = E[θ(W i ; p i )] [29]), which the MPTCP source can only observe via the realized sample path of gradients. Observe that Equations (17) and (18) hold for both RSF-LB and RSF-DB, which were proposed earlier (see Equations (1) and (2)). Cognizant of such issues, we follow along the lines of [30] and focus on the erroneous feedback mechanism of the form: Observe that the noise process ξ m = (ξ i,n,m ) i∈N is an L 2 -bounded martingale difference (with respect to filtration) from the history (F m ) ∞ m=1 of W n,m such that ξ m+1 is not F m measurable (however, ξ m is F m measurable). Therefore, ξ m satisfies the following zero-mean and finite mean-square error properties: With Equations (31) and (32), the gradient estimates of each path are unbiased and bounded in the mean square conditionally. Specifically, Such an interpretation is generic in the incorporation of a broad range of erroneous processes in our dynamical system of MPTCP data flows, including all compactly supported log-normal, Gaussian, and exponential distributions. Furthermore, both of the above theories can also be relaxed for finite-order moments considering a small bias.
Based on the foregoing discussion, an appropriate choice for V n is y n ← arg max θ i (y i , θ i ). However, such an approach that outputs actions that are closely aligned with y n has two main problems: • Due to uncertainty in the channel dynamics, the arg max action choice can be too aggressive. • Convergence of the dual-averaging (DA) approach is challenging, since the output will always be an extreme point.
Therefore, we suggest to soften the approach by using a regularized mapping: where the regularizing function h : C → R is a penalty function on a compact convex subset C of a finite-dimensional norm if: (i) h is strictly convex; (ii) the limit of h(θ) as θ approaches B equals h(B) (i.e., h is continuous). Observe that these conditions are also comparable to the MPTCP conditions reported in [23]. Furthermore, the mapping induced by h can be interpreted as [30]: In what follows, we assume that each MPTCP subflow has its own individual h i value and focus on the interplay between dual and primal parameters (i.e., the outputs θ i and the actions W i , respectively). More precisely, from an MPTCP source perspective, it can be viewed as an aggregate penalty function h(θ) = ∑ i h i (θ i ). As shown in Algorithm 1, the learning process can be observed as a multi-agent version of gradient descent with (lazy) projections. The mapping 'V n • θ i ' is most often defined as smooth and the regularized best response in a game-theoretical context; see [30,31] for detailed discussions (McKelvey et al. [27] referred to V n as a 'quantal response function' in finite-player games).

Algorithm 1 Abstract View of the Dual-Averaging Algorithm
Step size (γ m ), initial windows (W i ) 3: end procedure 4: procedure UPDATE 5: while n ≤ N do 6: for each MPTCP subflow i do 7: Choose action: use Equation (17) 8: Estimate gradient: use Equation (19) 9: Take gradient step: use Equation (18) 10: end for 11: end while 12: end procedure One main advantage of dual averaging in concave games is that this leads to no regret as long as the steps of Algorithm 1 are correctly picked [32]. While certain, within DA, the average payoff of each subflow corresponds to the most effective reverse case (note that this ignores changes in the behavior of other subflows owing to an alteration in the action selected by one subflow). With relevant insights from these observations and our ongoing foundational works [33,34], we plan to improve on this worst-case guarantee and obtain appropriate convergence results for the actual play series triggered by DA in future work.

Evaluation
In this section, we present the performance evaluation of our new MPTCP algorithm in different network scenarios. We describe the network settings used in the experiments and then present our experimental results for each scenario.

Experimental Setup
We evaluated our cooperative MPTCP algorithm based on the rent-seeking framework (RSF) by comparing it with three other well-known MPTCP algorithms: LIA [4], OLIA [9], and BALIA [23]. The latter three algorithms are all available in an experimental Linux implementation (C. Paasch et al., "Multipath TCP in the Linux Kernel", available from https://www.multipath-tcp.org (released: 2 November 2017)) of MPTCP v0.93 by Paasch et al. We used this Linux implementation as the basis for our experiments and added implementations for our RSF MPTCP, including both delay-balanced (RSF-DB) and lossbalanced (RSF-LB) versions.
The test environment in our lab consisted of three laptops, all running Ubuntu Linux 16.04. As shown in Figure 2, one of these was set up as an HTTP server, while the other two were clients. The laptops were connected to an Ethernet switch, each via two 1 Gbps interfaces, thus creating two separate paths for our experiments. In particular, each of our MPTCP connections had two subflows in our experiments (for simplicity [23]). We used the NetEm (NetEm: https://wiki.linuxfoundation.org/networking/netem, accessed on 13 May 2022) network emulator to control the bandwidth, delay, and packet loss on the Gigabit Ethernet links in order to emulate the characteristics of wireless paths. We used d = 100 ms and µ = 4 packets/s in all of our experiments.
The performance evaluation of the MPTCP algorithms was carried out with the following settings. We used five MPTCP connections (each with two subflows) between the server and the two clients. The server ran a single MPTCP implementation at a time for each experiment. The packet traffic was generated by a large file transfer from the HTTP server, where the file size ranged from 2 to 10 MB. Packet traces were captured using tcpdump (https://www.tcpdump.org, accessed on 13 May 2022) and analyzed with wireshark (https://www.wireshark.org, accessed on 13 May 2022).  Table 2 provides a concise summary of our results, which are presented and explained in the following subsections. We check for the target loss (µ) and delay (d) in all of our experiments.

Baseline Results
For the verification and validation of our experimental environment, we started with two baseline scenarios.

Observation 0
Scenario 0 used a single-path network, for which all TCP versions should perform the same. The first row of results in Table 2 confirms this observation, which is summarized in the rightmost column of the table.

Observation 1
Scenario 1 used two homogeneous paths with no packet loss at all. In this scenario, our RSF MPTCP should perform the same as LIA. The next row of results in Table 2 confirms this observation, as summarized in the table. In fact, we observe the exact same results for both RSF-DB and RSF-LB MPTCP.

Homogeneous Paths
The next set of experiments considered homogeneous paths, but with non-zero packet loss. In these scenarios, RSF MPTCP started to differentiate itself from the other MPTCP algorithms.
For Scenario 2, we set the capacity (10 Mbps), delay (150 ms), and Bernoulli packet loss probability (2%) for the links to be the same for both paths. We then studied the impacts on the RSF-DB throughput when downloading files of different sizes (2 MB ≤ size ≤ 8 MB) with different MPTCP algorithms, each operating under the same settings. Figures 3 and 4 illustrate the throughput and fairness results, respectively. Figure 3 shows that, when the size of the file is 4 MB, our proposed RSF MPTCP achieves almost double the throughput of LIA, OLIA, and BALIA. Figure 4 shows that the corresponding Jain fairness index stays quite close to unity. Hence, the throughput is highly improved by using our design without any adverse impact on fairness.  We also considered Scenario 3 with a much higher Bernoulli packet loss probability (10%) on both paths. While the throughput declined for all MPTCP algorithms, our RSF MPTCP still provided much higher throughput than the others.

Observation 2/3
The results from Scenario 2 and Scenario 3 lead to the following observation. The proposed RSF MPTCP outperformed all other MPTCP algorithms in terms of throughput and achieved similar fairness.

Heterogeneous Paths
The next set of experiments considered heterogeneous network paths. Scenario 4 had heterogeneous paths that differed in their bandwidth. We set the delay (150 ms) and random packet loss probability (4%) to be the same for both paths. We set the bandwidth to a fixed value of 20 Mbps for one path while varying the capacity of the other one from 10 to 80 Mbps. We then measured the RSF-DB performance when downloading a 10 MB file using different MPTCP algorithms. Figure 5 shows the impact on performance. In general, the throughputs achieved by the existing MPTCP algorithms were fairly low and remained almost constant, even with the increasing bandwidth on the second path. That is, they did not exploit the additional bandwidth well. In contrast, the RSF-DB algorithm showed consistent improvement in throughput under the same network settings. When packet losses were frequent, the existing MPTCP algorithms suffered a lot, but our algorithm was able to discover heterogeneous paths quickly and improved the load balancing across paths. The results from Scenario 4 are summarized as follows. The proposed RSF MPTCP outperformed the other MPTCP algorithms when the paths had heterogeneous bandwidths.
Next, Scenario 5 considered paths with heterogeneous delay characteristics. We set equal bandwidths for the two paths (20 Mbps) and the same random packet loss probability (3%) for both paths. We fixed the delay for one of the paths at 20 ms and then varied the delay of the other path from 20 to 100 ms. As before, we observed the RSF-DB throughput when downloading a 10 MB file. Figure 6 illustrates the overall impact on throughput, which can be summarized as follows. The throughput of all MPTCP algorithms (including RSF) decreased with increasing delay, as expected. However, the decline was quite gradual for RSF MPTCP and quite pronounced for the other MPTCP algorithms. This result was mainly due to the delay required for re-transmission of lost packets, as well as the buffering delay when re-sequencing out-of-order packets received at the client. In contrast, the balanced pricing strategy adopted in our design made RSF MPTCP superior to the others in an environment with heterogeneous delay and non-negligible packet losses.

Observation 5
Scenario 6 examined heterogeneous paths with different loss characteristics. We set the packet loss probability to a fixed value of 3% for one path while varying the packet loss probability for the other path. Both paths had equal bandwidth (20 Mbps) and delay (150 ms). Figure 7 depicts the throughput performance when downloading a file with a size of 10 MB. The results show that, with increasing packet loss, the throughput achieved by the existing MPTCP algorithms decreased. In contrast, the RSF-LB algorithm outperformed them and provided robust performance, even with increasing channel errors. This was because most existing algorithms significantly reduced congestion windows after detecting packet losses. However, the proposed RSF MPTCP could dynamically determine the best way to adjust the congestion windows using the balanced pricing approach.

Observation 6
This experiment can be summarized as follows. The proposed MPTCP algorithm was robust and outperformed the others under adverse channel conditions.
More importantly, we observed that, in the case of heterogeneous paths, the target loss and delay were mostly satisfied by the proposed RSF MPTCP where possible (i.e., bounded by Equation (27)).

Additional Experiments
The remaining experiments considered different networking scenarios to evaluate the responsiveness and robustness of RSF MPTCP.
In Scenario 7, we used iPerf3 to study the performance of short-lived MPTCP flows. For the network setup, we used RSF-DB and set equal bandwidth (20 Mbps), delay (150 ms), and loss (2%) for both paths. For the traffic, we generated short-lived flows with randomly generated durations between 1 and 30 s. These flows arrived in the network according to a Poisson arrival process (arrival rate = 8) and departed at random times, as well.  The results in Figure 8 illustrate RSF MPTCP's ability to handle the frequent establishment and termination of flows. This feature reflects the increased responsiveness of our MPTCP algorithm to network changes. The two main enhancements in our design, namely, balanced pricing and adaptive congestion control, enabled this responsiveness.
In Scenario 8, we investigated a case in which one of the two paths suddenly became unavailable (e.g., WiFi path disappeared, but the LTE connection was still fine). The results are shown in Figure 9, where we compare the throughput obtained by the intermittent subflows of different MPTCPs.

Observation 8
RSF MPTCP performed well in the face of a dynamically changing number of subflows, and it outperformed the other MPTCP algorithms. When the number of subflows changed over time in the realistic network scenario, we observed a sharp drop in performance at 50 s with all algorithms (including RSF MPTCP). However, our RSF-LB algorithm detected the failed path very quickly. This mechanism made the RSF MPTCP algorithm highly responsive to changes in network paths. The other existing MPTCP algorithms were all quite sluggish in detecting and abandoning the failed path.

Observation 9
Scenario 9 investigated fairness between TCP and MPTCP. Specifically, we considered a scenario of five MPTCP connections when one of the paths was shared with five regular (single-path) TCP connections.
The results obtained by using Scenario 9 are shown in Figure 10. Figure 10 illustrates the throughput perceived by both TCP and MPTCP connections, which demonstrated the following: RSF MPTCP was quite friendly towards TCP while providing better throughput than the other MPTCP algorithms.   Figure 10 shows that when RSF-DB MPTCP was used, the throughput achieved by regular TCP flows was comparable to that obtained while competing with the other MPTCP algorithms. However, the corresponding MPTCP connection had much higher throughput. More specifically, when compared to the other MPTCP algorithms, the performance of TCP when competing with cooperative MPTCP flows was slightly lower, but the throughput gain for MPTCP was very high. The throughput gain rose because of RSF MPTCP's ability to exploit the capacity of both available paths. The reason for the slightly reduced TCP throughput needs further investigation, and this remains for future work. The observation with RSF-LB was also similar.

WiFi-Based Experiment
We performed another experiment in a real WiFi testbed with settings similar to those of Scenario 5, but using two WiFi network interfaces. For both paths, the bandwidths were 54 Mbps and the delays were 8 ms. Packet losses were determined empirically based on the characteristics of the WiFi channel. We again used five MPTCP connections and measured the average throughput of the individual connections. The results are shown in Figure 11.  In this experiment, both the delay and the packet loss probability were very low, so the throughput advantages of RSF-DB MPTCP were small (but still measurable) compared to the other MPTCP algorithms. Note that our RSF MPTCP coincided with LIA under deterministic path conditions and zero packet loss. However, these properties seldom hold on real wireless networks. Furthermore, we observed that the target loss and delay were often always met with RSF MPTCP where attainable (aggregate achievable loss/delay < target).

Potential Usage and Applications
The outcomes of this research paper laid the foundation for improving the responsiveness of the future mobile internet. The potential benefits, usage, and applications can be observed from the following three different perspectives.

Benefits to Service Providers, the Environment, and Society
Internet giants, such as Amazon, found that every 100 ms of latency cost them 1% in sales; Google found that an extra 0.5 s in search page generation time dropped traffic by 20%. The gaming industry, where latencies larger than even 80 ms can hurt gameplay, has even tougher latency requirements. The expectations of today's NOW customers continue to grow, and the amount of data generated and accessed is mind-boggling. CISCO has described how 3.2 quintillion bytes of data are generated every day and has stated, that over the last two years alone, 95% of the data in the world were generated. It is clear that the need for speed and scale is escalating, and we need to understand how we can support future networks to remain competitive from all aspects: optimized operations, maximized network utility, and enhanced user experience. The benefits are not purely for large corporations. Reduced latency improves the quality of gaming and video telephony, improving the lives of isolated rural and under-connected global populations. Moreover, the applicability of the outcomes to the internet of vehicles will also help combat traffic congestion, bringing benefits to both the environment and society.

Potential for Impact in 6G Specifications
The 3GPP specification put forward ATSSS for multi-connectivity of WiFi and cellular over the 6G core [2,3]. This is also reflected in the ongoing development by Google of BBR (bottleneck bandwidth and roundtrip time) and the redesign of QUIC (a clean-sheet design overlaid on UDP, standardized by IETF in 2021) at the transport layer. Important trends include: (i) multi-connectivity with ATSSS, especially the concurrent use of both 3GPP and non-3GPP interfaces on mobile devices [34], (ii) use of forward error correction (FEC) to reduce latency, (iii) improving operation over lossy wireless links, and (iv) support for incremental rollout by avoiding the need for changes to the network fabric. However, many knowledge gaps need to be resolved for these to occur and for them to be applied for emerging technologies, which range from semantic communication [35] to federated learning over wireless vehicle-to-grid networks [36].

Applicability of the Proposed RSF Framework
We employed the RSF framework [8] in this paper to guarantee network utility maximization in a contest between many data subflows for future internet applications. The RSF framework can be enhanced and applied to simulate the interplay of rival entities in the field of distributed computing networks [13]. To this end, the competition among social media users for exposure over the timeline was modeled using RSF [12]. In a different context, an RSF-based incentive mechanism for crowd-sourcing was developed [37] and the contention between the miners was framed as RSF [38] in order to examine the multi-cryptocurrency blockchain. Recently, RSF has been employed for strategic resource management to operationalize network slicing [39]. Therefore, the developed RSF framework and the findings in this paper are not only timely, but have the potential for high impact in the field.

Conclusions
We developed a novel framework for designing MPTCP algorithms by using a rentseeking game-theoretic framework. This approach guarantees the existence of a utility function with strict diagonal concavity. Another advantage of the diagonally strict concave property of the utility function is that it is possible to develop dynamic congestion controls that converge to the normalized equilibrium while guaranteeing stability. Our network emulation experiments have demonstrated higher throughput for our RSF MPTCP implementations without compromising fairness. Furthermore, our RSF MPTCP approach is highly robust to the characteristics of heterogeneous network paths and provides better responsiveness than other existing MPTCP algorithms.