Finding Equilibria in the Traffic Assignment Problem with Primal-Dual Gradient Methods for Stable Dynamics Model and Beckmann Model

In this paper we consider the application of several gradient methods to the traffic assignment problem: we search equilibria in the stable dynamics model (Nesterov and De Palma, 2003) and the Beckmann model. Unlike the celebrated Frank--Wolfe algorithm widely used for the Beckmann model, these gradients methods solve the dual problem and then reconstruct a solution to the primal one. We deal with the universal gradient method, the universal method of similar triangles, and the method of weighted dual averages, and estimate their complexity for the problem. Due to the primal-dual nature of these methods, we use a duality gap in a stopping criterion. In particular, we present a novel way to reconstruct admissible flows (i.e.,\ meeting the capacity constraints and induced by flows on the paths) in the stable dynamics model, which provides us with a computable duality gap.


Introduction
The Beckmann model for searching static traffic equilibria in road networks is among the most widely used models by transportation planners (Beckmann et al., 1956;Patriksson, 2015). The equilibria found are practical for evaluating the network efficiency and distribution of business centers and residential areas, and establishing urban development plans, etc. This model introduces a cost function on every link of a transportation network, which defines a dependence of the travel cost on the flow along the link. In practice the BPR functions are usually employed (US Bureau of Public Roads, 1964): wheret e are free flow times, andf e are road capacities of a given network's link e. We take these functions with parameters ρ = 0.15 and µ = 0.25. Nesterov and De Palma (2003) proposed an alternative model called the stable dynamics model, which takes an intermediate place between static and dynamic network assignment models. Namely, its equilibrium can be interpreted as the stationary regime of some dynamic process. Its key assumption is that we no longer introduce a complex dependence of the travel cost on the flow (as in the standard static models), but only pose capacity constraints, i.e. the flow value on each link imposes the feasible set of travel times Unlike in the Beckmann model, there is no one-to-one correspondence between equilibrium travel times and flows on the links of the network. We can illustrate the difference on a simple example of two parallel routes (Figure 1).

Figure 1: Parallel routes
Let the input flow take values 1000, 2000, and 3000 veh/h. For the stable dynamics model, in the first and second cases, all drivers choose the upper route; the equilibrium travel time simply equals the upper route's free flow time (0.5 h) in the first case and varies from 0.5 h to 1 h (according to the model) in the second. In the third case, the input flow exceeds the upper route's capacity, so the upper route's flow is 2000 veh/h, the lower one's is 1000 veh/h, and the equilibrium travel time is 1 h. All these equilibria can be interpreted as stationary regimes of some dynamic processes, e.g. the last case can be viewed as the result of the queue at the beginning of the upper route (since this route's capacity is smaller than the input flow) created by drivers who wanted to take this route until the waiting time plus the route's travel time reached the lower route's travel time (Nesterov and De Palma, 2003). In the Beckmann model equilibria are as follows: for all three cases only the upper route is used, and the equilibrium travel times are approximately 0.5 h, 0.6 h, and 0.9 h, respectively. Chudak et al. (2007) conducted a detailed comparison for large and small networks of equilibria in these two models.
In the Beckmann model, searching equilibria reduces to minimization of a potential function. One of the most popular and effective approaches to solve this problem numerically is the famous Frank-Wolfe method (Frank and Wolfe, 1956;Jaggi, 2013) as well as its numerous modifications (Fukushima, 1984;LeBlanc et al., 1985;Arezki and Van Vliet, 1990;Chen et al., 2002).
In the case of the stable dynamics model, one cannot directly apply the Frank-Wolfe method. However, an equilibrium can be found as a solution of a pair of primal and dual optimization problems. The same holds also for the Beckmann model, so in both cases we can apply primal-dual (sub)gradient methods.
In this work, we compare several primal-dual gradient methods for searching equilibria in both the Beckmann and the stable dynamics models, namely, the universal gradient method (UGM) (Nesterov, 2015), the universal method of similar triangles (UMST) (Gasnikov and Nesterov, 2018), and the method of weighted dual averages (WDA) (Nesterov, 2009). The main advantage of the above universal methods is an automatic adjustment to a local (Hölder) smoothness of a minimized function, which is especially important since the dual problems we are dealing with are essentially non-smooth. Due to the primal-dual nature of these methods, one can use an adaptive stopping criterion guaranteeing required accuracy.
The main contributions of this paper include the following: • We propose a novel way to reconstruct admissible flows (i.e., meeting the capacity constraints and induced by flows on the paths) in the stable dynamics model and a novel computable duality gap, which can be used in a stopping criterion.
• We provide theoretical upper bounds on the complexity of searching equilibria by the considered algorithms: UMST, UGD, and WDA.
• We conducted numerical experiments comparing these algorithms on the Anaheim transportation network the source code is available for use and can be found in Kubentayeva (2021).
The paper is organized as follows. In Section 2 we give a problem statement, define equilibria in the Beckmann and the stable dynamics models and corresponding optimization problems. Section 3 is devoted to the complexity analysis of UGM, UMST, and WDA. We show that the number of iterations required to obtain an ε-solution of primal and dual problems is O(1/ε 2 ) for UGM and UMST. In Section 4 results of experiments on Anaheim transportation network are presented. Finally, some conclusions are drawn in Section 5.

Problem statement
Let the urban road network be represented by a directed graph G = (V, E), where vertices V correspond to intersections or centroids (Sheffi, 1985) and edges E correspond to roads, respectively. Suppose we are given the travel demands: namely, let d w (veh/h) be a trip rate for an origin-destination pair w from the set OD ⊆ {w = (i, j) : i ∈ O, j ∈ D}. Here O ⊆ V is the set of all possible origins of trips, and D ⊆ V is the set of destination nodes. For OD pair w = (i, j) denote by P w the set of all simple paths from i to j. Respectively, P = w∈OD P w is the set of all possible routes for all OD pairs. Agents traveling from node i to node j are distributed among paths from P w , i.e. for any p ∈ P w there is a flow x p ∈ R + along the path p, and p∈Pw x p = d w . Flows from vertices from the set O to vertices from the set D create the traffic in the entire network G, which can be represented by an element of Note that the dimension of X can be extremely large: e.g. for n×n Manhattan network log |P | = Ω(n).
To describe a state of the network we do not need to know an entire vector x, but only flows on arcs: where δ ep = 1{e ∈ p}. Let us introduce a matrix Θ such that Θ e,p = δ ep for e ∈ E, p ∈ P , so in vector notation we have f = Θx. To describe an equilibrium we use both path-and link-based notations (x, t) or (f, t).
Beckmann model. One of the key ideas behind the Beckmann model is that the cost (e.g. travel time, gas expenses, etc.) of passing a link e is the same for all agents and depends solely on the flow f e along it. In what follows, we denote this cost for a given flow f e by t e = τ e (f e ). Another essential point is a behavioral assumption on agents called the first Wardrop's principle: we suppose that each of them knows the state of the whole network and chooses a path p minimizing the total cost The cost functions are supposed to be continuous, non-decreasing, and non-negative. Then (x * , t * ), where t * = (t * e ) e∈E , is an equilibrium state, i.e. it satisfies conditions if and only if x * is a minimum of the potential function: and t * e = τ e (f * e ) (Beckmann et al., 1956). Another way to find an equilibrium numerically is by solving a dual problem. According to Theo-rem 4 from Nesterov and De Palma (2003), we can construct it in the following way: Finally, we obtain the dual problem, the solution of which is t * : When we search for the solution to this problem numerically, on every step of an applied method we can reconstruct primal variable f from the current dual variable t: f ∈ ∂ w∈OD d w T w (t) (see Subsection 3.1). Then we can use the duality gap which is always nonnegative for the estimation of the method's accuracy: It vanishes only at the equilibrium (f * , t * ). Stable dynamics model (Nesterov and De Palma, 2003). An equilibrium state (x * , t * ) of the stable dynamics model satisfies the next conditions: , where τ (f ) is defined earlier by (2). The above formula can be reformulated in terms of an optimization problem: x * = arg min x∈X w∈OD p∈Pw Here, we add underlined constant terms to show that the pair (f * , t * ) is an equilibrium if and only if it is a solution of the saddle-point problem where its primal problem is and its dual problem is In contrast with the Beckmann model, the equilibrium state in the stable dynamics model is defined by pair (f * , t * ) (in particular, it differs from the system optimum (f * ,t) in the model only by the time value).

Numerical methods
We have the following objective functions • The stable dynamics model: • The Beckmann model: .

In both cases it has form
The optimization problem (3) is convex, non-smooth and composite. We use all these properties to identify the best optimization method to solve the considered problem.

Subgradient
In our research, we consider first-order methods, i.e. they require a subgradient of Φ(t), the properties and effective computation of which we discuss in this section.
To get the subdifferential ∂ Φ(t) let us re-write Φ(t) in the following way: where the vector a p = (δ ep ) e∈E encodes a path p. Obviously, the shortest path may not be unique.
Using the rules of subgradient calculus (Rockafellar, 2015) we get the following expression: i.e. the subdifferential ∂ Φ(t) is a sum of convex hulls of binary vectors that encode the shortest length paths. An important consequence is that for any t 1 , t 2 ∈ R |E| + and ∇Φ(t 1 ) ∈ ∂ Φ(t 1 ), ∇Φ(t 2 ) ∈ ∂ Φ(t 2 ) the following bound holds: where H is the diameter of the graph G.
Note that any element from the set ∂ Φ(t) has form ∇Φ(t) = −f , where f = Θx is a flow distribution on links induced by x ∈ X concentrated on the shortest paths for given times t (and vice versa: any such f corresponds to a subgradient of Φ(t)).
In practice, the calculation of flows f is the most expensive part, since we have to find the shortest paths for all pairs w ∈ OD. We use the following algorithm 1. We use Dijkstra's algorithm (Dijkstra et al., 1959) to find the shortest paths in line 3, which runs in O(|E| + |V | log |V |) time; finding the traversal order with topological sort (Section 22.4 in Cormen et al. (2009)) and further flows aggregation have linear performance O(|V |). Hence, the total complexity of Algorithm 1 is O |O|(|E|+|V | log |V |) . When the transportation network is an (almost) planar graph or another sparse graph, |E| = O(|V |) and the complexity is O(|O| · |V | log |V |). Moreover, flows reconstruction for every source o ∈ O can be computed in parallel. And Dijkstra's algorithm also can be parallelized and has efficient implementations (Crauser et al., 1998;Chao and Hongxia, 2010).
for v in traversal_order do 8: 12: end for 13: end for 14: return flows f

Reconstruction of admissible flows in SD model
For given times t considered Algorithm 1 reconstructs feasible flows f , i.e. f = Θx for some x ∈ X. These flows meet all the constraints in the Beckmann model, but they can violate the capacity constraints in the stable dynamics model. In the latter case, an additional step is required to obtain admissible flows from f . Note that we could instead find flows that meet capacity constraints first (Theorem 8 from Nesterov and De Palma (2003)), but to reconstruct feasible flows from them is a more complex problem.
Suppose we are given some flows g = Θx such that Then for any f = Θx we can construct admissible flows π(f ) in the following way: let η = max e∈E f e /f e − 1, then ξf +ηg ξ+η , η > 0. In practice, we propose the following procedure to find admissible flows g: run some optimization method (e.g. UGM) for a small number of iterations for the same problem but with decreased capacities: 1 2f instead off ; if obtained feasible flowsf N satisfyf N ≤ 3 4f , then take g =f N ; otherwise, run it again with capacities 3 4f and checkf N ≤ 7 8f , etc.
Stopping criterion. The stopping criterion we use for the stable dynamics model is based on a duality gap wheref N ∈ {Θx : x ∈ X},t N ≥t are estimates of an equilibrium (f * , t * ) after N iterations of the applied method. Note that here the duality gap withf N is not applicable.

Universal gradient method
The method for solving non-smooth problems with smooth techniques was proposed by Nesterov (2015) and was called the universal gradient method. The pseudocode of UGM for the considered problem (3) is provided in Algorithm 2. Here the euclidean prox-structure is used. Note that we did not specify the stopping criterion as it can be different for different models.

Algorithm 2 Universal gradient method
Input: L 0 > 0, accuracy ε > 0 1: Set t 0 :=t, k := 0 2: repeat 3: while true do 5: k := k + 1 13: until Stopping criterion is fulfilled Now let us definef where L k are the estimates of the local Lipschitz constant in UGM and UMST methods. Convergence of the UGM was proved in Nesterov (2015) and is summarized in the following lemma and theorem.
Lemma 3.1. After N iterations of UGM for the stable dynamics model it holds that wheref N ,t N , and S N are defined by (7), and R = t * −t 2 is the distance from the starting point to a solution.
Theorem 3.2. Let L 0 ≤ M 2 ε , where M comes from (4). Then after at most iterations of UGM for the stable dynamics model it holds that Q(t N ) − Q(t * ) ≤ ε. Moreover, the stopping criterion (6) is fulfilled after at most iterations, where ξ comes from (5).
Now we provide results on the rate of convergence for the Beckmann model. The stopping criterion in this case is the following: Lemma 3.3. After N iterations of UGM for the Beckmann model it holds that wheref N ,t N , S N are defined by (7), and R = t * −t 2 Theorem 3.4. Let L 0 ≤ M 2 ε , where M comes from (4). Then after at most iterations of UGM for the Beckmann model it holds that Q(t N ) − Q(t * ) ≤ ε. Moreover, the stopping criterion (13) is fulfilled after at most

Universal Method of Similar Triangles
Let us introduce the following notations: Flows are reconstructed in the following way: Lemma 3.5. After N iterations of UMST for the stable dynamics model it holds that wheref N is defined by (17) and R = t * −t 2 is the distance from the starting point to a solution.
Theorem 3.7. Let L 0 ≤ 4M 2 ε , where M comes from (4). Then after at most iterations of UMST for the Beckmann model it holds that Q(t N ) − Q(t * ) ≤ ε. Moreover, the stopping criterion (13) witht N = t N is fulfilled after at most iterations, whereR is defined by (16).

Method of Weighted Dual Averages
Convergence of WDA-method was proved in Nesterov (2009) and is summarized in the following theorem.
Theorem 3.8. Non-composite WDA-method satisfies the following bounds • for the stable dynamics model: • for the Beckmann model if µ ≤ 1:

Numerical experiments
This section presents numerical results for the algorithms described above, namely, composite variants of UMST and UGM, both composite and non-composite WDA-method, on the Anaheim network (Transportation Networks for Research Core Team, 2021; Chudak et al., 2007). The network consists of 38 zones, 416 nodes, and 916 links. Experiments and the source code in Python 3(Van Rossum and Drake, 2009) can be found in Kubentayeva (2021). We used Dijkstra's algorithm for finding the shortest paths in the network from the graph-tool library (Peixoto, 2014), where it is implemented in C++. We also used the Numpy library (Harris et al., 2020) for all vector operations.
Stable dynamics model. Parameters of the network are adjusted to the Beckmann model, so we have to increase the capacities to ensure the existence of an equilibrium for the stable dynamics model. In our experiments, the capacities are multiplied by 2.5. In Figure 2, we plot the number of (inner) iterations of the algorithms required to fulfill the stopping criterion (6) against 1/ε. We consider the number of inner iterations for Alg. 3 and Alg. 2 since the complexity of an inner iteration in this case is similar to the complexity of an iteration of the other algorithms. Note that according to (Nesterov, 2015, formula (2.23)) the number N (k) of inner iterations of UGM or UMST at step k is bounded as so asymptotic rates from Theorems 3.2, 3.4, 3.6, and 3.7 are still valid. As we can see, the best results are shown by UMST, followed by UGM having similar performance. Both composite and non-composite WDA-method are much slower. k := k + 1 6: until Stopping criterion is fulfilled Figure 3 shows the convergence rates of the methods for the Beckmann model. The Frank-Wolfe method demonstrates the best results and is followed by UMST. Unlike the stable dynamics case, composite WDA-method is faster than UGM. However, the non-composite WDA-method has the worst performance again. Figure 3: Convergence rates of UMST, UGM, composite and non-composite WDA-methods, and the Frank-Wolfe method for the Beckmann model with the stopping criterion (13). Hereε is the relative accuracy ε/∆ 0 , where ∆ 0 is the duality gap at the start point.

Conclusion
We considered several primal-dual subgradient methods for finding equilibria in the stable dynamics and the Beckmann models. We suggested a way to reconstruct admissible flows in the stable dynamics model, which provides us with a novel computable duality gap. Complexity bounds for UMST and UGM were presented in terms of the iterations number required to achieve a desired accuracy in the dual function value or the duality gap. Finally, we conducted numerical experiments comparing convergence of the considered algorithms on the Anaheim transportation network: UMST is the best one for optimization of the dual problems in both models. Also, using the duality gap as a stopping criterion, we compared these methods with the Frank-Wolfe algorithm for the Beckmann model which, as expected, remains the most suitable approach in this case (but it is not applicable for the stable dynamics model).
The reader may be interested in another related topic, searching stochastic traffic equilibria. In Gasnikov and Kubentayeva (2018); Baimurzina et al. (2019) we with our colleagues studied the application of the UMST for finding Nash-Wardrop stochastic equilibria in the Beckmann model. In this case, a driver selects a route randomly according to Gibbs' distribution taking into account current time costs on the links of the network. It leads to iteration complexity O( 1 √ γε ), where γ > 0 is a stochasticity parameter (when γ → 0 the model boils down to the ordinary Beckmann model). However, the great decrease in the number of iterations comes along with a more expensive calculation of the objective function's gradient. Nesterov, Y. and De Palma, A. (2003). Stationary dynamic solutions in congested transportation networks: summary and perspectives, Networks and Spatial Economics 3(3): 371-395. Patriksson, M. (2015). The traffic assignment problem: models and methods, Courier Dover Publications.
Here we used that Φ(0) = − w∈OD d w T w (0) = 0. Therefore, Now notice that, since the flowf N is induced by some traffic distribution x ∈ X, we have This yields and thus Proof of Theorem 3.2. Theorem 1 in Nesterov (2015) ensures that L k ≤ M 2 ε for all k ≥ 0, thus S N ≥ εN M 2 . Then the first bound (11) follows immediately from (8). Now let us prove the second bound. First, supposef N e ≤f e for all e ∈ E. Then π(f N ) =f N , thus by (9) Otherwise, iff N e ≤f e , one has π(f N ) = ξf N +ηg ξ+η , where η = max e∈Ef N e /f e − 1, hence (8) and (9) yield Finally, according to (10) Combining all bounds together we obtain and substituting N = N stop , we conclude that the stopping criterion (6) is fulfilled. and maximum is attained at point t = ∇Ψ(f N ) = τ (f N ). As in the proof of Theorem 3.2, the inequality (25) holds in Beckmann's model case. Then the first term in the r.h.s. can be estimated as follows and we finally get an upper bound on the duality gap: In the same time, substituting t = t * one obtains Proof of Theorem 3.4. By construction,t N e ≤ w∈OD d w for all e ∈ E, thus τ (f N ) −t 2 ≤R. According to Theorem 1 in Nesterov (2015) S N ≥ εN M 2 , thus the statement follows immediately from Lemma 3.3.

Proofs for UMST
Proof of Lemma 3.5. According to the inequality (30) in Gasnikov and Nesterov (2018) Note that the above inequality has the same form as (25), if one replaces S N with A N , 1 L k+1 with α k , y k with t k , and t−t 0 2 2 S N with t−t 0 2 2 2A N . Then the claim follows by the same reasoning as in the proof of Lemma 3.1.
From Young's inequality we get that If L k+1 ≥ A k+1 M 2 α k+1 ε , then the stopping condition for inner iterations is fulfilled. Therefore, at the end of the k-th iteration either L k+1 < 2A k+1 M 2 α k+1 ε or L k+1 = L k 2 . Now we are going to prove by induction that α k ≥ ε 2M 2 , which is equivalent to L k ≤ 2M 2 ε + 4M 4 ε 2 A k , for all k ≥ 1. For k = 1 it follows from A 1 = α 1 and L 0 ≤ 4M 2 ε . In case where L k+1 < 2A k+1 M 2 α k+1 ε equation A k+1 = L k+1 α 2 k+1 immediately yields α k+1 ≥ ε 2M 2 . If L k+1 = L k 2 , then by the induction hypothesis and monotonicity of the sequence {A k } k∈N we obtain Therefore, Arguing in the same way as in the proof of Theorem 3.2 we obtain that After substitution N = N Q or N = N stop the claim follows.
Proof of Theorem 3.7. Repeating the proof of Theorem 3.4 we obtain that Then we conclude applying (27).
In case of the stable dynamics model ∇h(t) =f , thus we can take L = M + f 2 . For the Beckmann model ∂h(t) ∂t e =f e t e −t ē t e ρ µ .