Optimal Congestion Pricing with Day-to-Day Evolutionary Flow Dynamics: A Mean–Variance Optimization Approach

: This paper investigates the optimal congestion pricing problem that considers day-to-day evolutionary ﬂow dynamics. Under the circumstance that trafﬁc ﬂows evolve from day to day and the system might be in a non-equilibrium state during a certain period of days after implementing (or adjusting) a congestion toll scheme, it is questionable to use an equilibrium-based index under steady state as the objective to measure the performance of a congestion toll scheme. To this end, this paper proposes a mean–variance-based congestion pricing scheme, which is a robust optimization model, to consider the evolution process of trafﬁc ﬂow dynamics in the optimal toll design problem. More speciﬁcally, in the mean–variance-based toll scheme, travelers aim to minimize the variance of expected total travel costs (ETTCs) on different days to reduce risk in daily travels, while the average ETTC over the whole planning period is restricted to being no larger than a predetermined target value set by the authorities. A metaheuristic approach based on the whale optimization algorithm is designed to solve the proposed mean–variance-based day-to-day dynamic congestion pricing problem. Finally, a numerical experiment is conducted to validate the effectiveness of the proposed model and solution algorithm. Results show that the used 9-node network can reach a steady state within 18 days after implementing the mean–variance-based congestion pricing, and the optimal toll scheme can be also obtained with this toll strategy.


Introduction
Congestion exists in many network systems when demand temporarily and spatially exceeds capacity. Many regional planning organizations and transportation management authorities face enormous challenges in mitigating heavy traffic congestion to enable a high level of services for citizens. According to the Federal Highway Administration (FHWA) of the US [1], the budget request for fiscal year 2019 for transportation infrastructure capacity construction reached USD 40.41 billion; in 2018, traffic congestion cost Americans approximately USD 1348 per driver or USD 87 billion in total [2]. Similarly, many emerging mega-regions in Asia have invested substantially into road and transit backbone networks [3][4][5][6]. For example, the related budget in Beijing reached up to CNY 43.10 billion in 2018, but it is still very challenging to manage and spread the 36% of trips during the morning and afternoon peak hours (07:00~09:00 and 17:00~19:00); the current average speed of privately owned vehicles is 20 km/h [7]. It is widely recognized that only increasing infrastructure construction cannot fundamentally solve the traffic congestion problem, e.g., Braess's paradox shows that an extension of the road network may deteriorate the overall network performance with the same traffic demand [8,9]. entire planning period (which is three months in their work), and the network performance of each day is taken into consideration in robust optimization with the objective of minimizing the maximum regret on each day. However, as shown in its objective function, one needs to first solve the optimal toll for each day and take the corresponding total travel cost as a benchmark for each day. It is obvious that calculating the optimal toll for each day over the whole planning period is very time-consuming in the first stage of the minimax model, and this paper proposes a new toll scheme to overcome the computational issues in the robust optimization minimax model proposed by [19] with DTD dynamics.
In this paper, we aim to design and solve the DTD-DCP problem in the mean-variancebased framework. Mean-variance analysis is widely used in financial economics to help investors to spread risk in the portfolio [36,37]. In mean-variance analysis, investors first measure risk, which is expressed by variance, and then compare the risk of an asset with its likely return. The aim of mean-variance optimization is to maximize the reward of an investment with respect to its risk. In the congestion pricing problem with DTD dynamics, the total travel cost for each day over the whole planning period changes from day to day, and there is not one particular toll scheme that can result in minimal total travel cost for each day. In the mean-variance-based toll scheme proposed in this paper, travelers aim to minimize the variance of expected total travel costs (ETTCs) on different days to reduce risk in daily travels, while the average ETTC over the whole planning period is restricted to being no larger than a predetermined target value set by the authorities to maximize the reward over the whole planning period. To solve the proposed mean-variance-based DTD-DCP problem, this paper designs a metaheuristic approach based on the whale optimization algorithm (WOA) [38].
The rest of this paper is organized as follows: Section 2 defines the notations used in this paper and describes the distance-based congestion pricing problem, which is a usage-oriented toll scheme and widely used in the literature (e.g., [19,39]; just to name a few). Section 3 describes the discrete route-based DTD dynamics model to capture the flow and cost evolution process. In the DTD dynamics model, the information provided by advanced traveler information systems (ATIS) is also taken into consideration. The mean-variance-based congestion pricing model to determine the robust optimal toll rate in the context of DTD dynamics is also demonstrated in Section 3 and followed by the WOAbased metaheuristic solution approach in Section 4. Finally, the numerical experiment and conclusions are outlined in Sections 5 and 6, respectively.

Preliminaries
Consider a directed traffic network G = (N, A), where N and A are the set of nodes and links in the network, and n and a are the elements in N and A, respectively. Denote the set of origin-destination (OD) pairs by W, and w ∈ W. The number of OD pairs is denoted by |W|. r ∈ R w is the path connecting OD pair w ∈ W, R w is its set, and |R w | is the number of paths connecting OD pair w ∈ W. q w is the traffic demand between OD pair w ∈ W. The flow and travel time on link a ∈ A are denoted by x a and t a . The flow on path r ∈ R w is expressed by f w,r . The actual (i.e., experienced) path travel time for r ∈ R w is denoted by c w,r . There are two types of forecasted travel time for path r ∈ R w , including the path travel time forecasted by travelers h w,r and the path travel time forecasted by ATIS g w,r . The generalized path travel cost for r ∈ R w is expressed by C w,r . d represents the evolutionary calendar days for the DTD dynamics model, and D is the total planning period of interest. The notations used in this paper are summarized in Table 1. The total planning period, which is 90 days in this paper τ The congestion toll scheme ETTC(τ, d) The expected total travel cost on day d with the toll scheme τ

ETTC
The predetermined target of the average expected total travel cost during the total planning period q w Travel demand between OD pair w ∈ W f d w,r The flow on route r ∈ R w on day d h d w,r The traveler's predicted route travel cost for r ∈ R w on day d g d w,r The ATIS's predicted route travel cost for r ∈ R w on day d C d w,r The actual (i.e., experienced) route travel cost for r ∈ R w on day d The relationship between path travel time and link travel time can be expressed by where δ w,r,a = 1 if link a is on path r, which connects the OD pair w ∈ W, and δ w,r,a = 0 otherwise. Taking congestion charging into consideration, we can calculate the generalized path travel cost by where τ w,r denotes the toll for path r ∈ R w , and κ is the value of time (VOT), which is assumed to be a constant in this paper. It is worth noting that this constant VOT can be easily extended to a continuously distributed VOT [39], which is out of scope in this paper. The congestion toll is assumed to be a distance-based scheme, which has recently attracted a lot of attention both theoretically [19,[39][40][41] and practically [42]. Suppose that there is a charging cordon, which is predetermined by the government, in the transport network. For a particular path r ∈ R w in the network connecting an OD pair w ∈ W, a portion of the path is located inside the charging cordon, with the length of η w,r . The toll function for using tolled roads inside the cordon is denoted by φ(η w,r ). Followed by [19,40], we use a piecewise linear toll function to approximate the nonlinear distance-based toll.
As illustrated in Figure 1a, the purple path connects a particular origin and destination pair. The charging cordon is shown in the dashed area, and the travel distance inside the cordon is calculated as 16.2, which is the sum of all link lengths belonging to the path inside the cordon. For a piecewise linear approximated toll scheme, we need to first define the minimal and maximal travel distances inside the cordon, which are denoted by η 0 and η Z , with Z equal intervals between them. The toll values corresponding to η z are expressed by y z . As shown in Figure 1b, the toll with a travel distance of η w,r inside the cordon is calculated by With a determined toll scheme at the beginning of the planning period, the network flow will evolve dynamically from day to day. In the next section, we introduce a discrete DTD dynamics model to describe the DTD evolutionary flow dynamics.  With a determined toll scheme at the beginning of the planning period, the network flow will evolve dynamically from day to day. In the next section, we introduce a discrete DTD dynamics model to describe the DTD evolutionary flow dynamics.

A Discrete DTD Dynamics Model
After implementing a toll scheme, network flows will change since travelers' route choice decisions will change due to the toll. Therefore, a toll levied in the charging cordon results in a new flow evolution trajectory. To evaluate the new toll scheme prior to implementation, one needs to make a prediction on the DTD flow evolution trajectory if this new toll scheme is implemented. In the DTD dynamics model, travelers make predictions on travel costs for the next day at the end of each day and then make decisions on the route choice behaviors in terms of the predicted route costs and previously experienced route costs.
In this paper, the ATIS is taken into consideration in travelers' cost predictions [43]. In other words, at the end of day 1 d  , the ATIS makes a prediction and broadcasts the predicted information (i.e., the route travel costs) through learning from the actual route travel costs and predicted route travel costs on day 1 d  . In addition to the ATIS's prediction, travelers also have their predictions on travel costs. With the information published by the ATIS, travelers predict the route travel costs by learning from the ATIS's information and their predictions on the previous day. Finally, travelers make decisions in terms of their predictions. Mathematically, the DTD dynamics model can be expressed by the following three processes, i.e., the flow updating, cost updating, and information broadcasting processes: [Flow Updating Process] [Information Broadcasting Process] Figure 1. Illustration of the distance-based toll scheme and its piecewise linear approximation [19,40].

A Discrete DTD Dynamics Model
After implementing a toll scheme, network flows will change since travelers' route choice decisions will change due to the toll. Therefore, a toll levied in the charging cordon results in a new flow evolution trajectory. To evaluate the new toll scheme prior to implementation, one needs to make a prediction on the DTD flow evolution trajectory if this new toll scheme is implemented. In the DTD dynamics model, travelers make predictions on travel costs for the next day at the end of each day and then make decisions on the route choice behaviors in terms of the predicted route costs and previously experienced route costs.
In this paper, the ATIS is taken into consideration in travelers' cost predictions [43]. In other words, at the end of day d − 1, the ATIS makes a prediction and broadcasts the predicted information (i.e., the route travel costs) through learning from the actual route travel costs and predicted route travel costs on day d − 1. In addition to the ATIS's prediction, travelers also have their predictions on travel costs. With the information published by the ATIS, travelers predict the route travel costs by learning from the ATIS's information and their predictions on the previous day. Finally, travelers make decisions in terms of their predictions. Mathematically, the DTD dynamics model can be expressed by the following three processes, i.e., the flow updating, cost updating, and information broadcasting processes: [Flow Updating Process] [Cost Updating Process] [Information Broadcasting Process] where C d w,r , g d w,r , and h d w,r are the actual (i.e., experienced) route travel cost, the ATIS's predicted route travel cost, and the traveler's predicted route travel cost for r ∈ R w on day d, respectively. f d w,r is the flow on route r ∈ R w on day d, and p h d w,r is the route choice probability function with respect to the predicted route travel cost h d w,r based on the random utility theory. α, β, and γ are the flow updating parameter, traveler's predicting parameter, and ATIS's predicting parameter, respectively. In this paper, it is assumed that the route choice probability function follows the logit model, giving where θ is a perception-error-related parameter, which describes the dispersion in travelers' route choices. It is worth noting that when γ = 1, we have g d w,r = c d−1 w,r , which means that ATIS's prediction on the route travel cost for day d at the end of day d − 1 equals the actual travel cost experienced by travelers on day d − 1. In other words, the ATIS releases the actual travel cost at the end of each day when γ = 1, and this is widely used in the DTD dynamics model [19,44,45].

Mean-Variance-Based Robust Optimization Model
Mean-variance analysis is one of the most important aspects of modern portfolio theory [46], and it is a process to weigh risk (which is expressed by the variance) against the expected return. Investors weigh how much risk they can take on in return for rewards. It is assumed that investors will rationally make decisions on investment if complete information is provided to them. In mean-variance analysis, investors seek to find the maximal reward under a given level of risk (or the minimal risk under a given level of reward).
As for the congestion pricing problem with DTD dynamics, the system-level performance measure with the total travel cost is fluctuant due to evolutionary traffic flow dynamics. It would be unlikely for one particular toll scheme to exist which results in a minimal total travel cost for each day over the whole planning period. Similar to the mean-variance analysis in the portfolio theory which minimizes the risk at a given level of reward, in the mean-variance-based toll scheme proposed in this paper, travelers aim to minimize the variance of total travel costs on different days, while the mean of the total travel costs over the whole planning period is restricted to being no larger than a predetermined target value set by the authorities. Additionally, since travelers experience perception errors in route choices, it is more reasonable to use the expected total travel costs (ETTCs) than the total travel cost in the objective function [19,41]. Mathematically, the expected total travel cost for each day under the toll scheme τ can be formulated as [19,41]: where f(τ, d) and c(τ, f, d) are the column vectors of route flows and costs on day d under toll scheme τ, respectively. Therefore, in order to minimize risk, travelers seek to minimize the variance of expected total travel costs on different days with congestion pricing considering DTD traffic flow dynamics. Mathematically, this is expressed as and the DTD dynamic traffic flows introduced in Section 3. ETTC(τ, d) is the expected total travel cost on day d with toll scheme τ. ETTC is the predetermined target of the mean of expected total travel costs during the total planning period. Objective (9) aims to minimize risk, while constraint (10) indicates that the reward should be no less than a given level (or, more strictly speaking, the negative reward should be no larger than a given level, since the expected total travel cost can be seen as the negative reward). Comparing the mean-variance-based toll scheme defined in Equations (9) and (10) with the minimax regret model in [19], we can see that it does not need to calculate the minimal ETTC for each day in the mean-variance-based toll scheme, which greatly reduces the computational burden in the solution procedure.

Solution Algorithms
With a particular toll rate, one needs to calculate the flow and cost evolutionary trajectories to obtain the objective value defined in Equation (9); obviously, this is difficult since there is no analytical formula for the objective function, and traditional gradientbased algorithms cannot be used to solve the complex model. In this paper, we propose a metaheuristic approach based on the whale optimization algorithm to solve the meanvariance-based congestion pricing problem under DTD dynamics.
The whale optimization algorithm is a nature-inspired metaheuristic algorithm which imitates the hunting behavior of humpback whales, including three operators of search for prey (i.e., exploration phase), encircling prey, and bubble-net foraging (i.e., exploitation phase). It was first proposed by [38]. Since then, it has attracted extensive attention and been widely used in many engineering optimization problems, e.g., classification, clustering, image processing, robot routing, and task scheduling; see a comprehensive review in [47]. As shown in [38], the whale optimization algorithm was found to be very competitive compared to other state-of-the-art metaheuristic algorithms with 29 mathematical benchmark functions and 6 structural engineering problems.
While it is popular in many engineering optimization problems, only a few studies [48,49] have solved traffic and transportation problems with the whale optimization algorithm. In this study, the whale optimization algorithm is used to solve the meanvariance-based congestion toll problem with DTD dynamics. One of the reasons that we chose the whale optimization algorithm is that it has an excellent ability to escape from local minima to find the global minima due to cooperation between the exploration and exploitation phases. The primary algorithm procedures are summarized as follows, while the details on the algorithm can be found in [38].

Encircling Prey
In the whale optimization algorithm, it is assumed that the current solution (i.e., the best search agent) is the target prey or close to the target prey, and other search agents update their locations toward the best search agent. Mathematically, this behavior can be formulated as: where t is the iteration number, X * is the current best solution, → X is the vector of position, → A and → C are vectors of coefficients, and · is an element-by-element multiplication. During the iteration process, X * will be updated continuously in each iteration if a better solution is obtained. The vectors of coefficients → A and → C can be calculated by: where → a decreases linearly from two to zero, and → r is a random vector between zero and one.

Bubble-Net Attacking
A bubble-net strategy is used by whales to attack their prey. There are two approaches in the bubble-net strategy, the shrinking encircling mechanism and spiral updating position.
(1) Shrinking encircling mechanism The location of the whale group is updated in terms of Equation (12). The values of → A are randomly set at [−1, 1], and the updated location of the search agent can be defined between its original location and the location of the current best solution.
(2) Spiral updating position The whale moves along the spiral route in a shrinking circle, and this process can be mathematically formulated by: is the distance from individual whales to the whale with the best location so far. b is a constant which determines the shape of the logarithmic spiral, and l is a random value in [−1, 1]. The shrinking encircling and spiral updating strategies are performed simultaneously. In practical applications, we usually assume that there is a probability of 50% to select one of the shrinking encircling or spiral updating approaches to update the locations of whales [38]. This process can be summarized as follows: where p is a randomly generated value of [0, 1].

Search for Prey
Humpback whales hunt randomly for prey. In the whale optimization algorithm, the behavior of randomly searching for prey is based on the variation of → A. More specifically, the randomly chosen search agent will move far away from a reference whale when → A ≥ 1 and move toward the whale with the best location so far when → A < 1. Such a mechanism with a randomly chosen search agent (rather than the agent with the best solution) allows the whale optimization algorithm to perform a global search. Mathematically, the search for prey can be modeled as follows: where → X rand is the random location (i.e., a random selected whale) chosen from the present population.
The pseudo code of the whale optimization algorithm can be summarized as follows [38]: Initialize the population X i (i = 1, 2, 3, · · · , n) Calculate the distance-based toll for each path with Equation (3) Conduct the day-to-day dynamics evolution process with Equations (4)-(7) Evaluate the fitness for all the agents in the population with Equation (9), and set X * as the best search agent while (t < maxIterations): for each search agent  (3) for each search agent Conduct the day-to-day dynamics evolution process with Equations (4)-(7) for each search agent end for Revise the search agent if it is out of the search space Evaluate the fitness for all the agents with Equation (9), and update X * if a better solution is found Update t = t + 1 end while Output X *

Numerical Experiment
A numerical experiment was conducted to validate the effectiveness of the proposed mean-variance toll scheme in the DTD dynamics framework. A simple network shown in Figure 2 is used in the numerical experiment. This network was designed by [41] and has been used in recent studies on the congestion pricing problem [19]. It contains 2 origindestination (OD) pairs, 9 nodes, and 13 links. Tables 2-5 summarize the travel demand information, link data, link-route incidence relationship, and internal routes which go through the charging cordon, respectively. The classical BPR (acronym for the Bureau of Public Roads) function is used to calculate the link travel times in this study, and the general form of the BPR function is expressed as t(v a ) = t 0 a · (1 + ξ · (x a /H a ) ρ ), where free flow travel time (FFTT) t 0 a , capacity H a , coefficient ξ, and exponent ρ can be found in Table 3.
Public Roads) function is used to calculate the link travel times in this study, and the general form of the BPR function is expressed as  Table 3.   1, 2, 3, 5, 6, 9 7 1, 2, 3, 5, 7, 8, 9 8 1, 2, 5, 6, 9   1, 2, 3, 5, 6, 9 7 1, 2, 3, 5, 7, 8, 9 8 1, 2, 5, 6, 9 9 1, 2, 5, 7, 8, 9 10 1, 2, 7, 8, 9 11 1, 8, 9 As can be seen in Table 5, there are 9 different routes going through the charging cordon, and travel distances inside the cordon range from 9 to 15. Similar to [19], it is assumed that the nonlinear distance-based toll is piecewise-linearly approximated by six intervals, which can be uniquely determined by seven vertexes. The length for each interval is one. Other parameters used are summarized as follows:  Figure 3 shows the results of the optimal distance-based toll rate in this numerical experiment. As we can see, the optimal distance-based toll rate is indeed a nonlinear function with respect to travel distances. However, the optimal toll rate is different to the one obtained in [19]. The reasons are threefold: (1) The input demands are different between this study and [19]; (2) the DTD dynamics model in this paper takes into consideration the information broadcasting process (see Equation (6)), while the one in [19] does not consider this process; (3) the optimization objectives are different since this study aims to minimize variance over different days, while the study in [19] aims to minimize the maximal regret over different days. 9 1, 2, 5, 7, 8, 9 10 1, 2, 7, 8, 9 11 1, 8, 9 As can be seen in Table 5, there are 9 different routes going through the charging cordon, and travel distances inside the cordon range from 9 to 15. Similar to [19], it is assumed that the nonlinear distance-based toll is piecewise-linearly approximated by six intervals, which can be uniquely determined by seven vertexes. The length for each interval is one. Other parameters used are summarized as follows:  Figure 3 shows the results of the optimal distance-based toll rate in this numerical experiment. As we can see, the optimal distance-based toll rate is indeed a nonlinear function with respect to travel distances. However, the optimal toll rate is different to the one obtained in [19]. The reasons are threefold: (1) The input demands are different between this study and [19]; (2) the DTD dynamics model in this paper takes into consideration the information broadcasting process (see Equation (6)), while the one in [19] does not consider this process; (3) the optimization objectives are different since this study aims to minimize variance over different days, while the study in [19] aims to minimize the maximal regret over different days.  To validate the convergence of the proposed DTD dynamics model, Figure 4 shows the evolutionary processes of the flow and cost over the planning period. It is clear that the simple network can reach a steady state within 18 days. When increasing the network size, the number of days to reach a steady state may be much larger. Figure 5 shows the flow and cost evolution processes when an optimal toll is levied in the cordon. Comparing it with Figure 4, we can see that levying the congestion toll does not increase the convergence speed in this network with the corresponding parameters set in this experiment, and this is because the objective in the optimization model is to reduce risk and increase reward, rather than increasing the convergence speed. Figure 6 describes the expected total travel cost for each day. The ETTC keeps steady after 18 days since the system reaches the steady state, while it is fluctuant during the first 18 days.
To validate the convergence of the proposed DTD dynamics model, Figure 4 shows the evolutionary processes of the flow and cost over the planning period. It is clear that the simple network can reach a steady state within 18 days. When increasing the network size, the number of days to reach a steady state may be much larger. Figure 5 shows the flow and cost evolution processes when an optimal toll is levied in the cordon. Comparing it with Figure 4, we can see that levying the congestion toll does not increase the convergence speed in this network with the corresponding parameters set in this experiment, and this is because the objective in the optimization model is to reduce risk and increase reward, rather than increasing the convergence speed. Figure 6 describes the expected total travel cost for each day. The ETTC keeps steady after 18 days since the system reaches the steady state, while it is fluctuant during the first 18 days.

Conclusions
This paper solves the optimal congestion pricing problem with DTD dynamics through the mean-variance optimization approach. Under the circumstance that traffic flows evolve from day to day and the system might be in a non-equilibrium state during a certain period of days after implementing a new congestion toll scheme, it is questionable to use an equilibrium-based index under a steady state as the objective to measure the performance of a congestion toll scheme. Consequently, a mean-variance-based toll

Conclusions
This paper solves the optimal congestion pricing problem with DTD dynamics through the mean-variance optimization approach. Under the circumstance that traffic flows evolve from day to day and the system might be in a non-equilibrium state during a certain period of days after implementing a new congestion toll scheme, it is questionable to use an equilibrium-based index under a steady state as the objective to measure the performance of a congestion toll scheme. Consequently, a mean-variance-based toll scheme which takes into consideration the evolutionary flow dynamics from day to day is proposed to robustly optimize network system performance over the whole planning period. In the meanvariance-based optimization model, travelers aim to minimize the variance of ETTCs on different days to reduce risk the daily trips from day to day, while the mean of ETTCs over the whole planning period is restricted to being no larger than a predetermined target value set by the authorities. The proposed mean-variance-based DTD-DCP problem is solved by a metaheuristic approach based on the whale optimization algorithm, and the results from a numerical study validate the effectiveness of the proposed model and solution algorithm. As for future research, some extensions are recommended to further improve congestion pricing with DTD dynamics. On the one hand, identifying how to design an exact solution algorithm, rather than a metaheuristic method, is still a challenge for the DTD-DCP problem. On the other hand, it is still unclear how the DTD dynamics model influences the optimal toll rate in the DCP problem. More studies should be conducted to investigate these issues in future.