Model Predictive Control of a Road Junction

: This paper presents a model predictive control (MPC) approach for optimally managing the trafﬁc light (TL) signals at a road junction. The objective is to improve queue balancing compared to traditional control strategies where TL signals are periodic. The resulting MPC optimization problem is of quadratic mixed-integer nature. The proposed approach is validated via simulations based on a real scenario.


Introduction
Improving traffic light (TL) control can bring significant impacts in terms of reducing traffic congestion in cities, reducing emissions, improving travel experience, etc. Intelligent TL control is an intensively studied area of research. The classic control approach, still widely used today, is to have fixed TL timings, optimized based on the statistical properties of the traffic flows to be controlled. This strategy has the merit of being extremely simple to implement, requiring also no particular sensing and computing equipment to be installed. Several improved TL control strategies have been proposed in the literature. A review of them can be found, e.g., in [1][2][3]. Advanced techniques used recently include deep reinforcement learning [4], game theory [5], evolutionary algorithm [6], fuzzy control [7], model predictive control (MPC) [8] and others. This work proposes an MPC control strategy based on quadratic mixed-integer optimization. Relevant related works in literature include [9,10], in which the authors propose an MPC framework for intelligent traffic signal control in a road network. The objective is to reduce the total number of vehicles in the network (i.e., the total traffic). The MPC algorithm in [10] is based on mixed-integer linear programming, which results in feasible solving times. Linear programming is applied in [10] since in that study the objective is to minimize the sum of the queues in the network. The aim of the present work instead is to minimize and as well to balance the queues, which is achieved with a quadratic objective function. Furthermore, compared to [10], the present paper proposes a more accurate, nonlinear queue equation, in which the outflow from a queue is a function of the outflow rate (which in turn depends on the traffic light signal and the queue length), rather than simply a fraction of the queue (compare (10) with (3) in [10]). In more recent work [11], the authors propose a model-free adaptive predictive control for phase splits of urban traffic networks. The goal is to minimize the number of vehicles in the network. The authors derive the nonlinear traffic network dynamical model and then linearize it to derive a simple MPC based signal split strategy. In [11], dynamic linearization is achieved by computing at each control step the Jacobian of the traffic network nonlinear model. In the present work instead, an effort is done to derive accurate, nonlinear queue equations which can be exactly linearized through the inclusion of additional auxiliary Boolean and continuous variables (e.g., see (10) to (15)). Reference [12] proposes a similar control strategy, also incorporating relevant rules from the US National Electrical Manufacturing Association (NEMA) standards for the sector. Reference [13] extends the above works by proposing a distributed MPC framework (based on the dual decomposition method) for urban traffic network control. The authors show that the approach significantly reduces computation times and queue lengths. A similar approach will be pursued in future works to derive a decentralized version of the present work, for application to traffic network control scenarios. Furthermore, compared to [13], the present work provides a more accurate modelling of the queue balance equation, for the same reasons discussed above. In addition, in [13] the change of the queue length depends on the cycle time (i.e., the sum of the traffic light phases duration), while in the present work the cycle time is variable and adjusted by the controller in order to best match the traffic conditions. Reference [14] proposes a multi-objective optimal MPC of signals in urban traffic networks. The optimization goals considered are: maximizing system throughput, minimizing traveling delays, maximizing intersection crossing volume, avoiding spillbacks, enhancing traffic safety. The optimal solutions are found by resorting to genetic algorithms and Pareto multi-objective optimization theory. Reference [15] proposes a distributed event-triggered MPC strategy for traffic signals control in urban traffic networks. The main goal is to minimize the total time spent by the cars in the network. Contrary to traditional MPC, in which re-optimization is periodic and time-driven, the controller proposed in [15] performs re-optimization (i.e., it computes a new control solution) only when specific conditions are met, which avoids redundant computation and communication efforts. A similar approach could be embedded in the algorithm presented here in future works.
The main goal of the present paper is to show how the usage of the proposed control algorithm can improve the performance of the road junction, as captured by a set of key performance indicators (KPIs), such as: queue balancing (the focus of the present work), minimization of the total vehicles' waiting time, throughput of vehicles (i.e., vehicles processed over time), etc. Compared to the above-mentioned works and the general existing literature, the present work is more focused on the control of the single road junction, for which a more fine-grained mathematical modelling is introduced, which enables more accurate control and optimization of different KPIs, both driver-related (e.g., time spent at the road junction) and network related (e.g., throughput, balancing of the queues, etc.). In particular, as discussed above, this study introduces a different and more accurate nonlinear model of the queue length (which is the most important equation in this application area). The proposed queue model is nonlinear, however, it is also shown how it can be exactly linearized by introducing additional auxiliary variables. This allows keeping a linear formulation of the problem constraints, which reduces the computational complexity of the MPC problem. In addition, other detailed equations needed to realistically model the traffic light functioning logic are presented as well, such as the dynamics of the duration of the traffic light signs (8), the phases transition constraints (2)-(4) and the conflict avoiding constraints (17).
The rest of the paper is organized as follows. Section 2 presents the MPC control approach adopted in the paper. Section 3 presents the optimization problem at the basis of the proposed MPC approach: the optimization problem includes constraints to model the road junction, and a target function to capture the optimization goals. Section 4 presents simulation results based on a real scenario. Finally, in Section 5 conclusions and future works are presented.

MPC Control Logic
The problem is set up in discrete time, with T denoting the sampling time in seconds. L denotes the number of traffic lights and j ∈ [1, 2, . . . , L] the generic traffic light. According to the MPC control method, at the generic current time k = 1, 2, . . ., the TLs control signals g k,j , y k,j , r k,j (respectively, green, yellow and red) to be actuated are found by solving a constrained optimization problem defined over a time window of N time steps into the future, i.e., from time k to time k + N − 1 (the time interval [k, k + N − 1] is the so called prediction horizon). The optimal control problem to be solved at the generic time k is built based on the boundary conditions of the road junction, as acquired from the feedback from sensors deployed on the field or from the vehicle (considering cooperative scenarios).
In particular, the main information that the MPC controller needs to measure or estimate at every k is the length of the queues. The MPC control loop is summarised in the follow.
In the next section, the mathematical formulation of the optimal control problem that is built and solved in Step 2 of Algorithm 1 is presented. The goal of the control problem is to derive the optimal TL timing, based on specific optimization criteria, and subject to a number of constraints that model the TL junction and that enforce a correct timing of the TLs.
Algorithm 1: MPC control of a road junction.
Step 1: Measure or estimate the current state (i.e., at time kT) of the road junction (in the present case, the number of vehicles at every queue); Step 2: Based on the current state of the road junction, build and solve an optimal control problem to determine the optimal TL signals g k,j , y k,j , r k,j . The optimal control problem is meant to optimize the TL control over a time window going from the current time step k to a future time step k + N − 1 (i.e., N time steps ahead of the current time); Step 3: Actuate the computed TL control signals g k,j , y k,j , r k,j over the time interval [kT, (k + 1)T], and discard the remaining control sequence; Step 4: Wait the next time step (i.e., k ← k + 1) and go to Step 1.

MPC Road Junction Optimization Problem
We derive in the following a linear mathematical model for the road junction. The model captures all the constraints that need to be respected to enforce a correct functioning of the road junction and the TLs. The model is defined over the prediction horizon [k, k + N − 1].
In the following, the index j ∈ [1, 2, . . . , L] denotes a generic TL of the road junction, while the index i is used to denote a generic time step in the prediction horizon (i.e., i ∈ [k, k + N − 1]).

Model of a TL
The state of a TL j at time i is given by the triple of Boolean variables {g i,j , y i,j , r i,j }, green, yellow and red status, respectively. It must be: The right order of the transition of the lights from green to yellow to red is enforced by the following three constraints, which avoid, respectively, the transition from green to red, from red to yellow, and from yellow to green: For example, it is seen from (2) that if it is g i,j = 1, then it must be r i+1,j = 0. Next, for safety reasons, a minimum duration time for the yellow is set. To this end, an auxiliary variable t y i is introduced, to indicate the duration (in number of time steps) of the current yellow cycle up to time i (the duration is zero if the sign is not yellow, greater than zero otherwise).
where y ↑ i,j is a Boolean variable equal to one when and only when the state transits from green to yellow. This behaviour for y ↑ i,j is achieved via the inclusion of the following constraints: with an arbitrary small positive constant. In fact, when there is a transition from green to yellow, i.e., when y i+1,j − y i,j = 1, (6) forces y ↑ i+1,j to one. On the other hand, when y i+1,j − y i,j = 0 (i.e., when the yellow sign does not vary) or when y i+1,j − y i,j = −1 (i.e., the sign goes from yellow to red), then (6) forces y ↑ i+1,j to zero, which is exactly the behaviour sought for y ↑ i+1,j . As explained, e.g., in [16], the nonlinear term t y i,j y ↑ i,j in (5), which is the product of a continuous variable and a Boolean variable, can be equivalently written in linear form by including an auxiliary variablet y i,j and the following auxiliary constraints: It can be verified indeed that, with the above constraints it ist Hence (5) is rewritten simply as: from which it is seen that, when y i,j = 1, the duration of the current yellow cycle is increased by one, and when y ↑ i,j = 1, (i.e., the sign switches from green to yellow) t y i,j is reset to zero. Finally, the following constraints ensure that a minimum duration of the yellow sign is respected: where t y,min i,j denotes the minimum duration of the yellow sign. In fact, as long as t y i,j ≤ t y,min i,j , variable y i,j is forced by (9) to one (notice also from (5) that, since y ↑ i+1,j is reset when the sign switches from green to yellow, it is always t y i,j ≥ t y,min i,j when the sign is not yellow).

Number of Vehicles in Queue
Let n i,j ≥ 0 denote the number of vehicles in queue at time i at the TL j. Let v in i,j and v out i,j denote, respectively, the rate of vehicles (vehicles per seconds) entering the queue and exiting the queue at time i. The following approximated dynamics of the queue is derived: In practice, v out i,j and v in i,j are time varying (they depend on the traffic conditions) and need to be estimated based on measurements of the real time traffic flow (e.g., by using conventional vehicle detectors for signal operation, vehicle-to-infrastructure technology, etc. (see, e.g., [17,18]). Note that the two rates are equal when there is no queue and the TL is green: On the other hand, when n i,j ≥ 0 and the TL is green, v out i,j will be equal to an "escape rate" which will depend on the characteristics of the vehicles, the road (e.g., it will be higher for straight roads, compared to closed curves), etc. This rate as well can be estimated based on measurements.
To model the above behaviour related to v out i,j , a Boolean variable,n i,j , is introduced to indicate if n i,j is zero (in which case it isn i,j = 0 as well) or greater than zero (in which case it isn i,j = 1). This is easily achieved by introducing the following constraints: where M is a large positive number (larger than the maximum value that n i,j can take). Therefore, whenn i,j = 0 (i.e., there is no queue) and g i,j = 1 it is v out i,j = v in i,j , while whenn i,j = 1 and g i,j = 1 it is v out i,j = v esc i,j , where v esc i,j is the above-introduced escape rate of vehicles at the TL. Next, two auxiliary continuous variables are introduced to capture the two above "and" products: g i,j :=n i,j g i,j andǧ i,j := (1 −n i,j )g i,j . It is easily verified that the desired behaviour forǧ i,j andĝ i,j is obtained with the following constraints (see, e.g., [19]): Hence (10) can be rewritten as: Finally, it is necessary to add a last continue and non-negative auxiliary variable to (15) in order to prevent that n i,j takes negative values: In other terms, s i,j acts as a feasibility slack variable that prevents that infeasibility arises due to n i+1,j becoming negative in specific conditions, e.g., when n i,j = 1, v in i,j − v esc i,j ≤ −1 vehicles/second and T ≥ 1 second: this would result in infeasibility, because n i+1,j as computed by (15) would be negative. A dedicated term will be included in the objective function to ensure that s i,j is greater than zero only when needed (i.e., only to prevent infeasibility). More sophisticated modelling choices for v out i,j are possible. For example, v out i,j could be taken as an increasing function of the current duration of the green sign.
Starting from n i,j , other interesting variables can be derived. For example, the approximate length of the queues (in meters), as l i,j = n i,j l, where l denotes the average space occupied in queue by a vehicle. Including this variable allows, e.g., including constraints on the maximum length admissible for a given queue, for example to minimize the possibility that a queue stretches to, and blocks, another traffic junction. Another quantity that can be derived from n i,j is an estimate of the waiting time d i,j for a vehicle entering queue j at time i, which can be approximated via the Little law [20] as where v out i,j should be the average escape velocity. The Little law states that the waiting time in a queue is given by the number of items in the queue (vehicles) over the average exit rate (vehicles/seconds). The average vehicle exit rate can be a moving average of the observed exit rate. Future works will focus on a more accurate estimation of the waiting time for every vehicle in the queue. Finally, another interesting variable is the total amount of vehicles going through the TL at the generic time interval i, which can be computed as p i,j = T(v in i,jǧ i,j + v esc i,jĝ i,j ). This can be seen as the number of vehicles "processed" by the algorithm at lane j at time i.

Conflict Avoiding Constraints
A set of lanes C i with |C i | > 1 is called "conflicting" if any two lanes in the set intersect (i.e., vehicles flowing in any two lanes could potentially collide). For example, there are three conflicting sets in the junction in Figure 1: C 1 = {1, 3}, C 2 = {1, 2, 4} and C 3 = {2, 5}. Denote with C the set of all conflicting sets. The following constraint states that at maximum one TL per conflicting set can be green or yellow at any time.

Objective Function
Optimization is focuses at balancing the queues at the TLs. The objective function to be minimized is: where α j is a positive weight and M a large positive number. The term α j n 2 i,j is to minimize/balance the length of the queues at the road junction. The term Ms 2 i,j is to minimize the slack variable of constraint (16).

Overall MPC Optimization Problem
The overall optimization problem (i.e., the one build and solved at step 2 of Algorithm 1) is: subject to: g i,j ,ĝ i,j , n i,j ≥ 0, s i,j ≥ 0,t y i,j , t y i,j ∈ R, ∀i, j.
g i,j + y i,j + r i,j = 1, ∀i, j.
The above is a mixed-integer quadratic programming problem (quadratic objective and linear constraints), for which nowadays efficient solution algorithms and solver packages (both commercial and free) are available (see, e.g., [21,22]).

MPC Optimality and Stability
As in all MPC applications, the greater N is (i.e., the used MPC prediction horizon), the closer the MPC performance is to the performance that would be achieved via solving an infinite-horizon optimization problem (which however cannot be solved in practice, as it includes an infinite number of variables, and also assumes in the present the knowledge of all the future relevant parameters and signals entering the problem, such as future arrival rates, etc.). In other terms, large values of N could improve performance, but they would also increase the complexity of each MPC iteration (because the number of variables increases) and introduce more uncertainty in the problem. In this paper, N = 15 has been chosen, as a value which results in small computation times and very good performance.
Dealing with stability, the fact that the number of vehicles in the queues remains bounded over time (i.e., queue stability) depends mainly on the arrival and departure rates. For example, above given arrival rates, queues might become unstable no matter the control strategy applied, simply because there are too many vehicles arriving, and the traffic junction cannot process enough of them to avoid that queues grow unbounded. In the simulations presented below, realistic traffic rates have been used, in line with the real observed ones. These rates do not result in stability problems neither when using the traditional control approach (now implemented at the traffic junction), nor when implementing the proposed MPC approach. One of the future works regards the simulation of the proposed approach using a much more accurate simulation environment (e.g., Anylogic and SUMO tools). This will allow studying when and how instability phenomena arise (i.e., which are the limit arrival rates after which instability arises). Furthermore, another interesting future work will regard the theoretical investigation of stability when the system is controlled with MPC. This is not an easy task as the MPC problem includes mixed real and integer (Boolean) variables. The analysis could be carried out starting from recent theoretical contributions like [23].

Validation by Simulation
In this section, a first validation of the proposed TL control system is given on a real case study. The aim is to demonstrate the performance improvement that the algorithm can bring compared to the currently adopted traditional control strategy, in which the TLs' timings are fixed and periodic. In the following, we demonstrate the ability of the controller to balance the length of the queue at the TLs.

Case Study, Simulation Scenario and Setup
We consider a real road junction in Rome, Italy (Figure 1). The current TL control cycle implemented at the road junction is as follows: We simulate three hours of traffic: with low, medium and high vehicle arrival rates, respectively. The vehicle arrival rates follow a Poisson distribution. The average arrival rates λ i for the different TL are reported in Table 1 (λ i is the expected number of vehicles arriving at the 5 TLs every T seconds). They reflect a realistic distribution of traffic load on the 5 roads constituting the junction. Simulations have been performed using the Julia programming language (https://julialang.org/), version 1.3.1 [24]. The quadratic mixed-integer optimization problem constituting the MPC iteration has been modeled using the Julia library JuMP [25] and solved with the Gurobi solver (http://www. gurobi.com/). The simulations have been performed on a Window 10 machine, 64 bit, equipped with an Intel I7-5500U CPU, 2.40 GHz and 8 GB RAM. A sampling time T of 5 seconds, and a value of N = 15 have been chosen.

Queue Balancing
A sequence of vehicle arrivals in time at the different TLs is generated according to the arrival rates reported in Table 1. The sequence of vehicles arrivals is shown in Figure 2.
The results of the simulations are reported in Figures 3-5. The figures report, respectively, the comparison of the queue lengths at the five TLs, resulting from traditional control (gray line) and MPC control (black line), for the three hours of simulations.
The average queue lengths at the different TLs for the 3 h of test are reported in Table 2 (in brackets the values resulting from traditional control).
By analysing the figures and the table, it can be seen that the proposed MPC controller brings significant improvements under medium and high arrival rates, while the algorithm is not effective under low rates, where simpler reactive strategies are preferable. Notice from Figure 5 how the proposed algorithm manages to reduce and flatten the highest peaks of the queue curves, especially for TLs 1, 3 and 4.     Finally, the solving times for the 2160 MPC iterations (i.e., 3 h of simulation with a sampling time of 5 s) are reported in Figure 6. Solving times are compatible with the sampling time T used.

Conclusions
This paper has presented a model predictive control (MPC) approach to the management of a road junction. The MPC algorithm controls the timing of the traffic lights (TLs) governing the road junction. The resulting optimization problem is a quadratic mixed-integer one and the resulting solving times are compatible with real implementation. The results have shown the effectiveness of the algorithm in balancing the queues at the road junction, compared to a classic control strategy where the timing of the TLs are fixed.
In future works, the proposed algorithm will be extended to make it able to more accurately account for the evolution of the cumulative and single vehicles' waiting times at the TLs, so that waiting times can be better traded off with the balancing of the queue lengths. Another line of research regards the derivation of distributed variants of the proposed MPC algorithm, to tackle the case of multi-junction road networks. Finally, the resulting algorithms will be validated using more advanced and realistic simulation tools (such as Anylogic [26] or SUMO ("Simulation of Urban MObility") [27]).
Funding: This research received no external funding.