Multi-Agent-Based Data-Driven Distributed Adaptive Cooperative Control in Urban Traffic Signal Timing

Zhang, Haibo; Liu, Xiaoming; Ji, Honghai; Hou, Zhongsheng; Fan, Lingling

doi:10.3390/en12071402

Open AccessArticle

Multi-Agent-Based Data-Driven Distributed Adaptive Cooperative Control in Urban Traffic Signal Timing

by

Haibo Zhang

¹,

Xiaoming Liu

¹,

Honghai Ji

^1,*,

Zhongsheng Hou

² and

Lingling Fan

³

¹

School of Electrical & Control Engineering, North China University of Technology, Beijing 100144, China

²

School of Automation, Qingdao University, Qingdao 266071, China

³

School of Automation, Beijing Information Science & Technology University, Beijing 100192, China

^*

Author to whom correspondence should be addressed.

Energies 2019, 12(7), 1402; https://doi.org/10.3390/en12071402

Submission received: 25 March 2019 / Revised: 30 March 2019 / Accepted: 5 April 2019 / Published: 11 April 2019

(This article belongs to the Special Issue Energy Efficiency and Data-Driven Control)

Download

Browse Figures

Versions Notes

Abstract

:

Data-driven intelligent transportation systems (D²ITSs) have drawn significant attention lately. This work investigates a novel multi-agent-based data-driven distributed adaptive cooperative control (MA-DD-DACC) method for multi-direction queuing strength balance with changeable cycle in urban traffic signal timing. Compared with the conventional signal control strategies, the proposed MA-DD-DACC method combined with an online parameter learning law can be applied for traffic signal control in a distributed manner by merely utilizing the collected I/O traffic queueing length data and network topology of multi-direction signal controllers at a single intersection. A Lyapunov-based stability analysis shows that the proposed approach guarantees uniform ultimate boundedness of the distributed consensus coordinated errors of queuing strength. The numerical and experimental comparison simulations are performed on a VISSIM-VB-MATLAB joint simulation platform to verify the effectiveness of the proposed approach.

Keywords:

D²ITS; data-driven control; multi-agent systems; adaptive cooperative control; queuing strength balance; urban traffic signal timing

1. Introduction

Conventional technology-driven ITS is on the brink of a revolution in the age of big data. Data-driven intelligent transportation system (D²ITS) is trending to become a more powerful way of improving traffic efficiency, enhancing traffic security, saving energy and providing more possibilities for different stakeholders [1].

In urban transportation, it is costly and unsustainable to add additional infrastructure to accommodate the increased number of vehicles due to the limited road resources. A more socially feasible option is the optimization of traffic signal timing in a data-driven and learning-based manner. Since urban traffic systems are becoming more and more complex, establishing an exact mechanistic model of road networks or even an intersection is difficult or impossible due to its higher order, strong nonlinearities, non-stationary nature and complicated structures. Furthermore, acquiring the traffic data regarding vehicle count, queue, occupancy and flow has become much easier, and a huge amount of online/offline data is collected every day from auxiliary heterogeneous traffic sensors, e.g., inductive-loop detectors, microwave detectors, video surveillance and RFID. Thereby the spatial–temporal relationship between traffic data should be considered when performing a D²ITS.

One promising way is to develop data-driven control (D²C) [2] for traffic signal timing. In the last decade, research on D²C has received much more attention, such as model free adaptive control (MFAC) [3,4], iterative learning control (ILC) [5], lazy learning control (LL) [6], virtual reference feedback tuning (VRFT) [7], and iterative feedback tuning (IFT) [8]. Without using the explicit information from the mathematical model, D²C is designed by utilizing I/O data directly obtained from the controlled system or data processing [9]. From an application point of view, D²C of urban traffic systems will play a key role in D²ITS due to the availability, reliability and integrity of traffic data and is closer to real-life traffic signal control problems than many model-based approaches [10].

A renowned fixed timing approach is the so-called Webster’s signal plan, which is used to calculate the green time and the cycle using offline traffic data [11]. While some adaptive mechanisms, such as SCOOT [12], SCAT [13] and GLIDE [14], have been successfully implemented in real urban traffic systems due to the merits of dynamic tuning of signal timing. In recent years, some D²C strategies were investigated addressing the performance optimization of complex dynamic traffic systems. Adaptive dynamic programming (ADP) and reinforcement learning (RL) methods [15] have emerged in solving Markov decision processes with uncertain conditions. In [16], for solving the problem of streetcar bunching, the multiple reinforcement learning agents were developed for a series of successive signalized intersections. A Q-learning algorithm was introduced for optimal control of heavily congested traffic across in [17]. To achieve the global optimization of urban traffic control, Salkham et al. [18] studied a collaborative reinforcement learning approach.

Although ADP-based learning control could achieve good results, the shortcomings of slow convergence rate, poor stability, curse of dimensionality and difficulty in selecting reward functions may extremely hinder its application. Therefore, some real-time, rapid, recursive D²C designs by using MFAC or ILC techniques have also been proposed for freeway traffic control systems with high nonlinearity and repetitive operations [19,20,21]. Unfortunately, those protocols are still deficient for realizing traffic signal D²C.

Moreover, some prominent features, such as multi-functionality, multi-sourcing and multi-users, have recently become a positive force for traffic control changes. Therefore, as a rapidly developing field of artificial intelligence, multi-agent-based urban traffic distributed control strategies have attracted significant importance because they provide a highly flexible and modular structure [22]. To avoid solving complex traffic control problem as a whole, they offer distributed protocols by dividing the original problem into many sub-problems to get an optimal solution, in particular, for a large-scale traffic network system.

By using a hybrid neural network, a new simultaneous perturbation stochastic approximation- based neural network (SPSA-NN) [23] is developed for each intersection. The total mean delay and the total mean stoppage time of each vehicle has been substantially reduced. However, for each traffic pattern, a large number of training samples is needed for the implementation of a SPSA-NN model. The number of decision-making modules and the slow convergence rate would reduce the system performance. Another distributed multi-agent-based control method is Type-2 fuzzy with dynamic reasoning (T2DR) [24]. A weighted type-2 fuzzy inference engine and separate belief model are proposed to calculate the weights for input. The problem in this case is that T2DR strongly depends on the accuracy of the trained belief model’s limiting the parameter learning capability. Similar to the T2DR, geometric fuzzy multi-agent system (GFMAS) is addressed by modifying the belief model to reduce the number of training samples needed, thereby geometric defuzzification is used to further reduce the computational requirements [25]. A large number of samples is however still needed to ensure better accuracy. Also, an unconstrained fixed sequence clearing strategy (UNCFSCS) [26] is adopted recently for all completely empty vehicles in a fixed sequence from the point of multi-phase coordination. It is very restricted in application due to the unsaturated intersections condition. Therefore, a sample independent distributed multi-agent-based control under constraints is indispensable as a supplement.

It was pointed out by Choy et al. [27] that a three layered hierarchical multi-agent architecture, intersection controller agents (ICA), zone controller agents (ZCA) and regional controller agents (RCA), is appropriate to analyze the real-time traffic signal control. In terms of ICA, the traffic data of queuing length, vehicle arrival rate and saturation flow rate collected from different sensors placed in the lanes is taken into account as each ICA input. Based on the obtained data, green time for a specific phase or direction in progress and the cooperation factor of multi-agents in intersections are computed to adjust the signaling and traffic routing pattern for alleviating traffic jams, reducing the environmental pollution and the consumption of energy.

In this paper, we focus on the lowest layer ICA signal timing problem. A novel multi-agent-based data-driven distributed adaptive corporative control (MA-DD-DACC) is investigated for multi-direction queuing strength balance with changeable cycles at single intersections in urban traffic signal control. The main contributions of this work can be summarized as follows:

(1): A rapid and recursive distributed adaptive corporative control strategy with online parameter learning law is proposed by using the queuing length and the network topology of multi-direction signal controllers at single intersection;
(2): Traffic signal direction-sequence control is adopted here to avoid phase conflicts;
(3): A flexible changeable cycle signal timing strategy is analyzed satisfying maximum and minimum green time constraints;
(4): Both the undersaturation and supersaturation traffic flow conditions can be handled by the proposed MA-DD-DACC without distinction.

The rest of this paper is distributed as follows: Section 2 briefly introduces the traditional store-and-forward model and multi-directional queuing strength model, respectively. Section 3 introduces the data-driven cooperative control technology for queuing strength balance in a single intersection network. Section 4 shows the Lyapunov-based stability analysis of MA-DD-DACC. Section 5 gives the simulation results obtained by a joint programming method, showing the effectiveness of the algorithm. Finally, Section 6 summarizes the paper.

2. Preliminaries

2.1. Store-and-Forward Model

As shown in Figure 1 branch

m

connects intersection

i

and

j

, and vehicles move from intersection

i

to intersection

j

.

The traffic flow dynamic of branch

m

is expressed as follows:

x_{m} (k + 1) = x_{m} (k) + T [q_{m} (k) - u_{m} (k) + I_{m} (k) - O_{m} (k)]

(1)

where T is known as the sampling interval,

x_{m} (k)

denotes the number of vehicles on branch

m

,

q_{m} (k)

and

u_{m} (k)

are defined as the input and output of branch

m

in

[k T, (k + 1) T]

, respectively.

I_{m} (k)

and

O_{m} (k)

represent traffic flow demand and dissipation of branch

m

, respectively.

Dissipative flow

O_{m} (k)

satisfies the following equation:

O_{m} (k) = t_{m, 0} q_{m} (k)

(2)

where

t_{m, 0}

is defined as the dissipation rate which is a known constant.

Input flow

q_{m} (k)

satisfies the following equation:

q_{m} (k) = \sum_{n \in I_{i}} t_{n, m} u_{n} (k)

(3)

where

t_{n, m}

is the steering ratio of vehicles on branch

n

entering branch

m

through intersection j. Assuming that the traffic demand of branch

m

is sufficient, the output flow

u_{m} (k)

satisfies:

u_{m} (k) = (\frac{g_{m} (k)}{C}) O_{m}

(4)

where

C

is the signal cycle.

g_{m} (k)

denotes the green time and satisfies:

g_{m} (k) = \sum_{l \in v_{m}} g_{j, l} (k)

(5)

If the sampling interval T equals the cycle

C

,

u_{m} (k)

equals the mean of traffic flow in the whole cycle, the actual traffic flow is

s_{m}

or zero in the green or red period, so as long as the traffic demand is sufficient, there is a continuous and stable traffic flow in the road network.

By substituting Equations (2)–(5) into (1), it can be obtained that:

x_{m} (k + 1) = x_{m} (k) + T [(1 - t_{m, 0}) \sum_{n \in I_{i}} t_{n, m} \frac{O_{n} \sum_{l \in v_{n}} g_{i, l} (k)}{c} + I_{m} (k) - \frac{O_{m} \sum_{l \in v_{m}} g_{j, l} (k)}{c}]

(6)

The state equation of arbitrary dimension, topological structure and traffic flow characteristics can be rewritten by using Equation (6):

x (k + 1) = x (k) + B Δ I (k) - D Δ g (k)

(7)

where the state vector

x (k)

denotes the number of vehicles on each road in the road network during the k-th sampling period,

B

and

D

are stationary matrices reflecting road network characteristics. Denote

Δ g (k) = g (k) - g^{N}

,

Δ I (k) = I (k) - I^{N}

.

g (k)

is the green time vector of all intersections in the road network,

g^{N}

is the corresponding initial green time nominal vector,

I (k)

and

I^{N}

are traffic demand and initial steady flow, respectively.

2.2. Multi-Directional Queuing Strength Model with Changeable Cycle

The multi-directional queuing strength balance problem for traffic signal timing is shown in Figure 2. Based on the store-and-forward model, the multi-directional queuing length model with changeable cycle at single intersection is as follows:

l_{i} (k + 1) = l_{i} (k) + C (k) q_{i} (k) - S_{i} g_{i} (k)

(8)

where

i = 1, \dots, N

and

N = 4

.

i

denotes the i-th direction of the intersection.

N

denotes four directions;

C (k)

denotes the k-th cycle which is changeable;

l_{i} (k)

denotes the number of queuing vehicles of the k-th cycle in the i-direction;

q_{i} (k)

denotes the vehicle arrival rate of the k-th cycle in the i-th direction,

S_{i}

denotes the saturation flow rate in the i-th direction;

g_{i} (k)

denotes the green time of the k-th cycle in the i-th direction.

The green time and traffic signal cycle in all directions at intersections satisfy the following constraints:

C (k) = \sum_{i = 1}^{4} g_{i} (k) + t_{L}

(9)

where

t_{L}

denotes the total lost time at the intersection.

The queuing strength of an entrance line at an intersection is defined as the ratio of the queuing length of the entrance line (the number of queuing vehicles) to the length of its link (the storage capacity of vehicles) is as follows:

x_{i} (k) = \frac{l_{i} (k)}{l_{i, \max}}

(10)

where

x_{i} (k)

denotes the queuing strength of the i-th entrance line in the k-th cycle,

l_{i, \max}

denotes the storage capacity of vehicles of the i-th entrance line.

Therefore, multi-directional queuing strength model is given below:

x_{i} (k + 1) = x_{i} (k) + \frac{C (k) q_{i} (k)}{l_{i, \max}} - \frac{S_{i} g_{i} (k)}{l_{i, \max}}

(11)

The degree of saturation of the k-th cycle in the i-th entrance line of the intersection is defined as:

d_{i} (k) = \frac{C (k) q_{i} (k)}{S_{i} g_{i} (k)}

(12)

According to the definition of the degree of saturation, traffic state of the k-th cycle in the i-th direction can be judged as follows: (1)

d_{i} (k) < 1

indicates undersaturation; (2)

d_{i} (k) = 1

indicates critical saturation; (3)

d_{i} (k) > 1

indicates supersaturation.

When traffic state of the intersection is undersaturation, i.e.,

d_{i} (k) < 1

, according to Equation (11),

x_{i} (k + 1)

decreases continuously and may eventually be reduced to a negative value. Due to the non-negative characteristics of traffic system variables, the undersaturation state in store-and-forward model will lose its physical and control significance at this time. In the traditional store-and-forward model analysis, only the traffic signal timing under the supersaturated traffic condition is usually considered. If more green time is allocated in an undersaturated situation, there will be a waste of green time. Therefore, only less green time is needed to ensure capacity. However, since most traffic signal control systems are fixed cycle control and meet the constraints of maximum and minimum green time, signal timing is often limited and cannot be perfectly compatible with undersaturated, critical saturated and supersaturated conditions. In order to avoid wasting green time, it is necessary introduce time-varying control signal cycle

C (k)

leading to new green time constraints to ensure the queue strength satisfying

x_{i} (k + 1) \geq 0

. Therefore, the green time and signal cycle under constraints become:

{\bar{g}}_{i} (k) ≜ s a t (g_{i} (k)) = {\begin{matrix} \min {[l_{i} (k) + C (k) q_{i} (k)] / S_{i}, g_{\max}} & g_{i} (k) > \bar{g} \\ g_{i} (k) & \underline{g} \leq g_{i} (k) \leq \bar{g} \\ g_{\min} & g_{i} (k) < \underline{g} \end{matrix}

(13)

C (k) = \sum_{i = 1}^{4} {\bar{g}}_{i} (k) + t_{L}

(14)

Accordingly, the constrained multi-direction queuing strength model (11) with changeable cycle at single intersection would be rewritten as:

{\begin{matrix} x_{i} (k + 1) = x_{i} (k) + f_{i} (k) - {\bar{u}}_{i} (k) \\ f_{i} (k) = \frac{C (k) q_{i} (k)}{l_{i, \max}} \\ {\bar{u}}_{i} (k) = \frac{S_{i} {\bar{g}}_{i} (k)}{l_{i, \max}} \end{matrix}

(15)

The proposed queueing strength model (15) can formulate three traffic flow conditions: undersaturation, critical saturation and supersaturation, simultaneously.

3. Problem Formulation and Controller Design

3.1. Cooperative Control Problem Describtion

When the queuing strength balance is achieved at single intersection, the desired queueing strength can be expressed as:

{\begin{matrix} x_{r} (k + 1) = x_{r} (k) + f_{r} (k) - u_{r} (k) \\ x_{r} (k) = \frac{1}{4} \sum_{h = 1}^{4} x_{i} (k) \\ f_{r} (k) = \frac{1}{4} C (k) \sum_{h = 1}^{4} \frac{q_{h} (k)}{l_{h, \max}} \\ u_{r} (k) = \frac{1}{4} C (k) \sum_{h = 1}^{4} \frac{S_{h} {\bar{g}}_{h} (k)}{l_{h, \max}} \end{matrix}

(16)

where

x_{r} (k) \in R

is the average queue strength of four directions in k-th cycle;

f_{r} (k) \in R

and

u_{r} (k) \in R

are the average increase and decrease of queuing strength in the k-th signal cycle.

Considering each direction queuing strength as an agent, distributed consensus coordinated error of queuing strength in the i-th direction of one intersection is given below:

e_{i} (k) = \sum_{j \in N_{i}}^{} a_{i j} (x_{j} (k) - x_{i} (k)) + b_{i} (x_{r} (k) - x_{i} (k))

(17)

where

A = [a_{i j}]

denotes the adjacency matrix in network topology of signal controllers at single intersection,

b_{i}

denotes the connection coefficient between the

i

-direction queuing strength and the desired one. Moreover, define the in-degree of the ICA as

d_{i} = \sum_{j = 1}^{N} a_{i j}

and in-degree matrix as

D = d i a g {d_{i}} \in R^{N \times N}

. Then Laplacian matrix in network topology of multi-direction signal controller graph is

L = D - A

.

The unknown nonlinear parameterized functions

f_{i} (k)

and

f_{r} (k)

satisfy the relation:

f_{i} (k) - f_{r} (k) = \frac{C (k) q_{i} (k)}{l_{i, \max}} - \frac{1}{4} C (k) \sum_{h = 1}^{4} \frac{q_{h} (k)}{l_{h, \max}} = \frac{1}{4} C (k) \sum_{h = 1, h \neq i}^{4} (\frac{q_{i} (k)}{l_{i, \max}} - \frac{q_{h} (k)}{l_{h, \max}})

(18)

and consequently, satisfy the following inequality conditions by incorporating the parameter separation technique [28]:

| f_{i} (k) - f_{r} (k) | \leq | φ_{i} (k) | | θ_{i} (k) | | e_{i} (k) | \leq \bar{φ} | θ_{i} (k) | | e_{i} (k) |

(19)

where

θ_{i} (k)

denotes an unknown function related to vehicle arrival rate

q_{i} (k)

,

φ_{i} (k)

denotes a known function related to signal cycle

C (k)

, and meets

0 < | φ_{i} (k) | \leq \bar{φ}

. According to Equation (18), we choose

φ_{i} (k) = \frac{1}{4} C (k) < \frac{1}{4} \bar{C} = \bar{φ}

.

The global multi-direction and desired queuing strength as well as consensus coordinated error can be described as follows, respectively:

x (k + 1) = x (k) + f (k) - \bar{u} (k)

(20)

x_{r} (k + 1) = x_{r} (k) + f_{r} (k) - u_{r} (k)

(21)

e (k) = - (L + B) (x (k) - x_{r} (k)) = - (L + B) \tilde{x} (k)

(22)

where

x = {[x_{1}, \dots, x_{N}]}^{T} \in R^{N}

,

x_{r} = \underline{1} x_{r} \in R^{N}

,

f = {[f_{1}, \dots, f_{4}]}^{T} \in R^{N}

,

f_{r} = \underline{1} f_{r} \in R^{N}

,

u_{r} = \underline{1} u_{r} \in R^{N}

,

\bar{u} = {[{\bar{u}}_{1}, \dots, {\bar{u}}_{N}]}^{T} \in R^{N}

,

e = {[e_{1}, \dots, e_{N}]}^{T} \in R^{N}

,

\tilde{x} = x - x_{r} \in R^{N}

,

\underline{1} = {[1, \dots, 1]}^{T} \in R^{N}

are defined as N-Dimensional vector composed of element 1, and

B = d i a g (b_{i}) \in R^{N \times N}

denotes a diagonal matrix.

Moreover, the difference dynamics of global consensus coordinated error is as follows:

e (k + 1) = e (k) - (L + B) (\tilde{x} (k + 1) - \tilde{x} (k)) = e (k) - (L + B) (f (k) - f_{r} (k) + u_{r} (k) - \bar{u} (k))

(23)

Remark 1.

\tilde{x} = x - x_{r}

is a centralized global coordinated error of queueing strength. We assume that at least one directional signal controller can obtain centralized global information

\tilde{x}

. The global coordinated errors e, rather than centralized global information is more suitable for Lyapunov-based distributed data-driven adaptive controller design.

Remark 2.

Considering that the digraph of each direction queueing strength at single intersection is strongly connected and at least one signal controller can communicate with others. Considering that the digraph contains a spanning tree for all directions at single intersection, that is to say, there is at least one

b_{i} \neq 0

, then

(L + B)

matrix is an irreducible principal diagonal

M

matrix and is a nonsingular matrix. All the poles of

M

matrix are distributed in the right half plane.

Remark 3.

Define that the eigenvalue of a matrix

M

can be expressed as

σ (M)

. Then

\bar{σ} (M)

and

\underline{σ} (M)

represent the maximum and minimum eigenvalues, respectively. Frobenius norm is defined as

{‖ M ‖}_{F} = \sqrt{t r {M^{T} M}}

, where

t r {\cdot}

. represents the trace of a matrix. Frobenius inner product of two matrices is defined as

{〈 M_{1}, M_{2} 〉}_{F} = \sqrt{t r {M_{1}^{T} M_{2}}}

.

For the convenience of the following discussion, we first discuss the three properties of saturation functions and two lemmas in multi-agent systems in terms of queuing strength model.

Lemma 1.

If a digraph is strongly connected and

B = d i a g {b_{i}} \neq 0

, then L of digraphs is irreducible and there is at least one positive diagonal element

b_{i} > 0

in matrix

B

, then matrix

(L + B)

is a nonsingular

M

-matrix. Define:

‖ \tilde{x} ‖ \leq ‖ e ‖ / \underline{σ} (L + B)

(24)

where

\underline{σ} (L + B)

denotes the minimum eigenvalue of matrix

(L + B)

, when

e = 0

, if and only if the queue strength in all directions reaches balance.

Lemma 2.

If the Laplacian matrix L of digraphs is irreducible and there is at least one positive diagonal element

b_{i} > 0

in matrix

B

, then matrix

(L + B)

is a nonsingular

M

-matrix. Define:

q = {[q_{1}, \dots, q_{1}]}^{T} = {(L + B)}^{- 1} \underline{1}

(25)

P = d i a g {p_{i}} \equiv d i a g {1 / q_{i}}

(26)

then

P > 0

, the definition of matrix

Q

is:

Q = {(L + B)}^{T} P (L + B)

(27)

then

Q > 0

.

For

g, h^{*} \in R

, satisfying

| g | \leq h^{*}

, then

Property 1

[5].

{[g - s a t (h, h^{*})]}^{2} \leq {[g - h]}^{2}

(28)

Property 2

[5]. For

h = s a t (g, h^{*}) + d

, there are

| s a t (h, h^{*}) - h | \leq | d |

(29)

Property 3

[29].

[(γ + 1) g - (γ h + s a t (h, h^{*}))] [h - s a t (h, h^{*})] \leq 0

(30)

where

γ \geq 0

.

3.2. Data-Driven Distributed Adaptive Cooperative Control Design

Given the following standard definitions of multi-direction queueing strength balance for distributed cooperative control problem.

Definition 1.

The global consensus coordinated error

e (k) \in R^{N}

is uniformly ultimately bounded (UUB). If there exists a compact set

Ω \subset R^{N}

, so that

\forall e (k_{0}) \in Ω

, there exists a boundary

B

and a time

k_{t} (B, e (k_{0}))

, which are independent of

k_{0} \geq 0

, then

‖ e (k) ‖ \leq B

and

\forall k \geq k_{0} + k_{t}

hold.

Definition 2.

Given the desired queueing strength model (16), whose state

x_{r} (k)

is uniformly ultimately bounded under coordination. If there exists a compact set

Ω \subset R^{N}

, so that

\forall (x_{i} (k_{0}) - x_{r} (k_{0})) \in Ω

, there exists a boundary

B

and a time

k_{t} (B, (x_{i} (k_{0}) - x_{r} (k_{0})))

, which are independent of

k_{0} \geq 0

, then

‖ x_{i} (k_{0}) - x_{r} (k_{0}) ‖ \leq B

and

\forall i, \forall k \geq k_{0} + k_{t}

hold.

Considering the queuing strength model (15), the following data-driven queuing strength cooperative control strategy for multi-agent single intersection is constructed as follows:

{\bar{u}}_{i} (k) = \frac{S_{i} {\bar{g}}_{i} (k)}{l_{i, \max}} = \frac{p_{i} S_{i}}{(p_{i} + μ) l_{i, \max}} [{\bar{g}}_{i} (k - 1) - {v^{'}}_{i} (k)] = \frac{p_{i}}{(p_{i} + μ)} [{\bar{u}}_{i} (k - 1) - v_{i} (k)]

(31)

where

v_{i} (k) = \frac{S_{i}}{l_{i, \max}} {v^{'}}_{i} (k)

represents an arbitrary control signal designed by Lyapunov technology later. The global cooperative control strategy can be written as follows:

\bar{u} (k) = {(1 + μ P^{- 1})}^{- 1} (\bar{u} (k - 1) - v (k))

(32)

where

μ > 0

denotes a learning gain,

v = {[v_{1}, \dots, v_{N}]}^{T} \in R^{N}

.

Theorem 1.

Considering the multi-agent based queuing strength system (15) and (16) at single intersection, digraph in network topology of multi-direction signal controllers is strongly connected, and there is at least one

b_{i} \neq 0

. According to the distributed consensus coordinated error (17), the distributed adaptive cooperative control term

v_{i} (k)

is selected as follows:

v_{i} (k) = c \sum_{j \in N^{i}}^{} a_{i j} (e_{i} - e_{j}) + (c b_{i} + \bar{φ} {\hat{θ}}_{i} (k)) e_{i} (k)

(33)

where

c > 0

denotes the error learning gain.

The data-driven distributed adaptive cooperative control strategy for signal timing

{\bar{u}}_{i} (k)

at single intersection is designed as follows:

{\bar{u}}_{i} (k) = \frac{p_{i}}{(p_{i} + μ)} [{\bar{u}}_{i} (k - 1) - c \sum_{j \in N^{i}}^{} a_{i j} (e_{i} - e_{j}) - (c b_{i} + \bar{φ} {\hat{θ}}_{i} (k)) e_{i} (k)]

(34)

where

μ > 0

denotes the cooperative control gain,

p_{i} > 0

, and the parameter learning law is designed as follows:

{\hat{θ}}_{i} (k) = \frac{1}{1 + κ F_{i}} [{\hat{θ}}_{i - 1} (k) + \bar{φ} p_{i} F_{i} e_{i} (\sum_{j \in N^{i}}^{} a_{i j} (e_{i} - e_{j}) + b_{i} e_{i})]

(35)

where

F_{i} = Π_{i} > 0

.

κ > 0

denotes parameter learning gain;

The control and parameter learning gains satisfy the following conditions:

c > \frac{{\bar{φ}}^{2} {\bar{Θ}}^{2} \bar{σ} (Q)}{\underline{σ} (Q)} > 0

(36)

κ > \frac{1}{2} c \underline{σ} (Q) + \frac{1}{2 c} [{\bar{φ}}^{2} {\bar{Θ}}^{2} \underline{σ} (P) + \bar{φ} \bar{Θ} (\bar{σ} (P) + \bar{σ} (Q))] - \frac{1}{2} {\bar{φ}}^{2} {\bar{Θ}}^{2} \bar{σ} (Q) > 0

(37)

μ = \frac{1}{2} (c^{2} \underline{σ} (Q) + c (1 - {\bar{φ}}^{2} {\bar{Θ}}^{2}) \bar{σ} (Q) + {\bar{φ}}^{2} {\bar{Θ}}^{2} \underline{σ} (P)) > \frac{c}{2} \bar{σ} (Q) > 0

(38)

where

P = P^{T} \in R^{N \times N} > 0

,

Q = Q^{T} \in R^{N \times N} > 0

are positive definite matrices defined in Lemma 2.

Then the distributed consensus coordinated error of queueing strength

e_{i} (k)

of all directions is uniformly ultimately bounded (UUB) and converges to the desired queuing strength at single intersection. The upper bound of consensus coordinated error can be reduced by increasing the error learning gain

c

.

Remark 4.

It is worth noting that the proposed queuing strength balance control method is almost model-free. The traffic model parameters, such as the saturation flow rate, the vehicle arrival rate in model (8) are not used both in the distributed adaptive cooperative control (34) and the parameter learning law (35). Only the queueing length

l (k)

of output data and network topology L and B of multi-direction signal controllers at signal intersection are used here to adjust the green time of input data. Especially, network topology information is easily obtained from adjacent relationship of traffic infrastructures and irrelevant to the real-traffic flow model. Therefore, the proposed MA-DD-DACC is data-driven.

4. Stability Analysis

Proof of Theorem 1.

Define a Lyapunov function candidate as follows:

V (k) = \frac{1}{2} e^{T} (k) P e (k) + \frac{1}{2} t r {{\tilde{Θ}}^{T} (k - 1) F^{- 1} \tilde{Θ} (k - 1)} + \frac{1}{2 c} {\tilde{\bar{u}}}^{T} (k - 1) P \tilde{\bar{u}} (k - 1)

(39)

where

F^{- 1} \in R^{N \times N} > 0

is a diagonal matrix and

F = d i a g (F_{i})

.

Difference of (39):

Δ V (k + 1) = V (k + 1) - V (k) = Δ V_{1} (k + 1) + Δ V_{2} (k) + Δ V_{3} (k)

(40)

where:

Δ V_{1} (k + 1) = \frac{1}{2} e^{T} (k + 1) P e (k + 1) - \frac{1}{2} e^{T} (k) P e (k)

(41)

Δ V_{2} (k) = \frac{1}{2} t r {{\tilde{Θ}}^{T} (k) F^{- 1} \tilde{Θ} (k)} - \frac{1}{2} t r {{\tilde{Θ}}^{T} (k - 1) F^{- 1} \tilde{Θ} (k - 1)}

(42)

Δ V_{3} (k) = \frac{1}{2 c} {\tilde{\bar{u}}}^{T} (k) P \tilde{\bar{u}} (k) - \frac{1}{2 c} {\tilde{\bar{u}}}^{T} (k - 1) P \tilde{\bar{u}} (k - 1)

(43)

Equation (41) equals:

\begin{matrix} Δ V_{1} (k + 1) & = - e^{T} (k) {(L + B)}^{T} P (f (k) - f_{r} (k) + u_{r} (k) - \bar{u} (k)) \\ + \frac{1}{2} {(f (k) - f_{r} (k) + u_{r} (k) - \bar{u} (k))}^{T} Q (f (k) - f_{r} (k) + u_{r} (k) - \bar{u} (k)) \end{matrix}

(44)

where

Q = {(L + B)}^{T} P (L + B)

.

According to (19), one has:

- e^{T} (k) {(L + B)}^{T} P (f (k) - f_{r} (k)) \leq \bar{φ} e^{T} (k) {(L + B)}^{T} P Θ (k) e (k)

(45)

where

Θ (k) = d i a g (| θ_{i} (k) |)

.

Based on Equations (32), (33) and (45), the first term at right-hand of (44) satisfies:

\begin{array}{l} - e^{T} (k) {(L + B)}^{T} P (f (k) - f_{r} (k) + u_{r} (k) - \bar{u} (k)) \\ \leq - c e^{T} (k) Q e (k) + \bar{φ} e^{T} (k) {(L + B)}^{T} P \tilde{Θ} (k) e (k) - e^{T} (k) {(L + B)}^{T} P (μ P^{- 1} \bar{u} (k) - \bar{u} (k - 1) + u_{r} (k)) \\ = \bar{φ} e^{T} (k) {(L + B)}^{T} P \hat{Θ} (k) e (k) + \bar{φ} e^{T} (k) {(L + B)}^{T} P \tilde{Θ} (k) e (k) + e^{T} (k) {(L + B)}^{T} P \tilde{\bar{u}} (k) \end{array}

(46)

where

\tilde{Θ} (k) = d i a g ({\tilde{θ}}_{i} (k)) = d i a g (| θ_{i} (k) | - {\hat{θ}}_{i} (k))

,

\tilde{\bar{u}} (k) = \bar{u} (k) - u_{r} (k)

.

Substituting (46) into (44):

\begin{matrix} Δ V_{1} (k + 1) & \leq \bar{φ} e^{T} (k) {(L + B)}^{T} P \hat{Θ} (k) e (k) + \bar{φ} e^{T} (k) {(L + B)}^{T} P \tilde{Θ} (k) e (k) \\ + e^{T} (k) {(L + B)}^{T} P \tilde{\bar{u}} (k) + \frac{1}{2} {(f (k) - f_{r} (k) - \tilde{\bar{u}} (k))}^{T} Q (f (k) - f_{r} (k) - \tilde{\bar{u}} (k)) \end{matrix}

(47)

Equation (42) can be rewritten as:

\begin{matrix} Δ V_{2} (k) & = - t r {{\tilde{Θ}}^{T} F^{- 1} [\hat{Θ} (k) - \hat{Θ} (k - 1)]} - \frac{1}{2} t r {{[\hat{Θ} (k) - \hat{Θ} (k - 1)]}^{T} F^{- 1} [\hat{Θ} (k) - \hat{Θ} (k - 1)]} \\ \leq - t r {{\tilde{Θ}}^{T} F^{- 1} [\hat{Θ} (k) - \hat{Θ} (k - 1)]} \end{matrix}

(48)

Given the parameter learning law as follows:

{\hat{θ}}_{i} (k) = \frac{1}{1 + κ F_{i}} [{\hat{θ}}_{i - 1} (k) + \bar{φ} p_{i} F_{i} e_{i} (\sum_{j \in N^{i}}^{} a_{i j} (e_{i} - e_{j}) + b_{i} e_{i})]

(49)

with global form:

\hat{Θ} (k) = {(I + κ F)}^{- 1} [\hat{Θ} (k - 1) + F \bar{φ} e^{T} (k) {(L + B)}^{T} P e (k)]

(50)

It can be easily derived that

\hat{Θ} (k) - \hat{Θ} (k - 1) = - κ F \hat{Θ} (k) + F \bar{φ} e^{T} (k) {(L + B)}^{T} P e (k)

(51)

where

κ > 0

is the parameter learning gain.

Then we have:

\begin{matrix} Δ V_{1} (k + 1) + Δ V_{2} (k) & \leq \bar{φ} e^{T} (k) {(L + B)}^{T} P \hat{Θ} (k) e (k) + e^{T} (k) {(L + B)}^{T} P \tilde{\bar{u}} (k) \\ + t r {κ {\tilde{Θ}}^{T} (k) (Θ (k) - \tilde{Θ} (k))} + \frac{1}{2} {(f (k) - f_{r} (k) - \tilde{\bar{u}} (k))}^{T} Q (f (k) - f_{r} (k) - \tilde{\bar{u}} (k)) \end{matrix}

(52)

Equation (43) can be further described as:

\begin{matrix} Δ V_{3} (k) & = - \frac{1}{c} {(\bar{u} (k - 1) - \bar{u} (k))}^{T} P (\bar{u} (k) - u_{r}) - \frac{1}{2 c} {(\bar{u} (k) - \bar{u} (k - 1))}^{T} P (\bar{u} (k) - \bar{u} (k - 1)) \\ \leq - \frac{1}{c} {(\bar{u} (k - 1) - \bar{u} (k))}^{T} P \tilde{\bar{u}} (k) - \frac{1}{2 c} {[μ P^{- 1} \bar{u} (k) + (c (L + B) + \bar{φ} \hat{Θ} (k)) e (k)]}^{T} P \\ \times [μ P^{- 1} \bar{u} (k) + (c (L + B) + \bar{φ} \hat{Θ} (k)) e (k)] \\ \leq - \frac{1}{c} {(\bar{u} (k - 1) - \bar{u} (k))}^{T} P \tilde{\bar{u}} (k) - \frac{μ}{c} {\bar{u}}^{T} (k) (c (L + B) + \bar{φ} \hat{Θ} (k)) e (k) - \frac{1}{2 c} e^{T} (k) Q^{'} e (k) \end{matrix}

(53)

where

Q^{'} = {(c (L + B) + \bar{φ} \hat{Θ} (k))}^{T} P (c (L + B) + \bar{φ} \hat{Θ} (k))

.

Combining Equations (52) and (53):

\begin{matrix} Δ V (k + 1) & \leq \bar{φ} e^{T} (k) {(L + B)}^{T} P \hat{Θ} (k) e (k) + t r {κ {\tilde{Θ}}^{T} (k) (Θ (k) - \tilde{Θ} (k))} \\ + \frac{1}{2} {(f (k) - f_{r} (k) - \tilde{\bar{u}} (k))}^{T} Q (f (k) - f_{r} (k) - \tilde{\bar{u}} (k)) + e^{T} (k) {(L + B)}^{T} P \tilde{\bar{u}} (k) \\ - \frac{1}{c} {(\bar{u} (k - 1) - \bar{u} (k))}^{T} P \tilde{\bar{u}} (k) - \frac{μ}{c} {\bar{u}}^{T} (k) (c (L + B) + \bar{φ} \hat{Θ} (k)) e (k) - \frac{1}{2 c} e^{T} (k) Q^{'} e (k) \end{matrix}

(54)

Rearranging terms in (54):

\begin{matrix} Δ V (k + 1) & \leq - \frac{1}{2} e^{T} (k) (\frac{1}{c} Q^{'} - 2 \bar{φ} {(L + B)}^{T} P \hat{Θ} (k)) e (k) + t r {κ {\tilde{Θ}}^{T} (k) (Θ (k) - \tilde{Θ} (k))} \\ + \frac{1}{2} {(f (k) - f_{r} (k) - \tilde{\bar{u}} (k))}^{T} Q (f (k) - f_{r} (k) - \tilde{\bar{u}} (k)) \\ - \frac{1}{c} \bar{φ} e^{T} (k) \hat{Θ} (k) P \tilde{\bar{u}} (k) - \frac{μ}{c} {(\tilde{\bar{u}} (k) + u_{r} (k))}^{T} \tilde{\bar{u}} (k) \end{matrix}

(55)

According to (19) again, one has:

\frac{1}{2} {(f (k) - f_{r} (k))}^{T} Q (f (k) - f_{r} (k)) \leq \frac{1}{2} {\bar{φ}}^{2} e^{T} (k) Θ^{T} (k) Q Θ (k) e (k)

(56)

and:

{(f (k) - f_{r} (k))}^{T} Q (u_{r} (k) - \bar{u} (k)) \leq \bar{φ} \bar{σ} (Q) ‖ Θ (k) ‖ ‖ e (k) ‖ ‖ \tilde{\bar{u}} (k) ‖ \leq \bar{φ} \bar{σ} (Q) \bar{Θ} ‖ e (k) ‖ ‖ \tilde{\bar{u}} (k) ‖

(57)

where

\bar{Θ} \geq ‖ Θ (k) ‖

.

The third term at the right-hand of (55) becomes:

\begin{array}{l} \frac{1}{2} {(f (k) - f_{r} (k) - \tilde{\bar{u}} (k))}^{T} Q (f (k) - f_{r} (k) - \tilde{\bar{u}} (k)) \\ \leq \frac{1}{2} {\bar{φ}}^{2} e^{T} (k) Θ^{T} (k) Q Θ (k) e (k) + \bar{φ} \bar{σ} (Q) \bar{Θ} ‖ e (k) ‖ ‖ \tilde{\bar{u}} (k) ‖ + \frac{1}{2} {\tilde{\bar{u}}}^{T} (k) Q \tilde{\bar{u}} (k) \end{array}

(58)

Then Equation (55) is derived:

\begin{matrix} Δ V (k + 1) & \leq - \frac{1}{2} e^{T} (k) (\frac{1}{c} Q^{'} - 2 \bar{φ} {(L + B)}^{T} P \hat{Θ} (k)) e (k) + t r {κ {\tilde{Θ}}^{T} (k) (Θ (k) - \tilde{Θ} (k))} \\ - \frac{1}{c} \bar{φ} e^{T} (k) \hat{Θ} (k) P \tilde{\bar{u}} (k) - \frac{μ}{c} {(\tilde{\bar{u}} (k) + u_{r} (k))}^{T} \tilde{\bar{u}} (k) \\ + \frac{1}{2} {\bar{φ}}^{2} e^{T} (k) Θ^{T} (k) Q Θ (k) e (k) + \bar{φ} \bar{σ} (Q) \bar{Θ} ‖ e (k) ‖ ‖ \tilde{\bar{u}} (k) ‖ + \frac{1}{2} {\tilde{\bar{u}}}^{T} (k) Q \tilde{\bar{u}} (k) \end{matrix}

(59)

By using

‖ \tilde{Θ} (k) ‖ \leq {‖ \tilde{Θ} (k) ‖}_{F}

,

‖ Θ (k) ‖ \leq ‖ \hat{Θ} (k) ‖ \leq \bar{Θ}

, (59) is:

\begin{matrix} Δ V (k + 1) & \leq - \frac{1}{2} (c \underline{σ} (Q) + \frac{1}{c} {\bar{φ}}^{2} {\bar{Θ}}^{2} \underline{σ} (P) - {\bar{φ}}^{2} {\bar{Θ}}^{2} \bar{σ} (Q)) {‖ e (k) ‖}^{2} \\ + κ \bar{Θ} {‖ \tilde{Θ} ‖}_{F} - κ {‖ \tilde{Θ} ‖}_{F}^{2} + \frac{1}{c} \bar{φ} \bar{Θ} ‖ e (k) ‖ (\bar{σ} (P) + \bar{σ} (Q)) ‖ \tilde{\bar{u}} (k) ‖ \\ - (\frac{μ}{c} I - \frac{1}{2} Q) {‖ \tilde{\bar{u}} (k) ‖}^{2} + μ {\bar{u}}_{r} ‖ \tilde{\bar{u}} (k) ‖ \end{matrix}

(60)

which could be rewritten as:

\begin{array}{l} Δ V (k + 1) \leq - [\begin{matrix} ‖ e (k) ‖ & {‖ \tilde{Θ} (k) ‖}_{F} & ‖ \tilde{\bar{u}} (k) ‖ \end{matrix}] \\ [\begin{matrix} \frac{1}{2} (c \underline{σ} (Q) + \frac{1}{c} {\bar{φ}}^{2} {\bar{Θ}}^{2} \underline{σ} (P) - {\bar{φ}}^{2} {\bar{Θ}}^{2} \bar{σ} (Q)) & 0 & - \frac{1}{2 c} \bar{φ} \bar{Θ} (\bar{σ} (P) + \bar{σ} (Q)) \\ 0 & κ & 0 \\ - \frac{1}{2 c} \bar{φ} \bar{Θ} (\bar{σ} (P) + \bar{σ} (Q)) & 0 & (\frac{μ}{c} - \frac{1}{2} \bar{σ} (Q)) \end{matrix}] [\begin{matrix} ‖ e (k) ‖ \\ {‖ \tilde{Θ} (k) ‖}_{F} \\ ‖ \tilde{\bar{u}} (k) ‖ \end{matrix}] \\ + [\begin{matrix} 0 & κ \bar{Θ} & μ {\bar{u}}_{r} \end{matrix}] [\begin{matrix} ‖ e (k) ‖ \\ {‖ \tilde{Θ} (k) ‖}_{F} \\ ‖ \tilde{\bar{u}} (k) ‖ \end{matrix}] = - z^{T} R z + r^{T} z \end{array}

(61)

If

R

is a positive definite matrix, and:

‖ z ‖ > \frac{‖ r ‖}{\underline{σ} (R)}

(62)

then

Δ V (k + 1) \leq 0

.

Based on (39), we have:

\frac{1}{2} \underline{σ} (P) {‖ e ‖}^{2} + \frac{1}{2 Π_{\max}} {‖ \tilde{Θ} (k) ‖}_{F}^{2} + \frac{1}{2} \underline{σ} (P) {‖ \tilde{u} ‖}^{2} \leq V \leq \frac{1}{2} \bar{σ} (P) {‖ e ‖}^{2} + \frac{1}{2 Π_{\min}} {‖ \tilde{Θ} (k) ‖}_{F}^{2} + \frac{1}{2} \bar{σ} (P) {‖ \tilde{u} ‖}^{2}

(63)

then:

\begin{array}{l} \frac{1}{2} [\begin{matrix} ‖ e (k) ‖ & {‖ \tilde{Θ} (k) ‖}_{F} & ‖ \tilde{u} (k) ‖ \end{matrix}] [\begin{matrix} \underline{σ} (P) & 0 & 0 \\ * & \frac{1}{Π_{\max}} & 0 \\ * & * & \underline{σ} (P) \end{matrix}] [\begin{matrix} ‖ e ‖ \\ {‖ \tilde{Θ} (k) ‖}_{F} \\ ‖ \tilde{u} ‖ \end{matrix}] \leq V \\ \leq \frac{1}{2} [\begin{matrix} ‖ e (k) ‖ & {‖ \tilde{Θ} (k) ‖}_{F} & ‖ \tilde{u} (k) ‖ \end{matrix}] [\begin{matrix} \bar{σ} (P) & 0 & 0 \\ * & \frac{1}{Π_{\min}} & 0 \\ * & * & \bar{σ} (P) \end{matrix}] [\begin{matrix} ‖ e (k) ‖ \\ {‖ \tilde{Θ} (k) ‖}_{F} \\ ‖ \tilde{u} (k) ‖ \end{matrix}] \end{array}

(64)

Since:

\frac{1}{2} z^{T} \underline{S} z \leq V \leq \frac{1}{2} z^{T} \bar{S} z

(65)

where

\underline{S} = [\begin{matrix} \underline{σ} (P) & 0 & 0 \\ * & \frac{1}{Π_{\max}} & 0 \\ * & * & \underline{σ} (P) \end{matrix}]

,

\bar{S} = [\begin{matrix} \bar{σ} (P) & 0 & 0 \\ * & \frac{1}{Π_{\min}} & 0 \\ * & * & \bar{σ} (P) \end{matrix}]

, and then

\frac{1}{2} \underline{σ} (\underline{S}) {‖ z ‖}^{2} \leq V \leq \frac{1}{2} \bar{σ} (\bar{S}) {‖ z ‖}^{2}

(66)

It can be obtained that:

V > \frac{1}{2} \frac{\bar{σ} (\bar{S}) {‖ r ‖}^{2}}{{\underline{σ}}^{2} (R)}

(67)

which leads to Equation (62).

In Equation (61), we define the matrix

R

as follows:

R = [\begin{matrix} a_{1} & 0 & a_{3} \\ 0 & b_{1} & 0 \\ a_{3} & 0 & c_{1} \end{matrix}]

(68)

where

a_{1} = \frac{1}{2} (c \underline{σ} (Q) + \frac{1}{c} {\bar{φ}}^{2} {\bar{Θ}}^{2} \underline{σ} (P) - {\bar{φ}}^{2} {\bar{Θ}}^{2} \bar{σ} (Q))

,

a_{3} = - \frac{1}{2 c} \bar{φ} \bar{Θ} (\bar{σ} (P) + \bar{σ} (Q))

,

b_{1} = κ

,

c_{1} = (\frac{μ}{c} - \frac{1}{2} \bar{σ} (Q))

.

Firstly, select the learning gain satisfying:

c \underline{σ} (Q) + \frac{1}{c} {\bar{φ}}^{2} {\bar{Θ}}^{2} \underline{σ} (P) - {\bar{φ}}^{2} {\bar{Θ}}^{2} \bar{σ} (Q) > 0

(69)

κ > 0

(70)

μ > \frac{c}{2} \bar{σ} (Q)

(71)

such that

R > 0

.

Then the characteristic polynomial of matrix

R

is given as:

(λ - a_{1}) (λ - b_{1}) (λ - c_{1}) - a_{3}^{2} (λ - b_{1}) = 0

(72)

we obtain all the eigenvalues as:

λ_{1} = b_{1} = κ

(73)

λ_{2} = \frac{(a_{1} + c_{1}) + \sqrt{{(a_{1} - c_{1})}^{2} + 4 a_{3}^{2}}}{2}

(74)

λ_{3} = \frac{(a_{1} + c_{1}) - \sqrt{{(a_{1} - c_{1})}^{2} + 4 a_{3}^{2}}}{2}

(75)

Let

μ = \frac{1}{2} (c^{2} \underline{σ} (Q) + c (1 - {\bar{φ}}^{2} {\bar{Θ}}^{2}) \bar{σ} (Q) + {\bar{φ}}^{2} {\bar{Θ}}^{2} \underline{σ} (P))

, then:

a_{1} - c_{1} = \frac{1}{2} (c \underline{σ} (Q) + \frac{1}{c} {\bar{φ}}^{2} {\bar{Θ}}^{2} \underline{σ} (P) - {\bar{φ}}^{2} {\bar{Θ}}^{2} \bar{σ} (Q)) - (\frac{μ}{c} - \frac{1}{2} \bar{σ} (Q)) = 0

(76)

Now the third eigenvalue is:

\begin{matrix} λ_{3} & = \frac{a_{1} + c_{1} - 2 a_{3}}{2} = a_{1} - a_{3} \\ = \frac{1}{2} c \underline{σ} (Q) + \frac{1}{2 c} [{\bar{φ}}^{2} {\bar{Θ}}^{2} \underline{σ} (P) + \bar{φ} \bar{Θ} (\bar{σ} (P) + \bar{σ} (Q))] - \frac{1}{2} {\bar{φ}}^{2} {\bar{Θ}}^{2} \bar{σ} (Q) \end{matrix}

(77)

Select an appropriate

κ

, such that:

κ > \frac{1}{2} c \underline{σ} (Q) + \frac{1}{2 c} [{\bar{φ}}^{2} {\bar{Θ}}^{2} \underline{σ} (P) + \bar{φ} \bar{Θ} (\bar{σ} (P) + \bar{σ} (Q))] - \frac{1}{2} {\bar{φ}}^{2} {\bar{Θ}}^{2} \bar{σ} (Q) > 0

(78)

We have:

\underline{σ} (R) = λ_{3} = \frac{1}{2} c \underline{σ} (Q) + \frac{1}{2 c} [{\bar{φ}}^{2} {\bar{Θ}}^{2} \underline{σ} (P) + \bar{φ} \bar{Θ} (\bar{σ} (P) + \bar{σ} (Q))] - \frac{1}{2} {\bar{φ}}^{2} {\bar{Θ}}^{2} \bar{σ} (Q)

(79)

and error and parameter learning gains satisfy:

c > \frac{{\bar{φ}}^{2} {\bar{Θ}}^{2} \bar{σ} (Q)}{\underline{σ} (Q)} > 0

(80)

κ > \frac{1}{2} c \underline{σ} (Q) + \frac{1}{2 c} [{\bar{φ}}^{2} {\bar{Θ}}^{2} \underline{σ} (P) + \bar{φ} \bar{Θ} (\bar{σ} (P) + \bar{σ} (Q))] - \frac{1}{2} {\bar{φ}}^{2} {\bar{Θ}}^{2} \bar{σ} (Q) > 0

(81)

μ = \frac{1}{2} (c^{2} \underline{σ} (Q) + c (1 - {\bar{φ}}^{2} {\bar{Θ}}^{2}) \bar{σ} (Q) + {\bar{φ}}^{2} {\bar{Θ}}^{2} \underline{σ} (P)) > \frac{c}{2} \bar{σ} (Q)

(82)

As a result, 𝔃(k) is UUB [30]. □

5. Simulation

In this paper, a VISSIM-VB-MATLAB joint simulation platform is used to build a more realistic experiment. VISSIM is a professional traffic simulation software that simulates all modes of traffic and analyses their interactions. It consists of a traffic simulator and a signal state generator, which exchange detector data and signal state information through an interface. It can not only generate visualized traffic conditions online, but also output various realistic and statistical data offline, such as travel time, queue length, etc. The simulation data obtained by VISSIM is transferred into MATLAB through VB. In MATLAB, the MA-DD-DACC signal timing algorithm is used to realize the cooperative control of queue strength in four directions at single intersection. A single intersection is build by using the traffic evaluation module provided by VISSIM to output traffic evaluation data as shown in Figure 3.

Using multi-directional queuing strength model with changeable cycle at intersection (15), the simulation is conducted. The initial parameters are shown in Table 1.

Two experiments are carried out. The first one uses the UNCFSCS [26] which is designed to release vehicles in fixed sequence at unsaturated intersections and to keep all phases completely empty. The second one adopts the MA-DD-DACC signal timing strategies presented in this paper. The total queuing delays in all directions will be recorded at the end of each cycle in the simulation running process, as shown in Figure 4. It is not difficult to find out that the queuing delays of all directions in UNCFSCS decreases eventually to zero and to keep all phases completely empty under the premise that undersaturation. While the proposed cooperative control scheme can adjust the cycle and green ratio adaptively for different traffic conditions, undersaturation, saturation and supersaturation, respectively. It should be noted that the total delays in UNCFSCS is almost six times that in MA-DD-DACC.

From Figure 5, the queueing strength of south direction in UNCFSCS is more than 80. However, the queueing strength of all directions in MA-DD-DACC is only less than 50. By calculation, the queueing strength mean value of four directions in MA-DD-DACC has dropped from 25 to 10 during 2190 s. While, the queueing strength mean value in UNCFSCS has dropped from 35 to 10 during 2731 s. It is obvious that the queueing strength balance is achieved which implies the proposed scheme has the merit of high convergence rate and low energy consumption as displayed.

The VISSIM delay evaluation index is obtained from two simulations. It can be found that green time of each direction could be effectively adjusted by using the MA-DD-DACC such that the average vehicle delay is balanced. The performance in average delays of our proposed control is much better than that of the UNCFSCS control, as shown in Figure 6. Furthermore, the fuel consumption of MA-DD-DACC is less than that of the UNCFSCS as displayed in Figure 7. The experimental results verify the validity of the proposed MA-DD-DACC.

6. Conclusions

This work proposed a novel MA-DD-DACC method to deal with urban traffic signal timing problem concerning phase conflict, green time constraints, changeable cycle, and network topology of multi-direction signal controllers, which achieved a fairly good performance for queuing strength balance under different kind of traffic conditions, as shown in a VISSIM-VB-MATLAB joint simulation platform. Although much future research needs to be done, it can be expected that data-driven corporative control methodology in a distributed manner for urban traffic multi-agent systems will provide a powerful tool for realizing D²ITS with the goal of reducing traffic congestion and energy consumption.

Author Contributions

Conceptualization, H.J. and H.Z.; methodology, H.J. and Z.H.; software, H.Z. and H.J.; validation, H.Z., L.F.; formal analysis, L.F. and X.L.; investigation, H.Z. and X.L.; resources, Z.H.; writing—original draft preparation, H.J. and L.F.; writing—review and editing, H.J. and L.F.; visualization, X.L. and L.F.; supervision, Z.H. and X.L.; project administration, H.J.; funding acquisition, H.Z. and Z.H.

Funding

This work was funded by the North China University of Technology Scientific Research Foundation, the National Natural Science Foundation of China (No. 61803036), the Scientific Research Common Program of Beijing Municipal Commission of Education (Grant No: KM201911232015),the Supplementary and Supportive Project for Teachers at Beijing Information Science and Technology University (2018–2020) under Grant 5029011103(5111911129), and the Service Ability Construction of Science and Technology Innovation - Construction of High-end Disciplines (PXM2019_014212_000020).

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhang, J.; Wang, F.Y.; Wang, K.; Lin, W.H.; Xu, X.; Chen, C. Data-Driven Intelligent Transportation Systems: A Survey. IEEE Trans. Intell. Transp. Syst. 2011, 12, 1624–1639. [Google Scholar] [CrossRef]
Hou, Z.S.; Jin, S.T. Model Free Adaptive Control: Theory and Applications; CRC Press: Boca Raton, FL, USA, 2013. [Google Scholar]
Xu, D.; Jiang, B.; Shi, P. A novel model-free adaptive control design for multivariable industrial processes. IEEE Trans. Ind. Electron. 2014, 61, 6391–6398. [Google Scholar] [CrossRef]
Hou, Z.S.; Huang, W. The model-free learning adaptive control of a class of SISO nonlinear systems. In Proceedings of the 1997 American Control Conference, Albuquerque, NM, USA, 6 June 1997; Volume 1, pp. 343–344. [Google Scholar]
Xu, J.X.; Tan, Y.; Lee, T.H. Iterative learning control design based on composite energy function with input saturation. Automatica 2004, 40, 1371–1377. [Google Scholar] [CrossRef]
Hou, Z.S.; Liu, S.D.; Tian, T.T. Lazy-learning-based data-driven model-free adaptive predictive control for a class of discrete-time nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 1914–1928. [Google Scholar] [CrossRef] [PubMed]
Sala, A. Integrating virtual reference feedback tuning into a unified closed-loop identification framework. Automatica 2007, 43, 178–183. [Google Scholar] [CrossRef]
Hjalmarsson, H. From experiment design to closed-loop control. Automatica 2005, 41, 393–438. [Google Scholar] [CrossRef] [Green Version]
Hou, Z.; Chi, R.; Gao, H. An overview of dynamic-linearization-based data-driven control and applications. IEEE Trans. Ind. Electron. 2017, 64, 4076–4090. [Google Scholar] [CrossRef]
Markovsky, I. A missing data approach to data-driven filtering and control. IEEE Trans. Autom. Control 2017, 62, 1972–1978. [Google Scholar] [CrossRef]
Webster, F.V. Traffic Signal Settings. Road Research Paper No. 39; Her Majesty’s Stationary Office: London, UK, 1958. [Google Scholar]
Robertson, D.I.; Bretherton, R.D. Optimizing networks of traffic signals in real time—The scoot method. IEEE Trans. Veh. Technol. 1991, 40, 11–15. [Google Scholar] [CrossRef]
Sims, A.G.; Dobinson, K.W. The Sydney Coordinated Adaptive Traffic (SCAT) system philosophy and benefits. IEEE Trans. Veh. Technol. 1980, 29, 130–137. [Google Scholar] [CrossRef]
Keong, C.K. The GLIDE system—Singapore’s urban traffic control system. Transp. Rev. 1993, 13, 295–305. [Google Scholar] [CrossRef]
Wang, F.; Zhang, H.; Liu, D. Adaptive dynamic programming: An introduction. IEEE Comput. Intell. Mag. 2009, 4, 39–47. [Google Scholar] [CrossRef]
Ling, K.; Shalaby, A.S. A reinforcement learning approach to streetcar bunching control. J. Intell. Transp. Syst. Technol. Plan. Oper. 2005, 9, 59–68. [Google Scholar] [CrossRef]
Abdulhai, B.; Pringle, R.; Karakoulas, G.J. Reinforcement learning for the true adaptive traffic signal control. J. Transp. Eng. 2003, 129, 278–285. [Google Scholar] [CrossRef]
Salkham, A.; Cunningham, R.; Garg, A.; Cahill, V. A collaborative reinforcement learning approach to urban traffic control optimization. In Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, Sydney, Australia, 9–12 December 2009; Volume 2, pp. 560–566. [Google Scholar]
Hou, Z.S.; Xu, J.X.; Yan, J.W. An iterative learning approach for density control of freeway traffic flow via ramp metering. Transp. Res. C Emerg. Technol. 2008, 16, 71–97. [Google Scholar] [CrossRef]
Hou, Z.S.; Xu, J.X.; Zhong, H.W. Freeway traffic control using iterative learning control based ramp metering and speed signaling. IEEE Trans. Veh. Technol. 2007, 56, 466–477. [Google Scholar] [CrossRef]
Cheng, Z.; Hou, Z.; Jin, S. MFAC-based balance control for freeway and auxiliary road system with multi-intersections. In Proceedings of the 2015 10th Asian Control Conference (ASCC), Kota Kinabalu, Malaysia, 31 May–3 June 2015; pp. 1–6. [Google Scholar]
Balaji, P.G.; Srinivasan, D. Multi-agent system in urban traffic signal control. IEEE Comput. Intell. Mag. 2010, 5, 43–51. [Google Scholar] [CrossRef]
Choy, M.C.; Srinivasan, D.; Cheu, R.L. Neural networks for continuous online learning and control. IEEE Trans. Neural Netw. 2006, 17, 1511–1531. [Google Scholar] [CrossRef]
Gokulan, B.P.; Srinivasan, D. Distributed multi-agent type-2 fuzzy architecture for urban traffic signal control. In Proceedings of the IEEE International Conference on Fuzzy Systems, Jeju Island, South Korea, 20–24 August 2009; pp. 1624–1632. [Google Scholar]
Coupland, S.; John, R. New geometric inference techniques for type-2 fuzzy sets. Int. J. Approx. Reason. 2008, 49, 198–211. [Google Scholar] [CrossRef] [Green Version]
He, Z.H.; Chen, Y.Z.; Shi, J.J.; Han, X.G.; Wu, X. Steady-state control for signalized intersections modeled as switched server system. In Proceedings of the 2013 American Control Conference, Washington, DC, USA, 17–19 June 2013; pp. 842–847. [Google Scholar]
Choy, M.C.; Srinivasan, D.; Cheu, R.L. Cooperative, hybrid agent architecture for real-time traffic signal control. IEEE Trans. Syst. Man Cybern. A 2003, 33, 597–607. [Google Scholar] [CrossRef]
Lin, W.; Qian, C.J. Adaptive control of nonlinearly parameterized systems: A nonsmooth feedback framework. IEEE Trans. Autom. Control 2002, 47, 757–774. [Google Scholar] [CrossRef]
Sun, M.X.; Ge, S.S. Adaptive repetitive control for a class of nonlinearly parameterized systems. IEEE Trans. Autom. Control 2006, 51, 1684–1688. [Google Scholar] [CrossRef]
Ji, H.H.; Lewis, F.L.; Hou, Z.S.; Mikulski, D. Distributed information-weighted Kalman consensus filter for sensor networks. Automatica 2017, 77, 18–30. [Google Scholar] [CrossRef]

Figure 1. Traffic flow in a store-and-forward model.

Figure 2. Multi-direction queuing strength balance with changeable cycle at a single intersection.

Figure 3. Single intersection traffic modeling in VISSIM.

Figure 4. (a) Queuing delays in UNCFSCS; (b) Queuing delays in MA-DD-DACC.

Figure 5. (a) Queuing Strength in UNCFSCS; (b) Queuing Strength in MA-DD-DACC.

Figure 6. Average delay comparison between MA-DD-DACC and the UNCFSCS control.

Figure 7. Fuel consumption comparison between MA-DD-DACC and the UNCFSCS control.

Table 1. Simulation parameter table of queuing strength cooperative control at single intersection.

	East	North	West	South
Arrival flow (q/pcu*h⁻¹)	250	250	250	250
Initial cycle (Co/s)	128
Initial queuing strength (Lm(0)/pcu)	10	20	30	40
Saturation flow rate (s/pcu*h⁻¹)	1200	1200	1200	1200
Initial green time(g/s)	10	10	10	10
Lost time (tl/s)	2	2	2	2
The maximum queue value (lmax/pcu)	100	200	300	400
The maximum green time (gmax/s)	70	70	70	70
The minimum green time (gmin/s)	10	10	10	10
Quotiety (alpha)	0.99	0.99	0.99	0.99

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, H.; Liu, X.; Ji, H.; Hou, Z.; Fan, L. Multi-Agent-Based Data-Driven Distributed Adaptive Cooperative Control in Urban Traffic Signal Timing. Energies 2019, 12, 1402. https://doi.org/10.3390/en12071402

AMA Style

Zhang H, Liu X, Ji H, Hou Z, Fan L. Multi-Agent-Based Data-Driven Distributed Adaptive Cooperative Control in Urban Traffic Signal Timing. Energies. 2019; 12(7):1402. https://doi.org/10.3390/en12071402

Chicago/Turabian Style

Zhang, Haibo, Xiaoming Liu, Honghai Ji, Zhongsheng Hou, and Lingling Fan. 2019. "Multi-Agent-Based Data-Driven Distributed Adaptive Cooperative Control in Urban Traffic Signal Timing" Energies 12, no. 7: 1402. https://doi.org/10.3390/en12071402

APA Style

Zhang, H., Liu, X., Ji, H., Hou, Z., & Fan, L. (2019). Multi-Agent-Based Data-Driven Distributed Adaptive Cooperative Control in Urban Traffic Signal Timing. Energies, 12(7), 1402. https://doi.org/10.3390/en12071402

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multi-Agent-Based Data-Driven Distributed Adaptive Cooperative Control in Urban Traffic Signal Timing

Abstract

1. Introduction

2. Preliminaries

2.1. Store-and-Forward Model

2.2. Multi-Directional Queuing Strength Model with Changeable Cycle

3. Problem Formulation and Controller Design

3.1. Cooperative Control Problem Describtion

3.2. Data-Driven Distributed Adaptive Cooperative Control Design

4. Stability Analysis

5. Simulation

6. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI