1. Introduction
The continuous growth of cities, especially in developing countries, has produced a considerable increase in the number of vehicles circulating through roads that were not originally designed to accommodate such heavy traffic loads, particularly during rush hours, in which, according to the INRIX Global Traffic Scorecard [
1] and the TOMTOM Traffic Index [
2], people spent more than 157 h in cities like Bogotá and Lima in 2023. This problem has historically been addressed by means of limited alternatives such as the increase in lanes, circulation restrictions depending on the last number of the car plate, and the banning for heavy vehicles to transit on weekends in some city zones or even country regions, to mention a few, which only tackle the problem in a partial form. In this sense, it is crucial to optimize the use of the available infrastructure for traffic control, such as traffic lights, cameras, and general signaling [
3], in order to involve decision making about the green times in intersections considering street lengths, input and output flows, and street capacity to store vehicles, among other traffic parameters. Often, researchers focus their work on traffic signal timing (TST) and traffic signal control (TSC) optimization to develop mechanisms that assure the right of way (ROW) for multiple road users in two different strategy types, one based on 
fixed time, whose scheme defines the traffic lights plan according to historical data, and other that follows an 
adaptive schedule considering the traffic variability to assign dynamic times for traffic lights [
4].
Fixed-time strategies [
5], in spite of their limited applicability in environments with high traffic variability, continue to be widely employed in situations where technological infrastructure upgrades are not feasible. However, modern techniques have emerged to address this challenge, including nonlinear combinatorial programming models for pedestrian and vehicular signal plans [
6], and the implementation of platoon size control based on lane dimensions for connected and automated vehicles (CAVs) [
7]. Recently, fixed-time strategies have primarily served as benchmarks for comparing performance with novel adaptive approaches. These approaches leverage machine learning algorithms, IoT-based traffic control focused on CAVs [
8], semi-actuated networks [
8], and multi-agent methods [
9,
10], as well as many game theory-based controllers that consider not only vehicles but also pedestrian dynamics [
11].
In terms of adaptive strategies, multiple solutions have been proposed to integrate multi-agent and game theory concepts [
12,
13,
14,
15,
16,
17]. These strategies aim to improve real-time control over green light cycles and tackle vehicular congestion challenges in contexts involving conventional car drivers, pedestrians, and autonomous vehicles (AV). In this regard, in [
18], authors propose a game theory based multi-agent algorithm called GAMEPLAN that determines the optimal action for each agent based on their driving style (passive or aggressive), considering AV and drivers as agents with hidden intentions. In [
19], a mixed traffic environment including CAVs and human-driven vehicles (HDVs) is studied using a model involving adaptive dynamic programming (ADP) techniques, alongside quadratic zero-sum games to minimize external disturbances that cause stop and go waves in vehicular platoons. Another mixed study, this time involving pedestrians and vehicles, is proposed in [
16]. In this analysis, authors employ a Nash bargaining solution to determine the optimal green time that reduces pedestrian and vehicle delays. Bargaining games, supported by predictive control techniques, have also been employed in [
20] to establish an operating agreement among vehicles within a platoon, demonstrating acceptable performance in vehicular network coordination. In general, game theory and multi-agent systems have provided approaches for vehicular time delay reduction [
17,
21,
22,
23], space allocation control in vehicular platoons [
24], AV driving strategies based on velocity control considering road networks’ structure [
25], and traffic signal time modeling in intersections, where cooperating players or agents represent phase sequences as in [
26], or agents, which model intersections, work in non-cooperative schemes based on reinforcement learning, or in a cooperative way using Pareto optimal strategy [
27].
From the perspective of traffic system modeling, simulation tools offer relevant capabilities for analyzing a wide range of traffic conditions. These tools allow to set control parameters, such as cycle time, phase sequence, and cycle length, enabling detailed examination of various scenarios. Traditionally, simulation tools have been classified based on the level of detail that they consider regarding to the traffic actors in three types: microscopic, macroscopic and mesoscopic. Microscopic models delve into individual interactions between drivers and pedestrians. Between the most popular platforms of this type of models are AIMSUN, CORSIM, MATSim, Paramics, SUMO and VISSIM, with the latter being the most widely accepted in the research community [
4]. Macroscopic models, exemplified in platforms such as VISUM, TRANSYT-7F, and FREFLO, among others, provide a general view of the entire traffic network. A Mesoscopic model, on the other hand, combines the above mentioned approaches. Most of the mentioned modeling approaches are focused on traffic parameters for vehicular and pedestrian congestion control without the incorporation of additional and novel techniques, such as game theory, which is of primary interest in this work. Additionally, none of these works, which are heavily focused on techniques related to game theory and multi-agent algorithms, have modeled the traffic system as a hybrid system that represents vehicle storage as a continuous system and the transitions between network modes caused by traffic light changes as a discrete system. Furthermore, these studies have primarily focused on analyzing delay times, without considering other important factors of the traffic network, such as the number of stops, total travel time, average speed, and total distance traveled, among others, which are crucial for determining the relevance of an algorithm.
In this sense, the main contributions of this work, which, to the best of our knowledge, have not been considered in previous research, are described below.
- First, we implement a traffic simulator that integrates the microscopic simulation tool VISSIM with MATLAB to evaluate the dynamics of various evolutionary game-theoretic techniques within a traffic network. This network employs a hybrid signaling system, combining continuous dynamics—modeling vehicular storage changes—with discrete dynamics—representing transitions between network modes driven by traffic light changes. This hybrid approach, combined with control strategies based on evolutionary game theory that consider queue lengths and traffic light times as utility functions both in fixed-time and adaptive traffic control schemes, allows for a more realistic representation of traffic behavior, which, to the best of our knowledge, has not been addressed in the literature. 
- Second, beyond traditional metrics like delay, we comprehensively analyze a range of performance parameters that are often overlooked in prior studies involving game theory and multi-agent algorithms, such as the number of stops, the total travel time, average speed, total distance traveled, and average stop time per vehicle. By incorporating these metrics, we provide a more holistic evaluation of traffic control strategies, capturing their impact on efficiency. 
- Third, we implement and compare multiple traffic control strategies based on population dynamics and payoff functions that consider queue lengths and green times of network links. These strategies include the following: - –
- Fictitious play, implemented from a fixed-time perspective due to its offline nature, and 
- –
- Adaptive strategies such as Smith, Replicator, Logit, and Brown–Von Neumann–Nash (BNN), which leverage their ability to be continuously updated in real time. 
 
This approach allows us to contrast the performance of offline and adaptive control methods, highlighting their respective strengths and limitations in managing traffic flow under varying conditions.
The remainder of this paper is structured as follows. First, in 
Section 3, we provide a mathematical description of a vehicular traffic network using graph theory, modeling intersections as nodes and streets as links. In 
Section 3.1, we describe the mathematical details of the continuous traffic model implemented in the proposed traffic simulator. In 
Section 4, we show the different population dynamics implemented within the traffic simulator, along with the fitness function definitions of each case, which were obtained considering network parameters such as the queue length and number of vehicles in links. In 
Section 6, we test the game theory-based model and the traffic software simulator in a nine-intersection network in order to evaluate traffic parameters such as number of stops (NS), total travel time (TTT), total distance traveled (TDT), average speed (AS), total stop time (TST) and average stop time per vehicle (ASTV). Finally, we evaluate the performance of each game-theory based control strategy in 
Section 7.
  2. Notations
In this section, we describe the main notations used in the mathematical model description of 
Section 3.
- : Directed graph representing a vehicular traffic network. 
- : Set of network nodes. They represent the traffic network intersections. 
- : Set of network links. They represent the links between traffic network intersections (streets). 
- : Input and output nodes to link , respectively. 
- : Incidence function that matches each edge  to an ordered pair . 
- : Sets of input and output links from node p, respectively. 
- : Idle time for node p. Includes red–yellow transition and yellow light time. 
- : Green time for link . 
- : Cycle time for node p. Includes the idle and green times. 
- : Output flow capacity of link , given in . 
- : Storing capacity of vehicles in link . 
- : Flow rate of vehicles from link i to j, . 
- : Traffic light state of link , . 
- : Network mode given by the traffic light states. 
- : Traffic light state in a link i and mode , . 
- : Queue length. Number of vehicles in a given link. 
- : Queue input flow, given in . 
- : Queue output flow, given in . 
- : Set of stages for node p. One stage is the time at which an input link to a node p has the right of way (ROW). 
  3. Mathematical Description of a Urban Traffic Network
According to [
28], a vehicular traffic network can be represented as a directed graph 
, where 
 and 
 are the sets of 
m nodes and 
n edges of the network, respectively. 
 is the incidence function that matches each edge 
 to an ordered pair 
,
Each network node 
p has a set 
 of input edges given by
      with 
 as the input link to 
p from node 
l.
The set 
 of output edges for node 
p is given by
      with 
 as the output link from node 
p to node 
r.
To illustrate this, consider the vehicular traffic network shown in 
Figure 1a, whose graph representation is shown in 
Figure 1b, and traffic light behavior described in 
Figure 1c, where (1,2) represents the link between streets 1 and 2, whereas (3,2) represents the link between streets 3 and 2. The traffic light scheme shows two idle periods in (1,2), one that represents the red–yellow transition, and one that represents the yellow light after the green one. The total idle time is denoted as 
. During the period composed of the idle and green light times in (1,2), the traffic light in (3,2) must be in red. The same scheme applies for (3,2) in the second half of the cycle time 
. Considering the above definitions, the following constraint applies to node 
p:
      where 
 is the green light time for link 
i.
One stage is defined as the time at which an input link to a node p has the right of way (ROW), whereas the set of stages for a node p is denoted as . The saturation flow , with , represents the output vehicular flow capacity for the link i while the traffic light is green. The set of values  is normally known as . The turn rate on each node, denoted by , with  and , represents the distribution of vehicles turning from link i to j. In other words,  is the turn rate from link i to j, with   and .
The traffic light state in a link , denoted as , can be equal to 1 when the traffic light is green (ROW) and 0 in a different case (yellow or red). In general, the set of network states is , .
The expressions  and , with , represent the external vehicular input flow and the vehicular storing capacity of i, respectively. In general terms, , , and  represent the vehicular input flows and storing capacity of all the network links.
Network Restrictions. The geometry restriction on each link is given by , and the control input restriction is represented as , where  and  are the current and average queue lengths of link i, respectively, and  is the minimum value in seconds for the green light, which depends on the system properties (e.g., it can be equal to zero when there are no vehicles in link i). Finally, the term  represents the addition of all green times in intersection p and is given by . Observe that this value must not exceed the difference between the cycle time and the total idle time.
  3.1. Continuous Modeling of a Urban Traffic Network
The vehicular storing change in a link 
 can be modeled as a mass balance, given by
        where 
 and 
 represent the queue output flow and queue length changes, respectively. The input flow change 
 is given by
        with 
 being the turn rate from the input link 
 to node 
 direct towards the output link 
. Using (
6) and (
5), we have
The output traffic flow change of link 
i is modeled as
        i.e., the traffic flow change is given by the difference between the output vehicular flow capacity for link i while the traffic light is green (saturation flow 
), and the queue output flow change is 
.
Additionally, the green time variation 
 is expressed as
        which is constant when the traffic light state 
 is equal to 1 (green time), since the green time 
 increases linearly from its minimum value 
 to its maximum value 
. The second term in (
9) indicates that there is no green time variation if the traffic light state 
 is equal to 0 (yellow or red).
Finally, considering (
7)–(
9) and a 
 link network, we can represent the system as the differential-algebraic system of equations (DAE) 
, with the discontinuous coefficients matrix form
        where 
 is the network configuration in a given mode 
. 
 is a diagonal matrix whose elements are the 
. 
, and 
. 
, and 
. The matrix 
 is built by the proper selection of the turn rates.
Rewriting (
10) as an 
homogeneous system, we have
        whose matrix form is
  3.2. Hybrid Urban Traffic System
The urban traffic network integrates the continuous scheme described in 
Section 3.1 with the discrete behavior resulting from transitions between different network modes. This integration allows us to model the network as a hybrid system. To illustrate this, consider the network depicted in 
Figure 1a and its representation by the automaton in 
Figure 2. In this case, we have four traffic lights, each one with two possible states 
, i.e., we have 
 possible network configurations. However, since 
 and 
, this number can be reduced to 
, which is the number of ovals in the automaton. The set of modes is denoted as 
. The transitions between states are governed by a threshold condition based on the difference between the green time of individual links (
) and the total green time at each node (
). For instance, the transition from mode 
 to 
 is triggered when 
, i.e., when the green time in link 4 is greater or equal than the sum of green times of links 3 and 4 (
). In this sense, the switching scheme between modes is dynamic and depends on the variation of green times of links, which changes according to the fitness function used in the control strategies described in 
Section 4.
In general, the state vector for this system is
        where 
 is a finite discrete set of modes. The DAE hybrid system representation is
        where
        and
 and 
 represent the consistency sets and the projectors, respectively [
29]. 
, and 
 are subsets 
 that define the system evolution according to 
F and 
G. In the transitions between modes, the changes of 
 are defined by the map 
, where
Additionally, the changes of modes in the network are determined by
  4. Population Game-Based Model for Urban Traffic Control
In the urban traffic system, each node 
 is represented as a population of agents that share the effective green time 
. In this sense, agents form a mass 
m distributed between the set of pure strategies 
 of the population. The set 
S, corresponds to the set of stages 
 described in 
Section 3, i.e., the proportion of 
m that the input links of node p have for the right of way (ROW). Thus, each link 
 is assigned a proportion 
 of 
m, which is a pure strategy of the population.
The whole behavior of the agents defines the population state 
x, which belongs to the simplex
The set of agents assuming a strategy  allows us to obtain a payoff provided by the fitness function , in a determined population state x. The average fitness function is defined as , and the excess payoff of j is .
In this approach, the fitness function for a given link j is the relation between its queue length and its assigned proportion of m. This lead us to Definition 1.
Definition 1.  The fitness function for the adaptive control strategies based in population games depends on the relation between the link queue capacity and its green time assignment. This is given bywhich must be minimized.    4.1. Mean Dynamics and Revision Protocols
In 
mean dynamics, each of the 
n agents receive equally likely revision opportunities at rate R, which means that agents playing strategy 
j have an expected number of opportunities given by 
 in a time space 
 to observe their opponents behavior in order to make decisions. On the other hand, the expected number of opportunities that the agents have to change from strategy 
j to 
k is 
, where 
 represents the conditional switch rate to change from 
j to 
k. In this way, the expected change of agents using strategy 
j is given by
        from which we define an expected-value-based dynamic known as the 
mean dynamic, given by
The above expression determines how the proportion of agents playing strategy 
j varies [
30].
The switch change 
 also defines the revision protocols shown in 
Table 1. These revision protocols determine the population dynamic used by the agent population. We describe them in detail in 
Section 4.2, with a particular focus on the fictitious play case, since it uses historical data to define agent strategies, demonstrating its viability within a fixed-time traffic framework.
  4.2. Population Dynamics
In this section, we describe the population dynamics employed in this approach to optimize the green times for each node in the traffic network. First, we describe the dynamics that, due to their online nature, are used for adaptive control strategies. These dynamics are replicator, Brown–Von Neuman–Nash, Logit and Smith, which use (
13) as a fitness function. Subsequently, we describe fictitious play as a fixed-time control strategy, since it uses historical data for agent decision making. In this case, the payoff function is given by (
22).
  4.2.1. Replicator Dynamics
The replicator dynamic (RD) describes how the frequencies of different strategies change over time based on the fitness associated with each one, i.e., strategies having higher payoffs tend to increase in frequency within the population, while less successful strategies decrease. An agent using this type of dynamics chooses an opponent and applies the imitation protocol. The opponent’s strategy is followed by this when its utility is better than the current one. This dynamic is described as the difference between the fitness of the strategy 
j and the average fitness, as follows:
  4.2.2. Brown–Von Neumann–Nash
The Brown–Von Neumann–Nash (BNN) dynamic relies on the 
excess payoff dynamic, since it compares the 
excess payoff of strategy 
j with the 
excess payoff caused by the switch from strategy 
j to 
k. It can be expressed as
  4.2.3. Logit Dynamic
The Logit dynamic adopts the Gibbs–Boltzmann form of (
18), in which the rationality parameter 
 is included to increase the probability of strategies having better payoffs or to assign them the same probability to be chosen. In the first case, when 
, agents play following the best response rule, and in the second case, when 
, agents play under a uniform distribution scheme to select strategies.
Rationality 
 can be interpreted as the inverse of the noise 
, and it is included in the 
Logit selection protocol shown in 
Table 1.
  4.2.4. Smith Dynamic
In Smith dynamics, agents randomly select a strategy and evaluate its payoff against the current strategy. If the new payoff surpasses the current one, the agent switches to the new strategy with a probability proportional to the difference between the payoffs. Unlike the BNN dynamic, which carries out a comparison against the average payoff, or RD, which compares against an opponent strategy, the Smith dynamic involves a direct comparison between two strategies, which avoids some strategies being eventually discarded. This dynamic is described by
  4.3. Fictitious Play
In fictitious play, a player 
i plays a pure best response (BR) action 
 based on the joint distribution of the selected actions of its opponents. This can be expressed by
        where 
 represents the action performed by agent 
i at time 
t, and 
 is the joint probability distribution of actions performed by all players except 
i. The marginal probability for each player 
i up to time 
t is calculated using the expression
        where 
 is an 
S-dimensional vector that counts the times that player 
i has played each action. According to the above expressions, it is noticeable that in fictitious play, each agent requires historical data in order to make decisions. Consequently, in our context, this control strategy is assumed to follow a fixed-time approach.
The payoff for an agent i is a function that depends on its current action and the action set  played by the agents different to i. In our case, this payoff depends on the queue length of links converging to a node. This is presented in Definition 2.
Definition 2.  The payoff function in a fixed-time scheme modeled as a fictitious play based-control strategy depends on the accumulative queue length of the incidents links. This is given bywhich must be minimized.  According to Definitions 1 and 2, fitness and payoff functions determine the control strategies’ dynamics in adaptive and fixed-time schemes, respectively. For the adaptive control case, given by the online population dynamics described in 
Section 4.2, the green time is continuously updated according to the fitness function variation, which impacts the transition between the modes of the hybrid system described in 
Section 3.2. In the fixed-time case, the payoff function depends exclusively on the queue lengths of incident links to a specific node, which means that the mode transitions are determined by the current network configuration and the modes that have provided the best queue length reduction historically.
Both functions are strongly related with the traffic network dynamics for which they were taught, i.e., to associate good fitness or payoffs in links with low queue lengths and acceptable green times. The results of the dynamic changes in the functions’ values are described through the performance parameters for different scenarios, as shown in 
Section 7.
  5. Traffic Simulator
The implemented traffic simulator is shown in 
Figure 3. This integrates, through a COM server, the VISSIM platform, in which the traffic network is implemented, and MATLAB, where the control strategies described in 
Section 4.1 and the hybrid model described in 
Section 3.2 are implemented.
The following are the main steps in the information flow between both simulator sections:
- First of all, the vehicular traffic network is implemented in VISSIM. Subsequently, MATLAB obtains all its properties, such as number of controllers, traffic lights, queue counters, number of links, cycle time (), duration of yellow and red lights (intermediate times), and simulation time, among others. 
- The runtime environment, which is implemented in an object-oriented scheme, allows for the creation of class instances that represent the traffic network (implemented in VISSIM) and the virtual traffic controller associated with each node that is modeled by means of the hybrid dynamics described in  Section 3.2- . 
- The initial green times are loaded through class instances that represent the control strategies described in  Section 4.1- . These times, along with the cycle time ( - ) and the intermediate times (red and yellow), are used to generate the signaling plan for each controller, which is sent to VISSIM. 
- The update of the signaling plan, as well as the reading of the queue counters, are made in sample intervals . After this time, the queue lengths calculated in VISSIM are sent to the control strategy instance in order to recalculate the green times. This process is repeated until the simulation time expires. 
- Finally, the communication between both modules ends and the information related to performance parameters is stored in VISSIM. 
VISSIM includes a COM programming module that allows for an interaction with external programming languages and platforms such as MATLAB in a stable form. In our case, the traffic simulator architecture has an object-oriented scheme implemented in MATLAB, whose main classes are the following:
- COMInterfaceVISSIM: Builds the VISSIM and MATLAB COM connection. It also contains methods to send information to VISSIM (cycle time, simulation time, performance parameters to be measured, etc). 
- HybridController: Each instance of this class creates a “virtual controller” that interprets the signal plans and sends the instructions that the VISSIM controllers must follow. 
- ControllerUserDefined: Used to define the control strategy to be used. It also generates the green times for each virtual controller generated. 
- SignalPlan: Creates the signal plans from the information provided in the graphic interface and the green times given by ControllerUserDefined. 
- MainSimulation: Main class in which the instances of the previously described classes are created. 
  6. Case Study
Figure 4 shows the geometry and movement scheme for the traffic network used as a case study, which is located in the urban zone of Barranquilla city in Colombia (between 82 and 84 streets, and Kra. 50 to Kra. 52). The current traffic plans, which are based on a fixed-time strategy that considers only the queue length and historical data, were provided by the city traffic controller company (IMATIC). The results shown in 
Section 7, compare its performance with the proposed game theory-based controllers for multiple network parameters. In particular, this network has a reduced space between nodes, which avoids a large amount of vehicles within links. The number of input links for each node is 2, which is associated with the number of strategies that each agent can assume in the current node. In general, 16 links are modeled in this approach. Since the network is continuously changing. Moreover, with the aim to test the different control strategies provided by the population dynamics described in 
Section 4.2, we propose the five scenarios described in 
Table 2.
   7. Discussion
The implemented control strategies based on the population dynamics described in 
Section 4.1 were used on each one of the scenarios described in 
Table 2, during a simulation time of 
, with sample intervals 
, cycle time 
 and idle time 
.
The performance of each control strategy is evaluated according to the parameters shown in 
Table 3 and compared in 
Table 4, with the current fixed-time strategy used by the traffic controller company of Barranquilla city. Observe how most of the proposed control strategies outperform the current one, achieving reductions in the number of stops (NS) up to 90% and increasing the average speed (AS) up to 79% and 80% in scenario 4 with RD and Smith dynamics, respectively. There is not a notable improvement for scenario 1 due to the low and constant flow of vehicles. However, it is noticeable that under saturation and variable conditions, the proposed strategies enhance the results for the evaluated parameters, except the total distance traveled (TDT), which has increased up to 17% in scenario 4 in almost all the proposed strategies. This can be attributed to the change of routes inherent to the dynamics of the strategies.
Another important comparison provided by 
Table 4 encompasses game theory-based control strategies. In general, both Smith and replicator dynamics demonstrate superior performance in saturated and non-saturated scenarios due to their decentralized mechanisms. In replicator dynamics (RD), agents adjust their strategies based on pairwise fitness comparisons, whereas in Smith dynamics, they compare the expected payoffs of different strategies. This localized decision-making approach eliminates the need for complete information—unlike BNN or Logit dynamics—enhancing adaptability to unpredictable environments and, consequently, traffic volatility.
In this case, the control strategy with better performance was Smith dynamics. As shown in 
Table 4, all parameters show the most favorable values in comparison with the other controllers. On the other hand, the fixed-time control strategy, fictitious play, demonstrates that is also a good option, with values that do not deviate significantly from the Smith case; Logit (
) yields the least favorable outcomes.
In this scenario, all the control strategies demonstrate a notable improvement in comparison with scenario 1, especially Smith and replicator dynamics. However, there are some parameters in which some control strategies exhibit better behaviour, since Smith dynamics surpasses replicator dynamics in ASTV and TST, whereas the replicator dynamic outperforms the Smith dynamic in TTT.
In this case, the fixed-time strategy, fictitious play, yields superior outcomes compared to the adaptive strategies, replicator and Smith, in certain parameters such as TTT and AS. However, in terms of TST and ASTV, fictitious play is surpassed by them. Once more, the Smith dynamic and replicator demonstrate superior performance, with the replicator dynamic outperforming it in NS and TTT, while Smith dynamics slightly surpass replicator dynamics in TST.
Saturated network conditions are maintained but this time with a reduction in flows in relation to scenario 3. The adaptive strategy, Smith, exhibits the best performance among all strategies, followed by replicator dynamics. The fixed-time strategy based on fictitious play presents better results compared to the BNN and Logit cases.
This last scenario pretends to test the adaptive strategies’ performance. Again, the best results are obtained for Smith and replicator dynamics, in that order. Since this scenario is completely variable, the fixed-time strategy, fictitious play, exhibits a performance reduction in comparison with BNN and Logit, in relation to scenario 4.
  8. Conclusions
In this work, we implemented a traffic simulator that combines the micro-simulation platforms VISSIM and MATLAB in order to include population game-based control strategies to define green times in an eight-intersection traffic network. Due to its determinant role in network congestion, the queue length of links is an important parameter for fitness and payoff functions’ definitions, both in the fixed case, which is modeled using fictitious play dynamics, and for the adaptive case, modeled using the control strategies based on RD, BNN, Logit and Smith dynamics. Within the traffic simulator, we included a hybrid system to model both the continuous and discrete behavior of the network in order to involve both the vehicular storing change in links and the transitions between modes caused by the traffic light changes in nodes. The results, obtained in five different traffic scenarios, demonstrated acceptable performance for the fixed-time scheme modeled with fictitious play in a traffic network without saturation conditions. In conditions considering saturation and variability in traffic flows, fictitious play exhibited good performance in many of the analyzed parameters, specifically in NS, TTT, AS, TST and ASTV, in which it outperformed BNN and Logit dynamics. On the other hand, Smith dynamics emerged as the most effective control strategy across all studied scenarios, consistently outperforming the others, particularly in situations where increased congestion led to a decline in fitness values. Replicator dynamics also demonstrated strong performance, though slightly behind Smith dynamics. This superior performance can be attributed to the decentralized nature of Smith dynamics and RD. In the case of RD, agents adjust their strategies based on pairwise comparisons of fitness values, while in Smith dynamics, agents compare the expected payoffs of different strategies. This localized decision-making process enhances their adaptability in unpredictable environments, such as traffic conditions, as they do not rely on a complete information scheme, which is a requirement that other dynamics, such as BNN or Logit, depend on.
The proposed model, unlike other related works, considers five distinct levels of saturation or traffic characteristics, ranging from constant flows to variable flows in both vertical and horizontal links. Under these scenarios, the performance of various control strategies based on game theory was evaluated across multiple parameters, including the number of stops (NS), total travel time (TTT), total distance traveled (TDT), average speed (AS), total stop time (TST), and average stop time per vehicle (ASTV). Although the density of traffic flows can be adjusted to further validate the results, this study demonstrates that certain control strategies outperform others across most performance metrics, particularly under high saturation levels. The findings provide clear evidence of which game theory-based control strategies are best-suited for fixed or adaptive schemes. Furthermore, the traffic network is modeled as a hybrid system, a novel approach not explored in previous studies. This hybrid modeling provides a more realistic representation of traffic behavior, capturing both the continuous and discrete dynamics inherent in real-world traffic systems.
  Limitations and Future Directions
Although VISSIM and MATLAB are widely used for traffic analysis, the proposed combination operates in a simulated environment that accounts for various congestion conditions and performance metrics. However, it does not incorporate real-world data, which would enhance the accuracy and practical applicability of the results. Therefore, future research should focus on implementing this approach using actual traffic datasets and more complex traffic networks. Additionally, it is important to consider how the proposed control strategies could improve the performance in terms of the TDT parameter, which was the only one presenting negative behavior in saturated scenarios.