You are currently viewing a new version of our website. To view the old version click .
Applied Sciences
  • Article
  • Open Access

27 December 2024

Integrating Visible Light Communication and AI for Adaptive Traffic Management: A Focus on Reward Functions and Rerouting Coordination

,
,
,
,
and
1
DEETC-ISEL/IPL, R. Conselheiro Emídio Navarro, 1949-014 Lisboa, Portugal
2
UNINOVA-CTS and LASI, Quinta da Torre, Monte da Caparica, 2829-516 Caparica, Portugal
3
NOVA School of Science and Technology, Quinta da Torre, Monte da Caparica, 2829-516 Caparica, Portugal
4
INESC INOV, R. Alves Redol, 9, 1000-029 Lisboa, Portugal
This article belongs to the Special Issue Novel Advances in Internet of Vehicles

Abstract

This study combines Visible Light Communication (VLC) and Artificial Intelligence (AI) to optimize traffic signal control, reduce congestion, and enhance safety. Utilizing existing road infrastructure, VLC technology transmits real-time data on vehicle and pedestrian positions, speeds, and queues. AI agents, powered by Deep Reinforcement Learning (DRL), process these data to manage traffic flows dynamically, applying anti-bottlenecking and rerouting techniques. A global agent coordinates local agents, enabling indirect communication and a unified DRL model that adjusts traffic light phases in real time using a queue/request/response system. A key focus of this work is the design of reward functions for standard and rerouting scenarios. In standard scenarios, the reward function prioritizes wide green bands for vehicles while penalizing pedestrian rule violations, balancing efficiency and safety. In rerouting scenarios, it dynamically prevents queuing spillovers at neighboring intersections, mitigating cascading congestion and ensuring safe, timely pedestrian crossings. Simulation experiments in the SUMO urban mobility simulator and real-world trials validate the system across diverse intersection types, including four-way crossings, T-intersections, and roundabouts. Results show significant reductions in vehicle and pedestrian waiting times, particularly in rerouting scenarios, demonstrating the system’s scalability and adaptability. By integrating VLC technology and AI-driven adaptive control, this approach achieves efficient, safe, and flexible traffic management. The proposed system addresses urban mobility challenges effectively, offering a robust solution to modern traffic demands while improving the travel experience for all road users.

1. Introduction

Urban traffic management presents a significant challenge for modern cities, as the growing volume of vehicles and pedestrians contributes to congestion, delays, and safety risks []. Given the spatial constraints in urban areas, expanding road infrastructure is no longer a viable solution. Instead, optimizing traffic flow at intersections has become critical, with adaptive traffic signal control emerging as one of the most effective strategies.
Adaptive systems can significantly reduce congestion and improve intersection efficiency by using real-time data from traffic networks, such as traffic flow, waiting times, and vehicle queues. Visible Light Communication (VLC) offers a versatile and efficient solution for modern traffic management, seamlessly integrating into existing infrastructure, such as vehicles, streetlights, and traffic signals [,]. Beyond illumination and communication, VLC enables dynamic and adaptive traffic signal control [,], particularly when combined with AI. This paper introduces a novel approach to traffic management based on DRL that emphasizes the design and impact of reward functions tailored for standard and rerouting scenarios, enhancing both vehicular efficiency and pedestrian safety.
Intersections are key bottlenecks in road networks, making intelligent signal control crucial for improving traffic flow. Recent advances in Deep Reinforcement Learning (DRL) have shown promise in dynamically managing traffic signals for both vehicles and pedestrians [,]. However, optimizing traffic flow across multiple intersections is challenging due to varying traffic conditions and the need for information sharing []. The rise of connected vehicles (CVs) adds further potential to traffic management. Through advanced communication technologies, connected vehicles (CVs) can exchange real-time traffic and safety information with each other and the infrastructure, improving road safety, comfort, and efficiency. This connected traffic environment enables better optimization of both vehicular and pedestrian flows. VLC is an innovative solution that can complement CV technologies to optimize traffic management. By leveraging LED technology in both road infrastructure and vehicles, VLC paves the way for smarter, more efficient traffic management.
This work investigates the integration of VLC and AI, particularly DRL, to design reward functions that optimize traffic signal control in both standard and rerouting scenarios. This research focuses on applying DRL in vehicular communication to enhance traffic flow, improve intersection efficiency, and balance the needs of vehicles and pedestrians in real-world urban environments. By leveraging emerging technologies, such as VLC and connected vehicles (CVs), the study aims to demonstrate their potential to address modern traffic management challenges effectively.
The core contributions are as follows:
  • Demonstrating the feasibility of VLC in outdoor environments as a complementary technology to existing vehicular communication systems.
  • Proposing an integrated framework where VLC is utilized to collect real-time traffic data, which are then processed and analyzed using DRL models.
  • Showcasing how the proposed approach can effectively optimize vehicle flow and traffic signal control in urban traffic scenarios, enhancing intersection efficiency and addressing modern traffic management challenges.
The paper is organized as follows. The introduction outlines the challenges in urban traffic management and explores the potential of integrating connected vehicles (CVs) with Visible Light Communication (VLC) to optimize traffic signal control and rerouting. Section 2 discusses the advantages of using VLC for communication in traffic systems, highlights key issues, such as congestion and safety, examines VLC’s role in real-time traffic management, and introduces Deep Reinforcement Learning (DRL) for traffic signal optimization. Section 3 describes the system architecture, the role of rewards in DRL, coordination strategies between agents, and the simulation used to validate the approach. Section 4 covers network training and testing, global agent decision making, and the impact of VLC and CV integration on traffic rerouting, along with associated challenges. Finally, Section 5 concludes the paper by summarizing the key findings, addressing the study’s limitations, and suggesting potential future research directions.

3. Proposed Approach and Methodology

This study presents a novel approach to urban traffic management by integrating VLC with AI, specifically leveraging DRL. The proposed system is designed to optimize traffic signal control in both standard and rerouting scenarios, ensuring efficient vehicular flow and pedestrian safety.

3.1. System Architecture

The system comprises three main components:
  • VLC-Enabled Infrastructure: Traffic lights and roadside units equipped with VLC modules to enable real-time data transmission between vehicles, infrastructure, and pedestrians.
  • AI Agents: Independent DRL-based agents are deployed at each intersection, with a centralized global agent coordinating the local ones. This structure ensures scalability and real-time adaptability to traffic dynamics.
  • Data Integration Platform: A platform that collects, processes, and integrates real-time data from VLC-enabled devices, including vehicle positions, speeds, and pedestrian requests.

3.1.1. VLC-Enabled Infrastructure

Arterial traffic signal control refers to managing intersections formed by crossing two or more main roads that are either radial or circular in design. The layout and spacing between intersections vary depending on traffic volume, road capacity, and network design. Each approach at an intersection comprises multiple lanes to accommodate different vehicle movements, such as left turns, right turns, and through-traffic. These intersections are governed by standard traffic rules, with priority movements determined by the traffic signals in place.
The scenario was idealized to reflect a futuristic urban environment tailored for autonomous vehicles. Key aspects include the following:
  • Multiple lanes on each arm of the junction to reduce queuing and enhance traffic flow.
  • Exclusive pedestrian phases to ensure complete pedestrian safety by segregating vehicle and pedestrian movements.
  • The scenario assumes that autonomous vehicles pre-plan their routes, enabling seamless navigation within designated lanes.
  • These assumptions align with a forward-looking vision of urban traffic management, optimizing safety and efficiency while accommodating future technological advancements.
The traffic scenario analyzed in this study is depicted in Figure 2. It includes three uniform 4-arm junctions, spaced at varying distances. The lane between junctions C0 and C1 is 400 m long, while the lane between C1 and C2 is only 200 m long. Figure 4 presents the simulated environment for each four-legged intersection, showcasing the optical infrastructure (Xij), the generated footprints (#1–#9), and the interactions between connected vehicles and pedestrians. The streetlights along the lanes, denoted as Xi,j, are identified by integers that follow the opposite direction of traffic (N, S, E, W), starting from streetlight 0 at the required signalized intersection and extending, in line, towards streetlights K,L,M, N at the neighboring junction.
Figure 4. Simulated scenario for each junction: four-legged intersection and environment with the optical infrastructure (Xij), the generated footprints (#1–#9), and the connected cars and pedestrians. Dush lines show the sidewalks.
Each arm of the junction has two lanes: one for left turns and another for vehicles to continue straight or turn right. This design is tailored for CAVs. Using VLC (Figure 1), the vehicle gathers information about the junction’s footprint regions, lanes, and surroundings to select the appropriate lane. Once positioned, the vehicle communicates its intention to cross to the Intersection Manager (IM) via V2I technology. The environment also includes sidewalks with designated pedestrian waiting zones at each intersection, ensuring safe waiting areas until the crossing phase is activated for pedestrians to use the zebra crossings.
The traffic signal phase timing is determined by factors like traffic demand, intersection layout, and traffic management objectives (e.g., minimizing delays, maximizing traffic flow). Traffic signals typically operate in phases, with green lights allowing movement in one direction while red lights restrict conflicting movements. In this study, we design eight vehicle signal phases and one exclusive pedestrian phase for each intersection, as illustrated in Figure 5a. For vehicles, there is a north–south phase (P1) where they can either proceed straight or turn right, followed by a phase where vehicles coming from the north can cross in all directions (P2), and another phase where vehicles from the south can do the same (P3). Additionally, there is a phase where vehicles traveling from both the north and the south can only turn left (P4). The same phase structure applies to the east–west direction (P5, P6, P7, P8). Pedestrians have an exclusive phase during which all vehicle traffic lights are red, allowing them to cross safely without interference from vehicles. This exclusive phase ensures pedestrian safety by preventing any crossover between pedestrians and vehicles at the crossings.
Figure 5. (a) Phasing diagram. (b) Schematic diagram of one junction with coded lanes (L/0–7) and traffic lights (TL/0–15). Arrows show the traffic directions.
A traffic control system consisting of sixteen traffic lights (LT) has been implemented to manage the flow of vehicles approaching the intersections. These traffic lights are numbered (LT 0–15), as shown in Figure 5b, which also displays the numbering of the lanes (L 0–7), consistent across all three junctions. These sixteen traffic lights enable the implementation of traffic phases to regulate the flow of vehicles and pedestrians.
Due to the dynamic nature of traffic flow, significant fluctuations occur, including variations in the number of stops and acceleration/deceleration events both on arterial roads and at individual intersections. These fluctuations are considered when forecasting potential traffic conflicts in the design.
The main challenge in controlling traffic across multiple intersections is the coordination required to manage the varying traffic conditions between intersections. The target roads between intersections vary in distance, making it crucial to synchronize traffic signals based on traffic volume, vehicle movements, and road capacity. Without proper coordination, localized traffic signal control can lead to inefficiencies like congestion or increased waiting times, especially when multiple intersections interact closely in an arterial network.

3.1.2. Multi-Agent Reinforcement Learning

Figure 6 illustrates the centralized control algorithm, where there is no direct communication between agents. Our research has explored various junction types, starting with a single junction [] and progressing to two [] and now three, focusing on the behavior and dynamics of different traffic scenarios in these environments. Because the junctions are homogeneous, the similar experiences observed by each agent allow for training of a single neural network, which acts as a global agent to make decisions and determine the best actions. This approach has proven effective in managing both pedestrian and vehicle traffic at these intersections. However, large occupancy peaks in the queues can sometimes occur, leading to congestion and inefficient traffic flow.
Figure 6. A schematic of the algorithm employed using centralized MARL.
The global agent can dynamically direct local agents to reroute traffic, avoiding congestion zones and ensuring smoother flow. By monitoring traffic density and flow patterns, it adjusts signal phases across intersections to create synchronized green waves that enhance throughput and reduce stop-and-go behavior. For instance, it may shorten green time at one intersection to prevent overwhelming a downstream intersection [,,].
From our study across various scenarios, we gained insights into traffic queue behavior and identified lane capacity limits. To address the lack of direct communication between agents, we integrated this knowledge into the network by setting threshold values for queues. This allows the global system to manage traffic flow in critical sections by evaluating the volume that can be accommodated in each direction.
Although the neural network is trained centrally, traffic signal agents at each intersection locally implement these rerouting strategies. Congestion thresholds are set to adapt and optimize traffic flow, ensuring efficient management even in high-demand conditions.

3.1.3. Data Integration Platform

The protocol is designed for a platform that collects, processes, and integrates real-time data from VLC-enabled devices, including vehicle positions, speeds, and pedestrian requests. The communication protocol defines the structure and rules governing the exchange of information, outlining the synchronization, identification, and payload portions of the transmitted frame. The communication protocol is detailed in Table 1.
Table 1. Message protocol defined for each of the V-VLC communications.
The frame structure begins with a 5-bit synchronization block (Sync) with the pattern [10101], used to synchronize the receivers and mark the start of a new frame (SOF). Following this, the TIME block encodes the current time with a 12-bit sequence (6 + 6 + 6), representing the hour, minute, and second. A flag with the pattern [1 1 1} indicates that specific ID blocks will follow.
Each ID block consists of 4 bits, starting with the communication type (COM) that specifies the communication between streetlights (L), vehicles (V), pedestrians (P), and infrastructure (I). The next block provides the localization of transmitters, defined by x and y coordinates. Depending on the communication type, additional details include the occupied lane (Lane 0–7), requested traffic light signals (TL 0–15), the number of vehicles behind the leader (Veic. nr), the ID assigned by the Intersection Manager (IM) to acknowledge vehicle messages, the cardinal direction (Direct.), and the active phase (Phase), provided by the infrastructure in a “request” or “response” message.
For traffic-related messages, the frame includes vehicle information, such as coordinates, the position of vehicles behind the leader (CarIDx, CarIDy), and traffic-related data (payload), such as road conditions, average waiting times, and weather information. The frame ends with a 4-bit End of Frame (EoF) block, represented by the pattern [0000], signaling the conclusion of the frame.
In Figure 7, the moments of communication for both vehicles are illustrated. A highlighted car is coming from the north, heading towards C0, and it is waiting for its phase to be activated to turn left. Various VLC communications, V2V, V2I, and I2V, are being studied during this phase. At C0, pedestrians in the designated waiting area are awaiting the pedestrian phase activation. During this time, P2I communications are initiated, where the pedestrian makes a crossing request, and I2P serves as the IM’s response.
Figure 7. Simulated VLC scenario. Two junctions (C0 and C1) with the RGBV ID transmitters. (a) C0. (b) C1. (c) C2.
Figure 8a,b demonstrate the MUX signal and the decoded messages between the analyzed vehicles and the traffic lights toward C0 and their movement after crossing C0 toward C1, respectively.
Figure 8. Normalized MUX signal and the decoded messages between the analyzed vehicles and the traffic lights toward C0 (a) and their movement after crossing C0 toward C1 (b).
In Figure 8a, for V2V communication (COM: 2), the vehicle behind the target car communicates with the one in front, providing its position as G5,1, R5,10, V4,0, the lane it is in (Lane: 2), the number of vehicles following it (in this case, none), and the time of communication (11:18:30); as there are no cars behind, it does not transmit anything in these blocks to the car in front. After receiving this communication, the leader then makes a request to the IM through V2I. It provides its position as G5,1, R5,10, V4,0, the traffic light to which the request is being made (TL: 2), the number of vehicles following it (Veic. (nr): 1), the time of the communication (TIME: 11:18:31), the car identifier (y, x: G5,1), and the number of cars behind the follower, which is currently 0. Next, the I2V communication occurs, where the leader receives a response with the same information at 11:18:32, indicating that the active phase is 4 (W > E). After the N > S Left phase is activated, the cars move toward the C1 intersection, where they are currently lined up waiting.
In Figure 5b, after activating the NS Left phase, the cars move toward the C1 intersection, where they are currently queued. Here, L2V and V2V communications are presented for the vehicle under study. In the first L2V communication, information is transmitted to the vehicles, such as their positions on the road (R3,3; B4,8; G3,2) and the time the communication is established (TIME: 11:23:15). In the V2V communication for the last vehicle in the queue, the transmitted information includes its position (R3,3; B4,8; G3,2), the road it is on (L: 0), the number of vehicles behind it (currently 0), and the time the communication is made (TIME:11:23:16). Because there are no cars behind it, this vehicle does not pass any information forward in these blocks.
For the next vehicle in the queue, V2V communication transmits information like its position (R3,3; B4,8; G3,2), the road it is on (L: 0), the number of vehicles following it (in this case, 1), and the time the communication is established (TIME: 11:23:17), and it informs the vehicle ahead that there is one car behind it, providing its position (R3,3). After passing through C1, the vehicles arrive at C2, where they queue again, awaiting their phase. V2V and L2V communications are re-established. For the L2V communication, information is transmitted to the first vehicle under study, including its position on the road (R3,1; B4,6; G3,2) and the time the communication is established (11:26:10). In the V2V communication, the first car under study has three vehicles behind it on the same road. It then communicates its position (R3,1; B4,6; G3,2), the road it is on (lane 0), the number of vehicles following it (3), the communication time (11:26:11), the identifiers of the vehicles behind it (G3,2; R3,3; G3,4), as well as the number of vehicles following each of those cars. Figure 7 illustrates the different communication moments.
  • L2V Communication: Vehicles receive their positions in the environment.
  • V2V and V2I Communication: Vehicles receive their positions and communicate both with one another and with the infrastructure to relay positioning, traffic light phases, and vehicle status. This data exchange helps vehicles align their movements with the active traffic phases, ensuring a coordinated flow.
  • P2I and I2P Communication: Pedestrians also participate in a similar communication cycle, requesting to cross intersections and receiving confirmations. The infrastructure responds based on the active traffic phase, managing pedestrian crossings in harmony with vehicular phases to improve safety and efficiency.
  • Traffic Flow and Phase Coordination: Each intersection has specific phases (Figure 5) to control pedestrian and vehicle movements. By synchronizing these phases across multiple intersections (C0, C1, C2), the system can handle complex flows and enhance safety, preventing conflicts between vehicles and pedestrians.
Overall, the V-VLC protocols provide a structured way for traffic management systems to coordinate and streamline both pedestrian and vehicle movement, which can be particularly beneficial in densely populated or high-traffic areas. The approach leverages real-time data to reduce delays, improve safety, and optimize intersection efficiency.

3.2. Role of Rewards in Standard and Rerouting Scenarios

The design of the reward function and the integration of inter-agent communication significantly impact the system’s ability to handle traffic, creating different strategies that fit the scenario in question; in this case, the standard and rerouting scenarios. With a certain reward, it is possible to affect the training policy of the neural network, thus creating a traffic control strategy. While isolated optimization may suffice in simple environments, interconnected traffic systems require coordinated, adaptive strategies to achieve efficient and balanced flows. Two types of scenarios were considered:
-
Standard Scenarios: Rewards typically aim to minimize queue lengths, waiting times, or delays at a single intersection. With this reward, the traffic strategy developed during network training involves each agent considering only the intersection it controls, without contemplating neighboring intersections. This encourages each agent to optimize locally without necessarily considering downstream impacts. While this approach is effective in isolated or less congested intersections, it may lead to suboptimal results when intersections are interconnected. Static reward structures often fail to adapt to dynamic conditions, leading to peaks in occupancy.
-
Rerouting Scenarios: In scenarios where traffic flow from one intersection affects neighboring ones, the reward design needs to account for global traffic metrics. For example, this might involve penalizing congestion caused by vehicles heading toward already congested intersections or rewarding actions that reduce overall network pressure, even if local delays increase temporarily. Introducing a hierarchical reward structure, where global objectives (e.g., reducing network-wide congestion) take precedence over local ones, leads to a new policy in the training, resulting in a new strategy that fits this type of connected intersection scenario, improving overall performance. Alternatively, dynamically adjusting weights in the reward function based on the observed environment (e.g., during rush hours or peak flow) can help address evolving traffic patterns.
The necessity for inter-agent communication becomes evident in dense and complex networks. Without communication, the agents optimize independently, which can lead to localized bottlenecks or oscillatory behavior. Communication based on sharing lane occupancy or flow rates allows agents to make informed decisions, aligning local actions with network-wide goals. For example, an agent may delay activating a green phase for incoming traffic if the downstream intersection is nearing capacity, or it can also prioritize outgoing traffic to alleviate pressure on upstream intersections. When vehicles travel between intersections, the system must consider how decisions at one intersection propagate throughout the network. This requires the following:
-
Real-time data exchange to synchronize decisions.
-
Adaptive strategies that consider not only local queue lengths but also expected arrivals from neighboring intersections.
When designing the reward function (Equation (2), Figure 6), agents controlling intersections in proximity must adopt strategies to redistribute traffic dynamically. Localized reward components ensure that intersections minimize immediate queues and waiting times.
If an intersection’s queue length exceeds a threshold, neighboring agents might adjust their phase timings to divert or delay incoming traffic. Coordination can also enable smoother transitions for vehicles heading to subsequent intersections, preventing cascading congestion. Globalized reward components encourage behavior that aligns with network-wide goals, such as reducing total system delay or balancing traffic flow across intersections.
Three scenarios were analyzed and address different traffic conditions. In the standard scenario, most vehicles go straight; in the symmetric rerouting scenario, traffic is redistributed via turns; and in the asymmetric routing scenario, a priority direction is given precedence. We began with a typical arterial scenario, the “standard” where 75% of vehicles travel straight and 25% turn. However, when traffic demand exceeds capacity, the system activates the rerouting “symmetrical” scenario,” where 75% of vehicles are redirected to turn, helping balance traffic load. An “asymmetrical” rerouting scenario was also explored, where one direction is prioritized over the other, allowing for greater flow in one direction and reducing traffic in the opposite direction. Rerouting is applied exclusively to the prioritized direction.
To further optimize traffic in rerouting scenarios, upstream anti-bottlenecking and smart rerouting techniques are employed, adjusting intersection control in real time based on congestion levels and dynamically assigning priority to alternative routes. To integrate the three traffic scenarios (“standard”, “symmetrical rerouting”, and “asymmetrical rerouting”) into the reward equation (Equation (2)), the reward was adjusted dynamically to account for the specific goals and constraints of each scenario. This was achieved by introducing scenario-specific weights, ω , that adjusts the priority of the reward based on the current traffic scenario or parameters into the reward function, ensuring that the policy aligns with the objectives of each traffic condition. Below is a modified version of the reward equation:
r t = ω p v e h a t w t v e h , t 1 a t w t v e h , t + ω p p e d a t w t p e d ,   t 1 a t w t p e d , t
To accommodate varying traffic conditions, the reward function incorporates scenario-specific weights (ω) and parameters ( p v e h and p p e d ). The weights are dynamically adjusted based on the current traffic scenario:
  • Standard Scenario: ω = 1 serves as the baseline, ensuring a balanced prioritization of vehicles ( p v e h ) and pedestrians ( p p e d ).
  • Symmetrical Rerouting: ω > 1 increases responsiveness to congestion, with p v e h adjusted to prioritize turning movements, reducing oversaturation of straight-through lanes.
  • Asymmetrical Rerouting: ω varies across directions to reflect priorities, e.g., ω = 1.5 for the prioritized direction and ω = 0.8 for the less-prioritized direction. In this scenario, p v e h is weighted more heavily in the prioritized direction.
These weights and parameters are selected using a combination of empirical testing and domain-specific traffic management knowledge, ensuring the system’s adaptability to real-world traffic challenges and referring to our previous work [,].
The inclusion of ω allows the reward function to adapt to the needs of the traffic scenario dynamically. By varying p v e h and p p e d , the system can optimize for different priorities, such as faster vehicle movement in high-demand scenarios or equitable pedestrian crossings in normal conditions. This approach ensures that the reinforcement learning agent learns policies tailored to the specific challenges and goals of each traffic scenario, improving overall system performance.

3.3. Coordination Strategy

The proposed system uses a distributed DRL approach where local agents manage individual intersections while the global agent facilitates indirect communication and coordination. This ensures synchronization of traffic signals across intersections, addressing scalability and congestion issues.
Different methods using DRL can be used depending on the traffic scenario in hand. For multiple intersections, the existing methods can be classified into two categories, the centralized control methods and the decentralized control methods. For the first method, a global agent is trained to control the traffic signal of the entire network. Each agent observes an intersection and saves its experience to be used to train this global agent, which controls the environment and indicates which actions should be taken at each intersection. The second method employs the decentralized control formulating the multi-intersection signal control as a multi-agent system, in which each agent is trained to control a single intersection and only observes and perceives parts of the traffic environment.
A global agent coordinating local agents offers significant advantages for achieving network-wide optimization, especially in complex urban scenarios. Some advantages include the following:
  • Inter-Agent Communication and Coordination: The global agent has access to aggregated data from all local agents. This allows it to identify bottlenecks, optimize traffic signal phase timings for high-priority routes, and balance traffic loads. Local agents might have conflicting goals (e.g., prioritizing their intersection’s flow versus preventing downstream congestion). It mediates these conflicts by aligning local actions with global objectives.
  • Dynamic and Context-Aware Strategies: By observing traffic density and flow patterns, the global agent adjusts signal phases across intersections, creating synchronized green waves that improve throughput and reduce stop-and-go behavior. During events like accidents, road closures, or high pedestrian density (e.g., near stadiums or schools), the global agent coordinates local responses to adapt quickly to changing conditions.
  • Global Reward Integration: While local agents optimize using their localized reward functions, the global agent enforces a composite reward that balances local and network-wide priorities by penalizing actions that cause downstream congestion or pedestrian risks and rewarding synchronized actions that optimize overall traffic flow.
  • Incident Management: Detects anomalies, such as sudden congestion or accidents, and coordinates local agents to minimize the impact.
So, without a global agent, the local agents optimize only their intersections independently. An upstream intersection may release vehicles without considering downstream congestion, causing queue spillovers. With a global agent, it monitors the entire corridor, instructing the upstream intersection to hold traffic temporarily to allow downstream intersections to clear. This prevents cascading congestion and ensures smoother flow.

3.4. Simulation and Validation

The methodology is validated through simulations using the SUMO urban mobility simulator. The system’s performance is assessed under various traffic conditions and intersection types, with metrics including vehicle delays, pedestrian waiting times, and safety incidents.
Traffic demand data were synthesized based on expected vehicle movement patterns across different times of the day, simulating realistic urban scenarios. We incorporated historical traffic data, where available, to ensure the representation of typical flow distributions between origin and destination points.
By integrating VLC, DRL, and connected vehicle technologies, this approach provides a scalable, adaptive, and efficient solution for modern urban traffic challenges.
For the simulation environment, a three-connected, four-arm intersection scenario with two lanes in both directions is considered (Figure 4). Each one of these intersections is controlled by an agent that maps the environment around it in cells, acquiring different information about the vehicles and pedestrians travelling through the intersection (Figure 8). Each of the three intersections is divided into 3 layers made up of 164 cells. The first layer is made up of 80 cells, with 10 for each lane routing vehicles to the junction, indicating their presence. If a vehicle is inside of the cell, it is filled with ‘1’; otherwise, it is filled with ‘0’. The second layer, made up of the same number of cells, indicates the normalized speed of the cars in each cell, if any are present. The third layer, made up of just 4 cells, represents the waiting zones, indicating the number of pedestrians standing still waiting for their phase to become active. This state representation helps the agent map the environment around the intersection and ends up being very similar to the states observed via VLC, illustrated in Figure 9. In this case, the vehicles are identified over time by the lane they are in and by the traffic light they are communicating with. Pedestrians, on the other hand, are identified over time by the waiting zone they are in, as well as the traffic light they are communicating with.
Figure 9. A schematic of the state representation for each junction.

4. Results and Discussion

4.1. Network Training

The number of vehicles and pedestrians, along with their respective speed values, were derived from previous traffic studies and experiments conducted in the city of Lisbon. Using this empirical data as a foundation, we simulated 2600 vehicles and 2000 pedestrians over the course of one hour to represent a typical peak-hour scenario in an urban environment. The learning rate was selected to balance convergence speed and stability, following guidelines from foundational work on reinforcement learning and optimization. Preliminary experiments were conducted to identify the optimal range, ensuring that the model avoided divergence or excessively slow training [,].
This approach ensures that the experimental conditions closely align with real-world traffic patterns while providing a controlled environment for our analysis, with aligns our parameter tuning with practices in mixed traffic flow analyses and provides insights into dynamic parameter optimization in traffic signal control [,]. So, to compare both scenarios, three neural networks were trained with a reward function of 50/50, this being the weight for vehicles and pedestrians, respectively. The environment simulated 2600 vehicles and 2000 pedestrians over 300 episodes, each lasting 3600 s. The batch size was tuned to balance computational efficiency and the stability of gradient updates. Smaller batch sizes provided better generalization for our traffic scenarios, while larger sizes caused slower response to changing dynamics.
Training parameters are given in Table 2.
Table 2. Training parameters.
The three neural networks utilized in the comparison are based on a Feed Forward Neural Network (FFNN) architecture, structured as follows:
  • Input Layer: 164 neurons, representing the current state of the environment.
  • Hidden Layers: Five fully connected hidden layers, each utilizing the ReLU activation function for non-linearity.
  • Output Layer: 9 neurons, each corresponding to a discrete action that the agent can take in the environment.
This choice of architecture was made to balance computational efficiency with sufficient model complexity to capture the dynamics of the traffic environment.
To analyze the results obtained, Figure 10a presents the cumulative negative reward graph, where significant differences can be observed among the curves for the three training scenarios. In the arterial scenario, the system faces a higher volume of bidirectional traffic flow, resulting in a lower reward compared to the other scenarios. This is attributed to the increased traffic volume creating substantial pressure at each junction, leading to longer waiting times and larger queues.
Figure 10. Network training for both scenarios. (a) Cumulative negative reward. (b) Average queue size.
In contrast, the symmetrical rerouting scenario shows an improvement in reward. By implementing micro-control at each junction, the system dynamically suggests better routes for vehicles, reducing queue sizes and alleviating pressure at critical points. This optimization leads to a higher cumulative reward as congestion is mitigated.
In the asymmetrical rerouting scenario, where traffic flow is adjusted to prioritize vehicles moving from west to east, the system adapts by prioritizing phases that facilitate movement in this direction. Combined with rerouting at critical junctions, this approach significantly reduces pressure across all intersections, leading to a notable increase in rewards compared to the other scenarios.
Overall, the results demonstrate that the networks were effectively trained, as rewards became progressively less negative over time. The symmetric rerouting and asymmetric rerouting scenarios showed faster convergence towards optimal strategies, greater stability, and robust performance, as evidenced by their more consistent reward distributions compared to the standard arterial scenario.
Figure 10b illustrates the average queue size during training for all three scenarios, aligning with the reward analysis. In the arterial scenario, the bidirectional traffic flow exerts higher pressure on the environment, resulting in larger queues. In the rerouting scenario, micro-control at the junctions ensures better traffic phase activation, accounting for lane capacity limits. This prevents excessive pressure, even when the flow is shifted to a 25/75 distribution.
Finally, the asymmetric rerouting scenario exhibits the lowest average queue sizes. By significantly reducing traffic volume in the east-to-west direction—one of the most critical traffic flows—the system minimizes congestion more effectively than in the other scenarios. This leads to better queue management and overall improved traffic performance.

4.2. Network Testing

To evaluate the trained networks, simulations were conducted for 3600 s involving 2600 vehicles and 2000 pedestrians. The number of cars and pedestrians in the simulation was derived from a detailed analysis in which traffic phase cycles were dynamically defined [,].
  • The environment scenario was optimized based on observed traffic patterns, and the number of vehicles and pedestrians was estimated under the condition of the longest traffic signal phase.
  • The intelligent system, powered by neural networks, was implemented and demonstrated that these levels of traffic flow could be efficiently managed.
The evaluation metrics validate the optimized performance of the proposed algorithm in terms of traffic safety and efficiency. Additional analyses were performed to assess the effects of optimization on halting vehicles and pedestrians in both standard and rerouting scenarios. The results confirmed that these traffic densities were feasible when optimal decisions were made by the neural-network-based intelligent system. This systematic approach allowed for realistic simulation conditions that closely mirror peak-hour scenarios.
Figure 11a–c depict the halting vehicles at junctions C0, C1, and C2, respectively.
Figure 11. Comparison of trends over time for vehicle halting sessions at intersections in standard versus rerouting symmetrical and asymmetrical scenarios: (a) Intersection C0, (b) Intersection C1, and (c) Intersection C2.
In the scenario with three horizontally arranged junctions, C1 is identified as the critical junction. Situated between C0 and C2, it receives traffic from both directions, subjecting it to significant pressure and increasing vehicle queues. By incorporating queue length thresholds, the system dynamically responds to congestion levels on key roads, such as those connecting C0 to C1 and C1 to C2.
For example, vehicles traveling from west to east at junction C0 are informed of road conditions between C0 and C1 upon entering the system. If the vehicle count exceeds a set threshold, such as 25 cars (despite the road’s 400 m capacity to hold more), vehicles are rerouted to turn right instead. This rerouting prevents additional congestion on the road between C0 and C1, as illustrated in Figure 11a for the rerouting and asymmetric rerouting scenarios. Vehicles that reroute exit the system sooner, alleviating pressure at subsequent junctions. This leaves space for new vehicles to enter and prevents scenarios where cars are stalled at green lights due to blocked lanes ahead.
Additionally, the system manages congestion by prioritizing traffic phases that direct fewer cars into critical lanes. For instance, instead of activating the west-to-east phase, which would funnel more cars into C1’s critical lane, the system may activate a north-to-south phase or a pedestrian phase, allowing C1 to clear its critical lane. As shown in Figure 11b, during the peak period between 800 and 2200 s, the standard scenario exhibits significantly higher congestion at C1 compared to the rerouting scenarios. C1, being the pivotal junction between C0 and C2, connects two critical roads in the network. Strict micro-control is necessary, particularly due to the disparity in road lengths: the road between C0 and C1 is 400 m long, while the road between C1 and C2 is only 200 m. This difference in capacity requires careful traffic management to prevent overwhelming C2. For example, vehicles traveling from C2 to C1 and heading toward C0 move more easily due to transitioning from the shorter 200 m road to the longer 400 m road. However, even in this case, the system ensures the 25-vehicle limit is respected to maintain traffic flow.
In Figure 11c, the halting vehicles at C2 are shown. The rerouting scenarios consistently reduce the number of stopped vehicles compared to the standard scenario, aligning with earlier observations. While congestion management at C2 is generally effective, it requires greater precision due to the interplay of route changes and phase activations.
The asymmetric rerouting scenario focuses on managing the higher traffic flow in the west-to-east direction, deprioritizing the east-to-west flow where traffic volumes are lower. By reducing the flow from east to west, a critical region in the system, the system handles peak traffic volumes (between 500 and 2000 s) effectively. During these peak periods, the number of stopped vehicles across the three junctions averages around 50, demonstrating the system’s ability to maintain stability even under unbalanced flow conditions.
Compared to the high queue levels observed in the standard scenario at junctions C0, C1, and C2, the rerouting scenario reduces average traffic pressure by 66%, 50%, and 75%, respectively. These results highlight the significant efficiency gains provided by rerouting, making arterial roads more effective and improving overall traffic flow.
Careful management of the 200 m road connecting C2 to C1 is critical, as its limited capacity can exacerbate congestion at C1. By implementing dynamic rerouting and optimized phase activations, the system ensures balanced traffic distribution, minimizing pressure on all intersections.
Figure 12 presents the number of halted pedestrians at each intersection for both the standard and rerouting scenarios. A comparison of pedestrian traffic between these scenarios reveals that the rerouting scenario generally results in fewer pedestrians waiting in designated areas.
Figure 12. Comparison of trends over time for pedestrian halting sessions at intersections in standard versus rerouting symmetrical and asymmetrical scenarios: (a) intersection C0, (b) intersection C1, and (c) intersection C2.
The most significant differences between scenarios occur at intersections C0 and C2, where the rerouting scenario more effectively utilizes the pedestrian phase, enabling smoother pedestrian flow. At C1, the critical junction with the highest overall pedestrian volume, there are more halted pedestrians than at C0 and C2. However, even at C1, the rerouting scenario consistently shows fewer waiting pedestrians compared to the standard scenario.
While the rerouting scenario does not explicitly prioritize pedestrian flow, it indirectly benefits pedestrians by reducing vehicle queues. With fewer vehicles clogging the intersections, the system can more frequently activate pedestrian phases, facilitating safer and more efficient crossings.
The pedestrian phase serves a dual purpose: it not only manages pedestrian flow but also helps regulate vehicle traffic on critical roads connecting intersections. As such, pedestrians indirectly benefit from improved traffic flow in the rerouting scenario. However, the overall reduction in halted pedestrian numbers is modest. This is because activating a pedestrian phase does not guarantee that large numbers of pedestrians will cross, as the availability of pedestrians in waiting zones varies. Factors like smaller groups, fewer pedestrians arriving during active phases, or inconsistent arrival patterns influence this variability.
Peaks in pedestrian halting sessions are closely tied to crossing periods, with the size of these peaks reflecting congestion levels and pedestrian reactions to connected vehicle traffic. The rerouting scenario exhibits smaller peaks compared to the standard scenario, where peaks are more pronounced, indicating higher stress levels. Between 500 and 1000 s, the rerouting scenario reduces average pedestrian pressure by 25%, effectively alleviating congestion during peak traffic times when pedestrian and vehicle wait times overlap. This reduction in pressure significantly minimizes the risk of pedestrian run-overs, enhancing safety performance and underscoring the importance of rerouting in traffic management.
The asymmetric rerouting scenario shows slightly higher pedestrian waiting numbers at C0 and C2 compared to the rerouting scenario. This difference arises from the redistribution of vehicles: instead of reducing vehicle numbers from the east, they are rerouted to other entry points across the network. As a result, intersections like C0 and C2 experience a slight increase in pedestrian presence, as priority is given to managing the high flow of west-to-east vehicles.
At C1, the critical junction, pedestrians experience the longest wait times due to the increased vehicle flow originating from the west. Consequently, the number of pedestrians in waiting zones rises as a result of the higher traffic pressure at this intersection.
In conclusion, the rerouting scenario demonstrates superior performance in reducing pedestrian halts and enhancing safety, especially during peak periods. Despite some minor increases in pedestrian pressure under the asymmetric rerouting scenario, both rerouting approaches improve overall traffic flow and safety for all road users.

4.3. Global Agent Decisions

Figure 13 shows a comparison of trends over time for all of the active phases (agent actions) validated by the global agent at C0, C1, and C2 intersections in standard (left) versus rerouting symmetrical (middle) and asymmetrical (right) scenarios. At the top, the nine actions that the agent can take in the environment are displayed. These mini-figures highlight the differences in traffic management strategies across the scenarios under consideration. They represent how the neural network adapted its strategy during training, reflecting the learned traffic signal control patterns tailored to the observed traffic dynamics in each environment.
Figure 13. Comparison of trends over time for the active phases (agent actions) at C0, C1, and C2 intersections in standard (left) versus rerouting symmetrical (middle) and asymmetrical (right) scenarios. The nine possible phases are indicated at the top.
Results show that the system diverges from a fixed phase sequence, characteristic of dynamic traffic control systems, by continuously adapting to real-time traffic conditions. Crucially, pedestrian phases are only triggered upon pedestrian request, optimizing phase usage by prioritizing vehicular movement unless a pedestrian need arises.
When comparing the three scenarios, the allocated green times and phase sequences across intersections exhibit notable variations over time. During the first 1500 s, green times for Phase 1 (N > S) and Phase 5 (E > W) at intersections C0 and C1 in the standard scenario differ significantly from those at C2. In contrast, the rerouting scenarios show distinct adjustments: green times for Phase 1 at C0 and C1 nearly double, while green times for Phase 5 at C2 increase significantly in both rerouting configurations.
Furthermore, the symmetric and asymmetric rerouting scenarios prioritize Phase 9 (pedestrians) and Phase 8 (left turns) during the initial period. This emphasis reflects a strategic focus on smoother rerouting and effective congestion management by prioritizing pedestrian and turning movements to support traffic flow redistribution.
This adaptive system efficiently handles both pedestrian and vehicle flows under varying traffic conditions.
  • Standard Scenario: The system aims to maximize green band efficiency for vehicles while minimizing pedestrian rule violations. Green lights are coordinated to reduce vehicle stops, while pedestrian crossing times are scheduled to prevent excessive wait times.
  • Rerouting Scenarios: When arterial traffic demand exceeds system capacity due to incidents or severe congestion, the system activates rerouting mechanisms. Traffic light coordination is dynamically reconfigured to redistribute traffic flow. For example, in these scenarios, 25% of vehicles are directed along the more congested straight paths, while the remaining 75% are rerouted to turning movements. This adjustment alleviates congestion on main arteries and enhances overall flow efficiency.
In the symmetric rerouting scenario, equal priority is given to both traffic directions, ensuring balanced flow across the network. Green times and phase sequences adapt dynamically to accommodate redistributed traffic. In the asymmetric rerouting scenario, the system assigns priority based on traffic demand. For instance, when one direction experiences higher traffic volumes, green times for corresponding high-traffic lanes are extended. This asymmetry effectively reduces pressure on critical lanes while maintaining manageable pedestrian waiting zones.
Table 3 presents the overall percentages of green times for intersections C0, C1, and C2 over a training segment for both scenarios. These values are also visualized in Figure 14 for comparison.
Table 3. Global percentages of green times for C0, C1, and C2 for both standard and rerouting scenarios.
Figure 14. Comparison of green time trends across all active phases at intersections C0, C1, and C2. Active phases are indicated at the top for clarity. (a) Standard scenario. (b) Symmetric rerouting scenario. (c) Asymmetric rerouting scenario.
As expected, the system prioritizes critical phases (P1, P5, P6, and P9) with green bands that depend on the scenario, adapting its strategy to reduce wait times and improve traffic flow.
Figure 14 provides a comparison of green time trends across all active phases at intersections C0, C1, and C2 for the analyzed scenarios. Active phases are indicated at the top of the figure for clarity.
  • Standard Scenario: Green times are predominantly allocated to the arterial direction, particularly the west–east phase (P5). However, this allocation sharply decreases from 37% at C0 to 20% at C2, creating controlled bottlenecks along specific road sections or chains of sections. This imbalance leads to queue build-ups at junctions C0, C1, and C2, as shown in Figure 11a–c. While prioritizing the arterial direction, the irregular green time distribution contributes to congestion, especially at downstream intersections.
  • Symmetrical Rerouting Scenario: Green times are redistributed more evenly, with P5 green times decreasing gradually from 33% to 28%. This balanced allocation ensures smoother and more consistent traffic flow across the network, significantly reducing congestion and preventing bottlenecks observed in the standard scenario.
  • Asymmetrical Rerouting Scenario: Green time allocation is further adapted by increasing the duration of P6 to prioritize the west-to-east (W > E) direction, addressing the higher vehicle flow in this direction. To achieve this prioritization, the system reduces the activation of less critical north–south phases (P2, P3, and P4), ensuring optimal resource utilization for the prioritized W > E traffic.
These results demonstrate how the dynamic adjustment of ω, p v e h , p p e d and phase-specific green times enables the system to adapt effectively to varying traffic scenarios, improving overall traffic performance. Those examples highlight the system’s ability to manage diverse challenges and achieve scenario-specific goals.
Pedestrian phases also exhibit notable differences between the scenarios. In the standard scenario, the pedestrian phase (P9) at C1 nearly triples, significantly enhancing safe crossing opportunities. This adjustment helps reduce queues at all intersections, as vehicles in the standard scenario are rerouted to exit the environment via right turns, alleviating pressure downstream. In the asymmetric rerouting scenario, pedestrian phases (P9) maintain similar durations at C0 and C2 but decrease at C1. This reduction is necessary to allocate more green time to vehicle phases supporting the W > E direction, ensuring stable vehicle flows without causing excessive pressure on critical junctions.
The adaptive nature of the system allows it to effectively manage both vehicle and pedestrian traffic. For vehicles, it minimizes critical queues by dynamically redistributing green times, thereby reducing congestion and relieving pressure at intersections. For pedestrians, consistently activated crossing phases ensure safe and efficient movement through intersections, except when temporary reductions are required to manage critical vehicle flows. This dynamic flexibility extends to traffic phase activation, which occurs without a fixed sequence, allowing the system to respond in real time to evolving traffic patterns at each intersection.
The system demonstrates its adaptability by accommodating 2600 vehicles and 2000 pedestrians under diverse traffic conditions. The standard scenario, while prioritizing arterial traffic, results in bottlenecks and longer vehicle queues but compensates by enhancing pedestrian safety through extended crossing times (P9). The symmetric rerouting scenario achieves a more balanced green time allocation, reducing congestion and distributing traffic loads more evenly. Finally, the asymmetric rerouting scenario prioritizes W > E traffic through increased green times for P6, effectively reducing congestion in critical directions while temporarily reducing pedestrian crossing times at C1 to maintain overall flow stability.
Overall, the system dynamically adapts to real-time conditions, ensuring efficient traffic flow and safety for all road users. Each scenario demonstrates specific trade-offs and benefits, highlighting the importance of flexible, adaptive traffic management strategies in complex urban environments.

4.4. Impact of VLC and CV Integration on Traffic Rerouting

The integration of Vehicle-to-Everything communication (VLC) and connected vehicle (CV) systems within traffic management demonstrates significant potential to improve traffic flow and safety, particularly through rerouting mechanisms. In the rerouting scenarios, these systems show notable benefits in reducing congestion and enhancing road safety, underscoring their value in advanced traffic management strategies.
The rerouting scenarios dynamically adjust traffic light phases to reduce congestion while maintaining pedestrian safety. Adaptive prioritization of green times ensures that pedestrian waiting zones remain within manageable limits, optimizing both vehicle throughput and pedestrian flow. The rerouting scenarios, in particular, further tailor signal adjustments to localized traffic demands, providing a more targeted approach to congestion management.
To evaluate the impact of the reward function across scenarios, we have conducted experiments in the past [,] with varying traffic densities for one and two arterial intersections. Results have shown that in high-traffic conditions, prioritizing vehicles (ω > 1, high pveh) improved vehicle throughput but slightly increased pedestrian wait times. In low-traffic conditions, using balanced weights (ω = 1, equal pveh and pped) optimized the overall system performance.
By integrating VLC and CV technologies with rerouting systems, traffic management becomes more responsive, more efficient, and safer for all road users. These technologies enable real-time communication between intersections, allowing traffic signals to synchronize and adjust dynamically based on live data. This reduces stop-and-go driving, minimizes congestion, and facilitates smoother traffic flow across the network.
Real-time data sharing between intersections allows rerouting systems to anticipate vehicle queues and proactively adjust signal phases to minimize waiting times. This capability benefits vehicles and pedestrians by reducing delays and ensuring safer, more timely pedestrian crossings while avoiding conflicts with vehicular traffic.
Rerouting plays a crucial role in congestion management by redirecting vehicles away from heavily congested areas and preventing critical intersections from becoming overwhelmed. Additionally, these systems can address unexpected issues, such as malfunctioning signals or sudden congestion spikes, preventing minor problems from escalating into major traffic disruptions.
During blockages or heavy congestion, rerouting enhances network efficiency by dynamically assigning priority to alternative routes, redistributing traffic loads, and reducing bottlenecks at key intersections. This adaptive control ensures real-time adjustments to traffic flows, optimizing both arterial and local road conditions.

Author Contributions

Conceptualization, M.V.(Manuela Vieira); Formal analysis, M.A.V. and M.V. (Manuela Vieira); Investigation, M.V. (Mário Vestias), P.L., and P.V.; Methodology, M.V. (Manuela Vieira), G.G., and M.A.V.; Software, G.G.; Validation, G.G., M.A.V., M.V. (Mário Vestias), P.L., and P.V.; Writing—original draft, G.G.; Writing—review and editing, M.V. (Manuela Vieira). All authors have read and agreed to the published version of the manuscript.

Funding

This research received support from FCT—Fundação para a Ciência e a Tecnologia, through the Research Unit CTS—Center of Technology and Systems, with references UIDB/00066/2020 and IPL/IDICA/2024/INUTRAM_ISEL.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

The authors acknowledge CTS-ISEL and IPL.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

  1. Siegel, J.E.; Erb, D.C.; Sarma, S.E. A survey of the connected vehicle landscape—Architectures, enabling technologies, applications, and development areas. IEEE Trans. Intell. Transp. Syst. 2018, 19, 2391–2406. [Google Scholar] [CrossRef]
  2. O’Brien, D.; Le Minh, H.; Zeng, L.; Faulkner, G.; Lee, K.; Jung, D.; Oh, Y.; Won, E.T. Indoor visible light communications: Challenges and prospects. Proc. SPIE—Int. Soc. Opt. Eng. 2008, 7091, 60–68. [Google Scholar]
  3. Parth, H.; Pathak, X.; Pengfei, H.; Prasant, M. Visible Light Communication, Networking and Sensing: Potential and Challenges. IEEE Commun. Surv. Tutor. 2015, 17, 2047–2077. [Google Scholar]
  4. Caputo, S.; Mucchi, L.; Cataliotti, F.; Seminara, M.; Nawaz, T.; Catani, J. Measurement-based VLC channel characterization for I2V communications in a real urban scenario. Veh. Commun. 2021, 28, 100305. [Google Scholar] [CrossRef]
  5. Vieira, M.A.; Vieira, M.; Vieira, P.; Louro, P. Optical signal processing for a smart vehicle lighting system using a-SiCH technology. In Proceedings of the SPIE Optics+Electronics, Prague, Czech Republic, 24–27 April 2017; Volume 10231. [Google Scholar]
  6. Ge, H.; Song, Y.; Wu, C.; Ren, J.; Tan, G. Cooperative Deep Q-Learning With Q-Value Transfer for Multi-Intersection Signal Control. IEEE Access 2019, 7, 40797–40809. [Google Scholar] [CrossRef]
  7. Vidali, A.; Crociani, L.; Vizzari, G.; Bandini, S. A Deep Reinforcement Learning Approach to Adaptive Traffic Lights Management. In Proceedings of the 20th Workshop “From Objects to Agents”, WOA 2019, Parma, Italy, 26–28 June 2019; pp. 42–50. [Google Scholar]
  8. Oroojlooy, A.; Hajinezhad, D. A review of cooperative multi-agent deep reinforcement learning. Appl. Intell. 2023, 53, 13677–13722. [Google Scholar] [CrossRef]
  9. Girao, P.S.; Alegria, F.; Viegas, J.M.; Lu, B.; Vieira, J. Wireless System for Traffic Control and Law Enforcement. In Proceedings of the 2006 IEEE International Conference on Industrial Technology, Mumbai, India, 15–17 December 2006; pp. 1768–1770. [Google Scholar] [CrossRef]
  10. Khanna, A.; Goyal, R.; Verma, M.; Joshi, D. Intelligent traffic management system for smart cities: First International Conference, FTNCT 2018, Solan, India, 9–10 February 2018, Revised Selected Papers. In Futuristic Trends in Network and Communication Technologies; Springer: Singapore, 2018; pp. 152–164. [Google Scholar]
  11. Zambrano-Martinez, J.L.; Calafate, C.T.; Soler, D.; Lemus-Zúñiga, L.-G.; Cano, J.-C.; Manzoni, P.; Gayraud, T. A Centralized Route-Management Solution for Autonomous Vehicles in Urban Areas. Electronics 2019, 8, 722. [Google Scholar] [CrossRef]
  12. Oskarbski, J.; Guminska, L.; Miszewski, M.; Oskarbska, I. Analysis of Signalized Intersections in the Context of Pedestrian Traffic. Transp. Res. Procedia 2016, 14, 2138–2147. [Google Scholar] [CrossRef]
  13. Pribyl, O.; Pribyl, P.; Lom, M.; Svitek, M. Modeling of Smart Cities Based on ITS Architecture. IEEE Intell. Transp. Syst. Mag. 2018, 11, 28–36. [Google Scholar] [CrossRef]
  14. Miucic, R. Connected Vehicles: Intelligent Transportation Systems; Springer: Cham, Switzerland, 2019. [Google Scholar]
  15. Kodi, J.H. Evaluating the Mobility and Safety Benefits of Adaptive Signal Control Technology (ASCT). Master’s Thesis, University of North Florida, Jacksonville, FL, USA, 2019. Available online: https://digitalcommons.unf.edu/etd/930 (accessed on 7 April 2024).
  16. Bilal, J.M.; Jacob, D. Intelligent Traffic Control System. In Proceedings of the 2007 IEEE International Conference on Signal Processing and Communications, Dubai, United Arab Emirates, 24–27 November 2007; pp. 496–499. [Google Scholar] [CrossRef]
  17. Yousefi, S.; Altman, E.; El-Azouzi, R.; Fathy, M. Analytical Model for Connectivity in Vehicular Ad Hoc Networks. IEEE Trans. Veh. Technol. 2008, 57, 3341–3356. [Google Scholar] [CrossRef]
  18. Zhang, J.; Wang, F.-Y.; Wang, K.; Lin, W.-H.; Xu, X.; Chen, C. Data-Driven Intelligent Transportation Systems: A Survey. IEEE Trans. Intell. Transp. Syst. 2011, 12, 1624–1639. [Google Scholar] [CrossRef]
  19. Shen, W.; Tsai, H. Testing vehicle-to-vehicle visible light communications in real-world driving scenarios. In Proceedings of the 2017 IEEE Vehicular Networking Conference (VNC), Torino, Italy, 27–29 November 2017; pp. 187–194. [Google Scholar]
  20. Liang, X.; Du, X.; Wang, G.; Han, Z. A Deep Reinforcement Learning Network for Traffic Light Cycle Control. IEEE Trans. Veh. Technol. 2019, 68, 1243–1253. [Google Scholar] [CrossRef]
  21. Lopez, P.A.; Behrisch, M.; Bieker-Walz, L.; Erdmann, J.; Flötteröd, Y.P.; Hilbrich, R.; Lücken, L.; Rummel, J.; Wagner, P.; Wiessner, E. Microscopic Traffic Simulation using SUMO. In Proceedings of the 21st IEEE International Conference on Intelligent Transportation Systems, Maui, HI, USA, 4–7 November 2018; pp. 2575–2582. [Google Scholar]
  22. Vieira, M.; Vieira, M.A.; Galvão, G.; Louro, P.; Véstias, M.; Vieira, P. Enhancing Urban Intersection Efficiency: Utilizing Visible Light Communication and Learning-Driven Control for Improved Traffic Signal Performance. Vehicles 2024, 6, 666–692. [Google Scholar] [CrossRef]
  23. Yousefpour, A.; Fung, C.; Nguyen, T.; Kadiyala, K.; Jalali, F.; Niakanlahiji, A.; Kong, J.; Jue, J.P. All one needs to know about fog computing and related edge computing paradigms: A complete survey. J. Syst. Archit. 2019, 98, 289–330. [Google Scholar] [CrossRef]
  24. Vieira, M.A.; Vieira, P.; Fernandes, R.; Louro, P. Dynamic Vehicular Visible Light Communication for Traffic Management. Next-Generation Optical Communication: Components, Sub-Systems, and Systems XII; Li, G., Nakajima, K., Srivastava, A.K., Eds.; SPIE: Bellingham, WA, USA, 2023; p. 124290O. [Google Scholar] [CrossRef]
  25. Fernandes, R.; Vieira, M.A.; Vieira, P.; Louro, P.A.; Véstias, M. Using visible light communication to implement intelligent traffic signals and cooperative trajectories at urban intersections. In Light-Emitting Devices, Materials, and Applications XXVII; Kim, J.K., Krames, M.R., Strassburg, M., Eds.; SPIE: Bellingham, WA, USA, 2023; p. 124410G. [Google Scholar] [CrossRef]
  26. Vieira, M.A.; Vieira, M.; Louro, P.; Vieira, P. Cooperative vehicular communication systems based on visible light communication. Opt. Eng. 2018, 57, 076101. [Google Scholar] [CrossRef]
  27. Tang, F.; Kawamoto, Y.; Kato, N.; Liu, J. Future Intelligent and Secure Vehicular Network Toward 6G: Machine-Learning Approaches. Proc. IEEE 2019, 108, 292–307. [Google Scholar] [CrossRef]
  28. Luong, N.C.; Hoang, D.T.; Gong, S.; Niyato, D.; Wang, P.; Liang, Y.-C.; Kim, D.I. Applications of Deep Reinforcement Learning in Communications and Networking: A Survey. IEEE Commun. Surv. Tutor. 2019, 21, 3133–3174. [Google Scholar] [CrossRef]
  29. Ye, H.; Li, G.Y.; Juang, B.-H.F. Deep Reinforcement Learning Based Resource Allocation for V2V Communications. IEEE Trans. Veh. Technol. 2019, 68, 3163–3173. [Google Scholar] [CrossRef]
  30. Wu, S.; Sun, D.J.; Qiu, G. Emission analysis based on mixed traffic flow and license plate recognition model. Transp. Res. Part D Transp. Environ. 2024, 134, 104331. [Google Scholar] [CrossRef]
  31. Chen, S.; Sun, D.J. An Improved Adaptive Signal Control Method for Isolated Signalized Intersection Based on Dynamic Programming. IEEE Intell. Transp. Syst. Mag. 2016, 8, 4–14. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.