1. Introduction
Urban intersections represent one of the most challenging and accident-prone environments for Autonomous Driving Systems (ADS) due to the dynamic and often unpredictable interactions between traffic signals, vehicles, and vulnerable road users [
1]. Ensuring the safety of ADS in these complex settings requires not only robust onboard perception and decision-making but also an enhanced awareness of the surrounding environment.
In this context, connected perception enabled by Vehicle-to-Everything (V2X) communications plays a key role. In this work, connected perception is realized through Infrastructure-to-Vehicle (I2V) communication, where intersection infrastructure provides trafic light information and perception information to the vehicle. This approach extends the ADS’s situational awareness beyond onboard sensor limitations by delivering real-time information on traffic light states and other vehicles or objects on the road.
However, validating the safety of such connected systems presents a significant challenge. Traditional validation methods, such as large-scale field operational tests, are often impractical due to cost, time, and safety risks. While simulation-based methods offer scalability, they must be carefully correlated with real-world behavior to be credible. A structured, comprehensive framework is therefore essential to guide the validation process, ensuring that all critical scenarios are tested and that system performance is evaluated against rigorous safety and comfort metrics. This study addresses this need by applying the SUNRISE Safety Assurance Framework (SAF) to validate an ADS equipped with connected perception capabilities in urban intersections.
1.1. The SUNRISE Safety Assurance Framework
The SUNRISE SAF [
2], illustrated in
Figure 1, provides a harmonized and structured methodology to determine whether a Cooperative, Connected, and Automated Mobility (CCAM) system meets predefined safety levels. The SAF builds upon established standards, including ISO 26262 [
3], ISO/PAS 21448 (SOTIF) [
4], and the UNECE’s New Assessment/Test Method (NATM) [
5], while integrating ASAM standards (OpenODD [
6], OpenSCENARIO [
7], etc.) for consistent scenario modeling and simulation.
At its core, the SAF consists of three main blocks that form a comprehensive safety performance assurance process:
Scenario Block: Manages the entire lifecycle of test scenarios, from their creation using data-driven or knowledge-based approaches to their formatting in interoperable standards (e.g., ASAM OpenSCENARIO) and storage in centralized scenario databases (SCDBs).
Environment Block: Operationalizes scenarios by querying and concretizing them for specific tests, allocating them to appropriate test environments (virtual, hybrid, or physical), and executing the tests to collect performance data.
Safety Argument Block: Evaluates the test results to build a compelling safety case. This includes assessing test coverage, evaluating individual test outcomes against pass/fail criteria, compiling evidence, and making a final safety decision (pass/fail) based on the residual risk.
1.2. Contributions
This paper makes the following key contributions to the field of ADS safety validation:
- 1.
Application of the SUNRISE SAF:
We apply the SUNRISE SAF to validate an ADS equipped with connected perception in an urban intersection scenario.
- 2.
Hybrid Validation Methodology: We present a comprehensive hybrid testing framework that combines virtual simulations (MATLAB/Simulink) (2014b) with real-world components (autonomous vehicle, physical traffic lights, V2X units) using OMNeT++ (5.7.1)/Veins (5.2)/SUMO (1.12.0), enabling robust and scalable validation.
- 3.
V2X Communication Performance Analysis: We quantify the impact of V2X communication delays on ADS performance and passenger comfort, providing concrete metrics (e.g., 100% CAM delivery with a 142 ms average delay) to inform system optimization.
- 4.
Identification of Critical Edge Cases: We identify and analyze critical failure scenarios, such as a collision with a non-priority obstacle, highlighting specific challenges in ADS decision-making that require further development.
- 5.
Simulation-Reality Consistency: We demonstrate strong alignment between virtual and hybrid testing outcomes, with minimal deviations in ego vehicle speed (<2 km/h) and trigger distance (<3 m), validating the reliability of the simulation models.
1.3. Paper Structure
The remainder of this paper is organized as follows.
Section 2 reviews related work on ADS validation, connected perception, and safety frameworks.
Section 3 details the methodology, including the test scenarios. The virtual and hybrid testing frameworks are described in
Section 4. The hybrid testing executions are presented in
Section 5.
Section 6 presents the results of the validation, comparing KPIs across the different test environments.
Section 7 discusses the findings, addresses the limitations of the study, and outlines the implications for future ADS development. Finally,
Section 8 concludes the paper with a summary of key takeaways and directions for future research.
2. Related Work
The safety validation of ADS in Connected, Cooperative and Automated Mobility (CCAM) environments remains a complex and open challenge. Existing approaches span real-world testing, simulation-based validation, formal verification, and hybrid testing frameworks. Each addresses specific aspects of system performance, yet none alone provides a complete solution for validating connected ADS operating under infrastructure-supported perception. This section reviews the most relevant validation approaches and positions the proposed work within the current state of the art.
2.1. ADS Validation Paradigms
In this subsection, we discuss and compare the different ADS validation paradigms in the literature.
2.1.1. Field Operational Testing and Reduction Approaches
Field Operational Testing (FOT) relies on large-scale real-world data collection to estimate safety performance. While representative, the mileage required to expose rare and critical edge cases makes FOT alone impractical for comprehensive ADS validation. To mitigate this limitation, reduction approaches (RAs) have been proposed to focus testing on safety-critical situations. Betschinske et al. [
8], for example, evaluate RA techniques for Automatic Emergency Braking systems in alignment with ISO 21448. Although effective in reducing testing effort, such approaches still depend on real-world exposure and provide limited insight into connected perception or hybrid validation.
An alternative to physical FOT leverages high-fidelity driving simulators with human drivers. Zhao et al. [
9] demonstrate controlled experiments to evaluate connected vehicle functionalities. These approaches improve efficiency and reproducibility but primarily address human interaction and do not validate ADS perception or decision-making under realistic V2X conditions.
2.1.2. Regulatory Frameworks and Guidelines
High-level regulatory frameworks, such as UNECE WP.29 guidelines [
10], define safety objectives, functional requirements, and recommended validation principles for ADS. These frameworks establish a necessary regulatory baseline but intentionally remain technology-agnostic. As a result, they do not prescribe concrete test execution workflows or address hybrid testing, V2X-based perception, or simulation-to-reality consistency, which are critical for CCAM systems.
2.1.3. Formal Modeling and Verification
Formal methods provide mathematically rigorous guarantees of system correctness. Chen and Li [
11] propose scenario modeling and verification using UPPAAL-SMC. While effective for logical validation, such approaches struggle to incorporate real-world uncertainties such as communication latency, sensor timing, and control actuation delays, limiting their applicability to end-to-end ADS validation in CCAM environments.
2.1.4. Simulation-Based and Agent-Based Models
Model-based and agent-based simulations are widely used to generate diverse traffic behaviors and interaction patterns. Examples include MPC-based driver models [
12] and agent-based driver representations [
13]. These approaches offer scalability and behavioral richness, but typically focus on single-vehicle or nominal scenarios and rarely integrate V2X communication or hardware-in-the-loop components required for realistic connected ADS validation.
2.1.5. High-Fidelity and Cooperative Simulation Frameworks
Large-scale simulation frameworks combining traffic microsimulation and high-fidelity vehicle dynamics enable evaluation of cooperative and connected driving algorithms. Xu et al. [
14] demonstrate fleet-level safety assessment using CARLA (0.9.15), while OpenCDA [
15] provides an open-source cooperative driving automation framework integrating perception, planning, control, and V2X communication. These platforms are well suited for algorithm development and comparative evaluation but typically emphasize functional performance rather than structured safety assurance, passenger comfort, or simulation-to-reality consistency.
2.2. Hybrid and CCAM-Oriented Validation Approaches
Recent work has explored hybrid testing paradigms that combine real vehicles with simulated environments. Vehicle-in-the-Loop (ViL) frameworks embed physical vehicles into virtual CCAM scenarios to test ADAS functions under realistic V2X communication conditions [
16]. Digital-twin-based hybrid reality systems further synchronize real vehicle states with virtual environments to expand scenario coverage while reducing risk [
17]. High-fidelity driving simulators such as DriSMi emphasize human behavior, comfort, and acceptance in CCAM contexts [
18]. While these approaches improve realism and testing efficiency, they primarily focus on controller performance or human interaction and do not explicitly address the fidelity of infrastructure-supported perception or its impact on ADS decision-making.
Fail-safe decision architectures have also been proposed to maintain safe operation under sensing or positioning failures in CCAM systems [
19]. These works focus on architectural redundancy and fallback strategies, whereas the present study addresses the operational impact and validation of infrastructure-provided perception under nominal and extreme conditions.
2.3. I2V Communication and Connected Perception in CCAM
Infrastructure-to-Vehicle (I2V) communication is a cornerstone of CCAM, enabling connected perception through multiple standardized message types. Traffic signal and intersection awareness is supported through Signal Phase and Timing Messages (SPATEM), which provide real-time information about traffic light states, intersection geometry, and maneuver permissions. These messages are fundamental for ADS decision-making at signalized intersections and form a key component of connected perception for traffic control and priority management.
Beyond traffic signal awareness, Collective Perception Services (CPS) extend connected perception to dynamic objects through infrastructure-based sensing. ETSI TR 103 562 analyzes CPS use cases and highlights how infrastructure-supported perception can extend vehicle situational awareness in complex or occluded environments [
20]. ETSI TS 103 324 formalizes the Collective Perception Message (CPM), defining standardized object-level information exchange to enable interoperable cooperative perception between infrastructure and vehicles [
21].
Several studies have explored the use of I2V communication in safety assessment and ADAS design. Van de Sluis et al. [
22] incorporate V2V and I2V communication into scenario-based and hardware-in-the-loop safety testing for truck platooning, explicitly modeling communication effects. Frigerio et al. [
23] investigate V2X-assisted collision avoidance at intersections using simulation-based evaluation, analyzing the influence of communication delays and infrastructure sensing. These works demonstrate the relevance of I2V communication for safety but do not address the end-to-end fidelity of connected perception chains nor their consistency across simulation and real-vehicle testing.
At a higher level, prior academic studies have shown that communication-based situational awareness can significantly influence automated vehicle behavior and traffic performance [
24]. However, these works typically assume idealized communication or focus on macroscopic effects, without experimentally validating ADS decision-making under realistic hybrid testing conditions.
2.4. Positioning of the Present Work
In contrast to existing approaches, the present study focuses on the experimental validation of connected perception based on I2V communication, encompassing both traffic signal awareness via SPATEM and dynamic object perception via CPM, within a structured SUNRISE SAF. Rather than proposing new perception algorithms or communication protocols, this work evaluates how infrastructure-provided perception can replace onboard perception and support consistent ADS decision-making across simulation and hybrid tests.
By quantifying safety, control, communication, and passenger comfort KPIs across multiple scenarios—including a deliberately extreme intersection conflict—the proposed approach bridges the gap between ETSI I2V communication standards and their practical impact on ADS behavior, with particular emphasis on simulation-to-reality consistency. This positions the work as a complementary contribution to existing CCAM validation literature, addressing a critical but underexplored dimension of connected ADS safety assurance.
3. Methodology
This section details the methodology employed to validate an ADS with connected perception in urban intersection scenarios. The validation process is structured according to the SUNRISE SAF, which provides a systematic approach for defining, executing, and evaluating safety-critical tests. We first outline the validation workflow based on the SAF, followed by a detailed description of the selected test scenarios designed to challenge the ADS. Finally, we present the virtual and hybrid testing frameworks developed to support this validation.
3.1. Validation Workflow Based on the SUNRISE SAF
Our validation workflow systematically implements the three core blocks of the SUNRISE SAF: Scenario, Environment, and Safety Argument. A brief summary of our approach of application of the SUNRISE SAF is presented below.
3.1.1. Scenario Block
The validation begins with the Scenario block, where a set of safety critical urban intersection scenarios was internally defined based on expert knowledge. These scenarios were curated to cover a representative range of operational conditions, from routine maneuvers to high-risk edge cases. The process started with measurements of the real-world road network of the test track, which were then used to create calibrated road network models for both virtual and hybrid testing. The details of the test scenarios are described in
Section 3.2.
3.1.2. Environment Block
In the Environment block, the defined scenarios were concretized with specific parameters (e.g., vehicle speeds, distances, communication delays). Each concretized scenario was then allocated to an appropriate test environment. To ensure robustness and scalability, we employed a dual-environment strategy: a virtual testing framework described in
Section 4.1 for extensive parameter sweeps and a hybrid testing framework described in
Section 4.2 for validation with real-world hardware. Test executions presented in detail in
Section 5 was carried out in both testing frameworks to gather comprehensive performance data.
3.1.3. Safety Argument Block
Following test execution, the Safety Argument block was initiated. We performed a coverage analysis to ensure that only the interesting parameter space of the scenarios was explored. All relevant data, including vehicle states, decisions, and communication logs, were collected for in-depth analysis. The results were then used to construct a structured safety case, evaluating the ADS against predefined KPIs and supporting the final validation decision, which is presented in
Section 6.
3.2. Test Scenarios
This section presents the validation of the scenario block. Five representative test scenarios were defined. These scenarios evaluate ADS behavior at urban intersections under two configurations: traffic-light signalized intersections and unsignalized intersections relying on connected perception. The Operational Design Domain (ODD) considered in this study corresponds to urban intersections with full I2V connectivity. Note that the ADS operates using its locally stored map and makes vehicle control decisions exclusively based on information received via I2V communication.
3.2.1. Traffic-Light Signalized Urban Intersection Scenarios
These scenarios assess the ADS’s ability to handle conventional signalized intersection operations safely and comfortably. The considered intersection is managed by an intersection controller operating traffic lights and equipped with a Roadside Unit (RSU) enabling V2X communication. The controller broadcasts ETSI-compliant Signal Phase and Timing (SPATEM) messages, providing Signal Phase and Timing (SPaT) information to approaching vehicles via I2V communication [
25].
Scenario 111—Nominal Stop: As illustrated in
Figure 2, the ADS approaches a signalized urban intersection. The intersection controller broadcasts SPATEM messages via I2V communication, indicating a red traffic light phase. Upon receiving this information, the ADS correctly interprets the signal state and performs a smooth and controlled stop before the stop line.
Objective: To evaluate the accuracy, safety, and passenger comfort of the ADS response to traffic light phase transitions (green to amber to red).
Parameters: Initial vehicle speeds ranged from 5 to 50 km/h. The distance to the stop line at the onset of the amber phase was set to 0, 20, 45, 70, and m, the latter representing a vehicle that has already crossed the stop line.
Scenario 121—Resume on Green:
Figure 3 depicts a scenario in which the ADS approaches a signalized intersection and initiates braking in response to a red traffic light received through SPATEM messages. Before the vehicle comes to a complete stop, the traffic signal switches to green. Upon receiving the updated SPATEM information, the ADS aborts the braking maneuver and smoothly resumes motion to proceed through the intersection.
Objective: To assess the responsiveness and smoothness of ADS behavior when a braking maneuver is interrupted by a green-light transition.
Parameters: The initial speed was fixed at 50 km/h. The green-light transition occurred at ego vehicle speeds of 0, 10, 20, 30, and 40 km/h.
Scenario 134—Delayed Traffic Light Communication: In this scenario, shown in
Figure 4, the ADS approaches a signalized intersection while SPATEM messages transmitted via I2V communication are intentionally delayed. This setup allows the evaluation of ADS robustness to degraded communication conditions.
Objective: To quantify the impact of delayed traffic light information on braking performance, safety margins, and passenger comfort.
Parameters: The initial speed was set to 30 km/h, with the red-light activation point fixed at 30 m. Communication delays of 0, 0.1, 0.3, 0.5, 1, and 2 s were applied.
3.2.2. Unsignalized Urban Intersection Scenarios
These scenarios address unsignalized urban intersections where safety depends on connected perception rather than traffic lights. The intersection manager is equipped with perception sensors and disseminates perceived object information to approaching vehicles using Cooperative Perception Messages (CPMs) [
20].
Scenario 211—Stop for Priority Obstacle: As illustrated in
Figure 5, this scenario considers an unsignalized intersection where a priority vehicle approaches from the right of the ADS. The obstacle appears abruptly and at a relatively late stage. The intersection manager detects the obstacle and broadcasts its state via CPMs from the RSU. Based on the received connected perception information using On-Board Unit (OBU), the ADS evaluates the obstacle’s priority and executes a safe stop to yield the right-of-way.
Objective: To assess the ADS’s ability to make correct decisions and apply effective braking when reacting to a late-appearing priority obstacle.
Parameters: The ego vehicle’s initial distance (10 and 25 m) and speed (10, 20, 30, 40, and 50 km/h) were varied. The obstacle vehicle’s speed was fixed at 30 km/h.
Scenario 221—Urgent Stop for Non-Priority Obstacle (Collision Case):
Figure 6 illustrates an unsignalized intersection scenario involving a non-priority obstacle vehicle. The obstacle is initially stationary and respects the right-of-way. Its state and position are continuously perceived by the intersection manager and transmitted to the ADS via CPMs. The ADS, which has priority, proceeds without stopping while monitoring the obstacle. At a critical moment, the obstacle suddenly accelerates and forces the right-of-way by entering the intersection. This late maneuver introduces ambiguity, as it may correspond either to a conflicting trajectory or a non-conflicting turn.
Based on the connected perception information, the ADS must evaluate collision risk in real time and decide whether to initiate emergency braking. Depending on reaction time and available braking distance, a collision may occur.
Objective: To assess ADS decision-making and control behavior in highly critical and ambiguous right-of-way violation situations.
Parameters: Ego vehicle distances of 5, 10, and 25 m and speeds of 10, 20, 30, 40, and 50 km/h were evaluated.
In Scenarios 211 and 221, the CPM transmission frequency is fixed at 25 Hz, in accordance with ADS input requirements. Each CPM contains the obstacle vehicle identifier, position vector, velocity vector, and the occupancy status of the intersection conflict zone. A systematic evaluation of alternative CPM contents or transmission rates is beyond the scope of this work, which focuses on ADS decision-making and control performance.
Scenario Block Validation
The above defined scenarios were created, formatted, and stored to support the virtual and hybrid testing frameworks described in
Section 4. Each scenario was specified with explicit parameters, actors, and infrastructure elements, and formatted to ensure consistent execution in both simulation and XiL-based hybrid tests. The scenarios were stored locally. Thereby validating the scenario block in terms of creation, formatting, and storage. This validation process follows the scenario definition and parameterization requirements specified in SUNRISE Deliverable D3.2 [
26].
The above defined logical scenarios were concretized by assigning explicit parameter values (e.g., speeds, distances, communication delays), enabling their execution as concrete test instances. Thus validated the concretize part of Environment block of SAF.
4. Testing Frameworks
This section validates the allocation part of the Environment SAF block. To execute the above scenarios, two complementary testing frameworks were developed: a fully virtual environment enabling scalable and repeatable experimentation, and a hybrid setup integrating real-world hardware for validation under realistic operating conditions.
4.1. Virtual Testing Framework
The virtual testing framework, developed in MATLAB/Simulink (2014b), provides a high-fidelity environment for validating and tuning the ADS algorithms before deployment on physical platforms. As illustrated in
Figure 7, it serves as a digital twin of the real system, enabling the evaluation of planning, control, and mapping algorithms under controlled and repeatable conditions.
4.1.1. Architecture and Components
The simulator’s architecture is composed of four key subsystems that replicate a cooperative driving environment:
Real-Time Visualization: Provides multiple viewpoints (Ego Vehicle, Intersection Manager, and Bird’s-Eye) for monitoring and debugging the simulation.
Scenario Generation: Dynamically generates static and dynamic obstacles, controls traffic light states, and models traffic participants’ behavior within scenarios executed by a Level-4 autonomous vehicle prototype. The scenarios explicitly account for the driver–ADS interaction model, in which the human driver is responsible only for activating autonomous mode when the vehicle indicates operational readiness. Once engaged, the ADS assumes full driving control, with no continuous driver supervision required. Autonomous operation is restricted to pre-mapped operational design domains, reflecting the vehicle’s onboard mapping and deployment constraints.
Intersection Manager (IM): Implemented as a compiled Dynamic-Link Library (DLL) deployed on the target execution platform. The IM represents infrastructure-side intelligence, performing intersection-level coordination to support connected perception and to assess and mitigate collision risks among connected traffic participants.
The Intersection Manager builds a consolidated scene representation and performs temporal projections to identify potential risk situations. It transmits to the ego vehicle only relevant obstacles, along with their estimated kinematic states and localized collision-risk zones derived from predictive analysis. In this configuration, the RSU acts solely as a communication relay and does not perform analysis or issue alerts.
Ego Vehicle Model (MODEL ZOE2): A detailed model that simulates the vehicle’s physical dynamics and hosts the actual ADS algorithms.
4.1.2. Ego Vehicle Modeling as a Digital Twin
The Ego Vehicle model is a critical component, designed to be a high-fidelity digital twin of the physical test vehicle. It ensures that the behavior observed in simulation is a reliable predictor of real-world performance.
- 1.
Sensor Models: The ADS vehicle used in this study is not equipped with any onboard perception sensors (e.g., cameras or LiDAR). Instead, it relies exclusively on a locally stored topological map (including traffic light identifiers and positions), GNSS-based localization, vehicle trajectory information, and real-time Infrastructure-to-Vehicle (I2V) communication for decision-making and control.
The simulated sensor configuration strictly mirrors the real vehicle hardware and includes:
- (a)
A GNSS RTK/IMU unit for high-precision localization.
- (b)
A CAN bus interface providing vehicle state measurements (e.g., wheel speeds, yaw rate).
- 2.
Autonomous Driving (AD) Stack: The model integrates an autonomous driving software stack instantiated from the same codebase as the real vehicle and compiled as DLLs for virtual tests execution. The stack comprises core ADS functions, including topological map handling, map-matching of V2X information, and a high-level supervisory module responsible for decision-making and vehicle control.
While the SUNRISE project does not define a single explicit end-to-end AD stack in its deliverables [
27], the proposed model is aligned with the SUNRISE architecture and assumptions. In particular, Deliverable D4.1 [
28] specifies the functional decomposition of ADS components, and Deliverable D4.4 [
29] describes the interactions between these components at the simulation level. The implemented stack follows these functional boundaries and interaction principles, ensuring consistency with the publicly available SUNRISE specifications.
- 3.
Vehicle Dynamics: A detailed dynamic model of the Renault Zoé platform simulates actuator limits and system latencies, providing a realistic response to control commands. A manual driver model is also included to test mode transitions.
Crucially, the use of identical AD algorithms in both the virtual simulator and the real vehicle is the cornerstone of our validation approach, allowing for a direct and meaningful comparison between virtual and hybrid test results.
4.2. Hybrid Testing Framework
The hybrid testing framework bridges the gap between pure simulation and real-world testing by integrating physical hardware into the simulation loop. This setup, depicted in
Figure 8, enables the validation of the ADS under realistic conditions, including real sensor noise, actuator dynamics, and V2X communication latencies.
4.2.1. XiL Co-Simulator
The core of the hybrid framework is the XiL (X-in-the-Loop) simulator, a co-simulation environment built on OMNeT++ (5.7.1) [
30], Veins (5.2) [
31], and SUMO (1.12.0) [
32].
SUMO models the microscopic traffic flow and intersection dynamics.
Veins connects SUMO to OMNeT++, enabling the simulation of V2X communication protocols (e.g., CAM, SPATEM).
OMNeT++ acts as the discrete-event network simulator, managing all communication between entities.
An MQTT broker facilitates the exchange of data between the simulated environment and the physical components.
4.2.2. System Integration and Information Flow
The hybrid framework operates through a tightly controlled loop of information and action between the virtual and physical worlds:
- 1.
The XiL simulator runs the urban intersection scenario, modeling all non-subject vehicles and pedestrians.
- 2.
The physical ego vehicle, equipped with the real ADS stack, drives on a test track that is digitally mirrored in the simulator. Its position is fed back to the simulator via MQTT.
- 3.
A physical Intersection Management System (IMS) receives traffic light state information from the XiL simulator and controls a physical traffic light on the test track.
- 4.
The IMS also receives simulated perception data (obstacles) from the XiL simulator and combines it with its own assessment.
- 5.
The IMS transmits cooperative perception data and traffic light information (via SPATEM) to a physical Road-Side Unit (RSU).
- 6.
The RSU broadcasts this information over the air to the ADS vehicle’s On-Board Unit (OBU) using VEDECOM’s V2X stack.
- 7.
The ADS vehicle exploits I2V information for decision-making, which is subsequently enacted through the vehicle’s physical control actuators. Perception objects previously available to the ADS are continuously updated and superseded by the most recent objects received via I2V communication. In parallel, the ADS broadcasts its current state through Cooperative Awareness Messages (CAMs) to both the RSU and the XiL simulator, thereby closing the loop.
This integrated framework allows for the safe and repeatable testing of dangerous edge cases, such as Scenario 221, while capturing the complex interactions between the ADS, the environment, and the communication network.
The defined concrete scenario tests instances were allocated to the testing frameworks described in this section. In the fully virtual framework, all scenario elements were simulated, including the road network, intersections, the ADS ego vehicle, and obstacle vehicles. In the hybrid framework, static elements were realized on a real test track (road geometry, intersections, and markings), while dynamic elements included a real ADS vehicle with V2X capabilities; obstacle vehicles were simulated and reproduced within the XiL environment.
This allocation follows the principles of SUNRISE Deliverable D3.3 [
33], which defines the systematic mapping of scenario instances to test platforms according to test objectives and required fidelity.
Virtual tests were executed using the virtual testing framework, and the corresponding results are presented in
Section 6.1. The execution of the hybrid tests is described in the following section.
5. Hybrid Testing and Scenario Execution
This section validates the execution part of the Environment SAF block. Using the hybrid test environment described in
Section 4.2, we performed a series of validation runs. The physical ADS vehicle operated on a test track while its actions and the surrounding environment were mirrored in real-time within the XiL simulator. This synchronization, achieved via MQTT, allowed for the safe execution of critical scenarios that would be too dangerous to test in uncontrolled traffic. The following paragraphs detail the execution of the five selected scenarios, with visual snapshots provided in
Figure 9 and
Figure 10.
5.1. Execution of Traffic-Light Signalized Urban Intersection Scenarios
In these scenarios (111, 121, 134), the ADS’s interaction with traffic signals was tested. The physical traffic light on the track was controlled by the IMS, which received its state from the XiL simulator. In all the hybrid tests, the received connected perception information preceeds and replaces the local topological map information.
In Scenario 111 (
Figure 9a), the ADS approached a green light that turned red. The IMS transmitted a SPATEM message to the vehicle’s OBU, prompting a controlled stop that was precisely mirrored in the simulator.
For Scenario 121 (
Figure 9b), the ADS began braking for a red light. When the simulator changed the light to green, an updated SPATEM was sent, and the ADS smoothly resumed acceleration, demonstrating its ability to handle dynamic signal changes.
Scenario 134 (
Figure 9c) evaluated the impact of connectivity delays on the ego vehicle’s braking behavior and passenger comfort. During the execution, the ADS approached a green traffic light, but the SPATEM information from the intersection manager was intentionally delayed. As a result, the ADS received the red-light information later than usual, causing the vehicle to brake more abruptly, which reduced passenger comfort.
5.2. Execution of Unsignalized Urban Intersection Scenarios
These scenarios (211, 221) evaluated the ADS’s ability to detect and react to other vehicles using V2X-based connected perception, bypassing the limitations of onboard sensors.
In Scenario 211 (
Figure 10a), the ego vehicle approached an unsignalized intersection when a simulated priority vehicle suddenly appeared from the right. The XiL simulator detected the obstacle and transmitted this information to the intersection manager, which then broadcasted CPMs via its RSU to the ego vehicle’s OBU. Using CPMs connected perception data, the ADS correctly identified the priority vehicle and executed a controlled stop to yield the right-of-way.
Scenario 221 (
Figure 10b) evaluated the system under extreme, time-critical conditions. During execution, the non-priority vehicle was present from the start but remained stationary at the intersection entrance, initially respecting the right-of-way. At the moment defined by the scenario, the obstacle suddenly accelerated from the left, forcing its way into the intersection and creating a high-risk situation.
The ADS received the obstacle’s perception data and collision risk zone in CPMs from the intersection manager RSU to the ego vehicle OBU and executed an emergency braking maneuver. Although a minor collision occurred, a major collision was avoided, demonstrating the ADS’s ability to mitigate severe outcomes.
Scenario 221 was intentionally designed to study an extreme and near-inevitable collision case under a I2V-only perception paradigm. Both simulation and hybrid tests are performed under identical assumptions, without onboard perception. The obstacle exhibits late and unpredictable behavior, making early risk identification impossible.
6. Experimental Evaluation and Results
This section validates the safety argument SAF block. It presents the analysis of the results from validating the ADS in urban intersection scenarios using the SUNRISE SAF. We begin by detailing the process of identifying critical test cases from extensive virtual tests, which were then used to focus the hybrid testing. Subsequently, we present a comparative analysis of KPIs from virtual and hybrid environments, providing deep insights into the ADS’s decision-making, control performance, V2X communication effectiveness, and passenger comfort across all five scenarios. Passenger comfort is evaluated using longitudinal acceleration-based KPIs, namely, average and peak acceleration/deceleration. These metrics are commonly used in the literature to assess ride comfort, as excessive acceleration or deceleration levels are directly perceived by passengers and may lead to discomfort [
34].
6.1. Virtual Testing Results and Identification of Critical Scenarios
To maximize both the safety and the effectiveness of the hybrid testing phase, extensive virtual simulations were first conducted in MATLAB/Simulink to identify the most challenging and informative test cases. The selection process focused on scenarios that stress the ADS decision-making, control performance, and passenger comfort limits. In particular, scenario parameter configurations resulting in the highest jerk values were identified as critical cases and selected for further evaluation.
6.1.1. Scenario 111: Nominal Stop
Virtual simulation results illustrating the ego vehicle’s stop–go decisions for different combinations of initial speed
and distance to the traffic light are presented in
Figure 11. Blue markers correspond to test cases in which the ADS correctly decides to stop at the red traffic light. Orange markers indicate cases where the ADS proceeds through the intersection during the orange phase. Red markers represent cases in which the ADS proceeds through the intersection despite a red signal, constituting a red-light violation.
Figure 11a reveals a critical observation. With a 3 s orange phase, which is typical of dense urban environments, the ADS decision logic leads to a single red-light violation when the initial speed is 50 km/h and the distance to the traffic light is 45 m. This behavior arises because the ADS decision algorithm is calibrated for a 5 s orange phase, which is more common in peri-urban scenarios. This result highlights the sensitivity of the decision-making logic to regional traffic signal timing configurations. In contrast,
Figure 11b shows no red-light violations when a 5 s orange phase is applied.
Since the primary objective of Scenario 111 is to evaluate the ADS stopping behavior at red traffic lights, the subsequent analysis focuses on the blue-marked cases. These scenarios are used to further assess passenger comfort-related KPIs and to identify critical operating conditions.
Figure 12 presents the comfort-related KPIs obtained from virtual tests of Scenario 111, expressed in terms of peak deceleration and peak jerk experienced by the ADS passenger. The results show that the maximum values of deceleration and jerk do not necessarily occur for the same test configurations. This observation indicates that the most uncomfortable driving condition, as reflected by peak jerk, is not systematically associated with the highest deceleration level.
In particular, the highest jerk is observed in the test case where the ADS vehicle is located at 0 m from the traffic light and travels at a speed of 10 km/h during hybrid testing. In this configuration, the traffic light switches to red precisely as the vehicle reaches the stop line, forcing the ADS to execute an abrupt stop. Although the resulting deceleration remains moderate, the rapid change in acceleration leads to the highest jerk value, representing a significant passenger comfort challenge and stressing the system’s ability to handle last-moment stopping maneuvers smoothly.
This configuration is therefore identified as a critical scenario. It is selected for further investigation in the hybrid testing campaign with the real ADS-equipped vehicle, with the objective of verifying whether the comfort-related effects observed in virtual simulation are consistently reproduced under real-world conditions.
6.1.2. Scenario 121: Resume on Green
For reacceleration comfort,
Figure 13 shows that while average acceleration is stable, jerk varies significantly. The critical case selected for hybrid testing involves reacceleration from a low speed (below 20 km/h), which produces the highest jerk. This choice allows us to evaluate the ADS’s ability to provide a smooth and comfortable experience during a common, yet potentially jarring, maneuver.
6.1.3. Scenario 134: Delayed Communication
Figure 14 illustrates the direct impact of communication latency on passenger comfort, as reflected by acceleration and jerk. In this scenario, the latency is introduced by the intersection manager when transmitting SPATEM messages to the ADS. Both acceleration and jerk increase as the delay grows from 0 to 1 s. The 1 s delay case was selected for hybrid testing because it produces the highest jerk while remaining realistic and still allowing the vehicle to stop safely. The 2 s delay case was excluded, as it resulted in a red-light violation and therefore falls outside the scope of the comfort analysis.
6.1.4. Scenario 211: Stop for Priority Obstacle
As shown in
Figure 15a,b, the most critical case occurs at 10 m and 50 km/h. However, due to the practical limitations of our test track, we selected the 10 m and 40 km/h case for hybrid testing. The deceleration (−5.75 vs. −5.82 m/s
2) and jerk (7.06 vs. 7.14 m/s
3) values are nearly identical, ensuring that our real-world test remains highly representative of the most challenging scenario.
6.1.5. Scenario 221: Urgent Stop for Non-Priority Obstacle
For this high-stress scenario, we intentionally chose a collision-involving case for hybrid testing to safely analyze the system’s behavior at its absolute limits.
Figure 16a,b show the results, with the high jerk values (calculated over very short intervals) reflecting the abruptness of the emergency maneuver. The selected case—ego vehicle at 20 km/h with the obstacle at 10 m—produced the highest jerk, providing critical data on the system’s response to an unforeseen, high-risk event.
Figure 17 provides a visual summary of all critical cases identified through virtual simulations and subsequently implemented in our hybrid testing framework. This systematic approach ensures that our real-world validation is focused, efficient, and targeted at the most challenging aspects of the ADS’s performance.
6.2. Comparison of Virtual and Hybrid Testing KPIs
A primary goal of this study was to validate the fidelity of our virtual simulator by comparing its outputs with those from the hybrid framework. The strong alignment observed confirms the simulator’s reliability as a tool for ADS development.
Figure 18 demonstrates a strong correlation between virtual and hybrid environments. Ego vehicle speeds at trigger points deviated by less than 2 km/h, and trigger distances by less than 3 m. The minor discrepancy in Scenario 211’s distance is attributed to a premature trigger in the simulator, highlighting a specific area for model refinement.
6.2.1. Decision-Making Performance
The ADS demonstrates consistent and robust decision-making across all scenarios.
Figure 19 summarizes the decision outcomes, with virtual test results shown as a dotted blue line (SIMU) and hybrid test results as a dotted orange line (XP).
Speed adaptation behavior is presented in
Figure 19a. A speed reduction from 40 to 30 km/h occurs only in the Priority Obstacle scenario (211), where adaptation is required for safe intersection crossing. In the obstacle scenario (221), the lower initial speed (20 km/h) already ensures safety, and no adjustment is made. Likewise, in the traffic light scenarios (111, 121, and 134), all executed below 30 km/h, the ADS maintains its speed. Speed adaptation decisions are identical in simulation and hybrid testing.
Stop decision making results from virtual and hybrid tests are presented in
Figure 19b. At most one stop decision is made per scenario, except in scenario 121, where the ADS correctly reaccelerates instead of stopping. Decision behavior is fully aligned between simulation and real-world tests.
Figure 20a confirms decision consistency across testing modalities. The robustness of the decision-making process is quantified by the number of changes (0, 1, 2, 3) in the decision to stop.
For all scenarios, both in virtual tests and in hybrid tests, the decision is made only once, demonstrating robust decision-making across the different scenarios tested and similar behavior in both simulation and real-world conditions.
In the restart scenario 121, the decision corresponds to no longer stopping at the traffic light once the green light is received but the decision corresponds to acceleration.
Decision-making delays were negligible in hybrid tests (27–32 ms), well within the 40 ms control cycle (
Figure 20b), indicating that real-world hardware has no adverse impact on the system’s reaction time. The virtual tests results in almost zero delay in decision making by the ADS algorithm.
The ADS control performance is evaluated using three metrics: control accuracy, control delay, and collision occurrence. These aspects are analyzed in the following subsections.
6.2.2. Analysis of ADS Vehicle Control Accuracy
The ADS vehicle control accuracy, shown in
Figure 21, is quantified as the residual distance between a reference point (e.g., the stop line or the entrance to a collision zone) and the front bumper of the ADS (ego) vehicle. The positive value of distance in
Figure 21 represent that the ADS is approaching the reference point and negative distance values represent that the ADS passed the reference point. In safety-critical scenarios, larger residual errors are expected due to aggressive braking and increased system dynamics. The only exceptions are the traffic light delay scenario (134), which represents nominal braking, with a control error of
m in simulation and
m in real-world tests, and the traffic light resume scenario (121), in which the vehicle does not come to a complete stop.
The analysis focuses on differences between simulated and real-world results. For the traffic light scenarios (111, 121, and 134), the control accuracy values are closely aligned, indicating that the simulation faithfully represents real vehicle behavior. In contrast, larger discrepancies are observed in obstacle scenarios, with differences of m for the priority obstacle scenario (211) and m for the non-priority obstacle scenario (221).
These discrepancies primarily arise from two factors:
- 1.
Braking dynamics mismatch: Obstacle scenarios (211 and 221) require significantly higher deceleration than traffic light scenarios (111, 121, and 134). The simulated vehicle model relies on simplified assumptions and cannot fully capture real-world effects under high dynamic loads, including tire–road adhesion variability, elastomer deformation, tire micro-slip, and vehicle body motion.
- 2.
Obstacle triggering timing differences: In real-world tests, the XiL simulator triggered obstacle events in scenarios (211 and 221) slightly earlier than in simulation. Consequently, the vehicle initiated braking sooner and stopped at a greater distance from the obstacle. In addition, the obstacle crossed the ego vehicle’s path earlier while the vehicle still exhibited significant longitudinal dynamics, leading to higher braking to preserve safety margins.
Overall, the observed differences in longitudinal control accuracy between simulation and real-world experiments are mainly due to limitations of the simulated vehicle dynamics model and minor mismatches in obstacle-trigger timing.
6.2.3. Analysis of ADS Vehicle Control Delay
The ADS vehicle control delay is quantified as the time interval between the moment the control system detects an event requiring a stop (e.g., red light or potential collision) and the issuance of a corresponding actuator command (braking or acceleration torque). It does not include the communication delay.
As shown in
Figure 22, the observed delays in both the simulation and real-world tests are inherent to the control algorithms themselves. Consequently, the same delay values are found in both cases. Note that these delays do not include the communication delay.
Distinct delays can be observed between traffic light management and intersection management scenarios. For the traffic light cases (111, 121, 134), a delay of one computation cycle (40 ms) occurs before the braking torque command begins to react. This delay is systematic and appears only at the transition moment. It results from the control strategy used to generate the target speed, which always initializes with the measured speed value at the instant of transition, ensuring smooth continuity in the velocity profile.
In contrast, for the intersection scenarios (211, 221), an additional delay of one computation cycle (40 ms) is introduced, bringing the total delay to 80 ms. This extra delay corresponds to a confirmation strategy for validating the stop decision. Experiments performed on the XiL simulator demonstrated that this additional safety step was not strictly necessary, given the high robustness of the decision-making process.
6.2.4. Analysis of ADS Vehicle Control in Terms of Collisions
The experimental results presented in
Figure 23 highlight a single case in which a collision occurred between the ADS vehicle and an obstacle at an intersection. This corresponds to Scenario 221, where a non-priority obstacle intrudes abruptly in front of the ADS vehicle with a high acceleration of approximately 2 m/s
2.
This scenario is of particular importance, as it represents one of the first instances where such a critical situation could be tested under real-world conditions, enabling the validation of the longitudinal control system’s behavior in extreme edge cases.
In Scenario 221, the obstacle cuts in very close to the ego vehicle, violating its right of way by crossing the first lane. The situation is further complicated because, at the beginning of its acceleration, the obstacle could potentially turn right, cross the ADS lane, or turn left. This uncertainty makes collision risk assessment at the intersection especially challenging. In both the simulation and real-world tests, a collision ultimately occurred, confirming that the event represents an extreme limit case for the current control strategy.
6.3. V2X Communication Performance
The V2X communication system exhibits high reliability, achieving an almost 100% CAM delivery ratio across all scenarios in hybrid testing, as shown in
Figure 24a. Within the hybrid framework, CAMs are generated by the real ADS and transmitted by real OBUs over the Internet to an MQTT server. The XiL simulator subscribes to these CAMs and uses them to update the virtual ADS vehicle dynamics. All transmitted CAMs were successfully received by the XiL simulator, resulting in a near-perfect delivery ratio.
Because CAM transmission relies on a real wireless communication chain, environmental conditions, channel effects, and the effective distance between transmitter and receiver are inherently captured in the measurements.
Figure 24b presents the average end-to-end latency of CAM transmission from the ADS to the XiL simulator. The maximum observed average latency is approximately 142 ms in scenario 221. This value remains below the maximum acceptable delay of approximately 200 ms for the deployed ADS vehicle control, beyond which control oscillations may occur. The measured communication latency contributes to a realistic end-to-end sensing delay, while still constituting a performance bottleneck that impacts comfort and safety margins, as observed in scenario 134. Moreover, these results highlight the potential for further optimization of the XiL-based hybrid testing setup.
6.4. Passenger Comfort
Figure 25a presents the average acceleration and deceleration measured in both virtual and hybrid tests across all scenarios. In
Figure 25 positive values correspond to acceleration, while negative values indicate deceleration. This metric characterizes the overall control behavior during braking phases and during re-acceleration after a stop, as in the traffic light restart scenario 121. The results show closely matched behavior between simulation and real-world tests for all traffic light scenarios (111, 121, and 134) and for the priority obstacle scenario 211.
A noticeable deviation, of approximately 1 m/s2, is observed only in the non-priority obstacle scenario 221. This discrepancy is consistent with previously identified differences in the initial obstacle detection distance between simulation and track tests.
Figure 25b shows the peak acceleration and deceleration values obtained in the virtual and hybrid experiments. For the traffic light scenarios (111, 121, and 134), scenarios 121 and 134 exhibit very close agreement between simulation and real tests. In scenario 121, vehicle dynamics are minimal, leading to nearly identical behavior of the simulated and real vehicles. In scenario 134, higher dynamics are involved (approximately 4 m/s
2 at an average speed of 30 km/h), and the simulated model slightly underestimates the peak deceleration.
Scenario 111 (late stop at a traffic light) involves moderate dynamics at low speed (<10 km/h). In this case, the simulated vehicle underestimates deceleration by approximately 1 m/s2, resulting in a longer stopping distance in simulation than in the real test. These traffic light scenarios highlight a difference between the simulated vehicle model and the real vehicle during moderate to strong braking at very low speeds. Nevertheless, the simulation remains representative of real-world behavior.
Larger discrepancies are observed in the obstacle scenarios (211 and 221). In scenario 211, real-world tests exhibit lower peak deceleration due to earlier obstacle triggering (approximately 3 m), which provides the ego vehicle with a longer stopping horizon and allows smoother braking, thereby improving passenger comfort. As a result, a direct comparison with the simulation is not meaningful for this scenario due to differing initial conditions.
In scenario 221, the non-priority obstacle appears very late while the vehicle travels at an average speed of 20 km/h. This scenario requires the strongest braking of all cases (exceeding 5 m/s2). As the vehicle decelerates below 10 km/h, the deceleration remains high (around 5 m/s2), a regime in which the simulated vehicle model again underestimates the real vehicle response by approximately 1 m/s2.
Overall, the obstacle scenarios confirm the observed limitations of the simulated vehicle model during strong braking at low speeds, while still demonstrating consistent qualitative behavior between virtual and hybrid tests.
6.5. Summary and Validation of ADS Performance
Figure 26 provides a consolidated overview of ADS performance across key decision-making and longitudinal control KPIs. The acceptance criteria for stop/restart and speed adaptation decisions are defined by the expected outcomes of each scenario, while robustness and decision-making delay thresholds follow the specifications of the deployed ADS stack.
As shown in
Figure 19a, a single speed adaptation occurs in Scenario 211, which is the expected behavior. The corresponding KPI criterion is therefore satisfied. In all other scenarios, no speed adaptation is required, and the absence of such decisions is consistent with the KPI definition. Consequently, all speed adaptation decision KPIs are met, as summarized in
Figure 26.
Similarly,
Figure 19b shows that stop decisions are correctly taken in all scenarios where stopping is required, except in Scenario 121, where the ADS is expected to reaccelerate instead of stopping. This behavior is consistent with the scenario definition, and the stop decision KPI is therefore satisfied, as reflected in
Figure 26.
Decision-making robustness, measured by the number of ADS state transitions, is also achieved. In all scenarios, the ADS exhibits at most one state change corresponding to either a stop/go decision or a speed adaptation decision. Decision-making delay requirements are met across all tested scenarios.
Figure 26 also summarizes longitudinal control performance in terms of vehicle control accuracy and control delay. Control accuracy thresholds are scenario-dependent, while control delay thresholds are defined by ADS stack requirements. High control accuracy is observed when sufficient anticipation time is available, as in Scenario 134, where the ADS performs an immediate stop and halts within
m of the intersection line.
In Scenarios 111, 211, and 221, strict stopping accuracy is not prioritized, as the ADS emphasizes passenger comfort and safety through smooth, bounded deceleration within the available safe distance. During stopping phases, the speed planner targets an appropriate stopping point rather than enforcing an exact stop-line position. For instance, in Scenario 111, when the traffic light changes at very low speed near the intersection, attempting to stop exactly at the stop line would require abrupt braking. Instead, the ADS allows a limited overshoot within a permissible distance, selecting a stopping target that ensures comfortable and safe deceleration. In Scenario 121, stopping accuracy is not evaluated because the ADS correctly aborts the stop and resumes motion.
Regarding control delay, the measured vehicle response time is approximately 40 ms for traffic light scenarios (111, 121, and 134) and 80 ms for scenarios involving connected perception (211 and 221).
Figure 27 shows the obtained comfort KPIs from the hybrid tests. The indicated criteria KPIs thresholds of comfort logitudinal control are from
Figure 2 in [
34]. A clear and significant trade-off emerges between safety and comfort: while traffic light scenarios (111, 121, 134) produced comfort levels ranging from normal to moderately aggressive, obstacle scenarios (211, 221) elicited aggressive to urgent discomfort ratings. The collision in Scenario 221, classified as “urgent,” underscores the sharp decline in perceived comfort under unpredictable conditions. This finding is pivotal, revealing that unexpected events can rapidly degrade passenger experience.
Joint analysis of the longitudinal vehicle control accuracy in
Figure 26 and comfort KPIs in
Figure 27 reveals a deliberate decision-making pattern within the ADS. Although the system is capable of executing more abrupt braking maneuvers—for example, in Scenario 111, it could have stopped within 1 m at 10 km/h—it intentionally utilized the full available stopping distance. This behavior reflects a design philosophy prioritizing predictability and stability over sheer precision. By smoothing the deceleration curve, the ADS minimizes the risk of startling following drivers, thereby reducing the likelihood of secondary collisions.
This approach demonstrates that the ADS actively balances physical capability with situational awareness to enhance both comfort and safety. Rather than opting for the harshest possible braking, it executes a controlled, optimized deceleration that mitigates panic-inducing jolts. Although this results in an “aggressive” or “urgent” comfort classification, it represents a calculated compromise aimed at preserving composure during safety-critical events. The system’s effectiveness is further validated in the extreme Scenario 221, where it successfully avoided collision in three out of four test runs. These outcomes confirm that the ADS’s controlled deceleration strategy—favoring stability and predictability over raw stopping power—is not only intentional but crucial to maintaining safety and user acceptance at the limits of autonomous operation.
Overall, the results summarized in
Figure 26 indicate that all defined KPI acceptance criteria are satisfied across the tested scenarios. In addition, the passenger comfort KPIs shown in
Figure 27 remain within acceptable limits as defined in
Figure 2 of [
34], with the exception of Scenario 221. In this scenario, a collision is unavoidable; however, the ADS decision-making and control actions result in a minor collision, thereby mitigating the severity and preventing a more critical outcome. This behavior is considered acceptable within the scope of the study. Consequently, all scenarios are regarded as successfully validated.
7. Discussion
The results of this study demonstrate that the SUNRISE SAF provides an effective and practical structure for the validation of connected ADS in complex urban intersection scenarios. By following the overall SAF methodology we establish a systematic validation process that supports scenario-based evaluation and safety reasoning. The strong consistency observed between virtual and hybrid testing results—characterized by speed deviations below 2 km/h and distance differences under 3 m—highlights the fidelity of the proposed simulation and hybrid testing approach. These findings support the use of virtual and hybrid testing as primary tools for ADS development and validation, substantially reducing reliance on extensive and costly real-world testing while maintaining confidence in safety performance.
The analysis of V2X communication performance highlights both strengths and limitations. On the positive side, the near-perfect CAM delivery ratio demonstrates the reliability of the communication stack and confirms the benefits of connected perception for extending situational awareness. However, the average CAM link latency of 142 ms emerged as a significant bottleneck in hybrid testing using the XiL simulator. In particular, the delay in the intersection manager’s transmission of SPATEM messages directly affected passenger comfort in Scenario 134, requiring more abrupt braking.
These observations provide a clear quantitative target for future improvements: reducing V2X latency to the 40–80 ms range is critical to enhance both safety margins and passenger experience in connected mobility applications. Ideally, in hybrid tests, the simulator’s connectivity latency should be lower than the latency between real devices; otherwise, the results risk being skewed and less representative of real-world performance.
Furthermore, our results highlight a fundamental trade-off between safety and comfort. Based on comfort criteria from
Figure 2 in [
34], in nominal traffic light scenarios (111, 121, 134), the ADS provided a comfortable ride. However, in obstacle scenarios (211, 221), the system’s reactive safety logic resulted in aggressive to urgent levels of braking. The collision in Scenario 221 was minor and unavoidable, but the ADS successfully prevented a major collision. This scenario provided invaluable insights into the system’s behavior at the absolute limits of its operational envelope, highlighting the challenges of responding to unpredictable non-priority violations. Such trade-offs are critical for the public acceptance of ADS, as passenger comfort is as important as safety for achieving widespread adoption.
8. Conclusions
This study validates an ADS equipped with connected perception in urban intersection scenarios by following the overall structure, methodology, and guidelines of the SUNRISE SAF. By combining virtual simulations with a hybrid testing environment integrating real-world components, we demonstrate a structured and scalable approach for assessing ADS performance across a range of operating conditions, from nominal driving situations to safety-critical edge cases.
The results confirm the ADS’s ability to make consistent and reliable decisions, the high reliability of V2X communication, and the ability of the ADS to provide a comfortable driving experience under nominal conditions. At the same time, the analysis highlights important challenges, notably the influence of V2X communication latency on comfort and safety margins, as well as elevated deceleration levels during reactive obstacle avoidance. These findings emphasize the need for further improvements to achieve smoother and more human-like autonomous driving behavior.
Overall, this study leads to the following conclusions: (1) A structured application of the SUNRISE Safety Assurance Framework effectively supports the validation of a connected ADS when applied in detail where necessary. (2) The analysis of representative and safety-critical urban intersection scenarios demonstrates the capability of the proposed approach to capture both nominal and edge-case behaviors. (3) A hybrid testing framework integrating a real ADS with an XiL simulator successfully bridges the gap between purely virtual simulations and real-world testing. (4) The quantitative evaluation confirms that V2X communication delays have a measurable impact on ADS performance. (5) The comparison between virtual and hybrid testing shows strong overall agreement in ADS decision-making and control behavior.
Together, these conclusions support the use of combined virtual and hybrid testing as a reliable methodology for validating ADS with connected perception in complex urban environments.
Although the ADS configuration used in this study relies exclusively on V2I-based perception, future ADS deployments may combine onboard and infrastructure-supported perception. Managing potential conflicts between these information sources—particularly for dynamic obstacles—remains an open research challenge. Conflict handling for topological information such as traffic lights and intersections has been addressed in the present architecture, while ongoing work at VEDECOM focuses on multi-sensor data fusion and arbitration strategies for obstacle perception, which will be integrated into future experimental vehicles once sufficient maturity is reached.
Author Contributions
Conceptualization, M.S.A., A.W. and P.M.; methodology, M.S.A. and A.W.; software, M.S.A., A.M., W.J. and A.W.; validation, M.S.A., A.M., W.J. and A.W.; formal analysis, M.S.A. and A.W.; investigation, M.S.A., A.M. and A.W.; resources, P.M. and M.-C.R.; data curation, A.M. and A.W.; writing—original draft preparation, M.S.A.; writing—review and editing, M.S.A., A.W., P.M. and M.-C.R.; visualization, A.W.; supervision, P.M. and M.-C.R.; project administration, P.M. and M.-C.R.; funding acquisition, M.-C.R. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the European Union’s Horizon Europe programme under the SUNRISE project (Grant Agreement No. 101069573) and supported by the French State through the VEDECOM Institute of Energy Transition under the Programme d’Investissements d’Avenir (France 2030), including publication costs via the HY5 project.
Data Availability Statement
No new data were created or analyzed in this study.
Acknowledgments
The authors would like to thank Benoît Lusetti for his valuable contributions to the preparation and execution of the experiments.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- World Health Organization. Global Status Report on Road Safety 2018. Available online: https://www.who.int/publications/i/item/9789241565684 (accessed on 29 January 2026).
- SUNRISE Consortium. D2.3 Final SUNRISE Safety Assurance Framework. Deliverable, SUNRISE Project. 2025. Available online: https://ccam-sunrise-project.eu/deliverable/d2-3-final-sunrise-safety-assurance-framework/ (accessed on 29 January 2026).
- ISO 26262-1:2018; Road Vehicles—Functional Safety—Part 1: Vocabulary. ISO: Geneva, Switzerland, 2018. Available online: https://www.iso.org/standard/68383.html (accessed on 29 January 2026).
- ISO 21448:2022; Road Vehicles—Safety of the Intended Functionality. ISO: Geneva, Switzerland, 2022. Available online: https://www.iso.org/standard/77490.html (accessed on 29 January 2026).
- United Nations Economic Commission for Europe (UNECE). New Assessment/Test Method for Automated Driving (NATM)—Master Document; UNECE: Geneva, Switzerland, 2021; Available online: https://unece.org/sites/default/files/2021-01/GRVA-09-07e.pdf (accessed on 29 January 2026).
- ASAM OpenODD®. Operational Design Domain Standard (OpenODD). Available online: https://www.asam.net/standards/detail/openodd/ (accessed on 29 January 2026).
- ASAM OpenSCENARIO® XML. Standard for Description of Dynamic Driving Scenarios. Available online: https://www.asam.net/standards/detail/openscenario-xml/ (accessed on 29 January 2026).
- Betschinske, D.; Schrimpf, M.; Peters, S.; Klonecki, K.; Karch, J.P.; Lippert, M. Towards more efficient quantitative safety validation of residual risk for assisted and automated driving. arXiv 2025, arXiv:2506.10363. [Google Scholar] [CrossRef]
- Zhao, X.; Chen, H.; Li, H.; Li, X.; Chang, X.; Feng, X.; Chen, Y. Development and application of connected vehicle technology test platform based on driving simulator: Case study. Accid. Anal. Prev. 2021, 161, 106330. [Google Scholar] [CrossRef] [PubMed]
- UNECE. World Forum for Harmonization of Vehicle Regulations (WP.29). In Guidelines and Recommendations for Automated Driving System Safety Requirements, Assessments and Test Methods to Inform Regulatory Development; United Nations Economic Commission for Europe (UNECE): Geneva, Switzerland, 2024; Available online: https://unece.org/sites/default/files/2024-11/ECE-TRANS-WP.29-2024-39e.pdf (accessed on 29 January 2026).
- Chen, B.; Li, T. Formal Modeling and Verification of Autonomous Driving Scenario. In Proceedings of the 2021 IEEE International Conference on Information Communication and Software Engineering (ICICSE), Chengdu, China, 19–21 March 2021; pp. 313–318. [Google Scholar] [CrossRef]
- Cho, K.; Park, C.; Lee, H. A Study on Longitudinal Motion Scenario Design for Verification of Advanced Driver Assistance Systems and Autonomous Driving Systems. Appl. Sci. 2023, 13, 716. [Google Scholar] [CrossRef]
- Queiroz, R.; Sharma, D.; Caldas, R.; Czarnecki, K.; Garcia, S.; Berger, T.; Pelliccione, P. A Driver-Vehicle Model for ADS Scenario-based Testing. IEEE Trans. Intell. Transp. Syst. 2024, 25, 8641–8654. [Google Scholar]
- Xu, Z.; Wang, X.; Wang, X.; Zheng, N. Safety validation for connected autonomous vehicles using large-scale testing tracks in high-fidelity simulation environment. Accid. Anal. Prev. 2025, 215, 108011. [Google Scholar] [CrossRef] [PubMed]
- Xu, R.; Guo, Y.; Han, X.; Xia, X.; Xiang, H.; Ma, J. OpenCDA: An open cooperative driving automation framework integrated with co-simulation. In 2021 IEEE International Intelligent Transportation Systems Conference (ITSC); IEEE: New York, NY, USA, 2021; pp. 1155–1162. [Google Scholar]
- Coppola, A.; Mungiello, A.; Pane, G.; Petrillo, A.; Santini, S. On the Virtual Testing of ADAS in CCAM environment via Vehicle-in-the-Loop framework. IFAC-PapersOnLine 2025, 59, 151–156. [Google Scholar] [CrossRef]
- Shoukat, M.U.; Yan, L.; Yan, Y.; Zhang, F.; Zhai, Y.; Han, P.; Nawaz, S.A.; Raza, M.A.; Akbar, M.W.; Hussain, A. Autonomous driving test system under hybrid reality: The role of digital twin technology. Internet Things 2024, 27, 101301. [Google Scholar] [CrossRef]
- Cheli, F.; Gobbi, M.; Mastinu, G.; Melzi, S.; Previati, G.; Sabbioni, E.; Cappiello, A.; Luè, A. A New Generation Cable-Driven Dynamic Driving Simulator for the Assessment of CCAM Deployment. In Transport Research Arena Conference; Springer Nature: Cham, Switzerland, 2024; pp. 667–673. [Google Scholar]
- Rodríguez-Arozamena, M.; Matute, J.; Pérez, J.; Ozbay, B.; Tezcan, D.; Begecarslan, E.; Mutlukaya, I.; Buquerin, K.G.; Volkersdorfer, T.; Hof, H.J. A Fail-Safe Decision Architecture for CCAM Applications. In Transport Research Arena Conference; Springer Nature: Cham, Switzerland, 2024; pp. 731–737. [Google Scholar]
- ETSI. ETSI TR 103 562: Intelligent Transport Systems (ITS); Vehicular Communications; Basic Set of Applications; Analysis of the Collective Perception Service (CPS); Release 2; Technical Report TR 103 562 V2.1.1; European Telecommunications Standards Institute (ETSI): Sophia Antipolis, France, 2019. [Google Scholar]
- European Telecommunications Standards Institute (ETSI). ETSI TS 103 324: Intelligent Transport Systems (ITS); Vehicular Communications; Basic Set of Applications; Collective Perception Service; Release 2; European Telecommunications Standards Institute (ETSI): Sophia Antipolis, France, 2023. [Google Scholar]
- Sluis, J.v.d.; Op den Camp, O.; Broos, J.; Yalcinkaya, I.; de Gelder, E. Describing I2V Communication in Scenarios for Simulation-Based Safety Assessment of Truck Platooning. Electronics 2021, 10, 2362. [Google Scholar] [CrossRef]
- Frigerio, E. Design of an ADAS for Blind Intersections Leveraging I2V Communication. Master’s Thesis, Politecnico di Milano, Milan, Italy, 2024. [Google Scholar]
- Talebpour, A.; Mahmassani, H.S. Influence of connected and autonomous vehicles on traffic flow stability and throughput. Transp. Res. Part C Emerg. Technol. 2016, 71, 143–163. [Google Scholar] [CrossRef]
- European Telecommunications Standards Institute (ETSI). Intelligent Transport Systems (ITS); Vehicular Communications; Basic Set of Applications; Facilities Layer Protocols and Communication Requirements for Infrastructure Services; ETSI TS 103 301 V2.2.1; SPATEM (Signal Phase and Timing Extended Message) Defined as Part of TLM Service; ETSI: Sophia Antipolis, France, 2024. [Google Scholar]
- SUNRISE Consortium. D3.2 Report on Requirements on Scenario Concepts Parameters and Attributes. Deliverable, SUNRISE Project. 2025. Available online: https://ccam-sunrise-project.eu/deliverable/d3-2-report-on-requirements-on-scenario-concepts-parameters-and-attributes/ (accessed on 29 January 2026).
- SUNRISE Consortium. SUNRISE Project Public Deliverables. 2023. Available online: https://ccam-sunrise-project.eu/deliverables/ (accessed on 29 January 2026).
- SUNRISE Consortium. D4.1 Report on Relevant Subsystems to Validate CCAM Systems. Deliverable, SUNRISE Project. 2025. Available online: https://ccam-sunrise-project.eu/deliverable/d4-1-report-on-relevant-subsystems-to-validate-ccam-systems/ (accessed on 29 January 2026).
- SUNRISE Consortium. D4.4 Report on the Harmonised V&V Simulation Framework. Deliverable, SUNRISE Project. 2024. Available online: https://ccam-sunrise-project.eu/deliverable/d4-4-report-on-the-harmonised-vv-simulation-framework/ (accessed on 29 January 2026).
- Varga, A. Discrete event simulation system. In Proceedings of the European Simulation Multiconference (ESM’2001), Prague, Czech Republic, 6–9 June 2001; Volume 17. [Google Scholar]
- Sommer, C.; German, R.; Dressler, F. Bidirectionally Coupled Network and Road Traffic Simulation for Improved IVC Analysis. IEEE Trans. Mob. Comput. 2011, 10, 3–15. [Google Scholar] [CrossRef]
- Lopez, P.A.; Behrisch, M.; Bieker-Walz, L.; Erdmann, J.; Flötteröd, Y.P.; Hilbrich, R.; Lücken, L.; Rummel, J.; Wagner, P.; Wießner, E. Microscopic Traffic Simulation using SUMO. In 21st IEEE International Conference on Intelligent Transportation Systems; IEEE: New York, NY, USA, 2018. [Google Scholar]
- SUNRISE Consortium. D3.3 Report on the Initial Allocation of Scenarios to Test Instances. Deliverable, SUNRISE Project. 2024. Available online: https://ccam-sunrise-project.eu/deliverable/d3-3-report-on-the-initial-allocation-of-scenarios-to-test-instances/ (accessed on 29 January 2026).
- Bae, I.; Moon, J.; Seo, J. Toward a comfortable driving experience for a self-driving shuttle bus. Electronics 2019, 8, 943. [Google Scholar] [CrossRef]
Figure 1.
The SUNRISE SAF [
2] provides a structured methodology for validating the safety of CCAM systems.
Figure 1.
The SUNRISE SAF [
2] provides a structured methodology for validating the safety of CCAM systems.
Figure 2.
Scenario 111: Nominal stop at a red traffic light.
Figure 2.
Scenario 111: Nominal stop at a red traffic light.
Figure 3.
Scenario 121: Resuming motion upon a green signal.
Figure 3.
Scenario 121: Resuming motion upon a green signal.
Figure 4.
Scenario 134: Stopping behavior under delayed I2V communication.
Figure 4.
Scenario 134: Stopping behavior under delayed I2V communication.
Figure 5.
Scenario 211: Yielding to a late-appearing priority obstacle from the right.
Figure 5.
Scenario 211: Yielding to a late-appearing priority obstacle from the right.
Figure 6.
Scenario 221: Emergency braking for a non-priority obstacle forcing the right-of-way.
Figure 6.
Scenario 221: Emergency braking for a non-priority obstacle forcing the right-of-way.
Figure 7.
Architecture of the virtual testing simulator in MATLAB/Simulink.
Figure 7.
Architecture of the virtual testing simulator in MATLAB/Simulink.
Figure 8.
Architecture of the hybrid testing framework, integrating the XiL simulator with real-world components.
Figure 8.
Architecture of the hybrid testing framework, integrating the XiL simulator with real-world components.
Figure 9.
Execution snapshots of traffic light scenarios in the hybrid testing environment. (a) Scenario 111: Nominal stop. (b) Scenario 121: Resume on green. (c) Scenario 134: Delayed communication.
Figure 9.
Execution snapshots of traffic light scenarios in the hybrid testing environment. (a) Scenario 111: Nominal stop. (b) Scenario 121: Resume on green. (c) Scenario 134: Delayed communication.
Figure 10.
Execution snapshots of connected perception scenarios in the hybrid testing environment. (a) Scenario 211: Stop for priority obstacle. (b) Scenario 221: Emergency stop for non-priority obstacle.
Figure 10.
Execution snapshots of connected perception scenarios in the hybrid testing environment. (a) Scenario 211: Stop for priority obstacle. (b) Scenario 221: Emergency stop for non-priority obstacle.
Figure 11.
Traffic light violation analysis for Scenario 111, revealing a critical case with a 3 s orange phase where a red-light violation occurs. (a) Stop/go decisions with a 3 s orange phase. (b) Stop/go decisions with a 5 s orange phase.
Figure 11.
Traffic light violation analysis for Scenario 111, revealing a critical case with a 3 s orange phase where a red-light violation occurs. (a) Stop/go decisions with a 3 s orange phase. (b) Stop/go decisions with a 5 s orange phase.
Figure 12.
Comfort-related KPIs for Scenario 111, showing the most uncomfortable case is identified by the peak jerk, not by the peak deceleration (highlighted with red circles). (a) Deceleration. (b) Jerk.
Figure 12.
Comfort-related KPIs for Scenario 111, showing the most uncomfortable case is identified by the peak jerk, not by the peak deceleration (highlighted with red circles). (a) Deceleration. (b) Jerk.
Figure 13.
Analysis of jerk during re-acceleration in Scenario 121. The most uncomfortable point occurs during the low-speed transition and corresponds to the peak jerk, highlighted by the red circle.
Figure 13.
Analysis of jerk during re-acceleration in Scenario 121. The most uncomfortable point occurs during the low-speed transition and corresponds to the peak jerk, highlighted by the red circle.
Figure 14.
Scenario 134: jerk as a function of V2I communication delay, showing a clear trend of increasing discomfort with latency. The peak jerk is marked by the red circle.
Figure 14.
Scenario 134: jerk as a function of V2I communication delay, showing a clear trend of increasing discomfort with latency. The peak jerk is marked by the red circle.
Figure 15.
Scenario 211: Deceleration and jerk metrics, showing that the 40 km/h (red circles) case is representative of the more extreme 50 km/h case. (a) Deceleration for priority obstacle. (b) Jerk for priority obstacle.
Figure 15.
Scenario 211: Deceleration and jerk metrics, showing that the 40 km/h (red circles) case is representative of the more extreme 50 km/h case. (a) Deceleration for priority obstacle. (b) Jerk for priority obstacle.
Figure 16.
Scenario 221: Deceleration and jerk metrics for the collision case, with instantaneous jerk peaks highlighting the violence of the event. Peak deceleration and peak jerk are marked by red circles. (a) Deceleration for non-priority obstacle. (b) Jerk for non-priority obstacle.
Figure 16.
Scenario 221: Deceleration and jerk metrics for the collision case, with instantaneous jerk peaks highlighting the violence of the event. Peak deceleration and peak jerk are marked by red circles. (a) Deceleration for non-priority obstacle. (b) Jerk for non-priority obstacle.
Figure 17.
A summary map of all critical scenarios selected for hybrid testing based on virtual simulation analysis.
Figure 17.
A summary map of all critical scenarios selected for hybrid testing based on virtual simulation analysis.
Figure 18.
Comparison of key dynamic parameters between virtual and hybrid tests, showing high fidelity with deviations under 2 km/h and 3 m. (a) Comparison of ego vehicle speeds at trigger points. (b) Comparison of trigger distances to the intersection.
Figure 18.
Comparison of key dynamic parameters between virtual and hybrid tests, showing high fidelity with deviations under 2 km/h and 3 m. (a) Comparison of ego vehicle speeds at trigger points. (b) Comparison of trigger distances to the intersection.
Figure 19.
Analysis of decision-making outcomes, showing consistent behavior across scenarios. (a) Speed adaptation decisions. (b) Stop decisions taken.
Figure 19.
Analysis of decision-making outcomes, showing consistent behavior across scenarios. (a) Speed adaptation decisions. (b) Stop decisions taken.
Figure 20.
Robustness and latency of the ADS’s decision-making process, showing negligible delays in hybrid tests. (a) Decision-making consistency. (b) Decision-making delays.
Figure 20.
Robustness and latency of the ADS’s decision-making process, showing negligible delays in hybrid tests. (a) Decision-making consistency. (b) Decision-making delays.
Figure 21.
ADS control accuracy.
Figure 21.
ADS control accuracy.
Figure 22.
ADS control delay.
Figure 22.
ADS control delay.
Figure 23.
Analysis of the collision events in the virtual and hybrid tests.
Figure 23.
Analysis of the collision events in the virtual and hybrid tests.
Figure 24.
V2X communication performance, showing near-perfect CAM delivery ratio (reliability) with non-negligible end-to-end latency. (a) CAM delivery ratio by scenario. (b) CAM average latency by scenario.
Figure 24.
V2X communication performance, showing near-perfect CAM delivery ratio (reliability) with non-negligible end-to-end latency. (a) CAM delivery ratio by scenario. (b) CAM average latency by scenario.
Figure 25.
Comparison of comfort KPIs in terms of average and peak acceleration/deceleration between virtual and hybrid tests. (a) Average acceleration/deceleration. (b) Peak acceleration/deceleration.
Figure 25.
Comparison of comfort KPIs in terms of average and peak acceleration/deceleration between virtual and hybrid tests. (a) Average acceleration/deceleration. (b) Peak acceleration/deceleration.
Figure 26.
Summary of decision-making and control performance KPIs, reflecting overall ADS efficacy.
Figure 26.
Summary of decision-making and control performance KPIs, reflecting overall ADS efficacy.
Figure 27.
Detailed analysis of comfort KPIs obtained from hybrid tests, highlighting the trade-off between safety and comfort in obstacle scenarios.
Figure 27.
Detailed analysis of comfort KPIs obtained from hybrid tests, highlighting the trade-off between safety and comfort in obstacle scenarios.
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |