1. Introduction
With the rapid development of modern industry, the number of vehicles worldwide continues to rise, and frequent traffic accidents have had a profound impact on the economy and society, making traffic accident prevention research increasingly urgent [
1]. Especially at unsignalized intersections, due to obstructed driver visibility, high vehicle speeds, frequent crossings by non-motorized vehicles and pedestrians, and other factors such as trajectory uncertainty, traffic accidents occur frequently. To address the intense conflicts between vehicles, non-motorized vehicles, and pedestrians at unsignalized intersections, it is necessary to explore a new method for intersection conflict warning to significantly reduce accident rates and further improve traffic safety.
Intersection Collision Warning (ICW) technology is an important component of vehicle active safety technology. Early collision warning systems primarily relied on radar and cameras to perceive the surrounding environment and issue collision warnings to the driver [
2]. However, the limitations of traditional sensors (such as obstruction by obstacles) have restricted the effectiveness and further development of collision warning systems [
3]. In recent years, the rise of technologies such as Vehicle-to-Vehicle Communication (V2V) and Vehicle-to-Infrastructure Communication (V2I) has enabled traffic participants to achieve comprehensive and real-time information sharing, becoming a new hot topic in collision risk identification and warning research.
The collision risk at unsignalized intersections is essentially an issue of motion conflicts between traffic participants. Commonly used conflict risk identification indicators are mainly divided into two categories: time-space conflict indicators and probability conflict indicators. Time-space conflict-based warning methods predict potential collision risks by evaluating the time and spatial relationships between vehicles. Among them, Time to Collision (TTC) is a commonly used evaluation indicator. For example, Miller et al. [
4] proposed an intersection collision detection method based on TTC, performing risk assessment by calculating the TTC of vehicles. Lan et al. [
5], addressing the conflict between right-turning vehicles and pedestrians, used TTC, Post Encroachment Time (PET), and Deceleration to Safety Time (DST) as evaluation indicators. They found that TTC and PET were negatively correlated with the severity of conflict, while DST was positively correlated. Guo et al. [
6,
7,
8,
9] proposed an improved TTC calculation method and optimized the application of rectangular and elliptical models in conflict detection based on geometric models, enhancing calculation accuracy and real-time performance. In addition, safety distance warning methods assess risk by analyzing the distance change in vehicle movement. Han et al. [
10] proposed a longitudinal and lateral TTC difference model for collision scenarios between cars and electric two-wheelers, optimizing the trigger strategy of the Autonomous Emergency Braking (AEB) system, improving collision avoidance rates, and reducing collision injuries. Li et al. [
11] proposed a location fusion algorithm based on unscented Kalman filtering, improving positioning accuracy and vehicle safety when positioning fails. Zhao et al. [
12], based on V2I communication, proposed a warning method combining Dynamic Detection of Collision (DDTC) and Collision Avoidance with Circular Trajectory (CATC), which reduced missed and false alarms.
Although time-space conflict-based warning methods have the advantages of low computation and ease of operation, they rely on a single indicator (such as TTC or PET), making it difficult to address complex traffic scenarios. Moreover, probability conflict-based warning methods introduce uncertainty probability models to quantify the collision risk of vehicles or pedestrians for evaluation. For example, Laugier et al. [
13] used the Hidden Markov Model (HMM) and Gaussian processes to identify vehicle behavior, predict future collision risks, and validate its effectiveness in uncertain environments. Joerer et al. [
14] evaluated the collision risk of all possible vehicle trajectories using acceleration probability distributions, further improving detection accuracy. Berthelot et al. [
15] proposed a TTC probability distribution calculation method based on uncertain inputs, enhancing the system’s reliability by handling input noise. For pedestrian collision risks, Peng et al. [
16] proposed a stochastic geometry model for pedestrian collision zones based on V2I communication technology, considering factors such as communication delay and positioning errors, and established a pedestrian-vehicle collision probability model, which was validated through simulation experiments in complex scenarios. Yang et al. [
17] constructed a real-time collision risk warning model for pedestrians and vehicles based on catastrophe theory, evaluating it using binary logistic and catastrophe degree methods. Jin et al. [
18] proposed a warning method based on intersection vehicle collision probabilities, predicting the vehicle’s future behavior and calculating collision probabilities, which was validated through simulation.
In summary, pedestrian behavior at unsignalized intersections, which are characterized as high-risk scenarios, exhibits considerable uncertainty. Furthermore, traditional collision warning methods suffer from limited capability in modeling trajectory diversity and rely on single risk assessment mechanisms, resulting in inadequate adaptability and warning accuracy in complex mixed traffic environments. To address these challenges, this paper focuses on potential conflicts between vehicles and pedestrians and proposes a multi-fusion collision warning method under the Vehicle-to-Everything (V2X) communication environment, aiming to enhance the traffic safety protection of vulnerable road users (VRUs) in typical unsignalized scenarios. Specifically, the proposed method employs the Monte Carlo (MC) algorithm as the core framework, integrating the social generative adversarial network (Social GAN) and a soft attention mechanism to achieve robust pedestrian trajectory prediction in complex traffic interaction scenarios. Trajectory samples are generated by adding Gaussian noise to the mean trajectory to support the warning algorithm. Moreover, an elliptical buffer zone algorithm is employed to dynamically construct the risk detection module, achieving a collision detection mechanism that balances accuracy and computational efficiency. This study systematically advances from behavior modeling and risk assessment to warning decision-making, with the objective of improving pedestrian risk identification and collision warning capabilities at unsignalized intersections, thereby providing theoretical foundations and technical solutions for enhancing road traffic safety.
The remainder of this paper is organized as follows: Chapter 2 describes the data sources and preprocessing procedures; Chapter 3 details the design of each algorithmic module, including vehicle motion modeling, pedestrian trajectory prediction, and risk detection methods; Chapter 4 develops the integrated warning algorithm framework and introduces the collision probability evaluation mechanism; Chapter 5 presents simulation experiments and comparative analyses to validate the effectiveness of the proposed method under typical scenarios; finally, Chapter 6 concludes the paper and discusses future research directions.
3. Modular Algorithms
To develop a collision warning model suitable for unsignalized intersection scenarios, this study constructed a vehicle trajectory motion module, a pedestrian trajectory prediction module, and a risk detection module. Specifically, considering that motor vehicles exhibit relatively stable behavioral patterns and adhere to traffic rules during travel, this study employed a kinematics-based modeling approach for trajectory fitting. In contrast, given the high motion uncertainty and frequent path changes of pedestrians, this study further introduced a generative adversarial network (GAN) to model pedestrian trajectories, aiming to enhance the ability to predict pedestrians’ future movement trends. To address the insufficient spatial conflict recognition accuracy of traditional collision detection methods in dynamic interaction scenarios, this study designed a collision detection algorithm based on the elliptical buffer region in order to improve the spatial collision judgment between pedestrians and motor vehicles at unsignalized intersections.
3.1. Vehicle Trajectory Motion Module
This study adopts a simplified two-dimensional planar kinematics model (see
Figure 3) as the vehicle trajectory motion module in the warning algorithm to describe the motion characteristics of vehicles at unsignalized intersections, including state parameters such as vehicle position, velocity, and acceleration. Based on real-time vehicle state data provided by V2X communication, a vehicle state vector
is constructed to achieve dynamic updating of the vehicle’s position. On this basis, the future trajectory of the vehicle on the plane is further predicted, providing necessary trajectory information support for the subsequent collision risk warning algorithm, as detailed in Equations (1) and (2).
In the equations, and represent the position coordinates of the vehicle on the two-dimensional plane at time , while and represent the vehicle’s positions at the previous time The vehicle’s velocity components at time are denoted by and , with corresponding to the velocities in the and directions, respectively.
3.2. Pedestrian Trajectory Prediction Module
Based on the Social GAN model [
19], this paper proposes an improved Attention-based Social GAN (AS-GAN) model for developing the pedestrian trajectory prediction module in the warning algorithm. The overall model structure is shown in
Figure 4, and the model mainly consists of the input layer, pooling layer, interactive attention mechanism, and the generative adversarial network, which is further divided into the generator and discriminator. In this model, the target’s historical trajectory information and surrounding interaction data are used as inputs, and the generator, with the help of the interactive attention mechanism module, generates the trajectory sequence of the target over a specific future time span based on the input data. Subsequently, the discriminator performs authenticity discrimination on the generated trajectory sequence. Finally, the model outputs an individual predicted trajectory sequence that highly matches the distribution of real-world scenarios. This sequence, as the mean trajectory, will be introduced into the MC sampling stage of the subsequent collision risk warning model.
3.2.1. Model Input
In the AS-GAN model, the first 8 time steps of historical trajectory data are input into the interactive attention layer, where the attention mechanism captures the pedestrian’s movement trends and potential intentions. The Long Short-Term Memory (LSTM) then generates trajectory predictions for the next 12 time steps, each with a time step interval of 0.4 s. To capture the dynamic changes in the trajectory, the model simultaneously considers the status information of the target pedestrian and their surrounding interacting objects. This status information is described by dynamic features such as position, velocity, and acceleration and is represented as the state vector
.
In the equation, represents the state of the predicted pedestrian at time , including the lateral coordinate , longitudinal coordinate , velocity , and acceleration . represents the state of the interacting vehicle target at time , including the identifier of the interacting target, position , velocity , acceleration , and relative distance .
3.2.2. Encoder and Decoder
The encoder and decoder together form the foundation of the AS-GAN model. In the encoding stage, the model encodes the pedestrian’s historical states through an LSTM network, thereby capturing the pedestrian’s movement features. In the decoding stage, the model combines historical information with interaction features to generate the pedestrian’s future trajectory. This structure enables the model to predict future movement states based on historical trajectories and enhances the robustness of the prediction by incorporating interaction features from the traffic environment, as detailed in Equations (4) and (5).
In the equation, represents the historical encoded vector of the target pedestrian at time , serving as the input state information . denotes the predicted trajectory of the pedestrian at future time . is the context vector in the decoder input.
3.2.3. Interaction Feature Extraction and Attention Module
The interaction pooling layer and the interaction attention mechanism jointly constitute the interaction layer of the AS-GAN model. The interaction pooling layer extracts temporal interaction features based on the interaction information between the pedestrian and surrounding interactive objects, providing input for subsequent tasks such as trajectory prediction. Meanwhile, this study introduces the Soft Attention Mechanism from reference [
14] as the interaction attention mechanism module, assigning weights to different interactive objects to enable the model to focus on factors influencing pedestrian trajectories, thereby improving the accuracy and robustness of subsequent trajectory prediction, as detailed in Equations (6) and (7).
In the equations, represents the interaction feature vector, and , , , and represent the position information, velocity, and acceleration of pedestrians and vehicles, respectively, while denotes the relative distance between pedestrians and vehicles. determines the importance of each interactive object in pedestrian trajectory prediction; is the attention scoring function; is the encoded vector of the pedestrian’s historical states; and are the weight matrices; and is the learnable vector.
3.2.4. Dual Loss Function
To further improve the accuracy of trajectory prediction, this paper introduces a displacement loss function
based on the traditional adversarial loss
, thereby constructing a dual-loss mechanism. This mechanism not only maintains constraints on the authenticity of the generated trajectories but also guides the model to learn a trajectory distribution closer to reality by quantifying the displacement difference between the generated and real trajectories, as detailed in Equations (8)–(10).
In the AS-GAN model, trajectory prediction is achieved through adversarial optimization between the generator and the discriminator . The discriminator aims to maximize the adversarial loss to distinguish between real and generated trajectories. The generator attempts to minimize the loss value to enhance the authenticity of the generated trajectories. The adversarial loss function uses cross-entropy as a metric to measure the difference between the generated trajectory distribution and the real trajectory distribution. Meanwhile, to improve the spatial fitting accuracy of the trajectory, this paper introduces a displacement loss function to measure the Euclidean distance difference between the predicted trajectory and the real trajectory in both spatial and temporal dimensions. is a weighting coefficient used to balance the adversarial loss and the displacement loss functions.
3.2.5. Comparative Experiments
To validate the performance of the AS-GAN model proposed in this paper, several classic pedestrian trajectory prediction algorithms are used as benchmarks for comparison with the proposed model, such as the Constant Velocity Model (ConstVel) [
20], Social Long Short-Term Memory (Social LSTM) [
21], Social GAN, etc. The evaluation metrics are Average Displacement Error (ADE) and Final Displacement Error (FDE), with units in meters. A detailed comparison of the models can be found in
Table 1.
The results show that the AS-GAN model proposed in this paper outperforms the other comparison models in both prediction accuracy and robustness. The ConstVel model only uses historical trajectories for linear extrapolation, which cannot capture the nonlinear behavioral characteristics of pedestrians. The Social LSTM model introduces an LSTM network to model individual dynamics, but its social modeling ability is limited, as it only considers local interactions. The Social GAN model enhances interaction modeling capabilities through max pooling, but it fails to effectively incorporate the influence of vehicles, resulting in limited adaptability in mixed traffic environments. In contrast, the AS-GAN model proposed in this paper combines generative adversarial networks and a soft attention mechanism, enabling effective modeling of the dynamic relationships between pedestrians and surrounding multiple traffic participants. It improves trajectory prediction accuracy through the joint optimization of adversarial and displacement losses, providing more reliable trajectory support for collision risk warning.
3.3. Risk Detection Module
To improve the collision risk identification accuracy in unsignalized intersection scenarios, this paper designs an ellipse-buffer-based risk detection algorithm as the risk detection module in the warning algorithm, aimed at implementing spatial-level collision determination. The risk detection module takes the trajectory prediction module of vehicles and pedestrians as input and constructs a dynamically adjustable elliptical buffer zone, combining the motion states and physical boundary information of traffic participants. The long axis and short axis of the buffer zone are determined by vehicle speed
, driver reaction time
, vehicle dimensions (
,
), and safety distance
, and are adaptively adjusted using a tuning coefficient
to ensure good risk coverage and geometric discrimination accuracy under varying traffic conditions. Additionally, this mechanism significantly improves the robustness and accuracy of the warning system in multi-type traffic participant scenarios by expanding the detection boundaries and enhancing adaptability, providing a stable and reliable spatial criterion for the calculation of joint collision probabilities. As shown in
Figure 5, the yellow elliptical area represents the dynamically expanded vehicle safety domain, and the blue rectangle represents pedestrians or other vulnerable road users.
The ellipse-buffer algorithm determines whether there are intersection points, i.e., whether a potential spatial collision occurs, by substituting the vertices of the pedestrian’s rectangular boundary into the elliptical geometric equation. If there is an overlap between the two, a potential collision event can be determined, and its corresponding occurrence time and distance will be recorded. The specific calculation process is shown in Formulas (11)–(15).
In the formula, denotes the index of the four edges of the rectangle, where the top and bottom boundaries, as well as the left and right boundaries, each correspond to a straight line. is the slope of the -th edge, and is the intercept of the -th edge. is the center coordinates of the ellipse, and are the semi-major and semi-minor axes, respectively, and is the rotation angle of the ellipse.
6. Conclusions
This study addresses the vehicle–pedestrian conflict problem at unsignalized intersections and proposes a multi-fusion collision warning method based on V2X communication technology. The method adopts the MC algorithm as the core computational framework, incorporates the AS-GAN model for pedestrian trajectory prediction, and introduces an elliptical buffer zone algorithm for spatial collision detection.
In the trajectory prediction module, an improved AS-GAN model is proposed to model future pedestrian trajectories. This model integrates a GAN architecture with a soft attention mechanism, enabling the effective extraction of temporal evolution patterns and interaction features of trajectories. By designing a dual loss function composed of adversarial loss and displacement loss, the model significantly enhances trajectory prediction accuracy and reduces false and missed alarms caused by prediction deviations.
For collision detection, a spatial discrimination method based on elliptical buffer zones is proposed to address the limitations of traditional TTC models in handling lateral movement, target expansion, and dynamic trajectory evaluation. This method allows for a more accurate representation of interaction zones and improves the precision of collision point identification. Meanwhile, the system leverages V2X communication to obtain real-time motion state information of vehicles and pedestrians, providing crucial data support for risk analysis in dynamic scenarios.
In the collision risk assessment stage, the MC sampling algorithm serves as the core framework to introduce Gaussian perturbations into the predicted trajectories, generating multiple trajectory samples for spatial collision judgment. The system further incorporates driver reaction time and braking distance windows to construct a joint collision probability function, which serves as the basis for issuing collision warnings. This strategy not only enhances the model’s ability to represent trajectory uncertainty but also significantly improves the stability and reliability of the warning outcomes. Comparative experiments using real-world data and conventional algorithms confirm the superiority of the proposed method in predicting and warning of potential collision risks.
Nevertheless, this study has certain limitations. The current method has been validated primarily in relatively simple intersection scenarios with low traffic density, and its applicability in more complex environments—such as high-density traffic flows, multi-lane intersections, and weak communication conditions—has yet to be thoroughly evaluated. In addition, system deployment imposes demands on computational resources and real-time response capabilities, which require further balancing in future research. Subsequent work will focus on optimizing the model for more complex scenarios, improving the robustness of V2X communication, and refining strategies for system integration and deployment, with the ultimate goal of facilitating the practical implementation of the proposed method in real-world traffic safety systems.