Next Article in Journal
Recognition of Emotion by Brain Connectivity and Eye Movement
Next Article in Special Issue
Load Effect of Automated Truck Platooning on Highway Bridges and Loading Strategy
Previous Article in Journal
Design of a Multimodal Oculometric Sensor Contact Lens
Previous Article in Special Issue
Research on a Real-Time Estimation Method of Vehicle Sideslip Angle Based on EKF
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on Lane Changing Game and Behavioral Decision Making Based on Driving Styles and Micro-Interaction Behaviors

1
Key Laboratory of Advanced Manufacturing Technology for Automobile Parts, Ministry of Education, Chongqing University of Technology, Chongqing 400054, China
2
State Key Laboratory of Mechanical Transmission, College of Mechanical and Vehicle Engineering, Chongqing University, Chongqing 400044, China
*
Author to whom correspondence should be addressed.
Sensors 2022, 22(18), 6729; https://doi.org/10.3390/s22186729
Submission received: 25 July 2022 / Revised: 25 August 2022 / Accepted: 3 September 2022 / Published: 6 September 2022

Abstract

:
Autonomous driving technology plays an essential role in reducing road traffic accidents and ensuring more convenience while driving, so it has been widely studied in industrial and academic communities. The lane-changing decision-making process is challenging but critical for ensuring autonomous vehicles’ (AVs) safe and smooth maneuvering. This paper presents a closed-loop lane-changing behavioral decision-making framework suitable for AVs in fully autonomous driving environments to achieve both safety and high efficiency. The framework is based on a complete information non-cooperative game theory. Moreover, we attempt to introduce human driver-specific driving styles (reflected by aggressiveness types) and micro-interaction behaviors for both sides of the game in this model, enabling users to understand, adapt, and utilize intelligent lane-changing techniques. Additionally, a model predictive control controller based on the host-vehicle (HV) driving risk field (DRF) is proposed. The controller’s optimizer is used to find the optimal path with the lowest driving risk by using its optimizer and simultaneously adjusting its control variables to track the path. The method can synchronize path planning and motion control and provide real-time vehicle state feedback to the decision-making module. Simulations in several typical traffic scenarios demonstrate the effectiveness of the proposed method.

1. Introduction

1.1. Introduction and Related Work

As an essential component of autonomous driving technology, the automatic lane-changing system (ALCS) has attracted the interest of numerous scholars worldwide [1]. In ALCS, the decision-making module is responsible for generating the driving behaviors of the vehicle, which connects environmental perception, trajectory planning, and motion control; it is the most fundamental and critical component of AVs intelligence [2].
Many scholars around the world have conducted extensive and in-depth research on the decision-making model of autonomous driving in recent years. One of the primary bottlenecks of autonomous driving decision-making algorithms has always been how to make vehicles make reasonable and correct decisions in a complex environment full of uncertainties. Furda and Vlacic use a multi-criteria decision-making algorithm to select the optimal driving strategy based on the candidate action set [3]. The Markov decision process (MDP) is a common model that studies the probability of decision-making and can deal with decision-making uncertainty [4]. In addition, the technological innovation of machine learning algorithms has led to the rapid development of data-driven decision-making algorithms. Ulbrich established a vehicle lane-changing state assessment and vehicle decision-making judgment model based on the dynamic Bayesian network to achieve the optimal efficient lane-changing [5]. Li et al. [6] developed a data-driven deep neural network (DNN) driving algorithm to have an adaptive effect on road conditions in real-world traffic scenarios, but for DNN training and testing, a considerable number of high-quality data samples are required. Based on the reinforcement learning (RL) method, the authors integrate the reward mechanism of RL into the interaction of the stochastic MDP and the traffic environment, solve the sequential decision-making problem of the host vehicle, and effectively address the lack of training data in [7]. In addition to the learning-based lane-changing models previously mentioned, there are also expert system-based, fuzzy logic-based, and rule-based lane-changing decision-making methods [8,9,10]. Even though the above-mentioned traditional lane-changing models have been extensively studied and have yielded significant results, modeling and logical design challenges will arise due to interactions in complex traffic environments. Moreover, the model’s inflexibility and inability to adapt to different driving styles is a result of the static method of lane change decision-making.
Efforts have been made to study interaction-aware models in order to capture the interdependence between vehicles. Coupled HMM models (CHHMs) are used to model pairwise dependencies between multiple moving entities [11]. While this technique can tell when we change lanes, we did not go into detail about how. The study of decision-making problems involving interactions between vehicles can also be approached using game theory. In [12], a Stackelberg game-based lane-changing model is proposed by evaluating the aggressiveness of surrounding vehicles and incorporating it into the payoff function’s design. It can simulate the interaction behaviors between autonomous vehicles and human drivers with different aggressiveness types in mixed-traffic scenarios. Wang et al. [13] proposed a differential game-based vehicle tracking and lane change control model to prevent future situations that are detrimental to the revenue of the host vehicle. In [14], the authors used Stackelberg game theory to model the driver model of AVs and then used the model to simulate the driver’s reasoning process. Finally, the interaction behavior similar to the human driver is realized. Arbis et al. established a lane-changing model based on game theory, which represented the game process of lane-changing drivers and surrounding interacting vehicle drivers by using turn signals and continuous lateral displacement [15]. Aiming at the interaction of multiple vehicles in the lane-changing process, a lane-changing model including cooperative and game behaviors is established, which introduces and quantifies the interaction between multiple vehicles [16].
Although the above-mentioned decision-making research based on game theory can effectively deal with vehicle micro-interaction behaviors under complex traffic conditions, there is an important problem that has not been well considered. If the design of the decision-making algorithm lacks the real-time feedback of vehicle planning and motion control, it may lead to the lack of feasibility of the planned path. An unsuitable path or driving beyond the control constraints may result in frequent braking and sharp steering maneuvers on the planned path, deteriorating the driving safety and comfort dramatically.
Lower computational requirements allow for local path planning, maneuver planning, and motion control in dynamic environments. The driving risk field refers to the quantitative description of the driving risk faced by the vehicle using a mathematical model analogous to the “physical field” and based on dynamic traffic information obtained around the vehicle. This kind of research is developed from the trajectory planning method of mobile robots based on the concept of artificial potential field proposed by Khatib [17]. The method of using field theory to construct a driving risk field for driving safety assessment has also been extensively discussed with the advancement of the field of intelligent transportation. Wu Jian et al. [18] developed a unified model of the human-vehicle-road closed-loop system’s DRF based on different object attributes. Ji and Khajepour developed a three-dimensional virtual risk field as a trigonometric function about the road and an exponential function about the obstacles and gradient descent is used to generate the vehicle’s desired collision avoidance trajectory. However, the tracking control of trajectories and the problem of easily falling into local minima are not well considered in [19]. In [20], researchers built a risk field with a vehicle tire model that considered obstacles and road constraints.
These studies usually assume that surrounding vehicles have constant velocity or acceleration, and the impact of vehicle interaction on the real-time transition of the host vehicle’s state is not adequately considered. However, these assumptions are not always true in the real world, which can lead to uncomfortable and impractical lane changes. In terms of improving the maneuverability, safety, and driving stability of AVs, it is imperative to develop a path planning and control algorithm that can respond in real-time to vehicle decisions and account for interactions between vehicles.
Whether the host vehicle eventually decides to change lanes depends on the reactions of different types of drivers in the same situation, in addition to the fundamental parameters of the participating vehicles on the road (e.g., velocity, acceleration, position, etc.). In this paper, existing research is used to introduce the aggressiveness parameter (β), which effectively resolves this consideration. The aggressiveness type plays a crucial role in the decision-making process of participating vehicles in a dynamic game. A competing vehicle (CV) of high-aggressiveness (big β) prefers maintaining its velocity and positional advantage by not yielding to the host vehicle in a lane-changing scenario [21] whereas less-aggressive CV (Small β) tends to yield to HV to ensure their own driving safety [22]. This article incorporates aggressiveness into the design of decision-making and path-planning algorithms. This consideration has the potential to promote the self-adaptation of AVs to different kinds of drivers and passengers, as well as the advancement of personalized driving technology, hence shifting the status quo from “people adapting to vehicles” to “vehicles adapting to people.”

1.2. Contribution

The main contributions of this paper are as follows. Firstly, based on the complete information non-cooperative game theory, a closed-loop lane-changing behavior decision model for AVs that takes into account the aggressiveness types and vehicle micro-interaction behaviors of both game participants is established. Then, beginning with the field theory, a unified risk description model for the driving environment is created, and the field strength distribution of the DRF is incorporated into the MPC controller cost function. On this basis, a novel coupled path planning and motion control DRF-based MPC controller is proposed. Finally, the feasibility and effectiveness of the algorithm are tested in several typical lane-changing scenarios.

1.3. Paper Organization

The remainder of this article is organized as follows. In Section 2, the system architecture of the closed-loop decision algorithm is outlined. In Section 3, a lane-changing decision model based on complete information non-cooperative game theory is established. In Section 4, an MPC controller incorporating the host-vehicle driving risk field is designed. In Section 5, we design two typical cases to verify the proposed algorithm and analyze and discuss the test results. Finally, Section 6 concludes the paper.

2. Problem Formulation and System Construction

In a typical freeway lane change scenario, the proposed lane-changing behavior decision framework is shown in Figure 1. The model is mainly comprised of three mutually coupled functional modules: decision-making, path planning, and motion control. In the decision-making module, cost function considering driving safety, travel efficiency, and driving comfort are established. Then, by solving the Stackelberg game, the optimal strategy, including the optimal lane-changing command and the optimal longitudinal acceleration, is obtained and then transmitted to the path planning and control module. Note that this paper combines the longitudinal planning of the HV with the lane-changing decision. Then, this paper completes the unified risk perception of the road environment based on the established DRF model. The road environment is composed of traffic entities, including surrounding obstacle vehicles (OVs), road boundaries, and lane lines. The future state of the HV can be predicted using MPC. To take the risk field strength of the predicted state as a key component of its cost function, MPC used its optimizer to help find the optimal driving path with the lowest risk while simultaneously tracking the path by adjusting control output. Note that this controller can not only solve the problem that the conventional artificial potential field method is prone to local minima but also can tackle the control problem of multiple inputs and multiple outputs (MIMO). Finally, the module feeds back the planned path parameters and vehicle states to the decision-making module to address the decision-making at the next time step.
This paper defines the aggressiveness parameter as the degree of preference of different driving groups for travel efficiency in an effort to allow the decision-making of autonomous vehicles to be smarter and humanized. That is, vehicles with higher aggressiveness values have higher velocity expectations during lane changes and strive for larger driving spaces while being capable of withstanding higher levels of road risk. The aggressiveness types of drivers are generally divided into three types (conservative, moderate, and aggressive). Each type of driving style differs considerably in aggressiveness value and travel efficiency preferences. When the aggressiveness value is high, e.g., AVs should behave like aggressive drivers, preferring higher expected velocity and acceleration, shorter following distances, and less concern for driving safety when performing lane changes. The assessment of risk in the driving environment has also become bolder. Aggressiveness is incorporated into the design of the decision cost function and DRF model in this paper.
In the method proposed in this paper, the following assumptions are made:
  • It is assumed that the vehicles studied in this paper are all AVs and have been equipped with complete onboard sensors and wireless communication modules (i.e., V2V and V2I technologies) to obtain rich information about the surrounding vehicles motion status and road environment.
  • Only the acceleration and deceleration behaviors of surrounding vehicles are considered, and their lane-changing behaviors are not considered.
  • The vehicles studied are all cars, excluding other types such as trucks and motorcycles.

3. Mathematical Modeling of Lane-Changing Decision

Game theory is an effective approach to investigating the interactions of decision makers. A game is a mathematical object with well-defined elements, including players, strategy space, and payoffs [23]. This article examined the payoffs from the perspective of decision-making cost, and a complete information non-cooperative game theory approach is used to handle the lane-changing decision problem of AVs.

3.1. Game Formulation

As illustrated in Figure 2, the entire process mimics the conventional human driver’s lane-change decision and its corresponding longitudinal velocity adjustment strategy. When the preceding vehicles (PV) 2 is driving too slowly, the HV may consider changing lanes. Before doing that, the HV would evaluate the target lane’s traffic conditions using environmental perception information. If sufficient space is available, the HV will interact with the following vehicles (FV) by activating the turn signal or making a small lateral move to observe the FV’s reaction. It is presumed that in a few seconds, the FV reacts to the HV’s actions by the acceleration it might make. Note that if the aggressiveness types of the two sides of the game changed, the final strategies and decision-making costs of the players will also be different.
Figure 2 presents the decision costs for four interaction scenarios between HV and competing vehicle FV. With two players, HV is the leader who can choose either to change lanes or to keep lanes. FV is this follower; it has the option of yielding to HV based on the leader’s decision. Stackelberg game theory is used in this paper to simulate the multi-agent decision-making problem with sequential order. The Stackelberg game theory is used in this paper to mimic this multi-agent sequential decision-making problem. When HV chooses to change lanes, FV prefers to yield to HV, according to the illustration cost matrix in Figure 2. Likewise, HV selects lane keeping, and FV may accelerate appropriately based on the cost. In conclusion, both sides of the game evaluate and search for the decision cost corresponding to the strategy pairs to determine the optimal strategy pair (accelerate, stay). Therefore, we model the lane change decision-making as a two-player equilibrium game problem with different decision cost functions, where an equilibrium exists at all times.
The overall structure of the Stackelberg game is shown in Table 1, in which α denotes the player’s longitudinal acceleration and U represents the cost of the strategic combination. The HV not only decides whether to change lanes but also the optimal longitudinal acceleration required at the present time step. Since the acceleration is continuous, there are infinite combinations of strategies. Additionally, FV also has an infinite number of policy combinations; it can select any value within the constraint range, i.e., ( α min , α max ) . Throughout the game, the players are trying to minimize their cost. Consequently, the decision cost function plays a crucial part in the generation of reasonable decisions.

3.2. Definition of Cost Function

The decision cost considers a combination of three factors in order to generate a reasonable decision logic for the host vehicle and the competing vehicle.
The first part U d s H V quantifies the level of safety associated with vehicle lane-changing and lane-keeping. The second factor is U t e H V , which evaluates the vehicle velocity and travel space benefits. The final component is U d c H V , which ensures the vehicle’s comfort during the game. The total decision cost function of HV can therefore be expressed as
U H V = β U t e H V + ( 1 β ) U d s H V + k A c c U d c H V
where U d s H V , U t e H V , and U d c H V , respectively, represent the driving safety, travel efficiency, and driving comfort when HV adopts a specific strategy. β is an aggressiveness parameter that is used to represent specific driving styles. k A c c is the driving comfort factor.
The cost of HV driving safety cost differs in the two situations of lane-keeping and lane-changing. When selecting a lane-keeping strategy, the safety cost is primarily related to the relative velocity and relative distance between the HV and the P V κ of the vehicle ahead in the current lane. The lateral safety cost between the HV and the target lane FV is primarily considered when the HV changes lanes [24]. The safety cost can be uniformly expressed as
U d s H V = ( 1 | γ | ) U d s l k H V + | γ | U d s l c H V
where U l c H V and U l k H V are the corresponding driving safety costs of HV lane changing and lane keeping, respectively. γ denotes the results of HV decision-making, i.e., γ { 1 , 0 , 1 } : = {change to left lane, lane keeping, change to the right lane}.
U l k H V is related to the relative distance and relative velocity of HV and P V κ , which is expressed as
U d s - l k H V = { ψ v l k H V sgn [ Δ V l k P V κ H V ] ( Δ V l k P V κ H V ) 2 + ψ s l k H V / [ Δ S l k P V κ H V + ς ] , i f P V κ i ϖ l k , o t h e r w i s e
where P V κ denotes the vehicle ahead in the current lane, and κ { 1 , 2 , 3 } indicates the lane ID of the HV. Δ V l k P V κ H V and Δ S l k P V κ H V are relative velocity and relative distance between HV and P V κ . ψ v l k H V and ψ s l k H V are the respective weight coefficients for the velocity and distance terms. ϖ l k is a small number that represents the cost of selecting a lane-keeping strategy when the HV driving lane lacks P V κ , and ς is a very small value to avoid a denominator of 0.
U l c H V is related to the relative distance and relative velocity of HV and FV as follows:
U ds - lc H V = { ψ v lc H V sgn [ Δ V l c H V F V ] ( Δ V l c H V F V ) 2 + ψ s lc H V / [ Δ S l c H V F V + ς ] , if F V ϖ l c , o t h e r w i s e
where FV represents the competing vehicle in the target lane, Δ V l c H V F V and Δ S l c H V F V are relative velocity and relative distance between HV and FV. ψ v lc H V and ψ s lc H V are the respective weight coefficients for the velocity and distance terms. ϖ l c is a small number that represents the cost of selecting a lane-changing strategy when the target lane lacks FV.
Driving comfort is mainly related to acceleration, which is defined by
U d c H V = ψ a c c H V ( α x H V ) 2
where α x H V denotes the longitudinal acceleration of HV, and ψ a c c H V is the weight coefficient.
The travel efficiency of the HV is largely related to the longitudinal velocity. In this article, based on the research of P. Hang et al. [24] on the travel efficiency, the following further improvements are made. In this way, the HV has more initiative and flexibility when competing with other vehicles, and its travel velocity and space is significantly improved. U te H V can be given by
U te H V = { ( Δ V v x max H V ) 2 , Δ s d m ( Δ V P V κ H V ) 2 , Δ s < d m
where v x max represents the maximum velocity allowed on the road, d m is the safe distance threshold at which the HV may accelerate without restriction, Δ s is the relative distance between the host vehicle and the preceding vehicle. Under normal conditions, in response to the lane-changing behavior of HV, the competing vehicle will not accelerate to its maximum velocity to compete with the HV for the sake of driver safety. This paper therefore presumes that the maximum velocity that FV can achieve during the game is 0.8 * v x max .
The cost function of FV has a similar structure to HV, which is not repeated here. The cost of driving safety is defined differently. Because FV does not take lane-changing behavior into account, its lateral driving safety mainly comes from the lane-changing behavior of HV. The lateral safety cost of FV is assumed to be equal to the lane-changing safety cost of HV, i.e., U d s - l k H V = U d s - l a t F V .

3.3. Solution of the Game

Stackelberg game was raised by Heinrich Freiherr von Stackelberg in 1943 [25]. The total cost function of HV and FV are introduced into a Stackelberg game-based model, and the optimal strategy pairs are derived by solving the model. HV and FV play a 2-player Stackelberg game during the interaction. In contrast to the Nash game [26,27], the leaders HV and the follower FV are not independent and influence each other’s decisions. This is a typical two-layer optimization problem, which can be written as
( α x H V * , γ * ) = a r g m i n α x H V , γ ( m a x α x F V γ 2 ( α x H V , γ ) U H V ( α x H V , γ , α x F V , β ) )
γ 2 ( α x H V , γ ) = { σ γ 2 : U F V ( α x H V , γ , σ , β ) U F V ( α x H V , γ , α x F V , β ) , α x F V γ 2 }
s . t . γ { 1 , 0 , 1 } , γ ( γ + 1 ) ( γ 1 ) = 0 , α x H V [ α x min , α x max ] , α x F V [ α x min , α x max ] ,
v x H V [ v x min , v x max ] , v x F V [ v x min , v x max ]
where U H V and where U F V denote the total cost of HV and FV, and α x H V * is the optimal longitudinal acceleration of HV, α x F V is the longitudinal acceleration of FV, γ * is the optimal lane-changing command of HV, α x min and α x max are the minimum and maximum acceleration that a vehicle can reach, v x min and v x max are the minimum and maximum constraints of longitudinal velocity, γ 2 denotes the action candidates of FV, γ 2 ( α x H V , γ ) represents the optimal action candidates of FV given the actions of HV, β is the aggressiveness, v x H V and v x F V are the velocity of HV and FV.
For the bi-level optimization problem of the leader–follower game above, if a high-precision equilibrium solution is not required, the bi-level optimization problem can be solved by searching the discrete cost matrix extensively. The cost matrix is a finite composition, and it is easier to find an equilibrium. If the accuracy of the policy solution is very important, this bi-level optimization problem can also be solved by a bi-level genetic evolutionary algorithm (BLEAQ) [28]. It should be noted that, in order to satisfy the real-time calculation of decision-making, the optimal equilibrium of the game is calculated at every time step.

4. Controller Design Based on Driving Risk Field

4.1. Vehicle Kinematic Modeling

The motion planning of an AV includes longitudinal planning and lateral planning, and longitudinal planning refers to the velocity or acceleration planning of the AV in the longitudinal direction. In this paper, the lane-changing decision algorithm and the longitudinal planning of the vehicle are coupled into the decision-making module.
Lateral planning generally refers to path planning. To perform path-planning in real-time, an accurate model is required on the one hand, and the complexity of the model must be reduced on the other to reduce computational load and workload. To meet these two requirements, the following simplified vehicle kinematics model [29] is used in this paper.
χ = [ v x   φ   X   Y ] T = [ α x / v x , t a n β / l r , cos θ / cos β , sin θ / cos β ] T v x
β = arctan ( l r / ( l r + l f ) t a n δ f )
θ = φ + β
where the state vector χ = [ v x , φ , X , Y ] T and the control vector u = δ f . φ is the yaw angle. β and θ are the slip angle and heading angle, respectively. δ f and ( X , Y ) are the steering angle of the front wheel and position coordinates. α x and v x are the longitudinal acceleration (entered by the decision module) and velocity; l r and l f are the front and rear wheelbases, respectively.
To get the discretized system and for constraints of the control increment in the model predictive controller design hereinafter, Equation (10) should be transformed as follows:
ξ ( k + 1 ) = [ v x ( k + 1 ) φ ( k + 1 ) X ( k + 1 ) Y ( k + 1 ) δ f ( k + 1 ) ] = [ v x ( k ) + Δ t α x ( k ) φ ( k ) + Δ t v x ( k ) t a n β ( k ) / l r X ( k ) + Δ t v x ( k ) cos θ ( k ) / cos β ( k ) Y ( k ) + Δ t v x ( k ) sin θ ( k ) / sin β ( k ) δ f ( k ) + Δ δ f ( k + 1 ) ]
where Δ t is the discrete time step, and ξ ( k + 1 ) represents the state of the discrete system in the ( k + 1 ) th step. Δ u = u ( k ) u ( k 1 ) and it represents Δ δ f , while Δ δ f represents control inputs, which are the increments of vehicle front wheel steering angle in the kth step.

4.2. Driving Risk Field Modeling

To quickly quantify the driving risk level of vehicles in the road environment, based on the field theory, this paper establishes the driving risk field model of some necessary traffic entities, including obstacle vehicles, road boundaries, and lane lines. Among them, the paper considers the shape-size of the OVs and the different aggressiveness types. The risk level of these traffic entities is determined in detail by their own attributes and traffic regulations. Based on the vehicle kinematics model, MPC predicts the vehicle states and takes advantage of the time-efficient calculation and receding horizon optimization of the model. In this section, the risk level corresponding to the predicted position of HV is combined with the cost function of MPC to plan a collision-free lane-changing path for HV in real-time, and feedback on the states of the vehicle and the planned path parameters to the decision-making module.

4.2.1. Risk Field of an Obstacle Vehicle

Considering a position (X, Y), the risk field model of surrounding obstacle vehicles can be given by [30]
P o v ( X , Y ) = A o v exp ( ( ( X - X o v ) ε ( k x L o v ) 2 + ( X - X o v ) ε ( k y W o v ) 2 ) ρ )
where A o v is the maximum risk field value of OV, ( X o v , Y o v ) is the center of gravity coordinates of OV, ε are the coefficients to adjust the peak shape of the OV risk field, L o v and W o v are the length and width of OV, k x and k y are the adjustment coefficient of the OV’s shape size. ρ is the higher order coefficient Hereinafter, the strength of the risk field is represented by the character E.
The schematic diagram of the obstacle risk field is shown in Figure 3a. The smaller the relative distance between the HV and the OV, the greater the field strength of DRF at the HV’s position. Additionally, the risk field of the OV has a larger influence range in the longitudinal direction, and the risk field has a smaller influence range in the lateral direction, which is limited to the current driving lane. Under normal circumstances, an OV can occupy its own traffic lane, but cannot affect a vehicle in an adjacent lane, which has no relative displacement to it. According to Figure 3b, we set the optimal risk field convergence coefficient k y = 0.75 in the lateral direction of the OV.
Furthermore, different aggressiveness values have a greater impact on the risk field distribution of OVs, as shown in Figure 4. The figure shows that at the same velocity, the higher the aggressiveness value of the OV, the greater the influence range of the corresponding risk field in the driving direction reflected by different values of the k x .

4.2.2. Risk Field of an Obstacle Vehicle

The motion of the AVs should also be constrained by the road, including lane lines and road boundaries [31]. The risk field model of the road can be expressed as follows
P R o a d ( X , Y ) = A r 1 exp ( ( d r 1 + d c + W o c / 2 ) h 1 ) P r 1 + A r 2 exp ( d r 2 2 / ( ε σ l i n e 2 ) ) P r 2
where P r 1 and P r 2 are the field strength of the road boundary and lane lines at (X,Y) position, respectively. d r 1 and d r 2 are the minimum distance from the point (x,y) to the road boundary and lane lines. d c is the safety threshold. W o c denotes the OV’s width, σ l i n e , ε and h 1 are the risk field adjustment coefficients, respectively.
The three-lane road risk field is shown in Figure 5, where the left is a 3D map of the road risk field, and the right is the corresponding cross-sectional view along the Y-axis. The figure shows that the field strength near the road boundary is large, while the field strength in the free space of the road is very small. Moreover, the coefficient of convergence ε has a significant impact on the road risk field shown in Figure 6.
As discussed above, the DRF model needs to reflect various factors and build up a unified risk description of the surrounding environment of the host vehicle. The resulting DRF model is a combination of the obstacle vehicles and the road line marks, which are given as:
P e n v = i = 1 m P o v ( i ) + j = 1 n P r 1 ( j ) + e = 1 z P r 2 ( e )
Where m is the number of obstacle vehicles, n and z are the number of road boundaries and lane lines, respectively, P e n v is the total risk value of a position.
An example scenario with different kinds of aggressive OVs and HV, as well as the corresponding DRF contour maps is shown in Figure 7. The risk level of each location on the road can be visualized from the graph. Thus, the HV can efficiently calculate the safety level of each position on the road in complex traffic scenarios. On this basis, the HV can avoid high-risk areas and try to drive in low-risk level areas so as to make behavioral decision-making and path planning safely and efficiently improve travel efficiency. Notably, it also reflects the different levels of risk that different aggressiveness types vehicles can tolerate while driving.

4.3. DRF-Based MPC Controller

MPC is used in this paper to predict the motion state of HV for multiple steps in the future. The DRF strength of the predicted states of MPC is taken into account as a critical component of the MPC cost function to perform trajectory planning and motion control synchronously. The field strength of the HV’s kth prediction step is defined as follows:
Θ D R F ( k ) = Γ [ χ ( k ) , u ( k ) ] = P e n v ( X ( k ) , Y ( k ) )
The MPC cost function should reflect the trajectory with not only the lowest driving risk but also the performance of adjusting the control increments for trajectory tracking and following the center-line of the target lane. Furthermore, to ensure the vehicle’s stability and comfort requirements, the front wheel steering angle should be constrained, so the control increments are also weighted as part of the cost function.
min J ( k ) = i = 1 N p | | Θ D R F ( i ) | | Q 1 2 + i = 1 N p | | Δ φ ( i ) | | Q 2 2 + i = 1 N p | | Δ Y ( i ) | | Q 3 2 + i = 0 N c 1 | | u ( k + i | k ) | | Q 4 2 s . t . 0 v v max , u min u ( k + i | k ) u max , Θ D R F ( k + i 1 ) = Γ [ χ ( k + i 1 | k ) , u ( k + i 1 | k ) ]
where Θ D R F ( i ) denotes the field strength in each step of the of the HV prediction horizon, the calculation method is defined by Equation (16); Δ Y ( i ) and Δ φ ( i ) are the lateral distance error and yaw angle error between the HV prediction position and the center-line of lane κ . Δ u ( k + i ) represents the control increment which needs to be minimized. Q 1 , Q 2 , Q 3 , and Q 4 are weighting factors of the optimization of the MPC controller, respectively. N p and N c represent the prediction horizon and the control horizon, respectively, N p N c .
Equation (18) can be solved by a typical optimization algorithm [32], and the sequence of optimal control increments in the kth step can be obtained.
Δ u * ( k ) = [ Δ u * ( k | k ) , Δ u * ( k + 1 | k ) , , Δ u * ( k + N c 1 | k ) ]
Then, the control output at the time step k can be calculated.
u ( k | k ) = u ( k 1 | k 1 ) + Δ u * ( k | k )
Based on Equation (13), the state vector of HV can be updated, and at the next time step k + 1 , a new round of optimization is performed until the HV exits the game or successfully changes to the target lane.

5. Testing and Results Analysis

In this section, two typical highway lane-changing scenarios are designed to test the algorithm’s reliability and effectiveness. All algorithms and scenarios are implemented using MATLAB programs. Test scenario A is a two-lane highway scenario, which is designed to test the impact of the difference in the aggressiveness value of the two sides of the game on decision-making. Scenario B is an extension of the first scenario, and it attempts to test whether the same HV can make an appropriate lane-changing decision in a multi-vehicle game with various aggressive types of obstacle vehicles. This can be interpreted as reflecting the intelligence of the decision-making algorithm in this paper.

5.1. Parameters Setting

Table 2 shows the parameter settings of the decision model and the DRF model.

5.2. Test Scenario A

Scenario A examines a classic two-lane lane-changing scenario on expressways, as depicted in Figure 8. Because of the PV’s slow velocity, the HV must decide whether to change lanes or slow down to maintain a safe following distance from the PV2 in the current traffic environment. In order to better reveal the mechanism of aggressiveness types on game player decision-making, scenario A includes two cases. In case 1, FV1’s aggressiveness value is set to 0.6 (Moderate type) to analyze the effect of lane-changing decisions and path planning when HV with different aggressiveness levels play with FV1. In case 2, the aggressiveness value of FV1 is changed to 0.2 (conservative type), while the other settings keep unchanged. At the initial moment of the test, the position coordinates of HV, FV1, PV2, and PV1 are set as (8,−2), (6,2), (60,−2), and (100,2), respectively. The corresponding longitudinal velocities are 18 m/s, 10 m/s, 14 m/s, and 25 m/s. In order to simplify the test scenario, the velocity of PV2 and PV1 is assumed to be unchanged in both scenario A and scenario B.
The detailed test results of the two cases in Scenario A are shown in Figure 9 and Figure 10. When the aggressiveness type of FV1 is fixed, changing the HV aggressiveness value has a greater impact on decision-making and path planning. For instance, when the HV’s aggressiveness is 0.2 and FV1’s aggressiveness is 0.6, the HV prioritizes driving safety when changing lanes and has lower velocity expectations. As a consequence, the HV was unable to change lanes successfully and finally chose the strategy of lane-keeping. Additionally, FV1 decelerates to keep a safe distance when approaching PV2. The increase in the aggressiveness value of the HV leads to a larger weight of the travel efficiency cost, so the evaluation of driving safety by the HV becomes bolder. The driving style of the HV also shifts from conservative to aggressive, which directly promotes the velocity and acceleration of the HV throughout the interaction. Another critical point is that the aggressiveness value of the HV is positively correlated with the time required to make a lane change decision. That is, the more aggressive the HV, the shorter the time it takes for the HV to make a lane-changing decision and complete the lane-changing process.
Similarly, under the premise of the same HV aggressiveness, the different aggressiveness types of FV1 also have a certain degree of influence on HV decision-making. The reduction of the aggressiveness value causes FV1 to suppress the velocity expectation to ensure driving safety, which to some extent improves the dominant position of HV in the game process. Moreover, this also reflects that conservative FV1 hinders HV lane-changing behavior to a lesser degree than aggressive type. Therefore, the time required for the HV to make a lane-changing decision is shorter, and the acceleration and driving velocity are also improved. The test results in Figure 11 further validates the above analysis.

5.3. Test Scenario B

Scenario B is an upgrade of scenario A, as shown in Figure 12. Both HV and PV2 are driving in the middle lane, and PV2 is driving slowly. In this situation, the HV must choose whether to slow down or to change lanes. If the HV decides to change lanes, it must take into account the specific conditions of the competing vehicles in the target lane, such as aggressiveness, vehicle position, velocity and gap, etc. Then, the HV determines which target lane is less costly and risky to drive. In scenario B, the multi-vehicle game between HV and FVi (i = 1,2) needs to be considered. The positions of HV, FV1, FV2, PV2, and PV1 at the initial time are set to (8,−2), (6,2), (5,−6), (60,−2), and (95,2). The corresponding longitudinal velocities are 18 m/s, 10 m/s, 12 m/s, 14 m/s, and 26 m/s. Scenario B designs four cases to test whether HV can make rational and correct lane-changing decisions in the presence of diverse types of FVi (i = 1,2). In order to simplify scenario B and highlight the key points, this paper assumes that when HV decides to change lanes to the left, the game with FV2 ends immediately. Then, the velocity of FV2 remains unchanged while keeping the game with FV1 until the end of the lane change. Note that changing lanes to the right follows the same assumption.
The detailed test results of Scenario B are shown in Figure 13 and Figure 14. After summarizing, it can be discovered that because the positions and velocity of FV1 and FV2 are not significantly different, the difference in their aggressiveness types has a great impact on HV’s decision-making. For example, when the aggressiveness value of FV1 and FV2 are both 0.6, they will take a larger acceleration to prevent the HV from encroaching on their drivable space. In this situation, the HV has to stay in the lane to ensure driving safety and accelerate slowly to wait for the next opportunity to change lanes. When approaching PV2, slow down and follow it. If the aggressiveness types of FV1 and FV2 are not identical, the HV chooses the lane where the less aggressive obstacle vehicle is located when changing lanes. From the test results illustrated in Figure 15, it can be derived that the relative velocity and distance between HV and FVi (i = 1,2) increase as FV’s aggressiveness rise (HV minus FVi (i = 1,2)). This demonstrates that the advantages of velocity and drivable space markedly improved the HV’s initiative and safety when changing lanes while reducing the cost of making a decision. The last meaningful point is that if FVi (i = 1,2) have the same level of aggressiveness, then they hinder HV lane-changing behavior to the same degree. In this condition, the initial positions and velocities of FVi (i = 1,2) have obvious influence on the decision-making of HV. Case 4 of Scenario B proves this point.

6. Discussion

In this paper, the behavior decision algorithm is modeled using Stackelberg game theory, and then path planning and motion control are performed in real-time based on the decision result, with the vehicle states and path parameters fed back to the decision-making module. As a result, a closed-loop lane-changing decision-making framework that takes into account vehicle interaction and aggressiveness is designed. For the sake of improving the adaptability of the decision model for different driving groups, this paper incorporates the characteristics of different aggressive types into the design of the decision cost function and the DRF model. The effect of changes in aggressiveness value on decision-making and path planning is explored. Moreover, the DRF-based MPC controller developed in this paper can respond in real-time to decision-making commands and then plan a collision-free trajectory. The controller’s optimizer can be used to help determine the optimal path, which can then be followed by adjusting the control output simultaneously. Another advantage of this controller is that it can reduce computational load and improve real-time computing capability, which is critical for future AV applications.
Two typical highway scenarios are established in the simulation test to validate the algorithm in this paper. The results show that the closed-loop decision-making model can handle interactive behavior in complex traffic scenarios and make reliable lane-changing decisions intelligently. The algorithm has good application prospects.
Future work will further improve the decision-making algorithm based on game theory, primarily in terms of adaptability to more complex and heterogeneous traffic scenarios and dynamic recognition of vehicle aggressiveness types. In terms of algorithm validation, consider combining some experiments, such as driver-in-the-loop (MIL) and real-world vehicle verification, to more accurately evaluate and give feedback on the performance of the algorithm in this paper.

Author Contributions

Conceptualization and methodology, M.Y. and P.L.; formal analysis and writing—original draft preparation, P.L. and M.Y.; Model building and analysis, Z.Y. and P.L.; supervision and writing—review and editing, M.Y., Y.L. and Z.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by State Key Laboratory of Vehicle NVH and Safety Technology (Grant No. NVHHSKL-202111), and supported by the Chongqing Technology Innovation and Application Development Project (Grant No. cstc2021jscx-gksbX0043).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Wang, J.; Zheng, H.; Zong, C. Longitudinal and lateral dynamics control of automatic lane change system. Trans. Inst. Meas. Control 2019, 41, 4322–4338. [Google Scholar] [CrossRef]
  2. González, D.; Pérez, J.; Milanés, V.; Nashashibi, F. A review of motion planning techniques for automated vehicles. IEEE Trans. Intell. Transp. 2015, 17, 1135–1145. [Google Scholar] [CrossRef]
  3. Furda, A.; Vlacic, L. Enabling safe autonomous driving in real-world city traffic using multiple criteria decision making. IEEE Intel. Transp. Syst. 2011, 3, 4–17. [Google Scholar] [CrossRef]
  4. Bai, H.; Cai, S.; Ye, N.; Hsu, D.; Lee, W.S. Intention-aware online POMDP planning for autonomous driving in a crowd. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Seattle, WA, USA, 26–30 May 2015; pp. 454–460. [Google Scholar]
  5. Ulbrich, S.; Maurer, M. Situation assessment in tactical lane change behavior planning for automated vehicles. In Proceedings of the 2015 IEEE 18th International Conference on Intelligent Transportation Systems, Gran Canaria, Spain, 15–18 September 2015; pp. 975–981. [Google Scholar]
  6. Li, L.; Ota, K.; Dong, M. Humanlike driving: Empirical decision-making system for autonomous vehicles. IEEE Tran. Veh. Technol. 2018, 67, 6814–6823. [Google Scholar] [CrossRef]
  7. You, C.; Lu, J.; Filev, D.; Tsiotras, P. Highway traffic modeling and decision making for autonomous vehicle using reinforcement learning. In Proceedings of the IEEE Intelligent Vehicles Symposium (IV), Changshu, China, 26–30 June 2018; pp. 1227–1232. [Google Scholar]
  8. Naranjo, J.E.; Gonzalez, C.; Garcia, R.; De Pedro, T. Lane-change fuzzy control in autonomous vehicles for the overtaking maneuver. IEEE Trans. Intell. Transp. 2008, 9, 438–450. [Google Scholar] [CrossRef]
  9. Vacek, S.; Gindele, T.; Zollner, J.M.; Dillmann, R. Using case-based reasoning for autonomous vehicle guidance. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, San Diego, CA, USA, 29 October–2 November 2007; pp. 4271–4276. [Google Scholar]
  10. Yang, Q.; Koutsopoulos, H.N. A microscopic traffic simulator for evaluation of dynamic traffic management systems. Transport. Res. C-EMER 1996, 4, 113–129. [Google Scholar] [CrossRef]
  11. Brand, M.; Oliver, N.; Pentland, A. Coupled hidden markov models for complex action recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR), San Juan, PR, USA, 17–19 June 1997; pp. 994–999. [Google Scholar]
  12. Yu, H.; Tseng, H.E.; Langari, R. A human-like game theory-based controller for automatic lane changing. Transport. Res. C-EMER 2018, 88, 140–158. [Google Scholar] [CrossRef]
  13. Wang, M.; Hoogendoorn, S.P.; Daamen, W.; van Arem, B.; Happee, R. Game theoretic approach for predictive lane-changing and car-following control. Transport. Res. C-EMER 2015, 58, 73–92. [Google Scholar] [CrossRef]
  14. Yoo, J.H.; Langari, R. Stackelberg game based model of highway driving, Dynamic Systems and Control Conference. In Proceedings of the ASME Dynamic Systems and Control Conference joint with JSME Motion and Vibration Conference, Fort Lauderdale, FL, USA, 17–19 October 2012; pp. 499–508. [Google Scholar]
  15. Arbis, D.D.; Dixit, V.V. Game theoretic model for lane changing: Incorporating conflict risks. Accid. Anal. Prev. 2019, 125, 158–164. [Google Scholar] [CrossRef]
  16. Sun, D.J.; Kondyli, A. Modeling Vehicle Interactions during Lane-Changing Behavior on Arterial Streets. Comput. Civ. Infrastruct. Eng. 2010, 25, 557–571. [Google Scholar] [CrossRef]
  17. Khatib, O. Real-time obstacle avoidance for manipulators and mobile robots. Int. J. Rob. Res. 1986, 2, 396–404. [Google Scholar]
  18. Lin-heng, L.; Jing, G.; Xu, Q.; Pei-pei, M.; Bin, R. Car-following model based on safety potential field theory under connected and automated vehicle environment. China J. Highw. Transp. 2019, 32, 76. [Google Scholar]
  19. Ji, J.; Khajepour, A.; Melek, W.W.; Huang, Y. Path planning and tracking for vehicle collision avoidance based on model predictive control with multiconstraints. IEEE Trans. Veh. Technol. 2016, 66, 952–964. [Google Scholar] [CrossRef]
  20. Rasekhipour, Y.; Khajepour, A.; Chen, S.-K.; Litkouhi, B. A potential field-based model predictive path-planning controller for autonomous road vehicles. IEEE Trans. Intell. Transp. 2016, 18, 1255–1267. [Google Scholar] [CrossRef]
  21. Vanlaar, W.; Simpson, H.; Mayhew, D.; Robertson, R. Aggressive driving: A survey of attitudes, opinions and behaviors. J. Saf. Res. 2008, 39, 375–381. [Google Scholar] [CrossRef] [PubMed]
  22. Zhang, Q.; Filev, D.; Tseng, H.E.; Szwabowski, S.; Langari, R. Addressing mandatory lane change problem with game theoretic model predictive control and fuzzy Markov chain. In Proceedings of the Annual American Control Conference (ACC), Milwaukee, WI, USA, 27–29 June 2018; pp. 4764–4771. [Google Scholar]
  23. Nash, J. Non-cooperative games. Ann. Math. 1951, 54, 286–295. [Google Scholar] [CrossRef]
  24. Hang, P.; Huang, C.; Hu, Z.; Lv, C. Driving Conflict Resolution of Autonomous Vehicles at Unsignalized Intersections: A Differential Game Approach. arXiv 2022, arXiv:2201.01424. [Google Scholar] [CrossRef]
  25. Von Stackelberg, H. Market Structure and Equilibrium; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
  26. Myerson, R.B. Nash equilibrium and the history of economic theory. J. Econ. Lit. 1999, 37, 1067–1082. [Google Scholar] [CrossRef]
  27. Papavassilopoulos, G.; Cruz, J. Nonclassical control problems and Stackelberg games. IEEE Trans. Auto. Contr. 1979, 24, 155–166. [Google Scholar] [CrossRef]
  28. Sinha, A.; Malo, P.; Deb, K.J. Efficient evolutionary algorithm for single-objective bilevel optimization. arXiv 2013, arXiv:1303.3901. [Google Scholar]
  29. Yang, S.; Wang, Z.; Zhang, H. Kinematic model based real-time path planning method with guide line for autonomous vehicle. In Proceedings of the 2017 36th Chinese Control Conference (CCC), Dalian, China, 26–28 July 2017; pp. 990–994. [Google Scholar]
  30. Tu, Q.; Chen, H.; Li, J. A potential field based lateral planning method for autonomous vehicles. SAE Int. J. Passeng. Cars—Electron. Electr.Syst. 2017, 10, 24–35. [Google Scholar] [CrossRef]
  31. Li, H.; Wu, C.; Chu, D.; Lu, L.; Cheng, K. Combined trajectory planning and tracking for autonomous vehicle considering driving styles. IEEE Access 2021, 9, 9453–9463. [Google Scholar] [CrossRef]
  32. Li, B.-B.; Wang, L.; Liu, B. An effective PSO-based hybrid algorithm for multiobjective permutation flow shop scheduling. Trans.-SMCA 2008, 38, 818–831. [Google Scholar] [CrossRef]
Figure 1. The lane-changing behavior decision-making framework for AVs.
Figure 1. The lane-changing behavior decision-making framework for AVs.
Sensors 22 06729 g001
Figure 2. Stackelberg game theory.
Figure 2. Stackelberg game theory.
Sensors 22 06729 g002
Figure 3. OV’s risk field: (a) 3D map of risk field; (b) Different coefficients of convergence of an OV’s lateral risk fields exerting on the HV.
Figure 3. OV’s risk field: (a) 3D map of risk field; (b) Different coefficients of convergence of an OV’s lateral risk fields exerting on the HV.
Sensors 22 06729 g003
Figure 4. Risk field distribution for OV’s with different aggressiveness values along the X-axis.
Figure 4. Risk field distribution for OV’s with different aggressiveness values along the X-axis.
Sensors 22 06729 g004
Figure 5. Road risk field: (a) 3D map of risk field; (b) A cross-sectional view along the Y-axis.
Figure 5. Road risk field: (a) 3D map of risk field; (b) A cross-sectional view along the Y-axis.
Sensors 22 06729 g005
Figure 6. Different coefficients of convergence of the traffic line mark’s risk fields on the HV.
Figure 6. Different coefficients of convergence of the traffic line mark’s risk fields on the HV.
Sensors 22 06729 g006
Figure 7. Contour distribution of HV’s driving risk field in the example scenario.
Figure 7. Contour distribution of HV’s driving risk field in the example scenario.
Sensors 22 06729 g007
Figure 8. Testing Scenario A.
Figure 8. Testing Scenario A.
Sensors 22 06729 g008
Figure 9. Test results for decision making and path planning in Scenario A: (a) FV1_β = 0.6; (b) FV1_β = 0.6.
Figure 9. Test results for decision making and path planning in Scenario A: (a) FV1_β = 0.6; (b) FV1_β = 0.6.
Sensors 22 06729 g009
Figure 10. Velocity test results of Scenario A: (a) FV1_β = 0.6; (b) FV1_β = 0.2.
Figure 10. Velocity test results of Scenario A: (a) FV1_β = 0.6; (b) FV1_β = 0.2.
Sensors 22 06729 g010
Figure 11. Box-plots for V x , A x , and Δ S in Scenario A.
Figure 11. Box-plots for V x , A x , and Δ S in Scenario A.
Sensors 22 06729 g011
Figure 12. Testing Scenario B.
Figure 12. Testing Scenario B.
Sensors 22 06729 g012
Figure 13. Test results for decision-making and path planning in Scenario B.
Figure 13. Test results for decision-making and path planning in Scenario B.
Sensors 22 06729 g013
Figure 14. Velocity test results of Scenario B.
Figure 14. Velocity test results of Scenario B.
Sensors 22 06729 g014
Figure 15. Test results of relative distance and relative velocity of scenario B.
Figure 15. Test results of relative distance and relative velocity of scenario B.
Sensors 22 06729 g015
Table 1. Game Formulation.
Table 1. Game Formulation.
Decision MakingHV
Change Lanes
( α l c - min H V , α l c max H V )
Stay
( α l k min H V , α l k max H V )
FV ( α min F V , α max F V ) ( U l c H V , U 0 F V ) ( U l k H V , U 1 F V )
Table 2. Model parameters.
Table 2. Model parameters.
Stackelberg Game DRF
ParameterValueParameterValueParameterValueParameterValue
ψ v l k H V 0.2 ς 1 × 10−5 ρ 2 A r 2 1.7
ψ s l k H V 8 × 103 d m 20 A o b s 15 L o v 4.4
ψ v l c H V 0.2 α x H V [−3,5] A r 1 1 h 1 2.2
ψ s l c H V 8 × 103 v x max 30 W o v 1.8 d c 0.5
ψ α c c H V 0.4-- σ l i n e 10 ε 10
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Ye, M.; Li, P.; Yang, Z.; Liu, Y. Research on Lane Changing Game and Behavioral Decision Making Based on Driving Styles and Micro-Interaction Behaviors. Sensors 2022, 22, 6729. https://doi.org/10.3390/s22186729

AMA Style

Ye M, Li P, Yang Z, Liu Y. Research on Lane Changing Game and Behavioral Decision Making Based on Driving Styles and Micro-Interaction Behaviors. Sensors. 2022; 22(18):6729. https://doi.org/10.3390/s22186729

Chicago/Turabian Style

Ye, Ming, Pan Li, Zhou Yang, and Yonggang Liu. 2022. "Research on Lane Changing Game and Behavioral Decision Making Based on Driving Styles and Micro-Interaction Behaviors" Sensors 22, no. 18: 6729. https://doi.org/10.3390/s22186729

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop