A Game-Theoretic Intention Planning Method for Autonomous Vehicles

Li, Sishen; Guan, Hsin; Jia, Xin

doi:10.3390/electronics15051124

Open AccessArticle

A Game-Theoretic Intention Planning Method for Autonomous Vehicles

by

Sishen Li

,

Hsin Guan

and

Xin Jia

^*

National Key Laboratory of Automotive Chassis Integration and Bionics, Jilin University, Changchun 130025, China

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(5), 1124; https://doi.org/10.3390/electronics15051124

Submission received: 22 January 2026 / Revised: 15 February 2026 / Accepted: 18 February 2026 / Published: 9 March 2026

Download

Browse Figures

Versions Notes

Abstract

Autonomous vehicles (AVs) must make predictable and socially compliant behavioral decisions to ensure safe and efficient interactions with other road users. To address this challenge, this paper proposes a game-theoretic behavioral decision-making model integrated with spatial motion planning to capture the interactive intentions between the ego vehicle (EV) and target vehicle (TV) in pairwise scenarios. First, the study defines an intention representation method that characterizes intentions using spatial area boundaries, feasible speed ranges, and a set of goal points (speed goal points, position-orientation goal points). Second, a spatial motion planning approach is adopted to evaluate the intention, which optimizes the driving scheme using a multi-objective cost function (incorporating pursuit precision, comfort, energy efficiency, and travel efficiency). Finally, the game-theoretic decision-making model is constructed. The Social Value Orientation (SVO) is introduced to quantify drivers’ social preferences, and the payoff function, which integrates safety rewards (based on inter-vehicle distance) and performance rewards (based on motion planning indices), is established. Simulation results verify that the proposed model can effectively address the interactive intention decision-making problem between the AV and other road users and handle different scenarios.

Keywords:

autonomous vehicles; intention decision-making; game theory; spatial motion planning; Social Value Orientation

1. Introduction

With the advancement of autonomous driving technology and the progress of its commercialization, certain vehicles equipped with high-level autonomous driving functions are currently operating in urban road environments. It is foreseeable that during the forthcoming long-term transition period, the road traffic system will consist of human-driven vehicles and autonomous vehicles. In this context, autonomous vehicles must be capable of making predictable and socially compliant behavior decisions to ensure safe and efficient interaction with other road users.

The rule-based method is one of the most popular decision-making method types for autonomous vehicles. Ref. [1] suggested a rule-based algorithm for unprotected left-turn decisions in mixed traffic scenarios using AVs and human-driven vehicles. In Reference [2], an event-triggered overtaking decision-making method for highway autonomous vehicles was proposed, where the distance threshold is determined based on the speed difference between the host vehicle and the preceding vehicle, and the overtaking decision is triggered when the distance exceeds the determined threshold. The rule-based decision-making methods can guarantee transparency and comprehensibility; however, these methods are difficult to migrate to complex scenarios, especially when considering several traffic rules and driver behavior variability.

With rapid advancements in machine learning technologies, data-driven approaches have gained popularity in formulating decision-making solutions for autonomous vehicles [3,4,5,6,7,8,9,10,11]. In [3], a support vector machine (SVM) with Bayesian parameter optimization was employed in a lane-changing scenario. In [4], a deep neural network solution was devised, taking into account general traffic scenarios and real-world road conditions. In [5], a deep reinforcement learning algorithm was developed for highway driving. In [6], a Q learning algorithm was utilized to tackle an overtaking scenario. Reference [7] presented an approach of unsignalized intersection decision making by combining level-k game theory and deep reinforcement learning. In [8], a Bayesian reinforcement learning method for intersection decision was proposed. Reference [9] constructed a transfer deep reinforcement learning framework in intersection decisions, which can improve online implementation and learning efficiency by transforming one driving task into two tasks. These learning-based approaches are capable of capturing relevant information. Reference [10] combined KAN and DQN to overcome the deficiencies of the traditional reinforcement learning methods (e.g., unstable training, poor generalization capability, and missing security mechanisms). Reference [11] introduced the risk-attention mechanism, the balanced reward function, and the collision-supervised mechanism to DRL to improve the safety and efficiency of AVs during highway interactive driving scenarios. These learning-based approaches are capable of capturing relevant information. Nevertheless, they demand high-quality training samples and lack interpretability.

Game-theoretic approaches have demonstrated their efficacy in capturing the interactions among agents with distinct objectives and have been successfully applied to lane changes, merge scenarios, intersection crossings, roundabout traversals, and overtaking [12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32]. In a game formulation, the objective of each vehicle is to optimize its payoff while taking into account the possible actions of other vehicles. In [12,13], Stackelberg games were, respectively, utilized to solve lane-changing and roundabout-crossing scenarios. In [14,15], normal-form games and leader–follower games were, respectively, developed to address intersection-crossing scenarios. In [16], a game-theoretic level-k reasoning method was employed to model the behaviors of human drivers during highway driving scenarios. In [17], the Nash equilibrium game and the Stackelberg game were proposed and compared to model autonomous racing. In [18], a multi-player game was developed to handle the coordination of vehicles on highways. Recent research in this domain has aimed to employ more sophisticated methodologies for pay-off formulation [16,17,18,19,20,21,22,23,24,25,26,27,28]. These approaches directly model the trajectory interactions among multiple agents. By constructing payoff functions formulated through the consideration of motion performance, these methods enable autonomous systems to evaluate the potential outcomes of diverse motion strategies. However, the necessity of directly modeling the interactions between trajectories is confronted with high computational complexity. Some researchers focus on intention interaction planning. In [29,30], an intention planning method game was proposed; these studies have skillfully applied game theory to upper-level decision making for autonomous driving. However, the upper-level decisions made by game theory models are not consistently well integrated with lower-level vehicle motion planning. The aforementioned game-theoretic approaches were all developed for specific traffic scenarios, such as lane-changing or intersection-crossing, and lack the ability to generalize to diverse traffic scenarios due to the characteristics of certain specific game formulations.

To overcome the above challenge, this work proposed an intention planning method which can be used in pairwise interaction. In contrast to previous studies, this research makes the following contributions:

(1): A unified intention planning method: This work constructs a unified game-theoretic intention decision model, which can be used to handle different traffic scenarios by representing driving intention as a motion planning feasible region. The intention decision-making problem is formulated as a motion planning feasible region selection problem.
(2): Spatial motion planning: This paper evaluates the intention by using an optimal driving scheme that complies with the intention constraints (i.e., local optimal driving scheme). In contrast to other work, this paper proposes a spatial motion planning method which represents constrains and decides optimal driving scheme in the spatial domain rather than the temporal domain.
(3): Integrating intention planning and motion planning: Through the aforementioned representation and evaluation methods, the intention planning problem is formulated as a feasible region and local optimal driving scheme game selecting problem; thus, the intention planning is integrated with motion planning.

2. Problem Formulation

In this section, we describe the overall decision-making processes of the model proposed by this study.

In this study, we develop a game-theoretic behavior decision-making model to capture the interactions between the ego vehicle’s intention and other vehicles’ intentions. The common method used in multi-vehicle games is decomposing the problem into pair-wise subproblems, for which the best strategy can be found from the intersection of optimal subproblem solution sets [27]; thus, in this study, we consider a simplified system with an ego vehicle (EV) and one target vehicle (TV). In this simplified game, only the local interaction between the EV and the TV is considered. To overcome this limitation, further research is required to generalize the pairwise game to multi-vehicle interactions. We leave this for future work. Figure 1 shows the overall decision-making framework.

Firstly, we conduct an analysis of the intention areas of the ego vehicle (EV) and the traffic vehicle (TV) (without taking into account the interaction with other traffic participants). Based on this, the topological region intersection method is employed to ascertain whether there exists an intersection area. If the intention area of the EV intersects with that of the TV, the arrival time is utilized to judge whether a conflict occurs between the autonomous vehicle (AV) and the TV. When a conflict is present, the game-theoretic intention decision-making method is applied to determine the intention (either entering first or maintaining a sufficient space to allow the other vehicle to cross first) of each vehicle. Through formulating the game in all scenarios into a problem of determining and selecting the feasible region, various traffic scenarios are unified, and intention planning is integrated with motion planning. In this research, we solely concentrated on the game-theoretic decision-making method. The details will be elaborated in subsequent sections. In this study, we assume that the ego vehicle can acquire state information of surrounding vehicles. These states can be obtained through onboard sensor fusion integrating high-definition maps, localization systems, and perception technologies. If such information could additionally be acquired via connectivity with road infrastructure or other vehicles, sensing quality would be further enhanced, enabling more rational decision-making processes for autonomous vehicles.

3. Intention Representation and Determine Methods

To unify the spatial and temporal characteristics of intentions, we represent intention by intention area boundaries, feasible speed range, and a series of goal points (speed goal points, and position and orientation points). Area boundaries and feasible speed range are varied. Thus, we use spatial point sequence to represent the area boundaries and the feasible speed range. Spatial area boundaries are described by a spatial point sequence

[[s_{0}, t_{0}], [s_{1}, t_{1}], \dots, [s_{N}, t_{N}]]

, where

s_{i}

represents spatial station and

t_{i}

denotes the lateral offset in the Frenet coordinate. Feasible speed range are expressed as

[s {, V}_{\max}, V_{\min}]

, where

V_{\max}

and

V_{\min}

are the maximum and minimum feasible speeds at position

s

, respectively. Speed goal points are represented by

[s, V]

, and

V

represents target speed. Position and orientation points are represented by

[s, t, θ]

, and

t

and

θ

represent target lateral offset and target orientation at position s, respectively. The spatial representation is dependent on the selected coordinates. In this study, we use lane (or virtual lane), the centerline. to establish the Frenet coordinate to represent the intention.

3.1. Intention Area Boundaries Calculation

The intention area is the spatial area that the vehicle intends to occupy, determined by three factors: the driver’s intended maneuver, vehicle maneuverability, and interactions with other traffic participants.

3.1.1. Intended Maneuver

We consider three types of intent maneuver (lane keeping, lane change, and intersection).

As depicted in Figure 2, when the driver intends to execute a lane-keeping maneuver, the lane boundaries are directly used as the area boundaries. The length of the area is

V T_{p} + l

, where

V

denotes the current speed of the vehicle,

T_{p}

represents the preview time, and

l

signifies the fixed area length (which accounts for the vehicle length and a fixed minimum length).

As depicted in Figure 3, when the driver has the intention to execute a lane-change maneuver, the corresponding area is determined by the projection points of four key points: start point (the current spatial position of the vehicle), two lane-change points (intermediate spatial positions during the lane-changing process), and end point. Lane-change points and end point are determined by the vehicle’s speed and the speed of the leading vehicle. The lateral boundary is represented by a smooth curve that connects the projection points of these points. The lane-change points are determined by the preparation time

t_{p r e}

and lane-change duration

t_{l a n e c h a n g e}

(i.e., the first lane-change point is at

{V t}_{p r e}

, the second lane-change point is at

{V t}_{p r e}

+

{V t}_{l a n e c h a n g e}

).

When a driver intends to traverse an intersection, to uniformly address intersections with diverse geometric and topological structures, rather than defining abstract maneuvers such as left turns, right turns, and straight ahead movements, the area boundaries are determined based on the entry lane, exit lane, and virtual lane (which connects the entry lane and the exit lane), as depicted in Figure 4. The boundaries of the virtual lane are generated via a fifth-degree polynomial.

3.1.2. Vehicle Maneuverability

The reachable area is the location set that can be reached from the initial state when applying all feasible steering angle inputs. The reachable area is limited by the vehicle’s maneuverability. As shown in Figure 5, we compute the vehicle’s reachable area by considering its minimum turning radius and forbidding reverse driving in the longitudinal direction.

3.1.3. Interaction with Other Traffic Participants

When navigating within the urban traffic environment, it is occasionally necessary to yield to other traffic participants (i.e., entry into the area occupied by these traffic participants is prohibited). Geometric subtraction is utilized to compute the intention area (see Figure 6).

Above all, the intention area is calculated using the following:

A = A_{intent} \cap A_{maneuverability} - A_{occupied}

(1)

3.2. Feasible Speed Range Calculation

When driving in urban traffic environments, feasible speed range is mainly governed by speed limits (explicit limits and implicit speed limits) and vehicle maneuverability. Explicit speed limits are shaped by legal speed limits, i.e., it is specified by traffic rules. Implicit speed limits are induced by the curvature of the road and interaction constrains. In this study, we consider all of the abovementioned factors to compute the feasible speed range.

3.2.1. Speed Limits Constrain

Regarding the legal constraints, we directly restrict the speed to be lower than the maximum legal speed and higher than the minimum legal speed, as follows:

V_{\lg, \max} (s) = \min (V_{def} (s), V_{upper} (s))

(2)

V_{\lg, \min} (s) = \max (0, V_{lower} (s))

(3)

where

V_{def} (s)

is the default speed,

V_{upper} (s)

is the maximum legal speed, and

V_{lower} (s)

is the minimum legal speed.

The road curvature has a significant impact on the vehicle’s maximum allowable speed, as excessive speed when navigating curves can lead to loss of control due to centrifugal force. To account for this, we calculate the maximum speed allowed by road curvature using the following formula:

V_{cur, \max} (s) = \min (V_{def} (s), \sqrt{|\frac{a_{y, \max}}{κ (s)}|})

(4)

where

a_{y, \max}

is the maximum acceptable central acceleration and

κ

is the curvature of the road.

3.2.2. Interaction with Other Traffic Participants Constrains

When yielding to other road users is required, the vehicle needs to maintain sufficient space to let other road users pass. The motion prediction results are spatial–temporal integration (which time to arrive at which location). In order to uniformly express intentions in the spatial domain, we use time as an intermediate variable. We use

V_{\max} (s)

to calculate

t (s)

, as shown in Formula (5), and use

t (s)

to calculate the leading vehicle position

s_{l} (t (s))

and leading vehicle speed projection

\dot{s_{l}} (t (s))

. By combining Formulas (6) and (7), we can calculate a new

V_{\max} (s)

. We formulate the interaction constrains by iterative calculating

t (s)

and

V_{\max} (s)

, as shown in Figure 7 (similar to the EM algorithm). After every calculation step,

V_{\max} (s)

are limited to

V_{\max} (s) = \min (V (s), V_{\max} (s))

, where

V_{\max} (s)

is

\min (V_{\lg, \max} (s), V_{cur, \max} (s))

. The calculation formulations are

t (s) = \int_{0}^{s} \frac{1}{V_{\max} (s)} d s

(5)

d_{des} (V_{\max} (s), \dot{s_{l}} (t (s))) = s_{l} (t (s)) - s

(6)

d_{des} (V, \dot{s_{l}}) = V t + \frac{V^{2}}{2 a_{b, \max}} - \frac{{\dot{s_{l}}}^{2}}{2 \ddot{s_{b, \max}}}

(7)

where s represents the mileage difference between the start point and the end point.

3.2.3. Vehicle Maneuverability Constrains

In general form, we assume that maximum speeds are specified over space as

V_{\max} (s)

. The generation of

V_{\max} (s)

depending on the concrete traffic situation is detailed in the above subsections. As

V_{\max} (s)

may not respect acceleration constraints

[a_{\min}, a_{\max}]

, the constraint profile is first shaped by integrating

a_{\min}

and

a_{\max}

; once backwards from horizon S to 0, with

a (s) = a_{\min} (V (s))

, boundary condition

V (S) = V_{\max} (S)

, and once forwards with

a (s) = a_{\max} (V (s))

, boundary condition

V (0) = V

,

V

represent current vehicle speed. And, additionally, during integration of the resulting profiles

V_{bwd}

and

V_{fwd}

. After every integration step,

V (s)

and

a (s)

are limited to

V (s) = \min (V (s), V_{\max} (s))

(8)

a (s) = clip (a_{\min} (V (s)), a_{\max} (V (s)), a (s))

(9)

Applying the minimum operator then yields a profile that respects maneuverability limits.

V_{\max} (s) = \min (V_{fwd} (s), V_{bwd} (s))

(10)

V_{\max} (s)

is subsequently used to compute the initial value and is incorporated into a optimization constraint as an upper limit for feasible speed profiles.

The minimum speed

V_{\min} (s)

is calculated similarly to the calculation of maximum speed, besides

a_{f w d} (s) = a_{\min} (V (s))

,

a_{b w d} (s) = a_{\max} (V (s))

, and using the maximum operator to limit the

V (s)

and

V_{\min} (s)

as follows:

V (s) = \max (V (s), V_{\min} (s))

(11)

V_{\min} (s) = \max (V_{fwd} (s), V_{bwd} (s))

(12)

3.3. Goal Points Determination

3.3.1. Position and Orientation Goal Points Selection

The central points where the road boundaries experience sudden narrowing or widening are chosen as the position and orientation target points. This selection approach for key target points guarantees that the vehicle strictly drives within the area. When an autonomous vehicle confronts an impending sudden narrowing or widening of the road boundary, it is necessary to adjust the path to prevent scraping the boundary or to fully exploit the expanded driving space. As shown in Figure 8, there are four goal points where the area width is narrowing or widening.

3.3.2. Speed Goal Points Selection

The spatial speed points characterized by acceleration transitions are selected as spatial speed target points. These points are generated by the aforementioned types of constraints, specifically, road geometric constraints (e.g., sudden changes in curve radius), traffic regulation constraints (e.g., speed limit adjustments), and vehicle dynamic constraints (e.g., maximum allowable acceleration); for instance, a sharp increase in road curvature will force the vehicle to reduce speed, leading to a transition in acceleration and, thus, creating a speed goal point. These constraint-derived breakpoints are critical for ensuring that spatial speed target points align with real-world operational limits. As shown in Figure 9, the blue speed points are selected as target points as they correspond to the motion trends’ alteration.

4. Intention Evaluation Method

In real-world driving scenarios, drivers plan the vehicle’s driving scheme—i.e., pose profile and speed profile—based on their distinct driving intentions. Given the direct and tight correlation between a driver’s motion planning behavior and their driving intention, we, thus, leverage the optimal driving scheme that achieve the best motion performance to assess the driver’s underlying intention (i.e., we evaluate intentions by analyzing the optimal driving scheme that drivers can achieve in the feasible regions formed by each intention, rather than using simple discrete accelerations or fixed motion plans). To this end, we propose a spatial motion planning approach, the details of which are detailed in the following subsections.

4.1. Motion Model

The vehicle’s state vector in the global coordinate system is defined as

x = [X, Y, θ, V]

, where

X, Y \in R

are the horizontal and vertical coordinates,

θ

is the heading angle, and

V

is the longitudinal speed. The control input vector is

[M_{lon}, M_{lat}]

, where we define

x = [X, Y, θ, V]

and

u = [M_{lon}, M_{lat}]

.

Here,

M_{lon} = [[a_{1}, L_{a, 1}], [a_{2}, L_{a, 2}], \dots, [a_{na}, L_{a, n a}]]

(13)

M_{lat} = [[κ_{1}, L_{κ, 1}], [κ_{2}, L_{κ, 2}], \dots, [κ_{n κ}, L_{κ, n κ}]]

(14)

4.2. Motion Primitives

We use constant acceleration motion as longitudinal motion primitive and constant curvature motion as lateral motion primitive.

For constant acceleration motion, the motion equation is

V = \sqrt{V_{0}^{2} + 2 a L_{a}}

(15)

where

V_{0}

is the initial speed, and

a

and

L_{a}

are the acceleration and distance of the motion, respectively.

For constant curvature motion, the motion equations are derived from the vehicle kinematic model:

X = X_{0} + \frac{1}{κ} \sin (L_{κ} κ) \cos (θ_{0}) - \frac{1}{κ} (1 - \cos (L_{κ} κ)) \sin (θ_{0})

(16)

Y = Y_{0} + \frac{1}{κ} \sin (L_{κ} κ) \sin (θ_{0}) + \frac{1}{κ} (1 - \cos (L_{κ} κ)) \cos (θ_{0})

(17)

θ = θ_{0} + L_{κ} κ

(18)

where

[X_{0}, Y_{0}, θ_{0}]

is the initial pose, and

κ

and

L_{κ}

are the curvature and distance of the motion, respectively.

In the case of

κ = 0

,

\frac{1}{κ}

is infinite, and we use the following equations to model this special case:

X = X_{0} + L_{κ} \cos (θ_{0})

(19)

Y = Y_{0} + L_{κ} \sin (θ_{0})

(20)

θ = θ_{0}

(21)

The driving scheme is obtained via conducting recursive calculation of the above equations.

4.3. Driving Scheme Optimization Strategies

With preset legality and safety constraints, the vehicle ensures safe and legal driving. To plan the driving scheme, we also consider pursuit precision, comfort, energy consumption, and efficiency indices for optimization. This comprehensive consideration ensures that the selected driving scheme provides comfortable, energy-saving, and efficient driving while maintaining safety and legality.

Thus, the total cost function, incorporating all the abovementioned indices, is

{J = ω_{pursuit} J}_{pursuit} + ω_{comfort} J_{comfort} + ω_{eco} J_{eco} + ω_{efficiency} J_{efficiency}

(22)

where

ω_{pursuit}

,

ω_{comfort}

,

ω_{eco}

, and

ω_{efficiency}

are the weight coefficients for the pursuit precision cost, comfort cost, energy consuming cost, and efficiency cost, respectively.

The pursuit precision index quantifies the degree of deviation of the planned driving scheme from the goal points, with pursuit precision cost function formulated as follows:

J_{pursuit} = \sum (ω_{t} ∆ t_{i}^{2} + ω_{θ} ∆ θ_{i}^{2}) + \sum (ω_{V} ∆ V_{i}^{2})

(23)

where

ω_{t}

,

ω_{θ}

, and

ω_{V}

are the weight coefficients for the lateral deviation error cost, heading angle error cost, and velocity deviation cost, respectively.

∆ t

,

∆ θ

, and

∆ V

are the lateral offset deviation, heading angle deviation, and velocity deviation, respectively.

Using frequency weighted acceleration to formulate the comfort costs, the comfort cost function is defined as

J_{comfort} = k_{x} a_{w x}^{2} + k_{y} a_{w y}^{2}

(24)

where

k_{x}

and

k_{y}

denote product factors for the longitudinal and lateral direction.

a_{w x}

and

a_{w y}

are the frequency weighted longitudinal acceleration and frequency-weighted lateral acceleration, respectively.

Using the total energy consumed during the driving scheme to evaluate the energy saving cost, the energy saving cost is

{J_{eco} = (\int_{0}^{s} E (a (s), V (s)) d s)}^{2}

(25)

The efficiency of the planned trajectory is reflected in the total time consumed by the driving scheme, and the efficiency function is

J_{efficiency} = t^{2}

(26)

where

t

represent the time consumed.

The driving scheme optimization model is

\min J s . t . t_{right} \leq t \leq t_{left} V_{\min} {\leq V \leq V}_{\max} a_{x, \min} {\leq a_{x} \leq a}_{x, \max} |a_{y}| \leq |a_{y, \max}| |s_{end} - s_{end}^{des}| \leq ∆ s_{tol} |t_{end} - t_{end}^{des}| \leq ∆ t_{tol} |θ_{end} - θ_{end}^{des}| \leq ∆ θ_{tol} |V_{end} - V_{end}^{des}| \leq ∆ V_{tol}

(27)

where

∆ s_{tol}

,

∆ t_{tol}

,

∆ θ_{tol}

, and

∆ V_{tol}

represent the tolerance of the position deviation, heading angle deviation, and speed deviation of the terminal goal, respectively.

5. Game Theoretic Decision Making Model

This study contemplates a game that encapsulates the intention interaction between the ego vehicle and traffic vehicles to ascertain which vehicle crosses the intersection area first. In this game, each player endeavors to maximize their payoff. The payoff rules for each player are identical, taking into account safety and motion performance when executing the intention. Each player makes a decision based on the instantaneous state at the current moment and the prediction of the future to maximize the payoff. The game is defined as

G = (N, A, U)

(28)

where

N

represents all game participants,

A

is the actions of game participants, and

U

is the revenue set of game participants. In this game, the player set is

(Ego Vehicle, Traffic vehicle)

. The set of actions that each player can take is

A = (yield, cross)

. where “

cross

” represents the action “enter the intersection area first”, and “yield” represents “keep sufficient space for the other vehicle to cross”.

5.1. Pay-Off Functions Formulation

The pay-off matrix established in this study is shown in Table 1. In this game, the factors taken into account by the players are their own rewards and those of other participants. Referring to [31], we employ the Social Value Orientation (SVO) to quantify the drivers’ social preferences for distributing payoffs between themselves and others. The payoff function is the mathematical form of the considered factors:

U_{ij}^{E} = \cos (φ_{E}) r_{ij}^{E} + \sin (φ_{E}) r_{ij}^{T}

(29)

U_{ij}^{T} = \cos (φ_{T}) r_{ij}^{T} + \sin (φ_{T}) r_{ij}^{E}

(30)

where

φ_{E}

and

φ_{T}

are ego vehicle’s and traffic vehicle’s SVO angles, respectively.

r_{ij}^{E}

and

r_{ij}^{T}

represent the ego reward and traffic vehicle reward, respectively.

Safety is regarded as a constraint, that is, collision must be avoided in any case. The motion performance when executing the evaluated intention is another crucial factor. We formulate the reward function as follows:

r = ω_{safe} r_{safe} + ω_{performance} r_{performance}

(31)

where

ω_{safe}

and

ω_{performance}

are the weight coefficients for the safe reward

r_{safe}

and performance reward

r_{performance}

, respectively.

The reward calculation method is shown in Figure 10.

The safety reward quantifies the safety level as each vehicle executes its intention. We use the distance between the vehicle and surrounding traffic participants to construct an evaluation function as follows:

r_{safe} = \{\begin{matrix} {(1 {- |\frac{d_{des}}{d}|}_{\max})}^{α} & {|\frac{d_{des}}{d}|}_{\max} < 1 \\ - θ {(1 - {|\frac{d_{des}}{d}|}_{\max})}^{β} & {|\frac{d_{des}}{d}|}_{\max} \geq 1 \end{matrix}

(32)

where

d_{des}

represents the desired yield distance and d represents the distance between two vehicles. The other parameters follow reference [32]:

α

= 0.88,

β =

0.88,

θ

= 2.25.

The distance between two vehicles, d, is calculated as shown in Figure 11.

The desired yield distance

d_{des}

is calculated considering reaction time

t

, maximum deceleration

a_{b, \max}

, following vehicle speed

V_{follow}

, and leading vehicle speed

V_{lead}

, as follows

d_{des} = V_{follow} t + \frac{V_{follow}^{2}}{2 a_{b, \max}} - \frac{V_{lead}^{2}}{2 a_{b, \max}}

(33)

The performance reward quantifies the motion performance as each vehicle executes its intention. We use the previously established motion performance indices to conduct the evaluation as follows:

r_{performance} = \{\begin{matrix} {(J {- J}_{I})}^{α} & J {- J}_{I} > 0 \\ - θ {(J {- J}_{I})}^{β} & J {- J}_{I} \leq 0 \end{matrix}

(34)

where

J

represents the optimal motion performance without considering interaction with other vehicles, and

J_{I}

denotes the optimal motion performance when executing the evaluated intention.

5.2. Decision-Making Method

Assuming that the probability of the other vehicle choosing the “yield” strategy is P, the probability of it choosing the “cross” strategy is 1 − P. We formulate the strategy of the ego vehicle as follows:

I_{i} = argmax (E (U_{i}))

(35)

E (U_{i}) = P U_{i 1} + (1 - P) U_{i 2}

(36)

where

E (U_{i})

denotes the expected utility of intention i, and

U

represents the utility of each intention.

As indicated by the aforementioned formula, to calculate the expected utility, it is crucial to compute the probability of each intention of the other vehicle. To compute the probability of each intention, we formulate the probability calculation process as shown in Figure 12.

We iteratively calculate

P (I^{E})

and

P (I^{T})

. In each iteration, the probabilities are calculated based on last iteration probabilities and the intended probabilities as follows:

P^{K} (I_{i}) = (1 - γ) P^{K - 1} (I_{i}) + γ P_{intend}^{K} (I_{i})

(37)

where

P^{K} (I_{i})

is the k-th iteration probability of intention i, and

P_{intend}

represents the intended probability, which is calculated according to the expected utility of intention as follows:

P_{intend} (I_{i}) = \frac{e^{E (U_{i})}}{\sum_{j = 1}^{N} e^{E (U_{j})}}

(38)

The initial probabilities of each intention are calculated based on the distance between the observed acceleration and the expected acceleration when the corresponding intention is executed, as follows:

P^{0} (cross) = P (cross | O) = \{\begin{matrix} 1 & a \geq a_{cross} \\ \frac{f (a | I_{yield})}{\sum_{i = 1}^{N} f (a | I_{i})} & a_{yield} < a < a_{cross} \\ 0 & a \leq a_{yield} \end{matrix}

(39)

P^{0} (yield) = 1 - P^{0} (cross)

(40)

where

a

is the observed acceleration,

a_{cross}

is the expected acceleration when the driver executes cross intention, and

a_{yield}

is the expected acceleration when the driver executes yield intention.

f (I_{i}) = \frac{1}{\sqrt{2 π} σ} e^{\frac{{(a - a_{i})}^{2}}{2 σ^{2}}}

(41)

where

a

is the observed acceleration,

a_{i}

is the expected acceleration when the corresponding intention is executed, and

σ

is the standard deviation.

6. Experiments

The trajectory tracking method is constructed to follow the motion output by the intention decision-making model, and the simulation is verified in VTD for different scenarios. Both the EV and the TV are controlled by the method proposed in this paper.

6.1. Integrated Longitudinal and Lateral Motion Control Method

The simulation computers are Intel Core i7-11800H CPU@1.80 GHz 2.30 GHz, 16 GB RAM, and Intel Core i7-6700U CPU@3.40 GHz, 8 GB RAM: the former computer is used to run the proposed decision model and other modules (e.g., motion control). The latter one provides a simulation environment to run under the VTD. The integrated longitudinal and lateral motion control method is shown in Figure 13.

Where * sign represents desied value,

a_{x}^{*}

and

a_{y}^{*}

denote the desired longitudinal acceleration and lateral acceleration, respectively.

a_{x}^{*}

is determined by the decision-making outcome and the vehicle’s location s.

a_{y}^{*}

is calculated based on the decision-making outcome, the vehicle’s location s, and the vehicle’s speed as follows:

a_{x}^{*} = a (s)

(42)

a_{y}^{*} = V^{2} κ (s)

(43)

where

f_{r}

represents “resistance force ratio”, and is calculated as follows:

f_{r} = \frac{F_{r}}{M}

(44)

F_{r}

is the resistance force that the vehicle needs to overcome, and

M

is the vehicle’s weight.

6.2. Cut-In Scenario Test

In this test scenario, the target vehicle is designed to perform a lane-changing maneuver by cutting into the lane in which the ego vehicle is traveling. In the test scenario, the ego vehicle and the target vehicle drive at 100 km/h and 80 km/h speed, respectively, and the position along the lane of the ego vehicle

s_{e}

= 0 m, and the position along the lane of the target vehicle

s_{t}

= 25 m. The

V_{default}

is set as 120 km/h. The initial acceleration is set as 0

m / s^{2}

. If neither the ego vehicle nor the target vehicle alters its motion state, a collision will occur at the conflict area. The algorithms designed in this paper are deployed in the ego vehicle and the target vehicle. The SVO angles of the ego vehicle and the target vehicle in each test case are set as (a)

\frac{π}{4}

and 0, (b)

0

and

\frac{π}{4}

, (c) −

\frac{π}{4}

and 0, (d) 0 and −

\frac{π}{4}

. The illustration of the test scenario is shown in Figure 14.

6.2.1. Case A

In this case, the EV’s driving style is set as prosocial, and the TV’s driving style is set as egoistic. As shown in Figure 15, both the EV and the TV tend to decelerate at the beginning. The TV decides deceleration out of the consideration of collision avoidance, and the EV decides deceleration to give way to the TV. At time t = 0.8 s, the TV observes that the EV is decelerating and predicts that it can cross first when it arrives at the conflict area because the EV is prosocial, tending to give way to the other. Based on this observation and prediction, the TV decides to accelerate to cross the area first. At time t = 2.0 s, there is sufficient space, and the EV decides to accelerate to follow the TV.

6.2.2. Case B

In this case, the EV’s driving style is set as egoistic, and the TV’s driving style is set as prosocial. As shown in Figure 16, both the EV and the TV tend to decelerate at the beginning. The EV decides deceleration out of the consideration of collision avoidance, and the TV decides deceleration to give way to the EV. At time t = 0.8 s, the EV observes that the TV is decelerating and predicts that sufficient space will be available to cross when it arrives at the conflict area because the TV is prosocial, tending to maximize the joint reward. Based on this condition, the EV decides to accelerate to cross the area first. At time t = 1.9 s, there is sufficient space, and the TV decides to accelerate to follow the EV.

6.2.3. Case C

In this test, the EV adopts an aggressively competitive driving style, while the TV operates with a egoistic driving style. Figure 17 illustrates the test results. These results show that the EV selects maximum acceleration to force the TV to decelerate, thereby it can traverse the conflict area first. The TV decelerates to avoid collision at the beginning. Subsequently, based on the observation that the EV maintains acceleration, the TV decides to keep decelerating, allowing the EV to cross first. At t = 0.8 s, the space is sufficient. Based on this condition, EV decides to accelerate to follow the TV.

6.2.4. Case D

Figure 18 shows the displacement and speed of each vehicle, as well as the distance between the EV and the TV. These results reveal that the TV, configured with a competitive driving style, initially selects maximum acceleration to force the EV to decelerate, thereby it can traverse the conflict area first. The EV initially decelerates to avoid collision. Later, it observes the TV continuing to accelerate, and the EV decides to maintain deceleration, allowing the TV to cross first. At t = 1.2 s, sufficient space is available, and the EV begins accelerating.

6.3. Intersection Scenario Test

In this test scenario, the ego vehicle and target vehicle are both designed to perform a go-straight maneuver. In the test scenario, the ego vehicle and the target vehicle drive at 30 km/h speed, and the distance between the current location and the nearest sides of the conflict area are both 20 m. The

V_{default}

is set as 60 km/h. The initial acceleration is set as 0

m / s^{2}

. If neither the ego vehicle nor the target vehicle alters its motion state, a collision will occur at the conflict area. The algorithms designed in this paper are deployed in the ego vehicle and the target vehicle. The SVO angles of the ego vehicle and the target vehicle in each test case are set as (a)

0

and

\frac{π}{4}

, (b) 0 and −

\frac{π}{4}

. The illustration of the test scenario is shown in Figure 19.

6.3.1. Case A

Shown in Figure 20 is the simulation result of intersection scene case a, including the longitudinal positions of the ego vehicle and the traffic vehicle in their respective road Frenet coordinate systems, the speeds of the two vehicles, and the distance between the traffic vehicle and the ego vehicle in the road coordinate system of the vehicle that enters the conflict area later (the traffic vehicle in this case).

In this case, the traffic vehicle is set as a prosocial preference, while the ego vehicle is set as a egoistic social preference. Experimental results in Figure 20 show that, due to the traffic vehicle being set to a prosocial preference, it chooses to decelerate to provide sufficient space for the ego vehicle, allowing the ego vehicle to pass through the conflict area first. At the initial moment, the acceleration of the traffic vehicle is 0 m/s², and the ego vehicle has significant uncertainty about whether the traffic vehicle is preparing to yield or not. As the ego vehicle is set to a egoistic social preference, it only considers its own interests when making strategic decisions. To ensure its own driving safety, it chooses to decelerate in the initial stage to avoid collisions. Subsequently, the ego vehicle observes that the traffic vehicle maintains deceleration and, combined with the traffic vehicle’s prosocial preference, infers that the traffic vehicle is more inclined to yield. The ego vehicle then decides to switch from yielding to accelerating to cross the conflict area first.

6.3.2. Case B

Shown in Figure 21 is the simulation result of the intersection scene case b. As the traffic vehicles were set to a competitive driving style, the traffic vehicle chose to accelerate to force the ego vehicle to decelerate and yield, thus being able to pass the conflict area first. At the initial moment, the traffic vehicles’ acceleration was 0 m/s², and the ego vehicle had significant uncertainty about whether the traffic vehicles would yield or not. As the ego vehicle was set to a egoistic driving style, it made decisions solely based on its own interests. To ensure its own driving safety, it chose to decelerate at the beginning to avoid collisions. Subsequently, the host vehicle observed that the traffic vehicle continued to accelerate. Combined with its competitive driving style, the ego vehicle inferred that the traffic vehicle was more likely not to yield. Therefore, the ego vehicle decided to continue decelerating to ensure its own safety.

Under identical initial conditions, a more aggressive vehicle acquires greater priority through successive game stages, consequently exhibiting more probability to accelerate and cross the conflict area first. Conversely, cautious vehicles tend to be constrained by other agents, leading them to let the other agent cross first. In different driving style settings, the vehicle operating with an egoistic driving style tends to decelerate at the beginning to avoid collision, and it later decides to cross first or give way to the other based on the other vehicle’s driving style. The vehicle with a prosocial driving style tends to maintain sufficient space to let the other vehicle cross first. The competitive one intends to accelerate to force the other vehicle to give way to it. This result is consistent with real-world driving.

6.4. Sensitivity Test

Different hyperparameter settings may lead to different driving styles. Here, three sets of hyperparameter settings were adopted, as shown in Table 2.

We conduct 100 random initial condition experiments in the above two scenarios. In the cut-in scenario, the initial speed of the EV obeys uniform distribution

U (90, 110) k m / h

. The hyperparameter setting of the EV is selected from three distinct types uniformly. The initial speed of the TV obeys uniform distribution

U (70, 90) k m / h

. The initial location of the TV obeys uniform distribution

U (20, 40) m

. The hyperparameter settings of the TV are selected from three distinct types uniformly. Random test results are shown in Table 3.

In the interaction scenario, the initial speed of the EV and the TV both obey uniform distribution

U (25, 35) k m / h

. The hyperparameter settings of the EV and the TV are both selected from three distinct types uniformly. The initial distance between the current location and the nearest side of the conflict area obeys uniform distribution

U (15, 25) m

. Random test results of the EV are shown in Table 4.

The comparisons in Table 3 and Table 4 indicate that the aggressive style results in shorter time, which meets expectation as aggressive drivers tend to choose shorter crossing times to achieve higher efficiency. They would speed up with a larger acceleration to cross the conflict area as quickly as possible. The conservative style results in a longer time.

In order to show the time efficiency of the proposed method, the average time consumed is calculated. In the cut-in scenario, the average computational runtime is 84 ms, and the average computational runtime in the intersection scenario is 65 ms. The average time consumed in the above scenario is shown in Figure 22.

The most time-consuming part of the calculation is the trajectory optimization. The computation time scales linearly with the number of discrete intention candidates (regions) generated, not the complexity of the trajectory itself, due to the spatial motion planning decoupling.

6.5. Discussion: Simulation vs. Practical Implementation

Due to the current traffic rules, our vehicle platform cannot operate in real traffic, so the verification in this paper is based on the simulation environment. We establish representative typical scenarios that are commonly used in related research in a simulation environment to verify the method proposed in this paper. To a certain extent, the scenario established in this work can represent the situations encountered in real environments. However, there are some differences between the simulation and the real world, as follows:

Perception uncertainty and state estimation: In our simulation, we assume relatively accurate acquisition of the target vehicle’s state (position, velocity, acceleration). In real-world deployment, sensor data is subject to noise, occlusion, and latency.

Simplified interaction: This study simplifies the environment to pairwise (dyadic) interactions between the ego vehicle and one target vehicle. Real traffic often involves simultaneous, coupled interactions among multiple agents (e.g., a chain of braking vehicles) and diverse road users like pedestrians and cyclists. The pairwise decomposition may lead to suboptimal or conflicting decisions in dense traffic. Future practical systems must extend the game formulation to graph-based or multi-player models to capture these higher-order dependencies.

Fixed hyperparameters: The current model uses fixed hyperparameters (SVO angles of ±π/4.0) to represent distinct driving styles. In reality, a driver’s social preference is a continuous, time-varying parameter influenced by traffic density, road conditions, and psychological state. A practical implementation requires an online parameter estimation module that dynamically infers and updates the opponent’s SVO weights in real time, rather than relying on static presets.

7. Conclusions

This paper presents a game-theoretic intention planning method for autonomous vehicles capable of capturing vehicle intention interactions. A unified intention representation method is proposed. A key advantage of the proposed method is its applicability in diverse scenarios without modification. Furthermore, an intention evaluation method is introduced, which utilizes the optimal motion scheme generated during the execution of the evaluated intention to assess its quality. Finally, a game-theoretic decision-making framework is proposed, employing Social Value Orientation (SVO) to model the social preferences of traffic participants. Simulation results demonstrate that the proposed method ensures safety and efficiency during interactions with other vehicles, effectively handles traffic participants exhibiting diverse driving styles, and allows the autonomous vehicle itself to be configured with different driving styles. The current work assumes interactions involving only two vehicles. Future research should extend this to multi-vehicle scenarios. Additionally, while this study focuses exclusively on vehicle–vehicle interactions, future work must address interactions involving diverse traffic participant types, such as pedestrians and trucks. Validation of the algorithm in real-world environments is also required to enhance its adaptability to complex traffic conditions. In this work, we only focus on constructing a unified model which can be used in different scenarios without any adjustment. The settings of hyperparameters are fixed and discrete; however, the hyperparameters are adaptive and continuous. The self-adaptive parameter modeling and estimation of the other vehicle’s parameter need to be studied in future work (e.g., by comparing the target vehicle’s observed acceleration against the optimal trajectories generated for various hypothetical SVO angles, the system can iteratively update the probability distribution of the opponent’s social preference). In this work, the states of all agents can be perfectly perceived, and future research on methods that can handle sensing noise is needed. It is important to note that this study focuses on pairwise interactions. While sufficient for many sparse urban scenarios, pairwise decomposition may encounter limitations in dense traffic where chain-reaction interactions occur. Future work will explore multi-vehicle interaction methods to address multi-vehicle coupling.

Author Contributions

Conceptualization, S.L.; methodology, S.L.; software, S.L.; validation, S.L.; formal analysis, S.L.; investigation, S.L.; resources, S.L.; data curation, S.L.; writing—original draft preparation, S.L.; writing—review and editing, S.L.; visualization, S.L.; supervision, H.G.; project administration, H.G. and X.J.; funding acquisition, H.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key R&D Program of China, grant number 2023YFB2504500, and the Natural Science Foundation of Jilin Province, grant number SKL202302014.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Aksjonov, A.; Kyrki, V. Rule-based decision-making system for autonomous vehicles at intersections with mixed traffic environment. In Proceedings of the 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, IN, USA, 19–22 September 2021; pp. 660–666. [Google Scholar]
Huang, P.; Zhang, L.; Chen, H.; Ding, H.; Cao, J. Event-triggered optimisation of overtaking decision-making strategy for autonomous driving on highway. IET Intell. Transp. Syst. 2022, 16, 1794–1808. [Google Scholar]
Liu, Y.; Wang, X.; Li, L.; Cheng, S.; Chen, Z. A novel lane change decision-making model of autonomous vehicle based on support vector machine. IEEE Access 2019, 7, 26543–26550. [Google Scholar] [CrossRef]
Li, L.; Ota, K.; Dong, M. Humanlike driving: Empirical decision making system for autonomous vehicles. IEEE Trans. Veh. Technol. 2018, 67, 6814–6823. [Google Scholar] [CrossRef]
Nageshrao, S.; Tseng, H.E.; Filev, D. Autonomous highway driving using deep reinforcement learning. In Proceedings of the 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), Bari, Italy, 6–9 October 2019; pp. 2326–2331. [Google Scholar]
Ngai, D.C.K.; Yung, N.H.C. A multiple-goal reinforcement learning method for complex vehicle overtaking maneuvers. IEEE Trans. Intell. Trans. Syst. 2011, 12, 509–522. [Google Scholar] [CrossRef]
Yuan, M.; Shan, J.; Mi, K. Deep reinforcement learning based game-theoretic decision-making for autonomous vehicles. IEEE Robot. Autom. Lett. 2022, 7, 818–825. [Google Scholar] [CrossRef]
Hoel, C.-J.; Tram, T.; Sjöberg, J. Reinforcement learning with uncertainty estimation for tactical decision-making in intersections. In Proceedings of the 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), Rhodes, Greece, 20–23 September 2020; pp. 1–7. [Google Scholar]
Shu, H.; Liu, T.; Mu, X.; Cao, D. Driving tasks transfer using deep reinforcement learning for decision-making of autonomous vehicles in unsignalized intersection. IEEE Trans. Veh. Technol. 2022, 71, 41–52. [Google Scholar] [CrossRef]
Lin, Z.; Tian, Z.; Lan, J.; Zhang, Q.; Ye, Z.; Zhuang, H.; Zhao, X. A Conflicts-Free, Speed-Lossless KAN-Based Reinforcement Learning Decision System for Interactive Driving in Roundabouts. IEEE Trans. Intell. Trans. Syst. 2025, 26, 16377–16390. [Google Scholar] [CrossRef]
Tian, Z.; Zhao, D.; Lin, Z.; Zhao, W.; Flynn, D.; Tian, D.; Sun, Y. Balanced Exploration and Attention-Inspired Decision Making for Autonomous Driving. IEEE Trans. Veh. Technol. 2026, 75, 84–99. [Google Scholar] [CrossRef]
Hang, P.; Huang, C.; Hu, Z.; Xing, Y.; Lv, C. Decision making of connected automated vehicles at an unsignalized roundabout considering personalized driving behaviours. IEEE Trans. Veh. Technol. 2021, 70, 4051–4064. [Google Scholar] [CrossRef]
Hang, P.; Lv, C.; Huang, C.; Cai, J.; Hu, Z.; Xing, Y. An integrated framework of decision making and motion planning for autonomous vehicles considering social behaviors. IEEE Trans. Veh. Technol. 2020, 69, 14458–14469. [Google Scholar] [CrossRef]
Li, N.; Yao, Y.; Kolmanovsky, I.; Atkins, E.; Girard, A.R. Game theoretic modeling of multi-vehicle interactions at uncontrolled intersections. IEEE Trans. Intell. Transp. Syst. 2022, 23, 1428–1442. [Google Scholar]
Mandiau, R.; Champion, A.; Auberlet, J.-M.; Espié, S.; Kolski, C. Behaviour based on decision matrices for a coordination between agents in a urban traffic simulation. Appl. Intell. 2008, 28, 121–138. [Google Scholar]
Albaba, B.M.; Yildiz, Y. Driver modeling through deep reinforcement learning and behavioral game theory. IEEE Trans. Control Syst. Technol. 2022, 30, 885–892. [Google Scholar] [CrossRef]
Liniger, A.; Lygeros, J. A noncooperative game approach to autonomous racing. IEEE Trans. Control Syst. Technol. 2020, 28, 884–897. [Google Scholar] [CrossRef]
Fabiani, F.; Grammatico, S. Multi-vehicle automated driving as a generalized mixed-integer potential game. IEEE Trans. Intell. Transp. Syst. 2020, 21, 1064–1073. [Google Scholar] [CrossRef]
Li, N.; Oyler, D.; Zhang, M.; Yildiz, Y.; Girard, A.; Kolmanovsky, I. Hierarchical reasoning game theory based approach for evaluation and testing of autonomous vehicle control systems. In Proceedings of the 2016 IEEE 55th Conference on Decision and Control (CDC), Las Vegas, NV, USA, 12–14 December 2016; pp. 727–733. [Google Scholar]
Lopez, V.; Lewis, F.; Liu, M.; Wan, Y.; Nageshrao, S.; Filev, D. Game-theoretic lane-changing decision making and payoff learning for autonomous vehicles. IEEE Trans. Veh. Technol. 2022, 71, 3609–3620. [Google Scholar] [CrossRef]
Hang, P.; Lv, C.; Xing, Y.; Huang, C.; Hu, Z. Human-like decision making for autonomous driving: A noncooperative game the oretic approach. IEEE Trans. Intell. Transp. Syst. 2021, 22, 2076–2087. [Google Scholar]
Cai, L.; Guan, H.; Xu, Q.H.; Jia, X.; Zhan, J. Game-Theoretic Decision-Making Method and Motion Planning for Autonomous Vehicles in Overtaking. IEEE Trans. Intell. Transp. Syst. 2024, 25, 9693–9709. [Google Scholar] [CrossRef]
Nan, J.F.; Deng, W.; Zheng, B. Intention Prediction and Mixed Strategy Nash Equilibrium-Based Decision-Making Framework for Autonomous Driving in Uncontrolled Intersection. IEEE Trans. Veh. Technol. 2022, 71, 10316–10326. [Google Scholar] [CrossRef]
Rahmati, Y.; Talebpour, A. Towards a collaborative connected, automated driving environment: A game theory based decision framework for unprotected left turn maneuvers. In Proceedings of the 2017 IEEE Intelligent Vehicles Symposium (IV), Los Angeles, CA, USA, 11–14 June 2017; pp. 1316–1321. [Google Scholar]
Rahmati, Y.; Hosseini, M.K.; Talebpour, A. Helping Automated Vehicles With Left-Turn Maneuvers: A Game Theory-Based Decision Framework for Conflicting Maneuvers at Intersections. IEEE Trans. Intell. Transp. Syst. 2022, 23, 11877–11890. [Google Scholar] [CrossRef]
Lu, X.; Zhao, H.; Li, C.; Gao, B.; Chen, H. A Game-Theoretic Approach on Conflict Resolution of Autonomous Vehicles at Unsignalized Intersections. IEEE Trans. Intell. Transp. Syst. 2023, 24, 12535–12548. [Google Scholar] [CrossRef]
Li, D.; Zhang, J.; Liu, G. Autonomous Driving Decision Algorithm for Complex Multi-Vehicle Interactions: An Efficient Approach Based on Global Sorting and Local Gaming. IEEE Trans. Intell. Transp. Syst 2024, 25, 6927–6937. [Google Scholar] [CrossRef]
Cai, J.; Hang, P.; Lv, C. Game Theoretic Modeling and Decision Making for Connected Vehicle Interactions at Urban Intersections. In Proceedings of the 2021 6th IEEE International Conference on Advanced Robotics and Mechatronics (ICARM), Chongqing, China, 3–5 July 2021; pp. 874–880. [Google Scholar]
Li, D.; Liu, G.; Xiao, B. Human-like driving decision at unsignalized intersections based on game theory. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2023, 237, 159–173. [Google Scholar] [CrossRef]
Liu, M.; Wan, Y.; Lewis, F.L.; Nageshrao, S.; Filev, D. A three-level game-theoretic decision-making framework for autonomous vehicles. IEEE Trans. Intell. Transp. Syst. 2022, 23, 20298–20308. [Google Scholar] [CrossRef]
Schwarting, W.; Pierson, A.; Mora, J.A.; Karaman, S.; Rus, D. Social behavior for autonomous vehicles. Proc. Natl. Acad. Sci. USA 2019, 116, 24972–24978. [Google Scholar] [CrossRef]
Kahneman, D. Thinking, Fast and Slow, 1st ed.; Farrar, Straus and Giroux: New York, NY, USA, 2012; pp. 241–345. [Google Scholar]

Figure 1. The framework of the game-theoretic intention decision-making model.

Figure 2. Illustration of lane keeping intent area.

Figure 3. Illustration of lane-change intent area.

Figure 4. Illustration of intersection intent area.

Figure 5. Illustration of maneuverability area border.

Figure 6. Illustration of geometric subtraction.

Figure 7. Illustration of iterative calculation process.

Figure 8. Illustration of position-orientation goal points.

Figure 9. Illustration of speed goal points.

Figure 10. Illustration of reward calculation method.

Figure 11. Illustration of distance.

Figure 12. Illustration of Intention probability iterative calculation process.

Figure 13. Framework of acceleration tracking method. (a) Framework of longitudinal acceleration tracking method. (b) Framework of lateral acceleration tracking method.

Figure 14. Illustration of cut-in scenario.

Figure 15. Result of lane-change test case A. (a) Longitudinal displacement, (b) speed, (c) distance between TV and EV.

Figure 16. Result of lane-change test case B. (a) Longitudinal displacement, (b) speed, (c) distance between TV and EV.

Figure 17. Result of lane-change test case C. (a) Longitudinal displacement, (b) speed, (c) distance between TV and EV.

Figure 18. Result of lane-change test case D. (a) Longitudinal displacement, (b) speed, (c) distance between TV and EV.

Figure 19. Illustration of intersection scenario.

Figure 20. Result of intersection test case A. (a) Longitudinal displacement, (b) speed, (c) distance between TV and EV.

Figure 21. Result of intersection test case B. (a) Longitudinal displacement, (b) speed, (c) distance between TV and EV.

Figure 22. Time efficiency. (a) Cut-in scenario average time consumed, (b) intersection scenario average time consumed.

Table 1. Pay-off matrix.

Players	Traffic Vehicle
Ego Vehicle	Intentions	yield	cross
	yield	$U_{11}^{E}$ $, U_{11}^{T}$	$U_{12}^{E}$ $, U_{12}^{T}$
	cross	$U_{21}^{E}$ $, U_{21}^{T}$	$U_{22}^{E}$ $, U_{22}^{T}$

Table 2. Different settings of hyperparameters.

Driving Style	$φ$	$ω_{safe}$	$ω_{performance}$	$ω_{pursuit}$	$ω_{comfort}$	$ω_{eco}$	$ω_{sport}$
Conservative	$\frac{π}{4}$	0.7	0.3	0.1	0.6	0.1	0.2
Normal	0	0.5	0.5	0.1	0.4	0.1	0.4
Aggressive	$- \frac{π}{4}$	0.3	0.7	0.1	0.2	0.1	0.6

Table 3. Simulation results in cut-in scenario.

Driving Style	$t_{a r r i v e} / s$
Conservative	6.23
Normal	5.17
Aggressive	4.62

t_{arrive}

represents the arrival time when EV arrive at the conflict area.

Table 4. Simulation results in intersection scenario.

Driving Style	$t_{a r r i v e} / s$
Conservative	4.19
Normal	3.05
Aggressive	2.34

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Li, S.; Guan, H.; Jia, X. A Game-Theoretic Intention Planning Method for Autonomous Vehicles. Electronics 2026, 15, 1124. https://doi.org/10.3390/electronics15051124

AMA Style

Li S, Guan H, Jia X. A Game-Theoretic Intention Planning Method for Autonomous Vehicles. Electronics. 2026; 15(5):1124. https://doi.org/10.3390/electronics15051124

Chicago/Turabian Style

Li, Sishen, Hsin Guan, and Xin Jia. 2026. "A Game-Theoretic Intention Planning Method for Autonomous Vehicles" Electronics 15, no. 5: 1124. https://doi.org/10.3390/electronics15051124

APA Style

Li, S., Guan, H., & Jia, X. (2026). A Game-Theoretic Intention Planning Method for Autonomous Vehicles. Electronics, 15(5), 1124. https://doi.org/10.3390/electronics15051124

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Game-Theoretic Intention Planning Method for Autonomous Vehicles

Abstract

1. Introduction

2. Problem Formulation

3. Intention Representation and Determine Methods

3.1. Intention Area Boundaries Calculation

3.1.1. Intended Maneuver

3.1.2. Vehicle Maneuverability

3.1.3. Interaction with Other Traffic Participants

3.2. Feasible Speed Range Calculation

3.2.1. Speed Limits Constrain

3.2.2. Interaction with Other Traffic Participants Constrains

3.2.3. Vehicle Maneuverability Constrains

3.3. Goal Points Determination

3.3.1. Position and Orientation Goal Points Selection

3.3.2. Speed Goal Points Selection

4. Intention Evaluation Method

4.1. Motion Model

4.2. Motion Primitives

4.3. Driving Scheme Optimization Strategies

5. Game Theoretic Decision Making Model

5.1. Pay-Off Functions Formulation

5.2. Decision-Making Method

6. Experiments

6.1. Integrated Longitudinal and Lateral Motion Control Method

6.2. Cut-In Scenario Test

6.2.1. Case A

6.2.2. Case B

6.2.3. Case C

6.2.4. Case D

6.3. Intersection Scenario Test

6.3.1. Case A

6.3.2. Case B

6.4. Sensitivity Test

6.5. Discussion: Simulation vs. Practical Implementation

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI