Characterization of the Driving Style by State–Action Semantic Plane Based on the Bayesian Nonparametric Approach

Qiao, Xuqiang; Zheng, Ling; Li, Yinong; Ren, Yuqing; Zhang, Zhida; Zhang, Ziwei; Qiu, Lihong

doi:10.3390/app11177857

Open AccessArticle

Characterization of the Driving Style by State–Action Semantic Plane Based on the Bayesian Nonparametric Approach

by

Xuqiang Qiao

¹

,

Ling Zheng

¹,

Yinong Li

^1,*,

Yuqing Ren

¹,

Zhida Zhang

¹,

Ziwei Zhang

¹ and

Lihong Qiu

²

¹

College of Mechanical and Vehicle Engineering, Chongqing University, Chongqing 400030, China

²

Changan Auto Software Technology Co., Ltd., Chongqing 400030, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2021, 11(17), 7857; https://doi.org/10.3390/app11177857

Submission received: 3 August 2021 / Revised: 15 August 2021 / Accepted: 18 August 2021 / Published: 26 August 2021

(This article belongs to the Section Transportation and Future Mobility)

Download

Browse Figures

Versions Notes

Abstract

:

The quantification and estimation of the driving style are crucial to improve the safety on the road and the acceptance of drivers with level2–level3(L2–L3) intelligent vehicles. Previous studies have focused on identifying the difference in driving style between categories, without further consideration of the driving behavior frequency, duration proportion properties, and the transition properties between driving style and behaviors. In this paper, a novel methodology to characterize the driving style is proposed by using the State–Action semantic plane based on the Bayesian nonparametric approach, i.e., hierarchical Dirichlet process–hidden semi–Markov model (HDP–HSMM). This method segments the time series driving data into fragment clusters with similar characteristics and construct the State–Action semantic plane based on the statistical characteristics of the state and action layer to label and interpret the fragment clusters. This intuitively and simply visualizes the driving performance of individual drivers, while the risk index of the individual drivers can also be obtained through semantic plane. In addition, according to the joint mutual information maximization (JIMI) approach, seven transition probabilities of driving behaviors are extracted from the semantic plane and applied to identify driving styles of drivers. We found that the aggressive drivers prefer high–risk driving behaviors, and the total duration and frequency of high–risk behaviors are greater than those of cautious and normal drivers. The transition probabilities among high–risk driving behaviors are also greater compared with low–risk behaviors. Moreover, the transition probabilities can provide rich information about driving styles and can improve the classification accuracy of driving styles effectively. Our study has practical significance for the regulation of driving behavior and improvement of road safety and the development of advanced driver assistance systems (ADAS).

Keywords:

driving style; driving behaviors; intelligent vehicle; State–Action semantic plane; Bayesian nonparametric approach

1. Introduction

Better understanding of the variability in individual driving styles would be especially useful for understanding driver preferences, mechanisms for vehicle control, path planning, and for developing more realistic traffic simulations [1,2] to improve road safety and driver’s acceptance of L2–L3 level intelligent vehicles [3,4]. Previous studies have shown that drivers with an aggressive style are prone to bad behavior such as rapid acceleration, rapid deceleration, near following, and frequent lane changing in the process of driving. Regulating and warning of such behavior are conducive to promote the driving safety of vehicles [5]. In addition, drivers have personalized requirements for the advanced driver assistance system (ADAS) due to different driving styles. For example, a personalized adaptive cruise control (PACC) was designed to satisfy the personalized needs of drivers with different driving styles [6]. Yang et al. [7] collected various styles of lane changing characteristics and designed a personalized lane change strategy to meet the personalized requirements of lane changing. It is necessary to explore the driving styles and their application.

Figure 1 summarizes the framework of driving style analysis [8,9]. Driving style refers to all activities (layers) performed by a driver, including perception, strategic decision, state adjustment, vehicle operation (action), as well as maintaining situation awareness and engaging in secondary tasks. Existing studies have been performed on the above activities. Studies on the perception aspects of driving style focus on visual characteristics before and after lane changes, such as the different focuses in scanning and critical areas [10]. Studies of the decision aspects of driving style consider time–saving or short–distance routes [11]. Studies of the state aspects of driving style consider maneuver preferences such as close following, far following, and frequently changing lanes, etc. [12]. Studies of the operation (action) aspects of driving style include preferences for rapid acceleration, hard braking, etc. [13].

Previous studies showed that driving style mainly concerns state and operational aspects [13]. They categorized driving behavior into driving maneuvers (e.g., following, deceleration with respect to a moving target, lane changing, etc.) [14]. These studies focused on identifying the difference in driving style between categories based on statistical features with respect to multi representation of driving layers, without further consideration of the driving behavior frequency and duration proportion properties. This paper focuses on quantizing driving style and revealing the correlation between driving style and driving behavior.

The key of driving style analysis is to segment the time series driving data into fragments and extract effective indicators to characterize driving behaviors. Usually, the vehicle states and driver actions are recorded in time series data and can be decomposed into fragments to characterize the driving style. Schwarzer et al. [15] proposed a novel methodology to generate a stochastic driving cycle by segmenting the highly simplified acceleration and deceleration. However, because the synthetic character and numerous driving situations of real driving, such as transition areas from city to country driving, were not considered, it does not reflect actual driving behavior. Higgs and Abbas [16] developed a two–step algorithm to segment drivers’ behaviors in car–following. Eight predefined variables, longitudinal acceleration, lateral acceleration, yaw rate, vehicle speed, lane offset, yaw angle, range, and range rate, were used to obtain 30 state–action clusters, based on which the car–following model related to the driving style was established. Schockenhoff [14] and Zähringer [17] presented a new two–stage segmentation approach. This two–stage classification procedure enables the robust and unambiguous assignment of sequences to the four global driving states, acceleration, deceleration, cruising, and idling with fixed criteria. The result showed that more than 95% of all driving points can be assigned to one of the four global driving states. Taniguchi et al. [18,19] proposed the double articulation analyzer with temporal prediction (DAA–TP) model on the basis of the double articulation analyzer (DAA) model. It was applied to the ADAS to predict the driving scene and driving behaviors in the near future. In order obtain the more precise solution for log time series data, Hyunki et al. [20] proposed a memetic algorithm for multivariate time–series segmentation by calculating the score of a point using regularized covariance. Experiments demonstrated that the proposed method was superior to conventional segmentation methods. Bargi et al. [21] presented an online timing data segmentation and behavior recognition model using HDP–HMM (hierarchical Dirichlet process–hidden Markov model). The above approaches can decompose time series data effectively. However, they require prior knowledge about the number of states or clusters. Setting prior information artificially may lead to mode overfitting or underfitting. Meanwhile, the residence time distribution of each state is not considered, which may result in an extremely short period of some data fragments.

Although many studies have been carried out to characterize driving styles, most of the current studies use statistical features or the frequency and duration proportion of driving maneuvers separately and in combination to quantify driving styles. In order to obtain better speed performance control, Xu et al. [22]. collected driving data under different scenarios through a real vehicle platform and divided drivers into three categories (aggressive, moderate, and mild) according to the statistical characteristics of the data (the mean/standard deviation of brake pressure, throttle position, and vehicle speed). Their analysis result showed that the aggressive driver had the highest values for all throttle position indices, while the mild had the least. The hidden Markov model (HMM) has been widely applied to model and predict the driver state and driving behavior; researchers in [23,24] applied a hidden Markov model (HMM) to identify the underlying relationship between observations and driver state. To deal with driver behavior uncertainty in driving style recognition, Han et al. [25] developed a statistical–based recognition method, based on the Bayesian theory, to classify drivers into two groups, i.e., aggressive and normal (typical) using vehicle speed and throttle position. Xue et al. [26] presented a rapid driving style recognition method in a car following scene based on the trajectory features (acceleration, relative speed, and relative distance). These methods are easy to describe driver characteristics from a statistical perspective. However, under actual traffic conditions, people’s driving behaviors are random, and it is not sufficient to use statistical metrics to describe the driving styles. Considering user comfort, Bellem [13,27] classified driving style based on the objective variables (longitudinal acceleration and jerk) selected according to their frequency of occurrence in real traffic. These variables allow the driving style to be classified on a comfort–oriented scale. However, the maximum acceleration or maximum speed was limited during driving behavior construction. Li et al. [8] presented a method to estimate driving style in highway traffic using the transition probabilities between 12 maneuvers. The result demonstrated that high–risk drivers were more likely to be involved in approaching, near following, and constrained left and right lane changes. The above studies focused on identifying the differences in driving style between groups but did not create a model to quantify individual risk indices.

The driving style plays an important role in improving the safety and the ride comfort of autonomous vehicles although the driving style is difficult to perceive and describe accurately. Decomposing complex driver behaviors into simple, smaller behaviors can facilitate identifying and analyzing driving styles. In this paper, a novel framework to identify driving style with a quantitative method is proposed (Figure 2). The main contributions of this paper are as follows. (I) the Bayesian nonparametric method, i.e., HDP–HSMM, is innovatively applied to segment the time series driving data. It can decompose the time series driving data into fragment clusters with similar characteristics effectively. (II) The novel State–Action semantic plane is proposed to analyze and quantify the driving style, expressing driving preference simply and intuitively. (III) The transition probabilities are extracted based on the semantic plane to reveal interrelationships among driving behaviors. The transition probabilities are used to improve the identification accuracy of driving styles. The benefit of the proposed method is further verified by a comparison with the conventional statistical feature’s method.

The remainder of this paper is organized as follows: Section 2 introduces the simulator platform in detail, including the data collection, participants, and data analysis. Section 3 presents the framework of the driving style and the basic methods. Section 4 demonstrates the segment results using HDP–HSMM and the State–Action semantic plane. Section 5 presents the discussion of the results in detail and possible applications. Lastly, the conclusions are summarized in Section 6.

2. Data Acquisition and Pre–Processing

The driving data acquisition platform is developed on the basis of a simulator, as shown in Figure 2. The vehicle state variables, driver’s operation information, and physiological signals are collected. The impact of the cognitive load on driving safety and physiological characteristics for the cognitive load has been investigated previously [28]. This paper only focuses on the quantitative analysis of the driving style. Therefore, the impact of the cognitive load on driving style and driving behavior is not considered in detail.

2.1. Participants

A total of 33 (10 females and 23 males) volunteers with rich driving experience were recruited. They had an average age of 26.21 years and a standard deviation (SD) of 5.06 years ranging from 19 to 41 years old. Moreover, the 33 volunteers included students, teachers, taxi drivers, bus drivers, engineers, and others, and their education background ranged from high school education to Ph.D. Their ages ranged from 19 to 45, and these participants had 3.5 years driving experience, ranging from 1 to 9 years. The average annual mileage as a driver was about 4500 km per year.

2.2. Test Procedure

The highway driving scene was designed to collect the driving data in a car–following situation. All the participants first gave their informed consent and signed the test information book before participating in the study in the actual test process. Before the formal testing, the driver practiced for 30 min to become familiar with the test procedures and equipment operation. In order to eliminate random errors, the experiments were repeated three times. The testing procedure is shown in Figure 3. Testing data were collected including the driver style questionnaire (DSQ) (6–level Likert scale) [29], risk perception questionnaire (RPQ), the state of the ego vehicle, and the state of the preceding vehicle. The physiological signals of drivers were also collected. All information used for data collection is shown in Table 1.

The primary task was following the preceding vehicle, while the secondary task was answering N–Back questions. When the driver performed the primary driving task, he/she heard a series of random numbers broadcast from 0 to 9; the interval between the two numbers was 2.5 s. For instance, when performing the 0–Back task, if the driver hears that two adjacent numbers are the same, they need to answer “Yes”. Figure 4 is a schematic diagram of the N–Back task. The red boxes mean the drivers should say “Yes”.

2.3. Data Extraction and Pre–Processing

The car–following events were extracted by several simple rules. The ego vehicle and preceding vehicle were in the same lane. The relative distance L between the ego vehicle and preceding vehicle was no less than 120 m, and the ego vehicle speed was greater than 10 km/h. The duration of the following events was no less than 30 s to obtain sufficient data. Finally, 1104 following events were obtained, with an average of 33.45 events per driver, and the average duration was 45.7 s for each event.

Three–sigma criterion was used to eliminate abnormal data caused by driver’s operational irregularity and equipment unstably. The three–sigma criterion is expressed as

|x_{i} - \bar{x}| > 3 σ

(1)

where

\bar{x}

and

σ

represent mean and standard deviation of data, respectively.

In addition, a Z–score standardization method was used to standardize the selected variables of each event.

{\bar{y}}_{n}^{(m)} = \frac{y_{n}^{(m)} - u_{n}}{σ_{n}}, m = 1, 2, \dots, M, n = 1, 2, \dots, N

(2)

where y = [THW, a_e], m is the number of events, n is the total number of drivers, n = 33,

{\bar{y}}_{n}^{(m)}

is the n^th driver’s m^th event’s variable. u_n and σ_n are the mean and covariance of all events for the n^th driver.

2.4. Data Fundamental Analysis

2.4.1. Subjective Data Analysis

The drivers were categorized into three categories: aggressive, normal, and cautious by calculating each driver’s subjective question score. In order to test the reliability of the results of the questionnaire, Cronbach’s alpha reliability analysis was conducted on the scores of DSQ and RQP. The reliability of the two questionnaires was 0.841 and 0.815, respectively. Generally, a reliability of 0.70 is acceptable, and the range between 0.70 and 0.98 indicates high reliability. The analysis results show that the designed questionnaire contents were reasonable and reliable. According to the comprehensive score of each driver, the K–means clustering method was applied to cluster the scores of 33 drivers. The higher the score, the more aggressive the driving style and the greater the driving risk. Finally, 16 normal drivers, seven cautious drivers, and 10 aggressive drivers were obtained, as shown in Table 2.

2.4.2. Variable Threshold Definition

In this paper, THW and longitudinal acceleration were selected as the state index and action index [30], respectively. In order to provide a semantic explanation for driving behaviors, we classified each variable into different levels based on their statistical features. In addition, we use different distributions to fit them and determine the threshold of each variable from a statistical perspective. Figure 5a shows the fitting results of acceleration (a_e) and THW, using two distributions, i.e., the normal distribution (N) and Student’s t (t) distribution. It can be seen that for both acceleration and THW, the t–distribution achieved a better fitting performance than the normal distribution. Based on the characteristics of the variable’s thresholds, we selected the percentile value of the range with the t–fitting results, as illustrated in Figure 5b. Generally, dangerous events were small probability events. Therefore, events with values less than 5% and higher than 95% were taken as small probability events, and the corresponding TWH and a_e values were calculated according to the inverse function of the cumulative distribution function (CDF) [31].

t_{T H W}^{\min} = F_{t}^{- 1} (p_{0.05} | υ_{T H W})

(3)

t_{T H W}^{\max} = F_{t}^{- 1} (p_{0.95} | υ_{T H W})

(4)

a_{e}^{\min} = F_{t}^{- 1} (p_{0.05} | υ_{a_{e}})

(5)

a_{e}^{\max} = F_{t}^{- 1} (p_{0.95} | υ_{a_{e}})

(6)

Figure 5b shows the statistical results of driving data and the fitting results of the cumulative distribution function. It can be seen in Figure 5b that the a_e values were −1.42 m/s² and 1.59 m/s², corresponding to probabilities of 5% and 95%, respectively. The THW values were 1.19 s and 2.98 s, corresponding to probabilities of 5% and 95%, respectively. In order to facilitate the calculation, the thresholds were rounded, as shown in Table 3.

According to the results of statistical analysis and literature [32], the semantics of states and actions corresponding to each threshold range were defined. For instance, the driving state was divided into three categories, i.e., near following (NF), middle following (MF), and far following (FF), based on THW. Moreover, the driving action was divided into aggressive acceleration (AA), normal acceleration (NA), normal deceleration (ND), and aggressive deceleration (AD) based on acceleration. In total, 3 × 4 = 12 driving behaviors were analyzed. Each driving behavior had a semantic explanation. For example, when acceleration a_e > 1.6 m/s², and THW < 1.2 s, the semantic was defined as near following with aggressive acceleration (NFAA). A similar semantic explanation could be obtained according to other thresholds. It is worth noting that these thresholds can be adjusted considering the change in the actual driving scenarios or application fields.

3. Methods

The proposed approach is composed of several steps (Figure 6). In this process, first, the driving data were collected. Next, the hierarchical Dirichlet process–hidden semi–Markov model (HDP–HSMM) was introduced to separate the time series data into segments with similar characteristics. The fragment clusters were then labeled and described semantically by the State–Action plane, Finally, the driving preference with different styles was intuitively displayed.

3.1. Description of the Driving Process Based on HSMM

The HMM has been widely used to describe the dynamic characteristics of a driver’s behavior [33]. However, HMM has two significant disadvantages. One is that the number of hidden states must be set in advance. The other is that HMM does not take the state duration into consideration. To overcome the shortcomings of HMM, the hierarchical Dirichlet process–hidden semi–Markov model (HDP–HSMM) is introduced. HDP can provide prior and hidden state numbers for HMM with excellent clustering characteristics and the hierarchical sharing principle ability. The hidden semi–Markov model (HSMM) is an extension of HMM. It allows each state to have a variable duration through a semi–Markov chain. Therefore, HDP–HSMM has automatic clustering capacity and can describe stochastic characteristics in the driving process [34,35].

The driving process consists of two layers: a hidden state layer and observation state layer, as shown in Figure 7.

In Figure 7, the shaded nodes represent the observable variables, y_t, the unshaded nodes represent the driving behavior, z_s, and D_s denotes the duration of a behavior. HSMM can be expressed by

π_{0} = \frac{1}{S}

(7)

π_{i j} = p (z_{t + 1} = j | z_{t} = i, z_{t + 1} \neq i)

(8)

D_{s} ~ g (ω_{s})

(9)

y_{t} | z_{s}, d_{s} ~ F (θ_{z_{s}}, D_{s})

(10)

π_{i i} = 0, \sum_{j} π_{i j} = 1

(11)

where π₀ is the prior probability distribution, π_ij is the transition probability representing the probability of transferring from driving behavior i to j. y_t indicates the observation distribution of a current hidden state, and θ represents a model parameter, g (ω_s) is the state–specific distribution over the state duration, ω_s is the parameter for states. g (·) is the Poisson distribution. In HSMM, the state duration is closer to the actual driving state.

3.1.1. Construction of HDP–HSMM

The Dirichlet process (DP) is a stochastic process, which can be regarded as the distribution of the discrete distribution of infinite categories. It can complete the clustering of data and estimate distribution parameters [36]. HDP is a multi–layer extension of DP, including at least two layers of DP with a complex state inference and Bayesian mixing. HDP can provide state numbers and prior model parameters for HSMM. DP can be defined as follows:

Let measurable space be Θ, with a probability measure H on the space. γ is a positive real number, called the concentration parameter. DP (γ, H) is defined as the distribution of the random probability measure of G over Θ, for any finite measurable partition (A₁, A₂, …, A_K) of Θ; the random vector (G (A₁), G (A₂), …, G (A_K)) is distributed as a finite–dimensional Dirichlet distribution with parameters (γH (A₁), γH (A₂), …, γH (A_K)) [37],

(G_{0} (A_{1}), G_{0} (A_{2}), \dots, G_{0} (A_{K})) \sim Dir (γ H (A_{1}), γ H (A_{2}), \dots, γ H (A_{K}))

(12)

Equation (12) can be written as

G_{0} \sim D P (γ, H)

(13)

G_{0} = \sum_{k = 1}^{K} β_{k} δ_{θ_{k}}

(14)

θ_{k} \sim H, β \sim GEM (γ)

(15)

where θ_k is the distribution of H, β~GEM (γ) represents the construction relation of the weight coefficient (GEM is Griffiths, Engen, and McCloskey initials, respectively [34], which refer to the sticking–breaking process). δ_θ is the Dirac function, satisfying

δ (θ) = \{\begin{cases} 1, θ_{k} \in A_{k} \\ 0, o t h e r s \end{cases}

(16)

HDP used in this paper consists of two layers of DP, expressed by

G_{j} \sim H D P (γ, α, H)

(17)

G_{0} \sim D P (γ, H)

(18)

G_{j} = \sum_{k = 1}^{K} π_{j k} δ_{θ_{k}}, K \to \infty

(19)

θ_{k} \sim H, β \sim GEM (γ), π_{j} \sim D P (α, β)

(20)

where γ and α are the concentration parameters of the first layer DP and the second layer DP, respectively. G₀ is sampled from the first layer DP, and G_j is a variation of a global discrete measure G₀ and represents the prior transition probability of HMM.

According to the above discussion, HSMM can express the driving process, while HDP can cluster data adaptively and provide prior knowledge for HSMM. HDP–HSMM can be expressed as follows.

Figure 8 can be expressed by

β | γ \sim GEM (γ)

(21)

π_{i} | α, β \sim DP (α, β), i = 1, 2, \dots, \infty

(22)

θ_{i} \sim H (λ), i = 1, 2, \dots, \infty

(23)

ω_{i} \sim Ω

(24)

z_{s} \sim {\bar{π}}_{z_{s - 1}}, s = 1, 2, \dots, S

(25)

D_{s} \sim g (ω_{z_{s}}), s = 1, 2, \dots, S

(26)

x_{t_{s}^{1} : t_{s}^{D_{s} + 1}} = z_{s}

(27)

y_{t_{s}^{1} : t_{s}^{D_{s} + 1}} \overset{}{\sim} F (θ_{x_{t}})

(28)

where π_i is the distribution parameter of hidden state sequence z_s, implying that HDP provides an infinite number of states for HSMM. D_s is the length distribution of the state sequence with distribution parameter ω, and y_ts is the observation sequence with distribution parameter θ_i.

3.1.2. Parameter Sampling and Inference

In Bayesian nonparametric models, the Gibbs sampling algorithm is widely used for inference of model parameters. After the model structure is determined, the weak limited Gibbs sampler (WLGS) is utilized to sample and infer the model parameters. The weak limit approximation transforms the infinite dimension hidden state into finite dimension form, so that the hidden state chain can be updated according to the observation data. For the convenience of description, when “\” appears in the superscript or subscript of a variable, it means that the corresponding variable is removed from the collection. In order to simplify the derivation and facilitate the integral solution, it is assumed that the basic distribution H (·) and the observation series distribution F (·) are conjugated distributions, the hidden states distribution g (·) is a Poisson distribution, and the hidden states distribution and the observation series distribution are independent. The sampling process is as follows.

Step 1, sampling weight coefficient β,

β | γ \sim Dir (γ / S, \dots, γ / S)

(29)

Step 2, sampling the state sequence distribution parameter π_i,

π_{i} | α, β \sim Dir (α β_{1}, \dots, α β_{s}) j = 1, \dots S

(30)

Step 3, sampling the observation distribution parameters θ_i and state duration distribution parameter ω_i according to observation data. It is assumed that the observed data obey a multivariate Gaussian distribution, thereby, the model parameters θ_i = (u_i, Σ_i) obey the Normal–Inverse–Wishart distribution (NIW)

NIW (u, Σ | ν_{0}, Δ_{0}, μ_{0}, S_{0}) ≜ N (μ | μ_{0}, S_{0}) * IW (Σ | ν_{0}, Δ_{0})

(31)

where φ = {u₀, S₀, ν₀, Δ₀} are prior parameters, u₀ and S₀ are the prior mean and covariance matrix, respectively, and ν₀ and Δ₀ are the degrees of freedom and scale of NIW distribution, respectively.

In addition, the state duration distribution is a Poisson distribution, and parameter ω_i follows a Beta distribution.

ω \sim Beta (η_{0}, σ_{0})

(32)

Step 4, updating parameters according to the observation data. (Please refer to [38,39] for further detailed update processes.)

3.2. Construction of the State–Action Semantic Plane

In this subsection, the State–Action semantic plane is constructed to characterize driving styles.

The State–Action semantic plane was constructed based on the variable thresholds in Table 3, where the driving behaviors were divided into 12 units, and the risk index was defined for each behavior (Figure 9). Figure 9a is the driving risk evaluation model, also called the driving style evaluation model. The color represents the risk magnitude value. The warmer the color, the higher the risk coefficient. The most dangerous driving behavior is NFAA, that is, the driver is near following the preceding vehicle with aggressive acceleration with a risk index of 10. The minimum risk driving behavior is FFAD, that is, the driver is far following the preceding vehicle with aggressive deceleration, with a risk index of 3. Similarly, the driving behavior in other units can also be described. Moreover, the author believes that safety is relative, as even experienced drivers may have traffic accidents. Therefore, the minimum risk index is set as 3 instead of 0. Figure 9b is the State–Action semantic plane, where i–j–k indicates different driving behaviors. They may transfer from one to another and this transition can reflect the fluctuation of the driving style.

3.3. Quantification of the Driving Style Method

The driving style can be quantitatively analyzed by calculating the distribution of each driving behavior on the semantic plane over a period of time or a specific mileage. Unlike the previous studies only using the behavior frequency to characterize driving styles, in this paper, the frequency and duration proportion of driving behavior are considered together because even if a certain driving behavior appears at a high frequency, it does not mean that this behavior lasts for a long time.

The driving behavior frequency and duration proportion are calculated by Equations (33) and (34), respectively.

f_{(i, j)} = \frac{N_{(i, j)}}{\sum N_{(i, j)}}

(33)

g_{(i, j)} = \frac{T_{(i, j)}}{\sum T_{(i, j)}}

(34)

where ΣΣg _(i,j) = 1, ΣΣf _(i,j) = 1, i ∈ [AA, NA, ND, AD], j ∈ [NF, MF, FF], N _(i,j) is the number of events of the driving behavior, and T _(i,j) is the corresponding duration.

The final total score can be expressed as

Score = \sum \sum (f_{(i, j)} S_{s c o r e}^{(i, j)} ω_{1} + g_{(i, j)} S_{s c o r e}^{(i, j)} ω_{2})

(35)

where

S_{s c o r e}^{(i, j)}

is the score criterion for each behavior (see Figure 9), ω is the weight coefficient, which satisfies ω₁ + ω₂ =1; in this paper, ω₁ = ω₂ =0.5.

4. Results

4.1. Model Training Results

The driving data were used to train HDP–HSMM to obtain model parameters based on WLGS. Figure 10 shows the change in the log–likelihood value of each driver in the training progress. It can be seen in Figure 10 that when the sampling iterations reached around 75 steps, the log–likelihood tended to be stable, indicating that parameters in this model tended to be stable.

4.2. Driving Data Fragment Results

For clarity and conciseness, only the segmentation results of the driving behavior of driver #5 (randomly selected) are demonstrated, as shown in Figure 11.

Figure 11a shows the sequence clusters of the driving behavior. The same color or the same number represents the same driving behavior. For example, the number 0 and the number 19 represent two different driving behaviors. In Figure 11b,c, it can be found that THW and acceleration have similar characteristics. For “0” driving behavior, the corresponding THW is near 2 s, and the acceleration is about 0.5 m/s². For “19” driving behavior, THW is near 2.1 s, and the acceleration is about 0.8 m/s². Additionally, the duration of each driving behavior is different, which shows that HDP–HSMM can identify the driving behavior from time series data according to the data characteristics without subjective intervention. Furthermore, the data fragments with similar features are automatically classified into one class.

From above analysis, it is concluded that HDP–HSMM can effectively divide the driving process into different fragments. However, the fragments are still time series data clusters (Figure 12). It is not easy to quantify the driver risk indices. In fact, the quantification of a driver’s risk is more important than classification when the safety of a driver is evaluated. Herein, an analysis method of the driving style is proposed based on the driving behavior semantic plane in order to evaluate the safety of a driver accurately.

4.3. Fragment Sequence Cluster Labelling Results

Based on the State–Action semantic plane, fragment results were labeled with the semantic interpretation. In order to make fragment sequence clusters easier to be labelled, we clustered each fragment into a point with the K–means clustering algorithm, where the clustering parameter was set to K = 1. Figure 13 shows the labeled results of each fragment for driver#5. It can be seen that most of the fragment clusters are labeled as MFNA and MFND. It was mentioned in 4.1 that THW of driving behavior “16” was larger than that of “0”, which indicated that behavior “16” was safer than behavior “0”. However, the acceleration of behavior “16” was larger than behavior “0”, which indicated that behavior “16” was more dangerous than behavior “0”. This seemed to be a contradiction. Here, both driving behavior “16” and driving behavior “0” belong to MFNA, which means that the contradiction in the previous section is solved successfully. Since each unit has its risk scoring criteria, the semantic plane in Figure 9 can be used to evaluate the driving style efficiently and simply. In addition, because there is no driving behavior in some semantic plane units as shown in Figure 13, such as NFAA, NFAD, and FFAD, the score of these units is zero.

4.4. Correlation Analysis of the Subjective Score and Objective Risk Score

Based on Equations (33)–(35), the objective risk indices can be obtained from Equations (33) and (34). In order to verify the rationality of the proposed method, Pearson correlation analysis was conducted between the subjective evaluation score and the objective risk coefficient, and the significance level was set as p = 0.05. The analysis results are shown in Figure 14. It can be seen that subjective score is generally higher than the objective risk coefficient score. The possible reason is that drivers are full of confidence in their driving skills. The correlation coefficient between the subjective score and objective risk coefficient score is 0.81, and the significance level is p = 0, which indicates that the overall trend of the two variables is consistent and significantly correlated with a positive correlation. In addition, the objective risk coefficient was subjected to the K–means algorithm. The result showed five marking errors compared with the marked result of the subjective score, and the mark accuracy reached 84.8%. The above analysis results verify the feasibility and accuracy of the proposed quantization method for driving styles.

5. Discussion

The above analysis shows that HDP–HSMM can provide reasonable segmentation for driving data in time series, and the State–Action semantic plane allows us to interpret fragment clusters intuitively and to evaluate their risk coefficient easily. In the following section, the driving style will be quantified and discussed using the semantic plane.

5.1. Driving Style Discussion

5.1.1. Frequency and Duration Proportion of Driving Behavior

In this paper, the frequency and duration proportion of driving behavior were used to characterize the driving style so as to show the driving style intuitively.

Figure 15 presents the normalized probability distribution of driving behaviors and the duration proportion of drivers with different driving styles. Warm colors represent high frequency or long driving behavior, while cool colors represent low frequency or short duration of driving behavior. The pictures clearly demonstrate the driving preferences of different styles of drivers. For example, aggressive drivers prefer NFNA and NFNA driving behavior. The probability of these two behaviors reached 79.7%, and the duration proportion reached 75.4%. However, this type of driver has limited far–following behavior (FF), whose probability is close to 0. Normal drivers such as MFNA and MFND, have a probability of 72.52% and a duration proportion of 71.25%. Part of their driving behavior is FFNA with a probability of 14.12%. The cautious drivers such as FFNA and FFND, have a probability of 66.67% and a duration proportion of 61.04%. Part of their driving behavior is MFND. The probability of near–following behavior (NF) is close to 0. Furthermore, it can be concluded from Figure 15 that the action behavior of AA/AD is lower under highway conditions, which is consistent with the actual situation and indirectly proves that the threshold set in the previous section is reasonable.

5.1.2. The Transition Probabilities of Driving Behavior

The previous subsection analyzed the frequency and duration proportion of driving style, intuitively demonstrating the driving preference of different driving styles. This section will further discuss the transfer characteristics of driving behavior and reveal the relationship between driving style and transition probability of the driving behavior. The driving behavior transition probability is defined as follows:

\begin{matrix} a_{i}, a_{j} = \{a_{1}, a_{2}, a_{3}, a_{4}, a_{5}, a_{6}, a_{7}, a_{8}, a_{9}, a_{10}, a_{11}, a_{12}\} \\ \underset{FFAD,}{↓} \underset{FFND,}{↓} \underset{FFNA,}{↓} \underset{FFAA,}{↓} \underset{MFNA,}{↓} \underset{MFAA,}{↓} \underset{MFNA,}{↓} \underset{MFNA,}{↓} \underset{NFNA,}{↓} \underset{NFAA,}{↓} \underset{NFNA,}{↓} \underset{NFNA}{↓}, \end{matrix}

(36)

where a_i and a_j represent the current and next time driving behavior, respectively. The probability of driving behavior transferring from a_i to a_j is expressed as

a_{i j} = p (b_{t + 1} | b_{t}) = \frac{N (b_{t + 1} = a_{j} | b_{t} = a_{i})}{\sum_{j = 1}^{m} N (b_{t + 1} = a_{j} | b_{t} = a_{i})}

(37)

where N is the number of driving behavior a_i transferring to a_j, m is the total number of driving behavior transferring from a_i to other driving behaviors; in this paper m = 12, satisfying

\sum_{j = 1}^{m} a_{i j} = 1

.

The dimension of the transition probability matrix is 12 × 12 due to 12 driving behaviors, and the diagonal is zero, and each driver has 144 transition probability features. Because the probability of some driving behaviors is 0, in some cases, a_ij = 0, or the transition probability is very small.

In order to select the feature set with the strongest correlation with driving style from 144 transition features, the joint mutual information maximization (JMIM) algorithm was used to remove redundant features [40]. This method combines feature correlation and redundancy concepts together to select the optimized subset by the forward greedy search algorithm. It is expressed by

f_{J M I M} = \arg \underset{f_{i} \in F - S}{m a x} (\underset{f_{s} \in S}{m i n} (I (f_{i}, f_{s}; C))),

(38)

I (f_{i}, f_{s}; C) = [- \sum_{c \in C} p (c) \log (p (c))] - [\sum_{c \in C} \sum_{f_{i} \in F - S} \sum_{f_{s} \in S} \log (\frac{p (f_{i} f_{s}, c | f_{s})}{p (f_{i} | f_{s}) p (c | f_{s})})],

(39)

where F represents the set of all candidate features, S represents the selected feature set, f_i is the candidate features, satisfying f_i ∈ F–S, f_s is the selected features, satisfying f_s ∈ S, C is a discrete variable, C ∈ {1,2,3}, where 1,2,3 represent cautious, normal, and aggressive driver groups. Equation (23) employs joint mutual information and the ‘maximum of the minimum’ approach, which means for a feature f_i, if JMI is larger than that of all other features f_i, where f_i ∈ F–S (i ≠ j), then it is the most relevant feature to the class label C in the context of the subset S. For more details, please refer to [40].

The driver data with the same subjective score labeling result and objective risk indices labeling result were selected, and data from each driver were divided into five segments randomly, and finally, 5 × 28 = 140 groups of data were obtained. The JIMI method was used to select an optimized sub–feature set from 144 transition probabilities. Figure 16 shows the top ten transition probabilities with the greatest joint mutual information. Finally, the first seven features with the greatest mutual information were selected, which were (1) from middle following with normal deceleration to middle following with normal acceleration, (2) from near following with aggressive deceleration to near following with normal deceleration, (3) from near following with normal deceleration to near following with normal acceleration, (4) from near following with normal deceleration to near following with aggressive deceleration, (5) from far following with normal deceleration to far following with normal acceleration, (6) from middle following with normal deceleration with far following with normal acceleration, and (7) from far following with normal acceleration to far following with normal deceleration. These seven features will be used for driving style analysis and driving style recognition in the following subsection.

The significance analysis of the seven features is shown in Table 4. The significance levels of the seven features were less than 0.05, indicating that there were significant differences among the seven features of the three types of drivers. The selected seven features can be used for driving style classification. In addition, the average probability of aggressive drivers transferring from low–risk driving behavior to high–risk driving behavior was higher than that of normal and cautious drivers (bold font in Table 4), which means that aggressive drivers are more likely to have rear–end collision. Cautious drivers are more likely to transfer among low–risk driving behaviors, for example, from FFND to FFNA, from FFNA to FFND, and their transition probabilities were 0.8106 and 0.7363, respectively.

Figure 17a–c display the aggressive, normal, and cautious driving styles, respectively. The color indicates the transition probability value. The warmer the color, the greater the probability the transition will occur. The probability matrix of driving behavior can intuitively reflect the internal relationship between driving behaviors and driving preference. The transition probability between aggressive drivers and cautious drivers is more concentrated than that of cautions drivers. Moreover, aggressive drivers prefer to switch between high–risk driving behaviors, such as from NFND to NFNA, and from NFNA to NFAD. On the contrary, cautious drivers prefer to switch between low–risk driving behaviors, such as, from FFND to FFNA, and from FFNA to FFND.

5.2. Application of Transition Probabilities

The previous studies selected the statistical indexes (e.g., mean and standard deviations, maximum, and minimum) of velocity, acceleration, and THW as input features for driving style recognition [41], which are not significant under highway driving conditions, and may often result in low classification accuracy. This paper takes the driving behavior transition probabilities as the input features to establish the driving style evaluation model. Then, the results of the proposed model are compared with the results of the classification model using statistical indexes. In order to ensure the singleness of variables, the number of statistical indexes is the same as the number of transition probabilities. In total, seven statistical indexes were selected, namely, the mean and standard deviation of velocity, the mean and standard deviation of acceleration, and the mean and standard deviation of relative distance and the mean of THW.

The random forest (RF), support vector machines (SVM), and K–nearest neighbor (KNN) were developed based on the selected seven transition probabilities. The radial basis function (RBF) was adopted in SVM, the neighborhood number was set as 5, and the decision tree of random forest was set as 25. The leave–one–out method was applied for cross validation. Table 5 lists the classification accuracy of these three classification algorithms under different feature inputs.

Based on Table 5, the recognition accuracy for the aggressive driving style and cautious driving style was better than that for the normal driving style irrespective whether the transition probability or statistical features were used as the input features. The RF classifier had the highest classification accuracy. The classification accuracy of the classifier with transition probability as the input feature was higher than that of the classifier with traditional statistical features. The recognition rate of the RF classifier for aggressive and cautious driving styles was 91.35% and 92.22%, respectively, and the average recognition rate was higher than 90%. Since the classifiers have the same design, the differences in the values must be caused by different input characteristic parameters.

Correlation analysis between the two types of features was carried out. The results between the transition probability of seven driving behaviors and statistical indexes are presented in Table 6 and Table 7. Table 6 indicates that the transition probability of NFAD–NFAD had a strong correlation with other transition probabilities, with a maximum correlation coefficient of 0.6491. Table 7 shows that vehicle speed had a strong correlation with other statistical indexes, especially with relative distance and THW. In addition, the mean and standard deviation of relative distance had strong correlations with THW, with correlation coefficients of 0.9674 and 0.8225, which means that any one of THW or relative distance features can be removed. A comparative analysis of the correlation between the two kinds of features showed that the correlation and significance level of driving behavior transition probability features were lower than those of the traditional statistical features, indicating that information redundancy was lower. The proposed approach can express the driving style to a greater extent and improve the accuracy of driving style classification. The effectiveness of the maximum joint mutual information algorithm was also verified.

6. Conclusions

Aiming at the quantitative evaluation of driving style and revealing transition properties between driving style and behavior, a novel evaluation method based on the State–Action semantic plane was proposed. Through comparison with subjective and conventional approaches, the validity and reliability of this approach were verified, and the classification accuracy of the driving style was thus improved. Conclusions are summarized as follows.

(1): The HDP–HSMM algorithm combines the advantages of infinite clustering and adaptive updating in the HDP algorithm with the description of the dynamic random process in HSMM. It can decompose time series driving data into the fragment clusters with similar characteristics. This algorithm can be further used for the characteristic extraction of a large amount of naturistic driving data.
(2): The driving behavior semantic plane was developed. It can interpret the fragment clusters, quantify the drivers’ risk indices by determining the probability and duration proportion of each behavior, and intuitively express the driving preferences of drivers with different styles. The aggressive drivers prefer NFNA and NFNA, high–risk driving behaviors, in which probability frequency reaches 79.7% and the duration proportion reaches 75.4%. The cautious drivers prefer low–risk driving behaviors, such as FFNA and FFND, with a probability of 66.67% and a duration proportion of 61.04%.
(3): Additionally, the action behavior of aggressive acceleration (AA) and aggressive deceleration (AD) is lower under highway conditions, which is consistent with the actual situation.
(4): Transition probability can reveal the internal relationship among driving behaviors. The joint mutual information maximization (JMIM) algorithm can select an optimized subset effectively by combining feature correlation and redundancy concepts. The seven highest ranking features were selected to evaluate driving styles: (1) MFND–MFNA, (2) NFAD–NFND, (3) NFND–NFNA. (4) NFND–NFAD, (5) FFND–FFNA, (6) MFND–FFNA, and (7) FFNA–FFND. The results showed that the transition probability as classification features can provide rich information about driving style and improve the classification accuracy of the driving style.

These efforts extend previous studies that focus in more detail on identifying driving style among categories and provide a novel methodology quantifying personal risk indices and visualizing driving performance simply. The limitation of this study is that the driving data were collected from a simulator platform, which results in the observables being idealistic compared with a real vehicle testing platform. Only 33 drivers were selected, and the sampling framework was biased (23 males and 10 females, average age: 26.21 ± 5.06 years). The drivers were too young to represent the characteristics of drivers of other ages. Additionally, only the car–following scenario was designed, and the influence of the traffic flow on driving style was rarely considered. In future work, data collected under more complex natural conditions will be considered, and the proposed approach will be further investigated and verified by naturalistic driving data.

Author Contributions

X.Q.: Methodology, Investigation, Validation, Formal analysis, Resources, Writing—original draft. L.Z.: Conceptualization, Methodology, Writing—review & editing. Y.L.: Conceptualization, Methodology, Writing—review & editing. Y.R.: Data collection, Software, Validation, Formal analysis. Z.Z. (Zhida Zhang): Validation, Formal analysis. Z.Z. (Ziwei Zhang): Data collection, Formal analysis. L.Q.: Review & Formal analysis. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China of FUNDER grant number 51875061 and the Technological Innovation and Application Development of Chongqing of FUNDER grant number cstc2019jscx–zdztzxX0032.

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the Ethics Committee of CHONGQING UNIVERSITY CANCER HOSPITAL (CZLS2021215–A).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Acknowledgments

All individuals included in this section have consented to the acknowledgement.

Conflicts of Interest

The authors declare no conflict of interest.

References

Li, M.; Song, X.; Cao, H.; Wang, J.; Huang, Y.; Hu, C.; Wang, H. Shared Control with a Novel Dynamic Authority Allocation Strategy Based on Game theory and Driving Safety Field. Mech. Syst. Signal Process. 2019, 124, 199–216. [Google Scholar] [CrossRef]
Itkonen, T.; Lehtonen, E.; Selpi, S. Characterisation of Motorway Driving Style Using Naturalistic Driving Data. Transp. Res. Part F Traffic Psychol. Behav. 2020, 69, 72–79. [Google Scholar] [CrossRef]
Cao, H.; Zhao, S.; Song, X.; Bao, S.; Li, M.; Huang, Z.; Hu, C. An optimal hierarchical framework of the trajectory following by convex optimisation for highly automated driving vehicles. Veh. Syst. Dyn. 2019, 57, 1287–1317. [Google Scholar] [CrossRef]
Martinez, C.; Heucke, M.; Wang, F.; Gao, B.; Cao, D. Driving Style Recognition for Intelligent Vehicle Control and Advanced Driver Assistance: A Survey. IEEE Trans Intell. Transp. Syst. 2020, 2018, 666–676. [Google Scholar] [CrossRef] [Green Version]
Moller, M.; Haustein, S. Keep on Cruising: Changes in Lifestyle and Driving Style Among Male Drivers Between the Age of 18 and 23. Transp. Res. Part F Psychol. Behav. 2013, 20, 59–69. [Google Scholar] [CrossRef]
Zhu, B.; Jiang, Y.; Zhao, J.; He, R.; Bian, N.; Deng, W. Typical-driving-style-oriented Personalized Adaptive Cruise Control Design Based on Human Driving Data. Transp. Res. Part C Emerg. Technol. 2019, 100, 274–288. [Google Scholar] [CrossRef]
Yang, W.; Zheng, L.; Li, Y.; Ren, Y.; Zhou, X. Automated Highway Driving Decision Considering Driver Characteristics. IEEE Trans. Intell. Transp. Syst. 2020, 21, 2350–2359. [Google Scholar] [CrossRef]
Li, G.; Li, S.E.; Cheng, B.; Green, P. Estimation of Driving Style in Naturalistic Highway Traffic Using Maneuver Transition Probabilities. Transp. Res. Part C Emerg. Technol. 2017, 74, 113–125. [Google Scholar] [CrossRef]
Toledo, T.; Musicant, O.; Lotan, T. In-Vehicle Data Recorders for Monitoring and Feedback on Drivers’ Behavior. Transp. Res. Part C Emerg. Technol. 2008, 16, 320–331. [Google Scholar] [CrossRef]
Ma, Y.; Fu, R. Research and Development of Drivers Visual Behavior and Driving Safety. China J. Highw. Transp. 2015, 28, 82–94. [Google Scholar]
Bazzan, A.; Klugl, F. A Review on Agent-Based Technology for Traffic and Transportation. Knowl. Eng. Rev. 2014, 29, 375–403. [Google Scholar] [CrossRef]
Ehsani, J.; Li, K.; Simons, B.; Mcgrath, C.; Perlus, J.; O’Brien, F.; Klauer, S. Conscientious Personality and Young Drivers’ Crash Risk. J. Saf. Res. 2015, 54, 83.e29–87.e29. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Bellem, H.; Schoenenberg, T.; Krems, J.; Schrauf, M. Objective Metrics of Comfort: Developing A Driving Style for Highly Automated Vehicles. Transp. Res. Part F Traffic Psychol. Behav. 2016, 41 Pt A, 45–54. [Google Scholar] [CrossRef]
Schockenhoff, F.; Nehse, H.; Lienkamp, M. Maneuver-Based Objectification of User Comfort Affecting Aspects of Driving Style of Autonomous Vehicle Concepts. Appl. Sci. 2020, 10, 3946. [Google Scholar] [CrossRef]
Schwarzer, V.; Ghorbani, R. Drive Cycle Generation for Design Optimization of Electric Vehicles. IEEE Trans. Veh. Technol. 2012, 62, 89–97. [Google Scholar] [CrossRef]
Higgs, B.; Abbas, M. Segmentation and Clustering of Car-Following Behavior: Recognition of Driving Patterns. IEEE Trans. Intell. Transp. Syst. 2014, 16, 81–90. [Google Scholar] [CrossRef]
Zhringer, M.; Kalt, S.; Lienkamp, M. Compressed Driving Cycles Using Markov Chains for Vehicle Powertrain Design. World Electr. Veh. J. 2020, 11, 52. [Google Scholar] [CrossRef]
Taniguchi, T.; Nagasaka, S.; Hitomi, K.; Takenaka, K.; Bando, T. Unsupervised Hierarchical Modeling of Driving Behavior and Prediction of Contextual Changing Points. IEEE Trans. Intell. Transp. Syst. 2015, 16, 1746–1760. [Google Scholar] [CrossRef]
Taniguchi, T.; Nagasaka, S.; Hitomi, K.; Chandrasiri, N.; Bando, T.; Takenaka, K. Sequence Prediction of Driving Behavior Using Double Articulation Analyzer. IEEE Trans. Syst. Man Cybern. Syst. 2016, 46, 1300–1313. [Google Scholar] [CrossRef]
Lim, H.; Choi, H.; Choi, Y.; Kim, I.-J. Memetic algorithm for multivariate time-series segmentation. Pattern Recognit. Lett. 2020, 138, 60–67. [Google Scholar] [CrossRef]
Bargi, A.; Xu, R.; Piccardi, M. An Adaptive Online HDP-HMM for Segmentation and Classification of Sequential Data. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 3953–3968. [Google Scholar] [CrossRef] [PubMed]
Xu, L.; Hu, J.; Jiang, H.; Meng, W. Establishing Style-Oriented Driver Models by Imitating Human Driving Behaviors. IEEE Trans. Intell. Transp. Syst. 2015, 16, 2522–2530. [Google Scholar] [CrossRef]
Tadesse, E.; Sheng, W.; Liu, M. Driver drowsiness detection through HMM based dynamic modelling. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, China, 31 May–7 June 2014; pp. 4003–4008. [Google Scholar]
Gadepally, V.; Krishnamurthy, A.; Özgüner, Ü. A Framework for Estimating Driver Decisions Near Intersections. IEEE Trans. Intell. Transp. Syst. 2013, 15, 637–646. [Google Scholar] [CrossRef]
Han, W.; Wang, W.; Li, X.; Xi, J. Statistical-based Approach for Driving Style Recognition Using Bayesian Probability with Kernel Density Estimation. IET Intell. Transp. Syst. 2018, 13, 22–30. [Google Scholar] [CrossRef] [Green Version]
Xue, Q.; Wang, K.; Lu, J.J.; Liu, Y. Rapid Driving Style Recognition in Car-Following Using Machine Learning and Vehicle Trajectory Data. J. Adv. Transp. 2019, 2019 Pt 1, 9085238. [Google Scholar] [CrossRef] [Green Version]
Bellem, H.; Thiel, B.; Schrauf, M.; Krems, J. Comfort in automated driving: An analysis of preferences for different automated driving styles and their dependence on personality traits. Transp. Res. Part F Traffic Psychol. Behav. 2018, 55, 90–100. [Google Scholar] [CrossRef]
Zheng, L.; Qiao, X.; Ni, T.; Yang, W.; Li, Y. Driver Cognitive Load Based on Multi-Dimensional Information Feature Analysis. China J. Highw. Transp. 2021, 34, 240–250. [Google Scholar]
Wong, C.; Mao, Y.; Peng, K.; Shi, J. Differences between odd number and even number response formats: Evidence from mainland Chinese respondents. Asia Pac. J. Manag. 2011, 28, 379–399. [Google Scholar] [CrossRef]
Kondoh, T.; Yamamura, T.; Kitazaki, S.; Kuge, N.; Boer, E.R. Identification of Visual Cues and Quantification of Drivers’ Perception of Proximity Risk to the Lead Vehicle in Car-Following Situations. J. Mech. Syst. Transp. Logist. 2008, 1, 170–180. [Google Scholar] [CrossRef] [Green Version]
Nilsson, J.; Falcone, P.; Vinter, J. Safe Transitions from Automated to Manual Driving Using Driver Controllability Estimation. IEEE Trans. Intell. Transp. Syst. 2015, 16, 1806–1816. [Google Scholar] [CrossRef]
Kusano, K.D.; Chen, R.; Montgomery, J.; Gabler, H.C. Population Distributions of Time to Collision at Brake Application During Car Following from Naturalistic Driving Data. J. Saf. Res. 2015, 54, 95.e29–104.e29. [Google Scholar] [CrossRef] [PubMed]
Wang, W.; Zhao, D.; Xi, J.; Han, W. A Learning-Based Approach for Lane Departure Warning Systems with A Personalized Driver Model. IEEE Trans. Veh. Technol. 2018, 67, 9145–9157. [Google Scholar] [CrossRef] [Green Version]
Zhou, J.; Wang, F.; Zeng, D. Hierarchical Dirichlet Processes and Their Applications: A Survey. Acta Autom. Sin. 2011, 4, 389–407. [Google Scholar] [CrossRef]
Wang, W.; Xi, J.; Ding, Z. Driving Style Analysis Using Primitive Driving Patterns with Bayesian Nonparametric Approaches. IEEE Trans. Intell. Transp. Syst. 2018, 20, 2986–2998. [Google Scholar] [CrossRef] [Green Version]
The, Y.W.; Jordan, M.I.; Beal, M.J.; Blei, D.M. Hierarchical Dirichlet Processes. J. Am. Stat. Assoc. 2006, 101, 1566–1581. [Google Scholar]
Sethurman, J. A Constructive Definition of Dirichlet Priors. Stat. Sin. 1994, 4, 639–650. [Google Scholar]
Johnson, M.; Willsky, A. The Hierarchical Dirichlet Process Hidden Semi-Markov Model. In Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010), Catalina Island, CA, USA, 8–11 July 2010; Association for Uncertainty in Artificial Intelligence: Catalina Island, CA, USA, 2012; pp. 252–259. [Google Scholar]
Fox, E. Bayesian Nonparametric Learning of Complex Dynamical Phenomena. Ph.D. Thesis, MIT, Cambridge, MA, USA, 2009. [Google Scholar]
Bennasar, M.; Hicks, Y.; Setchi, R. Feature Selection Using Joint Mutual Information Maximisation. Expert Syst. Appl. 2015, 42, 8520–8532. [Google Scholar] [CrossRef] [Green Version]
Gao, B.; Cai, K.; Qu, T.; Hu, Y.; Chen, H. Personalized Adaptive Cruise Control Based on Online Driving Style Recognition Technology and Model Predictive Control. IEEE Trans. Veh. Technol. 2020, 69, 12482–12496. [Google Scholar] [CrossRef]

Figure 1. A framework of driving style analysis.

Figure 2. Data collection system architecture.

Figure 3. Experimental Process.

Figure 4. N–Back auditory feedback task diagram.

Figure 5. Histograms of the data collected from drivers and the fitting results of threshold values for two variables, (a) statistical results of driving data and fitting results of acceleration (a_e) and THW, using normal and t distributions, (b) statistical results of driving data and the fitting results of the cumulative distribution function.

Figure 6. The proposed framework to analyze driving styles based on the State–Action semantic plane.

Figure 7. A graphical model of HSMM.

Figure 8. The HDP–HSMM graphical model.

Figure 9. The driving behavior semantic plane. (a) driving risk evaluation model, (b) the State–Action semantic plane.

Figure 10. The log–likelihood of modeling the training data for all drivers.

Figure 11. The segmentation results of the driving behavior of driver #5 in one following event, (a) the sequence clusters of the driving behavior, (b) the fragments result of THW, (c) the fragments result of acceleration.

Figure 12. The scatter map of driver #5 in one following event.

Figure 13. Example of clustering results for driver #5 using the K–means clustering method based on the HDP–HSMM with K = 1.

Figure 14. Correlation analysis of the subjective score and objective risk coefficient of driving styles.

Figure 15. Frequency distribution and duration proportion distribution of driving behaviors of drivers with different styles.

Figure 16. The top ten transition probabilities with the greatest joint mutual information.

Figure 17. Transition probabilities between the behaviors involved in the selected seven features for (a) aggressive, (b) normal, and (c) cautious driving styles, respectively.

Table 1. Data acquisition information.

Types of Data	Information
driver operation	brake/accelerator pedal position, turn signal, steering angle
ego vehicle states	speed, acceleration, location information, yaw angle speed, engine speed
preceding vehicle states	speed, acceleration, location information
subjective score	DSQ, RPQ
physiological signal	ECG, GSR, EEG

Table 2. Drivers information statistic.

Driving Styles	Sample Number	Cluster Center
Cautious	7	6.49
Normal	16	8.42
Aggressive	10	9.30

Table 3. Variable thresholds.

Variables	States/Actions	Thresholds
THW (s)	near following (NF)	<1.2
	middle following (MF)	[1.2,3.0]
	far following (FF)	>3.0
Acceleration (m/s²)	aggressive acceleration (AA)	>1.6
	normal acceleration (NA)	[0,1.6]
	normal deceleration (ND)	[−1.4,0]
	aggressive deceleration (AD)	<−1.4

Table 4. Significance analysis of driving behavior transition probability of different driving styles.

Score Transition	Behavior Transition	Aggressive		Normal		Cautious		Significance Level
Score Transition	Behavior Transition	Mean	Std	Mean	Std	Mean	Std	p
6–7	MFND–MFNA	0.6281	0.1917	0.6453	0.1395	0.2787	0.2346	0 < 0.05
7–8	NFAD–NFND	0.7382	0.3201	0.4494	0.2271	0.0444	0.1721	0 < 0.05
7–9	NFAD–NFNA	0.4408	0.2353	0.3369	0.2812	0.1111	0.2999	0.0016 < 0.05
8–7	NFND–NFAD	0.2492	0.1625	0.1119	0.1396	0	0	0 < 0.05
4–5	FFND–FFNA	0.3347	0.3499	0.5005	0.2803	0.8106	0.0941	0 < 0.05
5–4	FFNA–FFND	0.3871	0.2168	0.5751	0.2287	0.7363	0.1842	0 < 0.05
7–8	MFNA–NFND	0.1051	0.0873	0.0575	0.0581	0.0321	0.0649	0.0021 < 0.05

Table 5. Classification accuracy of the three classification algorithms under different feature inputs.

Input Features	Classifiers	Aggressive	Normal	Cautious
Transition probability	RF	91.35%	87.26%	92.22%
	SVM	83.53%	81.64%	84.91%
	KNN	88.92%	84.43%	87.43%
Statistical features	RF	88.49%	83.26%	86.32%
	SVM	84.46%	79.27%	82.08%
	KNN	88.13%	82.64%	82.40%

Table 6. Pearson correlation coefficients between the seven transition probabilities.

Correlation	MFND–MFNA	NFAD–NFND	NFAD–NFNA	NFND–NFAD	FFND–FFNA	FFNA–FFND	MFNA–NFND
MFND–MFNA	1	0.1494	0.0687	0.0679	−0.2143	−0.1598	0.0857
NFAD–NFND	0.1494	1	0.5621	0.6491	−0.4193	−0.4520	0.4578
NFAD–NFNA	0.0687	0.5621	1	0.2014	−0.2554	−0.1145	0.6391
NFND–NFAD	0.0679	0.6491	0.2014	1	−0.2336	−0.3098	0.3264
FFND–FFNA	−0.2143	−0.4193	−0.2554	−0.2336	1	0.5029	−0.1657
FFNA–FFND	−0.1598	−0.4520	−0.1145	−0.3098	0.5029	1	−0.1336
MFNA–NFND	0.0857	0.4578	0.6391	0.3264	−0.1657	−0.1336	1

Note: Significant at the 0.01 level.

Table 7. Pearson correlation coefficients between the seven traditional statistical features.

Correlation	Ve_mean	Ve_std	De_mean	De_std	Ae_mean	Ae_std	THW_mean
Ve_mean	1	0.6025	−0.5068	−0.3547	0.1605	0.3657	−0.6764
Ve_std	0.6025	1	−0.2731	−0.2084	0.0464	0.4818	−0.3277
De_mean	−0.5068	−0.2731	1	0.8666	−0.0749	−0.4843	0.9674
De_std	−0.3547	−0.2084	0.8666	1	0.1078	−0.4450	0.8225
Ae_mean	0.1605	0.0464	−0.0749	0.1078	1	−0.0480	−0.1429
Ae_std	0.3657	0.4818	−0.4843	−0.4450	−0.0480	1	−0.4115
THW_mean	−0.6764	−0.3277	0.9674	0.8225	−0.1429	−0.4115	1

Note: Significant at the 0.01 level.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qiao, X.; Zheng, L.; Li, Y.; Ren, Y.; Zhang, Z.; Zhang, Z.; Qiu, L. Characterization of the Driving Style by State–Action Semantic Plane Based on the Bayesian Nonparametric Approach. Appl. Sci. 2021, 11, 7857. https://doi.org/10.3390/app11177857

AMA Style

Qiao X, Zheng L, Li Y, Ren Y, Zhang Z, Zhang Z, Qiu L. Characterization of the Driving Style by State–Action Semantic Plane Based on the Bayesian Nonparametric Approach. Applied Sciences. 2021; 11(17):7857. https://doi.org/10.3390/app11177857

Chicago/Turabian Style

Qiao, Xuqiang, Ling Zheng, Yinong Li, Yuqing Ren, Zhida Zhang, Ziwei Zhang, and Lihong Qiu. 2021. "Characterization of the Driving Style by State–Action Semantic Plane Based on the Bayesian Nonparametric Approach" Applied Sciences 11, no. 17: 7857. https://doi.org/10.3390/app11177857

APA Style

Qiao, X., Zheng, L., Li, Y., Ren, Y., Zhang, Z., Zhang, Z., & Qiu, L. (2021). Characterization of the Driving Style by State–Action Semantic Plane Based on the Bayesian Nonparametric Approach. Applied Sciences, 11(17), 7857. https://doi.org/10.3390/app11177857

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Characterization of the Driving Style by State–Action Semantic Plane Based on the Bayesian Nonparametric Approach

Abstract

1. Introduction

2. Data Acquisition and Pre–Processing

2.1. Participants

2.2. Test Procedure

2.3. Data Extraction and Pre–Processing

2.4. Data Fundamental Analysis

2.4.1. Subjective Data Analysis

2.4.2. Variable Threshold Definition

3. Methods

3.1. Description of the Driving Process Based on HSMM

3.1.1. Construction of HDP–HSMM

3.1.2. Parameter Sampling and Inference

3.2. Construction of the State–Action Semantic Plane

3.3. Quantification of the Driving Style Method

4. Results

4.1. Model Training Results

4.2. Driving Data Fragment Results

4.3. Fragment Sequence Cluster Labelling Results

4.4. Correlation Analysis of the Subjective Score and Objective Risk Score

5. Discussion

5.1. Driving Style Discussion

5.1.1. Frequency and Duration Proportion of Driving Behavior

5.1.2. The Transition Probabilities of Driving Behavior

5.2. Application of Transition Probabilities

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI