Next Article in Journal
An Active Inference Model of Collective Intelligence
Next Article in Special Issue
A Proposal of a Motion Measurement System to Support Visually Impaired People in Rehabilitation Using Low-Cost Inertial Sensors
Previous Article in Journal
Some New Hermite–Hadamard and Related Inequalities for Convex Functions via (p,q)-Integral
Previous Article in Special Issue
Deep Learning for Walking Behaviour Detection in Elderly People Using Smart Footwear
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Driving Risk Assessment Using Near-Miss Events Based on Panel Poisson Regression and Panel Negative Binomial Regression

1
Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport, School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China
2
Department of Econometrics, Riskcenter-IREA, Universitat de Barcelona, 08034 Barcelona, Spain
*
Authors to whom correspondence should be addressed.
Entropy 2021, 23(7), 829; https://doi.org/10.3390/e23070829
Submission received: 10 May 2021 / Revised: 17 June 2021 / Accepted: 24 June 2021 / Published: 29 June 2021

Abstract

:
This study proposes a method for identifying and evaluating driving risk as a first step towards calculating premiums in the newly emerging context of usage-based insurance. Telematics data gathered by the Internet of Vehicles (IoV) contain a large number of near-miss events which can be regarded as an alternative for modeling claims or accidents for estimating a driving risk score for a particular vehicle and its driver. Poisson regression and negative binomial regression are applied to a summary data set of 182 vehicles with one record per vehicle and to a panel data set of daily vehicle data containing four near-miss events, i.e., counts of excess speed, high speed brake, harsh acceleration or deceleration and additional driving behavior parameters that do not result in accidents. Negative binomial regression ( A I C o v e r s p e e d = 997.0, B I C o v e r s p e e d = 1022.7) is seen to perform better than Poisson regression ( A I C o v e r s p e e d = 7051.8, B I C o v e r s p e e d = 7074.3). Vehicles are separately classified to five driving risk levels with a driving risk score computed from individual effects of the corresponding panel model. This study provides a research basis for actuarial insurance premium calculations, even if no accident information is available, and enables a precise supervision of dangerous driving behaviors based on driving risk scores.

1. Introduction

Near-miss events are incidents that denote the existence of danger, even if no accident occurs. Reporting of near-miss events is an established error reduction technique that has been used by many industries to manage risk and reduce accidents. In the auto insurance industry, insurers traditionally calculate premiums by analyzing past claims reported by the insured policy holders, and reward those drivers that do not report accidents with a no-claims bonus. However, this may be a rather incorrect approach to the assessment of accident risk, especially when the insured has suffered accidents but chooses not to make a claim so as not to lose the no-claims bonus. Fortunately, the advent of the Internet of Vehicles (IoV) offers a better solution to this problem, using near-miss events to identify driving risk. Near-miss events ultimately provide information that can lead to actuarial premium calculations in the auto insurance industry [1,2].
This study explores how to evaluate driving risks, in the short term, and to score drivers without claims and accidents based on information on near-miss counts over a short period of time. One of the main novelties of this approach, in the absence of claims, is to use telematics sensors for observation of drivers over a given period. The model obtained in this study offers an important alternative for driving risk identification. Not only can the model reflect risk factors that influence each near-miss event but it can also help to evaluate drivers’ risks, and fixed-effects panel count data models can be used to rank drivers according to their individual effects. The modeling method and results are invaluable for insurance companies for developing usage-based insurance (UBI) to personalize premiums. They are also of interest to traffic regulatory authorities for promoting safe driving and the prevention of accidents.
Near-miss events are incidents that need to be defined and extracted from the original raw data files for further processing and analysis. By dealing only with near-miss events, and excluding claims or accidents, this study aims to specifically identify driving patterns. This study is carried out both on a per driver summary data set and on a panel data set where a daily summary is shown for each driver. Our data contain counts of the four types of near-miss events in our study. Speeding, high speed braking, harsh acceleration and harsh deceleration have been defined based on actual driving conditions and local laws and regulations. Other high-risk events, e.g., sharp turning, dangerous lane changing and unexpected maneuvers, proved by previous studies to be related to driving risk, are not included in this study due to the dimension and precision limitations of the original data set.
Our interest is to model the frequency of near-miss events given the drivers’ characteristics. The simplest statistical model that links a count data dependent variable with explanatory factors is the Poisson model. Essentially, the Poisson model is similar to linear regression, where a response depends on some others inputs. Here we think that distance driven or mean speed among others, influence the expected frequency of near-miss events. A Poisson model, which is also known as a Poisson regression model, is easily interpretable and provides a way to elucidate the significant effects on the conditional expected frequency. Poisson models are constrained by the fact that conditional expectation and conditional variance are equal. Negative binomial regression models are a natural extension that overcomes this restriction. More details on the models are provided in the Methods section below.
Since the extracted frequency of near-miss events is an unbounded non-negative integer, Poisson regression and negative binomial regression are both suitable for modelization. Poisson regression, negative binomial regression, zero-inflated Poisson regression and zero-inflated negative binomial regression are respectively applied to the summary data set. Average speed, brake times, accelerator pedal position, engine fuel rate etc., are selected as independent variables. Either mileage or fuel consumption can be chosen as the exposure variable to offset the model. In order to reach a clear understanding of risk factors of different near-miss events, each near-miss event is individually used as a dependent variable. However, regardless of which one is selected as the dependent variable, negative binomial regression is shown to provide the best fit in the summary data in this study.
Negative binomial regression also performs better than Poisson regression on the panel data sets. Individual effects and time effects are estimated using panel Poisson regression and panel negative binomial regression on a short panel data set of six days in length. The regression results confirm the existence of individual effects and time effects, and also enable the driving risk of each vehicle to be ranked. The driving risk level of vehicles can then be classified by converting the individual effects into scores, thus providing an important reference for further accurate calculation of premiums.
The rest of this article is organized as follows. The development of UBI and previous efforts on driving risk assessment are summarized in Section 2. Section 3 describes the data and introduces the key parameters used in modeling. Section 4 presents the model expression of Poisson regression and negative binomial regression used in the study. The results of negative binomial regression using the summary data set and the panel data set are reported and analyzed in Section 5. The results are discussed and the conclusions are presented in Section 6.

2. Literature Review

The auto insurance industry is continuously pursuing new ways to calculate more accurate actuarial premiums. However, traditional auto insurance calculations are limited by the difficulty of obtaining information on policy holders, so classical ratemaking uses simple information on drivers (age gender,), vehicles (type of car, model and brand) and driving sections [3]. With current advances in information technology, a new type of insurance business, UBI, based on multi-source data and personalized premium calculation is becoming the mainstream. The Pay-as-you-drive (PAYD) mode of charging premiums is based on mileage or fuel consumption, on the premise that mileage or fuel consumption correlates with the probability of suffering an accident [4]. PAYD has evolved into a newer scheme, called the pay-how-you-drive (PHYD) ratemaking mode, which is based on multiple sources of data, including driving behavior data [5]. Following the development of 5G communication technology, it may now be possible to implement an even more sophisticated monitoring and pricing strategy, known as the manage-how-you-drive (MHYD) principle, i.e., real-time calculation of premiums based on multi-source data and providing real-time information to drivers to restrain from bad driving behavior [3,6]. However, due to technological, regulatory and other issues regarding privacy [7], there is still no mature PHYD product on the market at present [8,9] and, in terms of MHYD, further research is necessary on driving risk to produce products that better reflect the driver profile [10].
Traffic accidents all over the world result in a large number of casualties every year, and high-risk driving is one of the main factors behind these incidents [3]. Consequently, research on driving risk has been a topic of interest over recent decades. Simulation experiments to evaluate driving risk have been designed in the laboratory setting to identify driving risk factors [11,12,13,14] as well as experiments using actual vehicles on the road [15,16,17,18,19]. Questionnaire surveys for driving risk assessment have also been studied [20,21]. In fact, the naturalistic type of driving data collected by the IoV or smart phones, known as telematics data, can effectively reduce the influence of subjective factors and unreasonable assumptions in producing effective risk-mitigating actions [22,23,24,25,26].
In research related to driving risk assessment in the auto insurance industry, machine learning and generalized linear models feature equally. Machine learning, with its strong ability to process big data efficiently, is increasingly gaining ground in its application in the auto insurance business. Logistic regression [27], cluster analysis [28], decision tree [5], support vector machine [29], neural network [30] and other machine learning models [31,32,33] have been widely studied in the field of driving risk assessment, and the results have shown machine learning to be a powerful tool [34]. However, since most machine learning procedures, being black box algorithms, do not offer a high degree of interpretability, they cannot completely replace the conventional generalized linear models implemented for decades in the auto insurance industry [8].
Conventional generalized linear models discern the correlation between influencing factors and claims or accidents in frequency and severity models [9,24,25,35]. However, the study of near-miss events even when there is a lack of information on claims and accidents should not be ignored [2,15]; on the contrary, since near-misses are more frequent than accidents and are positively associated with them, they can be considered a good alternative for risk modeling for driving risk assessment [1]. Compared with previous studies, this study not only conducts regression on the summary data set to model and analyze the factors causing near-miss events, but also conducts panel data regression on the panel data set to consider individual effects and time effects. The regression results can not only make more accurate causal inference, but also carry out risk scoring.

3. Data Description

The telematics data used in this study are collected from an IoV information service provider in China. While we cannot obtain more data due to the commercial privacy of the data, the limited data also contains valuable driving risk information, which is worth studying. The original data set contains 182 data files, representing sensor data for 182 vehicles observed from 3–8 July 2018 [10]. Each data file contains 62 different measurements but, after data processing [36], less than one-third of them can be used due to recording errors and inconsistencies. The original data are transformed for modeling into a summary data set with information on each driver (see details in Table 1).
The variables overspeed, highspeedbrake, harshacceleration and harshdeceleration are individually filtered by combining the rules of traffic law and driving code. Previous studies have confirmed that speeding is a dangerous driving behavior which is likely to cause traffic accidents [3]. In China, traffic safety regulations stipulate a maximum speed for each type of vehicle on all types of roads. The maximum speed limit for the vehicles in this study is 90 km/h; exceeding this by 10 % is not deemed to be a traffic offense. Therefore, 100 km/h is taken as the threshold value of the overspeed near-miss event. Another high risk near-miss event that deserves attention is that of emergency braking; at high speed (>90 km/h), if the brake is not used correctly or is subjected to lateral force, the car is prone to side-slip or even cartwheel. Lastly, both harsh acceleration and harsh deceleration are near-miss events that compromise driving safety and fuel economy. Based on previous research experience [1,2,37] and the filter analysis of the extreme values of this data set by box graph method, 6 m/s 2 is determined as the filtering threshold value of harsh acceleration and harsh deceleration. Figure 1 shows that near-miss events are all non-negative integers. Combined with the relationship between expectation and variance shown in Table 1, the four near-miss events are shown to be suitable as dependent variables of a Poisson regression or a negative binomial regression.
The panel data set has one summary per day for each driver. The statistics of the panel data set are shown in Table 2.

4. Methods

Poisson regression is a generalized linear model. Negative binomial regression can be considered as a generalization of Poisson regression with overdispersion of the dependent variable Y i where subindex i refers to the i-th observation in the data set. The probability density function of the Poisson distribution is:
P ( Y i = y i x i ) = e λ i λ i y i y i !
where λ i is the Poisson arrival rate and is determined by explanatory variable x i in Poisson regression to represent the average number of events, which is equal to the expectation and variance of the explained variable E ( Y i x i ) = V ( Y i x i ) = λ i .
The negative binomial distribution is a mixture of a Poisson ( λ ) and a Gamma (a,b) distribution. The probability density function of the negative binomial distribution is:
f ( y a , b ) = 0 f ( y λ ) g ( λ a , b ) d λ = Γ ( y + a ) Γ ( y + 1 ) Γ ( a ) b 1 + b a 1 1 + b y
where λ is the mean and variance of the Poisson distribution, a is the shape parameter of the Gamma distribution, b is the inverse scale parameter of the Gamma distribution, E ( y ) = a b = λ ¯ and V ( y ) = a b 1 + 1 b = λ ¯ 1 + λ ¯ a .
The zero-inflated model is applicable when the counting data contains a large number of zero values. Theoretically, it is a two-stage decision. First, it decides whether to choose zero or a positive integer, and then it determines which positive integer to choose. Therefore, the probability distribution of Y i is a mixed distribution:
P r Y i = y i x i = θ + ( 1 θ ) P K i = y i x i y i = 0 ( 1 θ ) P K i = y i x i y i > 0
where θ is the probability of an extra zero value, K i can follow a Poisson distribution or a negative binomial distribution depending on the characteristics of the dependent variable.
The conditional expectation function of a negative binomial regression model depends on a vector of explanatory variables x i and, similar to Poisson, is usually defined by a log-link as:
E ( y i x i ) = λ i = T i × exp ( α + β 1 x 1 i + + β k x k i )
where i is the number of the observation, k depends on the number of independent variables, T i denotes the offset variables (so, in our application, k i l o i or f u e l i is the exposure variable), x 1 i ... x k i represent the independent variables such as b r a k e s i , r a n g e i , s p e e d i , r p m i , a c c e l e r a t o r p e d a l p o s i t i o n i and e n g i n e f u e l r a t e i , α and β 1 ... β k are unknown parameters that need to be estimated.
The two-way fixed effect model of panel Poisson regression and panel negative binomial regression is specified as:
E ( y i t x i t ) = λ i t = T i t × exp ( α + β 1 x 1 i t + + β k x k i t + d i + p t )
where i is the number of the observation, t is of time reference, k depends on the number of independent variables, T i t is the offset and equals k i l o i t or f u e l i t as the exposure variable of the ith observation at time t, x 1 i t ... x k i t represent the independent variables of the ith observation at time t such as b r a k e s i t , r a n g e i t , s p e e d i t , r p m i t , a c c e l e r a t o r p e d a l p o s i t i o n i t and e n g i n e f u e l r a t e i t , α and β 1 ... β k are unknown parameters that need to be estimated, d i represents the individual effect and p t represents the time effect. To avoid identification problems in the model specification, d 1 = p 1 = 0 .
The methodology of this study involves data preparation, modeling, risk scoring of driving risk, etc. The whole technical process is shown in Figure 2.
In the data preparation stage, the original data need to be preprocessed, including multi-source data fusion, data cleaning, missing processing, etc. Then the summary data set and panel data set required in this study are obtained through statistical calculation. In the modeling phase, multiple count data models are used on two data sets for regression analysis, which follows certain premises. Our observed drivers can be considered independent of each other. Even if they drive in a similar area, they do not have any apparent relationship between each other. When we observe one driver over time, we have taken care of temporal correlation using the panel model that considers that one individual is observed repeatedly, here each day. In the scoring stage, the regression results obtained from the regression model most suitable for the data in this study can be used for causal analysis of near-miss events and driving risk scoring and rating. In the research field of telematics data application, the results of this study show this application has potential in, for example, driving behavior supervision and personalized premium calculation. The work at this stage has yet to be completed. Data processing in the preparation and Poisson regression and negative binomial regression on different data sets in the modeling can be implemented with data tools such as Stata, Python, R, etc.

5. Results

Before regression, multicollinearity tests are carried out on all explanatory variables to eliminate the influence of multicollinearity on the model. As shown in Table 3, the variance inflation factors (VIF) of all selected independent variables are less than 5, while the correlation coefficients are generally less than 0.7. This indicates that the multicollinearity among variables is weak, so all of them can be included in the regression equation and robust estimates can be made.
Both Poisson regression and negative binomial regression are applicable to this study, and the zero-inflated model is taken as a consideration for the large number of zero values of dependent variables. In order to determine the regression model which is most suitable for this study, the performance of the two models on different dependent variables is compared. All the estimated results are obtained by regression after standardization of the original values.

5.1. Results of the Summary Data Set

In the summary data set, four near-miss events are respectively treated as dependent variables while the independent variables are brakes, speed, rpm, accelerator pedal position and engine fuel rate, where kilo is chosen as the exposure variable or offset. Poisson regression, zero-inflated Poisson regression, negative binomial regression and zero-inflated negative binomial regression are estimated (see Table 4). Regardless of which near-miss event is the dependent variable, negative binomial regression has maximum log-likelihood value, and minimum AIC value and BIC value. That is, negative binomial regression has the best performance in this data set.
According to the results of negative binomial regression in different dependent variables (see Table 5 and Figure 3a), different near-miss events are affected by different driving risk factors with different influences. Overall, the average speed has the most obvious influence on near-miss events, with a significant negative effect on harsh acceleration (−0.776) and harsh deceleration (−0.658). The impact of braking event number on near-miss events is also positive significant. The higher the number of braking, the more high speed braking (0.272), harsh acceleration (0.189) and harsh deceleration (0.180) occur. In addition, average RPM is positively correlated with harsh acceleration (0.178), and average accelerator pedal position is positively correlated with harsh acceleration (0.152) and harsh deceleration (0.235). Interestingly, some influencing factors have opposite effects on different dependent variables. Range of driving has a positive effect on high speed brake (0.272) but a negative effect on harsh deceleration (−0.153) while average engine fuel rate has a significant positive effect on high speed braking (0.705) but a negative effect on sharp deceleration (−0.157). Furthermore, the significance of the constant term indicates that, in addition to the factors considered in this study, there are other factors that also influence near-miss events. The results of the other three regression models on the summary data set are shown in Table A1, Table A2 and Table A3, and discussed in the Discussion section.

5.2. Results of the Panel Data Set

As shown in Table 6, the evaluation index (log-likelihood, AIC and BIC) of negative binomial regression is lower than that of Poisson regression for each dependent variable. Therefore, negative binomial regression is better than Poisson regression on panel data.
The panel negative binomial regression is used to estimate the two-way fixed effect model, considering both individual effect and time effect on four dependent variables. The influencing factors reflected by this (see Table A4 and Figure 3b) differ from those shown in the results of the summary data. For example, harsh acceleration and harsh deceleration are positively affected by the number of brakes (0.246 and 0.253) and average accelerator pedal position (0.249 and 0.270) but negatively affected by the average speed (−0.645 and −0.586) and average engine fuel rate (−0.188 and −0.229). However, RPM, which is not significant in the summary data, is significantly positive for overspeed (1.683) and high speed braking (1.287). The brakes (0.0505) and engine fuel rate (0.295), which had a significant positive effect on the summary data, become insignificant.
The advantage of panel data over summary data is that fixed effects can be estimated and thus individual effects and time effects can be interpreted. The time effect is significant in most cases for high speed braking, harsh acceleration and harsh deceleration, which indicates that these three near-miss events are greatly influenced by time. The time effect on the overspeed event is significant for only one day, suggesting that it is less influenced by time. More importantly, the individual effects of the four near-miss events can be used to score each observation. It should be noted that the first observation has been omitted in the regression to avoid complete multicollinearity, and its value is expected to be zero in the subsequent driving risk score.

6. Discussion

The regression results of Poisson regression (see Table A1), zero-inflated Poisson regression (see Table A2), negative binomial regression (see Table 5) and zero-inflated negative binomial regression (see Table A3) on the summary data set show the importance of driving behavior variables in driving risk. The high significance of two variables, braking times and average speed, in the four regression models indicates that these two factors have a very important impact on the generation of near-miss events. Moreover, the significant performance of specific independent variables in the regression model of specific dependent variables indicates that near-miss events are affected by a variety of driving behavior factors and the formation mechanism of each near-miss event is different. For example, the positive effect of RPM on harsh acceleration events, the positive effect of accelerator pedal position on harsh deceleration events and the positive effect of engine fuel rate on high speed braking events are shown in Table A1, Table A2 and Table A3, Table 5.
The results obtained by panel regression are more reliable than those obtained by pooled regression. Table 5 and Table A4 and Figure 3 show that some coefficients that are not significant in the pooled negative binomial regression become significant in the panel negative binomial regression, while some significant parameters in the pooled negative binomial regression are not significant in the panel negative binomial regression. This means that the dependent variables are affected by individual effects and time effects. In the panel negative binomial regression, most of the individual and time coefficients are significant, which indicates the suitability of this type of regression analysis.
Driving risks can be evaluated by the regression coefficients of negative binomial models on panel data. The value of the individual coefficients within a regression indicates the individual’s deviance from the level of the expected occurrence of a particular near-miss event, given the information on all the other explanatory variables. In other words, the individual effect coefficient can be understood as the effect utility of each vehicle on the occurrence of the corresponding near-miss event. Geometrically, the effect coefficient of each individual is a change in the intercept.
Four near-miss events are used as dependent variables to obtain four sets of regression coefficients. Given that the influencing factors and generating mechanisms of different near-miss events are different, combining the four groups of regression coefficients into one group is not recommended. However, harsh acceleration and harsh deceleration show very similar characteristics in terms of data description before regression (Table 1 and Table 5), after regression (Figure 3 and Table A4) and in distribution of driving risk score (Table A5). Even so, it is not recommended to combine them into a single near-miss event for study, because the occurrence conditions and coping operations of them are different, and it is the most appropriate choice to study each near-miss event separately.
In order to transform individual effect estimates of near-miss models into a driving risk grading, several steps need to be followed. Firstly, winsorization avoids the influence of possibly spurious outliers (the double tail was winsorized with the threshold 0.01 in this study). Secondly, the regression coefficient can be compressed to the interval of [0,1] through normalization. Each group of coefficients is then mapped into an interval of [0,5] (see Table A5), and each observation then is given a driving risk level from 1 to 5, i.e., excellent, good, medium, bad and terrible (see Figure 4). The values of exactly 0 and 5 are included because the corresponding observations are the minimum and the maximum values in their group and are Min-Max scaled. In overspeed and h i g h s p e e d b r a k e groups, two types of observations with high risk or low risk can be clearly seen. This indicates that these two near-miss events are more sensitive to driving behavior than h a r s h a c c e l e r a t i o n and h a r s h d e c e l e r a t i o n and can be considered as a higher priority and weight in subsequent studies. Note that the same observation (id125) has different risk levels for different near-miss events, which also explains why multiple near-miss events cannot be analyzed together. Ultimately, the premium would be charged individually according to the driving risk level of the insured person.

7. Conclusions

The number and type of dependent variables and independent variables selected in this study are limited by the size and quality of the original data. With the promotion and innovation of IoV and of new energy vehicles, the amount and dimension of data will be greatly increased. Therefore, application of near-miss events as dependent variables could be easily increased or decreased, according to needs. For example, sharp turn should be included, if possible, as a near-miss event because sharp turn is a highly studied and accident-proven pattern of high driving risk. For the same reason, more driving behavior indicators, such as steering wheel angle speed, brake pedal position, and so on, could be used as independent variables in the regression model. In addition, traditional auto insurance factors, such as driver information, vehicle information, road information, environment information and the health status of batteries (of new energy vehicles) should be considered to provide more optional independent variables for the model.
In practical applications, near-miss events can be combined with claims and accidents to accurately evaluate driving risks. This study proves that near-miss events can be used as driving risk scores when there are no claims or accidents. However, when claims or accidents exist, the driving risk score obtained from claims or accidents can be used as the basis for premium calculation, while the driving risk rating obtained from near-miss events can be used to remind and warn drivers to reduce the corresponding dangerous driving habits.
In this study, the best performing negative binomial regression (see Table 4) was selected as the main method for modeling on our data set. The model is suitable for similar causal analysis of similar data sets. However, in case of risk event prediction or analysis on other data sets, it is necessary to reevaluate the goodness of fit of various models, and even machine learning methods with good prediction performance should be taken into consideration. The optimal method is not fixed, but depends on the data, conditions and purposes.
Econometrics and machine learning complement each other. The generalized linear model established in this study reveals the relationship between driving behavior factors and near-miss events, and gives a driving risk score for each observation. This model has strong explanatory power, but its generalization degree and robustness need to be further tested, especially on larger data volume and data dimension. The successful application of machine learning methods in many fields shows that they are often effective in dealing with big data problems but that their results cannot always be easily interpreted, and this interpretation is exactly what the insurance field values. Therefore, telematics data application offers a new way to help find a balance between econometrics and machine learning so as to have good explainability, good generalization ability, quick response ability, and so on [38,39].
In general, near-miss events can provide insurers with effective risk information in the absence of claims and accident data. In our real case study, negative binomial regression is the most suitable modeling method for near-miss events as dependent variables. This study provides a technical reference for the promotion and development of PHYD ratemaking schemes.

Author Contributions

Conceptualization, S.S. and M.G.; methodology, M.G.; software, S.S.; validation, J.B., M.G. and A.M.P.-M.; formal analysis, S.S.; investigation, S.S.; resources, M.G. and A.M.P.-M.; data curation, J.B., S.S. and M.G.; writing—original draft preparation, S.S.; writing—review and editing, S.S. and M.G.; visualization, S.S.; supervision, J.B. and M.G.; project administration, J.B.; funding acquisition, J.B. and S.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was granted by the Fundamental Research Funds for the Central Universities 2019YJS091.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

This study is based on the original telematics data from China Satellite Navigation and Communications Co., Ltd., which could not be made public due to confidentiality agreements.

Acknowledgments

M.G. thanks Fundación BBVA and ICREA Academia for providing research funds. S.S. thanks the China Scholarship Council for providing visiting research funds in the second institution. All the authors thank China Satellite Navigation and Communications Co., Ltd. for providing the original telematics data set.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
UBIusage-based insurance
IoVInternet of vehicles
PAYDpay as you drive
PHYDpay how you drive
MHYDmanage how you drive
VIFvariance inflation factor
POSPoisson
ZIPZero-inflated Poisson
NBNegative binomial
ZINBZero-inflated negative binomial
XTPOSPanel Poisson
XTNBPanel negative binomial
AICAkaike information criterion
BICBayesian information criterion

Appendix A

Table A1. The results of Poisson regression for four near-miss events in the summary data set of drivers.
Table A1. The results of Poisson regression for four near-miss events in the summary data set of drivers.
VariableOverspeedHighspeedbrakeHarshaccelerationHarshdeceleration
CoefficientzCoefficientzCoefficientzCoefficientz
constant−5.191 ***−21.45−5.194 ***−29.98−2.612 ***−40.34−2.591 ***−40.05
brakes0.2791.820.349 ***5.930.191 ***3.660.186 ***3.78
range0.04370.210.07410.78−0.157−1.65−0.208 *−2.04
speed−0.175−0.920.489 **3.22−0.717 ***−8.57−0.601 ***−6.51
rpm0.5141.590.2020.870.272 **2.940.183 *1.98
acceleratorpedalposition0.04670.18−0.0337−0.230.1691.910.238 **2.67
enginefuelrate0.540 *2.240.755 ***4.32−0.0499−0.59−0.119−1.48
log-likelihood−3518.9−2830.7−5857.3−6269.5
AIC7051.85675.511,728.512,552.9
BIC7074.35697.911,750.912,575.4
Observation182182182182
*** p < 0.01, ** p < 0.05, * p < 0.1.
Table A2. The results of zero-inflated Poisson regression for four near-miss events in the summary data set of drivers.
Table A2. The results of zero-inflated Poisson regression for four near-miss events in the summary data set of drivers.
VariableOverspeedHighspeedbrakeHarshaccelerationHarshdeceleration
CoefficientzCoefficientzCoefficientzCoefficientz
constant−4.388 ***−18.08−5.006 ***−24.96−2.612 ***−40.34−2.591 ***−40.05
brakes0.1671.080.339 ***5.340.191 ***3.660.186 ***3.78
range0.03650.200.07550.81−0.157−1.65−0.208 *−2.04
speed−0.391 *−2.120.408 *2.48−0.717 ***−8.57−0.601 ***−6.51
rpm0.607 *2.180.2741.110.272 **2.940.183 *1.98
acceleratorpedalposition−0.117−0.52−0.0563−0.380.1691.910.238 **2.67
enginefuelrate0.3461.590.700 ***4.01−0.0499−0.59−0.119−1.48
inflate-constant0.1010.66−1.183 ***−4.72−27.29 ***−295.09−27.00 ***−363.25
log-likelihood−2369.8−2667.0−5857.3−6269.5
AIC4755.65350.011,730.512,554.9
BIC4781.35375.711,756.112,580.6
Observation182182182182
*** p < 0.01, ** p < 0.05, * p < 0.1.
Table A3. The results of zero-inflated negative binomial regression for four near-miss events in the summary data set of drivers.
Table A3. The results of zero-inflated negative binomial regression for four near-miss events in the summary data set of drivers.
VariableOverspeedHighspeedbrakeHarshaccelerationHarshdeceleration
CoefficientzCoefficientzCoefficientzCoefficientz
constant−5.153 ***−7.37−5.114 ***−35.36−2.548 ***−43.39−2.525 ***−42.99
brakes0.2631.250.272 **2.600.189 ***3.380.180 ***3.45
range0.1800.730.272 *2.05−0.100−1.29−0.153−1.94
speed−0.114−0.200.2491.28−0.776 ***−8.81−0.658 ***−7.20
rpm0.1300.41−0.0241−0.110.1781.900.09691.07
acceleratorpedalposition0.2840.980.1711.030.1521.870.235 **2.82
enginefuelrate0.2270.990.705 ***4.49−0.0883−1.12−0.157 *−2.07
inflate-constant−3.793−0.17−14.62 ***−6.14−25.29 ***−294.22−23.27 ***−313.10
log-likelihood−490.5−627.4−1032.8−1037.1
AIC999.01272.82083.62092.3
BIC1027.91301.72112.52121.1
Observation182182182182
*** p < 0.01, ** p < 0.05, * p < 0.1.
Table A4. Panel negative binomial regression results for four near-miss events.
Table A4. Panel negative binomial regression results for four near-miss events.
VariableOverspeedHighspeedbrakeHarshaccelerationHarshdeceleration
CoefficientzCoefficientzCoefficientzCoefficientz
constant−3.768 ***(−8.68)−4.356 ***(−17.39)−2.284 ***(−25.13)−2.269 ***(−20.81)
brakes−0.0400(−0.31)0.0505(0.73)0.246 ***(6.93)0.253 ***(7.23)
range−0.0637(−0.36)−0.0108(−0.12)−0.00410(−0.07)−0.0595(−1.09)
speed−0.0405(−0.08)−0.0965(−0.39)−0.645 ***(−6.09)−0.586 ***(−5.24)
rpm1.683 *(2.34)1.287 **(3.00)0.143(0.91)0.145(0.93)
acceleratorpedalposition0.391(0.83)0.175(0.79)0.249 *(2.12)0.270 *(2.52)
enginefuelrate0.113(0.22)0.295(1.18)−0.188(−1.58)−0.229 *(−2.04)
2018-07-040.273(1.23)0.216(1.91)−0.111 *(−2.12)−0.216 ***(−4.33)
2018-07-05−0.168(−0.73)−0.0572(−0.52)−0.206 ***(−4.34)−0.317 ***(−6.72)
2018-07-06−0.00716(−0.03)−0.228 *(−2.08)−0.257 ***(−4.84)−0.370 ***(−7.19)
2018-07-07−0.477 *(−2.11)−0.200(−1.68)−0.485 ***(−7.41)−0.600 ***(−9.27)
2018-07-080.206(0.90)0.117(0.95)−0.694 ***(−8.63)−0.784 ***(−9.58)
id2−28.81 ***(−29.17)−2.001 *(−2.25)1.266 ***(5.24)1.342 ***(4.80)
id3−19.05 ***(−14.86)−18.13 ***(−16.37)2.004 *(2.31)1.740 ***(4.75)
id4−18.29 ***(−15.94)−17.91 ***(−20.04)1.891 ***(8.66)1.960 ***(8.58)
id5−29.62 ***(−41.07)−4.956 ***(−7.51)−1.193 ***(−3.77)−1.072 ***(−3.39)
id6−1.478 *(−2.40)−0.554(−1.79)1.067 ***(3.31)0.935 ***(4.27)
id7−3.236 ***(−4.11)−0.645(−1.58)0.656 **(3.15)0.835 **(3.05)
id8−20.79 ***(−24.48)−2.368 ***(−4.01)−0.190(−0.61)0.124(0.35)
id9−1.156(−1.10)−0.0678(−0.11)−0.251(−0.79)−0.109(−0.28)
id10−3.110 ***(−5.93)−1.527 ***(−4.20)−0.345 *(−2.17)−0.256(−1.63)
id11−2.026 *(−2.42)−1.163 ***(−3.45)−0.162(−0.88)−0.272(−1.22)
id12−1.342 *(−2.51)−0.772 *(−2.45)0.0781(0.34)0.0981(0.50)
id13−2.344 ***(−5.13)−0.808 *(−2.21)−0.138(−0.52)−0.129(−0.66)
id14−3.178 ***(−5.87)0.442(1.43)−0.629 ***(−3.66)−0.365 *(−2.07)
id15−1.254 *(−2.31)0.167(0.46)−0.0894(−0.34)0.0270(0.10)
id16−22.40 ***(−37.49)−21.63 ***(−41.20)0.271(1.35)0.439 *(2.13)
id17−21.77 ***(−25.73)−2.102 ***(−4.06)−0.200(−0.87)0.0983(0.38)
id18−20.99 ***(−19.61)−0.805(−1.48)−1.124 ***(−6.78)−1.267 ***(−5.69)
id19−0.998(−1.58)0.380(1.06)0.587 **(2.72)0.586 ***(3.89)
id20−23.99 ***(−38.51)−3.749 **(−3.22)0.292(1.19)0.0926(0.40)
id21−21.79 ***(−35.89)−2.577 **(−2.75)0.322(1.34)0.458 **(2.61)
id22−2.642 ***(−3.94)−0.229(−0.63)0.496 **(3.00)0.538 **(2.89)
id23−0.792(−1.10)0.00108(0.00)−0.474(−1.61)−0.409(−1.71)
id24−23.37 ***(−34.65)−22.27 ***(−40.12)−0.329(−1.36)−0.103(−0.32)
id25−21.06 ***(−30.30)−20.67 ***(−32.16)−0.882 ***(−3.45)−0.731 *(−2.56)
id26−2.739 ***(−3.82)−1.000 **(−2.61)−0.440(−1.77)−0.667 ***(−3.68)
id27−23.10 ***(−41.71)−22.25 ***(−45.16)−0.0464(−0.15)0.0656(0.22)
id28−17.86 ***(−17.18)−17.85 ***(−18.78)0.0432(0.13)0.309(1.41)
id29−1.136(−1.59)−0.872 *(−2.33)0.591 ***(3.74)0.625 ***(4.09)
id30−20.56 ***(−29.51)−19.84 ***(−31.89)−0.223(−0.53)−0.102(−0.25)
id31−0.407(−0.91)−0.633 *(−2.03)−1.148 ***(−3.69)−0.949 **(−3.21)
id32−3.255 **(−3.24)−2.923 *(−2.38)−0.110(−0.36)0.143(0.49)
id33−19.05 ***(−19.59)−19.50 ***(−21.06)−0.177(−0.37)−0.153(−0.32)
id34−2.431 **(−3.26)−1.547 ***(−3.51)−0.00573(−0.02)−0.0439(−0.14)
id35−3.832 ***(−3.98)−1.041 *(−2.18)−0.607 ***(−3.58)−0.552 **(−3.14)
id36−4.135 ***(−4.74)−2.412 ***(−4.36)−0.285(−1.37)−0.343(−1.82)
id37−38.70 ***(−32.02)−1.232(−1.72)−0.480(−1.64)−0.218(−0.75)
id38−20.26 ***(−29.14)−1.364 *(−2.11)−1.484 ***(−5.08)−1.121 ***(−4.31)
id39−38.61 ***(−41.67)10.89 ***(14.67)11.65 ***(21.85)11.77 ***(21.58)
id40−1.326(−1.57)−0.416(−0.81)−0.278(−1.10)0.0791(0.30)
id41−2.443 **(−3.19)−1.020 *(−2.02)0.180(0.82)0.155(0.61)
id42−0.467(−0.57)0.442(0.93)0.607(1.63)0.398(1.45)
id43−2.164 *(−2.45)0.219(0.39)−0.0359(-0.13)0.0900(0.35)
id44−2.465 ***(-−3.52)−0.156(−0.35)0.336(1.33)0.468(1.85)
id45−2.110 **(−3.26)−1.315 **(−3.20)0.105(0.64)0.282(1.18)
id460.132(0.14)−0.480(−1.01)−0.312 **(−2.77)−0.235(−1.65)
id47−2.957 ***(−5.23)−0.975(−1.32)−0.853 ***(−4.77)−0.656 ***(−4.01)
id480.486(0.78)1.381 ***(3.72)0.829 ***(5.36)0.787 ***(4.63)
id49−25.28 ***(−34.70)−1.575 **(−3.27)−0.568 **(−2.60)−0.353(−1.67)
id50−2.556 ***(−3.94)−1.907 ***(−3.87)−0.413 *(−2.43)−0.331(−1.80)
id51−20.62 ***(−20.07)−20.01 ***(−27.73)1.123 ***(3.56)1.140 ***(3.69)
id52−21.73 ***(−16.76)−20.90 ***(−19.35)−0.354(−1.10)−0.952 ***(−3.75)
id53−20.65 ***(−21.34)−20.00 ***(−31.74)−0.133(−0.72)0.200(1.01)
id54−4.881 ***(−5.39)−1.082 **(−2.66)−0.686 ***(−3.36)−0.639 **(−3.00)
id55−4.290 ***(−4.44)−1.731 ***(−3.90)0.472(1.80)0.476(1.37)
id56−2.462 ***(−3.72)−0.0866(−0.20)0.119(0.40)0.377(0.77)
id57−21.96 ***(−22.99)−0.700(−1.51)0.110(0.36)0.719 *(2.27)
id58−1.877(−1.66)−0.692(V0.83)−0.344(−0.77)0.0660(0.16)
id59−38.77 ***(−43.77)−0.0709(−0.10)−0.726 *(−2.11)−0.587(−1.68)
id60−3.117 **(−2.65)−3.815 ***(−4.00)−0.711 *(−2.07)−0.565(−1.69)
id610.821(0.83)1.078(1.78)−1.288 **(−2.87)−1.076 *(−2.42)
id62−0.465(−0.61)0.546(1.46)−0.670(−1.58)−0.473(−1.15)
id63−21.36 ***(−28.39)−20.68 ***(−33.84)1.393 ***(10.47)1.513 ***(10.22)
id64−2.529(−1.39)−1.707(−1.18)1.334 ***(7.83)1.339 ***(6.13)
id65−21.50 ***(−34.71)−20.76 ***(−42.35)−1.923 ***(−5.19)−1.288 ***(−5.26)
id66−1.389(−1.49)−1.510 ***(−3.74)0.504 **(2.64)0.971 ***(5.69)
id67−25.65 ***(−34.83)−3.400 ***(−3.49)−0.371 *(−1.98)−0.304(−1.70)
id68−19.09 ***(−27.85)−18.66 ***(−35.37)−1.286 **(−2.97)−1.660 ***(−7.16)
id69−24.40 ***(−27.41)−21.86 ***(−33.37)−0.589 **(−2.94)−0.625 *(−2.48)
id70−21.29 ***(−22.38)−3.693 ***(−4.50)−1.489 ***(−3.60)−1.501 ***(−6.37)
id71−31.22 ***(−36.99)−29.36 ***(−53.20)0.587 ***(3.95)1.212 ***(7.82)
id72−5.534 ***(−5.68)−1.058(−1.96)−0.516(−1.41)−0.643(−1.84)
id73−4.323 ***(−4.06)−2.863 ***(−5.25)−1.527 ***(−6.22)−1.523 ***(−5.97)
id74−30.92 ***(−35.40)−29.09 ***(−51.97)0.299(1.03)0.765 **(3.02)
id75−2.868 ***(−3.45)−1.677 ***(−4.49)−0.267(−0.90)−0.0911(−0.31)
id76−21.21 ***(−22.83)−21.18 ***(−31.54)−1.646 ***(−3.80)−1.903 ***(−4.47)
id77−19.83 ***(−15.23)−19.23 ***(−21.15)0.835 **(2.84)0.729 **(2.71)
id78−24.02 ***(−33.62)−3.260 ***(−3.47)−2.855 ***(−10.67)−2.759 ***(−8.49)
id79−3.449 **(−2.81)−0.618(−1.43)−0.232(−1.23)−0.110(−0.57)
id80−21.71 ***(−15.28)−20.69 ***(−20.65)−0.0149(−0.05)0.0509(0.16)
id81−33.98 ***(−60.17)−1.132 *(−2.44)−0.341 *(−2.34)−0.336 *(−2.41)
id82−1.391(−1.64)−0.541(−0.76)−0.312(−1.38)−0.326(−1.55)
id83−1.516 **(−2.58)0.157(0.53)−0.123(−0.64)−0.242(−1.37)
id84−23.95 ***(−25.81)−1.866 *(−2.04)−0.750 **(−3.19)−0.855 ***(−4.50)
id85−22.55 ***(−23.68)−3.844 ***(−4.49)−1.430 ***(−5.83)−1.318 ***(−5.43)
id86−29.06 ***(−33.35)−2.036 ***(−3.87)−1.272 ***(−3.40)−1.111 **(−2.99)
id87−1.851 *(−2.47)1.034 **(2.59)0.196(1.03)0.425 *(2.03)
id88−20.13 ***(−31.73)−19.83 ***(−38.92)−0.208(−1.54)−0.165(−1.04)
id89−25.42 ***(−13.08)−23.44 ***(−19.03)1.100 *(2.18)1.135 *(2.49)
id90−4.008 ***(−4.24)−0.841(−1.18)−0.972 ***(−4.48)−0.982 **(−3.17)
id91−19.51 ***(−26.36)−19.33 ***(−31.18)0.676 ***(5.52)0.818 ***(4.54)
id92−26.04 ***(−14.88)−24.17 ***(−22.56)0.848 *(2.54)0.663 *(2.00)
id93−23.79 ***(−23.97)−22.53 ***(−27.69)−0.290(−1.41)−0.300(−1.26)
id94−2.684 ***(−3.53)−1.034 **(−3.05)−0.157(−0.99)0.0139(0.06)
id95−24.75 ***(−12.35)−22.86 ***(−19.16)−0.503(−0.78)−0.670(−1.26)
id96−22.66 ***(−28.80)−21.77 ***(−35.98)1.374 ***(7.81)1.343 ***(6.60)
id97−20.83 ***(−29.33)−20.42 ***(−32.43)−0.464 *(−2.30)−0.282(−1.24)
id98−18.59 ***(−24.60)−18.56 ***(−27.54)−1.405 ***(−4.98)−0.887 *(−2.38)
id99−18.19 ***(−27.16)−18.21 ***(−34.28)−1.774 ***(−5.28)−1.369 ***(−4.34)
id100−4.226 **(−3.13)−21.52 ***(−26.23)0.802 *(2.30)0.824 *(2.37)
id101−22.60 ***(−13.70)−21.29 ***(−22.32)0.955 *(2.21)0.814(1.96)
id102−24.81 ***(−34.96)−23.71 ***(−42.27)0.0308(0.14)−0.0294(−0.12)
id103−17.82 ***(−22.56)−17.67 ***(−27.25)0.542 *(2.27)0.606 ***(3.70)
id104−20.05 ***(−32.16)−19.72 ***(−37.06)0.131(0.73)0.262 *(1.98)
id105−3.426 ***(−4.29)−0.430(−0.94)−0.464(−0.90)−0.925 *(−2.01)
id106−24.96 ***(−26.36)−23.73 ***(−35.39)0.317(1.84)0.252(1.33)
id107−21.01 ***(−29.69)−20.53 ***(−36.62)0.0144(0.09)0.147(0.63)
id108−23.28 ***(−27.56)−2.647 ***(−5.35)−0.532 *(−2.40)−0.635 ***(−3.35)
id109−20.89 ***(−19.32)−20.46 ***(−21.38)−0.347 **(−2.97)−0.782 ***(−5.90)
id110−20.70 ***(−17.96)−20.72 ***(−20.15)−1.801 ***(−5.72)−1.044 ***(−5.99)
id111−3.405 ***(−4.19)−1.278 ***(−3.61)0.173(1.27)0.198(1.40)
id112−19.65 ***(−24.23)−19.29 ***(−25.89)−1.453 ***(−8.75)−0.831 ***(−7.19)
id113−29.63 ***(−42.75)−2.998 ***(−3.76)−1.703 ***(−6.85)−1.296 ***(−6.74)
id114−23.43 ***(−20.14)−22.26 ***(−23.54)0.637 **(2.80)0.537 **(3.22)
id115−22.18 ***(−29.89)−21.49 ***(−33.89)−0.0179(−0.14)−0.109(−0.69)
id116−21.78 ***(−24.48)−3.753 ***(−7.41)−1.349 ***(−4.54)−1.135 **(−2.79)
id117−20.71 ***(−26.29)−20.08 ***(−32.28)−0.156(−1.35)−0.273 *(−2.17)
id118−18.97 ***(−27.81)−0.705(−0.91)0.116(0.88)0.00337(0.02)
id119−20.31 ***(−30.19)−19.89 ***(−36.33)−0.145(−0.79)−0.143(−0.92)
id120−27.27 ***(−33.83)−25.62 ***(−41.44)−0.0170(−0.06)0.133(0.42)
id121−28.62 ***(−36.09)−0.687(−1.76)0.239(1.20)0.387 *(2.01)
id122−21.69 ***(−15.37)−2.515 *(−2.40)0.623 *(2.36)0.653 *(2.48)
id123−23.18 ***(−12.54)−3.892 ***(−4.57)0.698(1.51)0.886 *(2.13)
id124−4.268 *(−2.34)−2.612 *(−2.50)0.698 **(2.66)0.361(1.22)
id125−3.828 *(−2.35)−22.52 ***(−19.24)0.296(0.67)0.619(1.69)
id126−2.023(−1.24)−2.183 *(−2.54)0.576(1.84)0.539(1.73)
id127−22.03 ***(−12.96)−20.58 ***(−19.98)1.158 ***(4.23)1.010 ***(3.82)
id128−20.90 ***(−14.97)−20.01 ***(−22.10)0.762 **(2.84)0.618 *(2.18)
id129−1.540 *(−2.37)0.776 *(2.07)0.0280(0.17)0.165(0.76)
id130−24.70 ***(−30.90)−23.31 ***(−35.06)−1.578 ***(−5.17)−1.635 ***(−4.61)
id131−1.659 *(−2.10)−0.403(−1.01)−0.980 ***(−3.71)−0.794 ***(−4.13)
id132−19.37 ***(−26.07)−18.83 ***(−32.28)−0.863 ***(−3.40)−0.435(−1.85)
id133−26.03 ***(−21.40)−2.904 ***(−8.09)−0.622 ***(−4.18)−0.691 ***(−4.38)
id134−31.36 ***(−34.83)−2.618 ***(−5.23)0.488 *(2.40)1.176 ***(6.93)
id135−23.37 ***(−24.64)−22.02 ***(−35.69)0.930 ***(4.58)1.350 ***(7.18)
id1363.358 ***(3.79)4.212 ***(6.07)2.661 ***(8.38)2.709 ***(11.14)
id137−23.49 ***(−24.93)−2.508 **(−2.65)0.0440(0.26)0.804 ***(4.16)
id138−19.02 ***(−29.14)−18.74 ***(−37.30)−0.827 ***(−3.86)−0.890 ***(−3.54)
id139−4.105 ***(−4.05)−1.187 *(−2.54)−0.922 **(−3.06)−0.677(−1.61)
id140−2.970 **(−3.06)−0.615(−1.10)−1.276 ***(−3.79)−1.035 ***(-−3.61)
id141−24.65 ***(−33.01)−23.40 ***(−44.51)−1.071 ***(−4.15)−1.100 ***(−4.52)
id142−37.04 ***(−41.13)−0.873(−1.69)0.0500(0.22)0.175(0.76)
id143−37.41 ***(−40.91)−0.397(−0.82)−0.368(−1.15)−0.157(−0.52)
id144−0.585(−0.65)0.551(1.04)0.0261(0.09)0.167(0.58)
id145−2.485 *(−2.36)−1.273 *(−2.44)−0.750(−1.56)−0.631(−1.32)
id146−22.93 ***(−26.06)−1.250 **(−2.87)−1.130 **(−2.83)−0.758 *(−1.97)
id147−2.851 ***(−3.75)−0.0796(−0.17)−1.021 **(−2.90)−0.896 **(−2.59)
id148−3.737 ***(−4.05)1.617 ***(3.83)−0.0483(−0.14)−0.0520(−0.16)
id149−3.202 ***(−3.53)−1.184 *(−2.01)−0.554 **(−2.59)−0.343(−1.54)
id150−3.616 **(−3.24)−0.905 *(−2.24)−1.167 ***(−4.30)−0.905 ***(−3.46)
id151−0.362(−0.51)−1.167 *(−2.07)−1.654 ***(−5.15)−1.677 ***(−4.32)
id152−32.89 ***(−33.69)−3.751 ***(−6.31)−1.421 ***(−3.68)−1.382 ***(−4.40)
id153−1.598 *(−2.14)−0.169(−0.46)−2.936 ***(−5.34)−3.067 ***(−5.01)
id154−21.84 ***(−20.17)−2.716 ***(−3.85)−1.703 ***(−4.38)−1.483 ***(−4.26)
id155−4.238 ***(−3.67)−2.441 *(−2.46)−0.759 **(−3.12)−0.814 **(−2.63)
id156−43.11 ***(−52.10)−1.456 **(−3.27)−0.590 **(−2.64)−0.429 *(−2.03)
id157−1.868 *(−2.19)0.337(0.62)−0.753 **(−2.82)−0.502 *(−2.35)
id158−19.28 ***(−28.14)−18.96 ***(−35.46)0.678 ***(4.39)0.744 ***(3.87)
id159−19.26 ***(−28.40)−19.02 ***(−33.84)0.827 ***(4.09)0.715 **(3.07)
id160−3.790 ***(−4.60)0.550(1.25)0.148(0.72)0.337(1.90)
id161−22.11 ***(−30.73)−21.31 ***(−34.53)0.608 ***(4.58)0.494 ***(3.29)
id162−20.15 ***(−18.45)−19.53 ***(−23.73)0.431(1.86)0.176(0.52)
id163−22.31 ***(−22.00)−21.43 ***(−29.98)1.844 ***(9.58)1.656 ***(8.12)
id164−2.923 **(−2.69)−2.557 ***(−3.68)−0.245(−1.66)−0.301(−1.85)
id165−20.82 ***(−28.93)−20.41 ***(−32.12)1.341 ***(11.50)1.447 ***(10.42)
id166−25.44 ***(−20.51)−2.439 ***(−4.42)−0.158(−0.49)−0.0534(−0.15)
id167−2.696 *(−2.28)−4.124 ***(−4.21)1.119 ***(3.43)1.089 ***(3.53)
id168−5.731 ***(−6.49)−1.940 ***(−3.42)0.0447(0.23)0.0115(0.06)
id169−26.04 ***(−24.96)−24.71 ***(−35.44)0.473 *(2.06)0.422(1.82)
id170−15.15 ***(−15.82)−15.11 ***(−19.48)−17.48 ***(−24.79)−0.684(−0.99)
id171−3.650 ***(−3.49)−1.497 **(−2.86)−0.344(−1.62)−0.313(−1.47)
id172−3.659 ***(−3.93)−1.951 ***(−4.53)−0.427(−1.84)−0.367(−1.34)
id173−3.036 **(−2.98)−3.500 ***(−4.94)−0.874 ***(−3.64)−0.888 ***(−3.41)
id1741.453(1.39)0.361(0.38)−1.484 ***(−6.68)−1.288 ***(−4.64)
id175−0.688(−0.99)1.615 ***(3.66)0.114(0.57)0.333(1.64)
id176−1.666(−1.82)−0.313(−0.81)0.530(0.98)−0.0614(−0.13)
id177−2.576 **(−3.13)−1.675 ***(−3.66)−0.245(−1.10)0.187(0.75)
id178−0.823(−0.51)0.510(1.04)0.213(1.00)0.0436(0.18)
id179−19.54 ***(−27.56)−1.071(−1.13)−1.386 ***(−4.24)−1.021(−1.91)
id180−4.457 ***(−3.85)−2.934 ***(−3.90)−0.402 *(−2.17)−0.277(−1.44)
id181−1.850 *(−2.18)−0.909 *(−2.21)−0.573(−1.78)−0.354(−1.41)
id182−4.754 **(−3.03)−2.082 **(−2.75)0.387(1.35)0.409(1.60)
log-likelihood−952.2391−1519.954−3479.969−3488.38
AIC2292.4783427.9087347.9377364.76
BIC3261.6574397.0868317.1168333.939
Observation1092109210921092
*** p < 0.01, ** p < 0.05, * p < 0.1.
Table A5. Driving risk scores for four near-miss events after winsorizing and Min-Max scaling on regression coefficients.
Table A5. Driving risk scores for four near-miss events after winsorizing and Min-Max scaling on regression coefficients.
VariableOverspeedHighspeedbrakeHarshaccelerationHarshdeceleration
id14.8247414.3449862.6228342.52286
id21.2423714.0338083.7537973.75
id32.4762981.6282044.4130784.113936
id42.5788241.7495024.3121314.315106
id51.1338143.5742721.5570841.542612
id64.6464674.2588333.5760233.377835
id74.4342994.2446823.2088623.286394
id82.2447113.9767362.45312.636247
id94.6853064.3344272.3986062.423189
id104.4496184.1075212.3146332.288771
id114.5803684.1641272.4781132.27414
id124.6628714.2249322.6926032.612564
id134.5420114.2193332.4995532.404901
id144.4414164.4137222.0609252.1891
id154.6734864.3709572.5429692.547549
id162.0505151.1865512.8649282.924287
id172.121684.0181022.4441672.612747
id182.2181754.21981.6187241.364301
id194.7043644.4040813.1472223.058705
id201.8358143.7619742.8836882.607535
id212.1240923.9442342.9104882.941661
id224.5060674.3093743.0659283.014813
id234.7292114.3451592.1993932.148866
id241.9238661.0636972.3289262.428676
id252.2073191.3171811.8349121.854426
id264.4943674.1894752.2297661.912948
id271.9576391.0808042.5813832.582846
id282.6210411.6950732.6614262.805413
id294.6875984.209383.1507953.094367
id302.2748661.4198182.423622.42959
id314.775654.2467031.5972841.655084
id324.4321283.8904272.5245672.653621
id332.4762981.5037942.4647132.382955
id344.5315184.104412.6177152.482718
id354.3625314.1830992.0805792.018105
id364.3259843.9700492.3682332.209217
id370.0217114.1533962.1940332.323519
id382.3170824.1328691.2971231.497805
id390.024124555
id404.6649224.2802942.3744862.59519
id414.530074.1863652.7836342.664594
id424.7684124.4137223.1650882.886796
id434.5637234.3790432.5907632.605157
id444.5274174.3207272.9229942.950805
id454.5702364.1404892.7166342.780724
id464.8406634.2703412.3441132.307974
id474.4680724.1933631.8608181.923007
id484.8833624.5597473.3634093.242502
id491.6729794.1000562.1154192.200073
id504.516444.0484262.2538862.22019
id512.2688351.3840513.626053.565289
id522.1928451.1243472.3065931.652341
id532.2603911.3482832.504022.705743
id544.2360024.1767232.0100051.938552
id554.3072884.0757963.0444882.95812
id564.5277784.3315192.7291412.867593
id572.188024.2361282.7211013.180322
id584.598344.2373722.3155262.583211
id5904.3339611.9742721.986101
id604.4487733.7520221.9876722.006218
id614.9237694.5126281.4722171.538954
id624.7686544.4298952.0242992.090344
id632.1651031.3094053.867253.906364
id644.5196974.0795283.8145443.747257
id652.1711341.3342870.9049491.345099
id664.6572024.1101643.0730753.410753
id671.6416183.8162482.2914062.244879
id682.4594121.6079871.4740041.004938
id691.786361.1165712.0966591.951353
id702.1940513.7706831.2926571.150329
id710.93720603.1472223.631127
id724.1572384.1804552.1618721.934894
id734.3033073.8997571.258711.130212
id740.97942202.8899413.222385
id754.4788074.0841942.3843132.439557
id762.1988760.8988551.1524030.782736
id772.3665361.4338143.3687693.189466
id781.8454643.8380190.072360
id794.4087284.248882.415582.422275
id802.1301231.1974372.6095232.569404
id810.5958564.1689472.3182062.215618
id824.6569614.2608552.3441132.224762
id834.6418844.3694022.5129532.301573
id841.8237524.0548021.9528321.741039
id852.0613713.7473561.3453641.317666
id861.211014.0283651.4865111.50695
id874.6014764.5057852.7979272.911485
id882.3412061.4758022.437022.371982
id891.6416180.8693083.6055033.560717
id904.3413024.2142011.7545111.624909
id912.406341.3871613.2267293.270849
id921.5716590.7744463.3803823.129115
id931.8732061.034152.3637662.248537
id944.5010014.1841882.482582.535571
id951.7369070.9252922.1734861.910205
id961.9998551.1632253.8502773.750914
id972.238681.3202912.2083262.264996
id982.5269581.6002111.3676971.711778
id992.5836491.6872981.0380561.271031
id1004.3150071.2005473.3392893.276335
id1012.0143291.1974373.4759693.267191
id1021.729670.8475372.6503482.495977
id1032.6198351.712183.1070223.076993
id1042.341.4602512.7398612.762436
id1054.4115024.2781162.2083261.67703
id1061.7115770.8522022.9060212.753292
id1072.2157621.3249562.6356982.657279
id1081.9178353.9333482.1475791.942209
id1092.2362681.2907442.3128461.807791
id1102.2881341.1756661.0139361.568215
id1114.4140354.1463982.7773812.703914
id1122.3906591.5224561.3248171.762985
id1131.1470823.8789191.1014831.337783
id1141.903361.0683633.1918893.013899
id1152.0625771.1974372.6068432.423189
id1162.1192683.7613521.4177241.485004
id1172.2531541.3824962.4834732.273226
id1182.4690614.2353512.7264612.525942
id1192.305021.4213732.49332.392099
id1201.4353610.550512.6076472.644477
id1211.2556394.238152.8363412.876737
id1222.1337423.9538753.1793823.119971
id1231.9877933.7397363.2463823.333029
id1244.3099413.9387913.2463822.852963
id1254.3630141.0668082.8872613.088881
id1264.580734.0055053.1373953.015728
id1272.0830821.2658623.6573163.446416
id1282.2254121.2627523.3035553.087966
id1294.6389894.4658192.6478472.673738
id1301.7417320.8926351.213151.027798
id1314.6246354.2823151.7473651.796818
id1322.422021.5675541.8518852.125091
id1331.5680413.8933812.0671791.891002
id1340.9167013.9378583.0587813.598208
id1351.9069791.0870243.4536363.757315
id1365555
id1371.8937113.9549642.662143.258047
id1382.4859481.6359791.8840451.709034
id1394.3296024.1603941.7991781.903804
id1404.4665044.2493471.4829371.576445
id1411.7598240.8926351.6660711.517008
id1420.2195264.2092252.66752.682882
id1430.1712784.2832482.2940862.379298
id1444.7541794.4306732.646152.675567
id1454.5250044.147021.9528321.945867
id1461.9479894.1505971.6133641.829737
id1474.4808584.3326081.7107381.703548
id1484.373994.5964482.5796862.475311
id1494.4385214.1608612.1279262.209217
id1504.3885854.2042491.5803111.695318
id1514.7810774.1635051.1452560.989393
id1520.7249173.7616631.3534041.259144
id1534.6319934.31870500
id1542.1204743.9226181.1014831.166789
id1554.313563.9653831.9447921.77853
id15604.1185622.0957662.130578
id1574.5994264.3973941.9501522.063826
id1582.4340821.5488933.2285153.203182
id1592.4340821.4804683.3616223.176664
id1604.3675974.4305182.7550472.831017
id1612.0722261.2503113.1659822.974579
id1622.3255251.4509213.0078612.683797
id1632.0432781.164784.2701454.037125
id1644.4721733.9473442.4039662.247623
id1652.2338551.310963.8207973.846013
id1661.6247323.9656942.4816872.474031
id1674.4995543.7036583.6224763.518654
id1684.1334764.0432942.6627662.533376
id1691.577690.7075773.0453812.908742
id1702.9623912.18493401.897403
id1714.3844844.1121862.3155262.23665
id1724.3833984.0415842.2413792.187271
id1734.4585433.8006971.8420581.710863
id17454.4011261.2971231.345099
id1754.7417564.5961372.7246742.827359
id1764.6237914.2963113.0963022.466715
id1774.5140284.0845052.4039662.693855
id1784.7254724.4242972.8131142.562729
id1792.406344.1784341.384671.589247
id1804.2871443.8887162.2637132.269568
id1814.6015974.2036272.1109522.199159
id1824.25124.0212122.9685552.896854

References

  1. Guillen, M.; Nielsen, J.P.; Pérez-Marín, A.M. Near-miss telematics in motor insurance. J. Risk Insur. 2021, 1–21. [Google Scholar] [CrossRef]
  2. Guillen, M.; Nielsen, J.P.; Pérez-Marín, A.M.; Elpidorou, V. Can automobile insurance telematics predict the risk of near-miss events? N. Am. Actuar. J. 2020, 24, 141–152. [Google Scholar] [CrossRef] [Green Version]
  3. Litman, T. Distance-Based Vehicle Insurance Feasibility, Costs and Benefits; Comprehensive Technical Report; Victoria Transport Policy Institute: Victoria, BC, Canada, 2011. [Google Scholar]
  4. Tselentis, D.I.; Yannis, G.; Vlahogianni, E.I. Innovative insurance schemes: Pay as/how you drive. Transp. Res. Procedia 2016, 14, 362–371. [Google Scholar] [CrossRef] [Green Version]
  5. Paefgen, J.; Staake, T.; Thiesse, F. Evaluation and aggregation of pay-as-you-drive insurance rate factors: A classification analysis approach. Decis. Support Syst. 2013, 56, 192–201. [Google Scholar] [CrossRef]
  6. Tselentis, D.I.; Yannis, G.; Vlahogianni, E.I. Innovative motor insurance schemes: A review of current practices and emerging challenges. Accid. Anal. Prev. 2017, 98, 139–148. [Google Scholar] [CrossRef]
  7. Troncoso, C.; Danezis, G.; Kosta, E.; Balasch, J.; Preneel, B. Pripayd: Privacy-friendly pay-as-you-drive insurance. IEEE Trans. Dependable Secur. Comput. 2010, 8, 742–755. [Google Scholar] [CrossRef] [Green Version]
  8. Pesantez-Narvaez, J.; Guillen, M.; Alcañiz, M. Predicting Motor Insurance Claims Using Telematics Data—XGBoost versus Logistic Regression. Risks 2019, 7, 70. [Google Scholar] [CrossRef] [Green Version]
  9. Guillen, M.; Nielsen, J.P.; Ayuso, M.; Pérez-Marín, A.M. The use of telematics devices to improve automobile insurance rates. Risk Anal. 2019, 39, 662–672. [Google Scholar] [CrossRef]
  10. Sun, S.; Bi, J.; Guillen, M.; Pérez-Marín, A.M. Assessing driving risk using internet of vehicles data: An analysis based on generalized linear models. Sensors 2020, 20, 2712. [Google Scholar] [CrossRef]
  11. De Diego, I.M.; Siordia, O.S.; Crespo, R.; Conde, C.; Cabello, E. Analysis of hands activity for automatic driving risk detection. Transp. Res. Part C Emerg. Technol. 2013, 26, 380–395. [Google Scholar] [CrossRef]
  12. Siordia, O.S.; de Diego, I.M.; Conde, C.; Cabello, E. Subjective traffic safety experts’ knowledge for driving-risk definition. IEEE Trans. Intell. Transp. Syst. 2014, 15, 1823–1834. [Google Scholar] [CrossRef]
  13. Charlton, S.G.; Starkey, N.J.; Perrone, J.A.; Isler, R.B. What’s the risk? A comparison of actual and perceived driving risk. Transp. Res. Part F Traffic Psychol. Behav. 2014, 25, 50–64. [Google Scholar] [CrossRef]
  14. Peng, J.; Shao, Y. Intelligent method for identifying driving risk based on V2V multisource big data. Complexity 2018, 2018. [Google Scholar] [CrossRef]
  15. Wang, J.; Zheng, Y.; Li, X.; Yu, C.; Kodaka, K.; Li, K. Driving risk assessment using near-crash database through data mining of tree-based model. Accid. Anal. Prev. 2015, 84, 54–64. [Google Scholar] [CrossRef]
  16. Yan, L.; Zhang, Y.; He, Y.; Gao, S.; Zhu, D.; Ran, B.; Wu, Q. Hazardous traffic event detection using Markov Blanket and sequential minimal optimization (MB-SMO). Sensors 2016, 16, 1084. [Google Scholar] [CrossRef] [Green Version]
  17. Liao, Y.; Wang, M.; Duan, L.; Chen, F. Cross-regional driver–vehicle interaction design: An interview study on driving risk perceptions, decisions, and ADAS function preferences. IET Intell. Transp. Syst. 2018, 12, 801–808. [Google Scholar] [CrossRef]
  18. Jiang, K.; Yang, D.; Xie, S.; Xiao, Z.; Victorino, A.C.; Charara, A. Real-time estimation and prediction of tire forces using digital map for driving risk assessment. Transp. Res. Part C Emerg. Technol. 2019, 107, 463–489. [Google Scholar] [CrossRef]
  19. Yan, Y.; Dai, Y.; Li, X.; Tang, J.; Guo, Z. Driving risk assessment using driving behavior data under continuous tunnel environment. Traffic Inj. Prev. 2019, 20, 807–812. [Google Scholar] [CrossRef]
  20. Lu, J.; Xie, X.; Zhang, R. Focusing on appraisals: How and why anger and fear influence driving risk perception. J. Saf. Res. 2013, 45, 65–73. [Google Scholar] [CrossRef]
  21. Wang, J.; Huang, H.; Li, Y.; Zhou, H.; Liu, J.; Xu, Q. Driving risk assessment based on naturalistic driving study and driver attitude questionnaire analysis. Accid. Anal. Prev. 2020, 145, 105680. [Google Scholar] [CrossRef]
  22. Handel, P.; Skog, I.; Wahlstrom, J.; Bonawiede, F.; Welch, R.; Ohlsson, J.; Ohlsson, M. Insurance telematics: Opportunities and challenges with the smartphone solution. IEEE Intell. Transp. Syst. Mag. 2014, 6, 57–70. [Google Scholar] [CrossRef]
  23. Joubert, J.W.; De Beer, D.; De Koker, N. Combining accelerometer data and contextual variables to evaluate the risk of driver behaviour. Transp. Res. Part F Traffic Psychol. Behav. 2016, 41, 80–96. [Google Scholar] [CrossRef] [Green Version]
  24. Verbelen, R.; Antonio, K.; Claeskens, G. Unravelling the predictive power of telematics data in car insurance pricing. J. R. Stat. Soc. Ser. C Appl. Stat. 2018, 67, 1275–1304. [Google Scholar] [CrossRef] [Green Version]
  25. Ma, Y.L.; Zhu, X.; Hu, X.; Chiu, Y.C. The use of context-sensitive insurance telematics data in auto insurance rate making. Transp. Res. Part A Policy Pract. 2018, 113, 243–258. [Google Scholar] [CrossRef]
  26. Jiang, Y.; Zhang, J.; Wang, Y.; Wang, W. Drivers’ behavioral responses to driving risk diagnosis and real-time warning information provision on expressways: A smartphone app–based driving experiment. J. Transp. Saf. Secur. 2020, 12, 329–357. [Google Scholar] [CrossRef]
  27. Jin, W.; Deng, Y.; Jiang, H.; Xie, Q.; Shen, W.; Han, W. Latent class analysis of accident risks in usage-based insurance: Evidence from Beijing. Accid. Anal. Prev. 2018, 115, 79–88. [Google Scholar] [CrossRef]
  28. Carfora, M.F.; Martinelli, F.; Mercaldo, F.; Nardone, V.; Orlando, A.; Santone, A.; Vaglini, G. A “pay-how-you-drive” car insurance approach through cluster analysis. Soft Comput. 2019, 23, 2863–2875. [Google Scholar] [CrossRef]
  29. Burton, A.; Parikh, T.; Mascarenhas, S.; Zhang, J.; Voris, J.; Artan, N.S.; Li, W. Driver identification and authentication with active behavior modeling. In Proceedings of the 2016 12th International Conference on Network and Service Management (CNSM), Montreal, QC, Canada, 31 October–4 November 2016; pp. 388–393. [Google Scholar]
  30. Baecke, P.; Bocca, L. The value of vehicle telematics data in insurance risk selection processes. Decis. Support Syst. 2017, 98, 69–79. [Google Scholar] [CrossRef]
  31. Guelman, L. Gradient boosting trees for auto insurance loss cost modeling and prediction. Expert Syst. Appl. 2012, 39, 3659–3667. [Google Scholar] [CrossRef]
  32. Bian, Y.; Yang, C.; Zhao, J.L.; Liang, L. Good drivers pay less: A study of usage-based vehicle insurance models. Transp. Res. Part A Policy Pract. 2018, 107, 20–34. [Google Scholar] [CrossRef]
  33. Jafarnejad, S.; Castignani, G.; Engel, T. Towards a real-time driver identification mechanism based on driving sensing data. In Proceedings of the 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), Yokohama, Japan, 16–19 October 2017; pp. 1–7. [Google Scholar]
  34. Paefgen, J.; Staake, T.; Fleisch, E. Multivariate exposure modeling of accident risk: Insights from Pay-as-you-drive insurance data. Transp. Res. Part A Policy Pract. 2014, 61, 27–40. [Google Scholar] [CrossRef]
  35. Boucher, J.P.; Pérez-Marín, A.M.; Santolino, M. Pay-as-you-drive insurance: The effect of the kilometers on the risk of accident. In Anales del Instituto de Actuarios Españoles; Instituto de Actuarios Españoles: Madrid, Spain, 2013; Volume 19, pp. 135–154. [Google Scholar]
  36. Sun, S.; Bi, J.; Ding, C. Cleaning and Processing on the Electric Vehicle Telematics Data. In Proceedings of the INFORMS International Conference on Service Science, Nanjing, China, 27–29 June 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 1–6. [Google Scholar]
  37. Gao, G.; Wüthrich, M.V.; Yang, H. Evaluation of driving risk at different speeds. Insur. Math. Econ. 2019, 88, 108–119. [Google Scholar] [CrossRef]
  38. Gao, G.; Wang, H.; Wüthrich, M.V. Boosting Poisson regression models with telematics car driving data. Mach. Learn. 2021, 1–30. [Google Scholar] [CrossRef]
  39. So, B.; Boucher, J.P.; Valdez, E.A. Synthetic Dataset Generation of Driver Telematics. Risks 2021, 9, 58. [Google Scholar] [CrossRef]
Figure 1. Histogram of frequency distribution of four near-miss events: (a) Over speed; (b) High speed brake; (c) Harsh acceleration; (d) Harsh deceleration.
Figure 1. Histogram of frequency distribution of four near-miss events: (a) Over speed; (b) High speed brake; (c) Harsh acceleration; (d) Harsh deceleration.
Entropy 23 00829 g001
Figure 2. Technical flow chart.
Figure 2. Technical flow chart.
Entropy 23 00829 g002
Figure 3. Partial coefficient estimation results of (a) negative binomial regression; (b) Panel negative binomial regression.
Figure 3. Partial coefficient estimation results of (a) negative binomial regression; (b) Panel negative binomial regression.
Entropy 23 00829 g003
Figure 4. Driving risk ranking of four near-miss events.
Figure 4. Driving risk ranking of four near-miss events.
Entropy 23 00829 g004
Table 1. Descriptive statistics of the summary data set for 182 drivers observed from 3–8 July 2018.
Table 1. Descriptive statistics of the summary data set for 182 drivers observed from 3–8 July 2018.
VariableMeanStandard DeviationMinimumMedianMaximumDefination
overspeed19.1945.3700330Frequency of driving speed greater than 100 km/h
highspeedbrake44.23108.304942Frequency of braking when the driving speed is greater than 90 km/h
harshacceleration139.0134.70101899Frequency of cases when acceleration is greater than 6 m/s 2
harshdeceleration141.9137.81105913Frequency of cases when acceleration is less than 6 m/s 2
kilo222316743.731832.1757164Total driving distance (km)
fuel621.7470.910.25487.2952018Total fuel consumption (L)
brakes1588142661138.59243Total number of brakes
range5.2015.0210.0273.39926.78Range of driving (geographical units)
speed36.8816.370.29736.65767.84Mean of speed (km/h)
rpm1028188.3233.11009.3011620Mean of revolutions per minute (r/min)
acceleratorpedalposition21.057.1100.18721.2639.29Mean of acceleration pedal position (%)
enginefuelrate11.524.4641.86811.20322.01Mean of engine fuel rate (%)
The number of each parameter is 182.
Table 2. Descriptive statistics of a panel data set for 182 drivers observed over six days (total cases 1092).
Table 2. Descriptive statistics of a panel data set for 182 drivers observed over six days (total cases 1092).
VariableNMeanStandard DeviationMinimumMedianMaximum
overspeed10923.19914.3700315
highspeedbrake10927.43521.7400215
harshacceleration109223.3729.78014223
harshdeceleration109223.8630.16013.5233
kilo1092372.6373.20263.241739
fuel1092104.1105.7072.15565.8
brakes1092264.7291.001781940
range10922.4062.96301.24314.07
speed109231.9621.58031.51477.74
rpm1092894.3346.90973.7141731
acceleratorpedalposition109217.5110.19018.61345.74
enginefuelrate10929.7945.835010.01826.18
Table 3. Variance inflation factor and correlation of explanatory variables.
Table 3. Variance inflation factor and correlation of explanatory variables.
VariableVIFBrakesRangeSpeedrpmAccelerator Pedal Positon
brakes3.07
range2.650.1213
speed2.300.05360.6262
rpm2.13−0.0254−0.02030.1804
accelerator pedal positon2.030.01540.11740.34580.7695
engine fuel rate1.040.16870.63130.64900.10750.3529
Table 4. Model performances of Poisson, zero-inflated Poisson, negative binomial and zero-inflated negative binomial in summary data set.
Table 4. Model performances of Poisson, zero-inflated Poisson, negative binomial and zero-inflated negative binomial in summary data set.
VariableModelNLog-LikelihooddfAICBIC
overspeedPOS182−3518.9277051.8467074.274
ZIP182−2369.8284755.644781.272
NB182−490.5178997.03381022.666
ZINB182−490.5169999.03151027.868
highspeedbrakePOS182−2830.7575675.4985697.926
ZIP182−2667.0285350.0345375.666
NB182−627.42281270.8431296.476
ZINB182−627.42291272.8431301.68
harshaccelerationPOS182−5857.26711,728.5111,750.94
ZIP182−5857.26811,730.5111,756.14
NB182−1032.8182081.6232107.255
ZINB182−1032.8192083.6232112.459
harshdecelerationPOS182−6269.47712,552.9312,575.36
ZIP182−6269.47812,554.9312,580.56
NB182−1037.1482090.2852115.917
ZINB182−1037.1492092.2852121.121
Table 5. The results of negative binomial regression for four near-miss events in the summary data set of drivers.
Table 5. The results of negative binomial regression for four near-miss events in the summary data set of drivers.
VariableOverspeedHighspeedbrakeHarshaccelerationHarshdeceleration
CoefficientzCoefficientzCoefficientzCoefficientz
constant−5.175 ***−15.87−5.114 ***−35.36−2.548 ***−43.39−2.525 ***−42.99
brakes0.2641.290.272 **2.600.189 ***3.380.180 ***3.45
range0.1850.790.272 *2.05−0.100−1.29−0.153−1.94
speed−0.113−0.200.2491.28−0.776 ***−8.81−0.658 ***−7.20
rpm0.1250.43−0.0241−0.110.1781.900.09691.07
acceleratorpedalposition0.2901.130.1711.030.1521.870.235 **2.82
enginefuelrate0.2270.990.705 ***4.49−0.0883−1.12−0.157 *−2.07
log-likelihood−490.5−627.4−1032.8−1037.1
AIC997.01270.82081.62090.3
BIC1022.71296.52107.32115.9
Observation182182182182
*** p < 0.01, ** p < 0.05, * p < 0.1.
Table 6. Model performances of Poisson and negative binomial in the panel data set of drivers with six observations per driver.
Table 6. Model performances of Poisson and negative binomial in the panel data set of drivers with six observations per driver.
VariableModelNLog-LikelihooddfAICBIC
overspeedXTPOS1092−1926.781884229.5595168.763
XTNB1092−957.4971892292.9933237.193
highspeedbrakeXTPOS1092−2594.371885564.7336503.937
XTNB1092−1527.051893432.1054376.305
harshaccelerationXTPOS1092−6117.4418812,610.8913,550.09
XTNB1092−3526.091897430.1868374.386
harshdecelerationXTPOS1092−6042.0218812,460.0313,399.24
XTNB1092−3547.661897473.3118417.51
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Sun, S.; Bi, J.; Guillen, M.; Pérez-Marín, A.M. Driving Risk Assessment Using Near-Miss Events Based on Panel Poisson Regression and Panel Negative Binomial Regression. Entropy 2021, 23, 829. https://doi.org/10.3390/e23070829

AMA Style

Sun S, Bi J, Guillen M, Pérez-Marín AM. Driving Risk Assessment Using Near-Miss Events Based on Panel Poisson Regression and Panel Negative Binomial Regression. Entropy. 2021; 23(7):829. https://doi.org/10.3390/e23070829

Chicago/Turabian Style

Sun, Shuai, Jun Bi, Montserrat Guillen, and Ana M. Pérez-Marín. 2021. "Driving Risk Assessment Using Near-Miss Events Based on Panel Poisson Regression and Panel Negative Binomial Regression" Entropy 23, no. 7: 829. https://doi.org/10.3390/e23070829

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop