# Correlation Analysis of Real-Time Warning Factors for Construction Heavy Trucks Based on Electrified Supervision System

## Abstract

## 1. Introduction

## 2. Literature Review

## 3. Materials and Methods

#### 3.1. Data Preprocessing

#### 3.2. Multiple Linear Regression

- There is a linear relationship between the dependent and explanatory independent variables.
- The independent variables are not highly correlated with each other.
- The variance of the residuals is constant.
- The observations should be independent of one another.
- Multivariate normality occurs when residuals are normally distributed.

_{0}+ x

_{1}β

_{1}+ x

_{2}β

_{2}+ … + x

_{k}β

_{k}+ ε,

#### 3.3. Variable Selection

- Generating a candidate set of features subsets;
- Evaluation function to evaluate the performance of different feature subsets;
- Setting a threshold and stopping when the evaluation function value reaches the threshold;
- Verifying the validity of the optimal feature subset.

#### 3.3.1. Best Subset Selection

- Step 1: Assuming that there are k features, start from the null model M
_{0}with only the intercept term. - Step 2: Fit the model with different feature combinations. When the number of variables is fixed, there are the best combinations of factors corresponding to the number of variables is achieved. Models M
_{1}, M_{2}, …, M_{k}are obtained by calculating the best combination of the number of variables from 1 to k. The model is optimal when the degree of freedom adjustment complex decision coefficient (${\mathrm{R}}_{\mathrm{a}}^{2})$ is the largest in this paper. - Step 3: Generally, the cross-check error, the Bayesian information criterion (BIC), Cp [35], or adjusted R
^{2}are used to select the optimal model among the k + 1 models obtained in step 1. In this paper, ${\mathrm{R}}_{\mathrm{a}}^{2}$ was used to determine the optimal model. The formula for ${\mathrm{R}}_{\mathrm{a}}^{2}$ is shown in Equation (3).

#### 3.3.2. Lasso Regression

#### 3.4. Multicollinearity Test

## 4. Results

#### 4.1. Variable Selection

#### 4.1.1. Best Subset Selection

^{2}greater than 0.8, only t3 and t4 could construct effective multivariate linear models.

#### 4.1.2. Lasso Regression

^{2}value is shown in the last row of Table 6. Obviously, when λ = 0.053, the optimal set of independent variables of t3 was obtained, and when λ = 0.055, the optimal set of independent variables of t4 was obtained. The optimal variable set of t4 had more b1 variables compared with the selection of the best subset selection method, whereas the optimal variable set of t3 had three more variables of b4, b5, and b7.

#### 4.2. Results of Regression

## 5. Discussion

#### 5.1. Too Close Distance

#### 5.2. Lane Change across Solid Line

## 6. Conclusions

**Figure 2.**(

**a**) Display of alarm trigger points on the road network; (

**b**) the density of alarm trigger points in space (red is the densest).

**Figure 4.**Lasso regression (y = t3); (

**a**) the change of the coefficient of each variable and residual degrees of freedom of model with the increase in log(λ), 1: Trip, 2: Aves, 3: Maxs, 4: t1, 5: t2, 6: t4, 7: b1, 8: b2, 9: b3, 10: b4, 11: b5, 12: b6, 13: b7, 14: b8; (

**b**) cross-check, the change of the MSE of each model and residual degrees of freedom of model with the increase of log(λ).

**Figure 5.**Lasso regression (y = t4); (

**a**) variation of log(λ) and residual degrees of freedom, 1: Trip, 2: Aves, 3: Maxs, 4: t1, 5: t2, 6: t3, 7: b1, 8: b2, 9: b3, 10: b4, 11: b5, 12: b6, 13: b7, 14: b8; (

**b**) cross-check.

**Figure 6.**This is a figure of actual vs. predicted plots of t3 and t4. (

**a**) Actual vs. predicted plot: model of t3, the result of the multiple regression model showed as Equation (7); (

**b**) actual vs. predicted plot: model of t4, the result of the multiple regression model showed as Equation (8).

Range | Warning Records | Format | Trip Records | Format |
---|---|---|---|---|

1 | Association | varchar | Association | varchar |

2 | License plate number | varchar | License plate number | varchar |

3 | License plate color | varchar | License plate color | varchar |

4 | SIM number | varchar | Date | date |

5 | Warning type | varchar | Departure time | datetime |

6 | Warning level | varchar | Closing time | datetime |

7 | Warning time | datetime | Average speed | float |

8 | Speed | float | Maximum Speed | float |

9 | Longitude | double | Is there a warning | varchar |

10 | Latitude | double | Path Area | varchar |

11 | Mileage | float | Start and End Places | varchar |

12 | State | varchar | Total mileage | float |

13 | Altitude | float | ||

14 | Terminal type | varchar | ||

15 | Car type | varchar | ||

16 | False warning or not | varchar | ||

17 | Remarks | varchar | ||

18 | Risk level | varchar | ||

19 | Risk duration | time | ||

20 | Risk value | float |

Range | Type | Count ^{1} | Count without False Warning ^{2} | Keep or Not | Classification |
---|---|---|---|---|---|

1 | Forward collision | 6944 ^{3} | 6944 | Yes | Trajectory abnormality |

2 | Lane departure | 2268 | 2268 | Yes | |

3 | Too close distance | 19,229 | 19,229 | Yes | |

4 | Pedestrian collision | 3127 | 3127 | No | |

5 | Frequent lane changes | 0 | 0 | No | |

6 | Road sign overrun | 0 | 0 | No | |

7 | Obstacle | 0 | 0 | No | |

8 | Assisted-driving fails | 0 | 0 | No | |

9 | Lane change across solid line | 41,788 | 41,788 | Yes | |

10 | Pedestrian detection in carriageway | 0 | 0 | No | Driving behavior abnormality |

11 | Driver mismatch (platform) | 0 | 0 | No | |

12 | Fatigue driving | 769 | 718 | Yes | |

13 | Answering calls | 2023 | 1982 | Yes | |

14 | Smoking | 5681 | 5616 | Yes | |

15 | Not looking ahead | 4047 | 3843 | Yes | |

16 | Driver abnormal | 23,438 | 23,438 | No | |

17 | Probe occlusion | 737 | 737 | Yes | |

18 | Driver behavior monitoring function fails | 1873 | 1873 | No | |

19 | Overtime driving | 0 | 0 | No | |

20 | Not wearing seat belt | 337 | 337 | Yes | |

21 | Infrared-blocking sunglasses fail | 1 | 1 | No | |

22 | Hands-off driving | 3400 | 3387 | Yes | |

23 | Playing phone | 1992 | 1992 | Yes | |

24 | Right rear approach | 162,144 | 162,144 | No |

^{1}Count of raw records;

^{2}count after deleting records of false warning identified by the system;

^{3}the counting units in the table are times in all trips.

Variables | Rename | Count ^{1} | Mean | SD ^{2} | CV ^{3} | Max | Min | Median |
---|---|---|---|---|---|---|---|---|

Tripkm ^{4} | Trip | - | 135.610 | 6890.924 | 2.5828 | 440.5 | 1.6 | 120.700 |

Average speed ^{5} | Aves | - | 19.030 | 82.994 | 0.2834 | 42.2 | 1.3 | 17.300 |

Max speed ^{5} | Maxs | - | 75.872 | 108.648 | 0.3243 | 106.3 | 34.4 | 76.900 |

Forward collision | t1 | 4009 | 3.88 | 94.902 | 0.303 | 82 | 0 | 0.00 |

Lane departure | t2 | 1051 | 1.02 | 16.327 | 0.126 | 52 | 0 | 0.00 |

Too close distance | t3 | 9791 | 9.48 | 320.816 | 0.557 | 123 | 0 | 2.00 |

Lane change across solid line | t4 | 18,345 | 17.76 | 1519.865 | 1.213 | 262 | 0 | 0.00 |

Fatigue driving | b1 | 513 | 0.50 | 4.930 | 0.069 | 30 | 0 | 0.00 |

Answering calls | b2 | 946 | 0.92 | 5.916 | 0.076 | 29 | 0 | 0.00 |

Smoking | b3 | 3376 | 3.27 | 71.262 | 0.263 | 71 | 0 | 0.00 |

Not looking ahead | b4 | 1734 | 1.68 | 42.406 | 0.203 | 54 | 0 | 0.00 |

Probe occlusion | b5 | 656 | 0.64 | 2.620 | 0.050 | 12 | 0 | 0.00 |

Not wearing seat belt | b6 | 127 | 0.12 | 0.271 | 0.016 | 5 | 0 | 0.00 |

Hands-off driving | b7 | 775 | 0.75 | 49.864 | 0.220 | 144 | 0 | 0.00 |

Playing with phone | b8 | 544 | 0.53 | 1.259 | 0.035 | 6 | 0 | 0.00 |

^{1}Number of each warning behavior in each trip, the unit is times/trip;

^{2}standard deviation;

^{3}coefficients of variation, CV is the ratio of standard deviation to mean;

^{4}the unit is kilometers/trip;

^{5}the unit is kilometers/hour.

Coef ^{1} | Mile | Aves | Maxs | t1 | t2 | t3 | t4 | b1 | b2 | b3 | b4 | b5 | b6 | b7 | b8 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Mile | 1 | ||||||||||||||

Aves | 0.535 ** | 1 | |||||||||||||

Maxs | 0.596 ** | 0.521 ** | 1 | ||||||||||||

t1 | 0.249 ** | 0.011 | 0.075 * | 1 | |||||||||||

t2 | −0.336 ** | −0.355 ** | −0.234 ** | 0.087 ** | 1 | ||||||||||

t3 | 0.149 ** | 0.039 | 0.006 | 0.752 ** | −0.054 | 1 | |||||||||

t4 | −0.051 | 0.022 | −0.141 ** | 0.305 ** | −0.281 ** | 0.689 ** | 1 | ||||||||

b1 | 0.154 ** | −0.037 | 0.042 | 0.299 ** | 0.260 ** | 0.185 ** | −0.195 ** | 1 | |||||||

b2 | 0.143 ** | −0.112 ** | 0.118 ** | 0.039 | −0.067 * | 0.092 ** | 0.116 ** | 0.155 ** | 1 | ||||||

b3 | 0.041 | −0.209 ** | 0.019 | −0.065 * | −0.182 ** | 0.097 ** | 0.202 ** | 0.044 | 0.525 ** | 1 | |||||

b4 | 0.153 ** | −0.033 | 0.032 | 0.408 ** | 0.063 * | 0.278 ** | −0.192 ** | 0.430 ** | 0.037 | 0.023 | 1 | ||||

b5 | 0.316 ** | 0.237 ** | 0.195 ** | 0.023 | −0.131 ** | −0.091 ** | −0.198 ** | −0.014 | −0.234 ** | −0.289 ** | 0.003 | 1.000 | |||

b6 | −0.016 | −0.006 | −0.065 * | −0.122 ** | −0.064 * | −0.066 * | 0.006 | −0.035 | 0.170 ** | 0.293 ** | 0.006 | −0.118 ** | 1 | ||

b7 | 0.081 ** | 0.011 | 0.093 ** | 0.007 | −0.069 * | −0.004 | 0.050 | −0.074 * | 0.006 | −0.013 | −0.069 * | 0.019 | −0.008 | 1 | |

b8 | 0.191 ** | 0.159 ** | 0.109 ** | −0.198 ** | −0.203 ** | −0.291 ** | −0.129 ** | −0.171 ** | −0.082 ** | −0.090 ** | −0.248 ** | 0.179 ** | −0.162 ** | 0.058 | 1 |

^{1}Coefficient; ** the correlation is significant at the 0.01 level (two-tailed); * the correlation is significant at the 0.05 level (two-tailed).

y | Equation | Adjusted R^{2} |
---|---|---|

t1 | y~Trip + Aves + t2 + t3 + t4 + b1 + b2 + b3 + b4 + b5 + b8 | 0.707 |

t2 | y~Aves + t1 + b1 + b3 + b4 + b8 | 0.110 |

t3 | y~Trip + Aves + Maxs + t1 + t2 + t4 + b1 + b2 + b3 + b6 | 0.895 |

t4 | y~Trip + Aves + Maxs + t1 + t2 + t3 + b2 + b3 + b4 + b5 + b6 + b7 + b8 | 0.901 |

b1 | y~t1 + t2 + t3 + t4 + b2 + b4 + b5 | 0.376 |

b2 | y~Trip + Aves + Maxs + t1 + t3 + t4 + b1 + b3 + b5 + b6 + b7 | 0.291 |

b3 | y~Trip + Aves + t1 + t2 + t3 + t4 + b2 + b6 + b7 | 0.472 |

b4 | y~Trip + Maxs + t1 + t2 + t4 + b1 + b5 + b8 | 0.481 |

b5 | y~Trip + t1 + t4 + b1 + b2 + b6 + b8 | 0.236 |

b6 | y~Aves + t3 + b2 + b3 + b8 | 0.094 |

b7 | y~t4 + b2 + b3 | 0.079 |

b8 | y~Trip + Aves + t1 + t2 + t4 + b4 + b5 + b6 | 0.181 |

y = t3 | y = t4 | ||||
---|---|---|---|---|---|

Variables | λ = 0.053 | λ = 0.716 | Variables | λ = 0.055 | λ = 1.71 |

Intercept | −7.031 | −4.595 | Intercept | 15.464 | 3.498 |

Trip | 0.005 | 0.002 | Trip | −0.002 | - |

Aves | −0.043 | - | Aves | 0.236 | - |

Maxs | 0.095 | 0.065 | Maxs | −0.223 | −0.027 |

t1 | 0.594 | 0.529 | t1 | −1.100 | −0.954 |

t2 | 0.071 | - | t2 | −0.242 | - |

t4 | 0.426 | 0.393 | t3 | 1.907 | 1.779 |

b1 | −0.188 | - | b1 | 0.149 | - |

b2 | 0.756 | 0.520 | b2 | −0.939 | - |

b3 | −0.350 | −0.177 | b3 | 1.094 | 0.966 |

b4 | 0.008 | - | b4 | −0.135 | - |

b5 | 0.076 | - | b5 | −0.772 | −0.006 |

b6 | −0.491 | - | b6 | −0.861 | - |

b7 | 0.012 | - | b7 | 0.111 | - |

b8 | - | - | b8 | −1.089 | - |

Adjusted R^{2} | 0.895 | 0.894 | Adjusted R^{2} | 0.901 | 0.894 |

y = t3 | |||||
---|---|---|---|---|---|

Variable | Estimate | S.E. ^{1} | |t| | p Value ^{2} | VIF |

Intercept | −7.403 | 1.509 | 4.906 | <0.0001 | |

Trip | 0.006 | 0.003 | 1.853 | 0.0641 | 2.202 |

Aves | −0.058 | 0.026 | 2.254 | 0.0244 | 1.678 |

Maxs | 0.102 | 0.023 | 4.407 | <0.0001 | 1.775 |

t1 | 0.597 | 0.029 | 20.800 | <0.0001 | 2.397 |

t2 | 0.090 | 0.047 | 1.903 | 0.0573 | 1.118 |

t4 | 0.429 | 0.006 | 69.800 | <0.0001 | 1.758 |

b1 | −0.250 | 0.103 | 2.420 | 0.0157 | 1.615 |

b2 | 0.770 | 0.085 | 9.010 | <0.0001 | 1.324 |

b3 | −0.365 | 0.027 | 13.350 | <0.0001 | 1.633 |

b4 | 0.022 | 0.039 | 0.570 | 0.5691 | 1.94 |

b5 | 0.100 | 0.128 | 0.783 | 0.4336 | 1.308 |

b6 | −0.547 | 0.362 | 1.512 | 0.1308 | 1.088 |

b7 | 0.015 | 0.027 | 0.545 | 0.586 | 1.1 |

Adjusted R^{2} | 0.8948 |

^{1}Standard error coefficient;

^{2}confidence interval is 95%.

y = t4 | |||||
---|---|---|---|---|---|

Variable | Estimate | S.E. ^{1} | |t| | p Value ^{2} | VIF |

Intercept | 15.930 | 3.189 | 4.995 | <0.0001 | |

Trip | −0.003 | 0.007 | 0.399 | 0.6897 | 2.335 |

Aves | 0.249 | 0.054 | 4.616 | <0.0001 | 1.665 |

Maxs | −0.231 | 0.049 | 4.736 | <0.0001 | 1.771 |

t1 | −1.110 | 0.064 | 17.330 | <0.0001 | 2.672 |

t2 | −0.263 | 0.100 | 2.621 | 0.0089 | 1.131 |

t3 | 1.911 | 0.028 | 67.980 | <0.0001 | 1.743 |

b1 | 0.225 | 0.219 | 1.027 | 0.3045 | 1.623 |

b2 | −0.979 | 0.185 | 5.288 | <0.0001 | 1.393 |

b3 | 1.100 | 0.052 | 21.040 | <0.0001 | 1.337 |

b4 | −0.148 | 0.082 | 1.806 | 0.0712 | 1.945 |

b5 | −0.790 | 0.270 | 2.921 | 0.0036 | 1.316 |

b6 | −0.980 | 0.775 | 1.265 | 0.2063 | 1.117 |

b7 | 0.119 | 0.057 | 2.104 | 0.0356 | 1.096 |

b8 | −1.155 | 0.376 | 3.067 | 0.0022 | 1.226 |

Adjusted R^{2} | 0.9009 |

^{1}Standard error coefficient;

^{2}confidence interval is 95%.

Index | Equation (5) | Equation (6) | ||
---|---|---|---|---|

No Training Set | 10-Fold Cross | No Training Set | 10-Fold Cross | |

MSE | 40.244 | 36.613 | 213.057 | 157.997 |

RMSE | 6.34 | 6.051 | 14.596 | 12.570 |

MAE | 3.20 | 3.243 | 7.345 | 7.028 |

