1. Introduction
Pavement anti-slip performance plays a critical role in ensuring road traffic safety, as inadequate low friction can markedly elevate the risk of accidents [
1]. It is typically quantified by the tire-pavement friction coefficient, which is affected by various factors such as vehicle load, speed, slip ratio, rubber properties, ambient and pavement temperatures, and the thickness of surface water films [
2]. Among these, pavement texture is considered a critical determinant, as it enables tires to displace surface water and penetrate water films, thereby maintaining effective contact with the road surface [
3,
4]. Therefore, regularly monitoring pavement texture and accurately predicting its anti-slip performance is crucial for traffic safety management.
Fractal dimension has been considered an important indicator in pavement surface image data segmentation, classification, and characterization [
5,
6]. M.M. Villani et al. [
7] raised the pavement anti-slip performance of asphalt mixture by considering the fractal indicators in road material design. Lin Li et al. [
8] and Mehran Motamedi et al. [
9] utilized the fractal parameters and the correlation function to describe the pavement surface texture roughness in determining the tire/pavement surface elastic contact behavior. Jiale Lu et al. [
10] simulated the pavement surface polish by taking into account the combined fractal parameter and conventional statistical parameters. Ke Zhong et al. [
11] used the fractal theory as well as the Fourier transform to develop a dynamic anti-sliding risk early warning model for airport pavement. Cheng Liu and You Zhan et al. [
12] demonstrated that the fractal dimension correlated to the BPN (British pendulum number) with a coefficient of 0.6 while the vertical height ratio has been appropriately selected. Kaifeng Wang et al. [
13] revealed that a correlation coefficient of 0.95074 existed between the fractal and the BPN in the anti-slip particles sprayed asphalt mixture in a laboratory study. Generally, the fractal dimension is still an important indicator in determining the pavement anti-slip performance. For example, values close to 2.0 typically indicate smoother surfaces, which are less favorable for skid resistance, whereas values approaching 2.5–2.8 represent more complex and irregular textures that are usually associated with improved anti-slip performance. However, existing studies still fall short in accurately correlating directional fractal dimension with in-situ pavement friction, especially under varying speeds and environmental conditions.
With advances in noncontact three-dimensional (3D) measurement technologies and developments in high-performance computers, wavelet analysis, the Hilbert-Huang transform, fractal analysis, power spectra density, and Persson’s model have been used to characterize pavement macrotexture attributes and correlate them with friction performance [
14,
15,
16,
17]. Besides the traditional texture parameters, such as the mean profile depth (MPD), Li et al. [
18] selected an array of three-dimensional (3D) areal texture parameters to predict surface friction at various speeds. Researchers have created a relational model that links anti-slip performance and pavement roughness using two-dimensional images of pavement [
19,
20]. However, the two-dimensional data lacks information on the elevation of the road surface. As a result, it is not possible to provide a complete description of the texture features of the road surface in detail. Hartikainen et al. [
21] analyzed the connection between root mean square roughness (RMS) and BPN by layering the road texture in the elevation direction. Kanafi et al. [
22] layered the pavement structure using the projected area and established an ideal anti-slip performance evaluation model. Pavement texture characteristics are described using fractal theory due to the self-similarity of pavement micro morphology [
23]. Hanyu Zhang et al. [
24] collected the texture geometric information of asphalt mixture specimens indoors, and demonstrated that there is a good correlation between the Fractal dimension of pavement texture and pendulum friction meter test data (BPN). Li et al. [
25] analyzed the correlation between road texture features, including fractal characteristics, and grip tester friction data based on on-site road texture data with 1 mm accuracy. Ding and Zhan et al. [
26] collected high-precision pavement texture data, and also described texture features of different heights through difference box Fractal dimension and effective contact area, establishing the correlation model between difference box Fractal dimension and pendulum friction meter test data. The methods for gathering information about road texture have evolved from low-precision indoor testing to high-precision on-site collection. Nowadays, the main research method for evaluating road surface anti-slip performance is based on this high-precision on-site collection of road texture features. However, previous research only discussed the connection between texture fractal dimension and road friction at low speeds, and the model correlation could be enhanced.
Recently, the application of machine learning has opened new avenues for predicting pavement friction. Deep learning models such as convolutional neural networks and residual networks have been proposed to extract pavement texture features from images and reconstruct 3D models to predict the pavement anti-slip performance [
27,
28,
29,
30]. Liu et al. [
31] extracted features from the single view road image based on the depth neural network encoder, and built a 3D model of the road macro texture to evaluate the anti-slip performance of the road. Yang et al. [
32] established a Convolutional neural network prediction model for pavement anti-sliding performance based on 1 mm precision field pavement texture data. Zhan et al. [
33] proposed a residual network prediction model of pavement friction depth suitable for surface finish data sets. Although these neural network-based approaches have demonstrated promising predictive capabilities, most of them require large training datasets, substantial computational resources and lack of interpretability. In parallel, recent researches [
34,
35] have explored alternative strategies, such as partial least squares regression with non-standard texture parameters and binocular vision combined with deep learning for texture characterization. These developments underscore the growing diversity of machine learning techniques that can be leveraged to evaluate pavement anti-slip performance. Moreover, beyond pavement engineering, artificial intelligence, machine learning, and computer-aided design have already proven effective in various scientific and industrial domains, including manufacturing optimization, material processing, and engineering design [
36,
37,
38]. These cross-disciplinary applications further highlight the potential of AI-driven methodologies for advancing pavement engineering research.
In contrast, tree-based machine learning algorithms, such as the Random Forest, Gradient Boosting Decision Trees(GBDT) and the XGBoost, offer several advantages [
39,
40], such as the ability of handling small datasets, faster training and higher accuracy, and interpretability. For example, Yang et al. [
41] established a pavement anti-slip performance model based on a random forest algorithm based on high-precision field pavement texture data, effectively explaining the impact of various pavement texture parameters on pavement anti-slip performance. Zhan et al. [
42] developed multi-scale evaluation metrics for measuring pavement anti-slip performance using real-world pavement data. They also created a perception model for pavement anti-slip performance based on GBDT. They found that road surface temperature has a significant impact on the prediction of the target friction coefficient. Additionally, Zhan et al. developed an XGBoost-based framework integrating fast fourier transform (FFT) analysis to establish correlations between texture features and pavement friction values (BPN), finding that road surface temperature significantly impacts friction coefficient prediction. However, these studies still face limitations such as relatively low correlation levels and a lack of high-speed friction prediction, given that the BPN primarily reflects friction at a speed of approximately 5 km/h [
43]. Moreover, limited attention has been paid to multi-directional texture characteristics under varying vehicle speeds, where dynamic effects are more significant.
Therefore, this study integrates pavement texture features from multiple spatial perspectives within a novel multi-view fractal framework and employs a tree-based machine learning model, specifically the XGBoost algorithm, to predict pavement anti-slip performance at vehicle speeds of 10 km/h and 70 km/h, representing low- and high-speed conditions, respectively. The main objectives of this research are as follows:
(1) To characterize multi-view fractal dimensions by analyzing the pavement texture in surface, cross-sectional, and depth perspectives;
(2) To prepare the 221 datasets of the high-resolution pavement texture data using an LS-40 portable 3D laser surface analyzer;
(3) To develop and optimize the XGBoost model by hyperparameter tuning based on the texture multi-view fractal dimension value and pavement friction data.
(4) the XGBoost model performance evaluation by comparing with that of other classical machine learning algorithms.
4. The XGBoost Evaluation Model for Road Friction
4.1. The XGBoost Algorithm
The XGBoost (Extreme Gradient Boosting) is a state-of-the-art ensemble learning algorithm based on gradient boosting decision trees (GBDT) [
47]. It is particularly well-suited for applications involving limited data, as it exhibits strong performance on small-sample datasets, making it ideal for scenarios where high-resolution pavement texture data are expensive or difficult to obtain. This study chooses decision trees as the primary learning tool for the XGBoost gradient enhancement. The basic idea is to add new decision trees continuously, and each tree will fit the residual of the previous tree’s prediction results to reduce model bias. After the final iteration is completed, the cumulative results of all trees will be the prediction results of the sample. The predicted value
is defined as:
where,
is the predicted result of sample
i after the
t-th iteration;
represents the prediction results of the first
t − 1 trees;
denotes the model of the
t-th tree.
The objective function calculation formula for the XGBoost is:
where
n denotes the total amount of sample data
the Loss function, which is used to measure the error between the predicted value
and the true value
; The regularization term
is used to control the complexity of the model and avoid overfitting. The expression is:
where,
denotes the penalty coefficient of the decision tree leaf node;
indicates the regularization coefficient of L2;
T and
w denote the number and weight of leaf nodes in the
t-th tree;
is the weight coefficient of the
jth leaf node.
As shown in
Figure 12, the XGBoost builds upon the fundamental concept of the second-order Taylor formula of the Loss function to enhance the accuracy of the Loss function. The model also employs multithreading to parallel search for the optimal segmentation point for each feature, thereby significantly improving the training speed.
4.2. The XGBoost Hyperparameter Tuning
In this study, the road texture multi-view fractal dimension (F3D, F2D, FSur, F10% ~ F90%) and road surface temperature are taken as the eigenvalues, and DFT10 and DFT70 are taken as the label data. Randomly select 70% of the data as the training set and 30% as the prediction set, and establish a regression model for road anti-slip performance based on the XGBoost algorithm.
In machine learning, hyperparameters refer to parameters that must be specified prior to the commencement of the training process. These parameters critically influence the model’s final performance.
Table 1 summarizes the key hyperparameters of the XGBoost algorithm that require tuning.
This study aimed to create a reliable predictive model for the anti-slip performance of the XGBoost pavement. To achieve this, a double-layer grid search method was used to optimize the relevant parameters. The method involves two search cycles. During the first cycle, a wider search range and cycle step size are used to identify the approximate location of the optimal parameters. This helps to avoid extended search times. In the second cycle, a smaller search range is used based on the results of the first cycle. This allows for a more precise determination of the optimal parameter combination by reducing the cycle step size.
The double-layer grid search method is conducted as the following specific settings: n-estimators have a search range of 20–200 and a step size of 20. The search range of max-depth is 1–100, with a step size of 5. The search range for the learning rate is 0–1, and the step size is 0.1. The search range of reg-alpha is 0–1, and the step size is 0.1. The search range of reg-lambda is 0–1, and the step size is 0.1. By nested loop search for each parameter combination, determine the second search setting for each parameter as the search range of n-estimators is 80–100, with a step size of 1. The search range of max-depth is 1–6, with a step size of 1. The search range for the learning rate is 0.1–0.2, and the step size is 0.001. The search range of reg-alpha is 0.1–0.2, and the step size is 0.001. The search range of reg-lambda is 0.2–0.3, and the step size is 0.001.
Finally, it was determined that the model had the optimal results when n-estimators were 90, max-depth was 5, the learning rate was 0.125, reg-alpha was 0.121, and reg-lambda was 0.28. To mitigate potential overfitting, L1 and L2 regularization were applied, tree depth was restricted, and a five-fold cross-validation was conducted.
An example decision tree in the XGBoost model after hyperparameter tuning is shown in
Figure 13.
4.3. The Classical Prediction Model Selection for Comparison
In order to further verify the prediction performance of the XGBoost model established in this paper, this paper establishes five machine learning models based on the same eigenvalues: linear regression (LR), decision tree (DT), support vector machine (SVM), random forest (RF), BP neural network (BPNN).
4.3.1. Linear Regression Model
The linear regression algorithm is one of the most basic algorithms of machine learning. It is a Supervised learning algorithm used to predict continuous values. The multiple linear regression model can describe the relationship between the dependent variable
Y and the independent variables
X1,
X2,
X3, …,
Xn.
where,
is a constant term,
(1, 2, 3, …,
n) is the regression coefficient,
is the error term.
4.3.2. Decision Tree Model
Decision tree is one of the most commonly used machine learning algorithms. The establishment of the regression decision Tree model is essentially a recursive process, which recursively divides the independent variable space. Divide the feature data into
N units
R1,
R2, …,
RN, and assign a specific output value
CN, to each unit. The regression decision tree model [
48] is represented as:
Segmenting the feature space, with the
i-th variable
xi and its value
s, the feature space is divided into two regions:
Furthermore, solve the following equation to obtain the optimal segmentation point
s corresponding to the variable
i. Repeat the above process for each region until the conditions are met, and complete the establishment of a regression decision tree.
The hyperparameter selection of the regression decision tree pavement anti-slip performance prediction model established by this research institute is shown in
Table 2:
4.3.3. Random Forest Model
Random forest is an ensemble learning algorithm that contains multiple decision trees. It uses the Bootstrap sampling method to randomly select data groups from the original data set to build a training sample set and repeats the process several times to create different training sample sets. For each training sample set, a node random splitting technique is used to construct a CART regression tree, and the minimum mean square deviation principle is adopted. For any partition node s corresponding to feature A, the data is divided into datasets
D1 and
D2, and the partition features and nodes corresponding to the minimum mean square difference between
D1 and
D2 and the minimum sum of mean square deviations between
D1 and
D2 are obtained. The specific expression [
49] is as follows:
where,
C1 is the sample output mean of the
D1 dataset, and
C2 is the sample output mean of the
D2 dataset.
Finally, the mean value of the regression results of each regression tree is the final regression result of the Random forest.
where,
F is the final regression result of random forest,
fi is the result of the
i-th regression tree, and
n is the number of regression trees.
The super parameter selection of the prediction model for anti-slip performance of Random forest pavement established in this study is shown in the following
Table 3:
4.3.4. SVM Model
SVM (Support Vector Machine) [
50] is a classical machine learning algorithm, which per forms data regression in the way of Supervised learning. The support vector machine algorithm maps the original data to the high-dimensional space through nonlinear mapping to find the Hyperplane with the smallest error. The regression function established in high-dimensional space is shown in Equation (1).
Then, Lagrange function is introduced to solve
ω and
b, obtain the regression equation.
where,
is Lagrange operator,
N is the number of support vectors, cis the penalty coefficient,
is the kernel function,
is the error term.
The hyperparameter selection of the support vector machine pavement anti-slip performance prediction model established by this research institute is shown in
Table 4:
4.3.5. BPNN Model
BPNN is a multi-layer feedforward network system that combines a large number of neurons to solve nonlinear complex problems using a gradient steepest descent learning strategy. In the process of forward transmission, the output value of each node is obtained according to the output value, weight value, threshold value, and Activation function of all nodes in the upper layer. The specific formula [
51] is as follows:
where,
m is the number of nodes,
ω is the weight value,
b is the threshold value, and
f is the Activation function.
During the reverse transmission process, by continuously adjusting the weights and thresholds of the network along the steepest descent direction of the sum of relative error squares, the error in achieving the actual output and expected output of the system is reduced. The specific formula is as follows:
where,
E is the Error function,
d is the output layer result,
δ is the learning signal.
The hyperparameter selection of the BPNN pavement anti-slip performance prediction model established by this research institute is shown in
Table 5:
6. Conclusions
(1) In this study, an LS-40 portable three-dimensional surface analyzer with an accuracy of 0.05 mm was used to collect road texture data, and the complexity of road texture was comprehensively characterized by multi-view fractal dimension. The fractal feature indicators of F3D, F2D1, F2D2, FSur, and F10%~F90% texture were proposed from the space, cross-section, and depth directions, respectively, improving the fractal characterization method of road texture under various spatial perspectives.
(2) For evaluating road friction, the dynamic friction coefficients measured at 10 km/h and 70 km/h were employed to represent low- and high-speed conditions, respectively. Analysis of the multi-view fractal characteristics, road surface temperature, and dynamic friction data revealed no direct linear correlation among these variables. Nevertheless, each feature independently contributes to describing the complexity of the pavement texture.
(3) By integrating multi-view fractal dimension features-capturing pavement texture complexity across spatial, cross-sectional, and depth dimensions-with road surface temperature, we developed a robust the XGBoost-based regression model for friction prediction. A two-stage grid search was used to optimize model hyperparameters, yielding a well-calibrated structure capable of learning from high-dimensional, nonlinear feature spaces.
(4) The model achieved high predictive accuracy, with R2 values of 0.80 and 0.82 at speeds of 10 km/h and 70 km/h, respectively. These results validate the effectiveness of the proposed multi-view fractal + the XGBoost framework in characterizing texture-friction relationships. Unlike prior approaches that utilize either standard texture indices or single-view fractal parameters, our method provides a more granular, scale-aware assessment of anti-slip performance and enables the potential transition from contact-based testing to efficient, non-contact friction evaluation.
(5) Through analysis of the XGBoost model for pavement anti-slip performance evaluation, it was determined that multiple factors significantly influence anti-slip behavior. These factors encompass the spatial morphology of texture, cross-sectional texture properties, fractal characteristics of profiles at varying texture depths within the multi-view fractal dimension framework, as well as road surface temperature. Furthermore, the relative importance of these factors was found to vary depending on the vehicle speed regime.
In addition, although 221 datasets were used in this study—which is relatively large for high-resolution pavement texture research—the sample size is still limited compared to typical machine learning applications. The higher training R2 values compared to testing results suggest possible mild overfitting, which we have mitigated through cross-validation and regularization. However, several limitations remain: (i) the dataset was collected from only two locations in Oklahoma, which may restrict the generalizability of the findings; (ii) only surface temperature was considered as an environmental factor, while humidity, rainfall, and seasonal cycles were not included; and (iii) only two representative speeds were analyzed. Therefore, future studies should expand to larger geographic regions, employ larger-scale datasets, incorporate multiple environmental variables, and investigate a wider range of vehicle speeds to further validate and generalize the proposed framework.