Estimation of the XGBoost Regression Model Used in the Prediction of Pavement’s Mechanical and Geometrical Parameters Based on Static Interpretation of the FWD Test

Marcin Daniel Gajewski; Pengyuan Xia; Beata Gajewska; Jorge Pais; Mikołaj Miecznikowski

doi:10.3390/app152412943

,

and

¹

Road and Bridge Research Institute, 03-302 Warsaw, Poland

²

Faculty of Civil Engineering, Warsaw University of Technology, 00-637 Warsaw, Poland

³

Institute of Civil Engineering, Warsaw University of Life Sciences—SGGW, 02-776 Warsaw, Poland

⁴

Department of Civil Engineering, University of Minho, 4800-058 Guimarães, Portugal

Appl. Sci.2025, 15(24), 12943;https://doi.org/10.3390/app152412943

This article belongs to the Special Issue Advances in Structural Health Monitoring in Civil Engineering

Version Notes

Order Reprints

Featured Application

The results of this paper can serve as guidance for software development as an automatic interpretation tool for FWD (Falling Weight Deflectometer) or TSD (Traffic Speed Deflectometer) tests. The application of ML (Machine Learning) and ANN (Artificial Neural Network) methods has become increasingly common recently; however, there are not enough scientific papers documenting their accuracy. This work fulfils this gap in the case of FWD.

Abstract

The FWD is commonly used to conduct a non-destructive evaluation of the capacity of the pavement. The layered pavement is loaded locally by falling weight, and deflection is recorded at many points. Based on these results, if the pavement geometry is known, the mechanical properties of the pavement may be determined using the back-calculation approach. Analytical, numerical, or ML methods can be used for back-calculation. An analytical solution for a multi-layered structure leads to non-linear relationships for the thickness or stiffness of each layer, but provides an accurate solution. The other methods, like numerical or ML methods, are just approximation methods with different levels of accuracy. In this paper, the accuracy of the XGBoost ML regression model in predicting mechanical and geometrical pavement parameters was estimated. The database was generated from a static analytical solution of an axially symmetrical problem implemented in the form of JPav software and then explored by training regression models to predict the moduli and thickness of pavement layers. Two other databases were created using PCA (Principal Component Analysis) and FDM-like (Feature Difference Method) to compare models trained with the complete deflection database. The results showed that models trained with the complete deflection database had the best average prediction performance compared to the other two. In contrast, models trained with the database pre-processed by PCA showed a similar predicting performance to that of the previous models, but with a slight loss in precision. Models trained with the database pre-processed by the FDM-like approach exhibited excellent prediction on some features but performed worse on the rest. The primary objective of this work is to develop a model that enables the determination of pavement layer thickness and moduli from the deflections obtained in FWD tests. The analysis carried out allowed us to conclude that it is possible to obtain some pavement variables from the deflections, while others require a more sophisticated approach.

Keywords:

FWD; XGBoost; regression models; prediction; PCA; FDM; ML

1. Introduction

The average highway mileage of the top three countries with the largest total highway mileage reached 6 million kilometres in 2024 [1]. With the continuously increasing highway mileage, maintenance is playing a more critical role than ever. Some countries, such as the US and Denmark, have already allocated a significant part of the road-related budget to this [2].

Considering pavement maintenance, the evaluation of pavement condition is a critical component of infrastructure management, necessitating robust assessment techniques. Two primary methodologies are utilised: destructive testing, such as coring, and non-destructive testing (NDT), exemplified by the Falling Weight Deflectometer (FWD) or Traffic Speed Deflectometer [3]. Coring involves the extraction of cylindrical pavement samples to directly characterise material properties, including compressive strength and asphalt content, as well as structural attributes, such as layer thickness and interlayer bonding. Despite its precision, this method causes permanent structural damage, requiring subsequent repairs that elevate costs and disrupt traffic flow. Conversely, NDT assesses pavement condition without compromising structural integrity, offering greater efficiency and the ability to cover larger areas compared to destructive methods. The FWD, a widely adopted NDT technique, generates a dynamic load pulse by dropping a calibrated weight onto a 300 mm diameter circular load plate, inducing vertical deformations that form a deflection basin (Figure 1) [4]. These deformations are measured using geophones positioned at the load centre and at multiple radial distances, enabling rapid data acquisition over extensive pavement sections. The collected deflection data facilitate the prediction of pavement layer moduli and thicknesses through back-calculation techniques [5].

Figure 1. Schematic diagram of the boundary value problem modelling the FWD test.

Back-calculation employs an iterative numerical optimisation process to determine pavement properties. This process begins with an initial estimate of pavement properties, followed by a forward calculation to simulate the FWD test and compute the predicted deflection response. The computed response is then compared to the actual FWD deflection data, yielding a value for an objective function that quantifies the agreement between estimated properties and observed measurements. Through an optimisation procedure, the estimated properties are iteratively adjusted to minimise the objective function, refining the accuracy of the pavement property predictions [6]. Compared to coring, which is labour-intensive and limited to discrete locations, the FWD enables efficient, large-scale assessments with minimal disruption. While coring provides precise, localised material data for detailed forensic analysis, NDT methods like the FWD preserve pavement integrity, making them ideal for routine monitoring and broad structural evaluations. The selection of an appropriate method depends on the project objectives, balancing the need for detailed material characterisation against the advantages of rapid, non-invasive assessment for effective pavement management.

Various programmes are available to analyse the pavement structures, such as ELSYM5, WESLEA, BISAR, etc., or back-calculation tools like ELMOD. In static analysis, the peak load and deflection are used to calculate the thickness and moduli of pavement layers. In contrast, in dynamic analyses such as DBALM, FEM, 3D-Move, etc. [7,8,9], the force-time recorded functions are directly utilised to predict needed results. However, based on research by Tarefder [10], the results from various back-calculation software differ, and the final output can differ from the laboratory result due to the calculation algorithm.

Unlike traditional back-calculation analytical algorithms, ML is a popular tool for analysing similar tasks [11,12,13,14]. ANN is one of the most popular analysis tools that allows for reasonable predictions without referencing physical phenomena in analysed problems [15,16,17]. Pure ANN and hybrid models, e.g., combined with a Genetic Algorithm, were utilised to predict the moduli of pavement. The results showed that ANN models improved prediction compared to traditional back-calculation software, and the hybrid ANN model showed stronger generalisation ability than traditional ANN [18,19,20].

Ensemble models also include popular ML models, e.g., RF (Random Forest) and GBM (Gradient Boosting Machines), which are commonly used to predict pavement properties. Sudyka et al. [21] used RF-, ANN-, and BT (Bagged Trees)-Reinforced Trees to train different models to predict asphalt layer temperature based on data obtained from FWD and TSD. The results showed that all models had a good prediction accuracy, with R² values exceeding 0.8. Worthey et al. [22] predicted the dynamic modulus of asphalt mixtures by a model trained in a Bagged trees ensemble, and ensemble models exhibited significantly better prediction accuracy on this property than some ANN models.

One of the ensemble models, XGBoost, performs better than other tree boosting models in practice [23]. Despite this advantage, there is a lack of literature on back-calculation in specific FWD applications, while its performance has been proven in many other applications in pavement analyses. Wang et al. [24] found that its performance in predicting the International Roughness Index of rigid pavement surpassed that of other ensemble models. In addition, Ali et al. [25] utilised XGBoost to predict the dynamic modulus of asphalt concrete mixtures, with results that significantly outperformed some well-known regression models, including Witczak, Hirsch, and Al-Khateeb. Ahmed [26] built both RF and XGBoost models to predict pavement structural conditions based on data derived from the Long-Term Pavement Performance (LTPP) programme. The results demonstrated that XGBoost outperformed RF and had practical advantages over empirical equations. Zhu [27] compared ANN, SVM, KNN and a combined RF-XGBOOST model to propose a pavement maintenance decision model, and found that the combined RF-XGBOOST model achieves a classification accuracy of 93.1% surpassing the rest of the ML models.

In short, ML models, e.g., ANN and ensemble models, are increasingly utilised for analysing pavement properties, offering advantages over traditional methods. One of the ensemble models, XGBoost, presents with a better performance than other tree boosting models; in some cases, it outperforms the ANN. However, there is little research on FWD back-calculation using XGBoost. Therefore, this study focuses on the prediction of XGBoost models.

2. Objective and Methodology

The paper involved the generation of a pavement deflection database using JPav (version 3.3) software, simulating an axisymmetric problem of a layered half-space under a circular load. A self-programmed Python (version 3.13) script generated over 60,000 datasets with randomly assigned layer properties. The pavement structures (elements of the generated database) were assessed with respect to their compliance (the inverse of stiffness). Compliance estimates were made using solutions to linear elasticity problems, assuming either uniform loading (which leads to a one-dimensional problem) or its localisation (which leads to the interpretation of compliance within the framework of Kirchhoff’s thin plate theory). Then, based on the database obtained and evaluated in this way, a machine learning model was built to predict the stiffness and thickness moduli of individual pavement layers. The study utilised an XGBoost regression model (version 3.0.2), trained after data pre-processing, including splitting, standard scaling, and Optuna (version 4.3.0)-based hyperparameter tuning. Model performance was assessed via R² and RMSE. Two alternative pre-processing methods, PCA and FDM-like, were also implemented and their results compared following the same process.

3. Database Generation and Its Analyses

In order to generate a database, i.e., sets of input data (all parameters of the layered structure, such as layer thicknesses and their material properties) and output data in the form of deflections at 10 distinguished points (as in the FWD test), a boundary value problem was formulated as in Figure 1. This is an axisymmetric problem of an infinite layered half-space symmetrically loaded on a circular area of radius a, with a load of 40 kN to form the 80 kN standard axle load.

A database for training the machine learning model was generated using the JPav programme, developed by one of the authors. This programme is a static analytical deflection-calculation software that calculates deflection at a given point based on the properties of pavement layers.

The programme allows for the stress/strain of a multi-layered pavement with linear–elastic and isotropic behaviour to be calculated by applying the biharmonic function proposed by Burmister [28]. The calculation of the stresses, strains, and displacements is carried out by substituting the biharmonic function in the equations of the elasticity theory. The programme reproduces the results provided by BISAR or other primary programmes developed to calculate stresses, strains, and displacements in multi-layered pavements. The programme was developed to calculate the vertical displacement on the theoretical basis presented in Appendix A; see also [29]. In the interpretation of FWD measurements, it is considered valid to employ static layered-elastic analysis and to perform a back-calculation of pavement layer moduli, provided that the resulting parameters remain consistent with the mechanistic–empirical design framework [30].

Pavement layer properties indicate the thickness (Has) and modulus (Eas) of the asphalt layer, the thickness (Hgr) and modulus (Hgr) of the granular layer, the thickness (Hsu) and modulus (Esu) of the subgrade layer, and the modulus (Eil) of the infinite subgrade layer. While the results of defection at ten measuring points, d1 to d10, are the output of the software, the distances of the measuring points from the load centre are 0, 0.2 m, 0.3 m, 0.45 m, 0.6 m, 0.9 m, 1.2 m, 1.5 m, 1.8 m, and 2.1 m, respectively.

JPav requires at least 11 input parameters to obtain results. The four constant inputs are the number of loads, the magnitude of the load (kN), the radius of the circular contact area (m), and the Poisson’s ratio of each layer, set at 1, 40, 0.15, and 0.35, respectively. To obtain a continuous database for later analysis, the remaining inputs, i.e., the properties of each layer, were generated using the random uniform function, and the number of decimal places for each value was determined by the corresponding interval value, as shown in Table 1. In total, 65,824 datasets were randomly generated (see flowchart in Figure 2), consisting of the input data (geometrical and mechanical parameters of the structure) and output data (deflection values for points d1–d10).

Table 1. The range and interval of the parameters used for data generation (UB—upper boundary; LB—lower boundary).

Figure 2. Flowchart of database generation.

This large dataset allows for the effective training and validation of the developed models. The limits for thickness and stiffness are typical values found in pavement. In the field, we have all possible values for all variables (thickness and stiffness) of the pavement layers. Therefore, the cases included in the database are valid for characterising a pavement.

Considering a continuous and even distribution of thickness and Young’s modulus of each layer within the above range, the database was generated using JPav. The distribution of the database’s input parameters is presented in Figure 3.

Figure 3. Distribution of the database regarding the following input parameters.

It should be noted that the randomly chosen geometric parameters (i.e., layer thicknesses), as well as the mechanical parameters (layer stiffness moduli), generally define the pavement structure as having a specific global stiffness (compliance). To evaluate the generated database, the compliance of individual pavement layers and the total compliance of the layers (excluding the native soil layer of infinite thickness) were assessed. The compliance was assessed assuming a uniaxial stress state (compression) in the layer of isotropic elastic material, following the formulas resulting from the theory of elasticity [31,32].

C_{i} = \frac{H_{i} (1 - v_{i} - 2 ν_{i}^{2})}{E_{i} (1 - v_{i})}

(1)

where i = as, gr, su, il, and H represent the thickness of one layer, E is the Young’s modulus of one layer, and

ν

is Poisson’s ratio, herein,

ν

= 0.35. The summary compliance can be calculated using the following sum:

S C = C_{a s} + C_{g r} + C_{s u} + C_{i l}

(2)

Assuming that, at some depth H = 15 m-Has-Hgr-Hsu, there is a rigid layer top surface, total compliance may be determined based on a displacement boundary condition. In this type of compliance estimation, the local nature of the load in the FWD test is not taken into account. Therefore, an estimation of the (bending) compliance based on Kirchhoff’s thin plate theory was also proposed [33] to determine the compliance of each layer according to the following formula:

B C_{i} = \frac{12 (1 - ν_{i}^{2})}{E_{i} H_{i}^{3}}

(3)

Summary compliance (SBC) can be determined analogously, as in the case of axial compliance— see Equation (2)—and the assumptions presented below this equation. The actual compliance of the FWD problem lies somewhere between SC and SCB.

Following the equations above for the analysed database, the distributions in terms of compliance are shown in Figure 4. The dataset shows a predominance of layers with lower compliance, indicating higher stiffness in most simulated structures.

Figure 4. Distribution of C_i and BC_i for the following pavement layers.

The compliance of individual layers contributes to the overall compliance of the pavement structure. Figure 5 presents histograms of the total compliances, SBC and SC. These histograms do not conform to normal distributions but resemble gamma distributions. The normality of the distribution was checked using the Anderson–Darling criteria, and the fit to the gamma distribution was also assessed (see also Appendix B). Comparing the distributions of total compliance with those of individual layers reveals a shift in their characteristics. The database predominantly contains cases with medium compliance, with fewer cases exhibiting low compliance. Cases with high compliance are also numerous.

Figure 5. Distribution of summary compliance SC and SBC.

The section above presents the analysis of input data for the ML algorithm. The data were processed using JPav to generate the output dataset, comprising ten deflection points (d1–d10) that define the deflection basin. The output data underwent statistical evaluation, with the results presented in Table 2. Additionally, multicollinearity assessment is essential for high-dimensional datasets. Elevated multicollinearity within a dataset complicates the isolation of individual effects of correlated features on the target variable, potentially compromising the interpretability and reliability of predictive models [34]. A Pearson correlation matrix was computed, with the results visualised as a heatmap in Figure 6. To facilitate analysis across varying distances, each distance (r) was normalised by the radius of the loading plate (a), as described below:

η = \frac{r}{a}

(4)

where

η

is the normalised distance, a is the radius of loading, (herein a = 0.15 m), and r is the radial distance of the measuring point from the axis of symmetry, in m.

Table 2. Assessment of the database of deflections.

Figure 6. Correlation matrix heatmap of deflection at measuring points.

Analysis of the correlation matrix heatmap (Figure 6) reveals a strong correlation between neighbouring measurement points within the range d1 to d5. Beyond this range, deflections correlate strongly not only with adjacent points but also with two or three subsequent points. Consequently, reducing the number of measurement points in the range d6 to d10 (for

η \in (5, 14)

) is unlikely to significantly affect the accuracy of the data used for back-analysis in interpreting FWD test results.

Table 2 summarises the average, maximum, and minimum deflection values, which characterise a typical deflection basin. All skewness values are positive, indicating a right-skewed distribution. Kurtosis values are less than 3 (the kurtosis of a normal distribution), indicating a platykurtic distribution with fewer outliers. This is confirmed by the absence of outliers within the homogeneous interval [Q1 − 1.5IRQ; Q3 + 1.5IRQ]. Approximately 3% of the data were identified as outliers and were removed from the dataset prior to training the machine learning model. After removing these outliers, the dataset contained 60,687 data points.

As can be seen from the generated database, cases identified as outliers were removed. At this stage, it is difficult to assess the precise impact of this removal on the predictive performance of the trained machine-learning models. Accordingly, after the models were developed, a subsequent verification was carried out, as presented in Appendix C, to evaluate the effect of outlier removal. This was achieved by comparing the performance of models trained on the full dataset (including outliers) with those trained on the cleaned dataset.

4. ML Model: Its Algorithm and Training

4.1. The Algorithm

XGBoost, which stands for eXtreme Gradient Boosting, an instance of GBM (Gradient Boosting Machine), was used for prediction. This is a modification of Greedy Forest that introduces regularisation, employs a second-order expansion of the loss function, and implements a novel, more efficient greedy-search strategy. The algorithm within it is a type of ensemble learning method, which predicts outputs (i.e.,

{\hat{y}}_{i}

) by combining multiple weak trees. The results

{\hat{y}}_{i}

can be predicted from the variables and t additive functions [35,36] as

{\hat{y}}_{i} = \sum_{k = 1}^{t} f_{k} (x_{i}), f_{k} ϵ F = \{f : f (x) w_{q (x)}\} q : R^{p} \to T = {1, 2, \dots, | T |},

(5)

where

f_{k}

is an independent tree structure specified by leaf score, and

F

is the space of regression trees. XGBoost receives outputs by creating a new tree

f_{k}

, which fits the residual error between the value from the previous tree and the actual value [26]. Function

q (x)

is a split function that assigns each input

x

to one of its

| T |

leaf indices and

w_{q (x)}

is a set of leaf weights.

The process can be presented as

{\hat{y}}_{i} = {\hat{y}}_{i}^{(t - 1)} + f_{t} (x_{i}),

(6)

where

{\hat{y}}_{i}^{(t - 1)}

is the predicted value at the t−1-th tree;

f_{t} (x_{i})

is the residual fitting value from the new built tree with input

x_{i}

. To achieve more precise predictions, the following object function is minimised:

{o b j}^{(t)} = l (y_{i}, {\hat{y}}_{i}^{(t)}) + Υ | T | + \frac{1}{2} λ \sum_{j = 1}^{t} ω_{j}^{2},

(7)

where

l (y_{i}, {\hat{y}}_{i}^{(t)})

is the loss function, defined as the squared error in regression trees:

l (y_{i}, {\hat{y}}_{i}^{(t)}) = {(y_{i} - {\hat{y}}_{i}^{(t)})}^{2}

and

y_{i}

stands for the given data.

Υ | T |

is the regularisation term on the number of leaves. t represents the number of leaves in the newly added tree at the t-th round, and

Υ

is the hyperparameter used to control the complexity of the tree.

\frac{1}{2} λ \sum_{j = 1}^{T} ω_{j}^{2}

is the L2 regularisation term on the leaf weights.

λ

controls the magnitude of the leaf weights,

ω_{j}

represents the weight associated with the j-th leaf in the newly added tree. Weights for leaves

ω_{j}

are optimised with

{\bar{o b j}}^{(t)}

, which is a second-order expansion of the loss function dropped the term which does not depend on

f_{t} (x_{i})

. After calculations optimised weights are

ω_{j}^{*} = - \frac{g_{i}}{h_{i} + λ}

where

g_{i}

and

h_{i}

are the sum of the first and the second derivative of

l (y_{i},),

calculated at

{\hat{y}}_{i}^{(t - 1)}

for a fixed tree. The value of modified object function without

{\bar{o b j}}^{(t)}

for

ω_{j}^{*}

equals

{\bar{o b j}}^{(t)} (q) = - \frac{1}{2} \frac{g_{i}^{2}}{h_{i} + λ} + Υ | T |

Minimisation proceeds node-by-node: for every candidate split, the algorithm computes the loss-reduction term (gain) and greedily chooses the split with the largest positive value.

4.2. Models Training

Before training XGBoost models, there are usually two steps: the first step is data pre-processing, and the second is hyperparameter tuning.

4.2.1. Data Pre-Processing

The dataset was imported, and all observations with missing values were removed.
The dataset was split into training and test sets in a 3:1 training-to-test ratio.
The same standard scaling was applied to the training dataset and the test set.

4.2.2. Tuning of Hyperparameters

Hyperparameters were tuned using 10-fold cross-validation with Optuna (version 4.2.0) [37]. The tuned hyperparameters fall into two categories: tree architecture and regularisation (see Table 3). Fixed hyperparameters, including the objective (reg:squarederror), early_stopping_rounds (10), and random_state (42), were set accordingly.

Table 3. Hyperparameters after Optuna tuning.

To obtain optimistic hyperparameters, the testing range was set as shown in Table 3. Optuna was set to 2000 rounds to find the best scores [37]. The results are shown below.

Optuna [38] frames hyperparameter tuning as a black-box optimisation loop built around a define-by-run API, so the search space is created dynamically inside the objective function rather than declared up-front. It begins with a Tree-structured Parzen Estimator (TPE) [39] for a broad exploration, then, once the trial data reveal correlations, it can hand data off to a relational sampler such as CMA-ES [40], which adapts a full covariance matrix to intensify the search in the most promising region. Throughout, an asynchronous variant of Successive Halving (ASHA) [41] monitors intermediate validation scores and culls under-performing trials on the fly, recycling these into better candidates. In short, the combination of dynamic search-space construction, hybrid Bayesian/evolutionary sampling, and aggressive early pruning delivers fast, resource-efficient convergence to high-quality hyperparameter settings.

To avoid potential overfitting, 10-fold cross-validation was employed on the test set, with the coefficient of determination (R²) reported across all folds; the results are presented in Table 4.

Table 4. Metrics of 10-fold cross-validation for all models.

Based on Table 4, the standard deviation of R² across folds and all models shows that each value is very small, less than or equal to 0.0071, except for Eil, which has a value of 0.0127. Combining Table 5, it can be seen that none of the models should be overfitting; however, some of them, such as Egr, Hgr, and Eil, show signs of underfitting.

Table 5. Evaluation of the model training for each target.

5. ML Models Evaluation

5.1. Model Training with the Whole Dataset

Based on the preparation in the last section, models were trained using XGBoost (version 2.1.4) [42]. Their performance was evaluated using the R² and root mean square error (RMSE), as well as Residual Variance (RV), with the results presented in Table 5.

The results indicate that models predicting Has, Hsu, and Esu achieved strong performance, with coefficients of determination (R²) exceeding 0.8. However, models for Hgr and Eil exhibited poor performance, with R² values below 0.1.

To interpret these models, feature importance (FI) and SHAP violin plots were employed, as detailed below. FI quantifies the influence of each input feature on the model’s predictions [43], as presented in Figure 7.

Figure 7. Feature importance graph in terms of determining target parameters.

Figure 7 illustrates the feature importance calculated using the Gain metric for each target. The results reveal that deflections at positions d1, d6, and d10 exert the most significant influence on the prediction of Has, Esu, and Hsu, respectively, with importance values exceeding 0.5. Notably, d10 also demonstrates a substantial impact on the prediction of Eil. For the prediction of Eas, deflections at d1 and d2 exhibit greater influence compared to other features, with importance values of approximately 0.2. In contrast, the remaining features show relatively minor influence on predictive performance, with importance values generally remaining below 0.2.

To elucidate the contribution of each feature to the model, SHAP (SHapley Additive exPlanations) violin plots were employed. These plots visualise the SHAP values, which quantify the contribution of each predictor to the model’s output. The violin plots illustrate the distribution of these contributions, highlighting the magnitude and direction (positive or negative) of each feature’s impact on predictions. The SHAP violin plots are presented in Figure 8.

Figure 8. SHAP violin plot for each target feature: (a) Has; (b) Hgr; (c) Hsu; (d) Eas; (e) Egr; (f) Esu; (g) Eil.

Analysis of the SHAP violin plots reveals distinct dominant features for various target variables. Specifically, for the targets Has, Hgr, Hsu, Eas, Egr, Esu, and Eil, the corresponding dominant features are deflections at positions d1, d10, d10, d1 and d2, all positions, d6 and d10, and d10, respectively. Examination of Figure 7 and Figure 8 indicates that models exhibiting high predictive performance, such as those for Has, Hsu, and Eas, are characterised by a limited number of dominant features. For instance, in the violin plot for Has (Figure 8a), the deflection at d1 emerges as the dominant feature, demonstrating a consistent influence on predictions, as evidenced by its broader span along the x-axis compared to other features. Additionally, certain features exhibit minimal importance in predictive tasks, prompting consideration of feature reduction techniques, which are discussed in subsequent sections. In Figure 9, the residual plots for each target feature are presented.

Figure 9. Residual plots for each target feature: (a) Has; (b) Hgr; (c) Hsu; (d) Eas; (e) Egr; (f) Esu; (g) Eil.

5.2. Model Training with PCA

Principal Component Analysis (PCA) is a widely utilised dimensionality reduction technique that identifies a reduced set of orthogonal features, or principal components, capable of representing the original dataset in a lower-dimensional subspace while minimising information loss [44]. The correlation analysis detailed in Section 3, coupled with the extensive dataset comprising 60,687 samples, indicates a highly correlated and voluminous database, rendering it well-suited for PCA.

During the data pre-processing phase, the methodology closely followed the procedures outlined previously, with the sole distinction occurring post-data scaling. At this stage, PCA transformation was applied to the entire dataset. The explained variance ratio of each principal component is illustrated in Figure 10. To ensure that 99% of the variance is accounted for, three principal components (PC1, PC2, and PC3) were selected for model training. These components were subsequently utilised to train the predictive models. The results of the trained models are presented in Figure 11.

Figure 10. Explained variance of each principal component.

Figure 11. Observations vs. predicted values of each model trained based on the database processed by PCA: (a) Has; (b) Hgr; (c) Hsu; (d) Eas; (e) Egr; (f) Esu; (g) Eil.

Subsequently, XGBoost regression models were trained on the whole dataset using the same hyperparameters as those applied to the PCA-processed dataset. The performance of the models for each target property was evaluated by R², RMSE, and RV. The results are presented in Table 6 (for the entire dataset) and Figure 11 (for the reduced database after applying the PCA approach).

Table 6. Evaluation of models trained on the database processed by PCA.

Based on the results presented in Table 6 and the accompanying figures, the trained models for Has, Hsu, and Esu exhibit satisfactory predictive performance, achieving R² values exceeding 0.8. In contrast, the predictive performance of the remaining models is suboptimal. A comparison with the models trained in Section 5.1 reveals that the models exhibiting both high and poor predictive performance are consistent across datasets pre-processed with and without PCA. However, the predictive capabilities of the models for most properties show a slight decline, as indicated by reductions in R² and increases in RMSE, with the notable exception of the Has model, which maintains nearly equivalent predictive performance. When considering the dimensionality reduction from 10 to 3, there is a significant boost in model training compared to when training on the entire feature set; see the results presented in Table 7. All values in the table are mean values from all models (Has, Hgr, etc.), so the unit of RMSE was cancelled.

Table 7. Performance benchmarks for different input strategies.

The table above demonstrates that PCA yields superior computational performance compared to the raw database in terms of optimisation time, fitting time, total time, and model size. Notably, PCA required only approximately 40% of the training time relative to the full dataset, while reducing the model size to 40%.

5.3. Model Training with FDM-like Approach

The Feature Difference Method (FDM) can be used for the generation of a new database comprising deflection differences between adjacent points, as described, among other sources, in [45]. In the present study, the Feature Difference Method (FDM) was not directly applied. Instead, drawing on the available literature and established experience in the interpretation of Falling Weight Deflectometer (FWD) measurements, a set of features was selected, consisting of differences between deflections recorded by individual geophones.

Based on the previous studies [46,47,48], four variables were employed—SCI, BDI, BCI, and CI—of which SCI (Surface Curvature Index), BDI (Base Damage Index), and BCI (Base Curvature Index) are deflection basin parameters, and variables d7–d8 (referred to here as CI) were introduced by Chen [47]. These variables are detailed as follows:

S C I = d 1 - d 3

(8)

B D I = d 3 - d 5

(9)

B C I = d 5 - d 6

(10)

C I = d 7 - d 8

(11)

The variables SCI, BDI, BCI, and CI represent the deflection differences between adjacent points, while d1, d3, d5, d6, d7, and d8 correspond to the deflections measured at the 1st, 3rd, 5th, 6th, 7th, and 8th points, respectively.

The use of specific deflections, and particularly the difference between them, allows us to define the deflection basin straightforwardly by reducing the amount of data. Typically, the FWD test produces a deflection basin with up to nine points, each of which represents the deflection measured with a specific sensor. This representation allows us to calculate, through back-analysis, the pavement layer modulus. If only the condition of the pavement, in terms of the asphalt layer, granular layers, and subgrade, is necessary, the amount of data defining the deflection basin can be reduced, typically by using the deflection basin parameters.

The model training process closely mirrored the methodology described previously, with the sole distinction being the substitution of the training database with the variables SCI, BDI, BCI, and CI during the data pre-processing stage. Following model training and prediction, the performance was assessed using R², RMSE, and RV. The results are presented in Table 8 and visualised in Figure 12.

Table 8. Evaluation of the models trained based on the database processed by FDM.

Figure 12. Observations vs. predicted values of each model trained based on the database processed by FDM: (a) Has; (b) Hgr; (c) Hsu; (d) Eas; (e) Egr; (f) Esu; (g) Eil.

Based on the results presented in Table 8 and the accompanying figures, the predictive performance of the models varies significantly across target properties. Only the model for Has (Figure 12a) achieves a satisfactory performance, with an R² value exceeding 0.8. The models for Eas (Figure 12d) and Esu (Figure 12f) approach this threshold, with R² values close to 0.8. However, the remaining models exhibit poor predictive performance. Notably, the models for Hgr (Figure 12b) and Eil (Figure 12g) yield R² values close to zero, indicating an extremely poor fit to the dataset compared to other trained models.

5.4. Model Assessment, Heteroscedasticity, and Sensitivity Analyses

To facilitate a direct comparison of the predictive models trained using different pre-processing methods, the R² and RV values are compiled and presented in Table 9.

Table 9. R² of models trained by three different approaches.

In summary, a comparative analysis of the models trained in this study reveals that models utilising full deflection data from 10 measuring points outperform both PCA-pre-processed and FDM-pre-processed models. Notably, the models for Hsu and Esu achieve R² values exceeding 0.9, indicating exceptional predictive capability. In contrast, models trained with PCA-pre-processed data exhibit moderate predictive performance, with nearly all models failing to surpass the predictive accuracy of those trained on full deflection data. However, these PCA-based models display a similar trend in predictive performance, excelling in predictions for Has, Hsu, and Esu, while performing poorly for other properties. Given that the PCA-pre-processed dataset is less than half the size of the full dataset, this approach significantly reduces computational resource demands, particularly for large databases.

Models trained with FDM-pre-processed data demonstrate superior performance in predicting Has, achieving an R² of approximately 0.919, which surpasses all other models, as well as notable performance for Eas. However, their predictive capabilities for the remaining properties are markedly inferior compared to both the whole dataset and PCA-pre-processed models.

The observed-versus-predicted plots (Figure 9, Figure 10, Figure 11 and Figure 12) reveal that residual scatter is not consistent across the range of each target variable. In nearly all models—regardless of whether training was performed on raw deflections, PCA components, or FDM indices—the vertical spread of residuals visibly increases with predicted values, resulting in a characteristic funnel or fan-out shape. This pattern is particularly pronounced for the best-performing targets (Has, Hsu, Esu), where the highest densities of thick or stiff layers coincide with the largest prediction errors. The systematic increase in residual variance with the magnitude of the predicted parameter clearly indicates the presence of heteroscedasticity. Although XGBoost, a Gradient Boosting Decision Tree (GBDT) algorithm, does not assume homoscedasticity and is generally robust to its presence [49], formal evaluation remains appropriate. Even after removing all identified outliers, the database comprises approximately 60,000 samples, which can render visual inspections of residual plots sensitive to local point density. Accordingly, a quantitative Decile Variance Check was employed to rigorously assess heteroscedasticity. The results of this test are presented in the tables below.

The decile variance check performed on the trained models (Table 10, Table 11 and Table 12) reveals clear differences in residual behaviour depending on the input representation.

Table 10. Decile check for models trained on the whole database.

Table 11. Decile check for models trained by PCA.

Table 12. Decile check for models trained by FDM.

When the models are trained on the full set of ten deflection points, residual variance remains remarkably stable for most targets: Has, Hgr, Eas, Egr, Esu, and Eil exhibit variance ratios between the lowest and highest deciles of only 1.0–1.3, indicating essentially homoscedastic behaviour. The sole exception is the subgrade thickness Hsu, which exhibits a pronounced heteroscedastic pattern (variance ratio ≈ 4.8), a natural consequence of its very wide range of generation (0.25–10 m), where absolute prediction errors inevitably increase with the magnitude of the target. Applying Principal Component Analysis markedly alters this picture. Although the first three components still capture 99% of the variance, the dimensionality reduction introduces or amplifies heteroscedasticity for several parameters. While Has and Hgr remain reasonably stable, Eas, Egr, Esu, and especially Hsu now display variance ratios of 3 to 7, demonstrating that the information discarded in the lower-variance components is particularly important for maintaining uniform predictive precision across stiff or thick structures. The strongest heteroscedasticity appears with the basin indexes (SCI, BDI, BCI, CI). Even the best-performing target, Has, exhibits a variance ratio exceeding 3, while Eas, Esu, and Hsu reach ratios of 6 to 10. The aggressive compression of the deflection basin into just four engineered parameters causes small variations in extreme or noisy basins to be magnified into substantially larger relative errors, particularly for thicker and stiffer pavement configurations. Overall, the complete ten-point deflection dataset produces the most homoscedastic residuals and therefore the most stable predictions across the entire range of realistic pavement structures. Both dimensionality-reduction approaches, and especially the use of traditional deflection-basin indexes, systematically increase heteroscedasticity, meaning that any gain in computational speed or simplicity comes at the cost of reduced prediction reliability for thick or stiff layers.

To evaluate the influence of individual input parameters on the predicted targets, a One-at-a-Time (OAT) sensitivity analysis was conducted [50]. A baseline input vector was established using the arithmetic mean of the test dataset. Afterwards, each feature was individually perturbed by increasing and decreasing its raw value by 20% while holding the other variables constant. Crucially, these perturbed inputs were processed through the full Standard Scaler and PCA transformation pipeline before inference, ensuring the XGBoost model evaluated the physical changes within the correct feature space. The resulting sensitivity was quantified by the “Total Range,” defined as the absolute difference between the predictions of the high- and low-perturbation scenarios, and subsequently normalised as a percentage of the total accumulated range to rank the relative importance of each component.

Models trained on the full deflection dataset (d1–d10) and on PCA-transformed data showed consistent sensitivity patterns: the predicted values of Has, Hsu, Esu, and Eas were most sensitive to changes in the dominant deflection sensors identified by feature importance and SHAP (primarily d1 for asphalt-layer parameters, d6–d10 for subgrade and deeper-layer parameters). Perturbations of the key sensors produced relative response changes typically between 15% and 35%, whereas perturbations of less important sensors induced changes below 5%.

For poorly predicted targets (Hgr and Eil), the total sensitivity range remained extremely low (<6%, even for the most influential sensors), confirming that realistic variations in deflections within the generated database contain almost no information about granular-layer thickness and infinite-subgrade stiffness. This insensitivity, rather than model deficiency, explains the persistently low R² values.

Models trained on the basin indexes (SCI, BDI, BCI, CI) exhibited markedly higher sensitivity for the well-predicted parameters (Has and Eas), with perturbation ranges reaching up to 45% for SCI and BDI, reflecting the strong concentration of predictive power in these engineered basin parameters. Conversely, sensitivity to BCI and CI was negligible for most targets, and overall sensitivity for Hgr, Egr, and Eil remained near zero, reinforcing the conclusion that these four indices alone cannot resolve granular and deep-subgrade properties.

In summary, one-at-a-time sensitivity analysis corroborated the feature importance and SHAP findings, quantitatively demonstrating that high-performing predictions are driven by a small subset of highly informative deflection measurements (or derived indices).

6. Conclusions

This study developed a deflection database using the static analysis software JPav, which was subsequently utilised to train XGBoost regression models for back-calculating pavement properties. The key findings are summarised as follows:

Correlation Structure in Deflection Measurements: Examination of the deflection dataset revealed strong positive correlations between deflections recorded at adjacent geophone positions, with correlation coefficients decreasing monotonically as a function of inter-sensor distance.
XGBoost Model Performance: The XGBoost regression models exhibited good predictive accuracy for the target parameters Has, Hsu, and Esu, consistently achieving coefficients of determination (R²) greater than 0.8. In contrast, the models failed to predict Hgr and Eil effectively, yielding R² values below 0.1.
Feature Importance and Interpretability via SHAP: Feature importance analysis identified deflections at radial offsets d1, d6, and d10 as the primary predictors for Has, Esu, and Hsu, respectively. SHAP summary violin plots further elucidated the heterogeneous influence of input features across the different target variables.
Performance of PCA-Pre-Processed Models: Models trained on data pre-processed using Principal Component Analysis (PCA) exhibited predictive capabilities comparable to those trained on the full deflection database, with Has, Hsu, and Esu models achieving R² values above 0.8. However, these models generally underperformed relative to those trained on the full dataset. Notably, PCA reduced the database dimensionality to four components, substantially lowering computational resource requirements.
Performance of FDM-Like Pre-Processed Models: Models trained on data pre-processed using basin parameters (SCI, BDI, BCI and CI) excelled in predicting Has and Eas, outperforming all other models trained with different methods. However, predictive performance for the remaining properties was inferior compared to models trained on the whole dataset or PCA-pre-processed datasets.
Despite the application of various pre-processing methods—full deflection database, PCA, and FDM—the prediction of the granular layer thickness (Hgr) and the modulus of the infinite layer (Eil) remained consistently poor. None of the models achieved satisfactory performance, with R² values failing to exceed 0.8 for these properties.
Heteroscedasticity Analysis: The decile variance check demonstrated that models trained on the complete ten-point deflection dataset yield the most homoscedastic residuals, ensuring stable predictions across the full range of pavement structures. Dimensionality reduction via PCA and especially FDM-like indices systematically increases heteroscedasticity, particularly for subgrade parameters (e.g., Hsu) and thicker/stiffer layers, indicating a trade-off where computational efficiency or simplified inputs reduce prediction reliability at the extremes of the parameter space.
Sensitivity Analysis: The one-at-a-time (OAT) perturbation analysis corroborated feature importance and SHAP results, showing that well-predicted parameters (Has, Hsu, Esu, Eas) are highly sensitive to dominant deflection sensors (15–35% relative changes), while poorly predicted ones (Hgr, Eil) exhibit negligible sensitivity (<6%), confirming that the FWD deflection basin in standard configurations lacks sufficient information to resolve granular and deep-subgrade properties. Models using FDM indices displayed even higher sensitivity for strong performers (up to 45% for SCI/BDI) but near-zero for others.
It should be noted that certain determined parameters, such as Hgr and Eil, cannot be evaluated using the presented approach. The authors believe that, most likely, their variability within the analysed realistic bounds does not influence the deflections recorded by the geophone array in the FWD test. Recording deflections at greater offsets or accounting for the dynamic nature of the FWD test might improve the prediction of these quantities. A rigorous analysis of such relationships is beyond the scope of the present paper and will be undertaken by the authors in future work. However, it should be emphasised that this does not constitute evidence of weakness in the applied machine learning method, but rather of the insensitivity of the collected data to variability in the introduced parameters of the selected layers (gr and il).

The primary objective of this work is to develop a model that enables the determination of pavement layer thickness and their respective modulus from the deflections obtained in FWD tests. The analysis allowed us to conclude that it is possible to obtain some of the pavement variables from the deflections, enabling the acceleration of the inverse pavement analysis process. Using the conclusions of this work, it will be possible to search for other methods that can predict the missing variables, a process that will continue until it is possible to predict all the defining variables of the pavement (thicknesses and moduli).

Author Contributions

Conceptualization, M.D.G., P.X., B.G. and J.P.; methodology, M.D.G., P.X., B.G. and J.P.; formal analysis, M.D.G. and P.X.; data curation, M.D.G. and P.X.; writing—original draft preparation, M.D.G. and P.X.; writing—review and editing, M.D.G., P.X., B.G., J.P. and M.M.; visualisation, P.X. and B.G.; supervision, M.D.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially supported (the first author’s work) by funds for the statutory work of the Road and Bridge Research Institute in Warsaw (PWS 1032).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

FWD	Falling Weight Deflectometer
XGBoost	eXtreme Gradient Boosting
ML	Machine Learning
PCA	Principal Components Analysis
FDM	First Difference Method
ANN	Artificial Neural Network
RF	Random Forests
GBM	Gradient Boosting Machine
LTPP	Long-Term Pavement Performance
RMSE	Root Mean Square Error
FI	Feature Importance
SHAP	SHapley Additive exPlanations

Appendix A. Theoretical Basis for Analytic Solution

Considering an axisymmetric model of a pavement, as defined in Figure A1, the vertical displacement is defined as

w = \frac{1 + μ}{E} [2 (1 - μ) \nabla^{2} Φ - \frac{\partial^{2} Φ}{\partial z^{2}}]

(A1)

where w is vertical displacement, μ is Poisson ratio, E is dynamic stiffness modulus, ∇ is Laplace operator, and Φ is stress function.

Figure A1. Axisymmetric model of a pavement differential element.

The biharmonic function to express the stress for the i-th layer proposed by Burmister [28] is as follows:

Φ_{i} (r, z) = \int_{0}^{\infty} (\begin{matrix} A_{i} (m) e^{m z} - \\ B_{i} (m) e^{- m z} + \\ C_{i} (m) z e^{m z} \\ - D_{i} (m) z e^{- m z} \end{matrix}) J_{0} (m r) d m

(A2)

where z is the vertical location of the calculation point from the top of each layer, r is horizontal position of the calculation points from the vertical axis of symmetry, m is the integration parameter, Ai(m), Bi(m), Ci(m), and Di(m) are coefficients depending on the m parameter, and J₀ is the Bessel function.

The application of the biharmonic stress function in the geometric equation for the vertical displacement leads to the following equation:

w_{i} = \frac{1 + μ_{i}}{E_{i}} \int_{0}^{\infty} (\begin{matrix} {- A}_{i} (m) m e^{m z} \\ + B_{i} (m) m e^{- m z} \\ {+ C}_{i} (m) (2 - 4 μ_{i} - m z) e^{m z} \\ {+ D}_{i} (m) (2 - 4 μ_{i} + m z) e^{- m z} \end{matrix}) J_{0} (m r) m d m

(A3)

Using this method, equations can be derived for all displacements, stresses, and strains. The calculation of the previous equations, which establish the state of stress/displacement in the pavement, is performed by applying continuity conditions between layers. For example, to ensure full vertical contact between the layers, through equal vertical displacements, the vertical displacement of the bottom of one layer must be equal to the displacement of the top of the next.

For the case of the vertical displacements, the compatibility equation after the application of the Hankel transform is as follows:

\frac{1 + u_{i}}{E_{i}} (\begin{matrix} - A_{i} (m) m e^{m h i} \\ + B_{i} (m) m e^{- m h i} \\ {+ C}_{i} (m) (2 - 4 μ_{i} - m h_{i}) e^{m h_{i}} \\ + D_{i} (m) (2 - 4 μ_{i} + m h_{i}) e^{- m h_{i}} \end{matrix}) - \frac{1 + u_{i + 1}}{E_{i + 1}} (\begin{matrix} - A_{i + 1} (m) m \\ {+ B}_{i + 1} (m) m \\ {+ C}_{i + 1} (m) (2 - μ_{i + 1}) \\ {+ D}_{i + 1} (m) (2 - 4 μ_{i + 1}) \end{matrix}) = 0

(A4)

A_i, B_i, C_i, and D_i coefficients are solved by a system of linear equations. These coefficients are applied to the inverse Hankel functions, as is shown for the vertical displacement:

(H_{n} f) (m) = \int_{0}^{\infty} m (\begin{matrix} - A_{i} m e^{m z} \\ + B_{i} (m) m e^{- m z} \\ {+ C}_{i} (m) (2 - 4 μ_{i} - m z) e^{m z} \\ + D_{i} (m) (2 - 4 μ_{i} + m z) e^{- m z} \end{matrix}) J_{n} (r m) d m

(A5)

Thus, the vertical displacement, like the other displacements, stresses, and strains, is calculated by improper integrals. JPav uses the Gauss quadrature method, applying Gauss’s quadrature by using the weighting factors and Gauss–Legendre arguments for four Gaussian points.

The static pavement analysis can predict the dynamic response of an FWD test because pavements behave in a largely quasi-static manner under the FWD load pulse. Although the load is applied dynamically, its duration is not long enough; inertial and damping effects remain small compared to the pavement’s static stiffness. Consequently, the peak deflections and overall deflection basin are mainly governed by the elastic properties of the pavement layers. Layered elasticity theory, therefore, provides an accurate approximation of the maximum deflection and basin shape recorded during FWD testing, making static analysis a reliable tool for interpreting dynamic measurements.

Appendix B. Statistical Analysis of SC and SBC

Appendix B.1. For the Whole Database

Table A1. Results of Anderson–Darling normality check in the case of the whole database for SC and SBC parameters.

Variable	n	AD_Stat	AD_p_Value	AD_5%_Critical	AD_Reject_5%
SC	65,824	1290.347	0	0.787	Yes
SBC	65,824	6858.862	0	0.787	Yes

Table A2. Results of statistical assessment of gamma distribution as an approximation for SC and SBC parameters for the whole database.

Variable	Gamma_Shape	Gamma_Scale	LogLik	AIC	BIC	KS_Gamma_Stat	KS_Gamma_p
SC	2.5502	0.0167	154,755.8	−309,508	−309,489	0.0134	1.06 × 10⁻¹⁰
SBC	1.4238	4.0001	−178,065	356,133	356,151.2	0.0954	0

Figure A2. Approximation of SC with a gamma distribution in the case of the whole database.

Figure A3. Approximation of SBC with a gamma distribution in the case of the whole database.

Appendix B.2. After the Removal of Outliers

Table A3. Results of Anderson–Darling normality check in the case of the database without outliers for SC and SBC parameters.

Variable	n	AD_Stat	AD_p_Value	AD_5%_Critical	AD_Reject_5%
SC	60,687	656.9763	0	0.787	Yes
SBC	60,687	5375.656	0	0.787	Yes

Table A4. Results of statistical assessment of gamma distribution as an approximation for SC and SBC parameters for the database without outliers.

Variable	Gamma_Shape	Gamma_Scale	LogLik	AIC	BIC	KS_Gamma_Stat	KS_Gamma_p
SC	2.8863	0.0134	151,132.2	−302,260	−302,242	0.0269	1.03 × 10⁻³⁸
SBC	1.6169	3.1687	−156,161	312,326.5	312,344.6	0.0801	0

Figure A4. Approximation of SC with a gamma distribution in the case of the database without outliers.

Figure A5. Approximation of SBC with a gamma distribution in the case of the database without outliers.

Appendix C. The Influence of Outlier Removal

Table A5. R2 of the models trained by the whole database and the database without outliers.

Database Pre-Processing Method		The Whole Database	The Database without Outliers	Difference
unprocessed	Has	0.864	0.860	−0.004
	Hgr	0.017	0.0121	−0.0049
	Hsu	0.961	0.961	0
	Eas	0.643	0.674	0.031
	Egr	0.493	0.487	−0.006
	Esu	0.920	0.904	−0.016
	Eil	0.079	0.0842	0.0052
PCA	Has	0.858	0.862	0.004
	Hgr	0.037	0.051	0.014
	Hsu	0.816	0.824	0.008
	Eas	0.178	0.190	0.012
	Egr	0.233	0.220	−0.013
	Esu	0.832	0.810	−0.022
	Eil	0.014	0.018	0.004
FDM	Has	0.924	0.919	−0.005
	Hgr	0.043	0.0376	−0.0054
	Hsu	0.292	0.295	0.003
	Eas	0.689	0.718	0.029
	Egr	0.602	0.596	−0.006
	Esu	0.829	0.798	−0.031
	Eil	0.000	0.0002	0.0002

Appendix D. Sensitivity Analysis Results

Table A6. Sensitivity analysis results for the models trained on the database without outliers in the case of layer thickness.

Target	Component	Impact High (+20%)	Impact Low (−20%)	Total Range	% Sensitivity
Has	d1	−0.04499	0.01397	0.05896	44.44665
Has	d5	0.00493	−0.01815	0.02308	17.39698
Has	d2	−0.00097	0.01611	0.01708	12.87460
Has	d6	0.00706	−0.00836	0.01542	11.62186
Has	d4	0.00545	−0.00613	0.01158	8.73007
Has	d7	0.00518	0	0.00518	3.90278
Has	d10	−0.00064	0.00072	0.00136	1.02707
Has	d3	0	0	0	0
Has	d8	0	0	0	0
Has	d9	0	0	0	0
Hgr	d4	−0.00176	0.00439	0.00615	36.16110
Hgr	d1	0.00215	−0.00275	0.00489	28.79192
Hgr	d5	−0.00241	0	0.00241	14.18280
Hgr	d9	0	−0.00142	0.00142	8.32828
Hgr	d10	0.00051	−0.00052	0.00104	6.09109
Hgr	d8	0	−0.00098	0.00098	5.79612
Hgr	d3	−0.00011	0	0.00011	0.64869
Hgr	d2	0	0	0	0
Hgr	d6	0	0	0	0
Hgr	d7	0	0	0	0
Hsu	d10	6.38000	−3.25396	9.63395	33.14973
Hsu	d8	−3.12747	3.52140	6.64887	22.87827
Hsu	d7	−1.58245	1.22391	2.80636	9.65647
Hsu	d5	1.22929	−1.16965	2.39894	8.25456
Hsu	d2	−0.46897	1.89713	2.36610	8.14156
Hsu	d6	0.62596	−1.33039	1.95636	6.73167
Hsu	d9	−0.91336	0.88628	1.79963	6.19241
Hsu	d4	0.67249	−0.44933	1.12183	3.86013
Hsu	d1	0.27850	0.50268	0.22417	0.77137
Hsu	d3	0.13988	0.24562	0.10573	0.36382

Table A7. Sensitivity analysis results for the models trained on the database without outliers in the case of layer stiffness.

Target	Component	Impact High (+20%)	Impact Low (−20%)	Total Range	% Sensitivity
Eas	d2	32,620.50	−8837.91	41,458.41	43.81
Eas	d1	−11,593.12	19,961.97	31,555.09	33.34
Eas	d4	−476.89	6675.34	7152.23	7.56
Eas	d5	−2188.17	3629.47	5817.64	6.15
Eas	d8	617.48	−1923.95	2541.43	2.69
Eas	d3	402.95	2389.89	1986.94	2.10
Eas	d6	−646.78	717.03	1363.81	1.44
Eas	d9	335.96	−751.83	1087.79	1.15
Eas	d7	−424.39	−1293.13	868.74	0.92
Eas	d10	−345.47	464.54	810.00	0.86
Egr	d2	31.09	−356.66	387.75	20.93
Egr	d8	196.66	−119.18	315.85	17.05
Egr	d10	−96.93	164.07	261.00	14.09
Egr	d4	−159.73	79.03	238.76	12.89
Egr	d5	−132.55	68.35	200.89	10.85
Egr	d9	112.98	−73.12	186.10	10.05
Egr	d7	8.93	−72.42	81.34	4.39
Egr	d6	−31.88	34.40	66.28	3.58
Egr	d3	−91.43	−29.42	62.01	3.35
Egr	d1	−8.96	−61.17	52.21	2.82
Esu	d10	14.89	−27.49	42.38	24.02
Esu	d8	−8.59	22.55	31.14	17.65
Esu	d7	−13.30	17.35	30.65	17.37
Esu	d6	−11.89	12.72	24.61	13.95
Esu	d5	5.49	−7.50	12.99	7.36
Esu	d1	−1.80	9.61	11.42	6.47
Esu	d4	4.97	−4.87	9.84	5.58
Esu	d3	1.93	−3.63	5.56	3.15
Esu	d9	−2.96	1.11	4.07	2.31
Esu	d2	1.03	4.79	3.76	2.13
Eil	d10	−834.13	171.49	1005.62	43.63
Eil	d8	80.55	−355.55	436.11	18.92
Eil	d4	0	−244.66	244.66	10.62
Eil	d6	27.22	191.86	164.63	7.14
Eil	d7	−32.89	−167.43	134.54	5.84
Eil	d2	104.75	−20.84	125.59	5.45
Eil	d3	25.06	131.77	106.72	4.63
Eil	d9	21.03	80.22	59.19	2.57
Eil	d5	−18.15	1.33	19.48	0.85
Eil	d1	0	−8.15	8.15	0.35

Table A8. Sensitivity analysis results for the models trained using the FDM-like approach on the database without outliers in the case of layer thickness.

Target	Component	Impact High (+20%)	Impact Low (−20%)	Total Range	% Sensitivity
Has	BDI	−0.02415	0.05770	0.08185	69.10794
Has	BCI	0.01614	−0.00261	0.01874	15.82470
Has	SCI	−0.00865	0.00267	0.01132	9.56124
Has	CI	0.00604	−0.00048	0.00652	5.50613
Hgr	BCI	−0.01644	0.00948	0.02592	43.84208
Hgr	SCI	0.00279	−0.01205	0.01484	25.10018
Hgr	CI	0	−0.00969	0.00969	16.38932
Hgr	BDI	0.00671	−0.00196	0.00867	14.66841
Hsu	BCI	−1.34824	0.62577	1.97401	40.99176
Hsu	BDI	0.68227	−1.02910	1.71138	35.53798
Hsu	CI	0.30790	−0.67861	0.98651	20.48564
Hsu	SCI	−0.20535	−0.06162	0.14373	2.98462

Table A9. Sensitivity analysis results for the models trained using the FDM-like approach on the database without outliers in the case of layer stiffness.

Target	Component	Impact High (+20%)	Impact Low (−20%)	Total Range	% Sensitivity
Eas	BDI	8404.09	−3089.96	11,494.05	42.37
Eas	SCI	−2903.71	4861.93	7765.64	28.62
Eas	BCI	−2735.95	2946.07	5682.02	20.94
Eas	CI	1051.21	−1136.15	2187.36	8.06
Egr	BCI	−182.12	195.44	377.56	49.74
Egr	BDI	40.01	−164.94	204.96	27.00
Egr	CI	85.14	−74.64	159.78	21.05
Egr	SCI	10.56	27.40	16.84	2.22
Esu	CI	−22.54	28.23	50.77	67.28
Esu	SCI	−6.17	4.16	10.33	13.69
Esu	BCI	−2.52	−9.96	7.43	9.85
Esu	BDI	4.52	−2.40	6.92	9.17

References

TOI World Desk World’s largest road networks 2024: The United States and India take top spots (Updated: 23 October 2024, 11:20 IST). The Times of India 2024. Available online: https://timesofindia.indiatimes.com/world/us/worlds-largest-road-networks-2024-the-united-states-and-india-takes-top-spots/articleshow/114419565.cms (accessed on 23 October 2025).
Azarijafari, H.; Yahia, A.; Ben Amor, M. Life cycle assessment of pavements: Reviewing research challenges and opportunities. J. Clean. Prod. 2016, 112, 2187–2197. [Google Scholar] [CrossRef]
Harasim, P.; Gajewski, M. Research on the influence of pavement unevenness on heavy vehicles ’ axle loads variations with the use of TSD deflectometer. Roads Bridges Drog. I Most. 2021, 20, 425–439. [Google Scholar] [CrossRef]
Domitrović, J.; Rukavina, T. Application of GPR and FWD in Assessing Pavement Bearing Capacity. Rom. J. Transp. Infrastruct. 2013, 2, 11–21. [Google Scholar] [CrossRef]
Rohde, G.T. Determining pavement structural number from FWD testing. Transp. Res. Rec. 1994, 1448, 61–68. [Google Scholar]
Guzina, B.B.; Osburn, R.H. Feasibility of backcalculation procedures based on dynamic FWD response data. Transp. Res. Rec. 2005, 1806, 30–37. [Google Scholar] [CrossRef]
Kanai, T.; Matsui, K.; Himeno, K. Applicability of Static and Dynamic Analytical Methods to Structural Evaluation of Flexible Pavements Using FWD Data. In Proceedings of the Seventh International Conference on the Bearing Capacity of Roads, Railways and Airfields, Trondheim, Norway, 25–27 June 2005. [Google Scholar]
Hamim, A.; Yusoff, N.I.M.; Ceylan, H.; Rosyidi, S.A.P.; El-Shafie, A. Comparative study on using static and dynamic finite element models to develop FWD measurement on flexible pavement structures. Constr. Build. Mater. 2018, 176, 583–592. [Google Scholar] [CrossRef]
Siddharthan, R.V.; Hajj, E.Y.; Sebaaly, P.E.; Nitharsan, R. Formulation and Application of 3D-Move: A Dynamic Pavement Analysis Program; Report: FHWA-RD-WRSC-UNR-201506; University of Nevada: Reno, NV, USA, 2015. [Google Scholar]
Tarefder, R.A.; Ahmed, M.U. Consistency and accuracy of selected FWD backcalculation software for computing layer modulus of airport pavements. Int. J. Geotech. Eng. 2013, 7, 21–35. [Google Scholar] [CrossRef]
Miani, M.; Dunnhofer, M.; Rondinella, F.; Manthos, E.; Valentin, J.; Micheloni, C.; Baldo, N. Bituminous mixtures experimental data modeling using a hyperparameters-optimized machine learning approach. Appl. Sci. 2021, 11, 11710. [Google Scholar] [CrossRef]
Rondinella, F.; Daneluz, F.; Hofko, B.; Baldo, N. Improved predictions of asphalt concretes’ dynamic modulus and phase angle using decision-tree based categorical boosting model. Constr. Build. Mater. 2023, 400, 132709. [Google Scholar] [CrossRef]
Baldo, N.; Daneluz, F.; Valentin, J.; Rondinella, F.; Vacková, P.; Gajewski, M.D.; Król, J.B. Mechanical performance prediction of asphalt mixtures: A baseline study of linear and non-linear regression compared with neural network modeling. Roads Bridges Drog. I Most. 2025, 24, 27–35. [Google Scholar] [CrossRef]
Roshan, A.; Abdelrahman, M. Predicting flexural-creep stiffness in bending beam rheometer (BBR) experiments using advanced super learner machine learning techniques. Res. Eng. Struct. Mater. 2024, 10, 1195–1208. [Google Scholar] [CrossRef]
Rondinella, F.; Oreto, C.; Abbondati, F.; Baldo, N. A Deep Neural Network Approach towards Performance Prediction of Bituminous Mixtures Produced Using Secondary Raw Materials. Coatings 2024, 14, 922. [Google Scholar] [CrossRef]
Baldo, N.; Manthos, E.; Pasetto, M. Analysis of the Mechanical Behaviour of Asphalt Concretes Using Artificial Neural Networks. Adv. Civ. Eng. 2018, 2018, 1650945. [Google Scholar] [CrossRef]
Pais, J.; Thives, L.; Pereira, P. Artificial Neural Network Models for the Wander Effect for Connected and Autonomous Vehicles to Minimize Pavement Damage. In Proceedings of the 10th International Conference on Maintenance and Rehabilitation of Pavements. MAIREPAV 2024, Guimaraes, Portugal, 24–26 July 2024; Lecture Notes in Civil Engineering. Pereira, P., Pais, J., Eds.; Springer: Cham, Switzerland, 2024; Volume 522. [Google Scholar]
Tarawneh, B.; Nazzal, M.D. Optimization of resilient modulus prediction from FWD results using artificial neural network. Period. Polytech. Civ. Eng. 2014, 58, 143–154. [Google Scholar] [CrossRef]
Li, M.; Wang, H. Development of ANN-GA program for backcalculation of pavement moduli under FWD testing with viscoelastic and nonlinear parameters. Int. J. Pavement Eng. 2019, 20, 490–498. [Google Scholar] [CrossRef]
Han, C.; Ma, T.; Chen, S.; Fan, J. Application of a hybrid neural network structure for FWD backcalculation based on LTPP database. Int. J. Pavement Eng. 2022, 23, 3099–3112. [Google Scholar] [CrossRef]
Sudyka, J.; Mechowski, T.; Harasim, P.; Graczyk, M.; Matysek, A. Optimisation of BELLS3 model coefficients to increase the precision of asphalt layer temperature calculations in FWD and TSD measurements. Roads Bridges Drog. I Most. 2024, 23, 437–456. [Google Scholar] [CrossRef]
Worthey, H.; Yang, J.J.; Kim, S.S. Tree-Based Ensemble Methods: Predicting Asphalt Mixture Dynamic Modulus for Flexible Pavement Design. KSCE J. Civ. Eng. 2021, 25, 4231–4239. [Google Scholar] [CrossRef]
Nielsen, D. Tree Boosting With XGBoost Why Does XGBoost Win “Every” Machine Learning Competition? Master’s Thesis, Norwegian University of Science and Technology, Trondheim, Norway, 2016. [Google Scholar]
Wang, C.; Xiao, W.; Liu, J. Developing an improved extreme gradient boosting model for predicting the international roughness index of rigid pavement. Constr. Build. Mater. 2023, 408, 133523. [Google Scholar] [CrossRef]
Ali, Y.; Hussain, F.; Irfan, M.; Buller, A.S. An eXtreme Gradient Boosting model for predicting dynamic modulus of asphalt concrete mixtures. Constr. Build. Mater. 2021, 295, 123642. [Google Scholar] [CrossRef]
Ahmed, N.S. Machine Learning Models for Pavement Structural Condition Prediction: A Comparative Study of Random Forest (RF) and eXtreme Gradient Boosting (XGBoost). Open J. Civ. Eng. 2024, 14, 570–586. [Google Scholar] [CrossRef]
Zhu, J.; Yin, Y.; Ma, T.; Wang, D. A novel maintenance decision model for asphalt pavement considering crack causes based on random forest and XGBoost. Constr. Build. Mater. 2025, 477, 140610. [Google Scholar] [CrossRef]
Burmister, D.M. The General Theory of Stresses and Displacements in Layered Systems. I. J. Appl. Phys. 1945, 16, 89–94. [Google Scholar] [CrossRef]
Jemioło, S.; Szwed, A. Zagadnienia Statyki Sprężystych Półprzestrzeni Warstwowych; Wydanie II; Oficyna Wydawnicza Politechniki Warszawskiej: Warszawa, Poland, 2017; ISBN 978-83-7814-670-4. [Google Scholar]
Chatti, K.; Kutay, M.E.; Lajnef, N.; Zaabar, I.; Varma, S.; Lee, H.S. FHWA-HRT-15-063 Enhanced Analysis of Falling Weight Deflectometer Data for Use With Mechanistic-Empirical Flexible Pavement Design and Analysis and Recommendations for Improvements to Falling Weight Deflectometers; U.S. Department of Transportation, Federal Highway Administration: Washington, DC, USA, 2017. [Google Scholar]
De Pascalis, R. The Semi-Inverse Method in Solid Mechanics: Theoretical Underpinnings and Novel Applications. Ph.D. Thesis, Universit’e Pierre et Marie Curie and Università del Salento, Lecce, Italy, 2010. [Google Scholar]
Timoshenko, S.; Goodier, J.N. Theory of Elasticity; McGraw-Hill Inc.: Columbus, OH, USA, 1970. [Google Scholar]
Timoshenko, S.; Woinowsky-Krieger, S. Theory of Plates and Shells; New Age International Publishers: New Delhi, Indian, 1959. [Google Scholar]
Chan, J.Y.L.; Leow, S.M.H.; Bea, K.T.; Cheng, W.K.; Phoong, S.W.; Hong, Z.W.; Chen, Y.L. Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review. Mathematics 2022, 10, 1283. [Google Scholar] [CrossRef]
Asselman, A.; Khaldi, M.; Aammou, S. Enhancing the prediction of student performance based on the machine learning XGBoost algorithm. Interact. Learn. Environ. 2023, 31, 3360–3379. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
Optuna: A Hyperparameter Optimization Framework—Optuna 4.3.0 Documentation. Available online: https://optuna.readthedocs.io/en/v4.3.0/ (accessed on 27 November 2025).
Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-generation Hyperparameter Optimization Framework. arXiv 2019, arXiv:1907.10902. [Google Scholar] [CrossRef]
Bergstra, J.; Bardenet, R.; Bengio, Y.; Kégl, B. Algorithms for Hyper-Parameter Optimization. Available online: https://www.researchgate.net/publication/216816964_Algorithms_for_Hyper-Parameter_Optimization (accessed on 27 November 2025).
Hansen, N.; Ostermeier, A. Completely derandomized self-adaptation in evolution strategies. Evol. Comput. 2001, 9, 159–195. [Google Scholar] [CrossRef] [PubMed]
Jamieson, K.; Talwalkar, A. Non-stochastic Best Arm Identification and Hyperparameter Optimization. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, Cadiz, Spain, 9–11 May 2016; pp. 240–248. [Google Scholar]
XGBoost Documentation—Xgboost 3.0.2 Documentation. Available online: https://xgboost.readthedocs.io/en/latest/ (accessed on 27 November 2025).
Hooker, S.; Erhan, D.; Kindermans, P.J.; Kim, B. A benchmark for interpretability methods in deep neural networks. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar] [CrossRef]
Kherif, F.; Latypova, A. Chapter 12—Principal component analysis. In Machine Learning; Mechelli, A., Vieira, S., Eds.; Academic Press: Cambridge, MA, USA, 2020; pp. 209–225. ISBN 978-0-12-815739-8. [Google Scholar]
Peterson, C.; Karl, R.; Jamason, P.F.; Easterling, D.R. First difference method: Maximizing station density for the calculation of long-term global temperature change. J. Geophys. Res. Atmos. 1998, 103, 25967–25974. [Google Scholar] [CrossRef]
Talvik, O.; Aavik, A. Use of Fwd Deflection Basin Parameters (SCI, BDI, BCI) for Pavement Condition Assessment. Balt. J. Road Bridge Eng. 2009, 4, 196–202. [Google Scholar] [CrossRef]
Chen, D.-H. Determination of Bedrock Depth from Falling Weight Deflectometer Data. Transp. Res. Rec. 1999, 1655, 127–135. [Google Scholar] [CrossRef]
Rohde, G.T.; Smith, R.E. Determining Depth to Apparent Stiff Layer from FWD Data; Research Report 1159-1; Texas Transportation Institut, Texas A&M Universit: College Station, TX, USA, 1991; Available online: https://static.tti.tamu.edu/tti.tamu.edu/documents/1159-1.pdf (accessed on 23 October 2025).
Biecek, P.; Burzykowski, T. Explanatory Model Analysis: Explore, Explain, and Examine Predictive Models; Chapman and Hall/CRC: Boca Raton, FL, USA, 2021. [Google Scholar]
Atmospheres, S. Sensitivity analysis: Could better methods be used? J. Geophys. Res. 1999, 104, 3789–3793. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of the boundary value problem modelling the FWD test.

Figure 2. Flowchart of database generation.

Figure 3. Distribution of the database regarding the following input parameters.

Figure 4. Distribution of C_i and BC_i for the following pavement layers.

Figure 5. Distribution of summary compliance SC and SBC.

Figure 6. Correlation matrix heatmap of deflection at measuring points.

Figure 7. Feature importance graph in terms of determining target parameters.

Figure 8. SHAP violin plot for each target feature: (a) Has; (b) Hgr; (c) Hsu; (d) Eas; (e) Egr; (f) Esu; (g) Eil.

Figure 9. Residual plots for each target feature: (a) Has; (b) Hgr; (c) Hsu; (d) Eas; (e) Egr; (f) Esu; (g) Eil.

Figure 10. Explained variance of each principal component.

Figure 11. Observations vs. predicted values of each model trained based on the database processed by PCA: (a) Has; (b) Hgr; (c) Hsu; (d) Eas; (e) Egr; (f) Esu; (g) Eil.

Figure 12. Observations vs. predicted values of each model trained based on the database processed by FDM: (a) Has; (b) Hgr; (c) Hsu; (d) Eas; (e) Egr; (f) Esu; (g) Eil.

Table 1. The range and interval of the parameters used for data generation (UB—upper boundary; LB—lower boundary).

Properties	Has (m)	Hgr (m)	Hsu (m)	Eas (MPa)	Egr (MPa)	Esu (MPa)	Eil (MPa)
LB	0.04	0.2	0.25	2000	80	40	500
UB	0.3	0.4	10	10,000	300	150	5000
Interval	0.01	0.01	0.01	1	1	1	1

Table 2. Assessment of the database of deflections.

Measuring Point	Average (µm)	Min. (µm)	Max. (µm)	Std. (µm²)	Skewness	Kurtosis	$\frac{n u m b e r o f o u t l i e r s}{n u m b e r o f o b s e r v a t i o n s}$ (%)
d1	365.59	66.110	1658.0	189.97	1.36	2.21	3.24
d2	289.65	51.790	1115.0	129.34	1.12	1.50	2.60
d3	244.48	44.540	824.7	99.05	1.06	1.38	2.87
d4	193.97	35.820	570.3	74.71	1.03	1.20	3.36
d5	156.89	9.678	428.9	62.10	0.95	0.88	3.10
d6	106.87	−3.213	285.0	47.93	0.77	0.42	2.79
d7	75.57	−4.847	210.1	38.54	0.69	0.28	2.95
d8	54.99	−6.773	164.5	31.31	0.69	0.28	3.19
d9	41.01	−6.662	130.8	25.59	0.71	0.32	3.21
d10	31.24	−5.368	108.3	21.05	0.74	0.38	2.98

Table 3. Hyperparameters after Optuna tuning.

Category	Hyperparameters	Testing Range	Values	Role in the Model
Architecture	n_estimators	1000–15,000	12,000	Number of boosted trees. Increasing this value can improve accuracy, but can also increase training time and the risk of overfitting.
	learning_rate	0.01–0.3	0.19	Shrinkage factor applied to each tree’s contribution. Lower values require more trees but often generalise better.
	max_depth	3–10	6	Maximum depth of an individual tree. Shallower trees are less expressive but more robust.
	subsample	0.6–1.0	0.93	Fraction of training instances sampled for every tree, introducing bagging-style variance reduction.
	colsample_bytree	0.6–1.0	0.95	Fraction of predictor variables sampled for every tree, further decorrelating the ensemble.
Regularisation	Gamma (γ)	0–1	0.03	Minimum loss reduction required to split a node. Acts as a complexity penalty: a higher γ prunes weak splits.
	reg_alpha	0–10	8.15	L1 penalty on leaf weights
	reg_lambda (λ)	0–10	1.41	L2 penalty on leaf weights

Table 4. Metrics of 10-fold cross-validation for all models.

Target	Has	Hgr	Hsu	Eas	Egr	Esu	Eil
Mean R²	0.861	0.0186	0.961	0.696	0.513	0.916	0.0106
Std. Dev R²	0.0026	0.0014	0.0009	0.0054	0.0071	0.0050	0.0127

Table 5. Evaluation of the model training for each target.

Target	RMSE (m)	R²	RV (m²)	Target	RMSE (MPa)	R²	RV (MPa²)
Has	0.0276	0.860	0.000774	Eas	1308.360	0.674	1,667,544
Hgr	0.0575	0.0121	0.00327	Egr	44.952	0.487	2057.731
Hsu	0.556	0.961	0.307	Esu	9.353	0.904	83.479
				Eil	1242.828	0.0842	1,530,805

Table 6. Evaluation of models trained on the database processed by PCA.

Target	RMSE (m)	R²	RV (m²)	Target	RMSE (MPa)	R²	RV (MPa²)
Has	0.027	0.862	0.000773	Eas	2062.391	0.190	4,367,567
Hgr	0.056	0.051	0.00316	Egr	55.447	0.220	3100.860
Hsu	1.180	0.824	1.423	Esu	13.154	0.810	175.206
				Eil	1287.192	0.018	1,640,908

Table 7. Performance benchmarks for different input strategies.

Input Strategy	RMSE	R²	Optimisation Time (s)	Fitting Time (s)	Total Time (s)	Model Size (Bytes)	Model Size (MB)
PCA	442.60	0.54	755.89	2.35	758.24	5,047,029.57	4.81
Whole database	366.07	0.62	1850.25	9.74	1859.99	12,871,808.29	12.28

Table 8. Evaluation of the models trained based on the database processed by FDM.

Target	RMSE (m)	R²	RV (m²)	Target	RMSE (MPa)	R²	RV (MPa²)
Has	0.0209	0.919	0.000454	Eas	1225.312	0.718	1,491,470
Hgr	0.0564	0.0376	0.0032	Egr	39.985	0.596	1619.623
Hsu	2.3680	0.295	5.618	Esu	13.672	0.798	185.383
				Eil	1289.826	0.0002	1,663,012

Table 9. R² of models trained by three different approaches.

Targets	Whole Dataset		PCA		FDM
Targets	R²	RV	R²	RV	R²	RV
Has	0.860	0.000774	0.862	0.000773	0.919	0.000454
Hgr	0.0121	0.00326	0.051	0.00316	0.0376	0.0032
Hsu	0.961	0.307	0.824	1.423	0.295	5.618
Eas	0.674	1,667,544	0.190	4,367,566	0.718	1,491,470
Egr	0.487	2057.731	0.220	3100.860	0.596	1619.623
Esu	0.904	83.479	0.810	175.206	0.798	185.383
Eil	0.0842	1,530,805	0.018	1,640,908	0.0002	1,663,012

Table 10. Decile check for models trained on the whole database.

Target	StdDev Low Predictions	StdDev High Predictions	Variance Ratio	Status
Has	0.014	0.020	1.412	Homoscedastic (Good)
Hgr	0.056	0.056	1.001	Homoscedastic (Good)
Hsu	0.189	0.640	3.383	Significant heteroscedasticity
Eas	803.524	724.675	0.902	Homoscedastic (Good)
Egr	23.772	28.131	1.183	Homoscedastic (Good)
Esu	3.608	6.059	1.679	Mild Heteroscedasticity
Eil	1158.610	955.172	0.824	Homoscedastic (Good)

Table 11. Decile check for models trained by PCA.

Target	StdDev Low Predictions	StdDev High Predictions	Variance Ratio	Status
Has	0.018	0.020	1.133	Homoscedastic (Good)
Hgr	0.056	0.053	0.938	Homoscedastic (Good)
Hsu	0.355	1.299	3.657	Significant heteroscedasticity
Eas	2170.945	1094.084	0.504	Significant heteroscedasticity
Egr	56.386	34.397	0.610	Significant heteroscedasticity
Esu	7.107	8.689	1.223	Homoscedastic (Good)
Eil	1335.043	1173.131	0.879	Homoscedastic (Good)

Table 12. Decile check for models trained by FDM.

Target	StdDev Low Predictions	StdDev High Predictions	Variance Ratio	Status
Has	0.013	0.016	1.233	Homoscedastic (Good)
Hgr	0.056	0.050	0.883	Homoscedastic (Good)
Hsu	0.574	2.276	3.966	Significant heteroscedasticity
Eas	680.913	448.368	0.658	Significant heteroscedasticity
Egr	33.764	22.854	0.677	Mild Heteroscedasticity
Esu	4.757	11.093	2.332	Significant heteroscedasticity
Eil	1300.714	1270.670	0.977	Homoscedastic (Good)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Estimation of the XGBoost Regression Model Used in the Prediction of Pavement’s Mechanical and Geometrical Parameters Based on Static Interpretation of the FWD Test

Featured Application

Abstract

1. Introduction

2. Objective and Methodology

3. Database Generation and Its Analyses

4. ML Model: Its Algorithm and Training

4.1. The Algorithm

4.2. Models Training

4.2.1. Data Pre-Processing

4.2.2. Tuning of Hyperparameters

5. ML Models Evaluation

5.1. Model Training with the Whole Dataset

5.2. Model Training with PCA

5.3. Model Training with FDM-like Approach

5.4. Model Assessment, Heteroscedasticity, and Sensitivity Analyses

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Theoretical Basis for Analytic Solution

Appendix B. Statistical Analysis of SC and SBC

Appendix B.1. For the Whole Database

Appendix B.2. After the Removal of Outliers

Appendix C. The Influence of Outlier Removal

Appendix D. Sensitivity Analysis Results

References

Article Metrics

Citations

Article Access Statistics