1. Introduction
Pavement networks are an integral component of the critical infrastructure for any country. They must maintain functionality at a reasonable cost. Effective Pavement Management System (PMS) programs include monitoring pavement distress, e.g., cracks, potholes, and faulting, and considering the numerous environmental and traffic loading factors [
1,
2]. Adequate PMS ensures the safety and functionality of pavement networks [
3,
4,
5]. PMSs are used to select suitable intervention levels and maintenance plans, considering grip, bearing, and roughness levels [
6,
7,
8,
9,
10]. The replacement of a damaged pavement is expensive and often causes serious traffic delays [
5,
11,
12]. A proactive approach based on continuous and limited pavement maintenance is rather less invasive and far more efficient. It ensures long-term performance and reduced road congestion while eliminating safety concerns due to total rehabilitation (replacement) [
5,
13]. However, a proactive approach, as an integral part of modern PMS, necessitates information gathering and examination at several levels [
14,
15].
Growing attention has been paid to the applications of PMS in urban regions. This is often driven by the limited economic resources available to transportation bodies [
16,
17,
18,
19]. PMSs utilize several pavement performance indices encompassing visual and automatic inspections [
10], the most popular of which is PCI [
9,
20]. PCI is a numerical value that indicates the distress level of the asphalt pavement surface, and, thus, it offers a measure of the current state of the pavement [
21,
22]. The U.S. Army Corps of Engineers proposed using the PCI technique in 1997 [
23]. However, PCI is not a direct measure of structural capacity, skid resistance, or road roughness. It is rather an objective tool for assessing the maintenance and rehabilitation (M&R) needs of a roadway section in the network.
This research aims to develop simpler, faster, and more economical tools for forecasting PCI using pavement distress. As such, this study attempts to build accurate predictive models for PCI assessment. The data were collected through a field survey. Machine learning (RF, SVM, and DT) and deep learning algorithms (ANN) are developed for this purpose, rather than using traditional regression or classical software packages. MicroPAVER software version 5.2 is a commonly used method to estimate the PCI. This traditional method, however, is based on manual data entry, which is time-consuming and cumbersome. So, the AI-based models developed in this study eliminate manual effort and save time.
2. Prediction Models of PCI
Several studies are available to model or estimate the PCI using collected pavement distresses. Galehouse et al. (2003) [
21,
24] identified a number of advantages of using PCI for pavement M&R. It helps improve road systems and offers protective repair approaches [
21,
24]. Hajj et al. (2011) [
25] explained that the PCI score of a roadway only handles the surface distress collected from the field, and it is not a direct measure of structural size, skid resistance, or pavement roughness. As such, it is an impartial measure for evaluating the M&R requirements of roadways. In 2015, Alwan conducted site and laboratory experiments to evaluate pavement distress for a main road using the standard PCI technique [
26].
Recent studies utilize new modeling techniques like instance neural networks. Issa et al. (2022a) [
27] utilized six pavement distresses and developed an optimized hybrid model to calculate PCI using the Long-Term Pavement Performance (LTPP) database. Their model consists of a cascade architecture with three traditional ML models, in addition to an ANN model. The model successfully predicted PCI with excellent accuracy,
values of 0.998, 0.997, and 0.997 for training, testing, and cross-validation, respectively [
27]. Another study by Issa et al. (2022) [
2] explained how the PCI could be predicted through the development of an artificial intelligence (AI) approach. Furthermore, the use of ANN allows presenting different local variables. For instance, the presence of manholes in pavement sections. The results showed that the ANN model outperformed the other models in predicting the PCI with a great level of robustness—
values of 0.997, 0.998, and 0.996 for training, testing, and validation, respectively. The regression slope between measured and predicted PCIs varies between 0.996 and 0.997 [
2]. To this end, AI techniques can yield excellent PCI predictions (as illustrated in
Figure 1).
The LTPP database can be a useful source for this purpose. Badr et al. (2022) [
29] used the LTPP database for nine states of the United States of America to calculate the PCI of the flexible pavement. Pavement segments were grouped into two sets. The first set of segments (codes SPS-1&3&8) includes pavement sections with no strengthening overlays. The second set of segments (code SPS-5) comprises the pavement sections overlain by a protection layer. The model indicated an excellent prediction of the PCI, with
values above 0.8 for almost all segments [
29]. Jalal et al. (2017) [
28] proposed an enhanced ANN model among different ANN architectures to estimate PCI. The model was developed using 173 datasets by Texas A&M University. All distresses were recognized, assessed, and measured from 2014 to 2016. The
values were calculated for training, testing, and validation subsets, as well as all data, 0.978, 0.965, 0.973, and 0.974, respectively. The high values support the competence of the model [
28].
3. Originality
This study proposes a new approach for predicting PCI using AI, particularly ML and deep learning algorithms. The suggested method provides reliable PCI estimates and can be integrated into PMS using widely available spreadsheet software. Unlike traditional methods relying on specialized software like MicroPAVER, this approach offers greater flexibility and accessibility. It enables seamless data import and export, eliminates the need for data re-entry, and provides a faster alternative for PCI calculation, making it particularly useful for regions where specialized software is scarce. Though this study was conducted on urban agricultural roads in Egypt, it applies to roads in similar climatic and operational conditions. This study uniquely offers a detailed visual and quantitative comparison between the proposed models. It is important to note that similar reported studies are generally limited in area and data size. For instance, the study area of Jalal et al. 2017 [
28] is only 22 km
2. Also, Issa et al. (2022b) [
2] used merely 10 different roads, which yielded a simple ANN model. They also used the LTPP database for this purpose.
4. Objective and Methodology
This study attempts to develop a pavement performance model for flexible pavement. In this way, the methodology comprises four steps. As shown in
Figure 2, the first step involves collecting data through visual inspection. This is followed by data analysis, including PCI calculation. The data are then used to develop four AI-based models using the following approaches: RF, SVM, DT, and ANN. The fourth step examines models’ performances via validation and error analysis. This step sets out to compare the four developed models. Finally, the statistical performance of the developed models is used to select the most accurate model based on error analysis.
5. Study Area and Data Collection
The data used in this research is based on selected urban flexible road segments in the governorate of Beni Suif, Egypt. The area of the study is located in an arid, non-freezing region. Approximately 15,000, 100 m long pavement segments were gathered over 3 years from regions that have the same environmental conditions (e.g., temperature, rainfall) and structural layer thickness and materials. The data were collected through integrating a desk study and new field observations. Office data consist of available maps, types of pavement layers and thicknesses, cross-section elements, traffic volumes, costs, and environment-related data provided by the Directorate of Roads and Transport in Beni Suif. The field data consists of geometry-related data and physical distress. The common pavement distresses were alligator cracking, longitudinal cracking, bleeding, rutting, and weathering. The PCI, based on visual examination, calculates every single type of distress which is categorized into three levels of severity, as reflected on pavement functionality and its structural performance and journey level: low (L), moderate (M), and high (H) [
30,
31].
Data collection of distress (i.e., defective areas) has been carried out using gadgets that facilitate real-time data transfer and downloading. In this study, handheld computers and global positioning system (GPS) technology have been used to capture pavement conditions and spot distressed areas of the pavement. The distress has been accordingly defined based on the 19 distresses listed in the PAVER system [
32]. The PCI values for 15,000 pavement segments were calculated using MicroPAVER software. The gathered values and observed distresses were used for forecasting the proposed models. Classification of the distresses and respective severity considered in this study are included as
Supplementary Data (Table S1).
A sample of the statistical characteristics of the training and testing subsets is summarized in
Table 1. The table summarizes the statistics of the surveyed distresses, including the number of occurrences (count), standard deviation, and minimum and maximum values. It is noticed that there is a wide range in the values of bleeding, with relatively low means and high maximum values. However, upon close inspection, these high values are not indicative of outliers or anomalies. Rather, they reflect genuine variability, naturally expected in large-scale urban pavement networks. Hence, observing a few segments with considerably larger areas of distress compared is not unusual given the data size.
Figure 3 illustrates the distribution of measured PCI values across the 15,000 pavement segments. The distribution exhibits a notable skewness toward higher PCI ranges (70–100), signaling satisfactory to good condition for most of the examined network [
33]. Such skewness could typically influence the performance of predictive models, particularly those sensitive to data distribution, e.g., SVR (using the RBF kernel) and ANN. Generally, unbalanced data distributions can cause models to become biased, providing excellent predictions within the dominant range but potentially underperforming for infrequent or extreme cases (lower PCI values in this study).
6. Development of AI Models
To train and evaluate forecasting models for PCI prediction, the evaluation begins by preprocessing the dataset and preparing it for AI analysis. The dataset consists of pavement distresses collected from various pavement segments, each characterized by a set of distress indicators and an associated PCI value. The feature matrix, which contains 51 features (3 severity levels × 17 distress types) and one target variable (PCI), was extracted. Next, the dataset was split into training and testing sets to facilitate model training and evaluation. A standard train-test split strategy was employed, allocating 80% of the data for training and 20% for testing [
34]. The adopted split ratio aligns with common practices and widely recommended guidelines in ML. Split ratios of 80–20 or 70–30 allow sufficient training data while reserving an adequately sized independent dataset for model validation and performance evaluation. This ratio also ensures a good balance between model stability and reliable performance for generalization [
35].
The PCI was treated as a continuous variable to preserve the full resolution of the pavement performance data. It is common in some studies, however, to convert PCI into ordinal or discrete classes (e.g., “Good”, “Fair”, “Poor”) for simplification or rule-based decision-making. This transformation inevitably leads to a loss of information by discretizing inherently continuous measurements. Modeling PCI as a continuous outcome offers several key advantages, including enabling learning algorithms to capture subtle variations in pavement conditions, leading to more precise predictions. Ordinal classification, in contrast, flattens this granularity and may ignore near-boundary effects that are critical in practice. Moreover, regression models also provide continuous outputs that can later be mapped to any decision threshold or management category (e.g., maintenance planning, budget allocation), making them adaptable across agencies with different classification schemes. Continuous PCI predictions can also be directly compared against national or agency-specific performance targets, which allows monitoring deterioration trends without being limited to fixed-condition bands.
6.1. Support Vector Regression (SVR)
SVR is a supervised learning algorithm that extends the principles of SVMs to the regression setting recommended by VAPNIK [
36]. The SVM model is a machine learning technique that can also solve a wide range of grouping problems of samples, non-linearities, and high-dimensional statistics [
37,
38]. The Vapnik–Chervonenkis (VC) theory of statistical principle and structural risk minimization is the basis of its principle, and the user seeks the optimal key in data mining by forming an ideal hyperplane [
39]. Generally, the length of the sample is reduced to make the problem simpler, whereas the SVM method works in another way. It uses the infinite-dimensional area to handle linear problems and even the kernel function to plot the model units to high-dimensional [
40].
Unlike traditional regression methods, SVR aims to find a hyperplane in a high-dimensional feature space that has the maximum margin with respect to the training data points. This hyperplane serves as the decision boundary, and SVR seeks to minimize the error of the predictions while still staying within a specified margin of tolerance. One of the key advantages of SVR is its ability to handle non-linear relationships between input features and target variables using kernel functions. The radial basis function (RBF) kernel, employed in this study, is particularly well-suited for capturing complex, non-linear relationships between pavement distress indicators and the PCI. Also, unlike linear kernels or polynomial kernels, the RBF kernel offers flexibility and enhanced performance for datasets exhibiting intricate non-linear patterns, such as pavement distress data. In this study, careful hyperparameter tuning was conducted to select optimal values that maximize the prediction accuracy.
Specifically, for the SVR model, the hyperparameter was systematically optimized using a grid-search methodology combined with 5-fold cross-validation. Firstly, the SVR model was initialized with an RBF kernel for the reasons mentioned earlier. The regularization parameter (C) was tested at 0.1, 1, 4, 10, 50, and 100. The value C = 4 was ultimately selected, as it provided the best balance between model complexity and generalization [
40]. While Epsilon (ε) was set to 0.1, aligning with the commonly used default value recommended in the literature, balancing prediction accuracy and model robustness. Additionally, several values of Gamma (γ) were tested [‘scale’, ‘auto’, 0.001, 0.01, 0.1, 1, 10] for RBF Kernel, with the best-performing option identified as ‘scale’, which adjusts automatically based on the dataset’s characteristics. The selected hyperparameters warrant the lowest RMSE and highest
R2. This systematic tuning ensured the SVR model achieved robust predictive performance.
In this trial, separate SVR models were trained. The proposed model prepared on the training dataset is evaluated on unseen testing data to assess performance and generalization. The results for all datasets and the three groups (i.e., training, validation, and testing datasets) are shown in
Figure 4. The scatter plot (left) and Kernel density (K-D) plot (right) are indicative of the model performance. The graph provides a visual insight into the model behavior that cannot be fully captured solely by statistics. The high scattering around the line of equal values indicates the model’s weakness. The outliers deviate significantly from the diagonal and may represent inaccurate predictions. This also may suggest overfitting or underfitting, as the value of
(less than 0.90). Also, the K-D plot shows the mismatch between the measured PCI curve and the predicted PCI curve. The error values of this model are high compared to those of the later models, thus indicating low predictability.
6.2. Decision Tree (DT)
DTs are among the most common classification algorithms. A reason for their popularity is their intelligibility and simplicity of clarification [
35]. DT Regression is a non-parametric supervised learning method used for both classification and regression tasks. It works by recursively partitioning the input space into smaller regions and fitting a simple model (e.g., a constant value) within each region. In the case of regression, the predicted value for a given instance is the average (or another appropriate summary statistic) of the target values in the region to which that instance belongs. As their name implies, these algorithms create a classifier tree built on the trends in the dataset. First versions of DTs, for instance, ID3 and CLS, were merely capable of being learned from separate data [
41], while the next versions (e.g., C4.5) can learn from connected and separate variables together [
42].
In this study, DT regression was employed to predict PCI based on pavement distresses. The DT model can capture complex non-linear relationships between distress indicators and PCI values, making it well-suited for this predictive task. This study employed the CART (Classification and Regression Tree) algorithm, which utilizes variance reduction (based on MSE) as its splitting criterion. Thus, commonly known criteria like the ‘Gini Index’ and ‘Entropy’ are specifically associated with classification tasks and do not apply to regression trees. For regression trees, the optimal splits are determined by minimizing the variance (MSE) within each subset after splitting. The DT model was initialized with certain hyperparameters to control its complexity and generalization. The maximum depth of the tree was set to 10 to limit complexity, and a minimum number of samples required to split an internal node was set at 10, ensuring robust splits. Additionally, cost complexity pruning was incorporated by setting the cost complexity parameter (ccp_alpha) to 1 to provide additional pruning. These hyperparameters were selected systematically to optimize predictive performance, prevent overfitting, and enhance generalization.
The DT model was trained using the training dataset to learn the underlying patterns between distress indicators and PCI values. The results for all datasets and the three groups (i.e., training, validation, and testing datasets) are shown in
Figure 5 as the values of
in datasets increase (close to 0.84), while the values of MAE and RMSE decrease, the data in the scatter plot show more closeness to the neutral line compared to the SVR model and the K-D curves almost overlap although the deviation still exists. Thus, the predictions of the decision tree algorithm are more accurate compared to the previous model.
6.3. Random Forest (RF)
RF is an ensemble learning method that combines the predictions of multiple individual DTs to improve predictive performance and robustness. This procedure was recommended by Breiman in 2001 [
43,
44]. In RF Regression, each tree in the ensemble is trained on a random subset of the training data and a random subset of features, resulting in a diverse set of trees that collectively provide more accurate predictions. Ensemble classification approaches have stood the test of time and demonstrated to be extremely precise prediction and classification systems [
45].
RF algorithm offers several advantages, including the ability to handle high-dimensional datasets with a large number of features and the capability to capture complex nonlinear relationships [
46]. The RF model was initialized with hyperparameters tailored to our predictive task through experimentation and cross-validation. Hence, hyperparameters are systematically tuned to ensure optimal performance and robust generalization. Specifically, the maximum depth of each tree was set to 20 to control tree complexity and mitigate overfitting. The minimum number of samples required to split an internal node was 60 to ensure each decision split is supported by sufficient data, improving model robustness. Additionally, the maximum number of features considered for splitting at each node was set to the square root of the total number of features, which aligned with a widely accepted practice for regression using RFs. An ensemble of 400 trees was constructed to provide robust and stable predictions by averaging outputs from multiple regression trees. Random state was fixed at 101, ensuring the reproducibility of the results.
The RF model was trained using the training dataset to learn the underlying patterns between distress indicators and PCI values. The results for all datasets and the three groups (i.e., training, validation, and testing datasets) are shown in
Figure 6, a noticeable difference from the previous models can be noticed as the value of
is almost close or equal to 0.90. This is also confirmed by the data’s closer alignment to the neutral line in the scatter plot, as well as by the K-D plot, where the deviation between the curves decreases, showing the difference in accuracy compared to the two previous models.
6.4. Artificial Neural Network (ANN)
ANNs are a class of deep learning models inspired by the structure and function of the human brain. ANNs are capable of learning complex non-linear relationships between input features and target variables. In this study, an ANN model was employed to predict PCI based on pavement distress. ANNs offer the flexibility to capture intricate patterns and relationships in the data, making them suitable for our regression task. A feedforward backpropagation ANN is usually formed from three layers, as shown in
Figure 7 [
47,
48,
49,
50]. The first layer is named the input layer, which is a vector that characterizes the input variables entirely. Commonly, the inputs are regularized before being fed into the input component. This regularization is utilized to check that the ANN is neutral, where all inputs will have a similar variety as soon as they are regularized. In this case, the input variables are the pavement distresses, so the input layer serves as the entry point for the pavement distress data into the neural network. It consists of neurons equal to the number of distress indicators, allowing the model to receive and process information about the condition of the pavement segments [
2].
The second layer (hidden) includes two hidden layers, each containing a specific number of neurons responsible for learning and abstracting features from the input data. The first hidden layer comprises 64 neurons, each connected to every neuron in the input layer. This layer captures the primary patterns and relationships present in the distress indicators. The second hidden layer consists of 32 neurons, providing additional capacity for the model to learn complex representations of the input data. Each neuron in this layer integrates information from the previous layer to refine the learned features. The number of neurons in the hidden layers is a trainable factor. Lastly, the third layer of ANN is the output layer, and it accumulates all the transmitted signals from the hidden layer and then performs a series of operations on those signals toward the output vector. For instance, the output layer consists of a single neuron responsible for predicting the PCI for each pavement segment. As PCI prediction is a regression task, the output neuron produces continuous values without the application of an activation function [
2]. The sigmoid activation function in the hidden layers was employed to introduce non-linearity into the model. This activation function enables the model to capture non-linear relationships within the data and improve its predictive performance. The specifications of the ANN architecture are presented in
Table 2.
The model parameters are optimized using Adam’s optimizer, a stochastic optimization algorithm widely used in neural network training. The MSE loss function is utilized to measure the discrepancy between predicted and measured PCI values and thus enhances prediction accuracy. The network was trained for 50 epochs, using the default batch size (equal to the number of samples unless otherwise specified), with no dropout or batch normalization. This architecture was determined empirically using manual tuning, guided by performance on the validation set. Preliminary experiments explored various numbers of hidden layers (ranging from 1 to 3), neurons per layer (16, 32, 64, 128), and activation functions (ReLU, Sigmoid, Tanh). The two layers and sigmoid activation configuration showed the best balance between model complexity and performance. While automated tuning methods such as grid search or Bayesian optimization were considered, manual tuning was deemed sufficient for the current dataset size and problem scale.
The results for all datasets and the three groups (i.e., training, validation, and testing datasets) are shown in
Figure 8. The results show that the value of the
reached its maximum (approx. 0.93) while MAE (approx. 3) and RMSE (approx. t 7.0) values were the minimum among the models; the plots clearly show the robustness of the ANN model. The points are clustering along the line of equal values in both plots, indicating acceptable accuracy of unseen data as well as generalization. The curves in the K-D plot overlapped strongly, and the deviation between the curves became smaller, indicating an accurate prediction of PCI. The previous remarks confirm the precision of the model as well amongst the rest of the models.
7. Statistical Performance of the Models
Statistical indicators are used to evaluate the performance of the models and the quality of their predictions, namely, MAE, RMSE, and
.
Table 3 summarizes the statistical performance for each data subset. The ANN model outperforms all other models, as captured by the high
value, and the low MAE and RMSE values. The least-performing model was SVR, with
= 0.8062, RMSE = 12.69%, and MAE = 7.74%. The similar
values for training and testing datasets rule out data overfit, which promotes model generalization. This was observed for all models. Other error indices were used to test the performance of the models: the Average Bias (B) (Equation (1)), which is the relation between predicted and measured PCI values, and the Willmott index of agreement (
) (Equation (2)), which is the relation between the MSE and the potential error (PE). As indicated in
Table 4, the ANN model has the highest
(0.993) and the lowest B (1.04). To this end, the ANN model has proven to be the most accurate among the tested models, supporting the previous findings.
Figure 9a depicts the R
2 surface of the SVR model. The surface reveals a steady improvement in model performance with values of “C”, stabilizing at values of 4 to 5. Similarly, the R
2 values improved gradually as “epsilon” increases, with insignificant gains beyond “epsilon” of 0.8. The selected configuration (“C” = 4, “epsilon” = 0.1) lies in the optimal region of the plot, where R
2 values are maximum. The visualization confirms that this point falls within a stable and high-performing plateau, validating the robustness and suitability of the final hyperparameter settings.
Figure 9b illustrates that the DT model’s performance improved significantly by increasing the tree depth up to a value of approximately 10 and plateaued thereafter. Similarly, the R
2 score showed a positive correlation with increasing “ccp alpha” up to a value of approximately 1.0, beyond which further gains were negligible. The highest performance was achieved in the region bounded by tree depth values between 10 and 16 and “ccp alpha” values between 0.8 and 1.0, indicating a zone of optimal model complexity.
Figure 9c illustrates the impact of “max depth” and “min samples split” on the RF’s R
2 value, with the “n estimators” fixed at 400. As shown in the plot, R
2 performance improves substantially as the tree depth increases, particularly between values of 5 and 20. The model performance becomes more stable at higher “min samples split” values (15 to 25), indicating reduced variance and improved generalization. The chosen combination of “max depth” of 20 and “min samples split” of 60 lies in the high-performance region of the surface, targeting the optimal zone. This is supported visually by reaching the maximum value of R
2, which plateaued afterward (
Figure 9d). The number of neurons in the first and second hidden layers was varied, holding all other parameters constant. The surface reveals a distinct performance peak around the selected 64–32 architecture. Minimal gain in R
2 was observed with a further increase in the number of neurons. The visual evidence confirms that the selected architecture performed well, and yielded stabilization and generalization.
The ANN achieved the highest predictive accuracy, as indicated by error metrics. This superior performance is attributed to the ANN’s ability to capture complex, non-linear relationships and interactions among the distress types. The ANN architecture provides a powerful framework capable of capturing these intricate relationships effectively. Its hierarchical learning approach allows the model to learn simple and complex patterns, which simpler models (like DTs and SVR) may struggle to represent adequately. Furthermore, the large and diverse dataset (15,000 pavement segments) used in this study provided sufficient variability and scale for the ANN model to generalize the predictions. Concurrently, this minimizes the risk of overfitting, thereby achieving superior predictive performance compared to the other algorithms tested.
8. Conclusions and Future Directions
PCI values were predicted in this study based on the collected pavement distresses in the study area, such as alligator cracks, bleeding, depression, and corrugation. The data were gathered through visual inspection of pavement sections for urban roads. Four different approaches were employed to forecast the model and estimate the PCI value: three machine learning approaches, SV Regression, DT and RF, and a deep learning model, ANN. Eighty percent (80%) of the datasets were used for training the model, whereas the remaining 20% were used for the test process. The results suggest that the ANN is the optimum model that can accurately estimate the PCI with several pavement distresses. The measured and predicted PCI values confidently show linear correlation, indicating an accurate and dependable prediction model. A close investigation of the four models supports the use of AI techniques to build a correlation between surface pavement distress and PCI. The ANN model can also be applied to large-scale pavement maintenance estimation, producing more effective and accurate pavement condition assessment. A statistical evaluation of the models was made using different error norms. The value was the highest (0.924) for ANN among all models. Similarly, its RMSE (7.93%) and MAE (3.25%) were the lowest.
Compared to the proposed ANN model, the MicroPAVER software, however, is time-consuming, inaccessible, costly, and tedious. As such, the findings of this study are instrumental for countries with similar environmental and operational conditions. The proposed models can save effort and time if adequate pavement performance data are available. Unmanned aerial vehicles (UAVs) can be used to collect data on pavement distress, providing cheaper and more accurate predictions. In addition to pavement distress, PCI could also be correlated with other structural and environmental variables, e.g., pavement age, precipitation, or temperature.