1. Introduction
With the progressive extension of service life and the sustained growth of traffic volumes—particularly the high-frequency passage of heavy freight vehicles—the service performance of highway pavements deteriorates incrementally. Distresses of varying severity, including cracking, rutting, and subsidence, have become increasingly prevalent, directly compromising traffic safety and ride comfort. Consequently, the mileage of highways requiring maintenance intervention continues to expand. A massive number of highways have now entered a concentrated maintenance peak period, underscoring the urgent need to strengthen maintenance management, enhance maintenance quality, and improve service efficiency. The present study takes asphalt pavements on ordinary national and provincial highways in Linfen City as the research object and investigates asphalt pavement performance prediction from the perspective of multi-factor influence analysis. This research is of considerable practical significance for enabling precise maintenance operations, improving maintenance efficiency, and enhancing ride comfort. The practical significance of pavement performance prediction extends beyond merely identifying which roads are in better condition. Quantitative performance forecasting enables highway agencies to: (1) determine optimal maintenance timing, as preventive interventions at PCI levels of 70–80 yield the highest life-cycle cost efficiency; (2) prioritize maintenance investments across competing road segments under constrained budgets; and (3) transition from reactive to proactive maintenance management, thereby reducing long-term costs and extending pavement service life.
Internationally, pavement performance prediction has undergone a paradigm evolution from empirical regression to intelligent modeling. The early theoretical foundation was laid by empirical approaches, most notably the World Bank HDM-4 deterioration model and the American Association of State Highway and Transportation Officials (AASHTO) serviceability equation, followed by the progressive introduction of probabilistic and statistical methods. Abaza [
1] improved the estimation of transition probabilities within Markov deterioration models through a back-calculation procedure, thereby enhancing prediction accuracy. Kelvin et al. [
2] developed an International Roughness Index (IRI) prediction equation based on grey relational analysis. Abed et al. [
3] employed Monte Carlo simulation to incorporate pavement parameter uncertainties into the prediction of permanent deformation and fatigue cracking. In recent years, machine learning has emerged as the dominant research paradigm: Marcelino et al. [
4] proposed a generalized random-forest-based modeling framework capable of forecasting IRI over 5–10-year horizons using historical data; Zeiada et al. [
5] combined Artificial Neural Networks (ANNs) with feature selection to reveal the environment-dominated mechanism governing pavement performance in warm climatic regions; Li et al. and Guo Runhua et al. [
6,
7] developed a GA (Genetic Algorithm)-optimized hybrid neural network that improved prediction accuracy by approximately 35% relative to conventional ANNs; Ali et al. [
8] demonstrated the superiority of ANN models over multiple linear regression for IRI prediction across different climatic zones; Wang et al. [
9] integrated ANNs with the Ford–Fulkerson algorithm to achieve network-level pavement performance prediction with R
2 exceeding 0.9; and Roberts et al. [
10] further confirmed that quadratic-function neural networks outperform conventional dot-product architectures in roughness prediction. Collectively, these studies demonstrate the pronounced advantages of intelligent algorithms in the domain of pavement performance prediction.
Although domestic research in China commenced relatively later, it has progressed rapidly and yielded prolific outcomes. In the domain of decay models, Sun et al. [
11] classified pavement performance deterioration into four archetypal patterns and formulated corresponding generalized equations. He [
12] subsequently constructed PCI and RDI prediction models adapted to Fujian expressway pavements based on this framework. Guo [
13] established a decay model with surface layer thickness and traffic loading as core explanatory variables. Ding [
14] proposed a dual-parameter adaptive tracking method enabling real-time parameter correction. Regarding grey prediction methods, Zhang and Zhu Yunsheng et al. [
15,
16] developed an improved Grey Model GM(1.1) model with enhanced prediction accuracy; Wu et al. [
17] constructed a multi-indicator performance deterioration model based on the GM(1,1) framework; Zhao et al. [
18] proposed an equal-dimension grey-number renewal model for dynamic updating; Wang et al. [
19] optimized grey prediction weight allocation using a hierarchical variable-weight method; and Jiang et al. [
20] further proposed an entropy-weighted grey–Markov composite model to address data fluctuation challenges. In the machine learning domain, Pei et al. [
21] developed a Bayesian-optimized Light Gradient Boosting Machine (LightGBM) model that reduced MAE to 1.902; Zhang et al. [
22] identified the polynomial-kernel Support Vector Machine (SVM) as the most accurate method for small-sample nonlinear PCI prediction; Li et al. [
23] constructed a combined Rough Set (RS) and Principal Component Analysis-Adaptive Particle Swarm Optimization-Support Vector Machine (PCA-APSO-SVM) model and subsequently [
24] further optimized SVM parameters using an improved firefly algorithm; and Zhao et al. [
25] proposed a Grey Relational Analysis-Support Vector Regression (GRA-SVR) model that enhances prediction reliability through grey-relational-analysis-based dimensionality reduction. In the neural network domain, Zhou et al. [
26] and Guo [
27] successively introduced genetic algorithms into neural network optimization, effectively reducing prediction errors, while Li et al. [
28] proposed a Particle Swarm Optimization-Back Propagation Neural Network (PSO-BPNN) model that significantly outperformed conventional methods across multiple evaluation metrics. Luchuan Chen et al. [
29,
30] improved a model for pavement performance prediction based on LSTM. In summary, domestic research has formed a multi-pathway collaborative development framework encompassing decay models, grey prediction, machine learning, and neural networks; however, model generalizability and multi-source data fusion remain areas warranting further investigation.
To date, substantial research achievements have been accumulated in the field of pavement performance prediction. The commonly employed prediction methodologies are generally categorized into four principal classes: deterministic methods, probabilistic methods, bio-inspired methods, and other methods. A detailed classification and comparative analysis of these approaches are presented in
Table 1.
Despite the substantial body of literature on pavement performance prediction, the most widely adopted approach remains empirical regression, which generally captures only the temporal evolution of pavement condition while neglecting the relationships between performance indicators and their underlying influencing factors, thereby introducing considerable uncertainty into prediction models. To address this limitation, numerous researchers have proposed bio-inspired and metaheuristic methods to improve prediction accuracy; however, these approaches tend to attenuate the temporal sequential characteristics inherent in pavement deterioration data. Recent advances in deep learning have expanded the methodological toolkit for pavement performance prediction. Transformer-based architectures and ensemble learning algorithms (e.g., XGBoost, LightGBM) have demonstrated competitive performance in various infrastructure prediction tasks. However, for time-series pavement deterioration data characterized by sequential dependencies and multi-factor coupling, recurrent architectures such as LSTM remain particularly well-suited due to their inherent ability to capture long-term temporal patterns through gating mechanisms.
Given the considerable volume and heterogeneity of the pavement performance data collected from ordinary national and provincial highways in Linfen City, the models associated with deterministic, probabilistic, and dynamic methods are inadequate for effective adaptation to such datasets. Only bio-inspired modeling approaches are capable of capturing the deterioration patterns of pavement service performance with sufficient accuracy. Compared with support vector machines, decision trees, and other conventional machine learning models, artificial neural networks exhibit superior adaptability in handling large-scale data [
7,
8,
9]; however, they suffer from insufficient consideration of temporal sequential features and relatively low learning efficiency. To address these limitations, the present study adopts the Long Short-Term Memory (LSTM) network—a deep learning architecture specifically designed for sequential data modeling—to construct the pavement performance prediction model. Furthermore, Particle Swarm Optimization (PSO) is introduced to optimize the model hyperparameters, thereby mitigating the problem of convergence to local optima during the training process and ultimately achieving dynamic prediction of pavement service performance, providing scientific support and a data-driven basis for the formulation of rational pavement maintenance strategies.
3. Establishment of the Prediction Model
3.1. Dataset Construction
(1) Model input and output variables.
Both the PSO-LSTM model and the baseline LSTM model adopt the following input variables: historical pavement performance values from preceding years, maintenance year, Annual Average Daily Equivalent Traffic (AADT), surface layer thickness, mean annual temperature, and mean annual precipitation. The Pavement Condition Index (PCI) and Ride Quality Index (RQI) for the target prediction year serve as the output variables, thereby constituting a complete dataset for pavement performance prediction.
The dataset was collected from 9 representative highway routes in Linfen City (national highways G209, G309, G341, G342 and provincial highways S234, S235, S326, S330, S331), encompassing approximately 480 one-kilometer evaluation units. Annual pavement inspection data spanning 2018–2024 (7 years) were collected for each unit. By applying the 5-year sliding window technique, 3 time-series samples were generated per evaluation unit, yielding a total of approximately 1440 samples. The dataset was randomly partitioned into a training set (80%, approximately 1152 samples) and a test set (20%, approximately 288 samples).
(2) Data normalization.
Given that considerable magnitude disparities exist among certain continuous variables, direct input into the neural network may result in slow convergence of gradient descent or entrapment in local optima. In this study, the Min-Max normalization method is employed to scale all feature values to the [0, 1] interval:
where
x′ denotes the normalized value;
x represents the original data value;
xmax is the maximum value of the continuous variable; and
xmin is the minimum value of the continuous variable.
(3) Data augmentation and time-series sample construction.
A sliding window technique is employed for dataset construction and time-series sample augmentation. The temporal step length is set to 5, meaning the model utilizes five consecutive years of data as a single sliding window sample. Specifically, the historical PCI/RQI values from years t − 4 through t − 1, along with the concurrent environmental and traffic features for the target year t (including AADT, surface layer thickness, mean annual temperature, and mean annual precipitation), are used to predict the PCI/RQI value in year t. The target variable (PCI or RQI at year t) is strictly used as the model output and is never included in the input features, thereby ensuring no temporal data leakage occurs.
The selection of a 5-year window balances the need for sufficient temporal context to capture deterioration trends and the constraint of available data spanning 2018–2024. This 5-year horizon also aligns with the medium-term maintenance planning cycle commonly adopted by highway agencies. The data samples for each road segment comprise input variable records spanning from 2018 to 2024. By applying the sliding window technique, these records are partitioned into three time-series samples: the input sequences covering 2018–2022, 2019–2023, and 2020–2024, which yield the predicted pavement performance values for 2022, 2023, and 2024, respectively.
3.2. Experimental Environment and Model Parameter Determination
3.2.1. Experimental Environment
The prediction model was developed and implemented on a platform running the Windows 11 operating system. The detailed configuration parameters of the experimental environment are presented in
Table 2.
3.2.2. Model Parameter Configuration
Prior to model training, a series of core parameters must be pre-configured, including the number of hidden layers, the number of neurons per hidden layer, and the number of training iterations. The specific parameter settings adopted in this experiment are as follows: the number of hidden layers is set to 1, containing 50 neurons; the maximum number of training epochs is set to 50, with a mini-batch size of 32. The Adam optimization algorithm is uniformly employed throughout the training process, iteratively updating the gradients to optimize and determine the weight parameters of each neuron in the hidden layer, thereby driving the model toward convergence.
During the initial construction of the LSTM model, the key hyperparameters are pre-defined as follows: the learning rate is initialized at 0.001, and the Dropout rate is set to 0.5. For the PSO-LSTM model incorporating Particle Swarm Optimization, the hyperparameter search space is further constrained: The Dropout probability (DP) search range of [0, 0.5] ensures regularization control, while the learning rate (η) bound of [0.0001, 0.01] covers the range typically effective for Adam-optimized LSTM models. The neuron search range of [10, 100] was determined based on the dataset size (~1152 training samples with 6 features per time step): a minimum of 10 neurons ensures basic representational capacity, while the upper bound of 100 prevents excessive parameterization relative to training sample size, mitigating overfitting risk alongside the Dropout regularization mechanism. On this basis, separate LSTM prediction models are constructed for the PCI and RQI training sets, respectively. The initial parameter configuration of the model and the parameter search ranges of the PSO algorithm are detailed in
Table 3,
Table 4 and
Table 5.
3.2.3. Determination of Optimal Learning Parameters
(1) PCI prediction model optimization
The key hyperparameters of the LSTM-based PCI prediction model were tuned by introducing the Particle Swarm Optimization algorithm. The dynamic variation in the fitness function (MSE) during the optimization process is visualized in
Figure 2.
As illustrated in
Figure 2, during the initial iterations, the fitness value decreases rapidly from an initial value of approximately 0.00295 to approximately 0.00266, reflecting the strong global exploration capability of the algorithm in quickly locating hyperparameter regions that yield lower LSTM prediction errors. From the 2nd to the 11th iteration, the rate of decline in the fitness value decelerates, gradually decreasing from 0.00266 to approximately 0.00258, indicating that the algorithm has transitioned from global exploration to local exploitation, continuously refining the hyperparameter combination through the interaction of individual and swarm best information. Upon reaching the 12th iteration, the curve enters a stable convergence phase, with the fitness value ultimately converging to 0.00242 without appreciable fluctuations, demonstrating that the PSO algorithm has identified the optimal hyperparameter combination that minimizes the LSTM prediction error, and the optimization process has reached a steady state.
The dynamic variation in the number of neurons, learning rate, and Dropout parameter during the iterative search and optimization process is presented in
Figure 3.
As shown in
Figure 3a, the number of hidden-layer neurons exhibits an initial decline followed by a subsequent rise prior to the 12th iteration, after which it stabilizes at an optimal value of 91. This represents a substantial increase relative to the baseline setting of 50, suggesting that the model requires greater complexity to adequately fit the feature space. The learning rate evolution depicted in
Figure 3b reveals an oscillatory trajectory characterized by an initial increase, a subsequent decrease, and a further rise. After the 12th PSO iteration, the learning rate progressively converges and remains stable at an optimal value of 0.0095.
Figure 3c indicates that the Dropout deactivation probability likewise enters a stable state after the 12th iteration, confirming that the relevant parameters have been effectively optimized, with the optimal Dropout rate determined as 0.033.
(2) RQI prediction model optimization
Similarly, the Particle Swarm Optimization algorithm is introduced to tune the key hyperparameters of the LSTM-based RQI prediction model. The optimization process is illustrated in
Figure 4.
Optimal parameter combinations for the two prediction models are determined as follows. For the PCI prediction model, the optimal parameters are: 91 hidden-layer neurons, a learning rate of 0.0095, and a Dropout rate of 0.033. For the RQI prediction model, the optimal parameters are: 65 hidden-layer neurons, a learning rate of 0.01, and a Dropout rate of 0.0008.
4. Model Verification and Analysis
4.1. PCI Prediction Model Validation
4.1.1. Training Process Analysis
A comparative analysis of the training convergence behavior between the LSTM model and the PSO-LSTM model is conducted, with the loss function descent curves plotted in
Figure 5 and
Figure 6.
The PSO-LSTM model demonstrates exceptionally rapid convergence from the early stages of training, with its loss value decreasing to 0.004043 by the 5th epoch, compared with a loss of 0.008126 for the baseline LSTM model at the same epoch. Owing to the optimal initial learning rate of 0.0095 identified by the PSO algorithm, the PSO-LSTM model achieves a markedly faster rate of error reduction than the baseline LSTM during the initial training phase, converging more efficiently toward the optimal solution region. After 50 training epochs, the final loss of the baseline LSTM model converges to 0.002503, while that of the PSO-LSTM model is further reduced to 0.002097. The final training error of the PSO-LSTM model is approximately 16.2% lower than that of the baseline LSTM, indicating that the optimized model parameters possess superior feature-fitting capability, enabling a more thorough learning of the nonlinear variation patterns embedded within the training samples.
4.1.2. Prediction Results Analysis
Based on the prediction models established above, prediction analyses were performed on the PCI test dataset, and the model outputs were compared against the corresponding actual values. The scatter distributions are presented in
Figure 7. The reference line shown in the figure represents the ideal condition where predicted values perfectly coincide with actual values. The closer the scatter points are distributed to this reference line, the greater the consistency between the model predictions and the true values, indicating higher prediction accuracy. Conversely, greater deviation of the scatter points from the reference line signifies larger discrepancies between predicted and actual values, reflecting less satisfactory predictive performance.
As shown in
Figure 7, the prediction points of the LSTM model are generally distributed in the vicinity of the reference line; however, certain scatter points exhibit a degree of dispersion on both sides, particularly within the 65–85 range, where considerable deviations between predicted and actual values persist, indicating that the model’s fitting accuracy in this local interval requires further improvement. In comparison, the scatter points of the PSO-LSTM model are distributed more tightly around the reference line, with a notably enhanced degree of overall clustering and significantly reduced dispersion, indicating that the discrepancies between predicted and actual values are further diminished. This result demonstrates that, following global optimization of the key LSTM hyperparameters via the PSO algorithm, both the fitting capability and stability of the model on the test set are improved, yielding predictive performance that is markedly superior to that of the unoptimized baseline LSTM model. To further quantify the model performance, the relevant evaluation metrics are computed, and the MAE, MSE, RMSE, and R
2 results for the two models are presented in
Table 6.
To facilitate engineering interpretation, the prediction errors are discussed in the original PCI scale (0–100). The PSO-LSTM model achieves an MAE of 1.86, indicating that predicted PCI values deviate from actual measurements by less than 2 points on average. The RMSE of 2.72 suggests that approximately 95% of predictions fall within ±5.5 points of actual values. Given that the Chinese Highway Performance Assessment Standards (JTG 5210-2018) [
31] classifies pavement condition at 10-point grade intervals (Excellent: ≥90; Good: 80–90; Fair: 70–80; Poor: 60–70; Very Poor: <60), an MAE of approximately 2 points is well within a single grade interval, enabling reliable identification of pavement condition grades for maintenance decision-making.
As shown in
Table 6, compared with the unoptimized LSTM model, the LSTM prediction model with hyperparameters tuned via the Particle Swarm Optimization algorithm demonstrates superior performance: the coefficient of determination (R
2) is improved by 2.36%, while the Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Square Error (RMSE) are reduced by 15.55%, 17.39%, and 9.11%, respectively. These results indicate that the optimized model more adequately meets the prediction requirements for the PCI indicator, achieving an effective enhancement in overall predictive performance.
4.2. RQI Prediction Model Validation
4.2.1. Training Process Analysis
Similarly, for the RQI training set, a comparative analysis of the training convergence behavior between the LSTM model and the PSO-LSTM model is conducted, with the loss function descent curves plotted in
Figure 8 and
Figure 9.
As shown in
Figure 8 and
Figure 9, the overall training behavior on the RQI dataset follows a similar pattern to that observed on the PCI dataset. The PSO-LSTM model demonstrates exceptionally rapid convergence from the early stages of training, with its loss value decreasing to 0.004208 by the 5th epoch, compared with a loss of 0.009510 for the baseline LSTM model at the same epoch. Owing to the optimal initial learning rate identified by the PSO algorithm, the PSO-LSTM model achieves a markedly faster rate of error reduction than the baseline LSTM during the initial training phase, converging more rapidly toward the optimal solution region. After 50 training epochs, the final loss of the baseline LSTM model converges to 0.003055, while that of the PSO-LSTM model is further reduced to 0.002523. The final training error of the PSO-LSTM model is approximately 17.4% lower than that of the baseline LSTM, indicating that the optimized model parameters possess superior feature-fitting capability, enabling a more thorough learning of the nonlinear variation patterns embedded within the training samples.
4.2.2. Prediction Results Analysis
The prediction methodology and performance analysis framework for RQI are consistent with those adopted for PCI. The corresponding prediction results are presented in
Figure 10. The PSO-LSTM model achieves an MAE of 2.13 for RQI, meaning predicted ride quality values deviate from actual measurements by approximately 2 points on the 0–100 scale. This level of accuracy is sufficient for maintenance grade classification and planning purposes.
Regarding the distribution of PCI and RQI values in the dataset, the majority of records fall within the 70–100 range, with fewer samples in the “Poor” (<60) category. This distribution reflects the actual condition of the studied highway network. The MSE loss function treats all prediction errors equally regardless of the PCI/RQI level, and the inclusion of routes with varying deterioration levels ensures representation of deteriorated conditions in the training data.
As shown in
Figure 10, the scatter points within the 80–100 value range in the RQI test dataset are relatively concentrated, whereas those below 80 are comparatively sparse—a phenomenon closely associated with the inherent variation characteristics of the ride quality index. Examining the correspondence between the RQI predicted and measured values, it can be observed that the prediction results of both the pre-optimization and post-optimization LSTM models are predominantly distributed on both sides of the reference line. However, compared with the conventional LSTM model, the scatter distribution of the PSO-LSTM model is noticeably more compact. This result indicates that the PSO-optimized LSTM model demonstrates superior modeling performance for RQI prediction, confirming that the Particle Swarm Optimization algorithm plays a constructive role in the hyperparameter optimization process. Employing the same evaluation methodology as adopted for the PCI prediction model, the R
2, MAE, and RMSE metrics of the RQI model are computed, with the results summarized in
Table 7.
As indicated by the results presented in
Table 7, the overall prediction performance of the PSO-LSTM model on the RQI prediction task is significantly improved compared with the conventional LSTM model. Specifically, MSE decreases by 20.5%, MAE is reduced by 5.8%, RMSE is lowered by 10.8%, and the goodness-of-fit (R
2) is increased by 3.4%, demonstrating that the optimized model exhibits notable advantages in both prediction accuracy and stability.
4.3. Model Verification
A typical road segment significantly affected by heavy traffic loading is selected as the research subject, with historical inspection data of its PCI and RQI indicators adopted as samples. To further validate the prediction performance of the proposed PSO-LSTM model, comparative prediction analyses are conducted using the PSO-LSTM model, the baseline LSTM model, and the Recurrent Neural Network (RNN) model, respectively. The comparison of prediction accuracy across the three models is presented in
Figure 11.
Based on the comparative prediction accuracy results presented in
Figure 11, the following observations can be drawn regarding the prediction of PCI and RQI indicators for the typical road segment subject to heavy traffic loading. Among the three models, the RNN exhibits the poorest overall predictive performance, with corresponding coefficients of determination of 0.7216 and 0.7273, respectively.
The prediction results of the LSTM model reveal that the predictive performance for both PCI and RQI is significantly improved relative to the conventional RNN model. Specifically, MSE decreases by 19.92% and 29.22%, RMSE is reduced by 10.51% and 15.86%, MAE is lowered by 11.38% and 16.41%, and the coefficient of determination (R2) is increased by 7.69% and 10.95%, respectively. These results demonstrate that the LSTM model is markedly superior to the conventional RNN model in terms of both prediction accuracy and fitting capability, rendering it more suitable for prediction problems characterized by multi-factor inputs and single-indicator outputs.
Compared with both the conventional RNN model and the unoptimized baseline LSTM model, the PSO-LSTM model achieves the best performance across all error metrics, with MSE, RMSE, and MAE all attaining their minimum values, while the fitting performance is substantially enhanced, with coefficients of determination reaching 0.845 and 0.869, respectively. The results indicate that, following optimization of the core LSTM hyperparameters via the PSO algorithm, the model prediction accuracy is effectively improved: MSE is reduced by 30.47% and 32.12%, RMSE decreases by 16.61% and 17.60%, MAE is lowered by 16.95% and 18.42%, and the coefficient of determination (R2) is increased by 8.73% and 7.68%, respectively. These findings confirm that the integration of Particle Swarm Optimization with the LSTM model plays a constructive role in enhancing predictive performance.
In summary, the predictive performance of the RNN model is inferior to that of both the LSTM and PSO-LSTM models, while the LSTM model is in turn outperformed by the PSO-LSTM model. Consequently, the PSO-LSTM model demonstrates superior applicability and reliability for pavement performance prediction, and can serve as an effective reference tool for pavement maintenance planning and management decision-making.
4.4. Model Prediction
To investigate the differences in pavement performance between roads that have undergone maintenance and those that have not since being put into service, typical road segments under different traffic loading levels are selected as research subjects. Under heavy traffic loading conditions, the National Highway G309 (K978+000–K983+000) segment and Provincial Highway S331 (K221+000–K226+000) segment are selected as representative sections; the former has received maintenance during its service life, while the latter has not undergone any maintenance works since opening to traffic. Under moderate traffic loading conditions, the Provincial Highway S234 (K9+000–K14+000) segment and National Highway G341 (K1008+000–K1013+000) segment are selected; the former has not received any maintenance, whereas the latter has undergone maintenance works. All four representative segments are classified as secondary highways. The detailed road information per unit mileage for these segments is summarized in
Table 8.
Following the PCI and RQI prediction model construction procedure proposed in the preceding sections, the models are established accordingly. Based on the road condition parameters listed in
Table 8, the corresponding independent variables for PCI and RQI are input into the respective models. On this basis, prediction analyses of PCI and RQI indicators are conducted for the selected representative road segments over a four-year forecast horizon. The traffic volume projections are constructed using a Grey–Markov model. The final model outputs comprise the predicted PCI and RQI values per kilometer for each study segment. The detailed prediction results are presented in
Table 9.
As indicated by the analytical results in
Table 9, regardless of whether the traffic loading condition is heavy or moderate, the typical road segments that have undergone maintenance interventions exhibit an overall level of pavement service performance superior to that of the unmaintained segments, with a comparatively more gradual declining trend over time. To more intuitively illustrate this variation pattern, the PCI and RQI prediction results in
Table 9 are visualized by averaging the predicted PCI and RQI values over the selected 5 km segments and plotting the corresponding trend lines. The results are presented in
Figure 12 and
Figure 13.
As shown in
Figure 12 and
Figure 13, over the forthcoming four-year period, the pavement performance indicators PCI and RQI for typical road segments under both heavy and moderate traffic loading conditions exhibit a general declining trend with increasing time. Concurrently, segments that have received maintenance interventions consistently maintain higher PCI and RQI levels across all prediction years compared with unmaintained segments, demonstrating the positive role of maintenance measures in retarding pavement performance deterioration.
Under heavy traffic loading conditions, analysis of the maintained typical segments reveals that the inter-annual pavement performance degradation is relatively moderate, with PCI declining within the range of 0.89–2.52 and RQI decreasing within 1.39–2.37. In contrast, the unmaintained segments exhibit markedly more pronounced performance deterioration, with annual PCI reductions reaching 3.86–7.97 and RQI reductions ranging from 1.89 to 6.24. Under moderate traffic loading conditions, the maintained typical segments similarly display relatively modest annual performance degradation, with PCI declining within the range of 1.43–2.30 and RQI decreasing within 0.97–2.07, whereas the unmaintained segments undergo more substantial performance decay, with annual PCI reductions of 1.63–3.88 and RQI variations ranging from 1.44 to 3.57. The aforementioned deterioration patterns are consistent with the well-established general principle that asphalt pavement service performance progressively declines with increasing service duration.
In summary, based on the established PCI and RQI prediction models for asphalt pavement service performance, a reasonable forecast of pavement technical conditions over an approximate four-year horizon can be achieved. The prediction results are capable of providing quantitative references for the formulation of short- to medium-term pavement maintenance plans in Linfen City, thereby offering robust support for the advancement of scientifically informed and refined maintenance decision-making. Based on the prediction results and in accordance with Chinese Highway Maintenance Standards (JTG 5421-2018) [
32], road segments predicted to reach PCI values below 80 within the four-year forecast horizon should be prioritized for preventive maintenance interventions. Engineering experience and the literature suggest that timely preventive maintenance at PCI levels of 70–80 yields the highest life-cycle cost efficiency compared to delayed reactive repairs. For the studied segments, the prediction results indicate that several unmaintained sections on routes S331 and S234 are projected to fall below the PCI = 70 threshold within 2–3 years, warranting urgent maintenance scheduling.
5. Conclusions
(1) A comparative discussion of different prediction methodologies was first conducted, elucidating the advantages of deep learning models in pavement performance prediction. A PSO-LSTM pavement performance prediction model incorporating a sliding window mechanism was subsequently established. Based on historical pavement performance inspection data from Linfen City, Shanxi Province, prediction analyses were performed for both the PCI and RQI indicators. The reliability and validity of the PSO-LSTM model were verified through a multi-model prediction accuracy comparison.
(2) Through a systematic comparison with the conventional RNN model and the unoptimized LSTM model, the optimized PSO-LSTM model demonstrates superior performance in terms of loss function convergence speed, test set error metrics, and coefficient of determination. The coefficients of determination (R2) for PCI and RQI prediction on typical road segments under heavy traffic loading reach 0.845 and 0.869, respectively. In the original PCI/RQI scale (0–100), the MAE values of 1.86 and 2.13 are well within a single condition grade interval (10 points), confirming the model’s practical utility for maintenance decision-making.
(3) Through the prediction of pavement service performance over a four-year forecast horizon for typical road segments, it is observed that segments with maintenance interventions exhibit substantially smaller performance degradation compared with unmaintained segments. The findings suggest that the implementation of maintenance measures at appropriate intervals can effectively retard the deterioration of pavement service performance. Road segments predicted to reach PCI below 80 should be prioritized for preventive maintenance to achieve optimal life-cycle cost efficiency.
6. Prospects
(1) The present study is confined to selected national and provincial highways within Linfen City, exhibiting a certain degree of regional specificity. The PSO-LSTM framework itself is transferable to other regions, provided that local pavement inspection data, traffic volume records, and climate data are available for model re-training and calibration. In future research, the sample scope may be expanded to incorporate data from a broader range of geographical regions, additional routes, and extended temporal spans, thereby enabling further refinement and calibration of the proposed model. Transfer learning techniques, such as pre-training on international pavement databases (e.g., LTPP) followed by fine-tuning with local data, may be explored to enhance cross-regional applicability.
(2) Pavement performance evolution is governed by the combined influence of multiple factors. Accordingly, further extensions in both the variable system and model architecture remain warranted. Future model enhancements should consider incorporating extreme climate indicators (ETCCDI variables) such as heatwave frequency, rainfall intensity, and freeze–thaw cycle counts to better capture non-linear environmental effects. Additional indicators capable of characterizing structural condition and environmental effects should be incorporated into the model inputs. Moreover, the introduction of attention mechanisms, hybrid deep learning architectures, or multi-task learning methodologies may be explored to further enhance the model’s capacity for identifying complex nonlinear relationships and long-term temporal dependency features. Spatial modeling techniques such as Graph Neural Networks (GNNs) or hybrid CNN-LSTM architectures should also be explored to capture spatial correlations between adjacent pavement sections.
(3) Future studies should incorporate model interpretability techniques such as SHAP (SHapley Additive exPlanations) analysis to quantify the contribution of individual input variables to prediction outcomes, thereby bridging the gap between data-driven predictions and physical understanding of pavement deterioration mechanisms. Additionally, more comprehensive comparisons with other machine learning approaches such as LightGBM, SVM, Random Forest, and XGBoost should be conducted, and time-series cross-validation should be implemented to provide a more rigorous evaluation framework. Sensitivity analysis of the sliding window length should be performed when extended temporal datasets become available.