Next Article in Journal
Leveraging Transformer Models for Seismic Fragility Assessment of Non-Engineered Masonry Structures in Malawi
Previous Article in Journal
3D Effects on the Stability of Upstream-Raised Tailings Dams in Narrow Valleys
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Physics-Aware Informer: A Hybrid Framework for Accurate Pavement IRI Prediction in Diverse Climates

China Highway Engineering Consulting Corporation, No. 17 Changyun Palace, West Third Ring North Road, Haidian District, Beijing 100089, China
*
Author to whom correspondence should be addressed.
Infrastructures 2025, 10(10), 278; https://doi.org/10.3390/infrastructures10100278
Submission received: 20 August 2025 / Revised: 22 September 2025 / Accepted: 16 October 2025 / Published: 18 October 2025

Abstract

Accurate prediction of the International Roughness Index (IRI) is critical for road safety and maintenance decisions. In this study, we propose a novel Physics-Aware Informer (PA-Informer) model that integrates the efficiency of the Informer structure with physics constraints derived from partial differential equations (PDEs). The model addresses two key challenges: (1) performance degradation in short-sequence scenarios, and (2) the lack of physics constraints in conventional data-driven models. By embedding residual PDEs to link IRI with influencing factors such as temperature, precipitation, and joint displacement, and introducing a dynamic weighting strategy for balancing data-driven and physics-informed losses, the PA-Informer achieves robust and accurate predictions. Experimental results, based on data from four climatic regions in China, demonstrate its superior performance. The model achieves a Mean Squared Error (MSE) of 0.0165 and R2 of 0.962 with an input window length of 30 weeks, and an MSE of 0.0152 and R2 with an input window length of 120 weeks. Its accuracy is superior to that of other models, and the stability of the model when the input window length changes is far better than that of other models. Sensitivity analysis highlights joint displacement and internal stress as the most influential features, with stable sensitivity coefficients (Sp ≈ 0.89 and Sp ≈ 0.81). These findings validate the PA-Informer as a reliable and scalable tool for predicting pavement performance under diverse conditions, offering significant improvements over other IRI prediction models.

1. Introduction

Time-series forecasting is critical in the field of transportation infrastructure, as it enables the prediction of essential indicators such as the International Roughness Index (IRI), which measures pavement surface smoothness and serving performance. Accurate IRI prediction is fundamental for decision-making in road maintenance and rehabilitation planning, ensuring road safety, driving comfort, and effective resource allocation [1,2]. Despite its importance, IRI prediction presents significant challenges due to the dynamic and complex interactions between various influencing factors, including traffic load, climate conditions, and pavement construction quality [3,4].
Traditionally, multiple regression (MR) models have been used for IRI prediction. However, these models fail to capture the nonlinear relationships between IRI and its predictors, such as temperature, precipitation, and vehicle load [5]. The advent of machine learning techniques, including support vector regression (SVR), random forests (RFs), and gradient boosting methods, has improved prediction accuracy by addressing nonlinearity [6,7,8]. Marcelino et al. demonstrated that machine learning models could provide superior performance in predicting pavement performance compared to traditional approaches [6]. Similarly, Qin et al. proposed a dual-stage attention-based recurrent neural network for time-series prediction, which effectively modeled long-term dependencies [7]. However, these models often suffer from overfitting or fail to incorporate the time-dependent characteristics of IRI, which evolve due to cumulative effects over time [9].
Recurrent neural networks (RNNs) and their variants, particularly long short-term memory (LSTM) networks, have been increasingly used for time-series forecasting tasks. These models are explicitly designed to handle sequential data, making them suitable for capturing temporal dependencies in pavement performance [10]. For example, LSTM-based models have demonstrated superior performance in predicting pavement deterioration by effectively modeling historical influences. However, traditional LSTM models are not without limitations. They may struggle with long-term dependencies when the input sequence is lengthy, and their prediction accuracy can degrade in cases of sparse or noisy datasets [11,12]. Transformer-based models, such as Informer, have recently emerged as state-of-the-art solutions for long-sequence time-series forecasting. Informer introduces a ProbSparse self-attention mechanism to reduce computational complexity, making it highly efficient in capturing long-range dependencies in large-scale datasets [13]. Its ability to distill self-attention and utilize generative-style decoding has made it a powerful tool for handling long-sequence forecasting tasks. However, Informer models often struggle with short-sequence time series, where limited historical data and noisy inputs can hinder performance. These limitations restrict their application to scenarios where data is sparse or irregular, such as IRI prediction in transportation systems [14].
Recent studies have also explored the integration of physics-based constraints into predictive models to enhance their generalization and interpretability, particularly in resource-limited scenarios. For example, physics-informed neural networks (PINNs) embed residual equations derived from physical laws into the learning process, ensuring that predictions adhere to governing physical principles [15,16]. Tartakovsky et al. successfully applied PINNs to solve partial differential equations (PDEs) in subsurface flow problems, demonstrating their robustness in resource-limited datasets [17,18]. In the field of pavement performance prediction, Lu et al. incorporated residual physics-based information into time-series forecasting models, providing a robust approach to ensure compliance with underlying physical laws [19].
In addition to the above, recent works have identified critical gaps in the application of machine learning to IRI prediction, particularly for flexible pavements. For instance, Li et al. provided a comprehensive review of machine learning for IRI prediction, highlighting challenges such as data scarcity, model interpretability, and adaptability to diverse road types [20]. Similarly, Deng et al. demonstrated that numerical modeling techniques could effectively predict rutting in asphalt pavements with semi-rigid bases, emphasizing the importance of incorporating mechanistic knowledge in machine learning frameworks [8]. These findings underscore the need for hybrid approaches that combine data-driven methods with domain knowledge to improve prediction accuracy and robustness in diverse environments [21].
To address these challenges, this study proposes a novel Physics-Aware Informer (PA-Informer) framework which integrates the efficiency of the Informer architecture with physics constraints derived from governing PDEs. This approach not only improves the accuracy of long-sequence forecasts but also addresses the limitations of traditional data-driven models in short-sequence scenarios by embedding physics-based constraints into the learning process.
The key contributions of this work include the following:
(1) The partial differential equation (PDE) between IRI and the input characteristic indicators is proposed as the residual equation to capture the relationship between IRI and its influencing factors, thus ensuring the integration of the physics constraints into the model and the accuracy of short-sequence time-series prediction of the PA-Informer model.
(2) A self-adaptive weighting strategy is proposed to dynamically balance data-driven loss and residual PDE loss, thereby ensuring a stricter physics constraint limit of short-time sequence prediction and thus guaranteeing the accuracy of the short-time sequence prediction of the model.
(3) A sensitivity analysis is conducted to evaluate the robustness and adaptability of the proposed framework under diverse climatic and structural conditions. Based on the sensitivity changes in different characteristic indicators, the reasons for the stability and robustness of the model are explained.
By bridging the gap between data-driven and physics-informed approaches, the PA-Informer model provides a robust, interpretable, and scalable solution for IRI prediction. It not only improves the prediction accuracy but also enhances the practical usability of the model for real-world applications in transportation infrastructure management.

2. Methodology

2.1. Overview of the Physics-Aware Informer Model

The Physics-Aware Informer (PA-Informer) model is a hybrid framework that integrates the Informer model’s efficient long-sequence time-series forecasting (LSTF) capabilities with physics constraints derived from governing partial differential equations (PDEs). This framework is designed to overcome two major challenges: (1) the performance degradation of data-driven models, including the Informer model, when input time-series data are sparse or short, and (2) the lack of physics constraints in conventional deep learning models, which limits their generalization under varying environmental and structural conditions [22]. By embedding the residual PDE of International Roughness Index (IRI) into the Informer architecture, the PA-Informer achieves physically consistent and accurate predictions, compensating for insufficient historical data in short-sequence scenarios while leveraging Informer’s efficiency for long-sequence forecasting.
The framework introduces three key innovations. First, the Informer model’s mechanisms are enhanced to efficiently process long sequences, allowing the model to extract global dependencies and capture long-term IRI trends. Second, the residual PDE for IRI is derived, linking IRI to physical and environmental factors such as joint width, height difference, material stress, temperature, humidity, solar radiation, and rainfall. Third, a novel fusion mechanism is designed to combine the temporal features extracted by Informer with physically consistent features generated from the PDE, ensuring that the model predictions adhere to governing physical laws [23]. This combination gives the PA-Informer model an edge in robustness and accuracy, even when dealing with complex pavement systems and environmental variability. And the whole structure of the Physics-Aware Informer Model is shown in Figure 1.

2.2. Informer Model for Long-Sequence Forecasting

The Informer model serves as the backbone of the PA-Informer framework, providing a highly efficient structure for long-sequence time-series forecasting. It addresses the computational inefficiencies of traditional Transformers while maintaining high accuracy for long input sequences through three core mechanisms: ProbSparse, self-attention distilling, and a generative-style decoder.
The ProbSparse mechanism reduces the computational complexity of traditional self-attention from O(L2) to O(L·lgL) by leveraging the sparsity assumption, which significantly improves memory efficiency while retaining the ability to capture long-range dependencies [24].
To enhance efficiency, Informer reduces sequence length after each attention layer by retaining dominant features through convolution and max-pooling. This operation halves the sequence length at each layer, minimizing memory usage to O((2 − ɛL·lgL) while preserving global dependencies. Additionally, stacking multiple encoder layers with reduced sequence lengths enables the model to extract multi-scale features and robust long-term patterns [25].
Instead of generating predictions step by step, the Informer decoder handles the entire target sequence in a single forward pass, avoiding cumulative error propagation. Using a start token and placeholders for the target sequence, it generates predictions directly, ensuring faster inference and high accuracy for long-sequence outputs. This makes it particularly suited for long-sequence time-series forecasting (LSTF) tasks [26].

2.2.3. Derivation of the Unified IRI Residual PDE

To ensure physics constraints in the IRI prediction task, the PA-Informer model incorporates a residual PDE that establishes the relationship between IRI, joint geometry, material properties, and environmental conditions. This PDE ensures that the predictions adhere to the physical laws governing the evolution of pavement roughness.
IRI is an indicator used to measure the smoothness of a road surface. It is typically defined as the ratio of the cumulative vertical motion generated by a vehicle’s suspension system during travel to the corresponding horizontal movement. Its formula is expressed as follows [27]:
I R I = 0 L z r ( x ) d x L
where zr’(x) represents the vehicle’s vertical velocity, and L is the road length. Taking into account the influence of joint geometry on vehicle vibration, the commonly used empirical formula for IRI can be expressed as follows [28]:
I R I = 1 L 0 L k 1 Δ h ω + k 2 Δ h 2 d x
where k1 and k2 are fixed values related to vehicle characteristics, ω represents the joint width, and Δ h represents the joint height difference. ω and Δ h can be associated with the internal stress field σ of the cement concrete slab through the mechanical equilibrium equation shown in Equation (4).
σ = 1 D ε α Δ T c I β Δ H c I
α and β represent the thermal expansion coefficient and hygroscopic expansion coefficient, respectively, while the strain tensor ɛ can be derived from the displacement field as follows:
ε = 1 2 u + Δ h + u + Δ h T
The relationship between the stress field σ and the internal temperature field and humidity field of the cement concrete slab can be expressed by Equations (5) to (8):
ρ c p T c t = κ T T c
H c t = D H c H c
T c t α 2 T c = S R
H c t β 2 H c = P
where the thermal conductivity (κT) and the moisture diffusion coefficient (DH) are calculated based on data obtained from temperature and humidity sensors embedded within the cement concrete slabs. The calibrated values are 1.42 W/m·K and 0.068 mm2/s for κT and DH accordingly.
By combining Equations (3) to (8), the residual expression of IRI can be derived and is simplified as Equation (9).
L P h y s i c s = F I R I σ , u , Δ h , T c , H c , S R , P , t I R I t r u e
The specific meanings of the physical quantities are summarized in Table 1.

2.2.4. Integration of Informer with Physical Constraints

To achieve accurate predictions that are both informed by temporal patterns and consistent with physical laws, the PA-Informer model integrates the Informer architecture with the residual PDE for IRI. This integration is achieved by combining the temporal features extracted by the Informer encoder with the physically consistent features generated from the residual PDE. These features are fused through a dimension adjustment module to ensure compatibility and effective integration. The decoder then generates predictions based on this unified representation, capturing both long-term temporal dependencies and physical dynamics.
The total loss function for the PA-Informer model is defined as follows:
L t o t a l = λ P h y s i c s L P h y s i c s + λ d a t a L d a t a
where Ldata is the data-driven loss capturing the temporal patterns in the input time series, and LPhysics is the residual PDE loss ensuring physics constraints. To dynamically balance the contributions of these two losses during training, a self-adaptive weighting strategy based on loss gradient scales is employed. This strategy ensures that the gradient contributions from each loss are balanced, preventing one loss from dominating the optimization process.
The weights λdata and λPhysics are dynamically adjusted during training based on the gradient magnitudes of the respective losses. Specifically, the gradient norms for each loss with respect to the model parameters are computed as follows:
g P h y s i c s = θ L P h y s i c s = i L P h y s i c s θ i 2
g d a t a = θ L d a t a = i L d a t a θ i 2
where ||·|| denotes the gradient norm, and θ represents the model parameters [29,30,31]. The weights are then updated to inversely scale with the gradient magnitudes, ensuring that the loss with a higher gradient norm receives a smaller weight. The weight formulas are defined as follows:
λ P h y s i c s = g d a t a g P h y s i c s + g d a t a
λ d a t a = g P h y s i c s g P h y s i c s + g d a t a
This strategy balances the optimization contributions from the data-driven and physics-informed losses, allowing the model to adapt to varying task requirements or input window lengths during training. During training, the total loss Ltotal is minimized using gradient-based optimization. The dynamic weighting mechanism ensures that the contributions from Ldata and LPhysics are adjusted in real time based on their respective gradient scales, promoting a balanced optimization process. This adaptive strategy is particularly effective when the scales or sensitivities of the two losses differ significantly or vary across training iterations.

3. Experiments

3.1. Data Collection

The data utilized in this study were collected from sites monitoring the long-term performance of cement concrete pavements located in various typical climatic regions across China. These monitoring sites are distributed across four representative environmental zones: arid desert region, humid and rainy region, lightly frozen region, and the Qinghai–Tibet alpine region. The primary focus of these sites is to monitor pavement performance under different climatic conditions over extended periods [30].
As shown in Table 2, four monitoring sites in Xinjiang, Guangxi, Beijing, and Tibet corresponding to one of the aforementioned climatic zones were established. The pavement structures and monitoring systems at each site were specifically designed to accommodate the unique geographic and climatic characteristics of their respective regions. Till now, the monitoring and data transmission systems at all sites have been operating stably. Between June 2022 and July 2025 (164 weeks in total), the data collection effort has yielded over 12.60 million records, providing a robust dataset for analyzing long-term pavement performance under diverse environmental conditions.
At each monitoring site, an integrated data acquisition system was installed to monitor the long-term performance of cement concrete pavements. The monitored parameters include the following:
(1) Meteorological Observations: Precipitation levels and solar radiation intensity are recorded by weather stations (Figure 2).
(2) Internal Temperature and Humidity Fields: Temperature and humidity distribution within the pavement structure is monitored using temperature and humidity fiber-optic sensors (Figure 3).
(3) Structural Response Indicators: Strain and stress sensors are used to capture the structural response metrics of the pavement (Figure 4 and Figure 5).
The layout of sensors embedded within the pavement structure is illustrated in Figure 6. All sensors collect data at a frequency of once every 5 s.
After the completion of pavement construction, weekly surface condition measurements are conducted at each monitoring site. Specifically, every Thursday, a road performance testing vehicle equipped with a LiDAR system (Figure 7) scans the pavement surface to obtain point cloud data. The point cloud data are then processed using ProVAL (RoadRuf 4.0) software, which extracts the pavement surface elevation data. From these, the pavement centerline IRI is calculated and used as one of the key feature indicators input into the prediction model.

3.2. Experiment Details

After the data was collected, all input data was preprocessed using zero-mean normalization, and input window lengths of 30, 60, 90, and 120 weeks were tested to explore the impact of sequence lengths on prediction performance. The data from Beijing, Guangxi, and Xinjiang spanning June 2022 to December 2024 were used as the training set, while data from January 2025 to August 2025 served as the validation set. Additionally, to evaluate the model’s generalization ability and robustness under extreme climatic conditions, data from Tibet during the same validation period were used as the test set.
Given that the input features are collected every 5 s, while the target variable, IRI, is measured on a weekly basis, it is necessary to align the temporal resolution of the target variable with that of the input features. For this, a Gaussian interpolation method is employed to interpolate the IRI values, thereby matching the high temporal resolution of the input features. This approach generates a continuous and smooth IRI time series, ensuring consistency with the dimensions of other feature variables. Furthermore, it preserves the cumulative effects of certain variables without compromising their significance. By applying this feature-specific processing strategy, all input features were unified to a weekly temporal resolution that matched the IRI measurements. The processed weekly features were then concatenated into input windows of varying lengths (30, 60, 90, and 120 weeks) to explore the impact of temporal sequence lengths on model prediction performance.
To prepare the data for the encoder, the processed features are structured into an input matrix with a shape of (T,d), where T represents the length of the input window, and d represents the number of features after preprocessing. Additionally, time stamps are incorporated into the input matrix to encode temporal information explicitly. The input matrix, augmented with temporal information, is then fed into the encoder. The encoder would extract high-level temporal patterns and dependencies from the input data, producing a context vector that serves as the foundation for subsequent prediction tasks.
To validate the predictive accuracy and generalizability of the model, four evaluation metrics were selected: mean squared error (MSE), coefficient of determination (R2), and the sensitivity coefficient (Sp). The mathematical formulations of these metrics are presented in Table 3. By leveraging these diverse evaluation criteria, the study ensures a robust and thorough evaluation of the model’s effectiveness under varying conditions.
Mean squared error (MSE) quantifies the average squared difference between the predicted and true values, providing a direct measurement of the model’s accuracy, where lower MSE values indicate better performance as they reflect smaller deviations from the ground truth [35]. Similarly, the coefficient of determination (R2) measures the proportion of variance in the target variable explained by the model’s predictions, evaluating the model’s goodness of fit, with values closer to 1 demonstrating higher explanatory power [36]. Finally, the sensitivity coefficient (Sp) quantifies the influence of input parameters (p) on the predicted IRI values by adjusting the perturbation value (Δp) of input variables and observing changes in IRI, where larger (Sp) values indicate a more significant driving effect of the parameter on IRI variations [37]. Together, these metrics provide a comprehensive evaluation of the model’s accuracy, explanatory power, temporal alignment, and sensitivity to input parameters.

4. Results and Discussion

4.1. Hyperparameter Tuning Process

The initial hyperparameters of the model are determined based on prior knowledge of the Informer architecture and the characteristics of the dataset. Studies have shown that Transformer models can effectively capture long-term dependencies in time-series data through multi-head self-attention mechanisms and feed-forward neural networks [4,5]. By introducing the ProbSparse attention mechanism, Informer reduces the computational complexity of self-attention for long input sequences. Thus, the encoder and decoder are configured with three and two layers, respectively, to ensure the capacity for modeling long-term dependencies while maintaining computational efficiency.
The token embedding dimension was set to 5 to provide a compact representation of input features, according to the recommended lightweight embedding dimensions for time-series tasks [4]. The hidden layer dimension of the feed-forward neural network (FFNN) was set to 20, which is four times the token embedding dimension, to balance model complexity and generalization performance. The initial learning rate was set to 0.0001 with a decay factor of 0.5 to stabilize the training process as the model converges. The number of training epochs was set to 100 to ensure sufficient iterations for learning complex temporal patterns, while the batch size was configured to 32 to balance computational efficiency and gradient stability. The weights for the physics-constrained loss (LPhysics) and data-driven loss (LData) were initialized to 0.1 and 0.9, respectively.
To optimize the model performance, a systematic hyperparameter tuning process was carried out, combining grid search, residual analysis, and dynamic loss weight adjustment methods. The optimization process mainly adjusted key parameters such as the learning rate, batch size, training rounds, and loss weight balance. To ensure that the original long sequences of the Informer model pay particular attention to the relationship between residual weights and the length of the time series. And the learning rate (k) is fine-tuned within the range of [0.0001, 0.01] to determine the optimal convergence rate [5]. A larger learning rate will lead to unstable training dynamics, while a smaller learning rate could result in excessively slow convergence. Consequently, a learning rate of 0.001 and a decay factor of 0.8 are selected as the optimal values to balance convergence speed and training stability. The batch size is adjusted between 16 and 64 to evaluate its impact on memory usage and training efficiency [38]. A batch size of 32 achieved the best balance between gradient stability and computational efficiency. Adjustments to the model architecture involve varying the number of encoder and decoder layers to balance model capacity and computational feasibility. Experimental results show that an encoder with over four layers provides limited improvement in prediction accuracy, while reducing them to two resulted in a significant drop in performance. Ultimately, a configuration of three encoder layers and two decoder layers is retained.
As described in Section 2.2.4, a dynamic weight adjustment mechanism to adapt the weights of the physics-constrained loss (λPhysics) and the data-driven loss (λdata) based on the length of the time series is used in the PA-Informer model. Through model training and fine-tuning, it is determined that, for time-series lengths of 30, 60, 90, and 120, the corresponding residual weights are set as (λPhysics, λdata) = (0.46, 0.54), (0.39, 0.61), (0.33, 0.67), and (0.18, 0.82), respectively. The balance effectively promotes both physics constraints and prediction accuracy, which significantly improves the model’s generalization ability across different time-series lengths.
After determining other hyperparameters, the number of epochs was adjusted by analyzing the convergence process of the residual curves. Residual analysis was conducted to study the convergence speed and process of the PA-Informer and Informer models on datasets from Beijing, Guangxi, Xinjiang, and Tibet. As shown in Figure 8, compared to the Informer model, the PA-Informer model demonstrated significantly faster convergence speeds and a more stable residual reduction process across all regional datasets. When the input window is 30, the PA-Informer model had the slowest convergence speed compared to input windows of 60, 90, and 120. However, even in this case, the PA-Informer model’s residuals can still converge to below 0.05 within 20 training epochs, whereas the Informer model exhibits a much slower convergence speed and pronounced oscillations, particularly on the Xinjiang and Tibet datasets. Such results indicate that the PA-Informer model’s faster and smoother convergence process enables it to better adapt to the diverse dynamic characteristics of regional datasets, achieving higher prediction accuracy and robustness. For the PA-Informer model, validation loss stabilizes after approximately 15 training epochs, suggesting that the initially set 100 training epochs may not be optimal. To improve training efficiency and prevent overfitting, an early stopping mechanism is introduced, terminating training when the validation loss shows no improvement for 20 epochs [39,40,41]. In conclusion, the final hyperparameter settings are shown in Table 4. The Informer baseline model used for comparison adopts the exact same architecture and hyperparameter settings as the PA-Informer model.

4.2. Prediction Accuracy Evaluation

To quantitatively analyze the prediction accuracy, the MSE and R2 metrics on both the training and test sets were calculated and are shown in Figure 8 and Figure 9.
From Figure 9 and Figure 10a, a horizontal comparison of the MSE and R2 values across different models reveals that, except for the MR model, the prediction accuracy of all other models improves as the time-series length increases. The proposed PA-Informer model achieves the best performance, with a maximum MSE of 0.0165 (input window length of 30 weeks) and a minimum MSE of 0.0152 (input window length of 120 weeks) across the datasets. Notably, the MSE variation is less than 8.55%, indicating consistent performance across different time-series lengths.
While ensuring the smallest MSE (0.0152) and the highest R2 (0.985) on the testing set, the PA-Informer model also exhibits the smallest fluctuation range. This suggests that reducing the input window length (e.g., from 120 to 30 weeks) has minimal impact on the performance of the PA-Informer model. Furthermore, even for shorter input windows, the model maintains reliable prediction accuracy.
Figure 9b illustrates the R2 values for the Tibet testing dataset, where the high consistency observed between actual data and predicted results demonstrates the model’s ability to capture underlying patterns. These results validate the reliability and generalizability of the PA-Informer model in complex environments.

4.3. Sensitivity Analysis

Based on the prediction accuracy results in Section 4.2, sensitivity analysis was conducted to further evaluate the performance of the PA-Informer model compared to other models. Only the Informer and LSTM-MA models, whose performance is closest to the PA-Informer model, were selected for comparison. Both models were tested using an input window length of 120 weeks, as this configuration yielded their best prediction performance as shown in Figure 11.
The PA-Informer model demonstrates Sp value variations of less than 5% for different input feature indicators across all datasets, suggesting that it can accurately capture the impact of various indicators on IRI changes with minimal influence from input window length. Additionally, the PA-Informer model identifies joint displacement as the most sensitive feature (Sp ≈ 0.9) across all regions, followed by internal stress (Sp ≈ 0.8), both of which remain relatively stable regardless of climatic conditions.
The sensitivity of other features, such as solar radiation (SR) and precipitation (P), varies significantly across regions. For instance, in the arid desert region of Xinjiang, frequent thermal expansion and contraction cycles caused by large diurnal temperature variations strongly influence joint displacement (Sp ≈ 0.91) and amplify internal stress (Sp ≈ 0.85). In contrast, in the humid and rainy region of Guangxi, precipitation becomes the dominant factor (Sp ≈ 0.67), surpassing the sensitivity of SR (Sp ≈ 0.52) due to excessive water infiltration into concrete pavements. In the high-altitude cold region of Tibet, freeze–thaw cycles drive significant joint displacement (Sp ≈ 0.95), while solar radiation (Sp ≈ 0.65) and precipitation (Sp ≈ 0.71) jointly contribute to rapid IRI growth. In lightly frozen regions like Beijing, where climatic conditions are relatively moderate, joint displacement (Sp ≈ 0.82) and internal stress (Sp ≈ 0.78) remain the primary influencing factors, while SR and P show lower sensitivity values (both Sp ≈ 0.41).

5. Discussion

5.1. Advantages of the PA-Informer Model in Diverse Conditions

The analysis highlights the limitations of traditional models, such as Informer, LSTM-MA, and MR, when applied to complex and highly variable regional datasets. The instability observed in other models may stem from significant differences in regional meteorological conditions, such as precipitation, sunlight, temperature, and humidity, which have varying effects on cement concrete. For example, in Xinjiang, the dry climate and intense sunlight primarily influence IRI through temperature stress. While in Guangxi, IRI variations are predominantly affected by precipitation and humidity.
Under these complex climatic conditions, traditional time-series models struggle to maintain accuracy for shorter input windows due to insufficient dynamic information. In contrast, the proposed PA-Informer model incorporates a Physics-Aware approach by embedding partial differential equations (PDEs) as residual equations. This mechanism ensures the model leverages physics constraints to address the performance limitations of the original Informer model, particularly for shorter input windows.
The PA-Informer model demonstrates superior predictive accuracy and stability across various input window lengths and diverse climatic conditions. Its ability to integrate physics-based constraints allows it to adapt to the cumulative effects of meteorological factors, making it more robust and reliable in challenging datasets.

5.2. Interpretation of Sensitivity Analysis

The results of the sensitivity analysis demonstrate the superior adaptability and robustness of the PA-Informer model compared to traditional models like Informer and LSTM-MA. The enhanced Physics-Aware mechanism enables the PA-Informer model to better capture the relationships between input features and IRI changes, even under shorter input window conditions. This capability allows it to maintain stable sensitivity to different feature indicators regardless of input window length, which can serve as a baseline for future model comparisons.
The regional variations in Sp values highlight the significant influence of climatic conditions on IRI changes. For example, in Xinjiang, the large diurnal temperature variations in the arid desert amplify thermal expansion and contraction cycles, leading to higher sensitivity of joint displacement (Sp ≈ 0.91) and internal stress (Sp ≈ 0.85). Similarly, in Guangxi, excessive precipitation accelerates subgrade erosion, making precipitation (Sp ≈ 0.67) the dominant factor. In Tibet, freeze–thaw cycles combined with intense solar radiation exacerbate joint displacement (Sp ≈ 0.95) and stress accumulation, leading to faster IRI growth [42]. The compound effects of features, such as prolonged precipitation combined with temperature fluctuations in humid regions or freeze–thaw cycles coupled with solar radiation in high-altitude regions, further amplify their impact on pavement deterioration.
In contrast, the Informer and LSTM-MA models fail to establish consistent patterns in sensitivity changes across regions. Although they can identify certain dominant features (e.g., joint displacement and internal stress) in relatively stable climatic conditions, such as Beijing or Guangxi, these models struggle to capture the nuanced interactions between features under more complex climatic conditions, such as in Xinjiang or Tibet. This limitation underscores the advantage of the PA-Informer model, which integrates PDEs as residual equations to address the performance shortcomings of traditional models.
Overall, the PA-Informer model demonstrates not only stable sensitivity to input feature indicators but also the flexibility to adapt to regional variations in feature sensitivity under diverse climatic conditions. This highlights its enhanced robustness and adaptability, making it a reliable choice for predicting IRI trends across various environments.

6. Conclusions

In this study, we introduce the Physics-Aware Informer (PA-Informer) model, a hybrid framework that combines the Informer structure with physics constraints to achieve accurate and robust IRI predictions for monitoring pavement performance. By embedding residual PDEs into the model, the PA-Informer ensures that predictions align with the physical laws governing the evolution of pavement roughness. The dynamic weighting strategy enhances the model’s adaptability to different input window lengths and environmental conditions, showcasing its versatility for diverse applications.
Experimental results based on four representative climatic zones in China (arid desert, humid and rainy, lightly frozen, and high-altitude cold regions) validate the effectiveness of the model. The PA-Informer achieves an MSE of 0.0165 and R2 of 0.962 for a 30-week input sequence and an MSE of 0.0152 and R2 of 0.985 for a 120-week input sequence. Sensitivity analysis revealed that joint displacement (Sp ≈ 0.89) and internal stress (Sp ≈ 0.81) are the most significant factors affecting IRI. The PA-Informer outperformed other models in maintaining feature sensitivity and capturing dependent features under complex meteorological conditions, demonstrating robustness across diverse climates. The PA-Informer also offers favorable computational efficiency, thanks to the ProbSparse self-attention mechanism in the Informer architecture, making it scalable to large-scale and real-time applications. However, lightweight model variants or hybrid frameworks could further improve deployment feasibility in resource-constrained environments. Integration with existing pavement monitoring systems should prioritize compatibility and real-time data processing.
Despite the promising results achieved by the PA-Informer model, there are certain limitations that need to be addressed in future research. First, the model’s performance requires further exploration, under data scarcity conditions, particularly in regions where sensor data collection is limited or inconsistent. In such scenarios, the model may benefit from techniques such as transfer learning or data augmentation to enhance its adaptability. Additionally, the integration of external datasets, such as those from international pavement performance monitoring programs, such as the Long-Term Pavement Performance (LTPP) program, could further validate and refine the model’s generalization capabilities across different climatic zones and road types.
From an engineering perspective, the feasibility of deploying the PA-Informer model in diverse regions or on various road types should also be carefully considered. While this study focuses on cement concrete pavements in China, other road materials, such as asphalt pavements, may exhibit different deterioration mechanisms that require adjustments to the model’s physics constraints. Furthermore, the implementation of the model in real-world pavement management systems will depend on its computational efficiency and ease of integration with existing infrastructure monitoring tools. Future work should explore lightweight model variants or hybrid architectures that balance predictive performance with practical deployment requirements.
By addressing these limitations and broadening the scope of application, the PA-Informer framework has the potential to become a universally applicable tool for pavement performance prediction, contributing to more cost-effective and adaptive road maintenance strategies worldwide.

Author Contributions

Conceptualization, F.Y.; methodology, Z.Z.; software, X.C. and Z.Z.; validation, X.C. and Z.Z.; formal analysis, X.C.; investigation, X.C. and F.Y.; resources, F.Y.; data curation, X.C.; writing—original draft preparation, X.C.; writing—review and editing, F.Y.; visualization, Z.Z.; supervision, F.Y.; project administration, F.Y. and Z.Z.; funding acquisition, F.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by [Ministry of Transport of China] grant number [2022FY101400].

Data Availability Statement

Data is still unavailable due to privacy or ethical restrictions.

Conflicts of Interest

The project is a long-term field observation study aimed at improving the understanding of pavement performance under diverse climatic conditions. This research involves no commercial interests, and no individual or organization has gained or will gain financial benefits from the outcomes of this study. Additionally, the research is purely academic, with no associated patents or related engineering projects. Therefore, the authors declare that there are no conflicts of interest regarding the publication of this paper.

References

  1. Zhang, T.; Smith, A.; Zhai, H.; Lu, Y. LSTM+MA: A Time-Series Model for Predicting Pavement IRI. Infrastructures 2025, 10, 10. [Google Scholar] [CrossRef]
  2. Dalla Rosa, F.; Liu, L.; Gharaibeh, N.G. IRI prediction model for use in network-level pavement management systems. J. Transp. Eng. Part B Pavements 2017, 143, 04017001. [Google Scholar] [CrossRef]
  3. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
  4. Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond efficient transformer for long sequence time-series forecasting. Proc. AAAI Conf. Artif. Intell. 2021, 35, 10606–10615. [Google Scholar] [CrossRef]
  5. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 1–11. [Google Scholar]
  6. Marcelino, P.; de Lurdes Antunes, M.; Gomes, M.C. Machine learning approach for pavement performance prediction. Int. J. Pavement Eng. 2021, 22, 341–354. [Google Scholar] [CrossRef]
  7. Qin, Y.; Song, D.; Cheng, H.; Cheng, W. A dual-stage attention-based recurrent neural network for time series prediction. arXiv 2017, arXiv:1704.02971. [Google Scholar]
  8. Deng, F.; Tang, X.; Wu, T. Particle swarm optimization-augmented feedforward networks for pavement rutting prediction. J. Comput. Civ. Eng. 2015, 29, 04014097. [Google Scholar]
  9. Wang, K.; Huang, Y.; Zhang, Z. Hybrid gray relation analysis and support vector regression for pavement performance prediction. Int. J. Pavement Res. Technol. 2018, 11, 287–294. [Google Scholar]
  10. Gong, W.; Chen, H.; Zhang, J. Random forest regression for flexible pavement IRI modeling. Transp. Res. Rec. 2019, 2673, 299–309. [Google Scholar]
  11. Hoang, H.-G.T.; Nguyen, H.-L.; Nguyen, T.-A.; Ly, H.-B. Hybrid machine learning approach for prediction and design optimization of marshall stability in graphene oxide-modified asphalt concrete. Environ. Res. 2025, 285, 122646. [Google Scholar] [CrossRef] [PubMed]
  12. Damirchilo, B.; Yazdani, R.; Khavarian, M. XGBoost-based pavement performance prediction and handling missing values. Int. J. Pavement Eng. 2020, 21, 456–466. [Google Scholar]
  13. Song, Y.; Wang, L.; Yin, J. ThunderGBM-based ensemble learning for asphalt pavement IRI prediction. Constr. Build. Mater. 2021, 313, 125421. [Google Scholar]
  14. Zhou, X.; Li, J.; Fang, S. RNN-based models for asphalt pavement performance prediction using LTPP data. Int. J. Pavement Res. Technol. 2018, 11, 559–567. [Google Scholar]
  15. Tamagusko, T.; Ferreira, A. Pavement Performance Prediction using Machine Learning: Supervised Learning with Tree-Based Algorithms. Transp. Res. Procedia 2025, 82, 2521–2531. [Google Scholar] [CrossRef]
  16. Han, Z.; Liu, F.; Zhang, X. Modified RNN for falling weight deflectometer back-calculation. J. Comput. Civ. Eng. 2020, 34, 04020023. [Google Scholar]
  17. Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving PDEs. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
  18. Karniadakis, G.E.; Kevrekidis, I.G.; Lu, L.; Perdikaris, P.; Wang, S.; Yang, L. Physics-informed machine learning. Nat. Rev. Phys. 2021, 3, 422–440. [Google Scholar] [CrossRef]
  19. Toba, A.-L.; Kulkarni, S.; Khallouli, W.; Pennington, T. Long-Term Traffic Prediction Using Deep Learning Long Short-Term Memory. Smart Cities 2025, 8, 126. [Google Scholar] [CrossRef]
  20. Tartakovsky, A.M.; Marrero, C.O.; Perdikaris, P.; Tartakovsky, G.D.; Barajas-Solano, D. Physics-informed deep neural networks for learning parameters and constitutive relationships in subsurface flow problems. Water Resour. Res. 2020, 56, e2019WR026731. [Google Scholar] [CrossRef]
  21. Lu, L.; Meng, X.; Mao, Z.; Karniadakis, G.E. DeepXDE: A deep learning library for solving differential equations. SIAM Rev. 2021, 63, 208–228. [Google Scholar] [CrossRef]
  22. Dai, H.; Liu, Z.; Dai, J.; Liu, Y. An adaptive spatio-temporal attention mechanism for traffic prediction. IEEE Internet Things J. 2021, 8, 11915–11924. [Google Scholar]
  23. Xu, Z.; Feng, C.; Zhang, L. Physics-enhanced neural networks for multiscale spatiotemporal traffic prediction. Transp. Res. Part C: Emerg. Technol. 2022, 137, 103597. [Google Scholar]
  24. Sun, L.; Wang, Z.; Zhang, H. Physics-informed deep learning for time-series forecasting in engineering applications. Mech. Syst. Signal Process. 2021, 152, 107377. [Google Scholar]
  25. Tian, G.; Zhang, C.; Shi, Y.; Li, X. Multi−Wave−Net: A long time series forecasting framework based on multi-scale analysis and multi-channel feature fusion. Expert Sys. With. App. 2024, 251, 124088. [Google Scholar] [CrossRef]
  26. Child, R.; Gray, S.; Radford, A.; Sutskever, I. Generating long sequences with sparse transformers. arXiv 2019, arXiv:1904.10509. [Google Scholar] [CrossRef]
  27. Beltagy, I.; Peters, M.E.; Cohan, A. Longformer: The long-document transformer. arXiv 2020, arXiv:2004.05150. [Google Scholar] [CrossRef]
  28. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Transformer-XL: Attentive language models beyond a fixed-length context. Proc. Annu. Meet. Assoc. Comput. Linguist. 2019, 2978–2988. [Google Scholar] [CrossRef]
  29. Sayers, M.W.; Gillespie, T.D.; Paterson, W.D.O. Guidelines for conducting and calibrating road roughness measurements. World Bank Tech. Pap. 1986. Available online: https://hdl.handle.net/2027.42/3133 (accessed on 15 October 2025).
  30. Sayers, M.W.; Karamihas, S.M. The Little Book of Profiling: Basic Information about Measuring and Interpreting Road Profiles; University of Michigan, Transportation Research Institute: Ann Arbor, MI, USA, 1998. [Google Scholar]
  31. Chen, Z.; Badrinarayanan, V.; Lee, C.-Y.; Rabinovich, A. GradNorm: Gradient normalization for adaptive loss balancing in deep multitask networks. Proc. Mach. Learn. Res. 2018, 80, 807–815. [Google Scholar]
  32. Sener, O.; Koltun, V. Multi-task learning as multi-objective optimization. Adv. Neural Inf. Process. Syst. (NeurIPS) 2019, 31, 527–538. [Google Scholar]
  33. Wang, S.; Teng, Y.; Perdikaris, P.; Karniadakis, G.E. Learning physics-informed neural networks without stacked back-propagation. arXiv 2021, arXiv:2109.13852. [Google Scholar]
  34. Chatterjee, S.; Singh, B. Durability of concrete in an arid environment: Challenges and remedies. Constr. Build. Mater. 2019, 223, 72–80. [Google Scholar]
  35. Mehta, P.K.; Monteiro, P.J.M. Concrete: Microstructure, Properties, and Materials; McGraw-Hill Education: Berkshire, UK, 2014. [Google Scholar]
  36. Vaitkus, A.; Čygas, D.; Laurinavičius, A.; Perveneckas, Z. Analysis and evaluation of asphalt pavement structure damages caused by frost. Balt. J. Road Bridge Eng. 2009, 4, 196–202. [Google Scholar]
  37. Willmott, C.J.; Matsuura, K. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Clim. Res. 2005, 30, 79–82. [Google Scholar] [CrossRef]
  38. Steel, R.G.D.; Torrie, J.H.; Dickey, D.A. Principles and Procedures of Statistics: A Biometrical Approach; McGraw-Hill Education: Berkshire, UK, 1997. [Google Scholar]
  39. Saltelli, A.; Ratto, M.; Andres, T.; Campolongo, F.; Cariboni, J.; Gatelli, D.; Saisana, M.; Tarantola, S. Global Sensitivity Analysis: The Primer; John Wiley & Sons: Hoboken, NJ, USA, 2008. [Google Scholar]
  40. Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
  41. Prechelt, L. Early stopping—But when? In Neural Networks: Tricks of the Trade; Springer: Berlin/Heidelberg, Germany, 1998; pp. 55–69. [Google Scholar]
  42. Li, N.; Zhang, J.; Peng, Z. Influence of environmental factors on the deterioration of concrete pavements in cold regions. Cold Reg. Sci. Technol. 2012, 83–84, 59–66. [Google Scholar]
Figure 1. The structure of Physics−Aware Informer framework.
Figure 1. The structure of Physics−Aware Informer framework.
Infrastructures 10 00278 g001
Figure 2. Weather station.
Figure 2. Weather station.
Infrastructures 10 00278 g002
Figure 3. Temperature and humidity sensors.
Figure 3. Temperature and humidity sensors.
Infrastructures 10 00278 g003
Figure 4. Strain and stress sensors.
Figure 4. Strain and stress sensors.
Infrastructures 10 00278 g004
Figure 5. Joint fiber-optic displacement sensor.
Figure 5. Joint fiber-optic displacement sensor.
Infrastructures 10 00278 g005
Figure 6. Overall layout diagram of the sensors.
Figure 6. Overall layout diagram of the sensors.
Infrastructures 10 00278 g006
Figure 7. Road performance testing vehicle.
Figure 7. Road performance testing vehicle.
Infrastructures 10 00278 g007
Figure 8. Residual curve graphs of PA-Informer and Informer models with different time-series lengths (BJ: Beijing; GX: Guangxi; XJ: Xinjiang; TB: Tibet).
Figure 8. Residual curve graphs of PA-Informer and Informer models with different time-series lengths (BJ: Beijing; GX: Guangxi; XJ: Xinjiang; TB: Tibet).
Infrastructures 10 00278 g008
Figure 9. The MSE curves of the selected models.
Figure 9. The MSE curves of the selected models.
Infrastructures 10 00278 g009aInfrastructures 10 00278 g009b
Figure 10. Performance comparison of different models with an input window length of 120 weeks.
Figure 10. Performance comparison of different models with an input window length of 120 weeks.
Infrastructures 10 00278 g010
Figure 11. Sensitivity analysis for different feature indicators (u, σ, SR, and P).
Figure 11. Sensitivity analysis for different feature indicators (u, σ, SR, and P).
Infrastructures 10 00278 g011
Table 1. The parameters of environment and materials.
Table 1. The parameters of environment and materials.
SymbolPhysical Meaning
x, y, zCoordinate
tTime stamp
σInternal stress
uHorizonal displacement at joints
Δ h Height difference at the joint
LLength of the road
(In this study, L is fixed at 100 m)
C Fourth-order elastic stiffness tensor, defined by E and v
ICubic tensor
TAtmospheric temperature
TcInternal temperature of cement concrete
HcInternal humidity of cement concrete
αThermal expansion coefficient
βHygroscopic expansion coefficient
SRSolar radiation
PPrecipitation
κTThermal conductivity1.42 W/m·K
DHMoisture diffusion coefficient0.068 mm2/s
Table 2. Characteristics of typical environmental conditions of observation points.
Table 2. Characteristics of typical environmental conditions of observation points.
Dataset TypesTypical EnvironmentsCharacteristics of Natural Conditions of the PavementsProvinces
Training setArid desertIn the arid desert region, cement concrete pavements face challenges from thermal expansion and contraction due to extreme temperature fluctuations, leading to frequent cracking and joint damage. Sand erosion can abrade the pavement surface, while dust accumulation may reduce skid resistance [32].Xinjiang
Hot and humidIn humid and rainy regions, excessive moisture infiltrates concrete pavements, causing joint spalling, surface scaling, and subgrade erosion. Persistent water exposure can also lead to damage at joints and cracks, accelerating structural deterioration [33].Guangxi
Lightly frozenConcrete pavements in light ice regions are affected by freeze–thaw cycles, which cause frost heave, cracking, and surface scaling. De-icing chemicals exacerbate surface deterioration and may lead to joint damage, weakening the cement concrete pavement over time [34].Beijing
Test setHigh-altitude cold regionIn Qinghai–Tibet Plateau, the extreme cold and presence of permafrost cause significant frost heave and thaw settlement, leading to uneven surfaces and cracking in concrete pavements. The harsh environment accelerates damage to joints and surface layers, reducing durability [32].Tibet
Table 3. The corresponding formulas of the indicators.
Table 3. The corresponding formulas of the indicators.
IndicatorsFormulas
MSE M S E = 1 n i = 1 n y i y ^ i
R2 R 2 = 1 i = 1 n y ^ i y ¯ i = 1 n y i y ¯
Sp S p = y I R I p + Δ p y I R I p / y I R I p Δ p / p
Table 4. Value of hyperparameters.
Table 4. Value of hyperparameters.
HyperparametersInitial ValueFinal Value
Encoder layers33
Decoder layers22
Token embedding dimension55
Dimension of the hidden layer of feed-forward neutral network2020
Learning rate0.00010.001
Learning rate decay0.50.8
Epoch10020
Batch size3232
λPhysics and λdata0.1 and 0.90.46 and 0.54
0. 39 and 0.61
0.33 and 0.67
0.21 and 0.79
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cao, X.; Zeng, Z.; Yi, F. Physics-Aware Informer: A Hybrid Framework for Accurate Pavement IRI Prediction in Diverse Climates. Infrastructures 2025, 10, 278. https://doi.org/10.3390/infrastructures10100278

AMA Style

Cao X, Zeng Z, Yi F. Physics-Aware Informer: A Hybrid Framework for Accurate Pavement IRI Prediction in Diverse Climates. Infrastructures. 2025; 10(10):278. https://doi.org/10.3390/infrastructures10100278

Chicago/Turabian Style

Cao, Xintao, Zhiping Zeng, and Fan Yi. 2025. "Physics-Aware Informer: A Hybrid Framework for Accurate Pavement IRI Prediction in Diverse Climates" Infrastructures 10, no. 10: 278. https://doi.org/10.3390/infrastructures10100278

APA Style

Cao, X., Zeng, Z., & Yi, F. (2025). Physics-Aware Informer: A Hybrid Framework for Accurate Pavement IRI Prediction in Diverse Climates. Infrastructures, 10(10), 278. https://doi.org/10.3390/infrastructures10100278

Article Metrics

Back to TopTop