Next Article in Journal
The Fusion Mechanism and Prospective Application of Physics-Informed Machine Learning in Bridge Lifecycle Health Monitoring
Next Article in Special Issue
Analysis of Variance in Runway Friction Measurements and Surface Life-Cycle: A Case Study of Four Australian Airports
Previous Article in Journal
Assessment of the Self-Healing Capacity of Sustainable Asphalt Mixtures Using the SCB Test
Previous Article in Special Issue
Impact of Pavement Surface Roughness on TSD Backcalculation Outputs and Potential Mitigation Strategies
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

State-Dependent Asphalt Pavement Deterioration Modeling via Noise-Filtered Reaction Signatures: A Data-Driven Framework Using Korea Highway Pavement Management System (K-HPMS) Data

1
Department of Transportation Engineering, Myongji University, Yongin-si 17058, Gyeonggi-do, Republic of Korea
2
Korea Expressway Corporation Research Institute, Dongtan-myeon, Hwaseong-si 18489, Gyeonggi-do, Republic of Korea
3
Department of Smart-Mobility Engineering, Myongji University, Yongin-si 17058, Gyeonggi-do, Republic of Korea
*
Author to whom correspondence should be addressed.
Infrastructures 2026, 11(1), 15; https://doi.org/10.3390/infrastructures11010015
Submission received: 24 November 2025 / Revised: 31 December 2025 / Accepted: 4 January 2026 / Published: 6 January 2026

Abstract

Conventional PMSs often rely on static age-based assumptions, which can fail to capture nonlinear, state-dependent deterioration and improvement-like responses observed in long-term monitoring data. This study addresses these limitations by proposing a reaction-oriented analytical framework using eight years of Korea Highway PMS data (2015–2022). We construct a Δ–State Vector by combining the previous-year condition grade with noise-filtered annual changes in the International Roughness Index (IRI) and Rut Depth (RD). Measurement noise is separated from structural signals via MAD-based noise bands (ΔIRI: ±0.089 m/km; ΔRD: ±0.993 mm), with a global MAD floor (minimum-threshold constraint) to avoid degenerate zero-band cases under sparse or near-constant transitions. The resulting vectors are embedded into a low-dimensional Reaction Space using UMAP and clustered with HDBSCAN. To validate interpretability, a rule-based Trend × Mode Reaction Signature taxonomy is used to assess the semantic consistency of unsupervised clusters. Five dominant reaction regimes are identified, showing strong agreement with signature-based labels (weighted purity = 0.927; coverage for purity ≥ 0.60 = 0.911). Overall, the results indicate that deterioration dynamics are governed by lane–segment heterogeneity and prior-state dependence rather than chronological age, providing a reproducible foundation for future event-sensitive, dynamic age reset frameworks.

1. Introduction

The rapid aging of highway pavement infrastructure has become a critical challenge in many countries, and Korea is no exception. Large portions of the national expressway network constructed between the 1970s and 1990s are now reaching or exceeding their intended service life, resulting in increasing frequencies of structural and functional distresses such as cracking, rutting, and potholes. According to the Ministry of Land, Infrastructure and Transport [1], approximately 15% of the national road network had already surpassed 30 years of service life as of 2019, and this proportion is projected to rise rapidly to 46% and 87% in 10 and 20 years, respectively. As deterioration accelerates with aging, the effectiveness of pavement maintenance strategies and, more fundamentally, the accuracy of deterioration modeling have become central concerns for both policy and engineering practice. Numerous studies have emphasized that reliable deterioration prediction is essential for sustainable pavement asset management and long-term budget planning [2]. In addition, the mechanisms of pavement response and damage are closely linked to vehicle–road interaction and traffic loading characteristics, which motivates the need to interpret performance changes in the context of loading-induced effects [3].
Korea’s expressway operator has used a Pavement Management System (PMS) since the 1990s, employing performance indices, such as the International Roughness Index (IRI), Rut Depth (RD), and Surface Distress (SD), to evaluate the pavement’s functional condition. These indices are monitored annually at approximately 0.1 km intervals, producing a dense time-series structure covering route, direction, lane, and segment levels. The Korea Highway Pavement Management System (hereafter, K-HPMS) provides a comprehensive dataset for characterizing pavement behavior; however, its decision-making utility is constrained by long-standing limitations in both data processing and model formulation.
Recent domestic and international studies have advanced data-driven and probabilistic approaches for pavement performance prediction and have also addressed practical data challenges in PMS (e.g., Bayesian estimation, ANN-based prediction, LTPP data utilization, and missing-data handling) [4,5,6,7].
First, aggregation bias inherently arises when pavement data are analyzed at higher hierarchical levels—such as the route or roadway direction—where micro-level deterioration signals at the lane or segment level are diluted by averaging effects. Prior studies have shown that aggregating short survey sections into longer “homogeneous” sections can distort condition statistics and obscure local extremes and dispersion, whereas finer sectioning better preserves localized variability relevant for project-level decisions [8,9]. This implies that deterioration analysis and modeling must focus on Lane–Segment-level data to avoid misleading interpretations of the pavement condition.
Second, traditional PMS deterioration models often parameterize deterioration primarily as a function of static age—either construction age or time since first measurement. However, actual pavement lifecycle behavior can involve irregular and nonlinear transitions driven by traffic loading, climatic effects, and maintenance interventions. When maintenance actions such as patching, milling, resurfacing, or localized repairs occur, the pavement condition may exhibit immediate improvement followed by renewed deterioration, and incorporating treatment effects is therefore important for reliable performance forecasting and planning [10,11]. In practice, when maintenance history is not explicitly modeled, improvement-like changes may be inadequately represented within purely age-driven deterioration structures, which can bias inferred deterioration rates and mask slope changes associated with recovery.
Third, pavement performance evolution is widely recognized as nonlinear, and long-term monitoring databases (e.g., LTPP) have enabled the development and validation of data-driven nonlinear prediction models, including deep neural networks for rutting progression [6,12]. Despite this, linear or quasi-linear models remain widely used in current PMS practice for convenience, leading to an inability to reflect acceleration phenomena or inflection points. Given that pavement performance is influenced by a wide range of heterogeneous factors, including geometry, traffic, temperature, moisture, and structural configuration—deterioration trajectories are inherently non-stationary and vary across segments.
Fourth, although absolute values of IRI, RD, and SD describe the present level of deterioration, year-to-year changes provide direct insight into the direction and magnitude of performance transitions and can reveal atypical responses that are not apparent in static summaries. Yet, change-based analysis has remained underutilized in PMS research. A key difficulty arises from measurement noise stemming from seasonal influences, sensor variability, and survey conditions, such that Δ values often mix structural signals with stochastic fluctuations and thus complicate interpretation. While several studies have developed data-driven prediction models for IRI using machine learning approaches [11], there is still no established framework for systematically quantifying noise, suppressing non-structural variability, and encoding the remaining structural responses into a coherent multivariate representation suitable for state-dependent analysis.
Finally, although state-based modeling approaches such as Markov chains have been widely adopted in PMS, they inherently rely on discretizing continuous performance indices into ordinal states, leading to information loss and limited ability to represent abrupt or nonlinear changes [13,14]. Moreover, transitions between states are typically modeled using empirical probabilities that do not account for maintenance-driven resets or multimodal responses. As a result, existing state models do not capture dynamic deterioration behaviors emerging from heterogeneous structural conditions and irregular maintenance.
These limitations collectively highlight the need for a new deterioration modeling paradigm that (1) reflects Lane–Segment-level heterogeneity, (2) incorporates noise-filtered year-to-year changes, (3) recognizes nonlinear and state-dependent deterioration structures, and (4) integrates maintenance-driven recovery through dynamic age adjustment.
To address these gaps, this study proposes a Δ–State Vector and Reaction Signature framework for asphalt pavement deterioration analysis using eight years of Korean HPMS segment-level data (2015–2022). The contributions of this research are fourfold.
First, we quantify hierarchical variance across K-HPMS levels and demonstrate that deterioration behavior is governed by segment-level heterogeneity, empirically verifying the presence of aggregation bias in network-level analyses.
Second, we compute robust Noise Bands for ΔIRI and ΔRD, enabling the separation of structural deterioration from measurement noise—a process essential for constructing reliable change-based models.
Third, by integrating previous-year condition grades with noise-filtered changes, we develop a Δ–State Vector that embeds state dependency and directional deterioration information in a compact multivariate representation.
Fourth, we apply UMAP and HDBSCAN to identify five dominant reaction regimes (UMAP–HDBSCAN clusters) and evaluate their consistency with the rule-based Trend × Mode Reaction Signature taxonomy.
Through these contributions, this study provides a data-driven foundation for a future event-based Dynamic Age Reset framework, in which pavement age is redefined according to the actual condition response rather than chronological time. The results offer practical implications for PMS optimization, deterioration prediction, and maintenance prioritization. Hierarchical tests (L0–L4) are used only to diagnose aggregation distortion, whereas reaction modeling and clustering are conducted at the Lane–Segment resolution (L5).

2. Literature Review

2.1. Time-Series Characteristics of Pavement Performance and the Need for Noise Treatment

Pavement performance evolves over time under the combined influence of traffic loading, environmental cycles, and long-term material degradation. Numerous studies have described pavement performance indicators (e.g., roughness/IRI and distress-related measures) as continuous-valued time-series signals reflecting functional and structural condition [5,11,15]. In the K-HPMS lane–segment data analyzed in this study, SD is strongly zero-inflated with sparse event-like spikes, and its year-to-year variation can behave differently from smoother functional indices (e.g., the IRI), motivating separate handling. The classical interpretation of pavement deterioration is commonly represented in PMS practice and agency-level condition modeling using a performance curve with an initial stable stage, followed by gradual deterioration and accelerated decline [13,16].
However, real-world PMS datasets seldom present smooth performance-curve trajectories. Yearly performance measurements include substantial measurement noise arising from seasonal variation, sensor calibration uncertainty, and survey conditions. This is especially pronounced in HPMS-type datasets with annual observation intervals, where small environmental fluctuations are frequently misinterpreted as structural deterioration. Prior research has highlighted the challenges of accurately predicting pavement performance indicators (e.g., IRI) from field data, motivating the use of data-driven modeling approaches that can better capture nonlinear and time-dependent patterns [11].
Recent overview studies emphasize that, as pavement monitoring and PMS databases continue to grow, extracting reliable deterioration information increasingly depends on data collection and preprocessing (e.g., fusion and cleaning) and on selecting appropriate data-analysis models and their combinations [6,7,15,17]. Moreover, when field measurements exhibit nonstationary variance and high volatility, robust data interpretation and forecasting frameworks become necessary to avoid overfitting to noise and to maintain reliability under uncertain conditions [2]. Despite this recognition, the literature still lacks a systematic procedure for estimating noise thresholds using robust statistics (e.g., IQR/2 and MAD) and formally distinguishing noise-dominated changes from structural responses.

2.2. State-Based Modeling in PMS and Its Structural Limitations

State-based performance modeling—particularly Markov chain approaches—has a long history in pavement management. The core idea is to discretize continuous performance indicators into a finite set of ordinal states and predict transitions between them using a transition probability matrix (TPM) [14,18].
π t = [ π t 1 ,   π t 2 ,   ,   π t n ] , i = 1 n π t i = 1
Let P denote the transition probability matrix, where P i j = P ( S t + 1 = j S t = i ) , and let π t be the state probability vector at time t . Under the Markov assumption, the evolution of pavement condition can be expressed as π t + 1 = π t P . This TPM-based framework has been widely adopted in both research and practice to represent probabilistic transitions among discretized pavement condition states. Moreover, prior studies have extended the basic formulation to better reflect nonstationary deterioration under evolving conditions, including time-varying (non-homogeneous) and staged-homogeneous Markov structures [14,19].
Nevertheless, several structural limitations have been consistently documented.
First, discretization of continuous variables (e.g., IRI and RD) into coarse states introduces information loss and reduces sensitivity to abrupt or non-monotonic changes [13].
Second, Markov models often assume monotonic deterioration and therefore do not explicitly capture sudden improvements associated with maintenance activities, unless treatment effects are separately modeled or transition matrices are stratified by treatment [14].
Third, transition probabilities are commonly estimated from aggregated or network-level data, which can fail to reflect the highly localized deterioration behavior observed at the Lane–segment scale [8,9].
While state models provide a statistically interpretable framework, they are inherently insensitive to nonlinear transitions, especially those arising from maintenance-induced recovery or accelerated deterioration. This gap underscores the need for a state representation that remains continuous, maintenance-aware, and responsive to year-to-year changes—leading directly to the Δ–State Vector concept proposed in this study.

2.3. Multivariate and High-Dimensional Feature Analysis for Pavement Data

Pavement performance data are inherently multivariate and influenced by numerous interacting factors—including traffic loading, geometric design, climate variability, material characteristics, and accumulated damage. Recent studies have applied dimension-reduction and neural-network-based approaches (e.g., PCA-integrated neural networks) to model complex, multivariate relationships in asphalt material properties, improving prediction performance under high-dimensional and correlated inputs [20]. Classical multivariate approaches, such as Principal Component Analysis (PCA), have also been employed to extract dominant modes of variation and construct composite representations from multiple pavement condition indicators [21]. While effective for summarizing linear correlations, PCA assumes global linearity and fails to capture the nonlinear, multimodal deterioration patterns typically present in HPMS-type datasets—such as transitions between stable, mixed, and accelerated deterioration regimes. Because reaction data are highly imbalanced (dominant near-zero responses with sparse extreme events), as also observed in our pooled ΔIRI/ΔRD distributions (Section 4.2), variable-density clustering is essential to avoid forcing rare but critical regimes into broad clusters.
To overcome these limitations, research has turned toward nonlinear manifold learning. UMAP (Uniform Manifold Approximation and Projection) has gained attention for preserving both local and global structure while effectively handling high-dimensional, non-uniform data distributions [22]. In pavement-related studies, UMAP has been used to project and visualize high-dimensional features for pavement condition/roughness recognition, revealing clearer separability among condition classes in the embedded space [23].
Complementing nonlinear projection, density-based clustering methods such as HDBSCAN offer advantages over traditional K-means for heterogeneous deterioration data. K-means implicitly assumes clusters with relatively simple geometry and comparable variance, and it can be sensitive to outliers; these assumptions are often violated in heavy-tailed deterioration signals, motivating density-based alternatives [24]. In contrast, HDBSCAN naturally accommodates variable-density clusters, identifies noise points, and provides stable cluster extraction across multimodal distributions [24,25]. Despite these advantages, the integration of UMAP + HDBSCAN has not been fully explored within PMS frameworks—particularly for modeling deterioration reactions rather than static conditions—representing a significant opportunity for advancement.

2.4. Change and Reaction-Based Modeling: Toward Dynamic Age Reset

Traditional PMS research primarily focuses on absolute performance values, yet growing evidence shows that yearly changes—ΔIRI, ΔRD, and ΔSD—offer superior insight into underlying deterioration processes, maintenance effects, and abnormal responses. Durango-Cohen and Madanat [26] proposed an adaptive-control formulation for deriving optimal maintenance and repair policies under uncertainty in facility deterioration rates, emphasizing the need to update deterioration characterization using observed condition information over the planning horizon. However, no standardized framework has emerged that combines noise-filtered changes, previous-year condition states, and nonlinear clustering into a unified deterioration modeling approach.
A Bayesian network (BN) is a probabilistic graphical model that represents conditional dependencies among variables and supports inference under uncertainty; in practical monitoring applications, its uncertainty propagation can be quantified via sampling-based approaches (e.g., Monte Carlo) [27]. Recent reviews further document how BNs have been deployed across civil infrastructure assessment, spanning Structural Health Monitoring (SHM)-driven diagnosis and reliability evaluation, and emphasize their key advantages in uncertainty handling, multi-source data fusion, and posterior updating as new evidence becomes available [27,28]. In parallel, the adoption of data-driven models in pavement engineering has increased, with demonstrated capability to learn distress development patterns and improve condition prediction performance relative to traditional approaches, thereby supporting more sustainable network-level management [17].
Furthermore, recent studies have highlighted the need for event-sensitive “effective age” (age reset) mechanisms that adjust pavement age after maintenance and rehabilitation, rather than continuously accumulating chronological time [5,6]. While such event-based lifecycle concepts are increasingly discussed, practical data-driven implementations remain limited, largely because maintenance records are often incomplete, inconsistent, or unavailable in routine PMS databases. As a result, purely age-driven models may inadequately represent improvement-like responses and can obscure slope changes associated with recovery, leading to biased inference of deterioration rates.
Taken together, these findings suggest that future PMS frameworks should adopt reaction-driven, state-dependent, and event-sensitive deterioration structures. In this context, the present study provides empirical building blocks—noise-filtered annual reactions and interpretable reaction regimes—that can be integrated into future data-driven age reset models when explicit M&R logs become available.

3. Methodology

This study develops a reaction-based analytical framework grounded in the Δ–State Vector, designed to jointly interpret pavement performance from the perspectives of absolute state and year-to-year structural change. This framework enables the detection of non-stationary deterioration behaviors, mixed responses, and improvement-like recovery patterns that conventional linear PMS models cannot capture, as illustrated in Figure 1.

3.1. Data Source and Study Scope

This study utilizes eight consecutive years (2015–2022) of pavement performance data from the K-HPMS. Only asphalt pavement (ACP) segments were included to ensure material homogeneity in deterioration behavior. Each record is structured according to a strict hierarchy consisting of Year → Route → Direction → Lane → Segment (0.1 km), with performance indices—IRI, RD, and SD—measured annually using standardized automated survey equipment. Because PMS deterioration behavior is governed by localized heterogeneity, the primary analytical resolution of this study is the Lane–Segment level (Level 5; 0.1 km), unless otherwise noted. Hierarchical-level analyses (L0–L4) are used only to diagnose aggregation distortion and distributional bias, whereas all reaction modeling, clustering, and hotspot analyses are conducted at the Lane–Segment resolution (L5) (Table 1).

3.2. Hierarchical Variability and Aggregation Bias Assessment

To determine the appropriate modeling scale, this study evaluates how the statistical properties of IRI, RD, and SD vary across the K-HPMS hierarchical levels. Although Level 5 provides Lane–Segment (0.1 km) resolution, formal statistical tests were conducted across Levels 0–4 (L0–L4) because Level 5 groups are essentially singleton cells (≈1 observation per segment–year group), making within-group variance undefined and ANOVA-type comparisons ill-posed. Accordingly, L0–L4 comparisons should be interpreted as a diagnostic test of aggregation-induced bias, not as a substitute for segment-level inference. Therefore, differences in distributional properties were assessed by comparing groups defined at each aggregation level. Three complementary tests were conducted, outlined in the following:

3.2.1. Brown-Forsythe Test

This test is used to examine variance heterogeneity among hierarchical groups by transforming each observation into the absolute deviation from the group median.
z g i = y g i y ~ g
A significant p -value ( p < 0.05) indicates that variances differ among groups within the given aggregation level, implying aggregation-dependent variability.

3.2.2. Welch ANOVA

This test is applied when heteroscedasticity is detected, providing robust comparisons of mean values among groups without assuming equal variances. Welch ANOVA uses unequal-variance weights,
w g = n g s g 2
to compute an approximate F statistic with Satterthwaite-type degrees-of-freedom adjustment, thereby reducing bias under variance heterogeneity. A significant p -value ( p < 0.05) indicates that group means differ at the corresponding aggregation level.

3.2.3. Effect Size ( η 2 )

This test quantifies the proportion of total variance explained by the grouping structure at each hierarchical level. Interpretation follows [29], as follows:
η 2 = S S between S S total
  • η 2 < 0.01: negligible;
  • η 2 = 0.01–0.06: small;
  • η 2 = 0.06–0.14: moderate;
  • η 2 ≥ 0.14: large.

3.3. Annual Change (Δ) Computation and Noise Band (ε) Estimation

For each segment i and performance index Y ∈ {IRI, RD, SD}, the year-to-year change is defined as
Δ Y i , t = Y i , t Y i , t 1
Because Δ values mix structural signals with measurement noise, the study estimates a Noise Band (εY) using the following robust statistics:
  • Interquartile Range: IQR/2;
  • Median Absolute Deviation (MAD);
  • The 5th–95th percentile envelope.
In this study, the global Noise Band for each indicator is defined from the pooled Δ Y distribution using the MAD as a conservative scale:
ε g l o b a l , Y M A D ( Δ Y )
To accommodate transition-specific irregularity in the fixed panel, a transition-wise MAD scale ε t , Y is computed for each performance indicator Y . To prevent unrealistically small thresholds (e.g., when ε t , Y becomes near-zero), the filtering threshold was defined by enforcing a minimum-threshold (floor) constraint using global scale:
ε t , Y * = max ( ε g l o b a l , Y ,   ε t , Y )
Here, ε t , Y * denotes the effective Noise Band and is applied symmetrically as a two-sided threshold, i.e., Δ Y i , t ε t , Y * is treated as noise-level variability and Δ Y i , t > ε t , Y * is treated as a meaningful structural reaction.
Values within the noise band Δ Y i , t ε t , Y * were treated as noise, and the noise-filtered change was defined as
Δ Y i , t c l e a n = { 0 , Δ Y i , t ε t , Y * Δ Y i , t , Δ Y i , t > ε t , Y *
Each annual change is classified as
  • Stable: Δ Y i , t c l e a n = 0 ;
  • Worsen: Δ Y i , t c l e a n > 0 ;
  • Improve: Δ Y i , t c l e a n < 0 .
This hard thresholding is intentionally adopted to suppress survey-condition variability and to preserve only changes that exceed the empirically derived noise scale. Because the objective of this study is identification of reaction patterns rather than continuous change estimation, retaining sub-threshold fluctuations would inflate “micro-reactions” and degrade the stability of the reaction manifold.

3.4. Construction of the Δ–State Vector

To jointly represent the absolute structural condition of a pavement segment and its year-to-year performance response, this study establishes a four-dimensional Δ–State Vector. This formulation integrates (1) the previous-year condition state and (2) the current-year noise-filtered structural changes (i.e., Δ I R I i , t c l e a n and Δ R D i , t c l e a n ; Section 3.3), forming the core representation of the proposed reaction-based deterioration framework.

3.4.1. Previous-Year Condition (State) Vector

For each segment i , the previous-year structural condition is expressed as a two-dimensional State Vector based on the absolute values of IRI and RD:
S i , t 1 = ( I R I i , t 1 ,   R D i , t 1 )
This state vector represents the baseline condition from which the subsequent annual reaction is evaluated.
To ensure consistency with the maintenance grading scheme used in the K-HPMS, the condition values were additionally coded into a seven-grade classification system following the Korea Expressway Corporation Research Institute (KECRI)’s reference maintenance grading criteria [30]:
S i , t 1 g r a d e = ( G r a d e i , t 1 I R I ,   G r a d e i , t 1 R D )
The grade definitions used in this study are summarized in Table 2. Using grade-coded states facilitates stable comparisons across segments with different deterioration histories and supports robust clustering in the reaction space. Although the grades are ordinal, they provide a standardized state representation aligned with operational PMS decision thresholds. Importantly, the grade-coding is only used to stabilize the state component for clustering and interpretation; the underlying continuous IRI/RD values remain unchanged and are retained for descriptive and validation analyses. To prevent spurious boundary-driven grade transitions under no-reaction conditions, grade-coded states were stabilized by retaining the previous-year grade when Δ I R I i , t c l e a n = 0 and/or Δ R D i , t c l e a n = 0 . This stabilization does not modify the underlying continuous measurements.

3.4.2. Four-Dimensional Δ–State Vector

The final four-dimensional Δ–State Vector is defined by concatenating the previous-year state with the current-year noise-filtered changes (Section 3.3):
X i , t = ( S i , t 1 ,   Δ I R I i , t c l e a n ,   Δ R D i , t c l e a n )
For consistency with the maintenance grading scheme used in the K-HPMS (Table 2), a grade-coded form is additionally defined as
X i , t g r a d e = ( S i , t 1 g r a d e ,   Δ I R I i , t c l e a n ,   Δ R D i , t c l e a n )
where S i , t 1 g r a d e features the stabilized grade under no-reaction conditions as described in Section 3.4.1. This representation captures both the absolute condition level (state) and the direction/magnitude of annual reactions (change), enabling state-dependent analysis of deterioration dynamics and mixed/transitional behaviors.
The Δ–State Vector deliberately combines an ordinal state descriptor with continuous change descriptors to capture state dependence without discarding the directional reaction magnitude. To mitigate scale incompatibility, subsequent embedding uses Z-score standardization across all components (Section 3.5), so that the state and reaction terms contribute comparably in the neighborhood graph construction. Surface distress (SD) is not included in the Δ–State Vector to avoid distorting distance-based embedding; its distributional characteristics are reported in Section 4.2.3 (see also Figure 2).

3.5. UMAP-Based Dimensionality Reduction Procedure

The Δ–State Vector constitutes a four-dimensional reaction descriptor (Section 3.4). Because this feature space is nonlinear and heterogeneously distributed, a nonlinear dimensionality reduction step was employed to obtain a stable low-dimensional reaction representation.
UMAP was adopted because it embeds high-dimensional observations onto a low-dimensional manifold while largely preserving local neighborhood relations [22]. This property is suitable for constructing interpretable reaction maps that highlight heterogeneous response modes [23]. The UMAP procedure in this study consists of the following steps:
  • Step 1. Input data
UMAP was applied to the grade-coded Δ–State Vector X i , t g r a d e defined in Section 3.4.2 (Equation (12)), which concatenates the stabilized previous-year grades with the noise-filtered annual changes.
  • Step 2. Metric and Standardization
All four components were standardized using Z-score normalization. Euclidean distance was used in the standardized space. Using Euclidean distance in the standardized space yields a transparent and reproducible similarity measure for neighborhood preservation, while avoiding additional tuning associated with alternative mixed-type metrics.
  • Step 3. Hyperparameter Settings
UMAP was configured with n n e i g h b o r s = 20 ,   m i n _ d i s t = 0.1 , and n _ c o m p o n e n t s = 2 .
  • Step 4. Output Definition
For each segment–year observation ( i ,   t ) , UMAP produces a two-dimensional embedding:
Z i , t = U M A P ( X i , t ) ,   Z i , t R 2
This embedding defines the Reaction Space for subsequent clustering and alignment evaluation.

3.6. HDBSCAN Clustering Procedure

Although UMAP provides a meaningful 2D projection of the Δ–State Vector, the resulting Reaction Space remains heterogeneous with variable-density regions and nonlinear boundaries. Therefore, this study employed HDBSCAN (Hierarchical Density-Based Spatial Clustering of Applications with Noise), which identifies clusters based on density hierarchy while separating noise points [24,25].
  • Step 1. Input data
HDBSCAN was applied to the UMAP coordinates Z i , t R 2 (Equation (13)).
  • Step 2. Hyperparameter Settings
HDBSCAN was applied to the 2D UMAP coordinates using Euclidean distance.
The clustering was configured with min_samples = 10. To avoid interpreting unstable micro-clusters, clusters with fewer than 30 observations were excluded by post-filtering and treated as noise. Observations not assigned to any retained cluster were labeled as noise and were retained only for visualization, while being excluded from cluster-level alignment metrics.

3.7. Rule-Based Reaction Signature Definition and Cluster–Signature Alignment Metrics

To provide an interpretable taxonomy of annual reactions and to evaluate whether unsupervised clusters reflect these interpretable patterns, in this study rule-based Reaction Signatures are defined for each Lane–Segment transitions observation ( i ,   t ) and their alignment with HDBSCAN clusters is quantified. Here, the rule-based Reaction Signature is an interpretation layer that assigns each observation to a physically readable category, whereas HDBSCAN discovers density-driven groupings in the embedded manifold. The alignment metrics quantify whether the data-driven clusters correspond to the intended, interpretable reaction taxonomy rather than to arbitrary binning.
First, noise-filtered changes were normalized by the transition-specific effective noise scales (Section 3.3; Equation (7)):
v i , t I R I = I R I i , t c l e a n ε t , I R I * ,   v i , t R D = R D i , t c l e a n ε t , R D *
A reaction magnitude was defined as
m i , t = ( v i , t I R I ) 2 + ( v i , t R D ) 2
The Trend (Stable/Improve/Worsen) was determined by the direction of the combined normalized change, with Stable assigned when both noise-filtered components are zero. The Mode (Zero/Clear/Mixed/Full) captures the strength and consistency of the response, using reaction magnitude and sign consistency, as described in Section 3.8.
To compare rule-based signatures with HDBSCAN clusters, a cluster–signature contingency table was constructed, and cluster purity was defined as
p u r i t y ( c ) = m a x s ( n c ,   s n c )
where n c , s is the number of observations in cluster c assigned to signature s , and n c is the size of non-noise cluster c (i.e., excluding HDBSCAN noise points.). Two summary alignment metrics were reported: (i) weighted purity defined as c n c   purity   ( c ) / c n c computed over non-noise clusters, and (ii) the coverage (purity ≥ 0.60), defined as the fraction of all observations that fall into clusters with purity   ( c ) 0.60 (noise points are counted as not covered). These metrics quantify how strongly unsupervised clustering aligns with the interpretable signature taxonomy.

3.8. Reaction-Signature Analysis

Reaction Signatures summarize the annual response pattern of each Lane–Segment observation ( i ,   t ) by combining Trend (directionality) and Mode (strength/consistency). Using the normalized changes and reaction magnitude defined in Section 3.7 (Equations (14) and (15)), each observation is assigned to a Trend × Mode category according to the rules below.

3.8.1. Trend Classification

Trend represents the net direction of the transition-level response as follows:
  • Stable: both noise-filtered components are zero ( I R I i , t c l e a n = 0 and R D i , t c l e a n = 0 );
  • Worsen: the combined normalized direction is positive;
  • Improve: the combined normalized direction is negative.
For non-stable observations, the combined normalized direction is defined as d i , t = v i , t I R I + v i , t R D ; Worsen is assigned if d i , t > 0 and Improve is assigned if d i , t < 0 (ties are negligible in continuous data).

3.8.2. Mode Classification

Mode captures the strength and consistency of the response within each Trend:
  • Zero: assigned only under Stable Trend (no reaction after noise filtering);
  • Mixed: assigned when the two normalized components indicate mixed-direction behavior (opposite signs) or when the response does not meet Clear/Full criteria;
  • Clear/Full: Full denotes upper-tail reactions (e.g., m i , t Q 0.90 ), and Clear denotes moderate reactions (e.g., Q 0.50 m i , t < Q 0.90 ).
Operationally, Mode is assigned as follows. (i) if Trend is Stable, Mode = Zero, (ii) if v i , t I R I v i , t R D < 0 then Mode = Mixed (sign-inconsistent), and (iii) if Mode is determined by the reaction magnitude quantiles computed from the empirical distribution of m i , t , it is Full if m i , t Q 0.90 , Clear if Q 0.50 m i , t < Q 0.90 , and Mixed otherwise.
In particular, sign-inconsistent responses ( v i , t I R I v i , t R D < 0 ) are classified as Mixed regardless of magnitude, because they reflect opposing movements between indicators.

3.8.3. Reaction Signature (Trend × Mode)

Combining the Trend and Mode yields the Reaction Signature taxonomy used throughout the subsequent analyses (Table 3). This categorical structure enables consistent interpretation of heterogeneous annual responses and supports clustering and reaction-space construction for pattern identification.

4. Results

4.1. Distributional Characteristics and Statistical Significance Across Hierarchical Levels

The HPMS dataset was structured according to a six-level hierarchy—ranging from Year (Level 0) to Route–Direction–Lane–Segment (Level 5)—to evaluate how the spatial aggregation level influences the distribution of pavement performance indicators (IRI, RD, and SD). Figure 3, Figure 4 and Figure 5 illustrate the distributional changes across Levels 0–5 for each indicator.
At Level 0 (Year), where all segments are aggregated within a given year, IRI values are concentrated within a narrow band around 1.3 m/km, masking heterogeneity across segments. At the Lane-Segment level (Level 5), however, the distribution widens substantially, spanning roughly 0.5–6.5 m/km, revealing pronounced localized roughness. This confirms that network-wide, year-aggregated IRI values obscure severe but spatially confined roughness conditions.
RD exhibited relatively stable typical values around 4–6 mm at the upper aggregation level (L0–L2), whereas the distribution widened markedly as the hierarchy became more detailed. At the Lane-Segment level (L5), extreme values exceeding 10 mm were frequently observed, with outliers reaching or exceeding 20 mm, indicating highly localized rutting conditions. These patterns suggest that severe rutting is muted under higher-level aggregation and becomes evident only when data are examined at finer spatial resolution.
SD values were strongly zero-inflated and were therefore analyzed on a logarithmic scale. At Level 0 (Year), SD values were tightly clustered around 10−2–10−1 m2, suggesting minimal apparent distress under year-level aggregation. As the hierarchy became more detailed, the distribution developed a pronounced long tail; at the Lane-Segment level (Level 5), SD spanned several orders of magnitude, with extreme observations reaching the 101–102 m2 range, indicating highly localized surface distress hotspots.
These results suggest that spatial averaging can conceal localized distress hotspots and that SD behaves primarily as an event-driven indicator rather than a gradual deterioration metric. Consequently, SD was excluded from Δ-based modeling and was reported only descriptively in this study.
To quantitatively assess whether hierarchical levels influence performance distributions, Brown–Forsythe and Welch ANOVA tests were conducted for all indices across Levels 0–4 (Table 4).
  • Brown-Forsythe tests indicated significant heteroscedasticity across hierarchical levels, confirming that aggregation materially alters the dispersion structure of IRI, RD, and SD.
  • Where Welch ANOVA was evaluable, it further confirmed significant differences in mean values across levels ( p < 0.05).
  • Effect size ( η 2 ) increased sharply as spatial resolution became more detailed, indicating that finer grouping explains substantially more variance.
  • Level 0 showed negligible explanatory power ( η 2 = 0.002–0.050), whereas Level 4 showed large effects ( η 2 = 0.146–0.358), indicating that approximately 14.6–35.8% of total variability is explained by Lane-level stratification.
Across all three indicators, the variance, interquartile range, and extreme-value prevalence increased markedly as the analysis moved from Level 0 to Level 4, demonstrating that pavement performance is fundamentally governed by localized spatial heterogeneity that can be distorted or lost under coarser aggregation. Accordingly, these L0–L4 results should be interpreted as a diagnostic of aggregation-induced bias rather than as Segment–level inference. The diagnostics indicate that meaningful inference requires at least Lane–level resolution (L4); therefore, all reaction modeling in this study was performed at the Lane–Segment level (L5).

4.2. Distribution of Annual Changes and Noise Band Validation

In this section, annual changes (Δ) in pavement performance indicators (IRI, RD, and SD) are quantified and Noise Bands (ε) are established using robust statistics. The Noise Bands provide a practical criterion for separating meaningful structural change from measurement- and environment-driven variability, serving as the basis for constructing noise-filtered Δ values for the Δ–State Vector and subsequent reaction classification.
All Δ statistics were computed from a fixed panel of 4302 asphalt segments continuously measured for eight years (2015–2022). Year-to-year changes were evaluated over seven transitions, yielding n = 30,114 pooled Δ observations per indicator.
Throughout this section, the “year” label t in the year-wise summaries corresponds to the transition ( t 1 t ) . Noise Bands are first estimated globally from pooled Δ distributions and then examined through year-/transition-wise and spatial-group screening to determine an appropriate filtering rule.

4.2.1. Natural Variability in ΔIRI ( ε I R I )

The pooled ΔIRI distribution is sharply centered around zero (Figure 6). Robust statistics were as follows:
  • 90% range (p05–p95): −0.29 to +0.35 m/km;
  • IQR/2: 0.060 m/km, MAD: 0.089 m/km.
To define a conservative global Noise Band for ΔIRI, the MAD-based scale was adopted in this study.
ε I R I ± 0.089   m / k m
Figure 6. Empirical distribution of ΔIRI and derived Noise Band (±0.089 m/km).
Figure 6. Empirical distribution of ΔIRI and derived Noise Band (±0.089 m/km).
Infrastructures 11 00015 g006
Values within ε I R I are treated as noise-level variability; values exceeding the band indicate meaningful structural change. Year-wise validation results are summarized in Table 5.

4.2.2. Natural Variability in ΔRD ( ε R D )

ΔRD exhibits a wider dispersion than ΔIRI (Figure 7). Robust statistics of pooled ΔRD were as follows:
  • 90% range (p05–p95): −2.02 to +2.60 mm;
  • IQR/2: 0.660 mm, MAD: 0.993 mm.
Consistent with Section 4.2.1, the global Noise Band is defined using the MAD-based estimate:
ε R D ± 0.993   mm
Figure 7. Empirical distribution of ΔRD and derived Noise Band (±0.993 mm).
Figure 7. Empirical distribution of ΔRD and derived Noise Band (±0.993 mm).
Infrastructures 11 00015 g007
Changes exceeding ε R D are classified as meaningful increases (rutting progression) or decreases (apparent recovery), whereas values within the band are treated as natural variability. Year-wise dispersion statistics and noise-level shares are summarized in Table 6.

4.2.3. Natural Variability in ΔSD ( ε S D )

SD exhibits strong zero inflation and event-driven spikes (Figure 8). The pooled ΔSD distribution shows near-zero central dispersion with occasional abrupt changes. Robust statistics were as follows:
  • 90% range (p05–p95): −2.73 to +4.87 m2;
  • IQR/2: 0.020 m2, MAD: 0.0148 m2.
Unlike IRI and RD, Δ S D is not used to construct the Δ –State Vector or the reaction manifold because its changes are dominated by sparse event-like spikes rather than gradual deterioration signals. Therefore, we do not apply the hard-thresholding filter in Equation (8) to obtain Δ S D c l e a n . Instead, SD is reported only descriptively in this study.
For completeness, we summarize the transition-wise robust dispersion scale of Δ S D using the median absolute deviation:
ε t , S D = MAD t   ( Δ S D )
where the “year” index t corresponds to the transition ( t 1 t ) . Table 7 reports ε t , S D and two reference “near-zero” shares: (i) the strict zero share ( Δ S D = 0 ) , reflecting zero inflation, and (ii) the small-change share ( Δ S D ε t , S D ) , reported only to characterize the distribution and not used for model input filtering.
Figure 8. Empirical distribution of ΔSD showing zero-inflation and event-driven spikes.
Figure 8. Empirical distribution of ΔSD showing zero-inflation and event-driven spikes.
Infrastructures 11 00015 g008
Although the MAD of Δ S D is small, it remains nonzero and varies across year-to-year transitions (Table 7). Here, ε t ,   S D is reported only as a descriptive dispersion scale under strong zero inflation and sparse event-driven spikes, and it is not used as a Noise Band for cleaning or as an input to the Δ –State Vector. Accordingly, SD is excluded from the reaction manifold construction and discussed only descriptively in this study.

4.2.4. Spatial Variability Check and Adoption of Transition-Specific Noise Bands

Figure 9 summarizes MAD-based dispersion scales of ΔIRI and ΔRD across spatial hierarchy levels (route; route–direction; route–direction–lane). The distributions are broadly comparable across these spatial groupings, suggesting limited practical benefit from introducing multiple spatially stratified Noise Bands. Spatial differences may still reflect various factors (e.g., heterogeneous structural or traffic conditions), but they are not used to parameterize ε in this study.
In contrast, transition-wise summaries in Table 5 and Table 6 show that the MAD-based dispersion scale varies across transitions. Notably, the first transition (2015–2016) yields MAD = 0 for both ΔIRI and ΔRD, which would result in ε y = 0 under the MAD rule. To avoid applying an unrealistically small threshold in such cases, Δ filtering follows the floor(minimum-threshold) defined in Equation (7). More generally, the observed transition-wise variability may be influenced by differences in survey conditions, including potential changes in measurement devices or processing performance over time; however, this study does not attribute causality and uses transition-specific ε only as a practical filtering adjustment. ΔSD is treated separately as an event-driven indicator and is not included in the Δ–State Vector.

4.3. Analysis of Deterioration Characteristics Using the Δ–State Vector

The Δ–State Vector was introduced to address the limitations of conventional single-indicator linear deterioration assessments by jointly considering (i) the previous-year condition state and (ii) noise-filtered annual structural changes. In this study, the state component is represented by the previous-year IRI grade and RD grade, and the change component is represented by the corresponding noise-filtered changes Δ I R I i , t c l e a n and Δ R D i , t c l e a n .
Noise filtering was performed as outlined in Section 4.2: global Noise Bands provide baseline thresholds, while the effective threshold is applied at the transition/year level with a conservative lower bound (Equation (7)). Using these noise-filtered changes, this section examines state-dependent deterioration dynamics within the Δ–State framework (Section 3.4).
Table 8 summarizes the state-stratified annual reactions of IRI and RD using the noise-filtered Δ values. The results reveal a clear state-dependent shift in both the net direction (mean Δ) and the directional composition (worsen vs. improve ratios) as the pavement condition worsens.
  • The mean ΔIRI decreased monotonically from +0.038 m/km/year (Grade 1) to −0.419 m/km/year (Grade 7), crossing zero at approximately Grade 3 (−0.012).
  • The ΔIRI Worsen Ratio increased from 18.7% (Grade 1) to ~32% (Grades 4–7) remaining at a comparable level in higher grades (≈30.7–32.5%).
  • The ΔIRI Improve Ratio increased steadily from 6.18% (Grade 1) to 45.3% (Grade 7).
  • For ΔRD, the mean shifted from +0.458 mm/year (Grade 1) to negative values in poorer grades (e.g., −0.107 at Grade 3, −0.775 at Grade 4, −3.52 at Grade 6).
  • The ΔRD Improve Ratio increased from 3.46% (Grade 1) to 41.0% (Grade 6), while the ΔRD Worsen Ratio remained substantial in lower-to-mid grades (≈26–28%) but dropped at Grade 6 (16.7%).
Figure 10 illustrates state-dependent changes in the dispersion and direction of annual reactions. As the previous-year condition grade worsens, both Δ I R I c l e a n and Δ R D c l e a n exhibit a wider spread, indicating increased instability under degraded structural states. In particular, the frequency and magnitude of negative Δ (improvement-like) increase in higher grades, consistent with more frequent improvement-like responses. Because explicit M&R records are unavailable, in this study these signals are not attributed to specific interventions; instead, the subsequent signature transition analysis examines whether these improvement-like reactions exhibit short-term persistence (recurrence) across consecutive years, which would be more consistent with event-driven changes than purely noise-level variability.
Overall, these results suggest that pavement performance is more strongly associated with its current condition state than with elapsed time, indicating a distinctly state-dependent response pattern. In higher-grade segments (Grades 6–7), improvement-like reactions (negative Δ) become more frequent while worsening reactions remain present, implying a more variable and bidirectional response regime under degraded conditions. This state dependence prompts the use of reaction-driven frameworks—rather than purely age-based approaches—to better characterize heterogeneous deterioration trajectories. In the subsequent analyses, we examine whether these improvement-like signals exhibit temporal persistence and spatial coherence, which would be consistent with event-driven changes.

4.4. Reaction Signature Classification

To categorize the multidimensional deterioration behaviors captured by the Δ–State Vector, this study defines rule-based Reaction Signatures using the directionality (Trend) and the magnitude/consistency (Mode) of the noise-filtered annual changes. The resulting Trend × Mode signatures provide an interpretable summary of heterogeneous Lane–Segment transition responses and serve as a reference taxonomy for comparison with the unsupervised clustering results presented in the next section.
The distribution of Reaction Signatures is summarized in Table 9 for the non-noise observation set used in the cluster–signature alignment analysis (i.e., after excluding post-filtered HDBSCAN noise; see Section 3.6). Within this evaluated set, Stable–Zero accounts for the largest share of observations (60.4%), indicating that many lane–segment year-to-year transitions remain within the noise band after filtering. Among non-stable reactions, Worsen–Mixed (14.4%) and Worsen–Clear (10.5%) are most prevalent, suggesting that a substantial subset of transitions exhibit net deterioration with varying consistency. Improvement-like reactions are less frequent but still non-negligible—Improve–Mixed (5.31%), Improve–Clear (4.78%), and Improve–Full (3.02%)—reflecting increasingly strong recovery patterns. Finally, Worsen–Full (1.60%) represents high-severity, strongly directional deterioration events.
Overall, the Reaction Signature framework provides a compact yet interpretable representation of annual pavement responses by encoding both direction and intensity/stability. These signatures form the basis for subsequent Reaction Space mapping and for evaluating the alignment between rule-based signatures and unsupervised UMAP–HDBSCAN clusters.

4.5. Reaction Space Mapping and Cluster–Signature Alignment

To visualize high-dimensional Δ–State responses, a two-dimensional Reaction Space was constructed using UMAP. Although the UMAP axes are unitless, nearby points in the embedding correspond to similar 4D Δ–State vectors (previous-year condition grades combined with noise-filtered annual changes). When colored by the rule-based Trend labels, the Reaction Space exhibits a clear directional organization among Stable, Improve, and Worsen regimes (Figure 11), suggesting that the local neighborhood structure of Δ–State responses is well preserved in the low-dimensional manifold.
To test whether the interpretable rule-based taxonomy reflects intrinsic structure in the data, HDBSCAN was applied in the UMAP space and compared against the rule-based Trend × Mode signatures. Following the clustering procedure (Section 3.6), clusters with fewer than 30 observations were post-filtered and treated as noise; noise points were retained only for visualization and excluded from cluster-level alignment metrics. In this study, the five dominant reaction regimes refer to the five largest non-noise HDBSCAN clusters (by sample size) shown in Figure 12. The cluster–signature agreement is strong (weighted purity = 0.927; coverage at purity ≥ 0.60 = 0.911), indicating that the rule-based signatures closely match the dominant density-defined regimes in the embedding rather than acting as arbitrary partitions (Figure 12). For consistency with this alignment evaluation, Table 9 reports signature shares over the same non-noise observation set, providing a baseline distribution for interpreting cluster–signature correspondence.

4.6. Signature Transition Analysis

This transition analysis uses all signature-labeled consecutive-year observations and is independent of HDBSCAN noise labels, because it is defined in the signature space rather than the cluster space.
To assess short-term predictability and recurrence in annual reactions, a one-step signature transition probability matrix was constructed from consecutive-year observations. For each Lane–Segment with adjacent-year measurements, the current-year signature S t was paired with the next-year signature S t + 1 , and empirical transition probabilities P ( S t + 1 S t ) were estimated by row-normalizing transition counts across the seven Trend × Mode categories.
Figure 13 shows that Stable–Zero exhibits the highest self-transition probability (0.68), indicating strong persistence of noise-level stability once segments enter the zero-reaction regime. Several non-stable signatures also revert to Stable–Zero with relatively high probability (e.g., Worsen–Mixed → Stable–Zero: 0.59), suggesting that a substantial portion of annual reactions are transient rather than persistently directional. At the same time, worsening signatures display meaningful branching, including non-negligible persistence within the same regime and transitions to other worsening regimes, consistent with heterogeneous evolution pathways.
Overall, the transition matrix demonstrates that the proposed signatures encode empirically observable persistence and recurrence patterns in next-year responses and provide a practical basis for risk-oriented interpretation, such as estimating the conditional likelihood of shifting into high-severity reactions (e.g., Worsen–Full) given the current signature.

5. Discussion

This study provides an empirical and methodological reassessment of how pavement deterioration should be interpreted and modeled within PMS frameworks. The findings collectively reveal that deterioration behavior in asphalt pavements is fundamentally state-dependent, nonlinear, and event-sensitive, challenging the long-standing assumption that performance declines smoothly as a function of chronological age.
First, the hierarchical variability analysis confirmed substantial aggregation bias in HPMS data, where segment-level heterogeneity is progressively masked at upper levels (Year, Route, and Direction). Large effect sizes already emerge at the lane-aggregation level (Level 4), supporting the need to avoid over-aggregated indicators when diagnosing deterioration dynamics; reaction modeling is therefore performed at the Lane–Segment resolution (Level 5).
Second, the robust Noise Bands derived from pooled annual change distributions (ΔIRI: ±0.089 m/km; ΔRD: ±0.993 mm, MAD-based) enabled separation of structural changes from natural measurement variability. The resulting clean Δ values highlight that most annual changes are near-zero and that meaningful deterioration or improvement occurs only when these thresholds are exceeded, implying that static deterioration curves can misinterpret noise-induced variations as structural trends.
Third, the Δ–State Vector analysis demonstrated that deterioration magnitude is governed not by chronological age but by the previous-year condition state. In higher grades (6–7), dispersion and structural-signal ratios increase sharply, consistent with entry into an accelerated deterioration zone that is difficult to operationalize using static age-based models.
Fourth, the UMAP–HDBSCAN framework identified five dominant reaction regimes (clusters) in the embedded Reaction Space, while the rule-based Trend × Mode Reaction Signature taxonomy provided an interpretable labeling layer for comparing and explaining these regimes. The presence of transitional patterns (e.g., Improve–Mixed and Worsen–Mixed) suggests regime shifts that contradict linear or strictly monotonic assumptions, and small dense Worsen–Full groups may indicate localized structural vulnerability.
Finally, while this study demonstrates that improvement-like and worsening reactions exhibit a systematic structure in both state and transition analyses, linking recurrent high-severity signatures to persistent spatial hotspots requires an explicit spatio-temporal mapping module and independent validation (e.g., M&R logs or corroborating distress surveys), which remains a key next step for future PMS integration.
Because oxidative ageing and associated microstructural evolution (e.g., asphaltene nano-aggregation) can weaken cracking resistance and alter damage accumulation under repeated loading and environmental exposure, identical service times may still yield different deterioration reactions across segments, underscoring the need to interpret performance changes through a state-dependent lens [31].

6. Conclusions

This study proposed a Δ–State Vector and Reaction Signature framework to diagnose and model deterioration behavior in asphalt pavements using eight years of Korean HPMS data (2015–2022). The main conclusions are as follows:
1.
Pavement deterioration is state-dependent rather than purely age-dependent.
Year-to-year changes in IRI and RD increase nonlinearly with previous-year grades, indicating that deterioration accelerates once condition thresholds are exceeded.
2.
Robust Noise Bands separate structural signals from natural variability.
MAD-based global bands (ΔIRI: ±0.089 m/km; ΔRD: ±0.993 mm) enable noise-filtered annual changes and stabilize downstream reaction embedding and classification.
3.
The Δ–State Vector provides an interpretable reaction representation.
By combining previous-year state with noise-filtered annual changes, the proposed vector captures nonlinear, mixed, and event-like responses that static linear or monotonic age-based models often miss.
4.
UMAP–HDBSCAN reveals five dominant reaction regimes (clusters), which are consistent with the interpretable Trend × Mode Reaction Signature taxonomy.
The embedded Reaction Space exhibits a clear structure, and the discovered regimes align strongly with the rule-based Trend × Mode Reaction Signature taxonomy, supporting both data-driven discovery and human-interpretable labeling.
5.
The framework supports event-aware screening and provides building blocks for future Dynamic Age Reset PMS models.
The proposed framework provides empirical building blocks for future event-aware PMS extensions and Dynamic Age Reset modeling. High-severity worsening and improvement-like signatures can serve as candidate indicators for Lane–Segment-level event screening but confirming persistent hotspots and quantifying treatment effects require independent validation with explicit M&R records and are left for future work.

Author Contributions

Conceptualization, I.K. and S.H.; methodology, S.H.; software, S.H. and K.Y.; formal analysis, S.H.; investigation, D.S.; resources, D.S.; data curation, S.H. and J.C.; writing—original draft preparation, S.H.; writing—review and editing, S.H. and I.K.; visualization, J.C. and K.Y.; funding acquisition, I.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Research Foundation of Korea (NRF), funded by the Korean government (MSIT) (Grant No. RS-2025-00573072). The APC was not funded.

Data Availability Statement

The data used in this study were obtained from the Korea Expressway Corporation and are subject to institutional restrictions. The data are not publicly available due to confidentiality and data-use agreements but may be made available from the corresponding author upon reasonable request and with permission of the data provider.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ministry of Land, Infrastructure and Transport (MOLIT); Korea Agency for Infrastructure Technology Advancement (KAIA). Survey Equipment for Integrated Road Pavement Maintenance and Management Digital Based Information System Development; (RS-2022-00155837); MOLIT/KAIA: Sejong, Republic of Korea, 2023. [Google Scholar]
  2. Lu, A.-W.; Ni, Y.-Q.; Wang, J.-F.; Ma, Z.-G. Data Interpretation and Forecasting of SHM Heteroscedastic Measurements under Typhoon Conditions Enabled by an Enhanced Hierarchical Sparse Bayesian Learning Model with High Robustness. Measurement 2024, 232, 114841. [Google Scholar] [CrossRef]
  3. Zhao, J.; Wang, H.; Lu, P. Impact Analysis of Traffic Loading on Pavement Performance Using Support Vector Regression Model. Int. J. Pavement Eng. 2022, 23, 3716–3728. [Google Scholar] [CrossRef]
  4. Park, E.-J.; Cheon, Y.-J.; Lee, J.-K. Hierarchical Bayesian Intelligence Framework for Uncertainty Quantification and Reliability Assessment of Solid Oxide Fuel Cells. IEEE Access 2025, 13, 188084–188101. [Google Scholar] [CrossRef]
  5. Yasarer, H.; Najjar, Y. Use of Artificial Neural Networks (ANNs) to Predict Pavement Management Data Attributes: Development of Pavement Performance Models for MDOT: A Neural Network Approach; Report No. FHWA/MDOT-RD-22-298; University of Mississippi: Oxford, MS, USA, 2024. [Google Scholar]
  6. Federal Highway Administration (FHWA). Long-Term Pavement Performance Information Management System (IMS) User Guide; FHWA: Washington, DC, USA, 2018. [Google Scholar]
  7. Farhan, J.; Fwa, T.F. Managing Missing Pavement Performance Data in Pavement Management Systems. Transp. Res. Rec. 2015, 2523, 15–24. [Google Scholar] [CrossRef]
  8. Mrawira, D.; Haas, R. Aggregation of Condition Survey Data in Pavement Management: Shortcomings of a Homogeneous Sections Approach and How to Avoid Them. Transp. Res. Rec. 2020, 2674, 394–405. [Google Scholar] [CrossRef]
  9. Jannat, G.E.; Henning, T.F.P.; Zhang, C.; Tighe, S.L.; Ningyuan, L. Road Section Length Variability on Pavement Management Decision Making for Ontario, Canada, Highway Systems. Transp. Res. Rec. 2016, 2589, 87–96. [Google Scholar] [CrossRef]
  10. Karlaftis, A.G.; Badr, A. Predicting Asphalt Pavement Crack Initiation Following Rehabilitation Treatments. Transp. Res. Part C 2015, 55, 510–517. [Google Scholar] [CrossRef]
  11. Zhang, T.; Smith, A.; Zhai, H.; Lu, Y. LSTM+MA: A Time-Series Model for Predicting Pavement IRI. Infrastructures 2025, 10, 10. [Google Scholar] [CrossRef]
  12. Haddad, A.J.; Chehab, G.R.; Saad, G.A. The Use of Deep Neural Networks for Developing Generic Pavement Rutting Predictive Models. Int. J. Pavement Eng. 2022, 23, 4260–4276. [Google Scholar] [CrossRef]
  13. Fani, A.; Golroo, A.; Mirhassani, S.A.; Gandomi, A.H. Pavement maintenance and rehabilitation planning optimisation under budget and pavement deterioration uncertainty. Int. J. Pavement Eng. 2022, 23, 414–424. [Google Scholar] [CrossRef]
  14. Yamany, M.S.; Abraham, D.M.; Nantung, T.E.; Labi, S. Review of Probabilistic Modeling of Pavement Performance Using Markov Chains. Preprints 2024, 2024071863. [Google Scholar] [CrossRef]
  15. Dong, Q.; Chen, X.; Dong, S.; Ni, F. Data Analysis in Pavement Engineering: An Overview. IEEE Trans. Intell. Transp. Syst. 2022, 23, 22020–22033. [Google Scholar] [CrossRef]
  16. Al-Omari, B.; Darter, M.I. Effect of Pavement Deterioration Types on IRI and Rehabilitation. Transp. Res. Rec. 1995, 1505, 57–65. [Google Scholar]
  17. Kuruvachalil, L.; Karim, F.; Masoud, A.R.; Hasan, U.; Ali, L.; Bin Sulaiman, F.; Alosaimi, F.; AlJassmi, H. Advancing Pavement Management: A Comprehensive Review of Smart Models for Better Decisions. Transp. Res. Interdiscip. Perspect. 2025, 34, 101711. [Google Scholar] [CrossRef]
  18. Khan, T.; Ahmed, A.; Khan, M.S. Developing Pavement Distress Deterioration Models for Pavement Management System Using Markovian Probabilistic Process. Int. J. Pavement Eng. 2017, 18, 409–420. [Google Scholar] [CrossRef]
  19. Alonso-Solorzano, A.; Pérez-Acebo, H.; Findley, D.J.; Gonzalo-Orden, H. Transition Probability Matrices for Pavement Deterioration Modelling with Variable Duty Cycle Times. Road Mater. Pavement Des. 2023, 24, 937–958. [Google Scholar] [CrossRef]
  20. Ghasemi, P.; Aslani, M.; Rollins, D.K.; Williams, R.C. Principal Component Neural Networks for Modeling, Prediction, and Optimization of Hot Mix Asphalt Dynamic Modulus. Infrastructures 2019, 4, 53. [Google Scholar] [CrossRef]
  21. Sun, L.; Zhang, Z.; Wang, F. An Innovative Evaluation Method for Performance of In-Service Asphalt Pavement with Semi-Rigid Base. Constr. Build. Mater. 2020, 247, 118520. [Google Scholar] [CrossRef]
  22. McInnes, L.; Healy, J.; Melville, J. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv 2018, arXiv:1802.03426. [Google Scholar]
  23. Li, Y.; Wang, J.; Ma, X.; Liu, Y. Pavement Roughness Grade Recognition Based on One-Dimensional Residual Convolutional Neural Network. Sensors 2023, 23, 1482. [Google Scholar] [CrossRef]
  24. Campello, R.J.G.B.; Moulavi, D.; Sander, J. Density-Based Clustering Based on Hierarchical Density Estimates. In Advances in Knowledge Discovery and Data Mining; Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G., Eds.; Springer: Berlin/Heidelberg, Germany, 2013; pp. 160–172. [Google Scholar] [CrossRef]
  25. McInnes, L.; Healy, J.; Astels, S. hdbscan: Hierarchical Density Based Clustering. J. Open Source Softw. 2017, 2, 205. [Google Scholar] [CrossRef]
  26. Durango-Cohen, P.L.; Madanat, S.M. Optimal Maintenance and Repair Policies in Infrastructure Management under Uncertain Facility Deterioration Rates: An Adaptive Control Approach. Transp. Res. Part A Policy Pract. 2002, 36, 763–778. [Google Scholar] [CrossRef]
  27. Wang, Q.-A.; Lu, A.-W.; Ni, Y.-Q.; Wang, J.-F.; Ma, Z.-G. Bayesian Network in Structural Health Monitoring: Theoretical Background and Applications Review. Sensors 2025, 25, 3577. [Google Scholar] [CrossRef]
  28. Wang, Q.-A.; Chen, J.; Ni, Y.-Q.; Xiao, Y.; Liu, N.; Liu, S.-K.; Feng, W. Application of Bayesian Networks in Reliability Assessment: A Systematic Literature Review. Structures 2025, 71, 108098. [Google Scholar] [CrossRef]
  29. Cohen, J. Statistical Power Analysis for the Behavioral Sciences, 2nd ed.; Lawrence Erlbaum Associates: Hillsdale, NJ, USA, 1988. [Google Scholar]
  30. Korea Expressway Corporation, Road and Transportation Research Institute. A Study on Development of the Next-Generation Platforms for the Introduction of an Intelligent Pavement Management System; Final Report (Control No. OTKCRK240331); Korea Expressway Corporation, Road and Transportation Research Institute: Hwaseong-si, Republic of Korea, 2023. [Google Scholar]
  31. Hu, Y.; Yin, Y.; Sreeram, A.; Liu, J.; Si, W.; Tang, D.; Airey, G.D. Nano-Aggregation of Asphaltenes and Its Influence on the Multiscale Properties of Bitumen Recycled through Multiple Ageing and Rejuvenation Cycles. Chem. Eng. J. 2025, 512, 162348. [Google Scholar] [CrossRef]
Figure 1. Workflow of the proposed Δ–State Vector and Reaction Signature framework.
Figure 1. Workflow of the proposed Δ–State Vector and Reaction Signature framework.
Infrastructures 11 00015 g001
Figure 2. Construction of the 4D Δ–State Vector from previous-year condition grades and noise-filtered annual changes, with SD excluded due to its zero-inflated, event-driven characteristics.
Figure 2. Construction of the 4D Δ–State Vector from previous-year condition grades and noise-filtered annual changes, with SD excluded due to its zero-inflated, event-driven characteristics.
Infrastructures 11 00015 g002
Figure 3. IRI distributions across HPMS hierarchical levels (L0–L5).
Figure 3. IRI distributions across HPMS hierarchical levels (L0–L5).
Infrastructures 11 00015 g003
Figure 4. RD distributions across hierarchical levels (L0–L5).
Figure 4. RD distributions across hierarchical levels (L0–L5).
Infrastructures 11 00015 g004
Figure 5. Log-scaled SD distributions across hierarchical levels (L0–L5).
Figure 5. Log-scaled SD distributions across hierarchical levels (L0–L5).
Infrastructures 11 00015 g005
Figure 9. Spatial-group dispersion scales (MAD) across hierarchical groupings for ΔIRI and ΔRD.
Figure 9. Spatial-group dispersion scales (MAD) across hierarchical groupings for ΔIRI and ΔRD.
Infrastructures 11 00015 g009
Figure 10. Boxplots of Δ I R I c l e a n across previous-year IRI grades (left) and Δ R D c l e a n across previous-year RD grades (right).
Figure 10. Boxplots of Δ I R I c l e a n across previous-year IRI grades (left) and Δ R D c l e a n across previous-year RD grades (right).
Infrastructures 11 00015 g010
Figure 11. UMAP Reaction Space of Δ–State vectors, colored by rule-based Trend (Stable/Improve/Worsen).
Figure 11. UMAP Reaction Space of Δ–State vectors, colored by rule-based Trend (Stable/Improve/Worsen).
Infrastructures 11 00015 g011
Figure 12. Alignment between HDBSCAN clusters and rule-based Trend × Mode signatures for the five largest non-noise clusters (by sample size). Cell values indicate within-cluster signature shares; weighted purity and coverage summarize overall agreement.
Figure 12. Alignment between HDBSCAN clusters and rule-based Trend × Mode signatures for the five largest non-noise clusters (by sample size). Cell values indicate within-cluster signature shares; weighted purity and coverage summarize overall agreement.
Infrastructures 11 00015 g012
Figure 13. Transition probability matrix of Trend × Mode signatures, P ( S t + 1 S t ) . Rows denote the current-year signature and columns denote the next-year signature; probabilities are row-normalized.
Figure 13. Transition probability matrix of Trend × Mode signatures, P ( S t + 1 S t ) . Rows denote the current-year signature and columns denote the next-year signature; probabilities are row-normalized.
Infrastructures 11 00015 g013
Table 1. K-HPMS hierarchical data structure (Levels 0–5).
Table 1. K-HPMS hierarchical data structure (Levels 0–5).
LevelData StructureDescription
L0YearEntire dataset for a given year
L1Year × PavementAggregation by pavement type within each year
L2Year × Route × PavementIncorporates route-level spatial structure
L3 Year × Route × Direction × PavementReflects directional segmentation
L4 Year × Route × Direction × Lane × PavementLane-level segmentation (lane-specific aggregation)
L5 Year × Route × Direction × Lane ×
Segment (0.1 km) × Pavement
Lane–Segment-level segmentation at 0.1 km resolution (for segment-level analysis)
Table 2. K-HPMS reference maintenance grades and M&R actions [30].
Table 2. K-HPMS reference maintenance grades and M&R actions [30].
GradeIRI (m/km)RD (mm)SD (m2)Recommended M&R
11.5 or less4 or less0Do nothing
21.5~2.04~71.0 or lessPreventive maintenance
32.0~2.57~101.0~18Required repair/selective maintenance
42.5~3.010~1318~36Repair required
53.0~3.513~1636~52Required repair/resurfacing
63.5~4.016~2052~72Resurfacing
74.0 or higher20 or higher72 or higherPriority resurfacing
Table 3. Reaction Signature categories (Trend × Mode).
Table 3. Reaction Signature categories (Trend × Mode).
TrendModeInterpretation
StableZero *Normal, noise-level stability (ΔIRI_clean = 0 and ΔRD_clean = 0)
ClearStrong performance recovery
MixedImprovement dominated with intermittent setbacks
FullMajor recovery/event-like improvement
WorsenClearConsistent structural deterioration
MixedDeterioration dominated with intermittent improvements
FullAccelerated deterioration or severe structural weakening
* Under the hard-thresholding rule (Equation (8)), the Stable regime corresponds exclusively to Zero reactions (ΔIRI_clean = 0 and ΔRD_clean = 0); therefore, Stable–Clear/Mixed/Full are not defined in this study.
Table 4. Statistical test results across hierarchical levels (BF, Welch, and η 2 ).
Table 4. Statistical test results across hierarchical levels (BF, Welch, and η 2 ).
LevelMetricBF_FWelch_F p -Value Eta _ sq   ( η 2 ) Significance
L0IRI1.9812.55<0.0010.002***
RD34.10271.18<0.0010.050***
SD81.7361.21<0.0010.013***
L1IRI75.6124.21<0.0010.012***
RD31.37134.48<0.0010.052***
SD40.3631.87<0.0010.014***
L2IRI28.46108.75<0.0010.138***
RD22.91132.64<0.0010.153***
SD34.9342.55<0.0010.050***
L3IRI20.04115.81<0.0010.237***
RD18.74NaNNaN0.169NA
SD28.44NaNNaN0.077NA
L4IRI8.14123.40<0.0010.263***
RD21.6794.26<0.0010.358***
SD19.47NaNNaN0.146NA
Note: *** indicates p < 0.001. NA denotes not available/not evaluated. NaN indicates that the Welch ANOVA statistic is undefined due to degenerate (near-zero) within-group variance and/or insufficient effective degrees of freedom after stratification. In such cases, Welch inference was not evaluated; dispersion shifts were supported using the Brown–Forsythe test and η 2 patterns across levels.
Table 5. Year-wise robust dispersion statistics and noise-level shares for ΔIRI.
Table 5. Year-wise robust dispersion statistics and noise-level shares for ΔIRI.
Year5th95thIQR/2MAD ε I R I ( y ) Noise Share
(Global: |ΔIRI| ≤ 0.089)
Noise Share
(Year: |ΔIRI| ≤ ε_IRI (y))
2016−0.1400.2200.0000.0000.00081.5%63.2%
2017−0.4300.3300.0650.0890.08958.5%58.5%
2018−0.3700.3800.0750.1040.10455.1%61.4%
2019−0.3090.3900.0750.1190.11952.3%62.5%
2020−0.2700.4200.0750.1040.10455.5%62.4%
2021−0.3100.3900.0750.1040.10451.7%59.2%
2022−0.1700.2900.0500.0740.07467.5%63.6%
Note: In the year-wise summaries, ε t , Y denotes the raw transition-wise MAD scale. The filtering threshold actually used to construct Δ Y c l e a n is the effective band ε t , Y * = m a x ( ε global , Y ,   ε t , Y ) (Equation (7)). This avoids unrealistically small thresholds in transitions where MAD becomes zero (e.g., 2015–2016).
Table 6. Year-wise robust dispersion statistics and noise-level shares for ΔRD.
Table 6. Year-wise robust dispersion statistics and noise-level shares for ΔRD.
Year5th95thIQR/2MAD ε R D ( y ) Noise Share
(Global: |ΔRD| ≤ 0.993)
Noise Share
(Year: |ΔRD| ≤ ε_RD (y))
2016−1.242.740.0750.0000.00078.0%61.6%
2017−3.052.160.6340.9340.93459.7%56.7%
2018−2.442.710.8341.2301.23053.4%64.1%
2019−1.902.250.5950.8900.89066.7%61.7%
2020−2.672.010.6000.8750.87569.2%64.3%
2021−1.651.960.5950.8750.87571.4%66.4%
2022−0.493.660.6100.9040.90430.3%26.3%
Note: In the year-wise summaries, ε t , Y denotes the raw transition-wise MAD scale. The filtering threshold actually used to construct Δ Y c l e a n is the effective band ε t , Y * = m a x ( ε global , Y ,   ε t , Y ) (Equation (7)). This avoids unrealistically small thresholds in transitions where the MAD becomes zero (e.g., 2015–2016).
Table 7. Year-wise robust dispersion statistics and noise-level shares for ΔSD.
Table 7. Year-wise robust dispersion statistics and noise-level shares for ΔSD.
Year5th95thIQR/2MAD ε S D   ( y )Noise Share
(Global: ΔSD = 0)
Noise Share
(Year: |ΔSD| ≤ ε_SD (y))
2016−1.302.060.0000.0000.00072.2%72.2%
2017−2.492.530.0000.0000.00058.5%58.5%
2018−3.152.700.0100.0150.01549.2%50.9%
2019−1.315.840.0400.0300.03046.6%51.9%
2020−3.675.040.0250.0300.03041.4%51.1%
2021−6.620.540.0400.0150.01544.3%52.7%
2022−1.1313.01.100.3020.30221.9%51.8%
Note: ε t , S D is a descriptive transition-wise MAD scale reported to summarize dispersion under zero-inflation and event-driven spikes; Δ S D is not filtered into Δ S D c l e a n and is not used as an input to the Δ –State Vector.
Table 8. Summary statistics of ΔIRI and ΔRD across condition grades.
Table 8. Summary statistics of ΔIRI and ΔRD across condition grades.
GradeΔIRI (m/km/Year)ΔRD (mm/Year)
MeanWorsen
Ratio (%)
Improve
Ratio (%)
MeanWorsen
Ratio (%)
Improve
Ratio (%)
10.03818.76.180.45826.43.46
20.02225.916.40.34127.110.3
3−0.01230.722.2−0.10727.121.1
4−0.08332.229.7−0.77526.732.0
5−0.12232.033.6−1.8128.438.2
6−0.28330.739.5−3.5216.741.0
7−0.41932.545.3NANANA
Note: RD Grade 7 was not observed (or was extremely sparse) in the fixed panel used for analysis; therefore, summary statistics for that grade are not reported (NA).
Table 9. Distribution of rule-based Reaction Signatures (Trend × Mode) for the non-noise observations used in cluster–signature alignment (post-filtered HDBSCAN noise excluded; Section 3.6).
Table 9. Distribution of rule-based Reaction Signatures (Trend × Mode) for the non-noise observations used in cluster–signature alignment (post-filtered HDBSCAN noise excluded; Section 3.6).
SignatureSample SizePercent (%)
Stable–Zero944860.4%
Worsen–Mixed224914.4%
Worsen–Clear164410.5%
Worsen–Full2501.60%
Improve–Mixed8305.31%
Improve–Clear7474.78%
Improve–Full4723.02%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hong, S.; Cho, J.; Yu, K.; Sohn, D.; Kim, I. State-Dependent Asphalt Pavement Deterioration Modeling via Noise-Filtered Reaction Signatures: A Data-Driven Framework Using Korea Highway Pavement Management System (K-HPMS) Data. Infrastructures 2026, 11, 15. https://doi.org/10.3390/infrastructures11010015

AMA Style

Hong S, Cho J, Yu K, Sohn D, Kim I. State-Dependent Asphalt Pavement Deterioration Modeling via Noise-Filtered Reaction Signatures: A Data-Driven Framework Using Korea Highway Pavement Management System (K-HPMS) Data. Infrastructures. 2026; 11(1):15. https://doi.org/10.3390/infrastructures11010015

Chicago/Turabian Style

Hong, Sungjin, Jeongyeon Cho, Kyungyoung Yu, Duecksu Sohn, and Intai Kim. 2026. "State-Dependent Asphalt Pavement Deterioration Modeling via Noise-Filtered Reaction Signatures: A Data-Driven Framework Using Korea Highway Pavement Management System (K-HPMS) Data" Infrastructures 11, no. 1: 15. https://doi.org/10.3390/infrastructures11010015

APA Style

Hong, S., Cho, J., Yu, K., Sohn, D., & Kim, I. (2026). State-Dependent Asphalt Pavement Deterioration Modeling via Noise-Filtered Reaction Signatures: A Data-Driven Framework Using Korea Highway Pavement Management System (K-HPMS) Data. Infrastructures, 11(1), 15. https://doi.org/10.3390/infrastructures11010015

Article Metrics

Back to TopTop