Machine Learning for Prediction of the International Roughness Index on Flexible Pavements: A Review, Challenges, and Future Directions

: Timely maintenance of road pavements is crucial to ensure optimal performance. The accurate prediction of trends in pavement defects enables more efﬁcient allocation of funds, leading to a safer, higher-quality road network. This article systematically reviews machine learning (ML) models for predicting the international roughness index (IRI), speciﬁcally focusing on ﬂexible pavements, offering a comprehensive synthesis of the state-of-the-art. The study’s objective was to assess the effectiveness of various ML techniques in predicting IRI for ﬂexible pavements. Among the evaluated ML models, tree ensembles and boosted trees are identiﬁed as the most effective, particularly in managing data related to trafﬁc, pavement structure, and climatic conditions, which are essential for training these models. Our analysis reveals that trafﬁc data are present in 89% of the studies, while pavement structure and climatic factors are featured in 78%. However, maintenance and rehabilitation history appears less frequently, included in 33% of the studies. This research underscores the need for high-quality, standardized datasets, and highlights the importance of model interpretability and computational efﬁciency. Addressing data consistency, model interpretability, and replicability across studies are crucial for leveraging ML’s full potential in ﬁne-tuning IRI predictions. Future research directions include developing more interpretable, computationally efﬁcient, and less complex models to maximize the impact of this research ﬁeld in road infrastructure management.


Introduction
Roads are crucial in transportation and society [1].To maintain the quality of the road network, regular maintenance activities are necessary, which require decisions to be made that are both technically and economically feasible [1].The cost of these activities often constitutes a significant portion of government infrastructure budgets, limiting investment and negatively impacting the road network [2][3][4].As such, optimizing funding is essential to ensure the quality of the road network [4].
Correspondingly, the transportation sector is experiencing transformative shifts due to breakthroughs in Artificial Intelligence (AI), big data, autonomous vehicles, and decarbonization.Nonetheless, these advancements introduce technical challenges demanding reevaluating traditional paradigms [5,6].
Likewise, the deterioration process of road pavement is nonlinear; it typically starts slowly, keeping the pavement in good condition during the early stages.However, once the deterioration begins, it progresses quickly [12].Additionally, inadequate M&R strategies throughout the pavement's life cycle typically lead to structural failures, necessitating major rehabilitation or reconstruction, thus escalating the overall cost of road maintenance [2,13].The international roughness index (IRI) is a traditional metric for evaluating road pavement quality [14].Hence, numerous pavement management agencies use it as an indicator for M&R tasks.
Artificial intelligence has made inroads into various fields, with machine learning (ML) being one of its most prominent [15,16].ML has been fueled by algorithm advancements, data availability, and reduced computing costs [15].Essentially, ML is the ability of computers to learn how to perform tasks such as prediction, classification, clustering, and pattern recognition [17]; thus, this learning occurs without explicit programming [15].
ML has become popular for pavement performance prediction in recent years due to its ability to model complex relationships between inputs and outputs [10,18].ML algorithms have the ability to predict various aspects of pavement performance, including roughness, cracking, and rutting [19].By training on large datasets of pavement data, these algorithms are able to identify patterns and relationships that may not be easily recognizable to the human eye.This leads to improved accuracy in pavement performance predictions and better road infrastructure management [10,18].
However, some challenges must be addressed to make ML a more reliable and effective tool for pavement performance prediction.One challenge is the availability of high-quality and sufficient data for training and models, as pavement performance data are often collected manually and can be time-consuming, expensive, and prone to errors [20][21][22].
Correspondingly, the absence of a benchmark dataset also affects the ability to compare different models.Thus, comparing the results of algorithms trained in different contexts and with different data sources is not feasible.
Additionally, the interpretability of ML models remains challenging, making it difficult to understand the underlying relationships between input variables and pavement performance [10].The variability in pavement performance due to environmental factors such as temperature, precipitation, and traffic loads is another challenge [23], as these factors significantly impact pavement performance but are challenging to quantify and incorporate into ML models [24,25].
Moreover, open science must be accentuated for rapid and significant scientific progress.The limited availability of information, data, and models in published articles presents a barrier to replicating studies, hindering the advancement of the field.
This study conducts a literature review, adhering to the preferred reporting items for systematic reviews and meta-analyses (PRISMA) guidelines [26] to examine the recent advancements in ML applications for predicting the IRI of flexible pavements.Recognizing the critical role of accurate IRI predictions in effective road infrastructure management, this review synthesizes state-of-the-art methods and highlights challenges, thereby providing a comprehensive understanding of the current research landscape in this domain.
The main objective of this review is to elucidate the state-of-the-art ML techniques employed in predicting the IRI for flexible pavements.Accordingly, four research questions (RQ) were outlined: RQ 1: What ML algorithms are prevalently utilized in predicting IRI for flexible pavements?
RQ 2: Which data repositories are most frequently employed for training and evaluating ML models within the scope of IRI prediction, and what is their significance in this context?RQ 3: What are the essential input parameters that significantly influence the training efficacy of ML algorithms in accurately predicting IRI? RQ 4: Which ML models show the greatest potential in accurately predicting IRI for flexible pavements, and what attributes contribute to their effectiveness?
Accordingly, the research gap addressed in this study is the need for an updated, comprehensive review that encapsulates recent advancements in ML techniques for IRI prediction, specifically for flexible pavements.This study endeavors to answer the aforementioned research questions, focusing on the latest developments in this field.
The paper is structured into five sections.Section 1 provides an overview of the paper, while Section 2 discusses the background, with particular attention to the pavement performance model, IRI, and the main ML methods applied to pavements.Section 3 explores the methodology adopted to select and analyze articles for review.Section 4 presents the state-of-the-art techniques for pavement performance prediction using ML.Finally, the conclusions are in Section 5.

Pavement Performance Models
Pavement performance models (PPMs) are essential tools for predicting the performance of road pavements over time [18,27].Pavement engineers have developed these models to tackle the challenge of predicting the behavior of complex structures made of various materials that respond differently to traffic and environmental conditions [3].The accuracy, scope, and data requirements of PPMs may vary, and high-quality data are important to maximize their effectiveness.
PPMs are divided into three main categories: mechanistic, empirical, and mechanisticempirical [18].Mechanistic models mathematically model the physics of pavements.Hence, the model calculates pavement reaction to traffic loads.On the other hand, empirical models use regression analysis to identify factors such as traffic, weather, pavement age, and others that impact pavement performance.These models usually use observed data to establish correlations between inputs and outputs.Therefore, ML-based PPMs are empirical.
Mechanistic-empirical models determine pavement stress and strain responses through mechanistic analysis and then relate them to pavement performance or deterioration through regression analysis.
Predicting pavement performance is a crucial aspect of pavement engineering as it provides insight into how road pavements will hold up under different conditions.This information is crucial for designing, constructing, and maintaining cost-effective, durable roads equipped to withstand traffic and weather changes.Therefore, the pavement performance prediction module is the core of pavement management systems (PMS) [10].
PMS are essential for effectively managing road networks, given the limited funding available and the need to allocate resources effectively [2,27].These frameworks provide decision-making tools and strategies for maintaining the quality of road pavements throughout their lifecycle, from planning to assessment [1].The concept of PMS gained popularity in the 1960s and has since evolved to become the best way to ensure effective M&R strategies [2].
These tools comprise several modules and manage the entire pavement life cycle, including data collection and management, pavement condition evaluation, economic analysis, PPMs, prioritization of M&R activities, and optimization of activities and investments [1].The most common components of a PMS include PPMs.These models are usually based on pavement conditions, traffic, historical data, and environmental conditions.Also, pavements are typically classified based on their current condition using pavement quality indexes.
Accordingly, a systematic approach is necessary to preserve existing road networks, starting with pavement condition assessment, performance modeling, strategic planning, and optimization of M&R activities.This article explores state-of-the-art ML techniques for IRI prediction and identifies the challenges encountered in the field.By understanding these challenges and the best practices for IRI prediction, the aim is to provide guidance for future research.
To ensure that the roads are well-maintained and safe for the users, it is crucial to assess road pavement performance.There are several methods for evaluating the performance of road pavements, some of the most common are IRI, pavement condition index (PCI), and the present serviceability index (PSI).
The PCI index varies on a numerical scale between 0 and 100, with higher values indicating better performance.During the initial assessment, the pavement is given a score of 100, and values are deducted for each type of distress based on its extent and condition [14].
The PSI is a method for evaluating the current condition of a road based on visual observations and ranges from 0 (impassable) to 5 (excellent).The PSI considers slope variance and can be related to roughness performance metrics such as the IRI [28].The PSI reflects the overall functional condition of the pavement, with higher values indicating better performance.It is calculated using a combination of visual inspections, surface distress measurements, and other data collected by pavement engineers.

International Roughness Index (IRI)
The IRI mathematically represents a pavement's longitudinal profile, rooted in the World Bank's 1982 Brazil experiment [29].Serving as a widely accepted metric, IRI quantifies pavement smoothness by calculating the average longitudinal profile, reflecting surface variations causing vehicle vibrations [29].This measurement is taken based on the hypothetical response of a quarter-car moving at 80 km/h [14].
By quantifying a road's roughness, the IRI provides essential data in a format easy to compare: meters per kilometer, millimeters per meter, or inches per mile.Figure 1 delineates the relationship between IRI, road pavement quality, and usage.
these challenges and the best practices for IRI prediction, the aim is to provide guidance for future research.
To ensure that the roads are well-maintained and safe for the users, it is crucial to assess road pavement performance.There are several methods for evaluating the performance of road pavements, some of the most common are IRI, pavement condition index (PCI), and the present serviceability index (PSI).
The PCI index varies on a numerical scale between 0 and 100, with higher values indicating better performance.During the initial assessment, the pavement is given a score of 100, and values are deducted for each type of distress based on its extent and condition [14].
The PSI is a method for evaluating the current condition of a road based on visual observations and ranges from 0 (impassable) to 5 (excellent).The PSI considers slope variance and can be related to roughness performance metrics such as the IRI [28].The PSI reflects the overall functional condition of the pavement, with higher values indicating better performance.It is calculated using a combination of visual inspections, surface distress measurements, and other data collected by pavement engineers.

International Roughness Index (IRI)
The IRI mathematically represents a pavement's longitudinal profile, rooted in the World Bank's 1982 Brazil experiment [29].Serving as a widely accepted metric, IRI quantifies pavement smoothness by calculating the average longitudinal profile, reflecting surface variations causing vehicle vibrations [29].This measurement is taken based on the hypothetical response of a quarter-car moving at 80 km/h [14].
By quantifying a road's roughness, the IRI provides essential data in a format easy to compare: meters per kilometer, millimeters per meter, or inches per mile.Figure 1 delineates the relationship between IRI, road pavement quality, and usage.Transportation agencies have traditionally used IRI as a threshold for road maintenance decisions [30].Regular monitoring of IRI values empowers these agencies to pinpoint declining roads, enabling efficient resource distribution for M&R.As technology progresses, the methods for collecting IRI data have become increasingly sophisticated and efficient, leading to abundant available data.This proliferation of information accentuates the importance of having robust and precise models to predict IRI based on these data.

Road Pavement Databases
Effective management of road pavements through a PMS depends significantly on available historical data, which includes a variety of factors such as pavement structure, M&R history, climate data, traffic, and performance metrics.Highway agencies often maintain proprietary databases with varying degrees of comprehensiveness.
The critical role of data quality in these databases must be balanced, especially when developing ML models to predict pavement quality.Focusing on a data-centric approach is essential in ML applications, as the data used to train models is a critical determinant of their predictive accuracy [32,33].
The most significant database with highway pavement data is the Long-Term Pavement Performance (LTPP) program, initiated as part of the Strategic Highway Research Program (SHRP) in 1987.Managed by the Federal Highway Administration (FHWA), LTPP is the world's largest and most comprehensive pavement performance database.It includes more than 2500 pavement sections across North America, with the goal of studying how pavement performance is influenced by design factors, environmental conditions, traffic loads, material characteristics, construction quality, and maintenance practices [34].
LTPP programs encompass two integral components: general pavement studies (GPS) and specific pavement studies (SPS).GPS focuses on the overall performance of various pavement types using in-service pavement sections, while SPS investigates the impact of specific factors such as drainage, layer thickness, and maintenance treatments on pavement performance.This program has facilitated research to understand different M&R strategies, adapt performance models to local conditions, and optimize maintenance decision-making processes [34].

Fundamentals of Machine Learning
Artificial intelligence, specifically ML, is rapidly advancing, propelled by algorithm innovations, data availability, and heightened computing power [15].ML is noteworthy for its capability to predict outcomes without explicit programming [15], offering transformative potentials in pavement performance prediction.Nevertheless, this transition poses new challenges [5,6].
Machine learning serves as a robust alternative to traditional methods in pavement performance prediction, promising improved accuracy and data-driven decision-making.Furthermore, ML techniques are categorized as supervised, unsupervised, or reinforcement learning [35], as illustrated in Figure 2. Supervised learning, a key facet of ML, focuses on generating predictions from labeled data.This involves training models with known input-output pairs, allowing the algorithm to forecast unknown data.These tasks predominantly involve regression, for predicting continuous values, and classification, for identifying distinct classes [36].Accordingly, these algorithms are inherently task-driven.
Conversely, unsupervised learning, another facet of ML, seeks to uncover hidden patterns in unlabeled data, focusing on pattern recognition and data clustering [37].Hence, they function as data-driven algorithms.
Reinforcement learning is distinct, involving an agent that performs actions in an environment to achieve maximum rewards.It relies on a feedback loop where the agent adjusts its actions through positive or negative reinforcement to attain optimal results [38].Consequently, they are designed to learn from mistakes through trial and error.
Supervised learning is particularly effective for tasks requiring predictions or decisions based on historical data.Then, they are the branch of ML algorithms most used in PPMs.In analyzing pavement quality through the IRI, algorithms utilize labeled data, associating pavement conditions (input) with specific target IRIs (output) to predict IRI for unseen pavement quality.

Popular Algorithms in Pavement Analysis
In pavement management, regression analyzes are frequently employed for forecasting.When incorporating ML, supervised learning algorithms are the choice.This section will discuss prevalent algorithms for predicting the IRI over time.
The support vector machine (SVM) [39] is a versatile, supervised ML method used for classification and regression tasks, determining optimal boundaries to segregate different classes with precision and reliability [39,40].
Decision trees (DT) [49] classify data and make decisions through a hierarchical structure, where nodes represent features, branches represent possible values, and leaf nodes signify outcomes.
Progressing from traditional decision trees, ensemble methods [50] in ML combine multiple models to create a single, more accurate, and reliable predictive model [50].Specifically, tree ensemble models consolidate predictions from numerous decision trees, improving precision and stability by balancing out individual errors from each tree.Boosted trees enhance prediction quality by amalgamating outputs from multiple trees and correcting preceding trees' errors, providing comprehensive and accurate predictions for pavement IRI [51].Supervised learning, a key facet of ML, focuses on generating predictions from labeled data.This involves training models with known input-output pairs, allowing the algorithm to forecast unknown data.These tasks predominantly involve regression, for predicting continuous values, and classification, for identifying distinct classes [36].Accordingly, these algorithms are inherently task-driven.
Conversely, unsupervised learning, another facet of ML, seeks to uncover hidden patterns in unlabeled data, focusing on pattern recognition and data clustering [37].Hence, they function as data-driven algorithms.
Reinforcement learning is distinct, involving an agent that performs actions in an environment to achieve maximum rewards.It relies on a feedback loop where the agent adjusts its actions through positive or negative reinforcement to attain optimal results [38].Consequently, they are designed to learn from mistakes through trial and error.
Supervised learning is particularly effective for tasks requiring predictions or decisions based on historical data.Then, they are the branch of ML algorithms most used in PPMs.In analyzing pavement quality through the IRI, algorithms utilize labeled data, associating pavement conditions (input) with specific target IRIs (output) to predict IRI for unseen pavement quality.

Popular Algorithms in Pavement Analysis
In pavement management, regression analyzes are frequently employed for forecasting.When incorporating ML, supervised learning algorithms are the choice.This section will discuss prevalent algorithms for predicting the IRI over time.
The support vector machine (SVM) [39] is a versatile, supervised ML method used for classification and regression tasks, determining optimal boundaries to segregate different classes with precision and reliability [39,40].
Decision trees (DT) [49] classify data and make decisions through a hierarchical structure, where nodes represent features, branches represent possible values, and leaf nodes signify outcomes.
Progressing from traditional decision trees, ensemble methods [50] in ML combine multiple models to create a single, more accurate, and reliable predictive model [50].Specifically, tree ensemble models consolidate predictions from numerous decision trees, improving precision and stability by balancing out individual errors from each tree.Boosted trees enhance prediction quality by amalgamating outputs from multiple trees and correcting preceding trees' errors, providing comprehensive and accurate predictions for pavement IRI [51].
In conclusion, the union of traditional engineering knowledge with advanced ML models presents an opportunity to increase the quality of road infrastructure, translating into cost reduction and a better user experience.

Methodology
This study's methodology involved a literature review, focusing on integrating pavement engineering and ML within state-of-the-art research.It specifically sought out articles that explored the use of ML techniques for modeling and predicting pavement performance, with a particular focus on forecasting the IRI in flexible pavements.Inclusion criteria were limited to peer-reviewed articles published within the last five years and written in English.The specific selection criteria are delineated in Table 2.The articles were searched in the Scopus database, and the query used was: TITLE-ABS-KEY ((pavement*) AND (predict* OR model* OR perform*) AND ("machine learning" OR "artificial intelligence" OR "deep learning" OR "neural network*") AND ("international roughness index" OR IRI) AND (flexible OR asphalt)) AND PUBYEAR > 2017 AND (LIMIT-TO (DOCTYPE, "ar")) AND (LIMIT-TO (LANGUAGE, English)).
Consequently, the search encompassed the title, abstract, and keywords for a range of the following terms: It should be noted that the asterisk (*) functions as a wildcard character in the query, indicating that all variations stemming from the root of the term following this symbol are to be included in the search results.
The outcomes of this query, along with the proposed framework for article selection based on the PRISMA guidelines [26], are presented in Figure 3.
After the initial search in SCOPUS yielded 27 articles, we expanded our search through a snowballing method, adding 41 articles.By examining the reference lists of the selected articles and any papers that cited them, we identified additional studies that met our inclusion criteria.Each new paper was subject to the same selection process.Likewise, papers considered low relevance in the eligibility criteria were discarded for not training an ML model or for displaying results that significantly differ from those of their counterparts.In the end, 18 articles were selected for review.
Despite the filters for the previous five years, most selected articles were published from 2020 onwards, with just one article from 2018 and 2019 meeting the inclusion criteria.Figure 4 represents the publication years for the articles considered in this review, highlighting the focus of state-of-the-art research in predicting the IRI of flexible pavements.Nonetheless, it is important to acknowledge that this outcome is a consequence of the specific methodology and query employed in this study rather than an indication of a trend.After the initial search in SCOPUS yielded 27 articles, we expanded our search through a snowballing method, adding 41 articles.By examining the reference lists of the selected articles and any papers that cited them, we identified additional studies that met our inclusion criteria.Each new paper was subject to the same selection process.Likewise, papers considered low relevance in the eligibility criteria were discarded for not training an ML model or for displaying results that significantly differ from those of their counterparts.In the end, 18 articles were selected for review.
Despite the filters for the previous five years, most selected articles were published from 2020 onwards, with just one article from 2018 and 2019 meeting the inclusion criteria.Figure 4 represents the publication years for the articles considered in this review, highlighting the focus of state-of-the-art research in predicting the IRI of flexible pavements.Nonetheless, it is important to acknowledge that this outcome is a consequence of the specific methodology and query employed in this study rather than an indication of a trend.For the chosen articles, relevant information was extracted to address the previously outlined research questions, namely: RQ 1: Which ML algorithms are most commonly used in predicting the IRI for flexible pavements?For the chosen articles, relevant information was extracted to address the previously outlined research questions, namely: RQ 1: Which ML algorithms are most commonly used in predicting the IRI for flexible pavements?
RQ 2: What databases are used to train and test ML models in IRI prediction?RQ 3: What key input parameters are essential for training ML models to predict IRI? RQ 4: Among the various models, which exhibit the highest potential for accurately predicting the IRI of flexible pavements?
In addition to answering these questions, the study thoroughly analyzed each article's individual contributions and limitations.
Over the years, several applications using ANN architectures have been introduced to predict IRI.Hossain et al. [54] used ANN to predict the IRI.Further, Abdelaziz et al. [30] developed an IRI prediction model for flexible pavements using ANN and MLR analysis.Moreover, Zeiada et al. [55] support that ANN was the most accurate in predicting pavement performance in warm climate regions compared to conventional regression methods.
Likewise, Gharieb et al. [58] developed two ANN models for double bituminous surface treatment (DBST) and asphalt concrete (AC) pavement sections within the National Road Network (NRN), using the Laos PMS database to predict IRI by analyzing only pavement age and traffic load, surpassing traditional MLR methods.Furthermore, Abdulaziz et al. [30] developed ANN models that accurately predict the IRI by analyzing the effects of pavement distress across two climate regions in North America.
Applications of the RF algorithm for predicting the performance of pavements have demonstrated promising results.Gong et al. [52] suggested using RF to predict IRI values and found it more accurate and precise than linear regression.Additionally, Marcelino et al. [59] raised a systematic approach to develop prediction models in PMS by evaluating different versions of the RF algorithm and prioritizing the generalization performance.Later, Naseri et al. [60] advanced by synergizing RF and the whale optimization algorithm (WOA), achieving enhanced accuracy in predicting the IRI and realizing more efficient, cost-effective pavement maintenance optimization than traditional models.Some authors compared different models for pavement performance prediction.Sharma et al. [66] compared five models, including GBDT, ANN, extremely random trees (XRT) [72], generalized linear model (GLM) [73], and RF.Their findings indicated that GBDT outperformed the other models.The study emphasized the crucial role of weather factors in predicting pavement performance.Further, Zeiada et al. [55] studied asphalt pavement in warm climates, pinpointing seven key design factors.They compared ML techniques, including DT, SVM, ensembles boosted trees (EBT), GPR, and ANN to traditional regression.ANN was the most accurate, with different environmental factors influencing performance in warm versus cold regions.
Likewise, Luo et al. [61] compared four models-GBDT, XGBoost, SVM, and MLR-to determine the best PPM, finding that GBDT was superior.Sandamal et al. [63] employed five ML models-k-Nearest Neighbor (kNN) [74], SVM, DT, RF, and XGBoost-to predict the IRI of pavements on Sri Lankan arterial roads.Focusing on pavement age and cumulative traffic volume as the only predictors, they found that these models outperformed traditional techniques.
Notably, RF emerged as the most effective.The study also integrated Shapley Additive exPlanations (SHAP) [75] to explain the feature importance.Naseri et al. [65] examined four algorithms-DT, SVM, RF, and ANN-for IRI prediction.The study also introduced a hybrid feature-selection technique using arithmetic optimization algorithm and stochastic gradient descent regression (AOA-SGDR) to streamline the initial set of 58 variables.
Marcelino et al. [9] proposed a transfer learning approach using the AdaBoost algorithm (TrAdaBoost) [76] to enhance the accuracy of pavement performance prediction models in scenarios with limited data.Likewise, Wang et al. [53] used the AdaBoost to outperform the mechanistic-empirical pavement design guide model (MEPDG) linear approach in predicting road roughness.Subsequently, Guo et al. [57] proposed an ensemble learning model utilizing LightGBM to predict IRI and Rut Depth (RD).Their findings illustrated that LightGBM was more effective than ANN and RF.
In a related study, Zhang et al. [67] employed GBDT to predict IRI, rutting, fatigue cracking, transverse cracking, and longitudinal cracking, while also identifying critical factors for overlay performance, with pre-overlay rutting and transverse cracking emerging as key determinants of overlay durability.Damirchilo et al. [56] explored predicting IRI for asphalt pavements through ML techniques and determined that eXtreme Gradient Boosting (XGBoost) [77] was the best-performing model.
Song et al. [62] proposed an ensemble learning model based on Thunder Gradient Boosting Machines (ThunderGBM) [78] to predict the IRI of flexible pavements.They improved feature interpretation by using the SHAP method.The findings indicated that their proposed model outperformed the MEPDG, ANN, and RF models.To summarize the studies, the content of the analyzed articles is presented in Table 3. Table 3. Summary of studies on IRI prediction using machine learning.

Author Title Contributions
Gong et al. [52] Use of random forests regression for predicting IRI of asphalt pavements.
Recommends using RF to predict IRI values and shows its accuracy with high R2 and low RMSE scores compared to LR. Highlights the initial IRI as the critical factor.

Marcelino et al. [9]
Transfer learning for pavement performance prediction.
Proposes a transfer learning method with the AdaBoost algorithm for pavement performance prediction with scarce data.
Wang et al. [53] Adaboost algorithm in artificial intelligence for optimizing the IRI prediction accuracy of asphalt concrete pavement Developed an AdaBoost model to improve IRI predictions, surpassing the MEPDG's linear regression approach.
Hossain et al. [54] International roughness index prediction of flexible pavements using neural networks.
Introduces an ANN model for IRI prediction using climate and traffic data.Results demonstrate low RMSE and accurate prediction in various United States climates.
Abdelaziz et al. [30] International roughness index prediction model for flexible pavements.
Introduces an improved IRI prediction model for flexible pavements using regression analysis and neural networks.

Zeiada et al. [55]
Machine learning for pavement performance modelling in warm climate regions.
The study demonstrates ANN modeling's superior accuracy over other ML methods and traditional regression, emphasizing distinct environmental impacts between warm and cold regions.

Damirchilo et al. [56]
Machine learning approach to predict international roughness index using long-term pavement performance data.
An XGBoost based approach is introduced to predict IRI and its performance was superior compared to SVR and RF.The study used LTPP data and found key factors affecting predictions, such as No.-200-passing, hydraulic conductivity, and KESAL.
Zhang et al. [67] Analysis of critical factors to asphalt overlay performance using gradient boosted models The research identified the critical variables for the evolution of overlay performance using GBDT.

Guo et al. [57]
An ensemble learning model for asphalt pavement performance prediction based on gradient boosting decision tree.
The study introduces an ensemble learning model using LightGBM to predict two functional indices, IRI and RD.This model performs better than ANN and RF.

Author Title Contributions
Gharieb et al. [58] Modeling of pavement roughness utilizing artificial neural network approach for Laos national road network.
Presents two ANN models that accurately forecast the IRI for DBST and AC pavements.
Marcelino et al. [59] Machine learning approach for pavement performance prediction.
Presents a ML method for pavement performance prediction, focusing on making the model applicable in different situations.It includes a case study using RF to predict 5-10 years of IRI using data from the LTPP.

Naseri et al. [60]
A newly developed hybrid method on pavement maintenance and rehabilitation optimization applying whale optimization algorithm and random forest regression.
This study presents a novel hybrid method for optimizing pavement maintenance using RF, WOA, and GA, significantly outperforming traditional models in accuracy and cost-efficiency.

Luo et al. [61]
Prediction of IRI based on stacking fusion model.
The study suggests a stacking fusion model improves pavement performance prediction.The model combines GBDT and XGBoost with bagging as meta-learners.

Song et al. [62]
An efficient and explainable ensemble learning model for asphalt pavement condition prediction based on LTPP dataset.
The paper introduces a model to predict the IRI of asphalt pavements.It uses ThunderGBM and SHAP to achieve higher accuracy and better feature interpretation.

Sandamal et al. [63]
Pavement roughness prediction using explainable and supervised machine learning technique for long-term performance RF offered the most accurate predictions compared to kNN, SVM, DT and XGBoost.Furthermore, these authors introduced SHAP to explain the importance of the resource.Abdualaziz et al. [30] Application of artificial neural network technique for prediction of pavement roughness as a performance indicator.
Developed ANN models to predict IRI by analyzing pavement distress effects across two climate regions (wet freeze and wet freeze) in North America.

Naseri et al. [65]
Novel soft-computing approach to better predict flexible pavement roughness.
Introduced an AOA-SGDR method for features selection from 58 initial variables.
GBDT performs the best.The paper also highlights the importance of weather factors.In the reviewed studies, ANN models were found to use fewer training features compared to other ML models.This is a significant aspect of ANN in pavement performance prediction.Additionally, ANN are less interpretable; they function as 'black boxes', effectively processing and learning from data for predictions or classifications.However, unlike more transparent models like decision trees or ensemble methods, deciphering how ANN use specific input characteristics to make predictions is more challenging.
Key findings in Table 3 include identifying crucial factors affecting IRI, such as traffic, climate, pavement structure, and specific characteristics like hydraulic conductivity.Also, some studies focus on particular contexts, like warm climates or different pavement types, revealing the adaptability of ML models.Several studies employ ensemble learning and ensemble methods, combining different models for enhanced prediction accuracy.Overall, these studies highlight the evolving landscape of ML applications in pavement performance prediction, showcasing advancements in accuracy, interpretability, and efficiency.
Table 4 summarizes the model's formulation, including the models, data source, and training features.This table categorizes four features: M&R history, traffic, structure, and climate.These features consolidate information about pavement, including records of maintenance and rehabilitation activities, traffic conditions, structural capacity, and exposure to various environmental conditions.[9] 2019 AdaBoost LTPP X X X Wang et al. [53] 2021 Adaboost LTPP X X X Hossain et al. [54] 2019 ANN LTPP X X Abdelaziz et al. [30] 2020 ANN LTPP X X Zeiada et al. [55] 2020 DT, SVM, EBT, GPR, ANN LTPP X X X Damirchilo et al. [56] 2020 XGBoost LTPP X X X X Zhang et al. [67] 2020 GBDT LTPP X X X Guo et al. [57] 2021 LightGBM LTPP X X X Gharieb et al. [58] 2021 ANN NRN X Marcelino et al. [59] 2021 RF LTPP X X X X Naseri et al. [60] 2022 RF LTPP X X X X Luo et al. [61] 2022 GBDT, XGBoost, SVM LTPP X X X Song et al. [62] 2022 ThunderGBM LTPP X X X X Sandamal et al. [63] 2023 kNN, SVM, DT, RF, XGBoost Proprietary 1 X Abdualaziz et al. [64] 2023 ANN LTPP Naseri et al. [65] 2023 DT, SVM, RF, ANN LTPP X X X Sharma et al. [66] 2023 GBDT, ANN, XRT, GLM, RF LTPP X X X In the models analyzed, a majority utilize data on traffic, with 89% of the studies incorporating this variable, while climatic factors and pavement structures are considered in 78%.This wide usage reflects a holistic approach to integrating diverse yet influential factors, underscoring the collective recognition of their importance in accurately predicting pavement conditions.Conversely, historical M&R data appear less frequently, integrated in only 33% of the studies, suggesting its relatively lower prevalence in current research models.
Figure 5 displays a boxplot illustrating the R 2 results of the models.Correspondingly, some of the key points regarding Figure 5 are: The sample of studies analyzed is relatively small (17 articles).
Due to the differences in training data and hyperparameters among models, creating a direct and equitable comparison between them is challenging.Therefore, the representation aims to highlight current trends and provide insight into each model's results in a generalized manner.Also, in this article, only the results of models qualified as best by the authors of the reviewed documents are analyzed; preliminary findings or variations are not included.Likewise, Figure 6 presents the algorithms used in the literature reviewed by year.Furthermore, models such as random forest, XGBoost, LightGBM, Adaboost, GBDT, EBT, and ThunderGBM are grouped under 'Ensemble and Boosted Trees'.Furthermore, models appearing only once are grouped and labeled 'Others'.The sample of studies analyzed is relatively small (17 articles).
Due to the differences in training data and hyperparameters among models, creating a direct and equitable comparison between them is challenging.Therefore, the representation aims to highlight current trends and provide insight into each model's results in a generalized manner.Also, in this article, only the results of models qualified as best by the authors of the reviewed documents are analyzed; preliminary findings or variations are not included.
Likewise, Figure 6 presents the algorithms used in the literature reviewed by year.Furthermore, models such as random forest, XGBoost, LightGBM, Adaboost, GBDT, EBT, and ThunderGBM are grouped under 'Ensemble and Boosted Trees'.Furthermore, models appearing only once are grouped and labeled 'Others'.The data indicates that the use of boosting tree models for pavement performance prediction has become state-of-the-art (Figure 6).For instance, Damirchilo et al. [56] determined that boosting tree models are the most effective in predicting IRI for flexible pavements, a conclusion also supported by [61,66].
Figure 5 also illustrates the observed trend, indicating that a plateau appears imminent, with R² results with the values getting stuck close to 0.95.Achieving such accuracy implies that subsequent enhancements are likely to be incremental.Nonetheless, oppor- The data indicates that the use of boosting tree models for pavement performance prediction has become state-of-the-art (Figure 6).For instance, Damirchilo et al. [56] determined that boosting tree models are the most effective in predicting IRI for flexible pavements, a conclusion also supported by [61,66].
Figure 5 also illustrates the observed trend, indicating that a plateau appears imminent, with R 2 results with the values getting stuck close to 0.95.Achieving such accuracy implies that subsequent enhancements are likely to be incremental.Nonetheless, opportunities remain for innovation, particularly in improving the usability and interpretability of models.
According to Shwartz-Ziv and Armon [79], boosting tree models like XGBoost are often recommended for prediction problems involving structured data, like tables, as they effectively handle this type of data.Outperforming ANN in these instances.
Machine learning models have outperformed traditional models, such as MLR, in pavement performance prediction.Bashar and Torres-Machi [24] found that ML models, on average, captured 15.6% more variability than traditional methods.Although ANN has been applied with excellent results for predicting pavement performance, studies suggest that tree ensemble models are often better for structured data [71,77,79,80].
In addition, explainability is one of the major challenges in using neural networks.The complex architecture of ANN makes it difficult to interpret their predictions and understand how they arrived at a particular conclusion.
Authors like Song et al. [62], Yao et al. [10], and Sandamal et al. [63] have used the SHAP method to improve their models' interpretability and better understand the factors driving their predictions.They could identify the most important features using SHAP and understand how they contributed to the predictions.Meanwhile, simple models are often better for prediction problems where explainability is essential.
Lastly, the outcomes showcased by the models highlight the significant potential of utilizing machine learning techniques for accurately predicting the IRI of flexible pavements.

Challenges in IRI Prediction with Machine Learning
Despite the promising results of using ML to predict pavement performance through IRI, several challenges must be addressed.This subchapter will discuss the challenges associated with using ML for IRI prediction on flexible pavements.
A major challenge is obtaining high-quality data for model training.Pavement performance data is often collected manually or semi-automated, leading to time-consuming, costly, and inaccurate results [20][21][22].Then, to create ML models that work well for a broad range of situations, it is crucial to standardize the collection, handling, storage, and accessibility of the data.
Another obstacle to accurate pavement performance prediction using ML is the variability that arises from external factors such as temperature, precipitation, and traffic loads [23].These factors significantly affect pavement performance, but they are challenging to quantify and include in ML models [24,25].Moreover, the complexity of the relationships between inputs and outputs make it challenging to create effective models in various regions and contexts.
A further challenge lies in the interpretability of ML models.Understanding the relationship between the input variables and pavement performance is hard.This point reinforces the option of bypassing ANN.This interpretability is important for road agencies because they need to understand the factors that impact pavement performance to decide how they allocate their resources.Additionally, the lack of interpretability limits the transparency of the ML models, making it difficult to assess their reliability and validity [81][82][83].
Furthermore, the computational cost of training and using ML models must be addressed.This is especially problematic for large datasets or for models that have a large number of inputs and outputs.In addition, the large number of parameters that need to be optimized often leads to overfitting, where the model becomes too complex and cannot generalize well to new data.
In addition, a leading demand in ML for pavement performance prediction is the need for a standard dataset.This makes it difficult to benchmark different models, even using the same data source, such as the LTPP database.This is because different studies use different sections, years, and features, making it almost impossible to compare the models.
Replicability is a crucial aspect of scientific research, as confirmed by Zwaan et al. [84].In line with this principle, independent verification of data, models, and methods is essential for scientific advancement [85].Unfortunately, many analyzed articles need more information on their methods, only mentioning the models and some key parameters.Moreover, most do not provide simple access to their data, opting to state that it is "available upon request".Few make their models available, making replication of results, confirmation of findings, and continuation of research extremely difficult.
In conclusion, while ML holds substantial promise for pavement performance prediction, several challenges must be addressed to optimize its reliability and effectiveness, ensuring the advancements effectively reach transportation agencies.

Future Research
Based on the literature review, the areas that hold the most promise for research into employing ML for predicting pavement performance of flexible pavements, specifically focusing on the IRI, include:

•
High-quality and standardized datasets: One of the major challenges in using ML for pavement performance prediction is the availability of high-quality data for benchmark models.Future research should focus on developing an extensive, standardized, highquality database.

•
Interpretable models: Research should focus on developing interpretable models that provide insight into the relationships between the inputs and the outputs.

•
Variability of pavement performance: The variability from diverse environmental factors presents a significant challenge.Future research should focus on developing models proficient at managing and adapting to the complexity and variability inherent in pavement performance.

•
Computational efficiency: Research should focus on developing computationally efficient models that can handle large datasets and consume less resources.

•
Complexity: Different stakeholders have shown interest in using AI.However, complexity is a limiting factor.Future research should focus on simplifying the use of the models and improving their explainability.
In conclusion, employing machine learning for pavement performance prediction improves road infrastructure management.To fully realize the potential of this research field, focused and coordinated efforts are essential for maximizing its impact.

Conclusions
This research evaluated the application of ML techniques for predicting road pavement quality using the IRI for flexible pavements.The study found that the data most frequently used to train ML models for IRI prediction includes M&R history, traffic, pavement structure, and climatic conditions.Traffic data was prominent, used in 89% of the studies, while pavement structure and climatic factors featured in 78%.In contrast, M&R history was less commonly used, appearing in only 33% of the articles.
Recent progress in predicting IRI for flexible pavements highlights the effectiveness of ML models, particularly ensemble and boosted tree models.These models gained prominence due to their accuracy, being state-of-the-art in IRI prediction.Its popularity stems from its ability to efficiently and accurately manage complex pavement performance data.
Addressing challenges in using ML for pavement performance prediction requires focusing on developing high-quality, standardized datasets.The LTPP database is the most utilized source in the reviewed studies.However, the absence of specific benchmarks within this database highlights the need for refined and standardized data frameworks to enhance model evaluation and evolution.Establishing benchmarks, akin to those in computer vision, could drive progress in pavement management, allowing for adequate model comparisons.An annual benchmark using data from the LTPP program could serve as a standard for evaluating and developing models.
The importance of replication for scientific advancement is undeniable, but pavement performance prediction faces obstacles due to restricted data sharing and a lack of transparency in methodologies.The limited availability of public data and models poses significant challenges for replicating studies.Embracing open science principles can significantly bolster cumulative research efforts, thereby reducing the need for each new researcher to start from scratch.
ML provides a transformative approach for IRI prediction, significantly enhancing the management of road infrastructure.Looking ahead, research should focus on developing high-quality, standardized datasets, and interpretable models.Addressing the variability in pavement performance caused by environmental factors, enhancing computational efficiency, and simplifying model complexity will make these tools more accessible and useful across a broader range of applications.
In conclusion, using ML effectively in predicting pavement performance can significantly improve road infrastructure management.Focused and coordinated efforts are vital to maximize the impact of this research field.

Figure 3 .
Figure 3. Protocol adopted for selection of reviewed articles.

Figure 4 .
Figure 4. Number of articles published per year selected for literature review.

Figure 5
Figure 5 displays a boxplot illustrating the R 2 results of the models.

Figure 5 .
Figure 5. Boxplot distribution comparing the performance of models evaluated by R-squared: median, interquartile range, and outliers in IRI prediction.Figure 5. Boxplot distribution comparing the performance of models evaluated by R-squared: median, interquartile range, and outliers in IRI prediction.

Figure 5 .
Figure 5. Boxplot distribution comparing the performance of models evaluated by R-squared: median, interquartile range, and outliers in IRI prediction.Figure 5. Boxplot distribution comparing the performance of models evaluated by R-squared: median, interquartile range, and outliers in IRI prediction.

Infrastructures 2023, 8 ,
x FOR PEER REVIEW 14 of 20 Correspondingly, some of the key points regarding Figure 5 are: • It includes only test sample results; • Each study considered may have different contexts and scopes; • Models may have diverse training data; • Only the best results are represented, as specifically cited in the studies; •

Figure 6 .
Figure 6.Algorithm usage in the reviewed literature per year.

Figure 6 .
Figure 6.Algorithm usage in the reviewed literature per year.

Table 2 .
Criteria applied to select articles in the review.

Table 4 .
Overview of machine learning models in the reviewed literature.