How Prediction Accuracy Can Affect the Decision-Making Process in Pavement Management System

One of the most important components of pavement management systems is predicting the deterioration of the network through performance models. The accuracy of the prediction model is important for prioritizing maintenance action. This paper describes how the accuracy of prediction models can have an effect on the decision-making process in terms of the cost of maintenance and rehabilitation activities. The process is simulating the propagation of the error between the actual and predicted values of pavement performance indicators. Different rate of error (10%, 30%, 50%, 70%, and 90%) was added into the result of prediction models. The results showed a strong correlation between the prediction models’ accuracy and the cost of maintenance and rehabilitation activities. The cost of treatment (in millions of dollars) over 20 years for five different scenarios increased from ($54.07–$92.95), ($53.89–$155.48), and ($74.41–$107.77) for asphalt, composite, and concrete pavement types, respectively. Increasing the rate of error also contributed to the prediction model, resulting in a higher benefit reduction rate.


Introduction
With an ageing transportation network, highway agencies are finding it challenging to maintain their deteriorating assets in good condition. Given the limited budget to maintain the network, departments of transportation (DOTs) need to efficiently manage their assets to satisfy network-level goals.
The Transportation Asset Management Plan (TAMP) is, "the strategic and systematic process of operating, maintaining, upgrading, and expanding physical assets effectively throughout their life cycle" [1]. In 2012, the US Congress passed the Moving Ahead for Progress in the 21st Century Act (MAP-21), which requires each state DOT to present a risk-based asset management plan to maintain and improve their infrastructure condition [2]. MAP-21 requires evaluating the pavement condition of highways through its infrastructure conditions criteria. Another legislation, known as Fixing America's Surface Transportation (FAST), also passed in 2015 to support performance-based asset management methods. Pavement and bridge assets are prioritized in both acts, and highway agencies spend the biggest portion of their budget on maintaining and preserving these two assets every year.
Although DOTs are actively maintaining their transportation assets, the 2017 infrastructure report card shows that US roads are in a fair to poor condition with a D GPA [3]. The biggest problem that keeps the US roads in a fair to poor condition is that the US has had a financial shortcoming in its highway system for many years. As a result of this shortcoming, the US has $836 billion backlogs in highway and bridge capital. The biggest portion of this backlog ($420 billion) is for repairing the highway system. So, a systematic way to optimize this limited funding is needed to maintain and preserve the highway system.
To achieve the TAMP goals, the Pavement Management System (PMS) presents a support tool to derive objective decisions to keep the pavements in an acceptable condition at a minimum cost [4]. Significant savings and improvement were observed in the network condition since the 1970s when DOTs established and implemented their own PMSs to match their needs [5,6]. Arizona DOT uses PMS for maintenance action in a 7400mile network of highways and recognizing the minimum funding required to implement the maintenance program [7]. Also, they saved $101 million in the first four years of implementation of PMS [8]. A comprehensive PMS involves collecting data, inspecting the road network, predicting network deterioration through performance models, and optimizing maintenance and rehabilitation activities over the planning horizon.
The decision levels in PMS are categorized into the project level, network level, and administrative level. Figure 1 shows the hierarchical decision level in the pavement management system [9].
. Funds are allocated to different transportation asset categories at the administrative level. The goals of network-level management are normally related to the budget process. These goals include identifying the maintenance, rehabilitation, and reconstruction needs, determining the funding needs, forecasting the impact of various funding options in the future, and prioritizing the maintenance activities for the selected funding option. In case of a limited budget, network-level management selects sections based on criteria such as the least cost first, the worst section first, and the highest benefit-cost ratio. At the project level, detailed maintenance and rehabilitation treatments and the best strategy for maintenance actions will be identified. Pavement condition evaluation is necessary for making an efficient decision at each of these decision levels.
Evaluating and modeling of pavement conditions is a major part of all PMSs. Nowadays, almost all state DOTs are using automated surveying tools to evaluate pavement conditions. The data collection covers pavement distress data such as transverse cracking, longitudinal cracking, alligator cracking, wheel-path cracking, patching, and surface friction [10][11][12]. To summarize the pavement condition, the U.S Army Corps of Engineering developed the Pavement Condition Index (PCI) in the 1970s. The index represents a weighted average of sub-indices, reflecting the severity levels of different distress types [13]. Since then, PCI has been widely used to represent the pavement condition [14]. PCI has a numerical value between 0 and 100, where 0 defines the worst and 100 defines the best condition for pavement segments. Also, based on the value of PCI, decision-makers can evaluate the functionality of the pavement network, predict the best time for any maintenance and rehabilitation activities, and estimate future funding needs [15]. It's worth mentioning that the PCI used in this study is based on the PCI developed by Iowa department of transportation, not ASTM D6433.

Project Level
These activities need to be prioritized in order to minimize the cost of maintenance activities and maximize the life cycle of the network [16]. To reach this goal, a robust and accurate deterioration model is needed. Maintenance optimization is sensitive to deterioration models that describe the change in pavement condition over time [17]. By reducing the error in deterioration models, agencies can obtain significant budget savings through timely intervention and accurate planning [18]. The pavement management system could be successful if an accurate deterioration model optimizes the maintenance and rehabilitation strategies during the pavement service time. Also, deterioration models can help agencies identify what maintenance activities are needed [19]. Long-term and short-term planning that become possible with deterioration models is even more critical when highway agencies have a shortcoming in funding [20].
There are different types of deterioration models used in the pavement management system to help decision-makers predict the future condition of pavement sections. Wolters and Zimmerman categorized these models in probabilistic, deterministic, knowledgebased, and neural networks [21]. The deterministic model is a system in which no randomness is involved in the development of the future states of the system. Structural performance, function performance, primary responses, and damage models are all included in deterministic models. The base of the most deterministic models is regression. Also, deterministic models can be broken into empirical models, mechanistic models, and mechanistic-empirical models [22].
Probabilistic models predict the future condition of pavements by giving a transition matrix with which the pavement would fall into a particular condition state, describing the possible pavement conditions of the random process. Neural Network (NN) models have received more attention in the past few years between researchers because of their capability to interconnect neurons between layers [23][24][25][26]. NN applications can solve complex problems in a more efficient way than traditional methods [26][27][28][29]. These problems can be in different categories of pavement engineering, based on research conducted by Ceylan in 2014 [30]. Deterioration models attempt to fit time series data with low observation frequency and high levels of variability, which can be properly captured using Recurrent Neural Networks (RNN). In the past few years, many different RNN algorithms have been developed by researchers, including the Long Short-Term Memory (LSTM), introduced to allow for modeling and forecasting long term data series.
All these deterioration models are designed to predict the future condition of pavement sections so that maintenance and rehabilitation activities can be planned. Each activity is suitable for specific distress and decision-makers cannot apply one treatment to all types of distresses. Since each pavement section can have more than one distress type and each distress type has its own treatment solution, state DOTs have defined their own decision trees for applying specific treatments to specific road sections with specific conditions. Nevertheless, all state DOTs have some mutual factors for selecting these treatments, such as the traffic condition, environmental factors, and pavement type. The treatment selection process is different in each state based on their pavement condition evaluation process. Some state DOTs use optimization routines for treatment assignment and some others use threshold value for assigning the treatment strategies [31].
In order to maximize the effectiveness of treatments, treatment effectiveness needs to be defined. There are different definitions available, such as extending the life of the pavement by treatments, improving the pavement deterioration curve by treatment, and the service life of the treatments. In general, however, treatment effectiveness is how well a treatment works during the pavement age so that the need for another treatment is eliminated. The right treatments can not only improve the pavement condition but also decrease the rate of deterioration of the pavement sections [32].
The uncertainties in pavement performance prediction produce errors which are classified into random and systematic errors. These types of errors can be due to human involvement (such as errors happening during data entry, data preprocessing, and visual rating) or technology errors (such as those that come from the instrument). Random errors are the, "result of irregular causes in which laws of action are unknown or too complex to be investigated. However, systematic errors are constant or may vary in some regular way" [33]. Saliminejad and Gharibeh have proven that even acceptable ranges of systematic and random errors could have an impact on the output of the PMS and average annual budget. Based on a study by Haider and Chatti, unbiased sampling can reduce the rate of systematic errors; however, increasing the sample size can reduce the rate of random errors [34]. In PMS, positive error in condition data (overestimating the condition index and underestimating distress) is less effective than the negative error (underestimating the condition index and overestimation of distress).
Different sources of errors might be introduced in pavement performance data and consequently, in pavement performance prediction. A composite condition index (for instance, the pavement condition index (PCI)) includes the measurement of roughness, distress, rutting, and faulting. The instrumental error might increase because of using different types of instruments to measure these condition indicators. On the other hand, another source of error can be introduced due to the subjectivity in the determination of severity and type of distresses. Also, another source of error may be introduced due to field and operator conditions. Since the maintenance actions in each state DOT is directly related to pavement prediction models, the effect of errors is not negligible. However, little work has been done to investigate the impact of errors in the decision-making process.
In this research, the result of the pavement prediction model (LSTM model), already developed in a previous study is used [35,36]. LSTM is used for time-dependent prediction of the pavement condition index. The goal of this study is to investigate the effect of prediction accuracy in the decision-making process in terms of maintenance costs and rehabilitation activities in different pavement types. Historical pavement condition data of the Iowa DOT Pavement Management Information System (PMIS) between 1998 to 2018 were used for developing the prediction model. Different scenarios are investigated while adding different rates of error to the predicted values. Iowa DOT decision trees are used to check the effect of the prediction model accuracy in terms of cost of treatments in different pavement types. The results of different scenarios were compared with the base scenario to check whether decreasing or increasing the accuracy of the prediction model can have an effect on the cost of maintenance and rehabilitation or not. Figure 2 represents the steps involved in completing this research study. Each individual step is described in detail in the following subsections.

Data
Information regarding the highway system, including construction history, section identification, maintenance history, pavement age, traffic loading, and pavement distresses are available in the Iowa DOT PMIS database and was used to develop the prediction model in the previous study [35]. The condition data of pavement sections from 1998 through 2018 was used for model development purposes. The data collection covered pavement distresses data such a transverse cracking, longitudinal cracking, alligator cracking, wheel-path cracking, patching, and surface friction. Three severity levels are assigned to distresses data: low, medium, and high, for all pavement types. Rutting depth for asphalt and composite pavements and faulting for concrete pavements have also been collected. The international roughness index (IRI) was used to characterize ride quality for all pavement types. The Iowa DOT spends about $1 million annually on collecting pavement condition data [15].

Preprocessing
After the data collection process, condition indices were estimated using the reported condition data. In the current database, different types of units are used for each distress type. Since the PCI is based on a scale of 100, individual indices and sub-indices were also estimated on a scale of 100 in order to make comparison easier. In this study, four individual indexes are used for asphalt pavements (AC), composite pavements (COM), and concrete pavements (PCC): The overall PCI is the combination of riding, rutting, and cracking indices for AC and COM pavements and riding, cracking, and faulting indices for PCC pavements. The weights for individual indexes were determined in a previous study for Iowa DOT by Bektas and Smadi [15]. Moreover, all indexes are derived based on the proposed approach in the same study.

Cracking Index
Four different sub-indexes were used to calculate the cracking index in AC and COM pavements based on transverse cracking, longitudinal cracking, alligator cracking, and longitudinal-wheel-path cracking. For PCC pavements, transverse cracking and longitudinal cracking were established as sub-indexes. Three severity levels were evaluated for pavement distresses by the Iowa DOT: low, medium, and high. The coefficients of 1, 1.5, and 2 are the low, medium, and high aggregated severities, respectively, and convert all severity levels into low severity [15]. A maximum value (threshold) corresponds to a deduction of 100 points. Therefore, a cracking sub-index of 0 was determined for each crack type within pavement type. Table 1 describes the threshold values for each sub-index in different pavement types. The cracking index is the combination of weighted sub-indexes. These weights are determined based on expert opinion at the Iowa DOT. Table 2 shows the weight of each sub-index for calculating the cracking index.

Riding Index
The International Roughness Index (IRI) is the roughness index most commonly obtained from measured longitudinal road profiles. The riding index in this study is based on the IRI measurements, as expressed on a scale of 100. IRI values below 0.5 m/km are taken as a perfect 100, whereas the values above 4.0 m/km are 0 on the index scale. Any other value between 0.5 and 4 m/km was calculated with interpolation [15].

Rutting Index
Rutting is a term for when permanent deformation or consolidation accumulates in an asphalt pavement surface over time. Rutting occurs because the aggregate and binder in asphalt roads can move. A threshold value of 12 mm was set to 0 on a rutting index scale of 100, and the values below 12 mm were applied as deductions correspondingly based on previous research [15].

Faulting Index
Faulting is a difference in elevation across a joint or crack. Usually, the approach slab is higher than the leave slab due to pumping. Similar to rutting index in AC pavements, a threshold value of 12 mm was set to 0 on the index scale of 100 based on previous research [15].

Pavement Condition Index (PCI)
After calculating all cracking, riding, rutting, and faulting indexes for AC and PCC pavements, the Iowa DOT uses the formula obtained from pure regression analysis to combine all these indexes and come up with a pavement condition index to describe the current condition of the pavements. The current formula for calculating the PCI for AC and PCC pavements are as follows [15]: Based on the PCI values, Iowa DOT classifies pavement condition for the interstate highway system as good, where PCI is between 76-100; fair, where PCI is between 51-75; and poor for sections with PCI between 0-50. Based on the Iowa DOT classification, approximately 91% and 79% of the interstate highway system and non-interstate highway system in the state of Iowa was categorized as being in a good condition pavement until the end of 2017 [37].

Condition Description
After gathering and processing all the information from the last step, performance indicators needed for decision making were defined. Each pavement type has its own performance indicator, different for AC, COM and PCC. In this study, the cracking, riding, rutting, and PCI for AC and COM pavements are identified as performance indicators. However, the cracking, riding, faulting, and PCI in PCC were used as a performance indicator. Highway agencies are using these performance measurements for selecting maintenance activities to expand the life of pavements and improve pavement conditions.

Prediction with LSTM
The LSTM, which is an RNN algorithm, was used to predict the future condition of individual pavement sections of the three different pavement types. The LSTM algorithm in this study was previously developed by the author in another study [34]. The database was divided into a training dataset and a validation dataset. The training dataset was used for the learning process and developing the model. The validation dataset was used to validate that the model works well. Since the AC pavement type had a lower number of records compared to the other two pavement types, 80% of the records were used for training the model and 20% for validating the model. In PCC and COM pavement types, these numbers are 70% for training and 30) for validating the model.
Model validation confirmed that the output of the statistical model was acceptable with respect to the collected data. For evaluating any machine learning models, it is necessary to test some data which was not used in the training process. The train-test split approach was used for cross-validation (CV), a validation technique that checks the effectiveness of the machine learning model. After performing the model training on 70% of the database (training dataset), the validation dataset was used as a test sample to validate the model performance. The prediction for all three pavement types was conducted for 20 years with the developed model. For AC, PCC and COM pavement types, 50, 80, and 80 sample sections were used for prediction purposes, respectively.  Five different scenarios were assumed from the minimum error rate to the maximum error rate to investigate the effect of increasing the error on the decision-making process:  [−20, 20], [−25, 25] in scenarios 1 to 5, respectively. Figures 4-6 show the distribution of the performance indicators for each pavement type, PCC, AC, and COM respectively for the base and five different error scenarios. Figure 7 shows the resulting PCI distribution for the three pavement types.    As can be seen from Figure 7, the PCI distribution remains almost similar for different error rates in all three pavement types since the errors are applied to the individual performance indicators and the PCI is calculated based on these new values. Figures 8-10 show the PCI values for the base and five error scenarios for the three pavement types (PCC, AC, and COM).

Decision Tree and Maintenance Assignment
Each state DOT has its own decision tree to assign treatment actions. If the condition of the pavement is acceptable, then no action is needed; otherwise, treatment is assigned based on the decision trees. For this study, this is achieved by adopting existing decision trees and matrices developed by the Iowa DOT. Since each pavement type has its own performance indicator, different decision trees are available based on pavement types. Table 3 shows the decision tree for AC and COM pavements and Table 4 shows the decision trees for PCC pavements.

K PCI Cracking Index Riding Index Faulting Index Treatment
Otherwise Do nothing It is worth mentioning that the PCI is not the only factor for assigning the treatment actions as can be seen from the decision trees. It is possible to have different treatment assignments for sections with similar PCI values when the other performance indicators are different (cracking, riding, rutting, and faulting indices).

Cost Calculation
Based on the condition of the pavement section and the treatment assignment from the decision tree, the cost of maintenance can be calculated. Iowa DOT has its own unit cost for each treatment action. Table 5 shows the unit cost for each treatment action based on mile lane units.

Optimization
The selection process in this study is based on maximizing (optimizing) the total benefit acquired from the different treatments applied to the sections that are given a limited budget. Several definitions of benefit can be found in the literature; however, one of the widely used definitions is the area between the deterioration curve without treatment activity and the expected deterioration curve after treatment, as depicted in Figures 11-13. Based on the decision trees and performance indicators affected by different error rates, the treatment activities were identified for each test section. As a result of the selected treatment activities, the cost of treatments for each section was calculated based on the Iowa DOT unit cost, mile-lane. Each treatment activity can extend the life of pavements by increasing the PCI. Tables 6 and 7 show the proposed reset values on performance curves for the different pavement types as recommended by the Iowa DOT. Reset values represent the increase in PCI values attributed to each treatment.  After increasing the PCI for each section based on the reset values, the total benefit for each section was calculated. For determining the total benefit, the area under deterioration without treatments needs to be calculated first, as shown in Equation (6). Figure 11 illustrates the pavement deterioration curve without applying the treatment. Area1 = f x dx (6) Figure 11. Deterioration curve without treatment.
In the next step, both areas under the with-treatment and the without-treatment curves need to be calculated, as shown in Equation (7). Figure 12 illustrates the area under both deterioration curves. The total benefit is the difference between the areas resulting from Equations (6) and (7), as shown below in Equation (8) and Figure 13: Optimization is done to maximize the total benefit when the budget is limited and is less than the actual total cost. This analysis showed the effect of increasing the error on the benefit. The optimization part was conducted in Microsoft Office Solver in the following steps:


As mentioned earlier, five different scenarios based on a different amount of error contribution to the prediction model were investigated to see how an increase in error rate can change the decision-making process.  The total benefit for each scenario was calculated for each pavement type for each test section, as described above.  The total cost of treatments for each scenario was calculated for each test section based on decision trees and the unit costs.  The limited budget, which is 15% less than the total cost, is assumed as an available budget.  By increasing the error contribution, the total cost (need) for maintenance actions increased, the available budget stayed constant, and Solver optimized these conditions to maximize the total benefit.

Comparison
Comparison between different scenarios with different error rates will be discussed in the following section. The outcomes from the optimization part and the effect of increasing error contribution on the prediction model in terms of cost and benefit will also be discussed.

Results and Discussion
This section describes the results of simulating the contribution of error in the prediction model developed by LSTM. The overall outcomes from different error rates are presented in Tables 8 and 9. Five different scenarios were conducted which show the impact of an error increase on the cost and benefit. All scenarios were compared with the base scenario in which no error is applied to the prediction model. Based on the reported results, the base scenario has the minimum maintenance cost in all three pavement types. The results showed that the higher the error rate, the more money was needed for maintaining the pavement network. This result is based on the fact that when the prediction model cannot predict properly, some sections will have unnecessary maintenance. Also, maintaining some sections in need of urgent maintenance was delayed. As a result, the treatment action would change, and more expensive treatments would be needed for these sections. Table 8, which shows the results of the needed cost for different scenarios, is based on the predicted value of 50 AC, 80 COM, and 80 PCC sections with different lengths in 20 years.
Results also showed that an increase in the error rate could reduce the benefit when agencies face a budget reduction or limitation. As a result of a higher benefit reduction rate, the overall pavement network condition could be worse. In all pavement types for the first scenario, where minimum error contribution was applied to the predicted value, a minimum rate of benefit reduction was observed. The more error added to the predicted values, the higher the percentage of benefit reduction.

Conclusions
The results of the pavement prediction model developed with LSTM were used in this study. To investigate the effect of increasing the error on the decision-making process, five different scenarios were assumed from the minimum error rate to the maximum error rate. The scenarios were investigated by adding different rates of error (10%, 30%, 50%, 70%, and 90 %) to the predicted values of performance indicators. The PCI was calculated based on the modified performance indicators with different error rates. The Iowa DOT decision trees were used to check the effect of the prediction model accuracy on the cost of treatments in different pavement types.
The results from the different scenarios were compared to check whether decreasing or increasing the accuracy of the prediction model can have an effect on the cost of maintenance. All five scenarios were compared with the original output of the prediction model as a base scenario in terms of cost and benefit. Based on the reported results, increasing the rate of error has a significant correlation with the cost of maintenance activities, and agencies need to improve the prediction accuracy of their current models to prevent spending unnecessary costs. The more error was added to the prediction model, the higher the cost of maintenance needed for maintaining the pavement network. The base scenario has the minimum cost compared to the other five scenarios. Also, increasing the rate of error into the prediction model can increase the rate of benefit reduction and consequently worsen the pavement network condition.