Adaptive Network Based Fuzzy Inference System with Meta-Heuristic Optimizations for International Roughness Index Prediction

: The International Roughness Index (IRI) is the one of the most important roughness indexes to quantify road surface roughness. In this paper, we propose a new hybrid approach between adaptive network based fuzzy inference system (ANFIS) and various meta-heuristic optimizations such as the genetic algorithm (GA), particle swarm optimization (PSO), and the ﬁreﬂy algorithm (FA) to develop several hybrid models namely GA based ANGIS (GANFIS), PSO based ANFIS (PSOANFIS), FA based ANFIS (FAANFIS), respectively, for the prediction of the IRI. A benchmark model named artiﬁcial neural networks (ANN) was also used to compare with those hybrid models. To do this, a total of 2811 samples in the case study of the north of Vietnam (Northwest region, Northeast region, and the Red River Delta Area) within the scope of management of the DRM-I Department were used to validate the models in terms of various criteria like coe ﬃ cient of determination (R) and the root mean square error (RMSE). Experimental results a ﬃ rmed the potentiality and e ﬀ ectiveness of the proposed prediction models whereas the PSOANFIS (RMSE = 0.145 and R = 0.888) is better than the other models named GANFIS (RMSE = 0.155 and R = 0.872), FAANFIS (RMSE = 0.170 and R = 0.849), and ANN (RMSE = 0.186 and R = 0.804). The results of this study are helpful for accurate prediction of the IRI for evaluation of quality of road surface roughness.


Introduction
The International Roughness Index (IRI) is a standard index to quantify road surface roughness from measured longitudinal road profiles [1].IRI is linearly proportional to roughness [2]; thus when the IRI value increases, the roughness of the pavement increases; and an IRI value of zero means that the pavement is smooth [3].Some researchers reported that the pavement fails when IRI values reach 3 m/km, and an overlay is needed [3].IRI is a standardized roughness measurement to represent the reaction of a single tire on a vehicle suspension to roughness in a pavement surface when a quarter car simulation traveling at 80 km/h [4].In the literature, IRI can be determined from profiles obtained from high-speed inertial profiling systems [5].Nonetheless, one problem associated with IRI is that determination of this index in laboratories is time-consuming and complicated, thus, the accurate prediction of the IRI is essential and useful.
Nowadays, machine learning (ML) methods have been applied widely in prediction of the IRI.Chen, et al. [6] proposed the grey forecast, genetic programming, and multiple regression for prediction the deterioration of IRI.Hossain, Gopisetti and Miah [3] used an artificial neural network (ANN) to predict the IRI for jointed plain concrete pavement sections.Lin, et al. [7] studied the relationships between IRI and pavement distress using a back-propagation neural network.Yousefzadeh, et al. [8] discussed road profile estimation using ANN.Mactutis, et al. [9] analyzed the relationship between the IRI and cracking and rutting over 317 observations.It has been inferred from these studies that good models and approaches should be developed in order to obtain higher prediction accuracy of IRI.
In this study, the main aim is to propose a new approach of adaptive network based fuzzy inference system (ANFIS) with various meta-heuristic optimizations namely the genetic algorithm (GA), particle swarm optimization (PSO), and the firefly algorithm (FA) to develop different hybrid models namely GA based ANGIS (GANFIS), PSO based ANFIS (PSOANFIS), FA based ANFIS (FAANFIS), respectively, for the prediction of the IRI.A benchmark model named ANN was also used to compare with those hybrid models.For this purpose, a total of 2811 samples in the case study of the north of Vietnam (Northwest region, Northeast region, Red River Delta Area) within the scope of management of the DRM-I Department were used to validate the proposed and applied models in terms of various criteria like coefficient of determination (R) and root mean square error (RMSE).

Case Study and Database
In this study, the data of the road surface status were collected by the Vietnam Road Administration in coordination with the Japanese International Cooperation Agency (JICA) from April 2012 to March 2013 for each 100 m of lane for the entire national road network.This area is located in the north of Vietnam (Northwest region, Northeast region, Red River Delta Area) (Figure 1).The survey was conducted for two types of pavements including cement and asphalt concrete pavements.For the asphalt concrete pavement, it was implemented for the expressway and national highway (including newly operated and operated highways) named National Highway N1 (Figure 1).The content of survey consisted of scale, survey period, diagram of lane layout and survey direction, and survey parameters.The survey parameters contained a type of car survey, vehicle weight, air pressure of wheel, the surface temperature, testing apparatus, and testing method.This project was particularly important because of the need for surveying and collecting road surface status data, especially for expressways that ensure comprehensiveness, accuracy, speed, efficiency, economics, and the ability to not obstruct current traffic in Vietnam.Therefore, Hawkeye technology was a good method for road management, maintenance and operation units to meet new requirements on surveying and managing data at all levels (https://arrbsystems.com/hawkeye-1000-series/).Since then, there have been effective plans and solutions in road maintenance and operation.
of points on the road and the image of the road were automatically recorded continuously according to the recording step of 5 m.A total of 2811 samples were collected and tested to generate the datasets which includes road length (m), analysis area (m ), summed cracks, maximum depth of ruts (mm), average depth of ruts (mm), and IRI (mm/m) (Table 1).In modeling, IRI was taken into account as the output variable (Y) while other parameters, which were road length, analysis area, summed cracks, maximum depth of rut, average depth of rut, were considered to be the independent variables X1, X2, X3, X4, X5, respectively.There were two parts generated from the original data.The first part data was a training dataset, which stood at 70% as opposite to that of the validating dataset (30%).The validating dataset was then used to test the models while training dataset was used to learn them.In the meantime, the principal component analysis method (PCA) [10] was used to reduce the dimensions data.Thus, we only selected the features which carry important information, and eliminated the irrelevant features for prediction.Savitzky-Golay filtering [11] was also applied in order to reduce extreme values in the distribution of data.This noise reduction technique was necessary in order to track nonlinear relationship between inputs and outputs without changing its tendency.In this paper, third degree polynomial and a window size of 15 were used when applying the Savitzky-Golay filtering technique.Finally, in order to deal with real values of measurement on different scales, all data were normalized into the [-1,1] range.Data include three basic characteristics of the road surface namely the international roughness of IRI, the depth of the track and the depth of road surface cracking.In addition, the GPS coordinates of points on the road and the image of the road were automatically recorded continuously according to the recording step of 5 m.A total of 2811 samples were collected and tested to generate the datasets which includes road length (m), analysis area (m 2 ), summed cracks, maximum depth of ruts (mm), average depth of ruts (mm), and IRI (mm/m) (     In modeling, IRI was taken into account as the output variable (Y) while other parameters, which were road length, analysis area, summed cracks, maximum depth of rut, average depth of rut, were considered to be the independent variables X 1 , X 2 , X 3 , X 4 , X 5 , respectively.There were two parts generated from the original data.The first part data was a training dataset, which stood at 70% as opposite to that of the validating dataset (30%).The validating dataset was then used to test the models while training dataset was used to learn them.In the meantime, the principal component analysis method (PCA) [10] was used to reduce the dimensions data.Thus, we only selected the features which carry important information, and eliminated the irrelevant features for prediction.Savitzky-Golay filtering [11] was also applied in order to reduce extreme values in the distribution of data.This noise reduction technique was necessary in order to track nonlinear relationship between inputs and outputs without changing its tendency.In this paper, third degree polynomial and a window size of 15 were used when applying the Savitzky-Golay filtering technique.Finally, in order to deal with real values of measurement on different scales, all data were normalized into the [−1,1] range.

Adaptive Neural Fuzzy Inference System (ANFIS)
ANFIS is a powerful Artificial Intelligent prediction system that uses combination of ML technique of neural networks and a fuzzy logic system [12][13][14].The ANFIS structure was a five-layer as follows [15] (Figure 2):

Particle Swarm Optimization (PSO)
PSO is inspired by the behavior of biological communities like a swarm of insects so as to look for best solution in a given space [16].The original PSO algorithm was initially proposed by Kennedy and Eberhart in [17].It generally includes a population of candidate and the particles which are solutions.PSO is initiated by a random cluster of individuals, then search for the optimal solution by updating the generation [18].In each generation, two values such as Pbest and Gbest are used to update each individual [19].The process of updating individuals was based on the following two formulas: Layer 1: This was the fuzzification layer.It consisted of defined membership functions of the input variables.The output was a degree of membership value that was calculated based on a Gaussian membership function.
where c i , δ i are parameters of a membership function.
Appl.Sci.2019, 9, 4715 5 of 18 Layer 2: This layer executed fuzzy AND of the previous part of the fuzzy rules.
Layer 3: This was the normalized layer.This layer normalized the membership functions.
Layer 4: This was the defuzzification layer.This layer executed the consequent part of the fuzzy rules.
where p i , q i , r i are linear parameters.
Layer 5: This output and combination layer was calculated by summing up the outputs of previous layers.

Particle Swarm Optimization (PSO)
PSO is inspired by the behavior of biological communities like a swarm of insects so as to look for best solution in a given space [16].The original PSO algorithm was initially proposed by Kennedy and Eberhart in [17].It generally includes a population of candidate and the particles which are solutions.PSO is initiated by a random cluster of individuals, then search for the optimal solution by updating the generation [18].In each generation, two values such as Pbest and Gbest are used to update each individual [19].The process of updating individuals was based on the following two formulas: where x k i : particle position, v k i : particle velocity, p k i : the best location of individual i in the swarm, p k g : the best location of the all individual in the swarm, c 1 , c 2 : cognitive and social parameters, r 1 , r 2 : random number, range from 0 to 1.

Genetic Algorithm (GA)
GA is based on Darwin's theory of evolution [20].GA uses the principles of genetics, mutation, natural selection and cross-exchange.It includes a number of genome terms such as chromosomes, populations, and genes.Chromosomes are made up of genes.Each gene carries several characteristics and is located in the chromosome.Each chromosome will represent a solution to the problem [21].
The algorithm was carried out in the following steps [22]: Step 1: Population Initiation: Generated randomly a population of n individuals (where n is the solution to the problem).
Step 2: Estimated the adaptation value of each individual.
Step 3: Stop Condition: Checked the condition to terminate the algorithm.
Step 4: Selection: Selected two parents from the old population according to their adaptation.
Step 5: Cross exchange: With a selected probability, there was a new one which was created by the crossing exchange of two parents.
Step 6: Mutation: With a selected mutation probability, the new individual was mutated.
Step 7: Select the result: If the stop condition was satisfied, the algorithm terminated and chose the best solution in the current population.
GA had two basic stop conditions: Condition 1.Based on the chromosome structure, controlling the number of genes that were converging.If the number of genes converged at or beyond that point, the algorithm terminated.
Condition 2. Based on the special meaning of the chromosome, it measured the change of the algorithm after each generation.If this change was less than a fixed constant, the algorithm terminated.

Firefly Algorithm (FA)
FA solves the hardest optimization problems based on the flashing patterns and behavior of fireflies.The original FA algorithm was initially developed by Yang [23].Generally, FA used the three rules as follows: (1) The fireflies were unisex so that a firefly was attracted to all other fireflies regardless of their sex.
(2) Attractiveness of a firefly was proportional to its brightness.The firefly moved randomly as there was no brighter firefly nearby.(3) Brightness of a firefly was determined by the value of its objective function.
Movement of a firefly was based on the following two formulas: where β is defined as the attractiveness degree of a firefly at an r distance, β 0 is the attractiveness degree of a firefly at a distance r = 0, γ is the light absorption coefficient, and Rand () is a random number, which refers to the uniform distribution on (0, 1).

Artificial Neural Network (ANN)
ANN is inspired by the way of biological nervous system like human brain processes information [24].A basic ANN contains artificial neurons including three simple sets of rules: multiplication, summation and activation.At the entry, the input are multiplied by its corresponding weight [24].Neural networks use weights information which express the strength of the interconnection to address the problem.Then, all weighted inputs are all summed up inside computing unit [24].The last section of the artificial neuron is transfer function, in which the sum is passed through activation function such as sigmoid, tanh, and rectified linear unit (RLU) [24,25].
When combining two or multiple basic artificial neurons (nodes), we are getting an artificial neural network.In a typical ANN, there are different layers: input layers, hidden layers and output layers [26] (Figure 3).Nodes at each layer have different functions.Input nodes are simply weighted and summed.Hidden nodes have no direct connection to the incoming data or to the ultimate output.They receive input from previous hidden nodes or initial input, then producing output to next hidden nodes or ultimate output.Output nodes assume responsibility for computations and transferring information to the outside.layers [26] (Figure 3).Nodes at each layer have different functions.Input nodes are simply weighted and summed.Hidden nodes have no direct connection to the incoming data or to the ultimate output.They receive input from previous hidden nodes or initial input, then producing output to next hidden nodes or ultimate output.Output nodes assume responsibility for computations and transferring information to the outside.

Accuracy Validation
Accuracy of the proposed and applied models was validated by the coefficient of determination (R) and the root mean square error (RMSE) [27][28][29][30].RMSE is the average squared difference between outputs and targets [31].It is always positive and its units match the units of the response.Lower RMSE values are better [32,33].The R values measure the correlation between outputs and targets [34][35][36][37].While R can be more easily interpreted, RMSE can be know the amount of predictions deviate, on average, from the actual values in the dataset [31].The formulas were as follow [38][39][40][41]: where pr i and pr are inferred as output values from the models, y i and y are the actual and mean values of IRI, respectively, and n is defined as the number of input data.In addition to above validation criteria, a non-parametric test named Wilcoxon sign-ranked test was used in this study to evaluate that whether the difference of the accuracy among the models are statically significant [42].This test was based on a predominant null hypothesis that there was no difference between the models at the significant level of α = 0.05, and that statistical values such as the z-value and p-value were calculated [42].If z-value and p-value were out of the range from −1.96 and +1.96 and smaller than 0.05, respectively, the null hypothesis was rejected, which means the difference between the models was statically significant and vice versa [43].

Methodology Framework
The modeling procedure for the prediction of IRI was carried out in four main steps: (I) data collection, (II) data preparation, (III) training models, and (IV) validation of models.A detailed description of each is briefly recalled below (Figure 4): (i) Data collection: in this step, the data generated from laboratory experiments were collected and summarized into two groups whereas the first groups contains all input parameters such as: road length, analysis area, summed cracks, maximum depth of rut, average depth of rut and the second one encloses the output parameter (IRI).(ii) Data pre-processing: in this step, PCA and the Savitzky-Golay filter were used to reduce the dimensions data and reduce extreme values in the distribution of data.(iii) Data preparation: in this study, the holdout validation method was used for training and validating the models as it is a popular and effective method for generating the datasets for training and testing the models [24,[44][45][46][47][48][49][50][51][52][53][54][55][56][57][58][59][60][61], and thus the collected data were divided into two parts.The first part included 70% data which was used to train the models, whereas the second part contained 30%, the remaining data and this was used to validate the models as the ratio 70/30 for dividing the training and testing dataset was a common ratio used in applying the ML models [29, [62][63][64][65][66][67][68][69][70][71].(iv) Training the models: the models were created using a 70% training dataset.PSOANFIS was created by combining PSO and ANFIS, GANFIS was created by combining GA and ANFIS, FAANFIS was created by combining FA and ANFIS.Out of these, PSO, GA, FA were used to optimize the consequence, antecedent parameters for giving the best ANFIS.An artificial neural network (ANN) was created using sigmoid algorithm.(v) Validation the models: after optimization of consequence, antecedent parameters, as well as training models, validation of the models were carried out using testing dataset via two methods namely RMSE and R.

Results
Four ML models namely PSOANFIS, GANFIS, FAANFIS and ANN were constructed and trained for prediction of IRI.For PSOANFIS, GANFIS, and FAANFIS, an initial ANFIS structure was generated based on a number of membership functions.In the learning PSOANFIS, inertia weight, population size, personal learning coefficient and global learning coefficient were initially set as "0.4", "100", "1" and "2", respectively, which achieved the best RMSE.In the learning GANFIS, crossover percentage, population size, mutation percentage, gamma and mutation rate were initially set as "0.4", "100", "0.7", "0.7" and "0.15", respectively.In the learning FAANFIS, initial parameters, namely the light absorption coefficient, attraction coefficient base value, and the mutation coefficient was selected as "1", "2" and "0.2", respectively.These hybrid models used RMSE criteria to validate the models for 200 iterations.In learning ANN, we constructed a network with 10 hidden neurons.In terms of a training network, the scaled conjugate gradient algorithm was used to find the bias and weight values of network.Figure 5 indicates cost functions in terms of RMSE and R when using PSO, GA and FA techniques for optimizing ANFIS parameters, respectively.It can be seen that 200 iterations were sufficient for achieving convergence of cost functions.

Results
Four ML models namely PSOANFIS, GANFIS, FAANFIS and ANN were constructed and trained for prediction of IRI.For PSOANFIS, GANFIS, and FAANFIS, an initial ANFIS structure was generated based on a number of membership functions.In the learning PSOANFIS, inertia weight, population size, personal learning coefficient and global learning coefficient were initially set as "0.4", "100", "1" and "2", respectively, which achieved the best RMSE.In the learning GANFIS, crossover percentage, population size, mutation percentage, gamma and mutation rate were initially set as "0.4", "100", "0.7", "0.7" and "0.15", respectively.In the learning FAANFIS, initial parameters, namely the light absorption coefficient, attraction coefficient base value, and the mutation coefficient was selected as "1", "2" and "0.2", respectively.These hybrid models used RMSE criteria to validate the models for 200 iterations.In learning ANN, we constructed a network with 10 hidden neurons.In terms of a training network, the scaled conjugate gradient algorithm was used to find the bias and weight values of network.Figure 5 indicates cost functions in terms of RMSE and R when using PSO, GA and FA techniques for optimizing ANFIS parameters, respectively.It can be seen that 200 iterations were sufficient for achieving convergence of cost functions.RMSE and errors were used to validate and compare models, and the results are illustrated in Figure 6.Summary information of RMSE and error mean and standard deviation are indicated in Table 3.It shows that RMSE values of the models varied from 0.122 to 0.2 for the training dataset.In particular, the value of RMSE in PSOANFIS was lowest with 0.122, compared to 0.146, 0.124, 0.200 of the FAANFIS, GANFIS and ANN model, respectively.Thus, the PSOANFIS model had the best goodness of fit with training dataset in comparison with other models.Similarly, RMSE values of GANFIS, PSOANFIS, FAANFIS and ANN for testing dataset were 0.155, 0.145, 0.170 and 0.186, respectively, which also indicated that PSOANFIS had good performance in both the training dataset and the testing dataset compared to other models.RMSE and errors were used to validate and compare models, and the results are illustrated in Figure 6.Summary information of RMSE and error mean and standard deviation are indicated in Table 3.It shows that RMSE values of the models varied from 0.122 to 0.2 for the training dataset.In particular, the value of RMSE in PSOANFIS was lowest with 0.122, compared to 0.146, 0.124, 0.200 of the FAANFIS, GANFIS and ANN model, respectively.Thus, the PSOANFIS model had the best goodness of fit with training dataset in comparison with other models.Similarly, RMSE values of GANFIS, PSOANFIS, FAANFIS and ANN for testing dataset were 0.155, 0.145, 0.170 and 0.186, respectively, which also indicated that PSOANFIS had good performance in both the training dataset and the testing dataset compared to other models.A new hybrid approach between adaptive network based fuzzy inference system (ANFIS) and various meta-heuristic optimizations such as the genetic algorithm (GA), particle swarm optimization (PSO), and the firefly algorithm (FA) to develop several hybrid models namely GA based ANGIS (GANFIS), PSO based ANFIS (PSOANFIS), FA based ANFIS (FAANFIS).
Validation of the models using the R criteria is shown in Figures 7 and 8 for regression graphs between actual and predicted IRI.It can be observed from Figures 7 and 8 that R values of four models varied from 0.806 to 0.933 for the training dataset.With R = 0.933, the PSOANFIS has the highest goodness of fit, followed by the GANFIS, FAANFIS, ANN with 0.930, 0.901, 0.806, respectively.For the testing dataset, the R values are 0.888 (PSOANFIS), 0.872 (GANFIS), 0.849  A new hybrid approach between adaptive network based fuzzy inference system (ANFIS) and various meta-heuristic optimizations such as the genetic algorithm (GA), particle swarm optimization (PSO), and the firefly algorithm (FA) to develop several hybrid models namely GA based ANGIS (GANFIS), PSO based ANFIS (PSOANFIS), FA based ANFIS (FAANFIS).
Validation of the models using the R criteria is shown in Figures 7 and 8 for regression graphs between actual and predicted IRI.It can be observed from Figures 7 and 8 that R values of four models varied from 0.806 to 0.933 for the training dataset.With R = 0.933, the PSOANFIS has the highest goodness of fit, followed by the GANFIS, FAANFIS, ANN with 0.930, 0.901, 0.806, respectively.For the testing dataset, the R values are 0.888 (PSOANFIS), 0.872 (GANFIS), 0.849 (FAANFIS) and 0.804 (ANN).
Thus, in this study, PSOANFIS has a good prediction capability in both training dataset and testing dataset compared to other models.
Validation of statistical difference of the models using the Wilcoxon sign-ranked test was done as shown in Tables 4 and 5.It can be observed in training dataset that p-value of the pair samples of the models is smaller than 0.05 except the pair sample of GANFIS vs. FAANFIS and the z-value of the pair samples of the models out of the range from −1.96 and +1.96 except the pair sample of GANFIS vs. FAANFIS (Table 4).In testing dataset, both p-value and z-values of all the pair samples of the models are less than 0.05 and out of the range from −1.96 and +1.96, respectively (Table 5).Therefore, it can be stated that in testing phase, the difference of all models is statistically significant whereas in training dataset the difference of the models is statistically significant except GANFIS vs. FAANFIS.
Appl.Sci.2019, 9, x FOR PEER REVIEW 11 of 18 Validation of statistical difference of the models using the Wilcoxon sign-ranked test was done as shown in Tables 4 and 5.It can be observed in training dataset that p-value of the pair samples of the models is smaller than 0.05 except the pair sample of GANFIS vs. FAANFIS and the z-value of the pair samples of the models out of the range from −1.96 and +1.96 except the pair sample of GANFIS vs. FAANFIS (Table 4).In testing dataset, both p-value and z-values of all the pair samples of the models are less than 0.05 and out of the range from −1.96 and +1.96, respectively (Table 5).Therefore, it can be stated that in testing phase, the difference of all models is statistically significant whereas in training dataset the difference of the models is statistically significant except GANFIS vs. FAANFIS.Validation of statistical difference of the models using the Wilcoxon sign-ranked test was done as shown in Tables 4 and 5.It can be observed in training dataset that p-value of the pair samples of the models is smaller than 0.05 except the pair sample of GANFIS vs. FAANFIS and the z-value of the pair samples of the models out of the range from −1.96 and +1.96 except the pair sample of GANFIS vs. FAANFIS (Table 4).In testing dataset, both p-value and z-values of all the pair samples of the models are less than 0.05 and out of the range from −1.96 and +1.96, respectively (Table 5).Therefore, it can be stated that in testing phase, the difference of all models is statistically significant whereas in training dataset the difference of the models is statistically significant except GANFIS vs. FAANFIS.

Discussion
Use of IRI has received much attention as a crucial tool for analysis of pavement performance prediction and life cycle cost because IRI is considered as a quality assurance parameter in the procedure of construction [72] and reflects the ride quality and comfort level of passengers.Besides, roughness-based indices have been used as a practical and reliable method for predicting performance of pavement and analyzing maintenance and rehabilitation [73].The health status of the pavement can be assessed by observing present distresses, checking the material properties and estimating quality of the construction [74].As assessment of the performance of the road pavement structure based on IRI prediction is a very important and necessary task for efficient management of the road infrastructure, there are several studies on the relationship between IRI and pavement distress [75][76][77].Determination of IRI is important task to quantify road surface roughness.On the other hand, the traditional methods to measure the IRI is time-consuming and cost a great fee.Thus, prediction of the IRI using advanced ML techniques is an effective solution which has not only a quick determination but also reduces significantly the cost of measurement.In the present study, we recommended potential ML methods PSOANFIS, GANFIS, FAANFIS, and ANN which have a better prediction capability for the IRI.
It can be observed from the validation results of the models that the PSOANFIS, GANFIS, FAANFIS model have acceptable performance whereas ANN has a slightly poor capability for the

Discussion
Use of IRI has received much attention as a crucial tool for analysis of pavement performance prediction and life cycle cost because IRI is considered as a quality assurance parameter in the procedure of construction [72] and reflects the ride quality and comfort level of passengers.Besides, roughness-based indices have been used as a practical and reliable method for predicting performance of pavement and analyzing maintenance and rehabilitation [73].The health status of the pavement can be assessed by observing present distresses, checking the material properties and estimating quality of the construction [74].As assessment of the performance of the road pavement structure based on IRI prediction is a very important and necessary task for efficient management of the road infrastructure, there are several studies on the relationship between IRI and pavement distress [75][76][77].Determination of IRI is important task to quantify road surface roughness.On the other hand, the traditional methods to measure the IRI is time-consuming and cost a great fee.Thus, prediction of the IRI using advanced ML techniques is an effective solution which has not only a quick determination but also reduces significantly the cost of measurement.In the present study, we recommended potential ML methods PSOANFIS, GANFIS, FAANFIS, and ANN which have a better prediction capability for the IRI.
It can be observed from the validation results of the models that the PSOANFIS, GANFIS, FAANFIS model have acceptable performance whereas ANN has a slightly poor capability for the prediction of the IRI.It can be stated that PSOANFIS has stable performance and the best prediction capability, followed by GANFIS, the FAANFIS and the ANN (Figures 5-8 and Tables 3-5).It is because the GANFIS, PSOANFIS, FAANFIS used the GA, PSO, FA to optimize parameters, thereby reducing the error of prediction.The calculation of these methods is very simple, occupies the bigger optimization ability and can be completed easily.Although ML methods such as PSOANFIS, GANFIS, FAANFIS, ANN has a great potentiality for prediction problems, their predictive capability depends substantial on the quality of dataset.In term of prediction problems, the variables used have determination from various experiments on various sample can have effect on the predictive capability of the models.To improve the capability of the ML models, one of many solutions is that it needs to be provided more numbers of data.In this study, four potential ML techniques have low average error rates, so that it might be used to predict other properties of the road surface.
In applying ML models, it is essential to carry out the sensitivity analysis to evaluate the importance of the input parameters used for training and validating the models.Thus, in the present study, a sensitivity analysis was carried out for estimating the degree of importance of each input variable on the prediction of IRI.The sensitivity analysis was calculated using the PSOANFIS model, as its prediction capability has been proved as the best.For this purpose, quantile values at 21 points (from 0 to 1 with a resolution of 0.05) of each input have been extracted and served as a new dataset for the calculation of IRI through the PSOANFIS model.In order to explore the sensitivity, each input was varied from its smallest (quantile 0) to biggest (quantile 1) values while all other variables remained at their median (quantile 0.5).Therefore, the normalized change in IRI output results could be used to quantify the degree of importance of input variables.Such calculation could be expressed by the following equation for computing the normalized change φ in IRI output results:

IRI
all−inputs quantile−0.5 (12) where IRI all−inputs quantile−0.5 was the configuration "zero" when all inputs remained at their medians, IRI j i is the IRI output results using jth input at its ith quantile level.In this work, quantile level varies from 0 to 1 with a resolution of 0.05 (i = 1, . . ., 21).Finally, the degree of importance of each input was calculated based on the following equation: Indeed, such variation in statistical behavior of inputs allows efficiently computing their influence in the prediction of IRI.Results of sensitivity analysis are scaled into the range of [0, 100%] and plotted in Figure 9 in the descending order.Obviously, all inputs exhibited impact on the prediction of IRI through the PSOANFIS model.However, the most influenced variables were maximum depth of rut, summed cracks and average depth of rut, which exposed 40, 34 and 17% of degree of importance, respectively.The other two inputs such as analysis area and road length exhibited less than 9% of degree of importance.Such obtained information was relevant and could help engineers save time and cost in applications and practices.The results of this sensitivity analysis might be helpful in selection of suitable input parameters for better performance of ML models for prediction of IRI and other properties of road surface.

Conclusions
Prediction of IRI is an important task in quantification road surface roughness.In the present study, we proposed an efficient approach using hybrid ML models such as PSOANFIS, GANFIS, FAANFIS which was also compared with a benchmark ML model named ANN.Data were provided from the survey area in the north of Vietnam.PSOANFIS, GANFIS, FAANFIS are hybrid models of ANFIS optimized by three meta-heuristic optimization algorithm; PSO, GA and FA.Experimental results performed on ML algorithms illustrated that PSOANFIS had the best capability in prediction compared to other applied models.The sensitivity analysis results showed that the most important parameters, namely maximum depth of rut, summed cracks and average depth of rut, should be taken into account in predicting IRI using the ML models.Results of this study were helpful for engineers and researchers to predict quickly and accurately the IRI for evaluating and managing road systems.It will also help to reduce the cost due to reduction of the cost of laboratorial experiments.However, it was noticed that cross-validation is also an effective method to validate the performance of the ML models, which can be investigated in further study instead of the holdout validation used in this study.In addition, other less biased ML techniques such as random forest should be used to compare with the proposed and applied models of this study for future works.

Conclusions
Prediction of IRI is an important task in quantification road surface roughness.In the present study, we proposed an efficient approach using hybrid ML models such as PSOANFIS, GANFIS, FAANFIS which was also compared with a benchmark ML model named ANN.Data were provided from the survey area in the north of Vietnam.PSOANFIS, GANFIS, FAANFIS are hybrid models of ANFIS optimized by three meta-heuristic optimization algorithm; PSO, GA and FA.Experimental results performed on ML algorithms illustrated that PSOANFIS had the best capability in prediction compared to other applied models.The sensitivity analysis results showed that the most important parameters, namely maximum depth of rut, summed cracks and average depth of rut, should be taken into account in predicting IRI using the ML models.Results of this study were helpful for engineers and researchers to predict quickly and accurately the IRI for evaluating and managing road systems.It will also help to reduce the cost due to reduction of the cost of laboratorial experiments.However, it was noticed that cross-validation is also an effective method to validate the performance of the ML models, which can be investigated in further study instead of the holdout validation used in this study.In addition, other less biased ML techniques such as random forest should be used to compare with the proposed and applied models of this study for future works.

Figure 1 .
Figure 1.Location and size of the Management Department I.

Figure 1 .
Figure 1.Location and size of the Management Department I.

Figure 2 .
Figure 2. Basic structure of the adaptive network based fuzzy inference system (ANFIS).

Figure 2 .
Figure 2. Basic structure of the adaptive network based fuzzy inference system (ANFIS).

Figure 4 .
Figure 4. Methodological flow chart of this study.

Figure 4 .
Figure 4. Methodological flow chart of this study.

18 Figure 6 .
Figure 6.Error analysis of the models using the training dataset: (a) probability distribution and (b) cumulative distribution; using the testing dataset: (c) probability distribution and (d) cumulative distribution.

Figure 6 .
Figure 6.Error analysis of the models using the training dataset: (a) probability distribution and (b) cumulative distribution; using the testing dataset: (c) probability distribution and (d) cumulative distribution.

Table 2
describes the statistical measures of dataset features.

Table 2
describes the statistical measures of dataset features.

Table 2 .
Statistical Measures of Dataset Features.

Table 3 .
Summary of prediction capability of four artificial intelligence (AI) models.

Table 3 .
Summary of prediction capability of four artificial intelligence (AI) models.

Table 4 .
Non-parametric test using training dataset.

Table 5 .
Non-parametric test using testing dataset.

Table 4 .
Non-parametric test using training dataset.

Table 5 .
Non-parametric test using testing dataset.