GEP Tree-Based Prediction Model for Interfacial Bond Strength of Externally Bonded FRP Laminates on Grooves with Concrete Prism

Reinforced concrete structures are subjected to frequent maintenance and repairs due to steel reinforcement corrosion. Fiber-reinforced polymer (FRP) laminates are widely used for retrofitting beams, columns, joints, and slabs. This study investigated the non-linear capability of artificial intelligence (AI)-based gene expression programming (GEP) modelling to develop a mathematical relationship for estimating the interfacial bond strength (IBS) of FRP laminates on a concrete prism with grooves. The model was based on five input parameters, namely axial stiffness (Eftf), width of FRP plate (bf), concrete compressive strength (fc′), width of groove (bg), and depth of the groove (hg), and IBS was considered the target variable. Ten trials were conducted based on varying genetic parameters, namely the number of chromosomes, head size, and number of genes. The performance of the models was evaluated using the correlation coefficient (R), mean absolute error (MAE), and root mean square error (RMSE). The genetic variation revealed that optimum performance was obtained for 30 chromosomes, 11 head sizes, and 4 genes. The values of R, MAE, and RMSE were observed as 0.967, 0.782 kN, and 1.049 kN for training and 0.961, 1.027 kN, and 1.354 kN. The developed model reflected close agreement between experimental and predicted results. This implies that the developed mathematical equation was reliable in estimating IBS based on the available properties of FRPs. The sensitivity and parametric analysis showed that the axial stiffness and width of FRP are the most influential parameters in contributing to IBS.


Introduction
Reinforced concrete (RC) structures are subjected to frequent maintenance and repair due to the corrosion of conventional steel reinforcement [1].Therefore, strengthening existing structures is considered an emerging construction activity to cope with the strength requirements and upgraded code designs [2]. Fiber-reinforced polymer (FRP) laminates are widely used for retrofitting and enhancing the existing structural capacity of beams [3][4][5], columns [6,7], and beam-column joints [8][9][10] owing to their superior performance [11].
achieving the coefficient of determination (R 2 ) equalling 0.99. Hoang [41] used an artificial neural network (ANN) for predicting the punching shear capacity of steel-fiber-reinforced concrete slabs. Abuodeh, et al. [42], investigated the behavior of RC beams in terms of shear capacity using a neural interpretation diagram (NID) and recursive feature elimination (RFE) algorithm.
In summary, AI models based on available experimental results are needed to predict the IBS of FRP plates on a concrete prism in order to enhance the cost-effectiveness of the engineering projects. Su, et al. [43], employed three AI models, namely, multilinear regression, support vector machine, and ANN, to predict the IBS of FRP laminates to the concrete prism. An accuracy of R 2 equalling 0.81 and 0.91 was observed for the training and validation data, respectively. The authors opine that the developed models can be further improved in terms of accuracy. In addition, gene expression programming (GEP) is a robust technique used to establish the relationship between input and output attributes in the form of a simple mathematical equation [44]. It is noteworthy to mention that the developed ANN models are in the form of a black box. There is no information about the relationship and mathematical equation of how these attributes are related to each other. Moreover, parametric analysis is essential in order to investigate the effect of input attributes on IBS, since this may better decide which type of strengthening technique is more helpful in terms of effectiveness and economy. Therefore, the GEP model was employed to establish the mathematical relationship between the attributes and IBS of FRP laminates bonded to the concrete prism [45,46]. This research explored the capability of the GEP model in estimating the IBS of FRP laminates externally bonded to the concrete prism on the grooves using 133 experimental SST results (anchorage made on one end of FRP to the concrete prism shown in Figure 1b). Tested samples with FRP plates parallel to the groove direction were used in the analysis. The parametric analysis was also presented to see the contribution of input variables to IBS.
Hoang [40] investigated the non-linear capabilities of the least square support vector ma chine to predict the punching shear capacity of FRP-reinforced concrete beams in achiev ing the coefficient of determination (R 2 ) equalling 0.99. Hoang [41] used an artificial neura network (ANN) for predicting the punching shear capacity of steel-fiber-reinforced con crete slabs. Abuodeh, et al. [42], investigated the behavior of RC beams in terms of shear capacity using a neural interpretation diagram (NID) and recursive feature elimination (RFE) algorithm.
In summary, AI models based on available experimental results are needed to predic the IBS of FRP plates on a concrete prism in order to enhance the cost-effectiveness of the engineering projects. Su, et al. [43], employed three AI models, namely, multilinear re gression, support vector machine, and ANN, to predict the IBS of FRP laminates to the concrete prism. An accuracy of R 2 equalling 0.81 and 0.91 was observed for the training and validation data, respectively. The authors opine that the developed models can be further improved in terms of accuracy. In addition, gene expression programming (GEP is a robust technique used to establish the relationship between input and output attrib utes in the form of a simple mathematical equation [44]. It is noteworthy to mention tha the developed ANN models are in the form of a black box. There is no information abou the relationship and mathematical equation of how these attributes are related to each other. Moreover, parametric analysis is essential in order to investigate the effect of inpu attributes on IBS, since this may better decide which type of strengthening technique is more helpful in terms of effectiveness and economy. Therefore, the GEP model was em ployed to establish the mathematical relationship between the attributes and IBS of FRP laminates bonded to the concrete prism [45,46]. This research explored the capability o the GEP model in estimating the IBS of FRP laminates externally bonded to the concrete prism on the grooves using 133 experimental SST results (anchorage made on one end o FRP to the concrete prism shown in Figure 1b). Tested samples with FRP plates paralle to the groove direction were used in the analysis. The parametric analysis was also pre sented to see the contribution of input variables to IBS.

Methodology
This section discusses the detailed methodology adopted for the estimation of the IBS of FRP laminates externally bonded to the concrete prism through grooves. An experimental database is explained, followed by an overview of the GEP model and its modeling procedure. The evaluation criteria for the developed models are also presented herein.

Methodology
This section discusses the detailed methodology adopted for the estimation of the IBS of FRP laminates externally bonded to the concrete prism through grooves. An experimental database is explained, followed by an overview of the GEP model and its modeling procedure. The evaluation criteria for the developed models are also presented herein. Figure 2 illustrates the magnitude of the input and target variables used in the study. The input variables were the elastic modulus of FRP times the thickness of fiber (E f t f, GPa-mm), which is also termed as axial stiffness, the width of the FRP plate (b f , mm), the concrete compressive strength (f c , MPa), the width of the groove (b g , mm), and the depth of the groove (h g , mm), whereas the ultimate capacity (P, kN) was considered a target variable. The database comprised 133 experimental results of single-lap shear tests (SSTs) taken from the previous study [47], which was reported by [43]. The data were evenly distributed between the extremes. E f t f ranged between 12.90 to 78.90 with a skewness of 0.58. The database used in our study was experimentally conducted by Moghaddas, et al. [47]. Four different widths of FRP sheets (b f as shown in Figure 1b), equalling 30, 40, 50, and 60 mm were used in the investigation. Four variable groove sizes (5 × 5, 5 × 10, 10 × 10, and 10 × 15) were considered. Three different mix designs with concrete strengths of 25, 35, or 45 MPa were used to manifest change in the strength of concrete. The SST tests were conducted on FRP plates bonded on one side with a concrete cylinder (150 × 150 × 350 mm), as shown in Figure 1b. It is worth mentioning that the FRP surface roughness may affect IBS; however, this study was based on experimental tests conducted on FRP sheets made of the Sika wrap-200C, Sika wrap-300C, and Sika wrap-430G types of carbon and glass fibers bonded with epoxy Sikadur 330 adhesive material, as reported in Moghaddas, et al. [47]. Other statistics of the employed database are listed in Table 1.

GEP Modelling
The GEP model, based on Darwinian principles, was inspired by the recombination of genetic materials in living organisms. An AI-based GEP model is a type of evolutionary algorithm that comprises complex trees called expression trees (ETs). The shape and size of these ETs was adjusted with the learning of the GEP model. The modelling of the GEP was carried out using GeneXproTools Version 5. Initially, the data comprising 133 data

GEP Modelling
The GEP model, based on Darwinian principles, was inspired by the recombination of genetic materials in living organisms. An AI-based GEP model is a type of evolutionary algorithm that comprises complex trees called expression trees (ETs). The shape and size of these ETs was adjusted with the learning of the GEP model. The modelling of the GEP was carried out using GeneXproTools Version 5. Initially, the data comprising 133 data points were fed into the modelling environment. The variables were assigned as inputs and target variables. The data was partitioned into 70% training and 30% validation data using random partitioning. Subsequently, the setting parameters were varied in order to yield a high-performance model. The fitness function was selected as RMSE; the number of genes, chromosomes, and head size was varied. In addition, genetic parameters such as the probability of mutation, RIS transposition, IS transposition, or recombination operators were set according to the previous literature [38]. The linking functions within ETs were assigned as +, −, /, sqrt, and x 2 , whereas the linking function between ETs was assigned as an addition (+). The model was executed and allowed to train until the best fitness was achieved. The authors use the term "best fitness" to mean that the model was allowed to train until no further enhancement in the performance in terms of correlations and error indices was observed. At the same time, the performance of the validation data was also monitored in order to avoid over-fitness of the model. The model was stopped to generate mathematical equations upon achieving the best performance. The schematics of the GEP modelling are shown in Figure 3.

Evaluation Criteria
The performance of the developed GEP models were evaluated using statistical functions, namely, coefficient of correlation (R), root mean square error (RMSE), and mean absolute error (MAE), which are common statistical indices used for the evaluation of AI models in accordance with the previous literature [37,[48][49][50][51].
Polymers 2022, 14, x FOR PEER REVIEW 6 of 21 was also monitored in order to avoid over-fitness of the model. The model was stopped to generate mathematical equations upon achieving the best performance. The schematics of the GEP modelling are shown in Figure 3.

Evaluation Criteria
The performance of the developed GEP models were evaluated using statistical functions, namely, coefficient of correlation (R), root mean square error (RMSE), and mean absolute error (MAE), which are common statistical indices used for the evaluation of AI models in accordance with the previous literature [37,[48][49][50][51].

Results and Discussion
This section describes the results achieved from this study. The effect of changing genetic variables on the performance of the developed models is explained in detail, followed by the performance of the developed models. Finally, parametric and sensitivity analysis (SA) is also discussed to see the relative impact of contributing variables on interfacial bond strength.  Table 2. The values of MAE and RMSE also increased with the change in number of chromosomes from 30 to 50. Further increases in chromosomes from 50 to 100 and 200 did not improve the performance of the models. The overall best performance

Results and Discussion
This section describes the results achieved from this study. The effect of changing genetic variables on the performance of the developed models is explained in detail, followed by the performance of the developed models. Finally, parametric and sensitivity analysis (SA) is also discussed to see the relative impact of contributing variables on interfacial bond strength.  Table 2. The values of MAE and RMSE also increased with the change in number of chromosomes from 30 to 50. Further increases in chromosomes from 50 to 100 and 200 did not improve the performance of the models. The overall best performance of the models for 4 variable numbers of chromosomes was attained at a magnitude of 30. Therefore, in onward trials, chromosomes were retained at 30, and the head size was changed from 8 to 9, 10, 11, and 12. The performance of the models increased by changing head size from 8 to 9 and then decreased at 10, whereas the most optimized results were obtained at a head size of 11. This way, two parameters, i.e., number of chromosomes and head size, were optimized at 30 and 11, respectively, as tuning parameters for the next trials to be executed with a variable number of genes. It has been observed that an increase in the number of genes complexifies the output equation and the performance of the model; however, an increasing number of genes beyond five complexifies the output equation to a greater extent [38]. Figure 4 shows that four genes yielded the optimized performance of the models. In this study, 30 chromosomes, 11 head size, and 4 number of genes yielded the best performance. Previously, it has been evaluated that the optimized performance was achieved at different setting parameters [52] for different types of problems. Since it was concluded that the setting parameters generally depend on the trial and access methods in the GEP modelling, they must therefore be determined on the basis of rigorous exercise by varying the genetic parameters. Figure 5 shows that the best trial observed was trial number 9, for which the magnitude of overall R and MAE were recorded as 0.964 and 0.9045.

Performance of the Developed Models
The performance of the models is presented in the form of a statistical evaluation of the training and validation data, followed by the slope of the regression line plotted between experimental and predicted observations. In addition, the predicted/experimental ratio has also been presented to see the performance of the models. Table 2 summarises the statistical evaluation of all the trials in the form of values of R, R 2 , MAE, and RMSE. The minimum value of R for the training data was observed as 0.948, and for the validation data, the minimum value of R was recorded as 0.860 for trial 6.

Statistical Evaluation
The maximum values of MAE were observed as 1.004 kN for the training data of trial 6, whereas for the validation data, it was recorded as 1.257 kN for trial 10. The minimum MAE was 0.782 and 1.027 kN for trial 9, and the values of R for the best trial for the training and validation data were 0.967 and 0.961, respectively. This made the average MAE equal to 6.48% and 8.52% for the training and validation data, respectively. The values of RMSE were 1.049 and 1.354 kN for the training and validation data, respectively. The statistical evaluation of all the trials showed a close agreement between experimental and predicted results; however, the results obtained from trial 9 excelled in the performance. The model in trial 9 can be used for future prediction of IBS more reliably.

Comparison of Regression Slopes
The regression slope of the line trending from plotting experimental results on the X-axis and predicted results on the Y-axis was investigated in this section regarding the performance of the developed models ( Figure 6). A similar type of analysis in evaluating AI models has been previously practised by numerous researchers [36][37][38]. While exploring the non-linear capabilities of ANN for the compressive strength of polyethylene-terephthalateincorporated cementitious grouts, Khan, et al. [36], found this slope equal to 1.01 and 0.90 for the training and testing data, respectively. A value of this slope more significant than 0.80 indicated agreement between the experimental and predicted results reported in the previous studies [39,51]. From Figure 6, it can be observed that the slopes for the training and validation data for trial 9 were 0.99 and 0.96, respectively. The values of the regression slopes were more significant than 0.8; therefore, the models reflected a good correlation between experimental and predicted results. Error analysis showed that the training and validation data trend line almost passed through 0 residual value. In addition, most of the residual points (experimental-predicted) lie between 1 and −1 kN.

Predicted to Experimental Ratio
As discussed in Sections 3.1 and 3.2.1, the model obtained in trial 9 was the most accurate model among the various trials investigated herein. Therefore, the results of trial 9 are plotted in the form of predicted/experimental values to manifest the accuracy in more detail. Feng, et al. [53], evaluated this ratio within ±20% while studying the XgBoost model for predicting the shear strength of squat-reinforced concrete walls. When supplemented with other statistical evaluations, the model interpreted its accuracy as higher than other empirical models. In ourstudy, Figure 7 and Table 3 show that almost 90% of data points lie between 0.9 and 1.1, which shows that the percentage of errors in predictions obtained in trial 9 are within ±10%, thus reflecting the string robustness of the developed model. This evaluation further strengthens the model for predicting IBS of FRP laminates bonded on grooves with a concrete prism.

Performance of the Developed Models
The performance of the models is presented in the form of a statistical evaluation of the training and validation data, followed by the slope of the regression line plotted between experimental and predicted observations. In addition, the predicted/experimental ratio has also been presented to see the performance of the models. Table 2 summarises the statistical evaluation of all the trials in the form of values of R, R 2 , MAE, and RMSE. The minimum value of R for the training data was observed as 0.948, and for the validation data, the minimum value of R was recorded as 0.860 for trial 6. The maximum values of MAE were observed as 1.004 kN for the training data of trial 6, whereas for the validation data, it was recorded as 1.257 kN for trial 10. The minimum MAE was 0.782 and 1.027 kN for trial 9, and the values of R for the best trial for the training and validation data were 0.967 and 0.961, respectively. This made the average MAE equal to 6.48% and 8.52% for the training and validation data, respectively. The values of RMSE were 1.049 and 1.354 kN for the training and validation data, respectively. The statistical evaluation of all the trials showed a close agreement between experimental and predicted results; however, the results obtained from trial 9 excelled in the performance. The model in trial 9 can be used for future prediction of IBS more reliably.

Comparison of Regression Slopes
The regression slope of the line trending from plotting experimental results on the Xaxis and predicted results on the Y-axis was investigated in this section regarding the performance of the developed models ( Figure 6). A similar type of analysis in evaluating AI models has been previously practised by numerous researchers [36][37][38]. While exploring the non-linear capabilities of ANN for the compressive strength of polyethylene-terephthalate-incorporated cementitious grouts, Khan, et al. [36], found this slope equal to 1.01 and 0.90 for the training and testing data, respectively. A value of this slope more significant than 0.80 indicated agreement between the experimental and predicted results reported in the previous studies [39,51]. From Figure 6, it can be observed that the slopes for the training and validation data for trial 9 were 0.99 and 0.96, respectively. The values of the regression slopes were more significant than 0.8; therefore, the models reflected a

GEP Formulations
The MATLAB model obtained from the GEP analysis was employed to extract simple mathematical equations for the prediction of IBS of FRP laminates bonded on grooves to the concrete prism. Figure 8 shows the ETs extracted from the GEP model, which was used to furnish the prediction equation expressed as Equation 1. It can be observed that trial 9 was executed on 4 genes; therefore, ETs contain 4 sub-ETs. The symbols denoted as c1, c2, c3, etc., among others in each sub-ET, are constants whose values are given in Figure 8. z = (6.15 + (−0.56 + (E f t f /(h g + (((7.29 + b g ) + 6.45) + (−0.159 fc ))))));

Sensitivity and Parametric Analysis
It is important to evaluate the developed models with several assessments which predict the unseen data to ensure that the prediction model possesses robustness and can forecast new data following the physical phenomenon involved in the process. Sensitivity and parametric tests demonstrate their robustness [54,55]. The SA on the simulated dataset based on the descriptive statistics of the entire database determines how susceptible a constructed model is to changes in the variables under consideration [56,57]. The relative contributions of the input factors (E f t f , b f , f c , b g , and h g ) were taken into consideration here to forecast the IBS of FRP laminates bonded to concrete with the help of grooves. This analysis was conducted on a simulated dataset created such that one first variable was varied between its extremes, and other variables were maintained at their average values. Subsequently, the second variable was varied, and so on. The predictions were made based on the trained model. For parametric analysis, the change in the value of IBS was plotted against the changing variable. For SA, Equations (2) and (3) were used. f max (S i ) and f min (S i ) denote, respectively, the maximum and minimum of forecasted IBS on the basis of the ith input domain, whereas the rest of the input variables remain constant at their mean.
From the Figure 9, It was observed that axial stiffness, which is termed as E f t f herein, had considerable influence in yielding IBS. It contributed more than 50% of the bond strength to the FRP laminates, followed by the width of FRP laminates (b f ), which contributed 37.18%. The other factors, i.e., concrete compressive strength and the width and depth of the groove, contributed 8.72 altogether. This reflects that the adhesion of the FRP laminates with the groove did not considerably increase the bond strength of FRP laminates in yielding IBS. The parametric analysis in Figure 10 shows that the ultimate bond capacity linearly increased with a rise in the axial stiffness and width of the FRP laminate. The other parameters depicted no considerable change. It is important to mention that a continuous increase in these parameters would not raise the ultimate bearing capacity; however, the parametric analysis ( Figure 10) was conducted to verify that the model had been trained reliably. For optimum magnitude of these parameters, a detailed study is needed based on a wide range of experiments.

Sensitivity and Parametric Analysis
It is important to evaluate the developed models with several assessments which predict the unseen data to ensure that the prediction model possesses robustness and can forecast new data following the physical phenomenon involved in the process. Sensitivity and parametric tests demonstrate their robustness [54,55]. The SA on the simulated dataset based on the descriptive statistics of the entire database determines how susceptible a constructed model is to changes in the variables under consideration [56,57]. The relative contributions of the input factors (Eftf, bf, fc′, bg, and hg) were taken into consideration here to forecast the IBS of FRP laminates bonded to concrete with the help of grooves. This analysis was conducted on a simulated dataset created such that one first variable was varied between its extremes, and other variables were maintained at their average values. Subsequently, the second variable was varied, and so on. The predictions were made based on the trained model. For parametric analysis, the change in the value of IBS was plotted against the changing variable. For SA, Equations (2) and (3) were used. fmax (Si) and fmin (Si) denote, respectively, the maximum and minimum of forecasted IBS on the basis of the ith input domain, whereas the rest of the input variables remain constant at their mean.
From the Figure 9, It was observed that axial stiffness, which is termed as Eftf herein, had considerable influence in yielding IBS. It contributed more than 50% of the bond strength to the FRP laminates, followed by the width of FRP laminates (bf), which contributed 37.18%. The other factors, i.e., concrete compressive strength and the width and depth of the groove, contributed 8.72 altogether. This reflects that the adhesion of the FRP laminates with the groove did not considerably increase the bond strength of FRP laminates in yielding IBS. The parametric analysis in Figure 10 shows that the ultimate bond capacity linearly increased with a rise in the axial stiffness and width of the FRP laminate. The other parameters depicted no considerable change. It is important to mention that a continuous increase in these parameters would not raise the ultimate bearing capacity; however, the parametric analysis ( Figure 10) was conducted to verify that the model had been trained reliably. For optimum magnitude of these parameters, a detailed study is needed based on a wide range of experiments.

Conclusions
Due to the wide application of FRP laminates used for the retrofitting of RC elements, especially beams, columns, joints, and slabs, it is important to evaluate its bond strength with a concrete prism. For this purpose, an interfacial shear strength test was conducted on FRP laminates on a concrete prism with or without grooves to manifest interfacial bond strength (IBS). This study investigated in detail the evaluation of the important factors influencing bond strength and its prediction models employing non-linear capabilities of GEP model. The following conclusions were drawn from this study: 1. For obtaining a more robust model, ten different trials were conducted on the basis of changes in number of chromosomes, head size, and number of genes. We noticed that increasing the number of chromosomes from 30 to 200 slightly reduced the performance, whereas an 11 head size and 4 genes yielded the most accurate model (trial 9). This exercise suggests that GEP modelling requires a detailed trial and access method in order to find the optimum genetic parameters.

Conclusions
Due to the wide application of FRP laminates used for the retrofitting of RC elements, especially beams, columns, joints, and slabs, it is important to evaluate its bond strength with a concrete prism. For this purpose, an interfacial shear strength test was conducted on FRP laminates on a concrete prism with or without grooves to manifest interfacial bond strength (IBS). This study investigated in detail the evaluation of the important factors influencing bond strength and its prediction models employing non-linear capabilities of GEP model. The following conclusions were drawn from this study:

1.
For obtaining a more robust model, ten different trials were conducted on the basis of changes in number of chromosomes, head size, and number of genes. We noticed that increasing the number of chromosomes from 30 to 200 slightly reduced the performance, whereas an 11 head size and 4 genes yielded the most accurate model (trial 9). This exercise suggests that GEP modelling requires a detailed trial and access method in order to find the optimum genetic parameters.

2.
The models were evaluated using statistical indices such as R, RMSE, and MAE for both the training and validation data. The statistical indices revealed the values of R, MAE, and RMSE equalled 0.967, 0.782, and 1.049 for training and 0.961, 1.027, and 1.354 for validation, respectively. The slope of a regression line was obtained as 0.97 and 0.96 for training and validation data, respectively. This reflects a strong agreement between the experimental and predicted values. The mathematical equation based on this model has been developed to predict the interfacial bond strength of FRP laminates. 3.
The sensitivity and parametric analysis showed that the axial stiffness and width of FRP were the most critical parameters in contributing to IBS. Other parameters such as concrete compression strength, width, and depth had no considerable influence in yielding IBS. Informed Consent Statement: Not applicable.

Data Availability Statement:
The data used for the development of models have been reported in the paper.