Investigating the Bond Strength of FRP Laminates with Concrete Using LIGHT GBM and SHAPASH Analysis

The corrosion of steel reinforcement necessitates regular maintenance and repair of a variety of reinforced concrete structures. Retrofitting of beams, joints, columns, and slabs frequently involves the use of fiber-reinforced polymer (FRP) laminates. In order to develop simple prediction models for calculating the interfacial bond strength (IBS) of FRP laminates on a concrete prism containing grooves, this research evaluated the nonlinear capabilities of three ensemble methods—namely, random forest (RF) regression, extreme gradient boosting (XGBoost), and Light Gradient Boosting Machine (LIGHT GBM) models—based on machine learning (ML). In the present study, the IBS was the desired variable, while the model comprised five input parameters: elastic modulus x thickness of FRP (EfTf), width of FRP plate (bf), concrete compressive strength (fc′), width of groove (bg), and depth of groove (hg). The optimal parameters for each ensemble model were selected based on trial-and-error methods. The aforementioned models were trained on 70% of the entire dataset, while the remaining data (i.e., 30%) were used for the validation of the developed models. The evaluation was conducted on the basis of reliable accuracy indices. The minimum value of correlation of determination (R2 = 0.82) was observed for the testing data of the RF regression model. In contrast, the highest (R2 = 0.942) was obtained for LIGHT GBM for the training data. Overall, the three models showed robust performance in terms of correlation and error evaluation; however, the trend of accuracy was obtained as follows: LIGHT GBM > XGBoost > RF regression. Owing to the superior performance of LIGHT GBM, it may be considered a reliable ML prediction technique for computing the bond strength of FRP laminates and concrete prisms. The performance of the models was further supplemented by comparing the slopes of regression lines between the observed and predicted values, along with error analysis (i.e., mean absolute error (MAE), and root-mean-square error (RMSE)), predicted-to-experimental ratio, and Taylor diagrams. Moreover, the SHAPASH analysis revealed that the elastic modulus x thickness of FRP and width of FRP plate are the factors most responsible for IBS in FRP.


Introduction
Repairing and reinforcing structures has traditionally been a dynamic and complex aspect of building work. The use of fiber-reinforced polymer (FRP) bars, sheets, and strips to enhance RC or even steel structural components is one of the prevalent approaches for these kinds of repairs. Because of the corrosion of traditional steel reinforcement that In a variety of engineering challenges, machine learning (ML) or artificial intelligence (AI) is commonly utilized to discover the best solution to regression and classification problems. These ML models are not only trained with a large series of experimental findings, but they are also verified using unseen data. Furthermore, they have seen a wide range of applications in composite constructions, particularly over the past couple of years. This study takes into consideration three AI methods, i.e., Light Gradient Boost Machine (LIGHT GBM), extreme gradient boosting (XGBoost), and random forest (RF) regression. Liang et al. [28] predicted the creep performance of concrete by utilizing LIGHT GBM, XGBoost, and RF algorithms. Note that a data-driven model performs prediction on the basis of specific input data to (a) develop understanding of model decisions, (b) determine the complex hidden nonlinear relationships, and (c) evaluate the implications of a model's analysis and evaluation. Mangalathu et al. [29] used a wide database in order to analyze the feature importance for the failure mode of RC structural elements, i.e., columns and shear walls. An RF model was formulated for the training set such that it possessed an accuracy of 84% and 86% for the unseen data of the two types of RC elements, respectively. Milad et al. [30] collected an experimental dataset comprising 729 experimental values to predict the FRP strain such that the governing input factors were material geometry, strength characteristics, strain characteristics, FRP characteristics, and confinement characteristics. They deployed XGBoost, RF, and multivariate adaptive regression splines (MARS) algorithms and found that the latter model exhibited the highest prediction accuracy. Xu et al. [31] concluded that the XGBoost model yielded the best model performance and outperformed the empirical models, as well as the RF, decision tree (DT), and artificial neural network (ANN) algorithms. Kim et al. [32] presented four ensemble ML approaches (i.e., CatBoost, histogram gradient boosting, XGBoost, and RF algorithms) for the estimation of FRP-concrete interfacial bond strength (IBS) by considering an extensive dataset with the results of 855 SSTs on the FRP-concrete IBS. They found that the CatBoost algorithm outperformed all of the other ensemble techniques (R 2 = 0.96, and other performance metrics were also lower). Su et al. [24] used multilinear regression (MLR), support-vector machine (SVM), and ANN models to estimate the IBS of FRP laminates to the concrete prism, and they achieved an R2 of 0.85 for the overall dataset. In yet another study on the estimation of seismic performance of RC walls, Zhang et al. [33] revealed that the XGBoost and gradient boost (GB) algorithms were efficacious, achieving an accuracy of almost 97%, whereas, the GB and RF regression methods In a variety of engineering challenges, machine learning (ML) or artificial intelligence (AI) is commonly utilized to discover the best solution to regression and classification problems. These ML models are not only trained with a large series of experimental findings, but they are also verified using unseen data. Furthermore, they have seen a wide range of applications in composite constructions, particularly over the past couple of years. This study takes into consideration three AI methods, i.e., Light Gradient Boost Machine (LIGHT GBM), extreme gradient boosting (XGBoost), and random forest (RF) regression. Liang et al. [28] predicted the creep performance of concrete by utilizing LIGHT GBM, XGBoost, and RF algorithms. Note that a data-driven model performs prediction on the basis of specific input data to (a) develop understanding of model decisions, (b) determine the complex hidden nonlinear relationships, and (c) evaluate the implications of a model's analysis and evaluation. Mangalathu et al. [29] used a wide database in order to analyze the feature importance for the failure mode of RC structural elements, i.e., columns and shear walls. An RF model was formulated for the training set such that it possessed an accuracy of 84% and 86% for the unseen data of the two types of RC elements, respectively. Milad et al. [30] collected an experimental dataset comprising 729 experimental values to predict the FRP strain such that the governing input factors were material geometry, strength characteristics, strain characteristics, FRP characteristics, and confinement characteristics. They deployed XGBoost, RF, and multivariate adaptive regression splines (MARS) algorithms and found that the latter model exhibited the highest prediction accuracy. Xu et al. [31] concluded that the XGBoost model yielded the best model performance and outperformed the empirical models, as well as the RF, decision tree (DT), and artificial neural network (ANN) algorithms. Kim et al. [32] presented four ensemble ML approaches (i.e., CatBoost, histogram gradient boosting, XGBoost, and RF algorithms) for the estimation of FRP-concrete interfacial bond strength (IBS) by considering an extensive dataset with the results of 855 SSTs on the FRP-concrete IBS. They found that the CatBoost algorithm outperformed all of the other ensemble techniques (R 2 = 0.96, and other performance metrics were also lower). Su et al. [24] used multilinear regression (MLR), support-vector machine (SVM), and ANN models to estimate the IBS of FRP laminates to the concrete prism, and they achieved an R2 of 0.85 for the overall dataset. In yet another study on the estimation of seismic performance of RC walls, Zhang et al. [33] revealed that the XGBoost and gradient boost (GB) algorithms were efficacious, achieving an accuracy of almost 97%, whereas, the GB and RF regression methods performed best in forecasting the lateral strength and ultimate drift ratio of the RC walls. Liu [34] conducted a study by utilizing XGBoost, RF, and support-vector regression (SVR) algorithms for the strength prediction of high-performance concrete. With the help of data preprocessing as well as parameter optimization, these three techniques yielded a better prediction state (R 2 > 0.9 for all cases) and good model fitting effect, where XGBoost possessed the highest prediction accuracy. While predicting the creep performance of concrete, Liang et al. [28] modelled the creep data in the Northwestern University (NU) database using LIGHT GBM, XGBoost, and RF techniques. After that, SHapley Additive exPlanations (SHAP) were computed for interpreting the predicted values on the basis of cooperative game theory [35,36]. In contrast, the LIGHT GBM approach was found to attain higher accuracy with a substantially shorter calculation duration. Moreover, this game-theory-based framework (i.e., SHAP) has been efficacious in explaining various supervised learning models [37].
To summarize, for the sake of improving the expense of civil engineering projects, AI models based on known experimental findings are required to estimate the IBS of FRP plates on a concrete prism. Due to highly nonlinear correlations between bond strength and a multitude of contributing parameters, typical prediction models for FRP-concrete coupling need further investigation [38]. The authors of the present study are of the viewpoint that the previously formulated models can be further improved in terms of accuracy. In addition, the parametric analysis is a desideratum for investigating the impact of input variables on the IBS, because it finally makes the decision as to which type of strengthening method is most efficacious and economical. Therefore, the present study investigated the ability of the LIGHT GBM, XGBoost, RF regression models in predicting the interfacial bond strength (IBS) of FRP laminates externally bonded to the grooves of a concrete prism by utilizing 136 experimental SST results (anchorage made on one end of FRP to the concrete prism, as shown in Figure 1b). Tested samples with FRP plates parallel to the groove direction were used in the analysis.

Overview of LIGHT GBM
LIGHT GBM is a Microsoft open-source gradient boosting machine learning framework that employs a decision tree as a training method [39]. LIGHT GBM reportedly surpasses existing gradient boosting techniques-including gradient boosting decision tree (GBDT) and extreme gradient boosting (XGBoost)-in terms of learning and training speed, as well as prediction accuracy, due to the fact that it employs two novel techniques: exclusive feature bundling (EFB), which is designed to manage multiple characteristics of data while avoiding overfitting issues; and gradient-based one-side sampling (GOSS), which is used for managing huge datasets.
Consider an input data including n instances s = {(x 1 , y 1 ), (x 2 , y 2 ) . . . , (x n , y n )}, where {x 1 , x 2 . . . , x n } are independent variables and {y 1 , y 2 . . . , y n } are dependent variables. The dependent variable is the ultimate capacity (p), and the independent variables are concrete compressive strength ( f (x)), width of groove b g , depth of groove h g , width of FRP plate b f , and elastic modulus of FRP E f T f . The estimated values of GBDT f (x) are the summation of the outcomes of a set of decision tree models h t (x): where T represents the number of trees. Finding an approximation function f that aims to minimize the loss function L(y, f (x)) is the main focus of fitting a GBDT method, as shown in Equation (2):f = argminE y,S L(y, f (x)) (2) In particular, with regard to adopting GOSS for sampling, LIGHT GBM leverages the EFB to accelerate the training procedure without compromising precision. Several applications include attributes that are mutually incompatible, such as high-dimensional and limited inputs. Such attributes can be aggregated into a single attribute bundle by EFB. To use an attribute scanning method, the statistics of such attribute bundles and specific attributes can be compiled. In brief, LIGHT GBM is a new machine learning (ML) method that employs GOSS for internal node splits based on variability gain and EFB to reduce the dimensions of the input attributes. As a decision-tree-based approach, LIGHT GBM has the significant benefit of being sensitive to multicollinearity [39]. Therefore, the incorporation of correlated predictors or independent attributes, which is highly prevalent in concrete data, is not problematic in the LIGHT GBM model.

Model Development
The optimization of hyperparameters is a crucial step in training machine learning (ML) techniques, as it can improve the generalization and prediction robustness of ML models, prevent overfitting and underfitting, and minimize the complexity of the model. In this way, grid search techniques have been used to optimize the hyperparameters for LIGHT GBM, XGBoost, and random forests (RFs) in order to achieve improved performance, efficiency, and precision. This technique is used to determine the efficacy of every combination of the specified hyperparameters and associated value ranges, and thereafter selects the optimal hyperparameter values. Moreover, a portion of the data samples are kept entirely masked from the models and utilized only as "testing set" to improve the results of the ML models and prevent the occurrence of underfitting and overfitting. The optimized hyperparameters of our analysis indicated improved results for identifying and predicting ultimate capacity of FRP laminates bonded to concrete. The hyperparameters for RF regression, LIGHT GBM, and XGBoost are listed in Tables 1-3, respectively.

Experimental Database
The descriptive statistics of the inputs as well as the target variable employed in the investigation are depicted in Table 4. The ultimate capacity (P, kN) of FRP laminates with a concrete prism was treated as the target variable, and the input variables included the FRP's elastic modulus x the thickness of the fiber (E f T f , GPa-mm), the width of the FRP laminate (b f , mm), the concrete's compressive strength (f c , MPa), the width of the groove (b g , mm), and the depth of the groove (h g , mm). The database contained 136 tested specimens of single-lap shear tests (SSTs) that were obtained from a previous work [40], as Polymers 2022, 14, 4717 6 of 16 already reported in [24]. Between the two extremes, the data were evenly dispersed. E f T f had a skewness of 0.58 and a range of 12.90 to 78.90. The database used in this study was originally created by Moghaddas et al. [40]. The original experiments on FRP laminates bonded to concrete prism sheets used four different widths of FRP plates, i.e., 30, 40, 50, and 60 mm (b f , as shown in Figure 1b). The effects of four different groove sizes (i.e., 5 × 5, 5 × 10, 10 × 10, and 10 × 15) mm 2 were also investigated. To demonstrate changes in concrete strength, three distinct mix designs with concrete strengths of 25, 35, or 45 MPa were used in accordance with ACI 211.1-91. The gradation and quality of coarse and fine aggregates were compiled as per ASTM C33/C33M. As shown in Figure 1b, the SST tests were performed on FRP plates that were coupled to a concrete prism on one side (150 × 150 × 350 mm). It is important to note that this study was based on experimental tests on FRP sheets made of the Sika wrap-200C, Sika wrap-300C, and Sika wrap-430G types of carbon and glass fibers bonded with epoxy Sikadur 330 adhesive material, as reported by Moghaddas et al. [40]. Single-lap shear tests were performed with the help of a specially formulated machine, and a hydraulic jack was utilized for the application of uniform tensile force with controlled displacements at a rate of 2 mm/min. Moreover, the tensile force exerted on the sample was accurately determined by deploying an S-type load cell.

Performance of the Developed Models
Correlations between the variables in the datasets used in this study were examined using Pearson's linear correlation. Figure 2 shows the correlation between the input and output variables. The results showed little or no linear correlation between the input and target variables, revealing the existence of a nonlinear relationship between these variables. In addition, relationships existed between h g and b g , P and E f T f , and P and b f . Only b g elicited a slight correlation with f c . Using the training and validation data, we determined how well the model performed by plotting the slope of the regression line between the experimental and predicted observations. Furthermore, the predicted/experimental ratio proved to be useful for assessing the models' performance. output variables. The results showed little or no linear correlation between the input and target variables, revealing the existence of a nonlinear relationship between these variables. In addition, relationships existed between hg and bg, P and EfTf, and P and bf. Only bg elicited a slight correlation with fc′. Using the training and validation data, we determined how well the model performed by plotting the slope of the regression line between the experimental and predicted observations. Furthermore, the predicted/experimental ratio proved to be useful for assessing the models' performance.

Statistical Analysis
To predict the interfacial bond strength of FRP laminates with a concrete prism and its grooves, we evaluated the robustness, effectiveness, and relative analysis of the RF, XGBoost, and LIGHT GBM models. When comparing robust performance and strongly linked models, the distribution of data points must have a slope higher than 0.8 [44], a minimal error index (MAE, NSE RSE, RMSE, and RRMSE) [45], an R 2 greater than 0.8 [46], and a performance index close to zero [47]. Table 5 summarizes the statistical evaluation results of the three adopted ML techniques using the following performance and error metrics: R 2 , RMSE, MAE, RAE, NSE, and ρ. Using the R 2 , the minimal values for the training and testing data-0.899 and 0.820, respectively-were recorded with XGBoost and RF regression, respectively, with LIGHT GBM recording the maximal values for both the training and testing datasets. For the other metrics measuring errors in the predicted values, the minimal values were shared between the RF regression and LIGHT GBM models. The RF regression recorded minimal errors in terms of RMSE, RRMSE, and NSE, while

Statistical Analysis
To predict the interfacial bond strength of FRP laminates with a concrete prism and its grooves, we evaluated the robustness, effectiveness, and relative analysis of the RF, XGBoost, and LIGHT GBM models. When comparing robust performance and strongly linked models, the distribution of data points must have a slope higher than 0.8 [44], a minimal error index (MAE, NSE RSE, RMSE, and RRMSE) [45], an R 2 greater than 0.8 [46], and a performance index close to zero [47]. Table 5 summarizes the statistical evaluation results of the three adopted ML techniques using the following performance and error metrics: R 2 , RMSE, MAE, RAE, NSE, and ρ. Using the R 2 , the minimal values for the training and testing data-0.899 and 0.820, respectively-were recorded with XGBoost and RF regression, respectively, with LIGHT GBM recording the maximal values for both the training and testing datasets. For the other metrics measuring errors in the predicted values, the minimal values were shared between the RF regression and LIGHT GBM models. The RF regression recorded minimal errors in terms of RMSE, RRMSE, and NSE, while MAE and RSE were minimal with LIGHT GBM, for both the training and testing datasets. The performance index (ρ) revealed that LIGHT GBM and RF regression had the best performance in the training and testing phases, respectively. The results of the statistical analysis revealed a close agreement between the experimental and predicted values amongst the three models. However, LIGHT GBM performed the best overall, with the highest recorded R 2 and error parameters very close to zero, making it a reliable predictor of interfacial bond strength between the FRP laminates, in close agreement with previous findings [44].  Figure 3 reveals the cross-plots between the predicted results of the three proposed models (RF, XGBoost, and LIGHT GBM) and the experimental observations. By closely observing the slope of the regression line for the training dataset, the intensity of correlation between the results of the proposed models and the experimental data increased from XGBoost (0.8988) to RF regression (0.9007), and then to LIGHT GBM, which was the closest to the slope of an ideal regression line (1:1). As for the validation dataset, the trend was similar to that observed for the training dataset, except that there was a similar correlation between the RF regression and XGBoost models, with LIGHT GBM giving the best fit, with a regression line slope of 0.865 as compared to 0.82 (RF) and 0.8247 (XGBoost).

(ii) Error analysis
We also analyzed the errors from the model predictions, and the results were plotted for the training and testing datasets, as shown in Figure 4a,c,e. These plots allow us to see

(ii) Error analysis
We also analyzed the errors from the model predictions, and the results were plotted for the training and testing datasets, as shown in Figure 4a,c,e. These plots allow us to see the residual errors between the predicted and actual experimentally observed values, in addition to the range of these error values. In addition, the histogram with frequency shows the value counts and the accompanied bin width (i.e., range of errors) for each of the proposed models. The histogram (Figure 4b,d,f) reveals that the model with the most of its dataset within the boundaries of the least errors is the LIGHT GBM, as shown in Figure 4f compared with Figure 4b,d for RF regression and XGBoost, respectively. This suggests that most of the errors between the predicted and observed values are concentrated or scattered near the zero region for all three models; however, the errors closest to zero were recorded with the LIGHT GBM.

(iii) Predicted-to-experimental ratio analysis
As part of the statistical methods used to evaluate the models' performance, the ratio of the predicted values to the experimental values was also used so as to clearly highlight the accuracy of the model in more detail. Some researchers [1] predicted the shear strength of squat-reinforced concrete walls within ±20% of the predicted/experimental ratio when they used the XGBoost model. In conjunction with other statistical evaluations, this model resulted in a higher interpretation of accuracy than that of other empirical models. In this study, Figure 5 and Table 6 show the percentage of errors in the prediction of each of the proposed models (RF, XGBoost, and LIGHT GBM). As for the RF model, Table 6

(iii) Predicted-to-experimental ratio analysis
As part of the statistical methods used to evaluate the models' performance, the ratio of the predicted values to the experimental values was also used so as to clearly highlight the accuracy of the model in more detail. Some researchers [1] predicted the shear strength of squat-reinforced concrete walls within ±20% of the predicted/experimental ratio when they used the XGBoost model. In conjunction with other statistical evaluations, this model resulted in a higher interpretation of accuracy than that of other empirical models. In this study, Figure 5 and Table 6 show the percentage of errors in the prediction of each of the proposed models (RF, XGBoost, and LIGHT GBM). As for the RF model,

(iv) Taylor Diagrams
In Figure 6, the dashed radial lines (blue) denote the standard deviation (SD), the dashed straight lines (black) denote the correlation coefficient (CC), and the continuous radial lines (red) indicate the centered root-mean-square deviation (CRMSD) between the training and testing datasets and the experimental dataset. The Taylor diagrams (shown in Figure 6) for the training and testing datasets provide more visualization of the accuracy of all of the proposed models using correlations and errors between the predicted and experimental values. These diagrams statistically summarize the data to assess the degree to which the observed and estimated values correspond based on root-meansquare error, standard deviation, and Pearson's correlation coefficient [2]. The Taylor diagrams provide a visual summary of the predictive abilities of the proposed models in

(iv) Taylor Diagrams
In Figure 6, the dashed radial lines (blue) denote the standard deviation (SD), the dashed straight lines (black) denote the correlation coefficient (CC), and the continuous radial lines (red) indicate the centered root-mean-square deviation (CRMSD) between the training and testing datasets and the experimental dataset. The Taylor diagrams (shown in Figure 6) for the training and testing datasets provide more visualization of the accuracy of all of the proposed models using correlations and errors between the predicted and experimental values. These diagrams statistically summarize the data to assess the degree to which the observed and estimated values correspond based on root-mean-square error, standard deviation, and Pearson's correlation coefficient [2]. The Taylor diagrams provide a visual summary of the predictive abilities of the proposed models in one image. They illustrate how close the experimental and predicted results are in terms of their correlation and biasness ratio [3]. The reference model is indicated by the white circular dot, with a measured SD of 4.

SHAPASH Analysis
The Python "SHAPASH" package was used to determine the relative relevance, direction of influence, and nature of influence of predictors on the target variable. It can be observed that EfTf is the most significant variable, followed by bf, fc′, hg, and bg ( Figure 7). The observations of feature importance reveal concurrence with the results obtained in a previous study [2]. It is evident from Figure 8 that increases in the value of EfTf positively contribute to the prediction. The lowest prediction of the ultimate IBS capacity was observed at 8 kN for the value of EfTf at 12.9 GPa-mm, whereas the highest prediction was obtained at 78.2 GPa-mm. Similarly, the highest prediction of IBS was obtained at high compressive strength (Figure 9). Increasing the depth of the groove beyond 15 mm predicted IBS in the range of 13-20 kN ( Figure 10). The specimens with a narrower groove yielded better IBS results compared to those with a wider groove ( Figure 11).

SHAPASH Analysis
The Python "SHAPASH" package was used to determine the relative relevance, direction of influence, and nature of influence of predictors on the target variable. It can be observed that E f T f is the most significant variable, followed by bf, f c , hg, and bg ( Figure 7). The observations of feature importance reveal concurrence with the results obtained in a previous study [2]. It is evident from Figure 8 that increases in the value of E f T f positively contribute to the prediction. The lowest prediction of the ultimate IBS capacity was observed at 8 kN for the value of E f T f at 12.9 GPa-mm, whereas the highest prediction was obtained at 78.2 GPa-mm. Similarly, the highest prediction of IBS was obtained at high compressive strength (Figure 9). Increasing the depth of the groove beyond 15 mm predicted IBS in the range of 13-20 kN ( Figure 10). The specimens with a narrower groove yielded better IBS results compared to those with a wider groove ( Figure 11).
previous study [2]. It is evident from Figure 8 that increases in the value of EfTf positively contribute to the prediction. The lowest prediction of the ultimate IBS capacity was observed at 8 kN for the value of EfTf at 12.9 GPa-mm, whereas the highest prediction was obtained at 78.2 GPa-mm. Similarly, the highest prediction of IBS was obtained at high compressive strength (Figure 9). Increasing the depth of the groove beyond 15 mm predicted IBS in the range of 13-20 kN (Figure 10). The specimens with a narrower groove yielded better IBS results compared to those with a wider groove ( Figure 11).

Conclusions
FRP laminates are widely utilized to retrofit a variety of reinforced concrete elem (i.e., beams, columns, joints, and slabs); therefore, it is crucial to assess their bond str with concrete structural members. Single-lap shear strength tests, performed on th laminates bonded to a concrete prism and its grooves, were used to develop three en ble models-namely, random forest (RF) regression, extreme gradient boo (XGBoost), and Light Gradient Boosting Machine (LIGHT GBM) models-based o chine learning (ML). It is notable that the developed models were applicable for th treme values of the input variables used in the present study. It is also worth menti that the tested specimens included single-lap-sheared samples bonded to the concre ing Sikadur 300 epoxy as an adhesive material. The following conclusions can be d from this research:

Conclusions
FRP laminates are widely utilized to retrofit a variety of reinforced concrete ele (i.e., beams, columns, joints, and slabs); therefore, it is crucial to assess their bond str with concrete structural members. Single-lap shear strength tests, performed on th laminates bonded to a concrete prism and its grooves, were used to develop three e ble models-namely, random forest (RF) regression, extreme gradient bo (XGBoost), and Light Gradient Boosting Machine (LIGHT GBM) models-based o chine learning (ML). It is notable that the developed models were applicable for t treme values of the input variables used in the present study. It is also worth menti that the tested specimens included single-lap-sheared samples bonded to the concre ing Sikadur 300 epoxy as an adhesive material. The following conclusions can be d from this research:

Conclusions
FRP laminates are widely utilized to retrofit a variety of reinforced concrete elements (i.e., beams, columns, joints, and slabs); therefore, it is crucial to assess their bond strength with concrete structural members. Single-lap shear strength tests, performed on the FRP laminates bonded to a concrete prism and its grooves, were used to develop three ensemble models-namely, random forest (RF) regression, extreme gradient boosting (XGBoost), and Light Gradient Boosting Machine (LIGHT GBM) models-based on machine learning (ML). It is notable that the developed models were applicable for the extreme values of the input variables used in the present study. It is also worth mentioning that the tested specimens included single-lap-sheared samples bonded to the concrete using Sikadur 300 epoxy as an adhesive material. The following conclusions can be drawn from this research:

•
While investigating the optimization of the formulated models, the learning rate (0.1), maximal depth (7), and number of trees (90) were found to govern the final RF predictions. The same magnitude of optimal learning rate was obtained for the other two ML methods (i.e., XGBoost and LIGHT GBM) as well. In contrast to the RF regression, the maximal tree depth was found to be 3 for the other two models.

•
The sensitivity via SHAPASH analysis indicated that E f T f is the most prominent input attribute, followed by the width of the FRP laminates. This is in good agreement with the Pearson's linear correlation yielded for these two parameters as well as the ultimate axial capacity of FRP laminates. This suggests validation of the formulated models and consistency in terms of the importance of the variables using different statistical evaluation methods. • Moreover, all of the models showed reliable performance in terms of correlation and error evaluation; however, LIGHT GBM outclassed the other two models. In LIGHT GBM, the values of R, RMSE, and MAE were 0.942, 3.40, and 0.80 for the training data, respectively, and 0.865, 3.56, and 1.3 for the testing data, respectively. For the training and validation datasets, the slopes of the regression lines were 0.9348 and 0.7678, respectively. This shows that the experimental and predicted values were in close agreement with one another.
The LIGHT GBM framework exhibits robust training speed, greater efficacy, higher accuracy, and is capable of handling large-scale data. It is also highly applicable in the binary and multi-classification problems. However, the associated drawbacks include overfitting and compatibility with the datasets. In addition, the SHAPASH analysis is highly versatile (used for plotting interactions between the considered variables alongside a training model that can be used for future predictions) and works with the aforementioned classification problems as well as with regression problems.