Machine Learning-Based Models for Shear Strength Prediction of UHPFRC Beams

: Estimating shear strength is a crucial aspect of beam design. The goal of this research is to develop a shear strength calculation technique for ultra-high performance ﬁber reinforced concrete (UHPFRC) beams. To begin, a shear test database of 200 UHPFRC beam specimens is established. Then, random forest (RF) is used to evaluate the importance of inﬂuence factors for the shear strength of UHPFRC beams. Subsequently, three machine learning (ML)-based models, including artiﬁcial neural network (ANN), support vector regression (SVR), and eXtreme-gradient boosting (XGBoost), are proposed to compute shear strength. Results demonstrate that the area of longitudinal reinforcement has the greatest inﬂuence on the shear capacity of UHPFRC beams, and ten parameters with high importance (e.g., the area of longitudinal reinforcement, the stirrup strength, the cross-section area, the shear span ratio, ﬁber volume fraction, etc.) are selected as input parameters. The models of ANN, SVR, and XGBoost have close accuracy, and their R 2 are 0.8825, 0.9016, and 0.8839, respectively, which are much larger than those of existing theoretical models. In addition, the average ratios of prediction values of ANN, SVR, and XGBoost models to experimental results are 1.08, 1.02, and 1.10, respectively; the coefﬁcients of variation are 0.28, 0.21, and 0.28, respectively. The SVR model has the best accuracy and reliability. The accuracy and reliability of ML-based models are much better than those of existing models for calculating the shear strength of UHPFRC beams.


Introduction
High-performance materials play a significant advantage in improving the performance of structures, reducing self-weight and saving materials, and leading to the development of high-performance structures [1].Ultra-high-performance fiber reinforced concrete (UHPFRC), a newly developed high-performance material in civil engineering, offers remarkable features such as ultra-high strength, high durability, and micro-crack solid self-healing capacity, which may help structures last longer and require less maintenance.The application of UHPFRC in concrete structures can significantly improve their mechanical performance and durability and reduce the size of the cross-section, saving concrete consumption.The research on the mechanical properties of UHFRPC members is the basis of the application of UHPFRC material in concrete structures, and the shear strength is an essential index for the evaluation and design of UHPFRC beams, so it is necessary to conduct in-depth discussion and research on the shear strength of UHPFRC beams.
The shear performance of UHPFRC beams has been experimentally investigated in depth.These shear tests have examined the effects of fiber volume fraction and type, shear-span ratio, stirrup configuration, longitudinal reinforcement configuration, and other parameters on the flexure and shear properties of UHPFRC beams.The experiment results show that the inclined cracks of the UHPFRC beam are dense and fine under shear force [2][3][4][5].The bearing capacity and ductility of UHPFRC beams with fibers may be significantly increased compared to UHPC beams without fibers [3,[6][7][8][9][10].Brittle shear failure can be converted to ductile flexure failure thanks to the fibers [11][12][13][14][15].The shear and flexural performance of UHPFRC beams may be significantly improved by increasing the fiber volume fraction [4,6,11,[15][16][17][18].The performance of UHPFRC beams is influenced by different kinds of fibers.Lowering the shear span ratio, which is similar to that of reinforced concrete beams, can improve the shear capacity of UHPFRC beams [5,12,17,19,20].Shear experiments on UHPFRC beams reveal that fiber and stirrup can enhance beam shear strength, whereas fibers can lower the stirrup ratio [6,8,21,22].As a result, numerous researchers have investigated the use of fibers to minimize the minimum stirrup ratio of UHPFRC beams [2,6,9,[23][24][25].The shear strength of UHPFRC beams without stirrups rises as concrete strength and fiber volume fraction increase, and the fibers aid in the shift from shear tension to shear compression failure mode [18,[26][27][28].
However, there are limited theoretical studies on the strength calculation technique for UHPFRC beams.In general, the complex tensile stress distribution of UHPFRC in the tensile zone is equivalent to a rectangular distribution with the constant stress of kf t , where f t is the tensile strength of UHPFRC and k is a constant that indicates the stress reduction coefficient [29].The existing theoretical models for the flexural strength of UHPFRC beams proposed by the above scholars are based on the assumption of a flat cross-section and take into account the tensile stress after cracking of the UHPFRC section.Qi et al. [30] proposed a method for calculating the shear strength of UHPFRC beams, which considers the influence of fibers, compressive UHPFRC, and stirrups.Shear tests of T-shaped UH-PFRC beams confirmed the accuracy and dependability of the method.Ahmad et al. [31] combined UHPFRC with high-strength steel bars and applied them to beam members and conducted shear loading tests on ten beam specimens.Based on the test data of the ten beams, the formula for calculating the shear capacity of UHPFRC beams was fitted.
In recent years, the method based on computer machine learning has been widely used in the mechanical analysis of civil structures.This method can comprehensively consider the influence of various factors through computer algorithms and has high prediction accuracy.Roya et al. [32] collected UHPFRC beam test data from previous literature and used data-driven machine learning (ML) framework to predict the failure mode and strength of UHPFRC beams, and the methods included support vector machine (SVM), artificial neural networks (ANN), k-nearest neighbor (k-NN), and genetic programming (GP).The prediction accuracy was examined by the collected test data, and the research results showed that the proposed method has reasonable accuracy.Kim et al. [33] used the ML methods of CatBoost, eXtreme-gradient boosting, histogram gradient boosting, and random forest to predict the interfacial bonding strength of FRP-concrete, and the results indicate that the proposed models have high prediction accuracy.Yaseen et al. [34] used a new support vector regression algorithm coupled with particle swarm optimization to establish the method for the shear strength of fiber reinforced concrete beams, and the comparative analysis results show that this prediction method has reasonable accuracy and can provide a reference for the shear strength design of fiber reinforced concrete beams.Mangalathu et al. [35] collected the test data of the existing beam-column joint specimens and used these data to train the machine learning techniques and established the prediction model for failure modes and shear strength of beam-column joint members, and the prediction accuracy of the method was examined by the test data.The research results show that this prediction method has a good calculation efficiency and reasonable accuracy.Payam et al. [36] proposed three innovative ML-based models to calculate the shear strength of reinforced concrete (RC) walls, and the combination of the support vector regression with meta-heuristic optimization algorithms such as teaching-learning-based optimization (TLBO), particle swarm optimization (PSO), and Harris Hawks optimization (HHO) are used for establishing the models.The results indicate that the proposed ML-based models have better accuracy than the methods proposed by the design codes and the researchers.Chen et al. [37] utilized a hybrid intelligence algorithm including the artificial neural network and particle swarm (ANN-PSO) to propose an ML-based model to predict the shear strength of squat RC walls.A total of 139 test results of squat walls are collected and utilized to train and test the hybrid ANN-PSO model.The results show that the proposed ML-based model has good prediction accuracy of the shear strength of such RC walls.Keshtegar et al. [38] combined an artificial neural network (ANN) with adaptive harmony search optimization (AHS) algorithm to establish an ML-based model for the shear strength of RC walls, and the results indicate that the proposed ANN-AHS model has excellent prediction accuracy in modeling the shear strength of RC shear walls.Gondia et al. [39] utilized genetic programming (GP) to develop an elegant shear strength prediction expression using a dataset of 254 Squat reinforced concrete shear walls with boundary constraints.The results show that the proposed expression can better predict the shear strength of RC walls compared to other shear strength prediction methods in design codes and literature.Wu et al. [40] used a back-propagation (BP) neural network algorithm to predict the shear strength of discontinuities with different joint wall compressive strength (DDJCS), and the input parameters include the joint wall strength combination, normal stress, and joint roughness.The results indicate that the prediction accuracy of the developed ML-based model is better than the multivariate regression model.Nguyen et al. [41] utilized an artificial neural network (ANN) to establish an ML-based model for predicting the shear strength of squat flanged RC walls, and the test data of a total of squat flanged RC wall specimens were collected to train and test the ANN model.The results show that the developed ANN model has better prediction accuracy than the existing equations.
The above studies indicate that the ML-based models have high prediction accuracy for estimating the shear strength of RC members, while they are currently focused on the establishment of shear strength models of RC beams and walls, and little related research has been conducted on UHPFRC beams.Many parameters have a critical influence on the shear strength of RC members, and each parameter has different importance coefficients for the shear strength.The input parameters of the above ML-based models adopt the empirical critical parameters commonly used in theoretical models, such as stirrup strength, stirrup ratio, cross-section size, etc., while the other critical parameters may be not considered.In ML-based models, selecting the parameters with high importance as the input parameters can ensure the accuracy of the model prediction and improve computational efficiency.Therefore, it is necessary to evaluate the importance coefficient of each parameter to the shear capacity of RC members, while it is not considered in the previous ML-based models.The fibers in UHPFRC are distributed randomly in the beam specimens, and the direction and position of the fibers can affect the performance of the beam, but it cannot consider the effect of each individual fiber owing to a large number of fibers in the beams.In many theoretical models, an effective coefficient is used to consider the effect of fibers on average, and the coefficient is mostly a fixed value, and the empirical value obtained by the test is mostly used, but there is no uniform value for this value at present.In ML-based models, through machine learning, the complex mapping relationship between fibers and beam shear strength can be comprehensively established, and the effect of randomly distributed independent fibers on beam shear capacity can be accurately equivalent.
The method for shear strength of UHPFRC beams is an essential influencing parameter for its design, while the research on the shear bearing capacity of UHPFRC beams is relatively insufficient, so it is necessary to conduct in-depth research on the estimation for shear capacity to promote the application of UHPFRC in concrete structures.The existing theoretical methods have many assumptions, simplifications, and considered parameters, so their prediction accuracy may be not high.The ML-based method can accurately consider the influence of each parameter on the prediction targets, and it can address the gaps in the theoretical methods.To this end, this paper collects the test data of 200 UHPFRC beam specimens, and establishes three prediction models of the shear strength utilizing the machine learning (ML) methods, including ANN, SVR, and XGBoost.The accuracy and applicability of the proposed models are evaluated by comparing them to the existing theoretical methods.

Experimental Database
The shear tests of UHPFRC beam specimens in the previous literature are summarized, and the corresponding database is established to provide experimental verification for the subsequent research on the shear capacity of UHPFRC beams in this paper, as well as test data support for relevant theories and finite element research on UHPFRC beams.At present, the loading setup for shear tests of UHPFRC beams can be divided into four-point loading and three-point loading setups (Figure 1a,b), where a in the figure is shear span, and p is peak shear force of the beam specimens.
The experimental database contains 200 UHPFRC beam specimens failing in shear, as listed in Table A1 in Appendix A, which summarizes 24 parameters of the beam specimens, including dimension and reinforcement details.In Table A1, A s is the longitudinal reinforcement area, f yw is stirrup strength, A so is the beam cross-section area, A is the total section area, λ is shear span ratio, A s is the area of compression longitudinal reinforcement, a is the shear span, V f is the fiber volume fraction, ρ w is the stirrup ratio, t w is web thickness, f y is longitudinal reinforcement yielding strength, UHPC-fc and ft are, respectively, compressive and tensile strength of UHPFRC, ρ s is the ratio of longitudinal reinforcement, I is the depth of section moment of inertia, b f is compressive flange width, h o is the effective height of cross-section, ρ s is the ratio of compressive longitudinal reinforcement, h is the height of the cross-section, s is the stirrup spacing, t f is the thickness of top flange, and t b is the thickness of the bottom flange.
The purpose of this shear testing was to see how various conditions affected the shear performance of UHPFRC beams.According to the findings, the effects of the shear span ratio, longitudinal reinforcement ratio, and stirrup reinforcement ratio on the shear performance of UHPFRC beams are similar to those of reinforced concrete beams.Fibers in UHPFRC can ensure the integrity of the beams after the full development of inclined cracks, and it is conducive to enhancing the ductility of the components.The bridging effect of fibers at inclined fractures after UHPFRC cracking can produce tension stress, effectively increasing beam shear performance.Figure 2 shows the key parameter distributions of the collected UHPFRC beam specimens.The cross-section forms of these beam specimens include rectangular (123 pieces), I-shaped (67 pieces), and T-shaped (10 pieces).The shear span ratios of the models range from 0.79 to 8.46, and most of them range from 1.38 to 3.74.The compressive strength values of UHPFRC used in the specimens are between 78 and 222 MPa, and most UHPFRC with a strength between 78 and 96 MPa do not have fibers or have a low volume fraction of fibers, which has relatively low compressive strength.In some literature, the data on the tensile strength of UHPFRC material is not given, and its tensile strength is calculated according to 0.6 f c .The fiber volume fractions of UHPFRC used in beam specimens are between 0 and 5%, and the yield strength of longitudinal bars is between 365 and 1835 MPa.The collected UHPFRC beam specimens include some illustrations without stirrups.The stirrup strength of the stirrups is between 284 and 568 MPa, and the stirrup ratio is between 0 and 1.7%.In conclusion, the dimensions of these beam specimens are various, and these UHPFRC beam specimens have investigated the influence of all critical parameters on the shear strength, so the research parameters are sufficient.Moreover, the variation range of the parameters is wide, and the parameter values of different beam specimens have obvious differences.The above features of the collected data allowing the ML-based model for the shear strength of UHPFRC beams to be effectively verified.

Parameter Evaluation Method
The ML algorithm of random forest (RF) is used to evaluate the importance of the given parameters.RF combines the ensemble learning theory of Bagging with the random subspace method and introduces the random sampling of samples and random sampling of features to achieve a more accurate classification than the single algorithm.So, RF is composed of Bootstrap resampling, decision tree generation, and random forest formation, and the classification results are as follows: where h(x) represents the classification results determined by voting of multiple decision trees, h i is the ith decision tree classification model; x is the vector of characteristic parameters to be identified, θ i is the self-help training set for training the ith decision tree, k is the number of decision trees; I(.) is an indicator function.
Gini coefficient (GC) is commonly utilized to split nodes during the generation of decision trees, and the parameter importance is obtained by calculating the average Gini coefficient change of feature f i [42].The GC can be defined as: where p m is the probability of a sample belonging to class m, and there is a total of M classes.Analogously, the GC of Database D can be defined as: where C m is the subset of the samples in class m of Dataset D. On node n, feature f i divides Dataset D into two parts, D 1 and D 2 , so the changes in the Gini coefficient can be obtained by: So, the importance of f i in kth decision tree can be expressed as: where N i is the node number divided by feature f i .Therefore, f i importance can be obtained by where d is the feature number, and K is the decision tree number in the RF model.

Results and Discussions
Figure 3 shows the influence importance of various parameters on the shear strength of UHPFRC beams by the above ML methods.It can be seen that the sectional area of longitudinal reinforcement has the greatest influence on the shear capacity of the UHPFRC beam with the importance coefficient of 0.14, and the stirrup strength, the cross-section area, the shear span ratio, etc., all have significant influence of the shear strength of UHPFRC beams.According to the existing shear tests on the UHPFRC beams, these parameters have a significant impact on the shear capacity, indicating that the parameter evaluation is reasonable.The sum of the importance coefficients of the first 12 parameters has reached 0.96.To simplify the calculation and maintain a certain calculation accuracy, the first 10 parameters are selected as input parameters.

ML Methods
Three commonly used ML algorithms were used to build the prediction models: ANN, SVR, and XGBoost.Based on statistical data, ML algorithms combine data and mathematical algorithms to find the relationships between the parameters and targets.After selecting appropriate parameters, the model usually achieves high accuracy, that is, ML methods are highly sensitive to parameter selection.

Prediction Steps
According to the workflow shown in Figure 4, the prediction models of shear capacity of UHPFRC beam based on ANN, SVR, and XGBoost are respectively established.The detailed prediction steps are as follows: Step 1. Dividing the dataset.The dataset is divided into a training dataset and a testing dataset in the ratio of 8:2.
Step 2. Model training.ANN, SVR, and XGBoost are respectively used to establish the prediction models based on the training database.Cross-validation and grid search strategies were used to optimize the hyperparameters, and MSE (mean square error) was used as the loss function.Finally, the prediction model was obtained.During the training process, dropout and L2 regularization were used to avoid the overfitting of ANN models.Dropout randomly makes some neurons invalid at a probability of p during each epoch of training, and all neurons will be used for final models.Of course, to maintain the scale of predicted value, the weight of neurons in the final models will multiply p.In this study, p = 0.2.L2 regularization used a penalty coefficient to avoid the network becoming too complex.SVR used its regularized risk function to help it avoid overfitting.In XGBoost models, max depth and gamma were used to avoid overfitting.By and large, the smaller the maximum depth, the larger the gamma, and the more difficult the models are to overfit.Moreover, overfitting and underfitting can be prevented by controlling the MSE of the training set and the testing set.
Step 3. Prediction accuracy evaluation.The model obtained in Step 2 is evaluated by the testing dataset, and new samples are used for the prediction evaluation, which is verified by the fit goodness of R 2 .R 2 represents the degree of fitting of regression results to measured values, and the closer it is to 1, the better fitting degree of regression results.The calculation formula is as follows: where y i and p i are, respectively, the measured and predicted results, and y m is the mean value of measured results.
Meanwhile, the ratios of predicted to measured values are used to evaluate the accuracy of the prediction model, that is, the coefficient of variation (CoV) of the ratio is used for the evaluation.CoV is used for evaluating the reliability of the ML-based models.
where σ is the mean value of p i /y i .

Artificial Neural Network (ANN)
ANN has three parts: the input layer, hidden layer, and output layer.Neurons in each layer can receive messages from all neurons in the previous layers, and generate signal output to the next layer, the signal from the input layer to the output layer does one-way propagation.For the training data in the input, each neuron in the first-layer network receives the input training data equally, and the training data received by each neuron should take into account the connection weight w and bias b between the input data.After processing by the activation function, the input value of this layer is generated and transferred to the next layer.After that, the output value of the neural network is finally obtained.In this way, a neural network resembles a composite function nested through layers of simple functions.In the training of the neural network model, the loss function to be optimized should be determined first, and in most cases, the mean square error function should be used as the loss function.The classification problem can also adopt the cross entropy function, where the loss function is assumed to be J, and then the model is trained by a gradient descent algorithm.As for the gradient derivative, it is easy to obtain the gradient of the jth node at the Nth layer to the last layer of the neural network, and it can be expressed as: where y N j is the output value of the corresponding node.When the mean square error function is chosen, J can be expressed as: u N j is the input value of the function activated for the node: where w N j and b N j are, respectively, the weight and bias of the corresponding node, x N j is the input value of the node.When the node adopts the Sigmoid function as the activation function, the gradient can be obtained as follows: After the gradient is obtained, the weight of this layer is updated according to the set step size, and the gradient of N − 1 layer can be further obtained as: It can be seen that the gradient of N − 1 layer is related to all nodes connected to N layer.Equation ( 13) can be regarded as the multiplication of three parts.The first part is the calculation content of the N layer, which is given in Equation (12).The middle reflects the content of the connection between the two layers, and the result of the derivative is the weight w N j .The last part is the derivation calculation of this layer, which can be referred to Equation (9).Repeat the above process until the weight of all layers is updated to complete a step of training, the above process is the back propagation algorithm.
We can refer to Equation ( 9) and repeat the above process until the weight of all layers is updated to complete a step of training, the above process is the back propagation algorithm.

Support Vector Regression (SVR)
SVR adopts statistical learning theory with the principle of structural risk minimization, and the sample generalization performance is very strong, avoiding the high dependence on sample data.The training set is given as: The regression function is assumed as: To improve the generalization ability, it is necessary to enlarge the ε pipeline.This minimizes the possibility of the unknown point going beyond the region.However, when the training set is nonlinear, the generalization performance of the regression function obtained is very poor even after the optimization is completed.Therefore, the kernel function K(x i• x j ) is introduced to transform the low-dimensional nonlinear problem into a high-dimensional linear problem, and finally, the regression problem is transformed into the following optimization problem: where α i and α * i are diagonal matrices for the undetermined coefficients.So we can get the optimal nonlinear regression function: The radial basis function is selected as the kernel function [43], which is expressed as: XGBoost is a promotion algorithm proposed by Chen et al. [44] of the University of Washington.The lifting algorithm is to train a large number of weak learners through certain strategies, for example, the shallow decision tree model, and then combine the prediction results of these weak classifiers through certain methods, and finally achieve the algorithm of greatly improving the prediction effect.XGBoost uses a shallow regression tree as the weak classifier.For the first shallow regression tree model obtained by training, assuming that it is represented as F 0 (t) and t represents the instance vector in the feature space, let the classifier obtained in the first step be y; 0 .On this basis, train the 1, 2, 3, etc. M shallow regression tree model F m (t).XGBoost first calculates and obtains the first and second derivatives h i and g i of the loss function of the error between the classifier and the predicted value obtained in the previous step, namely m − 1, then the objective function of F m (t) can be obtained according to the second-order expansion of Taylor function: Ω(F m ) is the regularization term, which can prevent the algorithm from blindly increasing the model complexity to improve the accuracy, thus leading to overfitting.Ω(F m ) can be expressed as: where γ and λ are both penalty coefficients, T is the number of regression leaf nodes, and w 2 expresses the influence of the weight of regression leaf nodes on model complexity.It can be seen from Equation ( 17) that the fitting object in the iteration of XGBoost objective function is the residual between the predicted value and the observed value of the sample.The training process is to make obj (m) reach the minimum value, and the regression tree node splitting can adopt the mean square error to select the splitting feature.Finally, a new shallow tree model F m (t) can be obtained and the classifier is updated as: Figure 5 represents the XGBoost regression mechanism.

Parameter Settings
(1) The number of hidden layers and neurons n was optimized by using the grid search method and cross-validation strategy.Figure 6 shows the variation in the goodness of fit with the number of hidden layers and neurons.As mentioned in Alavi et al.'s work [45], different metrics have different preferences.In this study, we used R 2 to perform a grid search and find the optimal values of hyperparameters since traditional models usually R 2 use to depict the performance.Moreover, we used mean absolute error as the index to explore the performance of parameter tuning.Results showed that the hyperparameters obtained by using R 2 also made models have lower mean absolute errors, i.e., in this study, the selection of metrics does not cause a big difference.Finally, one hidden layer containing 20 units was selected as the final ANN model structure.The activation function dropout was 0.2, and the regularized parameter L2 was 0.001.(2) SVR uses the kernel function to nonlinearly map low-dimensional data to highdimensional feature space and then obtains regression function in high-dimensional feature space.The same grid search method and cross-validation strategy were used to optimize the hyperparameters in SVR: penalty coefficient C and kernel function parameter γ.Finally, the hyperparameters C = 3500 and γ = 0.8 were selected.(3) In XGBoost, at each boosting iteration, the 1st and 2nd order gradient for the objective function "squared error" was calculated for each training case.The model was built using XGBoost s scikit-learn compatibility.The best results were achieved using tree-based learners in XGBoost, and the parameters are listed in Table 1.Qi et al. [30] suggested a calculating model for UHPFRC beam shear strength.The shear strength of UHPFRC beams is made up of three aspects: the shear capacity (V c ) provided by the shear-compression zone of the cross-section, the shear capacity supplied by the stirrups (V s ), the shear strength (V fi ) provided by the fibers, and the expression of the shear strength is shown as Equations ( 29) and (30).V s is calculated using the truss model.The angle of the inclined crack is assumed to be 45 • , and the shear strength provided by the fiber is calculated using the Mesoscale Fiber-Matrix Discrete (MFMD) Model.
where c is the height of the compression zone of the cross-section, τ b is the bond strength between a single fiber and the matrix.
(2) Ahmad et al. model for UHPFRC beams [31] Based on the shear test data of UHPFRC beams, Ahmad et al. [31] fitted the formula of shear strength, which is expressed as Equation (31).
where α is a bond factor (for straight steel fibers = 0.5), a is the shear span, ρ is the ratio of longitudinal reinforcement, and h is the overall depth of the beam.
(3) The method for shear strength of fiber reinforced concrete (FRC) beams in China Association for Engineering Construction Standardization (CECS) 38:2004 [46] This formula (Equation ( 32)) is suitable for calculating the shear strength of FRC beams.In this paper, the above test data is used to explore its applicability to the shear capacity of UHPFRC beams.
where β v is the fiber shape coefficient, the straight shape is 0.7, the irregular shape is 0.5, and λ f is the fiber characteristic value, (4) Sharma et al. [47] model for shear strength of FRC beams Sharma et al. [47] developed an equation for shear strength, which is illustrated in Equation ( 33).The aforesaid test data is utilized in this research to investigate its application to the shear strength of UHPFRC beams.

Comparison and Analysis
The values of shear strength of each UHPFRC beam specimen are derived using the above ML models and compared to the corresponding experimental data, as shown in Figure 7, to examine the correctness and reliability of the aforementioned methodologies.It compares the experimental and calculated values obtained by the models proposed in this paper.They appear to be in accord.The models of ANN, SVR, and XGBoost have close goodness of fit, and their R 2 are, respectively, 0.8825, 0.9016, and 0.8839, and the mean value is 0.8893.Figure 7d compares the experimental and calculated shear strength obtained by the model proposed by Qi et al. [30], and the R 2 is 0.6427, which is much smaller than those obtained by the ML-based models.The model may underestimate the shear strength of the UHPFRC beams.In the experimental and estimated shear strengths of UHPFRC specimens obtained by Ahmad et al. [31], the R 2 is 0.7026.The actual and predicted shear strength of UHPFRC specimens produced using the shear strength equations for FRC beams are shown in Figure 7f,g, and it can be observed that the discreteness is rather substantial.The goodness of fit of the prediction results obtained by the ML models are much larger than those of the existing models for calculating the shear strength of UHPFRC beams, and the ML models can better predict the shear strength of UHPFRC beams.
By analyzing and comparing the ratios (V pre /V exp ) of the calculated values of shear capacity (V pre ) to the experimental values (V exp ) obtained by the above methods, the prediction accuracy and reliability of the above models are further analyzed and evaluated.The V pre /V exp ratios of each model are shown in Figure 8.The CoV is a parameter used to evaluate the calculation reliability of the model.The lower the value, the higher the reliability of the model.This value is very important for the engineering application of theoretical models.The average values of the ratio (V pre /V exp ) obtained by the above three ML models (ANN, SVR, XGBoost) in this paper are, respectively, 1.08, 1.02, and 1.10, and the coefficients of variation are, respectively, 0.28, 0.21, and 0.28, so the SVR prediction model shows better accuracy and reliability.Figure 8c shows the distribution of the ratio (V pre /V exp ) obtained by the model proposed by Qi et al. [30].The average value is 0.72 and the coefficient of variation is 0.36.This model underestimates the shear strength of UHPFRC beams and shows reasonable reliability.Figure 8e presents the distribution of the ratios (V pre /V exp ) obtained by the method proposed by Ahmad et al. [31].The average value is 1.10 and the coefficient of variation is 0.41.The fitted calculation formula is fairly accurate, and the accuracy and reliability may be enhanced further by increasing the amount of fitted data.Figure 8f,g shows the distribution diagrams of the ratios (V pre /V exp ) obtained by the methods of the shear strength of FRC beams, and it can be seen that the methods have a lot of discreteness and need to be improved for estimating shear strength of UHPFRC beams.The CoV obtained by the SVR prediction model is much smaller than those obtained by the exciting theoretical models, and the average value of the ratios (V pre /V exp ) obtained by the SVR prediction model is also closer to 1, so the SVR prediction model has better prediction accuracy and reliability.  of fitted data.Figure 8f,g shows the distribution diagrams of the ratios (Vpre/Vexp) obtained by the methods of the shear strength of FRC beams, and it can be seen that the methods have a lot of discreteness and need to be improved for estimating shear strength of UHP-FRC beams.The CoV obtained by the SVR prediction model is much smaller than those obtained by the exciting theoretical models, and the average value of the ratios (Vpre/Vexp) obtained by the SVR prediction model is also closer to 1, so the SVR prediction model has better prediction accuracy and reliability.The R 2 , maximum, minimum, mean, and coefficients of variation of the ratios (Vpre/Vexp) obtained by the above models are all summarized in Table 2.

Conclusions
In this study, ML-based models (ANN, SVR, and XGBoost) for the shear strength of UHPFRC beams are developed and their accuracy is also compared with existing analytical models.The correctness of the suggested model is studied and assessed using primary shear test data of UHPFRC beam specimens.The following are the results: (1) The ML algorithm of random forest (RF) is used to evaluate the importance of the given parameters for the shear strength of UHPFRC beams, and the studies show that the area of longitudinal reinforcement has the greatest influence on the shear capacity of UHPFRC beam, and its importance coefficient is 0.14.To simplify the calculation and maintain certain calculation accuracy, the first 12 parameters of the area The R 2 , maximum, minimum, mean, and coefficients of variation of the ratios (V pre /V exp ) obtained by the above models are all summarized in Table 2.

Conclusions
In this study, ML-based models (ANN, SVR, and XGBoost) for the shear strength of UHPFRC beams are developed and their accuracy is also compared with existing analytical models.The correctness of the suggested model is studied and assessed using primary shear test data of UHPFRC beam specimens.The following are the results: (1) The ML algorithm of random forest (RF) is used to evaluate the importance of the given parameters for the shear strength of UHPFRC beams, and the studies show that the area of longitudinal reinforcement has the greatest influence on the shear capacity of UHPFRC beam, and its importance coefficient is 0.14.To simplify the calculation and maintain certain calculation accuracy, the first 12 parameters of the area of longitudinal reinforcement, the stirrup strength, the cross-section area, the shear span ratio, fiber volume fraction, etc., are selected as input parameters.(2) The suggested approach is evaluated for accuracy and reliability using a shear test database, and it is also compared to existing methods for shear strength of UHPFRC and FRC beams.The models of ANN, SVR, and XGBoost have close goodness of fit, and their R 2 are, respectively, 0.8825, 0.9016, and 0.8839, the mean value is 0.8893, and it is much larger than those obtained by the existing models for calculating shear strength of UHPFRC beams.The computed to experimental shear strength ratios generated by the suggested ML models (ANN, SVR, XGBoost), respectively, have the average values of 1.08, 1.02, and 1.10, and coefficients of variation are, respectively, 0.28, 0.21, and 0.28, so the SVR prediction model has better accuracy and reliability.The accuracy and reliability of ML-based models are much better than those of existing models for calculating the shear strength of UHPFRC beams, and the existing analytical methods for determining the shear strength of FRC beams are not applicable for UHPFRC beams.
Appendix A

Figure 3 .
Figure 3.The influence importance of various parameters on the shear strength of UHPFRC beams.

Figure 4 .
Figure 4. Flowchart of developing the ML model.

Figure 6 .
Figure 6.Goodness of fit with different numbers of hidden layers and neurons.

Table 2 .
Accuracy evaluation details of the above prediction methods.

Table 2 .
Accuracy evaluation details of the above prediction methods.

Table A1 .
Test data of UHPFRC beam specimens.