Soil-Cement Mixtures Reinforced with Fibers: A Data-Driven Approach for Mechanical Properties Prediction

: The reinforcement of stabilized soils with ﬁbers arises as an interesting technique to over-come the two main limitations of the stabilized soils: the weak tensile/ﬂexural strength and the higher brittleness of the behavior. These types of mixtures require extensive laboratory charac-terization since they entail the study of a great number of parameters, which consumes time and resources. Thus, this work presents an alternative approach to predict the unconﬁned compressive strength (UCS) and the tensile strength of soil-binder-water mixtures reinforced with short ﬁbers, following a Machine Learning (ML) approach. Four ML algorithms (Artiﬁcial Neural Networks, Support Vector Machines, Random Forest and Multiple Regression) are explored for mechanical prediction of reinforced soil-binder-water mixtures with ﬁbers. The proposed models are supported on representative databases with approximately 100 records for each type of test (UCS and splitting tensile strength tests) and on the consideration of sixteen properties of the composite material (soil, ﬁbers and binder). The predictive models provide an accurate estimation (R 2 higher than 0.95 for Artiﬁcial Neuronal Networks algorithm) of the compressive and the tensile strength of the soil-water-binder-ﬁber mixtures. Additionally, the results of the proposed models are in line with the main experimental ﬁndings, i.e., the great effect of the binder content in compressive and tensile strength, and the signiﬁcant effect of the type and the ﬁber properties in the assessment of the tensile strength. J.T., A.A.S.C. and P.J.V.O.; Software, J.T.; Investigation, J.T., A.A.S.C. and P.J.V.O.; Methodology, J.T., A.A.S.C. P.J.V.O.; Validation, J.T., A.A.S.C. P.J.V.O.; Writing—original draft, J.T.; Writing—review & editing, J.T., A.A.S.C. and P.J.V.O.; administration, P.J.V.O.;


Introduction
In the last two decades, soil stabilization using chemical binders has been spreading rapidly around the world. This technique is used to improve the properties of problematic soils, mainly when the soils show a low shear strength and high compressibility to support in safe conditions the loads applied by several works, such as: foundations of buildings and/or embankments, slope reinforcement, deep retaining walls [1,2], stabilization of contaminated soils [3], among others. The main constraint of this methodology is related to the weak tensile strength of the stabilized soil, which restrains its use in works where a non-negligible tensile strength is required, namely in the case of structures subject to horizontal vibrations (e.g., induced by heavy machinery, traffic, wind, sea waves, explosives and earthquakes) or horizontal loading/displacement (e.g., deep mixing columns used in slope stabilization or installed in the lateral of embankments, retaining walls [4]). The tensile/flexural strength of the soil-binder-water mixtures can be increased through the inclusion of short fibers [5,6] or by the installation of steel H-beams inside deep mixing columns. In fact, this approach of including fibers to improve the mechanical behavior of the mixtures has been adopted in other similar industries [7][8][9].
The reinforcement of soil-binder-water mixtures with short fibers, addressed in several works, induces an increase in the ductility, post-peak strength and tensile/flexural strength [6,[10][11][12][13][14][15][16][17][18]. However, the experimental results also show that the impact of the reinforcement changes with the type of soil, type and content of fiber, the amount of binder and the mechanism induced by the test used to characterize the tensile strength [6,10,11]. In fact, the reinforcement with synthetic fibers in soil-binder-water mixtures for a binder content lower than 10% induces an increase in the compressive strength [16][17][18], while a higher amount of binder originates an opposite tendency [6,10,19]. Moreover, the effect of the reinforcement with fibers on the tensile strength depends on the strain level imposed at failure by each type of test [10]. Thus, when the tests originate a reduced strain at failure (as the direct tensile strength tests), which is insufficient to mobilize the tensile strength of the fibers, the effect of the reinforcement is less expressive or even detrimental. On the other hand, when the failure is associated with a deformation high enough to mobilize the tensile strength of the fibers (as in the case of the flexural strength and the split tensile strength tests), an increase in the tensile strength is observed with the reinforcement with fibers.
As previously described, the evaluation of the mechanical characteristics of soil-fiberbinder-water mixtures depends on a great number of factors, requiring the execution of specific tests for each of the desired properties. Additionally, the specimens should be prepared in conditions to replicate as possible the field conditions, mainly the soil and water content, which increases the costs, especially when dealing with natural materials rich in heterogeneities as soils are. Thus, the use of tools to predict the mechanical characteristics of soil-fiber-binder-water mixtures can be very useful, particularly in the pre-design stage of a work allowing to minimize the associated costs. Keeping this in mind, this work followed a data-driven approach by exploring the capabilities of four Machine Learning (ML) algorithms. In particular, Artificial Neural Networks (ANNs) [20], Support Vector Machines (SVMs) [21] and Random Forest (RF) [22] have been explored for mechanical prediction of reinforced soil-binder-water mixtures with fibers. As a baseline comparison, a Multiple Regression (MR) was also implemented. These advanced algorithms have been widely applied in different knowledge domains [23,24] with very promising results and taking advantage of a consolidated experience. In the field of Civil Engineering, several successful applications of these tools can be found [25][26][27], including solving complex geotechnical problems related to slopes stability assessment [28,29]. These algorithms have also been applied in the study of mechanical properties of soil-binder-water mixtures as reported on Tinoco et al. [30], which underline the non-linear learning capabilities of these algorithms. Thus, considering its past application on unconfined compressive strength [30,31] estimation of non-reinforced soil-water-cement mixtures, the focus and main novelty of this work is the prediction of the unconfined compressive strength and, mainly, the tensile strength of stabilized soils reinforced with some types of short fibers.

Modeling
For both mechanical property's prediction of reinforced soil-binder-water mixtures with fibers, a data-driven approach was adopted. Thus, four different ML algorithms were fitted to each one of the databases previously compiled and prepared that contained unconfined compression strength tests results and indirect tensile strength tests results related to laboratory mixtures, as well as a set of input variables related to the soil, binder and fibers characteristics used to prepare the mixtures. Particularly, Artificial Neural Networks (ANNs) [20,32,33], Support Vector Machines (SVMs) [21,[34][35][36][37] and Random Forests (RF) [22,27,[38][39][40] were trained for Unconfined Compressive Strength (UCS) and Indirect Tensile Strength (ITS) estimation of reinforced soil-binder-water mixtures with fibers. In addition, as a baseline comparison, also a Multiple Regression (MR) [41] algorithm was implemented.
For a detailed overview of each one of the adopted ML algorithms, the readers are advised to check the literature, namely the above-indicated references. Concerning the definitions and hyperparameters of each algorithm, Figure 1 summarizes the adopted parameters.
fibers. In addition, as a baseline comparison, also a Multiple Regression (MR) [41] algorithm was implemented.
For a detailed overview of each one of the adopted ML algorithms, the readers are advised to check the literature, namely the above-indicated references. Concerning the definitions and hyperparameters of each algorithm, Figure 1 summarizes the adopted parameters. All experiments were conducted using the R statistical environment [42] and supported through the rminer package [43], which facilitates the implementation of several DM algorithms, including ANNs, SVMs and RF algorithms, as well as different validation schemas such as the cross-validation adopted in this work.

Models Evaluation
Models' accuracy and interpretability are two important steps for a deeper understanding and assessment of the proposed models.
Concerning the models' comparison and accuracy measurement, three distinct metrics were calculated [44]: Mean Absolute Error (MAE), Root Mean Square Error (RMSE) and Coefficient of correlation (R 2 ). On a perfect model, the first two metrics (MAE and RMSE) should present a value close to zero and R 2 equal to one. Although similar, MAE and RMSE allow a model's assessment under distinct and complementary perspectives. When compared with MAE, RMSE penalizes more heavily a model that in a few cases produces high errors since it uses the square of the distance between the real and predicted values [26,31]. In addition, the Regression Error Characteristic (REC) curve proposed by Bi and Bennett [45] was also adopted. An REC curve plots the error tolerance on the x-axis versus the percentage of points predicted within the tolerance on the y-axis, allowing quick and easy comparison of different regression models.
Generalization capacity is also a key point for the model's assessment. For this purpose, in this work, a 5-run under cross-validation (k-fold = 10) approach [44] was implemented. A k-fold validation evaluates the data across the entire training set, but it does so by dividing the training set into k folds (or subsections, where k is a positive integer) and then training the model k times, each time leaving a different fold out of the training data and using it instead as a validation set. In the end, the performance metric is averaged across all k tests. Lastly, as before, once the best parameter combination has been found, the model is retrained on the full data.
From an engineering point of view, the model's interpretability is a key aspect to take All experiments were conducted using the R statistical environment [42] and supported through the rminer package [43], which facilitates the implementation of several DM algorithms, including ANNs, SVMs and RF algorithms, as well as different validation schemas such as the cross-validation adopted in this work.

Models Evaluation
Models' accuracy and interpretability are two important steps for a deeper understanding and assessment of the proposed models.
Concerning the models' comparison and accuracy measurement, three distinct metrics were calculated [44]: Mean Absolute Error (MAE), Root Mean Square Error (RMSE) and Coefficient of correlation (R 2 ). On a perfect model, the first two metrics (MAE and RMSE) should present a value close to zero and R 2 equal to one. Although similar, MAE and RMSE allow a model's assessment under distinct and complementary perspectives. When compared with MAE, RMSE penalizes more heavily a model that in a few cases produces high errors since it uses the square of the distance between the real and predicted values [26,31]. In addition, the Regression Error Characteristic (REC) curve proposed by Bi and Bennett [45] was also adopted. An REC curve plots the error tolerance on the x-axis versus the percentage of points predicted within the tolerance on the y-axis, allowing quick and easy comparison of different regression models.
Generalization capacity is also a key point for the model's assessment. For this purpose, in this work, a 5-run under cross-validation (k-fold = 10) approach [44] was implemented. A k-fold validation evaluates the data across the entire training set, but it does so by dividing the training set into k folds (or subsections, where k is a positive integer) and then training the model k times, each time leaving a different fold out of the training data and using it instead as a validation set. In the end, the performance metric is averaged across all k tests. Lastly, as before, once the best parameter combination has been found, the model is retrained on the full data.
From an engineering point of view, the model's interpretability is a key aspect to take into account. Due to the high complexity of most ML algorithms, namely SVMs or ANNs that rely on complex statistical analysis and are frequently referred to as "black boxes", it is fundamental to find a way to "open" such models in order to understand what was learnt by them. With this purpose, Cortez and Embrechts [46] proposed a novel visualization approach based on sensitivity analysis (SA), which is used in this work. SA is a simple method that is applied after the training phase and measures the model responses when a given input is changed, allowing the quantification of the relative importance of each attribute as well as its average effect on the target variable. In particular, it was applied the Global Sensitivity Analysis (GSA) method [46], which is able to detect interactions among input variables. This is achieved by performing a simultaneous variation of F inputs. Each input is varied through its range with L levels and the remaining inputs fixed to a given baseline value. In this work, the average input variable value as a baseline was adopted and set to L = 12, which allows an interesting detail level under a reasonable amount of computational effort.
With the sensitivity response of the GSA, different visualization techniques can be computed. In this work, it is calculated the input importance barplot, which shows the relative influence (R a ) of each input variable in the model (from 0 to 100%). The rationale of GSA is that the higher the changes produced in the output, the more important is the input. To measure this effect, first, the gradient metric (g a ) for all inputs was calculated. After that, the relative influence was computed according to the following equation: where a denotes the input variable under analysis, andŷ a,j is the sensitivity response for x a,j .

Database
For models training and testing, two independent databases were compiled, respectively, for UCS and ITS studies, containing 121 records in the first case and 94 in the second. All samples were prepared under a controlled environment in the framework of a laboratory testing program developed at the University of Coimbra. This program aimed to characterize the compression and tensile behavior of soil-binder-water mixtures reinforced with fibers through unconfined compressive strength tests and indirect tensile strength tests (the later ones also called split tensile strength tests). Soils characteristics (grain size composition, organic matter content, water content, Atterberg limits), binder content, curing time and fibers characteristics (changing origin, length, fiber content and mechanical properties) were parameters considered in the study [4,6,[10][11][12]47,48].
The soils used in the preparation of the laboratory samples comprise natural soils (collected in the Mondego river lower valley area and in a gravel-silty pit) and laboratorymade soils (starting from natural soils a specific property was varied, e.g., organic matter content and sand content), ranging from cohesive to cohesionless soils, organic to nonorganic soils, presenting different geotechnical properties. In all cases, soils were chemically stabilized with Portland cement, the most widely used binder in soils stabilization [49], applied in different amounts ranging from 75 to 500 kg/m 3 . Concerning the fibers, four distinct types have been used trying to encompass all the types of fibers usually applied in soils stabilization. Thus, it was selected a natural fiber (Sisal) and three artificial fibers, a synthetic one (polypropylene), and two metallic fibers (Dramix and Wiremix, varying the fibers anchorage conditions), characterized by different mechanical properties, namely stiffness and tensile strength. The fibers length changed from 12 to 30 mm, and they were applied in different amounts ranging from 2 to 150 kg/m 3 . A detailed description of all materials may be found in [4,6,[10][11][12]47,48].
As models input, a set of 16 variables were selected. Among all variables available in the framework of the study, these 16 features are identified in the literature as influents on mechanical properties behavior [30,[50][51][52][53]. Moreover, from a statistical point of view, they were also identified as relevant, as shown in the correlation matrix depicted in Figure  2, which relates to the UCS study. Considering that the formulations prepared for both studies (UCS and ITS) are similar, the equivalent representation for ITS is also similar. For that reason, it was not included in the paper. In addition, the selection of the variables was also supported on a try and error procedure using the evaluation metrics described above. Below, all 16 input variables considered in this study are listed on both mechanical properties' prediction of reinforced soil-binder-water mixtures with fibers: Ratio between water and cement contents- Deformability modulus of the fiber (GPa)-E fiber Table 1 summarizes the main statistics of all 16 inputs variables, as well as of the output variables (UCS and ITS), showing the wide range of binder and fiber contents.

Results and Discussion
This section summarizes the main achievements of the study. Thus, the main achievements concerning the UCS prediction are presented and discussed in Section 3.1, followed by ITS results in Section 3.2. In both sections, after an overall comparison of all four ML algorithms trained, a more in-depth analysis is presented for ANN and RF algorithms, which achieved an overall superior performance. For simplification purposes, the following notation is adopted for the models' names: ML algorithm (ANN, SVM, RF or MR) dot followed by the prediction type (UCS or ITS). For example, ANN.UCS refers to the developed model for UCS prediction based on the ANN algorithm.

Results and Discussion
This section summarizes the main achievements of the study. Thus, the main achievements concerning the UCS prediction are presented and discussed in Section 3.1, followed by ITS results in Section 3.2. In both sections, after an overall comparison of all four ML algorithms trained, a more in-depth analysis is presented for ANN and RF algorithms, which achieved an overall superior performance. For simplification purposes, the following notation is adopted for the models' names: ML algorithm (ANN, SVM, RF or MR) dot followed by the prediction type (UCS or ITS). For example, ANN.UCS refers to the developed model for UCS prediction based on the ANN algorithm.
The average hyperparameters and fitting time values (and respective 95% level confidence intervals according to t-student distribution) of the four ML algorithms trained for both mechanical properties prediction of soil-binder-water mixtures reinforced with fibers (i.e., ANN, SVM, RF and MR) are summarized in Table 2. The slowest one is the RF on UCS modelling, which takes an average of 6 s over the five runs. If excluding MR, SVM was the fastest one taking on average around 2 s over the five runs, followed by ANN with more than 4.7 s. As expected, MR was very fast to model UCS and ITS, taking less than 0.50 s. It should be noted that these computational times are related to the time that each algorithm took to fit the training data. In the future, when the proposed models (namely the ANN and RF models) are applied to predict new cases, the time required is very close to zero (the computation is almost instantaneous). In terms of hyperparameter, and particularly for the ANN, the optimized number of neurons in the hidden layer was 6 and 5, respectively, for UCS and ITS prediction.  Table 3 compares the performance of the four ML algorithms in both UCS and ITS prediction of soil-binder-water mixtures reinforced with fibers based on MAE, RMSE and R 2 metrics (mean value and respective 95% level confidence intervals according to t-student distribution). Apart from MR, all other three algorithms present a particularly good and similar performance in both mechanical properties' prediction of soil-binderwater mixtures reinforced with fibers. Taken R 2 as a reference, all three algorithms (ANN, SVM and RF) achieved, on average, a value close to 0.95. A detailed analysis shows that ANN achieved an overall superior performance on both mechanical properties' prediction (best values in bold in Table 3 as described in Section 2.2), followed by RF and SVM. As expected, the lower performance is observed for MR, which evidenced clear difficulties in modelling UCS and ITS efficiently, which can be explained by the characteristic non-linear behavior of soil-binder-water mixtures reinforced with fibers.

Uniaxial Compressive Strength
Concerning the UCS study, Figure 3 compares REC curves of all four ML algorithms, confirming the lower performances of MR and the superior response of ANN. In a REC representation, a high performance corresponds to an accuracy of one (y-axis) achieved for as low as possible absolute deviation (x-axis). Thus, taken ANN.UCS as a reference, one can observe that ANN.UCS achieved accuracy close to one for an absolute deviation of 750 kPa. On the opposite side, and for the same absolute deviation, the MR.UCS accuracy is around 25% lower. SVM.UCS and RF.UCS have similar performances, although the first one shows a better response for lower absolute deviations.

Uniaxial Compressive Strength
Concerning the UCS study, Figure 3 compares REC curves of all four ML algorithms, confirming the lower performances of MR and the superior response of ANN. In a REC representation, a high performance corresponds to an accuracy of one (y-axis) achieved for as low as possible absolute deviation (x-axis). Thus, taken ANN.UCS as a reference, one can observe that ANN.UCS achieved accuracy close to one for an absolute deviation of 750 kPa. On the opposite side, and for the same absolute deviation, the MR.UCS accuracy is around 25% lower. SVM.UCS and RF.UCS have similar performances, although the first one shows a better response for lower absolute deviations.   Table 3 and discussed.
As important as the model's accuracy is its interpretability, particularly from an engineering point of view. Accordingly, in this study, a detailed sensitivity analysis was applied, aiming to measure the relative importance of each model attribute and, this way, understand what has been learnt by the algorithms and compare it with the empirical knowledge. Figure 5 plots the relative importance of each one of the sixteen attributes considered in the UCS prediction of soil-binder-water mixtures reinforced with fibers, according to the four ML algorithms implemented in this study. Taken ANN.UCS model as reference, which achieved the best overall performance as above shown, in the ranking of the first four key variables, it may be found the influence of the binder dosage (DKg/m3= 13.5%), soil characteristics (ω0= 12.8%, %Clay= 8.5%) and fiber type (Tfiber= 8.0%). These variables are indeed some of the most important parameters controlling the behavior of soil-binder-water mixtures reinforced with fibers, as observed in some experimental studies [4][5][6][7][8][9]14,16,19,[54][55][56]. Additionally, according to the SVM.UCS model, a similar distribution is observed. Concerning the RF.UCS model, although has achieved the secondbest overall performance on UCS prediction of soil-binder-water mixtures reinforced with fibers, in terms of relative importance distribution, the influence of ω0/aw, seems too   Table 3 and discussed.
As important as the model's accuracy is its interpretability, particularly from an engineering point of view. Accordingly, in this study, a detailed sensitivity analysis was applied, aiming to measure the relative importance of each model attribute and, this way, understand what has been learnt by the algorithms and compare it with the empirical knowledge. Figure 5 plots the relative importance of each one of the sixteen attributes considered in the UCS prediction of soil-binder-water mixtures reinforced with fibers, according to the four ML algorithms implemented in this study. Taken ANN.UCS model as reference, which achieved the best overall performance as above shown, in the ranking of the first four key variables, it may be found the influence of the binder dosage (D Kg/m 3 = 13.5%), soil characteristics (ω 0 = 12.8%, %Clay= 8.5%) and fiber type (T fiber = 8.0%). These variables are indeed some of the most important parameters controlling the behavior of soil-binder-water mixtures reinforced with fibers, as observed in some experimental studies [4][5][6][7][8][9]14,16,19,[54][55][56]. Additionally, according to the SVM.UCS model, a similar distribution is observed. Concerning the RF.UCS model, although has achieved the secondbest overall performance on UCS prediction of soil-binder-water mixtures reinforced with fibers, in terms of relative importance distribution, the influence of ω 0 /a w , seems too high (40%). However, it should be noted that based on previous studies [30] related to soilcement mixtures, this ratio has been identified as one of the most influential variables on mechanical properties development.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 9 of 16 high (40%). However, it should be noted that based on previous studies [30] related to soil-cement mixtures, this ratio has been identified as one of the most influential variables on mechanical properties development.

Indirect Tensile Strength
Following the same procedure adopted in the UCS study, the performance of all four algorithms in ITS prediction was compared based on REC curves, as depicted in Figure 6. As previously discussed and shown in Table 3, it is also clear here that the superior performance of ANN algorithm on ITS prediction and the weak response of a linear approach (MR.ITS model). Concerning RF.ITS and SVM.ITS, both present a very similar response on ITS prediction.
Looking in detail to ANN.ITS model, it is observed that around 96% of all records can be predicted with an absolute deviation lower than 100 kPa. Moreover, even for a tighter tolerance, such as an absolute deviation around 50 kPa, ANN.ITS presents an accuracy higher than 85%, showing its good performance.

Indirect Tensile Strength
Following the same procedure adopted in the UCS study, the performance of all four algorithms in ITS prediction was compared based on REC curves, as depicted in Figure 6. As previously discussed and shown in Table 3, it is also clear here that the superior performance of ANN algorithm on ITS prediction and the weak response of a linear approach (MR.ITS model). Concerning RF.ITS and SVM.ITS, both present a very similar response on ITS prediction.
Looking in detail to ANN.ITS model, it is observed that around 96% of all records can be predicted with an absolute deviation lower than 100 kPa. Moreover, even for a tighter tolerance, such as an absolute deviation around 50 kPa, ANN.ITS presents an accuracy higher than 85%, showing its good performance. Figure 7 validates the high performance of both ANN.ITS (Figure 7a) and RF.ITS (Figure 7b) models on ITS prediction. As shown, particularly according to the ANN.ITS model, all predictions are close to the experimental values (diagonal line).   Concerning to model's interpretability, Figure 8 compares the relative importance of each model attribute. As for the UCS study, with the ANN.ITS model taken as a reference, the dosage of the binder (D Kg/m3 ) was identified as the most relevant variable in ITS prediction with a relative influence close to 16%. A higher influence of the fibers is also observed, which was considered by the ANN.ITS model through E fiber (7.2%), F Kg/m3 (7.1%) and T fibre (7.0%), which ranks in the five most relevant variables. This higher influence of the fibers on the ITS prediction, when compared to the UCS study, is in agreement with some empirical studies [10,14,16]. In fact, when the composite material is subject to indirect tensile through a splitting failure mechanism, there is an effective mobilization of the tensile strength of the fibers that cross the vertical failure plane imposed by the ITS test, and consequently, the tensile strength is directly related to the fibers' characteristics. According to RF.ITS, once again, an influence above 40% is observed for ω 0 /a w , which demonstrates the coherence of the algorithm.  Concerning to model's interpretability, Figure 8 compares the relative importance of each model attribute. As for the UCS study, with the ANN.ITS model taken as a reference, the dosage of the binder (DKg/m3) was identified as the most relevant variable in ITS prediction with a relative influence close to 16%. A higher influence of the fibers is also observed, which was considered by the ANN.ITS model through Efiber (7.2%), FKg/m3 (7.1%) and Tfibre (7.0%), which ranks in the five most relevant variables. This higher influence of the fibers on the ITS prediction, when compared to the UCS study, is in agreement with

Conclusions
This work explored four Machine Learning (ML) algorithms to predict the mechanical properties of soil-binder-water mixtures reinforced with fibers. Thus, Artificial Neuronal Networks (ANNs), Support Vector Machines (SVMs), Random Forest (RF) and Multiple Regression (MR), which were used as a baseline comparison, were implemented to predict Unconfined Compressive Strength (UCS) and Indirect Tensile Strength (ITS) development. The proposed models, supported on representative databases comprising around 100 records each, were able to catch both mechanical properties behavior with a promising performance (R 2 higher than 0.95), particularly those based on ANNs. For that, sixteen variables covering information about the three main components involved in these types of mixtures (i.e., soil, fibers and binder) have been considered.
By addressing a global sensitivity analysis, a deeper understanding of the proposed models was extracted, showing that the binder content is one the most influential variable in both UCS and ITS prediction. Moreover, it was observed that in the ITS study, the type and characteristics of the fibers are more relevant than in the UCS study, which corroborates some experimental findings.
In conclusion, the proposed models can be used as an important tool for design purposes, allowing a very accurate estimation of the final properties of soil-binder-water mixtures reinforced with fibers by considering only information available without preparing/testing any sample. Moreover, it was shown once again the advantages of implementing a data-driven approach to explore complex geotechnical problems.

Conclusions
This work explored four Machine Learning (ML) algorithms to predict the mechanical properties of soil-binder-water mixtures reinforced with fibers. Thus, Artificial Neuronal Networks (ANNs), Support Vector Machines (SVMs), Random Forest (RF) and Multiple Regression (MR), which were used as a baseline comparison, were implemented to predict Unconfined Compressive Strength (UCS) and Indirect Tensile Strength (ITS) development. The proposed models, supported on representative databases comprising around 100 records each, were able to catch both mechanical properties behavior with a promising performance (R 2 higher than 0.95), particularly those based on ANNs. For that, sixteen variables covering information about the three main components involved in these types of mixtures (i.e., soil, fibers and binder) have been considered.
By addressing a global sensitivity analysis, a deeper understanding of the proposed models was extracted, showing that the binder content is one the most influential variable in both UCS and ITS prediction. Moreover, it was observed that in the ITS study, the type and characteristics of the fibers are more relevant than in the UCS study, which corroborates some experimental findings.
In conclusion, the proposed models can be used as an important tool for design purposes, allowing a very accurate estimation of the final properties of soil-binder-water mixtures reinforced with fibers by considering only information available without preparing/testing any sample. Moreover, it was shown once again the advantages of implementing a data-driven approach to explore complex geotechnical problems.