1. Introduction
The practical applications of concrete are dependent upon its rheological, mechanical, and durability properties, which in turn are affected by multiple factors including cementitious materials, chemical admixtures, aggregates type and grading, water-to-binder ratio, fibers and other inclusions, curing conditions (temperature and relative humidity), etc. [
1,
2,
3]. Ultra-high-performance concrete (UHPC) has been developed to achieve very high compressive strength along with superior ductility and durability. Its mechanical properties are extremely sensitive to the particle packing density, mixture components, and curing conditions [
2,
3,
4,
5]. To produce UHPC with very high compressive strength, high cement content, low water-to-binder (w/b) ratio, fine powders (quartz, silica fume, etc.), well-graded aggregates, and high-range water-reducing admixtures are deployed to achieve superior particle packing density and lowest porosity, while assuring adequate flow and consolidation.
Several researchers in recent decades have explored the mechanical properties of UHPC made with diverse ingredients and mixture proportions [
1,
2,
3,
4,
5,
6]. In particular, the inclusion of eco-efficient supplementary cementitious materials (SCMs), such as fly ash (FA) and ground granulated blast slag furnace (GGBFS), have attracted extensive attention among researchers and engineers [
7,
8,
9,
10]. Despite this abundant research, the effect of the inclusion of SCMs on the compressive strength of UHPC has not yet been analyzed systematically. For instance, Alsalman et al. [
11] and Wu et al. [
12] observed that the replacement of cement by FA led to increased compressive strength of UHPC, whereas contradictory results were reported by Randl et al. [
7]. Moreover, there have been various studies indicating that partial replacement of cement by GGBFS reduced the compressive strength of UHPC. For instance, Wang et al. [
4] reported that the compressive strength of mixtures incorporating GGBFS as partial replacement for Portland cement was reduced by up to 20%. Randl et al. [
7], Zhang et al. [
8], and Yang et al. [
13] also evidenced reduction in UHPC compressive strength upon using GGBFS as a partial replacement for cement.
Plain UHPC displays an undesirable brittle behavior, which can hamper its use in many engineering applications [
1,
14,
15]. Thus, various types of fibers, such as steel and synthetic fibers, have been widely used to improve the ductility and impact resistance of UHPC, among which steel microfibers achieved the most promising performance, increasing flexural and tensile strength, and enhanced toughness and impact resistance. Several researchers found that fibers had an insignificant effect on the compressive strength of UHPC, while the degree of cement hydration and particle packing density of the matrix play a more important role in the strength development of UHPC [
11,
15,
16,
17]. Such findings magnify the lack of consistent knowledge for predicting the behavior of UHPC incorporating various mixture ingredients, despite the extensive experimental studies in the literature.
Artificial intelligence has proven to be a powerful tool for solving convoluted engineering problems in various fields. Machine learning (ML) algorithms can predict an output target after being trained on a given dataset. For instance, various engineering properties of composite materials have been modeled using powerful ML models, including artificial neural networks (ANNs), support vector machines (SVMs), tree-based ensembles, deep learning (DL), etc. Ben Chaabene et al. [
18] conducted an in-depth review of the application of such ML techniques for predicting the mechanical properties of concrete. Moreover, there have been numerous studies which aimed at predicting the mechanical properties of various types of modern concretes, such as recycled aggregate concrete (RCA) [
19,
20,
21,
22], high-performance and ultra-high-performance concrete (HPC and UHPC, respectively) [
23,
24,
25,
26,
27], phase change materials-integrated concrete [
28], self-healing concrete [
29], etc. For instance, Han et al. [
24] used an improved random forest algorithm to predict the compressive strength of HPC. They deployed a dataset included 1030 compressive strength test observations for HPC made of normal cement and cured under normal conditions. Water, cement, GGBFS, FA, fine aggregates, coarse aggregates, and age were the basic input parameters of the dataset, along with five combined variables appended to predict the compressive strength. These combined variables included ratios of w/b, GGBFS-to-water, FA-to-water, coarse aggregate-to-binder, and coarse aggregate-to-fine aggregate. The developed model had a promising performance in predicting HPC compressive strength. It was recommended to use the absolute mass of mixture components as input features for developing predictive models.
The compressive strength of UHPC was modeled using ANN in a recent study by Abuodeh et al. [
30]. They used sequential feature selection and neural interpretation diagram techniques to distinguish those mixture components affecting the performance of the ANN model. Accordingly, they compiled a dataset of 110 UHPC mixture designs to predict the 28-day compressive strength. Although they achieved high predictive accuracy, the small size of their dataset, alongside the limited number of mixture components, warrant further effort to collect a more comprehensive dataset to extend the model robustness and generalization capability. The importance of extensive datasets in developing powerful ML models capable of adapting to new, previously unseen data is widely highlighted in the literature. For instance, Marani and Nehdi [
28] developed ML models to predict the compressive strength of concrete incorporating phase change materials using 154 data examples. Despite achieving high accuracy, they posited that expanding the dataset should improve the model generalization capability and provide better insights into the materials science aspects of the problem. Therefore, the collection of pertinent and comprehensive experimental data is of great importance in developing ML predictive tools to better understand the non-linear relationship between different mixture components of UHPC and its compressive strength. Moreover, the inclusion of the curing regime including temperature, relative humidity (RH), and time can provide valuable insight into the strength development of UHPC over time and under various curing conditions.
Considering various UHPC mixture components and the diverse existing experimental data available in the open literature, developing robust predictive tools for modeling the mechanical properties of UHPC and understanding the complex relationships between its mixture components are desirable. The present study creates novel ML models to predict the compressive strength of UHPC based on an extensive dataset of wide-ranging experimental data retrieved from reliable resources in the open literature. Furthermore, a state-of-the-art data generating technique was deployed, for the very first time, to generate UHPC compressive strength synthetic data points for training the ML models. Synthetic data generation can mitigate the problems associated with the limited availability of pertinent experimental data for in-depth and comprehensive analysis of UHPC mixture design. Accordingly, tabular generative adversarial networks (TGAN) were able to generate plausible data for training robust tree-based ensembles including random forest (RF), extra trees (ET), and gradient boosting (GB) for the estimation of UHPC compressive strength. Subsequent sections elaborate on the data collection, fundamentals of the applied ML models, performance evaluation metrics, and discussion of the results. Fundamentals of the applied methods along with the model development steps are further explained in
Section 3. A comprehensive parametric study was also carried out to gain profound insights into the influence of UHPC mixture ingredients on its compressive strength.
2. Data Collection
Creating a comprehensive and reliable dataset is a vital step in developing ML predictive models. For this purpose, an extensive literature review was performed to retrieve data from published research papers. Diverse supplementary cementitious materials (SCMs), fine and ultra-fine aggregates, types of fibers, etc., have been incorporated in UHPC to improve its mechanical and durability properties. Therefore, there are many input features that could be considered for an ML model to forecast the compressive strength of UHPC. Considering the numerous experimental studies that used such materials in UHPC mixture designs, along with several curing regimes, a large dataset comprising various mixture components was initially collected. However, to consolidate the proposed predictive framework, the dataset was narrowed down to UHPC mixtures incorporating the most frequently used ingredients. Additionally, only the temperature (T) and relative humidity (RH) were considered as curing conditions. Thus, a dataset consisting of 912 test observations was constructed to estimate the compressive strength of UHPC. This dataset was further preprocessed to eliminate outliers and data examples with missing input values. After preprocessing, 810 test observations and 15 input features were assigned as the final dataset. All the data were collected from research published in respected forums [
4,
5,
6,
7,
8,
11,
12,
13,
16,
31,
32,
33,
34,
35,
36,
37,
38,
39,
40,
41,
42,
43,
44,
45,
46,
47,
48,
49].
The following assumptions were made in collecting the data: (i) The dosage (absolute mass) of the mixture components for a unit volume of UHPC was collected; (ii) the physical properties of mixture components such as the density and particle size distribution were not included in the final dataset; (iii) only steel fibers were considered in the data collection, and other types of fibers were discarded; the physical and mechanical properties of steel fibers were not included; and (iv) the curing temperature (T) and relative humidity (RH) were considered as the curing conditions.
The selection of input features was performed considering findings in pertinent experimental studies or previous ML modeling of cementitious composites. For instance, the physical properties of steel fibers such as diameter and length were not included as input features due to their confirmed insignificant effect on compressive strength [
14,
15]. For instance, Abuodeh et al. [
30] used the absolute mass of steel fibers alone in their predictive model.
Table 1 presents the variables of the dataset along with their designations. The developed dataset is among the largest available on UHPC mixture designs. Abuodeh et al. [
30] used 110 data samples for their ML modeling of UHPC. Qu et al. [
50] used 162 data examples on the compressive strength of UHPC. Abellán-García [
51] collected 717 data points from the literature along with 210 experimental data from laboratory testing to construct a dataset with 927 observations. After outlier detection, their final number of data points used for training and testing was reduced to 827. The final dataset used for the model development in the present study is presented in
Tables S1 and S2 of the supplementary materials.
Table S1 presents the input variables of the dataset as well as their designation and units, while
Table S2 reports the final dataset used in this study.
5. Parametric Analysis
The robust predictive performance of the developed ML models along with the numerous credible data points generated by the TGAN model encourage a comprehensive parametric analysis to be conducted to better understand the effects of mixture components on the compressive strength of UHPC. Investigating the effects of various dosages of different mixture ingredients in laboratory experiments is laborious, costly, time-consuming, and associated with a negative environmental footprint. Thus, using robust and well-trained ML models can resolve such problems and broaden the outlook of UHPC materials science.
Accordingly, several case studies for parametric analysis were designed with respect to the UHPC research trends in most recent years. The replacement of cement with eco-efficient SCMs such as slag (S) and fly ash (FA) has attracted vast attention. Using such SCMs can mitigate the carbon footprint of UHPC production, whilst offering satisfactory mechanical properties. Hence the effect of the replacement of cement with S or FA at mass percentages varying from 0 to 50% was assessed. For this purpose, two control mixture designs along with two case studies were considered, as outlined in
Table 6. Each case study was applied on both control mixtures. For each control mixture design, the cement content was taken as 750 kg/m
3, and only silica fume was used as the SCM. Moreover, the analysis explored the effect of three different SF contents along with five water-to-cement ratios (W/C) on the compressive strength of UHPC. The main constraint considered in the design of the parametric analysis was having a unit volume for all mixture designs.
Since all developed ML models demonstrated satisfactory performance, a voting regressor was adopted to predict the compressive strength of UHPC by aggregating predictions of the RFR, ETR, and GBR models. A voting model is an ensemble meta-estimator that combines several base regression models and trains each on the entire training dataset, which was the TGAN generated synthetic data in the present study. Afterwards, it averages each single estimation to yield a final predicted target [
64]. Ultimately, the 28-day compressive strength of the mixtures hypothetically cured under a standard condition (T = 23° C and RH = 100%) was predicted using the voting regressor.
5.1. Replacing Cement with Slag
Figure 7 illustrates the influence of different levels of slag partial replacement for cement on the compressive strength of UHPC. In UHPC mixtures with no steel fibers, increasing the slag content slightly decreased the 28-day compressive strength, such that when the slag inclusion was 350 kg/m
3, the compressive strength reduction was less than 10%. A similar trend was observed for different SF contents, as well as different W/C ratios considered in this study. Lower W/C ratios and higher SF contents resulted in higher compressive strengths, as expected. On the other hand, when the UHPC mixtures incorporated 2% by volume of steel fibers (equivalent to 156 kg/m
3), the compressive strength was generally higher compared to that of mixtures with no steel fibers. Moreover, the replacement of cement with slag at lower dosages (up to 150 kg/m
3) slightly improved the compressive strength, while at higher dosages (350 kg/m
3) the compressive strength was decreased by less than 10%. In other words, the reduction of compressive strength in mixtures with steel fiber was less than that for mixtures without steel fibers. A similar trend was evidenced regarding various SF contents and W/C ratios, as shown in
Figure 7. Overall, the results suggested that the partial replacement of cement with slag maintained desired compressive strength of UHPC mixtures.
5.2. Replacing Cement with Fly Ash
The effect of FA inclusion as partial replacement for cement on the compressive strength of UHPC mixtures with and without steel fibers is illustrated in
Figure 8. The replacement of cement with FA led to insignificant reduction in compressive strength, like the trend observed for slag. However, when using FA with higher SF contents and W/C ratios, the reduction of compressive strength was slightly larger compared to that at lower SF content and W/C ratios. Moreover, in UHPC mixtures incorporating steel fibers, FA partial cement replacement at dosages of up to 200–250 kg/m
3 marginally enhanced the compressive strength, whereas FA levels beyond this threshold decreased the compressive strength. Like mixtures without steel fibers, the reduction of compressive strength due to replacement of cement with FA was more evident at higher SF content and W/C ratios. Yet, a high compressive strength of UHPC mixtures was still achievable using high FA dosages. Such findings are in agreement with experimental findings reported in the literature [
7,
8,
13,
45]. Thus, performing comprehensive parametric analyses using robust and generalized ML models can be a powerful tool for identifying combined effects of parameters on the compressive strength of UHPC. Owing to the inclusion of the age of specimens at the testing time, the effect of time on the strength development of UHPC mixtures could be simulated as well. For instance, the strength development of UHPC mixtures beyond 90 days was depicted for two control mixture designs having cement contents of 750 kg/m
3 and 1000 kg/m
3 in
Figure 9. It was observed that the models captured the strength development of UHPC mixtures having various silica fume contents over the time.
6. Limitations of the Model
Concrete is a highly heterogenous material, characterized by brittle fracture. Developing predictive models for its mechanical properties based on its fracture process requires thorough understanding of its behavior over a wide range of scales, and quantitative evaluation of multiple parameters governing its micro-and macro-cracking [
73]. Several attempts have been made to model the fracture process of concrete using liner-elastic fracture mechanics with the fracture zone surrounded by an elastic region characterized by stress intensity factors (linear) or
J integrals (nonlinear). This approach was, however, unable to predict the actual fracture behavior of concrete [
74]. Generally, it was found that defining unique critical stress intensity factors or
J integrals and
R curves was not successful for cementitious materials [
74]. Various schemes have thus been developed to model the fracture process zone in concrete using nonlinear fracture models.
For instance, Kurumatani et al. [
75] proposed an isotropic damage model for quasi-brittle materials such as concrete. This damage model was claimed to simulate the strain-softening behavior of concrete without mesh-size dependency. While the application of fracture mechanics to concrete garnered great interest, it has not led to reliable and practical models that can be implemented in design codes and industry applications. The common current practice is rather to rely on empirical models based on regression analysis of existing experimental data.
Moreover, several continuum or discrete models have been proposed to simulate the fracture mechanism of concrete, such as the extended finite element method (XFEM), lattice model, etc. [
76,
77,
78,
79,
80,
81]. For instance, Smith et al. [
78] simulated the behavior of UHPC using a lattice discrete particle model using the parameters identified by various quasi-static tests, such as single pull-out, uniaxial compression and strain, triaxial compression, etc. Their findings indicated that the micro-splitting failure due to the hooks at fiber ends with the brittleness of the cement matrix should be taken into account in failure mechanism analysis of UHPC [
78]. Such findings suggest the viability of machine learning modeling of the fracture mechanism of concrete using extensive experimental data in future work. Furthermore, fracture mechanics models have been mostly applied to simulate tensile or flexural strength of concrete, along with its ductility and impact behavior [
82,
83]. Data driven methods can further complement the findings in such studies considering the wide-ranging experimental data in the literature. It is noteworthy that few studies have investigated the fracture mechanism numerically for UHPC incorporating various supplementary cementitious materials and fibers. Thus, more comprehensive research is needed to bridge the knowledge gap found in pertinent experimental data.
More recently, there has been growing interest in using data driven artificial intelligence models to predict the mechanical properties of concrete. Such methods do not impose a model on the data. The model is rather created through learning algorithms from the structure of the data itself. The more comprehensive the data set, the more successful could be the training of the data driven model, and the more accurate would be the model predictions. Another advantage is that while traditional regression analysis models fail to capture the highly complex and nonlinear relations between the mixture ingredients of materials such as UHPC and its mechanical strength, data driven machine learning algorithms can excel in capturing such a behavior.
Therefore, it should be understood that the model proposed in this study is not a substitute for the meso-scale materials science understanding of concrete, nor does it try to capture the fracture behavior of the material. The model simply learns the relationship between the concrete mixture ingredients and its mechanical strength from existing data examples. When the learning is effective, the model can generalize its predictions to new data examples never presented to the model before. Such a performance is demonstrated in this paper on a large set of experimental data examples. However, if the new data example is outside the scope of the training of the model, it will likely not yield accurate prediction. Moreover, the dataset used in this study does not include data specific to the ductility/brittleness of UHPC mixtures. The compressive strength of UHPC was the only experimental parameter modeled. The analysis of the tensile and flexural strengths along with the ductility of UHPC with respect to its mixture ingredients can be the objective of future work.