Next Article in Journal
Semi-Supervised Classification via Hypergraph Convolutional Extreme Learning Machine
Next Article in Special Issue
Investigating the Effect of CNTs on Early Age Hydration and Autogenous Shrinkage of Cement Composite
Previous Article in Journal
Additive Manufacturing in the Construction Industry: The Comparative Competitiveness of 3D Concrete Printing
Previous Article in Special Issue
Sound-Absorbing and Thermal-Insulating Properties of Cement Composite Based on Recycled Rubber from Waste Tires
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Study on Influence of Range of Data in Concrete Compressive Strength with Respect to the Accuracy of Machine Learning with Linear Regression

1
Department of Architectural Engineering, Kyonggi University, Suwon 16227, Korea
2
Department of Architectural Engineering & Urban Engineering, Jeonbuk National University, Jeonju 54896, Korea
*
Author to whom correspondence should be addressed.
Appl. Sci. 2021, 11(9), 3866; https://doi.org/10.3390/app11093866
Submission received: 6 April 2021 / Revised: 17 April 2021 / Accepted: 20 April 2021 / Published: 24 April 2021
(This article belongs to the Special Issue Sustainability and Performance of Advanced Construction Materials)

Abstract

:
This study aims to predict the compressive strength of concrete using a machine-learning algorithm with linear regression analysis and to evaluate its accuracy. The open-source software library TensorFlow was used to develop the machine-learning algorithm. In the machine-earning algorithm, a total of seven variables were set: water, cement, fly ash, blast furnace slag, sand, coarse aggregate, and coarse aggregate size. A total of 4297 concrete mixtures with measured compressive strengths were employed to train and testing the machine-learning algorithm. Of these, 70% were used for training, and 30% were utilized for verification. For verification, the research was conducted by classifying the mixtures into three cases: the case where the machine-learning algorithm was trained using all the data (Case-1), the case where the machine-learning algorithm was trained while maintaining the same number of training dataset for each strength range (Case-2), and the case where the machine-learning algorithm was trained after making the subcase of each strength range (Case-3). The results indicated that the error percentages of Case-1 and Case-2 did not differ significantly. The error percentage of Case-3 was far smaller than those of Case-1 and Case-2. Therefore, it was concluded that the range of training dataset of the concrete compressive strength is as important as the amount of training dataset for accurately predicting the concrete compressive strength using the machine-learning algorithm.

1. Introduction

Concrete is an artificial composite material of various materials, including water, cement, sand, and coarse aggregates, and its mechanical properties depend on the amounts of the materials. Among the various mechanical properties of concrete, the most important is its compressive strength, and numerous studies have been conducted to investigate the relationship between the mixing amounts of materials and the compressive strength. However, accurate prediction of the compressive strength remains difficult, and in recent years, various chemical admixtures and admixtures have been proposed for improving the performance of the concrete. The properties of the materials mixed into the concrete differ depending on the production area and production method, affecting the final compressive strength of concrete. In addition to the mixed ingredients, the amount of moisture of the aggregate, along with the curing conditions, affects the concrete compressive strength. Therefore, it is difficult to accurately predict the compressive strength of concrete, and the concrete mixture is designed according to experience.
In many cases, the compressive design strength resulting from the mixture design of concrete based on experience exhibits a high error relative to the measured concrete compressive strength. Therefore, in recent years, researchers have attempted to predict the compressive strength according to the mixture using a machine-learning algorithm (hereafter, MLA).
Ahmad et al. [1] utilized a machine-learning technique called the individuals and ensemble algorithm to predict the compressive strength of concrete containing fly ash. Among the ensemble algorithms, the begging method was used. An accurate prediction was achieved using the begging method with 20 submodels and a decision tree. Chopra et al. [2] predicted the concrete compressive strength at 28, 56, and 91 days. An artificial neural network (ANN) model based on a small amount of data, i.e., a total of 76 data points, was developed, and through Levenberg–Marquardt training, the concrete compressive strength was predicted. Feng et al. [3] used a weak learner learning method, which has a low prediction error, along with the boosting method, a machine-learning technique that accelerates the learning to perform a strong learner with a good prediction for predicting the compressive strength of concrete. The machine learning was conducted with 1030 data points, and the algorithm was verified with 103 data points.
Nguyen et al. [4] predicted the compressive strength of high-strength concrete using four prediction algorithms: support vector regression (SVR), multilayer perceptron (MLP), gradient boosting regressor (GBR), and gradient boosting (XGBoost). A total of 1133 data points for the concrete compressive strength were used for the machine learning, and the hyperparameter tuning process was conducted to increase the accuracy of the algorithm. DeRousseau et al. [5] predicted the compressive strength through various machine-learning techniques, including a support vector machine (SVM), a decision tree-based model, linear regression, multivariate polynomial regression, Kernelized regression methods, and a regression tree based on 1681 fields and laboratory concrete data, and performed a comparative study on the techniques based on the predicted values.
Kandiri et al. [6] established the ANN model using the multiobjective slap swarm algorithm (MOSSA) and M5P model tree algorithm based on 624 data points and predicted the compressive strength of concrete containing blast furnace slag. This model exhibited a small error percentage, with mean absolute percentage errors (MAPEs) of 12.5% and 7.25%. Mohammed et al. [7] established five machine-learning models using linear regression, nonlinear regression, multi-logistic regression (MLR), and M5P tree, and an ANN based on 450 data points for predicting the compressive strength of concrete containing a high volume of fly ash (HVFA) and performed a comparative study on the techniques based on the predicted values. Golafshani et al. [8] established an artificial intelligence (AI) model that grafted gray wolf optimizer (GWO) and classical optimization algorithms (COAs) onto an ANN and an adaptive neuro-fuzzy inference system (ANFIS). To predict the compressive strength of the normal concrete and high-strength concrete, 2817 data points were utilized. Ahma-Nedushan et al. [9] predicted the compressive strength of high-strength concrete using the k-nearest neighbor algorithm trained with 104 data points. This model was compared with the results of regression neural network, stepwise regression, and modular neural network models. Behnood et al. [10] predicted the compressive strength of normal concrete and high-strength concrete using the M5P model tree algorithm trained with 1912 data points. This algorithm was compared with the results of other machine-learning techniques, such as ANN, classification and regression trees, and ANFISs.
Mohammad et al. [11] studied the important factors for strength, stiffness, and the drift ratio of steel plate shear walls, as well as reinforced concrete shear walls utilizing meta-models developed with ANN, trained under 4300 data points. Roshani et al. [12] predicted two-phase flows independent of the oil pipeline’s scale layer thickness based on 162 cases. Regiment identification was performed using the support vector machine (SVM), and the void fraction was predicted through the use of the multilayer perceptron with the Levenberg–Marquardt algorithm (MLP-LM). Roshani et al. [13] looked into determining the type and amount of four different petroleum by-products using gamma attenuation technique combined with ANN. Fuqua et al. [14] predicted control chart pattern recognition (CCPR) employing a convolutional neural network (CSCNN) trained with 7194 data points. Roshni et al. [15] predicted gas–dol–water volume fractions of a three-phase flow using the group method of data handling (GMDH), a neural network trained with 108 data points. Anyaoha et al. [16] predicted the compressive strength of concrete using boosting smooth transition regression trees (BooST) based on 2456 data points. In addition, compared to other technologies (multilayer perceptron, support vector machine, etc.), BooST exhibited good in complex model analysis. Al-Shamiri et al. [17] predicted the compressive strength of high-strength concrete using an extreme learning machine (ELM), a new method for an artificial neural network (ANN), trained with 324 data points. Ganguly et al. [18] introduced a convolutional neural network (CNN) topology using wavelet kernels to detect and identify single or multiple partial discharges (PD).
This study aims to predict the compressive strength of low-to-high-strength concrete using an MLA based on linear regression and to evaluate the accuracy of MLA when it was trained with a different range and/or amount of data. The open-source library TensorFlow, a representative machine-learning algorithm, was used to develop an algorithm for predicting the compressive strength of concrete. For testing MLA, 4279 data points were prepared. This is more data than previous studies. Among them, A total of 2991 training data were employed for the model training, and a total of 1288 data points were used to test the algorithm, and the measured compressive strength in data ranged from 7 to 100 MPa. First, the errors of the predicted values obtained from the MLA trained with all the data (2991 ea.) were examined (Case-1). Second, it is investigated how the predicted values were affected in the case where the number of training data points (1080 = 180 × 6 ranges) in each compressive strength’s range was the same (Case-2). Finally, 2991 data points were divided into six subcases according to the compressive strength of concrete, and then the predicted results of MLA trained with each subcase were investigated (Case-3).

2. Machine Learning Algorithm

2.1. Open-Source Ai Development Framework Tensorflow

There are representative open-source AI frameworks, including PyTorch, Theano, TensorFlow, and Keras. Among them, TensorFlow is widely used in the AI field owing to its various advantages. One advantage of TensorFlow is that it uses not only the CPU with sequential data processing but also the GPU with the parallel processing method, which processes orders simultaneously; hence, its algorithm processing speed is high. Moreover, TensorFlow is a Python-based library and can be used with other modules such as Numpy, Scipy, and Requests, which are other Python libraries, allowing easy data extraction and arrangement. Furthermore, because TensorFlow provides various functions, including tf.matmul, tf.split, and tf.tile, there is no need to pay attention to details such as the process of reentering the output of a node in the algorithm implementation. Therefore, the machine-learning model in this study was developed using TensorFlow owing to these advantages.

2.2. Model Composition

Machine learning is an AI technique that learns based on the related training dataset to obtain the desired results. In this study, among the various learning methods of machine learning, the method of predicting a specific result when entering random variables by identifying the association or regularity between variables of training dataset and results of training dataset was selected.
Linear regression is the most basic theory to determine a result. Linear regression involves approaching the most reasonable straight line by reducing the error of a hypothetical straight line of numerous variables. It is performed to find the optimal straight line, and in this process, the gradient descent method algorithm is generally used (Figure 1). The gradient descent method is that a hypothetical line moves in the direction toward where the absolute value of the slope of a specific value is smaller. It involves performing repetitive calculations to get closer to 0 by calculating the slope of the corresponding value and moving to the left if the value is positive and to the right, if the value is negative. The most representative modules among the linear regression models using the gradient descent method are TensorFlow, Numpy, and Pandas. TensorFlow is selected for this study. The linear regression models built using TensorFlow are outlined in Equations (1)–(4). The linear regression model is a linear equation, where y is the dependent variable, a represents the weight, x is the independent variable, and b represents the bias. Equation (2) describes the process of identifying the difference between the y value obtained from Equation (1) and the measured value and is used to decide whether to conduct re-learning of the linear regression model. As the value of Equation (2) converges toward 0, the accuracy increases. When it is decided to re-perform the learning given by Equation (2), the w and b values must be reset up. These values are determined by Equations (3) and (4), respectively. Therefore, Equations (1)–(4) are subjected to learning again until the value of Equation (2) converges to 0. During this process, users can specify the number of repetitions rather than setting the converged value.
y = a i x i + b ,
Cos t ( a , b ) = 1 n i n ( a i x i + b w i ) 2 ,
a   Gradient = Cos t ( a , b ) a ,
b   Gradient = Cos t ( a , b ) b .
where y is the dependent variable, a is the weight, b is the bias, and x and w are the independent variables and actual value.

2.3. Application

A database related to concrete’s mixtures and measured compressive strength of concrete (fc,meas) was constructed, which corresponds to the input stage of Figure 2. Concrete mixtures are normally designed with many variables. Subsequently, it passed the feature-extraction stage, in which the data is classified by each variable such as water, cement, sand, coarse aggregate, size of coarse aggregate, fly ash, and blast furnace slag (GGBS). And then, xi was designated to be a total of 7 variables in Equation (1): x1 represents water, x2 represents cement, x3 represents sand, x4 represents coarse aggregate, x5 represents the size of the coarse aggregate, x6 represents fly ash and x7 represents GGBS. Moreover, the w value of Equation (2) is fc,meas. Next, to conduct the learning stage, i.e., to construct the linear regression model to predict the compressive strength. Finally, repetitive machine learning with a linear algorithm was conducted to obtain the optimal result through the gradient descent method.

2.4. Database of Concrete Mixtures

For training and testing the MLA, concrete mixtures and experimental data for the concrete compressive strength were needed. In this study, 4279 data points suitable for the learning and testing of the algorithm among the data presented by Yang et al. [19] were utilized. The fc,meas of data ranges from 7 MPa to 100 MPa and were classified into the following ranges: 7–20 MPa, 20–30 MPa, 30–40 MPa, 40–60 MPa, 60–80 MPa, and 80–100 MPa. Furthermore, they were classified according to the mixing form: ordinary Portland cement (OPC), OPC + FA (fly ash), OPC+ blast furnace slag (GGBS), and OPC + FA + GGBS. The type of binder, compressive strength ranges and maximum and minimum values of each ingredient are presented in Table 1. 70% of the classified data were used as a training dataset, and the other 30% were utilized for the accuracy verification of the MLA.

3. Results

3.1. Evaluation Method

To evaluate the agreement between the predicted value obtained through MLA and the measured value, along with the MLA error, the coefficient of variation (CV), root-mean-square error (RMSE), mean absolute error (MAE) and mean absolute percent error (MAPE) was used. The CV was obtained by dividing the standard deviation by the average and comparing datasets with different units of measure. The RMSE is an objective error index used to study the difference between the model-predicted value and the measured value. The MAE, i.e., the absolute value of the difference between the predicted value and the measured value, indicates the accuracy (reliability) of the model. The MAPE supplements the disadvantages of the MAE and indicates how much relative error has occurred.
C V = σ m ,
RMSE = i = 1 m ( f c , p r e d f c , m e a s ) 2 m ,
MAE = 1 n i = 1 m | f c , p r e d f c , m e a s | ,
MAPE ( % ) = 1 m i = 1 m | f c , p r e d f c , m e a s f c , m e a s | × 100 .
where σ is the standard deviation, m is the mean, fc,meas and fc,pred are measured and predicted compressive strength of concrete.

3.2. Test of MLA Trained with All Training Dataset (Case-1)

After training the MLA using the 2991 training dataset, the algorithm was tested with a 1288 testing dataset (Case-1). The verification results were summarized using the analysis method introduced in Section 3.1 and are presented in Table 2. Figure 3 shows the relationship between the ratios of the measured value to the predicted value (ratio of fc,meas to fc,pred, hereinafter γ) and the measured value. The mean (m) and CV of the data were found to be 1.00 and 0.28. However, as shown in the graph, there was a linear relationship where γ increased with fc,meas. To analyze this tendency in detail, it was classified into different compressive-strength ranges, and the m, CV, RMSE, MAPE, and MAE of each range are presented in Table 2. The m and CV of γ and RMSE, MAPE, and MAE in the range of 7–20 MPa are 0.7, 0.22, 10.23 MPa, 8.76 MPa, and 51.0%, respectively. m increases as the fc,meas range value increases, and the m of γ at 30–40 MPa was 0.96, which was the closest to 1, followed by the m of γ in the 40–60 MPa range (m = 1.09). The RMSE, MAPE, and MAE of the 30–40 MPa range were 8.86 MPa, 7.58 MPa, and 20.96%, respectively. Among the different strength ranges, this range had the smallest RMSE and MAPE. The RMSE, MAPE, and MAE of the 40–60 MPa range were 9.54 MPa, 7.99 MPa, and 16.52%, respectively; this range had the best MAE index among the different fc,meas ranges. These ranges had the highest accuracy because 51% of the training dataset were included in them. Based on only the analysis result from Case-1, the results indicate that the algorithm estimates wi values with priority given to the range having the largest number of training datasets in the regression analysis.
Figure 4 presents the normal distribution γ based on the mean m and σ calculated based on all the training datasets (Case-1). As shown, the frequency increased as γ approached 1, indicating that there were many cases in which the error between the measured and predicted values was small. The γ-value of the 95% confidence interval was 0.45–1.55. This suggests that if the MLA is trained using all the training datasets, the predicted values with an error rate of approximately 55% will be included in 95% of the result values. The γ-value of the 90% confidence interval was 0.53–1.47. The error rate was approximately 47%. The γ-value of the 80% confidence interval was 0.63–1.37, and the error rate was approximately 37%. Therefore, if the MLA is trained using a wide range of training datasets, the accuracy and reliability of the prediction can be reduced.

3.3. Test of MLA Trained with the Same Number of Data in Each f’c,meas Range (Case-2)

When all the training datasets were used, the data of the 30–60 MPa range accounted for 51% of the total and were considerably concentrated. The research was performed to determine whether having a large amount of training dataset in a specific compressive-strength range affected the accuracy of the MLA. To compare with Case-1, the compressive-strength range affected the MLA. The error rate of the MLA was investigated when the number of training datasets for each fcmeas range was the same (Case-2). For this, 230 data points were randomly selected for each range of fc,meas; a total of 1380 (=230 × 6) data were selected. Among them, 1080 (=180 × 6) data were used for training, and 300 (=50 × 6) data were used for validating accuracy. The verification results of Case-2 exhibit in Table 3 and Figure 5. The CV, RMSE, MAE, and MAPE of Case-2 were 0.34, 14.41 MPa, 11.42 MPa, and 26.85%, respectively, and the error indices were slightly increased compared with Case-1. Regarding the results for each fc,meas range, the CV of Case-2 was larger than that of Case-1 for all the ranges; i.e., the error was larger. The other indices of different fcmeas ranges were also larger compared with Case-1 in most cases. The γ-value of the 95% confidence interval was 0.35–1.80, and the error rate was approximately 72%. The γ-value of the 90% confidence interval was 0.41–1.59. The error rate was approximately 59%. The γ-value of the 80% confidence interval was 0.52–1.48, and the error rate was approximately 48%. Although the range of the concrete compressive strength data was wide, the number of data was relatively small; hence, it is assumed that the error rate of Case-2 was higher than that of Case-1.

3.4. Test of MLA Trained with Each Range of f’c,meas (Case-3)

Because it is speculated that a wide range of fc,meas databases affected the MLA, the MLA case for each fc,meas range was generated (total of six subcases), and the operation and verification were conducted independently for each case (Case-3). Table 4 presents the evaluation indices obtained using the evaluation method proposed in Section 3.1, and Figure 6 presents the relationship between γ and fc,meas in each subcase. The m of all the ranges was 0.99–1.04, and the σ appeared to be 0.08–0.14. The average values of CV, RMSE, MAE, and MAPE of subcases were found to be 0.11, 4.56 MPa, 3.73 MPa, and 8.42%, respectively, which were superior to those for Case-1. The maximum range of γ values included in the 90% confidence interval was 0.76–1.24 in Case-3-2 (20–30 MPa), and the minimum range was 0.87–1.13 in Case-3–6 (80–100 MPa). This suggests that if the MLA is learned after using a training dataset divided by strength ranges, the predicted values with a maximum error rate of 24% and a minimum error rate of 13% will be included in >90% of all the result values. Therefore, if the MLA is trained using a training dataset with specific fc,meas ranges related to the desired result, the prediction accuracy and reliability can be enhanced.

4. Conclusions

The concrete compressive strength was predicted through an MLA based on a linear regression model constructed using the open-source library TensorFlow. The influence of fc,meas range of dataset to the accuracy of MLA was analyzed. Of 4279 data points, 70% were used as training dataset, and 30% were utilized as testing data, and the MLA was subjected to learning with seven mixing materials as variables (water, cement, coarse aggregate, sand, fly ash, blast furnace slag, and aggregate size). The results of verifying the model through the verification data were as follows:
  • When comparing Case-1 and Case-3, both the m-values of Case-1 and Case-3 were close to 1. However, there were differences in the CV, RMSE, MAE, and MAPE, which indicated the error between the measured and predicted values. For the range of 30–40 MPa in Case-1, the CV, RMSE, MAE, and MAPE of Case-1 were 0.23, 8.86 MPa, 7.58 MPa, and 20.96% respectively. In contrast, them of Case-3-2 (30–40 MPa) were 0.11, 3.39 MPa, 2.56 MPa, and 7.28%, respectively, and a similar trend was observed in all the strength ranges. These results indicated that the reliability and accuracy of the MLA increase when MLA is learned with a training dataset in a specific fc meas range related to a desired result.
  • The linear regression evaluation indices (RMSE, MAE, and MAPE) were large in Case-1 and Case-2, and the m-value of each fc meas range exhibited a tendency to be far from 1. The CV, RMSE, MAE, and MAPE of Case-1 had maximum values of 0.23, 25 MPa, 22.45 MPa, and 51%, respectively, and those of Case-2 had maximum values of 0.45, 20.3 MPa, 17.83 MPa, and 40.68%, respectively. Related to the normal distribution, and the 90% confidence intervals of Case-1 and Case-2 were 0.53–1.47 and 0.41–1.59, respectively. The accuracy of Case-1 had better than that of Case-2. This means that the training dataset with a wide range did not affect the accuracy of MLA and the number of training dataset affected to;
  • For Case-1, Case-2, and Case-3, the correlation graph of γ and fc,meas tended to exhibit a linear increase regardless of the cases. The reason for this linear shape is that the linear regression technique is a method for finding a mean value; hence, the weight and bias of the linear regression equation are highly correlated with the mean value and predicted values of the testing dataset far from the mean were overestimated or underestimated.

Author Contributions

Conceptualization, J.-R.P. and S.K.; formal analysis, J.-K.K. and S.K.; funding acquisition, K.-H.Y.; investigation, S.K.; methodology, J.-R.P.; project administration, S.K.; supervision, S.K.; validation, S.K.; writing—original draft, H.-J.L. and S.K.; writing—review and editing, J.-K.K. and K.-H.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the GRRC program of Gyeonggi province (GRRC KGU 2020-B01, Research on Intelligent Industrial Data Analytics).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ahmad, A.; Farooq, F.; Niewiadomski, P.; Ostrowski, K.; Akbar, A.; Aslam, F.; Alyousef, R. Predicting of Compressive Strength of Fly Ash Based Concrete Using Indivial and Ensemble Algorithm. Materials 2021, 14, 794. [Google Scholar] [CrossRef] [PubMed]
  2. Chopra, P.; Sharma, R.K.; Kumar, M. Prection of Compressive Strength of Concrete Using Artificial Neural Network and Genetic Programming. Adv. Mater. Sci. Eng. 2016, 2016, 7648467. [Google Scholar] [CrossRef] [Green Version]
  3. Feng, D.C.; Liu, Z.T.; Wang, X.D.; Chen, Y.; Chang, J.Q.; Wei, D.F.; Jiang, Z.M. Machine Learning-Based Compressive Strength Prediction for Concrete: An Adaptive Boosting Approach. Constr. Build. Mater. 2020, 230, 117000. [Google Scholar] [CrossRef]
  4. Nguyen, H.; Vu, T.; Vo, T.; Thai, H.T. Efficient Machine Learning Models for Prediction of Concrete Strengths. Constr. Build. Mater. 2021, 266, 120950. [Google Scholar] [CrossRef]
  5. DeRousseau, M.A.; Laftchiev, E.; Kasprzyk, J.R.; Rajagopalan, B.; Srubar, W.V., III. A Comparison of Machine Learning Methods for Predicting the Compressive Strength of Field-Placed Concrete. Constr. Build. Mater. 2019, 228, 116661. [Google Scholar] [CrossRef]
  6. Kandiri, A.; Golafshani, E.M.; Behnood, A. Estimation of the Compressive Strength of Concrete Containing Groud Granulated Blast Furnace Slag Using hybridized multi-objective ANN and Salp Swarm Algorithm. Constr. Build. Mater. 2020, 248, 118676. [Google Scholar] [CrossRef]
  7. Mohammed, A.; Rafiq, S.; Sihag, P.; Kurda, R.; Mahmood, W. Soft Computing Techniques: Systematic Multiscale Models to Predict the Compressive Strength of HVFA Concrete Based on Mix Proportions and Curing Times. J. Build. Eng. 2021, 33, 101851. [Google Scholar] [CrossRef]
  8. Golafshani, E.M.; Behnood, A.; Arashpour, M. Predicting the Compressive Strength of Normal and High-Performance Concrete Using ANN and ANFIS Hybridized with Grey Wolf Optimizer. Constr. Build. Mater. 2020, 232, 117266. [Google Scholar] [CrossRef]
  9. Ahmadi-Nedushan, B. An Optimized Instance Based-Learning Algorithm for Estimation of Compressive Strength of Concrete. Eng. Appl. Artif. Intell. 2012, 25, 1073–1081. [Google Scholar] [CrossRef]
  10. Behnood, A.; Behnood, V.; Gharehveran, M.M.; Alyamac, K.E. Prediction of the Compressive Strength of normal and High-Performance Concretes Using M5P Model Tree Algorithm. Constr. Build. Mater. 2017, 142, 199–207. [Google Scholar] [CrossRef]
  11. Mohammad, J.M.; Mohammad, A.H.A. Developing a Library of Shear Walls Database and the Neural Network Based Predictive Meta-Model. Appl. Sci. 2019, 9, 2562. [Google Scholar]
  12. Roshani, M.; Phan, G.T.T.; Ali, P.J.M.; Roshani, G.H.; Hanus, R.; Duong, T.; Corniani, E.; Nazemi, E.; Kalmoun, E.M. Evaluation of Flow Pattern Recognition and Void Fraction Measurement in Two Phase Flow Independent of Oil Pipeline’s Scale Layer Thickness. Alex. Eng. J. 2021, 60, 1955–1966. [Google Scholar] [CrossRef]
  13. Roshani, M.; Phan, G.; Faraj, R.H.; Phan, N.H.; Roshani, G.H.; Zazemi, B.; Corniani, E.; Nazemi, E. Proposing a Gamma Radiation Based Intelligent System for Simultaneous Analyzing and Detecting Type and Amount of Petroleum By-Products. Nucl. Eng. Technol. 2021, 53, 1277–1283. [Google Scholar] [CrossRef]
  14. Fuqua, D.; Razzaghi, T. A Cost-Sensitive Convolution neural network learning for Control Chart Pattern Recognition. Expert Syst. Appl. 2020, 150, 113275. [Google Scholar] [CrossRef]
  15. Roshani, M.; Phan, G.; Roshani, G.H.; Hanus, R.; Nazemi, B.; Corniani, E.; Nazemi, E. Combination of X-ray Tube and GMDH neural network as a Nondestructive and Potential Technique for Measuring Characteristics of Gas-Oil-Water Three Phase Flows. Measurement 2021, 168, 108427. [Google Scholar] [CrossRef]
  16. Anyaoha, U.; Zaji, A.; Liu, Z. Soft Computing in Estimating the Compressive Strength for High-Performance Concrete Via Concrete Composition Appraisal. Constr. Build. Mater. 2020, 257, 119472. [Google Scholar] [CrossRef]
  17. Al-Shamiri, A.K.; Kim, J.H.; Yuan, T.F.; Yoon, Y.S. Modeling the Compressive Strength of High-Strength Concrete: An Extreme Learning Approach. Constr. Build. Mater. 2019, 208, 204–219. [Google Scholar] [CrossRef]
  18. Ganguly, B.; Chaudhuri, S.; Biswas, S.; Dey, D.; Munshi, S.; Chatterjee, B.; Dalai, S.; Chakravorti, S. Wavelet Kernel-Based Convolutional Neurla Network for Localization of Partial Discharge Sources within a Power Apparatus. IEEE Trans. Ind. Inform. 2021, 17, 1831–1841. [Google Scholar]
  19. Yang, K.H.; Tae, S.H.; Choi, D.U. Mixture Proportioning Approach for Low-CO2 Concrete Using Supplementary Cementitious Materials. ACI Mater. J. 2016, 113, 533–542. [Google Scholar]
Figure 1. Gradient descent.
Figure 1. Gradient descent.
Applsci 11 03866 g001
Figure 2. Interpretation of machine learning with concrete mixtures.
Figure 2. Interpretation of machine learning with concrete mixtures.
Applsci 11 03866 g002
Figure 3. Relationship between γ obtained from Case-1 and fc,meas.
Figure 3. Relationship between γ obtained from Case-1 and fc,meas.
Applsci 11 03866 g003
Figure 4. The normal distribution curve of γ.
Figure 4. The normal distribution curve of γ.
Applsci 11 03866 g004
Figure 5. Relationship between γ obtained from Case-2 and fc,meas.
Figure 5. Relationship between γ obtained from Case-2 and fc,meas.
Applsci 11 03866 g005
Figure 6. Relationship between γ obtained from each subcase and fc,meas.
Figure 6. Relationship between γ obtained from each subcase and fc,meas.
Applsci 11 03866 g006aApplsci 11 03866 g006b
Table 1. Concrete mixtures and measured compressive strength.
Table 1. Concrete mixtures and measured compressive strength.
Type of BinderRange of fc,measDataWCW/BSGMax. of GFAGGBS
MPaeakg/m3kg/m3%kg/m3kg/m3mmkg/m3kg/m3
OPC7 to 2012290–216150–44430–89592–1039452–150310–40--
20 to 30488135–247251.67–63030–80166–1073452–126010–40--
30 to 4067169–247272.31–72014–67165–118632–159910–30--
40 to 60919108–280294–90020–60162–17310–156710–25--
60 to 80438108–232292–90020–50346–20220–141613–25--
80 to 10022497–200396.51–847.6220–40465–1122554–141615–25--
OPC + FA7 to 20111144–221135–37150–133508–980842–123013–2518–247.5-
20 to 30375142–383135–42540–107496–94028–129913–2517–270-
30 to 40226126–220200–58132–8037–95060–123013–8013–380-
40 to 60181126–220163–68025–100168–876105–142213–2527–437-
60 to 8097148–180298–65025–60391–856751–139313–2532.4–420-
80 to 1008157–165385–59726–43587–651977–105819–2062.8–192-
OPC + GGBS7 to 2013175–18273–29364–249655–943899–122320-23–234
20 to 3050150–220110–31250–164625–885864–122319–25-35–234
30 to 4065150–220150–4955–125605–864743–111119–25-8–330
40 to 6073150–220110–583.335–164272–864743–106219–25-32.2–408.33
60 to 8068120–175140–56720–118272–803889–109920–25-43.25–420
80 to 10045135.2–175192–777.7823–83263–1146667–111420-100–448
OPC + FA + GGBS7 to 2029108–180162–22754–110834–982885–9932025–4533–113
20 to 3059105–182158–29645–108776–950884–11342017–9622.8–184
30 to 40---------
40 to 6017157–177140–51531–114701–874850–95720–2531–11419–180
60 to 80---------
80 to 100---------
Table 2. Analysis accuracy of ML trained with fc,meas in all ranges.
Table 2. Analysis accuracy of ML trained with fc,meas in all ranges.
All DataRange of fc,meas
42797–20 MPa20–30 MPa30–40 MPa40–60 MPa60–80 MPa80–100 MPa
Training dataset2991189
(6.3%)
675
(22.6%)
680
(22.7)%
842
(28.2%)
420
(14.0%)
185
(6.2%)
Test dataset128886
(6.7%)
297
(23.1%)
282
(21.9%)
348
(27.0%)
183
(14.2%)
92
(7.1%)
mean (m)1.000.700.770.961.091.211.36
σ0.280.150.160.230.220.230.22
CV0.280.220.210.230.200.170.16
RMSE (MPa)12.3010.2310.328.869.5415.3225.00
MAE
(MPa)
9.988.769.287.587.9912.9522.45
MAPE (%)25.251.0036.620.9616.5218.4324.98
Table 3. Analysis accuracy of ML trained with 1080 data points.
Table 3. Analysis accuracy of ML trained with 1080 data points.
All DataRange of fc,meas
13800–20 MPa20–30 MPa30–40 MPa40–60 MPa60–80 MPa80–100 MPa
Training dataset1080180180180180180180
Testing dataset300505050505050
Mean (m)1.080.781.070.991.081.281.26
σ0.360.20.480.390.30.280.20
CV0.340.260.450.390.270.220.16
RMSE (MPa)14.4110.247.5412.3414.4317.5120.30
MAE (MPa)11.427.065.8610.3212.2314.9817.83
MAPE (%)26.8540.6822.8829.9825.7521.5320.13
Table 4. Analysis accuracy of ML trained with data in each range.
Table 4. Analysis accuracy of ML trained with data in each range.
Subcases According to Range of fc,meas
Case-3-1Case-3-2Case-3-3Case-3-4Case-3-5Case-3-6Average
7–20 MPa20–30 MPa30–40 MPa40–60 MPa60–80 MPa80–100 MPa
Training dataset189675680842420185
Testing dataset8629728234818392
Mean (m)1.041.031.010.991.011.011.02
σ0.120.140.110.110.090.080.11
CV0.110.140.110.110.0860.0800.11
RMSE (MPa)2.003.363.395.415.967.244.56
MAE (MPa)1.492.782.564.555.015.993.73
MAPE (%)9.0310.967.289.427.216.638.42
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Park, J.-R.; Lee, H.-J.; Yang, K.-H.; Kook, J.-K.; Kim, S. Study on Influence of Range of Data in Concrete Compressive Strength with Respect to the Accuracy of Machine Learning with Linear Regression. Appl. Sci. 2021, 11, 3866. https://doi.org/10.3390/app11093866

AMA Style

Park J-R, Lee H-J, Yang K-H, Kook J-K, Kim S. Study on Influence of Range of Data in Concrete Compressive Strength with Respect to the Accuracy of Machine Learning with Linear Regression. Applied Sciences. 2021; 11(9):3866. https://doi.org/10.3390/app11093866

Chicago/Turabian Style

Park, Jun-Ryeol, Hye-Jin Lee, Keun-Hyeok Yang, Jung-Keun Kook, and Sanghee Kim. 2021. "Study on Influence of Range of Data in Concrete Compressive Strength with Respect to the Accuracy of Machine Learning with Linear Regression" Applied Sciences 11, no. 9: 3866. https://doi.org/10.3390/app11093866

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop