The Use of Fuzzy Linear Regression and ANFIS Methods to Predict the Compressive Strength of Cement

: In this paper, the prediction of compressive cement strength using the fuzzy linear regression (FLR) and adaptive neuro-fuzzy inference system (ANFIS) methods was studied. Speciﬁcally, an accurate prediction method is needed as the modeling of cement strength is a di ﬃ cult task, which is based on its composite nature. However, many approaches are widely implemented in strength-predicting problems, such as the artiﬁcial neural network (ANN), Mamdani fuzzy rules in MATLAB, FLR and ANFIS models. Applying these methods and comparing the results with the corresponding observed ones, we concluded that the ANFIS method successfully decreased the level of uncertainty in predicting cement strength, as the average percentage error level was extremely low. Although the FLR method had the highest average percentage error level compared with the other methods, it provides a standard equation to estimate the output values by using symmetric triangular fuzzy numbers and determines the most important factor in increasing compressive strength, in contrast to ANFIS and ANN, which are black box models, and to the fuzzy method, which uses rules without providing the speciﬁc way by which the results come out. Thus, ANFIS and FLR are appropriate methods for dealing with engineering mathematical models by using fuzzy logic.


Introduction
Concrete is recognized as the most broadly used material in the construction of buildings, subway projects and other civil structures, which consists of cement, water and aggregates. Sometimes, powdered or liquid additives are also included in the mixture in order to improve its specific properties, such as the durability of the concrete, or to control setting or hardening. Concrete is widely used, as its raw materials are available worldwide. Concrete's properties depend on the quality of its materials. One of the most important properties that it has is the high compressive strength at 28 days after casting. Therefore, it is necessary to test the concrete's compressive strength, so as to define the quality and also the strength of the construction.
Cement is a material consisting of a binder within, widely used in the production of construction essentials as it acquires strong stability and durability after hardening. In this research, CEM I 42.5R European standard EN197-1 [1] was used as the type of cement, containing 95-100% clinker composition and 0-5% other additional constituents. Cement hydration is a complex process resulting from water's reaction with cement. There are many physical and chemical parameters that affect cement hydration and, therefore, its compressive strength. One of them is the Portland cement clinker composition [2], as it mainly consists of tricalcium silicate (C 3 S), which is responsible for the early strength and early characteristics of cement, dicalcium silicate (C 2 S), which contributes to cement's Symmetry 2020, 12, 1295 2 of 11 long-term compressive strength, tricalcium aluminate (C 3 A), which in a low amount provides resistance to sulfates, tetracalcium aluminoferrite (C 4 AF) that works as a flux material and sometimes trioxide sulfate (SO 3 ) [3], which is a gypsum that is added to the clinker after its cooling for minimizing the amount of grinding energy needed. Depending on its application, cement contains over 50-70% C 3 S, 15-30% C 2 S, 5-10% C 3 A, 5-15% C 4 AF [4] and less than 4% SO 3 [3]. Cement hydration also depends on the Blaine fineness of the cement particles, which ranges in value between 2000 and 5000 cm 2 /g, and by the alkali content in the cement composition.
Portland cement production is not an environmentally friendly process as it releases a massive amount of carbon dioxide (CO 2 ) into the atmosphere, causing climate change and greenhouse gas emissions. On the other hand, non-conventional cementitious materials [5][6][7][8][9] are being developed in order to reduce environmental pollution and replace ordinary Portland cement with sustainable materials like self-healing concretes or alternative geopolymer-based concretes, which seem to be promising in concrete construction. Non-conventional cementitious materials also improve the characteristics of the cement and increase its compressive strength. The ordinary cement compressive strength can also be improved by incorporating nanoparticles [10] in its composition to increase the density of the concrete. The prediction of the compressive strength of these types of concrete is a very important process, as it provides an option to modify the mix proportion in circumstances where the mandatory design strength is not attained, in order to avoid construction failures and substitute successfully the stability offered by Portland cement. As indicated, analytical models that include the effects of each of these factors on the compressive strength may be very complicated. Consequently, the use of fuzzy regression models, adaptive neuro-fuzzy inference systems (ANFISes) [11][12][13], as well as artificial neural networks (ANNs) [11,14,15] and fuzzy logic, seems to be a promising approach to the strength prediction problem, providing useful tools in the concrete industry.
In a previous study [16], the 28-day compressive strength was predicted with the use of ANNs and Mamdani fuzzy rules in MATLAB. The ANN method [17] was based on the structure of an artificial neural network with three levels. The first layer (input), as well as the second layer (hidden), had four neurons, and there was one output parameter, the compressive strength, in the last layer. The inputs, which were C 3 S, SO 3 , Blaine fineness and alkali, were first normalized to ranges from 0.1 to 0.9, using the equation where α and b are the lowest and highest values of the range of normalization, respectively, Xmin i and Xmax i are the lowest and highest values of each input, respectively, and X i is the value of each input of the ith node. Then, those values were multiplied by the weight factors in order to train the model, with the use of the equation where x i is the input parameter from the ith neuron and v ij is the weight from the previous level i in the next level j. Afterwards, the aforementioned values were transferred to the hidden layer. For determining the activation level, the sigmoid transfer function was used with the following form: Furthermore, a backpropagation algorithm [18] was used to minimize the error level between the output value data and the calculated values, optimizing the weight vector with the following form: in which y i is the vector of the calculated output value, t i is the target output, P is the number of training patterns and p is the number of output neurons. The modification of the network weights was accomplished by using the following equation: where δ, the learning rate, was equal to 0.01. The model was trained for 20,000 epochs. More details on the ANN method can be retrieved from [19]. The fuzzy logic algorithm was based on Mamdani rules and consisted of four components. Fuzzification was the first one, where each variable was represented by degrees of membership between the values 0 and 1. The variables were characterized as very low, low, medium and high. Then, rules were defined in order to comprise all possible fuzzy combinations between the processing parameters. Those rules were expressed in an If-Then conditional statement. In the fuzzy inference engine, all the fuzzy rules were taken into account and the inputs were converted to the corresponding outputs. The product method (prod) was used as an inference operator. Finally, in the last component, defuzzification [20], outputs from the previous step were converted into a number using the centroid method where x i is the value of the output and µ(x i ) is its membership value in the membership function.
In this study, the same inputs were applied in order to develop more effective fuzzy models to predict the compressive strength of cement at 28 days. Fuzzy linear regression and ANFIS methods were used to predict 50 sets of cement strength and the results of them were compared with those of the ANN and fuzzy logic algorithm methods.

Fuzzy Linear Regression (FLR)
Linear regression [21] provides a crisp approach between a dependent output Y and independent inputs X, using the following form: where u includes the deviations between the observed and the predicted parameters, known as the disturbance term. Fuzzy linear regression is an alternative process of probabilistic specification, expressed as where A i are symmetric fuzzy numbers, which includes the inability to determine an exact association between the dependent and independent parameters. They are expressed as A = (r i ,c i ) L , where r i is the center in which the membership function is equal to 1 and c is the range of values. Thus, the membership function [22] is defined as where L(x) is the symmetric reference function of a fuzzy number that satisfies the constraints below: Symmetry 2020, 12, 1295 4 of 11 Therefore, the membership function for the linear possibility Equation (8) is formed as where i is the number of inputs and j is the number of sets. Then, the degree h is determined in order to include the data in the estimated output Y, which is In this study, the reference function L(x) had the following form: where L(x) is decreasing in (0,1). As a result, the coefficients of the system (8) were symmetric triangular fuzzy numbers and the following linear programming problem was turned out: where the relations (17)- (19) were used to minimize the objective function in (16). Furthermore, the simplex method was used to solve the linear programming problem by combining the relations (16)- (19). The results of the fuzzy triangular numbers are represented in Table 1. The equation of fuzzy linear regression was formed as follows: With regard to the membership function, it depends on the distance of the observed output values from the center of regression. In particular, the closer the 28-day compressive strength is to the center of the regression, the higher the increase of the membership function is. The values of the root mean square error (RMSE) and the mean absolute percentage error (MAPE) were calculated as Symmetry 2020, 12, 1295 where A t and F t are the observed and computed values, respectively, and n is the number of sets. Theil's inequality coefficient is a measure of the accuracy of regression and was equal to which indicated that the fuzzy linear regression had a successful predictive capacity. The degree of the fuzziness [21] of this model was formed as where c i contributes to the fuzziness of the system. In order to find the effect of independent inputs on the calculation of the dependent variable Y, the degree of fuzziness was calculated separately for four different models. Each model was comprised of three input variables instead of four and each time a different variable was subtracted. The results are summarized in Table 2. Taking those results into account, it was concluded that C 3 S was the most important factor in increasing the compressive strength at 28 days, as it had the highest effect on the system's degree of fuzziness. It also constitutes 50-70% of cement's composition, which defines it as the main component in the cement. However, all variables were important for developing a successful fuzzy model to predict the 28-day compressive strength of cement. More details on the fuzzy linear regression method can be retrieved from [23,24].

Adaptive Neuro-Fuzzy Modeling (ANFIS)
ANFIS is a fuzzy model used as a method of constructing rule systems, by entering the data of the prediction model to derive more efficient membership functions. The membership function parameters are tuned with the use of methods based on ANNs, as ANFIS is a type of adaptive network that functions as a fuzzy system. The structure of this model is shown in Figure 1 and it is composed of four inputs, hidden layers and one output datum. The nodes of each level were connected to each other through links that indicated the direction of the information of the neural network.
the prediction model to derive more efficient membership functions. The membership function parameters are tuned with the use of methods based on ANNs, as ANFIS is a type of adaptive network that functions as a fuzzy system. The structure of this model is shown in Figure 1 and it is composed of four inputs, hidden layers and one output datum. The nodes of each level were connected to each other through links that indicated the direction of the information of the neural network.  In the first level, membership grades were generated for each input with the following form: where µ Ai (x 1 ), µ Bi (x 2 ), µ Ci (x 3 ) and µ Di (x 4 ) were the trapezoidal membership functions in this study and i = 1,2,3. At each node, the incoming signals were multiplied and this information was sent to the output, using the min or prod operator. The prod method was used in this study with the following form: where w i is the weight parameter from the ith neuron. In the third layer, the degree of membership function of a rule from each node resulted, using the equation where (w 1 + w 2 + w 3 + w 4 ) is the total weights extracted from the previous level. Then, in the fourth layer, each node was defined as where w i is the parameter from the previous layer and {p i , q i , r i } are the consequent parameters for each rule. Thus, every node layer was specified as an adaptive node. In the final layer, all of the inputs were used to calculate the output by using The ANFIS algorithm used Takagi-Sugeno-Kang rules and the output membership function was a linear function derived from the input values, in contrast to the Mamdani system, where the output of every rule was a fuzzy set.
The adjustment of the weights of each parameter provided minimization of the error level between the observed and predicted values of the output. A hybrid learning algorithm that associates the Symmetry 2020, 12, 1295 7 of 11 least squares method and the backpropagation gradient descent method was used in order to train Sugeno-type parameters and calculate input and output membership function parameters.
The model was trained for 20,000 epochs. The predicted and the observed outputs for testing data are presented in Figure 2. The checkpoints coincided with the data, as the error level was extremely low (MAPE = 0.0270%). Thus, the model responded satisfactorily to the observed data.
The ANFIS algorithm used Takagi-Sugeno-Kang rules and the output membership function was a linear function derived from the input values, in contrast to the Mamdani system, where the output of every rule was a fuzzy set.
The adjustment of the weights of each parameter provided minimization of the error level between the observed and predicted values of the output. A hybrid learning algorithm that associates the least squares method and the backpropagation gradient descent method was used in order to train Sugeno-type parameters and calculate input and output membership function parameters.
The model was trained for 20,000 epochs. The predicted and the observed outputs for testing data are presented in Figure 2. The checkpoints coincided with the data, as the error level was extremely low (MAPE = 0.0270%). Thus, the model responded satisfactorily to the observed data.

Model Application
The estimation of 50 sets of the 28-day compressive strength of cement was analyzed by using two fuzzy methods: fuzzy linear regression and ANFIS. The results of these methods are reported in Table 3 in order to compare the two fuzzy models. According to the results, the fuzzy linear regression and ANFIS methods successfully predicted the compressive strength of cement with small deviations. The data, which were obtained from a previous study [16], had a limited range of values and led to reliable predictions of the observed cement strength just within the aforementioned range. However, the purpose of the model is to apply and control input parameters in more favorable experimental conditions in order to provide the cement compressive strength with higher accuracy.
The values of C 3 S, SO 3 , Blaine fineness and alkali are the inputs, strength is the output, FLR (left), FLR (center) and FLR (right) are the values of the fuzzy linear regression in the boundaries and in the center of regression, µ A (y i ) is the membership grade of the FLR method and ANFIS presents the results obtained from the ANFIS algorithm.  For a better classification, the root mean square error (RMSE) and the mean absolute percentage error (MAPE) were determined for every method and are presented in Table 4, where the values of the ANN and fuzzy model methods were calculated in a previous study [16]. In Figure 3, the fuzzy linear regression is shown, where the boundaries and the center of the regression, as well as the observed values, are represented. In Figure 3, the fuzzy linear regression is shown, where the boundaries and the center of the regression, as well as the observed values, are represented. As indicated, all methods had satisfactory results, with small deviations from the observed values of strength. Although the fuzzy linear regression method had a higher RMSE factor than the ANN, ANFIS and fuzzy models, it provided a reliable way to estimate the compressive strength of cement. This method uses a specific equation to compute the predicted values in order to define the coefficients on which the input variables depend for obtaining the result. In addition, the FLR method is an effective way to define the degree of fuzziness in order to calculate the effect of independent inputs on the dependent variable and determine the major factor in increasing the compressive strength of cement at 28 days.

Metric ANN Fuzzy Fuzzy Linear
However, the ANFIS algorithm provided the most accurate approach to calculating the compressive strength. The value of the mean square error (0.04) was lower compared with that of all of the other methods, which made the results of the ANFIS method quite satisfactory. The MAPE was computed as 0.0270%, which proved that the values of the observed and calculated outputs were very close. Although ANFIS is a black box model, as there is no access to the rules that have been created, the valid combination of verbal rules, as well as the successful choice of membership function, make it a successful fuzzy model. As indicated, all methods had satisfactory results, with small deviations from the observed values of strength. Although the fuzzy linear regression method had a higher RMSE factor than the ANN, ANFIS and fuzzy models, it provided a reliable way to estimate the compressive strength of cement. This method uses a specific equation to compute the predicted values in order to define the coefficients on which the input variables depend for obtaining the result. In addition, the FLR method is an effective way to define the degree of fuzziness in order to calculate the effect of independent inputs on the dependent variable and determine the major factor in increasing the compressive strength of cement at 28 days.
However, the ANFIS algorithm provided the most accurate approach to calculating the compressive strength. The value of the mean square error (0.04) was lower compared with that of all of the other methods, which made the results of the ANFIS method quite satisfactory. The MAPE was computed as 0.0270%, which proved that the values of the observed and calculated outputs were very close. Although ANFIS is a black box model, as there is no access to the rules that have been created, the valid combination of verbal rules, as well as the successful choice of membership function, make it a successful fuzzy model.

Conclusions
The estimation of the best compressive strength of cement data is not an easy process, as it contains highly complex factors. In this study, fuzzy linear regression and ANFIS methods were developed in order to create more effective fuzzy models to estimate the 28-day compressive strength of cement. In particular, four inputs were applied in order to determine the output, which was the compressive strength of the cement. After comparing these methods with ANN and fuzzy logic models, which were both studied in a previous publication [16], and evaluating the results, we concluded that the ANFIS algorithm provided the most accurate values of the cement's compressive strength, as the value of the mean square error was extremely low (0.04).
It was also demonstrated that fuzzy linear regression was a more valid method as it provided a standard equation to estimate the output values, although it yielded higher error values than the other fuzzy models. This is in contrast to the black box methods of the ANFIS algorithm and the ANN model, as well as fuzzy logic, which used rules without providing the specific way by which the results come out. Another advantage of the FLR method was the determination of the degree of fuzziness, which specified the consequence of the independent variables for the compressive strength of cement.
In conclusion, all methods led to reliable results with relatively small deviations. Even though the ANFIS algorithm provided the smallest deviations from the observed values, the most valid method for predicting the compressive strength of cement was the fuzzy linear regression method, which gave the most reliable estimation of the cement's strength by providing a standard equation. This proves its capability of dealing with the perceptual uncertainties that are involved in strength prediction problems and leads to successful predictions of the predicted cement strength values, improving the design and providing a useful modeling tool in the field of engineering.
Author Contributions: The present paper was written by the PhD Candidate F.G. under the supervision of. B.P., Professor at the Democritus University of Thrace, Department of Civil Engineering. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.