Development of Crack Width Prediction Models for RC Beam-Column Joint Subjected to Lateral Cyclic Loading Using Machine Learning

In recent years, researchers have investigated the development of artificial neural networks (ANN) and finite element models (FEM) for predicting crack propagation in reinforced concrete (RC) members. However, most of the developed prediction models have been limited to focus on individual isolated RC members without considering the interaction of members in a structure subjected to hazard loads, due to earthquake and wind. This research develops models to predict the evolution of the cracks in the RC beam-column joint (BCJ) region. The RC beam-column joint is subjected to lateral cyclic loading. Four machine learning models are developed using Rapidminer to predict the crack width experienced by seven RC beam-column joints. The design parameters associated with RC beam-column joints and lateral cyclic loadings in terms of drift ratio are used as inputs. Several prediction models are developed, and the highest performing neural networks are selected, refined, and optimized using the various split data ratios, number of inputs, and performance indices. The error in predicting the experimental crack width is used as a performance index.


Introduction
The unpredictable nature of crack formation and propagation in reinforced concrete structures may seriously affect the stability and strength of structures, and thus, has been a subject of many studies in recent years [1][2][3][4][5]. In general, cracks initiate as narrow and elongated openings that consist of a width less than 0.5 mm, and are often not visible to the naked eye [6][7][8]. Although design codes impose limitations on crack widths based on empirical formulae, there is often uncertainty associated with determining crack width propagation, due to cyclic/seismic loads [9][10][11]. The propagation in crack width can reduce the structure's service life by accelerating the corrosion of steel reinforcement through the penetration of moisture, vapor, saltwater, and chemical gasses to the structural members [2,12,13]. The crack width initiation and propagation in reinforced concrete members could be estimated using classical theories by assuming the distribution of the bond stress as a member is subjected to tension with constant bending moment [14,15]. Base and Murray proposed applying numerical analysis on restrained members to predict the crack response of concrete structures using the finite difference methodology [16,17]. In addition, Gilbert implemented basic principles of equilibrium and compatibility to derive a series of expressions from calculating the stresses in concrete and steel members, the number of cracks, and the average crack width [17,18]. In recent years, several analytical methods and neural network models have been developed for estimating the propagations of cracks. In the past decade, Artificial Neural Network (ANN) and Finite Element Modeling (FEM) have been extensively used to analyze and predict the formation and propagation of cracks [19][20][21][22][23]. Theriault and Mehdi developed a theoretical model for predicting crack width based on the effect of Fiber Reinforced Polymer (FRP) numbers and thickness against the reinforcement ratio in concrete beams and prisms [24,25]. In other research, the mechanism of softened truss theory and bonding deterioration was proposed and developed using ANN and numerical modeling to estimate the crack width among reinforced concrete elements [26,27]. Such theoretical models have been able to predict features of cracks within the surface of reinforced concrete structures with reasonable accuracy.
In a recent survey of existing research, it was illustrated that there is a need for developing models to determine the crack widening process for reinforced concrete structures exposed to seismic loads [27]. It was noted that models did not predict the cracking behavior of reinforced concrete beam-column joint members subjected to lateral cyclic loading. Therefore, a reliable approach is needed for engineering practice to develop a model to estimate a crack width for members with complex structures. In this research, a machine learning model was developed to predict the cracking behavior and estimate the crack width. Four different types of prediction models were developed using Rapidminer machine learning software. Design parameters of RC beam-column joint, such as numbers of shear links, anchorage length, and lateral cyclic loading in terms of drift ratio, have been used as inputs for the neural network models. The outperforming model was selected based on the least error in the prediction of crack width. Further optimization has been performed using data splitting, changing the number of input parameters and performance indexes.

Experimental Specimen Test
In this research, seven beam-column exterior reinforced concrete (RC) joints with a column dimension of 2000 mm × 200 mm × 200 mm and a beam dimension of 1250 mm × 200 mm × 250 mm. Type I ordinary Portland cement (OPC) obtained from Tasek 111 Corporation, Berhad, Malaysia, was used in the concrete mix design. The cement is in accordance with quality conforms 112 to EN 196. The concrete compressive strength was evaluated in accordance with the BS EN 12390-3:2009 standard [28]. Three specimens were tested for each RC beam-column joint. The average compressive strength of the cubes ranged between 41.1 MPa to 47.7 MPa. The steel reinforcement had an average tensile strength of 614.15 MPa, meeting the requirements of BS EN 10002-1:2001 standard [29]. Table 1 summarizes the reinforcement details for each RC beam-column joint. Figure 1 illustrates the reinforcement detail for each RC beam-column joint investigated. The additional shear link spacing and increased anchorage length in specimens BCJ-2 to BCJ-7 were arranged based on the ductility class medium (DCM) for low to moderate seismicity [30][31][32].  Figure 2 illustrates the schematic drawing for the experimental test setup implemented. The ends of the RC beam-column were partially fixed. A 500 kN horizontal hydraulic actuator was attached to the top portion of the column. The bottom portion of the column was braced to the surface of a strong floor. An axial circular steel pin was attached to the strong floor and the end of the beam. The specimens were experimentally tested using displacement control mode with simulated earthquake loading under lateral cyclic loading [35]. The cyclic loading history is presented in Figure 3 and is in accordance with the ACI 374.2R-13 standard [36]. Each specimen was subjected initially to cyclic loading with a drift ratio (∆ y ) of 0.25%. The drift ratio (∆ y ) was then increased by 0.25% in each step until reaching a drift ratio (∆ y ) of 3.00%, or specimen failure is observed. At each drift ratio (∆ y ) level, the specimens were subjected to three cycles. The drift ratio (∆) was defined as l x /H, where l x is the lateral movement and H is the column height [37]. Dinolite microscope camera was used to measure and analyze the crack width in every drift ratio (DR) level during testing, as recommended in earlier studies [38][39][40][41].     Table 2 illustrates the parameters recorded before and after the experimental testing that was used as potential input parameters for machine learning models to accurately predict the cracking behavior. While the crack width of each specimen represents the major output element for the prediction models.  Table 2. List of data opted as potential input parameters for modeling purposes.

Num
Parameters Obtained before and after Experimental Testing Abbrev. Unit
Shear link spacing at beam SL B mm 3.
Shear link spacing at column SL C mm 4.
Shear span for additional shear links at beam SSL B mm 5.
Shear span for additional shear links at column SSL C mm 6.
Anchorage length at joint AL mm 7.
Concrete compression strength C c MPa 8.
Concrete tensile strength C t MPa 9.
Tensile bar strength T t MPa 10.
Maximum negative load-carrying capacity Q max(−ve) kN Figure 4 shows the load-drift hysteretic relationships. Q max (+ve) and Q max (−ve) represent the maximum positive and negative lateral load-carrying capacities of each specimen, respectively. All specimens had a greater negative lateral load-carrying capacity, i.e., Q max (−ve), than the positive lateral load-carrying capacity, i.e., Q max (+ve). In the case of positive Q max (+ve), the peak load reached the maximum load-drift response (i.e., deformation capacity) at the drift ratio of 2.25% for BCJ-1, 2.50% drift ratio for BCJ-6, 2.75% drift ratio for specimens BCJ-2 and BCJ-5 and 3.00% drift ratio for specimens BCJ-3, BCJ-4, and BCJ-7. The applied displacement load for the control specimen indicated that BCJ-1 fell under tension failure mode, then the applied displacement load was forced to pull it. This resulted in the maximum negative loading obtained is similar to the maximum positive loading. This indicates that a small load is needed to pull BCJ-1 to the origin, due to low reinforcement area without any seismic strengthening provision. However, in BCJ-2 to BCJ-7, the maximum negative loadings obtained were approximately twice the maximum positive loading, compared to the control specimen. The additional shear link spacing and the increased anchorage length enhanced the strength of specimens BCJ-2 to BCJ-7.

Crack Formation of Specimens
The crack pattern and width of the specimens were monitored during the lateral cyclic loading test to assess the effect of the increased number of shear links and anchorage length. Figure 5 illustrates the crack formation on the selected joint area for BCJ-1 (control), BCJ-2 (strengthen by Eurocode 8), BCJ-4 (representing additional shear links), and BCJ-5 (representing anchorage length) at the drift ratio level of 0.25%, 1.25%, and 2.25%, including at the failure. In the case of control specimen BCJ-1, the primary flexural crack initially formed at a drift ratio level of 0.25% from the top corner of the joint area and then propagated downwards to create two new secondary cracks at the intersection area between the beam and the joint panel. Moreover, numerous hairline cracks at 45 • appeared at the joint interface, which extended from the primary and secondary cracks in specimen BCJ-1. For specimens BCJ-2 and BCJ-4, the primary/flexural crack was initiated at a drift ratio level of 0.25%, similar to specimen BCJ-1. However, the width and length of the crack were smaller than those of the control specimen. For BCJ-5, two primary cracks were formed at the early stage of the loading protocol. At the drift ratio level of 1.25%, a shear crack formed at 60 • , and connected to the primary crack. With repeated loading cycles, new cracks and extension of tertiary cracks increased at the joint panel zone for all specimens. The test was terminated after a drift ratio level of 3.00% for BCJ-1, due to severe crushing of concrete at the beam-column joint area. For BCJ-2 and BCJ-5, concrete cover spalling was observed at the area of primary crack beyond the drift ratio level of 3.00%. However, in BCJ-4, small concrete surface spalling was seen near the joint interfaces before failure, due to shear action beyond the drift ratio level of 3.25%. The maximum crack widths were recorded at the primary/main cracks residing at the critical zone (joint region). In general, the results indicated that the additional shear links and increased anchorage length had a significant effect on strength and on reducing the crack width.

The Neural Network Configuration
Artificial neural network (ANN) is an artificial intelligence modeling technique that imitates the functioning of the human nervous system. The main processing of the human nervous system consists of the brain nerve cells as the basic unit of information processing. In ANN, the basic information processing units are called neurons. Neurons manage details concurrently and immediately [42][43][44]. Implementing ANNs requires specialized building blocks, including multidimensional arrays, activation functions, and autonomous gradient computation [45]. There are many forms of neural networks, from fairly simple to very complex, just as there are many hypotheses about how biological neural processing is carried out [46,47].
Developing a structural quality model with adequate prediction accuracy can be challenging, especially when modeling the crack width of RC beam-column joints area subjected to lateral cyclic loading with respect to different levels of drift ratios. In this research, four ANN prediction models, including deep learning (DL) max-out, deep learning (DL) rectifier, support vector machine (SVM) dot, and support vector machine (SVM) neural, were used. These methods provide an optimized framework for machine learning, data mining, and text mining and are fast and easy to use [48][49][50].

Application of Deep Learning (DL) and Support Vector Machine (SVM) in Engineering Practice
Deep learning (DL) and support vector machine (SVM) are commonly used in machine learning. DL and SVM both have advantages and have performed excellently in engineering, weather prediction, stock market forecasting, and medical diagnosis [51][52][53][54][55]. DL is related to the field of machine-based learning algorithms and imitate brain neurons. DL has two main features: The ability to learn how to perform complex functions once properly trained, and the ability to generalize and establish a reasonable solution for unattended data. SVM is a computational algorithm that can learn how to allocate labels to objects from experience and examples. SVM's fundamental function is to separate binary labeled data based on a line that achieves the maximum distance between the labeled data [56][57][58][59]. In engineering practice, DL has been applied in the fields of structural and materials engineering. Javier and co-researchers have developed a DL model for structural one-way slab design optimization [60]. The goal of this model was to reduce the environmental impact of energy consumption and CO 2 emissions from construction industry operations. Improvements in slab design were explored in the model, which was able to calculate thousands of solutions in real-time based on the requirements of the designer. The authors found that the decision support system (DSS) in the model was accurate and presented multi-criteria solutions that significantly reduced emissions without affecting the cost. Another related DL model was built to measure the local health of structures by establishing the Structural Health Index (SHI) [53]. Ambient noises have been applied to the SHI model to replicate the structure's damage rate. A comparison of vibration records from a 1:20-scale residential 42-story high-rise concrete building in Hong Kong with the degree and magnitude of damages obtained from SHI modeling verified the capability of the model. The authors proposed that this model could also be extended to systems for informed and warning maintenance decision-making for both local and global real-time health monitoring. Research on the development of instance-level identification and quantification of concrete surface bug-holes, based on DL was carried out [61]. The authors posited that this model was necessary to avoid the conventional, time-consuming, and inefficient methods of measurement performed by manual inspection. Hence, a total of 428 raw images with the appropriate bug-hole resolution were chosen to create the datasets, and the result showed that the model achieved an average accuracy of more than 90% of the quantification defects with real concrete surface bug-hole specimens compared to traditional CNN models. The authors recommended this model for future time-saving inspection, while avoiding applying the traditional CNN model that was inefficient in the accurate location of defect boundaries that led to difficulty in quantifying defects.
For the SVM model, the compressive strength and prediction of autogenous shrinkage for concrete were developed, respectively, using the regression technique [62,63]. One of the distinguishing features of SVM is its limited number of parameters compared to other types of prediction models that need a design that must integrate network structure and be combined with powerful optimization algorithms to deliver satisfactory results [64]. In these researches, both authors [62,63] applied five and nine numbers of parameters, respectively, to further simulate and correlate the predicted outcome with the experimental results. In both cases, the SVM predictions were in relatively good alignment with the findings observed. In addition, the accuracies of the proposed SVM models were compared with predictive models based on ANN. The predictive models based on ANN showed a higher R 2 value than the SVM models, suggesting that SVM's predictive efficiency is comparable to ANN. Nevertheless, the authors indicated that ANN models require a large number of optimization control parameters and a relatively large training database, while SVM requires few control parameters with little reliance on the size of training datasets. SVM proposed this benefit as a viable alternative to other ANN methods. In addition, the SVM-based classification was established to detect horizontal subsurface cracks in the pavement to avoid shortening the lifespan of roadways [65]. The model was added with the root-mean square error (RMSE) performance index to avoid over-fitting in positive identification. The predicted classification results were then compared with the standard reference method Amplitude Ratio Test (ART). The comparison indicated that SVM was efficient in the detection of de-bonding within pavement structures. Sensitivity analysis was included and further carried out using various parameters to obtain the robustness of SVM. It was found that SVM distinguished the composite structures implemented in pavement structure de-bonding with greater accuracy. Hence, the robustness of the SVM method had increased the potential in the detection process. Amin and Farhad showed that SVM-based reliability analysis of concrete dams can predict flood assessment of gravity dams and optimal earthquake intensity. They compared their results to those obtained using finite element method analysis (FEM) [66]. The authors determined that SVM has a lower computational cost when compared to FEM-based probabilistic simulations. Furthermore, each of the nonlinear transient time history analyses and its post-processing (in the second example) takes 10-25 h (depending on the period of the ground movement and the degree of damage) for FEM. Hence, a complete set of 100 analyses would require a computation time of about two months compared to the runtime of a few hours of an SVM model. Consequently, the authors stated that SVM is a useful and effective method for the classification, prediction of response, and reliability analysis of concrete dams.

Framework Model of Deep Learning (DL) and Support Vector Machine (SVM)
The basic framework of ANN requires three layers (Input, Hidden, and Output) [67], as shown in Figure 6. The input layer is the first layer where data/features are obtained, and some standardization techniques are used to restrict inputs to a certain set. The structured inputs make the neural network work easier, resulting in better accuracy. Depending on the network application, the hidden layer (intermediate or invisible) may be a collection of layers. Such layers are responsible for identifying the pattern of a process or device. Most operations of the neural network are carried out in these layers. The output layer also contains neurons representing the outputs of the final network generated from the previous layers of neuron processing. The basics of ANN's composition involve neuronal interconnection and layer formation. Therefore, they may categorize their temperament into four different types. Such types include feed-forward single-layer, mesh, recurrent, and feed-forward multi-layer networks.
DL is a class of machine learning algorithms that slowly extract higher-level features from the raw input using multiple layers. For example, lower layers can identify edges in image processing, whereas higher layers can identify human-relevant concepts, such as digits, letters, or faces [56]. The term 'DL' was introduced by Rina Dechter in 1986 [68,69], then followed by Igor Aizenberg and his colleagues in 2000 in the form of Boolean threshold neurons [70,71]. The hidden layers (see Figure 7) perform complex mathematical computations based on the number of inputs. The disadvantages of the DL model include the requirements of a large dataset and large amounts of computational power in both training and testing [72].  The concept of SVM was originated and developed by Cortes and Vapnik [75] to introduce a classifier derived from the theory of statistical learning. It has since been shown to be very robust, and also has been used as an intuitive model representation to detect outliers. SVM is a supervised technique in machine learning used for problems of classification, as well as regression. Even with few examples, SVM performs well and has good accuracy. Because of these benefits, SVM differs from other techniques of machine learning [75][76][77]. In addition, the following order provides a brief description of the principles of SVM [78]. SVM's goal is to find a function f that has a maximum ε deviation from the actual targets x n for all the training data, and is as flat as possible. The functions f can be represented by the following equation for the training data: where ω < X and b < R; <,> refers to the dot product in X; ω is the weight vector; b is the scalar threshold. SVM determines the regression function in accordance with statistical theory by minimizing an objective function. The regression function parameters x and b are calculated by minimizing the regularized function of risk as follows: where C is the pre-specified SVM tolerance parameter, ξ i and ξ * i are slack variables in determining the degree to which data points will be penalized if the error is larger than precision parameter ε. ε is the insensitive loss function. The SVM training algorithm will rely only on the data in high-dimensional space via dot products. In addition, the kernel function generated in Rapidminer can be used to approach the dot products in high-feature dimension space as in Equation (4) and in Figure 8, we only need to use K in the SVM training algorithm without specifically treating the function space to obtain the specific formulation of the algorithm of ∅. The kernel parameters must be carefully selected because they are important to define the high-dimensional space and to monitor the complexity of the final solution. Because SVMs are primarily defined by the types of their kernel functions, it is important to choose the correct kernel function and kernel parameters for each application problem to ensure satisfactory results. A trial-and-error technique was used to pick the kernel-specific parameters. In this research, the framework model was used, as illustrated in Figure 9. The regression performances of classification for all models were evaluated by the labeling dataset, where the inputs data were attributed with label roles, while an attribute with prediction role for experimental crack width data. The attribute of the label stores the real observed values, while the attribute of prediction stores the label values expected by the regression models. Training with a collection of sample data called a training set generates the models. Trained models were then provided with the test set to predict the accessible accuracy of data on crack widths. For the simulation, nine numbers of potential inputs vectors were selected, and a split data ratio of 75:25 (training set to test set ratio) was applied for all prediction models. The prediction models are described further in the following Section 5.1.

Comparison of Prediction Models from Rapidminer
A sufficient list of numbers (≤1000 data) of crack width data of seven RC beam-column joints was collected at every respective drift ratio level. The data were used for training and testing in the Rapidminer software based on the recommendations from literature as described in Section 4.1 [66]. Four different prediction models were developed and proposed as defined in Table 3 with potential input parameters. Table 3. The details of four prediction models.

No
Model Prediction Method

Deep learning (DL)
Max-out: Based on the maximum coordinate of the input vector.

Deep learning (DL)
Rectifier: Rectifier Linear Unit (RLU), which chooses the maximum of (0, x) where x is the input value.

Support Vector Machine (SVM)
Dot: The dot kernel is defined by k(x,y) = x × y, i.e., it is the inner product of x and y.

Support Vector Machine (SVM)
Neural: The neural kernel is defined by a two-layered neural net tanh (ax × y + b), where a is alpha and b is the intercept constant. These parameters can be adjusted using the kernel a and kernel b parameters. A common value for alpha is 1/N, where N is the data dimension. Figure 10 illustrates regression analysis plots for the distribution of predicted crack width. The figure includes the equity line as a guide, which for the predicted and measured crack widths reflects the state of equal value. The analysis shows that SVM model with dot prediction method presented better prediction with the smallest error, in which the distribution of predicted was below (27% error) and above (30% error) of the reference line compared to DL max-out (48% and 56%), DL rectifier (55% and 44%) and SVM-neural (29% and 48%). DL max-out and DL rectifier showed that almost all predicted points were distributed around the measured crack widths with the largest errors, due to over-fitting. These observed errors might be attributed to insufficient data as DL models require a large dataset for complex supported calculation/simulations [61,72]. The accuracy of predicted crack widths by SVM-dot was consistently increased within equity line, thus demonstrating regression technique with input-output relationships. This was further supported by the regression technique developed for compressive strength and predicting autogenous shrinkage for concrete [62,63]. Hence, in the field of machine learning, SVM is the best learning methodology for classification and regression tasks [79][80][81]. Though the SVM-neural showed a difference of 2%-18% from SVM-dot, however homogenous floats formed on top linear measured line at the area with more than 1.5 mm have shown uncorrelated and over-predicted values between the predicted and actual crack widths. Therefore, the SVM-dot model was selected for further analysis of optimization using data splitting, numbers of input parameters, and performance index and is described in the following sub-section.

Data Splitting Ratio Analysis in SVM-Dot Model
One of the most important decisions to be made when developing a prediction model is to make full use of existing experimental data to optimize the prediction model. The most common technique is to divide the data into two-typically referred to as the sets of training and testing. The training set is used to build models and is used as the substratum feature set to estimate parameters, compare models, and all other activities necessary to develop a complete model. The test collection is applied to determine the final and objective evaluation only at the end of these activities. Before this point, it is important that the test set is not used. Looking at the results of the test sets would skew the results because the test data were part of the process of model creation. Improper splitting of the dataset will result in the output of the model being excessively high. Nonetheless, various sophisticated methods of sampling were used to address this problem. In addition, an important aspect of data splitting is how well the training, testing, and validation datasets describe the feature space when the number of points in the whole dataset is large during when any division ratio would work, but when the dataset is limited, division ratio may play a crucial role [82][83][84]. The SVM-dot model from Section 5.1, is selected for further analysis. When a specific test set was not available, the split validation operator was implemented to predict the model's fit to a hypothetical test set. The split validation operator can also train on one set of data and check on another set of specific test data. The purpose of splitting data into two different categories in this prediction model was to avoid over-and under-fitting and only optimizing the training dataset accuracy. Hence, there is a need for a model that performs well on a dataset that it has never seen (test data), which is called generalization. In this case, the training data and test data are contributed randomly by seven specimens (not from the same specimen) with different detailing, loadcarrying capacity, different maximum drift ratio, etc. The test data and training data from every specimen are independent from each other as they have different designs. This procedure has assured the pattern of input-output data during the testing is different than the pattern used during the training. Figure 11 shows three different results based on linear sampling for the split ratio between training and testing of 70:30, 75:25, and 80:20 from the prediction model.
Results show that the split ratio of 70:30 with the lowest error of predicted crack widths (25%) near the reference line. It is, thus, a better split ratio for the SVM-dot than 75:25 (30%) and 80:20 (48%) of split ratios. Most of the plotted predicted points for the 80:20 split ratio were far below the equity/measured line, due to unbalanced data from a smaller number of datasets, thus, providing high variance in prediction, which can significantly change testing accuracy. In other words, significant under-fitting in the 80:20 split ratio may cause redundancy in experimental output data. Past studies have shown that the proportion chosen in the analysis for fewer numbers of datasets (≤1000 data) was 70% for the training set and 30% for the test. The idea is that more training data is preferred because it makes the classification and regression models better, whilst more test data makes the error estimate less accurate [85][86][87]. For this dataset, the 70:30 split is within this experimental range and is a reasonable choice. The trade-off is simple as less the testing data, bigger the variance performance of the model algorithm, while more the training data, smaller would be the variance in parameter estimates. From this 70:30 split ratio graph, further optimizations are analyzed and discussed based on the number of input parameters selected in the following sub-section.

Analysis on Input Parameters Selection for SVM-Dot Model
To define the optimal functional type of regression models, the choice of input variables is central and crucial. The role of selecting input variables is common to the creation of most models of regression and depends on the discovery of relationships within the available data to identify acceptable model output predictors. The difficulty of the input variable selection task is somewhat alleviated in the case of parametric or semi-parametric empirical models by a-priori assumption of the model's functional form, which is based on some physical interpretation of the underlying system or process being modeled. In the case of ANN, however, there is no such inference about the model's structure. The input variables are instead chosen from the available data, and the model is subsequently developed [88]. Consequently, three different types of input variables are used, as seen in Table 4, to integrate flexibility and prevent duplication, providing a more reliable model of prediction. The results for variable input vectors in determining the correlation and consistency between predicted and measured crack widths are presented in Figure 12. Table 4. Inputs variable vector (7, 9, and 11 inputs).

Inputs Vector (Z 1 -Z )
Crack width (C.W.) Several important parameters that were omitted for seven inputs resulted in overfitting (52-58%), due to the discrepancy of data between predicted and measured crack widths. The smaller number in sample size with the smaller number of inputs does not effectively cover the prediction observed in the broader domain. This can seen in Figure 12a, where a large variance developed in the regression model based on the smaller number of inputs dataset. As seen in Figure 12b, the load parameter (Q max(+ve) and Q max(−ve) ) played a significant role, thus showing how important the parameters are correlated in engineering terms to approach with experimental crack widths data. Even though the prediction model with nine inputs was sufficient (21-25% difference between measured to predicted), however in contrast to previous studies [63,65], Figure 12c with additional concrete tensile strength (C t ) and reinforcement tensile strength (T t ) had proved that the smaller number of datasets with an increased number of inter-related inputs parameter had drawn back less than 20% consistency of predicted points varied and equally within above and under the reference lines. Several studies found it difficult to pick input variables because of the number of variables available; associations between possible input variables will establish overlap and variables with little or no predictive power [89,90]. The redundancy occurred depend on the kind of ANN model used followed by the numbers of performance operators needed for complex analysis compared to SVM. A dot model was applied in this study whereby fewer operators were used, as in Figure 10, with straight-forward analysis, which was highly suitable for smaller number of datasets-thus developing a more robust and efficient regression model. Further analyses and modifications are made for the predictive model, based on Figure 12c, where the additional operator, performance index, is applied, as described in the following sub-section.

Analysis on Performance Indexes (Regression) Supported in SVM-Dot Model
Performance index operators were used to test the regression task with statistical performance and provided a list of the regression task's accuracy criteria values. The operator output (regression) was selected as it decides the type of learning task and measures the most common requirements for that category automatically. Regression operation is also a method used for numerical analysis and is a statistical measure that evaluates the intensity of the relationship between a dependent variable (label attribute) and a set of other changing variables known as independent variables (regular attributes). This type of operator has been constantly used in previous studies to evaluate specific factors, such as commodity price, interest rates, and certain industries, that affect the price movement of an asset [91][92][93]. To evaluate the statistical efficiency of the regression model, the dataset must be labeled and must have a label function attribute and a predictive role attribute. The attribute of the label stores the actual observed values, and the attribute of the prediction stores the label values predicted by the regression model under discussion. For this research, three types of index regression (as in Table 5) were used to refine and reduce the difference/gap between measured and predicted crack widths, and are shown in Figure 13. 3. Root mean square error (RMSE) RMSE is a quadratic scoring method calculating the error's average magnitude. It is the cumulative square root of variations between predictive and real observation. From the above figure, RMSE can be seen to exhibit a more consistent standard deviation of residuals (prediction errors) up to 5%, approaching actual crack widths compared to AE (26-37%) and PA (50%). The inconsistent prediction performance was found over-fitting in the case of AE and under-fitting in the case of PA from the reference lines, respectively. The extent of error found in the prediction model associated with AE and PA may be considered from the point of view of accuracy. The accuracy of measurement reflects the error or variance of the measurement from the average of a large number of measurements of the same quantity, whereas the precision of a measured value expresses the deviation of the measurement from the real quantity value. Error is viewed from the point of view of precision when the true value is known, but it must be used instead of accuracy when the true value of a quantity is not known. Accuracy cannot be achieved if the precision cannot be reached; however, precision does not guarantee accuracy. Residuals were calculated as to how far the data points were from the regression line; RMSE was assessed as to how these residuals were spread out. In other words, it shows how clustered the data is around the best fit axis, thus supporting the argument through a previous study in which RMSE was able to avoid over-fitting in positive identification [62,65]. This parameter of the RMSE index is, therefore, a simple, quick, and reliable metric compared to other parameters, widely used in climate analysis, forecasting, and regression to validate experimental results [50,64,94].

Conclusions
In this research, prediction models were developed to analyze and predict crack width in reinforced concrete beam-column joints area subjected to lateral cyclic loadings. The prediction models were developed considering the cracks observed at each drift ratio level with numbers of additional shear links and length of the anchorage within the joint region using Rapidminer machine learning tools. The results have shown that the support vector machine (SVM) model can provide accurate and precise performance on the crack widths prediction process. The following conclusions are drawn in this study: 1.

2.
For the data splitting ratio, 70:30 split of testing and training smaller number dataset was comparable to the experimental range, thus being a decent and reasonable choice in this research. 3.
The higher number of input variables (eleven inputs) being pre-processed by the SVM-dot model was recommended, where the small dataset for this research with an increased number of inter-related inputs parameter presented errors less than 20% consistency of predicted points varied along the reference lines compared to over-fitting result found in seven inputs (52-58%) and less under-fit found in nine inputs (21-25%), respectively.

4.
Finally, based on conclusion (2) and (3), further optimization was made where the root mean square error (RMSE) performance index was adapted, and the SVM-dot prediction model further reduced the error between measured and predicted crack width up to 5% compared to absolute error (AE), 37%, and prediction average (PA), 50%, respectively.
Although the proposed model uses SVM-dot with RMSE optimization, it has been proven to provide a reliable accuracy for the current application. However, it is only applicable for the lateral cyclic loading, which simulating the earthquake loading, as this is the main scope of the study. In the case for the different types of RC material or loading programs, the new dataset is required to improve the model, which may be considered for future work.