Prediction of Concrete Fragments Amount and Travel Distance under Impact Loading Using Deep Neural Network and Gradient Boosting Method

In the present study, the amount of fragments generated and their travel distances due to vehicle collision with concrete median barrier (CMB) was analyzed and predicted. In this regard, machine learning was applied to the results of numerical analysis, which were developed by comparing with field test. The numerical model was developed using smoothed particle hydrodynamics (SPH). SPH is a mesh-free method that can be used to predict the amount of fragments and their travel distances from concrete structures under impact loading. In addition, deep neural network (DNN) and gradient boosting machine (GBM) were also employed as machine learning methods. In this study, the results of DNN, GBM, and numerical analysis were then compared with the conducted field test. Such comparisons revealed that numerical analysis generated lower error than both DNN and GBM. When prediction results of both the amount of fragments and their travel distances were considered, the result of DNN showed smaller errors than that of GBM. Therefore, in studies where machine learning is used to predict the amount of fragments and their travel distances, careful selection of an appropriate method from the various available machine learning methods such as DNN, GBM, and random forest is absolutely important.


Introduction
In Korea, to ensure passenger protection, the structural performance of concrete median barriers on expressways are evaluated periodically through certain tests [1]. In the passenger protection test, the theoretical head impact velocity (THIV) and postimpact head deceleration (PHD) are limited to 33 km/h and 30 G (gravity acceleration), respectively. In the structural performance test, the weight and travel distance of the fragments of concrete median barriers should not exceed 2 kg and 2 m, respectively. Furthermore, the structural performance and occupant protection performance are affected by uncertainty factors, such as strain rate effect, size effect, characteristics of the impactor, and heterogeneous material. Figure 1 shows fragments generated from a secondary accident caused by a damaged fender installed on top of a concrete median barrier (CMB) due to collision of a truck with the CMB. It is already known that due to certain uncertainties, prediction of the amount of fragments generated and their travel distances is not always accurate. For example, Kim et al. [2] reported that under the same impact energy, the amount of fragments and their travel distances can vary depending on the velocity and mass of the impactors. According to Chopra et al., inaccurate input values can also make prediction difficult [3]. In such scenarios where there are uncertainties and inaccurate information, the movement of fragments after a collision can be predicted by machine learning effectively.
However, a large amount of data is required in machine learning for training, and it is time-consuming to gather such a large amount of date through actual experiments. To this end, finite element method (FEM) has been employed to increase the training data [2,4,5]. Through FEM, any cracks in a concrete structure are simulated by deleting elements, as reported by Lee and Kim et al. [6][7][8][9]. However, such element deletion is not applicable in the current study, as it would affect the amount of fragments. Instead, smoothed particle hydrodynamics (SPH), a mesh-free method developed by Rabczuk and Eibl [10], would be a suitable technique to predict the amount of fragments and their travel distances without any loss of concrete weight.
Furthermore, there has been no parametric study reported to-date on the prediction of concrete fragments and their travel distance. Therefore, in the present study, we developed an analytical model using SPH to predict structural performance of median barriers, which depends upon certain major variables viz. concrete thickness, compressive strength, reinforcement ratio, and collision conditions. The newly developed model was also evaluated by comparing the model results with that of a field test. Furthermore, prediction studies using machine learning have been conducted in various areas [11][12][13][14][15]. However, there has been no parametric study on the prediction of concrete fragments and their travel distance. Therefore, using the analytical results as training data for machine learning, it was possible to accurately predict the amount of fragments generated after a collision and their travel distances.
The study was conducted in 3 steps. The first step was numerical model development and verification (Sections 3.1-3.3). The second step was construction of DNN and LightGBM to predict concrete fragments and their travel distance based on the developed numerical model (Sections 4.1-4.4). The third and final step was a comparative study of the results of DNN and LightGBM with that of field test (Section 4.5).

Smoothed Particle Hydrodynamics Model to Predict the Amount of Fragments and Travel Distance
In this section, theoretical backgrounds of SPH and a concrete material model are discussed. The concrete material model used here is called the continuous surface cap (CSC) model [16,17]. This roadside safety material model was developed by the Federal Highway Administration (FHWA) of the United States to analyze collisions between vehicles and roadside facilities.

Theoretical Background of SPH
The smoothed particle method was developed to solve astrophysical problems. The SPH is a mesh-free method based on Lagrangian formulation. Therefore, SPH does not involve locking of elements or negative volume and can simulate large deformations. For this reason, the SPH method has found wide applications, e.g., in fluid mechanics and soil mechanics. Benz and Asphaug [18] simulated fractures in brittle solids under impact loadings using the SPH method. Liu et al. [19,20] performed computer simulation of large explosions using SPH. The SPH method typically considers interactions of each particle as field functions and their differential form, and the approximations made in this method can be expressed as follows: where f(x) are field functions, ∇·f(x) is differential form, i is particle, m j is mass, ρ j is density, h is smoothing length, W x i − x j , h is the value of the kernel function, and N is the total number of particles. ∇W in the above equation is the gradient, which is evaluated at particle j. A detailed explanation of the SPH method can be found in Liu and Liu [20].

Theoretical Background of Continuous Surface Cap (CSC) Model
The CSC model has several capabilities, such as multiaxial strength, stiffness degradation and dilation, and strain rate effects. The CSC model can simulate continuous intersection between the failure surface and hardening cap. The shear failure surface of this model has a shape of affine-exponential spine and is expressed as where, α, β, λ, and θ are values selected by fitting the model surface to strength measurements from a triaxial compression test of plain concrete cylinders. Fracture energy is an important key parameter to determine generation of cracks due to tension. In the present study, the fracture energy was estimated based on fib Model code 2010 [21], as shown in Equations (4) and (5): where f ctm is tensile strength, f ck is characteristic compressive strength, f cm is mean compressive strength, and G F is the fracture energy of concrete.

Introduction of the Developed Local Impact Model
According to Kim et al. [2], the maximum amount of fragments is generated when top of a median barrier and corner of cargo of a vehicle contact during a collision. The vehicle model of National Crash Analysis Center (NCAC) based on the European standard EN-1317 [22] is generally used in vehicle collision analysis. However, this vehicle model does not simulate local collision between the lower corner zone of a cargo compartment and the upper zone of a concrete median barrier. Therefore, a detailed numerical model was developed in the current work to simulate a local collision. Figure 2 shows our newly developed numerical analysis model for verification. In this model, the concrete for numerical model was SPH and the wire-mesh for numerical model was beam element. The concrete material model used here is the CSC model [16,17], while the wire-mesh material model used is piecewise linear plasticity, which is based on elastoplastic material model.

The Developed Numerical Analysis Model
The key parameters selected for the CSC model were "tension softening parameter", "compression softening parameter", "fracture energy", "repow", and "erode". To estimate the concrete fracture energy, equations from the fib Model code 2020 [21] were used. The "erode" determines the deletion of elements with plastic strain. If an element lacks the material capacity given by its yield surface, the element should be removed from the simulation of the behavior of concrete. Thai et al., El-tawil et al., and Murray selected 1.4, 1.4, and 1.1 for "erode", respectively [16,23,24]. However, in the current study, 1.0 was selected for "erode" following Kim et al. [2].
Another key parameter, "repow", is for rate effect parameter. This parameter is used to increase fracture energy based on the rate effect parameter [16,17]. In the present study, the default value of "repow" was selected as 1.0. The selected parameters for tension softening, compression softening, and erode were 0.1, 100, and 1.0, respectively. A detailed discussion of the parameter can be found in Kim et al. [2].
To consider the strain rate effect of wire-mesh, Cowper-Symonds' equation was used with C and p coefficients. The values of the selected C and p coefficients were 1.05 × 10 7 and 8.3 according to Chung et al. [25].
The new model was developed by comparing our results with that of Xiao et al. [26]. The original shape of a concrete median barrier is thicker toward the lower end. However, in the new model, a slab-type concrete structure was selected since the lower part of the concrete median barrier does not affect the amount of fragments after collision. The size of the concrete structure selected was 3000 mm × 1270 mm × depth parameter.
The crack patterns and damaged area obtained from the developed numerical model were compared with the experimental results. Such comparison indicated that the developed numerical model could predict crack patterns and damaged area correctly. A detailed discussion of the numerical model can be found in Kim et al. [2].

Research Scopes of CMB
To predict the amount of fragments and travel distance, the scopes of the key parameters need to be established. With limited statistical data of CMB-vehicle collision, research scopes were selected for concrete thickness, compressive strength, reinforcement ratio, impact location, and impact energy (impact velocity and impact mass). However, the impact mass could not be predicted from the results of field test; therefore, we conducted reverse analysis with various masses using the numerical model to estimate the impact mass relevant to local impact of a truck with a CMB, as shown in Figure 3. Furthermore, impact locations and impact velocities were obtained from recorded video data obtained from the Korea Expressway Corporation [27,28]. The damaged zones of the CMB in both the front and rear side faces were compared with the field test data [6][7][8]29].
A detailed discussion on model verification can be found in Kim et al. [2]. Table 1 shows the scope of the key parameters for numerical analysis.

Prediction and Verification of the Travel Distance
The travel distance was predicted using the initial velocity of the fragments. The prediction equation for fall time is expressed as where a is gravitational acceleration (9810 mm/s), t is fall time (s), V yi is the initial velocity in the direction of gravity (mm/s), and 1270 mm is the height of the concrete median barrier. The prediction equation can be expressed as where, V xi is the initial lateral velocity (mm/s), D L is the prediction travel distance value (mm), and t is the obtained time from Equation (6).

Application of Machine Learning to Evaluate Structural Performance under Impact
To predict the amount of concrete fragments, Kim et al. [2] conducted multiple linear regression analysis (MRA) considering the numerical analysis results. However, the coefficient of determination (R 2 ) was not high since the MRA defines the relationship between independent and dependent variables. Therefore, ANN was employed to improve the relationship between the independent and dependent variables. There are various algorithms based on ANN such as Deep Neural Network, Recurrent Neural Network (RNN), Gated Recurrent Unit (GPU), and Long Short-Term Memory (LSTM). With a computer having high processing speed and sufficient amount of training data available, DNN can be applied very rapidly and effectively in various fields. On the other hand, LightGBM has the advantages of excellent accuracy and fast training processing speed using the leafwise method. Therefore, to predict the amount of fragments and travel distance of concrete under impact loadings, both DNN and LightGBM were constructed based on the results of SPH analysis, taking the uncertainties into account.

Development of ANN
Deep learning was perceived and developed in the way the human brain works. Deep learning progresses typically via three layers, viz., the input layer, hidden layer, and output layer. The results from the input layer are combined in the hidden layer using a combination function. The results are then weighted, calculated, and sent to the output layer via an activation function. In the current work, to minimize error, a backpropagation technique was followed in which the errors found in the output layer were calculated backward (see Figure 4). There are many activation functions, such as sigmoid, rectified linear unit (ReLU), and hyperbolic tangent function. Among these, the ReLU is used most frequently and we used this in the current study.
Abbas and Jang performed future data prediction using ReLU and tanh as activation functions, with Adam and RMSprop as optimizers [11]. Results indicated that ReLU and Adam showed better performances than tanh. Therefore, ReLu and Adam were selected as the activation function and optimizer, respectively, in this study.
For learning rate, we selected the values that would not generate underfitting or overfitting by considering combinations with other parameters using the values, 0.01, 0.001, and 0.0001. Generally, large epoch values are needed for small training rates, and small epoch values for large training rates. The caveat here is that optimum weight values are not determined when large training rates are used, which results in inaccurate prediction results.
In the present work, the DNN was optimized considering the number of layers, the number of nodes, and epoch so that the DNN does not generate overfitting or underfitting. Table 2 shows the selected parameters.

Development of Gradient Boosting Method
The boosting method is one of the ensemble methods that utilizes decision trees, as shown in Figure 5. It continuously improves an error from a single model. It generally exhibits high accuracy because it ultimately applies all training models considered so far. There can be outliers, which lie outside the normal distribution, in the distribution of each variable. Furthermore, to note, where the boosting method assigns one weight to all the models, the gradient boosting machine (GBM) assigns different weights to an individual model. In GBM, the weights are applied in the form of differentiated errors. The errors in GBM are calculated using the down gradient method, in which the differentiated value becomes the minimum. The Loss function of GBM can be represented as where y i are the observed values and f (x i ) are the predicted values. The gradient can be represented as shown below.
where the most optimum decision tree is made to predict the minimized residual via repeated iteration. Training is carried out in such a way so that it complements the error of the previous tree. To minimize error, a second model is made to predict the error of the first model, and a third model is made to predict the error of the second model. This iteration is repeated to create a final model that reduces the error and can be used to make predictions. The advantage of GBM is that the training and prediction times are very fast because it can create an asymmetric tree shape. The accuracy of this method is also known to be very good.
In the present study, we carried out GBM by considering the learning rate, the number of estimators, and the number of iterations using LightGBM developed by Microsoft in 2016. The boosting algorithm horizontally expands the tree by level with levelwise method. On the other hand, LightGBM is an algorithm that vertically expands the tree based on leaves with max delta loss through leafwise analysis. Assuming that both algorithms generate the same leaves, we obtained a classification model that is faster and less lossy than the time-consuming levelwise analysis algorithm by scaling symmetrically regardless of the loss rate.

Results of DNN and GBM
In Figure 6, the prediction and numerical analyses results of the amount of fragments and their travel distances using DNN are compared. The coefficients of determination (R 2 ) for the amount of fragments and their travel distances are 0.9406 and 0.9165. This indicates that the prediction results of the amount of fragments are better than the analysis results. The prediction results of the travel distances were obtained using the initial velocity and not from the direct analysis results, which we believe is the reason why the prediction results were not accurate. Nevertheless, the coefficient of determination for the prediction of travel distances was as good as 0.9165, and the analysis result is properly reflected. In Figure 7, the prediction results of the amount of fragments using GBM and the analysis results are compared. The coefficient of determination for the amount of fragments was 0.9795, while the coefficient for their travel distances was 0.9207. Similar to the prediction results obtained using DNN, the prediction results of the amount of fragments by GBM are better than the analysis results. Furthermore, the coefficients of determination obtained using GBM are higher than those obtained using DNN. We can thus confirm that the prediction results of GBM are better than the prediction results of the analysis. The degrees of importance of the variables in predicting the amount of fragments and their travel distances using GBM are shown in Figures 8 and 9. It was observed that the amount of fragments was affected mostly by concrete thickness and the travel distance was most affected by collision speed. On the other hand, the amount of fragments and their travel distances were affected least by compressive strength of the concrete and the reinforcement ratio. Accordingly, the compressive strength in the range 25.5-34.5 MPa used in this study did not affect the amount of fragments and their travel distances. Figure 9 shows the decision tree structure for predicting the amount of concrete fragments considering various parameters. The selected parameters were learning rate 0.01, number of iterations 20,000, and number of estimators 1000.   Table 2 shows the final mean absolute error (MAE) and R 2 after the training using DNN and GBM. MAE was smaller and R 2 was larger in GBM than in DNN. The error between the prediction and the analysis results was smaller in DNN, while performance of the regression model was better in GBM. Generally, high MAE values tend to show low R 2 values. However, the reason for high R 2 values at such low MAE values is because the same training data were not used. Only 80% of the training data were randomly selected to carry out the training.

Results of the DNN and Gradient Boosting Machines
In Figure 10, the prediction results from the test data that were not used in learning are compared with those from the numerical analysis. For the prediction on the amount of fragments, R 2 for DNN was 0.8759 and R 2 for GBM was 0.5853, whereas for the prediction on the travel distance, R 2 for DNN was 0.5308 and R 2 for GBM was 0.0866. It can be observed that DNN reflected better predictions than the analysis results. When compared with the training data, the R 2 value was higher in GBM than in DNN. However, for the prediction using the test data, GBM showed a relatively smaller R 2 value than DNN. Therefore, the GBM models showed the best for the training data, whereas the DNN models showed improved results for both training data and test data. This means that DNN gave better prediction than GBM. The studies by Nawar and Mouazen [30] also showed similar trends. This points out that depending on the training data set, one method may outperform other methods. Therefore, while conducting machine learning studies of prediction, one should evaluate various machine learning techniques, such as DNN, GBM, and Random Forest, and then select a proper technique.

Comparison of the Prediction Results with Experimental Test
A field test was performed as shown in Figure 11. The amount of fragments and travel distances predicted from DNN, GBM, and numerical analyses with the same dimensions and conditions were compared with the field test results. Table 3 shows the MAE and R 2 of DNN and LightGBM. Table 4 exhibits both the test results and prediction values, whereas Figure 12 shows the relative accuracies with respect to the test results. For machine learning, the amount of fragments predicted by DNN and GBM were 20 kg and 14 kg, respectively. The errors with respect to the real test were 23.1% and 46.2%, respectively. By contrast, the amount of fragments predicted by numerical analysis was 33 kg, and the difference from the real test data was 26.9%. Therefore, it can be seen that DNN exhibited the smallest error. The values of travel distance obtained from DNN, GBM, and numerical analysis were 456 mm (30.9%), 297 mm (55.0%), and 815 mm (23.5%). When the amount of fragments and their travel distances were considered jointly, the average error for numerical analysis, DNN, and GBM were 25.2%, 27.0%, and 50.6%, respectively. Therefore, for both fragmentation amount and travel distance, numerical analysis generated results closer to the field test ones than DNN and GBM. In machine learning, the DNN showed smaller errors than GBM with respect to the field test results. Therefore, it can be inferred that DNN was more efficient than GBM in predicting the amount of fragments and their travel distances after collision with concrete structures and various input variables, nonlinearities, and uncertainties were inherent in the analysis.   To note, in South Korea, if the travel distance and amount of fragments are more than 2 m and 2 kg, respectively, it is considered a failed test. The CMB in such a case does not meet the real impact test guideline for vehicle safety guard [1] and is considered to have poor performance. When these regulations were applied, it was found that all the methodologies employed in this study, i.e., the numerical analysis, DNN, and GBMpassed the test and met those guidelines. Furthermore, the field test results also showed that all the guidelines were fulfilled (no more than 2 kg and 2 m). Therefore, if machine learning is learned based on pass/fail, it is expected that the accuracy of prediction will improve.

Conclusions and Future Study
In the current study, to predict the amount of fragments and their travel distances, a collision was analyzed and the results were predicted using SPH. The analytical model closely simulated the configurations of the fracture and the scope of damage to the concrete after collision.
Using the newly developed analytical model, various variables were analyzed and the results were utilized as training data for DNN and GBM. The results of the training showed that MAE was smaller in DNN, and R 2 was larger in GBM. Therefore, the error in the analysis results was smaller in DNN, and the performance of the regression model was higher in GBM. When the field test results were compared with the results of the numerical analysis, DNN, and GBM, the average errors for the amount of fragments and their travel distances were 25.2%, 27%, and 50.6% for the numerical analysis, DNN, and GBM, respectively.
From the results of numerical analysis, DNN, and GBM prediction, it was found that no fragments and travel distance values obtained were more than 2 kg and 2 m, respectively. In the field test, neither the fragments amount nor the travel distance were more than 2 kg and 2 m, respectively. The developed numerical model, DNN, and GBM were found to predict the field test results very well following the real impact test guidelines for vehicle safety guard [1].
In the study of fragments amount and travel distance prediction using machine learning, the prediction results showed different errors depending on the method of machine learning. Therefore, from the various machine learning methods, such as DNN, GBM, and random forest, it is important to select an appropriate one.