Rock Fragmentation Prediction Using an Artiﬁcial Neural Network and Support Vector Regression Hybrid Approach

: While empirical rock fragmentation models are easy to parameterize for blast design, they are usually prone to errors, resulting in less accurate fragment size prediction. Among other shortfalls, these models may be unable to accurately account for the nonlinear relationship that exists between fragmentation input and output parameters. Machine learning (ML) algorithms are potentially able to better account for the nonlinear relationship. To this end, we assess the potential of the multilayered artiﬁcial neural network (ANN) and support vector regression (SVR) ML techniques in rock fragmentation prediction. Using geometric, explosives, and rock parameters, we build ANN and SVR models to predict mean rock fragment size. Both models yield satisfactory results and show higher performance when compared with the conventional Kuznetsov model. We further demonstrate an automated means of analyzing a varied number of hidden layers for an ANN using Bayesian optimization in the Keras Python library.


Introduction
Rock fragmentation is the process by which rock is broken down into smaller size distributions by mechanical tools or by blasting.The resulting fragment size distribution may be characterized by a histogram showing the percentage of sizes of particles, or as a cumulative size distribution curve [1].The primary means of rock fragmentation in mining is blasting.A good blast produces a size distribution that is well suited to the mining system it feeds, maximizes saleable fractions, and enhances the value of saleable material [2].Blasting efficiently saves significant amounts of money that would otherwise be spent on secondary blasting [3].It also yields significant savings on the costs of downstream comminution processes, i.e., crushing and grinding.
The results of a blast depend on several parameters, which are broadly categorized as controllable and uncontrollable [4,5].Controllable parameters can be varied by the blasting engineer to adjust the outcome of blasting operations.Controllable parameters can be grouped into geometric, explosives, and time parameters.Geometric parameters include drill hole diameter, hole depth, charge length, spacing, burden, and stemming height.Explosives parameters include the type of explosive, explosive strength and energy, powder factor, and priming systems.Time parameters include delay timing and initiation sequence.A blasting engineer's ability to change these controllable parameters dynamically in response to as-drilled information is critical to achieving good fragmentation [3].The uncontrollable parameters constitute the geological and geotechnical properties of the rock mass.These parameters are inherent, and thus, cannot be varied to adjust blasting outcomes.They include rock strength, rock-specific gravity, joint spacing and condition, presence and depth of water, and compressional stress wave velocity [6].Though these parameters cannot be varied by the blasting engineer, adequately accounting for them in a blast design helps to achieve good fragmentation.Figure 1 is a bench blast profile showing a variety of design parameters.
Mining 2022, 2, FOR PEER REVIEW 2 joint spacing and condition, presence and depth of water, and compressional stress wave velocity [6].Though these parameters cannot be varied by the blasting engineer, adequately accounting for them in a blast design helps to achieve good fragmentation.Figure 1 is a bench blast profile showing a variety of design parameters.
Several studies have sought to predict fragment size distribution based on the parameters used in blast design.The accurate prediction will give blasting engineers control over the outcome of blasting operations.Consequently, engineers will know which controllable parameters to modify, and to what extent the modification should be.
Having an accurate prediction model leads to good post-blast results, and this comes with enhanced loader and excavator productivity along with numerous downstream benefits.However, the prediction exercise proves to be challenging considering that numerous parameters influence fragmentation.Additionally, the rock mass may be heterogeneous and/or anisotropic in its structures of weakness.To this end, it is impossible to develop a predictive tool solely based on theoretical and mechanistic reasoning [5].Researchers have thus mostly resorted to empirical techniques in predicting the outcome of fragmentation, with the Kuz-Ram being the most widely used.The empirical models are favored and widely used in daily blasting operations because they are easily parameterized.A major shortfall, however, with the empirical methods is that certain significant parameters are not accounted for, and this leads to less accurate results.Cunningham [2], notes that essential parameters omitted by empirical techniques include rock properties and structure, e.g., joint spacing and condition, detonation behavior, and mode of decking.Other parameters include blast dimensions and edge effects from the borders of the blast.Over the years, researchers have modified existing models and formulated new ones in an attempt to improve prediction accuracy.While this has contributed to significant improvement, none of the ensuing models incorporate all the important parameters, and accuracy is still of concern.In some instances, highly simplified or inappropriate procedures were used for estimating the properties of structural weakness in the rock mass [5].Furthermore, the relationship between fragmentation input and output parameters is highly nonlinear, and empirical models may not be well suited for such modeling.
To this end, researchers, in recent years, have sought to implement machine learning (ML) techniques for fragmentation prediction.The objective was to capture as much of the inherent nonlinearity using limited input parameters and subsequently improve accuracy.Kulatilake et al. [5] and Shi et al. [7] have respectively exploited the potential of using artificial neural network (ANN) and support vector regression (SVR) for this Several studies have sought to predict fragment size distribution based on the parameters used in blast design.The accurate prediction will give blasting engineers control over the outcome of blasting operations.Consequently, engineers will know which controllable parameters to modify, and to what extent the modification should be.Having an accurate prediction model leads to good post-blast results, and this comes with enhanced loader and excavator productivity along with numerous downstream benefits.However, the prediction exercise proves to be challenging considering that numerous parameters influence fragmentation.Additionally, the rock mass may be heterogeneous and/or anisotropic in its structures of weakness.To this end, it is impossible to develop a predictive tool solely based on theoretical and mechanistic reasoning [5].Researchers have thus mostly resorted to empirical techniques in predicting the outcome of fragmentation, with the Kuz-Ram being the most widely used.The empirical models are favored and widely used in daily blasting operations because they are easily parameterized.A major shortfall, however, with the empirical methods is that certain significant parameters are not accounted for, and this leads to less accurate results.Cunningham [2], notes that essential parameters omitted by empirical techniques include rock properties and structure, e.g., joint spacing and condition, detonation behavior, and mode of decking.Other parameters include blast dimensions and edge effects from the borders of the blast.Over the years, researchers have modified existing models and formulated new ones in an attempt to improve prediction accuracy.While this has contributed to significant improvement, none of the ensuing models incorporate all the important parameters, and accuracy is still of concern.In some instances, highly simplified or inappropriate procedures were used for estimating the properties of structural weakness in the rock mass [5].Furthermore, the relationship between fragmentation input and output parameters is highly nonlinear, and empirical models may not be well suited for such modeling.
To this end, researchers, in recent years, have sought to implement machine learning (ML) techniques for fragmentation prediction.The objective was to capture as much of the inherent nonlinearity using limited input parameters and subsequently improve accuracy.Kulatilake et al. [5] and Shi et al. [7] have respectively exploited the potential of using artificial neural network (ANN) and support vector regression (SVR) for this purpose, and have achieved satisfactory results.ANN and SVR are machine learning techniques that are proven to possess high nonlinearity-recognition properties.However, ANN models in the rock fragmentation literature were limited to only one hidden layer, and do not exploit the potential of the multilayered network (ANN with more than one hidden layer), which could potentially lead to achieving higher accuracy.In this research, we implement SVR and a variety of multilayered ANN for predicting mean fragment size.
Machine learning (ML) is a branch of artificial intelligence (AI) that allows computer systems to improve their performance at a task through experience (learning) for the purpose of predicting future outcomes [7,8].It is a multidisciplinary field that relies significantly on specialized subject areas such as probability and statistics, and control theory.ML techniques are broadly classified as supervised and unsupervised learning.Supervised learning is concerned with predicting an outcome given a set of input data.It does so by making use of the already established relationship between representative sets of input and output data that were used for model training.Unsupervised learning is concerned with data segmentation based on pattern recognition.Unsupervised ML techniques can infer patterns from data without reference to known outcomes.They are useful for discovering the underlying structure of a given data set.The rock fragmentation problem is a regression problem that is suited to tools of supervised machine learning such as multivariate regression analysis, artificial neural network (ANN), and support vector regression (SVR).The last two comprise algorithms that are more robust to nonlinear relationships between input and output data [5,9].They are thus considered in this study since rock fragmentation input and output parameters are nonlinearly related.

Preliminary Background
We provide a fundamental explanation of the machine learning techniques used in this study.The section describes the architecture of the artificial neural network and support vector regression.

Artificial Neural Network (ANN)
Artificial neural network (ANN) is a machine learning technique that is inspired by the way the biological neural system works, such as how the brain processes information [7,8,10].Information processing in ANN involves many highly interconnected processing elements known as neurons that work together to solve specific problems.The learning process involves adjustments to the synaptic connections existing between the neurons [7,11].In the biological neural system, a neuron consists of a cell body, known as soma, an axon, and dendrites.The axon sends signals, and the dendrites receive these signals.A synapse connects an axon to a dendrite.Depending on the signal it receives, a synapse might increase or decrease electrical potential.An ANN consists of a number of neurons similar to human biological neurons.These neurons are known as units and are connected by weighted links that transmit signals from one neuron to the other [7,12].The output signal is transmitted through the neuron's outgoing connection, which is analogous to the axon in the biological neuron.The outgoing connection splits into a number of branches that transmit the same signal.The outgoing branches terminate at the incoming connections (analogous to dendrites) of other neurons in the network [7].
An ANN has three types of neurons, and these are known as input, hidden, and output neurons.They are stacked in layers, and receive input from preceding neurons or external sources, and use this to compute an output signal using an activation function.The activation function is a mathematical formula for determining the output of a neuron based on the neuron's weighted inputs.The output signal is then propagated to succeeding neurons.While this is ongoing, the ANN adjusts its weights in order to record an acceptable minimal error between input variables and the final output variable(s) [13].The complexity of the ANN architecture makes it well suited for solving both linear and nonlinear problems [10].Advancement in computational power has enhanced its use in the fields of engineering, industrial process control, medicine, risk management, marketing, finance, communication, and transportation.

Suport Vector Regression (SVR)
Support vector regression (SVR) is a type of supervised machine learning that is based on statistical learning theory [14].Just like the ANN, SVR is efficient at modeling nonlinearly related variables and does well at solving both classification and regression problems.It works by nonlinearly mapping, i.e., transforming, a given data set into a higher dimensional feature space, and then solving a linear regression problem in this feature space [9,15].That is, it seeks to predict a single output variable ( ŷ) as a function of n input variables (x) using a function f (x) that has at most ε deviation from the actual values (y) for all the training data [16].Equation (1) expresses this function in its simplest form as a linear relationship [9]: In Equation ( 1), the function ϕ(x) denotes the high dimensional kernel-induced feature space.Kernel refers to the mathematical function used in the data transformation process.Different kernels are available for use in SVR analysis.They include the linear, polynomial, radial basis function (rbf), and sigmoid kernels.Parameter w in Equation ( 1) is a weight vector, and b is a bias term.Both w and b are calculated by minimizing a regularized cost function.Figure 2 is a graphical representation of the SVR concept.The ±ε deviation from the actual values (y) can be described as a tube that contains the sample data with a certain limit ε [16].This implies that the function f (x) is constrained by the ±ε limits to form a tube that represents the data set with the expected deviations.
the fields of engineering, industrial process control, medicine, risk management, marketing, finance, communication, and transportation.

Suport Vector Regression (SVR)
Support vector regression (SVR) is a type of supervised machine learning that is based on statistical learning theory [14].Just like the ANN, SVR is efficient at modeling nonlinearly related variables and does well at solving both classification and regression problems.It works by nonlinearly mapping, i.e., transforming, a given data set into a higher dimensional feature space, and then solving a linear regression problem in this feature space [9,15].That is, it seeks to predict a single output variable ( ) as a function of  input variables () using a function   that has at most  deviation from the actual values () for all the training data [16].Equation (1) expresses this function in its simplest form as a linear relationship [9]: In Equation ( 1), the function   denotes the high dimensional kernel-induced feature space.Kernel refers to the mathematical function used in the data transformation process.Different kernels are available for use in SVR analysis.They include the linear, polynomial, radial basis function (rbf), and sigmoid kernels.Parameter  in Equation ( 1) is a weight vector, and  is a bias term.Both  and  are calculated by minimizing a regularized cost function.Figure 2 is a graphical representation of the SVR concept.The  deviation from the actual values () can be described as a tube that contains the sample data with a certain limit  [16].This implies that the function   is constrained by the  limits to form a tube that represents the data set with the expected deviations.

Literature Review
The ability to accurately predict fragment size distribution from a given blast design will give blasting engineers control over the outcome of blasting operations.Engineers will be able to identify which controllable parameters to modify, and to what extent the modification should be.To this end, several studies have sought to predict fragment size distribution based on the parameters used in blast design.These studies have resulted in empirical prediction models, with the Kuz-Ram being the commonest model in use.Others include the CZM, two-component model (TCM), Kuznetsov-Cunningham-Ouchterlony (KCO), SveDeFo, and Larson's equation [4,18].The reliance on empirical models stems from the complexity that comes with the attempt to develop explicit theoretical and mechanistic equations to predict the outcome of fragmentation [2,4,5].This complexity is primarily attributed to the fact that there are so many parameters that affect a blast, coupled with geological heterogeneity [5,9].

Literature Review
The ability to accurately predict fragment size distribution from a given blast design will give blasting engineers control over the outcome of blasting operations.Engineers will be able to identify which controllable parameters to modify, and to what extent the modification should be.To this end, several studies have sought to predict fragment size distribution based on the parameters used in blast design.These studies have resulted in empirical prediction models, with the Kuz-Ram being the commonest model in use.Others include the CZM, two-component model (TCM), Kuznetsov-Cunningham-Ouchterlony (KCO), SveDeFo, and Larson's equation [4,18].The reliance on empirical models stems from the complexity that comes with the attempt to develop explicit theoretical and mechanistic equations to predict the outcome of fragmentation [2,4,5].This complexity is primarily attributed to the fact that there are so many parameters that affect a blast, coupled with geological heterogeneity [5,9].
The Kuz-Ram model is essentially a three-part model consisting of a modified version of the Kuznetsov equation, the Rossin-Rammler equation, and the Cunningham uniformity index.The parameters defined by these equations constitute the output of the prediction model [4].The Kuznetsov equation is for predicting mean fragment size (X 50 ), and the original version is given by Kuznetsov [19] as: In Equation ( 2), X 50 is the mean fragment size (cm); A is the rock factor (7 for medium hard rocks, 10 for hard but highly fissured rocks, 13 for very hard, weakly fissured rocks); V is the rock volume (m 3 ); and Q is the weight of TNT (kg) equivalent in energy to the explosive charge in one borehole.A shortfall of the equation is that the rock mass categories it defines are very wide, and thus need more precision [5].Cunningham [20,21] provides a modified version of the equation as follows: 115 RWS 19 20 (3) In Equation (3), A is the rock factor, and varies between 0.8 and 22 depending on hardness and structure; K is the powder factor, defined as the weight of explosive, in kg, per cubic meter of rock; Q is the mass, in kg, of the explosive in the hole; and RWS is the weight strength relative to ANFO (115 is the RWS of TNT).
The role of the Rosin Rammler equation is to estimate the complete fragmentation distribution.For a given mesh size or screen opening, X, this equation is able to estimate the percentage of fragments retained.It is given as [22]: where R x is the proportion of fragments larger than the mesh size X (cm), and X c is the characteristic fragment size (cm).The characteristic size is one through which 63.2% of the materials pass.If the characteristic size and the uniformity index are known, a size distribution curve can be plotted for the rock fragments [18].The curve is plotted as percentage passing vs. mesh size.The former is obtained by subtracting R x from one.Equation ( 4) can be rewritten to make direct use of the mean fragment size, X 50 , as follows [20,21]: From Equations ( 4) and ( 5), the characteristic size can be deduced as: The third part of the Kuz-Ram model is the uniformity index, developed by Cunningham through several investigations which involved consideration of the effects of blast geometry, hole diameter, burden, spacing, hole length, and drilling accuracy [4].This equation is given as [20,21]: where B is the burden (m); S is the spacing (m); d is the hole diameter (mm); W is the standard deviation of drilling precision (m); L is the charge length (m); BCL is the bottom charge length (m); CCL is the column charge length (m); and H is the bench height (m).Equation ( 7) is multiplied by 1.1 when using a staggered pattern.The value of n is essential in determining the shape of the size distribution curve, and is usually between 0.7 and 2.
High values indicate uniform sizing, while low values indicate a wide range of sizes, including both oversize and fines [18,23].Equations ( 3), (5), and ( 7) are what constitute the typical Kuz-Ram model.Cunningham [2] makes modifications in the model twenty years on, mainly as a result of the introduction of electronic delay detonators.This leads to what is now known in the literature as the modified Kuz-Ram model.The adjustments by Cunningham incorporate the effects of inter-hole delay and timing scatter.The changes also incorporate correction factors for the rock factor and uniformity index.These changes lead to the modification of Equations ( 3) and (7) as follows [2]: where A T is a timing factor for the effect of inter-hole delay, C(A) is a correction factor for the rock factor, n s is the uniformity factor for the effect of timing scatter, and C(n) is a correction factor for the uniformity index.Thus, the modified Kuz-Ram model comprises Equations ( 5), ( 8) and (9).A major shortfall of the Kuz-Ram model is the underestimation of fines.Extensions to the model have, thus, emerged with the objective of improving the prediction of fines.The CZM and TCM are such models [18].Kanchibotla, Valery, and Morrell [24] address the issue of fines via the CZM model, which provides fragment distribution based on the coarse and fine parts of the muck pile.The authors note that during blasting, two different mechanisms control rock fragmentation, i.e., tensile fracturing and compressive-shear fracturing.Tensile fracturing produces coarse fragments, while compressive fracturing produces the fines.The model predicts the coarser part of the size distribution using the Kuz-Ram model.The size distribution of the finer part is predicted by modifying the values of n and X c in the Rosin-Rammler equation.Djordjevic [25] develops a two-component model (TCM) based on the same mechanisms of failure captured by Kanchibotla et al. [24] in their work.The model utilizes experimentally determined parameters from small-scale blasting, and parameters of the Kuz-Ram model to obtain an improved prediction of fragment size distribution.
Ouchterlony [26] develops the KCO model which ties in the Kuz-Ram, CZM, and TCM models.The KCO model replaces the original Rosin-Rammler equation with the Swebrec function to predict rock fragment size distribution.The replacement stems from the author's recognition that the Rosin-Rammler curve has limited ability to follow the various distributions from blasting.The Swebrec function proves to be more adaptable and is able to predict fines better.The model is given by Equations ( 10) and (11) as follows [26]: where P(x) is the percentage of fragments passing a given mesh size, X; X max is the upper limit of fragment size; X 50 is the mean fragment size; and b is the curve undulation parameter.Just like the Rosin-Rammler model, the Swebrec function has the mean fragment size (X 50 ) as its central parameter but introduces an upper limit to fragment size (X max ).
While the aforementioned extensions to the Kuz-Ram model improve the distribution of fines, they introduce yet another factor into a predictive model that is already somewhat extended [2].
With the advancement in computational power, attention is being drawn to the use of machine learning (ML) in rock fragmentation prediction.Over the last decade, researchers have used multivariate regression (MVR) analysis, artificial neural network (ANN), and support vector regression (SVR) to predict fragment size distribution.In their work, Hudaverdi, Kulatilake, and Kuzu [27] use MVR analysis to develop prediction equations for the estimation of the mean particle size of muck piles.They develop two different equations based on rock stiffness.The equations incorporate blast design parameters (i.e., burden, spacing, bench height, stemming, and hole diameter) expressed as ratios, explosives parameters (i.e., powder factor), and rock mass properties (i.e., elastic modulus and in situ block size).Comparative analysis involving results of the prediction equations, Kuznetsov empirical equation, and the actual values prove the capability of the proposed models in offering satisfactory results.The authors make use of a diverse database (the largest ever used in research at the time) representing blasts conducted in different parts of the world.This makes their prediction models robust to a wide range of blast design parameters and rock conditions.
Building upon the work of Hudaverdi et al. [27], Kulatilake et al. [5] developed MVR and ANN models for the same set of data used in the former authors' work.The authors train a single hidden layer neural network model to predict the mean particle size for each of two groups of data, as distinguished by the rock stiffness.The authors perform extensive analysis to determine the optimum number of neurons for the hidden layer.Comparative analysis reveals that the MVR and ANN models perform better than the conventional Kuznetsov model.Shi et al. [9] build upon the work of Kulatilake et al. [5] by exploiting the potential of using support vector regression (SVR) for predicting rock fragmentation.Using the same data set as the previous authors, Shi et al. [9] develop an SVR model for predicting mean fragment size.They compare the results of the SVR model with those of ANN, MVR, Kuznetsov, and the actual values.The comparison shows that SVR is capable of providing acceptable prediction accuracy.
The effectiveness of prediction models is assessed via comparative analysis involving post-blast measurement.Post-blast measurement techniques have been developed over the years for determining the true fragment size after a blast was completed.An accurate predictive model will record insignificant deviation from the true fragment size.The available techniques for measuring fragmentation output can be classified as direct and indirect [3].The direct methods include sieve analysis, boulder count, and direct measuring of fragments.The most accurate method of determining fragmentation is to sieve the whole muck pile.However, because muck piles are large, the use of sieving and the other direct methods can be tedious, time-consuming, and costly [5].Thus, they are not practicable for muck pile fragment distribution.They can, however, be used for smaller amounts of fragment materials, and for very special purposes [3].
The indirect methods of fragment size measurement include digital image processing, and measurement of parameters, which can be correlated to the degree of fragmentation [3].Digital image processing involves the use of sophisticated software and hardware for measuring fragment size.It is the latest fragmentation analysis tool and has largely replaced the conventional methods.The use of this tool comprises the following steps: image capturing of muck pile, image scaling, image filtering, image segmentation, binary image manipulation, measurement, and stereometric interpretation [5].Though quick and cost-effective, this tool has some challenges.Non-uniform lighting, shadows, and a large range of fragment sizes can make fragment delineation very difficult.Another challenge is the overestimation of fines since the computer treats all undigitized voids between the fragments as fines.Thus, to obtain accurate estimation, a correction must be applied.Additionally, the wide variations in size may require different scales of calibration [5,28].

Data and Methodology
This section discusses the data and methods employed in this study.The data set comprises 102 blasts.Using this data set, we develop a multilayered artificial neural network and support vector regression models that satisfactorily predict mean rock fragment size.

Data Source and Description
The data set used in this work is obtained from the blast database compiled by Hudaverdi et al. [27], and subsequently used by Kulatilake et al. [5] and Shi et al. [9].The compilation consists of blast data from various mines around the world.The data, therefore, represents a diverse range of blast design parameters and rock formations.Having such a diverse range of data is good for the purpose of this study, i.e., training machine learning models for prediction.The implication here is that the predictive ability of the ensuing models would span a wide variety of rock formations.The compilation by Hudaverdi et al. [27] represents one of the largest and most diverse blast data collections in the literature, and thus fits the purpose of this study.
Table 1 shows a sample of the data.A summary of the individual research projects from which Hudaverdi et al. [27] compiled the data is provided hereafter.Blasts with labels "Rc", "En", and "Ru" are from research by Hamdi, Du Mouza, and Fleurisson [29], and Aler, Du Mouza, and Arnould [30] at the Enusa and Reocin mines in Spain.The Enusa Mine is an open-pit uranium mine in a schistose with moderate to heavily folded formation.The Reocin Mine is an open pit and underground zinc mine.Blasts designated "Mg" are from a study by Hudaverdi [31] at the Murgul Copper Mine, an open-pit mine in northeastern Turkey.Those designated "Mr" are from a study by Ouchterlony et al. [28] at the Mrica Quarry in Indonesia.The rock formation is mainly andesite.Blasts with the "Sm" label are from an open-pit coal mine in Soma Basin, in Western Turkey [32].Blasts labeled "Db" are from the Dongri-Buzurg open-pit manganese mine in Central India.The rock formation is generally micaceous schist and muscovite schist [33].Blasts labeled "Ad" and "Oz" are, respectively, from the Akdaglar and Ozmert quarries of the Cendere basin in northern Istanbul.Rock formation at both quarries is sandstone [27].The data set features blast design parameters that can be categorized as geometric, explosives, and rock parameters.The geometric parameters include burden, B (m), spacing, S (m), stemming, T (m), hole depth, H (m), and hole diameter, D (m).These are represented in the data set as ratios and include hole depth to burden (H/B), spacing to burden (S/B), burden to hole diameter (B/D), and stemming to burden (T/B) ratios.The powder factor, Pf ( kg m 3 ), represents the explosives parameter and shows the distribution of explosives in the rock.The elastic modulus, E (GPa), and the in situ block size, X b (m), represent the rock parameters.Specifically, in situ block size represents the rock mass structure, while the elastic modulus represents the intact rock properties [27].In effect, a total of seven rock fragment size prediction parameters are in the data set, and these will constitute the input parameters (independent variables) for the SVR and ANN models.The data set also features a post-blast parameter, i.e., X 50 (m), which is the actual mean fragment size.This will be the output parameter (dependent variable) to be predicted by the models.Table 2 shows the summary statistics of the seven input parameters and the mean fragment size for the entire data set.

Model Development
Support vector regression (SVR) and artificial neural network (ANN) models are built for a total of 102 blasts.We split the data into training and test sets comprising 90 and 12 blasts, respectively.The test set has Kuznetsov predictions matching the actual fragment size.This is for the purpose of comparative assessment of results.The data set is scaled within the range 0-1 since the parameters have different orders of magnitude.The scaling is performed using the MinMaxScaler function of the Scikit-learn Python library [34].The SVR and ANN models are built using the Scikit-learn and Keras Python libraries, respectively [34,35].

SVR Modeling
Using Scikit-learn, we develop and train a support vector regression model for prediction.The modeling process involves iterating over several combinations of the following support vector hyper-parameters: regularization (C), epsilon (ε), and kernel (k).Four kernels are considered for modeling, i.e., radial basis function (rbf), polynomial (poly), sigmoid, and linear.Twenty-five different values of C are considered in the interval [1:10], and twenty-seven different values of ε are considered in the interval [1 × 10 −6 :0.3].This yields a total of 2700 combinations of hyper-parameters, each representing a unique SVR model.The process of searching for the optimal combination of these hyper-parameters (adjustable parameters which control the support vector) is known as hyper-parameter tuning.To aid with this process, the GridSearchCV function in Scikit-learn is used [34].It involves building SVR models using each of these hyper-parameter combinations and subsequently using cross-validation to assess model performance.We adopt the five-fold cross-validation technique.This means that for each hyper-parameter combination, the data are split into five folds.The hyper-parameter combination undergoes five runs of model training, and during each run, a distinct fold (one-fifth of the training data) is set aside for validation purposes.The final score assigned to the hyper-parameter combination is the average validation score from the five runs.This process is repeated for all other hyper-parameter combinations.We retrieve the best performing combination of hyperparameters, and these are C = 5.25, ε = 0.04, and kernel = rbf.The final SVR model is thus built using these hyper-parameters.
In this study, retrieval of the best performing combination is based on the mean squared error (MSE) scoring metric.The MSE is a statistical metric that provides a means of assessing performance between two or more models.For each model, the MSE measures the average squared difference between the actual and predicted values.A perfect model would yield an MSE of zero, signifying that the actual values are perfectly predicted by the model, i.e., there is no error in prediction.In machine learning, the best-performing model among alternatives will be the one with MSE closest to zero.We show the MSE values for selected hyper-parameter combinations for the training and test data in Figure 3. From the figure, we observe that models with rbf kernels have better generalization abilities in respect of unseen, real-world data, i.e., data not included in the training process.This is represented by the test data.The best-performing model retrieved from the hyper-parameter tuning is of the rbf kernel type.It yields the lowest MSE value for the test data.
parameter tuning.To aid with this process, the GridSearchCV function in used [34].It involves building SVR models using each of these hyp combinations and subsequently using cross-validation to assess model per adopt the five-fold cross-validation technique.This means that for each hyp combination, the data are split into five folds.The hyper-parameter undergoes five runs of model training, and during each run, a distinct fol the training data) is set aside for validation purposes.The final score as hyper-parameter combination is the average validation score from the fi process is repeated for all other hyper-parameter combinations.We retr performing combination of hyper-parameters, and these are C = 5.25, ε = 0. = rbf.The final SVR model is thus built using these hyper-parameters.
In this study, retrieval of the best performing combination is based squared error (MSE) scoring metric.The MSE is a statistical metric that pro of assessing performance between two or more models.For each mo measures the average squared difference between the actual and predic perfect model would yield an MSE of zero, signifying that the actual value predicted by the model, i.e., there is no error in prediction.In machine learn performing model among alternatives will be the one with MSE closest to z the MSE values for selected hyper-parameter combinations for the training in Figure 3. From the figure, we observe that models with rbf kernel generalization abilities in respect of unseen, real-world data, i.e., data not in training process.This is represented by the test data.The best-performing m from the hyper-parameter tuning is of the rbf kernel type.It yields the low for the test data.

ANN Modeling
Using Keras, we develop a variety of multilayered ANNs with up to layers for prediction.In each instance, hyper-parameter tuning is performe optimal number of neurons (units) for the hidden layers under consideratio the input and output layers have fixed neurons, being seven and one, respe represent the seven input parameters, and the output parameter ( ), whi

ANN Modeling
Using Keras, we develop a variety of multilayered ANNs with up to four hidden layers for prediction.In each instance, hyper-parameter tuning is performed to obtain an optimal number of neurons (units) for the hidden layers under consideration.In all cases, the input and output layers have fixed neurons, being seven and one, respectively.These represent the seven input parameters, and the output parameter (X 50 ), which we seek to predict. Figure 4 is a schematic representing the general architecture of the ANNs used in this study.
For each instance of hidden layers, hyper-parameter tuning is performed using the Bayesian optimization object in Keras [35].The process involves iterating over several combinations of neurons for a given instance of hidden layers and returning the combination that yields the best performance.This process can be very cumbersome and time-consuming when carried out manually.The use of Bayesian optimization saves time by automating the search process for the best combination of neurons for a given number of hidden layers.During the search process, 20% of the training data is set aside for validation purposes using the MSE scoring metric.The remaining data are used for training, and this involves running 1500 epochs to yield an acceptable reduction in prediction error.022, 2, FOR PEER REVIEW predict.Figure 4 is a schematic representing the general architecture this study.For each instance of hidden layers, hyper-parameter tuning is Bayesian optimization object in Keras [35].The process involves combinations of neurons for a given instance of hidden layer combination that yields the best performance.This process can be v time-consuming when carried out manually.The use of Bayesian op by automating the search process for the best combination of neuro of hidden layers.During the search process, 20% of the training validation purposes using the MSE scoring metric.The remaini training, and this involves running 1500 epochs to yield an acc prediction error.
Table 3 shows the results for the various hidden layers conside of hidden layers, the table shows the optimal number of neurons parameter tuning.The neural network with four hidden layers is ANN model.This is based on the test scores, which represent the a generalize to unseen, real-world data.The four-hidden-layer archit test score.Table 3 shows the results for the various hidden layers considered.For each instance of hidden layers, the table shows the optimal number of neurons returned via hyperparameter tuning.The neural network with four hidden layers is selected as the final ANN model.This is based on the test scores, which represent the ability of the models to generalize to unseen, real-world data.The four-hidden-layer architecture has the lowest test score.In the second configuration of hidden layers, the batch normalization (BN) technique serves to control model overfitting, so as to improve model generalization in respect of unseen, real-world data.Batch normalization applies a transformation that maintains the mean output close to zero and the output standard deviation close to 1, thereby standardizing the inputs to a given layer [35].We show the performance of selected hyper-parameter combinations for the various hidden layer instances in Figure 5

Results and Discussion
Through hyper-parameter tuning, we obtain the final SVR and AN purpose of assessing model generalization, we subject these models t data set comprises 12 blasts; these are not used for training.The perform on this data shows how well it will perform when deployed in the r shows the performance of the final models on the training and test se squared error (MSE) as a scoring metric.For the purpose of comparative assessment, the Kuznetsov empir Equation (3), is used to predict the mean rock fragment size for the tes obtained for the ANN and SVR models are compared with those f technique and the actual values.Table 5 and Figure 6 show the re modeling techniques.It is observed that the ANN model records the le Kuznetsov records the highest error.The coefficient of determination proportion of the variation in the dependent variable (mean fragm accounted for by its relationship with the independent variables.It ran and one.A model with r 2 closer to one is said to be reliable in predict variable.The foregoing indicates that the ANN and SVR models are be the relationship between the dependent and independent variables th empirical model.They show superior performance to the Kuznetsov

Results and Discussion
Through hyper-parameter tuning, we obtain the final SVR and ANN models.For the purpose of assessing model generalization, we subject these models to testing.The test data set comprises 12 blasts; these are not used for training.The performance of the model on this data shows how well it will perform when deployed in the real world.Table 4 shows the performance of the final models on the training and test sets using the mean squared error (MSE) as a scoring metric.For the purpose of comparative assessment, the Kuznetsov empirical technique, i.e., Equation (3), is used to predict the mean rock fragment size for the test data.Test results obtained for the ANN and SVR models are compared with those for the Kuznetsov technique and the actual values.Table 5 and Figure 6 show the results for all three modeling techniques.It is observed that the ANN model records the least error while the Kuznetsov records the highest error.The coefficient of determination (r 2 ) measures the proportion of the variation in the dependent variable (mean fragment size) that is accounted for by its relationship with the independent variables.It ranges between zero and one.A model with r 2 closer to one is said to be reliable in predicting the dependent variable.The foregoing indicates that the ANN and SVR models are better able to model the relationship between the dependent and independent variables than the Kuznetsov empirical model.They show superior performance to the Kuznetsov as a result of their inherent ability to model complex, nonlinear relationships, such as exist between rock fragment size and blast design parameters.

Conclusions and Future Work
The paper successfully demonstrates the potential of achieving higher accuracy in mean rock fragment size prediction using multilayered artificial neural network (ANN) and support vector regression (SVR).Using varied blast data sets from different parts of the world, we obtain training and test sets comprising 90 and 12 blasts, respectively, for building multilayered ANN and SVR models.Both models perform satisfactorily and better than the conventional Kuznetsov empirical model.The paper further demonstrates the possibility to analyze a varied number of hidden layers for a neural network in a less cumbersome way using Keras.Keras makes it less time-consuming to consider the performance of a wide variety of hidden layers and neurons via the Bayesian optimization feature.Thus, multilayered ANN analysis of rock fragmentation, which is typically timeconsuming, can be carried out in a relatively shorter time.The end goal here is that blasting engineers would be able to fully exploit the potential of the multilayered ANN architecture for improved performance without having to do manual hyper-parameter tuning.The trained ANN and SVR models could be incorporated into existing fragmentation analysis software to give blasting engineers more accurate options for mean rock fragment size estimation.This incorporation would make it possible for blasting engineers to have access to results from both empirical and machine learning techniques.Blasting engineers would then be able to conduct post-blast analysis to verify the improved accuracy offered by the machine learning techniques.Commercial fragmentation software providers could adopt this integrated approach to gradually build

Conclusions and Future Work
The paper successfully demonstrates the potential of achieving higher accuracy in mean rock fragment size prediction using multilayered artificial neural network (ANN) and support vector regression (SVR).Using varied blast data sets from different parts of the world, we obtain training and test sets comprising 90 and 12 blasts, respectively, for building multilayered ANN and SVR models.Both models perform satisfactorily and better than the conventional Kuznetsov empirical model.The paper further demonstrates the possibility to analyze a varied number of hidden layers for a neural network in a less cumbersome way using Keras.Keras makes it less time-consuming to consider the performance of a wide variety of hidden layers and neurons via the Bayesian optimization feature.Thus, multilayered ANN analysis of rock fragmentation, which is typically timeconsuming, can be carried out in a relatively shorter time.The end goal here is that blasting engineers would be able to fully exploit the potential of the multilayered ANN architecture for improved performance without having to do manual hyper-parameter tuning.The trained ANN and SVR models could be incorporated into existing fragmentation analysis software to give blasting engineers more accurate options for mean rock fragment size estimation.This incorporation would make it possible for blasting engineers to have access to results from both empirical and machine learning techniques.Blasting engineers would then be able to conduct post-blast analysis to verify the improved accuracy offered by the machine learning techniques.Commercial fragmentation software providers could adopt this integrated approach to gradually build client confidence in the use of machine learning techniques with time.
In the future, we seek to improve model performance via data augmentation.We intend to do this using the variational autoencoding (VAE) technique.VAE is a deep learning technique that fits a probability distribution to a given data set, and then samples from the distribution to create new unseen samples.Thus, the VAE offers a means of augmenting the data set used in this study to improve model training, and thus enhance pattern recognition and prediction.We also seek to build additional rock fragmentation models using other machine learning techniques.The final phase of this project will involve developing robust machine learning-based fragmentation software that will not only predict the mean fragment size but the entire fragment size distribution.

Figure 4 .
Figure 4. ANN architecture for rock fragmentation prediction.

Figure 4 .
Figure 4. ANN architecture for rock fragmentation prediction.
. The figure shows how the final ANN model (M8) compares with other models from the hyper-parameter tuning exercise.Model M5 has the worst generalization performance while model M8 has the best generalization performance.

Figure 6 .
Figure 6.MSE plot for test data.

Table 3 .
Optimal neurons for hidden layers.

Table 3 .
Optimal neurons for hidden layers.

Table 5 .
Results for test data.
Figure 6.MSE plot for test data.