1. Introduction
Earthmoving and loading/unloading works are essential parts of construction processes. They require the use of heavy equipment. In spite of the fact that energy use and emissions are critical when selecting the equipment [
1], the cost is usually a crucial factor in the industry. In mining, loading is one of the most important operations, influencing mine preparation, design, and production. This activity has a significant share of the capital and operational expenditures. In the process of mine planning, the loading system mainly determines what other pieces of equipment and what mode of operation will be used, and it is a key to low-cost production [
2]. A shovel machine, which is a tool for digging, lifting, and moving bulk materials, is the most common piece of equipment that is used to load rock material in surface mines. A shovel is a machine capable of handling hard, dense, abrasive, as well as highly fragmented ground, which can accurately spot for loading into dump trucks, rail cars, loading hoppers, etc. [
2]. Shovels are usually grouped into four classes [
3]:
small shovels (0.5–2 m3 bucket size);
medium shovels (2–5 m3 bucket size);
large-size shovels (5–25 m3 bucket size);
very large-size shovels (with a bucket larger than 25 m3).
The selection of the best shovel among a pool of the existing alternatives based on the considered criteria can be performed applying the feasibility studies, which should be conducted to analyse the involved different technical, economic, and operational aspects. The most frequent technological parameters of shovels operations are analysed in the literature. As an example that is related to mathematical simulation a study presenting automated designing of swing circuit for a hydraulic shovel [
4] can be mentioned. The effectiveness of mining equipment, namely trucks and shovel, regarding its useful employment without time losses were analysed [
5]. For the considered equipment selection problem, the mentioned numerous aspects of different nature can be effectively investigated by using a multiple criteria decision making (MCDM) tool. MCDM methodology allows for combining several important issues (criteria), also estimate the relative importance of the analysed criteria, as well as to compare potential alternative equipment and to select the best suited in the analysed situation [
6].
Besides equipment selection that is based on technical requirements, the production rate and cost-benefit analysis make the main parts of the feasibility studies. Cost estimation with various purposes takes a significant place at different stages and processes in mining and related industries. During the planning stage, machine specifications, like technical and operating features, and, accordingly, the costs are not available in in the mineral fields, unlike other areas [
7,
8]. Therefore, developing an up-to-date model with sufficient accuracy is essential. There is a number of models for estimating mineral industry costs, such as the cost estimation to small underground mines [
9], the evaluation of mining equipment, as well as mineral processing equipment costs and other capital expenditures that are related to mining and processing operations [
10,
11], as well as cost estimation that is adapted for the peculiarities of the Australian mining industry [
12], the system of cost estimation in mining of metallic, as well as nonmetallic minerals in two countries, comprising Canada and the United States [
13], a cost model for preliminary feasibility study and a different cost model for a detailed feasibility study, including exponential regression and multivariable linear regression, implemented in Iranian deposit of copper [
14], a case study of South Africa for calculation of capital costs for setting up a coal mine [
15], and Hard-rock LHD cost estimation [
7]. A methodology of evaluation of ore bodies and a guide for practical application was suggested [
16,
17,
18,
19].
An individual question of feasibility evaluations applying simplified cost models was analysed [
20]. A guide for the mining sector, including mining and energy valuation, and being focused on Australian investors and managers [
21], was prepared. Also, energy costs of equipment as a significant part of costs in mining activity were evaluated, including not only processing costs but also transportation and exploitation costs of machinery [
22].
The models mentioned above often take into account only one parameter as an independent variable, while the constructed models are merely based on the regression analysis. Therefore, it is essential to develop a more accurate and efficient model for cost estimation of equipment.
The different techniques can be applied for the optimization problems. During the last decade, significant research has been performed on fuzzy logic control of the nonlinear systems [
23,
24]. Another direction of the optimization techniques is artificial neural networks.
The Artificial Neural Network (ANN), which is one of the artificial intelligence methods, as well as multivariate regression (as one of the statistical models), are powerful tools for pattern recognition and modelling. These two methods are used by various researchers.
An ANN-based model for estimating distillation process, using the Levenberg–Marquardt approach is developed by Singh et al. [
25]. Yamamura [
26] predicted pharmacokinetic parameters by an ANN modelling. An ANN prediction model for determining the failure depth of coal seam floors was developed by Lian-guo et al. [
27]. They compared calculation results with the real case study and stated that the predicted results by applying the suggested model agreed well with practical measurements.
An ANN model for an industrial gas turbine was developed by Fast et al. [
28]. They used the operational data with a multilayer feed-forward network to construct an ANN model. Analysing the results, they made a conclusion that some of the functional and performance parameters of the gas turbine, including a critical parameter of identification of the anti-icing mode, can be accurately predicted in a changing modelling environment. Lee et al. [
29] suggested a way to improve the reliability of the Bridge Management System, using the ANN-based Backwards Prediction Model.
Jalali-Heravi et al. [
30] developed the shuffling multivariate adaptive regression splines and the adaptive neuro-fuzzy inference system as tools for studying a quantitative structure-activity relationship (QSAR) of severe acute respiratory syndrome (SARS) inhibitors.
Verlinden et al. [
31] presented a case study and estimated the cost of sheet metal parts, using a combination of multiple regression and artificial neural networks. Mesroghli et al. [
32] used the regression and artificial neural networks for estimating gross calorific value based on coal analysis. Sahoo et al. [
33] developed the models for predicting stream water temperature, using three techniques, namely the regression analysis, an artificial neural network, and also combining them with chaotic non-linear dynamic models. Therefore, it is clear that the ANN and regression have demonstrated their capabilities of modelling engineering practice problems.
An application of neural networks for improving the weighting precision, having the aim to optimise loading of trucks, as well as production efficiency of electric shovels, was presented by Gu et al. [
34]. The combination with fuzzy logic was suggested to decrease uncertainties in the process.
During the recent years, the different aspects of civil engineering problems have been considered when applying artificial neural networks (ANNs). Suspended-dome model updating was performed by back propagation network approach to evaluate the discrepancy between actual structure and the corresponding numerical approximation [
35]. The ANN approach was applied for the estimation of the axial bearing capacity of the rectangular concrete-filled tubular columns [
36]. A stochastic conceptual cost evaluation of the highway projects is performed by generating an empirical distribution of the estimated cost range, without additional initial assumptions [
37]. This empirical distribution is constructed applying ANN techniques and bootstrap sampling. The application of ANN techniques was implemented to study construction labour productivity [
38]. Additionally, the different ANN activation and transfer functions are applied to estimate the most influencing factors to model construction labour productivity. The parameter sensitivity analysis for civil engineering problems was performed implementing ANN algorithms [
39]. This paper dealt with parameter sensitivity analysis paradigm, in which the essential element is neural network ensemble.
Usually, structural health monitoring of the engineering systems is governed by fixed or hand-crafted features, and this fact significantly reduces the reliability of such monitoring systems. The proposed structural damage detection system, which was constructed by implementing one-dimensional (1D) convolutional neural networks, allows for extracting damage-sensitive features from raw acceleration information [
40].
The prediction problem of the capacity characteristics of the pile structures has been modelled implementing ANN and principal component analysis [
41]. The issue of the claim management in the construction projects was solved applying neural network approach [
42]. The proposed approach allows for not only classifying and ranking emerging claims, but also to predict the claim frequency in the construction projects.
The integration of two different information flows: spatial planning of buildings and territorial planning system, was performed while applying ANN technique [
43]. The evaluation of the uncertainty influence to the estimated cost of the construction projects was carried out implementing different models for the training and evaluating phases [
44]. For training phase, fuzzy adaptive learning control network and the fast, messy genetic algorithm were applied. For evaluating phase, component ratios, regression, and multi-factor evaluation sub-models are accomplished.
In this paper, multivariate regression and ANN models are employed to construct a hybrid model to yield a more accurate and precise model than traditional multivariate regression and ANN models. In the proposed approach, a model is considered as a function of linear and nonlinear components, so that the multivariate regression model is first employed to recognise the existing linear pattern in data. Then, the ANN is applied as a nonlinear function to model the preprocessed data. Finally, the main performance criteria of the model are calculated, including the coefficient of determination (R2), Normalized Mean Square Error (NMSE), and Mean Absolute Percentage Error (MAPE), and the best model is identified according to calculation results.
3. An Artificial Neural Network
The ANN technique, a branch of artificial intelligence methods, is a reliable and useful tool for the formulation of linear and non-linear patterns. Bourquin et al. [
46] revealed that an ANN methodology shows a clear superiority as a modelling technique, in comparison to classical methods, for data sets showing non-linear relationships, and this is for both data fitting and prediction abilities. This technique is widely used for many scientific and engineering problems, such as data processing, classification, and pattern recognition. The ANN technique has some unique features, distinguishing it from other data processing systems. This technique, even when partly damaged, can work successfully. This method can also be used for parallel processing, generalisation, and demonstrates low vulnerability to errors in the dataset [
47]. An ANN model employs the mechanism that is applied in the human brain to extract the patterns and behaviours of data [
48].
The ANN technique has been developed based on the structure of biological neural networks, where neurons are the backbone of the structure. The inspiration for an artificial neuron arises from a biological neuron; so that, an artificial neuron can send signals to other neurons. Then, it collects these signals, and when fired, it transmits a signal to all of the connected neurons [
49].
Figure 1 graphically shows a typical artificial neuron.
From
Figure 1, the transfer function is shown by
f; the activation threshold of the neuron
j is determined by
; the connection weight between the
ith and
jth neurons is assigned by
; the input signal of
n other neurons to a neuron
j is determined by
xi (I = 1, 2, …, n); and, the output of the neuron
j is assigned by
yj, which can be mathematically computed by Equation (4):
where
f usually presents hyperbolic tangent sigmoid and linear transfer functions.
A typical neural network comprises three primary layers, including input, intermediate, and output layers. Based on the basic concepts of machine learning, the number of hidden (intermediate) layers does not have a theoretical limit [
50]. A typical ANN structure is depicted in
Figure 2.
A multilayer feedforward perceptron (MLP), based on a backpropagation learning function, is applied for the estimation of the shovel capital cost. In the process of formulation, the model output with the actual output is compared to adjust the coefficients. An ANN model follows a five-step procedure to obtain a relationship between input(s) and output(s). Firstly, the training dataset, a vector of input–output, is randomly selected. Secondly, the network structure is constructed. Next, the model output vector is calculated for the input vector. Then, by using the model performance measure, connection weights are adjusted. Finally, the process of improving the weights is continued in order to satisfy the model performance.
4. The Hybrid Methodology
Both ANN and MVR models have earned success in modelling nonlinear and linear patterns, respectively. However, none of them is a universal model, which is appropriate to all situations. The MVR model ignores the complex nonlinear patterns that are involved in data. On the other hand, using an ANN model for the formulation of a linear problem can lead to mixed results [
51]. Since it is impossible to recognise the behaviour of the data correctly, a combined approach, based on both nonlinear and linear formulating proficiency, is a robust plan in modelling a complicated issue [
52].
In this study, a new hybrid model is proposed, in which the MVR model and ANNs are integrated to acquire a robust and efficient method for cost estimation of shovel machines. Since the MVR is a linear approach that is unable to recognise nonlinear patterns in data, the ANN is applied to model the residuals and to acquire the nonlinear behaviour, while the result of the ANN is added to the final output. Therefore, the MVR model will be responsible for the linear model, while the ANN will be responsible for the nonlinear part. A diagram of the model can be schematically seen in
Figure 3.
Having this in mind, we assume that the shovel costs are classed as the nonlinear and linear parts:
where
NLt and
Lt indicate the nonlinear and linear components, respectively. Therefore, the main idea is to employ, in the first place, the MVR model, and, next, to apply the ANN to formulate the residuals of the linear structure. A schematic diagram of the proposed model is shown in
Figure 3.
Since the MVR model cannot approximate the nonlinear patterns that are involved in the dataset, then the residuals of the linear pattern may comprise nonlinear behaviour, which cannot be captured by a linear pattern. The combination model includes the exceptional advantages of both MVR and ANN models, allowing it to recognise various patterns. Therefore, it is profitable to separately formulate nonlinear and linear patterns by employing multiple patterns and, then, to integrate the results for improving the outcomes and model efficiency.
8. The Comparison of the Developed Model with Other Models
In this section, the predictive capabilities of the proposed model are compared with those of the MVR and ANN models. The ANN model was built with the selected variables that were found through the MVR model and trained with the Levenberg-Marquardt algorithm. The neural network model used is composed of five input, eight hidden, and one output neurons (N
5−8−1). The results of different models used for testing data are presented in
Table 2.
The performance measures of the proposed and other models for testing a data set are presented in
Table 3. It can be seen that the NMSE value for the proposed model is 0.0035, which is smaller than those obtained by using MVR and ANN, making 0.0059 and 0.0076, respectively.
The MAPE value for the proposed model is 9.59%, which is also dramatically smaller than those obtained by ANN and MVR, and making 17.44% and 20%, respectively. The
R2 value for the proposed model is 0.9965, which is bigger than those that are yielded by MVR and ANN and making 0.9941 and 0.9924, respectively. The experimental results presented in
Table 2 show that the hybrid models are more accurate. This conclusion can be derived because the hybrid models integrate linear and nonlinear information for predicting, while the individual model uses only linear or nonlinear information for modeling.
The comparison of the actual values and the values that are predicted by using ANN, MVR, and the hybrid models is presented in
Figure 5. It can be seen from the graphs that the estimates that are yielded by the ANN, MVR, and the hybrid models closely follow the actual values. It can also be seen that the predicting ability of ANFIS outperforms that of other models.
The obtained result is also in good agreement with the previous studies [
8], which state that, in most cases, the accuracy of the estimation by hybrid procedures would be better than that for pure statistical methods due to the essential property of costs, which are cumulative. To facilitate the cost estimation, it is also possible to apply convex optimization algorithms [
61,
62] and to compare the results in future works.
9. Sensitivity Analysis
Sensitivity analysis is a useful tool for determining the relationship between the considered parameters [
63]. The most sensitive factors affecting the shovel capital cost is analysed by the cosine amplitude method (CAM). This method is a useful tool for performing sensitivity analysis.
Based on the concepts of the CAM method, the sensitivity for each independent component can be determined through establishing the degree of the relationship (
rij) between the shovel capital cost and the considered independent component [
56]. The larger the value of CAM, the higher its impact on the capital cost. If the shovel capital cost is not related to the independent variable, then, the CAM value is zero. The independent variable plays a positive role the shovel capital cost where the CAM value is non-negative and plays a negative role in the shovel capital cost where the CAM value is non-positive.
Let
n be the number of independent variables represented as an array
X = {
x1,
x2, …,
xn}, while each of its elements,
xi, in the data array X, is itself a vector of length
m, and can be expressed as:
Thus, each of the data pairs can be viewed as a point in
m dimensional space, where each point requires
m coordinates for a complete description [
64]. Each element of the relation,
rij, results from a pairwise comparison of two data samples. The strength of the relationship between the data samples,
xi and
xj, is given by the membership value, expressing this strength:
The strength of the relations (
rij values) between the shovel capital cost and input parameters is shown in
Figure 6. As shown in
Figure 6, the most effective parameters of the capital cost of hydraulic and cable shovels are horsepower, bucket capacity, and weight, respectively.
10. Conclusions
Despite the fact that there are many different prediction models, the improvement of prediction accuracy is still an acute problem that is facing decision makers in many areas. Multivariable regression (MVR) models are among the most popular linear models in predicting. Although various techniques have been widely used with the aim of constructing more accurate models, they cannot recognise nonlinear patterns in the existing data. On the other hand, artificial neural networks (ANNs) are well-known as the useful tools for pattern recognition, clustering, and, mainly, prediction with a high degree of accuracy, but it is hardly reasonable to use the ANNs blindly to model linear problems. The hybrid methods, which decompose a problem into its linear and nonlinear constituent parts, refer to the most efficient models.
In this paper, the hybridisation of the MVR and ANN models is proposed to overcome their limitations mentioned above and to yield a more accurate predictive model that is generated by individual methods. In the proposed model, the unique capability of the MVR model has been utilised in linear modelling to recognise the existing linear structure in data, and, then, an ANN is applied to model the nonlinear forms, using the preprocessed data. The results obtained demonstrate that the proposed model is superior to the individual models regarding three indices, and can yield more accurate data. Moreover, MVR provides the excellent initial approximation of the error due to nonlinear part as the initial data of the ANN. Therefore, this fact enables the reduction computation time of the ANN.
It should be noted that the proposed methodology could be only employed for complex systems. Therefore, it is appropriate for a dataset with at least one nonlinear pattern that is involved in information.