Comparison between Simulation and Analytical Methods in Reliability Data Analysis : A Case Study on Face Drilling Rigs

Collecting the failure data and reliability analysis in an underground mining operation is challenging due to the harsh environment and high level of production pressure. Therefore, achieving an accurate, fast, and applicable analysis in a fleet of underground equipment is usually difficult and time consuming. This paper aims to discuss the main reliability analysis challenges in mining machinery by comparing three main approaches: two analytical methods (white-box and black-box modeling), and a simulation approach. For this purpose, the maintenance data from a fleet of face drilling rigs in a Swedish underground metal mine were extracted by the MAXIMO system over a period of two years and were applied for analysis. The investigations reveal that the performance of these approaches in ranking and the reliability of the studies of the machines is different. However, all mentioned methods provide similar outputs but, in general, the simulation estimates the reliability of the studied machines at a higher level. The simulation and white-box method sometimes provide exactly the same results, which are caused by their similar structure of analysis. On average, 9% of the data are missed in the white-box analysis due to a lack of sufficient data in some of the subsystems of the studies’ rigs.


Introduction
During the last three decades different drilling techniques, from pneumatic to electro-hydraulic, have been developed.Nowadays, drilling rigs have high capacities and are equipped with different advanced monitoring and control systems.The focus of development in this machinery has not just been on speed, but also on the quality of drilling, operation costs, and safety [1].Face drilling rigs are powerful machines, manufactured with a wide range of capabilities and are applied either in the mines or in construction projects.They are important machines with a critical role in the mine production rate and a construction project's duration.Therefore, their reliability and availability are very crucial for whole excavation operation.Collecting a wide range of data, getting the deep information from machine and well-established maintenance process are required to achieve a reliable drilling operation.Drilling machines are complex and compactly-designed, making their maintenance challenging, which suggests the need for an in-depth study of their failure behavior [2].Reliability studies in underground mining machinery have been carried out by many researchers and in different types of mines, varying from coal to hard rock metal mines.A review of the related outstanding literature has been presented by Hoseinie et al. [3].It has been emphasized in the past that reliability studies in mines, specifically in underground mining, are particularly difficult due to their harsh operation and maintenance environment, as well as work pressures [4].Considering the importance of mobile underground machinery for mine production, the complexity of the machinery and the harsh mining environment, a reliability analysis of the drilling rig must meet rigorous requirements and come up with clear results.
In this paper, the reliability of a fleet of face drilling rigs in a Swedish mine is analyzed using three different approaches: two analytical methods (white-box and black-box modeling) and simulation.Finally, the results of each approach are discussed and compered.

Face Drilling Rigs: A Case Study
Face drilling rigs are used for the drilling of blasting holes in construction and mining tunnels/galleries.Many international companies manufacture drilling rigs that are composed of similar operational units and have similar structures, however, they have variable technical characteristics, especially in capacity and power.This study examined a fleet of three rigs, called Machine A, B, and C in a Swedish underground metal mine.All of the rigs in this study are the same model and are manufactured by the Atlas Copco Company.Each rig has four retractable stabilizer legs and an articulated four-wheel drive chassis.They are operated electrically by a maximum power capacity of 158 kW. Figure 1 shows the schematic view of the studied rigs and their different parts.According to the operation manuals, field observations, and maintenance reports from the case study mine, the drilling rig is considered a system that consists of 12 operational subsystems working simultaneously to achieve the desired function and is connected in a series of configuration, as shown in Figure 2.

Data Collection
The required data for this study was collected from available operation and maintenance reports extracted from a computerized maintenance management system (CMMS) called "MAXIMO" and some field observations over a period of two years.After refining and filtering the data, the failures of each subsystem were assigned to its data-log and the failure frequency was analyzed.As can be seen in Figure 3, the Pareto analysis shows that hoses, rock, drills and feeders are the top three subsystems/units view point of failure frequency.On average, 63.3% of all recorded failures in this fleet are related to these subsystems.All of the defined subsystems for the face drilling rigs do not appear in the Pareto analysis in all machines.This means that some subsystems had no failures during the study period or had fewer than five failures and were not analyzed (the figure's "others" column represents these subsystems).

Reliability Analysis
Collecting data, analyzing, and making a decision are time consuming processes that must be done during any reliability study.In general, there are three main approaches for the reliability analysis of the systems [7,8]; (a) White-box modeling: the white-box (or structural) approach explicitly takes the structure of the system into account [9,10].In other words, in this method, the state of the system is modeled in terms of the states of the various components [8].In order to model the reliability of the system, the reliability of all subsystems are calculated and combined based on the reliability network and overall system configuration.(b) Black-box modeling: black-box analysis is a system-based analytical method, referring to the technique of testing a system with no knowledge of its internal workings [11].When the system is treated as a black-box, there is no concern about how the system "looks inside" [12].In this approach, the system is described either in terms of two states (working/failed) or more than two states (allowing for one or more partially failed states) without explicitly linking them to components of the system [8].(c) Simulation: stochastic simulation is a suitable technique to assess the reliability of a system and can be applied in two ways [8,9,12]: (1) Sequential approach: by examining each basic interval of the simulated period in chronological order; and (2) Random approach: by examining randomly chosen basic intervals of the system lifetime.The random approach, usually known as the "Monte Carlo" method, is a numerical method that allows for the solution to mathematical and technical problems by means of a system of probabilistic models and the simulation of random variables.In this method the stochastic failure occurrence of the system is analyzed and the probability of the failure and success of the system operation are calculated.The failure behavior of each subsystem is considered in this method.
Every mentioned reliability analysis approach has special and different inputs, advantages, disadvantages, limitations and outputs.Sometimes when the white-box approach is applied there are some statistical-based restrictions.For example, it is very difficult to fit a failure density function on the available data in subsystems that have less than five numbers of failures [13].Therefore, all these failures are missed in the calculations and analysis.However, they actually exist.In the black-box approach the all of the failure data is put in one set.The main disadvantage of this method is that we are not able to study the failure behavior of each subsystem and detailed information about the failure modes of the system is missed.As for its advantages, the calculation time is very low and the complete failure data is used in this approach.The simulation method stands in line with white-box modeling, nevertheless, the final reliability analysis of the whole system is performed stochastically.The main disadvantages of this method include high calculation time and missing failure data during the distribution fitting.
Within this article, the available failure data is analyzed by the above mentioned methods and the resulting reliability plots and the applications of each approach are compared and discussed.

Reliability Analysis Using the White-Box Modeling Approach
As discussed, in the white-box method the reliability of the whole machine is calculated based on the reliability of its subsystems.Therefore, all of the data analysis is performed during the time between failures (TBF) of the subsystems.Since the subsystems of a face drilling rig have a series of configurations, the reliability of the whole machine in time t is calculated by multiplying the reliability of all subsystems at time t.
In the CMMS, the failure data are recorded based on calendar time.Since drilling is not a continuous process, the TBFs were estimated by considering the utilization of each rig.Reliability and maintainability data analysis is usually done based on the assumption that the TBF and TTR data are independent and identically distributed (iid) in the time domain.It was critical to conduct a formal verification analysis of the assumption, otherwise, completely wrong conclusions could be drawn [14,15].In this study, statistical analysis could not find trends in the failure and repair data.The failure data were tested for trends with the Laplace trend test, which is used to determine whether a data set is identically distributed or not [14].If such a trend is observed, classical statistical techniques for reliability analysis may not be appropriate, and a non-stationary model such as the non-homogenous Poisson process (NHPP) must be fitted [15,16].Otherwise, the serial correlation test can be used to test the dependence of the failure data.A dependence test determines whether successive failures are dependent in data without a long-term trend or not [15].If dependence is not observed, the iid assumption is valid.
In this study, after testing and confirming the validity of the iid assumption, different types of statistical distributions were examined on the data and their parameters were estimated using Easyfit Professional software version 5.6 [17] and Minitab software [18].The goodness of a fit of the distributions was tested by using the Kolmogorov-Smirnov (K-S) test.All of the statistical tests used a significance level of 0.05.The best fitted failure density functions are presented in Table 1.
The reliability plots of the subsystems of the studied machines and the reliability of the whole machines based on the white-box modeling are presented in Figures 4-6.As can be seen in these figures, the accumulator, cable, water, and generator subsystems have the highest reliability, while the feeder and hose subsystems have the lowest reliability level.The figures show that the reliability behaviors of the all subsystems are more or less similar in all studied machines, possibly because they are all in the same working environment, are the same model, come from the same manufacturer, have the same maintenance crew, and are used by operators with similar skills.Figure 7 shows the reliability plots of all three machines using the white-box method in one area.As the figure indicates, machine C has the highest reliability among the studied machines, while machine A has the lowest level.The maximum difference value is 20% at almost 15 h.An obvious result of these plots is that the studied machines are almost equal in reliability during the period of high reliability operation (from time 0 to 5 h) and also during the period of very low reliability (after 35 h).The reliability of all machines decreases to almost zero after 50 h of operation.

Reliability Analysis Using Black-Box Modelling Approach
In this approach, no subsystems were considered for the machines and all of the failure data were analyzed as one statistical population per one machine.The iid testing procedures and data analysis were performed on the whole failure data of each machine.Reliability models and calculated plots are drawn from these data.The results of the data analysis using the black-box method are presented in Table 2. Using the parameters of the best-fit functions given in Table 2, the reliability plots of the machines are drawn in Figure 8.As shown in this figure, the reliability of all the machines reaches almost zero after 50 h of operation.The reliability of machine A is the lowest.In total, the black-box method finds very close, almost similar values for machines B and C.

Reliability Analysis Using Simulation Approach
The Monte Carlo simulation is a approach for the reliability analysis of large-scale complex networks that has been developed in several stages and different application guidelines have been recommended so far.During the simulation process, subsystems may have random failures and repair distributions, and failure data of the subsystems are sometimes not sufficient and smaller in sample size [19].These two phenomena cause some challenges in the simulation, nevertheless, they have not been able to restrict the applications or reduce its profit.
According to the available literature, the Monte Carlo reliability simulation is carried out by different algorithms that are mainly built up on the Kamat and Raily (K-M) [20] algorithm.It is considered the most general reliability simulation method, and other methods, such as Rice and Moore (R-M) [21], Chao and Huang (C-H) [22], Lin et al. (L-D-L) [23], and Lin and Donagh (L-D) [24] are known as the modifications or specializations of this method [25].Therefore, in this paper the K-R method is applied for the reliability simulation of face drilling rigs.
In the Kamat and Raily method the random failure times for each subsystem are generated based on defined failure distribution functions, which are then applied to assess the success or the failure of the system.Figure 9 presents the flowchart of the K-M simulation method.More details of this method are presented by Wang and Pham [19] and Hoseinie et al. [25,26].According to the presented algorithm in Figure 9, a computer code was developed in MATLAB software.For each machine, the program was run in different operation times and with the iteration number of 10,000.The simulated reliability plots of the studied machines are presented in Figure 10.As can be seen in this figure, the reliability of Machine A is the lowest value as well as in the other two methods.Also, the difference between the reliability of Machine B and C are a maximum 10%.

Comparison and Discussion
As discussed earlier, the reliabilities of the face drilling rigs were analyzed using three methods and the resulted reliability plots for each machine are presented in Figures 11-13.As seen in these figures, the resulted reliability plots are very similar and sometimes have the same values in each machine.
In Machine A (Figure 11), during the initial eight hours, the results of all methods are very similar to each other.After this period, the reliability plots of the white-box and black-box modeling become almost the same, but the simulation plot comes over them.At the end, after 30 h of operation, the reliability plots of all the methods show almost the same values.Finally, all reach zero after 40 h.In Machine B (Figure 12), the reliability plot of simulation is almost the same values as the black-box modeling.Both of them are higher than the white-box approach.In total, the reliability of this machine reaches almost zero after 50 h, which is 10 h more than Machine A. In Machine C (Figure 13), the results are complicated.Before 5 h, the simulation plot is close to the black-box method and is higher than the white-box.After 5 h, the simulation result separates from the black-box and is closer to the white-box.In passing 25 h, all plots show the same values and they join each other.
In general, the simulation method determines the higher values with greater accuracy than the other methods or are very similar in reliability values to the white-box method.It is caused by the process of simulation method, which is based on the failure distribution functions of the machines' subsystems.
In more or less the whole time period in all of the studied machines, the black-box and white-box methods reveal a reasonable difference in values (except at times after 10 h in machine A).Because when the white-box method is used, the frequencies of the failures in some of the subsystems are less than five and therefore, the related subsystems and their failures are eliminated from the analysis process.Nevertheless, in the black-box approach all of the failures are counted and the analysis is performed using a complete set of failure data.In detail, according to Tables 1 and 2, the number of missing data can be summarized as shown in Table 3.On average, 9.3% of the data has been missed within the white-box analysis, which accounts for the differences in the resulted values of this method.However, even without considering the missed data, the combination of different failure modes and failure distributions in the black-box method results in a different description of system reliability, which presents a holistic and general view of system reliability.In other view, according to Figures 7-9, the ranking of the studied machines in the white-box method is different with the other two methods.In the white-box (Figure 7), the reliability of machine C is obviously higher than machine B and A, nevertheless, in the black-box and simulation approaches the reliability of machine C and B are almost the same values.The reliability of machine A is the lowest value in all methods.
Finally, according to the presented results, it could be concluded that, even though the results of the three mentioned reliability analysis methods are clearly different, this is sometimes negligible.Therefore, in fleet level analysis, considering the required time for analysis the black-box method could be helpful in finding the reliability of each machine or detecting the weakest machine (lowest reliability) in the fleet.In further analysis, the white-box method can be used, but only on the weakest machine to do the subsystem and component level analysis.
Since the working environment is so harsh in mining, the machinery must be analyzed and the reliability view point monitored in regular time periods.Therefore, considering the time consuming nature of the simulation method, it is not suggested that it be applied in normal mining machineries.

Conclusions
In this paper the reliability of face drilling machines in a Swedish underground mine were analyzed using white-box, black-box, and simulation methods.The overall findings could be listed as followings: 1.
Applied reliability analysis methods obviously reveal different results, where the difference varies from almost zero to 20 percent.It is recommended to apply the black-box method in fleet level analysis, the white-box in machine level and simulation only in complex systems or in the case of a lack of available failure data.

2.
Comparative analysis shows that the applied reliability analysis approaches present different rankings of machines within the fleet, nevertheless, in finding the last-ranked machine they present the same result.

3.
According to all findings of this study, when our aim is to analyze the machine's reliability itself and to investigate the production stoppages and production reliability, the black-box method is the best method of modeling.All failure data are included in this method, and it is the shortest and easiest way when compared to the other methods.

Figure 2 .
Figure 2. The defined reliability block diagram for face drilling rig.

Figure 3 .
Figure 3. Pareto analysis on studied face drilling rigs.

19 K 1 KK 7 KFigure 4 .
Figure 4. Reliability plots of the subsystems of Machine A using white-box approach.

Figure 5 .
Figure 5. Reliability plots of the subsystems of Machine B using white-box approach.

Figure 6 .
Figure 6.Reliability plots of the subsystems of Machine C using white-box approach.

Figure 7 .
Figure 7. Reliability plots of all machines using white-box modeling.

Figure 8 .
Figure 8. Reliability plots of all studied machines using black-box modeling.

Figure 9 .
Figure 9.The flowchart of the K-R Monte Carlo reliability simulation method (adopted from[25]).

Figure 10 .
Figure 10.Reliability plots of all three studied machines achieved from the simulation method.

Figure 11 .
Figure 11.Reliability plots of machine A resulted from the different modeling approaches.

Figure 12 .
Figure 12.Reliability plots of machine B resulted from the different modeling approaches.

Figure 13 .
Figure 13.Reliability plots of machine C resulted from of different modeling approaches.

Table 1 .
Failure data analysis of subsystems of studied machines.

Table 2 .
Parameters of best-fitted distributions on failure data using the black-box approach.

Table 3 .
Number of analyzed and missed data view point of different modeling approaches.