Statistical Reliability Assessment for Small Sample of Failure Data of Dumper Diesel Engines Based on Power Law Process and Maximum Likelihood Estimation

: Dumpers or dump trucks are used all over the world to move overburden from many opencast mines. Diesel engines are the main driving force behind the trucks. The frequency of damage due to the failure of diesel engines is enormous. Therefore, efforts are necessary to analyze failure to reduce the downtime periods. A detailed analysis of engine failure at the subsystem level needs to be done. Reliability analysis and maintenance planning remain the norm in this regard. The obstacle faced while analysing the reliability of dumpers was the availability of a large number of data failures. In this paper, this issue is addressed by using Common Beta Hypothesis test and Meta-analysis test. The engine is divided into ﬁve subsystems. The result shows that all ﬁve subsystems pass the CBH test and Meta-analysis test. Accordingly, the failure data is grouped. The trend test of grouped failure data shows that the Failure data of two subsystems follows the independent and identically distributed characteristics while the remaining three do not follow it. The reliability is estimated for all ﬁve subsystems. Finally, fuel supply subsystems show the highest reliability while the lowest value is seen for self-starting subsystems.


Introduction
The main drive units used in dump trucks are diesel engines. Dump trucks or dumpers are used to transport heavy materials around the world. The frequency of breakdown causing the failure of diesel engines is adequate. A vital concern in the engine system's performance under certain operational conditions is to guarantee the satisfactory uninterrupted operation of the equipment [1][2][3][4][5][6][7]. However, failure of components is unescapable and takes place due to the ongoing wear and tear process in working parts of the system. This deterioration can result in unexpected failures of the system which will incur a significant increase in repair cost than in scheduled maintenance or repair. To control the impact of cost, it is necessary to evaluate the reliability of the equipment and its components. Such a study will be useful for making maintenance decisions and incorporate adaptive changes in maintenance policies. The main hurdle is the availability of a large amount of failure data [6][7][8][9]. The general pattern is that the small sample is not representative of the data and there is every possibility that any statistical treatment is misleading when a small number of failures are used. In the present study, a roadmap is provided using which reliability analysis could be possible for a small amount of failure data for any machinery.
To this day, research articles highlighting analytical methods on small data sets are practically limited. D.H. Olwell et.al [10] completed limited data with advanced information using the Weibull probability distribution. The paper conducted a firing analysis of 2000 motors used in missiles in field conditions using the Maximum Likelihood Estimation (MLE) method and the Bayesian method [10][11][12]. R M Mayer et.al [13] pooled the data from multiple data sets to get a large amount of failure data for statistical interpretation. The paper emphasizes that grouping of failure data is valid only when the data is collected with sufficient reliability. G. Wang et al. [14] used Failure mode effects and critical analysis (FMECA) for analyzing small sample of failure data of diesel engines. L. Qin et al. [15] analyzed the reliability of bearings based on performance attenuation data. E. J. Ahn et al. [16], described the methods used in systematic reviews and meta-analyses in medical sciences. W. Dai et al. [17] made an effective method for reliability assessment using signal features of the machining process. W. Si et al. [18] suggested reliability model for repairable systems having incomplete failure time data. X. Xintao et al. [19] proposed a model of improved maximum entropy probability distribution for estimation of reliability of bearings. F.V. Garcia et al. [20] discussed in their paper the methods to improve failure data used for high-speed marine diesel engine using Failure Modes, Effects, and Criticality Analysis. L. Zhang et al. [21] used Bayesian method for reliability evaluation of very few failure data. The researchers performed the reliability analysis on wet friction plate used in hydraulic control. S. Darmanto et al. [22] analyzed the reliability of diesel engines as a driver for the fire water pump. The researchers have determined reliability and the rate of failure of the diesel engine [22][23][24][25][26][27][28].
Recently, Y. He [25] suggested using a combined forecasting model to increase the amount of fault data samples. This increase in data is utilized for reliability analysis of Sanitation vehicles having Small Sample Data.
From the above literature review, it is revealed that most of the studies so far conducted using scarce failure data are less. The methods used for reliability analyses for small number data are mostly the Bayesian approach, FMECA and Monte Carlo method. Studies on reliability analysis with a very small sample amount of failure data on engine subsystems have been carried out. The present study uses CBH which has not been addressed so far for statistical treatment of small failure data. Additionally, the Meta-analysis test used in this paper has been used only in the medical field and not in the industrial field. Using the methods mentioned above the small failure data of any machine or system (in this case diesel engine) can be grouped and easily used for further reliability analysis.
Maintenance philosophies involve performing maintenance after given time intervals, typically after a fixed running hour for an engine. In spite of the scheduled maintenance, failure of the engine is inevitable, thereby decreasing the availability of dumpers and reducing the production cost. Reliability analysis of engine subsystems is essential to formulate the maintenance strategies which will reduce the downtime of the engine and enhance its availability. The main obstacle was the deficiency of adequate data for the appropriate statistical analyses. A data set containing a small sample size of failure data limits the possibility of precise decision-making. The current study gives specific guidelines for using CBH and meta-analysis testing, which emphasizes the failure data to predict reliability and MTBF. The researchers perform the reliability assessment using the grouped TBF data using which suitable maintenance strategies could be formed. It provides a roadmap of reliability analysis for any machinery having less failure data.

Research Methodology
An engine is made up of components, each of which is vital to the operation of an entire engine. There are certain major failures which can be prevented by replacing certain parts of the engine in the work site itself. The High oil consumption which is commonly caused by the hose pipe burst or hose pipe leakage can be prevented by replacing the hose Appl. Sci. 2021, 11, 5387 3 of 17 pipe timely. The presence of metallic pieces in a lube oil filter can heavily damage the condition of an engine. Hence, lube oil should be replaced along with the bearing oil filters regularly. If the lubrication oil is not changed timely its viscosity will increase leading to overheating. Overheating will cause expansion of the piston liners which will ultimately leads to engine seizure. Hence, timely replacement of lube oil can prevent engine seizure. The problem of overheating is the most common problem occurring in the engine. It may also occur due to insufficient working of the cooling fan and radiator. Proper and timely maintenance of the radiator will prevent the overheating problem. The reliability analysis is desired to prevent any catastrophic failure which may be fatal. Chart 1 shows all the steps used in this paper for reliability analysis. The following methodology is used to perform reliability analysis of an appreciably small amount of failure data. parts of the engine in the work site itself. The High oil consumption which is commonly caused by the hose pipe burst or hose pipe leakage can be prevented by replacing the hose pipe timely. The presence of metallic pieces in a lube oil filter can heavily damage the condition of an engine. Hence, lube oil should be replaced along with the bearing oil filters regularly. If the lubrication oil is not changed timely its viscosity will increase leading to overheating. Overheating will cause expansion of the piston liners which will ultimately leads to engine seizure. Hence, timely replacement of lube oil can prevent engine seizure. The problem of overheating is the most common problem occurring in the engine. It may also occur due to insufficient working of the cooling fan and radiator. Proper and timely maintenance of the radiator will prevent the overheating problem. The reliability analysis is desired to prevent any catastrophic failure which may be fatal. Flowchart 1 shows all the steps used in this paper for reliability analysis. The following methodology is used to perform reliability analysis of an appreciably small amount of failure data.

Flowchart 1.
Steps followed for estimating reliability analysis.
The TBF data from the three engines is collected from the management log book of the surface mine. All three engines are of the same type. For statistical analysis, the engine Chart 1. Steps followed for estimating reliability analysis.
The TBF data from the three engines is collected from the management log book of the surface mine. All three engines are of the same type. For statistical analysis, the engine is divided into main subsystems such as air supply, lubrication, self-starting, fuel supply and cooling subsystems. The number of TBF data collected from the project was found to be low. TBF data are pooled to increase the number of failure data. Grouping of failure data magnifies the sample size of each subsystem of three engines. Before aggregating TBF data, a CBH test and a meta-analysis test are applied on the TBF data of all five subsystems to examine the difference between the failure data of individual engines. In the CBH test, the consistency of an inter-arrival failure rate of each subsystem is evaluated [6]. If the failure rate is consistent between the three engines of each subsystem, then the failure data can be grouped. To combine the findings (in this case failure data) for independent studies (in this case three engines) meta-analysis test is used. In this analysis level of heterogeneity is checked among the three engines failure data of each subsystem. Heterogeneity in meta-analysis refers to the variation in the three engines failure data of each subsystem. Next iid characteristics of the TBF data of all five subsystems are tested. The relationship between cumulative time and cumulative number failures is considered for trend tests using group TBF data from all five subsystems. For the serial correlation test, the graph between the failure of i th number and (i-1) th number is considered. Based on the results of the trend test, Reliability and MTBF is determined for all five subsystems using either the MLE method or Power Law Process (PLP) model. The Power Law Process (PLP) model is basically a popular infinite NHPP model utilized to determine the reliability of repairable systems on the basis of the analysis of the observed failure data [29,30].

Collection of Field Data
The engines under study are turbocharged compression ignition (C.I.) engines with 12 cylinders, V-type and a maximum power rating of 900 H.P, rotating at 2100 rpm. In CI engines, air is compressed in the combustion chamber such that the injected liquid fuel can easily catch fire and burn progressively for power generation. Figure A1 shows a view of the dumper engine under study (see Appendix A). TBF data of each subsystem is collected for a period of three years from the mechanical register book of the mechanical open pit mine. The failure data in Table 1 were found to be less than 7 for each subsystem. The occurrence of failures has been calculated and shown in Table 2. The pie chart has been drawn to depict the frequency of failure. Figures 1-3 show pie charts for all five subsystems of three engines. The occurrence of failures has been calculated and shown in Table 2. The pie chart has been drawn to depict the frequency of failure. Figures 1-3 show pie charts for all five subsystems of three engines.

Common Beta Hypothesis Test
The TBF data have been collected for dumper engine subsystems of three engines of three years duration are used to have pictorial representation in the form of the pie chart [27][28][29]. To increase the number of TBF data, the TBF data of three same types of engines for each subsystem are grouped together. The grouping of data is validated by using Common Beta Hypothesis (CBH) test [2].

Common Beta Hypothesis Test
The TBF data have been collected for dumper engine subsystems of three engines of three years duration are used to have pictorial representation in the form of the pie chart [27][28][29]. To increase the number of TBF data, the TBF data of three same types of engines for each subsystem are grouped together. The grouping of data is validated by using Common Beta Hypothesis (CBH) test [2].

Common Beta Hypothesis Test
The TBF data have been collected for dumper engine subsystems of three engines of three years duration are used to have pictorial representation in the form of the pie chart [27][28][29]. To increase the number of TBF data, the TBF data of three same types of engines for each subsystem are grouped together. The grouping of data is validated by using Common Beta Hypothesis (CBH) test [2].
In the CBH test, all three engines are under test. An intensity function of each subsystem is given by Equation (1).
where q is the number of engines, i.e., 1, 2, and 3. The intensity functions of each engine is compared by comparing the β q of each system. Let β q denote the conditional maximum likelihood estimate of β q , which is given by [31][32][33]: β q is the shape parameter of each subsystem K = 1, 2 and 3 is the number of engines. M q = number of subsystem failures of each engine. T q = total working hours of each engine. X iq is the i th time-to-failure on the q th engine system The shape parameter β * average value is given by where, For calculation of yield statistics D, Calculate the statistic D, such that: The statistic D is distributed as a chi-squared random variable with a degree of freedom (3 − 1) = 2. It is estimated using Equation (7). The chi-squared tables are referred to to find the critical points.

CBH Test of Engine Subsystems
The data used to calculate the chi-squared value D for the CBH test are given in Table 3. "Start" refers to the time the engine was first put into service, which is 0. The cumulative time between failure hours of all subsystems of individual engines is calculated (from the values given in Table 1). For a given engine, the maximum cumulative time of its subsystem failures (between all five subsystems) is considered to be the life of the engine during data collection. This is shown in Table 3 below the "End" event. "Failures" mention cumulative TBF of individual subsystem taken from Table 1.

Meta-Analysis Test Steps
To check the level of heterogeneity meta-analysis test is used. It is a statistical technique for combining findings from independent studies. In the present study, variability of the failure data among the three engines for each subsystem is tested using Meta-analysis. Variability means differences in statistical results obtained between the individual failed data and pooled failure data for a particular subsystem [31][32][33].
In Table 4, the column "downtime hours" describes the total downtime hours of a particular engine for the problem related to a specific subsystem mentioned at the top of the table. The total run of the engine column indicates the total time in hours the engine has worked.
Rate of outcome = Outcome × 100 The failure data for each subsystem has been weighed (w) against its variance, and it is calculated using weighted effected size for each engine is a computed by-product of effect size and study weight, i.e., (w × es) Other important variables, w × es 2 is calculated for each engine required for calculating Q statistics. Q test measures the diversity of studies and t acts as a test. It is calculated as the weight of the squared differences between the individual effects of the collected failure data and the effects collected across the failure data using Equation (13).
The formula is Finally, the level of heterogeneity, i.e., i 2 is calculated using Equation (14). The i 2 is a percentage of the total variability between the failure data.
The formula is where "df" is degrees of freedom which is equal to n − 1, and where n is the number of engines under study (in this case, it is 3 − 1 = 2).

Results and Discussions
After going through the recent studies on reliability analysis on small failure data, it is evident that the CBH test and Meta-analysis test has not been seen as a possible solution for small failure data. Although Meta-analysis has been considered for medical studies, it has not been considered for machines. This paper uses CBH that has not been considered so far as statistical treatment for a small amount of failure data. Additionally, the Meta-analysis used in this paper has been used only in the medical field and not in the industrial sector.

CBH Test
The values of D of the five subsystems were calculated using the CBH test. The three engines together are shown in Table 4. Mathematically calculated values, as well as software values, are shown in the table. The test values for all five subsystems falls between the lower (0.10) and upper value (5.99). Hence, the TBF data for each subsystem of three engines pass CBH suggesting pooling of TBF data for further analysis.

Meta-Analysis Test
It can be observed from Table 5 that the level of heterogeneity was a negative value for self-starting, fuel supply, lubrication and cooling subsystems. Negative level of heterogeneity values can be treated as equal to zero [11]. The level of heterogeneity value for the air supply subsystem is 2.23% which is very low [12]. The zero value for four subsystems and the low level of heterogeneity value for one subsystem indicated that there is no variability among the failure data of three engines for all the five subsystems. It suggests that all the samples came from the same underlying distribution thereby supporting the result of the CBH test, which allows the pooling of failure data of three engines for each subsystem. The failure data of all five subsystems of the engine showed consistency by confirming CBH test. Additionally, meta-analysis test supports the result of CBH test which allows the pooling of failure data of three engines. The pooled data for each subsystem are shown in Table 6. This pooled data can be further successfully used for reliability analysis.

Trend Test and Serial Correlation Test
The graph is plotted for all five subsystems of the engine between cumulative time between successive failures and the cumulative number of failures using Grapher software. The linearity of the graph will validate that collected data has no trend and they are independent and identically distributed. Next, with TBF data, a plot between (i − 1) th TBF and i th TBF is drawn for all five subsystems. The scattered plot will reveal whether that the data have no trend and no serial correlation exists [7]. The grouped TBF data of Table 6 is considered for plotting the graph. The trend test plots of five subsystems. They are shown in Figure 4a-e. The failure data of all five subsystems of the engine showed consistency by confirming CBH test. Additionally, meta-analysis test supports the result of CBH test which allows the pooling of failure data of three engines. The pooled data for each subsystem are shown in Table 6. This pooled data can be further successfully used for reliability analysis.

Trend Test and Serial Correlation Test
The graph is plotted for all five subsystems of the engine between cumulative time between successive failures and the cumulative number of failures using Grapher software. The linearity of the graph will validate that collected data has no trend and they are independent and identically distributed. Next, with TBF data, a plot between (i − 1) th TBF and i th TBF is drawn for all five subsystems. The scattered plot will reveal whether that the data have no trend and no serial correlation exists [7]. The grouped TBF data of Table  6 is considered for plotting the graph. The trend test plots of five subsystems. They are shown in Figure 4a-e.   The plots in Figure 5a-e show the serial correlation tests of all five subsystems. From the plots above, no trend is observed in air supply and lubrication subsystems as the plotted points are in a straight line. The trend is seen for self-starting, fuel supply and cooling subsystems. No serial correlation is found for all five subsystems due to the scattered nature of the graphs (Figure 5a-e). Hence, self-starting, fuel supply and cooling subsystems do not follow iid characteristics whereas the air supply and lubrication subsystem follows it. The plots in Figure 5a-e show the serial correlation tests of all five subsystems. From the plots above, no trend is observed in air supply and lubrication subsystems as the plotted points are in a straight line. The trend is seen for self-starting, fuel supply and cooling subsystems. No serial correlation is found for all five subsystems due to the scattered nature of the graphs (Figure 5a-e). Hence, self-starting, fuel supply and cooling subsystems do not follow iid characteristics whereas the air supply and lubrication subsystem follows it.

Reliability Analysis
The grouped TBF data of self-starting, fuel supply and cooling subsystems are identified as not independently and identified distributed. The TBF data of the air supply and the lubrication subsystems were distributed independently and evenly. The MLE method is used to estimate the reliability and MTBF. PLP model is used for reliability estimation of subsystems having non-IID data. The reliability is estimated at an arbitrary value after 1000 h (for comparison) and also Mean Time Between Failure (MTBF) is calculated.

Reliability Analysis
The grouped TBF data of self-starting, fuel supply and cooling subsystems are identified as not independently and identified distributed. The TBF data of the air supply and the lubrication subsystems were distributed independently and evenly. The MLE method is used to estimate the reliability and MTBF. PLP model is used for reliability estimation   Figures 6 and 7 show the value of reliability and MTBF for all five subsystems. The value of reliability is highest for fuel supply subsystems and lowest for self-starting subsystems. The lowest MTBF value is of Self-starting subsystem which is 1186.47 h and the highest is of the air supply subsystem which is 1525.50. of subsystems having non-IID data. The reliability is estimated at an arbitrary value after 1000 h (for comparison) and also Mean Time Between Failure (MTBF) is calculated. Table  A1 shows the values of reliability and MTBF for all five subsystems (See Appendix A). Figures 6 and 7 show the value of reliability and MTBF for all five subsystems. The value of reliability is highest for fuel supply subsystems and lowest for self-starting subsystems. The lowest MTBF value is of Self-starting subsystem which is 1186.47 h and the highest is of the air supply subsystem which is 1525.50.   of subsystems having non-IID data. The reliability is estimated at an arbitrary value after 1000 h (for comparison) and also Mean Time Between Failure (MTBF) is calculated. Table  A1 shows the values of reliability and MTBF for all five subsystems (See Appendix A). Figures 6 and 7 show the value of reliability and MTBF for all five subsystems. The value of reliability is highest for fuel supply subsystems and lowest for self-starting subsystems. The lowest MTBF value is of Self-starting subsystem which is 1186.47 h and the highest is of the air supply subsystem which is 1525.50.

Conclusions
The problem associated with reliability analysis using a very small number of failure data has been solved in this paper. This research work provides a guide which can be used for reliability analysis of any repairable system and its subsystems when a very small sample size of failure data is available. Using CBH, the consistency of failure data of the system can be checked. Further to support the CBH test results, using meta-analysis the level of heterogeneity can be found out for systems and subsystems. After passing the above two tests, the very small failure data can be pooled. The pooled TBF data can be effectively further tested for trend analysis.
By using the MLE method and PLP model reliability analysis can be carried out. The values of reliability and MTBF are estimated. The value of MTBF can be utilized in scheduling the maintenance of the engine. Additionally, the subsystem with the lowest reliability, i.e., self-starting subsystems should be taken extra care of during maintenance.
The test values for all five subsystems falls between the lower (0.10) and upper value (5.99). Hence, the TBF data for each subsystem of three engines pass CBH suggesting pooling of TBF data for further analysis.
The zero value for four subsystems and the low level of heterogeneity value for one subsystem indicated that there is no variability among the failure data of three engines for all the five subsystems. It suggests that all the samples came from the same underlying distribution, thereby supporting the result of CBH test which allows the pooling of failure data of three engines for each subsystem.
The failure data of all five subsystems of the engine showed consistency by confirming the CBH test. Additionally, meta-analysis test supports the result of the CBH test which allows the pooling of failure data of three engines. The trend is seen for self-starting, fuel supply and cooling subsystems. No serial correlation is found for all five subsystems and thus, self-starting, fuel supply and cooling subsystems do not follow iid characteristics whereas the air supply and lubrication subsystem follows it.
The value of reliability is highest for fuel supply subsystems and lowest for selfstarting subsystems. The lowest MTBF value is of the Self-starting subsystem which is 1186.47 h, and the highest is of the air supply subsystem which is 1525.50 h.
Due to reliability analysis and a reliability-based maintenance schedule, the downtime and catastrophic failure of dumpers can be reduced.