A Statistical Evaluation Method Based on Fuzzy Failure Data for Multi-State Equipment Reliability

: For complex equipment, it is easy to over-evaluate the impact of failure on production by estimating the reliability level only through failure probability. To remedy this problem, this paper proposes a statistical evaluation method based on fuzzy failure data considering the multi-state characteristics of equipment failures. In this method, the new reliability-evaluation scheme is ﬁ rstly presented based on the traditional statistical analysis method using the Weibull distribution function. For this scheme, the failure-grade index is de ﬁ ned


Introduction
In existing studies, reliability evaluation is mainly performed based on failure data or simulation.The evaluation method based on failure data is the earliest and most commonly used at present [1][2][3].Failure data refer to the record data generated during development, testing, or use, including the failure mode, failure type, cause of failure, impact on production and occurrence time, etc., which describe the key features of fault events.According to research experiences, the time between the failures or service life of electromechanical equipment such as computer numerical control (CNC) machine tools is subject to certain probability distribution types.Typical distribution types mainly include exponential distribution [4], Weibull distribution [5,6], Rayleigh distribution [7], normal distribution [8], and extremum distribution [9].For example, under the complete maintenance assumption of "repair as new", exponential distribution and Weibull distribution are often used for fault-data modeling.Common reliability indicators include reliability, failure rate, mean time between failures (MTBF), and mean time to repair (MTTR).In early studies, researchers evaluated the reliability of machine tools mostly based on the two-state hypothesis, which assumes that the equipment has only two states: normal and faulty.For example, exponential distribution and Weibull distribution are often used to establish the reliability model of the machine tool based on fault data [10][11][12].
However, the numerical control machine tool is a complex piece of mechanical and electrical equipment, which always has diversified failure modes, difficult data collection, and fewer occasional failures.Therefore, the system failure reflects the obvious and complex polymorphism.In order to evaluate its reliability level more accurately, multi-state system reliability has become a current research hot spot.The existing reliability-modeling methods for multi-state systems include a multi-valued model extended from the Boolean model for the two-state system [13], the stochastic process model [14], the general-purpose generating function method [15], and Monte Carlo simulation [16], which are analyzed and summarized in the reference report of [17].In these studies, scholars used a reliability block diagram, fault trees, and the Petri net model to describe the fault logic relationship between the system and its component units, and they proposed the reliability-evaluation method of complex systems through the analysis of the probability of underlying events and the importance of components.For example, Wang et al. proposed a reliability-evaluation method for manufacturing systems based on dynamic adaptive fuzzy-reasoning Petri nets, considering the multi-state performance of each machine of the system [18].He et al. proposed a task reliability-evaluation method for manufacturing systems based on extended stochastic flow nets, focusing on the machine fuzzy multistate characteristics of the system affected by human factor division and working conditions [19].Sun et al. established a reliability model called the general-purpose generating function, which considered the stochastic uncertainty and cognitive uncertainty of manufacturing systems, and they carried out structural reliability modeling and an analysis of the fatigue strength of machine-tool milling shafts [20].Due to the difficulty of failure data collection in practical applications, the small sample size is always the main problem for reliability evaluations based on failure data, which can be better solved using advanced statistical methods [21].For example, Xu et al. introduced the objective Bayesian method to analyze degradation data with small sample sizes and used a rejection sampling-embedded Monte Carlo algorithm to obtain Bayesian parameters [22].Zhou et al. proposed a re-parameterized gamma process with random effects to improve the calculation efficiency and estimation accuracy of the product degradation process [23].Amalnerkar et al. used the bootstrap information criterion to present a unique and efficient simulation scheme, aiming to solve the uncertainty problem of reliability analysis [24].Moreover, the reason for the small sample size is that reliability experiments have several drawbacks, such as a long time period and high cost; thus, simulation-based reliability evaluation was developed, which can help us to realize the collaborative design of reliability and performance.The reliability-simulation model is established based on performance simulation, also considering equipment fault behaviors and mechanisms [25].
Moreover, since different failure modes have different impacts on the system or task, failure severity has been considered in many existing studies.For example, Kaidis et al. classified failure events of wind turbines into three severity levels according to the required repair time of failure-including failures that only need manual restarts, failures that need minor repairs, and failures that need major repairs-and proposed reliability statistical processing methods for failures at different severity levels [26].Zhang et al. introduced failure severity into the failure mode effects and criticality analysis (FMECA) method considering failure mode correlation, which referenced the definition of failure severity (the severity of the most serious consequences of product failure) in the standard (the detailed information can be found in the reference [27]).And, they divided failure severity into four levels according to the influence of failures on people, systems, economic loss, or task efficiency: minor level, medium level, lethal level, and disaster level [28].Zhang et al. considered the severity level of failures in the reliability analysis of machine tools based on FMECA and fuzzy evaluation, which divided failures into five levels according to occurrence rate and detection difficulty [29].In summary, many researchers have considered failure severity to improve the accuracy of the reliability analysis, but they have used different classification means according to the requirements of their proposed methods.
From the above introduction, the method based on failure data is the most used in recent studies for complex equipment-for example, some distribution functions have been widely applied to establish the reliability model for mechanical or control systems, such as machine tools [30], robots [31], and control systems [32].Failure severity has also been considered to more accurately analyze the reliability of equipment.However, except for failure severity, maintenance time and expense also have a major impact on the taskexecuting efficiency and cost of equipment.Moreover, combined with the definition of reliability (the ability of the equipment to complete the specified function under specified conditions and within the specified time), the traditional method cannot accurately quantitatively evaluate the equipment-reliability level since it only considers the failure occurrence probability and ignores the influence degree of failure on equipment performance or functioning.Therefore, in this paper, we introduce a new index-the failure-grade index-to characterize the failure state more comprehensively, considering the severity, maintenance time, and expense of the failure.The purpose of this index definition is to obtain a more accurate reliability-evaluation model for equipment, which can consider not only the occurrence probability but also the influence of failure on equipment performance or function, as well as the efficiency and cost the production task.Based on the above definition, this paper proposes a reliability fuzzy-evaluation method based on failure-state characterization.It mainly includes the following three contributions: (1) The failure-grade fuzzy-evaluation method is proposed to characterize the failure state considering fault severity, failure maintenance time and expense; (2) The modified adaptive small-sample-expansion method is proposed based on error judgement and correlation coefficient judgement for the time between failures and the failure-grade evaluation index, respectively, aiming to solve the problem of a small sample size; and (3) A novel reliability-evaluation model is established to more accurately estimate the reliability level of equipment by considering the failure grade and membership degree.The remainder of this paper is organized as follows: Section 2 presents the reliability-evaluation scheme considering multiple states of failure, which is proposed based on the traditional Weibulldistribution-based reliability modeling framework; Section 3 presents the failure-grade fuzzy-evaluation method and an example analysis; Section 4 outlines the modified adaptive small-sample-expansion method and an example analysis; Section 5 presents the novel reliability-evaluation model and an example analysis; and Section 6 offers the main conclusions and future recommendations based on this work.

Reliability-Evaluation Scheme Considering Multi-State Characteristic of Failures
According to existing studies, Weibull distribution is an absolutely continuous probability distribution, and when the failure rate follows this distribution, its power function form can be adjusted by shape and scale parameters; it has been widely applied in the reliability analysis of equipment due to its wide coverage [30][31][32].Based on Weibull distribution, the probability density function and the cumulative distribution function (also known as failure distribution function) can be expressed by [33].
wherein  and  are the scale parameter and shape parameter, respectively (,  > 0), and  is the position parameter ( ≥ 0).The practice proves that the failure rate of repairable equipment generally takes the shape of a "bathtub curve" over time, and it includes three periods: the early failure period ( < 1), the occasional failure period ( = 1), and the exhaustion failure period ( > 1).In real applications, it is assumed that a failure occurs at  = 0; then, the reliability function based on the two-parameter Weibull distribution is widely applied and can be expressed by [33].
Moreover, in order to analyze the equipment reliability of the whole life cycle, researchers proposed a reliability-modeling method based on mixed Weibull distribution, which is expressed as three-stage segmented function as Equation ( 4).The linear regression analysis for each segment of the function was implemented based on the failure data in the corresponding failure stage, and the function continuity processing was carried out to obtain the final function parameters [34,35].
where  ,  ,  are the weight parameters of the model, which is used to adjust the continuity of three curves;  ,  ,  are the scale parameters of the Weibull distributions of three failure periods; and  ,  ,  are the shape parameters of the Weibull distributions of three failure periods, respectively.In the parameter estimation of the distribution function, the classical probability statistical method is generally used to analyze the equipment failure data, and the estimation accuracy depends on sufficient data.However, in practical engineering, for most equipment or manufacturing systems, it is difficult to obtain sufficient failure data due to long experiment periods, huge experiment costs, and an insufficient sample size.A small sample dataset is one of the key problems faced by complex equipment or manufacturing system reliability research.A small sample refers to a size less than or equal to 30 [36].Small-sample expansion is an effective way to solve the problem of insufficient sample size.The methods mainly include the regression conversion method, virtual augmented samples, and the bootstrap method.The authors of [36] found that the bootstrap method has an obvious advantage in the small-sample expansion of time between failures through a comparative study of the above methods.The classical bootstrap method extends the original fault data through the following steps: (1) take the random value  in the interval [0,1]; (2) take  = ( − 1),  = loor() + 1, wherein loor() represents taking the largest integer not greater than  ; and (3) obtain regenerated data by  =  * + ( −  + 1)( * −  * ), wherein  * is the ℎ data point after the original data are processed in descending order.Based on the above description, the pseudo-code algorithm of the classical bootstrap method is determined and can be found in Table 1, wherein CBExpansion(  ) means the expansion of the original data   based on the classical bootstrap method and   denotes the new generated data;  is the required sample size of the expansion; and the function rand(1,1) is used to generate the random value in the interval [0,1].Using the above method, the original fault data can be extended, which then provides a sufficient fault-data basis for the estimation of distribution parameters.Therefore, the traditional equipment-reliability modeling procedure based on distribution functions is shown in Figure 1.With the above method as the basis, in order to consider the influence of failure on equipment performance or function, the failure-grade evaluation index is established to characterize the failure state.Then, taking the values of time between failures and the failure-evaluation index as input data, failure-grade fuzzy evaluation can be performed.Finally, for graded failures, the reliability model for each grade of failures can be established based on the Weibull distribution and the estimation of its parameters, and it is related to the failure grade and its membership degree.After failure grading, the small-sample problem will be more likely to occur for the reliability modeling based on each grade failure, so the modified adaptive small-sample extension method is proposed based on a bootstrap method for the failure data composed of the time between failures and the failuregrade index.The proposed new scheme based on the above steps is shown in Figure 2.This scheme considers both the failure probability and the influence degree of failure on the system running effect, so the result can more accurately evaluate the reliability level of equipment.

Failure-Grade Fuzzy-Evaluation Method and Example Analysis
From the proposed scheme, the equipment-reliability evaluation needs to solve three problems, including the failure-grade fuzzy-evaluation problem based on index definition, the modified small-sample-expansion problem for the failure data, and the new reliability-evaluation modeling problem based on failure-state characterization.The proposed methods are presented in the following contents.

Failure-Grade Fuzzy-Evaluation Method
In this subsection, the failure-grade index is established to characterize the failure state through the analysis of influence factors, and then, the failure-grade fuzzy-evaluation method is proposed to provide the premise of failure-data processing for the final reliability evaluation.The detailed steps are presented below.
(1) Influence analysis.In practical applications, the equipment performance is often multifaceted.For a piece of equipment, people may be concerned not only about whether it can continuously and successfully complete specified operations but also about its productivity, accuracy, and many other factors, and different modes of failure often have different degrees of impact on equipment performance.Taking the processing machine tool as an example, the failure of the lubrication system caused by the lack of oil leakage will affect the machining accuracy and efficiency.The failure of the lubrication system caused by the oil valve, oil pump motor, etc., will directly lead to the failure of the processing task.Therefore, the failure severity is one factor that determines the failure grade.In this paper, we also used a similar method to [28] to determine the failure severity levels, which considers the influence of failures on equipment performance or function and gives detailed descriptions of each severity level: Level-1 severity can result in a decrease in accuracy or efficiency but has no impact on the production task; Level-2 severity can result in a decrease in accuracy or efficiency and has a general impact on the production task; Level-3 severity can result in a serious decline in accuracy or efficiency and has a significant impact on the production task; and Level-4 severity can cause the system to shut down, making it unable perform the production task.In addition, the maintenance time and expense required by different failures are often different; that is, the impact of failures on the actual production efficiency and cost is different.From this perspective, the maintenance time and expense of the failure are considered as two influencing factors for failure-grade evaluation.(2) Failure-grade index definition.The failure-grade index is defined to characterize the failure state, comprehensively considering the severity, maintenance time, and expense of the failure, and is written as follows,  =   +   , +   , (5) where  means the severity coefficient of the th failure, and its values for the Level-1, Level-2, Level-3, and Level-4 failures are taken as 0.25, 0.5, 0.75, and 1, respectively.That is, when the value is closer to zero, the influence of failure on the production task is smaller. ,  and  are weights of severity, maintenance time, and the expense coefficient for the comprehensive index, and they can be assigned according to the actual application through weighing functional requirements, time, cost, and other resources. , and  , are, respectively, the time proportion coefficient and expense proportion coefficient, which are mainly used to characterize the maintenance time  and the maintenance expense  of the ℎ failure.These are related to the  and  thresholds  of time and expense, respectively, which can be set according to the actual production situation. , and  , can be calculated by the following formulas.
According to the above formulas, when the maintenance time or maintenance expense is larger than the threshold value, the proportion coefficient is 1, and when it is less than the threshold value, the proportional coefficient is the ratio of the maintenance time or expense to the corresponding threshold value-that is, the greater the time or expense, the closer the ratio is to 1.
(3) Failure-grade fuzzy evaluation.For the fuzzy evaluation of the failure grade, it is divided into five grades 1~V, and the quantitative assignment of each grade is shown in Table 2.The failure-grade judgment matrix of the index value  can be written as  = [ , ,  , ,  , ,  , ,  , ] (8) where  , represents the membership degree of the index value  to the ℎ failure grade.The above membership degree is determined by the trapezoidal fuzzy distribution function, wherein the first grade is small, the second to fourth grades are intermediate, and the fifth grade is large.The membership degree function at all grades can be expressed as follows:

Example Analysis
In order to verify the feasibility of the above fuzzy failure-grade evaluation method, we took the practical data of one machine tool and one machine-tool cooling system listed in the literature [36] as the research objects and carried out the numerical analysis.In the analysis, based on the failure-grade fuzzy-evaluation method, the severity of each failure is classified according to the severity classification principle, and the maintenance expense is estimated for each failure based on practical experience.Component replacement is conducted with the purchase price as a reference, and the expense of cleaning, disassembly, and other behaviors is mainly reflected in the maintenance time.The labor cost is ignored in this case-that is, the corresponding maintenance expense is taken as zero.We set the time threshold to 120 min and the expense threshold to 1000 RMB, and the failure data of the machine tool and the machine-tool cooling system based on fuzzy evaluation are listed in Tables A1 and A2 of Appendix A, respectively. means the time between failures, and F01~F13 and FX01~FX67 represent failure descriptions for two systems, respectively, which can be found in the literature [36].
Based on the proposed failure-grading method, the failure-grade index can reflect the contributions of severity, maintenance time, and expense to the grade evaluation, and the contribution is expressed by the coefficient of each factor.By plotting values of these coefficients and the corresponding grade index of the machine tool (Figure 3), we can observe that the results based on the proposed method meet the objective reality.For example, the severity of failure 1 and failure 10 is level 4, but the maintenance time and expense of failure are lower, which results in lower-grade index values of 0.4750 and 0.5503, respectively.For failure 5 and failure 6, the severity level, maintenance time, and expense are all lower, which results in the lowest grade index values of the table: 0.3375 and 0.3500, respectively.The severity level, maintenance time, and expense are all high, so the grade index value is the highest in the table: 0.9000.From the partial-failure data of the cooling system shown in Figure 4, we can also obtain the same conclusion-for example, the severity level, maintenance time, and expense of failures 2, 4, 5 and 14 are high, so the grade index values for these failures are all above 0.9; the severity of failure 1 and failure 15 is level 4, but their maintenance time and expense are lower, so the grade index values are about 0.45.The above analysis indicates the rationality and effectiveness of the proposed failure-grading method.

Modified Adaptive Small-Sample-Expansion Method
The modified method is proposed based on the classical bootstrap method for the failure data, including the time between failures and the failure-grade index and assuming that two data sets are independent of each other.Moreover, in order to solve the deviation of the newborn sample from the real distribution caused by the randomness of bootstrap expansion, the random number expansion is optimized based on the error judgement and the correlation coefficient judgment for the time between failures and the failure-grade index, respectively.Detailed steps for the modified sample expansion are as follows.
(1) Take the original data for the time between failures   and the failure-grade index   , and two random values  and  in the interval [0,1]; (2) Take  = ( − 1) and  = ( − 1) , and set  = loor( ) + 1 and  = loor( ) + 1; (3) Obtain the newborn data by  =  * + ( −  + 1)( * −  * ) ,  =  * + ( −  + 1)( * −  * ), wherein  * and  * are the ℎ value for the time between failures and the corresponding failure-grade index in the original data from smallest to largest; (4) For the data optimization of the time between failures, taking the original data  and the newborn data in each iteration  as failure data, respectively, the Weibull distribution functions can be determined as  () and  () according to Equation (2).Then, the largest error between two distribution curves can be calculated by ∆ = max| () −  ()|.Finally, setting the error threshold as ∆ , the newborn data set can be optimized by the above steps until the threshold condition ∆ ≤ ∆ is satisfied.Based on the above embedded optimization, the data distribution of the time between failures can be ensured.(5) The distribution feature of the failure-grade index is analyzed firstly based on the calculation implemented in the example cases of the above section.Figure 5a,b shows the interval distributions of the failure-grade index for the machine tool and the cooling system, respectively.In the two figures, it is shown that the distribution features a bell shape, roughly increasing first and then decreasing; thus, it can be considered that the grade index approximately follows a positive skewed distribution.
(a) (b) Therefore, for the failure-grade index, the ratio of two variances of the original data and the newborn data are calculated to define the coefficient of correlation of two data sets,  =  / , wherein  and  are the smaller variance and larger variance respectively, and 0 <  ≤ 1. the closer the coefficient is to one, the more correlated the two sets of data are.The threshold of the coefficient is set as  .When the distribution of the newborn data following steps from (1) to (3) cannot satisfy the threshold requirement ( ≥  ), random numbers are re-taken, and newborn data are generated until the threshold requirement is satisfied, to ensure that the new data do not deviate from the real distribution.Based on the above description, the pseudo-code algorithm of the modified sampleexpansion method is listed in Table 3, wherein MSExpansion(  ,   ) means the expansion of the original data   and   based on the modified sample-expansion method;  and  mean the coefficients of correlation for the newborn and original data, and  and  are their sample sizes, respectively; the function mean(•) is used to calculate the average value of the sample; and the functions min(•) and max(•) are used to take the smallest value and the largest value from several data.

Example Analysis
In this subsection, the original failure data of the machine tool and the cooling system, as listed Tables 2 and 3, are expanded to verify the effectiveness of the proposed sample-expansion method-that is, the newborn data for the time between failures and the failure-grade index are still consistent with the distribution regularity of the original data.
(1) Distribution verification of time between failures.In the simulation, the expansion capacities for the machine tool and the cooling system are set as 50 and 200, respectively.For the time between failures, taking the error threshold value between failure distribution curves of the original data and newborn data as a constraint (∆ < 0.01), the optimal newborn data for two systems are obtained through eight and three iterations, respectively.Figure 6a,b shows the failure distribution curves of the original data and the newborn data for two systems.Through comparison, the failure distribution errors for the machine tool and the cooling system are 0.0037 and 0.0057, respectively, which both satisfy the threshold requirement; two function curves that basically coincide can effectively ensure that the newborn data of the time between failures do not deviate from the original distribution regularity.Moreover, the smaller the size of the original data, the more iterations are required to produce a result that satisfies the constraint requirement.Moreover, the iteration numbers and the maximum errors under different values of expansion capacity are listed in Table 4, where  and  are the expansion capacities of the machine tool and the cooling system, respectively.From this table, errors under all cases satisfy the threshold constraint condition ∆ ≤ 0.01, and they are basically irrelevant to the value of ; this is because the sample is expanded based on random array generation.However, the iteration number is obviously related to the expansion capacity and the original sample size-for example, for the sample expansion of the machine tool, the iteration number increases with the value of , but for the cooling system, the iteration number for each expansion capacity is very small, which indicates that the larger the original sample, the easier it is to obtain capacity expansion data that meet the threshold requirement.(2) Distribution verification of failure-grade index.In the simulation, the expansion capacities of the machine tool and the cooling system are set as 50 and 200.For the failuregrade index, taking the threshold value of the coefficient of correlation between failure distributions of the original data and newborn data as constraint  ≥ 0.95, the optimal newborn data for two systems are obtained through 23 and 10 iterations, respectively, and the optimal correlation coefficients are 0.9682 and 0.9815, which both satisfy the given threshold condition.Figure 7a,b shows failure frequency comparisons of the original data and the newborn data of the failure-grade index for two systems.From the comparison, it can be found that the failure frequency distributions of the original data and the newborn data are generally the same, which can indicate that the modified expansion method can verify the effectiveness of the proposed modified sample-expansion method; the results also indicate that when the original sample size is small, more iterations are required to search for the newborn data that meet the constraint condition.Moreover, the iteration numbers and the correlation coefficients under different values of expansion capacity are listed in Table 5.From this table, the coefficients under all cases all satisfy the given threshold condition  ≥ 0.95, and they are also basically irrelevant to the value of .The results also show that as the expansion capacity increases, the iteration number required by the sample expansion increases dramatically.

Novel Reliability-Evaluation Model
In the reliability-evaluation method, firstly, the Weibull distribution parameters for failures at each grade are estimated.In the estimation, the membership degree is introduced to reflect the probability distribution under the corresponding grade.During the data fitting, the value of  that corresponds with  = ln can be expressed by, Then, parameters  and  of the Weibull distribution for grade- failures are calculated based on linear fitting, and based on this, the failure distribution function can be defined considering the influence effect of the failure on the system function and performance.This is relevant to the failure grade  and its membership degree  , .The distribution function is expressed as follows: where the term 0.2 represents the evaluation standard value of the failure grade.Finally, the reliability function of grade- failures is obtained by  () = 1 −  ().The overall reliability-evaluation model can be expressed by In summary, the proposed failure-grade fuzzy-evaluation method is used to characterize the multiple states of failures, and the modified sample expansion is the basis of the failure data analysis and the reliability modeling.After the failure grading and sample expansion, the new failure data (including the original sample and the newborn sample) are further handled in the following ways, aiming to ensure the accuracy of the distribution-parameter estimation.
(1) Failures with a membership degree of zero at each grade were eliminated.Since the membership degree  , of the th failure at the th grade is less than or equal to one and may be zero, the failures with a membership degree of zero at each grade are eliminated for the ease of the reliability modeling.
(2) Small sample data cases at some grades after expansion were re-handled, as in Figure 8.In detail, according to the analysis of the distribution of failure-grade index, it approximately follows a positive skewed distribution; therefore, the failure samples at some edge grades may still be small samples even after expansion.In order to avoid the serious deviation phenomenon caused by the small-sample cases at some grades, in this work, the failures at the grade with small sample data (in previous works, when the sample size K was less than 30, it was regarded as the small-sample case) are automatically incorporated into the failure data at the lower or higher grades, starting at the lowest and highest grades, respectively, until the sample size at each grade satisfies the sample-size requirement.

Example Analysis
The machine tool and the cooling system are further used as the example cases of this analysis, and their original failure data have different features.For example, the original data of the machine tool only have 13 recordings of failures, so they belong to the small sample data.Although the original data of the cooling system have 67 failures, there still exist small-sample cases under some grades after failure grading.Two example cases can better verify the feasibility and sample adaptation of the proposed reliability-evaluation method.Moreover, in each example case, the evaluation results based on the methods that consider and do not consider the failure grading are compared to show the necessity of considering multi-state failure, and the influences of the expansion randomness, the expansion capacity, and grade-index values on the reliability-evaluation result are analyzed to verify the effectiveness of the proposed model.
(1) Example case 1 (taking the machine tool as the research object).Since the sample size of the machine tool is small, the sample-expansion capacity is taken by 50.The newborn sample is obtained through expanding five times; then, it is used to generate five failure samples with different expansion capacities ( = 50, 100, 150, 200, 250 ).During the five expansions, the errors for time between failures are 0.0022, 0.0043, 0.0065, 0.0093 and 0.0012, and the correlation coefficients for the failure-grade index are 0.9567, 0.9688, 0.9748, 0.9969 and 0.9529, respectively, which indicates that the newborn data do not deviate from the distribution rule of the original data.Finally, the failure samples including 263 failures are obtained by combining them with the original 13 failures.Based on the proposed methods, the failure samples at all grades can be obtained, and their sizes are 45, 175, 185, 87 and 32, which are all larger than 30. Figure 9 shows a comparison of reliability curves based on the methods before and after considering failure grading ( = 250): traditional and novel reliability-evaluation.The difference between the two curves is also calculated and plotted.Based on the comparison, it is known that the reliability-evaluation value becomes obviously after considering failure grading, and it has the tendency of increasing gradually and then basically not changing over time.The largest value of the difference is 0.3128.This result indicates that the proposed method can increase the accuracy of the reliability evaluation through considering the impact degree of failures on the system running.

Failures at
Grade-1

Failures at Grade-2
Failures at Grade-3 Failures at Grade-4 Failures at Grade-5 In order to reveal the influence mechanism of the modeling process on the evaluation result, the above simulation is repeated ten times.Then, the evaluation results obtained by ten simulations are compared.The maximum and minimum fluctuation values of the evaluation are used to show the stability of the algorithm, and they are obtained by calculating the differences between values at the same time from ten simulations and taking the smallest and largest value from all values.In this case, they are 0.0801 and 0.0112, respectively.Moreover, the evaluation results based on different expansion capacities ( = 50, 100, 150, 200, 250) are compared.Sample sizes at all grades under different expansion capacities are listed in Table 6, wherein  , ⋯ ,  are sample sizes at five grades, respectively.The comparison shows that the small-sample case more easily occurs at some failure grades when the expansion capacity is small.Moreover, the comparison of reliability curves based on different values of expansion capacity ( = 50, 100, 150, 200, 250) is shown in Figure 10.From this figure, the following conclusions can be obtained, (1) from the reliability comparison based on the failure data with and without incorporating small sample data at some grades (as shown in Figure 6, and corresponding to the cases  = 200, ℎ .and  = 200, ℎ .), it is found that the reliability value under the case  = 200, ℎ .is obviously smaller than that under the case  = 200, ℎ ., which means that although the small-sample incorporation is necessary due to the existence of the small sample, sometimes the handling way will largely influence the evaluation accuracy of the reliability level; (2) the reliability value of the machine tool decreases over the expansion capacity, which indicates that the capacity has the major influence on the evaluation result, and the more larger the sample size, the closer the evaluation result is to the real reliability level, for example, the results under cases  = 200, ℎ .and  = 250 are very close; (3) in summary, although the increase in the expansion capacity will influence the calculation efficiency of the evaluation result, its value should be large enough to make the sample data at each grade is sufficient to ensure that the result does not deviate from the real level in practical applications.(2) Example case 2 (taking the cooling system as the research object).Compared with example case 1, the original sample size is larger; therefore, the sample-expansion capacity is set to 200 and the expansion is performed four times.Each expansion ensures that the error and the coefficient of correlation satisfy the given threshold constraints.Then, firstly, four failure samples with a capacity of 200 are obtained.Secondly, through the generation method shown in Figure 12, six failure samples with a capacity of 400 are obtained, and the small sample at Grade-1 is caused because the original failures at Grade-1 are fewer.Sample sizes at all grades of different failure samples are listed as Table 7. From the table, we may observe that the sample sizes at all grades when  = 400 all satisfy the large sample requirement, and the failure-number distribution at different grades is generally the same.Figure 13 shows a comparison of reliability curves based on different failure samples; it shows that the curves are concentrated, and their maximum difference is 0.0787.This comparison result explains the usability of the proposed methods for failure grading and reliability evaluation.Figure 14 shows a comparison of reliability curves based on different values of the failure-grade index for sample 1.From the comparison, we can draw conclusions as follows: (1) the failure-grade index has a major influence on the evaluation result, and the larger the failure-grade index, the smaller the reliability-evaluation value, which indicates the effectiveness of the proposed method; (2) in all cases, the reliability-evaluation value decreases rapidly with time at the beginning and then flattens out after 1000 h; and (3) compared with example case 2, the reliability curve in example case 1 declines more gently, mainly because the mean time between failures (MTBF) of the cooling system is smaller than that of the machine tool, based on the original sample data of two systems listed in Tables 2 and 3.The observed value of the MTBF can be calculated by  = (∑  )/.The calculated MTBF values of the machine tool and the cooling system are 619.41h and 330.63 h, respectively.Therefore, the decline rate of the reliability curve of the cooling system is higher than that of the machine tool, which means the tendency of the reliability curve is consistent with the practical experience.Moreover, when the time is less than 1000 h, the larger the failure-grade index, the higher the decline rate of the reliability over time, and when the time is larger than 1000 h, the change tendency tends to be flat.

Conclusions and Future Works
This work proposed a new reliability-evaluation scheme for equipment considering multi-state failures.It was applied in three sections to a machine tool and its cooling system to verify the effectiveness of the methods included in this scheme, respectively.The main conclusions are as follows: (1) In the failure-grade fuzzy-evaluation method, the failure severity, maintenance time, and expense are considered to establish the failure-grade index model.From the simulation cases, we note that this method is effective for characterizing the state of failure based on its influence on the system performance or function.(2) In the modified adaptive small-sample-expansion method, the data of the time between failures and the failure-grade index are expanded through random-number optimization based on error judgement and correlation coefficient judgement, respectively.The results of simulation cases indicate that the distributions of the original and its corresponding newborn samples are basically the same, which verified the effectiveness of the proposed sample-expansion method.(3) In the novel reliability-evaluation modeling method, the distribution parameters are estimated by introducing the failure-grade membership degree; then, the equipmentfailure distribution function at each grade is established, and the equipment reliability is finally modeled.Based on simulation cases, the novel method can more accurately evaluate the reliability level of equipment based on sufficient failure data after the effective sample expansion, which avoids the problem of the over-evaluation of equipment unreliability caused by ignoring multi-state failures.Moreover, the randomness of the sample-expansion process has a minor impact on the reliability-evaluation result-for example, the maximum fluctuation value of the reliability curves of the machine tool is 0.081 and that of the cooling system is 0.0787, which shows the stability of the proposed reliability-evaluation scheme.
Moreover, the execution of the above steps depends mainly on the integrity of the original failure data, which include not only the time between failures but also the severity, maintenance time, and expense of the failure.Also, in the failure-grade index definition, the equipment differences are considered by setting weight parameters and threshold parameters for the maintenance time and the expense of the failure, which can improve the universality of the proposed method in terms of its application with different equipment.Moreover, the adaptive sample-expansion method can effectively solve the general small-sample problem.In summary, as long as the above failure data are fully provided, the proposed method is basically available for the reliability evaluation of any equipment-for example, automotive bearings or engine failure data.
In future works, the failure data of more equipment will be collected during their service stages and be used to portray the actual universality of the proposed method and the significance of this work.Moreover, the calculation algorithm will be embedded in the equipment control system, aiming to realize the real-time quantification and monitoring of the equipment-reliability level, which will also provide a theoretical basis for the further proposal of reliability-improvement strategies.

Algorithm 1 :Figure 1 .
Figure 1.Traditional equipment-reliability modeling procedure based on distribution functions.

Figure 2 .
Figure 2. New scheme for the equipment-reliability evaluation.

Figure 3 .
Figure 3. Plot of Failure-grade index and coefficients of factors for machine tool.

Figure 4 .
Figure 4. Partial plot of failure-grade index and coefficients of factors for machine-tool cooling system.

Figure 5 .
Figure 5. Interval distribution analysis of the failure-grade index.(a) For the machine tool (13 faults).(b) For the machine-tool cooling system (67 faults).

Figure 6 .
Figure 6.Failure distribution curves of the original data and the newborn data for two systems.(a) For the machine tool.(b) For the machine-tool cooling system.

Figure 7 .
Figure 7. Failure frequency comparisons of the original data and the newborn data of failure-grade index for two systems.(a) For the machine tool.(b) For the machine-tool cooling system.

Figure 8 .
Figure 8. Handling method for small-sample cases after expansion.

Figure 9 .
Figure 9.Comparison of reliability curves based on the traditional and novel evaluation methods ( = 250).

Figure 11
Figure11shows the comparison of reliability curves based on different values of failure-grade index ( = 0.2, 0.4, 0.6, 0.8, 1.0) when  = 250, that is, the grade indexes of all failures are taken as a certain value.From the result, the following conclusions can be obtained, (1) the evaluation result is obviously affected by the failure-grade index, and it decreases over the index value, which indicates the novel model is effective in reliability evaluation; (2) the larger the failure grade, the higher the decline rate of the reliability over the time, for example, the evaluation value at  = 2000 h decreases by 16.33% compared with that at  = 100 h , the decline percentages when  = 0.4, 0.6, 0.8, 1.0 are 33.45%,51.42%, 70.28% and 90.13% respectively.

Figure 11 .
Figure 11.Comparison of reliability curves based on different values of failure-grade index when K = 250.

Figure 12 .
Figure 12.Generation way of failure samples with capacity of 400.

Figure 13 .
Figure 13.Comparison of reliability curves based on different failure samples.

Figure 14 .
Figure 14.Comparison of reliability curves based on different values of the failure-grade index for sample 1.

Table 1 .
Pseudo-code algorithm of the classical bootstrap method.

Table 2 .
Standard values of failure-grade evaluation.

Table 3 .
Pseudo-code algorithm of the modified sample-expansion method.

Table 4 .
Iteration numbers and maximum errors under different values of expansion capacity.

Table 5 .
Iteration numbers and correlation coefficients under different values of expansion capacity.

Table 6 .
Sample sizes at all grades under different expansion capacities.

Table 7 .
Sample sizes at all grades for different failure samples.