Abstract
Many lifetime distribution models have successfully served as population models for risk analysis and reliability mechanisms. We propose a novel estimation procedure of stress–strength reliability in the case of two independent unit-half-normal distributions can fit asymmetrical data with either positive or negative skew, with different shape parameters. We obtain the maximum likelihood estimator of the reliability, its asymptotic distribution, and exact and asymptotic confidence intervals. In addition, confidence intervals of model parameters are constructed by using bootstrap techniques. We study the performance of the estimators based on Monte Carlo simulations, the mean squared error, average bias and length, and coverage probabilities. Finally, we apply the proposed reliability model in data analysis of burr measurements on the iron sheets.
1. Introduction
Recently, ref. [1] introduced a new distribution defined on the unit interval with one parameter and simple structure based on the half normal distribution, called the unit-half-normal distribution, as a good alternative to the Topp–Leone distribution [2], Kumaraswamy distribution [3], unit-logistic distribution [4], beta distribution of two parameters (or the Pearson type IV distribution) [5], unit-Birnbaum–Saunders distribution [6], and unit-Lindley distribution [7], among others. The probability density function (PDF) of the unit-half-normal distribution is as follows
where is a scale parameter and is the PDF of the standard normal distribution. From now on, a random variable X with PDF defined in (1) will be denoted by . The corresponding cumulative distribution function (CDF) is
where is the CDF of the standard normal distribution. Figure 1 and Figure 2 illustrate some of the possible shape of the unit-half-normal distribution for selected values of the parameter . From these figures, we observed that, the PDF shapes are unimodal and asymmetric (left and right skewed). As showed by [1], the unit-half-normal distribution belongs the exponential family of probability distributions.
Figure 1.
Plot of density function of UHN distribution for .
Figure 2.
Plot of density function of UHN distribution for .
The literature demonstrates that estimation of the stress–strength model, , has already been performed assuming that X and Y are independent random variables with positive support and different degrees of skewness and kurtosis described by the same kind of probability distribution. We refer the reader to [8] for a review and the references therein for more information on this claim. Much less attention is given when X and Y take values in a limited range, such as proportions, percentages and fractions. The main goal of this work is to develop the inferential procedure of the stress–strength parameter R, when X and Y are independent and , respectively. We can note the important role in the reliability analysis played the stress–strength parameter. Let X and Y denote, respectively, the stress and the strength. We say that the system is failed if the used stress is greater than its strength, in one active system.
The rest of the paper is structured as follows. The next section presents the entropy and mean residual life of a random variable with distribution. Then, we present an expression for the stress–strength reliability (R), MLE of R, its exact distribution and some properties, and three algorithms to simulate random variables from R. In the subsequent section, confidence intervals for R are developed by means of exact, asymptotic and bootstrap approaches. Next, the computational simulations are presented to evaluate the performance of the MLE and exact, asymptotic and bootstrap confidence intervals, followed by the section containing an application in the context burr measurements on the iron sheets. Finally, some concluding remarks are presented in the last section.
2. Entropy and Mean Residual Life
In this section, we present the entropy and mean residual life for a random variable with distribution.
2.1. Entropy
The entropy of a random variable X with PDF (1) is a measure of variation of the uncertainty. A large value of entropy indicate the grater uncertainty in the data. Using and numerical integration, it is possible to calculate the entropy as a function of the scale parameter . Then the Shannon entropy [9], defined by , is equal to
Using the Taylor expansion for around zero, but instead of x, we have .
Thus, by interchanging the order of the summation and the integration, the final form of the entropy is given by
Figure 3 shows the value of (3) for true value for from zero to one. The real value was computed by numerical integration. The software R [10] provides this option with the function integrate.
Figure 3.
Entropy values for a range of values of .
We can note that the second and third order approximation is not as good as the fourth order ones. However, all approximations are good especially for values of .
2.2. Mean Residual Life
The mean residual life or life expectancy is an important characteristic of the model. It gives the expected additional lifetime given that a component has survived until time t. For a non-negative continuous random variable the mean residual life function is defined as
where . The above conditional expectation is given by
Calculation of the numerator is done in the same way as the calculation of the mean. Thus
where .
Finally, Equation (4) can be written as
The integral of the numerator can be calculated with numerical methods. More on mean residual life, we refer our readers to [11], among others
3. Stress–Strength Reliability Model
An expression for the stress–strength reliability R is given by the following theorem.
Theorem 1.
Suppose X and Y are random variables independently distributed as and , the reliability of the system with stress variable (Y) and strength variable (X) is given by
Since () then (). We can note that can be computed by Equation (5) when and are known. We then focus on estimating and .
3.1. Maximum Likelihood Estimation of R
Before we move on to calculate the maximum likelihood estimation of R, some results are necessary.
Lemma 1.
If then .
Proof of Lemma 1.
See [1]. □
Corollary 1.
If then , where denotes the chi-squared distribution with 1 degree of freedom.
Proof of Corollary 1.
Let
The derivative of gives the PDF of . □
Corollary 2.
If , , and , , with independent of , then
- 1.
- and
- 2.
- and
- 3.
- 4.
- .
Now, suppose is a random sample of size n from and is a independent random sample of size m from with and . The log-likelihood is given by
The maximum likelihood estimators (MLEs) and of and , respectively, are the solutions of the following system of linear equations:
The solution to the system of equations is
Corollary 3.
If, , and, , withindependent of, then
- 1.
- , thenand.
- 2.
- , thenand,
where denotes the chi-distribution with s degrees of freedom.
Remark 1.
From Corollary 3 we have and . Therefore, both and are biased estimators of η and λ, respectively.
3.2. Confidence Intervals for and
Let and be random samples from and , respectively. In addition, let the two samples be independent. From Corollary 3, we have and . Taking and as two pivotal quantities, the confidence intervals for and are given by and , respectively, where and are the lower and upper th percentiles of a chi-square distribution with a degrees of freedom.
3.3. Exact PDF for R
Lemma 2.
Let then the PDF of is given by
for , where , and .
Proof.
Using the Corollary 2, let then
The derivative of gives the PDF in Equation (9). □
Remark 2.
Note that the random variable is a ratio of independent generalized gamma random variables, since we can write with and , where denotes a generalized gamma distribution. Following [13] the ratio of independent generalized gamma random variables has a generalized F distribution.
Proposition 1.
A random variable R follows a distribution, denoted as , if its PDF is given by
for , where and .
Proof of Proposition 1.
Using Lemma 2 and Equation (8) we have the result. □
With a little algebraic handling and the help of Maple or Mathematica software, it can be proved that the expectation of R is given by
where is the hypergeometric function such that
and is the Pochhammer symbol or ascending factorial. It is defined by for , and by for . However, for all integer k, we write simply
We may generate values of R using different procedures. Next, we describe Algorithms 1–3.
| Algorithm 1: Algorithm to generate observations from . |
Require Initialize the algorithm fixing , , n and m
|
| Algorithm 2: Algorithm to generate observations from . |
Require Initialize the algorithm fixing , , n and m
|
| Algorithm 3: Algorithm to generate observations from . |
Require Initialize the algorithm fixing , , n and m
|
4. Interval Estimation of
In this section, we consider the interval estimation of R based on exact, asymptotic and bootstrap methods.
4.1. Exact Confidence Interval
Let us assume that and , and we have a sample from the distribution of X and a sample from the distribution of Y. In addition, let the two samples be independent. From Corollary 1, we have then . Taking as a pivotal quantity, a confidence interval for R is given by
where and denote, respectively, the lower and upper th percentiles of F distribution with m and n degrees of freedom.
4.2. Asymptotic Distribution and Confidence Interval
In this subsection, at first, we compute the asymptotic distribution of and after this, we study the asymptotic distribution of . From the asymptotic distribution of , we get the asymptotic confidence interval of R.
Let be the expected Fisher’s information matrix of . The elements of the expected Fisher’s information matrix are
Under some regularity conditions, we have
The point estimator of R is . We obtain the asymptotic confidence interval for R following this procedure (see [14])
This gives
Thus, we obtain the following result
Hence, the asymptotic confidence interval for R is given by
where is the th percentile of the standard normal distribution and is given by (8).
4.3. Bootstrap Confidence Intervals
MLE is a typical statistical method. However, in many practical situations since the sample size is not large, so the large-sample based inference such as MLE-based asymptotic estimates may not be suitable and may even be misleading sometimes. In this subsection, parametric and nonparametric bootstrap confidence intervals are constructed for unknown parameters; see [15,16] for details.
4.3.1. Parametric Bootstrap Sampling Algorithm
Next, to generate parametric bootstrap samples, as suggested by [16], of , and R, from the given independent random samples, we use the following method and obtained from and , respectively.
- Stage 1 Compute MLE of and , say and , based on data and .
- Stage 2 Based on and , generates samples from and from withwhere for , is generated independent observations from the uniform distribution of sample size n and m, respectively.
- Stage 3 Compute MLE of and , say and , based on data and , respectively.
- Stage 4 Compute MLE of R, say , based on and .
- Stage 5 Repeat Steps 2 to 4 B times and generate B bootstrap estimates of , and R.
4.3.2. Nonparametric Bootstrap Sampling Algorithm
Next, we describe the steps to obtain nonparametric bootstrap samples of , and R.
- Stage 1 Draw random samples with replacement and from the original data and , respectively.
- Stage 2 Compute the bootstrap estimates and , say and , based on data and , respectively.
- Stage 3 Using and and Equation (8), compute the bootstrap estimate of R, say .
- Stage 4 Repeat Steps 1 and 3 B times and generate B bootstrap estimates of , and R.
Now, we propose different types of bootstrap confidence intervals for the parameter R using the parametric and nonparametric bootstrap samples. Identically to R, we compute the confidence intervals of and . For , we denote the set of bootstrap estimates of R. We also denote the MLE obtained from the original real dataset as and we assumed that the confidence level is .
Bootstrap-t confidence interval. The bootstrap-t confidence interval reproduces the way of constructing the standard-t confidence interval. The t-like critical value and the standard error of are computed based on the bootstrap estimates . We obtain the bootstrap standard error as follows
To find the t-like critical value, denoted by , we standardize by using
Then, we obtain from the bootstrap estimate:
Then, we obtain the bootstrap-t confidence interval
Bootstrap percentile confidence interval. To obtain bootstrap percentile confidence interval [17] of R, we simply find the and percentiles, denoted by and , based on the set of bootstrap estimates of R. The simple bootstrap percentile confidence interval is defined to be .
Bias-Corrected and Accelerated Bootstrap (BCa) Method. To overcome the overcoverage issues in percentile bootstrap CIs, the BCa method corrects for both bias and skewness of the bootstrap parameter estimates by incorporating a bias-correction factor and an acceleration factor (see [17,18]). The bias-correction factor is estimated as the proportion of the bootstrap estimates less than the original parameter estimate ,
where is the inverse CDF of a standard normal distribution. We can estimate the acceleration factor a through jackknife or leave-one-out resampling, which involves generating n replicates of the original sample, where n is the number of observations in the sample. The first jackknife replicate is obtained by leaving out the first case () of the original sample, the second by leaving out the second case (), and so on, until n samples of size are obtained. For each of the jackknife resamples, is obtained. The average of these estimates is,
Then, the acceleration factor is calculated as follows
With the values of and , the values and are calculated
Here, is the th percentile point of a standard normal distribution. Then, a BCa confidence interval of R is given by . For more on different types of confidence intervals, see [19], among others.
5. Simulation Study
In this section, we present a small Monte Carlo simulation study in order to illustrate the behavior of different estimates for different sample sizes. The simulation studies were conducted using 10,000 samples from and . The sample sizes were combinations of n and m, with n = 15, 20, 30, 50, 100 and m = 15, 20, 30, 50, 100. In all cases, we take and , with those values we get . In [1], the reader can find a simulation study for different values of the shape parameter including values greater than 1. From the sample, we estimate and using MLE Equations (6) and (7). Once and are estimated, we compute the MLE of R using (8). We also compute the average biases and mean squared errors (MSEs) using MLE and parametric and nonparametric bootstrap estimates (Par.Boot/Npar.Boot) of , and R in Table 1 over the 10,000 replications. We obtain the 95% confidence intervals based on the exact and asymptotic distributions of , and R. We also obtain the 95% confidence intervals based on parametric and nonparametric bootstrap methods for , and R. Based on exact and asymptotic distribution of , and R, and parametric and nonparametric bootstrap methods of , and R, we reported the average confidence lengths and coverage probabilities for 95% confidence intervals in Table 2. For the bootstrap methods, estimates and confidence interval was computed based on replications. Some of the points are quite clear from this simulation. Even for small sample sizes, the performance of the MLEs and bootstrap methods are quite satisfactory in terms of biases and MSEs. In addition for all methods when sample sizes n, m increases then the average bias and MSEs decreases. It verifies the consistency property of the MLE of , and R. It is observed that the bootstrap methods behave almost in a similar way both with respect to biases and MSEs.
Table 1.
Average biases and MSE values (within bracket) for parameters at , and .
Table 2.
Average confidence length and coverage probabilities of confidence intervals using exact, asymptotic and various parametric and non-parametric bootstrap methods.
We also compute the confidence intervals and the corresponding coverage probabilities by different methods. For R, the exact confidence interval was computed using (11), the asymptotic confidence interval was computed using (12). For and , we compute the confidence intervals using the formulae of Section 3.2 for the exact confidence interval, for the asymptotic confidence intervals we use the formulae in [1]. For the bootstrap methods, the confidence intervals are computed using the formulae of Section 4.3. In this case, all the eight confidence intervals behave very similarly in the sense of average confidence lengths and coverage probabilities.
The confidence intervals based on asymptotic confidence interval provides the shortest length in comparison with the exact confidence interval, whereas in coverage probability, using exact confidence interval shows better performance than the asymptotic confidence interval. On the contrary, among the bootstrap methods considered here, bootstrap-p method performed well as compared with bootstrap-t and bootstrap- methods.
6. An Illustrative Example
Cutting processes are those where a great enough force is applied to a piece of raw metal, usually sheet metal to cause the material to fail. One of the most common cutting processes is shearing and it is performed by applying a shearing force on the metal sheets [20]. In this section, we propose the use of our procedure on two real-life data sets to illustrate the implementation of our methods. The two data sets were firstly introduced and studied by [21] for burr measurements on the iron sheets. For the first data set of 50 observations on burr (in the unit of millimeter), the hole diameter is 12 mm and the sheet thickness is 3.15 mm. We shall refer to this as data set 1 and is given in Table 3. For the second data set of 50 observations, hole diameter and sheet thickness are 9 mm and 2 mm, respectively. We shall refer to this as data set 2 and is given in Table 4. Hole diameter readings are taken on jobs with respect to one hole, selected and fixed as per a predetermined orientation. The two data sets relate to two different machines under comparison [21]. One may see [21] about the technical details of the data sets’ measurements.
Table 3.
Data set 1.
Table 4.
Data set 2.
First of all, we conduct a one sample Kolmogorov–Smirnov (K-S) goodness-of-fit test on distribution based on the two data sets. We report the MLEs and estimates using parametric and nonparametric bootstrap methods and its corresponding standard errors (S.E.) of model parameters as well as the p-value (pval) and the test statistic (D) of K-S goodness-of-fit test for both data sets (K-S, for data set ) in Table 5. The K-S statistic (based on the MLE of the parameter ) is and the corresponding p-value is for data set 1. The K-S statistic (based on the MLE of the parameter ) is and the corresponding p-value is for data set 2. Therefore, the two data sets are reasonably fitted for the unit-half-normal distribution. Point estimates of , and R are similar in all methods considered, but the standard errors of the nonparametric bootstrap estimates are smaller than the MLEs and the parametric bootstrap estimates. The confidence intervals for , and R at 95% confidence level are reported in Table 6. Noticed that length of exact confidence interval is larger than that of asymptotic as we expected. In addition, the confidence intervals of the parametric bootstrap methods are larger than that of nonparametric bootstrap methods.
Table 5.
Maximum likelihood (MLE), parametric (Par.Boot) and non-parametric bootstrap (Npar.Boot) estimates (s.e.), statistics (D) and the p-values (pval) of K-S test of goodness-of-fit of the two distributions in both data sets.
Table 6.
The exact, asymptotic and various parametric and non-parametric bootstrap confidence intervals of , and R at 95% confidence level.
7. Concluding Remarks
In this work, we study different estimators of considering that both random variables X and Y follow a unit-half-normal distribution, with different shape parameters. A MLE procedure to obtain the MLEs of the unknown shape parameters is presented. Moreover, the MLE and the exact and asymptotic distribution of R are deduced. This allows us to compute the exact and asymptotic confidence intervals (CI). Additionally, based on parametric and nonparametric bootstrap methods, we are able to compute estimates of R and its respective CI. The simulation study shows that the performance of the MLEs, in terms of biases and MSEs, is quite satisfactory. We observe, also, a decrease in the average bias and MSEs as the sample size increases. From the point of view of biases and MSEs, we noticed similar performances using the MLEs and bootstrap methods. We studied, using different methods, the CI and the corresponding coverage percentages. We observe similar performances in terms of average confidence lengths and coverage probabilities for all the eight CI consider in this work. Based on the CI of R develop in this work, the best preference is using the nonparametric bootstrap method.
It was observed that the MLE of the shape parameter of the UHN distribution is biased. Although, the MLE possesses a number of attractive limiting properties: asymptotically unbiased, consistent, and asymptotically normal, many of these properties depend on an extremely large sample sizes. Those properties, such as unbiasedness, may not be valid for small or even moderate sample sizes, see [22], which are more practical in real data applications. Some bias-corrected techniques for the MLEs are desired in practice, especially when the sample size is small, see, for example, refs. [23,24,25,26,27] and references therein. Bias correction is an important topic in the UHN distribution, but is outside the scope of this article.
Author Contributions
Created and conceptualized the idea, R.d.l.C. and H.S.S.; data curation, R.d.l.C. and H.S.S.; formal analysis, R.d.l.C. and H.S.S.; methodology, R.d.l.C., H.S.S. and C.M.; software, R.d.l.C. and H.S.S.; supervision, R.d.l.C. and H.S.S.; validation, R.d.l.C., H.S.S. and C.M.; visualization, R.d.l.C. and H.S.S.; writing—original draft, R.d.l.C., H.S.S. and C.M.; writing—review and editing, R.d.l.C., H.S.S. and C.M. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by ANID/FONDECYT/1181662 and ANID/FONDECYT/1190801 (Chile).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
All data used to support the findings of the study are available within the article.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Bakouch, H.S.; Nik, A.S.; Asgharzadeh, A.; Salinas, H.S. A flexible probability model for proportion data: Unit-half-normal distribution. Commun. Stat. Case Stud. Data Anal. Appl. 2021, 7, 271–288. [Google Scholar] [CrossRef]
- Topp, C.W.; Leone, F.C. A family of J-Shaped frequency functions. J. Am. Stat. Assoc. 1995, 50, 209–219. [Google Scholar] [CrossRef]
- Kumaraswamy, P. A generalized probability density function for double-bounded random processes. J. Hydrol. 1980, 46, 79–88. [Google Scholar] [CrossRef]
- Tadikamalla, P.R.; Johnson, M.L. Systems of frequency curves generated by transformations of Logistic variables. Biometrika 1982, 69, 461–465. [Google Scholar] [CrossRef]
- Johnson, N.L.; Kotz, S.; Balakrishnan, N. Continuous Univariate Distributions, 2nd ed.; Wiley: New York, NY, USA, 1994; Volume 1. [Google Scholar]
- Mazucheli, J.; Menezes, A.F.B.; Dey, S. The unit-Birnbaum-Saunders distribution with applications. Chil. J. Stat. 2018, 9, 47–57. [Google Scholar]
- Mazucheli, J.; Menezes, A.F.B.; Chakraborty, S. On the one parameter unit-Lindley distribution and its associated regression model for proportion data. J. Appl. Stat. 2019, 46, 700–714. [Google Scholar] [CrossRef] [Green Version]
- Kotz, S.; Lumelskii, Y.; Pensky, M. The Stress-Strength Model and Its Generalizations: Theory and Applications; World Scientific Publishing: Singapore, 2003. [Google Scholar]
- Shannon, C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef] [Green Version]
- R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021; Available online: https://www.R-project.org/ (accessed on 10 January 2022).
- Ahsanullah, M.; Aazad, A.A.; Kibria, B.M.G. A Note on Mean Residual Life of the k out of n System. Bull. Malays. Math. Sci. Soc. 2013, 37, 83–91. [Google Scholar]
- Casella, G.; Berger, R. Statistical Inference; Duxbury Press: Belmont, CA, USA, 1990. [Google Scholar]
- Malik, H.J. Exact distributions of the quotient of independent generalized gamma variables. Can. Math. Bull. 1967, 10, 463–465. [Google Scholar] [CrossRef]
- Rao, C.R. Linear Statistical Inference and Its Applications; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2002. [Google Scholar]
- Davison, A.; Hinkley, D. Bootstrap Methods and Their Application; Cambridge University Press: Cambridge, UK, 1997. [Google Scholar]
- Efron, B.; Tibshirani, R.J. An Introduction to the Bootstrap; Chapman and Hall: New York, NY, USA, 1993. [Google Scholar]
- Efron, B. The Jackknife, the Bootstrap, and Other Resampling Plans; Society of Industrial and Applied Mathematics: Philadelphia, PA, USA, 1982. [Google Scholar]
- Efron, B. Better bootstrap confidence intervals. J. Am. Stat. Assoc. 1987, 82, 171–185. [Google Scholar] [CrossRef]
- Almonte, C.; Kibria, B.M.G. On some classical, bootstrap and transformation confidence intervals for estimating the mean of an asymmetrical population. Model Assist. Stat. Appl. 2009, 4, 91–104. [Google Scholar] [CrossRef]
- CustomPart.Net. Sheet Metal Cutting (Shearing). Available online: https://www.custompartnet.com/wu/sheet-metal-shearing (accessed on 3 January 2022).
- Dasgupta, R. On the distribution of Burr with applications. Sankhya B 2011, 73, 1–19. [Google Scholar] [CrossRef]
- Kay, S. Asymptotic maximum likelihood estimator performance for chaotic signals in noise. IEEE Trans. Signal Process. 1995, 43, 1009–1012. [Google Scholar] [CrossRef]
- Mazucheli, J.; Menezes, A.F.B.; Dey, S. Improved Maximum Likelihood Estimators for the Parameters of the Unit-Gamma Distribution. Commun. Stat. Theory Methods 2018, 47, 3767–3778. [Google Scholar] [CrossRef]
- Giles, D.E.; Feng, H.; Godwin, R.T. On the Bias of the Maximum Likelihood Estimator for the Two-Parameter Lomax Distribution. Commun. Stat. Theory Methods 2013, 42, 1934–1950. [Google Scholar] [CrossRef] [Green Version]
- Giles, D.E. Bias Reduction for the Maximum Likelihood Estimators of the Parameters in the Half-Logistic Distribution. Commun. Stat. Theory Methods 2012, 41, 212–222. [Google Scholar] [CrossRef]
- Lemonte, A.J. Improved point estimation for the Kumaraswamy distribution. J. Stat. Comput. Simul. 2011, 81, 1971–1982. [Google Scholar] [CrossRef]
- Firth, D. Bias reduction of maximum likelihood estimates. Biometrika 1993, 80, 27–38. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).