Next Article in Journal
An SIRS Epidemic Model Supervised by a Control System for Vaccination and Treatment Actions Which Involve First-Order Dynamics and Vaccination of Newborns
Next Article in Special Issue
Construct Six Sigma DMAIC Improvement Model for Manufacturing Process Quality of Multi-Characteristic Products
Previous Article in Journal
Thermogravitational Convective Flow and Energy Transport in an Electronic Cabinet with a Heat-Generating Element and Solid/Porous Finned Heat Sink
Previous Article in Special Issue
Instrumental Variable Quantile Regression of Spatial Dynamic Durbin Panel Data Model with Fixed Effects
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Inference for the Process Performance Index of Products on the Basis of Power-Normal Distribution

1
School of Management, Xiamen University, Xiamen 361005, China
2
Data Mining Research Center, Xiamen University, Xiamen 361005, China
3
School of Mathematics and Statistics, Northeast Petroleum University, Daqing 163318, China
4
Department of Statistics, Tamkang University, Tamsui District, New Taipei City 251301, Taiwan
*
Authors to whom correspondence should be addressed.
Mathematics 2022, 10(1), 35; https://doi.org/10.3390/math10010035
Submission received: 22 November 2021 / Revised: 20 December 2021 / Accepted: 21 December 2021 / Published: 23 December 2021

Abstract

:
The process performance index (PPI) can be a simple metric to connect the conforming rate of products. The properties of the PPI have been well studied for the normal distribution and other widely used lifetime distributions, such as the Weibull, Gamma, and Pareto distributions. Assume that the quality characteristic of product follows power-normal distribution. Statistical inference procedures for the PPI are established. The maximum likelihood estimation method for the model parameters and PPI is investigated and the exact Fisher information matrix is derived. We discuss the drawbacks of using the exact Fisher information matrix to obtain the confidence interval of the model parameters. The parametric bootstrap percentile and bootstrap bias-corrected percentile methods are proposed to obtain approximate confidence intervals for the model parameters and PPI. Monte Carlo simulations are conducted to evaluate the performance of the proposed methods. One example about the flow width of the resist in the hard-bake process is used for illustration.

1. Introduction

Control chart methods have been widely used as an online tool for process monitoring. After confirming the process is in control, practitioners can conduct a process capability analysis to determine how well the output of the process meets the expectations of customer, requirements, and specifications. The process capability analysis methods aim to continuously monitor the quality of process via using the capability indices for assuring that the products are consistent with the specifications, supplying information on product design and process quality improvement. Moreover, the results of process capability analysis can be the basis for reducing the cost due to product failures, see [1].
Among widely used process capability indices, the process performance index (PPI) can provide a close connection to the conforming rate of P ( X > L ) , where X is the quality variable and L is a given threshold about the lower quality limit. Ref. [2] recommended using the PPI to assess the quality of products under the normality assumption. The inferences of the PPI for the lifetime distributions with different censoring schemes also have been widely studied. Based on using a type-II censoring scheme, ref. [3] obtained the maximum likelihood estimate (MLE) of the PPI for the Pareto distribution. Moreover, they also proposed a hypothesis testing method to test PPI. Ref. [4] proposed an inference procedure to obtain the uniformly minimum variance unbiased estimate of PPI and conducted a hypothesis testing method for the two-parameter exponential distribution based on type-II censored samples. Ref. [5] proposed fourteen different estimates of the PPI for the two-parameter exponential distribution with a multiple type-II censoring scheme. Then, three estimates are screened from the pool to develop hypothesis testing procedures for testing PPI. Using a type-II censoring scheme, ref. [6] studied the statistical inference procedure for the PPI when the lifetimes follow an exponential distribution. Ref. [7] proposed some optimal statistical inferential procedures for the PPI based on progressively type-II censored samples for the Burr type XII distribution. Ref. [8] used generalized order statistics to conduct statistical inferences for the PPI when the lifetimes follow an exponential distribution. Ref. [9] proposed Bayesian estimation methods to assess the PPI when the progressive type-II censored samples are collected from Rayleigh distribution. Ref. [10] proposed parameter estimation methods to estimate the PPI based on progressive Weibull first-failure censored samples.
The power-normal family of distributions was first noted by [11]. They investigated the statistical properties of power-normal distribution and gave expressions for the moments in terms of infinite series. Ref. [12] studied the maximum likelihood estimation method to estimate the model parameters for the normalized power-normal distribution. Ref. [13] mentioned some applications about using the power-normal distribution. Ref. [14] presented another form of the power-normal distribution based on using inverse Box–Cox power transformation and obtained the expressions of the mean and variance of the power-normal distribution. Ref. [15] studied a truncated power-normal distribution that has the truncation point of zero. They studied the probabilistic properties along with the maximum likelihood and moments estimation methods. Ref. [16] used delta method to develop several estimation methods to obtain the confidence intervals of the percentiles of power-normal distribution. Ref. [17] investigated the closeness of power-normal distribution and skew-normal distribution and proposed estimation procedures to characterize skewed data via using the power-normal distribution. Ref. [18] evaluated the asymptotic influence of the truncation on the estimation of the parameters of the power-normal distribution via using an intensive simulation study.
The power-normal distribution is a generalized version of normal distribution. Moreover, the power-normal distribution can be a generalized family and more competitive than skew-normal distribution to characterize real data if the skewness parameter in the skew-normal distribution is small. In this study, we aim to propose a reliable estimation process to evaluate the PPI in quality control applications. The proposed estimation process can be used to evaluate the PPI whatever the data follow a symmetric or asymmetric distribution. Moreover, the new estimation process can involve the estimation method based on normality assumption as special case.
We investigate the maximum likelihood estimation method and the confidence interval method based on using the exact Fisher information matrix. Moreover, we also study the weakness of using an exact Fisher information matrix to obtain the confidence intervals of the power-normal distribution parameters. Two bootstrap methods are used to obtain the approximate confidence interval of PPI when the quality characteristic follows a power-normal distribution.

2. Motivation and Organization

The probability density function (PDF) and cumulative density function (CDF) of the power-normal are defined, respectively, by
f ( y | Θ ) = γ σ Φ y ξ σ γ 1 ϕ y ξ σ ,   y R , ξ R , γ > 0 , σ > 0 ,
and
F ( y | Θ ) = Φ y ξ σ γ ,   y R , ξ R , γ > 0 , σ > 0 ,
where Θ = ( ξ , γ , σ ) , and ξ is the location parameter, σ is the scale parameter and γ is the shape parameter. When γ = 1 , the power-normal distribution reduces to the normal distribution with the mean ξ and standard deviation σ . The power-normal distribution is left-skewed if 0 < γ < 1 and right-skewed if γ > 1 . The power-normal distribution can be an alternative to the skew-normal distribution proposed by Azzalini [19].
Process capability analysis methods often are used to evaluate whether the quality of products meets customers’ requirements. Define the PPI by
C L = ξ L σ .
If the quality variable X follows a normal distribution, ξ and σ are respectively the mean and standard deviation of X. Throughout this article, a product is labeled as satisfactory if the condition of { X > L } is satisfied. The probability of a satisfactory product is known as the conforming rate and defined by
η L = P X L .
There is a connection between the conforming rate η L and C L when X follows a parametric distribution. This fact implies that the statistical inference for the conforming rate may be used as well for the C L and vice versa.
In the real world, the distribution of quality variables could not have a symmetric shape. Hence, the existing inference methods based on the normality assumption for C L become not available. It is important to develop new inference methods to assess the quality of C L for a wide range of distribution shapes and the new method can also be applied to the normal distribution. The skew-normal distribution proposed by Azzalini [19] can be an option to characterize process data. However, the skew-normal distribution is not the best model in terms of capturing high degrees of kurtosis. The quality of the estimation results based on the skew-normal distribution is not stable if the skewness parameter is close to zero. In order to overcome the weakness of the skew-normal distribution for the inferences about C L . We suggest replacing the skew-normal distribution with the power-normal distribution to develop new inference methods for C L .
The power-normal distribution is available to characterize the data whose distribution has a higher degree of kurtosis than the normal and skew-normal distributions but has a narrow range of asymmetry. Based on the aforementioned reasons, we aim to develop a maximum likelihood estimation procedure for the power-normal distribution and investigate the weakness of using an exact Fisher information matrix to obtain the confidence intervals of the model parameters and C L . Moreover, two parametric bootstrapping procedures are developed in this study to obtain approximate confidence intervals of the power-normal distribution parameters and C L .
The Fisher information matrix and bootstrap methods are two extensively used methods to obtain approximate confidence intervals of the model parameters. If the exact Fisher information matrix can be obtained, the delta method and exact Fisher information matrix can be used to obtain an approximate confidence interval of a function of model parameters. If the exact Fisher information matrix is not available, the observed Fisher information matrix can be used instead. Because the obtained approximate confidence interval via using the delta method and Fisher information matrix is based on the Central Limit Theorem, the obtained approximate confidence interval could be conservative with an over-or under-estimated coverage probability.
The bootstrap is a computer-based method which can replace the delta method to obtain an approximate confidence interval of a function of model parameters. The bootstrap often is used to derive an estimate of standard error or confidence interval of a complex estimator of the interest parameter. In quality control or reliability applications, bootstrap has been extensively used to obtain the approximate confidence interval of a complex estimator of the interest parameter. Among all bootstrap methods, the parametric bootstrap percentile (PBP) and bias-corrected percentile (BCP) methods are two popular methods to obtain the approximate confidence interval of a complex estimator of the interest parameter. Comprehensive processes to construct bootstrap confidence intervals can be found in [20]. Quality control and reliability applications based on using bootstrap methods can be found in [21,22,23,24].
The rest of this paper is organized as follows: In Section 3, the maximum likelihood estimation procedure is proposed and the exact Fisher information matrix is derived. Moreover, the steps to implement the PBP and BCP methods are developed. We also study the drawbacks of using the exact Fisher information matrix to obtain an approximate confidence interval for the model parameters and PPI. In order to evaluate the performance of the proposed estimation methods, an intensive Monte Carlo simulation study is conducted in Section 4 to evaluate the performance of the proposed maximum likelihood estimation method in terms of the metrics of bias and mean squared error (MSE). The performance of the interval inference methods is evaluated based on the coverage probability (CR) for the target parameter. In Section 5, the applications of the proposed estimation procedures are illustrated with one real example about the flow width of the resist in a hard-bake process. Finally, some concluding remarks are given in Section 6.

3. The Inference Methods

3.1. Maximum Likelihood Estimation Method

Based on the power-normal distribution defined by Equations (1) and (2), the maximum likelihood estimation procedure is proposed in this section. Let X = Y ξ σ , the m th moment of Y can be obtained by
E [ Y m ] = l = 0 m m l ξ l σ m l E [ X m l ] ,
where X has the PDF
f ( x | γ ) = γ [ Φ ( x ) ] γ 1 ϕ ( x ) ,   x R .
The PDF of y i can be presented by
f ( y i | Θ ) = γ σ Φ y i ξ σ γ 1 ϕ y i ξ σ = γ σ ( Φ ( x i ) ) γ 1 ϕ ( x i ) ,   i = 1 , 2 , , n .
Let y = y ( y 1 , y 2 , , y n ) denote a random sample of y s . The likelihood function and log-likelihood function of y can be presented by
L ( Θ | y ) = i = 1 n f ( y i | Θ ) = γ n σ n i = 1 n Φ ( x i ) γ 1 i = 1 n ϕ ( x i ) ,
where x i = y i ξ σ and
= log ( L ( Θ | y ) ) = n ( log ( γ ) log ( σ ) ) + ( γ 1 ) i = i n log Φ ( x i ) + i = i n log ϕ ( x i ) .
The first and second derivatives of with respect to ξ , σ and γ can be obtained by
ξ = 1 σ i = 1 n x i ( γ 1 ) i = 1 n ϕ ( x i ) Φ ( x i ) ,
γ = n γ + i = 1 n log Φ ( x i ) ,
σ = 1 σ i = 1 n x i 2 n ( γ 1 ) i = 1 n x i ϕ ( x i ) Φ ( x i ) ,
2 ξ 2 = 1 σ 2 n + ( γ 1 ) i = 1 n x i ϕ ( x i ) Φ ( x i ) + ϕ 2 ( x i ) Φ 2 ( x i ) ,
2 γ 2 = n γ 2 ,
2 σ 2 = 1 σ 2 3 i = 1 n x i 2 n ( γ 1 ) i = 1 n ( 2 x i x i 3 ) ϕ ( x i ) Φ ( x i ) i = 1 n x i 2 ϕ 2 ( x i ) Φ 2 ( x i ) ,
2 ξ γ = 1 σ i = 1 n ϕ ( x i ) Φ ( x i ) ,
2 ξ σ = 1 σ 2 2 i = 1 n x i + ( γ 1 ) i = 1 n ( x i 2 1 ) ϕ ( x i ) Φ ( x i ) + i = 1 n x i ϕ 2 ( x i ) Φ 2 ( x i ) ,
and
2 γ σ = 1 σ i = 1 n x i ϕ ( x i ) Φ ( x i ) .
Let ξ = 0 , σ = 0 , and γ = 0 , we can obtain the following three likelihood equations:
i = 1 n x i = ( γ 1 ) i = 1 n ϕ ( x i ) Φ ( x i ) ,
i = 1 n x i 2 = n + ( γ 1 ) i = 1 n x i ϕ ( x i ) Φ ( x i ) ,
γ = n i = 1 n log Φ ( x i ) .
Let κ ( Θ ) be a differentiable function of Θ . Denote the gradient of κ ( Θ ) by κ ( Θ ) = κ ( Θ ) ξ , κ ( Θ ) σ , κ ( Θ ) γ . The MLE of κ ( Θ ) can be denoted by κ ( Θ ^ ) though using the invariance property of MLE. For simplicity, let κ ^ κ ( Θ ^ ) and let κ ( Θ ^ ) be the plug-in version of κ ( Θ ) . An approximate confidence interval of κ ( Θ ) can be obtained using Theorem 1.
Theorem 1. 
Let I ( Θ ) denote the exact Fisher information matrix. Based on Equations (8)–(16) and delta method, the asymptotic distribution of κ ^ can be obtained by
(a) 
κ ^ N Θ , ( κ ( Θ ^ ) ) I ( Θ ^ ) 1 ( κ ( Θ ^ ) ) T   as   n .
(b) 
The approximate confidence interval of C L can be presented by
C ^ L N C L , 1 σ ^ , 0 , ξ ^ L σ ^ 2 I ( Θ ^ ) 1 1 σ ^ , 0 , ξ ^ L σ ^ 2 T   as   n ,
where C L is the PPI of the power-normal distribution and defined by
C L = ξ L σ .
The approximate ( 1 α ) % confidence interval of C L can be obtained by
C ^ L ± z α / 2 × 1 σ ^ , 0 , ξ ^ L σ ^ 2 I ( Θ ^ ) 1 1 σ ^ , 0 , ξ ^ L σ ^ 2 T .
Proof. 
(a)
The MLEs of ξ , σ and γ can be the simultaneous solutions of the Equations (17)–(19). Denote them by ξ ^ , σ ^ and γ ^ , respectively, and let Θ ^ = ( ξ ^ , γ ^ , σ ^ ) . Let δ 11 = E ( 2 ξ 2 ) , δ 22 = E ( 2 γ 2 ) , δ 33 = E ( 2 σ 2 ) , δ 12 = δ 21 = E ( 2 ξ γ ) , δ 13 = δ 31 = E ( 2 ξ σ ) , δ 23 = δ 32 = E ( 2 γ σ ) , We obtain the following results:
δ 11 = n σ 2 1 + ( γ 1 ) E X ϕ ( X ) Φ ( X ) + E ϕ 2 ( X ) Φ 2 ( X ) ,
δ 22 = n γ 2 ,
δ 33 = n σ 2 3 E ( X 2 ) 1 ( γ 1 ) E ( 2 X X 3 ) ϕ ( X ) Φ ( X ) E X 2 ϕ 2 ( X ) Φ 2 ( X ) ,
δ 12 = n σ E ϕ ( X ) Φ ( X ) ,
δ 13 = n σ 2 2 E ( X ) + ( γ 1 ) E X 2 1 ϕ ( X ) Φ ( X ) + E X ϕ 2 ( X ) Φ 2 ( X ) ,
and
δ 23 = n σ E X ϕ ( X ) Φ ( X ) .
Hence, the exact Fisher information matrix can be presented by
I ( Θ ) = δ 11 δ 12 δ 13 δ 21 δ 22 δ 23 δ 31 δ 32 δ 33 .
Because I ( Θ ) contains unknown parameters, the plug-in version of I ( Θ ) , denoted by I ( Θ ^ ) , can be used to find the asymptotic distribution of Θ ^ . We can be shown that
Θ ^ N Θ , I ( Θ ^ ) 1   as   n .
The delta method indicates that the asymptotic mean and variance of κ ^ are κ ( Θ ) and κ ( Θ ^ ) I ( Θ ^ ) 1 ( κ ( Θ ^ ) ) T , respectively. Moreover, using Central limit theorem, we can show that the asymptotic distribution of κ ^ is normal. Hence, we can show that
κ ^ N κ ( Θ ) , ( κ ( Θ ^ ) ) I ( Θ ^ ) 1 ( κ ( Θ ^ ) ) T   as   n .
We prove Theorem 1a.
(b)
Using invariant property, the MLE of C L can be obtained by
C ^ L = ξ ^ L σ ^ .
The gradient of C L is C L = C L ξ , C L γ , C L σ = 1 σ , 0 , ξ L σ 2 . Moreover, we can show that the asymptotical variance of C ^ L is
1 σ ^ , 0 , ξ ^ L σ ^ 2 I ( Θ ^ ) 1 1 σ ^ , 0 , ξ ^ L σ ^ 2 T
and the asymptotic distribution of C ^ L can be obtained by
C ^ L N C L , 1 σ ^ , 0 , ξ ^ L σ ^ 2 I ( Θ ^ ) 1 1 σ ^ , 0 , ξ ^ L σ ^ 2 T   as   n .
Then, the approximate ( 1 α ) % confidence interval of C L can be obtained by
C ^ L ± z α / 2 × 1 σ ^ , 0 , ξ ^ L σ ^ 2 I ( Θ ^ ) 1 1 σ ^ , 0 , ξ ^ L σ ^ 2 T .
We prove Theorem 1b. □
If γ = 1 , C L reduces to the PPI of the normal distribution. Hence, the C L is a generalized PPI which can be used as a performance metric under skew distributions and including the normal distribution as special case.

3.2. Bootstrap Methods

Parametric bootstrap methods are other widely used methods to obtain an approximate ( 1 α ) × 100 % confidence interval of κ ( Θ ) . In this study, the parametric bootstrap methods of PBP and BCP are developed to obtain the approximate ( 1 α ) × 100 % confidence interval of κ ( Θ ) . Readers can see [20] for comprehensive introductions to the PBP and BCP methods. The procedures to implement the PBP and BCP methods are presented as follows:
  • The PBP method
The ( 1 α ) % bootstrap confidence interval of κ ( Θ ) based on the PBP method can be obtained based on the following steps:
Step 1:
Obtain the MLE of Θ based on a large sample of size n and denote the MLE by Θ ^ .
Step 2:
Generate a bootstrap sample of size n from the power-normal distribution with parameter Θ = Θ ^ .
Step 3:
Implement Step 2 B times, where B is a large number and denote all obtained MLE of Θ by Θ ^ j * , j = 1 , 2 , , B . Let κ ^ j * κ ^ ( Θ ^ j * ) , j = 1 , 2 , , B be the MLE of κ ( Θ ) based on B bootstrap samples.
Step 4:
Construct the empirical distribution of κ ^ based on the bootstrap samples { κ ^ j * ,   j = 1 , 2 , , B } and denote the empirical distribution of κ ^ by G ^ κ * . The ( 1 α ) % bootstrap confidence interval of κ ( Θ ) can be obtained by ( g α / 2 , g 1 α / 2 ) , where g p is the p th quantile function of G ^ κ * such that G ^ κ * ( g p ) = p for 0 < p < 1 .
  • The BCP method
The ( 1 α ) % bootstrap confidence interval of κ ( Θ ) based on the BCP method can be obtained based on the following steps:
Step 1:
Implement Step 1 to Step 3 of the PBP method to obtain the bootstrap sample of κ ^ j * κ ^ ( Θ ^ j * ) , j = 1 , 2 , , B .
Step 2:
The approximate ( 1 α ) confidence interval of κ ( Θ ) can be obtained by
( G ^ κ * ) 1 Φ [ z α / 2 + 2 Φ 1 ( G ^ κ * ( κ ^ ) ) ] , ( G ^ κ * ) 1 Φ [ z 1 α / 2 + 2 Φ 1 ( G ^ κ * ( κ ^ ) ) ] .

3.3. Discussions

From Theorem 1, we find that the approximate confidence interval of C L in Theorem 1b is available only if the exact Fisher information matrix, I ( Θ ) , in Theorem 1a exists. The existence of I ( Θ ) depends on the MLE of γ . For example, the component of E X ϕ 2 ( X ) ϕ 2 ( X ) in δ 13 exists only if γ 3 . Because γ is an unknown parameter in practical applications, this limitation makes the use of I ( Θ ) difficult. Considering the sampling error for estimating model parameters, we could obtain an MLE of γ smaller than 3 if the value of γ is larger but close to 3. If γ < 3 , I ( Θ ) cannot be obtained. The existence of I ( Θ ) causes a difficulty to use the approximate confidence interval that is obtained via using Theorem 1. Based on the aforementioned reasons, we suggest using the two proposed parametric bootstrap methods to obtain an approximate confidence interval for the model parameters and C L in practical applications.
The power-normal distribution can be a generalized version of normal distribution. In literature, some extended versions of the power-normal distribution have been proposed. For example, the log-power-normal distribution and truncated-power-normal distribution. Hence, these extended versions have a more complicated function form than the power-normal distribution and contain more parameters to characterize data. The extended versions of the power-normal distribution have a merit with a complicated shape to model data. However, the parameter estimation could be a problem. If a distribution contains two or more shape (or skewness) parameters, the model identification could be a problem for practical use. Two distributions with different combinations of plug-in estimates can fit the same data set well.
In our experience, two different distributions could result in different tail percentiles. Percentiles are important for quality control or reliability applications. Hence, the complexity of the model form and obtaining reliable parameter estimation results are trade-off. To avoid an over-fitting problem, we suggest using a generalized version of normal distribution to model data and such generalized version can include the normal distribution as special case. Based on the purpose, the skew-normal distribution and power-normal distribution are competitive. The power-normal distribution is available to characterize the data whose distribution has a higher degree of kurtosis than the normal and skew-normal distributions but has a narrow range of asymmetry. Moreover, the estimation results are less reliable based on the skew-normal distribution if its skewness parameter is small. That is, if the distribution of data is slightly asymmetric, the estimation results based on skew-normal distribution could be less reliable if the sample size is not large enough. In this study, we recommend using power-normal distribution and the developed inference processes to characterize data.

4. Monte Carlo Simulations

In this section, Monte Carlo simulations are conducted to evaluate the performance of the proposed maximum likelihood estimation method and the two parametric bootstrap methods. Because the power-normal distribution reduces to normal distribution when γ = 1 , the first goal of the design of the Monte Carlo simulation is to study the estimation performance of the MLE of γ and its impacts on the estimation performance of the MLEs of the location and scale parameters in the power-normal distribution. Based on the aforementioned purposes, the Monte Carlo simulations are conducted with the parameters of ξ = 3 , σ = 1 and γ = 0.5 , 1 , 2 and 5. The estimation performance is evaluated in terms of the scale-free metrics based on the bias and MSE. Define the bias and MSE of a parameter of θ by Bias = 1 M i = 1 M ( θ ^ i θ ) and MSE = 1 M i = 1 M ( θ ^ i θ ) 2 , respectively, where θ ^ i is the MLE of θ for the i th iteration, i = 1 , 2 , , M and M is a big positive integer. The relative bias and the relative square root of the MSE of θ ^ can be defined respectively by
rBias = 1 θ × Bias
and
rsqMSE = 1 θ × MSE .
In this study, θ can be ξ , γ , σ and C L . We use M = 10,000 iterations to evaluate the values of rBias and rsqMSE. Moreover, we also evaluate the number of any absolute MLEs of ξ , γ and σ larger than ten times of their true parameters in 10,000 simulation runs, and denote this obtained number by N u . All simulation results are reported in Table 1. N u can be an indicator to evaluate the frequency of the bad-quality MLEs in 10,000 runs of simulation.
From Table 1 we obtain the following results:
  • When the value of γ is small ( γ = 0.5 ) or large ( γ = 5 ), the rBias and rsqMSE are larger than that in the cells of γ = 1 and 2 even the sample size increases to 500. These findings indicate that we may consider a bias-correction method to obtain a more reliable MLE of γ , and then we can obtain the more reliable MLEs of ξ and σ . The bias-correction maximum likelihood estimation method is another issue and can be a future study.
  • Based on the values of N u in Table 1, we find that the performance of the maximum likelihood estimation method get worse as the value of the shape parameter is increased when the sample size is small. The findings imply that the maximum likelihood estimation method could not be a satisfactory method to obtain the reliable estimates of the model parameters if the power-normal distribution has a big shape parameter when the sample size is small. A big sample of 500 or more is requested to obtain the reliable MLEs of the model parameters if the power-normal distribution has a big shape parameter. For the power-normal distribution with a small to moderate shape parameter, a sample of 250 or 300 is enough to obtain the reliable MLEs of model parameters.
  • The MLE of σ underestimates its true value. The bias of σ ^ and γ ^ is larger than the bias of ξ ^ . Because C L is a function of ξ and σ , the bias of C ^ L could be inflated due to the underestimated σ ^ . We also find that the bias of σ ^ cannot be significantly reduced when the sample size increases. How to reduce the bias of the MLE of σ can be a future study.
  • The ξ ^ can be a good estimate of ξ in terms of the rBise and rsqMSE in Table 1.
  • Some cells of rBias and rsqMSE for ξ ^ , γ ^ and σ ^ could not decrease as the sample size increases. Carefully check these cells, we can find these values are close. The slightly differences are caused by random error in simulation. We can treat them at a close level of rBias and rsqMSE.
  • The MLE C ^ L is a plug-in function of the MLEs of ξ and σ . Hence, the estimation performance of C ^ L depends on the quality of ξ ^ and σ ^ . C ^ L can be a good estimate of C L for the power-normal distribution if its shape parameter is small to moderate.
  • Because the power-normal distribution with a small to moderate shape parameter can characterize a wide range of real skewed data. The maximum likelihood method can be a potential estimation method to obtain reliable estimates of model parameters.
The second goal of the design of the Monte Carlo Simulations is to evaluate the performance of the two proposed parametric bootstrap methods of PBP and BCP based on the power-normal distribution with ξ = 3 , σ = 1 and γ = 2 . The parametric bootstrap methods of PBP and BCP are implemented 1000 iterations to evaluate their CR for the model parameters and C L with the nominal confidence level of 95 % . In each iteration, a bootstrap sample with B = 10,000 estimates is obtained to establish the bootstrap empirical distribution for each iteration. Then, the bootstrap confidence intervals are established. All estimation results are summarized in Figure 1. All obtained CR estimates are close with a range from 0.94 to 0.963, and we can see that the numerical CRs are close to their nominal values to estimate all parameters if the sample size reaches 100 or more.

5. Example

A data set with 185 flow width measurements of the resist in a hard-bark process is used to demonstrate the utility of the proposed methods. In semiconductor manufacturing, hard-bake is used in conjunction with photo-lithography and the flow width of resist is an important quality variable for maintaining the photo-lithograph process at a good quality. This data set was taken from the first 37 Phase I in-control samples in Chapter 6 of [2]. Ref. [2] used normality assumption to establish Shewhart x ¯ -R control charts for monitoring the production process of integrated circuits. We remove the samples after sample 37 from this data set due to the process looks gradually moving to an out-of-control state since sample 38. The histogram based on the sample with all 185 observations is displayed in Figure 2.
The design of the data analysis contains two stages. In Stage 1, the in-control Phase I sample with 185 observations of flow width is used for modeling. The maximum likelihood estimation method in Section 3.1 is used to obtain the MLE of the model parameters. Moreover, the PBP and BCP methods in Section 3.2 are used to obtain an approximate confidence interval of C L . In Stage 2, we investigate the impact of sample size on the reduction of sampling error. Hence, data are regenerated in Stage 2 based on the model that is established in Stage 1 to study the lengths of the approximate bootstrap confidence intervals of C L .
Stage 1:
Modeling: Using the proposed maximum likelihood estimation in Section 3.1 with the initial values of ξ 0 , γ 0 = 1 and σ 0 , where ξ 0 and σ 0 are the sample mean and standard deviation of the data set, respectively. Based on the simulation results in Table 1, a random sample of n = 185 should be okay to obtain reliable MLEs of the model parameters when γ is close to 1. We obtain the MLEs ξ ^ = 1.5282 , γ ^ = 0.8015 and σ ^ = 0.1197 for the power-normal distribution. The dashed line in Figure 2 is the density curve of the power-normal distribution based on the obtained MLEs. We can see that the power-normal distribution has a good fitting to this data set.
The quantile-to-quantile plot based on the flow width data and the power-normal distribution is presented in Figure 3. All the dots are plotted around the straight line. Hence, the quantile-to-quantile plot indicates that the power-normal distribution can be the right model to characterize the flow width data. The test statistic of the Kolmogorov-Smirnov test based on the flow width data and power-normal distribution is 0.062 with the p-value of 0.2917 . Based on the Kolmogorov-Smirnov test, we conclude that the power-normal distribution can be a good model to characterize the flow width data. Because the MLE of the shape parameter, γ ^ = 0.8015 , is closed to 1, this estimate indicates that the distribution of the flow width data has a slightly skewed shape.
Refer to the lower specification limit of L S L = 1 suggested by [2], we let L = 1 . The MLE of the C L can be C ^ L = ( 1.5282 1 ) / 0.1197 = 4.4127 . The high value of the C ^ L indicates a good process capability for the flow width of the resist in the hard-bark process. Based on the original sample with 185 observations of flow width, the 95 % PBP and BCP confidence intervals of the model parameters and C L are obtained and reported in Table 2. We can find that the obtained PBP and BCP intervals for each parameter are close. Two bootstrap intervals of C L recommend a good quality for the flow width of the resist in the hard-bake manufacturing process due to two lower limits of C L are significantly larger than 1.
Stage 2:
The impact of the sample size on the quality of the confidence interval of C L : In this stage, 315 observations are generated from the model obtained in Stage 1 to investigate the impact of the sample size on the quality of the PBP and BCP intervals of C L . We merge the original 185 observations with the generated observations. The merged sample has a size of 500. We are interested in studying how much the length of confidence interval can be reduced when the sample size increases. The PBP and BCP confidence intervals of C L are evaluated based on the first 300 and all 500 observations in the merged sample. All computation results are reported in Table 3.
From Table 3 we can find that the lengths of PBP and BCP confidence intervals are reduced when the sample size increases. Both the PBP and BCP methods are competitive for estimating ξ , σ and C L . The PBP method outperforms the BCP method to estimate γ with a shorter length of confidence interval. Table 3 also indicates the quality of the PBP, and BCP methods can be significantly improved when the sample size increases. The BCP method significantly outperforms the PBP method for evaluating the C L in terms of the length of confidence interval.

6. Concluding Remarks

In this study, we proposed a maximum likelihood estimation procedure to obtain the MLEs of the model parameters and PPI. Moreover, the approximate confidence intervals of the model parameters and PPI are obtained based on the parametric bootstrap methods of PBP and BCP. The exact Fisher information matrix is derived and we study the weakness of using the exact Fisher information matrix to obtain an approximate confidence interval for the model parameters and PPI. Monte Carlo simulations were conducted to evaluate the quality of the proposed methods. We find that a sample with 250 or 300 is large enough to obtain the reliable MLEs of the model parameters and PPI. An example about the flow width of the resist in a hard-bake process of integrated circuit is used to illustrate the applications of the proposed methods.
The maximum likelihood estimation procedure and the bootstrap method of BCP ask a big sample for the power-normal distribution to obtain reliable point and interval estimation results. Bias-correction methods could be helpful to improve the performance of the proposed maximum likelihood estimation procedure. In the case of small samples, finding other reliable parameter estimation methods with good estimation performance is an important issue. Moreover, establishing inference methods with censoring schemes also are an important issue for the power-normal distribution. All these topics will be studied in the near future.

Author Contributions

Data curation, C.Z.; Funding acquisition, J.Z. and T.-R.T.; Investigation, H.X. and T.-R.T.; Methodology, J.Z., H.X., C.Z., T.-R.T.; Project administration, T.-R.T.; Resources, H.X. and T.-R.T.; Software, C.Z. and T.-R.T.; Supervision, J.Z. and T.-R.T.; Writing—original draft, T.-R.T.; Writing—review & editing, J.Z., H.X., C.Z., T.-R.T. All authors contributed equally in writing this article. All authors have read and agreed to the published version of the manuscript.

Funding

National Office for Philosophy and Social Sciences of China under Grant 20&ZD137; Ministry of Science and Technology, Taiwan MOST 110-2221-E-032-034-MY2.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The first 185 observations in the data set of the example can be found in Chapter 6 of Montgomery, D. C. Statistical Quality Control: A Modern Introduction 7th Edition. John Wiley & Sons, 2012.

Acknowledgments

This study is supported by the grant projects of National Office for Philosophy and Social Sciences of China under Grant 20&ZD137 and Ministry of Science and Technology, Taiwan MOST 110-2221-E-032-034-MY2.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Pan, J.; Wu, S. Process capability analysis for non-normal relay test data. Microelectron. Reliab. 1997, 37, 421–428. [Google Scholar] [CrossRef]
  2. Montgomery, D.C. Statistical Quality Control: A Modern Introduction, 7th ed.; John Wiley & Sons: New York, NY, USA, 2012. [Google Scholar]
  3. Hong, C.-W.; Wu, J.-W.; Cheng, C.-H. A look at the Burr and related distribution. Appl. Math. Comput. 2007, 184, 336–350. [Google Scholar]
  4. Lee, H.-M.; Wu, J.-W.; Lei, C.-L.; Hung, W.L. Implementing lifetime performance index of products with two-parameter exponential distribution. Int. J. Syst. Sci. 2011, 42, 1305–1321. [Google Scholar] [CrossRef]
  5. Wu, S.-F.; Chiu, C.-J. Computational testing algorithmic procedure of assessment for lifetime performance index of products with two-parameter exponential distribution based on the multiply type II censored sample. J. Stat. Comput. Simul. 2014, 84, 2016–2122. [Google Scholar] [CrossRef]
  6. Wu, S.-F.; Lin, Y.-P. Computational testing algorithmic procedure of assessment for lifetime performance index of products with one-parameter exponential distribution under progressive type I interval censoring. Math. Comput. Simul. 2016, 120, 79–90. [Google Scholar] [CrossRef]
  7. Lee, W.-C.; Wu, J.-W.; Hong, C.-W. Assessing the lifetime performance index of products from progressively type II right censored data using Burr XII model. Math. Comput. Simul. 2009, 79, 2167–2179. [Google Scholar] [CrossRef]
  8. Ahmadi, M.V.; Doostparast, M.; Ahmadi, J. Statistical inference for the lifetime performance index based on generalised order statistics from exponential distribution. Int. J. Syst. Sci. 2009, 46, 1094–1107. [Google Scholar] [CrossRef]
  9. Lee, W.-C.; Wu, J.-W.; Hong, M.-L. Assessing the lifetime performance index of Rayleigh products based on the Bayesian estimation under progressive type II right censored samples. J. Comput. Appl. Math. 2011, 235, 1676–1688. [Google Scholar] [CrossRef]
  10. Ahmadi, M.V.; Doostparast, M.; Ahmadi, J. Estimating the lifetime performance index with Weibull distribution based on progressive first-failure censoring scheme. J. Comput. Appl. Math. 2013, 239, 93–102. [Google Scholar] [CrossRef]
  11. Goto, M.; Inoue, T. Some properties of power-normal distribution. Jpn. J. Biom. 1980, 1, 28–54. [Google Scholar] [CrossRef]
  12. Goto, M.; Inoue, T.; Tsuchiya, Y. On estimation of parameters in power-normal distribution. Bull. Inform. Cybern. 1984, 21, 41–53. [Google Scholar] [CrossRef]
  13. Goto, M.; Matsubara, Y.; Tsuchiya, Y. Power-normal distribution and its applications. Rep. Stat. Appl. Res. 1983, 30, 8–28. [Google Scholar]
  14. Freeman, J.; Modarres, R. Inverse Box-Cox: The power-normal distribution. Stat. Probab. Lett. 2006, 76, 764–772. [Google Scholar] [CrossRef]
  15. Castillo, N.O.; Gallardo, D.I.; Bolfarine, H.; Gómez, H. Truncated power-normal distribution with application to non-negative measurements. Entropy 2018, 433, 433. [Google Scholar] [CrossRef] [Green Version]
  16. Maruo, K.; Goto, M. Percentile estimation based on the power-normal distribution. Comput. Stat. 2013, 28, 241–356. [Google Scholar] [CrossRef]
  17. Gupta, R.D.; Gupta, R.C. Analyzing skewed data by power normal model. Test 2008, 17, 197–210. [Google Scholar] [CrossRef]
  18. Maruo, K.; Shirahata, S.; Goto, M. Underlying assumptions of the power-normal distribution. Behaviormetrika 2011, 38, 85–95. [Google Scholar] [CrossRef]
  19. Azzalini, A. A class of distributions which includes the normal ones. Scand. J. Stat. 1985, 12, 171–178. [Google Scholar]
  20. Efron, B.; Tibshirani, R.J. An Introduction to the Bootstrap; Chapman & Hall: New York, NY, USA, 1993. [Google Scholar]
  21. Lio, Y.L.; Tsai, T.-R.; Chiang, J.-Y. Estimation of the lower confidence limit of the breaking strength percentiles under progressive type-II censoring. J. Chin. Inst. Ind. Eng. 2012, 29, 16–29. [Google Scholar] [CrossRef]
  22. Dey, S.; Saha, M.; Maiti, S.S.; Jun, C.-H. Bootstrap confidence intervals of generalized process capability index Cpyk for for Lindley and power Lindley distributions. Commun. Stat.-Simul. Comput. 2018, 47, 249–262. [Google Scholar] [CrossRef]
  23. Besseris, G.J. Evaluation of robust scale estimators for modified Weibull process capability indices and their bootstrap confidence intervals. Comput. Ind. Eng. 2019, 128, 135–149. [Google Scholar] [CrossRef]
  24. Park, C.; Dey, S.; Ouyang, L.; Byun, J.-H.; Leeds, M. Improved bootstrap confidence intervals for the process capability index Cpk. Commun. Stat.-Simul. Comput. 2020, 49, 2583–2603. [Google Scholar] [CrossRef]
Figure 1. The CRs for estimating ξ , γ , σ , and C L through using the PBP and BCP methods. The dashed line is the nominal value.
Figure 1. The CRs for estimating ξ , γ , σ , and C L through using the PBP and BCP methods. The dashed line is the nominal value.
Mathematics 10 00035 g001
Figure 2. The histogram of the flow width data in Example 1.
Figure 2. The histogram of the flow width data in Example 1.
Mathematics 10 00035 g002
Figure 3. The histogram of the flow width data in Example 1.
Figure 3. The histogram of the flow width data in Example 1.
Mathematics 10 00035 g003
Table 1. The rBias and rsqMSEs of the MLEs of ξ , γ , σ and C L .
Table 1. The rBias and rsqMSEs of the MLEs of ξ , γ , σ and C L .
rBiasrsqMSE
γ n ξ ^ γ ^ σ ^ C ^ L ξ ^ γ ^ σ ^ C ^ L N u
0.51000.2096−0.5994−1.22691.13200.725833.02142.02782.2301240
1500.2390−2.6064−1.22640.81510.657113.57332.00541.454870
2000.2434−2.9725−1.21760.68250.63704.84671.98651.108625
2500.2475−3.1008−1.21520.62990.62514.57501.97920.956711
3000.2462−3.1453−1.21090.59050.61344.55481.97720.84478
5000.2544−3.2785−1.21040.54470.59874.55591.96920.73350
11000.27123.0260−0.93450.83410.641726.45241.40442.2745511
1500.30160.4983−0.93830.54270.565914.57901.37981.3561212
2000.3241−0.2356−0.94730.46840.54254.62321.37250.984294
2500.3276−0.4745−0.94430.41860.52353.55711.36190.834933
3000.3337−0.6071−0.94670.39850.51551.83561.35880.763226
5000.3420−0.7792−0.94630.35380.49691.44211.35040.62352
21000.25517.9755−0.87610.28490.642245.15811.20722.42651126
1500.29282.6819−0.88250.02670.548516.85471.19001.4013652
2000.32731.2177−0.9007−0.00750.51827.00361.19021.0722402
2500.32810.8255−0.8962−0.07360.49364.01301.17930.8994261
3000.33420.5714−0.8965−0.09700.47782.49491.17460.8284159
5000.35100.2852−0.9045−0.11940.46551.13071.16910.690139
51000.084710.9240−1.3410−2.97910.815146.26672.25496.99582210
1500.12135.6620−1.3496−3.36330.742226.00572.25915.81191808
2000.14313.1266−1.3552−3.49870.705812.20672.25675.47871470
2500.15492.4277−1.3586−3.55320.693010.30262.25255.41071265
3000.15951.8131−1.3582−3.60830.67736.75432.25135.35881164
5000.18111.0443−1.3662−3.64700.65582.81482.24945.2462721
Table 2. The 95% bootstrap confidence intervals based on the original sample with n = 185 observations.
Table 2. The 95% bootstrap confidence intervals based on the original sample with n = 185 observations.
PBPBCP
ParameterLower LimitUpper LimitLower LimitUpper Limit
ξ 1.27161.68861.25751.6842
γ 0.08395.56560.09445.9333
σ 0.05110.19320.05350.1965
C L 1.404013.47261.297512.7272
Table 3. The 95% bootstrap confidence intervals based on the merged sample with n = 300 and 500.
Table 3. The 95% bootstrap confidence intervals based on the merged sample with n = 300 and 500.
PBPBCP
n ParameterLower LimitUpper LimitLower LimitUpper Limit
300 ξ 1.43021.68861.18321.5994
γ 0.10891.82090.38205.4131
σ 0.05880.15120.09250.2001
C L 2.873811.74850.97816.3928
500 ξ 1.45951.66341.29551.5825
γ 0.19021.50530.45234.1516
σ 0.07340.14530.09750.1784
C L 3.17509.00231.66425.8747
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zhu, J.; Xin, H.; Zheng, C.; Tsai, T.-R. Inference for the Process Performance Index of Products on the Basis of Power-Normal Distribution. Mathematics 2022, 10, 35. https://doi.org/10.3390/math10010035

AMA Style

Zhu J, Xin H, Zheng C, Tsai T-R. Inference for the Process Performance Index of Products on the Basis of Power-Normal Distribution. Mathematics. 2022; 10(1):35. https://doi.org/10.3390/math10010035

Chicago/Turabian Style

Zhu, Jianping, Hua Xin, Chenlu Zheng, and Tzong-Ru Tsai. 2022. "Inference for the Process Performance Index of Products on the Basis of Power-Normal Distribution" Mathematics 10, no. 1: 35. https://doi.org/10.3390/math10010035

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop