Next Article in Journal
A Modified Gradient Method for Distributionally Robust Logistic Regression over the Wasserstein Ball
Previous Article in Journal
Simulation of Light Scattering in Automotive Paints: Role of Particle Size
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Poisson–Lindley Distribution: Some Characteristics, with Its Application to SPC

by
Waleed Ahmed Hassen Al-Nuaami
,
Ali Akbar Heydari
* and
Hossein Jabbari Khamnei
*
Department of Statistics, Faculty of Mathematics, Statistics and Computer Science, University of Tabriz, Tabriz 51666-16471, Iran
*
Authors to whom correspondence should be addressed.
Mathematics 2023, 11(11), 2428; https://doi.org/10.3390/math11112428
Submission received: 8 April 2023 / Revised: 19 May 2023 / Accepted: 22 May 2023 / Published: 24 May 2023

Abstract

:
Statistical process control (SPC) is a significant method to monitor processes and ensure quality. Control charts are the most important tools in SPC. As production processes and production parts become more complex, there is a need to design control charts using more complex distributions. One of the most important control charts to monitor the number of nonconformities in production processes is the C-chart, which uses the Poisson distribution as a quality characteristic distribution. However, to fit the Poisson distribution to the count data, equality of mean and variance should be satisfied. In some cases, such as biological and medical sciences, count data exhibit overdispersion, which means that the variance of data is greater than the mean. In such cases, we can use the Poisson–Lindley distribution instead of the Poisson distribution to model the count data. In this paper, we first discuss some important characteristics of the Poisson–Lindley distribution. Then, we present parametric and bootstrap control charts when the observations follow the Poisson–Lindley distribution and analyze their performance. Finally, we provide a simulated example and a real-world dataset to demonstrate the implementation of control charts. The results show the good performance of the proposed control charts.

1. Introduction

The quality of production is critical for achieving success in a business. There are different methods for ensuring the quality of production, and statistical process control (SPC) is one of them. SPC involves using statistical methods to monitor production processes and identify potential issues before they can affect quality. By implementing SPC, businesses can improve their quality control efforts and increase the likelihood of success. Some of the latest challenges and research areas in the field of SPC are:
  • Integration with industry technologies: As more and more factories become digitized and automated, there is a need for SPC methodologies to be integrated with these new technologies. For example, using sensors to collect data in real time and feeding that data into statistical models to identify process variations and anomalies.
  • Multivariate SPC: Traditional SPC techniques focus on monitoring single variables at a time, but many manufacturing processes involve multiple interconnected variables. Researchers are working on developing multivariate SPC methods that can handle these complex relationships.
  • Adaptive SPC: Another area of active research is the development of adaptive SPC methods that can automatically adjust control limits in response to changes in the process.
For more studies in these fields, you can refer to [1,2,3].
The statistical tool known as the control chart, which utilizes random sampling, is widely used in the SPC to assess and monitor quality characteristics of interest collected during manufacturing processes and detect deviations from expected levels of control. As production processes and production parts become more complex, there is a need to design control charts using more complex distributions.
One of the most important control charts used to monitor the number of defective parts in production processes is the C-chart, which utilizes the Poisson distribution as a quality characteristic distribution. Situations where random events occur, such as the number of customers at a service point or the number of defects in a given material, can be modeled using the Poisson distribution. This distribution has practical applications in various fields, including telecommunications, traffic safety, and materials science. Other examples include measuring the number of meteorite collisions with a test satellite during an orbit or the number of organisms in a fluid sample.
While the Poisson distribution is commonly used for modeling independent events, it may not be suitable in cases where there is dependence between successive events. To address this issue, the negative binomial distribution can be used as an alternative to the Poisson distribution. To study more in this field, you can refer to [4,5,6]. Additionally, it is important to note that the mean and variance of count data must be equal for accurate fitting of the Poisson distribution; especially, the mean should be less than or equal to the variance. In some cases, such as in the biological and medical sciences, this condition is not fully satisfied. In such cases, we can use the Poisson–Lindley distribution ( P L D ) instead of the Poisson distribution. Empirical and theoretical reasoning support the use of the Poisson–Lindley distribution in characterizing data from biological and medical sciences, due to its overdispersion property ( μ < σ 2 ) .
The Lindley distribution is one of the important distributions that have great potential to represent different systems consisting of complex and heterogeneous samples with high flexibility. The pdf of this distribution is given by
f ( x ; θ ) = θ 2 ( 1 + θ ) ( 1 + x ) e θ x             x > 0   ,   θ > 0   ,  
where θ is the shape parameter. In reliability and SPC, there is no need to consider the shift parameter.
The Poisson–Lindley distribution is proposed by [7] for modeling count data. This distribution emerges from the Poisson distribution, where its parameter λ conforms to the Lindley distribution [8], with a probability density function in the form of Equation (1), i.e., X follows a Poisson probability density function of the form:
P ( X = x | λ ) = e λ   λ x x !   ,   λ > 0   x = 0 , 1 , 2 ,  
in which the parameter λ in the Poisson distribution has a Lindley probability density function of the form:
f ( λ ; θ ) = θ 2 θ + 1   ( 1 + λ ) e λ θ , λ > 0   ,   θ > 0   .
So, we can write the probability mass function (pmf) of Poisson–Lindley distribution as follows:
P ( X = x , λ ) = P ( X = x | λ ) f ( λ )
p ( x ) = P ( X = x ) = 0 P ( X = x | λ ) f ( λ ) d λ = 0 e λ   λ x x ! θ 2 ( θ + 1 ) ( 1 + λ ) e θ λ d λ .
Therefore, the pmf of the PLD is equals to:
p ( x ) = θ 2 ( θ + x + 2 ) ( 1 + θ ) x + 3 ,           x = 0 , 1 , 2 ,               θ > 0   .
The cumulative distribution function (cdf) of Poisson–Lindley distribution is as follows:
F ( x ) = P ( X x ) = 1 P ( X > x + 1 ) = 1 t = x + 1 0 P ( X = t | λ ) f ( λ ) d λ = 1 t = x + 1 0 e λ   λ t t ! θ 2 ( θ + 1 ) ( 1 + λ ) e θ λ d λ
F ( x ) = 1 θ 2 + 3 θ + 1 + θ x ( θ + 1 ) x + 3   , x = 0 , 1 , 2 ,   θ > 0   .  
There exist several generalizations of the Poisson–Lindley distribution. The generalized Poisson–Lindley distribution was introduced by [9], while the two-parameter Poisson–Lindley distribution was introduced by [10] using a combination of a Poisson distribution with the two-parameter Lindley distribution. The complementary Poisson–Lindley class of distributions was introduced by [11], and the bivariate Poisson–Lindley distribution was introduced by [12]. Additionally, the size-biased two-parameter Poisson–Lindley distribution was introduced by [13], and the three-parameter Poisson–Lindley distribution was introduced by [14].
The Maximum Likelihood and Method of Moments estimations of the Poisson–Lindley distribution are presented in [7], where their consistency and asymptotic behavior are also investigated in [15].
To track a quality characteristic X at pre-defined targets μ 0 and σ 0 in industrial processes, practitioners often employ Shewhart control charts with 3-sigma control limits. For a comprehensive overview of these charts, one may refer to [16].
Given the non-availability of an analytical expression for the M L E of the shape parameter in Poisson–Lindley processes, inferring its sampling distribution can be challenging. This limitation is overcome by using bootstrap control charts, which offer a reliable way of monitoring these processes. Details about the bootstrap methodology can be found in [17]. Examples of bootstrap control charts can be found in [18,19,20,21,22,23,24].
So far, no control chart has been presented for data that are over-dispersed. Therefore, considering the importance of such data, in this paper we have presented control charts for them using Poisson–Lindley distribution. For this purpose, in Section 2, we review some of the characteristics of the P L D . Control charts for the data to which the P L D is fitted are presented in Section 3. Section 4 includes numerical results and simulation for the implementation of the control charts. Finally, the discussion and conclusions are presented in Section 5.

2. Some of the Characteristics of the P L D

In this section, we will first explain some characteristics of the distribution, such as the graph of the probability mass function, in addition to the mean, variance, moment-generating function, skewness and kurtosis. Then, we will talk about generating random numbers from P L D and estimating the parameter of the P L D .
Figure 1 illustrates the pmf for the Poisson–Lindley distribution in Equation (2), showing how it changes at different values of θ .

2.1. Moments

Moments of the P L D are calculated using the following relation:
E ( X k ) = x = 0 x k P ( X = x ) = x = 0 x k θ 2 [ θ + x + 2 ] ( 1 + θ ) x + 3
= θ 2 ( 1 + θ ) 3 [ ( θ + 2 ) θ k + 1 i = 1 k { 1 ( 1 + θ ) k i j = 0 i ( 1 ) j ( k + 1 ) ! ( i j ) k j ! ( k + 1 j ) ! } + 1 θ k + 2 i = 1 k { 1 ( 1 + θ ) k + 1 i j = 0 i ( 1 ) j ( k + 2 ) ! ( i j ) k + 1 j ! ( k + 2 j ) ! } ] .
Using the above relation, we can obtain:
μ = E ( X ) = ( θ + 2 ) θ ( θ + 1 )  
E ( X 2 ) = θ 2 + 4 θ + 6 ( θ + 1 ) θ 2   ,
which implies that
σ 2 = V a r ( X ) = θ 3 + 4 θ 2 + 6 θ + 2 θ 2 ( θ + 1 ) 2   .  
Note that σ 2 = μ [ 1 + θ 2 + 4 θ + 2 θ ( θ + 1 ) ( θ + 2 ) ] > μ , indicating that the Poisson–Lindley distribution is over-dispersed.
Also, the third and fourth central moments are:
E ( X 3 ) = ( θ 2 + 6 θ + 12 ) ( θ + 2 ) ( θ + 1 ) θ 3
E ( X 4 ) = θ 4 + 16 θ 3 + 78 θ 2 + 168 θ + 120 ( θ + 1 ) θ 4   .
So, the skewness and kurtosis of the P L D are given by the following relations, respectively:
s k e w n e s s = 2 ( θ + 1 ) 4 ( θ + 2 ) θ 3 ( θ + 2 ) ( θ + 3 ) [ 2 ( θ + 1 ) 3 θ 2 ( θ + 2 ) ] 3 / 2
and
k u r t o s i s = 3 + 2 ( θ + 1 ) 5 [ ( θ + 3 ) 2 3 ] θ 4 ( θ + 2 ) [ ( θ + 4 ) 2 3 ] [ 2 ( θ + 1 ) 3 θ 2 ( θ + 2 ) ] 2 .

2.2. Moment Generating Function and Probability Generating Function

The moment-generating function of a P L D is given by:
M X ( t ) = E ( e t X ) = ( θ e t + 2 ) θ 2 ( θ + 1 ) ( θ + 1 e t ) 2   .
Also, the moment generating function of the sample mean of a P L D is equal to:
M X ¯ ( t ) = E ( e t X ¯ ) = i = 1 n M X i ( t n ) = ( M X i ( t n ) ) n = ( ( θ e t n + 2 ) θ 2 ( θ + 1 ) ( θ + 1 e t n ) 2 ) n .
The probability generating function of the P L D is given by:
p x = E ( t X ) = ( θ t + 2 ) θ 2 ( θ + 1 ) ( θ + t 1 ) 2   .

2.3. Generating Random Numbers from PLD

In this section, we discuss how to generate random numbers from the Poisson–Lindley distribution. One reliable method for generating a random variable x that conforms to the P L D is to generate the parameter λ from the Lindley distribution (with θ as its parameter) and then generate x from the Poisson distribution using λ as its parameter. To generate λ , notice that the Lindley distribution can be expressed as the combination of an Exponential distribution and a gamma distribution as follows:
w ( λ ; θ ) = p w 1 ( λ ; θ ) + ( 1 p ) w 2 ( λ ; θ ) , λ > 0 ,   θ > 0 ,
where
p = θ θ + 1 ,   w 1 ( λ ; θ ) = θ e θ λ ,   w 2 ( λ ; θ ) = θ 2 λ e θ λ   ,
and, w 1 ( λ ; θ ) and w 2 ( λ ; θ ) are the probability density functions of the exponential distribution with parameter θ and the gamma distribution with parameters 2 and θ , respectively.
Now we can generate a random number x from the P L D using the following Algorithm 1.
Algorithm 1: Generating a random number from the P L D
Step 1: Generate λ from a Lindley distribution with parameter θ as follows:
(i) Generate u from a U n i f o r m   ( 0 ,   1 ) distribution.
(ii) If u p = θ ( θ + 1 ) , let λ = y 1 ; otherwise, let λ = y 1 + y 2 , where y 1 and y 2 are random numbers generated from an E x p ( θ ) distribution.
Step 2: Generate x from a Poisson distribution with parameter λ .

2.4. Estimation of the Parameter of the P L D

Let x 1 , x 2 , , x n be n observations from a P L D with parameter θ .

2.4.1. Method of Moment Estimation

The method of moment ( M o M ) estimation of θ , given by [7], is as follows:
θ ˜ = ( x ¯ 1 ) + ( x ¯ 1 ) 2 + 8 x ¯ 2 x ¯   .  
Ref. [15] has proven that
n ( θ ~ θ ) d N 0 , v 2 ( θ )
where,
v 2 θ = θ 2 ( θ + 1 ) 2 ( θ 3 + 4 θ 2 + 6 θ + 2 ) θ 2 + 4 θ + 2 .
Based on this result, an asymptotic 100 1 α % confidence interval for θ is presented as follows
θ ~ ± Z α / 2 v ( θ ~ ) n .

2.4.2. Maximum Likelihood Estimation

The likelihood function of observations x 1 , x 2 , , x n from pmf of Equation (2) is equal to:
L θ = L θ ; x 1 , x 2 , , x n = i = 1 n P X i = x i = i = 1 n θ 2 θ + x i + 2 1 + θ x i + 3 = θ 2 1 + θ 3 i = 1 n θ + x i + 2 1 + θ x i = θ 2 1 + θ 3 n i = 1 n θ + x i + 2 i = 1 n 1 + θ x i = θ 2 n 1 + θ 3 n i = 1 n θ + x i + 2 1 + θ i = 1 n x i = θ 2 n 1 + θ 3 n 1 1 + θ n x ¯ i = 1 n θ + x i + 2 .
Therefore, l = l o g L θ = 2 n l o g θ 3 n l o g 1 + θ n x ¯ l o g 1 + θ + i = 1 n l o g θ + x i + 2 , and the maximum likelihood ( M L ) estimate, θ ^ of θ , is the solution of the nonlinear equation:
l θ = 2 n θ n x ¯ + 3 θ + 1 + i = 1 n 1 x i + θ + 2 = 0 .
Equation (7) has a unique solution for all n , but it does not have a closed-form solution, and must be calculated with numeric methods [15]. Additionally,
n θ ^ θ d N 0 , I 1 θ ,
where
I θ = 2 θ 2 3 θ 2 + 4 θ + 2 θ θ + 1 3 + θ 2 θ + 1 2 0 1 t θ + 1 θ + 1 t d t
is Fisher’s information about θ . Based on this, an asymptotic 100 1 α % confidence interval for θ is presented as follows:
θ ^ ± Z α / 2 I 1 / 2 θ ^ n .

3. Control Charts for Poisson–Lindley Processes

When observing nonconformities in an inspection unit for a particular type of product, it is important to note that although the inspection unit can be a single unit of the product, it is not always the case. The inspection unit can also be a group of products that are easy to maintain records and information about. This group may consist of five, ten, or any other desired number of units of the product. In this section, we will introduce control charts for nonconformities of inspection units based on the Poisson–Lindley distribution.

3.1. Control Charts When the Parameter Is Known

Assume that nonconformities occur in the inspection unit based on the Poisson–Lindley distribution with parameter θ and pmf of Equation (2), where x is the number of nonconformities, a Shewhart-type control chart for nonconformities in the inspection unit with three sigma limits, can be defined according to Equations (4) and (5), as follows:
U C L = θ + 2 θ θ + 1 + 3 θ 3 + 4 θ 2 + 6 θ + 2 θ 2 ( θ + 1 ) 2 C L = θ + 2 θ θ + 1 L C L = θ + 2 θ θ + 1 3 θ 3 + 4 θ 2 + 6 θ + 2 θ 2 ( θ + 1 ) 2 .
If the calculated value for L C L is negative, it should be set equal to zero.
Additionally, considering that E X ¯ = μ and V a r X ¯ = σ 2 n , the X ¯ control chart based on P L D can be defined as follows:
U C L = θ + 2 θ θ + 1 + 3 θ 3 + 4 θ 2 + 6 θ + 2 n θ 2 ( θ + 1 ) 2 C L = θ + 2 θ θ + 1 L C L = θ + 2 θ θ + 1 3 θ 3 + 4 θ 2 + 6 θ + 2 n θ 2 ( θ + 1 ) 2 .
Again, if the calculated value for L C L is negative, set it equal to zero.

3.2. Control Charts When the Parameter Is Unknown

In many cases, the distribution parameter is unknown in practice. In such cases, the control chart for individual observations would be defined as follows:
Let x 1 , x 2 , , x m be our observations assumed to have a P L D with parameter θ , and we want to define a control chart for them. Then, the L C L , C L , and U C L values can be computed from Equation (8), where θ is replaced by its M o M or M L estimates, i.e., θ ~ or θ ^ of Equations (6) or (7), respectively. For instance, x ¯ = i = 1 m x i m .
Suppose we have m subgroups of size n and want to draw X ¯ control chart. In that case, let x i 1 , x i 2 , , x i n , i = 1,2 , , m be i th subgroup sample of size n that has a P L D with parameter θ . Let x ¯ i = j = 1 n x i j n and x ¯ ¯ = i = 1 m x ¯ i m . Then, the L C L , C L , and U C L values can be computed from Equation (9), where θ is replaced by its M o M estimate, θ ~ of Equation (6). Instead of using x ¯ , one must use x ¯ ¯ to compute θ ~ . Additionally, the L C L , C L , and U C L values can be computed from Equation (9), where θ is replaced by its M L estimate, θ ^ of Equation (7). In this case, one must use all m n observations together to compute θ ^ .
The main criterion for evaluating control charts is the average run length ( A R L ) . To calculate this value in an X ¯ control chart, we need to know the X ¯ distribution in both in-control and out-of-control states. We attempted to obtain the X ¯ distribution by assuming that X 1 , X 2 , , X n follow a Poisson–Lindley distribution with parameter θ . However, even after calculating the moment-generating function of X ¯ in Section 2.2, we were unable to identify its distribution. Therefore, we had to resort to using the bootstrap technique for this purpose.

3.3. Bootstrap Control Charts

In Poisson–Lindley processes, we do not have an explicit formula for the M L E estimator of the parameter. Since closed-form expressions for their sampling distribution and key statistics such as the sample mean and standard deviation are not available, analyzing these data sets can be challenging. Therefore, to monitor Poisson–Lindley processes, bootstrap control charts are very useful. As seen in Figure 1, the Poisson–Lindley distribution is a skewed distribution, especially for large values of θ , where the amount of skewness is very high. In the case of skewed distributions, several studies indicate that bootstrap control charts generally outperform Shewhart control charts (see [25,26,27]). Therefore, in this section, we introduce bootstrap control charts for the Poisson–Lindley distribution.
To construct a bootstrap control chart, we rely solely on the sample data to estimate the parameter estimator’s sampling distribution and appropriate control limits. Therefore, conforming to the traditional assumptions associated with control charts—i.e., a stable process and independent, identically distributed subgroup observations—is sufficient. The following Algorithm 2, similar to the ones proposed in [19,20,21], can be used to implement bootstrap control charts for subgroup samples of size n , to monitor the process mean value (M-chart) and the process standard deviation (S-chart) of a Poisson–Lindley distribution, respectively.
Algorithm 2: Computing the Bootstrap M and S control charts
Phase I: Estimation and computation of the control limits
  • When the process is in a state of in-control and stability, observe m (for example, 25 or 30) random samples of size n . It should be assumed that the observations are independent and identically distributed according to a Poisson–Lindley distribution with parameter θ
  • To obtain the maximum likelihood ( M L ) estimate of θ , use Equation (7) and apply it to the pooled sample of size m × n .
  • Using Algorithm 1, generate a parametric bootstrap sample of size n , ( x 1 * , , x n * ) from a Poisson–Lindley distribution, with the M L estimate obtained in Step 2 used as the distribution parameter.
  • To monitor the mean value of the process, μ (M-chart), use the sample mean μ ^ = x ¯ * derived from the bootstrap subgroup obtained in Step 3. Similarly, to monitor the process standard deviation, σ (S-chart), use the sample standard deviation σ ^ = s * computed from the bootstrap subgroup in Step 3.
  • To obtain B bootstrap estimates of the parameter of interest (either x ¯ * or s * ), repeat Steps 3–4 a considerable number of times, such as B = 10,000 iterations.
  • Let that γ denote the intended false alarm rate (FAR) for the chart. To create the bootstrap M-chart, use the B bootstrap estimations obtained in Step 5 and set the lower control limit ( L C L ) to the ( γ / 2 ) t h quantile of the distribution of μ ^ * and the upper control limit ( U C L ) to the ( 1 γ / 2 ) t h quantile of the distribution of μ ^ * . For the bootstrap S-chart, set the U C L to the ( 1 γ ) t h quantile of the distribution of σ ^ * . Note, that in statistical quality control, a lower standard deviation indicates better quality, and reducing the standard deviation does not cause the system to go out of control. Therefore, the L C L for S-chart is considered equal to 0.
Phase II: Monitoring of process
7.
Obtain subgroup samples of size n from the process at fixed time intervals of h hours (For example, once every hour or every half an hour). For each subgroup, compute the estimated values of x ¯ and s .
8.
Decision: The M-chart signals that the process mean may be out of control when x ¯ falls below the L C L or above the U C L . Conversely, if x ¯ falls between the L C L and U C L , the mean of the process is deemed to be in control. Meanwhile, the S-chart indicates that the variance of the process could be out of control if s falls above the U C L ; otherwise, if s falls below the U C L , the variance of the process is assumed to be in control.
Repeating Steps 1 to 6 of Algorithm 2 numerous times (e.g., k = 30 times) is necessary in order to obtain an average value for the control limits ( U C L and L C L ) and their associated variances. This approach provides insights into the robustness of the bootstrap control limits.
The present investigation employed Algorithm 2 to create bootstrap M-charts and S-charts for subgroups of size n = 5 . These control charts were used to monitor the process mean and standard deviation values of a Poisson–Lindley process at target values of μ 0 , and σ 0 respectively. Without loss of generality, we assume θ = 1 (or equivalently μ 0 = 1.5 , and σ 0 2 = 3.25 ). Assuming a false alarm rate (FAR) of γ = 0.0027 , there is an expected in-control average run length ( A R L 0 ) of approximately 370.4. In the implementation of bootstrap control charts, we used m = 25 subgroups, each containing n = 5 observations during Phase I.
The performance of the bootstrap M and S control charts in identifying changes in the process mean and standard deviation was evaluated based on the average run lengths ( A R L s ) . Various levels of change were considered to assess the charts’ efficacy. In scenarios where the process shifts from the in-control state to an out-of-control state, it is assumed that the mean of the process shifts from μ = μ 0 to μ = μ 0 + δ σ 0 , and/or the standard deviation of the process changes from σ 0 to ξ σ 0 . Since variance reduction is not desired in SPC, it is assumed that ξ > 1 . Additionally, for the negative values of δ , notice that because in the P L D , σ 2 > μ , and μ > 0 , δ cannot be a very small number. Because, in this case, the number calculated for L C L will be a negative value that should be considered equal to 0. Additionally, since our data are of counting type, none of them will be less than L C L . Consequently, A R L and S D R L cannot be calculated in this situation.
We set the δ values for the M-chart equal to 0 (no change in the mean of the process, to check the false alarm rate), 0.25, 0.5, 0.75 (to investigate positive low shift values in the mean of the process), 1, 1.25, 1.5 (to investigate positive moderate shift values in the mean), 2, 2.25, and 2.5 (to investigate positive high shift values in the mean). Additionally, we investigated negative shifts in the mean by setting δ equal to −0.2, −0.04, −0.6, and −0.8. We did not include any values less than −0.8 because the L C L for M-chart would be negative.
For the S-chart, we put the ξ values equal to 1 (no change in variance of the process, to check for false alarms), 1.1, 1.2, 1.3, and 1.4 (to check for small changes in variance), 1.5, 1.6, 1.7, 1.8, and 1.9 (to examine moderate changes in variance), and 2, 2.25, and 2.5 (to examine large changes in variance).
For each value of δ and ξ , we set m = 25 , n = 5 , B = 10,000 , and γ = 0.0027 . Then, we repeated steps 1 to 6 of Algorithm 2, k = 30 times. The results are presented in Table 1.
Table 1 presents the A R L associated with the S D R L values of the bootstrap M-chart and S-chart for different values of δ and ξ . The table indicates that the performance of both the bootstrap M and S control charts is outstanding, even when minor changes are taken into account. For example, if the mean of a Poisson−Lindley process with parameter θ = 1 increases by 1.5 σ   ( δ = 1.5 ) , the bootstrap M-chart will detect this shift on average after 2.34 sampling times with a standard deviation of 1.83. Additionally, if the standard deviation of a Poisson−Lindley process with parameter θ = 1 is doubled ( ξ = 2 ), the bootstrap S-chart will detect this change on average after 2.52 samplings with a standard deviation of 2.23. Furthermore, in the M-Chart, when the process is in control ( δ = 0 ) , A R L is equal to 370.61. This indicates that on average, for every 370.61 samples, the M-Chart has one false alarm. Similarly, in the S-Chart, when the process is in control ( ξ = 1 ), the A R L is 370.71.
In general, in both bootstrap M and S control charts, as the magnitude of the change increases, there is a rapid reduction in the A R L and SDRL values.

3.4. Process Capability Analysis Using the Proposed Control Charts

The common way to express process capability is to use the process efficiency ratio ( P C R or C p ). This ratio is calculated in Shewhart-type control charts for a quality characteristic as follows:
C p = P C R = U S L L S L 6 σ
where, USL and LSL are upper and lower specification limits, respectively, which must be specified by the process design engineers.
Since the control charts introduced in Section 3.1 and Section 3.2 are also of the Shewhart type, C p is calculated for them using Equation (10), where σ is derived from the square root of σ 2 in Equation (5). In the case of an unknown parameter, σ can be estimated using either θ ~ or θ ^ from Equations (6) and (7), respectively.
Another process capability ratio that takes process centering into account is C p k , which is calculated for a quality characteristic in Shewhart-type control charts as follows:
C p k = m i n ( C p u , C p l )
where, C p u = U S L μ 3 σ and C p l = μ L S L 3 σ .
For the control charts introduced in Section 3.1 and Section 3.2, μ is calculated using Equation (4), and σ is derived from the square root of σ 2 in Equation (5). In the case of an unknown parameter, it can be estimated using either θ ~ or θ ^ .
For bootstrap control charts, C p can be calculated from the following equation:
C p = P C R = U S L L S L U C L L C L
where the process design engineers must specify the USL and LSL values again. For the bootstrap M-chart, the L C L is calculated as the ( γ / 2 ) th quantile of the distribution of μ ^ * and the U C L is calculated as the ( 1 γ 2 ) th quantile of the distribution of μ ^ * . Additionally, for the bootstrap S-chart, the U C L is calculated as the ( 1 γ ) th quantile of the distribution of σ ^ * , while the L C L is set to zero. These values must be calculated using Algorithm 2.
For the bootstrap M-Chart, C p k is calculated using Equation (11), where C p u = U S L m e a n ( μ ^ * ) 3 s d ( μ ^ * ) and C p l = m e a n ( μ ^ * ) L S L 3 s d ( μ ^ * ) . The L C L is calculated as the ( γ / 2 ) th quantile of the distribution of μ ^ * , while the U C L is calculated as the ( 1 γ 2 ) t h quantile of the distribution of μ ^ * . Here, m e a n ( μ ^ * ) and s d ( μ ^ * ) are the mean and standard deviation of the distribution of μ ^ * , respectively. Since the bootstrap S-chart is a one-limit control chart, its C p k value is equal to C p u = U S L m e a n ( σ ^ * ) 3 s d ( σ ^ * ) where m e a n ( σ ^ * ) and s d ( σ ^ * ) are the mean and standard deviation of the distribution of σ ^ * , respectively.
When evaluating process capability using C p and C p k , a general rule of thumb is:
  • C p and C p k values less than 1 indicate that the process is not capable of meeting customer requirements.
  • C p values greater than 1 indicate that the process has the potential to meet customer requirements, but may not be doing so consistently.
  • C p k values greater than 1 indicate that the process is capable of meeting customer requirements with a low defect rate.
However, it is important to note that these are general guidelines and the specific interpretation of C p and C p k values will depend on the context of the process being analyzed and the customer requirements [16].

4. Numerical Results and Simulation

In this section, we examine the implementation of the control charts introduced in the previous section for subgroups of the Poisson–Lindley distribution using a simulated example and a real dataset.

4.1. Simulated Example

To produce the simulated sample using the R software, we first randomly generated 25 samples of size 5 from the P L distribution with a parameter of θ = 1 using Algorithm 1 for phase I. Then, using Equation (9), we calculated the values of C L , L C L , and U C L for subgroups of size 5 from the P L D with a parameter of θ = 1 . The calculated values are as follows: C L = 1.5 , U C L = 3.918678 , and L C L = 0.91868 . As mentioned in the previous section, we set L C L to 0.
Using Algorithm 2, we calculated the values of the control limits for the bootstrap control M and S-charts for 25 subgroups of phase I. We considered a bootstrap iteration size of B = 10,000 and repeated the bootstrap calculations k = 25 times. The results are as follows: For the M-chart, we calculated L C L = 0 and U C L = 4.71 . For the S-chart, we calculated U C L = 4.167 .
For the exploitation phase (phase II), we first randomly generated 10 subgroups of size 5 from the P L D with a parameter of θ = 1 , which correspond to samples numbered 26 to 35. Then, we randomly generated five subgroups of size 5 from the P L D with a parameter of θ = 0.5 , which correspond to samples numbered 36 to 40. Figure 2, Figure 3 and Figure 4 display control charts for the subgroups of the P L D related to Equation (9) as well as M-charts and S-charts for subgroups 1 to 40 of the simulation.
As depicted in Figure 2, it is evident that samples number 37 and 39 are out of control. It is noteworthy that the mean of the process has shifted upwards from sample 36 onwards due to a change in the P L D parameter, and our control chart successfully detected this shift in sample 37 with A R L = 2 , which is an excellent outcome.
Based on Figure 3, it is evident that sample number 39 is out of control. Our bootstrap control M-chart successfully detected the change in the mean of the process after four samples with an A R L of 4, which indicates its good performance. Moreover, Figure 4 shows that sample number 37 is out of control. Our bootstrap control S-chart also detected a change in the standard deviation of the process in sample 37, after only two sampling times, resulting in an A R L of 2, indicating its excellent performance.

4.2. A Real Data Set Example

To assess the implementation of the control charts introduced in Section 3 on a real dataset, we considered the “number of European red mites on apple leaves” dataset available in [28]. We fitted the Poisson and Poisson–Lindley distributions to this dataset using the “fitdistrplus” package in R software.
Table 2 displays the observed and expected frequencies resulting from fitting the distributions, along with estimates of the distribution parameters and goodness-of-fit statistics.
As shown in Table 2, we estimated the value of parameter θ for the Poisson distribution to be 1.146667 using the maximum likelihood ( M L ) method. Additionally, the chi-square test statistic for fitting the Poisson distribution to the dataset is 49.15817 with a p-value nearly equal to zero (up to 9 decimal places), indicating that the assumption that the data follows the Poisson distribution is rejected. For the Poisson–Lindley distribution, we estimated parameter θ using the M L method to be 1.1664. The chi-square test statistic for this distribution is 1.251797 with a p-value of 0.7406099, indicating that the P L D fits the dataset well. Additionally, both the Akaike Information Criterion ( A I C ) and Bayesian Information Criterion ( B I C ) values for the P L D are lower than the corresponding values for the Poisson distribution, demonstrating the superiority of the P L D over the Poisson distribution.
We considered both an ordinary control C-chart based on the Poisson distribution and a P L D -based control chart for comparison in analyzing this dataset. For the C-chart, we have C L = 1.146667 , U C L = 4.359143 (found by adding 3 times the square root of 1.146667 to the center line), and L C L = 0 (since it is less than zero, which is not relevant for this chart). For the P L D -based control chart, we placed θ ^ = 1.26016 in Equation (8) and found C L = 1.146667 , U C L = 5.604811 , and L C L = 0 (since it is less than zero, which is not relevant for this chart).
Figure 5 and Figure 6 show the C-chart and the P L D -based control chart for the mentioned data sets, respectively.
Figure 5 illustrates that if one uses the C-chart without considering the fitness of the Poisson distribution to the dataset, one will incorrectly identify six out-of-control samples. Conversely, considering the results of the goodness-of-fit tests, P L D should be used for these data. Figure 6 indicates that in this case, there are only three out-of-control samples.

5. Discussion and Conclusions

Statistical Process Control (SPC) commonly employs control charts as graphical and statistical tools to detect whether a process is under control. Shewhart control charts with 3-sigma control limits are widely used for monitoring industrial processes. One of the most important control charts for monitoring the number of nonconformities in production processes is the C-chart, which uses the Poisson distribution as a quality characteristic distribution. However, when the dataset is over-dispersed, the Poisson distribution may not be suitable, and the Poisson–Lindley distribution can serve as a suitable alternative.
Until now, no control chart has been designed for over-dispersed data. In this article, we reviewed some important features of the Poisson–Lindley distribution and introduced parametric control charts for both individual observations and subgroups of known and unknown parameter states. Additionally, we presented non-parametric bootstrap charts to monitor the mean and standard deviation of the Poisson–Lindley process as an algorithm. We demonstrated how to implement parametric and bootstrap control charts using a simulated example of randomly generated subgroups from the Poisson–Lindley distribution, as well as a real-world dataset. Furthermore, we evaluated the performance of bootstrap control charts using A R L and S D A R L criteria. The results show excellent performance of the parametric and very good performance of the bootstrap control charts introduced in this paper.

Author Contributions

Conceptualization, A.A.H. and H.J.K.; Data curation, W.A.H.A.-N.; Formal analysis, H.J.K.; Investigation, W.A.H.A.-N.; Methodology, W.A.H.A.-N.; Software, W.A.H.A.-N. and A.A.H.; Supervision, A.A.H. and H.J.K.; Validation, W.A.H.A.-N.; Visualization, W.A.H.A.-N. and A.A.H.; Writing—original draft, W.A.H.A.-N.; Writing—review and editing, A.A.H. and H.J.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The authors confirm that the data supporting the findings of this study are available within the article.

Acknowledgments

The authors would like to thank the editor and referees for their suggestions that improved the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ramos, M.; Ascencio, J.; Hinojosa, M.V.; Vera, F.; Ruiz, O.; Jimenez-Feijoó, M.I.; Galindo, P. Multivariate statistical process control methods for batch production: A review focused on applications. Prod. Manuf. Res. 2021, 9, 33–55. [Google Scholar] [CrossRef]
  2. Apsemidis, A.; Psarakis, S.; Moguerza, J.M. A review of machine learning kernel methods in statistical process monitoring. Comput. Ind. Eng. 2020, 142, 106376. [Google Scholar] [CrossRef]
  3. Zhang, C.; Niu, S. Adaptive industrial control data analysis based on deep learning. Evol. Intell. 2023, 20, 1–9. [Google Scholar] [CrossRef]
  4. Vellaisamy, P.; Upadhye, N.; Cekanavicius, V. On negative binomial approximation. Theory Probab. Its Appl. 2013, 57, 97–109. [Google Scholar] [CrossRef]
  5. Sengar, A.S.; Upadhye, N.S. Subordinated compound Poisson processes of order k. Mod. Stoch. Theory Appl. 2020, 7, 395–413. [Google Scholar] [CrossRef]
  6. Kumar, A.N.; Upadhye, N.S. On discrete Gibbs measure approximation to runs. Commun. Stat.-Theory Methods 2022, 51, 1488–1513. [Google Scholar] [CrossRef]
  7. Sankaran, M. 275. note: The discrete poisson-lindley distribution. Biometrics 1970, 26, 145–149. [Google Scholar] [CrossRef]
  8. Lindley, D.V. Fiducial distributions and Bayes’ theorem. J. R. Stat. Soc. Ser. B 1958, 20, 102–107. [Google Scholar] [CrossRef]
  9. Mahmoudi, E.; Zakerzadeh, H. Generalized poisson–lindley distribution. Commun. Stat.-Theory Methods 2010, 39, 1785–1798. [Google Scholar] [CrossRef]
  10. Shanker, R.; Mishra, A. A two-parameter Poisson-Lindley distribution. Int. J. Stat. Syst. 2014, 9, 79–85. [Google Scholar]
  11. Hassan, A.S.; Assar, S.M.; Ali, K.A. The complementary Poisson-Lindley class of distributions. Int. J. Adv. Stat. Probab. 2015, 3, 146–160. [Google Scholar] [CrossRef]
  12. Zamani, H.; Faroughi, P.; Ismail, N. Bivariate Poisson-Lindley distribution with application. J. Math. Stat. 2015, 11, 1. [Google Scholar] [CrossRef]
  13. Shanker, R.; Mishra, A. On size-biased two parameter Poisson-Lindley distribution and its applications. Am. J. Math. Stat. 2017, 7, 99–107. [Google Scholar]
  14. Das, K.K.; Ahmed, I.; Bhattacharjee, S. A new three-parameter Poisson-Lindley distribution for modelling over-dispersed count data. Int. J. Appl. Eng. Res. 2018, 13, 16468–16477. [Google Scholar]
  15. Ghitany, M.; Al-Mutairi, D. Estimation methods for the discrete Poisson–Lindley distribution. J. Stat. Comput. Simul. 2009, 79, 1–9. [Google Scholar] [CrossRef]
  16. Montgomery, D.C. Introduction to Statistical Quality Control; John Wiley & Sons: Hoboken, NJ, USA, 2020. [Google Scholar]
  17. Efron, B.; Tibshirani, R.J. An Introduction to the Bootstrap; CRC Press: Boca Raton, FL, USA, 1994. [Google Scholar]
  18. Bai, D.; Choi, I. X and R control charts for skewed populations. J. Qual. Technol. 1995, 27, 120–131. [Google Scholar] [CrossRef]
  19. Nichols, M.D.; Padgett, W. A bootstrap control chart for Weibull percentiles. Qual. Reliab. Eng. Int. 2006, 22, 141–151. [Google Scholar] [CrossRef]
  20. Lio, Y.; Park, C. A bootstrap control chart for inverse Gaussian percentiles. J. Stat. Comput. Simul. 2010, 80, 287–299. [Google Scholar] [CrossRef]
  21. Lio, Y.L.; Park, C. A bootstrap control chart for Birnbaum–Saunders percentiles. Qual. Reliab. Eng. Int. 2008, 24, 585–600. [Google Scholar] [CrossRef]
  22. Chiang, J.-Y.; Lio, Y.; Ng, H.; Tsai, T.-R.; Li, T. Robust bootstrap control charts for percentiles based on model selection approaches. Comput. Ind. Eng. 2018, 123, 119–133. [Google Scholar] [CrossRef]
  23. Saeed, N.; Kamal, S.; Aslam, M. Percentile bootstrap control chart for monitoring process variability under non-normal processes. Sci. Iran. 2021, in press. [CrossRef]
  24. Ma, Z.; Park, C.; Wang, M. A Robust Bootstrap Control Chart for the Log-Logistic Percentiles. J. Stat. Theory Pract. 2022, 16, 3. [Google Scholar] [CrossRef]
  25. Seppala, T.; Moskowitz, H.; Plante, R.; Tang, J. Statistical process control via the subgroup bootstrap. J. Qual. Technol. 1995, 27, 139–153. [Google Scholar] [CrossRef]
  26. Liu, R.Y.; Tang, J. Control charts for dependent and independent measurements based on bootstrap methods. J. Am. Stat. Assoc. 1996, 91, 1694–1700. [Google Scholar] [CrossRef]
  27. Jones, L.A.; Woodall, W.H. The performance of bootstrap control charts. J. Qual. Technol. 1998, 30, 362–375. [Google Scholar] [CrossRef]
  28. Bliss, C.I.; Fisher, R.A. Fitting the negative binomial distribution to biological data. Biometrics 1953, 9, 176–200. [Google Scholar] [CrossRef]
Figure 1. The Poisson–Lindley distribution’s pmf at different θ values: (a) 0.25, (b) 0.5, (c) 1, and (d) 2.
Figure 1. The Poisson–Lindley distribution’s pmf at different θ values: (a) 0.25, (b) 0.5, (c) 1, and (d) 2.
Mathematics 11 02428 g001
Figure 2. Control chart for randomly generated subgroups of size 5 from the P L D .
Figure 2. Control chart for randomly generated subgroups of size 5 from the P L D .
Mathematics 11 02428 g002
Figure 3. Bootstrap control M-chart for randomly generated subgroups of size 5 from the P L D .
Figure 3. Bootstrap control M-chart for randomly generated subgroups of size 5 from the P L D .
Mathematics 11 02428 g003
Figure 4. Bootstrap control S-chart for randomly generated subgroups of size 5 from the P L D .
Figure 4. Bootstrap control S-chart for randomly generated subgroups of size 5 from the P L D .
Mathematics 11 02428 g004
Figure 5. C-chart for the number of European red mites on apple leaves data set.
Figure 5. C-chart for the number of European red mites on apple leaves data set.
Mathematics 11 02428 g005
Figure 6. The P L D based control chart for the number of European red mites on apple leaves data set.
Figure 6. The P L D based control chart for the number of European red mites on apple leaves data set.
Mathematics 11 02428 g006
Table 1. A R L and S D R L of the bootstrap M and S charts for subgroups of size n = 5 .
Table 1. A R L and S D R L of the bootstrap M and S charts for subgroups of size n = 5 .
M-ChartS-Chart
δ A R L S D R L ξ A R L S D R L
0370.61371.721370.71369.83
0.25182.43185.641.1156.78162.49
0.551.3158.421.252.3655.74
0.7518.6722.961.326.3528.47
18.237.321.414.1814.93
1.254.153.971.58.327.64
1.52.341.831.65.925.76
1.751.761.281.74.563.97
21.220.631.83.473.16
2.251.140.421.92.952.48
2.51.030.2422.522.23
−0.2132.4136.32.252.181.34
−0.443.644.72.51.501.12
−0.68.88.2
−0.82.21.9
Table 2. Goodness-of-fit statistics of Poisson and Poisson–Lindley distributions to the number of European red mites on apple leaves dataset.
Table 2. Goodness-of-fit statistics of Poisson and Poisson–Lindley distributions to the number of European red mites on apple leaves dataset.
Number of European Red Mites per LeafObserved FrequencyExpected Frequency
Poisson DistributionPoisson–Lindley Distribution
07047.667.2
13854.638.9
21731.321.2
31011.911.1
493.45.7
530.82.8
620.21.4
710.10.9
800.10.8
Total150150150
ML Estimate θ ^ = 1.146667 θ ^ = 1.26016
Standard Error0.087432450.1139965
Chi-square Statistic49.158171.251797
Chi-square d.f33
Chi-square p-value1.207139 × 10−100.7406099
A I C 487.6199447.0218
B I C 490.6305450.0324
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Al-Nuaami, W.A.H.; Heydari, A.A.; Khamnei, H.J. The Poisson–Lindley Distribution: Some Characteristics, with Its Application to SPC. Mathematics 2023, 11, 2428. https://doi.org/10.3390/math11112428

AMA Style

Al-Nuaami WAH, Heydari AA, Khamnei HJ. The Poisson–Lindley Distribution: Some Characteristics, with Its Application to SPC. Mathematics. 2023; 11(11):2428. https://doi.org/10.3390/math11112428

Chicago/Turabian Style

Al-Nuaami, Waleed Ahmed Hassen, Ali Akbar Heydari, and Hossein Jabbari Khamnei. 2023. "The Poisson–Lindley Distribution: Some Characteristics, with Its Application to SPC" Mathematics 11, no. 11: 2428. https://doi.org/10.3390/math11112428

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop