Outliers Detection Models in Shewhart Control Charts; An Application in Photolithography: A Semiconductor Manufacturing Industry

: Shewhart control charts with estimated control limits are widely used in practice. However, the estimated control limits are often a ﬀ ected by phase-I estimation errors. These estimation errors arise due to variation in the practitioner’s choice of sample size as well as the presence of outlying errors in phase-I. The unnecessary variation, due to outlying errors, disturbs the control limits implying a less e ﬃ cient control chart in phase-II. In this study, we propose models based on Tukey and median absolute deviation outlier detectors for detecting the errors in phase-I. These two outlier detection models are as e ﬃ cient and robust as they are distribution free. Using the Monte-Carlo simulation method, we study the estimation e ﬀ ect via the proposed outlier detection models on the Shewhart chart in the normal as well as non-normal environments. The performance evaluation is done through studying the run length properties namely average run length and standard deviation run length. The ﬁndings of the study show that the proposed design structures are more stable in the presence of outlier detectors and require less phase-I observation to stabilize the run-length properties. Finally, we implement the ﬁndings of the current study in the semiconductor manufacturing industry, where a real dataset is extracted from a photolithography process.


Introduction
The two salient tools of statistical process control (SPC) are memory and memory-less control charts. The memory-less control charts are most suitable for large shift, while the memory-control charts are used to monitor moderate and small shifts. The prominent form of memory-less control chart for location monitoring is the Shewhart X control chart. In general, control charts-irrespective of the magnitude they measure-operate in two phases: phase-I, the prospective stage from which the control limits are obtained; phase-II, where we monitor the process and correct the unnatural causes of variation whenever they occur (cf. [1]). In phase-I we estimate the control limits using the parameters of the process under study which, in reality, are seldom known. The amount of data employed in phase-I for estimating process parameters varies from one practitioner to the other. As a result, this variability affects the chart performance in the monitoring stage i.e., phase-II. (see for example [2][3][4][5][6]).

Methodology
In this section, we give details of the Shewhart control chart for normal and non-normal environments. The known and unknown parameter scenarios, the practitioner-practitioner variation in the estimation stage, the presence of outliers/extreme values in the estimation sample, and incorporating some outlier detection models in the Shewhart chart are all discussed in the following subsections.

Overview of the Shewhart Control Chart
Let Y ij i = 1, 2, . . . , n and j = 1, 2, . . . represent a ith observation from jth sample of an ongoing (continuous) process. Further Y ij follows a normal distribution with mean µ 0 + δσ 0 and variance σ 0 2 i.e., Y ij ∼ N µ 0 + δσ 0 , σ 0 2 . The process is said to be in the in-control (IC) state if δ = 0, and out-of-control (OoC) otherwise. A default Shewhart set-up monitors a process by plotting the sample mean (Y i = 1/n n j=1 Y ij ) of Y ij against the following control chart limits.
where UCL and LCL denote the upper and lower control limits, respectively. Limits in (1) are useful when the parameters (µ 0 and σ 0 2 ) of the process are known. However, when they are unknown, their respective unbiased estimators from the phase-I are used, and the resulting control chart structures will be in estimated form.
For phase-I, let Y il represents ith observation from lth random sample ∀ i = 1, 2, 3, . . . , n and l = 1, 2, 3, . . . , m, regarded to be under statistically IC state. It is good to mention here that the choice of m and n varies from one practitioner-to another. Therefore, it affects the accuracy of the control limits implying an influenced ARL in phase-II. The unbiased estimators for the parameters µ and σ of an IC process are defined as:μ In phase-II, Y l s are plotted against the control limits in (3) and the chart is said to have given an OoC signal if any value of Y l is plotted outside the limits. Here, the sample number at which the statistic is plotted outside the limits is recorded as run length (RL). RL is an important variable in measuring the performance of control charts in general, and the Shewhart is not an exception. The most widely used property of RL is ARL, which is the average number of samples observed before the chart sends an OoC signal. Mathematically, ARL = s k=1 RL k /s where s is the number of RLs recorded. In addition to ARL, standard deviation of the RL (SDRL) gives more information about the behavior of the RL variable in evaluating the performance of a control chart. Furthermore, the ARL is of two types i.e., the IC ARL, denoted as ARL 0 and the OoC ARL, referred to as ARL 1 . ARL 0 is expected to be sufficiently large enough to avoid false alarms. On the other hand, ARL 1 is anticipated to be sufficiently small to enable the process to send a signal as soon as there is a shift in the process parameter(s).

Variability in the Shewhart Chart Performance
In this section, we explain the effect of the practitioner to practitioner variability on the Shewhart chart, both in normal and non-normal distribution, by using the Monte Carlo simulation approach. See ( [25][26][27][28][29]) for more information about the effect of sample size and practitioners' variability. To achieve this aim, we develop an algorithm in R programing language to simulate the Shewhart chart environment, using the standard Shewhart chart as our benchmark and reference point. The X chart has a control limits width determinant L that influences RL properties. We use the standard L = 3, that corresponds to the ARL 0 = 370 (see [1] for more details). Without any loss of generality, we generate random samples from a standard normal distribution N(µ = 0, σ = 1), each of sample size n = 5, assuming the process parameters are known. While for the non-normal distribution, we considered the t-distribution with degrees of freedom v = 5, 25, and 100. Since all the three categories of v exhibit the same pattern, we report only the results for v = 100. In both environments, normal and t-distributions, we set up the chart limits as given in Equation (1) and plot the sample means against the UCL and LCL. As soon as a value of Y j is plotted outside the limits, RL is recorded and saved. The process is iterated 10 5 times to get ARL and SDRL.
For the unknown parameters, we estimate the parameter from phase-I. The number of samples employed for the estimation differs from on practitioner to another and so does the accuracy of the charts in phase-II. To depict that, we estimated both µ 0 and σ 0 from different number of in-control phase-I samples i.e., m = 25, 50, 100, 250, 500 and 1000 each of sample size n = 5. The estimated parametersμ 0 andσ 0 from the phase-I IC stage are, therefore, used in the same algorithm instead of µ 0 and σ 0 respectively. Subsequently the parameter L, changes as the amount of phase-I samples changes. The corresponding L s for the different m s are L = 2.962, 2.983, 2.9925, 2.997, 2.999, and 3 respectively for the normal distribution, and L = 2.974, 2.995, 3.005, 3.010, 3.012, and 3.012 respectively for the t-distribution of v = 100. These L's are determined through simulations to obtain ARL 0 = 370. We carry out the simulation with different level of shifts δ ranging from 0 to 5 i.e., δ ∈ (0, 0.5, 5), as shown in Tables 1 and 2.

Presence of Outliers in the Shewhart Chart with Estimated Parameters
Although the estimation of the unknown parameters in phase-I samples plays its role on the efficiency of the control chart in phase-II. The drop in the efficacy of the chart performance is not limited to this fact alone, rather it extends to presence of outlying/extreme values in the phase-I samples. In this Section, we study the effect of outliers in the phase-I samples on the performance and accuracy of the Shewhart chart. Here, through Monte Carlo simulation, we generate the m phase-I samples from a mixture distribution i.e., (1 − α)100% from assumed (normal or t-distribution) and the remaining α100% from a chi-square distribution with n degrees of freedom denoted by χ 2 (n) . Subsequently, the estimated parameters emerging from the m samples have an extreme values effect on the control chart in phase-II. That is, each observation of the phase-I sample is generated from the following expression: where α > 0, is the probability of having a multiple of χ 2 (n) added to the assumed distribution, serving as the outliers in the samples. In addition, w ≥ 1 is the magnitude of the outlier. We develop an algorithm from the R language, similar to that in Section 2.2, but the samples are from the environment described in (4). We set µ = 0, σ 2 = 1, v = 100, w = 3, and α [0, 0.01]. We design the Shewhart chart using the same parameters L and m as in Section 2.2.
In general, the pattern exhibited by the RL properties implies the following: • Increasing the m phase-I samples in the presence of outliers, gets the ARL 0 's closer to the theoretical values.

•
Reducing the value of α, the percentage of outliers present in the m samples also brings the ARL 0 's closer to the theoretical values.
Unfortunately, neither of the two suggested remedies is practicable in real life. Thus, we propose outliers detecting structures through the robust Turkey and MAD detection models.

Shewhart Chart with Outlier Detection Models
In the section, we propose two outlier-detecting models as remedy to the issues raised in if y o − Y > p × IQR, then y o is declared an outlier. Here IQR = Q 3 − Q 1 is the inter-quartile range of the sample. Q 3 and Q 1 are the third and first quartiles, respectively, of all m × n phase-I observations. The constant p on the other hand is the confidence factor of the Tukey's detector, commonly chosen between 1.5 and 3.0. The confidence factor should be carefully chosen, and not too small, to avoid over detection. Also it should not be too large, to prevent under detection [18]. In this study, we choose p = 2.2. Applying the same algorithm, parameters and limits employed in Section 2.2, we incorporate the Tukey outlier-detector model on the phase-I samples to screen out the extreme values present there in. Then we compute the IC ARL and SDRL values for the Shewhart chart based on the Tukey model in phase-II, when the parameters are estimated.

The Median Absolute Deviation (MAD) Shewhart Control Chart
We define median absolute deviation (MAD) as the deviation of the dataset about the median as MAD = median Y il − Y /0.6574. Then it follows, that any observation y o from the sample that falls outside the expression Y ± b * MAD , is declared an outlier. Here b is the outlier detecting constant and chosen 3.642 so that the percentage of screening by MAD is the same as Tukey. This has been done to keep the comparison between two outlier detectors valid [19].
Furthermore, it is worth distinguishing between outlying and OoC sample points. The former emerges from mphase-I samples, which are used to construct the control limits for the monitoring stage; phase-II; while the latter are the sample points that fall beyond the control limits in phase-II. Therefore, the presence of outlying sample points in phase-I leads to wider control limits, rendering the control charts less effective. A flowchart summarizing the procedure is depicted in Figure 1.

Results
In this section, we provide the results of the methodologies discussed in Section 2. These results are presented in three folds, so is the discussion in the next section.

Practitioners' Estimation Variability
Here, through the simulation results of the algorithm explained in Section 2.2, we observe the variability that appears in the Shewhart control chart due to different choices of sample size , amongst practitioners. Tables 1 and 2 depict the Shewhart chart whose parameters, both mean and variance, are estimated from phase-I samples for both normal and non-normal distributions. It is evident from the result, the effect of parameter estimation on the performance of the chart. The s when = 0, are clustering around the target 370 with their respective 's. However, when ≠ 0 , we observe that the smaller becomes, the less effective the Shewhart chart performance. The 's are expected to be sufficiently small in order to detect any drift in the ongoing process, but as gets smaller, 's get bigger. Which implies the chart is less sensitive in identifying the presence of shifts in the ongoing process early enough. Another noticeable effect of the parameter estimation on the Shewhart chart is the decrement in the limits L, as reduces. This should be recorded as an edge if the corresponding phase-II charts detects shift earlier than when the parameters are known.

Effect of Outliers on the Shewhart Control Charts
In Tables 3 and 4, we present the simulation results of environment (4) discussed in Section 2.3. From these results, the gross impact of outliers in the phase-I samples on the performance of the Shewhart chart cannot be over emphasized. Having seen the pattern of the IC and OoC RL properties in Tables 1 and 2, in order to save space, we restrict the performance evaluation to the IC RL properties. That is, considering the case when = 0 only. From Tables 3 and 4, when = 0, in the absence of outlier, the 's are clustering around its target 370, irrespective of the amount of phase-I sample . However, when > 0, the 's deviate from the target, vigorously. As the amount of phase-I samples reduces, and the percentage of outliers present in the samples increases, the more the 's deviate from the target. Similarly the pattern of the SDRL, even

Results
In this section, we provide the results of the methodologies discussed in Section 2. These results are presented in three folds, so is the discussion in the next section.

Practitioners' Estimation Variability
Here, through the simulation results of the algorithm explained in Section 2.2, we observe the variability that appears in the Shewhart control chart due to different choices of sample size m, amongst practitioners. Tables 1 and 2 depict the Shewhart chart whose parameters, both mean and variance, are estimated from m phase-I samples for both normal and non-normal distributions. It is evident from the result, the effect of parameter estimation on the performance of the chart. The ARL 0 s when δ = 0, are clustering around the target 370 with their respective L's. However, when δ 0, we observe that the smaller m becomes, the less effective the Shewhart chart performance. The ARL 1 's are expected to be sufficiently small in order to detect any drift in the ongoing process, but as m gets smaller, ARL 1 's get bigger. Which implies the chart is less sensitive in identifying the presence of shifts in the ongoing process early enough. Another noticeable effect of the parameter estimation on the Shewhart chart is the decrement in the limits L, as m reduces. This should be recorded as an edge if the corresponding phase-II charts detects shift earlier than when the parameters are known.

Effect of Outliers on the Shewhart Control Charts
In Tables 3 and 4, we present the simulation results of environment (4) discussed in Section 2.3. From these results, the gross impact of outliers in the phase-I samples on the performance of the Shewhart chart cannot be over emphasized. Having seen the pattern of the IC and OoC RL properties in Tables 1 and 2, in order to save space, we restrict the performance evaluation to the IC RL properties. That is, considering the case when δ = 0 only. From Tables 3 and 4, when α = 0, in the absence of outlier, the ARL 0 's are clustering around its target 370, irrespective of the amount of phase-I sample m. However, when α > 0, the ARL 0 's deviate from the target, vigorously. As the amount of phase-I samples m reduces, and the percentage of outliers present in the samples α increases, the more the ARL 0 's deviate from the target. Similarly the pattern of the SDRL, even more.  Tables 5-8 respectively. Tables 5 and 7 represents the ARL result for Tukey and MAD outlier detection models respectively, as Tables 6 and 8 are the corresponding SDRL results. The effect of these detection models are noticed as ARLs and SDRLs are closer to when there is an absence of outliers or even better.   For better visuals of the results, we depict the ARL results (Tables 3, 5

Discussion
We summarize the findings of the study under the following subsections: (a) parameter estimation effect on the Shewhart control chart, (b) effect of outliers on Shewhart chart performance, and (c) improvement of outliers screening models on the Shewhart chart performance. Through the discussion, we use the run length properties as a yardstick for measuring the performance of the charts.

Parameter Estimation Effect on the Shewhart Control Chart
Theoretically, when the Shewhart charts parameters are known, the limit corresponding to the IC = 370 is = 3. When the parameters are estimated from phase-I samples, the first effect of the estimation is the change in . The control limit deviates from its theoretical value as much as the sample size reduces. That implies, the smaller the sample size , the farther the control limit from the theoretical value. This is noticeable in Tables 1 and 2, as changes as the sample size does. We compute s based on 100,000 iterations of simulation. Secondly, in the introduction of shifts, which makes the process OC, the RL properties values of the estimated parameters are bigger than the theoretical values. This indicates that the chart with estimated parameters are slower in detecting shifts in the process as compared to the chart with known parameters. For instance, (cf . Tables 1 and 2

Effect of Outliers on Shewhart Control Chart performance
Haven noticed the effect of parameter estimation on Shewhart chart performance; one major cause could be the presence of outliers in the dataset. The results in Tables 3 and 4 prove that extreme values in the sample causes great havoc to the performance of the process. As discussed

Discussion
We summarize the findings of the study under the following subsections: (a) parameter estimation effect on the Shewhart control chart, (b) effect of outliers on Shewhart X chart performance, and (c) improvement of outliers screening models on the Shewhart X chart performance. Through the discussion, we use the run length properties as a yardstick for measuring the performance of the charts.

Parameter Estimation Effect on the Shewhart X Control Chart
Theoretically, when the Shewhart charts parameters are known, the limit L corresponding to the IC ARL 0 = 370 is L = 3. When the parameters are estimated from phase-I samples, the first effect of the estimation is the change in L. The control limit L deviates from its theoretical value as much as the sample size m reduces. That implies, the smaller the sample size m, the farther the control limit from the theoretical value. This is noticeable in Tables 1 and 2, as L changes as the sample size does. We compute Ls based on 100,000 iterations of simulation. Secondly, in the introduction of shifts, which makes the process OC, the RL properties values of the estimated parameters are bigger than the theoretical values. This indicates that the chart with estimated parameters are slower in detecting shifts in the process as compared to the chart with known parameters. For instance, (cf . Tables 1 and 2), with m = 1000, δ = 0.5 the resulting ARL 1 and SDRL 1 are 156.42 and 158.84 for normal distribution and 150.92 and 160.46 for t-distribution respectively. However, with m = 25, δ = 0.5 ARL 1 and SDRL 1 are 190.12 and 333.88 for normal distribution and 194.06 and 340.19 for t-distribution respectively.

Effect of Outliers on Shewhart X Control Chart performance
Haven noticed the effect of parameter estimation on Shewhart chart performance; one major cause could be the presence of outliers in the dataset. The results in Tables 3 and 4 prove that extreme values in the sample causes great havoc to the performance of the process. As discussed earlier in Section 4, α = 0 indicates absence of outliers, and the presence of outliers if otherwise. We observe jumps in the values of IC ARL and SDRL from Tables 3 and 4. With different combinations of α and m, we say the bigger the value of α and the smaller the value of m, the gross the effect of the outliers on the chart. Take for instance, in the normal environment, the ARL and SDRL values of just 1% of outliers (α = 0.01) for when m = 1000 as against when m = 25. It shocks to see the ARL and SDRL jumped from 592.55 and 612.00 to 996.3 and 4012.72 respectively. However, in the t-distribution, ARL and SDRL values of 1% of outliers (α = 0.01) for when m = 1000 as against when m = 25, are 575.08 and 591.55 to 953.61 and 3823.24 respectively.

Improvement of Outliers Screening Models on Shewhart Chart Performance
The proposed remedy for the effect of outliers on the Shewhart chart works perfectly. The incorporation of Tukey and MAD outlier-screening models in the Shewhart chart normalizes the outlier effects and restores the performance even much better than it was. To access the effect of these two screening methods, we present Figures 2-5, displaying the IC ARL values with m = 25, 50, 100, 250, 500 and 1000, and the magnitude w = 3, without outliers screening, alongside the IC ARL whose outliers are screened with the Tukey and MAD-based models. The IC ARL that are supposed to be around the target 370 has jumped to more than 250% increment due to the effect of outliers. However, with our proposed screening models; both Tukey and MAD-based models; the IC ARL is returned back to its target with less than 5% increment and decrement. The IC SDRL also exhibits the same pattern; in fact, its improvement is more appreciable as compared to the ARL's.

Illustrative Example
In the manufacturing industry, semiconductor lithography (photolithography) refers to the formation of three-dimensional images on the substrate for subsequent transfer of the pattern to the substrate. A keynote aspect of this process is the bake process, both the pre (soft)-bake and post (hard)-bake. In this section, we implement the Shewhart chart with the proposed outlier detection models on the flow width measurement of a hard bake process. In the subsequent subsections, we give a brief overview of the hard-bake process and then application of the Shewhart chart on the dataset extracted from such a process (the Basics of Microlithography n.d.).

The Post (Hard) Bake Process
A typical photolithography process consist of the following sequence of operation: substrate preparation, photoresist spin coat, pre-bake, exposure, post-exposure bake, development and finally the post-bake. The hard-bake process, as the name implies, is used to harden the final resist image so that it will withstand the harsh environments of etching. This post-bake ensures complete removal of solvent, improving adhesion in wet etch processes and resistance to plasma etches. Practitioners use different temperatures depending on the material under study. However, the temperature should be carefully chosen and not more than 200 • C. A major characteristic of this process is the wafer. Recall that the word lithography is a combination of two Greek words: lithos meaning stones and graphia, meaning to write. Our stones in this case are silicon wafers and the patterns are written with photoresist, which are sensitive polymers. Figures 6 and 7 depict a typical photolithography flowchart and the hard-bake process.

Application of Shewhart Control Charts with Outlier
In this section, we implement the findings of this study on a set of data generated from a semiconductor manufacturing of a hard-bake process, which monitors the flow width measurement of wafers [1]. The variable of interest is the flow width measurement (in microns) for the hard-brake process. The data consist of 25 IC phase-I samples and 10 phase-II samples each of sample size 5. The process mean and standard deviation of the phase-I samples are 16.7163 and 3.5167, respectively. Therefore, we use these estimates to setup Shewhart chart control limits for monitoring phase-II samples. Figure 8 shows all phase-I sample points staying within the limits and 3 of the phase-II sample points stretching beyond the LCL making them OoC due to some assignable cause of variation.
Prior to setting the limits, we test the data for possible autocorrelation. The data is autocorrelation-free as the Durbin-Watson (DW) test result proves. The value of the DW test statistics is = 1.7564 and the critical values at 1% level of significance are = 1.19, = Figure 6. A flowchart of a photolithography process of semiconductor manufacturing industry.

Application of Shewhart Control Charts with Outlier
In this section, we implement the findings of this study on a set of data generated from a semiconductor manufacturing of a hard-bake process, which monitors the flow width measurement of wafers [1]. The variable of interest is the flow width measurement (in microns) for the hard-brake process. The data consist of 25 IC phase-I samples and 10 phase-II samples each of sample size 5. The process mean and standard deviation of the phase-I samples are 16.7163 and 3.5167, respectively. Therefore, we use these estimates to setup Shewhart chart control limits for monitoring phase-II samples. Figure 8 shows all phase-I sample points staying within the limits and 3 of the phase-II sample points stretching beyond the LCL making them OoC due to some assignable cause of variation.
Prior to setting the limits, we test the data for possible autocorrelation. The data is autocorrelation-free as the Durbin-Watson (DW) test result proves. The value of the DW test statistics is = 1.7564 and the critical values at 1% level of significance are = 1.19, = Figure 7. Illustration of hard-bake process.

Application of Shewhart Control Charts with Outlier
In this section, we implement the findings of this study on a set of data generated from a semiconductor manufacturing of a hard-bake process, which monitors the flow width measurement of wafers [1]. The variable of interest is the flow width measurement (in microns) for the hard-brake process. The data consist of 25 IC phase-I samples and 10 phase-II samples each of sample size 5. The process mean and standard deviation of the phase-I samples are 16.7163 and 3.5167, respectively. Therefore, we use these estimates to setup Shewhart chart control limits for monitoring phase-II samples. Figure 8 shows all phase-I sample points staying within the limits and 3 of the phase-II sample points stretching beyond the LCL making them OoC due to some assignable cause of variation.
Prior to setting the limits, we test the data for possible autocorrelation. The data is autocorrelation-free as the Durbin-Watson (DW) test result proves. The value of the DW test statistics is DW = 1.7564 and the critical values at 1% level of significance are d L = 1.19, and d U = 1.31. By the interpretation explained in Table 9, we fail to reject the null hypothesis and conclude that there is no evidence of autocorrelation in the data.
Mathematics 2020, 8, x; doi: FOR PEER REVIEW www.mdpi.com/journal/mathematics the mean and standard deviation by 4% and 25% respectively resulting to an increased UCL and decreased LCL. The changes in the control limits implies a wider range of the boundaries. Therefore the resulting control charts is less efficient as compared to the previous one without outliers. Figure  9 depicts this.
Reject H 0 : negative autocorrelation Furthermore, we introduce a 5% of outliers to the phase-I samples, to illustrate the argument that the presence of outliers affects the performance of control charts. This subsequently increased the mean and standard deviation by 4% and 25% respectively resulting to an increased UCL and decreased LCL. The changes in the control limits implies a wider range of the boundaries. Therefore the resulting control charts is less efficient as compared to the previous one without outliers. Figure 9 depicts this.

Application of Shewhart Outlier Detection Model
Having established the deficiency of the Shewhart chart with outliers on the dataset; we employ our proposed outlier detection model with the Shewhart chart explained in Section 2.4 to rectify this shortcoming. Figure 10 shows the application of the Shewhart Tukey-based model. It is evident there in that the chart was not only able to restore the efficiency of the chart as there were no outliers, Figure 9. Scatter plot of phase-I sample and the resulting Shewhart chart with estimated parameters and 5% of outliers with magnitude 3.

Application of Shewhart Outlier Detection Model
Having established the deficiency of the Shewhart chart with outliers on the dataset; we employ our proposed outlier detection model with the Shewhart chart explained in Section 2.4 to rectify this shortcoming. Figure 10 shows the application of the Shewhart Tukey-based model. It is evident there in that the chart was not only able to restore the efficiency of the chart as there were no outliers, detecting 3 OoC sample points, but also to identify the outliers in the phase-I sample points. Similarly, Figure 11 portrays the scenario when the Shewhart MAD-based model is applied on the monitoring stage. Despite the presence of outlier in the dataset, the chart is able to detect the OC sample points as much as it does when there were no outliers.
Mathematics 2020, 8, x; doi: FOR PEER REVIEW www.mdpi.com/journal/mathematics Figure 9. Scatter plot of phase-I sample and the resulting Shewhart chart with estimated parameters and 5% of outliers with magnitude 3.

Application of Shewhart Outlier Detection Model
Having established the deficiency of the Shewhart chart with outliers on the dataset; we employ our proposed outlier detection model with the Shewhart chart explained in Section 2.4 to rectify this shortcoming. Figure 10 shows the application of the Shewhart Tukey-based model. It is evident there in that the chart was not only able to restore the efficiency of the chart as there were no outliers, detecting 3 OoC sample points, but also to identify the outliers in the phase-I sample points. Similarly, Figure 11 portrays the scenario when the Shewhart MAD-based model is applied on the monitoring stage. Despite the presence of outlier in the dataset, the chart is able to detect the OC sample points as much as it does when there were no outliers.

Conclusions
In this article, we evaluate the performance of the Shewhart control chart for location monitoring with estimated parameters. The study substantiates the effect of estimation error and the variability in the practitioners' choice of phase-I samples on the chart, especially when the samples are prone to outliers. Increasing the phase-I sample size (although not practicably) will to some extent reduce the gross impact on the Shewhart chart. The results of this study further prove that incorporation of the non-parametric outlier screening models, Tukey and MAD, in the design of the Shewhart chart is more practicable as it requires less phase-I samples and yields better results. Another advantage of this study lies in the simplicity of its design and ease of usage. The study Figure 11. Scatter plot of phase-I sample and the resulting Shewhart chart with MAD-outlier detection screening.

Conclusions
In this article, we evaluate the performance of the Shewhart control chart for location monitoring with estimated parameters. The study substantiates the effect of estimation error and the variability in the practitioners' choice of phase-I samples on the chart, especially when the samples are prone to outliers. Increasing the phase-I sample size (although not practicably) will to some extent reduce the gross impact on the Shewhart chart. The results of this study further prove that incorporation of the non-parametric outlier screening models, Tukey and MAD, in the design of the Shewhart chart is more practicable as it requires less phase-I samples and yields better results. Another advantage of this study lies in the simplicity of its design and ease of usage. The study rounds up with an illustrative example with a photolithography real data. A comparison of the two detection models, Tukey and MAD, reveals that duo relatively efficient. The study is limited to operate within the univariate setup, while focusing on multivariate setup will be a great advantage and we plan a future study for that. Also, proposed charts are memory-less, which implies they are suitable for monitoring large shift. However, the idea of the study is not only applicable in Shewhart multivariate setup, but also extendable to other control charts, like exponentially weighted moving average and cumulative sum charts both univariate and multivariate setups.