1. Introduction
Quality is one of the main requirements of any organization’s product and service goodwill. The control charts are the main tools of statistical process control (SPC) for monitoring the quality of products to improve the process capability [
1]. In the last few decades, different control charts have been introduced and implemented in various industries to monitor the online process. These control charts include the Shewhart control chart [
2], the cumulative sum (CUSUM) control chart [
3] and the exponential weighted moving average (EWMA) control chart [
4]. The Shewhart-type control chart is the memoryless control chart and is applied to detect the large shift in the mean and standard deviation of the process. However, EWMA and CUSUM are memory-type structures used to monitor small changes in the process parameters.
A single quality characteristic of a process is monitored by control charts, which may be quantitative or qualitative. Sometimes, the quality characteristic depends upon the explanatory variable(s). When the quality characteristic follows a normal distribution and is linearly associated with the explanatory variable, it is termed a simple linear profile. In linear profiling, one may want to monitor the intercept, slope, and error variance [
5].
In profiling, if the normality assumption is violated, we move towards the generalized linear model (GLM)based profiling. The term GLM refers to a large class of models popularized by Nelder and Wedderburn [
6]. In the literature, several monitoring studies were designed based on the GLM approach, and such charts are termed model-based control charts. Skinner et al. [
7] proposed a model-based control chart using deviance residuals while the response variable follows a Poisson distribution, and they used the square root link function. Jearkpaporn et al. [
8] proposed a model-based scheme based on the deviance residual for the system in which the response variable follows a Gamma distribution, and they used the log link function. In addition, Skinner et al. [
9] studied the effectiveness of GLM-based control charts on the semiconductor process data based on the deviance residuals. Koosha and Amiri [
10] considered the effect of autocorrelation presence between the observations in different levels of the independent variable in a logistic regression profile on the monitoring procedure (
control chart) and proposed two remedies to account for the autocorrelation within logistic profiles. Shu et al. [
11] reviewed the literature on regression control charts based on their importance and practical applications. Asgari et al. [
12] proposed the GLM-based control chart to monitor a two-stage procedure under Poisson distribution. Amiri et al. [
13] investigated the profiles with binomial and Poisson responses in phase I and monitored them using three methods, namely Hoteling
statistic, the F method, and the likelihood ratio test (LRT). Amiri et al. [
14] concentrated on Phase II monitoring and proposed procedures for monitoring multivariate linear and GLM regression profiles. Qi et al. [
15] developed a control chart to monitor GLM profiles using the weighted likelihood ratio tests. Moheghi et al. [
16] studied the robust estimation and monitoring of parameters in GLM profiles in the presence of outliers. Recently, Kinat et al. [
17] proposed the Pearson and deviance residuals-based control charts for the inverse gaussian response. Moreover, recent studies on GLM-based control charts and their applications can be found in the literature [
18,
19,
20,
21,
22,
23].
One of the most important members of the GLM family is the logistic regression profile. When the quality characteristics follow the Bernoulli distribution, we use the logistic regression profile. Different link functions can be used for logistic regression profiling. These link functions include the logit, probit, log-log, complementary log-log (c-log-log), and cauchit link function. Koosha et al. [
10] studied the effect of applying different link functions on the performance of the
control chart in monitoring the parameters of logistic regression profiles, and they found that the logit link function is best for logistic regression. Yu et al. [
24] analyzed the performance of the LRProb chart under the assumption that only a small number of predictable abnormal patterns are available. Hakimi et al. [
19] proposed some robust approaches to estimate the logistic regression profile parameters to decrease the effects of outliers on the performance of the
control chart. Khosravi et al. [
25] proposed three self-starting control charts to monitor a logistic regression profile that models the relationship between a binomial response variable and explanatory variables. Alevizakos [
26] proposed two indices, cp and Spmk, for logistic regression profiles using different link functions: the logit, probit, and the c-log-log. The value of each index is approximately the same regardless of the used link functions. Jahani et al. [
27] developed two control charts based on Wald and Rao score test (RST) to monitor nominal logistic regression profiles in Phase II.
The available literature showed that most researchers focused on deviance residuals-based control charts with probit and logit link functions. In this study, we evaluate the performance of various link functions in logistic regression profiling based on Pearson and deviance residuals, and find out which one of the link functions shows better performance. The outline of the research work is described as:
Section 2 involves the methodology.
Section 3 presents the structure of the proposed control charts based on Pearson and deviance residuals.
Section 4 consists of numerical evaluations, which include the simulation study of the proposed control charts.
Section 5 describes a real-life application, and finally,
Section 6 consists of the conclusion and future recommendations.
2. The GLM-Based Control Charts
The GLM-based control chart is used to enhance the ability of linear profile when the variable of interest follows an exponential family distribution. In this study, GLM-based control charts are designed based on deviance residuals (DR) and Pearson residuals (PR) of the logistic regression.
Figure 1 shows all the steps of our proposed approach.
Suppose that our variable of interest (
y) follows the Bernoulli distribution with probability mass function given by:
where
is the probability of success. The logistic regression model is a subset of the binomial regression model. When the response variable (
) of a regression model belongs to the Bernoulli distribution, the logistic regression model is the most commonly used statistical model; i.e.,
where the probability
is a function of
and is defined by:
where
is the ith row of
, which is an
data matrix with
explanatory variables, and
is a
vector of regression coefficients. The mean and variance of Bernoulli distribution are, respectively. given by
.
To convert Equation (1) into an exponential format, we rewrite it as:
The logit link function is suitable for linking
in the logistic regression profiles denoted by
, which is defined by:
We also assume some other link functions such as the probit, c-log-log, and the cauchit to fit the logistic regression model. These link functions are given below:
The maximum likelihood estimator (MLE), based on the iterative reweighted least square technique (IRLS), is the most often used approach for estimating the parameter
, where the following log-likelihood of Equation (5) should be maximized.
Substituting Equations (1) and (7) in the above expression, we have:
After some simplifications, the result is as follows:
Differentiating Equation (3) with respect to
and equating to zero yield:
Equation (4) is solved using the IRLS algorithm and obtained:
where
where
. The general form of the DR for the logistic regression model has the following form:
However, the PR for the logistic regression is defined as:
5. Application: COVID-19 Deaths Profile Monitoring
In this modern era, COVID-19 affects different people in different ways. Here, we monitor the mortality status of COVID-19 patients who were admitted to Benazir Bhutto Shaheed Hospital, Rawalpindi, Pakistan. This data had been already collected by Akhtar et al. [
29], where they collected this data from three hospitals, and we are considering one of these hospital data sets. In this application, our response variable
is binary (
, if the COVID patient is discharged deceased; otherwise,
). Therefore, we consider this application for the evaluation of our proposed control charts. The data about the demographic and vital signs of all adult COVID-19 patients who were discharged from Benazir Bhutto Shaheed Hospital, Rawalpindi, during the first wave of COVID-19 (February to August 2020) were retrospectively collected. The National Early Warning Score (NEWS) was calculated by following the work of the Royal College of Physicians (RCP) (2012, page 14), and considered an independent variable in this study. The NEWS is based on a simple aggregate scoring system, which is computed based on vital signs (see
Table 5 [
30]).
We had data of a total of 916 COVID-19 patients, consisting of mortality status
and NEWS
. Based on the available data, we run logistic regression by setting logit, probit, c-log-log, and cauchit link functions. The Pearson (PR) and deviance (DR) residuals were obtained for estimated logistic regression models under each link function, and their means and standard deviations (SD) are reported in
Table 6. Further, we set
and obtained the limits of PR and DR-based control charts, which are also given in
Table 6. To make the OOC dataset, we have estimated a new response variable
by using
. Similarly, we estimated logistic regression models and obtained shifted PR’s and DR’s. The PR- and DR-based control charts under different link functions are presented in
Figure 2,
Figure 3,
Figure 4 and
Figure 5. The points under the pink and white shaded areas belong to the IC and OOC situation, respectively. The blue color points indicate the IC PR’s and DR’s, while the red color shows the OOC points. The PR- and DR-based control charts under the logit function are presented in
Figure 2, while proposed control charts based on the probit function are portrayed in
Figure 3.
Figure 4 and
Figure 5 consist of PR and DR-based control charts under c-log-log and cauchit link functions, respectively. The out-of-control points are counted for each chart and reported in
Table 6. It is revealed that the PR- and DR-based control charts under the cauchit link function have captured a large number of OOC signals. The second largest OOC points were detected by the PR- and DR-based control charts for the c-log-log link function. Hence, PR- and DR-based control charts under cauchit link and c-log-log link functions outperform all their counterparts.