Next Article in Journal
Existence of Three Solutions for a Nonlinear Discrete Boundary Value Problem with ϕc-Laplacian
Previous Article in Journal
New Design Procedure of Transtibial ProsthesisBed Stump Using Topological Optimization Method
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multivariate Control Chart Based on Kernel PCA for Monitoring Mixed Variable and Attribute Quality Characteristics

1
Departement of Statistics, Institut Teknologi Sepuluh Nopember, Jawa Timur 60111, Indonesia
2
Department of Mathematical Sciences, Universiti Teknologi Malaysia, Johor Bahru 81310, Malaysia
*
Author to whom correspondence should be addressed.
Symmetry 2020, 12(11), 1838; https://doi.org/10.3390/sym12111838
Submission received: 8 October 2020 / Revised: 25 October 2020 / Accepted: 4 November 2020 / Published: 6 November 2020

Abstract

:
The need for a control chart that can visualize and recognize the symmetric or asymmetric pattern of the monitoring process with more than one type of quality characteristic is a necessity in the era of Industry 4.0. In the past, the control charts were only developed to monitor one kind of quality characteristic. Several control charts were created to deal with this problem. However, there are some problems and drawbacks to the conventional mixed charts. In this study, another approach is used to monitor mixed quality characteristics by applying the Kernel Principal Component Analyisis (KPCA) method. Using the Hotelling’s T2 statistic, the kernel PCA mix chart is proposed to simultaneously monitor the variable and attribute quality characteristics. Due to its ability to estimate the asymmetric pattern of the mixed process, the kernel density estimation (KDE) used in the proposed chart has successfully estimated the control limits that produce ARL0 at about 370 for α = 0.00273 . Through several experiments based on the proportion of the attribute characteristics and kernel functions, the proposed chart demonstrates better performance in detecting outlier and shift in the process. When it is applied to monitor the synthetic data, the proposed chart can detect the shift accurately. Additionally, the proposed chart outperforms the performance of the conventional mixed chart based on PCA mix by producing lower false alarm with more accurate detection of out of control processes.

1. Introduction

The control chart can visualize the quality characteristics in a graphical form and calculate its control limit based on the symmetric or asymmetric distribution of the monitored processes. In statistical process control (SPC), two types of control chart have been developed based on the monitored quality characteristics, namely the variable and attribute charts [1]. The variable control chart is developed to monitor the metric quality characteristics (variable or ratio scale) such as length or height. On the other hand, to monitor the nonmetric quality characteristics (categorical scale), the attribute chart is used. Some works have developed the variable and attribute charts, especially to monitor more than one characteristic (multivariable or multiattribute characteristics). The Shewhart, multivariate exponentially weighted moving average (MEWMA), and multivariate cumulative sum (MCUSUM) type charts are developed to accommodate the multivariable characteristics [1].
The recent development of the Shewhart type chart includes the T2-PCA chart [2], the robust T2 chart [3,4], the variable parameters (VP)-T2 Hotelling chart [5], and the T2 chart for short-run production [6]. Meanwhile, the latest development of MEWMA and MCUSUM charts covers an adaptive MEWMA chart [7], one-sided and two one-sided MEWMA charts [8], dual MCUSUM chart [9], Max MCUSUM for autocorrelated data [10], as well as the mixed multivariate memory control charts [11]. Other recent charts of the multiattribute charts consist of multiple dependent state repetitive sampling (MDSRS) [12] and fuzzy bivariate chart [13] for Poisson distribution, as well as the multinomial generalized likelihood ratio (MGLR) control chart for multinomial distribution [14].
To improve the product quality, the mixed monitoring procedure is necessary in the production process [15]. The quality characteristics in a product not only can be measured by the variable or attribute separately but also can be measured together using the mixed scheme. Therefore, to accommodate these needs, some researchers have developed the mixed characteristics charts. Aslam et al. [16] suggested the mixed chart by applying combined X ¯ and np charts in monitoring the quality processes. This mixed chart is developed by transforming the variable characteristics into attributes which are then monitored simultaneously on a chart. The performance of the mixed chart of Aslam et al. [16] was compared with hybrid exponential weighted moving average (HEWMA) in [17]. Wang et al. [18] proposed a spatial-sign covariance matrix-based chart by integrating the standardized ranks and spatial signs to calculate the mixed statistics. Finally, the T2 based principal component analysis (PCA) mix chart is proposed to monitor the mixed characteristics processes [19] and to detect outlier [20] using the kernel-based control limit [21].
The drawback of the PCA mix chart was discovered when it was applied to inspect the extreme imbalanced categorical data or attribute characteristics. Ahsan et al. [19] found that the performance of PCA mix chart is decreased for an extremely imbalanced proportion of the attribute characteristics. Commonly, in the manufacturing process there is 95% good product and 5% defect product. To solve such issue, the kernel PCA (KPCA) method proposed by Schölkopf [22] can be employed to accommodate the difference in data type. KPCA is a nonlinear version of the conventional PCA that can model data from non-Gaussian distributions [23]. This approach can efficiently calculate principal component scores (PCs) on the high dimensional feature space using kernel functions [24]. This method is also applied in control chart and success in detecting outlier [25].
Based on the above considerations, this paper proposes a mixed multivariate control chart based on the KPCA method that can accommodate the different types of quality characteristics, named KPCA mix chart. In this approach, the attribute characteristics or the categorical data will be transformed into the dummy variables (numeric variables that reflect categorical data or attribute characteristics symbolized in 0 and 1). Further, along with the variable characteristics of the continuous data, the kernel matrix is formed, and the PCs from the mixed characteristics are computed. The computed PCs are then transformed into T2 statistics. In estimating the control limit of T2 statistics, this study uses kernel density estimation (KDE), the same approach used in Ahsan et al. [19] to find the asymmetric or even unknown pattern of the mixed characteristics. Moreover, to show the appropriateness of the proposed chart, its performance is compared with the PCA mix chart. The rest of the paper is organized as follows: Section 2 describes the proposed KPCA mix. The KDE control limit from the different kinds of kernel function are tabulated in Section 3. Section 4 presents the performance of the proposed chart in detecting outlier and shift in the process. The utilization of the proposed chart in simulated and real data is shown in Section 5. The managerial implication is described in Section 6. The conclusions and possible development of this paper are laid down in Section 7.

2. Kernel PCA Mix Control Chart

2.1. Kernel PCA

PCA is the basis of transformation to diagonalize the estimated covariance matrix C from input data x j , j = 1 , . , n , x j p , j = 1 n x j = 0 defined as follows:
C = 1 n j = 1 n x j x j T .
The new coordinate, principal component, is calculated based on the eigenvector projection of the input data. PCA works under the assumption that the data has a linear relationship. However, in the complex case such as the chemical industry or biology, the relationship of the data is not always linear. As a consequence, the conventional PCA has a poor performance for such a case [26].
To overcome the nonlinearity problem, Schölkopf et al. [22] introduced the kernel PCA method. The basic idea of this method is calculating the PCs in feature space by conducting a nonlinear mapping Φ : p F ,   x X (see Figure 1). This can be done by involving kernel functions known from SVM [27]. In other words, PCA can be executed in feature space F by employing the kernel function.
Assume that the centered input data are mapped to feature space F, Φ ( x 1 ) , , Φ ( x n ) . The covariance matrix in feature space can be written as:
C F = 1 n j = 1 n Φ ( x j ) Φ ( x j ) T .
The next step is finding the eigenvalues λ 0 eigenvector V F \ { 0 } that satisfies:
λ V = C F V .
By substituting the C F in Equations (2) and (3), it can be found that:
λ V = ( 1 n j = 1 n Φ ( x j ) Φ ( x j ) T ) V = 1 n j = 1 n Φ ( x j ) , V Φ ( x j ) T ,
where Φ ( x j ) , V is a dot product between Φ ( x j ) and V . As a consequence, all solutions from V with λ 0 lies on the range of Φ ( x 1 ) , , Φ ( x n ) . Therefore, λ V = C F V is equivalent to:
λ Φ ( x k ) , V = Φ ( x k ) , C F V , k = 1 , . , n
and there are α 1 , . , α n so that:
V = i = 1 n α i Φ ( x i ) .
By combining Equations (5) and (6), we found that:
λ i = 1 n α i Φ ( x k ) , Φ ( x i ) = 1 n i = 1 n α i Φ ( x k ) , j = 1 n Φ ( x j ) Φ ( x j ) , Φ ( x i ) .
In general, the mapping Φ ( . ) is not always can be calculated. To solve the problem, we just need to calculate the dot product from to vector in feature space. Let matrix K with a size of n × n be defined as K i j = Φ ( x i ) , Φ ( x j ) . By replacing the left-hand side from Equation (7) with matrix K we found:
λ i = 1 n α i Φ ( x k ) , Φ ( x i ) = λ i = 1 n α i K k i ,
and the right-hand side from Equation (7) becomes:
1 n i = 1 n α i Φ ( x k ) , j = 1 n Φ ( x j ) Φ ( x j ) , Φ ( x i ) = 1 n i = 1 n α i j = 1 n K k j K j i .
By combining Equations (8) and (9), we found the following expression:
λ i = 1 n α i K k i = 1 n i = 1 n α i j = 1 n K k j K j i .
If we simplify the Equation (10) into a matrix form, we found:
λ α K = 1 n α K 2 .
The solution of Equation (11) can be found by solving the eigenvalue problem from:
n λ α = α K
for non-zero eigenvalue. In other words, conducting PCA in feature space is equivalent to solving the eigenvalue problem from Equation (12). After solving the eigenvalue problem, eigenvector α 1 , α 2 , . , α n and eigenvalue λ 1 λ 2 λ n can be determined.
The dimension reduction is conducted by taking the first l eigenvector. Further, normalize the α 1 , α 2 , . , α l that provide V v , V v = 1 , v = 1 , 2 , , l . From Equation (6), we found that:
V v = i = 1 n α i v Φ ( x i ) .
Thus, V v , V v = 1 can be written as
1 = i = 1 n α i v Φ ( x i ) , j = 1 n α j v Φ ( x j )    = i = 1 n j = 1 n α i v α j v Φ ( x i ) , Φ ( x j )    = i = 1 n j = 1 n α i v α j v K i j    = α v , K α v    = λ v α v , α v
Principal component score t is calculated by projecting Φ ( x i ) to eigenvector V v where v = 1 , 2 , , l as follows:
t v = V v , Φ ( x ) = i = 1 n α i v Φ ( x i ) , Φ ( x ) ,
To solve the eigenvalue problem in Equation (12) and principal component calculation, the nonlinear mapping does not need to be conducted. To replace this, the kernel function can be constructed K ( x , y ) = Φ ( x ) , Φ ( y ) . In this work, the kernel centering is calculated before it is applied in KPCA using the following expression:
K ˜ = K 1 n K K 1 n + 1 n K 1 n ,
where 1 n = 1 n [ 1 1 1 1 ] n x n .

2.2. Kernel PCA Mix Control Chart Procedures

After explaining the KPCA algorithm in the previous subsection, the KPCA mix chart procedure is presented in this subsection. The main idea of KPCA mix chart is constructing the matrix Z representing the metric and nonmetric variables. There are two steps in this KPCA mix chart procedures. First, the T2 statistics are calculated from matrix Z. Further, the control limit of the proposed chart is calculated using the KDE approach. These procedures are illustrated by the flowchart in Figure 2. The procedures are detailed as follows:
Statistics T2 Calculation
  • Form matrix Z = [ Z 1 , Z 2 ] sized n × ( p + m ) where:
    • Z 1 sized n × p is centered on a matrix X 1 which is contained the metric data.
    • Z 2 sized n × m is centered on a matrix G which is contained binary coding from every level of nonmetric data X 2 . For example, X 2 has three categories such as “no defect”, “minor defect”, and “major defect” represented as 1, 2, and 3, respectively
      X 2 = [ 1 2 1 3 ] , then   matrix   G   can   be   written   as   G = [ 1 0 0 0 1 0 1 0 0 0 0 1 ] ,
      where the dummy variable for “no defect” symbolized as 1 is 1 0 0, the dummy variable for “minor defect” symbolized as 2 is 0 1 0, and the dummy variable for “major defect” represented as 3 is 0 0 1.
  • Calculate
    Z ˜ = N 1 2 Z M 1 2 ,
  • Choose the kernel function.
  • Calculate the matrix kernel K = K ( z ˜ i , z ˜ j ) = Φ ( z ˜ i ) , Φ ( z ˜ j ) .
  • Calculate principal component score t as follows:
    t v = i = 1 n α ˜ i , v Φ ( z i ) , Φ ( z ) = i = 1 n α ˜ i , v K ˜ ( z i , z ) .
  • From the first l principal component t, calculate the T2 statistics using the following equation:
    T ˜ k 2 = v = 1 l t v λ v 1 t v T ,
    where v = 1 , 2 , , l , and λ v eigenvalues that correspond to v-th PCs.
KDE control limit calculation
1. Estimate the empirical density of T ˜ k 2 statistics using the following equation:
f ^ h ( T ˜ k 2 ) = 1 n h ^ i = 1 n k ( T 2 T ˜ k , i 2 h ^ )
2. Calculate the cumulative distribution F ^ h ( t ˜ k ) = 0 t ˜ k f ^ h ( T ˜ k 2 ) d T ˜ k 2 using the trapezoid rule as follows:
π min π max f ^ h ( T ˜ k 2 ) d T ˜ k 2 π max π min 2 n i = 1 n ( f ^ h ( T ˜ k , i 2 ) + f ^ h ( T ˜ k , ( i + 1 ) 2 ) )
where π min and π max are the maximum and minimum value of T ˜ k 2 .
3. Calculate the KDE control limit using the following expression:
C L ˜ = F ^ h 1 ( t ˜ k ) ( 1 α ) .
In this paper, R statistical software was used to create the proposed KPCA mix chart and conduct the simulation studies. The Kernel-Based Machine Learning Lab (kernlab) package was used to perform the KPCA algorithm.

3. KDE Control Limit

In this section, KDE control limit of the T ˜ k 2 statistics is presented for various kernel functions. Three types of kernel functions are used in this paper, such as:
  • Linear Kernel K(xi,xj) = 〈xi,xj〉.
  • Polynomial Kernel K(x,y) = (〈x,y〉 + 1)d.
  • Radial Basis Function (RBF) Kernel K ( x i , x j ) = exp ( σ * | | x i x j | | 2 ) .
The continuous or metric quality characteristic X 1 is generated from the multivariate normal distribution. In this research, the number of metric quality characteristics p is 5. Meanwhile, the nonmetric or categorical quality characteristics are generated from a multinomial distribution X 2 M ( n , θ 1 , θ 2 , θ 3 ) with three types of parameters as follows:
  • θ 1 , θ 2 = 0.3   and   θ 3 = 0.4 (balanced case),
  • θ 1 , θ 2 = 0.1   and   θ 3 = 0.8 (imbalanced case),
  • θ 1 , θ 2 = 0.05   and   θ 3 = 0.9 (extreme imbalanced case).

3.1. Linear Kernel

Table 1 reports the KDE control limit for linear kernel when the number of continuous characteristics p is 5 and the number of PCs l = 2, 3, and 5. From the table, it can be seen that the KDE control limit produces stable ARL0 at 370 for α = 0.00273 . Additionally, it can be seen that the larger number of PCs l used the larger KDE control limit produced.

3.2. Polynomial Kernel

KDE control limits of polynomial kernel for various cases are reported in Table 2, Table 3 and Table 4. According to the results, the larger the d used, the larger the ARL0 produced. In this case, the ARL0 that is close to the theoretical is achieved when the parameter of the polynomial kernel is 1 (d = 1). Moreover, similar to the linear kernel, KDE control limit is larger for the larger number of principal components used.

3.3. RBF Kernel

Table 5, Table 6 and Table 7 present the KDE control limit of the proposed chart for p = 5 and various proportions of nonmetric data. From the tables, it can be seen that the smaller the hyperparameter σ * used, the closer the ARL0 to the theory (in this case is 370). In general, the ARL0 is close to the theory when the hyperparameter σ * = 0.001 . Thus, for the same case in this work, the hyperparameter σ * is set to 0.001.

4. Performance of the Proposed Chart

In this paper, the performance of the proposed chart to detect outlier and to detect a shift in the process is evaluated for some scenarios. Similar to the previous section, the variable quality characteristics are generated from multivariate normal distribution and the attribute quality characteristics are generated from multinomial distribution.

4.1. Detecting Outlier

4.1.1. Simulation Setup

In this part, the performance of the proposed chart in detecting the presence of outlier is presented. Using the same algorithm as in Ahsan et al. [20], the simulation studies was conducted 1000 times to calculate the hit rate, FN (false negative) rate, and FP (false positive) rate. The metric data X 1 is generated to follow the multivariate normal distribution X 1 N p ( 0 , I ) . Meanwhile, the nonmetric data is generated to multinomial distribution X 2 M ( n , θ 1 , θ 2 , θ 3 ) . The percentage of outlier ε added to the clean or in-control data is set to 5%, 10%, 20%, 30%, 40%, and 50% out of the total observation. Furthermore, Table 8 shows the scenarios used to assess the proposed chart performance.

4.1.2. Simulation Results

Figure 3 reports the performance evaluation results of the proposed chart with kernel linear to detect the outlier (see Appendix A Table A1 for the detailed results). According to the results, the increase in the proportion of outliers added to the clean data causes a decrease in performance which can be seen from a decrease in the hit rate value. Moreover, for this case, the usage of kernel linear in kernel PCA mix chart is still reasonable for 30% outlier added to the clean data which can be seen from the high hit rate value produced (around 0.85–0.9). The performance of the proposed chart with the polynomial kernel in detecting outliers is presented in Figure 4 (see Appendix A Table A2 for detailed results). In this case, the parameter of the polynomial kernel is 1 (based on the result from the previous section). Similar to the previous results, the larger the outlier added to the clean data the smaller the hit rate value. According to its hit rate, the polynomial kernel is still in a good performance for 30% outlier added to the in-control data. Similar results also occur in RBF Kernel (see Figure 5 and Appendix A Table A3). Using the hyperparameter σ * = 0.001 , the performance of the proposed chart is still good for smaller than 40% outlier added. When more than 40% outlier added to the in-control data, the misdetection for this case occurs due to the large false alarm produced. This happens because the proposed chart declares the actual in-control observations as the outliers. Thus, to improve the performance of the proposed chart in detecting outliers, the new method needs to overcome this issue.

4.2. Detecting Shift in the Process

The performance of the proposed chart is evaluated to detect a shift in the process using the average run length (ARL) criterion. This chart is also evaluated using several scenarios based on the proportion of the nonmetric parameter and kernel function. Moreover, the control limits used in this simulation are taken from the previous section.

4.2.1. Extreme Imbalanced

In this subsection, the performance of the proposed chart is evaluated when the variable characteristics are generated from the multivariate normal distribution N p ( 0 , I ) and the attribute characteristics are generated from a multinomial distribution with extreme imbalanced parameter ( θ 1 , θ 2 = 0.05   and   θ 3 = 0.9 ). For p = 5 and l = 2, the evaluation results for various kernel function are visualized in Figure 6a. From the results, it can be seen that ARL0 for all cases is around 370. Additionally, it can be concluded that the proposed chart can detect the shift in the process, which can be seen from the smaller ARL1 value for the larger shift given. According to the figure, the ARLs value for the kernel RBF and linear are not significantly different. Furthermore, the kernel function for this case did not performed well compared to the two kernel functions.
Figure 6b depicts the performance evaluation results from the kernel PCA mix control chart for p = 5 and l = 3 with various kernel functions and extreme imbalanced proportion of categorical quality characteristics. It can be seen from the table that the proposed chart can detect a shift in the process which can be seen from smaller ARL1 for the larger shift. For the smaller shift, the polynomial kernel produces a better performance compared to the RBF kernel which can be seen from the smaller ARL1 owned. On the other hand, for the larger shift, the RBF kernel outperforms the polynomial kernel. For this case, the linear kernel does not perform well compared to the other kernel functions.
Figure 6c presents the ARLs of the proposed chart for θ 1 , θ 2 = 0.05   and   θ 3 = 0.9 and p = 5 and l = 4. From the figure it can be seen the difference between the kernel functions used. In general, for all kernel functions used, it can be said that the proposed chart can detect the shift in the process which can be seen from the smaller ARL1 value for the larger shift. In this case, the similar performance from all kernel functions compared can be seen. However, for the small shift, the linear kernel produces a slightly better performance compared to the other kernels. The detailed results for this case can be found in Appendix A Table A4, Table A5 and Table A6.

4.2.2. Imbalanced

In this part, the performance of the proposed chart for the imbalanced parameter of the nonmetric characteristics with various kernel functions are presented. Figure 7a shows the ARLs of the proposed chart for θ 1 , θ 2 = 0.1 , θ 3 = 0.8 , p = 5, and l = 2 with various kernel functions used. According to the table, it can be said that the proposed chart is able to detect a shift in the process indicated by the smaller the value of ARL1 when the process shift gets larger. For this case, the best performance is produced by the linear and polynomial kernels. On the other hand, the RBF kernel does not perform well for this scenario.
The performance of the proposed chart in detecting the shift in process for θ 1 , θ 2 = 0.1 , θ 3 = 0.8 , p = 5, and l = 3 is presented in Figure 7b. In general, the proposed chart can detect the shift for all kernel functions used. According to the figure, it can be seen that the linear and polynomial kernels have similar performance. For this case, these two kernel functions outperform the performance of RBF kernel. Furthermore, Figure 7c reports the performance of the proposed chart with various kernel functions for θ 1 , θ 2 = 0.1 , θ 3 = 0.8 , p = 5, and l = 3. According to the figure, the best performance for this case is performed by the linear and polynomial kernel. The detailed results for this case can be found in Appendix A Table A7, Table A8 and Table A9.

4.2.3. Balanced

In this subsection, the performance of the proposed chart to detect a shift in the process for the balanced proportion of the nonmetric data is presented. Some scenarios based on the kernel function and number of the PCs l used are used to assess the performance of the proposed chart. For the balanced nonmetric data with p = 5 and l = 2, the proposed chart can detect the shift in the process according to its ARL1 values for all kernel functions (see Figure 8a). For this case, the linear and polynomial kernel outperform the performance of the RBF kernel. Moreover, the best performance for this case is presented by the polynomial kernel.
Figure 8b shows the ARLs of the proposed chart for a balanced proportion of the nonmetric parameter with p = 5 and l = 3. For this case, all kernel functions demonstrate good performance as can be seen in Figure 8b. For a small shift, the polynomial kernel shows a great performance which can be found from the smaller ARL1 produced. On the other hand, the RBF kernel outperforms the other kernel functions for the large shift. This similar performance also happens for p = 5 and l = 3, which can be seen in Figure 8c. According to the figure, the kernel polynomial has a slightly better performance for the small shift compared to the other kernel functions. The detailed results for this case can be found in Appendix A Table A10, Table A11 and Table A12.

4.3. Summary and Discussion

In this section, the summary of simulation studies and discussion about the performance of the proposed KPCA mix chart are presented. The simulation studies were conducted to evaluate the performance of the proposed chart in detecting outlier and process shift. In detecting outliers, it can be found that the KPCA mix chart still has better performance for 30% outlier added to the clean data. In general, for more than 30% outlier added, the misdetection is mainly caused by the high FP rate value (see Appendix A Table A1, Table A2 and Table A3).
Table 9 summarizes the proposed KPCA mix chart performance in detecting process shift for all scenarios. The sign ● symbolizes the better performance for a small shift while the sign ⁂ represents the better performance for the large shift. Based on the results, the polynomial kernel demonstrates good performance in the balanced and imbalanced cases for both small and large shifts in the process. On the other hand, for the extreme imbalanced parameter of the nonmetric data, the RBF and linear kernels show a better performance when it is used to monitor a small shift.
Based on the summary of simulation studies discussed, some limitations are found. First, the proposed KPCA mix chart is producing more false alarm when the larger outlier is added in simulations. Second, there is no superior kernel functions for all cases. Third, executing the KPCA mix chart requires more computational time due to the complexity of the kernel function. To overcome these problems, new methods for calculating the control limit and robust estimator are needed to reduce the false alarm when more outliers are added. Additionally, discovering new kernel functions and using the Fast KPCA method can improve the accuracy and speed of the computation of the proposed chart.

5. Applications

In this section, the proposed chart is applied for the simulated and real data. First, some scenarios of data are given in order to see the ability of the proposed chart in detecting mean shift. Second, the proposed chart is applied to monitor the real data and its monitoring result is compared with the PCA mix chart [19].

5.1. Simulated Data

Table 10 shows the application of the proposed chart to monitor three scenarios of data. The linear, polynomial, and RBF kernel are employed in this application. The first 70 metric observations are generated to follow the multivariate normal distribution with μ = 0 and Σ = I . Meanwhile, the remaining 30 shifted observations are generated to follow a multivariate normal distribution with μ s h i f t = 2 and Σ = I . Furthermore, the nonmetric data is generated to follow the multinomial distribution with a certain parameter ( θ 1 , θ 2 , and θ 2 ) as given in Table 10.
Figure 9, Figure 10 and Figure 11 illustrate the application of the proposed chart to monitor simulated data for RBF, polynomial, and linear kernels, respectively. From the results, it can be seen that for all kernel function used, the proposed chart can correctly detect the shift in 71st observation. However, for the imbalanced proportion of nonmetric data (see scenarios 2 and 3), the shift is not clearly seen as in the balanced case when the RBF kernel is used (see Figure 9). On the other hand, the polynomial kernel has a good performance for the imbalanced and extreme imbalanced cases as depicted in Figure 10. Furthermore, compared to the polynomial kernel, the linear kernel has better performance for the balanced and imbalanced cases as presented in Figure 11.

5.2. Real Data

In this subsection, the proposed chart is applied to monitor the machine failure data used by Ahsan et al. [19] in evaluating the mixed chart based on PCA mix. The machine failure dataset has a balanced proportion of the categorical characteristics (see the complete description in [20]). Therefore, in this application, the RBF kernel is used. Table 11 presents the performance comparison between the proposed KPCA mix and PCA mix charts in monitoring the machine failure dataset. Based on the monitoring results, it can be concluded that the proposed chart can detect all out of control observations. However, the proposed KPCA mix chart produced more false alarms than the PCA mix chart.

6. Managerial Implication

In the industrial 4.0 era, monitoring the products with control chart plays a crucial role for the enhancement of process quality. Monitoring and enhancing the process are the main purpose of the control chart by reducing the variability in the process. The traditional control charts are used to monitor one type of quality characteristics. For instance, the numerical measurements such as length or weight are monitored using a variable type control chart. On the other hand, the categorical data such as defect, color, or softness are monitored using the attribute control chart. Thus, if a corporation wants to monitor the numerical and categorical data simultaneously, they need to use two types of the chart (variable and attribute) individually which is inefficient.
The findings in this paper are in-line with the concept of continuous quality enhancement and the adaptive monitoring process. The mixed monitoring scheme, proposed in this paper, covers not only one type of quality characteristic but also the mixed variable and attribute quality characteristic in one chart. Through simulation studies, this chart was guaranteed effective in monitoring shifts in the mixed process. By using this chart, fast corrective actions for any assignable causes can be taken by the administrator due to the sensitivity of the mixed monitoring scheme. Additionally, monitoring control limits need to be readjusted for the certain time intervals. The historical in-control observation can be used to calculated new control limits by estimating its empirical distribution (asymmetric or even unknown) using the KDE method. The adjusted control limit will help the company to adapt to the new data production behavior in the future.

7. Conclusions and Future Works

In this paper, a new control chart based on kernel PCA for monitoring mixed variable (continuous data) and attribute (categorical data) quality characteristics was proposed. The principal component scores (PCs) were transformed into T2 statistics in constructing the proposed method. In calculating the accurate control limit, kernel density estimation (KDE) was employed. To evaluate the performance of the proposed chart, some scenarios with various kernel function such as linear, polynomial, and radial basis function kernels were used. For in-control condition, using the KDE control limit, the proposed chart produces ARL0 at about 370 ( α = 0.00273 ) for all scenarios. For the shifted process, the control chart was evaluated in monitoring the outlier in phase I and process shift in phase II. In monitoring outlier, the proposed chart was successful in detecting outliers mixed with clean data. In general, for this case, the proposed chart still has a good performance in detecting up to 30% outliers added in simulations. In monitoring the shift in the process, the proposed control chart based on kernel PCA demonstrated better performance. For this case, the different result was produced for different kernel function. The polynomial kernel showed a good performance for both small and large shifts with the balanced and imbalanced proportion of categorical data. This can be concluded from the high hit rate yielded by the polynomial kernel. On the other hand, for a small shift in the process, the linear and RBF kernels demonstrated good performance for an extreme imbalanced proportion of categorical data in term of accuracy detection. Furthermore, the proposed chart was applied to monitor the simulated and real data. The proposed chart shows great performance in monitoring the simulated data in terms of success detection of the out-of-control observations. Meanwhile, in monitoring the real data, the proposed chart outperforms the performance of the conventional PCA mix chart by producing lower false alarms. As future study, the bootstrap resampling method [28] can be employed to estimate the control limit of the proposed method. Development of mixed kernel function can also be a good alternative to exchange the conventional kernel used in this study. Finally, the use of fast kernel PCA [29] can improve the computational time.

Author Contributions

M.A.: Conceptual methodology, writing original draft, and data analyzing. M.M.: Supervising and validating the results. W.: Performed the analysis and data visualization. H.K.: Software analysis tools. M.H.L.: Validating the results. All authors have read and agreed to the published version of the manuscript.

Funding

This research and the APC were funded by the Ministry of Research and Technology/National Research and Innovation Agency (Kemenristek/BRIN) of the Republic of Indonesia with grant number 1213/PKS/ITS/2020.

Conflicts of Interest

The authors declare no potential conflicts of interest concerning the research, authorship, and/or publication of this article.

Appendix A

Table A1. Simulation results for linear kernel.
Table A1. Simulation results for linear kernel.
Scenario ε = 5 % ε = 10 %
Hit RateFN RateFP RateHit RateFN RateFP Rate
i0.99900.00820.00060.99680.01260.0022
ii0.99890.01080.00050.99490.00720.0048
iii0.99800.00340.00190.99720.01200.0018
iv0.99850.02320.00040.99610.01970.0021
v0.99850.02040.00050.99600.02950.0012
vi0.99830.00780.00130.99610.02080.0020
vii0.99810.03400.00020.99530.02860.0021
viii0.99630.07220.00010.99320.06270.0006
ix0.99780.04020.00020.99130.08340.0004
Scenario ε = 20 % ε = 30 %
Hit RateFN RateFP RateHit RateFN RateFP Rate
i0.97620.03160.02190.88920.07340.1269
ii0.97840.04140.01660.91270.15150.0597
iii0.98020.06790.00780.91050.13130.0716
iv0.97610.05370.01640.89200.09740.1125
v0.97330.04200.02290.90800.19160.0494
vi0.97380.10400.00680.89930.11730.0936
vii0.95210.23200.00190.90210.17190.0661
viii0.97370.08730.01100.89930.14480.0818
ix0.97050.11440.00830.85630.45600.0099
Scenario ε = 40 % ε = 50 %
Hit RateFN RateFP RateHit RateFN RateFP Rate
i0.74910.29880.21900.50010.22540.7745
ii0.67280.11340.46980.50130.19030.8070
iii0.68520.12690.44010.49850.16860.8345
iv0.65690.10560.50150.50020.23200.7677
v0.73770.50110.10320.50090.26080.7374
vi0.73200.25490.27680.50120.37730.6203
vii0.72930.27010.27120.49690.28470.7215
viii0.74050.35870.19350.50000.46160.5385
ix0.69840.17640.38510.49930.49560.5057
Table A2. Simulation results for polynomial kernel.
Table A2. Simulation results for polynomial kernel.
Scenario ε = 5 % ε = 10 %
Hit RateFN RateFP RateHit RateFN RateFP Rate
i0.99890.01680.00030.99700.01560.0016
ii0.99910.01180.00040.99690.01310.0019
iii0.99900.00940.00060.99640.03230.0005
iv0.99850.01400.00080.99480.04710.0006
v0.99870.01240.00080.99580.03290.0011
vi0.99840.01840.00080.99530.01180.0039
vii0.99640.07140.00010.99480.03740.0016
viii0.99810.02500.00070.99330.06210.0005
ix0.99680.06140.00010.99480.03880.0015
Scenario ε = 20 % ε = 30 %
Hit RateFN RateFP RateHit RateFN RateFP Rate
i0.97750.03380.01970.91090.12820.0724
ii0.98050.05310.01100.90120.10020.0982
iii0.97910.04050.01600.88780.07350.1287
iv0.97280.11470.00530.90610.16070.0652
v0.97510.04620.01950.90060.12170.0898
vi0.97580.04590.01880.90020.26210.0302
vii0.97180.04890.02300.89610.13120.0922
viii0.97210.05390.02140.89990.24300.0389
ix0.97190.10620.00860.88500.10420.1196
Scenario ε = 40 % ε = 50 %
Hit RateFN RateFP RateHit RateFN RateFP Rate
i0.70610.15320.38770.50090.21000.7882
Ii0.73520.23140.28700.49960.20090.8000
Iii0.71370.17640.35950.50010.25890.7410
Iv0.69330.15310.40910.49850.32720.6759
V0.73680.26810.25980.49860.29940.7034
vi0.72310.22070.31430.49880.33520.6671
vii0.71590.21660.32910.49990.27430.7258
viii0.72260.24610.29820.49990.22450.7757
ix0.73300.29710.24690.49970.33200.6686
Table A3. Simulation results for RBF kernel.
Table A3. Simulation results for RBF kernel.
Scenario ε = 5 % ε = 10 %
Hit RateFN RateFP RateHit RateFN RateFP Rate
i0.99910.00940.00050.99760.01520.0010
ii0.99930.00860.00030.99770.01530.0009
iii0.99890.00600.00080.99720.02110.0008
iv0.99870.01230.00070.99590.03010.0012
v0.99860.01090.00090.99560.03400.0011
vi0.99870.01980.00040.99590.02750.0015
vii0.99790.01240.00160.99520.03080.0019
viii0.99750.04620.00010.99550.03170.0014
ix0.99840.01840.00070.99490.04260.0009
Scenarioε = 20%ε = 30%
Hit RateFN RateFP RateHit RateFN RateFP Rate
i0.98010.03370.01650.88220.05900.1429
ii0.97780.02910.02040.91540.12050.0692
iii0.98240.05080.00930.90990.24630.0231
iv0.97280.03870.02440.90070.11580.0923
v0.96600.02650.03590.90450.13380.0791
vi0.97450.04400.02090.88120.07760.1365
vii0.96480.16550.00260.90170.13000.0848
viii0.97500.05380.01770.90820.17580.0558
ix0.97000.13410.00400.90720.16750.0608
Scenarioε = 40%ε = 50%
Hit RateFN RateFP RateHit RateFN RateFP Rate
i0.73610.22380.29060.49960.25910.7416
ii0.73490.20980.30190.50060.40560.5933
iii0.73500.21180.30040.49820.28190.7217
iv0.72020.20770.32790.50140.22290.7744
v0.70700.17840.36940.50050.62620.3728
vi0.70000.16620.38920.50190.24900.7472
vii0.71890.21610.32440.50010.36470.6351
viii0.73590.27880.25440.49910.31170.6902
ix0.73250.25400.27640.50040.26760.7316
Table A4. ARLs for θ1, θ2 = 0.05, θ3 = 0.9, p = 5, and l = 2.
Table A4. ARLs for θ1, θ2 = 0.05, θ3 = 0.9, p = 5, and l = 2.
ShiftKernel
δμδθRBF (0.001)Poly (1)Linear
00386.221379.075365.194
0.10.0025353.105358.055341.855
0.20.0050260.201282.095237.335
0.30.0075236.995249.421159.401
0.40.0100119.915166.745109.505
0.50.012588.045117.94565.1308
0.60.015054.98274.18542.411
0.70.017535.19557.10524.155
0.80.020025.00542.29517.541
0.90.022516.54528.53012.075
1.00.025010.75120.5829.195
1.10.02758.30714.2916.815
1.20.03005.89510.4254.335
1.30.03254.6158.2223.441
1.40.03504.1646.6852.593
1.50.03752.8115.5652.445
Table A5. ARLs for θ1, θ2 = 0.05 and θ3 = 0.9, p = 5, and l = 3.
Table A5. ARLs for θ1, θ2 = 0.05 and θ3 = 0.9, p = 5, and l = 3.
ShiftKernel
δμδθRBF (0.001)Poly (1)Linear
00362.455357.21368.175
0.10.0025326.510330.370328.430
0.20.0050314.055253.810305.860
0.30.0075227.840197.685266.260
0.40.0100153.770124.415238.035
0.50.012590.46099.950185.685
0.60.015066.23575.310136.395
0.70.017547.22547.92591.755
0.80.020030.04036.93064.465
0.90.022520.00527.73555.045
1.00.025016.62022.37040.715
1.10.02759.93015.05034.060
1.20.03007.56012.55525.985
1.30.03255.7807.22516.330
1.40.03504.4806.64014.010
1.50.03753.9255.22510.945
Table A6. ARLs for θ1, θ2 = 0.05 and θ3 = 0.9, p = 5, and l = 4.
Table A6. ARLs for θ1, θ2 = 0.05 and θ3 = 0.9, p = 5, and l = 4.
ShiftKernel
δμδθRBF (0.001)Poly (1)Linear
00364.425359.040351.510
0.10.0025325.890307.205282.950
0.20.0050270.730229.495240.455
0.30.0075220.725190.665182.670
0.40.0100157.060141.245107.775
0.50.0125103.130107.68576.505
0.60.015071.66066.60537.040
0.70.017547.84047.64526.845
0.80.020031.33034.70017.955
0.90.022521.76520.82012.165
1.00.025016.58015.7758.985
1.10.027511.38511.3056.610
1.20.03008.0257.8905.160
1.30.03255.8155.6653.545
1.40.03504.8754.8402.995
1.50.03753.6553.1152.515
Table A7. ARLs for θ1, θ2 = 0.1 and θ3 = 0.8, p = 5, and l = 2.
Table A7. ARLs for θ1, θ2 = 0.1 and θ3 = 0.8, p = 5, and l = 2.
ShiftKernel
δμδθRBF (0.001)Poly (1)Linear
00384.025362.975378.955
0.10.0025201.185112.091110.12
0.20.0050105.52151.48248.485
0.30.007558.97228.98129.690
0.40.010046.85121.67517.275
0.50.012536.06514.68513.241
0.60.015025.68512.13511.015
0.70.017520.3529.5158.195
0.80.020016.1518.7557.760
0.90.022514.2226.5425.775
1.00.025012.7016.1635.631
1.10.027510.1415.6215.352
1.20.030010.1115.2104.605
1.30.03259.3705.0054.825
1.40.03508.0014.6553.362
1.50.03758.0253.8554.265
Table A8. ARLs for θ1, θ2 = 0.1 and θ3 = 0.8, p = 5, and l = 3.
Table A8. ARLs for θ1, θ2 = 0.1 and θ3 = 0.8, p = 5, and l = 3.
ShiftKernel
δμδθRBF (0.001)Poly (1)Linear
00369.025380.04365.145
0.10.0025196.880120.265121.500
0.20.0050104.58054.18564.130
0.30.007552.84533.93034.475
0.40.010040.21020.26523.995
0.50.012525.92515.60014.590
0.60.015020.94012.75514.920
0.70.017517.8808.76011.355
0.80.020013.4008.2259.655
0.90.022512.1057.4807.850
1.00.025010.3555.6057.360
1.10.02758.6206.2706.935
1.20.03007.7255.0505.860
1.30.03257.9205.0805.810
1.40.03506.8704.8704.960
1.50.03756.0204.1804.745
Table A9. ARLs for θ1, θ2 = 0.1 and θ3 = 0.8, p = 5, and l = 4.
Table A9. ARLs for θ1, θ2 = 0.1 and θ3 = 0.8, p = 5, and l = 4.
ShiftKernel
δμδθRBF (0.001)Poly (1)Linear
00366.170356.350360.710
0.10.0025189.895135.420141.050
0.20.005082.90558.89062.395
0.30.007544.64540.85536.430
0.40.010031.45021.18024.260
0.50.012524.90016.72516.325
0.60.015017.61013.45512.685
0.70.017514.07510.6309.945
0.80.020010.6657.6509.210
0.90.022510.0658.3257.005
1.00.02509.7256.2007.590
1.10.02757.0956.4206.990
1.20.03007.1655.6006.210
1.30.03256.3905.3105.510
1.40.03506.0354.7404.360
1.50.03755.0604.5154.570
Table A10. ARLs for θ1, θ2 = 0.3 and θ3 = 0.4, p = 5, and l = 2.
Table A10. ARLs for θ1, θ2 = 0.3 and θ3 = 0.4, p = 5, and l = 2.
ShiftKernel
δμδθRBF (0.001)Poly (1)Linear
00380.765388.43388.155
0.10.0025286.715131.855205.025
0.20.0050258.21166.763104.961
0.30.0075199.53535.44064.611
0.40.0100174.01527.56150.075
0.50.0125125.24220.83541.332
0.60.015098.74113.98525.425
0.70.017594.72110.72123.351
0.80.020072.55210.38517.281
0.90.022566.4118.96114.015
1.00.025064.0926.99013.272
1.10.027551.7216.20512.245
1.20.030044.4956.69511.131
1.30.032541.3126.0818.565
1.40.035035.0255.6228.465
1.50.037531.1125.3618.425
Table A11. ARLs for θ1, θ2 = 0.3 and θ3 = 0.4, p = 5, and l = 3.
Table A11. ARLs for θ1, θ2 = 0.3 and θ3 = 0.4, p = 5, and l = 3.
ShiftKernel
δμδθRBF (0.001)Poly (1)Linear
00360.92364.295366.095
0.10.0025228.395189.135205.490
0.20.0050141.870121.995102.540
0.30.007577.11571.65559.360
0.40.010061.58046.03045.345
0.50.012536.80038.57534.255
0.60.015029.86026.03025.575
0.70.017523.41026.03017.965
0.80.020019.78519.70016.910
0.90.022516.90514.24515.370
1.00.025013.83512.51013.815
1.10.027511.89011.98011.430
1.20.030011.70510.1658795
1.30.03259.84511.23010.145
1.40.03509.3508.1558.335
1.50.03759.4859.3708.080
Table A12. ARLs for θ1, θ2 = 0.3 and θ3 = 0.4, p = 5, and l = 4.
Table A12. ARLs for θ1, θ2 = 0.3 and θ3 = 0.4, p = 5, and l = 4.
ShiftKernel
δμδθRBF (0.001)Poly (1)Linear
00367.040388.005363.535
0.10.0025146.845124.230153.925
0.20.005061.07556.07078.945
0.30.007537.59530.50543.010
0.40.010024.05019.80029.545
0.50.012516.65014.65018.340
0.60.015012.31010.64015.205
0.70.01759.25510.05512.315
0.80.02008.7909.0559.130
0.90.02257.8256.9009.840
1.00.02507.5506.0956.970
1.10.02756.2455.5457.615
1.20.03005.8155.4905.315
1.30.03254.9005.0255.975
1.40.03504.9204.7204.730
1.50.03754.6654.5255.150

References

  1. Montgomery, D.C. Introduction to Statistical Quality Control; John Wiley & Sons: New York, NY, USA, 2009; ISBN 0470169923. [Google Scholar]
  2. Ahsan, M.; Mashuri, M.; Kuswanto, H.; Prastyo, D.D. Intrusion Detection System using Multivariate Control Chart Hotelling’s T2 based on PCA. Int. J. Adv. Sci. Eng. Inf. Technol. 2018, 8, 1905–1911. [Google Scholar] [CrossRef]
  3. Maleki, F.; Mehri, S.; Aghaie, A.; Shahriari, H. Robust T2 control chart using median-based estimators. Qual. Reliab. Eng. Int. 2020, 36, 2187–2201. [Google Scholar] [CrossRef]
  4. Ahsan, M.; Mashuri, M.; Lee, M.H.; Kuswanto, H.; Prastyo, D.D. Robust adaptive multivariate Hotelling’s T2 control chart based on kernel density estimation for intrusion detection system. Expert Syst. Appl. 2020, 145, 113105. [Google Scholar] [CrossRef]
  5. Salmasnia, A.; Kaveie, M.; Namdar, M. An integrated production and maintenance planning model under VP-T2 Hotelling chart. Comput. Ind. Eng. 2018, 118, 89–103. [Google Scholar] [CrossRef]
  6. Chong, N.L.; Khoo, M.B.C.; Haq, A.; Castagliola, P. Hotelling’s T2 control charts with fixed and variable sample sizes for monitoring short production runs. Qual. Reliab. Eng. Int. 2019, 35, 14–29. [Google Scholar] [CrossRef] [Green Version]
  7. Haq, A.; Khoo, M.B.C. An adaptive multivariate EWMA chart. Comput. Ind. Eng. 2019, 127, 549–557. [Google Scholar] [CrossRef]
  8. Haq, A. One-sided and two one-sided MEWMA charts for monitoring process mean. J. Stat. Comput. Simul. 2020, 90, 699–718. [Google Scholar] [CrossRef]
  9. Haq, A.; Munir, T.; Khoo, M.B.C. Dual multivariate CUSUM mean charts. Comput. Ind. Eng. 2019, 137, 106028. [Google Scholar] [CrossRef]
  10. Khusna, H.; Mashuri, M.; Suhartono; Prastyo, D.D.; Lee, M.H.; Ahsan, M. Residual-based maximum MCUSUM control chart for joint monitoring the mean and variability of multivariate autocorrelated processes. Prod. Manuf. Res. 2019, 7, 364–394. [Google Scholar] [CrossRef] [Green Version]
  11. Zaman, B.; Lee, M.H.; Riaz, M.; Abujiya, M.R. An improved process monitoring by mixed multivariate memory control charts: An application in wind turbine field. Comput. Ind. Eng. 2020, 142, 106343. [Google Scholar] [CrossRef]
  12. Aldosari, M.S.; Aslam, M.; Srinivasa Rao, G.; Jun, C.-H. An attribute control chart for multivariate Poisson distribution using multiple dependent state repetitive sampling. Qual. Reliab. Eng. Int. 2019, 35, 627–643. [Google Scholar] [CrossRef]
  13. Mashuri, M.; Wibawati; Purhadi; Irhamah. A Fuzzy Bivariate Poisson Control Chart. Symmetry 2020, 12, 573. [Google Scholar]
  14. Lee, J.; Peng, Y.; Wang, N.; Reynolds, M.R., Jr. A GLR control chart for monitoring a multinomial process. Qual. Reliab. Eng. Int. 2017, 33, 1773–1782. [Google Scholar] [CrossRef]
  15. Pu, X.; Li, Y.; Xiang, D. Mixed variables-attributes test plans for single and double acceptance sampling under exponential distribution. Math. Probl. Eng. 2011, 2011, 1–15. [Google Scholar] [CrossRef]
  16. Aslam, M.; Azam, M.; Khan, N.; Jun, C.H. A mixed control chart to monitor the process. Int. J. Prod. Res. 2015, 53, 4684–4693. [Google Scholar] [CrossRef]
  17. Aslam, M.; Khan, N.; Aldosari, M.S.; Jun, C.H. Mixed Control Charts Using EWMA Statistics. IEEE Access 2016, 4, 8286–8293. [Google Scholar] [CrossRef]
  18. Wang, J.; Su, Q.; Fang, Y.; Zhang, P. A multivariate sign chart for monitoring dependence among mixed-type data. Comput. Ind. Eng. 2018, 126, 625–636. [Google Scholar] [CrossRef]
  19. Ahsan, M.; Mashuri, M.; Kuswanto, H.; Prastyo, D.D.; Khusna, H. Multivariate Control Chart based on PCA Mix for Variable and Attribute Quality Characteristics. Prod. Manuf. Res. 2018, 6, 364–384. [Google Scholar] [CrossRef]
  20. Ahsan, M.; Mashuri, M.; Kuswanto, H.; Prastyo, D.D.; Khusna, H. Outlier detection using PCA mix based T2 control chart for continuous and categorical data. Commun. Stat.-Simul. Comput. 2019, 1–28. [Google Scholar] [CrossRef]
  21. Phaladiganon, P.; Kim, S.B.; Chen, V.C.P.; Jiang, W. Principal component analysis-based control charts for multivariate nonnormal distributions. Expert Syst. Appl. 2013, 40, 3044–3054. [Google Scholar] [CrossRef]
  22. Schölkopf, B.; Smola, A.; Müller, K.-R. Kernel principal component analysis. Artif. Neural Netw.-ICANN 1997, 97, 583–588. [Google Scholar] [CrossRef]
  23. Ma, X.; Zabaras, N. Kernel principal component analysis for stochastic input model generation. J. Comput. Phys. 2011, 230, 7311–7331. [Google Scholar] [CrossRef]
  24. Lee, J.-M.; Yoo, C.; Choi, S.W.; Vanrolleghem, P.A.; Lee, I.-B. Nonlinear process monitoring using kernel principal component analysis. Chem. Eng. Sci. 2004, 59, 223–234. [Google Scholar] [CrossRef]
  25. Stefatos, G.; Hamza, A. Ben Statistical process control using kernel PCA. In Proceedings of the 2007 Mediterranean Conference on Control & Automation, Athens, Greece, 27–29 June 2007; pp. 1–6. [Google Scholar]
  26. Dong, D.; McAvoy, T.J. Nonlinear principal component analysis—Based on principal curves and neural networks. Comput. Chem. Eng. 1996, 20, 65–78. [Google Scholar] [CrossRef]
  27. Boser, B.E.; Guyon, I.M.; Vapnik, V.N. A Training Algorithm for Optimal Margin Classifiers. In Proceedings of the 5th Annual Acm Workshop on Computational Learning Theory, Pittsburgh, PA, USA, 27–29 July 1992; pp. 144–152. [Google Scholar]
  28. Khusna, H.; Mashuri, M.; Ahsan, M.; Suhartono, S.; Prastyo, D.D. Bootstrap Based Maximum Multivariate CUSUM Control Chart. Qual. Technol. Quant. Manag. 2018, 17, 52–74. [Google Scholar] [CrossRef]
  29. Khediri, I.B.; Limam, M.; Weihs, C. Variable window adaptive Kernel Principal Component Analysis for nonlinear nonstationary process monitoring. Comput. Ind. Eng. 2011, 61, 437–446. [Google Scholar] [CrossRef]
Figure 1. Illustration of KPCA [23].
Figure 1. Illustration of KPCA [23].
Symmetry 12 01838 g001
Figure 2. KPCA mix chart procedures.
Figure 2. KPCA mix chart procedures.
Symmetry 12 01838 g002
Figure 3. Visualization of the hit rate, FN rate, and FP rate for all scenarios with the linear kernel for: (a) p = 5 l = 2, (b) p = 5 l = 3, and (c) p = 5 l = 4.
Figure 3. Visualization of the hit rate, FN rate, and FP rate for all scenarios with the linear kernel for: (a) p = 5 l = 2, (b) p = 5 l = 3, and (c) p = 5 l = 4.
Symmetry 12 01838 g003
Figure 4. Visualization of the hit rate, FN rate, and FP rate for all scenarios with the polynomial kernel for: (a) p = 5 l=2, (b) p = 5 l = 3, and (c) p = 5 l = 4.
Figure 4. Visualization of the hit rate, FN rate, and FP rate for all scenarios with the polynomial kernel for: (a) p = 5 l=2, (b) p = 5 l = 3, and (c) p = 5 l = 4.
Symmetry 12 01838 g004
Figure 5. Visualization of the hit rate, FN rate, and FP rate for all scenarios with the RBF kernel for: (a) p = 5 l = 2, (b) p = 5 l = 3, and (c) p = 5 l = 4.
Figure 5. Visualization of the hit rate, FN rate, and FP rate for all scenarios with the RBF kernel for: (a) p = 5 l = 2, (b) p = 5 l = 3, and (c) p = 5 l = 4.
Symmetry 12 01838 g005
Figure 6. ARLs comparison from various kernel function with θ 1 , θ 2 = 0.05   and   θ 3 = 0.9 for (a) p = 5 and l = 2, (b) p = 5 and l = 3, and (c) p = 5 and l = 4.
Figure 6. ARLs comparison from various kernel function with θ 1 , θ 2 = 0.05   and   θ 3 = 0.9 for (a) p = 5 and l = 2, (b) p = 5 and l = 3, and (c) p = 5 and l = 4.
Symmetry 12 01838 g006
Figure 7. ARLs comparison from various kernel function with θ 1 , θ 2 = 0.1   and   θ 3 = 0.8 for (a) p = 5 and l = 2, (b) p = 5 and l = 3, and (c) p = 5 and l = 4.
Figure 7. ARLs comparison from various kernel function with θ 1 , θ 2 = 0.1   and   θ 3 = 0.8 for (a) p = 5 and l = 2, (b) p = 5 and l = 3, and (c) p = 5 and l = 4.
Symmetry 12 01838 g007
Figure 8. ARLs comparison from various kernel function with θ 1 , θ 2 = 0.3   and   θ 3 = 0.4 for (a) p = 5 and l = 2, (b) p = 5 and l = 3, and (c) p = 5 and l = 4.
Figure 8. ARLs comparison from various kernel function with θ 1 , θ 2 = 0.3   and   θ 3 = 0.4 for (a) p = 5 and l = 2, (b) p = 5 and l = 3, and (c) p = 5 and l = 4.
Symmetry 12 01838 g008
Figure 9. Application of the proposed T ˜ k 2 chart with kernel RBF (0.001) for (a) scenario 1, (b) scenario 2, and (c) scenario 3.
Figure 9. Application of the proposed T ˜ k 2 chart with kernel RBF (0.001) for (a) scenario 1, (b) scenario 2, and (c) scenario 3.
Symmetry 12 01838 g009
Figure 10. Application of the proposed T ˜ k 2 chart with kernel polynomial (1) for (a) scenario 1, (b) scenario 2, and (c) scenario 3.
Figure 10. Application of the proposed T ˜ k 2 chart with kernel polynomial (1) for (a) scenario 1, (b) scenario 2, and (c) scenario 3.
Symmetry 12 01838 g010
Figure 11. Application of the proposed T ˜ k 2 chart with kernel linear for (a) scenario 1, (b) scenario 2, and (c) scenario 3.
Figure 11. Application of the proposed T ˜ k 2 chart with kernel linear for (a) scenario 1, (b) scenario 2, and (c) scenario 3.
Symmetry 12 01838 g011
Table 1. KDE control limit of linear kernel.
Table 1. KDE control limit of linear kernel.
θ 1 , θ 2 = 0.3   and   θ 3 = 0.4 θ 1 , θ 2 = 0.1   and   θ 3 = 0.8 θ 1 , θ 2 = 0.05   and   θ 3 = 0.9
p = 5, l = 210,170.3011,267.6511,217.93
375.01387.40365.19
p = 5, l = 313,292.8613,567.3313,582.07
376.94385.82379.60
p = 5, l = 416,007.8515,845.0915,942.24
361.19356.42376.72
ARL0 target is 370.
Table 2. KDE control limit of polynomial kernel with various d for θ 1 , θ 2 = 0.3   and   θ 3 = 0.4 .
Table 2. KDE control limit of polynomial kernel with various d for θ 1 , θ 2 = 0.3   and   θ 3 = 0.4 .
d
123
p = 5, l = 210,115.9832,409.3371,318.15
355.84804.52844.36
p = 5, l = 313,129.4339,586.6183,200.83
358.20774.80815.38
p = 5, l = 415,708.8748,741.2890,609.87
386.68812.37696.31
ARL0 target is 370.
Table 3. KDE control limit of polynomial kernel with various d for θ 1 , θ 2 = 0.1   and   θ 3 = 0.8 .
Table 3. KDE control limit of polynomial kernel with various d for θ 1 , θ 2 = 0.1   and   θ 3 = 0.8 .
d
123
p = 5, l = 210,665.1019,383.6357,755.22
371.39487.68755.76
p = 5, l = 313,316.0124,878.6471,965.38
385.64717.66699.74
p = 5, l = 415,864.0029,099.0381,079.33
351.55656.43620.34
ARL0 target is 370.
Table 4. KDE control limit of polynomial kernel with various d for θ 1 , θ 2 = 0.05   and   θ 3 = 0.9 .
Table 4. KDE control limit of polynomial kernel with various d for θ 1 , θ 2 = 0.05   and   θ 3 = 0.9 .
d
123
p = 5, l = 211,293.8316,354.3648,299.77
379.08542.78668.36
p = 5, l = 313,476.4722,357.91119,229.20
354.44787.47957.12
p = 5, l = 415,820.2926,376.1077,170.29
351.76830.91631.40
ARL0 target is 370.
Table 5. KDE control limit of RBF kernel with various σ * for θ 1 , θ 2 = 0.3   and   θ 3 = 0.4 .
Table 5. KDE control limit of RBF kernel with various σ * for θ 1 , θ 2 = 0.3   and   θ 3 = 0.4 .
σ *
0.0010.0050.010.050.1
p = 5, l = 210,490.039974.929358.186338.115570.16
388.28381.70418.82404.65512.10
p = 5, l = 312,854.1812,165.3311,727.517129.096302.07
367.22394.55390.09900.731000.00
p = 5, l = 416,197.9814,669.4013,256.487799.566556.82
361.66355.83418.541000.00904.12
ARL0 target is 370.
Table 6. KDE control limit of RBF kernel with various σ * for θ 1 , θ 2 = 0.1   and   θ 3 = 0.8 .
Table 6. KDE control limit of RBF kernel with various σ * for θ 1 , θ 2 = 0.1   and   θ 3 = 0.8 .
σ *
0.0010.0050.010.050.1
p = 5, l = 210,464.289642.609939.386487.395731.99
369.76366.09468.92477.56584.15
p = 5, l = 312,714.5513,059.8711,763.307276.466496.19
360.03385.56444.63962.981000.00
p = 5, l = 416,821.4414,567.5813,399.807999.696759.66
386.13359.66406.921000.001000.00
ARL0 target is 370.
Table 7. KDE control limit of RBF kernel with various σ * for θ 1 , θ 2 = 0.05   and   θ 3 = 0.9 .
Table 7. KDE control limit of RBF kernel with various σ * for θ 1 , θ 2 = 0.05   and   θ 3 = 0.9 .
σ *
0.0010.0050.010.050.1
p = 5, l = 211,379.4010,474.769693.146445.415714.85
386.22385.89358.27463.50687.76
p = 5, l = 313,524.9812,872.4411,690.727353.566392.95
361.81443.36436.451000.001000.00
p = 5, l = 415,611.9514,734.5014,882.037874.676690.34
354.36397.19437.901000.001000.00
ARL0 target is 370.
Table 8. Simulation scenarios for linear, polynomial, and RBF kernel to evaluate the performance of the proposed chart in detecting outliers.
Table 8. Simulation scenarios for linear, polynomial, and RBF kernel to evaluate the performance of the proposed chart in detecting outliers.
ScenarioNonmetric Parameterpl
i θ 1 , θ 2 = 0.3   and   θ 3 = 0.4 52
ii θ 1 , θ 2 = 0.1   and   θ 3 = 0.8 52
iii θ 1 , θ 2 = 0.05   and   θ 3 = 0.9 52
iv θ 1 , θ 2 = 0.3   and   θ 3 = 0.4 53
v θ 1 , θ 2 = 0.1   and   θ 3 = 0.8 53
vi θ 1 , θ 2 = 0.05   and   θ 3 = 0.9 53
vii θ 1 , θ 2 = 0.3   and   θ 3 = 0.4 54
viii θ 1 , θ 2 = 0.1   and   θ 3 = 0.8 54
ix θ 1 , θ 2 = 0.05   and   θ 3 = 0.9 54
Table 9. Summary of the proposed chart performance in detecting shift in the process for various scenarios and kernel functions.
Table 9. Summary of the proposed chart performance in detecting shift in the process for various scenarios and kernel functions.
Parameter of Nonmetric DatalKernel Function
RBFPolynomialLinear
Balanced2 ⁂●
3 ⁂●⁂●
4⁂●⁂●
Imbalanced2 ⁂●
3 ⁂●⁂●
4 ⁂●⁂●
Extreme Imbalanced2
3
4
● represents better performance for a small shift. ⁂ represents better performance for the large shift.
Table 10. Scenarios of simulated data for proposed chart application.
Table 10. Scenarios of simulated data for proposed chart application.
Scenario θ 1 θ 2 θ 3 plIn Control Mean Process μShifted Mean Process μ s h i f t
10.300.300.405402
20.100.100.805402
30.050.050.905402
Table 11. Performance comparison between the proposed kernel PCA mix T ˜ k 2 chart and PCA mix chart to monitor testing dataset in machine failure data.
Table 11. Performance comparison between the proposed kernel PCA mix T ˜ k 2 chart and PCA mix chart to monitor testing dataset in machine failure data.
CriteriaNumber of Observations
PCA Mix Chart [19]Proposed Chart
with RBF Kernel
[This Study]
In-control observations247247
Out of control observations33
Success detection of in-control observations 247246
Success detection of out-of-control observations 23
Misdetection of in-control observations01
Misdetection of out-of-control observations 10
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Ahsan, M.; Mashuri, M.; Wibawati; Khusna, H.; Lee, M.H. Multivariate Control Chart Based on Kernel PCA for Monitoring Mixed Variable and Attribute Quality Characteristics. Symmetry 2020, 12, 1838. https://doi.org/10.3390/sym12111838

AMA Style

Ahsan M, Mashuri M, Wibawati, Khusna H, Lee MH. Multivariate Control Chart Based on Kernel PCA for Monitoring Mixed Variable and Attribute Quality Characteristics. Symmetry. 2020; 12(11):1838. https://doi.org/10.3390/sym12111838

Chicago/Turabian Style

Ahsan, Muhammad, Muhammad Mashuri, Wibawati, Hidayatul Khusna, and Muhammad Hisyam Lee. 2020. "Multivariate Control Chart Based on Kernel PCA for Monitoring Mixed Variable and Attribute Quality Characteristics" Symmetry 12, no. 11: 1838. https://doi.org/10.3390/sym12111838

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop