Next Article in Journal
Analyzing the Effect of Extraction Parameters on Phenolic Composition and Selected Compounds in Clove Buds Using Choline Chloride and Lactic Acid as Extraction Agents
Previous Article in Journal
Effect of Acid-Injection Mode on Conductivity for Acid-Fracturing Stimulation in Ultra-Deep Tight Carbonate Reservoirs
Previous Article in Special Issue
Determining Optimal Assembly Condition for Lens Module Production by Combining Genetic Algorithm and C-BLSTM
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Algorithm for Correlation Diagnosis in Multivariate Process Quality Based on the Optimal Typical Correlated Component Pair Group

Department of Product Design, Lanzhou Jiaotong University, Lanzhou 730070, China
*
Author to whom correspondence should be addressed.
Processes 2024, 12(4), 652; https://doi.org/10.3390/pr12040652
Submission received: 21 February 2024 / Revised: 19 March 2024 / Accepted: 20 March 2024 / Published: 25 March 2024
(This article belongs to the Special Issue Advances in Intelligent Manufacturing Systems and Process Control)

Abstract

:
Correlation diagnosis in multivariate process quality management is an important and challenging issue. In this paper, a new approach based on the optimal typical correlated component pair group (OTCCPG) is proposed. Firstly, the theorem of correlation decomposition is proved to decompose the correlation of all the quality components as serial correlations of component pairs, and then according to the transitivity of correlations of component pairs, the decomposition result is represented by a correlation set of typical correlated component pairs. Finally, an algorithm for OTCCPG based on the maximum correlation spanning tree (MCST) is proposed, and T2 control charts to monitor the correlations of component pairs in OTCCPG are established to form the correlation diagnostic system. Theoretical analysis and practice prove that the proposed method could reduce the space complexity of the diagnostic system greatly.

1. Introduction

With the development of the modern global market, the product’s quality has been one of the key factors that greatly influence the competitiveness of enterprises. In the whole formation of the product’s quality, process quality is the most basic aspect because the product’s quality will be influenced by every process’s quality, directly or indirectly, so process quality control is the essence of quality management in manufacturing.
The theory of statistical process control (SPC) and its tools, that is, the control chart proposed by Shewhart, achieved the goal of univariate process quality management [1,2,3,4]. However, in modern manufacturing, many processes contain more than one quality component. Due to the correlation of quality components, all components and their correlation need to be monitored simultaneously [5,6]. For example, in a bivariate process with quality y = (y1, y2)T, 20 quality data points are collected to plot the mean control charts of y1 and y2, as shown in Figure 1 and Figure 2. We can see that the variations of the two components are usually regular, i.e., with the increment of y1, there is a monotonic increment of y2. This regular variation of a monotonic increment within a component with the increment (or decrement) of another is defined as the correlation of y.
Although y1 and y2 are under control in Figure 1 and Figure 2, this does not mean that y is in a stable state. At the 10th point, y1 is decreasing while y2 is increasing; this is opposite to the general variations of the two components, so it means that the correlation of y is abnormal. At the 20th point, although both y1 and y2 are decreasing, the decrement of y2 is too heavy to exceed the normal range; this also means that the correlation of y is abnormal. The above inferences can be verified by plotting the T2 control chart for y, as illustrated in Figure 3.
The T2 statistic was initially proposed by Hotelling to describe the correlation between quality components. For a p-dimensional process quality y = (y1, y2, …, yp)T, the T2 statistic is defined as follows:
T 2 = ( y μ ) T Σ 1 ( y μ )
where μ is the mean vector, and Σ is the covariance matrix of y. When T2 > 0, it means that all the components in y are correlated.
It has been proven that the T2 statistic follows the χ2 distribution with p-freedom. Suppose α is the false probability, then the upper control limit (UCL) of the T2 statistic is χ α 2 ( p ) , and the lower control limit (LCL) is 0. Thus, the T2 control chart can be established to monitor the correlation shift of y. However, this control chart cannot specify the cause(s) when the correlation shift is out of control.

1.1. Diagnosis Method Based on Component Combinations

In order to diagnose the cause(s) that cause the correlation shift to be out of control, an approach is to consider all the possible combinations of quality components when constructing their T2 control charts. The cause(s) can then be specified by examining all the T2 control charts when the correlation shift is out of control [7,8,9]. This approach is referred to as the component combination-based diagnosis (CCBD) method in this paper. While this approach is theoretically sound and appealing, it has inherent deficiencies. For a p-dimensional process quality y = (y1, y2, …, yp)T, the number of T2 control charts using the CCBD method is N = C p 2 + C p 3 + + C p p = 2 p p 1 , where N is an exponential function of p and the space complexity is O(2p). When p increases, N will increase sharply, leading to a significant expansion of the diagnostic system scale, so this method is difficult to apply in practice.
On the other hand, the defect of information redundancy in diagnosis results cannot be avoided in the CCBD method. For example, in a four-dimensional process quality y = (y1, y2, y3, y4)T, suppose the abnormal correlation shift between y1 and y2 is the only cause that causes the correlation of y to be out of control. Now in the CCBD method, besides the T2 control chart to monitor the correlation shift of (y1, y2) being out of control, the other combinations that contain y1 and y2, namely (y1, y2, y3) and (y1, y2, y4) are both out of control. This phenomenon is called the redundancy of diagnostic messages because the correlation shift of one component combination is out of control, and the correlations of other component combinations that contain the abnormal component combination are all out of control. The redundancy in diagnosis results is a disturbance for process quality adjustment.

1.2. Diagnosis Method Based on Principal Component Analysis

To reduce the scale of the correlation diagnostic system, scholars proposed that, by using the principal component analysis (PCA) method [10,11,12,13], the original process quality y = (y1, y2, …, yp)T is converted to p-independent principal components and sorted by variance-decreasing order, denoted as z = (z1, z2, …, zp)T. Then, firstly, p Shewhart control charts are constructed to monitor the normality of zi. Secondly, the first n(n < p) principal components whose cumulative sum of their variance exceeds a specified critical value are grouped as component pairs, and T2 control charts are constructed to monitor the normality of (zi, zj) (i, jn, ij); At last, the normality of the rest of the principal component group (zn+1, zn+2, …, zp) is monitored by a T2 control chart.
The number of control charts based on the PAC method is N = p + C n 2 + 1 = p + n ( n 1 ) / 2 + 1 . In general, n increases while p increases, the space complexity is approximately O(p2), and the scale of the diagnostic system is still large. Furthermore, since zi generally has no engineering significance after conversion, the cause(s) that cause the correlation shift of y out of control can only be specified by a comprehensive analysis of all the results in control charts, and by consulting the mapping relationship between y and z, the calculation of the diagnosis is increased. Meanwhile, the redundancy of diagnostic messages also cannot be avoided.

1.3. Diagnosis Method Based on Correlation and Orthogonal Decomposition

In 1995, Mason, Young, and Tracy [14,15,16] proposed to decompose the T2 statistic into conditional and unconditional terms by regression analysis, in which the conditional and unconditional terms have equal weight in the decomposition results and are orthogonally independent of each other. Then, according to the statistical distribution of the conditional and unconditional terms, the corresponding control limits are established, so as to diagnose the specific cause(s) when the manufacturing process is abnormal.
As an example, in bivariate process quality y = (y1, y2)T, the basic idea of the MYT orthogonal decomposition method [17,18,19,20] is to decompose the T2 statistic into the following form:
T 2 = T 1 2 + T 2 1 2
where T 1 2 , called the unconditional term, is related only to the quality component y1 and is used to measure the contribution shift in y1 to the T2 statistic, and T 2 1 2 , called the conditional term, whose value is related to the conditional probability P(y2|y1), is used to measure the contribution in the correlation between y1 and y2 to the T2 statistic.
Similar to Equation (2), the T2 statistic can also be decomposed into another form as follows:
T 2 = T 2 2 + T 1 2 2
where the unconditional term T 2 2 is related only to the quality component y2 and is used to measure the contribution shift in y2 to the T2 statistic; the conditional term T 1 2 2 depends on the conditional probability P(y1|y2) and is used to measure the contribution in the correlation between y2 and y1 to the T2 statistic.
Conditional probability P(y2|y1) ≠ P(y1|y2) under the condition that the quality components y1 and y2 are correlated, and hence the conditional term T 2 1 2 T 1 2 2 . Therefore, Equations (2) and (3) represent two different decomposition results of the T2 statistic. In general, for a p-dimensional process quality y = (y1, y2,…, yp)T, the decomposition results have a total of p!. As the number of quality components increases, under the condition that every possible form of decomposition is analyzed, this will lead to a significant increase in calculations and a serious reduction in diagnostic efficiency. At the same time, the accuracy of the diagnostic results based on this method will be affected when there are obvious correlations between different quality components.

1.4. Intelligent Diagnosis Methods

In addition to the traditional diagnostic methods based on mathematical model analysis, in recent years, with the development of artificial intelligence technology, intelligent diagnostic methods have been applied to the field of multivariate process quality diagnosis, and the diagnostic methods based on artificial neural networks (ANNs) [21,22,23], Bayesian networks [24,25,26], support vector machines (SVMs) [27,28,29], etc. have been widely applied. Intelligent diagnostic methods can effectively reduce the scale of the diagnostic system and improve the diagnostic efficiency; however, these methods generally require a large amount of data to train the network’s parameters, and the constructed network is generally suitable for specific applications, so their generality will be greatly restricted. Therefore, establishing a general and efficient method for multivariate process quality correlation diagnosis is a major problem to be solved in the field of quality management.

1.5. Sketch of the Algorithm

In this paper, a new correlation diagnosis method based on the optimal typical correlated component pair group is proposed. For the multivariate process quality y = (y1, y2, …, yp)T, the correlation decomposition theorem is first proved by drawing on the idea of decomposing the T2 statistic in the MYT orthogonal decomposition method, which decomposes the correlation of all the quality components into the correlations of all the component pairs, to reduce the space complexity of the diagnostic system to O(p2). Then the monotonicity of the correlation between two quality components is used to prove the transitivity of the correlations of different component pairs, thus proving that the set formed by the correlations of all the component pairs is an equivalence relation. Next, drawing on the graphical representation of the equivalence relation in set theory and the concept of the minimum weighted spanning tree in graph theory, the computation method of the optimal correlated component pairs group is proposed, and reference is made to the diagnostic algorithm of component combination and the idea of combination of quality components in principal component analysis in order to represent the correlations of all the quality component pairs with the correlations in the optimal correlated component pairs group, so as to reduce the space complexity of the diagnostic system to O(p).

2. Covariance Matrix Properties of Multivariate Process Quality

In the manufacturing process, factors affecting the product’s quality can be attributed to five aspects: man, machines, materials, methods, and environment (4M1E). On this basis, ISO9000 supplemented another three factors: manufacturing software, auxiliary materials, and utilities. Among the many factors affecting the product’s quality, changes in any one of them will have an impact on the final quality of the product, so the product’s quality is fluctuating in manufacturing. Tolerance theory is a direct proof of the fluctuation of the product’s quality.
For the multivariate process quality y = (y1, y2, …, yp)T, the covariance matrix is an important parameter to describe its correlation. Combined with the fluctuation of the product’s quality in the manufacturing process, this paper first establishes three theorems describing the characteristics of the covariance matrix of multivariate process quality.
Theorem 1: 
In the covariance matrix Σ of the multivariate process quality y = (y1, y2, …, yp)T, all of the elements are not 0.
Proof. 
Suppose the mean vector of y is μ = (μ1, μ2, …, μp)T. According to the definition of the covariance matrix, it is known that:
Σ = E [ ( y 1 μ 1 ) ( y 1 μ 1 ) ] E [ ( y 1 μ 1 ) ( y 2 μ 2 ) ] E [ ( y 1 μ 1 ) ( y p μ p ) ] E [ ( y 2 μ 2 ) ( y 1 μ 1 ) ] E [ ( y 2 μ 2 ) ( y 2 μ 2 ) ] E [ ( y 2 μ 2 ) ( y p μ p ) ] E [ ( y p μ p ) ( y 1 μ 1 ) ] E [ ( y p μ p ) ( y 2 μ 2 ) ] E [ ( y p μ p ) ( y p μ p ) ] = σ 11 σ 12 σ 1 p σ 21 σ 21 σ 2 p σ p 1 σ p 1 σ p p
For any element σ i j = E [ ( y i μ i ) ( y j μ j ) ] in Σ, the sufficient and necessary condition for it to be 0 is as follows:
y i = μ i   or   y j = μ j
According to the properties of mathematical expectation, Equation (5) implies that the quality component yi or yj is a constant in the manufacturing process. Obviously, this contradicts the viewpoint of the fluctuation of the product’s quality, and therefore, Equation (5) does not hold, i.e., all of the elements in Σ are not 0. □
Theorem 2: 
The covariance matrix Σ of the multivariate process quality y = (y1, y2, …, yp)T is a real symmetric positive definite matrix.
Proof. 
According to Equation (4) on the definition of the covariance matrix:
σ i j = E [ ( y i μ i ) ( y j μ j ) ]
σ j i = E [ ( y j μ j ) ( y i μ i ) ]
Compared to Equations (6) and (7), the properties of mathematical expectation can be seen as follows:
σ i j = σ j i
Equation (8) shows that Σ is a symmetric matrix.
Let p-dimensional vector c = (c1, c2, …, cp) T0.
c T Σ c = ( c 1 , c 2 , c p ) Σ ( c 1 , c 2 , c p ) T
Bringing Equation (4) into (9), after simplification and consolidation, we obtain the following:
c T Σ c = E i = 1 p c i ( y i μ i ) k = 1 p ( y k μ k ) c k
Let the random variable z = i = 1 p c i ( y i μ i ) . Bringing this into Equation (10), we obtain the following:
c T Σ c = E ( z 2 ) 0
From the proof of Theorem 1, it is clear that, according to the viewpoint of the fluctuation of the product’s quality, z ≠ 0, i.e.,
c T Σ c = E ( z 2 ) > 0
Equation (12) shows that the covariance matrix Σ of the multivariate process quality y = (y1, y2, …, yp)T is a real symmetric positive definite matrix. □
Theorem 3: 
The inverse matrix Σ−1 of the covariance matrix Σ of the multivariate process quality y = (y1, y2, …, yp)T is a real symmetric positive definite matrix.
Proof. 
First, prove the symmetry of Σ−1. It follows from the symmetry of Σ:
Σ = Σ T
Inverting both ends of Equation (13), we obtain the following:
Σ 1 = ( Σ T ) 1 = ( Σ 1 ) T
The above equation shows that Σ−1 is a symmetric matrix.
Let the eigenvalues of Σ be λ1, λ2, …, λp. By the positive definiteness of Σ, λi > 0 (ip). According to the nature of the inverse matrix, the eigenvalues of Σ−1 are 1/λ1, 1/λ2, …, 1/λp, i.e., the eigenvalues of Σ−1 are all greater than 0, so Σ−1 is a positive definite matrix. □

3. Correlation Diagnosis Method Based on OTCCPG

In the CCBD method, the exponential function between N and p is the main reason for the difficulty of applying this approach. If the gradient of N with p can be lowered by proper means, the defect of the diagnostic system scale expanding greatly while p is increasing will be avoided to a certain extent, and thus this approach can be applied in multivariate process quality management.

3.1. Correlation Decomposition

Theorem 4: 
In the multivariate process quality y = (y1, y2, …, yp)T, the sufficient and necessary condition of the correlation of all components exists is that for any two components yi and yj, they are correlated.
Proof. 
Firstly, the sufficiency of Theorem 4 is proved. Any two components yi and yj in y that are correlated show that σij ≠ 0. From Theorems 2 and 3, the covariance matrix Σ and its inverse matrix Σ−1 are real symmetric positive definite matrices. From the definition of the T2 statistic in Equation (1), it is clear that for any sample data, its T2 statistic is greater than 0, i.e., the correlation of all components exists.
The following proves the necessity of Theorem 4 by reductio ad absurdum. The existence of a correlation between all the components in y implies that for any sample data, its T2 statistic is greater than 0. From the definition of the T2 statistic in Equation (1), there exists an inverse matrix of the covariance matrix Σ of y, and the rank of Σ is p.
R ( Σ ) = p
Assume yk and yj in y are uncorrelated, i.e., σkj = 0. By the definition of covariance, there are, as follows:
σ k j = E [ ( y k μ k ) ( y j μ j ) ] = 0
The sufficient and necessary condition for Equation (16) to hold is yk = μk or yj = μj. It may be useful to set yk = µk. From the definition of covariance, we know that for any component yi (i = 1, 2, …, p), there are, as follows:
σ k i = E [ ( y k μ k ) ( y i μ i ) ] = 0
Equation (17) shows that in the covariance matrix Σ of y, the kth row and kth column are both 0, i.e., R(Σ) ≤ p − 1. This contradicts Equation (15), the assumption is not valid, and the necessity of Theorem 4 is proved. □
Theorem 4 means that the correlation of all the quality components can be represented as correlations of component pairs, so in the correlation diagnostic system, it only needs to monitor the correlation shifts of all the component pairs. In addition, a T2 control chart to monitor the correlation shift of all the components should be added. The number of T2 control charts is N = Cp2 + 1 = p(p − 1)/2 + 1, where N is the power function of p. The space complexity of the diagnostic system is lowered to O(p2). Compared to the CCBD method, the gradient of N with p has decreased significantly. Meanwhile, because the component pair is the minimum combination of components, information redundancy in diagnosis results can be avoided.

3.2. Transitivity of Component Pairs’ Correlations

Although the functional relation between N and p is lowered to a power function by correlation decomposition, N will still increase rapidly while p is increasing, so further proper ways should be adopted to reduce the scale of the diagnostic system on the basis of the above analysis.
Definition 1: 
In the multivariate process quality y = (y1, y2, …, yp)T, Y = {(yi, yj)| i, j = 1, 2, …, p, i ≠ j} is the component pair set, and Y* is another set that meets the following two properties:
  • Y* Y;
  • Any component in y should appear at least once in Y*.
Then Y* is defined as a typical correlated component pair group (TCCPG) of y.
Based on the above definition, there is Theorem 5 shown below.
Theorem 5: 
In the multivariate process quality y = (y1, y2, …, yp)T, suppose Y* is a TCCPG of y, then the correlation of any component pair (yi, yj) (i, j = 1, 2, …, p, i ≠ j) can be represented by the correlations of the typical correlated component pairs in Y*.
Proof. 
The proof of Theorem 5 is given below in two cases.
(1)
( y i , y j ) Y * .
It is obvious that Theorem 5 holds.
(2)
( y i , y j ) Y * .
The analysis of Figure 1, Figure 2 and Figure 3 shows that for any two components yj and yk, the correlation of them means that with an increment in yj, there is a proper and monotonically increasing increment in yk; therefore, the correlation between yj and yk can be described by a function yk = f(yj) that is monotonous in its domain. For any sample data, if the variations of yj and yk violate the monotonicity of this function or the magnitude of the change is beyond the reasonable range limited by this function, it means that the correlation between yj and yk is out of control.
When ( y i , y j ) Y * , from the second property in Definition 1, we know that there is a component pair chain (yi, yk), (yk, ym), …, (ys, yj) in Y*. Suppose the correlations of these component pairs are denoted as yi = fik(yk), yk = fkm(ym), …, yj = fsj(yj), then from the properties of compound function, the correlation between yi and yj can be written as yi = fik[fkm[…[fsj(yj)]…], i.e., the correlation between yi and yj can be represented by the correlations of component pairs in Y*, Theorem 5 is hold. □
Theorem 5 means that in multivariate process quality, the correlations of component pairs have the property of transitivity, i.e., any correlation between two different components can be represented by the correlations of component pairs in TCCPG. Corresponding to the manufacturing process, if all the correlations of component pairs in TCCPG are normal, then the rest of the correlations of component pairs are normal, too; the correlation of all the quality components is under control. When the correlation of all the quality components is out of control, one or more correlations of component pairs in TCCPG must be abnormal. If these abnormal correlations are adjusted to normal levels by proper means, it can be assured that the correlation of all the quality components should be recovered to normal levels.
According to the above analysis, the correlation control of all the component pairs can be converted to monitor the ones in TCCPG. In addition, a control chart to monitor the correlation of all the quality components should be added to form the correlation diagnostic system.

3.3. Maximum Correlated Spanning Tree

In the above analysis, when using Theorem 5 to establish the diagnostic model of multivariate process quality correlations, the key problem is to solve the TCCPG. Here, the method of solving the TCCPG is given on the basis of the principles in set theory and graph theory.
Theorem 6: 
In the multivariate process quality y = (y1, y2, …, yp)T, the set of binary correlations of all the quality component pairs is an equivalence relation.
Proof. 
In set theory, a binary relation set with properties of self-reflexivity, symmetry, and transitivity is defined as an equivalence relation. Obviously, any component yi in y and itself must be correlated, and the correlation coefficient is 1, so the self-reflexivity holds. If components yi and yj are correlated, then yj and yi are also correlated, and the quantitative description of their correlation, i.e., the correlation coefficients, are equal, i.e., ρij = ρji, so the symmetry property holds. According to Theorem 5, the binary correlations of all the component pairs have the property of transitivity. In summary, Theorem 6 holds. □
The set of equivalence relations can be characterized graphically [30,31,32]. For the multivariate process quality y = (y1, y2, …, yp)T, the set of vertices V = {y1, y2, …, yp} is built with all the quality components. If yi and yj are correlated, an undirected edge eij is drawn to describe the binary correlation between yi and yj, and an undirected graph G representing the binary correlations between all the quality components can be obtained. From Theorem 1, any two quality components in y are correlated, and therefore, the undirected graph G contains p(p − 1)/2 edges. G is a completely undirected graph.
In the completely undirected graph G, the correlation between any two components yi and yj can be described by the connectivity between the vertices yi and yj in V. According to the basic properties of completely undirected graphs, a loop L can be found in G, which contains all the vertices in V. If only one edge eij in L is deleted, the vertices in V are still connected, and the correlation between yi and yj can be represented by the correlations of the other vertices in L-eij. From our knowledge of graph theory [33,34], we know that L-eij is a spanning tree of G.
From Definition 1, the component pair group consisting of two vertices of each edge in the spanning tree of G is a TCCPG of y. Therefore, the problem of solving the TCCPG of y can be converted into the problem of solving a spanning tree in the correlation graph G of y.
The spanning tree of G is not unique in general. To find the most typical correlated component pairs group, take the correlation coefficient matrix P = (ρij)p×p of y as a reference, set ρij as the weight of the edge eij in G, and then a weighted correlation graph of y is obtained, which is denoted as G*. Now the problem of finding the most typical correlated component pair group is converted to the one that calculates the maximum weighted spanning tree of G*. This spanning tree is named the maximum-correlated spanning tree (MCST).
Considering that G* is a dense graph with more edges, take the Prim algorithm [35,36,37] as the reference. The maximum correlated spanning tree of y can be calculated by the following steps:
(1)
Let U = {y1}, V = {y1, y2, …, yp}, and E = Φ;
(2)
Let eij denote the edge whose absolute value of the weight is maximized and its vertices yiU, yjV − U;
(3)
Add yj to U, eij to E;
(4)
If U ≠ V, go to step (2); otherwise, the algorithm is over.
When the algorithm is executed, the collection of vertex pairs of every edge in E is a TCCPG of y, and the binary correlations of all the quality component pairs can be represented to the maximum extent by this component pairs group, so it is called the optimal typical correlated component pairs group (OTCCPG).

3.4. Correlation Diagnosis Method

There are p − 1 edges in a spanning tree of a completely undirected graph with p vertices. Therefore, the number of component pairs in the OTCCPG method is p − 1. Establish T2 control charts for every component pair in the OTCCPG to form the correlation diagnostic system; its space complexity is O(p).
The above analysis is founded on the condition that the mean vector μ, covariance matrix Σ, and correlation coefficient matrix P of the manufacturing process are given. However, in many applications, these parameters are generally unknown. In this case, the unbiased estimator of the manufacturing process parameters can be calculated from a set of sample data Si (i = 1, 2, …, n) collected while the process is in a stable state.
μ = μ 1 μ 2 μ p = 1 n i = 1 n S i
Σ = σ 11 σ 12 σ 1 p σ 21 σ 21 σ 2 p σ p 1 σ p 1 σ p p = 1 n 1 i = 1 n ( S i μ ) T ( S i μ )
P = ρ 11 ρ 12 ρ 1 p ρ 21 ρ 21 ρ 2 p ρ p 1 ρ p 1 ρ p p
ρ i j = σ i j σ i i σ j j ( i , j = 1 , 2 , , p )
Combining the above analyses, for the multivariate process quality y = (y1, y2, …, yp)T, the correlation diagnostic system can be formed by the following steps:
(1)
Collect sufficient quality data Si (i = 1, 2, …, n) while the manufacturing process is in a stable state;
(2)
Calculate the manufacturing parameters according to Equations (18)–(21);
(3)
Take the coefficient matrix P as a reference and draw the weighted correlation graph G* of y;
(4)
Find the MCST of G*;
(5)
Obtain the OTCCPG of y according to the MCST;
(6)
For every component pair (yi, yj) in the OTCCPG, establish its T2 control chart, Kij;
(7)
Establish the T2 control chart K to monitor the correlation shift of all the quality components.
The number of T2 control charts in this diagnostic system is p. When K is normal, the correlation of all the quality components is under control. When an out-of-control signal is generated in K, the root cause(s) can be specified by examining all the rest of the T2 control charts.

4. A Case Study

The following case shows how the diagnostic system works in practice. The automotive purifier coating production line is mainly composed of three key steps: pre-treatment, coating, heating, and curing. Pre-treatment is the surface treatment of the parts to be coated to remove oil and rust; the coating process needs to ensure that the coating coverage is uniform and the thickness is consistent to avoid defects such as too thin, too thick, and leakage of the coating; and then curing of the coating material by heating. In the coating process, in order to ensure the uniformity of the coating thickness, the pressure, mixing temperature, coating temperature, slurry mass, and pH value should be controlled, which is denoted as y = (y1, y2, y3, y4, y5)T. Moreover, 20 quality data points are collected, as shown in Table 1.
In order to analyze the reliability of the sample data, for each quality component, the probability of false alarm is taken as α = 0.0027 according to the 3σ principle of the Shewhart control chart, and then the probability of false alarm in the correlation shift is taken as αy = 0.025 with reference to Bonferroni’s inequality to make Shewhart control charts for the five quality components and T2 control charts to monitor the correlation shift of the five quality components, as shown in Figure 4, Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9.
Figure 4, Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9 show that the process is in a stable state; therefore, the data in Table 1 can be used to estimate the parameters of the coating process using Equations (18)–(21):
                     μ = (4143, 41.8, 39.5, 9.2, 2.11)
Σ = 29795.6079 145.6247 - 154.6684 - 38.6679 - 5.8386 145.6247 0.8220 - 0.7342 - 0.2207 - 0.02986 - 154.6684 - 0.7342 0.8232 0.1983 0.0330 - 38.6679 - 0.2207 0.1983 0.3982 0.0685 - 5.8386 - 0.0298 0.0330 0.0685 0.0147 P = 1 0.9305 0.9876 0.3550 0.2793 0.9305 1 0.8926 0.3857 0.2712 0.9876 0.8926 1 0.3463 0.3003 0.3550 0.3857 0.3463 1 0.8959 0.2793 0.2712 0.3003 0.90 1
Take the coefficient matrix P as a reference; the weighted correlation graph G* is shown in Figure 10, and then the MCST of G* can be found, as shown in Figure 11.
According to Figure 11, the OTCCPG of y is Y* = {(y1, y2), (y1, y3), (y2, y4), (y4, y5)}. For every component pair in Y*, suppose the false probability α = 0.025. K12 denotes the T2 control chart to monitor the correlation shift between y1 and y2. K13 denotes the T2 control chart to monitor the correlation shift between y1 and y3. Similar symbols are applied to denote the rest of the T2 control charts. Finally, T2 control chart K to monitor the correlation shift of all the quality components is established. These five T2 control charts constitute the correlation diagnostic system for y.
Now this diagnostic system can be used to monitor and diagnose the correlation shift of quality components. In subsequent manufacturing, four quality data points at different moments are collected, as shown in Table 2. The T2 control chart of these four quality data points is drawn in K, as shown in Figure 12. We can see that at the last three points, the correlation shift of all the components is out of control.
In order to diagnose the root cause(s) that cause the last three points to be out of control, check the rest of the T2 control chart (K12, K13, K24, K45), as shown in Figure 13, Figure 14, Figure 15 and Figure 16. From these charts, it can be concluded that the root causes that cause the second point to be out of control are the correlations of (y1, y2) and (y1, y3) being abnormal. The abnormal correlation of (y1, y3) causes the third point out of control, and the abnormal correlation of (y4, y5) leads to the last point out of control.
In order to validate the above conclusions, 26 T2 control charts contained in the diagnosis system using the CCBD method are constructed. The diagnostic results of the two systems are shown in Table 3.
In Table 3, the conclusions are different except at the first point; this difference can be explained properly. At the second point, because the correlations of (y1, y2) and (y1, y3) are abnormal, the other correlations of component combinations that contain (y1, y2) or (y1, y3) must be abnormal, so they are redundant diagnostic results. If the redundant diagnostic results are removed, the conclusions of the two diagnostic systems are completely consistent at this point. A similar analysis could explain the difference at the other points, as shown in Table 4.
From the above comparison, we can see that the correlation diagnosis method of multivariate process quality based on OTCCPG can not only achieve the goal of specifying the cause(s) that cause the correlation shift out of control but could also avoid the redundancy of diagnosis results, so it can provide more accurate diagnosis messages for the process quality adjustment in further steps.

5. Conclusions and Discussion

This paper proposes an OTCCPG-based correlation diagnosis method for multivariate process quality management. The development of this method included several steps. First, based on correlation decomposition, the correlation of all the quality components is decomposed as the ones of all the quality component pairs, and then, according to the transitivity of correlations of component pairs, the correlation of any component pair can be represented by the ones in TCCPG. Finally, the algorithm of the OTCCPG based on MCST is proposed, and the correlation diagnostic system based on the OTCCPG is illustrated. Compared with the existing diagnostic methods, the method proposed in this paper has the following advantages:
(1)
The diagnosis is more efficient.
The space complexity of the multivariate process quality correlation diagnostic method based on OTCCPG is O(p), while the space complexity of the diagnostic algorithm based on the CCBD method, the principal component analysis method, and the orthogonal decomposition of the T2 statistic are O(2p), O(p2), and O(p!), respectively. Therefore, the proposed method in this paper has higher diagnostic efficiency.
(2)
The diagnostic results are more accurate.
The OTCCPG-based multivariate process quality correlation diagnosis method takes the correlation of component pairs as the diagnostic unit. Since component pairs are the smallest combination of quality components, the disadvantage of redundant diagnostic information in diagnostic algorithms based on the CCBD method, the principal component analysis method, and the orthogonal decomposition of T2 statistics can be avoided to provide more accurate diagnostic results for manufacturing processes.
(3)
Better generality.
Compared with the diagnostic methods based on artificial intelligence technology, the diagnostic method proposed in this paper is based on strict mathematical analysis as the theoretical foundation, which avoids the defect of intelligent diagnostic methods in which the network structure and parameters are oriented to specific applications. Therefore, the proposed method can be used as a general theoretical model for multivariate process quality correlation diagnosis.
The multivariate process quality correlation diagnostic method proposed in this paper is based on the fact that the manufacturing process parameters are known or can be estimated more accurately by using sufficient sample data. For the manufacturing process of large-batch products, it is feasible to make more accurate estimates of the manufacturing process parameters due to the sufficient available sample data for reference. However, for flexible manufacturing systems with variable varieties and small batches, the method proposed in this paper cannot be directly applied due to the small number of available samples. Considering the similarity of different batches of products in a flexible manufacturing system in terms of structure and process, the change of process parameters in flexible manufacturing should have certain regularity. Therefore, studying the changing laws of process parameters in the flexible manufacturing system and adapting the multivariate process quality correlation diagnosis method based on mathematical statistical theory to the flexible manufacturing system can be further research in the field of quality management.

Author Contributions

Conceptualization, Q.N.; methodology, Q.N. and S.C.; resources, Z.Q.; writing—original draft, Q.N.; writing—review and editing, Q.N. and S.C.; supervision, Z.Q.; project administration, Q.N. and Z.Q. All authors have read and agreed to the published version of the manuscript.

Funding

The work described in this paper was supported by a research grant from the Natural Science Foundation of Gansu Province (22JR5RA342), and we hereby thank them for the financial aid.

Data Availability Statement

The data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Fernandes, F.H.; Ho, L.L.; Bourguignon, M. About Shewhart control charts to monitor the Weibull mean. Qual. Reliab. Eng. Int. 2019, 35, 2343–2357. [Google Scholar] [CrossRef]
  2. Linda, L.H.; Fidel, H.F.; Roberto, C.Q. Improving Shewhart control chart performance for monitoring the Weibull mean. Qual. Reliab. Eng. Int. 2021, 37, 984–996. [Google Scholar]
  3. Nguyen, H.D.; Tran, K.P.; Celano, G.; Maravelakis, P.E.; Castagliola, P. On the effect of the measurement error on Shewhart and EWMA control charts. Int. J. Adv. Manuf. Technol. 2020, 107, 4317–4332. [Google Scholar] [CrossRef]
  4. Malela-Majika, J.C.; Shongwe, S.C.; Castagliola, P.; Mutambayi, R.M. A novel single composite Shewhart-EWMA control chart for monitoring the process mean. Qual. Reliab. Eng. Int. 2022, 38, 1760–1789. [Google Scholar] [CrossRef]
  5. Mjimer, I.; Aoula, E.; Achouyab, E.H. Monitoring of overall equipment effectiveness by multivariate statistical process control. Int. J. Lean Six Sigma 2022, 13, 847–862. [Google Scholar] [CrossRef]
  6. Sun, Y.; Younis, I.; Zhang, Y.; Zhou, H. Optimizing the quality control of multivariate processes under an improved Mahalanobis-Taguchi system. Qual. Eng. 2023, 35, 413–429. [Google Scholar] [CrossRef]
  7. Bahrami, H.; Niaki, S.T.A.; Khedmati, M. Monitoring multivariate profiles in multistage processes. Commun. Stat. Simul. Comput. 2019, 50, 3436–3464. [Google Scholar] [CrossRef]
  8. Muhammad, I.; Hong, L.D.; Fatima, S.Z. Incorporating principal component analysis into Hotelling T2 control chart for compositional data monitoring. Comput. Ind. Eng. 2023, 186, 40–56. [Google Scholar]
  9. Ershadi, M.J.; Niaki, S.T.A.; Azizi, A.; Esfahani, A.A.; Abadi, R.E. Monitoring data quality using Hoteling multivariate control chart. Commun. Stat. Simul. Comput. 2023, 52, 1591–1606. [Google Scholar] [CrossRef]
  10. Huang, J.; Yan, X. Quality-driven principal component analysis combined with kernel least squares for multivariate statistical process monitoring. IEEE Trans. Control Syst. Technol. 2019, 27, 2688–2695. [Google Scholar] [CrossRef]
  11. Qi, L.; Yi, X.; Yao, L.; Fang, Y.; Ren, Y. Quality-related process monitoring based on improved kernel principal component regression. IEEE Access 2021, 9, 132733–132745. [Google Scholar] [CrossRef]
  12. Riaz, M.; Zaman, B.; Mehmood, R.; Abbas, N.; Abujiya, M. Advanced multivariate cumulative sum control charts based on principal component method with application. Qual. Reliab. Eng. Int. 2021, 37, 2760–2789. [Google Scholar] [CrossRef]
  13. Sun, C.; Hou, J. An improved principal component regression for quality-related process monitoring of industrial control systems. IEEE Access 2017, 5, 21723–21730. [Google Scholar] [CrossRef]
  14. Mason, R.L.; Tracy, N.D.; Young, J.C. Decomposition of T2 for multivariate control chart interpretation. J. Qual. Technol. 1995, 27, 109–119. [Google Scholar] [CrossRef]
  15. Mason, R.L.; Tracy, N.D.; Young, J.C. A practical approach for interpreting multivariate T2 control chart signals. J. Qual. Technol. 1997, 29, 396–406. [Google Scholar] [CrossRef]
  16. Mason, R.L.; Tracy, N.D.; Young, J.C. Improving the sensitivity of the T2 statistic in multivariate process control. J. Qual. Technol. 1999, 31, 155–165. [Google Scholar] [CrossRef]
  17. Akeem, A.A.; Yahaya, A.; Asiribo, O. Hotelling’s T2 decomposition: Approach for five process characteristics in a multivariate statistical process control. Am. J. Theor. Appl. Stat. 2015, 4, 432–437. [Google Scholar] [CrossRef]
  18. Huang, X.H.; Xu, J.K.; Zhou, Q. Multi-scale diagnosis of spatial point interaction via decomposition of the k function-based T2 statistic. J. Qual. Technol. 2017, 49, 213–227. [Google Scholar] [CrossRef]
  19. Li, X.L.; Liu, S.S. Fault separation and detection algorithm based on Mason Young Tracy decomposition and Gaussian mixture models. Int. J. Intell. Comput. 2020, 13, 81–101. [Google Scholar] [CrossRef]
  20. Ueda, R.M.; Souza, A.M. An effective approach to detect the source(s) of out-of-control signals in productive processes by vector error correction (VEC) residual and Hotelling’s T2 decomposition techniques. Expert. Syst. Appl. 2022, 187, 115979. [Google Scholar] [CrossRef]
  21. Yu, J.B.; Zhang, C.Y.; Wang, S.J. Sparse one-dimensional convolutional neural network-based feature learning for fault detection and diagnosis in multivariable manufacturing processes. Neural. Comput. Appl. 2022, 34, 4343–4366. [Google Scholar] [CrossRef]
  22. Samira, Z.; Moosa, A. Simultaneous fault diagnosis of wind turbine using multichannel convolutional neural networks. ISA Trans. 2021, 108, 230–239. [Google Scholar]
  23. Xu, Q.Q.; Dong, J.; Peng, K.X. A novel method of neural network model predictive control integrated process monitoring and applications to hot rolling process. Expert. Syst. Appl. 2023, 237, 121682. [Google Scholar] [CrossRef]
  24. Xian, X.C.; Li, J.; Liu, K.B. Causation-based monitoring and diagnosis for multivariate categorical processes with ordinal information. IEEE Trans. Autom. Sci. Eng. 2019, 16, 886–897. [Google Scholar] [CrossRef]
  25. Rezki, N.; Kazar, O.; Mouss, L.H. A hybrid approach for complex industrial process monitoring. J. Sci. Ind. Res. India 2017, 76, 608–613. [Google Scholar]
  26. Wang, Y.Z.; Liu, Y.; Khan, F. Semiparametric PCA and Bayesian network based process fault diagnosis technique. Can. J. Chem. Eng. 2017, 95, 1800–1816. [Google Scholar] [CrossRef]
  27. Liang, J.P.; Zhang, K. A new hybrid fault diagnosis method for wind energy converters. Electronics 2023, 12, 1263. [Google Scholar] [CrossRef]
  28. Zhang, H.Q.; Wang, J.C.; Wang, M. Integration of cuckoo search and fuzzy support vector machine for intelligent diagnosis of production process quality. J. Ind. Manag. Optim. 2022, 18, 195–217. [Google Scholar] [CrossRef]
  29. Tang, J.; Zhao, Q.N. Motor rolling bearing fault diagnosis based on MVMD energy entropy and GWO-SVM. J. Vibroeng. 2023, 25, 1096–1107. [Google Scholar] [CrossRef]
  30. Crone, L.; Fishman, L.; Jackson, S.C. Equivalence relations and determinacy. J. Math. Log. 2022, 22, 2250003. [Google Scholar] [CrossRef]
  31. Zhuchok, Y.; Toichkina, O. Endotypes of partial equivalence relations. Semigroup Forum 2021, 103, 966–975. [Google Scholar] [CrossRef]
  32. Frey, J. Categories of partial equivalence relations as localizations. J. Pure Appl. Algebra 2022, 227, 107115. [Google Scholar] [CrossRef]
  33. Chakraborty, M.; Chowdhury, S.; Pal, R.K. Two algorithms for computing all spanning trees of a simple, undirected, and connected graph: Once assuming a complete graph. IEEE Access 2018, 6, 56290–56300. [Google Scholar] [CrossRef]
  34. Alexander, V.S. Spanning tree of a multiple graph. J. Comb. Optim. 2022, 43, 850–869. [Google Scholar]
  35. Zeng, X.X.; Liu, Q.L.; Yao, S. An improved Prim algorithm for connection scheme of last train in urban mass transit network. Symmetry 2019, 11, 681. [Google Scholar] [CrossRef]
  36. Lukaszewski, A.; Nogal, L. Multi-sourced power system restoration strategy based on modified Prim’s algorithm. Bull. Pol. Acad. Sci. Tech. Sci. 2021, 69, e137942. [Google Scholar] [CrossRef]
  37. Łukaszewski, A.; Nogal, L.; Januszewski, M. The application of the modified Prim’s algorithm to restore the power system using renewable energy sources. Symmetry 2022, 14, 1012. [Google Scholar] [CrossRef]
Figure 1. Mean control chart of y1.
Figure 1. Mean control chart of y1.
Processes 12 00652 g001
Figure 2. Mean control chart of y2.
Figure 2. Mean control chart of y2.
Processes 12 00652 g002
Figure 3. T2 control chart of y.
Figure 3. T2 control chart of y.
Processes 12 00652 g003
Figure 4. Shewhart control chart of y1 to monitor the mean shift of y1 in Table 1.
Figure 4. Shewhart control chart of y1 to monitor the mean shift of y1 in Table 1.
Processes 12 00652 g004
Figure 5. Shewhart control chart of y2. to monitor the mean shift of y2 in Table 1.
Figure 5. Shewhart control chart of y2. to monitor the mean shift of y2 in Table 1.
Processes 12 00652 g005
Figure 6. Shewhart control chart of y, to monitor the mean shift of y3 in Table 1.
Figure 6. Shewhart control chart of y, to monitor the mean shift of y3 in Table 1.
Processes 12 00652 g006
Figure 7. Shewhart control chart of y4 to monitor the mean shift of y4 in Table 1.
Figure 7. Shewhart control chart of y4 to monitor the mean shift of y4 in Table 1.
Processes 12 00652 g007
Figure 8. Shewhart control chart of y5 to monitor the mean shift of y5 in Table 1.
Figure 8. Shewhart control chart of y5 to monitor the mean shift of y5 in Table 1.
Processes 12 00652 g008
Figure 9. T2 control chart of y to monitor the correlation shift of (y1, y2, y3, y4, y5) in Table 1.
Figure 9. T2 control chart of y to monitor the correlation shift of (y1, y2, y3, y4, y5) in Table 1.
Processes 12 00652 g009
Figure 10. Weighted correlation graph.
Figure 10. Weighted correlation graph.
Processes 12 00652 g010
Figure 11. Maximum correlation spanning tree.
Figure 11. Maximum correlation spanning tree.
Processes 12 00652 g011
Figure 12. T2 control chart K to monitor the correlation shift of (y1, y2, y3, y4, y5) in Table 2.
Figure 12. T2 control chart K to monitor the correlation shift of (y1, y2, y3, y4, y5) in Table 2.
Processes 12 00652 g012
Figure 13. T2 control chart K12. to monitor the correlation shift of (y1, y2) in Table 2.
Figure 13. T2 control chart K12. to monitor the correlation shift of (y1, y2) in Table 2.
Processes 12 00652 g013
Figure 14. T2 control chart K13. to monitor the correlation shift of (y1, y3) in Table 2.
Figure 14. T2 control chart K13. to monitor the correlation shift of (y1, y3) in Table 2.
Processes 12 00652 g014
Figure 15. T2 control chart K24. to monitor the correlation shift of (y2, y4) in Table 2.
Figure 15. T2 control chart K24. to monitor the correlation shift of (y2, y4) in Table 2.
Processes 12 00652 g015
Figure 16. T2 control chart K45. to monitor the correlation shift of (y4, y5) in Table 2.
Figure 16. T2 control chart K45. to monitor the correlation shift of (y4, y5) in Table 2.
Processes 12 00652 g016
Table 1. Collected sample data.
Table 1. Collected sample data.
No.Pressure
y1 (Pa)
Mixing Temperature
y2 (°C)
Coating Temperature
y3 (°C)
Slurry Mass
y4 (kg)
pH Value
y5
1394040.740.79.982.26
2394740.740.49.542.11
3395041.240.79.092.04
4395540.740.49.092.11
5395841.240.79.982.26
6403541.239.99.542.11
7403841.740.18.191.97
8404041.239.99.092.04
9404341.740.19.092.19
10413341.239.39.982.19
11413741.739.69.982.33
12413941.239.38.191.9
13422142.239.18.641.97
14422342.239.19.542.11
15422642.239.19.542.19
16422941.739.19.092.19
17431942.738.59.982.26
18441243.8388.642.04
19441743.338.38.191.97
20450543.337.78.641.97
Table 2. Test data collected in subsequent manufacturing.
Table 2. Test data collected in subsequent manufacturing.
No.Pressure
y1 (Pa)
Mixing Temperature
y2 (°C)
Coating Temperature
y3 (°C)
Slurry Mass
y4 (kg)
pH Value
y5
1422642.239.19.982.26
2413340.239.18.642.04
3394740.739.98.191.97
4422642.239.19.541.97
Table 3. Comparison of the two diagnosis results.
Table 3. Comparison of the two diagnosis results.
No.Diagnostic System Based on OTCCPGDiagnostic System Using CCBD Method
1NormalNormal
2(y1, y2), (y1, y3) are abnormal(y1, y2), (y1, y3), (y1, y2, y3), (y1, y2, y4), (y1, y2, y5), (y1, y3, y4), (y1, y3, y5), (y1, y2, y3, y4), (y1, y2, y3, y5), (y1, y2, y4, y5), (y1, y3, y4, y5) are abnormal
3(y1, y3) is abnormal(y1, y3), (y1, y2, y3), (y1, y3, y4), (y1, y3, y5), (y1, y2, y3, y4),
(y1, y2, y3, y5), (y1, y3, y4, y5) are abnormal
4(y4, y5) is abnormal(y4, y5), (y1, y4, y5), (y2, y4, y5), (y3, y4, y5), (y1, y2, y4, y5),
(y1, y3, y4, y5), (y2, y3, y4, y5) are abnormal
Table 4. Comparison of the two diagnostic systems after redundant diagnostic results were removed.
Table 4. Comparison of the two diagnostic systems after redundant diagnostic results were removed.
No.Diagnostic System Based on OTCCPGDiagnostic System Using CCBD Method
1NormalNormal
2(y1, y2), (y1, y3) are abnormal(y1, y2), (y1, y3) are abnormal
3(y1, y3) is abnormal(y1, y3) is abnormal
4(y4, y5) is abnormal(y4, y5) is abnormal
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Niu, Q.; Cheng, S.; Qiu, Z. Algorithm for Correlation Diagnosis in Multivariate Process Quality Based on the Optimal Typical Correlated Component Pair Group. Processes 2024, 12, 652. https://doi.org/10.3390/pr12040652

AMA Style

Niu Q, Cheng S, Qiu Z. Algorithm for Correlation Diagnosis in Multivariate Process Quality Based on the Optimal Typical Correlated Component Pair Group. Processes. 2024; 12(4):652. https://doi.org/10.3390/pr12040652

Chicago/Turabian Style

Niu, Qing, Shujie Cheng, and Zeyang Qiu. 2024. "Algorithm for Correlation Diagnosis in Multivariate Process Quality Based on the Optimal Typical Correlated Component Pair Group" Processes 12, no. 4: 652. https://doi.org/10.3390/pr12040652

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop