Algorithm for Correlation Diagnosis in Multivariate Process Quality Based on the Optimal Typical Correlated Component Pair Group

: Correlation diagnosis in multivariate process quality management is an important and challenging issue. In this paper, a new approach based on the optimal typical correlated component pair group (OTCCPG) is proposed. Firstly, the theorem of correlation decomposition is proved to decompose the correlation of all the quality components as serial correlations of component pairs, and then according to the transitivity of correlations of component pairs, the decomposition result is represented by a correlation set of typical correlated component pairs. Finally, an algorithm for OTCCPG based on the maximum correlation spanning tree (MCST) is proposed, and T 2 control charts to monitor the correlations of component pairs in OTCCPG are established to form the correlation diagnostic system. Theoretical analysis and practice prove that the proposed method could reduce the space complexity of the diagnostic system greatly.


Introduction
With the development of the modern global market, the product's quality has been one of the key factors that greatly influence the competitiveness of enterprises.In the whole formation of the product's quality, process quality is the most basic aspect because the product's quality will be influenced by every process's quality, directly or indirectly, so process quality control is the essence of quality management in manufacturing.
The theory of statistical process control (SPC) and its tools, that is, the control chart proposed by Shewhart, achieved the goal of univariate process quality management [1][2][3][4].However, in modern manufacturing, many processes contain more than one quality component.Due to the correlation of quality components, all components and their correlation need to be monitored simultaneously [5,6].For example, in a bivariate process with quality y = (y 1 , y 2 ) T , 20 quality data points are collected to plot the mean control charts of y 1 and y 2 , as shown in Figures 1 and 2. We can see that the variations of the two components are usually regular, i.e., with the increment of y 1 , there is a monotonic increment of y 2 .This regular variation of a monotonic increment within a component with the increment (or decrement) of another is defined as the correlation of y.
Although y 1 and y 2 are under control in Figures 1 and 2, this does not mean that y is in a stable state.At the 10th point, y 1 is decreasing while y 2 is increasing; this is opposite to the general variations of the two components, so it means that the correlation of y is abnormal.At the 20th point, although both y 1 and y 2 are decreasing, the decrement of y 2 is too heavy to exceed the normal range; this also means that the correlation of y is abnormal.The above inferences can be verified by plotting the T 2 control chart for y, as illustrated in Figure 3.The T 2 statistic was initially proposed by Hotelling to describe the correlation between quality components.For a p-dimensional process quality y = (y1, y2, …, yp) T , the T 2 statistic is defined as follows: where µ is the mean vector, and Σ is the covariance matrix of y.When T 2 > 0, it means that all the components in y are correlated.
It has been proven that the T 2 statistic follows the χ 2 distribution with p-freedom.Suppose α is the false probability, then the upper control limit (UCL) of the T 2 statistic is , and the lower control limit (LCL) is 0. Thus, the T 2 control chart can be established to monitor the correlation shift of y.However, this control chart cannot specify the cause(s) when the correlation shift is out of control.

Diagnosis Method Based on Component Combinations
In order to diagnose the cause(s) that cause the correlation shift to be out of control, an approach is to consider all the possible combinations of quality components when   The T 2 statistic was initially proposed by Hotelling to describe the correlation between quality components.For a p-dimensional process quality y = (y1, y2, …, yp) T , the T 2 statistic is defined as follows: where µ is the mean vector, and Σ is the covariance matrix of y.When T 2 > 0, it means that all the components in y are correlated.
It has been proven that the T 2 statistic follows the χ 2 distribution with p-freedom.Suppose α is the false probability, then the upper control limit (UCL) of the T 2 statistic is , and the lower control limit (LCL) is 0. Thus, the T 2 control chart can be established to monitor the correlation shift of y.However, this control chart cannot specify the cause(s) when the correlation shift is out of control.

Diagnosis Method Based on Component Combinations
In order to diagnose the cause(s) that cause the correlation shift to be out of control, an approach is to consider all the possible combinations of quality components when   The T 2 statistic was initially proposed by Hotelling to describe the correlation between quality components.For a p-dimensional process quality y = (y1, y2, …, yp) T , the T 2 statistic is defined as follows: where µ is the mean vector, and Σ is the covariance matrix of y.When T 2 > 0, it means that all the components in y are correlated.
It has been proven that the T 2 statistic follows the χ 2 distribution with p-freedom.Suppose α is the false probability, then the upper control limit (UCL) of the T 2 statistic is , and the lower control limit (LCL) is 0. Thus, the T 2 control chart can be established to monitor the correlation shift of y.However, this control chart cannot specify the cause(s) when the correlation shift is out of control.

Diagnosis Method Based on Component Combinations
In order to diagnose the cause(s) that cause the correlation shift to be out of control, an approach is to consider all the possible combinations of quality components when The T 2 statistic was initially proposed by Hotelling to describe the correlation between quality components.For a p-dimensional process quality y = (y 1 , y 2 , . .., y p ) T , the T 2 statistic is defined as follows: where µ is the mean vector, and Σ is the covariance matrix of y.When T 2 > 0, it means that all the components in y are correlated.It has been proven that the T 2 statistic follows the χ 2 distribution with p-freedom.Suppose α is the false probability, then the upper control limit (UCL) of the T 2 statistic is χ 2 α (p), and the lower control limit (LCL) is 0. Thus, the T 2 control chart can be established to monitor the correlation shift of y.However, this control chart cannot specify the cause(s) when the correlation shift is out of control.

Diagnosis Method Based on Component Combinations
In order to diagnose the cause(s) that cause the correlation shift to be out of control, an approach is to consider all the possible combinations of quality components when constructing their T 2 control charts.The cause(s) can then be specified by examining all the T 2 control charts when the correlation shift is out of control [7][8][9].This approach is referred to as the component combination-based diagnosis (CCBD) method in this paper.
While this approach is theoretically sound and appealing, it has inherent deficiencies.For a p-dimensional process quality y = (y 1 , y 2 , . .., y p ) T , the number of T 2 control charts using the CCBD method is where N is an exponential function of p and the space complexity is O(2 p ).When p increases, N will increase sharply, leading to a significant expansion of the diagnostic system scale, so this method is difficult to apply in practice.
On the other hand, the defect of information redundancy in diagnosis results cannot be avoided in the CCBD method.For example, in a four-dimensional process quality y = (y 1 , y 2 , y 3 , y 4 ) T , suppose the abnormal correlation shift between y 1 and y 2 is the only cause that causes the correlation of y to be out of control.Now in the CCBD method, besides the T 2 control chart to monitor the correlation shift of (y 1 , y 2 ) being out of control, the other combinations that contain y 1 and y 2 , namely (y 1 , y 2 , y 3 ) and (y 1 , y 2 , y 4 ) are both out of control.This phenomenon is called the redundancy of diagnostic messages because the correlation shift of one component combination is out of control, and the correlations of other component combinations that contain the abnormal component combination are all out of control.The redundancy in diagnosis results is a disturbance for process quality adjustment.

Diagnosis Method Based on Principal Component Analysis
To reduce the scale of the correlation diagnostic system, scholars proposed that, by using the principal component analysis (PCA) method [10][11][12][13], the original process quality y = (y 1 , y 2 , . .., y p ) T is converted to p-independent principal components and sorted by variance-decreasing order, denoted as z = (z 1 , z 2 , . .., z p ) T .Then, firstly, p Shewhart control charts are constructed to monitor the normality of z i .Secondly, the first n(n < p) principal components whose cumulative sum of their variance exceeds a specified critical value are grouped as component pairs, and T 2 control charts are constructed to monitor the normality of (z i , z j ) (i, j ≤ n, i ̸ = j); At last, the normality of the rest of the principal component group (z n+1 , z n+2 , . .., z p ) is monitored by a T 2 control chart.
The number of control charts based on the PAC method is N = p + C 2 n + 1 = p + n(n − 1)/2 + 1.In general, n increases while p increases, the space complexity is approximately O(p 2 ), and the scale of the diagnostic system is still large.Furthermore, since z i generally has no engineering significance after conversion, the cause(s) that cause the correlation shift of y out of control can only be specified by a comprehensive analysis of all the results in control charts, and by consulting the mapping relationship between y and z, the calculation of the diagnosis is increased.Meanwhile, the redundancy of diagnostic messages also cannot be avoided.

Diagnosis Method Based on Correlation and Orthogonal Decomposition
In 1995, Mason, Young, and Tracy [14][15][16] proposed to decompose the T 2 statistic into conditional and unconditional terms by regression analysis, in which the conditional and unconditional terms have equal weight in the decomposition results and are orthogonally independent of each other.Then, according to the statistical distribution of the conditional and unconditional terms, the corresponding control limits are established, so as to diagnose the specific cause(s) when the manufacturing process is abnormal.
As an example, in bivariate process quality y = (y 1 , y 2 ) T , the basic idea of the MYT orthogonal decomposition method [17][18][19][20] is to decompose the T 2 statistic into the following form: where T 2 1 , called the unconditional term, is related only to the quality component y 1 and is used to measure the contribution shift in y 1 to the T 2 statistic, and T 2 2•1 , called the conditional term, whose value is related to the conditional probability P(y 2 |y 1 ), is used to measure the contribution in the correlation between y 1 and y 2 to the T 2 statistic.Similar to Equation (2), the T 2 statistic can also be decomposed into another form as follows: where the unconditional term T 2 2 is related only to the quality component y 2 and is used to measure the contribution shift in y 2 to the T 2 statistic; the conditional term T 2 1•2 depends on the conditional probability P(y 1 |y 2 ) and is used to measure the contribution in the correlation between y 2 and y 1 to the T 2 statistic.
Conditional probability P(y 2 |y 1 ) ̸ = P(y 1 |y 2 ) under the condition that the quality components y 1 and y 2 are correlated, and hence the conditional term Therefore, Equations ( 2) and (3) represent two different decomposition results of the T 2 statistic.In general, for a p-dimensional process quality y = (y 1 , y 2 ,. .., y p ) T , the decomposition results have a total of p!.As the number of quality components increases, under the condition that every possible form of decomposition is analyzed, this will lead to a significant increase in calculations and a serious reduction in diagnostic efficiency.At the same time, the accuracy of the diagnostic results based on this method will be affected when there are obvious correlations between different quality components.

Intelligent Diagnosis Methods
In addition to the traditional diagnostic methods based on mathematical model analysis, in recent years, with the development of artificial intelligence technology, intelligent diagnostic methods have been applied to the field of multivariate process quality diagnosis, and the diagnostic methods based on artificial neural networks (ANNs) [21][22][23], Bayesian networks [24][25][26], support vector machines (SVMs) [27][28][29], etc. have been widely applied.Intelligent diagnostic methods can effectively reduce the scale of the diagnostic system and improve the diagnostic efficiency; however, these methods generally require a large amount of data to train the network's parameters, and the constructed network is generally suitable for specific applications, so their generality will be greatly restricted.Therefore, establishing a general and efficient method for multivariate process quality correlation diagnosis is a major problem to be solved in the field of quality management.

Sketch of the Algorithm
In this paper, a new correlation diagnosis method based on the optimal typical correlated component pair group is proposed.For the multivariate process quality y = (y 1 , y 2 , . .., y p ) T , the correlation decomposition theorem is first proved by drawing on the idea of decomposing the T 2 statistic in the MYT orthogonal decomposition method, which decomposes the correlation of all the quality components into the correlations of all the component pairs, to reduce the space complexity of the diagnostic system to O(p 2 ).Then the monotonicity of the correlation between two quality components is used to prove the transitivity of the correlations of different component pairs, thus proving that the set formed by the correlations of all the component pairs is an equivalence relation.Next, drawing on the graphical representation of the equivalence relation in set theory and the concept of the minimum weighted spanning tree in graph theory, the computation method of the optimal correlated component pairs group is proposed, and reference is made to the diagnostic algorithm of component combination and the idea of combination of quality components in principal component analysis in order to represent the correlations of all the quality component pairs with the correlations in the optimal correlated component pairs group, so as to reduce the space complexity of the diagnostic system to O(p).

Covariance Matrix Properties of Multivariate Process Quality
In the manufacturing process, factors affecting the product's quality can be attributed to five aspects: man, machines, materials, methods, and environment (4M1E).On this basis, ISO9000 supplemented another three factors: manufacturing software, auxiliary materials, and utilities.Among the many factors affecting the product's quality, changes in any one of them will have an impact on the final quality of the product, so the product's quality is fluctuating in manufacturing.Tolerance theory is a direct proof of the fluctuation of the product's quality.
For the multivariate process quality y = (y 1 , y 2 , . .., y p ) T , the covariance matrix is an important parameter to describe its correlation.Combined with the fluctuation of the product's quality in the manufacturing process, this paper first establishes three theorems describing the characteristics of the covariance matrix of multivariate process quality.
Theorem 1: In the covariance matrix Σ of the multivariate process quality y = (y 1 , y 2 , . .., y p ) T , all of the elements are not 0.
Proof.Suppose the mean vector of y is µ = (µ 1 , µ 2 , . .., µ p ) T .According to the definition of the covariance matrix, it is known that: For any element in Σ, the sufficient and necessary condition for it to be 0 is as follows: According to the properties of mathematical expectation, Equation ( 5) implies that the quality component y i or y j is a constant in the manufacturing process.Obviously, this contradicts the viewpoint of the fluctuation of the product's quality, and therefore, Equation (5) does not hold, i.e., all of the elements in Σ are not 0. □ Theorem 2: The covariance matrix Σ of the multivariate process quality y = (y 1 , y 2 , . .., y p ) T is a real symmetric positive definite matrix.
Proof.According to Equation (4) on the definition of the covariance matrix: Compared to Equations ( 6) and ( 7), the properties of mathematical expectation can be seen as follows: Equation (8) shows that Σ is a symmetric matrix.Let p-dimensional Bringing Equation ( 4) into (9), after simplification and consolidation, we obtain the following: Bringing this into Equation (10), we obtain the following: From the proof of Theorem 1, it is clear that, according to the viewpoint of the fluctuation of the product's quality, z ̸ = 0, i.e., Equation (12) shows that the covariance matrix Σ of the multivariate process quality y = (y 1 , y 2 , . . . y p ) T is a real symmetric positive definite matrix.□ Theorem 3: The inverse matrix Σ −1 of the covariance matrix Σ of the multivariate process quality y = (y 1 , y 2 , . .., y p ) T is a real symmetric positive definite matrix.
Proof.First, prove the symmetry of Σ −1 .It follows from the symmetry of Σ: Inverting both ends of Equation ( 13), we obtain the following: The above equation shows that Σ −1 is a symmetric matrix.

Correlation Diagnosis Method Based on OTCCPG
In the CCBD method, the exponential function between N and p is the main reason for the difficulty of applying this approach.If the gradient of N with p can be lowered by proper means, the defect of the diagnostic system scale expanding greatly while p is increasing will be avoided to a certain extent, and thus this approach can be applied in multivariate process quality management.

Correlation Decomposition
Theorem 4: In the multivariate process quality y = (y 1 , y 2 , . .., y p ) T , the sufficient and necessary condition of the correlation of all components exists is that for any two components y i and y j , they are correlated.
Proof.Firstly, the sufficiency of Theorem 4 is proved.Any two components y i and y j in y that are correlated show that σ ij ̸ = 0. From Theorems 2 and 3, the covariance matrix Σ and its inverse matrix Σ −1 are real symmetric positive definite matrices.From the definition of the T 2 statistic in Equation (1), it is clear that for any sample data, its T 2 statistic is greater than 0, i.e., the correlation of all components exists.
The following proves the necessity of Theorem 4 by reductio ad absurdum.The existence of a correlation between all the components in y implies that for any sample data, its T 2 statistic is greater than 0. From the definition of the T 2 statistic in Equation ( 1), there exists an inverse matrix of the covariance matrix Σ of y, and the rank of Σ is p.
Assume y k and y j in y are uncorrelated, i.e., σ kj = 0.By the definition of covariance, there are, as follows: The sufficient and necessary condition for Equation (16) to hold is y k = µ k or y j = µ j .It may be useful to set y k = µ k .From the definition of covariance, we know that for any component y i (i = 1, 2, . .., p), there are, as follows: Equation (17) shows that in the covariance matrix Σ of y, the kth row and kth column are both 0, i.e., R(Σ) ≤ p − 1.This contradicts Equation ( 15), the assumption is not valid, and the necessity of Theorem 4 is proved.□ Theorem 4 means that the correlation of all the quality components can be represented as correlations of component pairs, so in the correlation diagnostic system, it only needs to monitor the correlation shifts of all the component pairs.In addition, a T 2 control chart to monitor the correlation shift of all the components should be added.The number of T 2 control charts is N = C p 2 + 1 = p(p − 1)/2 + 1, where N is the power function of p.The space complexity of the diagnostic system is lowered to O(p 2 ).Compared to the CCBD method, the gradient of N with p has decreased significantly.Meanwhile, because the component pair is the minimum combination of components, information redundancy in diagnosis results can be avoided.

Transitivity of Component Pairs' Correlations
Although the functional relation between N and p is lowered to a power function by correlation decomposition, N will still increase rapidly while p is increasing, so further proper ways should be adopted to reduce the scale of the diagnostic system on the basis of the above analysis.

Definition 1:
In the multivariate process quality y = (y 1 , y 2 , . .., y p ) T , Y = {(y i , y j )| i, j = 1, 2, . .., p, i ̸ = j} is the component pair set, and Y* is another set that meets the following two properties: Any component in y should appear at least once in Y*.
Then Y* is defined as a typical correlated component pair group (TCCPG) of y.Based on the above definition, there is Theorem 5 shown below.
Theorem 5: In the multivariate process quality y = (y 1 , y 2 , . .., y p ) T , suppose Y* is a TCCPG of y, then the correlation of any component pair (y i , y j ) (i, j = 1, 2, . .., p, i ̸ = j) can be represented by the correlations of the typical correlated component pairs in Y*.
Proof.The proof of Theorem 5 is given below in two cases. ( It is obvious that Theorem 5 holds. ( The analysis of Figures 1-3 shows that for any two components y j and y k , the correlation of them means that with an increment in y j , there is a proper and monotonically increasing increment in y k ; therefore, the correlation between y j and y k can be described by a function y k = f (y j ) that is monotonous in its domain.For any sample data, if the variations of y j and y k violate the monotonicity of this function or the magnitude of the change is beyond the reasonable range limited by this function, it means that the correlation between y j and y k is out of control.
When (y i , y j ) / ∈ Y * , from the second property in Definition 1, we know that there is a component pair chain (y i , y k ), (y k , y m ), . .., (y s , y j ) in Y*.Suppose the correlations of these component pairs are denoted as y i = f ik (y k ), y k = f km (y m ), . .., y j = f sj (y j ), then from the properties of compound function, the correlation between y i and y j can be written as . .], i.e., the correlation between y i and y j can be represented by the correlations of component pairs in Y*, Theorem 5 is hold.□ Theorem 5 means that in multivariate process quality, the correlations of component pairs have the property of transitivity, i.e., any correlation between two different components can be represented by the correlations of component pairs in TCCPG.Corresponding to the manufacturing process, if all the correlations of component pairs in TCCPG are normal, then the rest of the correlations of component pairs are normal, too; the correlation of all the quality components is under control.When the correlation of all the quality components is out of control, one or more correlations of component pairs in TCCPG must be abnormal.If these abnormal correlations are adjusted to normal levels by proper means, it can be assured that the correlation of all the quality components should be recovered to normal levels.
According to the above analysis, the correlation control of all the component pairs can be converted to monitor the ones in TCCPG.In addition, a control chart to monitor the correlation of all the quality components should be added to form the correlation diagnostic system.

Maximum Correlated Spanning Tree
In the above analysis, when using Theorem 5 to establish the diagnostic model of multivariate process quality correlations, the key problem is to solve the TCCPG.Here, the method of solving the TCCPG is given on the basis of the principles in set theory and graph theory.
Theorem 6: In the multivariate process quality y = (y 1 , y 2 , . .., y p ) T , the set of binary correlations of all the quality component pairs is an equivalence relation.

Proof.
In set theory, a binary relation set with properties of self-reflexivity, symmetry, and transitivity is defined as an equivalence relation.Obviously, any component y i in y and itself must be correlated, and the correlation coefficient is 1, so the self-reflexivity holds.If components y i and y j are correlated, then y j and y i are also correlated, and the quantitative description of their correlation, i.e., the correlation coefficients, are equal, i.e., ρ ij = ρ ji , so the symmetry property holds.According to Theorem 5, the binary correlations of all the component pairs have the property of transitivity.In summary, Theorem 6 holds.□ The set of equivalence relations can be characterized graphically [30][31][32].For the multivariate process quality y = (y 1 , y 2 , . .., y p ) T , the set of vertices V = {y 1 , y 2 , . .., y p } is built with all the quality components.If y i and y j are correlated, an undirected edge e ij is drawn to describe the binary correlation between y i and y j , and an undirected graph G representing the binary correlations between all the quality components can be obtained.From Theorem 1, any two quality components in y are correlated, and therefore, the undirected graph G contains p(p − 1)/2 edges.G is a completely undirected graph.
In the completely undirected graph G, the correlation between any two components y i and y j can be described by the connectivity between the vertices y i and y j in V.According to the basic properties of completely undirected graphs, a loop L can be found in G, which contains all the vertices in V.If only one edge e ij in L is deleted, the vertices in V are still connected, and the correlation between y i and y j can be represented by the correlations of the other vertices in L-e ij .From our knowledge of graph theory [33,34], we know that L-e ij is a spanning tree of G.
From Definition 1, the component pair group consisting of two vertices of each edge in the spanning tree of G is a TCCPG of y.Therefore, the problem of solving the TCCPG of y can be converted into the problem of solving a spanning tree in the correlation graph G of y.
The spanning tree of G is not unique in general.To find the most typical correlated component pairs group, take the correlation coefficient matrix P = (ρ ij ) p×p of y as a reference, set ρ ij as the weight of the edge e ij in G, and then a weighted correlation graph of y is obtained, which is denoted as G*.Now the problem of finding the most typical correlated component pair group is converted to the one that calculates the maximum weighted spanning tree of G*.This spanning tree is named the maximum-correlated spanning tree (MCST).
Considering that G* is a dense graph with more edges, take the Prim algorithm [35][36][37] as the reference.The maximum correlated spanning tree of y can be calculated by the following steps: (1) Let U = {y 1 }, V = {y 1 , y 2 , . .., y p }, and E = Φ; (2) Let e ij denote the edge whose absolute value of the weight is maximized and its vertices y i ∈U, y j ∈V − U; (3) Add y j to U, e ij to E; (4) If U ̸ = V, go to step (2); otherwise, the algorithm is over.
When the algorithm is executed, the collection of vertex pairs of every edge in E is a TCCPG of y, and the binary correlations of all the quality component pairs can be represented to the maximum extent by this component pairs group, so it is called the optimal typical correlated component pairs group (OTCCPG).

Correlation Diagnosis Method
There are p − 1 edges in a spanning tree of a completely undirected graph with p vertices.Therefore, the number of component pairs in the OTCCPG method is p − 1. Establish T 2 control charts for every component pair in the OTCCPG to form the correlation diagnostic system; its space complexity is O(p).
The above analysis is founded on the condition that the mean vector µ, covariance matrix Σ, and correlation coefficient matrix P of the manufacturing process are given.However, in many applications, these parameters are generally unknown.In this case, the unbiased estimator of the manufacturing process parameters can be calculated from a set of sample data S i (i = 1, 2, . .., n) collected while the process is in a stable state.
Combining the above analyses, for the multivariate process quality y = (y 1 , y 2 , . .., y p ) T , the correlation diagnostic system can be formed by the following steps: (1) Collect sufficient quality data S i (i = 1, 2, . .., n) while the manufacturing process is in a stable state; (2) Calculate the manufacturing parameters according to Equations ( 18)-( 21); (3) Take the coefficient matrix P as a reference and draw the weighted correlation graph G* of y; Processes 2024, 12, 652 10 of 18 (4) Find the MCST of G*; (5) Obtain the OTCCPG of y according to the MCST; (6) For every component pair (y i , y j ) in the OTCCPG, establish its T 2 control chart, K ij ; (7) Establish the T 2 control chart K to monitor the correlation shift of all the quality components.
The number of T 2 control charts in this diagnostic system is p.When K is normal, the correlation of all the quality components is under control.When an out-of-control signal is generated in K, the root cause(s) can be specified by examining all the rest of the T 2 control charts.

A Case Study
The following case shows how the diagnostic system works in practice.The automotive purifier coating production line is mainly composed of three key steps: pre-treatment, coating, heating, and curing.Pre-treatment is the surface treatment of the parts to be coated to remove oil and rust; the coating process needs to ensure that the coating coverage is uniform and the thickness is consistent to avoid defects such as too thin, too thick, and leakage of the coating; and then curing of the coating material by heating.In the coating process, in order to ensure the uniformity of the coating thickness, the pressure, mixing temperature, coating temperature, slurry mass, and pH value should be controlled, which is denoted as y = (y 1 , y 2 , y 3 , y 4 , y 5 ) T .Moreover, 20 quality data points are collected, as shown in Table 1.In order to analyze the reliability of the sample data, for each quality component, the probability of false alarm is taken as α = 0.0027 according to the 3σ principle of the Shewhart control chart, and then the probability of false alarm in the correlation shift is taken as α y = 0.025 with reference to Bonferroni's inequality to make Shewhart control charts for the five quality components and T 2 control charts to monitor the correlation shift of the five quality components, as shown in Figures 4-9.    1.

Figure 7.
Shewhart control chart of y4 to monitor the mean shift of y4 in Table 1. Figure 7. Shewhart control chart of y 4 to monitor the mean shift of y 4 in Table 1.
Figure 7. Shewhart control chart of y4 to monitor the mean shift of y4 in Table 1.
Figure 8. Shewhart control chart of y5 to monitor the mean shift of y5 in Table 1.
Figure 8. Shewhart control chart of y 5 to monitor the mean shift of y 5 in Table 1.
Figure 7. Shewhart control chart of y4 to monitor the mean shift of y4 in Table 1.
Figure 8. Shewhart control chart of y5 to monitor the mean shift of y5 in Table 1.
Figure 9. T 2 control chart of y to monitor the correlation shift of (y 1 , y 2 , y 3 , y 4 , y 5 ) in Table 1.
Figures 4-9 show that the process is in a stable state; therefore, the data in Table 1 can be used to estimate the parameters of the coating process using Equations ( 18)-( 21 Take the coefficient matrix P as a reference; the weighted correlation graph G* is shown in Figure 10, and then the MCST of G* can be found, as shown in Figure 11. According to Figure 11, the OTCCPG of y is Y* = {(y 1 , y 2 ), (y 1 , y 3 ), (y 2 , y 4 ), (y 4 , y 5 )}.For every component pair in Y*, suppose the false probability α = 0.025.K 12 denotes the T 2 control chart to monitor the correlation shift between y 1 and y 2 .K 13 denotes the T 2 control chart to monitor the correlation shift between y 1 and y 3 .Similar symbols are applied to denote the rest of the T 2 control charts.Finally, T 2 control chart K to monitor the correlation shift of all the quality components is established.These five T 2 control charts constitute the correlation diagnostic system for y. every component pair in Y*, suppose the false probability α=0.025.K12 denotes the T 2 control chart to monitor the correlation shift between y1 and y2.K13 denotes the T 2 control chart to monitor the correlation shift between y1 and y3.Similar symbols are applied to denote the rest of the T 2 control charts.Finally, T 2 control chart K to monitor the correlation shift of all the quality components is established.These five T 2 control charts constitute the correlation diagnostic system for y.Now this diagnostic system can be used to monitor and diagnose the correlation shift of quality components.In subsequent manufacturing, four quality data points at different moments are collected, as shown in Table 2.The T 2 control chart of these four quality data points is drawn in K, as shown in Figure 12.We can see that at the last three points, the correlation shift of all the components is out of control.Now this diagnostic system can be used to monitor and diagnose the correlation shift of quality components.In subsequent manufacturing, four quality data points at different moments are collected, as shown in Table 2.The T 2 control chart of these four quality data points is drawn in K, as shown in Figure 12.We can see that at the last three points, the correlation shift of all the components is out of control.2.
In order to diagnose the root cause(s) that cause the last three points to be out of control, check the rest of the T 2 control chart (K12, K13, K24, K45), as shown in Figures 13-16.From these charts, it can be concluded that the root causes that cause the second point to be out of control are the correlations of (y1, y2) and (y1, y3) being abnormal.The abnormal correlation of (y1, y3) causes the third point out of control, and the abnormal correlation of (y4, y5) leads to the last point out of control.2.
In order to diagnose the root cause(s) that cause the last three points to be out of control, check the rest of the T 2 control chart (K 12 , K 13 , K 24 , K 45 ), as shown in Figures 13-16.From these charts, it can be concluded that the root causes that cause the second point to be out of control are the correlations of (y 1 , y 2 ) and (y 1 , y 3 ) being abnormal.The abnormal correlation of (y 1 , y 3 ) causes the third point out of control, and the abnormal correlation of (y 4 , y 5 ) leads to the last point out of control.2.     2.  2. Figure 14.T 2 control chart K 13. to monitor the correlation shift of (y 1 , y 3 ) in Table 2.    2.
In order to validate the above conclusions, 26 T 2 control charts contained in the diagnosis system using the CCBD method are constructed.The diagnostic results of the two systems are shown in Table 3.In Table 3, the conclusions are different except at the first point; this difference can be explained properly.At the second point, because the correlations of (y1, y2) and (y1, y3) are abnormal, the other correlations of component combinations that contain (y1, y2) or (y1, y3) must be abnormal, so they are redundant diagnostic results.If the redundant diagnostic results are removed, the conclusions of the two diagnostic systems are completely consistent at this point.A similar analysis could explain the difference at the other points, as shown in Table 4.   2.
In order to validate the above conclusions, 26 T 2 control charts contained in the diagnosis system using the CCBD method are constructed.The diagnostic results of the two systems are shown in Table 3.In Table 3, the conclusions are different except at the first point; this difference can be explained properly.At the second point, because the correlations of (y 1 , y 2 ) and (y 1 , y 3 ) are abnormal, the other correlations of component combinations that contain (y 1 , y 2 ) or (y 1 , y 3 ) must be abnormal, so they are redundant diagnostic results.If the redundant diagnostic results are removed, the conclusions of the two diagnostic systems are completely consistent at this point.A similar analysis could explain the difference at the other points, as shown in Table 4.
From the above comparison, we can see that the correlation diagnosis method of multivariate process quality based on OTCCPG can not only achieve the goal of specifying the cause(s) that cause the correlation shift out of control but could also avoid the redundancy of diagnosis results, so it can provide more accurate diagnosis messages for the process quality adjustment in further steps.

Conclusions and Discussion
This paper proposes an OTCCPG-based correlation diagnosis method for multivariate process quality management.The development of this method included several steps.First, based on correlation decomposition, the correlation of all the quality components is decomposed as the ones of all the quality component pairs, and then, according to the transitivity of correlations of component pairs, the correlation of any component pair can be represented by the ones in TCCPG.Finally, the algorithm of the OTCCPG based on MCST is proposed, and the correlation diagnostic system based on the OTCCPG is illustrated.Compared with the existing diagnostic methods, the method proposed in this paper has the following advantages: (1) The diagnosis is more efficient.
The space complexity of the multivariate process quality correlation diagnostic method based on OTCCPG is O(p), while the space complexity of the diagnostic algorithm based on the CCBD method, the principal component analysis method, and the orthogonal decomposition of the T 2 statistic are O(2 p ), O(p 2 ), and O(p!), respectively.Therefore, the proposed method in this paper has higher diagnostic efficiency.
(2) The diagnostic results are more accurate.
The OTCCPG-based multivariate process quality correlation diagnosis method takes the correlation of component pairs as the diagnostic unit.Since component pairs are the smallest combination of quality components, the disadvantage of redundant diagnostic information in diagnostic algorithms based on the CCBD method, the principal component analysis method, and the orthogonal decomposition of T 2 statistics can be avoided to provide more accurate diagnostic results for manufacturing processes.
Compared with the diagnostic methods based on artificial intelligence technology, the diagnostic method proposed in this paper is based on strict mathematical analysis as the theoretical foundation, which avoids the defect of intelligent diagnostic methods in which the network structure and parameters are oriented to specific applications.Therefore, the proposed method can be used as a general theoretical model for multivariate process quality correlation diagnosis.
The multivariate process quality correlation diagnostic method proposed in this paper is based on the fact that the manufacturing process parameters are known or can be estimated more accurately by using sufficient sample data.For the manufacturing process of large-batch products, it is feasible to make more accurate estimates of the manufacturing process parameters due to the sufficient available sample data for reference.However, for flexible manufacturing systems with variable varieties and small batches, the method proposed in this paper cannot be directly applied due to the small number of available samples.Considering the similarity of different batches of products in a flexible manufacturing system in terms of structure and process, the change of process parameters in flexible manufacturing should have certain regularity.Therefore, studying the changing laws of process parameters in the flexible manufacturing system and adapting the multivariate

Figure 3 .
Figure 3. T 2 control chart of y.

Figure 3 .
Figure 3. T 2 control chart of y.

Figure 3 .
Figure 3. T 2 control chart of y.

Figure 3 .
Figure 3. T 2 control chart of y.

Figure 6 .
Figure 6.Shewhart control chart of y, to monitor the mean shift of y3 in Table1.
show that the process is in a stable state; therefore, the data in Table1can be used to estimate the parameters of the coating process using Equations (18)-(21): µ = (4143, 41.8, 39.5, 9.2, 2.11)

Table 1 .
Collected sample data.

Table 1 .
Shewhart control chart of y1 to monitor the mean shift of y1 in Table1.Figure 4. Shewhart control chart of y 1 to monitor the mean shift of y 1 in Table1.Shewhart control chart of y1 to monitor the mean shift of y1 in Table1.
Figure 4.Figure 5. Shewhart control chart of y2. to monitor the mean shift of y2 inFigure 4.Figure 5. Shewhart control chart of y2. to monitor the mean shift of y2 in Table 1. Figure 5. Shewhart control chart of y 2. to monitor the mean shift of y 2 in Table 1.Processes 2024, 12, x FOR PEER REVIEW 12 of 19 Figure 6.Shewhart control chart of y, to monitor the mean shift of y3 in Table 1. Figure 7. Shewhart control chart of y4 to monitor the mean shift of y4 in Table 1. Figure 6.Shewhart control chart of y, to monitor the mean shift of y 3 in Table 1.Processes 2024, 12, x FOR PEER REVIEW 12 of 19

Table 2 .
Test data collected in subsequent manufacturing.

Table 2 .
Test data collected in subsequent manufacturing.Figure 12. T 2 control chart K to monitor the correlation shift of (y1, y2, y3, y4, y5) in Table

Table 3 .
Comparison of the two diagnosis results.

Table 4 .
Comparison of the two diagnostic systems after redundant diagnostic results were removed.

Table 3 .
Comparison of the two diagnosis results.

Table 4 .
Comparison of the two diagnostic systems after redundant diagnostic results were removed.