A Deep Regression Model with Low-Dimensional Feature Extraction for Multi-Parameter Manufacturing Quality Prediction

Manufacturing quality prediction can be used to design better parameters at an earlier production stage. However, in complex manufacturing processes, prediction performance is affected by multi-parameter inputs. To address this issue, a deep regression framework based on manifold learning (MDRN) is proposed in this paper. The multi-parameter inputs (i.e., high-dimensional information) were firstly analyzed using manifold learning (ML), which is an effective nonlinear technique for low-dimensional feature extraction that can enhance the representation of multi-parameter inputs and reduce calculation burdens. The features obtained through the ML were then learned by a deep learning architecture (DL). It can learn sufficient features of the pattern between manufacturing quality and the low-dimensional information in an unsupervised framework, which has been proven to be effective in many fields. Finally, the learned features were inputted into the regression network, and manufacturing quality predictions were made. One type (two cases) of machinery parts manufacturing system was investigated in order to estimate the performance of the proposed MDRN with three comparisons. The experiments showed that the MDRN overwhelmed all the peer methods in terms of mean absolute percentage error, root-mean-square error, and threshold statistics. Based on these results, we conclude that integrating the ML technique for dimension reduction and the DL technique for feature extraction can improve multi-parameter manufacturing quality predictions.


Introduction
High manufacturing quality grants manufacturers a competitive edge. Modern manufacturing processes exhibit new characteristics, such as multi-stage and multi-parameter inputs, both of which influence quality. High-quality products tend to stem from the design of appropriate manufacturing parameters. Hence, designing favorable parameters hinges on predicting manufacturing quality at an early production stage, which provides reference for set high-quality products at lower costs.
Several approaches and methods have been adopted to predict manufacturing quality. Statistical quality control [1], a traditional method, has been widely used to assess the quality and performance of manufacturing processes. Based on this method, other techniques have been developed, e.g., linear regression [2,3], nonlinear regression [4], inference learning [5], fuzzy theory [6], and graph theory [7]. These approaches have successfully been applied to manufacturing quality prediction, but only in situations in which the factors (e.g., materials, equipment, and technological parameters) maintain x i and x j is closer than the radius ε, or x i is one of the K-nearest neighbors of x j , x i and x j can be regarded as neighbors.
Step 2. Compute the shortest distances. Define the graph, G, over all data points by connecting d(x i , x j ). If x i and x j are neighbors, d G (x i , x j ) = d(x i , x j ), otherwise, d G (x i , x j ) = ∞. Evaluate the geodesic distances, D G , between all pairs of data points by computing their shortest path distances, d G (x i , x j ), using Floyd's algorithm [37].
Step 3. Construct m-dimensional embedding. This step is realized through the classical multidimensional scaling method, as follows: Let λ p be the p-th eigenvalue (in decreasing order) of the matrix τ(D G ) = -HSH/2 (S is the matrix of squared distances S ij = [d G (x i , x j )] 2  Based on the description above, the radius, ε, or the number of neighbors, K, is the only parameter in the Isomap algorithm. In this paper, the K-Isomap was applied for dimension reduction. To our knowledge, there is no general theory for determining the K (a common-used approach is empirical method). In addition, the instinct low dimension, m, was specified by the maximum likelihood estimator (MLE) [38].

Deep Regression Network for Manufacturing Quality Prediction
After achieving the low-dimensional information (Y), the DL technique was adopted to learn the essential features of the pattern with the low-dimensional information and the manufacturing quality.
As shown in Figure 1, the DNN is a stack of simple networks (an autoencoder (AE) in this paper [39], named AE-based DRN) with the following three steps [40]: (i) from the lowest to top layers (layer 1 to layer l, from left to right), generative unsupervised learning occurred layer-wise on the AE; (ii) from the top to lowest layers (layer l to layer 1), fine-tuning by a supervised learning method (back propagation algorithm) is used to tweak the parameter sets (w, b); and (iii) from the hidden (top) to the output layer, a regression network is formed using the pre-training parameter sets (w, b). Appl. Sci. 2020, 10, x FOR PEER REVIEW 4 of 14 As Figure 1 illustrates, the AE model operates as follows. The purpose of the AE is to reconstruct inputs h l−1 (h 0 = Y) into new representations, R = [r1, r2, …, rm], with a minimum reconstruction error, as follows: To address this issue, the AE-based DRN is operated step by step for encoder, fe(.) (feature extraction function), and decoder, fd(.) (reconstruction function), until the optimal parameter sets (w, b) are achieved based on a minimal loss function (Equation (3)).
where Sigm(.) means the sigmoid activation function. The minimal loss function is as follows:

Overview of the Proposed MDRN Approach
Having addressed the constituents separately, the present approach for the manufacturing quality prediction can be summarized as follows, and is illustrated by Figure 2.
Step 1. Collect multi-parameter X and the corresponding quality from each manufacturing batch.
Step 2. Perform the K-Isomap algorithm and get the low-dimensional information Y.
Step 3. Construct the DRN model and get the quality prediction. (a) Construct the AE-based deep framework for the feature learning of Y, and get the parameter sets (w, b).
(b) Initialize the regression network with the parameter sets (w, b) pre-trained by (a), and model the relationships between Y and the corresponding quality. As Figure 1 illustrates, the AE model operates as follows. The purpose of the AE is to reconstruct inputs h l−1 (h 0 = Y) into new representations, R = [r 1 , r 2 , . . . , r m ], with a minimum reconstruction error, as follows: To address this issue, the AE-based DRN is operated step by step for encoder, f e (.) (feature extraction function), and decoder, f d (.) (reconstruction function), until the optimal parameter sets (w, b) are achieved based on a minimal loss function (Equation (3)).
where Sigm(.) means the sigmoid activation function. The minimal loss function is as follows:

Overview of the Proposed MDRN Approach
Having addressed the constituents separately, the present approach for the manufacturing quality prediction can be summarized as follows, and is illustrated by Figure 2.

Dataset
The data were collected from two manufacturing lines for the same product (machinery parts). The data were available from a competition about manufacturing quality control, which is organized by the Alibaba Company [41]. The data have the same technique parameters (i.e., 19 process Step 1. Collect multi-parameter X and the corresponding quality from each manufacturing batch.
Step 2. Perform the K-Isomap algorithm and get the low-dimensional information Y.
Step 3. Construct the DRN model and get the quality prediction. (a) Construct the AE-based deep framework for the feature learning of Y, and get the parameter sets (w, b).
(b) Initialize the regression network with the parameter sets (w, b) pre-trained by (a), and model the relationships between Y and the corresponding quality.

Dataset
The data were collected from two manufacturing lines for the same product (machinery parts). The data were available from a competition about manufacturing quality control, which is organized by the Alibaba Company [41]. The data have the same technique parameters (i.e., 19 process parameters, shown in Table 1) with different settings; thus, the quality index (one key-quality index with range [0, 1], shown in Figure 3) exhibits diversity in different batches. Case 1 includes 1000 batches, with the total sample being (19 + 1) × 1000, and Case 2 includes 2000 batches, with the total sample being (19 + 1) × 2000. These data were divided into two categories, 90% for training and 10% for testing. Note that all the data were desensitized.

Model Development
In this subsection, the ingredients of the proposed model, that is, the K-Isomap and the DRN for the real manufacturing system are defined in detail. Note that all of the data are normalized into [0, 1]. Then, the experimental method is applied to construct the proposed MDRN model. The details of

Model Development
In this subsection, the ingredients of the proposed model, that is, the K-Isomap and the DRN for the real manufacturing system are defined in detail. Note that all of the data are normalized into [0, 1]. Then, the experimental method is applied to construct the proposed MDRN model. The details of the model developments are listed in Table 2.

Performance Criteria
Three criteria are employed to assess the forecasting performances, namely: mean absolute percentage error (MAPE; %), root-mean-square error (RMSE; dimensionless value), and threshold statistics (TS; %). The definitions of three criteria are listed as follows: where B is the length of the prediction; ob i and pr i represent the i-th observation and prediction, respectively; and n a is the number of data predicted having relative error in forecasting less than a %. In this paper, TS a is calculated for three levels of 1%, 5%, and 10%.

Proposed MDRN Results
Firstly, take Case 1 as an example to introduce the modeling process in detail. As mentioned in the Methodologies section, the instinct dimension of the multi-parameter in the manufacturing system (Table 1) is firstly evaluated by the MLE, and the result is 3, i.e., m = 3. Then, the experiments of the model development are implemented based on Table 2, and the performance (MAPE) of the proposed MDRN model is shown in Figure 4.
For the Isomap algorithm, some statistical data are summarized, namely: K = 8, the range of the MAPE is between 2.444% and 2.639% (mean value 2.560%); K = 10, the range of the MAPE is between 2.345% and 2.686% (mean value 2.513%); K = 12, the range of the MAPE is between 1.943% and 2.624% (mean value 2.365%); and K = 14, the range of the MAPE is between 2.183% and 2.671% (mean value 2.429%). These results indicate that the prediction performance improves as K increases, but they also prove that a large value of K may lead to performance degradation. Therefore, K = 12 is an appropriate selection in this paper.
For the DRN model shown in Figure 4, the prediction performance of the deep structure is better (slight or significant) than that of the "shallow" framework (one hidden layer), except for a few special samples. In addition, the deep structures with different K values exhibit the diversity in deep representations, that is, when K = 8, the best deep structure is 50 × 50 with 2.444% (hidden layer l = 2, 50 nodes in each hidden layer); when K = 10, the best deep structure is 40 × 40 with 2.345% (hidden layer l = 2, 40 nodes in each hidden layer); when K = 12, the best deep structure is 30 × 30 with 1.943% (hidden layer l = 2, 30 nodes in each hidden layer); and when K = 14, the best deep structure is 10 × 10 with 2.183% (hidden layer l = 2, 30 nodes in each hidden layer). Therefore, the deep structure 30 × 30 with K = 12 was chosen as the optimal pattern for Case 1 prediction (marked as 3-30-30-1), and the results and residual analysis are plotted in Figure 5. For the Isomap algorithm, some statistical data are summarized, namely: K = 8, the range of the MAPE is between 2.444% and 2.639% (mean value 2.560%); K = 10, the range of the MAPE is between 2.345% and 2.686% (mean value 2.513%); K = 12, the range of the MAPE is between 1.943% and 2.624% (mean value 2.365%); and K = 14, the range of the MAPE is between 2.183% and 2.671% (mean value 2.429%). These results indicate that the prediction performance improves as K increases, but they also prove that a large value of K may lead to performance degradation. Therefore, K = 12 is an appropriate selection in this paper.
For the DRN model shown in Figure 4, the prediction performance of the deep structure is better (slight or significant) than that of the "shallow" framework (one hidden layer), except for a few special samples. In addition, the deep structures with different K values exhibit the diversity in deep representations, that is, when K = 8, the best deep structure is 50 × 50 with 2.444% (hidden layer l = 2, 50 nodes in each hidden layer); when K = 10, the best deep structure is 40 × 40 with 2.345% (hidden layer l = 2, 40 nodes in each hidden layer); when K = 12, the best deep structure is 30 × 30 with 1.943% (hidden layer l = 2, 30 nodes in each hidden layer); and when K = 14, the best deep structure is 10 × 10 with 2.183% (hidden layer l = 2, 30 nodes in each hidden layer). Therefore, the deep structure 30 × 30 with K = 12 was chosen as the optimal pattern for Case 1 prediction (marked as 3-30-30-1), and the results and residual analysis are plotted in Figure 5.   For the Isomap algorithm, some statistical data are summarized, namely: K = 8, the range of the MAPE is between 2.444% and 2.639% (mean value 2.560%); K = 10, the range of the MAPE is between 2.345% and 2.686% (mean value 2.513%); K = 12, the range of the MAPE is between 1.943% and 2.624% (mean value 2.365%); and K = 14, the range of the MAPE is between 2.183% and 2.671% (mean value 2.429%). These results indicate that the prediction performance improves as K increases, but they also prove that a large value of K may lead to performance degradation. Therefore, K = 12 is an appropriate selection in this paper.
For the DRN model shown in Figure 4, the prediction performance of the deep structure is better (slight or significant) than that of the "shallow" framework (one hidden layer), except for a few special samples. In addition, the deep structures with different K values exhibit the diversity in deep representations, that is, when K = 8, the best deep structure is 50 × 50 with 2.444% (hidden layer l = 2, 50 nodes in each hidden layer); when K = 10, the best deep structure is 40 × 40 with 2.345% (hidden layer l = 2, 40 nodes in each hidden layer); when K = 12, the best deep structure is 30 × 30 with 1.943% (hidden layer l = 2, 30 nodes in each hidden layer); and when K = 14, the best deep structure is 10 × 10 with 2.183% (hidden layer l = 2, 30 nodes in each hidden layer). Therefore, the deep structure 30 × 30 with K = 12 was chosen as the optimal pattern for Case 1 prediction (marked as 3-30-30-1), and the results and residual analysis are plotted in Figure 5. As Figure 5a shows, the prediction trends are, to some extent, in accordance with the measurement trends, but there are also some significant differences between the two values. As shown in Figure 5b, the residual errors cause some fluctuation in the testing processes within the range of [−0.1, 0.1]. There are only five prediction outliers in a triangle, because the interval around the residual errors does not contain zero. This implies that the five residual errors that are caused by the unfortunate fitting, and have a confidence interval of over 95%, accounted for 5% of the testing data. Furthermore, the quantitative evaluation results, MAPE = 1.943% and RMSE = 0.022, also exhibit a high accuracy. Hence, the thorough processes of the testing are successful, and the results are acceptable.
Following the modeling process of Case 1, the testing results of Case 2 are plotted in Figure 6. Note that the optimal MDRN model for Case 2 is constructed by m = 3, K = 14, and has a deep structure of 25 × 25 (hidden layer l = 2, with 25 nodes in each hidden layer), i.e., 3-25-25-1.
shown in Figure 5b, the residual errors cause some fluctuation in the testing processes within the range of [−0.1, 0.1]. There are only five prediction outliers in a triangle, because the interval around the residual errors does not contain zero. This implies that the five residual errors that are caused by the unfortunate fitting, and have a confidence interval of over 95%, accounted for 5% of the testing data. Furthermore, the quantitative evaluation results, MAPE = 1.943% and RMSE = 0.022, also exhibit a high accuracy. Hence, the thorough processes of the testing are successful, and the results are acceptable.
Following the modeling process of Case 1, the testing results of Case 2 are plotted in Figure 6. Note that the optimal MDRN model for Case 2 is constructed by m = 3, K = 14, and has a deep structure of 25 × 25 (hidden layer l = 2, with 25 nodes in each hidden layer), i.e., 3-25-25-1. The results shown in Figure 6 can be summarized as follows: (1) The instinct dimension, m, is related to the data feature, not the length of data. The calculation result, that is, m = 3, is the same, because it uses the same type of manufacturing parameters (x1-x19). (2) Because of its deep feature learning capacity, the DRN with two hidden layers achieves better prediction results (MAPE = 1.861% and RMSE = 0.023) than that of the model with a single hidden layer. (3) There are eleven outliers (accounting for 5.5% of the testing data) in the residual error analysis, which illustrate that the MDRN approach has a stable ability for feature learning and regression, because of the 5% of outliers in the Case 1 prediction. Table 3 summarizes the quantitative results in the training and the testing process for the robust analysis. From Table 3, one can find that all performances in both the training and testing processes are similar, indicating that the proposed model has a generalization ability and robustness. It is attributed to the "dropout", which weakens the joint adaptability between neurons (that is, the parameters of the whole network neurons are only partially updated because of the different neurons eliminated each time). Table 3. Quantitative results in training and testing process using the proposed model.

Case
Performance ( The results shown in Figure 6 can be summarized as follows: (1) The instinct dimension, m, is related to the data feature, not the length of data. The calculation result, that is, m = 3, is the same, because it uses the same type of manufacturing parameters (x 1 -x 19 ). (2) Because of its deep feature learning capacity, the DRN with two hidden layers achieves better prediction results (MAPE = 1.861% and RMSE = 0.023) than that of the model with a single hidden layer. (3) There are eleven outliers (accounting for 5.5% of the testing data) in the residual error analysis, which illustrate that the MDRN approach has a stable ability for feature learning and regression, because of the 5% of outliers in the Case 1 prediction. Table 3 summarizes the quantitative results in the training and the testing process for the robust analysis. From Table 3, one can find that all performances in both the training and testing processes are similar, indicating that the proposed model has a generalization ability and robustness. It is attributed to the "dropout", which weakens the joint adaptability between neurons (that is, the parameters of the whole network neurons are only partially updated because of the different neurons eliminated each time). According to the quantitative and qualitative analysis above, the proposed MDRN model, which combines the Isomap for multi-parameter reduction and the DRN for the relationship between the low-dimension information and manufacturing quality learning, predict the manufacturing quality within an acceptable margin of error. In addition, the performance of the model with three hidden layers is not greater than that of the model with two hidden layers. This phenomenon is different from the DL theory, in which the deeper the hidden layer is, the better the performance is. That is, the deep architecture is closely related to the internal features of the real data. Furthermore, the proposed model has a generalization ability and robustness. Therefore, suitable deep architecture and the lack of blind pursuit for more hidden layers can avoid computational burdens and enhance the prediction performance.

Comparison Results
To evaluate the prediction performance of the proposed model, three models are applied to the comparisons using the same dataset, i.e., the AE-based DRN, the BPNN, and least squares support vector regression (LSSVR). The BPNN with one hidden layer and the LSSVR with the kernel function (none hidden layer) are typical shallow learning frameworks in the peer work. All of the models have the same input-output structures (19 inputs and 1 output), and the hidden layer and hidden node (except for the LSSVR) are specified by the experimental method.
For the AE-based DRN, the experimental design is the same as the "DRN for manufacturing quality prediction" shown in Table 2. The optimal model structure is set as the hidden layer l = 2, with 25 nodes in each hidden layer for Case 1, and it is set as the hidden layer l = 2, with 25 nodes in each hidden layer for Case 2.
For the BPNN, an empirical formula is applied to determine the hidden node, i.e., hidden nodes = inputnodes + outputnodes + b(b ∈ [0, 10]). Therefore, the optimal hidden node for Case 1 and Case 2 is calculated as 10 and 12, respectively, in terms of the MAPE. In addition, other main computing parameters are set to a learning rate of 0.05 and a goal of 0.0001 for 200 iterations.
For the LSSVR, there are two parameters, i.e., kernel width and penalty coefficient [9], that are determined using the ten-fold cross-validation method in this paper. Hence, the optimal parameters are set as (25.016 and 9.565) for Case 1 and (20.525 and 10.410) for Case 2.
The prediction results for the two cases using the three models are displayed in Figures 7 and 8, seperately. Figure 7 reveals the following: (1) There are some unsatisfactory prediction results in the three models for Case 1, which illustrate that the performance is not related to the multi-parameter inputs; rather, it is related to the input feature. Compared with Figure 5, the three comparison models, which consider 19 parameters, exhibit a lower prediction performance. Specifically, there are six (accounting for 6% in the test data) outliers in the AE-based DRN, and seven (accounting for 7%) outliers in both the LSSVR and the BPNN. (2) Deep architecture is superior in performance to feature learning. That is, the performance of the proposed model and the AE-based DRN is better than that of the LSSVR and the BPNN in terms of the number of outliers. Cumulatively, the MDRN model has the best performance for Case 1 among these comparison models, attributed to multi-parameter reduction and deep feature representation, both of which reduce the invalid information interference and improve the prediction capacity.
As Figure 8 illustrates, the fluctuations in the three models also exhibit diversity, and are caught in the mutations. The residual analysis shows evidence of 10 outliers (accounting for 5%) in the AE-based DRN and the LSSVR model, and 12 outliers (accounting for 6%) in the BPNN. Compared with Figure 6, the performance of the MDRN model is in a second place with 11 outliers. This result of Case 2 differs with that of Case 1. Therefore, to evaluate the performance of these models, quantitative evaluations are also investigated, and the results are summarized in Table 3. As Figure 8 illustrates, the fluctuations in the three models also exhibit diversity, and are caught in the mutations. The residual analysis shows evidence of 10 outliers (accounting for 5%) in the AEbased DRN and the LSSVR model, and 12 outliers (accounting for 6%) in the BPNN. Compared with Figure 6, the performance of the MDRN model is in a second place with 11 outliers. This result of Case 2 differs with that of Case 1. Therefore, to evaluate the performance of these models, quantitative evaluations are also investigated, and the results are summarized in Table 3.
(a)  As Figure 8 illustrates, the fluctuations in the three models also exhibit diversity, and are caught in the mutations. The residual analysis shows evidence of 10 outliers (accounting for 5%) in the AEbased DRN and the LSSVR model, and 12 outliers (accounting for 6%) in the BPNN. Compared with Figure 6, the performance of the MDRN model is in a second place with 11 outliers. This result of Case 2 differs with that of Case 1. Therefore, to evaluate the performance of these models, quantitative evaluations are also investigated, and the results are summarized in Table 3. As shown in Table 4, the statistical indexes of the two case applications demonstrate the following: (1) In terms of the highest MAPE and RMSE, the shallow model (BPNN with one hidden layer and LSSVR without a hidden layer) has difficulty sufficiently capturing the features and quality of the manufacturing parameters. However, the deep network (MDRN and AE-based DRN) demonstrates an increase capacity for feature learning and regression. (2) In terms of the highest TS, the error distribution of the proposed MDRN model falls in the range of less than 5% (accounting for 95%) and 10% (accounting for 100%) for Case 1, and 5% (accounting for 95.5%) and 10% (accounting for 99.5%) for Case 2. However, the error distributions of the comparison model are scattered in many ranges (even 4%-7% errors beyond TS10 for Case 1, and 3%-5% errors beyond TS10 for Case 2). (3) In terms of all of the criteria, the rankings for prediction performance, from best to worst, are as follows: MDRN, AE-based DRN, LSSVR, and BPNN. Therefore, according to the qualitative analysis and the quantitative analysis, the proposed model MDRN, which combines manifold dimension reduction and deep feature learning, is beneficial for exploring the sophisticated relationships between manufacturing multi-parameter inputs and quality, and displays a better prediction capacity for manufacturing quality.  Table 4, the statistical indexes of the two case applications demonstrate the following: (1) In terms of the highest MAPE and RMSE, the shallow model (BPNN with one hidden layer and LSSVR without a hidden layer) has difficulty sufficiently capturing the features and quality of the manufacturing parameters. However, the deep network (MDRN and AE-based DRN) demonstrates an increase capacity for feature learning and regression. (2) In terms of the highest TS, the error distribution of the proposed MDRN model falls in the range of less than 5% (accounting for 95%) and 10% (accounting for 100%) for Case 1, and 5% (accounting for 95.5%) and 10% (accounting for 99.5%) for Case 2. However, the error distributions of the comparison model are scattered in many ranges (even 4%-7% errors beyond TS10 for Case 1, and 3%-5% errors beyond TS10 for Case 2). (3) In terms of all of the criteria, the rankings for prediction performance, from best to worst, are as follows: MDRN, AE-based DRN, LSSVR, and BPNN. Therefore, according to the qualitative analysis and the quantitative analysis, the proposed model MDRN, which combines manifold dimension reduction and deep feature learning, is beneficial for exploring the sophisticated relationships between manufacturing multi-parameter inputs and quality, and displays a better prediction capacity for manufacturing quality.

As shown in
Based on the Friedman test at a 5% level [42], the differences between the proposed MDRN model and the comparison models are statistically significant. Specifically, the values of significance (asymptotic) are 0.029 (Case 1) and 0.001 (Case 2), respectively, or less than 0.05 in both cases. Hence, the proposed MDRN model, which has a significant difference with other candidates, is an effective attempt for the multi-parameter manufacturing quality prediction.

Conclusions
To precisely predict manufacturing quality, we considered a manifold learning-based deep regression network (MDRN), which integrates the Isomap algorithm for learning manufacturing multi-parameter information and the AE-based DRN model for extracting sophisticated patterns between manufacturing multi-parameter inputs and quality. The three steps of the MDRN are as follows: (1) the Isomap is operated to reduce the multi-parameter dimension from a high-dimensional to a low-dimensional space; (2) the DL technique, the first part in the AE-based DRN, is applied to extract the features in the low-dimensional space; and (3) the ANN, the second part in the AE-based DRN, is then used to construct the relationship between the low-dimensional features and the manufacturing quality, so as to achieve the manufacturing quality prediction. To investigate the prediction capacity of the proposed model, two cases with the same manufacturing process are considered, and comparisons with other methods are also studied. The results show that the proposed method with a generalization ability and robustness exhibits the best performance among all of the peer methods, and these models are also significantly different in statistics.
The innovation of this work is combined with the deep framework through the ML technique to understand sophisticated features of multi-parameter manufacturing quality. Based on this philosophy, specific constituents, i.e., the Isomap and the AE-based DRN, may be replaced with other techniques. Therefore, the present approach is general enough to be developed as a series of models for the multi-parameter manufacturing quality prediction. In the future, we will focus on the theoretical method of the model's parameter definition and practical application in different manufacturing.