1. Introduction
In recent decades, robust design (RD) has been considered essential for improvement of product quality, as the primary purpose of RD is to seek a set of parameters that make a product insensitive to various sources of noise factors. In other words, RD attempts to minimize the variability of quality characteristics while ensuring that the process mean meets the target value. To solve the RD problem, Taguchi [
1] considered both the process mean and variance as a single performance measure and defined a number of signal-to-noise ratios to obtain the optimal factor settings. Unfortunately, the orthogonal arrays (OAs), statistical analysis, and signal-to-noise ratios associated with this technique were criticized by Box et al. [
2], Leon et al. [
3], Box [
4], and Nair [
5]. Therefore, Vining and Myers [
6] proposed the dual response (DR) approach based on response surface methodology (RSM), in which the process mean and variance are estimated separately as functions of control factors. The result is an RD optimization model in which the process mean is prioritized by setting it as a constraint and the process variability is set as an objective function. From this starting point, the three sequential steps of the RD procedure were generated: design of experiment (DoE), estimation, and optimization. The DoE step exploits information about the relationship between the input and output variables. In the second step, the functional form of this relationship is defined by estimating the model parameters. Ultimately, the optimal settings for the input factors are identified in the third step.
As several RD optimization models have been proposed and modified according to various criteria, the optimization step is relatively well developed. The priority criterion was used in the DR models proposed by Vining and Myers [
6], Copeland and Nelson [
7], and Del Castillo and Montgomery [
8], whereas the process mean and variance were considered simultaneously in the mean squares error (MSE) model developed by Lin and Tu [
9]. The weight criterion was used to consider the trade-off between the mean and variance in the weighted sum models reported by Cho et al. [
10], Ding et al. [
11], and Koksoy and Doganaksoy [
12], whereas Ames et al. [
13] used the weight criterion in a quality loss function model. Shin and Cho [
14,
15] extended the DR model by integrating the customized maximum value on the process bias and variance. Based on the MSE concept, Robinson et al. [
16] and Truong and Shin [
17] proposed generalized linear mixed models and inverse problem models. Using the goal programming approach, Kim and Cho [
18] and Tang and Xu [
19] introduced prioritized models, whereas Borror [
20] and Fogliatto [
21] both considered the output responses on the same scale as the desirability functions. In an attempt to identify the Pareto efficient solutions, Kim and Lin [
22] and Shin and Cho [
23] developed a fuzzy model and a lexicographical weighted Tchebycheff model for the RD multiple-objective optimization problem, respectively. Furthermore, Goethals and Cho [
24] integrated the economic factor in the economic time-oriented model to handle time-oriented dynamic characteristics, while Nha et al. [
25] introduced the lexicographical dynamic goal programming model.
The DoE is a systematic method that aims to identify the effects of controllable factors on the quality characteristics of interest. DOE has been developed since the 1920s. Montgomery [
26] reviewed the history of DoE in the published literature. Several DoE techniques were developed and intensively researched as a means of conducting experiments in industrial applications; these include full factorial designs, fractional factorial designs (screening designs), mixture designs, Box-Benken designs, central composite designs (CCD), Taguchi array designs, Latin square designs and other non-conventional methods (D-optimal designs).
Although many techniques for the estimation stage of an RD process are reported in literature, there is room for improvement. Indeed, the accuracy and reliability of prediction and optimization depend directly on the estimation results. Additionally, most regression methods are RSM-based approaches which commonly rely on assumptions, such as normality and homogeneous variance of the response data. However, in practice, these assumptions may not be maintained.
Along with the development of RD, RSM has been widely applied in various fields of applied statistics. The most extensive applications of RSM are in the industrial world, particularly in situations where several input variables may influence some performance measure or quality characteristic of the product or process [
27]. RSM uses mathematical and statistical techniques to explore the functional relationship between input control variables and an output response variable of interest, with the unknown coefficients in this functional relationship typically estimated by the least-squares method (LSM). The usual assumptions behind LSM-based RSM are that the experimental data and error terms must be normally distributed, and the distribution of error terms must have constant variance and zero mean. When one of these assumptions is violated, the Gauss-Markov theory no longer holds. Instead, alternative techniques can be applied, such as the maximum likelihood estimation (MLE), weighted least-squares (WLS), and Bayesian methods. From the viewpoint of MLE, the model parameters are regarded as fixed and unknown quantities, and the observed data are considered as random variables [
28]. Truong and Shin [
17,
29] showed that LSM-based RSM does not always estimate the input-output functions effectively, so they developed a procedure to estimate these unknown coefficients using an inverse problem.
In recent decades, neural networks (NNs) have become a hot topic of research; NNs are now widely used in various fields, including speech recognition, multi-objective optimization, function estimation, and classification. NNs can model linear and nonlinear relationships between inputs and outputs without any assumptions based on the activation function’s generalization capacity. NNs are universal functional approximators, and Irie and Miyake [
30], Funahashi [
31], Cybenko [
32] and Hornik et al. [
33], and Zainuddin and Pauline [
34] showed that NNs are capable of approximating any arbitrary nonlinear function to the desired accuracy without the knowledge of predetermined models, as NNs are data-driven and self-adaptive. Therefore, a NN provides a powerful regression method to model the functional relationship between RD input factors and output responses without making any assumptions. In RD settings, Rowlands et al. [
35] integrated an NN into RD by using the NN to conduct the DoE stage. Su and Hsieh [
36] applied two NNs to train the data to obtain the optimal parameter sets and predict the system response value. Cook et al. [
37] developed an NN model to forecast a set of critical process parameters and employed a genetic algorithm to train the NN model to achieve the desired level of efficiency. The integration of NNs into RD has also been discussed by Chow et al. [
38], Chang [
39], and Chang and Chen [
40]. Arungpadang and Kim [
41] developed a feed-forward NN-based RSM to model the functional relationship between input variables and output responses to improve the precision of estimation without increasing the number of experimental runs. Sabouri et al. [
42] proposed an NN-based method for function estimation and optimization by adjusting weights until the response reached the target conditions. With regard to input-output relationship modeling, Hong and Satriani [
43] also discussed a convolutional neural network (CNN) with an architecture determined by Taguchi’s orthogonal array to predict renewable power. Recently, Arungpadang et al. [
44] proposed a hybrid neural network-genetic algorithm to predict process parameters. Le et al. proposed NN-based response function estimation (NRFE) identifies a new screening procedure to obtain the best transfer function in an NN structure using a desirability function family while determining its associated weight parameters [
45]. Le and Shin propose an NN-based estimation method as a RD modeling approach. The modeling method based on a feedback NN approach is first integrated in the RD response functions estimation. Two new feedback NN structures are then proposed. Next, the existing recurrent NNs (i.e., Jordan-type and Elman-type NNs) and the proposed feedback NN approaches are suggested as an alternative RD modeling method [
46].
The primary motive of this research is to establish feed-forward NN structure-based estimation methods as alternative RD modeling approaches. First, an NN-based estimation framework is incorporated into the RD modeling procedure. Second, RD modeling methods based on the feed-forward back-propagation neural network (FFNN), cascade-forward back-propagation neural network (CFNN), and radial basis function network (RBFN) are proposed. These are applied to estimate the process mean and standard deviation response functions. Third, the efficiency of the proposed modeling methods is illustrated through simulation studies with a given real function. Fourth, the efficacy of the proposed modeling methods is illustrated through a printing case study. Finally, the results of comparative studies show that the proposed methods obtain better optimal solutions than conventional LSM-based RSM. The proposed estimation methods based on feed-forward NN structures are illustrated in
Figure 1. From the experimental data, the optimal numbers of hidden neurons in the FFNN and CFNN structures and the dispersion constant “spread” in the RBFN are identified to finalize the optimal structures of the corresponding NNs. The DR functions can be separately estimated using the proposed estimation methods from the optimal NN structures with their control factors and output responses.
The statistical estimation method of RSM based on conventional LSM, as introduced by Box and Wilson [
47], generates the response surface approximation using linear, quadratic, and other functions, while the coefficients are estimated by minimizing the total error between the actual and estimated values. For a more comprehensive understanding of RSM, Myers [
48] and Khuri and Mukhopadhyay [
49] discuss the various development stages and future directions of RSM. When the exact functional relationship is unknown or very complicated, conventional LSM is typically used to estimate the input-output functional responses in RSM [
50,
51]. In general, the output response
can be identified as a function of input factors
as follows:
where
is a vector of the control factors,
is a column vector of the estimated parameters, and
is the random error. The estimated second-order models for the process mean
and standard deviation
are represented as
where
and
are the estimators of unknown parameters in the mean and standard deviation functions, respectively. These coefficients are estimated using LSM as
where
and
are the average and standard deviation values for the experimental data, respectively.
4. Case Study
The printing data example used by Vining and Myers [
6] and Lin and Tu [
9] was selected to demonstrate the application of the proposed methods. The printing experiment investigates the effects of speed (
), pressure (
), and distance (
) on a printing machine’s ability to add colored ink to a package (
y). The case study has a factorial design with three level three factors, so in total, the number of experimental runs is
. To ensure the accuracy of the experiment, each treatment combination is repeated three times. In this case study, the target
. The experimental data are given in Vining and Myers [
6]. The estimated mean and standard deviation functions given by LSM-based RSM are
where
,
,
,
, and
.
The information used for the RD modeling methods after training with the associated transfer functions, training functions, NN architectures (number of inputs, number of hidden neurons, number of outputs), and number of epochs in the case study are summarized in
Table 10 and
Table 11. The specified weight and bias values of the proposed FFNN, DFNN, and RBFN used to approximate the process mean and standard deviation functions in the case study are shown in
Appendix C (
Table A9,
Table A10,
Table A11,
Table A12,
Table A13 and
Table A14).
The contour plots of the response functions for the process mean and standard deviation estimated by the LSM-based RSM, FFNN, CFNN, and RBFN robust design modeling methods are demonstrated in
Figure 17,
Figure 18,
Figure 19 and
Figure 20, respectively.
Table 12 presents the optimal factor settings, associated process mean, process bias, process variance, and EQL values obtained from the proposed NN-based modeling methods and LSM-based RSM in the case study. The proposed RD modeling methods produced significantly smaller EQL values than LSM-based RSM.
The process variance obtained from the proposed NN-based estimation methods is markedly lower than that of conventional RSM. The scatter plots demonstrate the value of squared process bias vs. variance for the traditional LSM-based RSM model and the three proposed NN-based modeling methods as illustrated in
Figure 21. The optimal settings are marked with green stars in each figure.
5. Conclusions and Further Studies
This paper identified three NN-based modeling approaches (FFNN, CFNN, and RBFN) that obviate the need for the assumptions required when LSM-based RSM is used to approximate the mean and standard deviation functions. The feed-forward NN structure-based RD modeling methods are alternative options for identifying a functional relationship between input factors and process mean and standard deviation in RD. Compared with the conventional RD modeling method, the proposed approach has significant advantages with regard to accuracy and efficiency. The proposed RD modeling methods can easily be implemented using existing software such as MATLAB. The results of both the two types of simulation studies and case study show that the proposed RD modeling methods provide better optimal solutions than conventional LSM-based RSM. The main results are summarized, and the variability index, EQL, and R2 are central to the model validation criteria. (i) In simulation study 1, the proposed RD modeling methods showed, on average, 139 times lower process bias and 1.5 times lower process variance than LSM-based RSM in terms of variability, and about four times smaller in EQL. Moreover, among the three NN-Based modelings, FFNN showed the lowest value in terms of EQL. (ii) In simulation study 2, the proposed RD modeling methods showed 11 times lower process bias and one time lower process variance than LSM-based RSM on average in terms of variability and about 1.2 times smaller in EQL. Among the three NN-Based modelings, FFNN also showed the lowest value. (iii) In the case study, the proposed RD modeling methods showed significantly lower results than LSM-based RSM, with an average process bias of 102 times and process variance of 85 times in terms of variability, and about 86 times smaller in EQL. Among the three NN-Based modelings, CFNN and FFNN showed much lower values than RBFN. (iv) Specifically, in the printing machine case study, the R2 values for the estimated standard deviation functions are 45.452%, 73.51%, 69.47%, and 99.99%, when the conventional LSM-based RSM, FFNN-based modeling, CFNN-based modeling and RBFN-based modeling methods are applied respectively.
In future work, the proposed NN structures-based RD methods could be used to estimate multiple responses (RD multi-objective optimization problem), time-series data, big data, and simulation data [
60]. But the activation and transfer function in the hidden and output layers should be carefully investigated and selected separately when applied. In addition, in this study, a classical case study was performed to verify the proposed methodology, but in the future study, a field-based case study that suggests optimal process conditions for productivity improvement in a smart factory will be performed.