Next Article in Journal
Cloud-Based Data-Driven Framework for Optimizing Operational Efficiency and Sustainability in Tube Manufacturing
Previous Article in Journal
Total Productive Maintenance and Industry 4.0: A Literature-Based Path Toward a Proposed Standardized Framework
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Study of the Response Surface Methodology Model with Regression Analysis in Three Fields of Engineering

1
Africa Industrial Research Center, National Chung Hsing University, Taichung 40227, Taiwan
2
Department of Bio-Industrial Mechatronics Engineering, National Chung Hsing University, Taichung 40227, Taiwan
*
Author to whom correspondence should be addressed.
Appl. Syst. Innov. 2025, 8(4), 99; https://doi.org/10.3390/asi8040099
Submission received: 16 May 2025 / Revised: 6 July 2025 / Accepted: 16 July 2025 / Published: 21 July 2025

Abstract

Researchers conduct experiments to discover factors influencing the experimental subjects, so the experimental design is essential. The response surface methodology (RSM) is a special experimental design used to evaluate factors significantly affecting a process and determine the optimal conditions for different factors. The relationship between response values and influencing factors is mainly established using regression analysis techniques. These equations are then used to generate contour and surface response plots to provide researchers with further insights. The impact of regression techniques on response surface methodology (RSM) model building has not been studied in detail. This study uses complete regression techniques to analyze sixteen datasets from the literature on semiconductor manufacturing, steel materials, and nanomaterials. Whether each variable significantly affected the response value was assessed using backward elimination and a t-test. The complete regression techniques used in this study included considering the significant influencing variables of the model, testing for normality and constant variance, using predictive performance criteria, and examining influential data points. The results of this study revealed some problems with model building in RSM studies in the literature from three engineering fields, including the direct use of complete equations without statistical testing, deletion of variables with p-values above a preset value without further examination, existence of non-normality and non-constant variance conditions of the dataset without testing, and presence of some influential data points without examination. Researchers should strengthen training in regression techniques to enhance the RSM model-building process.

1. Introduction

The results or responses of an engineering system or its process are often affected by many factors. Finding the optimal level of these factors is called optimization. In research evaluating the impact of multiple factors on one or more response values, the most common method to evaluate the most appropriate conditions of multiple factors is the response surface methodology (RSM) [1,2]. RSM is a special experimental design that can be used to establish process operations for new products, improve the utilization rate of operating processes, and reduce resource waste. Through the use of appropriate statistical techniques, it is possible to evaluate the factors that have a significant impact on the process and thus determine the optimal conditions for these factors [1,3].
The special feature of RSM in experimental design is that this method can test multiple factors with a small number of samples. After completing the data collection, regression analysis is used to establish the mathematical relationship between the system response value and its influencing factors. These models are then used to create graphs that illustrate the impact of various factors on the response. Researchers can use these linear or curvilinear distribution graphs to observe this relationship. Due to this unique graphical observation capability, the response surface methodology has been widely used in industry [1,2,3,4].
Two excellent textbooks introduce the RSM concept, two- and three-level factorial design, experimental design for fitting the response surface, and some special topics of RSM [1,2]. Bas and Boyaci [4] proposed an essential concept about RSM. Asoo et al. [3] introduced the historical background of RSM. These review papers related to engineering applications involved processing in machinability [5], manufacturing [6], biofuel [7,8], energy [9,10], agro-industrial processes [11], and polymers [12].
In an experimental design, an experimental system includes the response of this system and is combined with different levels of the influencing factor. These factors serve as the independent variables for the regression analysis. Experimental runs represent a series of tests for the experiment. The response or output of the experiments is used as the dependent variable for further analysis.
The relationship between the response (y) and the input factors (x1, x2, …, xk) of the RSM can be expressed as
yi = f(x1, x2,…, xk) + εi
where y is the response of the RSM model and is called the dependent variable; x1, x2, …, and xk are the influencing variables and are called independent variables in regression; k is the number of factors; and εi is the model’s error.
The mathematical model of the RSM model includes linear, interaction, and quadratic terms effects. If the process system involves three process variables, x1, x2, and x3, the form of the RSM equation is
y = bo + b1x1 + b2x2 + b3x3 + b11x12 + b22x22 + b33x32 + b12x1x2 + b13x1x3 + b23x2x3 + ε
If four factors are used, x1, x2, x3, and x4, the RSM equation is
y = bo + b1x1 + b2x2 + b3x3 + b4x4 + b11x12 + b22x22 + b33x32 + b44x42 + b12x1x2 + b13x1x3 + b14x1x4
+ b23x2x3 + b24x2x4 + b34x3x4 + ε
If the RSM equation includes all variables, as in Equations (2) and (3), it is called a full RSM equation.
The relationship between the response variable, the independent variable, and the dependent variable can be established through regression analysis techniques. This regression model is an empirical equation, not a theoretical equation.
The major experimental methodologies used for the response surface methodology are the full factorial design (CFD), Box–Behnken design (BBD), and central composite design (CCD) [1,6,13,14,15,16]. The number of studies with each experimental methodology, as determined using Google Scholar, is listed in Table 1.
Table 1 presents the number of papers employing the three statistical experimental techniques from 2019 to 2024, as determined by a Google Scholar search. It can be seen that the number of papers using central composite design is the largest, followed by full factorial design, while Box–Behnken design is the least frequent. The number of papers using the central composite design is about 10 times that of paper using Box–Behnken design. However, upon examining the increasing trend, it can be seen that the number of papers using central composite design and full factorial design has been decreasing year by year. The number of papers using Box–Behnken design is increasing year by year.
The experimental matrices in the three methods for RSM design can be found in various publications and textbooks [1,2]. The number of required experiments for the experimental methods is 27, 22, and 13 or more for FFD, BBD, and CCD, respectively.
Many criteria have been proposed to evaluate the regression models’ fitting agreement and predictive ability. The essential procedure in regression analysis is to find the significant effect of certain variables on the response. To evaluate the fitting agreement of a proposal model, the criteria include the F-value of ANOVA, coefficient of determination R2, adjusted R2 (R2adj), lack-of-fit test, etc. To assess the significant effect of one variable on the response, the criterion is the t-value or p-value of the coefficients of this variable. The criteria to evaluate the predictive ability are the predicted residual error sum of squares (PRESS) and predicted R-squared (R2pred) [17,18,19].
After the evaluation, the appropriate equation (RSM model) establishes the relationship between the response value and the influencing factors as a mathematical function. Among the graphical methods, contour plots and three-dimensional surface response plots are particularly popular among researchers [1,3,20]. The studied techniques are described in detail in this paper [21].
The accuracy and predictive power of the RSM model determine whether it can effectively find the factors that actually affect the response value. However, engineers often do not care about the suitability of the regression equations they use. Many researchers directly use commercial software to calculate the RSM model. Reza et al. [22] emphasize the importance of the model’s validity and accuracy in the context of RSM application, as well as the validity of the RSM model itself [22].
Compared with other experimental design methods, the most prominent feature of the response surface method is that it graphically represents the relationship between the influencing factors (variables) and the response value. Contour maps are used to present two-dimensional space. These contour maps illustrate the changes in response value under various influencing factors. Three-dimensional surface response maps represent information in three-dimensional space, with the response value as the z-coordinate and the x- and y-coordinates as the two influencing factors. Suppose the quadratic polynomial relationship does not exist. In that case, this type of contour plot will be drawn as a straight line, and the three-dimensional surface response plot will appear as a stable plane. If the quadratic polynomial relationship is significant, both graphs will be represented by curves. Researchers often use these contour maps to intuitively observe the optimal level of the influencing factors that lead to the optimal response [1,3,4].
The purpose of performing a response surface methodology (RSM) experiment is to find the optimal operating conditions. Suppose the RSM model represents a linear relationship. In that case, the contour plot and the surface response plot indicate the direction of change in the response value concerning the original experimental design conditions. Assuming that the RSM model is a quadratic equation, the two plots will indicate the maximum, minimum, or saddle point conditions, respectively. Therefore, establishing a suitable RSM equation can ensure that the research results lead to the correct optimization conditions. Therefore, the regression technique for establishing the RSM needs careful consideration.
To the best of the authors’ knowledge, regression techniques have not been fully considered when evaluating the accuracy and predictive ability of RSM equations. This study compiled 16 datasets from three engineering fields, drawn from the literature and supplemented with experimental datasets. These datasets were used to evaluate the adequacy of the RSM equations. The complete regression technique was illustrated. The adequate RSM models were proposed to highlight the importance of comprehensive regression analysis. The effect of the RSM equations on optimization was further discussed. Some suggestions were proposed for the use of RSM in engineering fields.

2. Materials and Methods

2.1. Data Sources for the Equations of the Response Surface Methodology

Table 2 presents 16 datasets from three engineering fields, used to evaluate the adequacy of RSM equations. These published papers present the original datasets.
The databases included ScienceDirect, Scopus, IEEE Xplore, SpringerLink, Google Scholar, and J-STAGE. Keywords were “RSM”, “Regression analysis”, “Model evaluation”, and “Semiconductor”, or “Steel materials”, or “Nanomaterials”.
The potential publication bias in the selected literature was a limitation of the data available. Some studies did not list their experimental data and presented their RSM results directly. Much helpful information in other non-selected literature was not considered. The publications years of 16 datasets from three engineering fields ranged from 2004 to 2024. The countries of the authors involved are Algeria, Brazil, China, Korea, Italy, Iran, Kazakhstan, Malaysia, Turkey, and the UK.
The parameters and criteria were estimated using Sigma Plot v.14.0 (SPSS Inc., Chicago, IL, USA).

2.2. Model Building for the Response Surface Methodology

A typical multiple regression model is expressed as
y i = b 0 + b 1 x 1 + . . + b k x k + ε i
where ε i is a model error
The parameters of the regression equation are x1, x2, …, xk, and the coefficients of the parameters are b0, b1, b2, …, bk. The parameters’ coefficients are usually calculated using the least squares method. Many commercial software programs can help researchers perform this calculation easily and quickly.
To establish the relationship between the variables and response of the RSM models, the significant effect of the variables on the response must be evaluated using statistical techniques. However, due to the convenience of using the commercial software programs, the evaluation of regression analysis is easily neglected.

2.3. Basic and Complete Regression Analyses

Regression analysis is a statistical technique used in experimental design to establish the relationship between response values and influencing factors. With the help of convenient commercial software programs, the estimated values of variable parameters can be easily calculated using the least squares method. These software programs also provide statistical tests to validate the regression models, such as analysis of variance tables. An analysis of variance table lists the t-value and p-value for each variable, facilitating further analysis. This process is called basic regression analysis.
Regression analysis does not simply accept all variables in the initial calculation equation. The next step is to screen out those variables that have no significant effect on the response. Due to the interaction and multicollinearity between these variables, it is impossible to remove the variables that have no significant effect on the response at the same time [17,18,39,40,41,42,43].
Several procedures are employed, including the sequential variable selection procedure, the forward procedure, stepwise regression, and backward elimination. In the establishment of the RSM models, the backward elimination procedure is recommended. For the regression analysis, two criteria are used to evaluate the accuracy (fitting agreement) and prediction [17,18,39,40,41,42].
Besides calculating the estimated values of the parameters and screening the effective variables, the detailed technique includes checking for violations of the assumptions and performing influence diagnostics. The influential data points are checked with other criteria [17,18,39,40,41,42,43].
In this study, the complete regression analysis includes performing a sequential variable selection procedure, evaluating RSM model accuracy and prediction using different criteria, checking for violations of assumptions, and performing influence diagnostics to identify influential data points.

2.4. Assumptions Involved in the Regression Analysis

The regression model includes the following assumptions: εi is uncorrelated across observations, the mean of εi is zero, the variance of εi is constant, and the distribution of εi follows a normal distribution. Based on this assumption, non-standard conditions involved in regression analysis include the balance between underfitting and overfitting, the non-normality of the dataset, the heterogeneity of the variance, and influential data points [17].
After completing the regression calculation using the least squares method, the first concern is to verify the significant effect of each parameter. Then, these assumptions must be verified, and any non-standard conditions must be addressed if they exist. Another concern of regression analysis is distinguishing between the model’s performance in terms of model fitting and predictive ability. Two criteria are used [17,44,45,46]. Modern concepts in regression modeling have been introduced by Myers [17] and Marinoiu [46]. Checking the significant effect of each parameter and selecting a regression model is called classical regression. The concept of complete regression involves a trade-off between bias and variance when selecting a limited number of important variables, comparing predictive ability, and verifying and addressing the assumptions of regression.

2.5. Establishment of the Model

Once the regression equation is established, contour and 3D surface plots can be generated using commercial software programs, allowing researchers to easily visualize the optimization using these figures. Suppose the selection of the regression equation is inappropriate. In that case, these contour and 3D surface plots are inappropriate, so the conclusions and suggestions regarding the effect of the dependent variable (factors and levels) are meaningless.
Three methods for performing sequential variable selection are forward selection, stepwise regression, and backward elimination. These methods are illustrated in detail [17,18,39,40,41,42,43,47].
For the forward selection procedure, all variables are selected as regressors and enter the model with a constant term. The variable that produces the largest R-squared value is selected first, and the resulting equation is referred to as the first equation. The other variables are considered as the second variable of the first equation. The variable that produces the largest R-squared value in the second analysis is selected as the second variable. The selection of variables continues in this way. For each procedure, the p-value of the selected variable is compared with the preset value. If the p-value of the selected variable is greater than the preset value, the procedure ends.
Stepwise regression is an improvement on forward selection. Variables that were deleted in the previous stage can be re-entered in the selection procedure. The selection procedure is the same as forward regression.
Backward elimination procedures are used to fit all variables for the regression equation and determine the p-value for each variable in the model. The variable with the lowest observed t-value and its corresponding p-value are compared with a preselected significance level, usually p < 0.05. If its p-value exceeds the preselected value, the variable is removed. The remaining variables are recalculated, and the variable with the lowest t-value and p-value is identified to compare its p-value with a value of p < 0.05. The above backward elimination procedures are repeated until no variable is dropped, and the procedure ends. The selection of the regression model consists of all remaining variables.
The problem with forward and stepwise methods is that the critical t-values they set are not appropriate in the early stages [17]. Since there are fewer variables in the early stages, the standard values of the estimates are usually overestimated, and the p-values of the variables may be too significant, thus preventing important variables from entering the model. Therefore, forward and stepwise models often underestimate. Mendenhall and Sincich [42] recommend the use of backward elimination because this method can select all possible explanatory variables as early as possible and eliminate those that are not important in explaining the response variation. Backward elimination is recommended because it considers the effects of all candidate variables [41,43,48].
In the backward elimination procedure, a t-test statistic for a variable is used to calculate its p-value, which is then used to assess the statistical interpretation of the variable in the regression model [14,48].

2.6. Criteria for the Evaluation of RSM Equations

Ten criteria are used to evaluate the RSM equations after calculating the coefficients of the variables. They are listed in Table 3.
R2, the coefficient of determination, is affected by the number of variables, so it is not a reasonable criterion. The criteria of R2adj and s are considered measures of the effect of the number of variables; they serve as criteria for model accuracy. The PRESS value is used to compare the predictive performance [17,18,39,40,41,42].
The normality test technique employed is the Kolmogorov–Smirnov method, with a cutoff value of p = 0.05. The constant variance test uses the Spearman rank correlation method; the cutoff value is p = 0.05.
The externally studentized residuals, ti, and DFFITSi values are used to examine potentially influential data points. Both criteria are set at ±2.0. If a data point is identified as influential, it should not be removed from the dataset immediately. It may be due to experimental or instrumental error, or it may deviate from the trend predicted by the proposed model. Further observations should be made, and more relevant data should be collected under the same experimental conditions.
In this study, statistical analysis was performed using SigmaPlot V.14.0 (SPSS Inc., Chicago, IL, USA). The contour plots were produced by this software.

3. Results

3.1. Semiconductor Manufacturing

An experimental dataset was reported by Box and Draper [49] and was introduced by Myers et al. [1]. The process involved applying a coating material to a wafer. Several coating thicknesses at different locations on a wafer were measured. The mean y1 and standard deviation y2 of the thickness were calculated. The influencing variables were x1, speed; x2, pressure; and x3, distance. These datasets illustrate the backward elimination technique for an adequate equation and are listed in Appendix A.1.
For the y1 mean of thickness [49], the complete regression procedure is
1. y1 = 327.615 + 177.011x1 + 109.422x2 + 131.472x3 + 32.022x12 − 22.378x22
(<0.001) (<0.001) (<0001) (0.317) (0.481)
−29.061x33 + 66.033x1x2 + 75.458x1x3 + 43.583x2x3
(0.363) (0.008) (0.003) (0.064)
R2 = 0.927, R2adj = 0.888, s = 76.111, PRESS = 337,737.94
Delete x22 and recalculate the equation.
2. y1 = 312.696 + 177.011x1 + 109.422x2 + 131.475x3 + 32.022x12 − 29.061x32
(<0.001) (<0.001) (<0.001) (0.310) (0.356)
+ 66.033x1x2 + 75.458x1x3 + 45.583x2x3
(0.007) (0.003) (0.006)
Delete x32 and recalculate the equation.
3. y1 = 293.322 + 177.011x1 + 109.422x2 + 131.475x3 + 32.022x12 − 66.033x32
(<0.001) (<0.001) (<0.001) (0.308) (0.007)
+ 75.458x1x3 + 43.458x2x3
(0.002) (0.058)
Delete x12 and recalculate the equation.
4. y1 = 314.670 + 177.011x1 + 109.422x2 + 131.473x3 + 66.033x1x2 + 75.458x1x3
(<0.001) (<0.001) (<0.001) (0.006) (0.002)
+ 47.583x2x3
(0.0058)
R2 = 0.916, R2adj = 0.891, s = 75.068, PRESS = 288,457.9
The p-values of all variables are <0.05; this is an adequate equation.
The normality and constant variance tests are passed. Influential points include the 9th (ti = −3.316, DFFITSi = −2.858), 19th (ti = 2.255, DFFITSi = 2.056), and 25th (ti = −2.322).
The adequate equation involves x1, x2, x3, x1x3, and x2x3. No quadratic terms exist in this equation. The contour plots of the response form a plateau, not a surface curve.
The adequate equation for the y2 standard deviation and three variables is calculated as follows:
y2 = 34.904 + 11.522x1 + 15.317x2 + 29.183x3 + 4.189x12 − 1.328x22
(0.280) (0.156) (0.012) (0.818) (0.942)
−16.772x33 + 7.717x1x2 + 5.117x1x3 + 14.075x2x3
(0.362) (0.550) (0.691) (0.281)
R2 = 0.454, R2adj = 0.615, s = 43.817, PRESS = 93,044.256
After the execution of the backward elimination, the adequate equation for y2 is
y2 = 47.993 + 29.183x3
(0.007)
R2 = 0.256, R2adj = 0.227, s = 42.171, PRESS = 52,441.283
The normality test is passed. The constant variance test is failed (p < 0.001).
The results of the evaluation of adequate RSM models for semiconductor manufacturing in five studies are listed in Table 4.
Won et al. [23] employed the response surface method to optimize the final polishing of Si wafers. The experimental design was CCD, the response ySR was surface roughness, and the variables were x1 applied pressure, x2 platen speed, and x3 mixed slurry ratio. The RSM model was not reported in this study. Contour plot and response surface plot curves were produced by full models [23].
The full equation with these datasets is as follows:
ySR = 1.988 + 0.501x1 − 0.0440x2 − 0.172x3 + 0.235x12 + 0.0101x22
(<0.001) (0.464) (0.027) (0.084) (0.931)
+ 0.223x32 + 0.0711x1x2 − 0.128x1x2 + 0.00125x2x3
(0.098) (0.302) (0.095) (0.985)
R2 = 0.958, R2adj = 0.882, s = 0.172, PRESS = 1.299
The adequate equation was evaluated with the regression technique of backward elimination:
ySR = 1.991 + 0.501x1 − 0.172x3 + 0.238x12 + 0.225x32 − 0.128x1x2
(<0.001) (0.007) (0.030) (0.037) (0.044)
R2 = 0.941, R2adj = 0.909, s = 0.154, PRESS = 0.646
The normality and constant variance tests are passed. Two influential data points are found, the 7th and 15th data points.
In this study [23], the authors used the full equation to produce the curved surface plots. However, the adequate equation indicated that the x2 variable (platen speed) did not significantly affect the response, surface roughness. With the complete regression technique, the adequate RSM equation could help researchers to find the optimal condition.
Figure 1 indicates the contour plots for the complete and adequate equations. The difference between the two equations resulted in a difference in the distribution of curves between the two figures. The contour and response surface plots produced with the full equation were presented in the study [23]. The incorrect RSM equation could induce incorrect results of observation.
Lee et al. [24] investigated the polishing factors affecting surface roughness using a Box–Behnken design. The polishing factors were x1, pressure; x2, wheel speed; and x3, process time. The authors proposed the full equation as the best equation and found that the R2 was 0.974 for this equation. This full equation produced contour and response surface plots.
The full equation proposed by the authors is
ySR = 9.465 − 24.850x1 − 0.163x2 − 0.278x3 + 46.375x12 + 0.00309x22
(<0.001) (0.006) (0.011) (0.001) (0.008)
+ 0.00815x32 − 0.0325x1x2 + 0.425x1x3 + 0.000150x2x3
(0.038) (0.661) (0.029) (0.919)
R2 = 0.974, R2adj = 0.928, s = 0.140, PRESS = 1.559
However, the p-values of the variables x1x2 and x1x3 are greater than 0.005.
The adequate equation evaluated with the complete regression technique in this study is
ySR = 9.565 − 25.501x1 − 0.168x2 − 0.275x3 + 46.375x12 + 0.00309x22
(<0.001) (<0.001) (0.002) (<0.001) (0.002)
+ 0.00815x32 + 0.425x1x3
(0.014) (0.010)
R2 = 0.973, R2adj = 0.946, s = 0.121, PRESS = 0.603
The normality and constant variance tests are passed. Four influential data points are found (4th, 6th, 7th, 12th).
Comparing the full and adequate equations, the adequacy equation has higher values of R2adj and s and a lower value of PRESS. This indicated that an adequate equation has better accuracy performance (R2adj, s) and prediction (PRESS).
Zhang et al. [25] investigated the optimization of dispatching rules for wafer manufacturing systems. The affecting variables were x1, the criterion of bottleneck, and x2 and x3, which were two coefficients of work-in-progress (WIP) status. The responses included yCT, the cycle time (CT); yWIP, work-in-progress (WIP); and yTP, throughput (TP). The authors proposed the full models [25], and contour and 3-D response surface plots were produced as curve surface plots.
The results of the complete regression in our study are
yCT = 932.466 − 10.392x1 − 31.055x2 + 0.0685x12 + 3.849x22
R2 = 0.813, R2adj = 0.763, s = 6.755, PRESS = 2307.945
The normality and constant variance tests were passed. Three influential data points were found.
yWIP = 10514.817 − 166.362x1 + 1878.219x2 + 2653.558x3 + 1.765x12 − 164.955x32
−22.265x1x2
R2 = 0.922, R2adj = 0.886, s = 162.678, PRESS = 948,709.4
The normality test was failed (p = 0.008), and the constant variance test was passed. One data point was influential (13th).
yTP = −1920.381 + 40.347x1 + 124.394x2 + 248.409x3 − 0.184x12 − 15.781x22
−8.908x32 − 1.204x1x3
R2 = 0.768, R2adj = 0.633, s = 15.875, PRESS = 13,369.7
The normality test was failed (p = 0.008), and the constant variance test was failed (p = 0.003). Three influential data points were found (5th, 15th, 18th). For the yCT response, x32 did not significantly affect the response. For the yWIP response, x32 was not included in this RSM equation. The authors presented plots of the surface curve using the full equation. However, this full equation was inappropriate. The yWIP response was under non-normality conditions. Both tests of normality and constant variance failed for the yTP response. An advanced regression technique needs to be performed to remedy these conditions.
Seo et al. [26] optimized a tungsten chemical mechanical planarization (CMP) slurry for semiconductor manufacturing. The CCD experimental design was employed. The responses yW and yOxide were the removal rates of the thickness of the W and oxide films. The full equations of the two responses were proposed [26] and used to produce the contour and response surface plots.
The adequate equations with datasets listed in the literature [26] were evaluated by complete regression analysis:
yW = 25.601 + 843.010x2
R2 = 0.721, R2adj = 0.698, s = 138.318, PRESS = 328.601
yOxide = 187.296 + 0.834x1 + 10.682x3
R2 = 0.688, R2adj = 0.636, s = 14.801, PRESS = 4.068
Both responses passed the tests of normality and constant variance. In this study, only the x2 variable had a significant effect on yW. The x1 and x3 variables had a linear relationship with yOxide. The surface plots were presented in the literature [26]. The effect of the adequate equation to present the appropriate contour and response surface plots is evident by comparing the adequate and proposed full equations in the literature [26].
Saleem and Soma [27] used the Box–Behnken design and RSM to study the optimization of MEMS devices. The influencing factors were x1, top electrode length (TEL); x2, top electrode width (TEW); x3, torsion spring length; and x4, torsion spring width (TSW). The response y was the pull-in voltage.
The full equation, which included x1, x2, x3, x4, x12, x22, x32, x42, x1x2, x1x3, x1x4, x2x4, and x3x4, was proposed and used to produce the surface plots.
The full equation calculated with the datasets in literature [27] is
yPV = 27.200 − 7.550x1 − 2.717x2 − 6.208x3 + 3.292x4 + 0.154x12 − 0.696x22
(<0.001) (<0.001) (<0.001) (<0.001) (0.842) (0.376)
+ 1.192x32 − 0.258x42 + 0.825x1x2 + 1.800x1x3 − 0.925x1x4 + 0.425x2x3
(0.147) (0.731) (0.364) (0.062) (0.311) (0.635)
−0.400x2x4+ 1.350x3x4
(0.655) (0.148)
R2 = 0.975, R2adj = 0.945, s = 1.748, PRESS = 211.08
The adequate equation calculated by backward elimination regression is
yPV = 26.773 − 7.550x1 − 2.717x2 − 6.208x3 + 3.292x4 + 1.352x32 + 1.801x1x3
R2 = 0.962, R2adj = 0.95, s = 1.659, PRESS = 100.206
Comparing the R2adj and s values, the adequate equation has better accuracy than the full equation. The adequate equation also exhibits better predictive performance, as indicated by the PRESS criterion. The normality test is failed (p = 0.0046). Further analysis is needed.

3.2. Steel Materials

The results of evaluating adequate equations for steel materials are listed in Table 5.
Noordiu et al. [28] investigated the performance of coated carbide tools using a CCD design. The influencing variables were x1, cutting speed; x2, feed; and x3, side cutting edge angle (SCEA), and the responses were yRa, surface roughness (Ra), and yFc, tangential force (Fc). The regression procedure was introduced in detail, and the backward elimination procedure was used. The model assessment criteria included R2, R2adj, PRESS, and lack of fit [28].
The response yRa and yFc calculations in this study are presented in Appendix A.2 and Appendix A.3. The complete regression analysis produces the appropriate model, which is consistent with the report by the authors [28]. In our study, the tests of normality and constant variance are performed. Both responses, yRa and yFc, pass. For the yRa response, one influential data point (7th) is identified. Two influential data points (12th, 13th) exist for the yFc response.
Bouacha et al. [29] studied the physical properties in hard turning with a cubic boron nitride (CBN) tool. The factors affecting the response were x1, cutting speed; x2, feed rate; and x3, depth of cut. The responses were surface roughness yRa, arithmetic average of absolute roughness Ra; yRt, maximum height of the profile Rt; and y3, average maximum height of the profile yRz.
The ANOVA tables of yRa, yRt, and yRz are presented in the literature [29]. The authors use p < 0.05 as a criterion and then delete all other variables for which the p-value is greater than 0.05. The coefficient of the parameters at this first calculation of variables with p-values < 0.05 is used to propose the final equation. The interaction of these variables is not considered. The remaining variables of their proposal equation, which consist of three responses, are presented in the published table [29].
The adequate equations evaluated and checked with the regression technique are as follows:
yRa = 0.285 − 0.00841x1 + 14.410x2 + 0.0000215x12 − 33.681x22 − 0.0128x1x2
R2 = 0.991, R2adj = 0.989, s = 0.018, PRESS = 0.0115
Both tests (normality and constant variance) are passed, and two influential data points are identified (6th and 13th).
yRt = 2.221 − 0.0548x1 + 86.001x2 + 0.000136x12 − 208.333x22 − 0.068x1x2
R2 = 0.986, R2adj = 0.982, s = 0.139, PRESS = 0.676
Both tests (normality and constant variance) are passed, and two influential data points are found (1st, 4th)
yRz = 2.994 − 0.0409x1 + 40.071x2 + 1.575x3 + 0.000951x12 − 78.472x22 − 0.0.357x1x2  0.00541x1x3
R2 = 0.993, R2adj = 0.990, s = 0.079, PRESS = 0.220
The normality test is failed (p = 0.007). The constant variance test is passed. Two influential data points are found (5th, 6th).
The x3 variable did not significantly affect yRa and yRt. For the yRz response, the x3 variable only had the linear effect, and the quadratic term was insignificant. In this study [29], curve surface plots were produced using full equations rather than the appropriate models.
Figure 2 shows the contour plots for the complete and adequate equations. The difference in the equations induces a difference in the distribution of curves between the two figures. The contour and response surface plots produced with the complete equation are presented in the study [29]. An inadequate RSM equation can induce incorrect results.
Elbah et al. [30] performed a mixed ceramic tool performance test. The factors affecting performance were x1, depth of cut; x2, feed rate; and x3, cutting speed. The response factors were yFa, axial force; yFr, thrust force; yFt, tangential force; and yRs, surface roughness.
In the literature [30], four ANOVA tables with four responses were listed, along with the p-values of each variable and a notation indicating whether the variable was significant or not. However, this information was not used. The full equations of the four responses were proposed and used to produce the contour and response surface plots.
The forms of the four responses evaluated by the complete regression technique in our study are as follows:
yFa = f(x1, x2, x3, x1x3, x2x3)
yFr = f(x1, x2, x3, x1x2, x1x3)
yFt = f(x1, x2, x3, x1x2, x1x3, x2x3)
yRs = f(x1, x2, x1x2)
The quadratic terms had no significant effect on the responses. The curved surface plots were inappropriate and could easily induce incorrect results.
Campos et al. [31] observed the machining of hardened steels with CCD in an RSM study. The influencing factors were x1, cutting speed; x2, feed rate; and x3, cut depth. The responses were yTime, time; yRa, average surface roughness Ra; and yRt, maximum height of the profile surface roughness Rt.
The sequential model test and the ANOVA results of the three responses are presented in the literature [31]. The significant effects are the linear + square model for yTime, the linear + square model for yRa, and the linear + interaction model for yRt. In the three ANOVA tables, the p-values of some variables were >0.05. However, the full models of the three responses were proposed. The curved surfaces of both plots were presented.
The forms of the adequate equations in our study are
yTime = full equation.
yRa = f(x1, x2, x3, x1x2, x1x3, x2x3, x12, x32).
yRt = f(x1, x12).
The adequate equation for yTime is the full equation, yielding the same results as in the literature [31], and four influential data points are identified. For the Ra response yRa, the x22 variable was not included in the adequate equation, and four influential data points were found. The curved surface plots were inappropriate for this response. For the yRt Rt response, only the variables of x1 and x12 have a significant effect on yRt.
Khalil et al. [32] reported optimizing the effect of machining factors on surface roughness for machining AISI D3 steel. The influencing factors were x1, cutting speed; x2, feed rate; and x3, cut depth. The response ySR was the surface roughness. The results of the ANOVA for the experiment are presented in the literature [32]. The p-values of variables x12 and x22 were >0.05. However, the full equation was still adopted, and curved response surface plots were produced.
The form of the adequate equation evaluated by the complete regression technique is
ySR = f(x1, x2, x3, x1x2, x1x3, x32).
For the quadratic terms, only the x23 term significantly affects the response.
Using the full equation involving x12 and x22 to produce the curved surface plots was inappropriate and could easily cause incorrect results.

3.3. Nanomaterials

The results of evaluating adequate RSM equations for nanomaterials are presented in Table 6.
Pakolpakcil et al. [37] investigated the effect of processing parameters on the aerosol filter of poly nanofiber mats. A three-factorial BBD was used. The variables were x1 concentration; x2, rotation speed; and x3, needle size. The response yAfd was the average fiber diameter. The statistical results of the sequential models showed that the p-value of the quadratic term was <0.05. The ANOVA table in this study [37] indicated that the p-values of x1x2, x1x3, and x22 were >0.05. That is, these variables did not have a significant effect on the response. However, the full equation was proposed and used to produce the contour and surface response plots [37].
The calculation of the adequate equation with the datasets in the literature is presented in Appendix A.4. The form of this adequate equation is
yAfd = 277.462 + 60.375x1 + 17.001x2 + 15.875x3 − 42.058x12 + 34.442x32
+ 17.501x1x2
The variable x22 is excluded. That is, there is no quadratic relationship between x22 and the response.
An adequate equation has a similar accuracy performance to that of a full equation. However, it has a better predictive ability (PRESS = 675.548) than the full equation (PRESS = 1,113,520).
Figure 3 reveals the contour plots for the complete and adequate equations. The difference in the equations induces a difference in the distribution of curves between the two figures. The contour and response surface plots produced with the full equation are presented in the study [37]. An inadequate RSM equation can lead to incorrect observation results.
The normality test is passed. However, the constant variance test is failed. One suspicious data point (12th) is found. Further analysis needs to be performed.
Pajaie and Taghizadeh [33] reported optimizing the catalytic performance of synthesized catalysts for the methanol-to-olefin reaction. The BBD experimental design was used. There were three variables: x1, the MW aging time; x2, the US aging time; and x3, the HT time. Two responses are the yield of ethylene and the yield of propylene.
Two ANOVA tables for yethylene and ypropylene are presented in the literature. For the ethylene yield, p-values of all variables are <0.05. The full equation is an adequate representation of the ethylene yield. The R2 value of ethylene is very close to 1.0 (R2 = 0.9995).
In the ANOVA table of the propylene yield ypropylene, the p-values of x2x3 and x32 were >0.05. The two variables did not have a significant effect on response ypropylene. However, the authors used two full equations to produce the contour and response surface plots [33]. It is inappropriate to use the propylene yield as a measure. The x3 HT time did not have the curved surface condition with ypropylene.
There are six suspicious data points for yethylene and two suspicious data points for ypropylene. Further investigation needs to be performed.
Jourshabani et al. [34] investigated the factors influencing benzene hydroxylation to phenol using a V/SBA-16 nanoporous catalyst. The CCD was used. The variables were x1, reaction temperature; x2, H2O2 content; and x3, catalyst amount. The response was the yield of phenol.
In the AVOVA table for response, the p-value of the x1x3 variable is >0.05. However, the authors proposed the full equation, which was used to produce the surface plots [34].
The adequate equation, evaluated using complete regression analysis, is yphenol yield = f(x1, x2, x3, x12, x22, x32). The interaction terms (x1x2, x1x3, x2x3) are excluded in this equation. That is, the full equation is inappropriate.
Sheng et al. [35] investigated the optimization of deposition variables to synthesize upright ZnO rod arrays with large diameters. There were four influencing factors: x1, the concentration of Zn+2; x2, reaction temperature; x3, reaction time; and x4, the molar ratio of Zn+2. The response was yD diameter. The authors used the logarithm transform for the response. The form of their proposal equation is
Log (yD + 0.5) = f (x1, x2, x3, x4, x32, x1x2, x2x3, x2x4, x3x4).
Both tests are passed for the y response using the normality and constant variance tests, so it is not necessary to transform the response y into a logarithmic form. The adequate equation calculated by complete regression analysis in our study is
yD = 1.182 − 57.722x1 − 0.0289x2 + 0.288x3 + 0.00265x4 − 0.00822x32 + 0.778x1x2
−0.0000299x2x4 − 0.0000643x3x4
R2 = 0.86, R2adj = 0.797, s = 0.169, PRESS = 1.18
Rakhmanova et al. [36] reported the optimization of nanosized zinc oxide synthesis conditions using electrospinning. Three influencing factors were applied: voltage; x2, distance; and x3, calcination temperature, and the response, y, was zinc oxide.
The authors did not propose the empirical equation; instead, surface plots of the curve were presented [36]. The full equation evaluated by complete regression with the datasets in the literature is
yZine oxide = 302.751 − 0.577x1 + 1.625x2 − 0.712x3 + 0.000348x12 − 0.00438x22
(0.629) (0.970) (0.985) (0.565) (0.997)
−0.432x32 − 0.00871x1x2 + 0.0113x1x3 + 0.0897x2x3
(0.739) (0.733) (0.566) (0.932)
R2 = 0.768, R2adj = 0.351, s = 7.154, PRESS = 6124.736
In our study, the adequate equation evaluated by complete regression analysis is
yZine oxide = 129.859 − 2.526x2 − 3.663x3
R2 = 0.651, R2adj = 0.592, s = 5.668, PRESS = 639.412
The normality and constant variance tests are passed, and two suspicious data points are found.
The PRESS values for the adequate and complete equation are 639.432 and 6124.736. The adequate equation offers a significant improvement in the predictive ability.
Sreekumar et al. [38] investigated the optimization of a photovoltaic/thermal system using a Mxene/water nanofluid as the heat transfer fluid via CCD. The four influencing variables were x1, nanofluid concentration; x2, nanofluid flow rate; x3, solar radiation; and x4, inlet temperature. The four responses were ynth, thermal efficiency; ynele, electrical efficiency; ynex, thermal exergy efficiency; and ynex, electrical exergy efficiency.
The author proposed full equations for four responses, which were used to produce curve surface plots of contour and 3D response surface plots [38]. The coefficients of variables and their corresponding p-values for four responses (ynth, ynele, ynex,th, and ynex,ele) are presented in the table from the study [38]. In this table, many p-values of each parameter in the full equations are >0.05. This indicates that these parameters did not significantly affect the response. However, the complete equations were adopted [38].
The complete regression technique evaluated the adequate equation of the four responses.
ynth = 51.078 + 29.749x1 + 0.257x2 − 0.631x4
R2 = 0.814, R2adj = 0.789, s = 4.017, PRESS = 515.04
The normality and constant variance tests are passed. One suspicious data point is found.
ynele = 19.452 − 0.462x1 + 0.00935x2 − 0.00438x3 − 0.0539x4 + 0.0000238x2x3
−0.000406x2x4
R2 = 0.984, R2adj = 0.979, s = 0.171, PRESS = 1.419
The normality and constant variance tests are passed. Three suspicious data points are found.
ynex,th = 0.601 − 3.364x1 − 0.00775x2 + 0.00353x3 − 0.0271x4 + 19.391x12
−0.0000171x2x3 + 0.000328x2x4 − 0.0000258x3x4
R2 = 0.979, R2adj = 0.970, s = 0.119, PRESS = 0.666
The normality and constant variance tests are passed. Two suspicious data points are found.
ynex,ele = 20.366 − 0.175x1 + 0.0225x2 − 0.00427x3 − 0.0748x4 − 0.000192x22
−0.00158x1x3 + 0.0000196x2x3
R2 = 0.998, R2adj = 0.997, s = 0.068, PRESS = 0.175
The normality test is passed. However, the constant variance test is failed (p = 0047). Two suspicious data points are found.
For ynth and ynele, only the linear relationship is valid. No quadratic terms (x12, x22, or x42) significantly affect ynth and ynele. So, the curve surface plots were inappropriate. For ynex,th, only x22 has a significant effect on response. For ynex,ele, only the x22 term exists in the adequate equation. The constant variance test is failed for ynex and ele. Advanced regression techniques should be performed to remedy these conditions.

4. Discussion

This study collected 16 papers related to the application of RSM models in three engineering fields. This literature dataset was used to evaluate the adequacy of RSM equations. The evaluation of the adequate RSM model was completed through a complete regression analysis. This analysis calculated the coefficients for all variables, tested the significant effect of each variable, verified the assumptions, and identified influential data points in the regression analysis. It was found that only one paper reported an equation that could fully and correctly express the relationship between the response and the influencing factors [28].
The common issues with the application of RSM in three engineering fields, as identified in the study, are listed in Table 7.
Most papers adopted the full model and then used it to create contour plots and 3D response surface plots, allowing for the observation of the optimization of these variables. In the literature, ANOVA tables typically include the coefficient value of variable parameters, the t-value, and the p-value for each variable. However, this information was not used by some researchers [24,25,26,27,30,32,33,34,38] to evaluate the adequacy of equations.
One study did not report the RSM model [36]. The contour and response surface plots were produced with the full equation.
One study completely deleted unwanted variables after the first regression calculation [29]. When the ANOVA table of regression results indicated that the p-values of some variables were greater than 0.05, these variables were deleted simultaneously [29]. The equation was proposed using the coefficient values of the remaining variables. However, the full equations were still used to yield the contour and response surface plots for this study [29]. This method is unreasonable for model building.
Two studies employed sequential models to investigate the significant effects of linear, interaction, and quadratic terms on the response [31,37]. The p-values of some combinations (linear + interaction, linear + square terms) indicated that these combinations did not significantly affect the response. However, full equations were still used to produce contour and response surface plots.
If the RSM models are not full equations, the curved contour and 3D surface response plots produced using the full equations are inappropriate. The performance of the optimization conditions of these variables by these plots is meaningless.
Some datasets failed the normality test [25,27,29], and others failed the constant variance test [25,37,38]. These datasets need to be transformed to align with the assumptions of regression analysis. Yang et al. [50] emphasized that departures from the homogenous assumption will induce seriously incorrect results and require remedying this violation. Sheng et al. [35] used the logarithmic transformation of the response to perform the regression analysis. However, the response yD of their datasets did not violate the normality assumption, as determined by the Kolmogorov–Smirnov method in our study.
The implementation of remedial measures for heteroscedasticity in regression analysis includes transforming the dependent variable, such as log transformation, log(y); square root, (y0.5), and Box–Cox transformation; using weighted least squares (WLS); considering segmenting the data; and using generalized least squares (GLS) [51,52,53].
Influential data points are usually found in the datasets of the examined literature. These data points may be the source of experimental error or indicate the need for further study of other forms of RSM models. The treatment of influential points in regression analysis involves identifying whether these points are due to a pure data entry error or a valid data point that is an extreme value [54,55].
If an influential point is due to a mistake (e.g., experimental error or sampling error), it can be corrected or removed. Valid data points with extreme values can be transformed to reduce their influence, or robust regression can be used [54,55].
Asoo et al. [3] reported the integration of computer technology in RSM. They described the high-performance computing used to calculate the models and visualize the results, which helped the researchers understand the relationships between influencing variables and responses. Graphical visualization helped the researchers interpret results and make decisions [3]. However, the effect of adequate RSM models on graphical visualization was not considered.
The challenges of RSM include limitations in modeling nonlinear systems, sensitivity to experimental error, model interpretability, and model validation [3]. The nonlinear systems of an RSM experiment can be evaluated with nonlinear regression. The criteria for evaluating linear regression can also be applied to a nonlinear system. Some advanced modeling techniques, such as machine learning, neural networks, and support vector machines, can be applied and incorporated into the regression analysis technique. The experimental error can be further studied and quantified by checking the criteria of influential data points. The model’s interpretability problems can be addressed by incorporating the criteria of regression analysis. The PRESS criterion introduced in this study can be used as the model validation criterion to assess the predictive ability of other RSM models.
Sample size is an essential criterion for ensuring the power of statistical techniques. Researchers have proposed simple sample size equations to evaluate the required sample size (n) for multiple regression models. These criteria can be applied in the calculation of RSM models. Snee [56] proposed this equation: n ≥ 2p + 20. Green’s equation is n ≥ 8p + 50 [57]. The calculation equation used by Khamis and Kepler is n ≥ 5p + 20 [58]. In these equations, n is the required number of samples, and p is the number of parameters in the RSM models. With these equations, the sampling numbers are not great enough for the RSM equations in most studied. The solution method is to increase the replicates at each run.
The study used sixteen published studies to evaluate the adequacy of the RSM equation. The 16 papers used in this study employed the following commercial regression analysis software: Design Expert (10 papers), Minitab (2 papers), and MOODE (1 paper). Three papers did not report the software used. The three commercial software packages provide detailed analyses of regression calculations, including regression coefficients, t-values, and ANOVA tables. However, most users struggle to utilize the calculation results of these programs. Ten papers, accounting for 62.5% of the total literature analyzed, utilized Design Expert software. However, only one study [28] reported the appropriate equation.
Based on the results of this study, several suggestions are proposed for the application of RSM in experimental design within engineering fields.
  • For engineers using RSM, receiving complete regression analysis training will help them in their research work. Engineers not only need to be able to use commercial software programs to calculate the estimated values of parameter coefficients but also need to be familiar with screening influencing variables, checking the conditions of regression analysis assumptions, and examining all possible influencing data points. Training in complete regression techniques can enhance researchers’ ability to establish appropriate RSM equations.
  • Ask a statistician for help with the experimental design and verify the validity of the regression calculation.
  • Many commercial software programs can calculate RSM models and create precision contour and response surface plots. The backward elimination technique is beneficial in finding an adequate equation. It is recommended to integrate this backward elimination method into commercial software to assist researchers in developing suitable RSM equations.

5. Conclusions

This study compiled sixteen datasets from the literature in three engineering fields to evaluate the adequacy of RSM equations with complete regression analysis. The results of this study raise some critical issues regarding the use of RSM models in engineering research, including the selection of the full equation without considering statistical validation, the removal of all variables with p-values above a preset value, the presence of non-normality and non-constant variance conditions in the data set, and the presence of influential data points.
These issues need to be considered in RSM modeling. The sample size should be increased to enhance statistical power. Some suggestions for engineering researchers include training them in the complete regression technique, seeking the assistance of a statistician in experimental design and data analysis, and incorporating the backward elimination technique into commercial software programs, especially RSM software.

Author Contributions

Conceptualization, H.-Y.C. and C.C.; methodology, H.-Y.C. and C.C.; software, C.C.; formal analysis, H.-Y.C.; investigation, H.-Y.C. and C.C.; data curation, H.-Y.C.; writing—original draft preparation, H.-Y.C. and C.C.; writing—review and editing, H.-Y.C. and C.C.; visualization, C.C. supervision, C.C.; project administration, C.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data is unavailable because a statement is still required.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Appendix A.1. Data from the Coating Experiment [49]

RunSpeedPressureDistanceMean,
y1
Standard
Deviation, y2
1−1−1−12412.5
20−1−1120.38.4
3+1−1−1213.742.8
4−10−1863.5
500−1136.680.4
6+10−1340.716.2
7−1+1−1112.327.6
80+1−1256.34.6
9+1+1−1271.723.6
10−1−10810.0
110−10101.717.7
12+1−10357.032.9
13−100171.315.0
14000372.00.0
15+100501.792.5
16−1+10264.063.5
170+10427.088.6
18+1+10730.721.1
19−1−1+1220.7133.8
200−1+1239.723.5
21+1−1+1422.018.5
22−10+1199.029.4
2300+1485.344.7
24+10+1673.7158.2
25−1+1+1176.755.5
260+1+1501.0138.9
27+1+1+11010.0142.4

Appendix A.2. Evaluation of the RSM Models for yRa, Ra Surface Roughness [28]

1. yRa = 0.210 − 0.0138x1 + 15.133x2 − 0.166x3 + 0.0000308x12 + 32.24x22
(0.403) (0.490) (0.541) (0.245) (0.468)
+ 0.419x32 − 0.0268x1x2 + 0.000110x1x3 + 1.375x2x3
(0.059) 0.192) (0.773) (0.031)
R2 = 0.982, R2adj = 0.954, s = 0.174, PRESS = 2.470
Delete x1x3 and recalculate the equation.
2. yRa = 0.297 − 0.141x1 + 15.102x2 − 0.0827x3 + 0.0000307x12 + 33.310x22
(0.358) (0.457) (0.559) (0.210) (0.433)
+ 0.0418x22 − 0.0268x1x2 + 1.375x2x3
(0.041) (0.159) (0.020)
Delete x22 and recalculate the equation.
3. yRa = −0.863 − 0.0177x1 + 30.425x2 − 0.0606x3 + 0.0000366x12 + 0.0463x32
(0.222) (<0.001) (0.652) (0.115) (0.018)
−0.0268x1x2 + 1.375x2x3
(0.147) (0.015)
Delete x1x2 and recalculate the equation.
4. yRa = 1.022 − 0.0239x1 + 22.228x2 − 0.0599x3 + 0.0000366x12 + 0.463x32
(0.120) (<0.001) (0.680) (0.137) (0.023)
+ 1.372x2x3
(0.020)
Delete x12 and recalculate the equation.
5. yRa = −2.231 − 0.00125x1 + 22.228x2 + 0.00395x3 + 0.0590x32 + 1.372x1x3
(0.181) (0.001) (0.979) (0.004) (0.020)
Although the linear term x3 is insignificant (p > 0.05), the quadratic term x32 and interaction term x1x3 are significant in the model. The linear term x3 is hierarchically retained in the equation.
Delete x1 and recalculate the equation.
6. yRa = −2.714 + 22.288x2 + 0.000289x3 + 0.0583x32 + 1.372x1x2
(<0.001) (0.0199) (0.005) (0.003)
R2 = 0.958, R2adj = 0.942, s = 0.195, PRESS = 1.013.
The normality and constant variance tests are passed. One suspicious data point is found, point 7 (ti = 2.350, DFFITSi = 2.058).

Appendix A.3. Evaluation of the RSM Models for yRc, Tangent Force [28]

1. yRc = 264.240 − 0.294x1 + 199.40x2 − 7.094x3 + 0.000473x12 + 3750.468x22
(0.725) (0.858) (0.481) (0.736) (0.143)
+ 2.668x32 − 0.0707x1x2 + 0.00244x1x3 + 62.058x2x3
(0.029) (0.943) (0.244) (0.051)
R2 = 0.994, R2adj = 0.985, s = 9.039, PRESS = 3864.088
Delete x1x2 and recalculate the equation.
2. yRc = 269.233 − 0.311x1 + 177.735x2 − 7.092x3 + 0.000473x12 + 3750.468x22
(0.675) (0.857) (0.444) (0.715) (0.112)
+ 2.6688x32 + 0.0277x1x3 + 62.050x2x3
(0.018) (0.205) (0.034)
Delete x12 and recalculate the equation.
3. yRc = 241.953 − 0.0410x1 + 62.922x2 − 6.563x3 + 4000.060x22 + 2.722x32
(0.490) (0.943) (0.445) (0.062) (0.007)
+ 0.0243x1x3 + 62.050x2x3
(0.178) (0.024)
Delete x1x3 and recalculate the equation.
4. yRc = 261.851 − 0.104x1 + 58.033x2 + 0.842x3 + 4010.659x12 + 2.764x32
(0.027) (0.950) (0.905) (0.072) (0.008)
+ 62.050x2x3
(0.027)
Delete x22 and recalculate the equation.
5. yRc = 52.737 − 0.103x1 + 1902.950x2 + 4.737x3 + 3.543x32 + 62.505x2x3
(0.046) (<0.001) (0.544) (0.002) (0.046)
R2 = 0.988, R2adj = 0.982, s = 9.654, PRESS = 2279.002
The normality and constant variance tests are passed. Two suspicious data points are found, the 12th (ti = 2.322) 13th (ti = −2.326).

Appendix A.4. Evaluation of the RSM Models for yAfd, Average Fiber Diameter [37]

1. yAfd = 274.667 + 60.375x1 + 17.000x2 + 15.875x3 − 41.708x12 + 4.542x22
(<0.001) (<0.014) (0.019) (0.002) (0.535)
+ 34.792x32 + 6.000x1x2 + 9.750x1x3 − 17.501x1x3
(0.004) (0.402) (0.197) (0.044)
R2 = 0.982, R2adj = 0.95, s = 13.099, PRES S = 1,113,250
Delete x22 and recalculate the equation.
2. yAfd = 277.462 + 60.375x1 + 17.000x2 + 15.875x3 − 42.058x12 + 34.44x32
(<0.001) (<0.014) (0.019) (0.002) (0.535)
+ 6.001x1x2 + 9.750x1x3 − 17.501x2x3
(0.373) (0.169) (0.031)
Delete x1x2 and recalculate the equation.
3. yAfd = 277.462 + 60.375x1 + 17.000x2 + 15.875x3 − 42.0058x12 + 34.442x32
(<0.001) (<0.006) (0.009) (<0.001) (0.001)
+ 9.750x1x3 − 17.501x2x3
(0.160) (0.026)
Delete x1x3 and recalculate the equation.
4. yAfd = 277.462 + 60.375x1 + 17.001x2 + 15.875x3 − 42.058x12 + 34.442x32 + 17.501x2x3
R2 = 0.97, R2adj = 0.947, s = 13.502, PRESS = 675,548
The normality test is passed. The constant variance test is failed (p 0.038). One suspicious data point is found, point 12 (DFFITSi = 2.981)

References

  1. Myers, R.H.; Montgomery, D.C.; Anderson-Cook, C.M. Response Surface Methodology: Process and Product Optimization Using Designed Experiments; John Wiley & Sons: Hoboken, NJ, USA, 2016. [Google Scholar]
  2. Anderson, M.J.; Whitcomb, P.J. RSM Simplified: Optimizing Processes Using Response Surface Methods for Design of Experiments; Productivity Press: University Park, IL, USA, 2016. [Google Scholar]
  3. Asoo, H.R.; Alakali, J.S.; Ikya, J.K.; Yusufu, M.I. Historical background of RSM. In Response Surface Methods-Theory, Applications and Optimization Techniques; IntechOpen: London, UK, 2024. [Google Scholar]
  4. Baş, D.; Boyacı, İ.H. Modeling and optimization I: Usability of response surface methodology. J. Food Eng. 2007, 78, 836–845. [Google Scholar] [CrossRef]
  5. Chelladurai, S.J.S.; Murugan, K.; Ray, A.P.; Upadhyaya, M.; Narasimharaj, V.; Gnanasekaran, S. Optimization of process parameters using response surface methodology: A review. Mater. Today Proc. 2021, 37, 1301–1304. [Google Scholar] [CrossRef]
  6. De Oliveira, L.G.; de Paiva, A.P.; Balestrassi, P.P.; Ferreira, J.R.; da Costa, S.C.; da Silva Campos, P.H. Response surface methodology for advanced manufacturing technology optimization: Theoretical fundamentals, practical guidelines, and survey literature review. Int. J. Adv. Manuf. Technol. 2019, 104, 1785–1837. [Google Scholar] [CrossRef]
  7. Veza, I.; Spraggon, M.; Fattah, I.R.; Idris, M. Response surface methodology (RSM) for optimizing engine performance and emissions fueled with biofuel: Review of RSM for sustainability energy transition. Results Eng. 2023, 18, 101213. [Google Scholar] [CrossRef]
  8. Boshagh, F.; Rostami, K. A review of application of experimental design techniques related to dark fermentative hydrogen production. J. Renew. Energy Environ. 2020, 7, 27–42. [Google Scholar]
  9. Mäkelä, M. Experimental design and response surface methodology in energy applications: A tutorial review. Energy Convers. Manag. 2017, 151, 630–640. [Google Scholar] [CrossRef]
  10. Mishra, P.; Mohapatra, T.; Sahoo, S.S.; Padhi, B.N.; Giri, N.C.; Emara, A.; AboRas, K.M. Experimental assessment and optimization of the performance of a biodiesel engine using response surface methodology. Energy Sustain. Soc. 2024, 14, 28. [Google Scholar] [CrossRef]
  11. Pais-Chanfrau, J.M.; Núñez-Pérez, J.; del Carmen Espin-Valladares, R.; Lara-Fiallos, M.V.; Trujillo-Toledo, L.E. Uses of the response surface methodology for the optimization of agro-industrial processes. In Response Surface Methodology in Engineering Science; IntechOpen: London, UK, 2021. [Google Scholar]
  12. Boublia, A.; Lebouachera, S.E.I.; Haddaoui, N.; Guexxout, Z.; Ghriga, A.A.; Hasanzadeh, M.; Benguerba, Y.; Drouiche, N. State-of-the-art review on recent advances in polymer engineering: Modeling and optimization through response surface methodology approach. Polym. Bull. 2023, 80, 5999–6031. [Google Scholar] [CrossRef]
  13. Bezerra, M.A.; Ferreira, S.L.C.; Novaes, C.G.; Dos Santos, A.M.P.; Valasques, G.S.; da Mata Cerqueira, U.M.F.; dos Santos Alves, J.P. Simultaneous optimization of multiple responses and its application in Analytical Chemistry—A review. Talanta 2019, 194, 941–959. [Google Scholar] [CrossRef]
  14. Dejaegher, B.; Vander Heyden, Y. Experimental designs and their recent advances in set-up, data interpretation, and analytical applications. J. Pharm. Biomed. Anal. 2011, 56, 141–158. [Google Scholar] [CrossRef]
  15. Szpisják-Gulyás, N.; Al-Tayawi, A.N.; Horváth, Z.H.; László, Z.; Kertész, S.; Hodúr, C. Methods for experimental design, central composite design and the Box–Behnken design, to optimise operational parameters: A review. Acta Aliment. 2023, 52, 521–537. [Google Scholar] [CrossRef]
  16. Olabinjo, O.O. Response surface techniques as an inevitable tool in optimization process. In Response Surface Methods—Theory, Applications and Optimization Techniques; IntechOpen: London, UK, 2024. [Google Scholar]
  17. Myers, R.H. Classical and Modern Regression with Applications, 2nd ed.; Duxbury Press: Monterey, CA, USA, 1990. [Google Scholar]
  18. Berger, D.E. Introduction to Multiple Regression. Master’s Thesis, Claremont Graduate University, Claremont, CA, USA, 2008. [Google Scholar]
  19. Meloun, M.; Militký, J. Detection of single influential points in OLS regression model building. Anal. Chim. Acta. 2001, 439, 169–191. [Google Scholar] [CrossRef]
  20. Bhattacharya, S. Central composite design for response surface methodology and its application in pharmacy. In Response Surface Methodology in Engineering Science; IntechOpen: London, UK, 2021. [Google Scholar]
  21. Rodrigues, A.C. Response surface analysis: A tutorial for examining linear and curvilinear effects. Rev. Adm. Contemp. 2021, 25, e200293. [Google Scholar] [CrossRef]
  22. Reza, A.; Chen, L.; Mao, X. Response surface methodology for process optimization in livestock wastewater treatment: A review. Heliyon 2024, 10, e30326. [Google Scholar] [CrossRef]
  23. Won, J.K.; Lee, J.H.; Lee, J.T.; Lee, E.S. The selection on the optimal condition of Si-wafer final polishing by combined Taguchi method and respond surface method. Trans. Korean Soc. Eng. A. 2008, 17, 21–28. [Google Scholar]
  24. Lee, E.S.; Hwang, S.C.; Lee, J.T.; Won, J.K. A study on the characteristic of parameters by the response surface method in final wafer polishing. Int. J. Precis. Eng. Manuf. 2009, 10, 25–30. [Google Scholar] [CrossRef]
  25. Zhang, H.; Jiang, Z.; Guo, C. Simulation-based optimization of dispatching rules for semiconductor wafer fabrication system scheduling by the response surface methodology. Int. J. Adv. Manuf. Technol. 2009, 41, 110–121. [Google Scholar] [CrossRef]
  26. Seo, J.; Kim, J.H.; Lee, M.; You, K.; Moon, J.; Lee, D.H.; Paik, U. Multi-objective optimization of tungsten CMP slurry for advanced semiconductor manufacturing using a response surface methodology. Mater. Des. 2017, 117, 131–138. [Google Scholar] [CrossRef]
  27. Saleem, M.M.; Somá, A. Design of experiments based factorial design and response surface methodology for MEMS optimization. Microsyst. Technol. 2015, 21, 263–276. [Google Scholar] [CrossRef]
  28. Noordin, M.Y.; Venkatesh, V.C.; Sharif, S.; Elting, S.; Abdullah, A. Application of response surface methodology in describing the performance of coated carbide tools when turning AISI 1045 steel. J. Mater. Process. Technol. 2004, 145, 46–58. [Google Scholar] [CrossRef]
  29. Bouacha, K.; Yallese, M.A.; Mabrouki, T.; Rigal, J.F. Statistical analysis of surface roughness and cutting forces using response surface methodology in hard turning of AISI 52100 bearing steel with CBN tool. Int. J. Refract. Met. Hard Mater. 2010, 28, 349–361. [Google Scholar] [CrossRef]
  30. Elbah, M.; Aouici, H.; Meddour, I.; Yallese, M.A.; Boulanouar, L. Application of response surface methodology in describing the performance of mixed ceramic tool when turning AISI 4140 steel. Mech. Ind. 2016, 17, 309. [Google Scholar] [CrossRef]
  31. Campos, d.S.P.H.; de Carvalho Paes, V.; de Carvalho Gonçalves, E.D.; Ferreira, J.R.; Balestrassi, P.P.; Davim, J.P. Optimizing production in machining of hardened steels using response surface methodology. Acta Sci. Technol. 2019, 41, e38091. [Google Scholar] [CrossRef]
  32. Khalil, K.; Mohd, A.; Mohamad, C.O.C.; Faizul, Y.; Ariffin, S.Z. The optimization of machining parameters on surface roughness for AISI D3 steel. J. Phys. Conf. Ser. 2021, 1874, 012063. [Google Scholar] [CrossRef]
  33. Pajaie, H.S.; Taghizadeh, M. Optimization of nano-sized SAPO-34 synthesis in methanol-to-olefin reaction by response surface methodology. J. Indust. Eng. Chem. 2015, 24, 59–70. [Google Scholar] [CrossRef]
  34. Jourshabani, M.; Badiei, A.; Lashgari, N.; Mohammadi Ziarani, G. Application of response surface methodology as an efficient approach for optimization of operational variables in benzene hydroxylation to phenol by V/SBA-16 nanoporous catalyst. J. Nanostructures 2016, 6, 107–115. [Google Scholar]
  35. Sheng, X.; Cheng, Y.; Yao, Y.; Zhao, Z. Optimization of synthesizing upright ZnO rod arrays with large diameters through response surface methodology. Processes 2020, 8, 655. [Google Scholar] [CrossRef]
  36. Rakhmanova, A.; Kalybekkyzy, S.; Soltabayev, B.; Bissenbay, A.; Kassenova, N.; Bakenov, Z.; Mentbayeva, A. Application of response surface methodology for optimization of nanosized zinc oxide synthesis conditions by electrospinning technique. Nanomaterials 2022, 12, 1733. [Google Scholar] [CrossRef]
  37. Pakolpakçıl, A.; Kılıç, A.; Draczynski, Z. Optimization of the centrifugal spinning parameters to prepare poly (butylene succinate) nanofibers mats for aerosol filter applications. Nanomaterials 2023, 13, 3150. [Google Scholar] [CrossRef]
  38. Sreekumar, S.; Chakrabarti, S.; Hewitt, N.; Mondol, J.D.; Shah, N. Performance prediction and optimization of nanofluid-based PV/T using numerical simulation and response surface methodology. Nanomaterials 2024, 14, 774. [Google Scholar] [CrossRef]
  39. Rawlings, J.O.; Pantula, S.G.; Dickey, D. Applied Regression Analysis; Springer: New York, NY, USA, 1998. [Google Scholar]
  40. Allen, M.P. Understanding Regression Analysis; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2004. [Google Scholar]
  41. Dielman, T.E. Applied Regression Analysis for Business and Economics, 4th ed.; Duxbury/Thomson Learning: Pacific Grove, CA, USA, 2005. [Google Scholar]
  42. Mendenhall, W.; Sincich, T. Regression Analysis. A Second Course in Statistics, 12th ed.; Pearson: London, UK, 2012. [Google Scholar]
  43. Montgomery, D.C.; Peck, E.A.; Vining, G.G. Introduction to Linear Regression Analysis; John Wiley & Sons: Hoboken, NJ, USA, 2021. [Google Scholar]
  44. Ryan, T.P. Modern Regression Methods; John Wiley & Sons: Hoboken, NJ, USA, 2008. [Google Scholar]
  45. Wilcox, R.R.; Keselman, H.J. Modern regression methods that can substantially increase power and provide a more accurate understanding of associations. Eur. J. Pers. 2012, 26, 165–174. [Google Scholar] [CrossRef]
  46. Marinoiu, C. Classic and modern in regression modelling. Econom. Insights Trends Chall. 2017, 69, 41–50. [Google Scholar]
  47. Rowley, E.K. Comparison of Variable Selection Methods. Ph.D. Thesis, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA, 2019. [Google Scholar]
  48. Chowdhury, M.Z.I.; Turin, T.C. Variable selection strategies and its importance in clinical prediction modelling. Fam. Med. Community Health 2020, 8, e000262. [Google Scholar] [CrossRef]
  49. Box, G.E.P.; Draper, N.R. Response Surface, Mixtures, and Ridge Analysis; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2007. [Google Scholar]
  50. Yang, K.; Tu, J.; Chen, T. Homoscedasticity: An overlooked critical assumption for linear regression. Gen. Psychiatry 2019, 32, e100148. [Google Scholar] [CrossRef]
  51. Wang, G.C.; Akabay, C.K. Heteroscedasticity: How to handle in regression modeling. J. Bus. Forecast. 1994, 13, 11. [Google Scholar]
  52. Agunbiade, D.A.; Adeboye, N.O. Estimation of heteroscedasticity effects in a classical linear regression model of a cross-sectional data. J. Pro. Appl. Math. 2012, 4, 18–28. [Google Scholar]
  53. Kumar, N.K. Autocorrelation and heteroscedasticity in regression analysis. J. Business Soc. Sci. 2023, 5, 9–20. [Google Scholar] [CrossRef]
  54. Stevens, J.P. Outliers and influential data points in regression analysis. Psychol. Bull. 1984, 95, 334. [Google Scholar] [CrossRef]
  55. Chatterjee, S.; Hadi, A.S. Regression Analysis by Example; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
  56. Snee, R.D. Validation of regression models: Methods and examples. Technometrics 1977, 19, 415–428. [Google Scholar] [CrossRef]
  57. Green, S.B. How many subjects does it take to do a regression analysis? Multivar. Behav. Res. 1991, 26, 499–510. [Google Scholar] [CrossRef]
  58. Khamis, H.J.; Kepler, M. Sample size in multiple regression: 20 + 5k. J. Appl. Stat. Sci. 2010, 17, 505–517. [Google Scholar]
Figure 1. Contour plots of the complete and adequate equations for the final polished surface roughness response of a silicon wafer. The contours present the effect of using the RSM equation on the relationship between the response variables and the influencing variables. (a) Full equation. (b) Adequate equation.
Figure 1. Contour plots of the complete and adequate equations for the final polished surface roughness response of a silicon wafer. The contours present the effect of using the RSM equation on the relationship between the response variables and the influencing variables. (a) Full equation. (b) Adequate equation.
Asi 08 00099 g001
Figure 2. Contour plots of the complete and adequate equations for the average absolute roughness of AISI 52100 bearing steel. The contours present the effect of using the RSM equation on the relationship between the response variables and the influencing variables. (a) Full equation. (b) Adequate equation.
Figure 2. Contour plots of the complete and adequate equations for the average absolute roughness of AISI 52100 bearing steel. The contours present the effect of using the RSM equation on the relationship between the response variables and the influencing variables. (a) Full equation. (b) Adequate equation.
Asi 08 00099 g002aAsi 08 00099 g002b
Figure 3. Contour plots of the complete and adequate equations for the average fiber diameter response. The contours present the effect of using the RSM equation on the relationship between the response variables and the influencing variables. (a) Full equation. (b) Adequate equation.
Figure 3. Contour plots of the complete and adequate equations for the average fiber diameter response. The contours present the effect of using the RSM equation on the relationship between the response variables and the influencing variables. (a) Full equation. (b) Adequate equation.
Asi 08 00099 g003aAsi 08 00099 g003b
Table 1. The number of studies using each experimental methodology, determined using Google Scholar.
Table 1. The number of studies using each experimental methodology, determined using Google Scholar.
YearFull Factorial
Design (CFD)
Box–Behnken Design (BBD)Central Composite Design (CCD)
2019145,0007250218,000
2020151,0008450235,000
2021147,0009860224,000
2022132,00012,400207,000
2023114,00013,600159,000
202488,30016,500113,000
Table 2. Published data in the literature on engineering for evaluating the adequate equation of the response surface methodology.
Table 2. Published data in the literature on engineering for evaluating the adequate equation of the response surface methodology.
StudyTargets Number of Data Points and Experimental DesignSoftwareModel EvaluationCriteria for Parameter SelectionReported ModelOptimization
I. Semiconductor
Won et al. [23]
ySR = surface roughness
x1 = applied pressure
x2 = platen speed
x3 = slurry ratio
Si-wafer polishing15
CCD
Not reportedNot reportedNot reportedNot reportedContour plot, response surface plot
Lee et al. [24]
ySR = surface roughness
x1 = pressure
x2 = wheel speed
x3 = time
Final wafer polishing15
BBD
MINITABR2Not reportedFull modelsContour plot, response surface plot
Zhang et al. [25]
yCT = CT
yWTP = WTP
yTP = TP
x1 = Ub
x2 = C1
x3 = C2
Wafer fabricating20
CCD
Design
Expert, version not mentioned
ANOVA lack of fit, PRESS, R2, R2adjNot reportedFull modelsContour plot, response surface plot
Seo et al. [26]
yw (WAPR)
yoxide (Oxide MRR)
x1 = Free concentration
x2 = H2O2
x3 = SiO2
15
CCD
MINITABANOVA R2, R2adjt-value
p-value
Full
models
Contour plot,
response surface plot
Saleem and Soma [27]
yPV = pull-in voltage
x1 = TEL
x2 = TEN
x3 = TSL
x4 = TSW
MEMS27
BBD
Not reportedR2,
R2adj
F-value
p-value
Full
models
Contour plot, response surface plot
II. Steel materials
Noordin et al. [28]
yRa = surface roughness
yRc = tangential force
x1 = cutting speed
x2 = SCEA
Coated carbide tools
AISI 1045 steel
16
CCD
Design Expert
Ver. 6.0
ANOVA
R2,
R2adj
Lock of First, PRESS
p-value
backward
elimination
YRa = f(x2, x3, x2x3, x32)
YRc = f(x1, x2, x3, x2x3, x32)
Contour plot,
response surface plot
Bouacha et al. [29]
yRa = Ra
yRt = Rt
yRz = Rz
x1 = VC
x2 = f
x3 = ap
Surface roughness, cutting forces
AISI 52100 steel
27
Taguchi orthogonal array
Not reportedANOVA
R2
R2adj
F-value
p-value
At once
delete
variance
YRa = f(x1, x2, x3, x1x2)
YRt = f(x1, x2, x1x2, x22)
YRz = f(x1, x2, x1x2, x22)
Contour plot,
response surface plot
Elbah et al. [30]
yFa = Fa
yFr = Fr
yFt = Ft
yRa = Ra
x1 = depth of cut
x2 = feed rate
x3 = cutting speed
Mixed ceramic tool, AISI 4140 steel27
CCD
Design Expert
8.0.7
ANOVA
R2
R2adj
F-value
Some variables are not significant
in ANOVA
y F a ~ y R a
full model
Contour plot, response surface plot
Campos et al. [31]
yTime = Time
yRa = Ra
yRt = Rt
x1 = VC
x2 = f
x3 = ap
Machining of hardened steel19
CCD
Design Expert, version not mentionedANOVA
R2
R2adj
Sequential
model for some terms
y T i m e ~ y R t
full model
Contour plot,
response surface plot
Khalil et al. [32]
yRS = surface roughness
x1 = cutting spend
x2 = feed rate
x3 = depth of cut
Surface roughness AISI D3 steel20
CCD
Design Expert, version not mentionedLack of fitF-value
p-value
(cutoffs p < 0.1)
Full modelContour plot,
response surface plot
III. Nanomaterials
Pajaie and Taghizadeh [33]
yEthylene = ethylene
yPropylene = propylene
x1 = MW aging time
x2 = US aging time
x3 = HT time
Yield15
BBD
Design Expert ver. 6R2
R2adj
F-value
p-value
ANOVA
Table.
Full modelContour plot,
response surface plot
Jourshabani et al. [34]
yphenol yield = phenol yield
x1 = temperature
x2 = H2O2 content
x3 = Catalyst
Benzene hydroxylation
20
CCD
Design
Expert
Ver. 7.1.3
R2
R2adj
F-value
p-value
Full modelContour plot, response surface plot
Sheng et al. [35]
yTC002 = TC002

yAspect ratio = Aspect ratio


yD = D
x1 = concentration
x2 = temperature
x3 = catalyst
27
CCD
MOODE
Ver. 10
ANOVA lack of fitNot reported
Not reported


Not reported


log ( y D + 0.5 )
of
f(x1, x2, x3, x4, x1x2, x2x3, x2x4, x3x4, x32)
Contour plot, response surface plot
Contour plot, response surface
Contour plot, response surface plot
Rakhmanova et al. [36]
yZinc Oxide = zinc
oxide synthesis
x1 = applied potential
x2 = distance
x3 = temperature
15
BBD
Design Expert ver. 8.0.7.1R2,
R2adj
Lack of fit, PRESS
Not reportedNot reportedContour plot, response surface plot
Pakolpakcil et al. [37]
yAfd = Average fiber diameter
x1 = concentration
x2 = Rotational speed
x3 = Needle size
Poly nanofiber mats15
BBD
Design Expert ver. 13ANOVA
sequential model, quadratic, and interaction
F-value
p-value
Full modelContour plot, response surface plot
Sreekumar et al. [38]
yNth = Nth
ynele = nele
ynex,th = nex,th
ynex,ele = nex,ele
x1 = φ%
x2 = m
x3 = I
x4 = Ti
Nanofluid-based
PV/T
27
CCD
Design Expert,
version not mentioned
ANOVA
R2,
R2adj
F-value
p-value
Full modelContour plot,
Response surface plot
Table 3. Criteria for the evaluation of the RSM equations.
Table 3. Criteria for the evaluation of the RSM equations.
CriterionDescriptionCutoffs
R2The coefficient of determination is used to determine the relationship between the response and the independent variable.R2 value near 1.0
Adjusted R2This value takes into account the impact of the number of independent variables on the R-squared value. The closer the adjusted R-squared (R2adj) is to 1.0, the better the descriptive ability of the regression equation.R2adj value closer to 1.0
sThis represents the actual variability in the equation regarding the data distribution between the response and independent variables. It indicates the precision of those estimates. It reflects the variability in the estimates across different random samples from the same population.A smaller s suggests a more precise estimate, meaning the estimated coefficient is likely closer to the actual population value.
t-valueThe t-value is used to test the null hypothesis that the coefficients of the independent variables are significantly different from zero.A large t-value for the independent variable indicates that the coefficient is statistically significant and not equal to zero.
p-valueThe variable coefficient is calculated from its t-value and used to test the null hypothesis that the coefficients of the independent variables are significantly different from zero.The p-value represents the probability of incorrectly determining whether the coefficient of the variable is not zero. A smaller p-value represents a greater probability of the validity of the variable.
PRESS,
predicted residual error sum of squares
This evaluates the predictive ability of the regression modelThe smaller, the better
Normality testThe normality test is used to evaluate whether the dataset is normally distributed. The normality test technique used in this study is the Kolmogorov–Smirnov method.The p-value calculated with this method compares the preset value (p = 0.05).
Constant variance testThis assesses whether the dependent variable (response) has constant variance across its overall sources. The testing technique used in this study is the Spearman rank correlation method.The cutoff value is p = 0.05.
ti, externally studentized residualsThis computes the standard error of the residual of the estimated value, and this data is not used in model building.Values of ± 2.0 are usually used to indicate the possibility of an outlier.
DFFITSiThis evaluates the prediction effect for a data point. It is used to compare the estimated standard errors when the observed value is removed.The cutoffs of DFFITSi are ± 2.0 .
Table 4. Results of the evaluation of adequate RSM equations for semiconductor manufacturing.
Table 4. Results of the evaluation of adequate RSM equations for semiconductor manufacturing.
SourceReported EquationsContour and Surface Response PlotAdequate EquationsNormality TestConstant Variance TestInfluential Data
Won et al. [23]Not reportedCurve surfaceySR = f(x1, x3, x1x3, x12, x32)PassedPassed2, 15
Lee et al. [24]Full modelsCurve surfaceySR = f(x1, x2, x3, x1x3, x22, x32)PassedPassed4, 6, 7, 12
Zhang et al. [25]Full modelsCurve surfaceyCT = f(x1, x2, x12, x22)
yWTD = f(x1, x2, x3, x1x2, x22, x32)
yTP = f(x1, x2, x3, x12, x22, x32, x1x2)
Passed
Failed
Failed
Passed
Passed
Failed
5, 13, 17
13
5, 14, 18
Seo et al. [26]yW = full modelCurve surfaceyW = f(x2)PassedPassed12, 14
yOxide = full model yOxide = (x1, x3)PassedPassedno
Saleem and Soma [27]Full modelCurve surfaceyPV = f(x1, x2, x3, x4, x1x3, x32) FailedPassed27
Table 5. Results of the evaluation of adequate RSM equations for steel materials.
Table 5. Results of the evaluation of adequate RSM equations for steel materials.
SourceReported EquationsContour and Surface Response PlotAdequate EquationsNormality TestConstant Variance TestInfluential
Data Points
Noordin et al. [28]ySR = f(x2, x3, x2x3, x32)
yTF = f(x1, x2, x3, x2x3, x32)
Curve surfaceySR = f(x2, x3, x1x2, x2x3, x32)
yTF = f(x1, x2, x3, x2x3, x32)
PassedPassed7, 12, 13
Bouacha et al. [29]yRa = f(x1, x2, x3, x1x2)Curve surfaceyRa = f(x1, x2, x1x2, x12, x22) PassedPassed6, 13
yRt = f(x1, x2, x1x2, x22)Curve surfaceyRt = f(x1, x2, x1x2, x12, x22)PassedPassed1, 4
yRz = f(x1, x2, x1x2, x22)Curve surfaceyRz = f(x1, x2, x3, x1x2, x1x3, x12, x22)FailedPassed5, 6
Elbah et al. [30]Full models for yFa, yFr, yFt, yRa Curve surfaceyFa = f(x1, x2, x3, x1x3, x2x3)PassedPassed26
yFr = f(x1, x2, x3, x1x2, x1x3)PassedPassed24
yFt = f(x1, x2, x3, x1x2, x1x3, x2x3)PassedPassed27
yRa = f(x1, x2, x1x2)PassedPassedNo
Campos et al. [31]Full models Curve surfaceyTime = full equation PassedPassed1, 8, 16, 17
yRa = f(x1, x2, x3, x1x2, x1x3, x2x3, x12, x32)
yRt = f(x1, x12)
Passed
Passed
Passed
Passed
1, 3 4, 6
18
Khalil et al. [32]Full modelCurve surfaceySR = f(x1, x2, x3, x1x2, x1x3, x32)PassedPassed1, 7, 17, 20
Table 6. Results of the evaluation of adequate RSM equations for nanomaterials.
Table 6. Results of the evaluation of adequate RSM equations for nanomaterials.
SourceReported EquationsContour and Surface Response PlotAdequate EquationsNormality TestConstant Variance TestInfluential Data
Pakolpakcil et al. [37]Full modelCurve surfaceyAfd = f(x1, x2, x3, x2x3, x12, x32)PassedFailed12
Pajaie and Taghizadeh
[33]
Full modelCurve surfaceyEthylene = full model
yPropylene = f(x1, x2, x3, x1x3, x12, x22)
PassedPassed3, 5, 8, 9, 11
Jourshabani et al. [34]Full modelCurve surfaceyphenol yield = f(x1, x2, x3, x12, x22, x32)PassedPassed9
Sheng et al. [35]log ( y D + 0.5 )
= f(x1, x2, x3, x4, x1x2, x2x3, x2x4, x3x4, x32)
Curve surfaceyD = f(x1, x2, x3, x4, x1x2, x2x4, x3x4, x32)PassedPassedNone
Rakhmanova et al. [36]Not reportedCurve surfaceyZine oxide = f(x2, x3)PassedPassed6, 14
Sareekumar et al. [38]Full models for ynth-ynex,eleCurve surfaceYnth = f(x1, x2, x4)PassedPassed18
Curve surfaceynele = f(x1, x2, x3, x4, x2x3, x2x4)PassedPassed7, 24, 25
Curve surfaceynex,th = f(x1, x2, x3, x2x3, x2x4, x3x4)PassedPassed3, 25
Curve surfaceynex,ele = f(x1, x2, x3, x4, x1x3, x2x3, x22)PassedFailed3, 14
Table 7. Common issues with the applications of RSM in the literature.
Table 7. Common issues with the applications of RSM in the literature.
IssueLiterature
  • The full model was used to generate contour plots and three-dimensional response surface plots, which were then used to optimize these variables. However, the coefficient values, t-values, and p-values for each variable in the ANOVA table were not used.
[24,25,26,27,30,32,33,34,38]
2.
The contour plots and three-dimensional response surface plots with the full equation were proposed, but the RSM model was not reported.
[36]
3.
All variables with p-values higher than the preselected value (usually p < 0.05) were deleted at once.
[29]
4.
The ANOVA table of the sequential model was used and all variables were included in the linear or squared term directly, without conducting significance testing for each variable.
[31,37]
5.
Datasets did not pass the normality test.
[25,27,29]
6.
Datasets did not pass the constant variance test.
[25,37,38]
7.
Influential data points were found.
[23,24,25,26,27,28,29,30,31,32,33,34,36,37,38]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, H.-Y.; Chen, C. A Study of the Response Surface Methodology Model with Regression Analysis in Three Fields of Engineering. Appl. Syst. Innov. 2025, 8, 99. https://doi.org/10.3390/asi8040099

AMA Style

Chen H-Y, Chen C. A Study of the Response Surface Methodology Model with Regression Analysis in Three Fields of Engineering. Applied System Innovation. 2025; 8(4):99. https://doi.org/10.3390/asi8040099

Chicago/Turabian Style

Chen, Hsuan-Yu, and Chiachung Chen. 2025. "A Study of the Response Surface Methodology Model with Regression Analysis in Three Fields of Engineering" Applied System Innovation 8, no. 4: 99. https://doi.org/10.3390/asi8040099

APA Style

Chen, H.-Y., & Chen, C. (2025). A Study of the Response Surface Methodology Model with Regression Analysis in Three Fields of Engineering. Applied System Innovation, 8(4), 99. https://doi.org/10.3390/asi8040099

Article Metrics

Back to TopTop