Application of Machine Learning Techniques in Injection Molding Quality Prediction: Implications on Sustainable Manufacturing Industry

: With sustainable growth highlighted as a key to success in Industry 4.0, manufacturing companies attempt to optimize production efﬁciency. In this study, we investigated whether machine learning has explanatory power for quality prediction problems in the injection molding industry. One concern in the injection molding industry is how to predict, and what affects, the quality of the molding products. While this is a large concern, prior studies have not yet examined such issues especially using machine learning techniques. The objective of this article, therefore, is to utilize several machine learning algorithms to test and compare their performances in quality prediction. Using several machine learning algorithms such as tree-based algorithms, regression-based algorithms, and autoencoder, we conﬁrmed that machine learning models capture the complex relationship and that autoencoder outperforms comparing accuracy, precision, recall, and F1-score. Feature importance tests also revealed that temperature and time are inﬂuential factors that affect the quality. These ﬁndings have strong implications for enhancing sustainability in the injection molding industry. Sustainable management in Industry 4.0 requires adapting artiﬁcial intelligence techniques. In this manner, this article may be helpful for businesses that are considering the signiﬁcance of machine learning algorithms in their manufacturing processes.


Introduction
Sustainable growth has become important for firms, especially in Industry 4.0.Integration among the physical and digital systems of production is the main concern of Industry 4.0 [1].Industry 4.0 also enables continuous contact between machines, people, products, and even production materials.It is inextricably correlated with the Internet of Things (IoT), Machine-to-Machine (M2M) technology, and Machine Learning (ML).Among them, the most optimal solutions are machine learning and deep learning [2].The results on a test renewable microgrid show that a machine learning-based structure can solve the problem with high accuracy [3].One approach toward sustainable growth is by making a manufacturing process to shift the overall process to autonomous manufacturing, the core of which is information accessibility that enables the maintenance of manufacturing agility [4].Specifically, automated data collection from machines and applying machine learning techniques to the collected data for automated quality prediction or fault detection are two significant factors driving Industry 4.0.By combining novel techniques based on machine learning or deep learning with manufacturing processes, the performance of the systems is enhanced and can be monitored in real-time, data-driven, and continuous learning from a more varied range of data sources [5].
One concern of deploying machines or deep learning techniques in manufacturing fields is the selection of appropriate algorithms.That is, as techniques are very sensitive to the types of input data and size of the data, appropriate techniques should be selected for a particular manufacturing type [6].In other words, it is crucial to choose a manufacturing business for specific types of algorithms to fully enhance the manufacturing process.In this respect, this study attempted to employ machine learning and deep learning techniques in injection molding businesses.Specifically, we used prediction algorithms to verify whether they are suitable for quality prediction problems.
Injection molding is a complex production system.Injection molds are used as semifinal or final parts that can be used to produce the end products.Injection molds are frequently used in manufacturing plastic parts, and these parts are used in various businesses, such as automobile, shoe, and electronics manufacturing.As the injection molding industry supplies the base product for other manufacturing industries, it is considered a "root" industry, and the industry size is growing.With its importance growing every year, many studies have focused on designing the manufacturing process of injection molding and how to efficiently test the molding [7].While the literature on methods to effectively improve the manufacturing process of injection molding is growing, relatively little attention has been given to how to employ modern techniques, such as machine learning or deep learning models, to predict the quality of injection molding products.This is of particular importance as injection molds are semifinal parts, and the customers of the molds use the mold parts to produce the final products.If defective molds are delivered to customers, it is highly likely that they will be dissatisfied.
Using a private injection molding production and quality dataset from Hanguk Mold, an injection molding company in Ulsan, South Korea, we deployed several machine learning and deep learning models to empirically test which models are suitable for injection molding businesses.The company manufactures molding products for car manufacturers, and the data we obtained from the company were a large manufacturing dataset from injection machines on one specific item.The injection machine data included injection time (s), filling time (s), plasticizing time (s), cycle time (s), clamp close time (s), cushion position (mm), switch over position (mm), plasticizing position (mm), clamp open position (mm), max injection speed (mm/s), max screw RPM (RPM), average screw RPM (RPM), max injection pressure (MPa), Max switch over pressure (MPa), max back pressure (MPa), average back pressure (MPa), barrel temperature ( • C), and mold temperature ( • C).
Of numerous machine learning algorithms, we deployed techniques that are frequently used in the manufacturing industries.Specifically, we used tree-based algorithms and regression-based and autoencoder models.Tree-based algorithms included random forest, gradient boost, XGBoost, LightGBM, and CatBoost.Regression-based algorithms included logistic regression and support vector machine.Finally, we also used an autoencoder model.Among several models, we find that the autoencoder model performs well in quality prediction problems in injection molding compared to regression-based and decision treebased models.This is largely because of the complexity of the input variables in the injection model.Autoencoder models generally have strengths in settings with complex input features.Furthermore, we calculated the feature and characteristics importance of the predictive models to investigate which covariates are significant factors that determine the quality of the molding.We report that the models are generally in close agreement with the most influential predictors.Feature importance tests found that the molding temperature, hopper temperature, injection time, and cycle time factors were largely influential.Such common findings imply that the variation in the values of the abovementioned variables is a major cause of production defects; therefore, we highlight the importance of monitoring these variables in injection molding production.
This study aimed to apply modern machine learning and deep learning techniques to the injection molding business.Specifically, focusing on the quality prediction issue, we showed that autoencoder models are suitable for such businesses.This study contributes to the literature and to the future sustainable growth of injection molding businesses.We contribute to the literature by showing that autoencoder models have high explanatory power in explaining the quality of injection molds.From our knowledge, this is the first attempt to employ machine learning algorithms in plastic injection molding businesses and horserace the performances of models.Our massive comparison among several models showed that an autoencoder-based model outperforms other machine learning models.Furthermore, we contribute to on-site manufacturing businesses by showing the key variables that influence product quality.With over 50 real-time variables collected during the injection molding process, it is difficult for humans to identify influential variables.However, with the help of modern techniques, we found that temperature and time are important features, and such findings may be used by other injection molding companies.Contributions of the research are similar to the prior studies that apply modern statistical methods in practical businesses [8].
While injection molding businesses have long desired to figure out the main drivers that affect the quality of the product, this was not easy as the relationship between variables is rather complex.Complex relationships are not well captured in classical statistical models.With this regard, we deployed machine learning techniques to investigate the causes.This is possible as complexity is not a hurdle for machine learning models.The findings that molding temperature, hopper temperature, injection time, and cycle time are major factors are important for businesses' sustainable growth.Defective items are costly for both manufacturing companies and the environment.For manufacturers, since they cannot sell defective items, they waste resources.From an environmental perspective, such wasted defective items would harm the environment.To reduce the manufacturing cost and environmental risk, it is critical for businesses to understand the main factors that cause defects.By monitoring important features suggested in the article, firms may reduce the defect ratio, which would reduce the manufacturing costs and environmental risks and would in the end increase their advantages in sustainable growth.
Furthermore, injection molding businesses are facing challenges because the quality cost is increasing due to the wage increment.Ordinary injection molding firms set a couple of employees beside each injection machine to check the quality of the manufactured product and decide whether the product is defective or not.A medium-sized injection molding manufacturer that operates around a hundred injection machines has over two hundred employees that check the quality of products.This generates a huge cost especially in countries with high income.Therefore, it has long been questioned in the manufacturing field whether this cost could be minimized by using recent machine learning algorithms.However, due to lack of data, empiricists found it difficult to analyze the injection molding data and report whether the machine learning techniques have the potential to replace human labor, at least in quality monitoring.In this manner, using a large dataset generated from injection machines, we tried to investigate whether recent machine learning-based classification algorithms can well classify items by their quality.We found that an autoencoder-based model outperforms other models and that the performance of the autoencoder model is suitable to be applied in real injection molding businesses.This also means that applying machine learning techniques to the manufacturing sites may potentially reduce quality monitoring costs, which was a big hurdle that holds one back from sustainable growth.
We begin by presenting a literature review of both the injection molding industry and modern machine learning and deep learning techniques used in this study in Section 2. We describe the data and the methodologies used for quality prediction in injection molding in Section 3. The main results are provided in Section 4, where we also provide descriptive statistics, model performance comparisons, and feature importance.We then discuss the findings in Section 5 and, finally, the conclusions in Section 6.

Literature Review 2.1. Injection Molding
Enhancing production efficiency has long been a research question in the injectionmolding industry.Prior studies largely focused on methods to improve the cooling system of the injection molding process, enhance the energy consumption, alter cavity design, and improve the scheduling policy.
Temperature is a significant factor in determining the quality of the product.This is because the injection molding process involves melting the resin and subsequent cooling of the manufactured product.With its importance underscored, the literature attempted to enhance the layout design of the cooling system of injection molding.Searching on the Google Scholar data source, we were able to list related articles.For instance, a heuristic searching algorithm framework was used to develop the cooling circuits in the layout designs [9], and convex optimization models were further studied to improve the energy transfer efficiency [10].Furthermore, another strand of literature deployed topology optimization to simplify the cooling process analysis [11].K. J. Lee et al. (2020) [12] performed unsupervised probability matching between each instance and output based on injection molding data in the semiconductor industry to generate a training dataset with one-to-one relationships and apply k-nearest neighbor (KNN).It performed better than simply applying supervised learning methods such as support vector machine, random forest, and KNN.
Another critical research topic in the injection-molding industry is efficient energy usage.The guideline for characterizing the energy consumption around the injection molding process consists of five steps [13].Under these guidelines, we can estimate a variety of injection molding manufacturing processes and products by considering the theoretical minimum energy that was computed with part design and process planning.Thus, studies have largely focused on a variety of perspectives to enhance the efficiency and sustainability of the injection-molding manufacturing processes.Regarding the literature on cavities, which constitute a major part of injection molding [14], literature focused on ways to save manufacturing time.One approach was to exploit the intelligent cavity layout design system to help injection molding designers in cavity design steps [15].Another study examined the parting surface and cavity blocks in a computer-aided injection molding design system [16].
Recent studies have also investigated how the optimization of the scheduling of injection molding production may enhance manufacturing efficiency.For example, a deep Q-network was deployed to determine the scheduling policy to minimize the total tardiness [4].The authors found that the deep reinforcement learning method outperformed the dispatching rules that are popularly used for minimizing the total weighted tardiness.Another recent study is transfer learning between different injection molding processes to reduce the amount of data needed for model training [17].The authors used different approaches to ANN models; 16 training samples provided an average R2 value of 0.88 in this paper.
Ke, K.-C.et al. ( 2021) [18] filtered out outliers in the input data and converted the measured quality into a quality class used as output data.the prediction accuracy of the MLP model was improved, and the quality of finished parts was classified into various quality levels.The model classified "qualified," "unqualified," and "to-be-confirmed" and added quality assessments to only "to-be-confirmed" products, significantly reducing quality management costs.

Machine Learning
Studies on machine learning and its applications are proliferating.Focusing on its implications for solving issues in manufacturing businesses, research has focused on predicting failures [6].Cinar et al. (2020) [14] and Binding, Dykeman, and Pang (2019) [19] forecasted the downtime of manufacturing machines using real-time prediction models.They utilized unstructured historical machine data to train the machine learning classification algorithms, including random forest, XGBoost, and logistic regression, to predict machine failures [6].Qi, X et al. (2019) [20] conducted a study to apply neural network algorithms to complete additive manufacturing process chains from design to post-treatment.Yang, He, and Li (2020) [21] employed a machine learning-based approach to obtain an appropriate estimation model for the power consumption of the mask image projection stereolithography process.Among stepwise linear regression, shallow neural network, and stacked autoencoders, stacked autoencoders had the best performance.Reference [22] researched the quality control of continuous flow manufacturing.The authors labeled data with random forest-based pseudo-labeling and deployed recurrent neural network models.
Ruey-Shiang, G et al. (2020) [23] proposed a random forest model to detect the mean shifts in multivariate control charts during production.The proposed model well detected the moving average and was able to identify the exact variables.M. Strano et al. (2006) [24] proposed the logistic regression for the empirical determination of the locus of the principal planar strains where failure is most likely to occur.They directly derived the probability of the failure as a function of different predictor variables through the model.Pal, M. (2005) [25] compared classification accuracy between Random forest and SVM for remote detection.Zhang, C. et al. (2019) [26] built a two-stage energy-efficient decision-making mechanism using random forest.The authors selected appropriate control strategies for different occasions in the manufacturing process.Alhamad, I. M. et al. ( 2019) [27] compared the machine learning models that predict faults during the wafer fabrication process of the semiconductor industries.The combinations of feature selection methods and four models were k-nearest neighbor (KNN), random forest (RF), Naïve Bayes (NB), and decision tree (DT).The authors then compared recall, precision, F-measure, and falsepositive rates.Jo, H. et al. (2019) [28] compared machine-learning algorithms for predicting the endpoint temperature of molten steel in a converter in steel-making processes.Omairi, A. et al. (2021) [29] proposed machine learning algorithms to detect product defects in cyber-physical systems in additive manufacturing.The authors argued that the inclusion of AI frameworks in automated tasks could improve the manufacturing process efficiently.
There has been a recent study to evaluate multi-level quality control based on various machine learning and blockchain-based solutions [30].The authors found that XGBoost performs well by comparing the accuracy, precision, and recall of XGBoost and KNN algorithms.
Regarding injection molding, a study compared linear and kernel support vector machine (SVM) classifiers in datasets corresponding to product defects in an industrial environment around a plastic injection molding machine [31].The author compared linear and kernel SVM classifiers in datasets corresponding to product faults in an industrial environment with a plastic injection molding machine.Another study used images of injection-molding products and applied deep learning algorithms [32].The study found that long short-term memory (LSTM) fitted better than convolutional neural network (CNN) models in defect classification problems using image data.Although machine learning techniques based on image data are surging, not much research has been conducted on applying such methodologies using injection machine data.This research aims to apply several machine learning algorithms to the data gathered from injection machines.

Data
This study used a large injection machine dataset gathered from actual injection molding production at Hanguk Mold, a company in South Korea.Table 1 provides a description of all the variables that are available from the injection machine, and Figure 1 provides that process diagram of injection molding.There are over 50 available variables, and we selected variables that are considered more important in the manufacturing sites.

Injection_Time (s)
The time it takes the screw to move from the injection start position to the transfer position.Filling_Time (s) Filling time is an indication of how fast the plastic is injected into the mold.Plasticizing_Time (s) The time plasticizing the plastic.Cycle_Time (s) The amount of time it takes to start and end injection molding.Clamp_Close_Time (s) The time mold is closed.Cushion_Position (mm) The position of cushion after the mold filling and pack stages of the injection process.

Switch_Over_Position (mm)
The quality of the molded part is greatly influenced by the conditions under which it is processed.
Plasticizing_Position (mm) Plasticizing position; during the cooling time, the molding machine begins plasticizing material in the barrel to prepare for the next cycle.Clamp_Open_Position (mm) Clamp position when clamping force is applied to a mold.Max_Injection_Speed (mm/s) Maximum injection speed when screw to push molten plastic resin into a mold cavity.

Max_Screw_RPM (RPM)
Maximum rpm when the screw rotation speed in plastic injection molding is the speed of rotations of the screw for mixing the pellets.

Average_Screw_RPM (RPM)
Average rpm when the screw rotation speed in plastic injection molding is the speed of rotations of the screw for mixing the pellets.Max_Injection_Pressure (MPa) Maximum injection pressure when screw to push molten plastic resin into a mold cavity.Max_Switch_Over_Pressure (MPa) Maximum pressure applied to switch over position.Max_Back_Pressure (MPa) Maximum pressure applied to back pressure.Average_Back_Pressure (MPa) Average pressure applied to back pressure.

Barrel_Temperature ( • C)
The temperatures that need to be controlled during the plastic injection molding process about barrel temperature.Mold_Temperature ( • C) Temperature of the actual mold cavity after it has stabilized.Table 2 provides the summary statistics of the data divided by the quality of the injection molding.As the defect ratio is relatively low, we oversampled the defect data using the synthetic minority oversampling technique (SMOTE) method.The summary statistics comparing the mean value of the injection machine variables for the original dataset are reported in Panel A, and the oversampled data are provided in Panel B of Table 2.The univariate comparison result shows that, in general, there is a statistically significant difference in mold temperature-related measures injection time and plasticizing time for both original and oversampled datasets.Given that interpreting results from univariate analyzes have several endogeneity issues, we further deployed machine learning techniques to capture how the variation in such features can explain the quality of the product.

Logistic Regression
Logistic regression analysis is a representative method for linear-based classification algorithms.This algorithm is the basis of deep learning.A typical regression model estimates the linear regression equation below by determining the distribution characteristics of the features.
The most important aspect of logistic regression is to model the probability of an event.Instead of y, the probability of belonging to category 1, p = P(Y = 1) is modeled, indicating a numeric value between 0 and 1 [33].
Then, the probabilities are categorized using appropriate thresholds.In this study, the threshold was set to 0.5 to classify good and bad products.However, logistic regression has basic assumptions that must be met, such as linearity in the logit for continuous variables, independence of errors, and absence of multicollinearity [34].Using such model in our settings, the linear regression that passed the sigmoid function is a non-linear hyperplane.If we use sensor data to find the optimal hyperplane, we can explain which feature is important because it is derived by a hyperplane.

Support Vector Machine
A hyperplane is a decision boundary that classifies the data in high dimensions.Compared to logistic regression, the SVM can classify high-dimensional data that cannot be classified by linear classification using hyperplanes.Providing a kernel function in higher-dimensional data allows for a non-linear classification of observations in the original data [35].
The support vector is the data closest to the decision boundary.The SVM uses a margin, the distance between these support vectors, to find the optimal decision boundary.It is very important to select the proper kernel function as it explains the feature space where the training data will be classified [36].

Tree-Based Model
Another popularly used model in the manufacturing process is the decision tree-based model.Among several models that use decision trees, we deployed random forest, gradient boosting, lightGBM, and CatBoost algorithms following prior studies that developed models for computer numerical control (CNC) machines [26].The tree model consists of decision trees, and the advantage of this is we can extract feature importance to figure out which feature is important for quality prediction.Each of the five algorithms can extract its own method to extract important features, and we compare every important feature.

Random Forest
Random forest is an important machine learning algorithm for pattern recognition owing to its low cost.The main principle of the training strategy is bagging.This implies that the random forest is derived from ensemble sampling without replacement from part of the dataset [37].The remaining data are called out-of-bag and are used to evaluate the model performance [13].Most boosting or bagging algorithms are based on decision trees [38].The initial state of the node creates other nodes that contain features directed upward.Consequently, many decision trees were used to classify each set of data with sampling.For this method, individual decision trees can have low accuracy compared to a decision tree made using the total dataset.Hence, it is better to determine the total result of each tree because each tree can classify trained data that complement each other [39].

Gradient Boosting
Boosting is another ensemble method that gradually improves train error by using the residual of the models.Gradient boosting calculates the residual error that is identical to the gradient to make a reasonable model [40].The framework in which the residuals are calculated is the same as the way the loss of the model is directed in the opposite direction of the gradient.Hence, this algorithm is called gradient boosting [41].

XGBoost
XGBoost is a helpful approach for optimizing the gradient boosting algorithm by removing missing values and eliminating overfitting issues using parallel processing.System optimization in XGBoost is achieved by implementing parallelization, tree pruning, and hardware optimization [42].

LightGBM
Although XGBoost computing with high parallelism is faster than GBM, a method that can reduce the training time is required for large datasets [43].Unlike XGBoost, LightGBM (LGBM) showed better performance in the case of training time and memory efficiency as it offers superior performance and parallel computing capabilities for large amounts of data and more recently supports additional GPUs.LGBM has been developed in a way that inherits its advantages and complements the disadvantages of XGBoost.However, applications to small datasets of less than 10,000 are prone to overfitting.GBM is stronger for the overfitting problem using the level-wise method, but it requires time to balance.
LGBM uses the leaf weight method [44].Instead of balancing the tree, it continuously splits leaf nodes with maximum delta loss, expands the depth, and generates asymmetric rule trees [45].This method minimizes the predictive error loss compared with the balanced tree split method as learning repeats.

CatBoost
CatBoost can perform better than other GBM algorithms by substituting the orderingprincipal concept to solve the problem of prediction shift due to traditional data leakage and pre-processing for category variables with high cardinality [46].The first advantage is the reduction in learning time due to improvements in the categorical variable handling methods.Most GBMs use decision trees as base predicators, but with categorical variables, training takes a long time.Another advantage is the use of ordered boosting techniques to calculate leaf values to solve the preference shift problem [47].

Autoencoder-Based Model
For prediction manufacturing quality, the length of training data is important, and a deep framework overwhelms other machine learning methods.It means that the deep learning techniques considered can be applied to establish accurate manufacturing fields.Similarly, deep feature learning is beneficial to explore sophisticated relationships between multiple features of manufacturing and quality [48].
An autoencoder consists of an encoder that maps the input to the hidden layer and a decoder that maps the encoded data back to the reconstruction [49].First, it compresses the original input data to a vector of lower dimension and then decodes this vector to the original representation of the data [50].A stacked autoencoder is an autoencoder with multiple hidden layers.As shown in Figure 2, the structure is symmetric with respect to the middle-hidden layer, and the hidden layers have fewer nodes than the nodes in the input and output layers.Autoencoder models train from high-dimensional input to lowdimensional bottleneck intervals by repeatedly compressing and releasing the mapping process.In this process, an information bottleneck is created, and it automatically learns the ability to distinguish between important and non-critical features for restoring input samples.However, the autoencoder model incorporates normal data on developing the network.If the input data are suitable, the results are often significant.If data projected to higher dimensions using a kernel is well classified using a particular hyperplane, the machine learning model may be more appropriate.Therefore, we hypothesized that the autoencoder model would perform better because the data are correlated and the classification results in the high-dimensional kernel are not significant.

Time Complexity and Model Evaluation
The complexity usually depends on the size of the data.It is important to check the complexity because consuming resources and less time matter in the real world.In other words, if the results of the model are similar, a model with less complexity is more efficient in terms of resources and time savings and should be applied in practice, and it is highly related to the symbiotic relationship between humans and robots [51].Logistic regression, a type of linear regression, has the advantage of having no parameters, but there is also no way to control the complexity of the model.Autoencoder is also a combination of multiple logistic regression analyses, making it difficult to calculate complexity.In a computable model of complexity, we put the data consisting of n instances that have m attributes.SVM has O(n2), and it is considered as time complexity.The model complexity of a decision tree, one of the basic methods of a tree-based model, is O(mn2) [52].The complexity of random forest is O(Mmn log n).Different tree-based algorithms employ methods to reduce the complexity of their own methods.
For the binary model evaluation, we set four different elements to check the performance of the models.True positives (TPs), true negatives (TNs), false positives (FPs), and false negatives (FNs).True elements are those that the model classifies correctly, and false are those not classified correctly by the model.Accuracy is the most intuitive metric because it does not require statistical interpretation.
Generally, finding a defective item is more important than a good item.Therefore, we place the defect item to class "1" and the good item to class "0" for proper model evaluation.These data are very imbalanced as the ratio of defective items is less than 1.5%.Therefore, we need an additional model metric because, if our model cannot classify anything and every item is 0, then its accuracy is 98.5.Therefore, precision, recall, and F1-score usually check model performance in the case of imbalance and binary.Precision is the ratio of good items that the model predicts as actual good items [37].Recall is the ratio of actual good items to the sum of actual good items and factual good items.Generally, precision and recall have a tradeoff as it is a different view of the model evaluation metric.The F1-score is the harmonic mean of precision and recall complementing when it comes to imbalance [53].

Main Results
With that said, we employed several machine learning algorithms to observe and compare the performances of models.Specifically, we employed logistic regression, support vector machine, random forest, XGBoost, CatBoost, LightGBM, and autoencoder models.Those models can be categorized as regression-based models, tree-based models, and autoencoder-based models.
The model results are presented in Table 3. Panel A reports the results for the regression-based models.Because logistic regression is a method of classifying the results of the linear regression through a sigmoid function, four statistical assumptions-linearity, homoscedasticity, independence, and normality-must be satisfied.However, in the case of manufacturing data, variables are highly multicollinear, and some features are not invariant because of their unique characteristics.Consequently, the statistical assumption is difficult to satisfy owing to its unique characteristics, and thus the F1-score is remarkably poor.Unlike tree algorithms, recall was better in regression models.This implies that regression models detect 90% of defective items but also highly misclassify good items as bad items.This is the result of SMOTE because, when we train the model using oversampled data, SMOTE sets a 50:50 ratio of defective items and good items.Therefore, we note almost 90% recall but less than 1% precision.The linear SVM is almost the same because finding a hyperplane that classifies good and bad items is challenging owing to poor statistical assumptions.Further, it is difficult to find a good linear hyperplane, while good and bad items are mixed in high dimensions.
Panel B of Table 3 provides the results of the tree-based models.We found that the random forest of the bagging method outperforms the boosting-type method.In nonparametric data, sampling without replacement is better than that based on residuals for model update.It seems that, using manufacturing data, non-parametric methods are better than parametric methods.However, in the image classification problem, the human error was set at 5%.This is not applicable to the actual industry.
The autoencoder model results are presented in Panel C of Table 3.The stacked autoencoder classified most of the products accurately without misclassification.In the case of 5617 quality data, 70% were used for training and 30% for testing (1605).It has different characteristics compared to machine learning algorithms in that the network is trained only as a good item.As a result of classifying 1605 quality data and 125 defective data, the F1-score was 0.9727.The reason why the autoencoder is better than the others is non-parametric and is not significantly affected by the distribution between features.It detects every defect item; therefore, the recall score is 1 and the precision is 0.9469, and thus it can be highly compared to human error (i.e., 5%).Another advantage of this method is that only good items are trained; thus, all defective items can be used for model evaluation.Because many manufacturing data are imbalanced, a sampling method is necessary to create a model.However, it can skip this process, and thus it is more accurate and easier to use for classification.

Discussion
This paper finds that autoencoder-based models outperform tree-based models.In the tree-based models, there are two main ways of developing models: bagging and boosting.Bagging focuses more on how to organize the data well before building the model, and boosting focuses more on sensor values in terms of developing the model with updates of the residuals for which feature.Referring to Table 3, gradient boosting has the highest value, 0.7638.It means the model is classifying 76% of defective products.However, only 55% of the results determined by the model to be defective products were accurate.It is important to find out which product is a defective item because of cost.Therefore, we applied a stacked autoencoder.It is a method for anomaly detection through differences between input and output data in the process of learning and restoring networks that reduce the dimension of the original data.The advantage of this is that the model does not need any defective items.It is useful in low-cost injection molding to let the model be sustainable.This is because the labeling cost is high in the process of obtaining data to make the model.Since the network is learned only from good items, labeling costs are reduced, and results are very good as shown in Panel C of Table 3.In other words, in injection molding, there is a stable pattern in the case of good products, and in the case of defective products, there is a difference so it can be classified well.
Furthermore, with the significant results of machine learning methods in predicting the quality of injection molding, the variables that drive such results are also important.That is, among dozens of injection machine variables, what are the main important features that lead to quality problems in injection molding businesses.We employ feature importance tests for each model used in the analysis.
Figure 3 shows the combined feature importance graph.Regardless of the models used, we found that molding temperature, hopper temperature, injection time, and cycle time are important variables commonly selected by machine learning techniques.These findings contribute to manufacturing sites.With over 50 control variables on injection machines, workers find it difficult to efficiently control each variable.Using important features selected by machine learning algorithms may reduce the worker's time controlling machines and consequently increase the production level.This study contains two limitations.The first limitation is the limitation of data.Threats to validity are an important category to be discussed in machine learning studies [54,55].Among several categories of threats to validity, this paper is mostly concerned with external validity.That is, the findings may limit the ability to generalize the results beyond the experiment setting.As the results are from the plastic injection molding business, our findings may be different in other molding businesses.As the data are from a manufacturing company that is known for plastic injection, data might be biased.That is, the data may contain plastic injection molding characteristics that may not be applicable in other types of molding.Another limitation of the research is the insufficient knowledge on investigating clear reasons what caused the defect.Second, in manufacturing, the results of the model such as accuracy, recall, precision, and F1-score are important, but the explanation of the results is often more important.Finding causes for the outcome in the business is needed, but the current study may not have sufficiently performed it.Feature importance is a test to find important variables according to the "classification" of the model, and it is another problem whether they are actually important.To solve this problem, there are three methods: combining an explainable model or changing the structure of deep learning to understand which active functions' reaction results affect the results or using an explainable model.In the future, another example applied with different injection machine data will be needed, and a model structure that focuses on the cause rather than on the outcome of the model through explainable models will need to be in place.

Conclusions
Quality issues have long been a critical concern in injection-molding businesses.Such technical issues became more important for firms' sustainable growth, especially in the Industry 4.0 era.We believe that important innovations that would keep the manufacturing industries as leading roles in the market are an adaptation process to the new environment.With many artificial intelligence models introduced every day, manufacturing industries should also try to be more innovative by applying such modern techniques in their current manufacturing processes.Furthermore, quality efficiency is an important concern for manufacturing businesses for sustainable development, and this is also very much related to issues of energy efficiency [56,57].If enterprises want to reduce cost and find or retain clients, they should offer the products with the highest quality and reasonable prices.
Hence, injection molding firms attempt to improve their production efficiency and enhance the quality of prediction, monitoring key variables that influence the quality of injection molding and are the main drivers.As the manufacturing environment is becoming more dynamic with an increased number of products, not responding to the environment with agility causes customers' dissatisfaction and, therefore, causes a negative influence on the companies' competitiveness in the market [4].Therefore, intelligent solutions that may solve such complex problems are required, many prior studies have examined the importance of Industry 4.0 for enterprises in a changeable and innovative environment [58][59][60].
Injection molding manufacturing consists of complex production systems because many parts are combined, and the specifications of each mold are different.Moreover, mold products have different processes, and all these factors increase the complexity of the dynamic of the manufacturing environment.From the perspective of the data gathered during the process, this also implies non-linear and complex relationships among variables.Therefore, employing statistical methods based on linearity assumptions may not be effective.Using quality prediction as a testing ground, this study performed a comparative analysis of various methodologies in the machine learning architecture.At the upper level, we demonstrated that machine learning methods can help improve the understanding of quality problems in the injection molding industry.Using the large real production dataset gathered from the injection machines, we found that machine learning models are generally useful for quality prediction.Autoencoder and random forest are the best performing methods.Specifically, we showed that the autoencoder model outperforms other tree-based machine learning algorithms in terms of accuracy and F1-score.
We also tracked down the advantages of these machine learning algorithms to accommodate non-linear interactions that are often missed in other classical methods.The injection molding process is a combination of numerous variables, such as temperature, pressure, and velocity, and the relationship between these variables is not linear.Thus, methods that have comparative advantages in handling non-linear relationships are necessary.
In addition to the prediction results of several machine learning methods, we tested which factors are key variables that influence the quality of injection molding products.We found that molding temperature, hopper temperature, injection time, and cycle time are important variables commonly selected by machine learning techniques.