The Effect of Grouping Output Parameters by Quality Characteristics on the Predictive Performance of Artiﬁcial Neural Networks in Injection Molding Process

: In this study, a multi-input, multi-output-based artiﬁcial neural network (ANN) was constructed by classifying output parameters into different groups, considering the physical meanings and characteristics of product quality factors in the injection molding process. Injection molding experiments were conducted for bowl products, and a dataset was established. Based on this dataset, an ANN model was developed to predict the quality of molded products. The input parameters included melt temperature, mold temperature, packing pressure, packing time, and cooling time. The output parameters included mass, diameter, and height of the molded product. The output parameters were divided into two cases. In one case, diameter, and height, representing length, were grouped together, while mass was organized into a separate group. In the other case, mass, diameter and height were separated individually and applied to the ANN. A multi-task learning method was used to group the output parameters. The performance of the two constructed multi-task learning-based ANNs was compared with that of the conventional ANN where the output parameters were not separated and applied to a single layer. The comparative results showed that the multi-task learning architecture, which grouped the output parameters considering the physical meaning and characteristics of the quality of molded products, exhibited an improved prediction performance of about 32.8% based on the RMSE values.


Introduction
Injection molding is the process of injecting molten resin at high temperatures into a cavity in a mold at high speeds and pressures to form the final product.This process involves a complex interplay of various physical phenomena and material behaviors, such as rheology, flow dynamics, and heat transfer, at each stage of the process.Consequently, research has been actively pursued for many years to model the relationships between influencing factors such as process conditions in the injection molding process and the quality of the molded product, with the goal of predicting the process and optimizing product quality [1,2].
In recent years, with the advent of the Fourth Industrial Revolution, artificial intelligence (AI) technology has found applications in various fields such as data mining, image processing, engineering system modeling, technical control, and more.This integration has been made possible via the development of intelligent information technology and data computing, ushering in a new era of AI-driven solutions.Among them, there is a growing industrial demand for artificial neural networks (ANNs), which have shown strong performance in unraveling complex nonlinear relationships, making them one of the most promising languages in the field of artificial intelligence [3,4].
Appl.Sci.2023, 13, 12876 2 of 16 In line with this paradigm shift, the injection molding industry has also embraced the application of artificial intelligence technology to overcome the limitations of existing techniques for predicting the relationship between process factors and product quality in injection molding processes.Models for predicting the quality of molded products using ANNs can be categorized into two types.One is the multi-input, single-output (MISO) structure, which predicts a single quality parameter for multiple process conditions, while the other is the multi-input, multi-output (MIMO) structure, which predicts multiple quality parameters for different process conditions.Yang et al. [5] conducted a study in which they used a MISO-ANN to predict the mass of injection-molded products based on 10 process conditions, including melt temperature and mold temperature, set as input parameters.Their research aimed to explore the optimal conditions for molding products with the desired mass.The performance evaluation of the established ANN showed that the deviation between the predicted values and the actual values of the mass was within 0.23 ± 0.02 g.Based on this, they used the ANN to derive the process for a product with a target mass of 41.14 g.Injection molding experiments were conducted, and when the mass of the product was compared with the target mass, it showed a minimal deviation of 0.15 ± 0.07 g.This demonstrated a high degree of accuracy in deriving process conditions.They concluded that the ANN model for this product had a high degree of accuracy and reliability.Heinisch et al. [6] used 2 mm thick flat samples and used simulations to create a dataset with three output variables: product mass, length, and width.This dataset was based on six input variables including resin temperature, mold temperature, injection time, packing pressure level and time, and cooling time.Various experimental designs were used to construct the dataset.When comparing the prediction accuracy of ANNs for datasets generated using different experimental designs, the central composite design (CCD) showed the highest coefficient of determination at 0.930, indicating the most effective performance.However, in some cases, the D-optimal design and the L25 orthogonal array method also showed coefficients of determination similar to those of the CCD, indicating excellent performance.This indicates that ANNs can achieve a high level of accuracy and reliability and produce good results.
However, in the case of MIMO models, multiple quality factors with different physical meanings and characteristics are evaluated using a single feed-forward neural network.This structure includes multiple input variables, input neurons, a certain number of hidden layers, and an appropriate number of output neurons responsible for predicting multiple desired variables.The disadvantage of this structure is that it is not flexible enough to evaluate all quality factors, because the output neurons must use the same features (the output of the last hidden layer) for all variables.If the input variables fundamentally affect each of the output variables in a different way, this structure may not produce acceptable results [7].For example, when constructing a model using a MIMO structure, changing the weights and biases associated with the input variables to improve the prediction accuracy of one quality factor may result in a decrease in the prediction accuracy of other output values.
This study applied the data-based intelligent neural networks algorithm developed for the injection molding of a light guide module with a fine pattern on a large area mold core.This paper proposes a model for the correlation between improved injection molding process conditions and the multiple qualities of molded products using multi-task learning-based ANNs.This ANN takes the form of establishing multiple task structures with separate branches for different sets of output parameters, all sharing common input parameters.Injection molding experiments were conducted for bowl products and a dataset was established.Six process conditions including melt temperature, mold temperature, injection speed, packing pressure, packing time, and cooling time were set as input parameters.Output parameters included mass, diameter, and height.Based on this, two multi-task learning-based ANN architectures were constructed.One architecture groups the length parameters represented by diameter and height into a single category, while the remaining parameter, mass, is placed in a separate group.The other architecture constructs a multi-task learning-based ANN where mass, diameter, and height are all distinguished into separate groups.Then, comparisons are made between the predictive performance of two multi-task learning ANNs and that of a single-task MISO structure.Based on this, structural guidelines for ANNs to predict multiple qualities of injection molded products are presented.

Artificial Neural Networks
Artificial neural networks (ANNs) provide a powerful solution for handling complex, non-linear relationships in various industries where conventional methods struggle.They mimic the information processing structure of the human brain and are widely used in fields such as control engineering and robotics.In ANNs, artificial neurons process data by multiplying inputs with weights and applying an activation function, resulting in output generation.This process is represented by the simplest ANN, the perceptron in Figure 1, the data operations of which are shown in Equation (1).In Equation ( 1), x represents the input variables, while w and b denote the weights and biases required for model updates.F represents the activation function in Figure 1.
Appl.Sci.2023, 13, x FOR PEER REVIEW 3 of 16 constructs a multi-task learning-based ANN where mass, diameter, and height are all distinguished into separate groups.Then, comparisons are made between the predictive performance of two multi-task learning ANNs and that of a single-task MISO structure.Based on this, structural guidelines for ANNs to predict multiple qualities of injection molded products are presented.

Artificial Neural Networks
Artificial neural networks (ANNs) provide a powerful solution for handling complex, non-linear relationships in various industries where conventional methods struggle.They mimic the information processing structure of the human brain and are widely used in fields such as control engineering and robotics.In ANNs, artificial neurons process data by multiplying inputs with weights and applying an activation function, resulting in output generation.This process is represented by the simplest ANN, the perceptron in Figure 1, the data operations of which are shown in Equation (1).In Equation ( 1), x represents the input variables, while w and b denote the weights and biases required for model updates.F represents the activation function in Figure 1.
An ANN is a computational processing system, as shown in Figure 2, that consists of multiple interconnected perceptron structures [9].Unlike perceptrons, the structure in Figure 2 contains several intermediate computational layers called hidden layers.These hidden layers are typically so called because the processes taking place within the computational layers are not readily observable by users.Numerous nodes (neurons) are distributed within these layers, and this configuration is known as an ANN.Hidden layers can consist of multiple layers, and the term "deep learning" implies the depth of these hidden layers.Equation (2) represents the process of calculating the output value,  ( ) , of the ith neuron in layer lth.It depicts the formula for computing the output of neurons in a multi-layer ANN, extending from the output form of the perceptron shown in Equation (1).An ANN is a computational processing system, as shown in Figure 2, that consists of multiple interconnected perceptron structures [9].Unlike perceptrons, the structure in Figure 2 contains several intermediate computational layers called hidden layers.These hidden layers are typically so called because the processes taking place within the computational layers are not readily observable by users.Numerous nodes (neurons) are distributed within these layers, and this configuration is known as an ANN.Hidden layers can consist of multiple layers, and the term "deep learning" implies the depth of these hidden layers.Equation (2) represents the process of calculating the output value, y i (l) , of the ith neuron in layer lth.It depicts the formula for computing the output of neurons in a multi-layer ANN, extending from the output form of the perceptron shown in Equation (1).(2)

Backpropagation
The backpropagation algorithm, a fundamental technique for training neural networks, takes its name from the way it handles errors by propagating them in the opposite direction of the network's forward flow.This method involves two key steps: the forward pass and the backward pass.In the forward pass, the network computes predictions by processing input data through its layers.In the backward pass, it uses the error between the predictions and the actual data to adjust the model's internal parameters, such as weights and biases.This iterative process continues until the network converges to an optimal state, at which point the chain rule of derivatives applies, as shown in Figure 3 and Equation (3) [8].

Hyperparameters
Hyperparameters are user-defined variables that are essential for training ANN models.These hyperparameters have a significant impact on the efficiency and performance of the model.Key hyperparameters include not only structural elements such as the number of neurons or hidden layers, but also various other factors that can affect the model's effectiveness.

Backpropagation
The backpropagation algorithm, a fundamental technique for training neural networks, takes its name from the way it handles errors by propagating them in the opposite direction of the network's forward flow.This method involves two key steps: the forward pass and the backward pass.In the forward pass, the network computes predictions by processing input data through its layers.In the backward pass, it uses the error between the predictions and the actual data to adjust the model's internal parameters, such as weights and biases.This iterative process continues until the network converges to an optimal state, at which point the chain rule of derivatives applies, as shown in Figure 3 and Equation (3) [8].

Backpropagation
The backpropagation algorithm, a fundamental technique for training neural networks, takes its name from the way it handles errors by propagating them in the opposite direction of the network's forward flow.This method involves two key steps: the forward pass and the backward pass.In the forward pass, the network computes predictions by processing input data through its layers.In the backward pass, it uses the error between the predictions and the actual data to adjust the model's internal parameters, such as weights and biases.This iterative process continues until the network converges to an optimal state, at which point the chain rule of derivatives applies, as shown in Figure 3 and Equation (3) [8].

Hyperparameters
Hyperparameters are user-defined variables that are essential for training ANN models.These hyperparameters have a significant impact on the efficiency and performance of the model.Key hyperparameters include not only structural elements such as the number of neurons or hidden layers, but also various other factors that can affect the model's effectiveness.

Hyperparameters
Hyperparameters are user-defined variables that are essential for training ANN models.These hyperparameters have a significant impact on the efficiency and performance of the model.Key hyperparameters include not only structural elements such as the number of neurons or hidden layers, but also various other factors that can affect the model's effectiveness.
The proper configuration of these hyperparameters is a critical stage in determining the efficiency and performance of the model.In this work, the hyperband technique of Li et al. [10] is used to identify hyperparameter settings that are suitable for the characteristics and structure of the data.The hyperband approach progressively selects and optimizes hyperparameter combinations that exhibit superior performance, rather than evaluating all possible combinations at once.This method is known to provide accelerated optimization that outperforms that of traditional techniques such as grid search, random search, and Bayesian optimization, while achieving superior results.As a result, the hyperband method is widely used in practice.

Multi-Task Learning
A neural network model that handles multiple input and output variables is known as a multi-input, multi-output (MIMO) model.The methods for building MIMO models can be divided into single-task learning and multi-task learning.In single-task learning, all variables share a layer, which poses difficulties due to interdependencies.Multi-task learning, on the other hand, separates the variables into different layers within a single model, allowing tailored learning for each variable.This approach is more efficient and suitable for building MIMO models for multiple output predictions.Figure 4 shows the basic structure of multi-task learning.
et al. [10] is used to identify hyperparameter settings that are suitable for the characteristics and structure of the data.The hyperband approach progressively selects and optimizes hyperparameter combinations that exhibit superior performance, rather than evaluating all possible combinations at once.This method is known to provide accelerated optimization that outperforms that of traditional techniques such as grid search, random search, and Bayesian optimization, while achieving superior results.As a result, the hyperband method is widely used in practice.

Multi-Task Learning
A neural network model that handles multiple input and output variables is known as a multi-input, multi-output (MIMO) model.The methods for building MIMO models can be divided into single-task learning and multi-task learning.In single-task learning, all variables share a layer, which poses difficulties due to interdependencies.Multi-task learning, on the other hand, separates the variables into different layers within a single model, allowing tailored learning for each variable.This approach is more efficient and suitable for building MIMO models for multiple output predictions.Figure 4 shows the basic structure of multi-task learning.The goal of a multi-learning task can be briefly described as " Multi-task learning is an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias.It does this by learning tasks in parallel while using a shared low dimensional representation; what is learned for each task can help other tasks be learned better" [13,14].The fundamental assumption of multi-task learning is that the tasks being learned are either all related or at least some of them are, so jointly learning all tasks can lead to better learning outcomes compared to independently learning each task.The idea is that by having tasks learn together, shared information between different tasks can result in improved overall performance.In other words, in processes like injection molding, where the qualities of the molded product for process conditions are not entirely independently separated but influenced by the interplay known as "Interaction", with some aspects affecting each other and contributing to the overall output (qualities), multi-task learning is a suitable structure.This is because certain aspects jointly influence tasks, while others independently The goal of a multi-learning task can be briefly described as " Multi-task learning is an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias.It does this by learning tasks in parallel while using a shared low dimensional representation; what is learned for each task can help other tasks be learned better" [13,14].The fundamental assumption of multi-task learning is that the tasks being learned are either all related or at least some of them are, so jointly learning all tasks can lead to better learning outcomes compared to independently learning each task.The idea is that by having tasks learn together, shared information between different tasks can result in improved overall performance.In other words, in processes like injection molding, where the qualities of the molded product for process conditions are not entirely independently separated but influenced by the interplay known as "Interaction", with some aspects affecting each other and contributing to the overall output (qualities), multi-task learning is a suitable structure.This is because certain aspects jointly influence tasks, while others independently affect them, creating an intertwined relationship among tasks.Fundamentally, multi-task learning is commonly employed to enhance the pattern recognition accuracy in computer vision applications.This involves multiple tasks sharing the same input, and they are processed within a single neural network, a setup often referred to as multi-class learning [14].However, in recent times, multi-task learning techniques have been applied to various deep learning architectures such as CNNs, LSTMs, and regression problems within artificial neural networks.[11,14] Lately, the adoption of deep learning techniques, particularly the deep neural network structure, has gained prominence in multi-task learning due to its ability to learn latent representations of data without requiring explicit hand-crafted formulations [11].In various applications, the approach often involves either hard parameter sharing (where the hidden layers are shared among all tasks) or soft parameter sharing (where each task has its own model, represented by its own set of parameters, within the hidden layers) [11,12,14], as shown in Figure 4.

Materials and Molding Equipment
In this study, injection molding experiments were conducted to collect data for the development of an artificial neural network (ANN).A single-cavity mold was used to injection mold a bowl-shaped product with specific dimensions, including a diameter of 99.90 mm and a height of 50.80 mm, as shown in Figure 5.The material selected for injection molding was LUPOL GP1007F polypropylene (LG Chem, Seoul, Republic of Korea).The injection molding machine used in this study is the LGE-150 model (LSMtron, Anyang, Gyeonggi, Republic of Korea) equipped with a 32 mm diameter screw.This model has a clamping force of 150 tons, a maximum injection speed of 1000 mm/s and a maximum injection pressure of 350 MPa.[14].However, in recent times, multi-task learning techniques have been applied to ous deep learning architectures such as CNNs, LSTMs, and regression problems w artificial neural networks.[11,14] Lately, the adoption of deep learning techniques, p ularly the deep neural network structure, has gained prominence in multi-task lea due to its ability to learn latent representations of data without requiring explicit crafted formulations [11].In various applications, the approach often involves either parameter sharing (where the hidden layers are shared among all tasks) or soft para sharing (where each task has its own model, represented by its own set of param within the hidden layers) [11,12,14], as shown in Figure 4.

Materials and Molding Equipment
In this study, injection molding experiments were conducted to collect data for t velopment of an artificial neural network (ANN).A single-cavity mold was used to inj mold a bowl-shaped product with specific dimensions, including a diameter of 99.9 and a height of 50.80 mm, as shown in Figure 5.The material selected for injection mo was LUPOL GP1007F polypropylene (LG Chem, Seoul, Republic of Korea).The inj molding machine used in this study is the LGE-150 model (LSMtron, Anyang, Gyeo Republic of Korea) equipped with a 32 mm diameter screw.This model has a clamping of 150 tons, a maximum injection speed of 1000 mm/s and a maximum injection press 350 MPa.

Experimental Conditions
The recommended molding ranges for resin and mold temperatures were defin considering the resin manufacturer's recommendations and the property database o POL GP1007F within Moldflow Insight 2023 (Autodesk, San Rafael, CA, USA).These perature ranges were categorized into three levels, as shown in Table 1.The packing sure and time were determined based on preliminary experiments to establish su process ranges for the product, and these were also divided into three levels and ap in actual molding experiments, as shown in Table 1.The injection time and cooling were obtained through CAE analysis using Moldflow Insight 2023 (Autodesk, USA

Experimental Conditions
The recommended molding ranges for resin and mold temperatures were defined by considering the resin manufacturer's recommendations and the property database of LUPOL GP1007F within Moldflow Insight 2023 (Autodesk, San Rafael, CA, USA).These temperature ranges were categorized into three levels, as shown in Table 1.The packing pressure and time were determined based on preliminary experiments to establish suitable process ranges for the product, and these were also divided into three levels and applied in actual molding experiments, as shown in Table 1.The injection time and cooling time were obtained through CAE analysis using Moldflow Insight 2023 (Autodesk, USA), and these conditions were also divided into three levels, similarly to other process parameters, for application in molding experiments.Based on the levels of process conditions in Table 1, 27 process conditions were generated using the L27 orthogonal array design.Additionally, 23 process conditions were created by randomly selecting values within the minimum and maximum ranges of the process conditions in Table 1 as shown in Table A1 (Appendix A).In total, injection molding experiments were conducted for 50 sets of process conditions, collecting data on the mass, diameter, and height of the injection-molded products to construct the dataset used for training the ANN.The mass of the injection-molded product was measured using the CUX420H (CAS, Yangju-si, Gyeonggi-do, Republic of Korea) electronic scale.The mass measurements were conducted under ambient conditions, using a case cover during the process to minimize the influence of atmospheric movement.The measurements were taken to two decimal places and the average of five measurements was used.The diameter of the product was evaluated using the average values measured at six points, as shown in Figure 6a.Diameter measurement was performed using Datastar200 (RAM OPTICAL INSTRUMENT, Westlake, OH, USA), a non-contact optical measurement device.The molded part was placed on a properly leveled measuring table and the outline of the part was measured with the device for the diameter shown as Figure 6b.The height of the product was determined using the average values measured at four points shown in Figure 7 with the digimatic height gauge (Mitutoyo, Kawasaki, Kanagawa, Japan).The height was measured by attaching a height gauge to a vertical rod and placing the product between the gauge and a leveled measuring table.For height, measurements of the points in Figure 7 were measured five times each, and the average value was used.cess conditions, collecting data on the mass, diameter, and height of the injection-molded products to construct the dataset used for training the ANN.The mass of the injectionmolded product was measured using the CUX420H (CAS, Yangju-si, Gyeonggi-do, Republic of Korea) electronic scale.The mass measurements were conducted under ambient conditions, using a case cover during the process to minimize the influence of atmospheric movement.The measurements were taken to two decimal places and the average of five measurements was used.The diameter of the product was evaluated using the average values measured at six points, as shown in Figure 6a.Diameter measurement was performed using Datastar200 (RAM OPTICAL INSTRUMENT, Westlake, OH, USA), a noncontact optical measurement device.The molded part was placed on a properly leveled measuring table and the outline of the part was measured with the device for the diameter shown as Figure 6b.The height of the product was determined using the average values measured at four points shown in Figure 7 with the digimatic height gauge (Mitutoyo, Kawasaki, Kanagawa, Japan).The height was measured by attaching a height gauge to a vertical rod and placing the product between the gauge and a leveled measuring table.
For height, measurements of the points in Figure 7 were measured five times each, and the average value was used.
Table 1.Process conditions and levels for the experiment.cess conditions, collecting data on the mass, diameter, and height of the injection-molded products to construct the dataset used for training the ANN.The mass of the injectionmolded product was measured using the CUX420H (CAS, Yangju-si, Gyeonggi-do, Republic of Korea) electronic scale.The mass measurements were conducted under ambient conditions, using a case cover during the process to minimize the influence of atmospheric movement.The measurements were taken to two decimal places and the average of five measurements was used.The diameter of the product was evaluated using the average values measured at six points, as shown in Figure 6a.Diameter measurement was performed using Datastar200 (RAM OPTICAL INSTRUMENT, Westlake, OH, USA), a noncontact optical measurement device.The molded part was placed on a properly leveled measuring table and the outline of the part was measured with the device for the diameter shown as Figure 6b.The height of the product was determined using the average values measured at four points shown in Figure 7 with the digimatic height gauge (Mitutoyo, Kawasaki, Kanagawa, Japan).The height was measured by attaching a height gauge to a vertical rod and placing the product between the gauge and a leveled measuring table.

Conditions
For height, measurements of the points in Figure 7 were measured five times each, and the average value was used.
Table 1.Process conditions and levels for the experiment.Out of the dataset containing 50 different process conditions, 38 datasets were designated as the training dataset, 6 datasets were designated as the validation dataset, and the remaining 6 datasets were designated as the test dataset, which was used to evaluate the performance of the model.To ensure that the influence of the parameter scale was consistent and to standardize the magnitudes and differences between the parameter values, all datasets underwent min-max normalization using Equation (4).In the early stages of this research, an artificial neural network was constructed using standard min-max normalization, which is commonly employed in statistics, scaling values to the range between 0 and 1.However, challenges arose due to parameters being normalized to 0, leading to data saturation and difficulties in predicting accurate output values for certain cases.

Conditions
To resolve this issue, the saturation problem was addressed by implementing min-max normalization within the range of 0.1 to 0.9, as outlined in Equation ( 4):

Neural Network Architectures and Implementation
In this study, three multi-input, multi-output (MIMO) models were constructed with six process parameters, melt temperature, mold temperature, injection speed, packing pressure, packing time, and cooling time, as input parameters, and the mass, diameter, and height of the molded product as output parameters.One of the models is Network A, shown in Figure 8, a conventional artificial neural network (ANN) that combines all three output parameters in a single-task layer.The other two models were built using multi-task learning by grouping the output parameters based on their physical meanings and characteristics.One model groups length, represented by diameter and height, and separates mass as a separate group, as shown in Network B (Figure 9), while the other model, Network C, classifies all output parameters into separate groups, as shown in Figure 10.
nated as the training dataset, 6 datasets were designated as the validation dataset, and the remaining 6 datasets were designated as the test dataset, which was used to evaluate the performance of the model.To ensure that the influence of the parameter scale was consistent and to standardize the magnitudes and differences between the parameter values, all datasets underwent min-max normalization using Equation (4).In the early stages of this research, an artificial neural network was constructed using standard min-max normalization, which is commonly employed in statistics, scaling values to the range between 0 and 1.However, challenges arose due to parameters being normalized to 0, leading to data saturation and difficulties in predicting accurate output values for certain cases.To resolve this issue, the saturation problem was addressed by implementing min-max normalization within the range of 0.1 to 0.9, as outlined in Equation ( 4): ′ = (0.9 − 0.1) × ( . ) ( . . ) + 0.1, ′ ∈  (4)

Neural Network Architectures and Implementation
In this study, three multi-input, multi-output (MIMO) models were constructed with six process parameters, melt temperature, mold temperature, injection speed, packing pressure, packing time, and cooling time, as input parameters, and the mass, diameter, and height of the molded product as output parameters.One of the models is Network A, shown in Figure 8, a conventional artificial neural network (ANN) that combines all three output parameters in a single-task layer.The other two models were built using multitask learning by grouping the output parameters based on their physical meanings and characteristics.One model groups length, represented by diameter and height, and separates mass as a separate group, as shown in Network B (Figure 9), while the other model, Network C, classifies all output parameters into separate groups, as shown in Figure 10.remaining 6 datasets were designated as the test dataset, which was used to evaluate the performance of the model.To ensure that the influence of the parameter scale was consistent and to standardize the magnitudes and differences between the parameter values, all datasets underwent min-max normalization using Equation ( 4).In the early stages of this research, an artificial neural network was constructed using standard min-max normalization, which is commonly employed in statistics, scaling values to the range between 0 and 1.However, challenges arose due to parameters being normalized to 0, leading to data saturation and difficulties in predicting accurate output values for certain cases.To resolve this issue, the saturation problem was addressed by implementing min-max normalization within the range of 0.1 to 0.9, as outlined in Equation ( 4): ′ = (0.9 − 0.1) × ( . ) ( . . ) + 0.1, ′ ∈  (4)

Neural Network Architectures and Implementation
In this study, three multi-input, multi-output (MIMO) models were constructed with six process parameters, melt temperature, mold temperature, injection speed, packing pressure, packing time, and cooling time, as input parameters, and the mass, diameter, and height of the molded product as output parameters.One of the models is Network A, shown in Figure 8, a conventional artificial neural network (ANN) that combines all three output parameters in a single-task layer.The other two models were built using multitask learning by grouping the output parameters based on their physical meanings and characteristics.One model groups length, represented by diameter and height, and separates mass as a separate group, as shown in Network B (Figure 9), while the other model, Network C, classifies all output parameters into separate groups, as shown in Figure 10.Table 2 shows the hyperparameters used to build the ANN models in this study and the search ranges for optimization.In this study, the exploration of optimizing hyperparameters included the seed number.Typically, in the construction of artificial neural networks, the seed number is set to a specific value while the remaining hyperparameters are explored.However, the seed number, akin to the batch size, possesses algorithmic characteristics within the device and program that trains the neural network, and is a factor that must be carefully considered.In the initial stages of the research, the seed number was fixed to a specific value without exploration.However, it was observed that, even with the same structure of artificial neural networks, the results could vary significantly depending on the seed number.Therefore, in this study, to prevent such variations and optimize results, the seed number was treated as a hyperparameter and explored for optimal values.The optimizer was consistently set to the widely used Adams optimizer, with its parameters defined based on the research proposed by Kingma et al. [15].Initially, an artificial neural network was constructed by applying the widely used default coefficient values to the Adams optimizer.However, confirming variations in the performance of the Adams optimizer observed in previous studies and recognizing the need for coefficient exploration depending on the dataset, this research applied the coefficient ranges used in exploration in the study by Kingma et al. [15].This approach aimed to search for and apply the optimal coefficients in the Adams optimizer.The activation function was set to the popular ELU function from the ReLU family, and the initializer was chosen as the He normal initializer, which is known for its good performance along with that of the ReLU family.For the output layer, where the results of the neural network model are produced, a linear function was applied as the activation function, and the Xavier normal initializer, which performs well with the linear function, was used.Other hyperparameters were explored according to the ranges shown in Table 2.However, to facilitate a comparison of the performance of Networks A, B, and C, the number of common hidden layers was set to 3, and the number of hidden layers associated with each output parameter was set to 1.In addition, the root mean square errors (RMSEs) were used as a metric to evaluate the performance during the training process of the ANNs.Table 2 shows the hyperparameters used to build the ANN models in this study and the search ranges for optimization.In this study, the exploration of optimizing hyperparameters included the seed number.Typically, in the construction of artificial neural networks, the seed number is set to a specific value while the remaining hyperparameters are explored.However, the seed number, akin to the batch size, possesses algorithmic characteristics within the device and program that trains the neural network, and is a factor that must be carefully considered.In the initial stages of the research, the seed number was fixed to a specific value without exploration.However, it was observed that, even with the same structure of artificial neural networks, the results could vary significantly depending on the seed number.Therefore, in this study, to prevent such variations and optimize results, the seed number was treated as a hyperparameter and explored for optimal values.The optimizer was consistently set to the widely used Adams optimizer, with its parameters defined based on the research proposed by Kingma et al. [15].Initially, an artificial neural network was constructed by applying the widely used default coefficient values to the Adams optimizer.However, confirming variations in the performance of the Adams optimizer observed in previous studies and recognizing the need for coefficient exploration depending on the dataset, this research applied the coefficient ranges used in exploration in the study by Kingma et al. [15].This approach aimed to search for and apply the optimal coefficients in the Adams optimizer.The activation function was set to the popular ELU function from the ReLU family, and the initializer was chosen as the He normal initializer, which is known for its good performance along with that of the ReLU family.For the output layer, where the results of the neural network model are produced, a linear function was applied as the activation function, and the Xavier normal initializer, which performs well with the linear function, was used.Other hyperparameters were explored according to the ranges shown in Table 2.However, to facilitate a comparison of the performance of Networks A, B, and C, the number of common hidden layers was set to 3, and the number of hidden layers associated with each output parameter was set to 1.In addition, the root mean square errors (RMSEs) were used as a metric to evaluate the performance during the training process of the ANNs.

Results
Table 3 shows the results of hyperparameter exploration for Networks A, B, and C. It is important to note that during hyperparameter exploration, the hidden layer structure was kept consistent across the three different artificial neural network (ANN) architectures to facilitate intuitive comparison.The prediction results for the untrained test data (Experiments #28, 30, 31, 32, 36, and 45, as shown in Table A1) using the neural network described in Table 3 are shown in Table 4.To evaluate the performance, the root mean square error (RMSE) between the measured values and the predictions generated by the neural network was calculated for the normalized test data.As shown in Table 4, the application of grouping for the quality factors of injectionmolded products resulted in superior performance for Network B and C, which used the multi-task learning structure, compared to that of the conventional single MIMO structure used in Network A. In particular, Network C exhibited the best RMSE value, showing an improvement of approximately 32.8% over that of the conventional structure of Network A in the total normalized test data.
Table 5 shows the results comparing the prediction performance for each quality of the molded product, focusing on individual quality factors rather than the entire test dataset.Figure 11 graphically illustrates these results.In Table 5, it can be observed that Network A, the conventional MIMO neural network structure, exhibited the lowest prediction performance for all three factors: mass, diameter, and height.On the other hand, Network C showed the best prediction performance compared to that of Network A for mass, diameter, and height.It showed the most significant improvement in mass, achieving approximately 56.6% better performance based on the RMSE values.These results are also confirmed in Figure 11, where it is shown that the multi-task learning structures, grouped by the quality factors of the molded products, generally outperformed the conventional ANN.Table 5 shows the results comparing the prediction performance for each quality of the molded product, focusing on individual quality factors rather than the entire test dataset.Figure 11 graphically illustrates these results.In Table 5, it can be observed that Network A, the conventional MIMO neural network structure, exhibited the lowest prediction performance for all three factors: mass, diameter, and height.On the other hand, Network C showed the best prediction performance compared to that of Network A for mass, diameter, and height.It showed the most significant improvement in mass, achieving approximately 56.6% better performance based on the RMSE values.These results are also confirmed in Figure 11, where it is shown that the multi-task learning structures, grouped by the quality factors of the molded products, generally outperformed the conventional ANN.To analyze the error deviation between the actual measurements of the injectionmolded product's quality and the predictions of the networks, the mean and standard deviation of the squared errors were calculated, as shown in Table 6.The average of the squared errors is represented by the mean squared error (MSE).Figure 12 shows the To analyze the error deviation between the actual measurements of the injectionmolded product's quality and the predictions of the networks, the mean and standard deviation of the squared errors were calculated, as shown in Table 6.The average of the squared errors is represented by the mean squared error (MSE).Figure 12 shows the comparison of the standard deviations of the prediction errors for the molded product's quality between each of the networks as calculated in Table 6. Figure 12c shows that the difference in standard deviations between Network A and B is almost negligible.However, when considering the overall results for mass, diameter, and height, it is evident that the standard deviations of the prediction errors for Networks B and C improved compared to that of the conventional single-task structure of Network A. For mass, diameter, and height, Network C, which assigns separate tasks to each, shows an improvement in standard deviation of approximately up to 83.9% compared to that of Network A. comparison of the standard deviations of the prediction errors for the molded product's quality between each of the networks as calculated in Table 6. Figure 12c shows that the difference in standard deviations between Network A and B is almost negligible.However, when considering the overall results for mass, diameter, and height, it is evident that the standard deviations of the prediction errors for Networks B and C improved compared to that of the conventional single-task structure of Network A. For mass, diameter, and height, Network C, which assigns separate tasks to each, shows an improvement in standard deviation of approximately up to 83.9% compared to that of Network A.  The performance of the networks with the dimensional quality specifications according to ISO 20457:2018 (plastic-molded parts-tolerances and acceptance conditions) for injection molded parts, including diameter and height in millimeters [16], as well as the quality specifications for mass in percent [15], which are commonly applied to PP molded parts, is shown in Figure 12.The ISO 20457:2018 specifications for the injection-molded parts used in this study are both 0.09 mm for diameter and height [16], and the quality specification for the mass of the molded parts is 0.5% [17].Studies applying artificial neural networks to injection molding processes have primarily expressed performance by comparing results using metrics such as error ratios or RMSEs.However, to assess The performance of the networks with the dimensional quality specifications according to ISO 20457:2018 (plastic-molded parts-tolerances and acceptance conditions) for injection molded parts, including diameter and height in millimeters [16], as well as the quality specifications for mass in percent [15], which are commonly applied to PP molded parts, is shown in Figure 12.The ISO 20457:2018 specifications for the injection-molded parts used in this study are both 0.09 mm for diameter and height [16], and the quality specification for the mass of the molded parts is 0.5% [17].Studies applying artificial neural networks to injection molding processes have primarily expressed performance by comparing results using metrics such as error ratios or RMSEs.However, to assess whether the constructed artificial neural network is practically applicable to injection molding processes, it is essential to compare results against quality specifications used in the industry.Therefore, in this study, the final performance of the artificial neural network was evaluated using the actual quality specifications of the manufactured products as a benchmark.Comparing the results with the quality standards for mass, it is confirmed that test dataset 1, 4 and 5 in Network A exceed the quality specifications, as shown in Figure 13a.On the other hand, both Network B and Network C meet the quality standards for mass, with Network C generally providing predictions that are closest to the actual measured values.The comparative results for diameter in Figure 13b and height in Figure 13c show that all networks have predictions that meet the quality standards.Furthermore, similar to Figure 13a, among these networks, Network C consistently produces results that are closest to the actual quality measurements.Based on these results, it can be confirmed that in the construction of ANNs for predicting the quality of injection-molded products in terms of mass, diameter, and height, the architectures of Networks B and C, which apply multi-task learning by grouping the quality factors according to their characteristics, outperform the traditional single-task MIMO neural network structure (Network A).
whether the constructed artificial neural network is practically applicable to injection molding processes, it is essential to compare results against quality specifications used in the industry.Therefore, in this study, the final performance of the artificial neural network was evaluated using the actual quality specifications of the manufactured products as a benchmark.Comparing the results with the quality standards for mass, it is confirmed that test dataset 1, 4 and 5 in Network A exceed the quality specifications, as shown in Figure 13a.On the other hand, both Network B and Network C meet the quality standards for mass, with Network C generally providing predictions that are closest to the actual measured values.The comparative results for diameter in Figure 13b and height in Figure 13c show that all networks have predictions that meet the quality standards.Furthermore, similar to Figure 13a, among these networks, Network C consistently produces results that are closest to the actual quality measurements.Based on these results, it can be confirmed that in the construction of ANNs for predicting the quality of injection-molded products in terms of mass, diameter, and height, the architectures of Networks B and C, which apply multi-task learning by grouping the quality factors according to their characteristics, outperform the traditional single-task MIMO neural network structure (Network A).

Discussion and Conclusions
In this study, artificial neural networks (ANNs) were built to predict the relationship between process conditions and product quality in injection molding.Injection molding experiments were conducted on bowl products, and data were collected to evaluate the predictive performance of a multi-task learning structure with the grouping of quality factors based on their physical meanings and characteristics.

Discussion and Conclusions
In this study, artificial neural networks (ANNs) were built to predict the relationship between process conditions and product quality in injection molding.Injection molding experiments were conducted on bowl products, and data were collected to evaluate the predictive performance of a multi-task learning structure with the grouping of quality factors based on their physical meanings and characteristics.
Based on the collected dataset, three different ANN networks with different architectures were constructed.One is the Network A architecture, which is the existing multi-input, multi-output (MIMO) structure, where the output parameters for them ass, diameter, and height of the molded product are connected to a single task layer.Another is Network B, where diameter and height are grouped and assigned to one task layer, and mass is separated into a separate group with its own task layer.The last is Network C, where all output parameters, mass, diameter, and height, are grouped separately with individual task layers.In the case of Networks B and C, which applied multi-task learning according to output parameter groups, both showed relatively superior performance in predicting product quality in all scenarios compared to that of the typical MIMO-ANN, Network A. In particular, the architecture of Network 3, which assigns product mass, diameter, and height to separate task groups, showed excellent performance in predicting product weight, diameter, and length.When compared to the RMSE value of the general MIMO-ANN, Network A, the overall root mean square error (RMSE) for Network C on the entire test data showed an improvement of approximately 32.8%.For mass, diameter, and height, the respective improvements were 56.6%, 15.0%, and 44.3%, indicating that Network C exhibited superior predictive performance compared to that of the conventional MIMO neural network (Network A) based on RMSE.These results suggest that a multi-task learning architecture, which separates groups based on the characteristics of the quality factors of injection-molded products set as output parameters, may be a more suitable approach for the quality prediction of injection-molded products using the ANN.
The analysis of the specific dataset of the bowl product used in this study indicates that a multi-task learning architecture, which divides and assigns separate tasks based on the physical meanings and characteristics of the quality factors of injection molded products set as output parameters, may be a better choice for predicting the mass, diameter, and height of injection-molded products compared to the conventional MIMO structure of the ANN.The results of this study are expected to serve as valuable reference material for future research on the application of the ANN in the injection molding industry.

Figure 2 .
Figure 2. Process for artificial neural network.

Figure 2 .
Figure 2. Process for artificial neural network.

Figure 2 .
Figure 2. Process for artificial neural network.

Figure 6 .
Figure 6.(a) Measurement points of bowl product; (b) method for measurement of diameter.

Figure 7 .
Figure 7. Measurement points of bowl product: height.

Figure 6 .
Figure 6.(a) Measurement points of bowl product; (b) method for measurement of diameter.

Figure 6 .
Figure 6.(a) Measurement points of bowl product; (b) method for measurement of diameter.

Figure 7 .
Figure 7. Measurement points of bowl product: height.Figure 7. Measurement points of bowl product: height.

Figure 7 .
Figure 7. Measurement points of bowl product: height.Figure 7. Measurement points of bowl product: height.

Figure 8 .
Figure 8. Network A with the output parameters of mass, diameter, and diameter being connected to the single-task layer.

Figure 8 .
Figure 8. Network A with the output parameters of mass, diameter, and diameter being connected to the single-task layer.

Figure 8 .
Figure 8. Network A with the output parameters of mass, diameter, and diameter being connected to the single-task layer.

Figure 9 .
Figure 9. Network B with diameter and height as one group and mass as the other group.

Figure 9 .
Figure 9. Network B with diameter and height as one group and mass as the other group.

Figure 10 .
Figure 10.Network C with mass, diameter, and height all categorized into separate groups.

Figure 10 .
Figure 10.Network C with mass, diameter, and height all categorized into separate groups.

Figure 11 .
Figure 11.Root mean square errors (RMSEs) for each quality of the injection−molded part according to the network structure: (a) mass; (b) diameter; (c) height.

Figure 11 .
Figure 11.Root mean square errors (RMSEs) for each quality of the injection−molded part according to the network structure: (a) mass; (b) diameter; (c) height.

Figure 12 .
Figure 12.Standard deviation of square errors for each quality of the injection−molded part according to the network structure: (a) mass; (b) diameter; (c) height.

Figure 12 .
Figure 12.Standard deviation of square errors for each quality of the injection−molded part according to the network structure: (a) mass; (b) diameter; (c) height.

Figure 13 .
Figure 13.Performances of the prediction models using test data according to networks in terms of (a) mass; (b) diameter; (c) height.

Figure 13 .
Figure 13.Performances of the prediction models using test data according to networks in terms of (a) mass; (b) diameter; (c) height.

Table 1 .
Process conditions and levels for the experiment.

Table 2 .
Ranges of hyperparameters for networks.

Table 4 .
Root mean square errors (RMSEs) of total normalized property data for networks.

Table 4 .
Root mean square errors (RMSEs) of total normalized property data for networks.

Table 5 .
Root mean square errors (RMSEs) of each normalized property data for networks.

Table 6 .
Mean square errors (MSEs) and standard deviations of each normalized property data for networks.