1. Introduction
The injection molding process is one of the most widely used manufacturing methods for the production of complex high-volume plastic components [
1,
2]. To optimize process parameters, it is crucial to achieve sustainability goals, since it enables minimal energy and material input [
3]. Obtaining optimal process parameters is a major challenge due to the high nonlinear and independent relationships between different process parameters, such as injection pressure, temperature, holding pressure, and cooling time [
4]. An improper set of parameters can result in several defects, such as warpage, sink marks, improper filling, and flush formation, including high cycle time and increased energy consumption [
5]. Systematic adjustment of melt temperature, mold temperature, and packing pressure can significantly reduce surface and dimensional defects, though precise control is required to avoid residual stresses that lead to warpage [
5]. Traditional parameter-tuning methods are based on expert knowledge and numerous trial-and-error steps that are time-consuming, costly, and difficult to scale.Manual tuning typically stops once acceptable part quality is reached with a workable parameter set, and further fine-tuning is rarely pursued due to time constraints. Consequently, there is growing research interest in alternative approaches for process optimization that enable automated and data-driven parameter adjustment [
6]. Moreover, recent studies emphasize that comprehensive analyses of parameter interactions, including melt temperature, mold temperature, and cooling time effects, are necessary to understand their collective influence on warpage and other critical defects [
7].
In industry, the standard is first to establish an initial good state, which is a baseline parameter configuration derived from the operator’s experience in previous production runs or mold setup sheets [
8]. However, converging to a good state can be challenging, especially when processing with a new mold, where there is no historical machine–mold fingerprint and operators must determine suitable process windows from scratch [
9]. Even when re-setting an existing mold, where technicians rely on documented “golden settings” or previously validated process parameters, stable operation is not always guaranteed due to mold-specific rheology, machine-induced variations, material batch fluctuations (i.e., viscosity, moisture content), and thermal history effects [
10].
These machine–mold-material interactions introduce dynamic variations in resulting part quality. Compensating these effects would require constant monitoring and parameter correction, if needed, by an operator. This makes manual parameter tuning, based on operator skill, insufficient to efficiently achieve consistent global optimal states in terms of quality, cycle time, and energy consumption. Time and resource constraints for the operator, who, in practice, often supervises multiple machines at the same time, prevent exhaustive monitoring and correction. This highlights the need for an adaptive autonomous approach that continuously adjusts parameters based on real-time feedback to maintain optimal production conditions despite these fluctuations.
Previous research on injection molding optimization has explored statistical and machine learning techniques such as Design of Experiments (DOE), response surface methodology (RSM), and Artificial Neural Networks (ANNs) to model and predict process behavior. In recent years, evolutionary and multi-objective optimization algorithms have been increasingly applied for the optimization of the injection molding process due to their ability to handle nonlinear, high-dimensional, and conflicting objectives. Genetic algorithm (GA)-based approaches remain prevalent, with several studies integrating surrogate models to reduce experimental cost. Early hybrid Artificial Neural Network–Genetic Algorithm (ANN-GA) approaches demonstrated the ability to capture nonlinear relationships between parameters of the injection molding process and part quality [
11]. More recent studies have extended these frameworks to optimize quality-related objectives such as warpage and shrinkage, demonstrating strong modeling capabilities [
12]. However, these methods typically rely on extensive offline datasets and simulation-driven evaluations, which limit their scalability and suitability for real-time, adaptive industrial deployment. Pascoschi et al. employed unsupervised learning techniques based on autoencoders and clustering to analyze energy consumption patterns in industrial injection molding, demonstrating the potential of machine learning to understand processes and improve energy efficiency [
13]. However, such approaches focus primarily on analysis rather than on closed-loop parameter optimization and require expert interpretation of the extracted patterns. Kariminejad et al. proposed a Bayesian adaptive design of experiments that reduces the number of required trials compared to classic NSGA-II and desirability methods; however, it still depends on structured sampling strategies and assumes surrogate models that may not generalize well in highly dynamic production environments [
6]. Hong et al. combined the response surface methodology (RSM) with a backpropagation (BP) neural network and NSGA-II to optimize process parameters via simulation, but the approach is largely validated using numerically generated data, which may not fully capture the complexities of real machine behavior or support online optimization [
14]. Narowski and Wilczyński developed a global injection molding model that incorporates the plasticizing system and mold flow using detailed numerical simulations and experimental validation [
15]. Such models improve simulation fidelity but remain computationally intensive and are not designed for real-time optimization or adaptive control.
Recently, multi-objective evolutionary algorithms (MOEAs) have gained attention due to their ability to handle nonlinear, high-dimensional, and conflicting objectives in injection molding. Among these, the Non-dominated Sorting Genetic Algorithm II (NSGA-II) [
16], Non-dominated Sorting Genetic Algorithm III (NSGA-III) [
17], Strength Pareto Evolutionary Algorithm 2 (SPEA2) [
18], and Multi-objective Evolutionary Algorithm based on Decomposition (MOEA/D) [
19] are widely used in engineering optimization problems. Other works have adopted MOEAs such as NSGA-II in combination with deep neural networks or response surface models to jointly optimize cycle time, energy consumption, and quality metrics [
20,
21,
22]. While these approaches achieve improved Pareto-optimal solutions in simulation environments, they typically focus on a single evolutionary strategy and lack systematic comparisons with alternative MOEAs. Experimental comparisons of classical MOEAs, such as NSGA-II, SPEA2, and MOEA/D, on benchmark problems have shown that each method exhibits distinct strengths and weaknesses with respect to convergence and solution diversity [
23]. Benchmark studies in the evolutionary optimization literature highlight that algorithm performance can vary significantly depending on problem structure, objective dimensionality, and constraint handling [
24]. This suggests that algorithm selection remains problem-dependent. Real-time optimization studies emphasize the importance of adaptive feedback and online learning but often avoid multi-objective trade-offs or evolutionary comparisons due to computational complexity [
6]. Gaspar-Cunha et al. provided a comprehensive survey of optimization strategies in injection molding, covering surrogate modeling, evolutionary algorithms, and multi-objective formulations [
25]. The authors highlight that despite methodological advances, most studies rely on offline simulations and lack a systematic experimental comparison of alternative MOEAs under industrial conditions.
Table 1 summarizes these works, highlighting the problems addressed, techniques used, limitations, and remarks regarding industrial and real-time applicability. Some recent works move closer to industrial deployment by leveraging real production data. Vega et al. used machine learning classifiers such as random forests and logistic regression to predict and classify injection molding process states with high accuracy using real machine data [
26]. Nevertheless, these studies focus on process monitoring and classification rather than multi-objective parameter optimization or adaptive decision making. While surrogate-based, hybrid, and MOEA approaches have provided valuable insights into parameter interactions, most methods rely on offline data, are computationally intensive, and lack the flexibility needed for adaptive, real-time optimization in industrial contexts. NSGA-II remains a practical choice due to robustness and relatively low computational overhead, but alternative algorithms may offer advantages in many-objective or structured problems, albeit with higher complexity.
To address these limitations, this study integrates expert knowledge into a hybrid data-driven optimization framework to answer the following research questions:
- 1.
What system components are needed for a real-time optimization, and how can they be integrated into the production process?
- 2.
Can AI provide better settings to a human expert, and where do they differ?
- 3.
How much data is needed for good quality predictions?
- 4.
How should limitations be handled due to real-world production?
The effectiveness of the proposed framework is then shown based on a case study of a real-world mold-machine use case.
Figure 1 shows the high-level overview of the proposed system and its interactions with the injection molding machine. The blue blocks highlight the main components required for the solution.
The objective is to enable data-driven process optimization in injection molding by combining digital process monitoring, smart data acquisition, and AI-supported decision systems. The goal is to optimize the process and help avoid production defects and cycle-dependent variability through the development of automated parameter recommendation systems and closed-loop optimization workflows. The optical quality inspection module was developed in [
27,
28]. This paper focuses on the different modules: data-acquisition and preprocessing framework, the surrogate-modeling methodology, the multi-objective NSGA-II-based optimizer, and the rule-based feedback interface. The workflow developed is validated in an industrial injection-molding case study at a local production company (KHW [
29]), demonstrating the applicability of automated optimization in a real production environment. Changes in the recommended parameters as well as the input parameters to the workflow are visualized via Grafana [
30] for testing and validation.
In contrast to the existing work in the injection molding optimization framework, which focuses on process-parameter tuning using simulation-based surrogate models [
11,
14] or real-time machine-level adjustments without multi-objective trade off [
4,
6], this work introduces an integrated end-to-end optimization framework that directly connects industrial machine data, surrogate modeling, multi-objective optimization, and defect-based feedback. Several contributions lead to the first research question. These include (1) the development of a fully data-driven surrogate modeling pipeline based on industrial machine logs, rather than the simulation-derived datasets commonly used in prior studies. This work also (2) introduces a classifier-constrained NSGA-II algorithm that jointly optimizes cycle time and energy consumption while enforcing a learned quality constraint, an approach not found in the existing literature, where quality is usually modeled as an objective rather than a feasibility condition. Finally, this project (3) integrates this optimization into a closed-loop workflow that incorporates inline defect detection from the visual inspection pipeline, allowing automated parameter suggestion under real production conditions. To the best of our knowledge, no existing work has described a workflow that combines real machine data, defect-aware surrogate modeling, multi-objective evolutionary optimization, and rule-based system feedback in a single operational framework suitable for deployment on industrial shop floors.
The rest of this paper is structured as follows:
Section 2 introduces the use-case for this research and the general working process of the machine.
Section 3 describes the concept and implementation workflow, including data acquisition, preprocessing, and optimization architecture.
Section 4 reports the experimental results.
Section 5 discusses the benefits, compares the proposed approach with alternative methods, and highlights limitations. Finally,
Section 6 concludes and describes future directions.
4. Results
4.1. Surrogate Model Performance
The surrogate modeling framework consists of a binary quality classifier to predict defect occurrence, a cycle-time regressor, and an energy-consumption regressor. All models were trained with CatBoost using historical process data collected from the industrial injection–molding machine. The data was collected during normal production, machine setup, and optimization trial runs as described in
Section 3.2. The complete dataset consists of 5545 injection-molding shots and was split into a training set of 4436 samples and a test set of 1109 samples. The test set also served as the evaluation set during training to enable early stopping. Early stopping was implemented by monitoring performance on the test set and terminating training when no improvement was observed for 50 consecutive iterations. The final model was rolled back to the iteration achieving the best performance on this test set.
The classifier achieved an overall accuracy of 97% on the test set of 1109 shots. Class-wise performance was well balanced, with precision, recall, and F1-scores of 0.94, 1.00, and 0.97 for ”good” shots. For ”defective” shots, the values for precision, recall, and F1-scores were 1.00, 0.93, and 0.97. This shows a robust separation between feasible and infeasible regions of the parameter space. The performance metrics of the classifier, as summarized in
Table 3, correspond to the single best iteration in the evaluation set during this training run. This is critical for the optimization pipeline because NSGA-II evaluates thousands of candidate configurations. Early rejection of infeasible solutions significantly accelerates convergence and prevents exploration of defect-prone zones. The performance metrics of the surrogate models are summarized in
Table 3, and the confusion matrix of the classifier is shown in
Figure 5. During offline evaluation, the surrogate models were integrated into the optimization and feedback loop, where the predictions were used to select the process parameters.
The cycle-time regressor achieved an RMSE of 2.03 s across a target range of approximately 23–61 s, with CatBoost early stopping, which reduced the model to 26 effective iterations. This corresponds to a relative prediction error of roughly 3.3%, indicating that the deviation is small compared to the predicted value scale. Although the absolute error remains non-negligible due to limited training samples, the model exhibits consistent monotonic trends and stable generalization, enabling NSGA-II to reliably estimate relative improvements.
The energy regressor achieved an RMSE of 9.01 Wh in the target range of 26–106 Wh, converging after 46 iterations. This corresponds to a relative error of approximately 8.5%, reflecting moderate accuracy given the larger natural variability in energy consumption. Despite the small training dataset of 5545 shots, the model captures sensitivity to parameter changes, particularly in the cooling and holding phases where most energy fluctuations occur, providing sufficiently descriptive behavior for guiding the multi-objective search. This data was obtained from 3–4 trial runs covering normal production, machine setup, and the test runs of the optimization. No formal DoE was performed. Therefore, the system is able to use normal machine activity for data collection and setup.
4.2. Multi-Objective Optimization Results (NSGA-II)
The NSGA-II run produced a smooth, concave Pareto front, shown in
Figure 6, between cycle time and energy consumption. The Pareto set shows a clear trade-off: reductions in cycle time incur small increases in energy use, and vice versa. The distribution of the solution was uniform, without artificial clustering, providing a good set of alternatives for decision making.
To assess the effectiveness of the proposed optimization framework, NSGA-II was compared against other methods: NSGA-III, MOEA/D, and SPEA2. The performance of the algorithms was evaluated using the hypervolume (HV), inverted generational distance (IGD), and the number of Pareto-optimal solutions. The HV indicator measures the size of the portion of the objective space that is dominated by the obtained Pareto set with respect to a given reference point. Larger HV values indicate a better approximation of the true Pareto front, reflecting both convergence toward the optimal front and diversity along the trade-off surface. The IGD indicator evaluates how far the reference Pareto front is from the obtained solution set by computing the average distance from each point on the reference front to its nearest neighbor in the approximated set. Lower IGD values correspond to solutions that more closely approximate the reference front. The number of Pareto-optimal solutions indicates the richness of the solution set by counting distinct non-dominated solutions produced by an algorithm. A larger number of Pareto-optimal solutions provides greater flexibility for decision making by offering a broader set of trade-offs among objectives.
The quantitative results are summarized in
Table 4. NSGA-II achieved a high mean HV of
with low variance, indicating a strong balance between convergence and diversity of the Pareto front. In contrast, NSGA-III obtained a lower HV of
, and MOEA/D produced substantially inferior performance, with a mean HV of
and fewer Pareto-optimal solutions. This indicates reduced robustness under classifier-based feasibility filtering. SPEA2 achieved the highest HV value (
), but with a lack of variance. In terms of convergence accuracy, NSGA-II achieved an IGD of 0.143, indicating a close approximation to the reference Pareto front. NSGA-III and MOEA/D exhibited higher IGD values of 0.457 and 6.704, respectively. This shows inferior convergence under the same feasibility handling strategy. Although SPEA2 achieves an IGD of 0.000, indicating convergence to the best-known Pareto front, NSGA-II offers a favorable balance between convergence, diversity, and solution richness. This is evident from its higher HV value and larger number of Pareto-optimal solutions.
Industrial expertise knowledge (e.g., monotonic pressure decay, minimum holding-time windows) and rule-based filters removed approximately 80% of solutions, demonstrating the importance of combining a data-driven search with domain expertise. From the filtered set, operators selected the most promising settings, balancing a short cycle time with a low defect likelihood and practical implementability on the machine.
4.3. Iterative Testing
To verify the practical applicability and robustness of the proposed framework, a series of iterative machine tests was conducted on the industrial injection molding machine at KHW.
Figure 7 illustrates the sequence of iterative machine trials performed during the validation of the optimized parameter sets. The plot shows the LastCycleTime with respect to the JobCycleCounter, which represents the uniqueID for each cycle, highlighting how different events and parameter adjustments affected cycle behavior during the experiment. Each point corresponds to one of the candidate solutions selected from the filtered Pareto-optimal set.
The left region of the
Figure 7 represents some test values, where the cycle time fluctuates around approximately 55–57 s. This range is close to the baseline working values used by KHW machine operators before optimization. A short period marked in red indicates a short-shot event, where the cycle time drops significantly due to incomplete filling. This confirms that a certain set of optimized parameters can occasionally produce critical defects. The area marked as “Machine working values” is the baseline working values generally used by KHW operators during production.
Further into the sequence in
Figure 7, an isolated spike above 65s occurs, annotated as a “Spike in Dosing Time”. This event reflects an instantaneous disturbance in the plastification behavior, typical of the variability of the machine–material relationship observed during long production runs. To the right of this disturbance, there is a more iterative evaluation of optimized parameters. These tests were applied sequentially on the real machine during live production. The figure shows that these optimized settings initially increased the cycle time variance, which is expected because the hold-pressure and cooling-time profiles differ from the historically stable configuration of the machine. Highlighted in green are some of the best working sets of optimized parameters. Here, the cycle time consistently converges toward approximately 53.5 s, improving upon the original 56 s baseline. A brief red zone labeled Flash indicates an over-packing incident produced by one of the more aggressive parameter sets, and the huge fluctuations in cycle time again illustrate the importance of the rule-based filtering steps, described in detail in
Section 4.4.
Toward the end of the sequence in
Figure 7, a green-highlighted region labeled Lower Cooling Time (19 s) demonstrates the best-performing optimized solution. Modification of the cooling time was applied only during testing, since it depends on the material type and product thickness. This is a high-risk adjustment with insufficient information to adjust this specific control parameter. Overall, this figure demonstrates how iterative machine testing, guided by the surrogate model and NSGA-II optimization, validates and refines parameter candidates under real production conditions.
4.4. Closed-Loop Feedback Performance
The rule-based feedback module played a central role in stabilizing the optimization-to-machine workflow. The system incorporates the folling:
Machine-specific pressure monotonicity constraints;
Minimum holding-time thresholds;
Operator-validated feasibility rules;
Digital defect detection feedback.
Thus, the system automatically pruned infeasible or high-risk configurations. This reduced the combinations of invalid parameters by approximately 80%, substantially decreasing the number of physical machine trials required. When the optical inspection system detects defects, the feedback logic is reverted to the previously accepted solution, ensuring uninterrupted production.
4.5. Comparison Against Industrial Baseline
Following NSGA-II optimization and subsequent rule-based post-processing, three representative parameter sets were selected from the filtered Pareto front. These solutions were chosen because they satisfied the classifier-based quality constraint, exhibited monotonic pressure profiles, and demonstrated favorable trade-offs between cycle time and energy consumption.
Table 5 summarizes the baseline machine settings used in production and the three optimized configurations.The optimized sets show adjustments across control variables, most notably reshaping of the holding pressure levels (points 1–5) and modification of the corresponding timing parameters. These patterns reflect the optimizer’s ability to exploit regions of the search space associated with shorter cycle times or reduced energy usage while still maintaining the feasibility as learned by the classifier. Compared to the expert’s baseline, these values also show the ability of the Artificial Intelligence (AI) system to explore further options, especially with fractions of a second for the timing parameters. These fine sets are typically not explored during human trials.
Table 5 shows the optimized configurations that correspond to the top three distinct trade-off solutions selected from the filtered Pareto front. From
Table 5, the following conclusion is drawn:
Baseline cycle time: ∼56 s.
Optimized cycle time: s.
Improvement: ≈4.5% reduction.
For energy consumption, surrogate-predicted improvements were directed toward the measured values during production trials. Although the surrogate cannot fully capture mold- and material-specific thermal behavior due to limited training data, predicted and measured trends matched closely. The first real-machine trial using NSGA-II settings produced a nearly optimal result, demonstrating the practical utility of surrogate-assisted optimization in an industrial injection-molding environment.
4.6. Real-Machine Validation and Defect Behavior
A set of 200 optimized parameters, obtained using the NSGA-II algorithm, was evaluated on the KHW injection mold machine. Only two sets of parameters resulted in defects, both detected by the optical inspection system, confirming that the solution is repeatable and practically feasible. The surrogate classifier correctly identified feasible regions in most cases, validating its usefulness as a feasibility filter. The surrogate predictions did not show measurable drift, which is an important factor in data-scarce industrial environments. The measured cycle times and energy values closely matched the surrogate predictions, reinforcing the reliability of the model for real-world deployment. Overall, the end-to-end framework demonstrated a successful deployment of AI-assisted optimization in a real industrial setting.
5. Discussion
This work demonstrates that real-time optimization in injection molding relies on the integration of four core components: machine-level data acquisition, a preprocessing and feature-engineering pipeline, surrogate models with a feasibility classifier, and a multi-objective optimizer. Through the implementation and validation of this workflow, this study successfully addressed all research questions introduced in the article. The final implemented workflow shows the real-time optimization capability and is seamlessly integrated into the production environment with minimal operational impact. The iterative testing phase provides key information on the interaction between AI-generated recommendations and expert knowledge. The AI-based method efficiently explored the parameter space and proposed performance-enhancing settings, while human-defined rule-based filtering remains essential for enforcing production constraints and ensuring machine-safe operation. Despite the limited size of the training dataset, iterative machine feedback compensates for data sparsity and leads to stable surrogate model performance. In this use case, 60–80 representative cycles were sufficient for reliable predictions, with further refinement achieved by continuously comparing optimized predictions to real measured results. The best performing results emerged when AI-driven exploration was constrained by domain knowledge, indicating that AI serves as a complementary tool that enhances, rather than replaces, expert-driven process design.
Compared to the recent literature, the proposed framework explicitly incorporates classifier-based feasibility handling and real-machine validation, which are often omitted in simulation-driven studies. Although alternative optimization algorithms such as NSGA-III, MOEA/D, and SPEA2 are widely used, the comparative evaluation conducted in this study demonstrates that NSGA-II offers a more robust balance between convergence accuracy, solution diversity, and feasibility under industrial constraints. This positions the proposed approach as a practical and scalable optimization strategy for real-world injection molding processes.
Nevertheless, several limitations and challenges remain. The precision and generalization capability of the surrogate models and the feasibility classifier depend on the quality and representativeness of the collected production data. Abrupt changes in material properties, machine conditions, or mold configuration are not formulated in the problem configuration. The classifier-based feasibility filtering may restrict the exploration of unexplored regions of the design space, potentially excluding unconventional but viable solutions. In addition, the computational overhead associated with surrogate training, classifier updates, and multi-objective optimization limits scalability. This requires retraining for high-dimensional parameter spaces. The proposed framework has been validated on a specific industrial injection molding setup. Direct transferability to other machines, materials, or processes would require additional calibration and domain-specific adjustments. Furthermore, due to ongoing production constraints at KHW, only a limited number of real-machine validation trials could be conducted within the available testing time. These trials were sufficient to confirm the practical feasibility and performance trends of the proposed approach, but more extensive, long-term testing would be required to fully assess robustness under varying production conditions.
The scientific contribution of this work lies not in proposing a new optimization algorithm but in systematically evaluating and validating multi-objective optimization strategies under realistic industrial constraints. By combining classifier-based feasibility handling, surrogate-assisted optimization, and real-machine validation, this study identifies an optimization procedure that is both effective and practically deployable for injection molding processes. This approach advances current practice by bridging the gap between theoretical optimization methods and industrial applicability.
6. Conclusions and Future Work
This research demonstrates the successful implementation of a data-driven optimization framework for an industrial injection-molding process using a large mold at KHW. The complexity of the mold, characterized by extended flow paths, asymmetric cooling behavior, and high sensitivity to pressure and profile variations, makes manual parameter optimization challenging. The machine-implemented tests confirmed that optimized solutions lead to consistent improvements in cycle time and energy consumption, with only minimal defect occurrence.
NSGA-II–based optimization achieved a reduction in cycle time from approximately 56 s in the baseline configuration to 53.5 s for the best optimized setting, corresponding to an improvement of approximately 4.5%. The trend in energy consumption was also reduced according to surrogate predictions, but not according to production trials. Comparative evaluation against NSGA-III, MOEA/D, and SPEA2 demonstrated that NSGA-II provided the most robust balance between convergence accuracy, solution diversity, and feasibility handling, achieving a high mean hypervolume of 725.61 with low variance and an IGD of 0.143. Although SPEA2 achieved zero IGD, it exhibited reduced solution diversity and premature convergence, whereas NSGA-III and MOEA/D showed inferior convergence and significantly fewer Pareto-optimal solutions.
Although this project demonstrates the effectiveness of the proposed approach, several opportunities remain for future work. Feedback-based refinement can be expanded beyond rule-based filtering toward more expressive approaches such as regression-based adaptive weighting of parameters or large language model (LLM)-assisted rule generation, enabling dynamic interpretation of machine behavior and operator feedback. Additional experimental campaigns are planned using different molds and product geometries to systematically evaluate the transferability and robustness of the framework beyond the single industrial use case presented in this study. Expanding the dataset across multiple molds, materials, and machine types would support generalization and enable the development of mold-independent base models. Future iterations of the system will shift toward continuous learning paradigms, where the surrogate updates itself automatically based on each production cycle. The long-term goal is a fully automated closed-loop optimization architecture in which the optimizer directly communicates with the machine control system to autonomously adjust parameters, verify results, and refine models without human intervention. Such advances would bring industrial injection molding closer to self-optimizing, intelligent manufacturing systems capable of adapting to mold wear, material variability, and environmental changes in real time.