Next Article in Journal
Pose- and Direction-Dependent Modulation and Accuracy in Robotic Milling
Previous Article in Journal
Component Energy Modelling for Machine Tools
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Geometry-Aware Neural Network for Generalized Temperature Prediction in Microwave Heating of PET Preforms †

by
Ahmad Alsheikh
1,2,* and
Andreas Fischer
2,*
1
Krones AG, Böhmerwaldstr. 5, 93073 Neutraubling, Germany
2
Deggendorf Institute of Technology, Dieter-Görlitz-Platz 1, 94469 Deggendorf, Germany
*
Authors to whom correspondence should be addressed.
This manuscript is an extended version of a conference paper published at the Second Workshop on AI in Production (AIP2025): Alsheikh, A.; Fischer, A. Fusion-Based Neural Generalization for Predicting Temperature Fields in Industrial PET Preform Heating. arXiv 2025, arXiv:2510.05394.
J. Manuf. Mater. Process. 2026, 10(4), 138; https://doi.org/10.3390/jmmp10040138
Submission received: 24 March 2026 / Revised: 14 April 2026 / Accepted: 16 April 2026 / Published: 19 April 2026

Abstract

Accurate temperature prediction is essential for optimizing the microwave preheating of PET preforms prior to blow molding. A key challenge in this context is the strong dependence of electromagnetic field distributions and thermal responses on preform geometry, which varies substantially across product lines. Conventional neural network models trained on specific geometric configurations typically fail to generalize to unseen preform designs, requiring costly retraining for each new geometry. This work proposes a unified geometry-aware deep learning framework that predicts spatial temperature distributions across multiple preform designs using a single neural network model. The approach reformulates temperature prediction as a coordinate-level regression task conditioned on spatial location, geometric descriptors, process parameters, and structural region labels. A domain-bounded training strategy based on extreme feasible preform geometries is introduced, ensuring that predictions for intermediate designs remain within the interpolation regime of the network. The framework is evaluated on six distinct preform geometries, demonstrating that a single model can generalize reliably to new, unseen preform designs when their geometric parameters fall within the bounds of the training data. This is achieved through a domain-bounded training strategy that constructs datasets from the extreme feasible geometries, thereby converting the prediction of any intermediate design into an interpolation task. Since neural networks are inherently limited in their ability to extrapolate beyond the training domain, this formulation is essential for ensuring stable and accurate predictions across the full range of industrially relevant preform configurations. The proposed methodology provides a foundation for geometry-informed surrogate modeling in thermal process control and can be extended to other manufacturing systems characterized by strong geometric variability.

1. Introduction

Polyethylene terephthalate (PET) preforms are injection-molded components used in the production of plastic bottles through stretch blow molding [1]. Prior to molding, preforms must be heated to a controlled temperature range to ensure uniform deformation and consistent mechanical properties. Infrared (IR) heating is widely adopted in industry; however, its limited penetration depth, low energy efficiency, and restricted spatial controllability motivate the investigation of alternative heating technologies.
Microwave (MW) heating has emerged as a promising alternative due to its volumetric energy deposition, rapid heating rates, and potential for selective heating of dielectric materials [2]. Achieving a uniform temperature distribution along the preform surface is critical for controlling wall thickness, optical clarity, and residual stresses in the final container [3]. In contrast, nonuniform heating can lead to defects such as haze formation, thickness variations, and mechanical weakness (Figure 1), thereby compromising product quality and reliability.
A major challenge in microwave heating of PET preforms is the large variability in both material properties and geometric designs. Preforms differ in length, mass, wall thickness, and neck configuration depending on container specifications [1]. These variations significantly influence electromagnetic field distributions and thermal responses, making accurate temperature prediction a complex multiphysics problem that must account for strong geometric dependence. Reliable prediction of the temperature profile is therefore essential for process control and optimization, as it enables adjustment of heating parameters to ensure that preforms reach the target thermal state required for uniform deformation during the subsequent blow molding process.
Recent advances in deep learning have provided powerful tools for approximating nonlinear physical processes such as microwave-induced heating. Nevertheless, neural networks typically require large training datasets generated from high-fidelity simulations or experiments, which are computationally expensive and time-consuming.
In prior work [4], we proposed a data-efficient framework combining transfer learning and stacking-based model fusion to predict temperature distributions in microwave-heated PET preforms. A residual fully connected architecture with skip connections served as the base predictor, and variant-specific models were fused into a unified global predictor via a Design of Experiments (DOE) based experience extraction process. The approach was evaluated on two generalization tasks—material property variation (heat capacity) and geometric variation across preform sizes—and outperformed models trained from scratch under limited data conditions.
However, a closer examination revealed that the training and test preforms in the geometric case study were too similar, occupying a narrow region of the geometry feature space. When the model was evaluated on preforms with more extreme geometric characteristics, prediction accuracy deteriorated considerably. This exposed a fundamental extrapolation limitation: the fusion strategy improved data efficiency but did not ensure coverage of the geometry space.
The present work addresses this limitation by reformulating temperature prediction as a coordinate-level regression task conditioned on spatial location, explicit geometric descriptors (preform length, wall thickness, neck dimensions, dome radius), process parameters, and a categorical region label indicating whether the prediction point belongs to the neck, body, or dome section of the preform. This formulation enables a single model to generalize across diverse preform geometries without variant-specific retraining.
A central contribution of this work is a domain-bounded training strategy designed to ensure that prediction of unseen preform geometries remains an interpolation task. The key observation is that for a given microwave applicator, there exist smallest and largest preform configurations that can physically fit within the cavity. These extreme geometries define the lower and upper bounds of all relevant geometric parameters. By constructing training datasets from these boundary configurations, the geometric features of any intermediate preform are guaranteed to lie within the convex hull of the training data. This transforms the prediction problem for new preform designs from an extrapolation task, where neural networks are known to be unreliable, into an interpolation task, where accurate approximation can be expected. The strategy provides a principled mechanism for selecting representative training geometries based on physical feasibility rather than arbitrary or clustered sampling, and it enables systematic generalization across the full range of industrially relevant preform designs.
The proposed framework occupies a deliberately pragmatic position within the surrogate modeling landscape. Rather than encoding geometric variability through latent representations learned by variational autoencoders or through mesh-based graph neural networks, the present work uses a small set of explicit, physically interpretable geometric descriptors as direct network inputs. This avoids the architectural complexity and implementation overhead of latent encoders or graph operators and is well suited to industrial settings where preforms vary parametrically within a fixed topology rather than across arbitrary shapes. Likewise, although physics-informed neural networks have been successful for problems with well-established governing equations, the strongly coupled electromagnetic–thermal physics inside a multi-mode resonant cavity with dielectric inserts does not lend itself to a tractable, closed-form partial differential equation (PDE) residual formulation suitable for direct embedding into a neural network loss. A detailed comparison with these alternative paradigms is provided in Section 2, where the trade-offs are discussed in the context of the present application.
Three case studies are conducted to systematically evaluate the proposed framework under increasing levels of complexity. Case Studies 1 and 2 are retained from the conference publication [4] and serve as foundational evaluations, while Case Study 3 is introduced in this work to validate the proposed domain-bounded training strategy:
  • Case Study 1: Generalization across variations in PET heat capacity, motivated by the incorporation of recycled materials.
  • Case Study 2: Generalization across multiple preform geometries using a conventional training strategy, serving as a baseline that exposes the extrapolation limitation.
  • Case Study 3: Geometry-bounded generalization using extreme feasible preform configurations, demonstrating that reliable prediction of intermediate geometries can be achieved through interpolation.
Compared to the conference publication [4], this work introduces the following extensions:
  • A geometry-aware and region-aware problem formulation conditioning temperature prediction on spatial coordinates, geometric descriptors, and structural region labels (Section 3.2).
  • A domain-bounded training strategy constructing datasets from extreme feasible geometries to convert geometric extrapolation into interpolation (Section 3.3).
  • An expanded experimental evaluation on six preform geometries with systematic hold-out testing, assessing both interpolation and extrapolation behavior (Section 4.5).
  • A geometric feature space and correlation analysis characterizing the distribution of preform designs and the nonlinear relationship between input features and the temperature field (Section 4.3).
  • A physical analysis of extrapolation difficulty revealing that spatial and structural features pose greater generalization challenges than wall thickness parameters (Section 4.7).
  • A substantially extended review of related work covering geometry-aware surrogate modeling and neural network extrapolation limitations (Section 2).
Together, these extensions shift the modeling paradigm from variant-specific fusion with limited data to a unified, geometry-conditioned predictor capable of generalizing across the full range of feasible preform designs.

2. Related Work

2.1. Physics-Based Modeling of Preform Heating

Modeling and control of PET preform heating have traditionally relied on physics-based simulation and heuristic control strategies. In the context of infrared (IR) reheating, prior work has studied radiative heating, ventilation effects, and the resulting temperature profiles that influence stretch blow molding outcomes [3,5,6]. While such models provide physical insight, they often require substantial computational effort and can be difficult to adapt to changing production conditions, especially when preform geometry varies significantly across product lines.
Microwave (MW) heating has been investigated as an alternative due to its volumetric energy deposition and the potential for faster and more energy-efficient heating. Early feasibility studies demonstrated the relevance of coupled electromagnetic–thermal modeling for PET preform conditioning under microwave excitation [2]. More recent work has focused on applicator design and field shaping, for example using dielectric structures to tailor heating patterns in single-mode cavities [7,8]. However, predicting temperature fields robustly across diverse preform designs remains challenging because electromagnetic field distributions and thermal dynamics are strongly geometry-dependent.

2.2. Machine Learning for Thermal Process Modeling

Machine-learning-based surrogate modeling has gained momentum as a way to accelerate expensive multi-physics simulations and enable real-time decision-making. Deep learning has been shown to act as a surrogate for PDE-based solvers in complex systems, including physics-informed surrogate modeling of field quantities [9]. Within industrial heating, machine learning has been explored for control and prediction tasks, such as deep reinforcement learning for temperature control in stretch blow molding [10], deep transfer learning for furnace temperature prediction [11], and neural metamodels with transfer learning for induction heating processes [12]. These studies demonstrate the potential of data-driven models for thermal processes, but they typically do not treat strong geometric variability as a first-class generalization problem.
Transfer learning and model fusion methods have been applied in data-rich domains such as vision and medical imaging to improve performance and reuse knowledge across tasks [13,14,15,16,17,18]. Adaptive model merging and modular fusion approaches have also been proposed to compose capabilities from multiple pretrained models [19,20]. While these techniques motivate the use of fusion and fine-tuning for robustness, the majority of this literature focuses on classification tasks and does not directly address regression-based surrogate modeling where outputs are continuous spatial fields.

2.3. Geometry-Aware Surrogate Modeling and Neural Network Extrapolation

A fundamental limitation of neural networks in physics-based regression is their reduced reliability under extrapolation beyond the training domain. Empirical studies across scientific modeling domains have shown that neural-network extrapolation can be unstable and significantly less trustworthy than interpolation [21]. This issue is particularly acute when geometry changes, because geometric parameters and shape features can move test cases outside the distribution spanned by training samples. Addressing this limitation has motivated a growing body of work on geometry-aware surrogate modeling, which can be broadly grouped into three families: parametric descriptor-based models, latent-encoded representations, and mesh- or graph-based models. A complementary direction, physics-informed neural networks, embeds governing equations directly into the training loss and is discussed separately below.

2.3.1. Parametric Descriptor-Based Surrogates

The most established approach represents each geometry by a small vector of explicit parameters and feeds this vector as an additional input to a regression model. Cao et al. [22] developed a non-parametric surrogate based on machine learning for design optimization, highlighting the importance of avoiding brittle dependence on narrow geometric parameterizations. Such formulations are appropriate when geometric variation can be captured by a manageable number of physically meaningful parameters and the underlying topology of the design remains fixed. Their main weakness is that they cannot natively handle arbitrary shape variation: if two designs differ in their topology rather than in the values of a fixed parameter set, no descriptor vector can express the difference. For PET preforms processed by a single applicator, however, all designs share the same neck–body–dome topology, and the relevant variation is fully captured by a small number of dimensional parameters. The descriptor-based formulation is therefore well matched to the present problem.

2.3.2. Latent-Encoded Geometry Representations

A more general alternative is to learn a latent representation of geometry from raw shape data. Oldenburg et al. [23] proposed a geometry-aware physics-informed neural network (GAPINN) that encodes irregular geometries into a latent vector using a variational autoencoder, enabling PDE-based surrogate modeling without explicit parameterization. Such approaches are powerful when geometric variation is high-dimensional or difficult to parameterize, but they introduce substantial overhead: an auxiliary encoder must be trained on a dedicated dataset of shapes, the latent space must be regularized to ensure smooth interpolation, and the learned representation is not directly interpretable. For applications in which a handful of parameters fully describe the geometric design space, the additional complexity of a latent encoder is difficult to justify.

2.3.3. Graph and Mesh-Based Surrogates

Graph neural networks (GNNs) have been explored as geometry-flexible surrogates that operate directly on mesh or point cloud representations of the simulation domain [24,25]. These methods are well suited to problems with topologically varying geometries and unstructured meshes, and they can in principle generalize across very different shapes without hand-crafted parameterization. Their drawbacks for industrial deployment are practical: they require mesh or graph preprocessing for every input geometry, the network architecture is more complex and harder to tune, training and inference are computationally heavier than dense feedforward networks, and integration into existing simulation-based engineering workflows is non-trivial. For a manufacturing process in which the same applicator is used across a parametric family of preforms, the flexibility of a graph-based surrogate is largely unused while its overhead is fully incurred.

2.3.4. Physics-Informed Neural Networks

A complementary line of research embeds physical laws directly into the training loss of a neural network, yielding physics-informed neural networks (PINNs) [9]. PINNs have demonstrated strong performance on problems with well-defined governing equations and clean boundary conditions, and they can reduce the amount of simulation data required for training. Their applicability to the present problem is limited, however, by the nature of microwave heating in a multi-mode resonant cavity. The temperature field is determined by the coupled solution of Maxwell’s equations and the heat equation, with electromagnetic boundary conditions imposed by the cavity walls, the coaxial excitation, the dielectric slabs, and the temperature- and frequency-dependent dielectric properties of PET. Casting this strongly coupled system as a tractable PDE residual that can be embedded into a neural network loss—and that converges reliably during training—is itself an open research problem. PINN approaches in microwave heating are therefore largely restricted to simplified configurations and have not been demonstrated for industrially relevant applicator geometries. In the absence of a tractable physics residual, the present work adopts a data-driven surrogate trained on high-fidelity simulations, where the underlying physics is captured implicitly through the simulation data rather than explicitly through a loss term.
Across all four paradigms, the existing literature demonstrates that encoding geometric information into the learning process is essential for robust surrogate modeling. However, the more flexible approaches (latent-encoded, graph-based, physics-informed) introduce architectural and implementation overhead that is difficult to justify when the geometric design space is parametric, low-dimensional, and topologically fixed—as is the case for PET preforms processed by a single microwave applicator.

2.4. Positioning of This Work

In the specific context of PET preform microwave heating, prior work has primarily emphasized either physics-based IR modeling or applicator-level MW optimization, while systematic machine-learning strategies for generalization across diverse preform geometries remain limited. In particular, existing approaches seldom distinguish between interpolation and extrapolation behavior in the geometry feature space, nor do they provide dataset design principles to enforce robust generalization across diverse industrial designs.
The present study addresses this gap by adopting the parametric descriptor-based formulation—which matches the structure of the problem—and combining it with two design choices that, to the best of our knowledge, have not been previously brought together for industrial thermal process modeling. First, the temperature prediction task is formulated as a coordinate-level regression conditioned on spatial location, explicit geometric descriptors, slab configuration, and a structural region indicator, allowing a single unified model to learn temperature responses across multiple preform geometries simultaneously. Second, training datasets are constructed using the extreme feasible preform geometries determined by the physical constraints of the applicator cavity, ensuring that the geometric features of any intermediate preform fall within the convex hull of the training data and that prediction therefore constitutes an interpolation task rather than an extrapolation task. The contribution of this work lies not in the choice of geometric representation, which is well established, but in the combination of this representation with a physically grounded dataset construction principle that converts geometric generalization from an open-ended extrapolation problem into a bounded interpolation problem.

3. Methodology

This section presents the proposed framework for geometry-aware temperature prediction in microwave heating of PET preforms. The overall framework is illustrated in Figure 2 and consists of four components. Section 3.1 describes the microwave applicator configuration and the simulation model used to generate training data. Section 3.2 introduces the geometry-aware and region-aware problem formulation that conditions temperature prediction on spatial coordinates, geometric descriptors, and structural region labels. Section 3.3 presents the domain-bounded training strategy based on extreme feasible geometries. Section 3.4 describes the neural network architecture. Finally, Section 3.5 outlines the training procedure and evaluation protocol.

3.1. Microwave Applicator Configuration and Simulation Model

The microwave applicator determines the electromagnetic field distribution within the cavity and, consequently, the resulting temperature profile of the PET preform. The applicator used in this work is a rectangular cavity measuring 250 mm × 190 mm × 150 mm (width × length × height), equipped with adjustable dielectric slabs for controlled field shaping.
A Type-N coaxial antenna, centrally mounted on the bottom wall and oriented along the z-axis, excites the cavity in the TE101 mode at 915 MHz [7], a standard operating frequency for industrial microwave heating. The electromagnetic field distribution and the resulting thermal response depend strongly on the preform geometry, including its wall thickness, neck dimensions, and overall shape, all of which affect the coupling between the microwave field and the dielectric material.
To enable systematic manipulation of the field distribution, the applicator incorporates two stacks of 16 PTFE (polytetrafluoroethylene) sheets serving as dielectric slabs. Each slab measures 25 mm × 190 mm × 5 mm and is oriented parallel to the y-axis. The two stacks are placed symmetrically on either side of the preform, with their positions adjustable along the x-axis. By varying the distance between the slabs and the preform, the near-field electromagnetic distribution can be shaped to promote more uniform heating. These slabs function as near-field focusing elements [8], manipulating wave propagation through reflection, refraction, and diffraction [7]. All simulations were performed using Ansys Electronics HFSS 2024 R1 (Ansys Inc., Canonsburg, PA, USA) under a commercial license, a high-frequency electromagnetic simulation platform capable of computing coupled electromagnetic and thermal fields. The simulation model served both as the basis for applicator design evaluation and as the data source for training the proposed learning framework. This configuration, illustrated in Figure 3, provides a physically consistent and controllable environment for studying geometry-dependent heating behavior across diverse PET preform designs.

Reduced-Dimensional Simulation Model

To make the generation of the large training dataset computationally tractable, the simulation model is formulated as a two-dimensional cross-section of the applicator rather than a full three-dimensional cavity simulation. The cross-section corresponds to a vertical slice through the cavity that contains the central axis of the preform, the coaxial excitation, and both stacks of dielectric slabs, capturing the dominant features of the geometry that determine the field distribution near the preform. Within this slice, the preform is represented as a stationary cross-section, and the coupled electromagnetic–thermal solution is computed for the resulting two-dimensional configuration.
The validity of this reduction rests on an essential physical feature of the heating process that is sometimes overlooked in computational studies of microwave applicators: during operation, the preform rotates around its own central axis at a fixed rotational speed throughout its passage through the microwave applicator, both during entry and during transit through the cavity. This rotation is a deliberate design choice of the industrial machine and has a fundamental effect on the thermal field. At any single instant, the electric-field distribution along the preform surface is not azimuthally uniform: the rectangular cavity, the TE101 mode, and the two-sided dielectric slab arrangement together produce a field that varies around the preform circumference. Without rotation, this would lead to azimuthal hot and cold regions on the preform surface and would prevent uniform deformation during the subsequent blow molding step. The rotation eliminates this variation by exposing every point on the preform surface, at a given axial height, to the same time-averaged electromagnetic field over each full revolution. As a result, the absorbed microwave power—and therefore the temperature reached at the end of the heating phase—depends only on the axial position along the preform, not on the azimuthal angle.
Because the temperature distribution that is relevant for the blow molding process is intrinsically one-dimensional in the axial direction, the engineering quantity of interest is fully described by the axial profile and is independent of how the field varies around the preform circumference at any single instant. The two-dimensional cross-sectional simulation provides this axial profile directly, by computing the dielectric heating in the slice that contains the dominant features of the cavity field. The reduction from a full three-dimensional simulation to a two-dimensional cross-section is therefore not an approximation that discards relevant physics, but a reformulation that targets the same scalar quantity—the axial temperature profile of the rotating preform—that the production process itself produces and that is measured by the thermal imaging shown in Figure 3, which likewise records the surface temperature of the rotating preform as it exits the cavity. The simulation, the experimental observable, and the engineering quantity of interest therefore all describe the same axial profile, ensuring that the simulation data used to train the proposed neural network framework are consistent with the temperature field that would be encountered in production.

3.2. Geometry-Aware and Region-Aware Temperature Prediction Formulation

In conventional surrogate modeling of PET preform heating, a separate neural network is trained for each preform geometry, learning the mapping between process parameters and the resulting temperature field for that specific design. While this approach can achieve high accuracy within the geometry used for training, it does not transfer to new preform designs. When a model trained on one set of geometric proportions is applied to a preform with substantially different dimensions—for example, a significantly longer body, thicker walls, or a differently shaped dome—prediction accuracy deteriorates because the model is forced to extrapolate beyond the geometric conditions it has seen. This limitation is not merely a matter of insufficient data; it reflects a fundamental property of neural networks, which approximate functions reliably within the convex hull of their training data but exhibit degraded and often unstable performance outside this domain.
To overcome this limitation, the temperature prediction task is reformulated to explicitly incorporate spatial and geometric context into a unified learning framework. Rather than predicting the full temperature field for each preform as a single output vector, temperature is modeled as a continuous, conditional function of spatial location, geometric characteristics, and operating conditions. Specifically, the prediction problem is defined as:
T = f ( x , g , s , r ) ,
where x denotes the spatial coordinate along the preform surface, g represents a vector of geometric descriptors, s denotes the slab position defining the heating configuration, and r is a categorical variable indicating the structural region of the preform.
The geometric descriptor vector g encodes the physical dimensions that characterize each preform design. In this work, g includes six features: preform neck wall thickness, body wall thickness, dome wall thickness, body length, neck length, and dome radius. These parameters were selected because they capture the principal geometric variations across preform designs and directly influence the electromagnetic field coupling and thermal response within the microwave applicator. By providing this information explicitly as input features, the network is not required to infer geometric identity implicitly from the temperature patterns alone, which would limit its ability to generalize beyond the training geometries.
The inclusion of the region indicator r accounts for the distinct physical behavior observed in different structural sections of the preform, as illustrated in Figure 4. The neck, body, and dome regions are characterized by different wall thicknesses, curvatures, and proximity to cavity boundaries, all of which produce different electromagnetic field distributions and thermal dynamics. By partitioning the coordinate space into these three region-specific subdomains, the learning problem is decomposed into piecewise regimes with more homogeneous physical behavior. This decomposition reduces the functional complexity of the regression task and enables the network to learn localized heating patterns that would otherwise be difficult to capture with a single global mapping.
The thread region of the preform, visible at the left end of Figure 4, is excluded from the prediction domain. This region contains the mouth-piece geometry, which is not subjected to deformation during blow molding and is shielded from significant microwave absorption by its position outside the active cavity zone. Including it would introduce geometric features (thread pitch, sealing surface) unrelated to the thermal field of interest.
The boundaries between the three structural regions are defined by the geometric parameters of each preform design rather than by manual annotation. The neck–body transition is located at the point where the wall thickness changes from the thin-walled neck section to the constant-thickness cylindrical body, corresponding to the end of the tapered transition zone below the support ring. The body–dome transition is defined as the point where the cylindrical wall begins to curve inward, coinciding with the onset of the hemispherical end geometry. For each preform, these transition points are directly determined by the computer-aided design (CAD) geometry and are encoded through the spatial coordinate x and the region label r, ensuring that the assignment is reproducible and consistent across all preform designs without manual intervention.
A key advantage of this formulation is that it enables a single predictor model to learn temperature responses across multiple preform designs simultaneously. Predictions are conditioned on explicit geometric descriptors rather than encoded implicitly through separate preform-specific models. This transforms temperature prediction from a geometry-dependent mapping—where each preform requires its own model—into a generalized function that adapts continuously to variations in both geometry and operating conditions. As a result, the model can be queried for any combination of spatial coordinate, geometric parameters, slab configuration, and region label, producing a temperature estimate without requiring retraining for each new preform design.
This formulation provides the representational foundation for the proposed framework. However, the ability of the network to generalize across geometries depends not only on how the problem is formulated but also on how the training data are distributed in the geometry feature space. This is addressed by the domain-bounded training strategy described in the following subsection.

3.3. Domain-Bounded Geometry Training Strategy

The geometry-aware formulation introduced above enables conditioning temperature predictions on geometric descriptors, but this alone does not guarantee reliable generalization to unseen preform designs. If the training data cover only a narrow region of the geometry feature space, the model will still fail when confronted with preforms whose geometric parameters fall outside the range of the training samples. The distribution of training geometries within the feature space is therefore a critical design choice that directly determines whether prediction of a new geometry constitutes an interpolation or an extrapolation task.
An analysis of the PET preform designs used in this study reveals that the geometric descriptors—including total length, wall thickness, and the relative proportions of the neck, body, and dome regions—span a bounded range determined by the physical constraints of the microwave applicator. The preforms considered in this work were selected from the full range of industrially produced designs intended for use with this applicator type, covering the complete spectrum of geometric variation encountered in production. Crucially, the largest preform in the set reaches the maximum admissible length of the resonator cavity; any longer preform would physically not fit within the applicator. The smallest preform represents the lower bound of the production range, with the thinnest walls and shortest body among all designs manufactured for this cavity configuration. Together, these two extreme configurations establish the boundaries of the feasible geometry domain. Since all preforms processed by this applicator must physically fit within the cavity and fall within the production specifications of the system, no industrially relevant preform geometry can lie outside the range spanned by the training data. The specific preform designs and their geometric parameters are presented in Section 4.3.
When training data are drawn from a limited set of geometries that occupy only a narrow band within this domain—for example, preforms of similar size—the trained model captures relationships that are valid only locally. Predictions for preforms with substantially different geometric proportions then require extrapolation in the geometry feature space, leading to increased error and reduced stability. This is precisely the limitation observed in the conference publication [4], where the training and test preforms were geometrically similar, and the model’s apparent generalization did not hold when evaluated on more extreme configurations.
The domain-bounded training strategy addresses this problem directly. Training datasets are constructed using preforms that correspond to the minimum and maximum admissible geometries under the applicator constraints. By spanning the extreme bounds of the feasible design space, the training data define a convex hull in the geometry feature space that encompasses all intermediate preform designs. Any target preform whose geometric features lie between these extremes is guaranteed to fall within this convex hull, ensuring that its prediction is an interpolation task rather than an extrapolation task.
This distinction between interpolation and extrapolation is critical for neural network-based surrogate models. Within the convex hull of the training data, neural networks can leverage the learned relationships between inputs and outputs to produce stable and accurate predictions. Outside this hull, the network must rely on functional extrapolation, which is known to be unreliable because the behavior of the learned function beyond the training domain is not constrained by any observed data. By deliberately constructing training datasets that bound the geometry space, the proposed strategy eliminates this source of failure for any preform that can physically fit within the applicator.
The domain-bounded strategy also differs fundamentally from conventional dataset construction approaches in surrogate modeling. Standard practice typically involves sampling training geometries either randomly or by selecting designs that are representative of the expected operating range. Both approaches risk leaving gaps in the geometry feature space that expose the model to extrapolation. In contrast, the proposed strategy provides a principled selection criterion: include the geometric extremes, and all intermediate designs are automatically covered. This makes the approach systematic, reproducible, and directly tied to the physical constraints of the manufacturing system.
The combination of the geometry-aware problem formulation with domain-bounded training establishes the core of the proposed framework. The formulation ensures that the network can condition its predictions on geometry, while the training strategy ensures that the geometry space is adequately covered. The effectiveness of this combination is evaluated experimentally through dedicated case studies in which prediction performance for intermediate geometries is compared against models trained on narrowly distributed datasets, directly testing the interpolation-versus-extrapolation hypothesis.

3.4. Neural Network Architecture

The neural network architecture used in this work is a fully connected feedforward network with residual (skip) connections. In prior work [4], this residual architecture was compared against a standard multilayer perceptron (MLP) under identical configurations and was shown to substantially outperform it across all evaluation metrics. The residual architecture is therefore adopted throughout all experiments in this study.
The architecture is illustrated in Figure 5. The input layer receives four groups of features: 17 slab position parameters defining the heating configuration, one spatial coordinate along the preform surface, six geometric descriptors characterizing the preform design, and one categorical region indicator. All continuous input features are normalized to the range [ 0 , 1 ] using Min–Max scaling prior to being fed into the network, ensuring numerical stability and preventing dominance of features with larger physical scales. Region labels are encoded as categorical variables and concatenated with the continuous features into a single input vector.
The network consists of several dense layers with nonlinear activation functions, organized in a residual learning structure. Additive skip connections combine intermediate feature representations at regular intervals, allowing the network to preserve low-level input information while learning higher-level abstractions. This design improves gradient flow during backpropagation and enhances training stability for deeper architectures [26]. The final output layer consists of a single neuron with linear activation, producing a continuous temperature prediction in degrees Celsius.
Layer normalization was not employed in the current architecture. While normalization layers can improve training dynamics in certain settings, the combination of input-level Min–Max scaling and residual connections was found to provide sufficient numerical stability for the regression tasks considered in this work. The potential benefit of incorporating layer normalization remains an avenue for future investigation.
The key distinction from the conference publication is the expanded input representation. In [4], the network received only slab positions as inputs, with separate models trained for each preform variant. In the present work, geometric descriptors and region labels are provided as explicit inputs, enabling a single network to learn temperature responses across multiple preform geometries simultaneously without requiring variant-specific models.

3.5. Training and Evaluation Protocol

All models were trained using supervised learning with the mean squared error (MSE) loss function, the standard choice for continuous-valued regression [9]:
L = 1 N i = 1 N T i T ^ i 2 ,
where T i denotes the simulated temperature at the i-th sample and T ^ i the corresponding prediction. Optimization was performed using gradient-based methods with a fixed learning rate and mini-batch updates.
To make data generation tractable, the simulations were performed on a two-dimensional cross-sectional model of the applicator rather than a full three-dimensional cavity model. The validity of this reduction, which exploits the physical rotation of the preform during industrial heating to ensure that the relevant temperature field is intrinsically axial, is discussed in detail in Section 3.1.
To ensure that evaluation performance reflects true geometric generalization rather than spatial interpolation within a known preform, training, validation, and test splits were constructed at the geometry level. No spatial samples from the same preform geometry appear in both the training and test sets. For the six-preform evaluation presented in Section 4.5, each preform is systematically held out for testing while the remaining five are used for training, providing a rigorous assessment of how well the model generalizes to unseen geometric configurations.
Model performance is evaluated using standard regression metrics: the coefficient of determination ( R 2 ), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE).
For Case Studies 1 and 2, which are retained from the conference publication [4], a stacking-based model fusion strategy was employed. In this approach, variant-specific base models were first trained independently on simulation data from selected material or geometric configurations. A secondary Design of Experiments was then generated, and each base model was used to predict temperature values for this new sample set. The resulting predictions, together with their associated input features, were aggregated into a unified dataset on which a global predictor was trained. This fusion process, referred to as experience extraction, is described in detail in [4]. For Case Study 3 and the six-preform generalization evaluation, no fusion is employed; a single unified model is trained directly on the combined dataset from all training geometries using the geometry-aware formulation introduced in Section 3.2.

4. Experimental Design and Case Studies

This section presents the experimental evaluation of the proposed framework. The evaluation is organized into four parts with increasing levels of complexity. Case Studies 1 and 2 are retained from the conference publication [4] and serve as foundational evaluations addressing material property variation and conventional geometry generalization, respectively. Case Study 2, in particular, exposes the extrapolation limitation that motivates the central contribution of this work. To understand and address this limitation, a detailed analysis of the geometric feature space across six preform designs is conducted, leading to the domain-bounded training strategy evaluated in Case Study 3. Finally, a comprehensive generalization study is performed by systematically holding out each of the six preforms for testing, providing a rigorous assessment of the model’s interpolation and extrapolation behavior across the full range of feasible geometries.

4.1. Case Study 1: Material Property Generalization

The first case study evaluates the model’s ability to generalize across variations in PET material properties, specifically heat capacity. While virgin PET is commonly used in preform production, environmental and regulatory considerations have increased the adoption of recycled PET (rPET). The recycling process can introduce impurities, poor sorting, and thermal degradation [27], leading to deviations in thermophysical properties that directly affect the thermal response during microwave heating.
Heat capacity governs the material’s ability to absorb and store thermal energy, and is therefore a key parameter influencing the resulting temperature distribution. Virgin PET typically exhibits higher heat capacity due to fewer structural defects, whereas rPET tends to show reduced values as a result of crystallization disruption and contamination. Since experimental rPET property data are scarce, three plausible temperature-dependent heat capacity profiles were modeled and compared with a reference virgin PET curve. Figure 6 illustrates these profiles and their effect on the resulting temperature distribution along the preform surface, demonstrating that even moderate variations in heat capacity produce measurable changes in the thermal field.
Training and testing datasets were constructed using the low, medium, and high heat capacity categories summarized in Table 1. Following the fusion methodology described in [4], a base predictor was initially trained on the mid-range heat capacity dataset comprising 550 samples and subsequently fine-tuned on 450 samples from each of the low and high categories. After training, a DOE-based experience extraction process was used to generate 2000 synthetic predictions from each model, yielding a merged dataset of 6000 samples on which a global predictor was trained. The model was evaluated on an unseen heat capacity profile not included in any training phase.
Results demonstrated that the fusion-based global model generalized well to the unseen material variation and outperformed a baseline model trained from scratch on combined real data. This confirms the viability of the transfer learning and fusion approach for material property generalization. A detailed presentation of the training dynamics and numerical results for this case study can be found in [4].

4.2. Case Study 2: Geometry Generalization with Conventional Training

The second case study investigates model generalization across different PET preform geometries using a conventional training strategy. This case study, originally presented in [4], serves as a baseline that exposes the limitations of standard approaches to geometric generalization.
Four representative preform geometries were selected for this experiment, varying in total length, wall thickness and curvature. Three geometries were used for training through the same fusion pipeline described in Case Study 1: variant-specific models were trained independently and their predictions fused into a unified dataset for training a global predictor. The fourth geometry was reserved exclusively for testing.
Although geometric descriptors such as neck length were included as input features, the training geometries occupied only a narrow region of the geometry feature space. The three training preforms were relatively similar in their proportions, meaning that the model was exposed to only a limited range of geometric variation during training.
When evaluated on the fourth preform, which was geometrically more distinct from the training set, prediction accuracy deteriorated. The model produced unstable predictions in regions where the test geometry’s features fell outside the range of the training data. This behavior is characteristic of neural network extrapolation: the learned mapping, while accurate within the domain of the training geometries, does not reliably extend to configurations with substantially different geometric proportions.
This result reveals a fundamental limitation that cannot be resolved by simply adding geometric features as inputs to the network. If the training geometries do not adequately span the relevant feature space, the model will still be forced to extrapolate when confronted with a geometrically distinct test case. This observation directly motivates the domain-bounded training strategy proposed in this work, which is designed to ensure that geometric generalization remains an interpolation task. Before evaluating this strategy, however, a systematic analysis of the geometric feature space is necessary to understand the relationships between the available preform designs and to identify appropriate training and testing configurations.

4.3. Geometric Feature Space and Correlation Analysis

To inform the design of a robust training strategy and to understand the distribution of geometric variation across the available preform designs, six PET preform geometries (A through F) are analyzed in this section. Figure 7 shows the six preforms used in this study, illustrating the substantial variation in length, diameter, wall thickness, and neck configuration across the set. The geometric features extracted from each preform and used as inputs to the neural network are summarized in Table 2.

4.3.1. Aggregated Correlation Analysis

To characterize the relationship between input variables and the temperature field, a Pearson correlation analysis was performed across all input features and the output temperature values. Figure 8 presents the aggregated correlation coefficients grouped by feature category. Only the spatial coordinate exhibits a moderate linear correlation with temperature (≈0.45), reflecting its role as an indirect proxy for the structural region of the preform: a coordinate value in the neck is associated with fundamentally different electromagnetic field behavior, wall thickness, and boundary conditions than one in the body or dome. Among the remaining features, only slider_17—the single vertically oriented slab parameter—shows a weak but detectable correlation (≈0.12); the 16 horizontal slab parameters and all geometric descriptors fall below 0.05. As shown below, however, this aggregated view is misleading: the slab parameters exert strong, physically interpretable influence on temperature when analyzed at the appropriate spatial scale.

4.3.2. Spatially Resolved Slab–Temperature Correlation

Figure 9 shows the Pearson correlation between each of the 17 slab parameters and the temperature at each of the 32 coordinate points along the preform surface. The 16 horizontal slab parameters exhibit a clear diagonal pattern: each slab correlates most strongly with coordinate points in its physical neighborhood and has negligible influence on distant regions, with peak values reaching 0.46 for sliders 1–4 in the neck region and 0.26 for sliders 5–8 in the upper body. This spatial locality explains why the aggregated correlations in Figure 8 are low: the localized contributions of each slider are diluted across the 32 coordinates where it has no effect.
Slider_17 behaves fundamentally differently. As the only vertically oriented slab parameter, it exhibits strong positive correlation across the entire body region (0.42–0.72) and a strong negative correlation in the dome (reaching 0.78 ), indicating that it acts as a global control parameter redistributing electromagnetic energy between the body and dome rather than providing localized field adjustment. Its dual-signed influence explains its moderate aggregated correlation of 0.12.

4.3.3. Grouped Slab Correlation by Structural Region

Figure 10 shows the Pearson correlation between spatially grouped slab averages and the mean temperature in each structural region. Grouping the neck sliders (1–4) raises their correlation with neck temperature from 0.36 (maximum individual value) to 0.49, and the upper body group (5–8) reaches 0.24 with body temperature, confirming that neighboring slabs act collectively to shape the local field. In contrast, averaging all 16 horizontal slabs together yields correlations near zero across all three regions ( 0.04 , + 0.02 , + 0.003 ): the individual slab effects are spatially opposing and cancel under global aggregation. Slider_17 again stands out, with the highest grouped correlation in both the body ( + 0.71 ) and dome ( 0.81 ), reinforcing its role as a global energy redistribution parameter. These results confirm that global pairwise correlation is an unsuitable metric for assessing the importance of spatially distributed control parameters and that the prediction task is fundamentally nonlinear—justifying the use of a deep neural network capable of capturing distributed, region-dependent effects rather than a simpler regression model.

4.4. Permutation Feature Importance

The Pearson correlation analysis presented above quantifies only linear, pairwise relationships between individual input features and temperature. Because the underlying physics of microwave heating is strongly nonlinear and the trained model is a deep neural network, a complementary analysis is required to assess how much the model actually relies on each input when producing predictions. To this end, a permutation feature importance analysis was performed on the trained unified model.
For each feature group, the values of all features in the group were randomly shuffled across samples, breaking the relationship between those features and the target while preserving their marginal distribution. The drop in the coefficient of determination R 2 on the evaluation set, relative to the unperturbed baseline, was used as the importance score. Each permutation was repeated 20 times with different random seeds, and the mean and standard deviation of the resulting drops are reported. A larger drop indicates that the model depends more strongly on the corresponding features.
Two evaluation settings are considered. The within-preform setting uses the held-out Preform C test set and quantifies the importance of features that vary within a single preform’s heating experiments—namely the slab positions and the spatial coordinate. The across-preform setting uses the full multi-preform dataset and additionally reveals the importance of the geometric descriptors, which vary only between preforms and therefore cannot be assessed within a single preform. Reporting both settings provides a more complete picture than either alone.

4.4.1. Group-Level Importance

Figure 11 shows the across-preform group-level importance scores. The spatial coordinate is by far the dominant input, with a drop of 1.83 ± 0.005 from a baseline of R 2 = 0.992 —collapsing prediction quality to well below the level of a constant-mean predictor and confirming its central role as the primary carrier of region-specific physical information. Among the remaining groups, the vertical slab (slider 17) shows the largest drop (0.235 ± 0.001), followed by the 16 horizontal slabs grouped together (0.166 ± 0.001), the three wall thickness features (0.153 ± 0.001), and the three spatial/structural geometric features (0.134 ± 0.001).
A notable observation is that slider 17 alone is more important than all 16 horizontal slabs combined, quantitatively confirming the qualitative finding from the correlation analysis that slider 17 acts as a global energy redistribution parameter while the horizontal slabs each exert localized influence. The relative ordering of the geometric groups suggests that wall thickness and spatial/structural features contribute comparably at the aggregated level, but as shown next, this aggregated view obscures a substantially more uneven distribution at the individual feature level.

4.4.2. Fine-Grained Geometric Importance

The group-level view obscures an important structural property of the geometric features: their individual contributions are highly unequal. Figure 12 shows the permutation importance of each geometric descriptor evaluated separately. The single most important geometric feature is H-Body (0.241 ± 0.001), which alone exceeds the combined importance of all three wall thickness features (0.157 summed). R-Dome (0.097) is the second most important geometric feature and is also a spatial/structural parameter. The wall thickness features (WT-Body 0.074, WT-Neck 0.051, WT-Dome 0.032) form an intermediate band, while H-Neck (0.025) is the smallest geometric contribution overall.
This fine-grained view confirms the physical interpretation developed in Section 4.7: features that determine the physical proportions of the preform—particularly body length (H-Body) and dome radius (R-Dome)—exert substantially stronger influence on the predicted temperature field than wall thickness parameters, because they control where the structural region boundaries lie and therefore the qualitative pattern of the electromagnetic field distribution rather than merely scaling the thermal response. The fact that this dominance is not visible at the group level—where H-Neck’s small contribution drags the spatial/structural group average down—exactly parallels the slab analysis in Section 4.3, where global aggregation also concealed the true importance of individual control parameters. In both cases, the appropriate analytical scale is the individual feature, not the aggregated group.

4.4.3. Implications

The permutation analysis complements the linear correlation analysis in three respects. First, it quantifies nonlinear feature importance, addressing a limitation of pairwise Pearson correlation in characterizing the behavior of a deep regression model. Second, it confirms that all analyzed input groups contribute meaningfully to prediction—every group exhibits a clearly non-zero drop in R 2 —which justifies the choice to provide spatial coordinate, slab positions, and geometric descriptors as joint inputs to a single unified model. Third, the dominance of H-Body and R-Dome over the wall thickness features provides quantitative support for the physical extrapolation argument developed in Section 4.7: when a held-out preform requires extrapolation along these spatial/structural dimensions, the model is forced to predict in a regime where its most influential geometric inputs lie outside the training distribution, which explains the larger accuracy degradation observed for boundary cases such as Preforms B and F in Section 4.5.

4.5. Generalization Across Preforms

To provide a rigorous and comprehensive assessment of the proposed framework, a systematic evaluation was conducted in which each of the six preform geometries was held out for testing in turn, while the remaining five were used for training. This protocol produces six independent experiments, each testing the model’s ability to predict temperature distributions for a completely unseen preform geometry.
For each experiment, a single unified model was trained using the geometry-aware formulation without any model fusion, following the same protocol as Case Study 3. Table 3 summarizes the results across all six held-out experiments.
The training performance is consistently high across all experiments, with R 2 values exceeding 0.99 in every case. This confirms that the model has sufficient capacity to learn the relationship between process parameters, geometric features, and temperature distribution within the training domain.
The test performance, however, varies substantially depending on which preform is held out, and this variation directly reflects the interpolation-versus-extrapolation distinction established by the geometric feature space analysis. The strongest test performance is achieved when Preform C is held out ( R 2 = 0.9865 , MAE = 0.0183 ). As shown in Table 2, Preform C is the only geometry whose normalized features lie entirely within the interior of the feature range defined by the remaining preforms. Figure 13 illustrates predicted versus simulated temperature profiles for six slab configurations selected to span the range of heating conditions in the test set, including cases with low, moderate, and high overall temperature levels as well as varying spatial profile shapes. The model shows close agreement across all cases. This result provides direct empirical evidence that the model performs well when the prediction task is an interpolation problem.
Preform A also achieves relatively strong performance ( R 2 = 0.880 ). While Preform A defines extreme values for three features (neck wall thickness, body wall thickness, and dome wall thickness), the majority of its spatial and structural features remain within the bounds of the remaining training preforms, which mitigates the extrapolation difficulty as discussed further below.
Preforms D ( R 2 = 0.8646 ) and E ( R 2 = 0.8607 ) exhibit very similar performance, both showing moderate degradation compared to Preform C. Both preforms have several geometric features approaching extreme values—Preform D defines the maximum for body wall thickness and neck length, while Preform E defines the maximum for dome wall thickness. The comparable performance of these two preforms, despite their different geometric characteristics, suggests that partial extrapolation along a small number of feature dimensions introduces a consistent and predictable level of prediction uncertainty.
In contrast, the lowest test performance is observed for Preforms B ( R 2 = 0.7733 ) and F ( R 2 = 0.7498 ). These preforms occupy positions at the boundaries of the geometric feature space. Preform B defines the minimum values for both body length (H-Body) and dome radius simultaneously, while Preform F defines the maximum values for body length, and dome radius. When either preform is excluded from training, the remaining five preforms no longer span the full extent of the feature space in those dimensions, forcing the model to extrapolate. The reduced accuracy for these boundary cases is consistent with the known extrapolation limitation of neural networks and reinforces the importance of the domain-bounded training strategy.
Across all six experiments, the mean test coefficient of determination is R 2 ¯ = 0.852 with a standard deviation of σ = 0.077 . Despite the variation between interpolation and extrapolation cases, all test R 2 values remain above 0.74, indicating that the geometry-aware formulation provides a meaningful degree of robustness even under unfavorable geometric conditions. The gap between the best ( R 2 = 0.9865 ) and worst ( R 2 = 0.7498 ) test cases quantifies the practical cost of extrapolation and provides a clear guideline for industrial deployment: when the training data span the geometric extremes of the production range, predictions for any intermediate preform can be expected to be highly accurate.

4.6. Validation on Additional Intermediate Preforms

To further validate the domain-bounded training strategy beyond the single interpolation case of Preform C, three additional PET preforms (G, H, and I) were selected for independent testing. These preforms, shown in Figure 14, were not included in any training or prior evaluation phase. Their geometric features, summarized in Table 4, were verified to lie within the minimum and maximum bounds defined by the training geometries (Preforms A, B, D, E, and F) across all six feature dimensions, ensuring that each prediction constitutes a pure interpolation task.
The three validation preforms represent diverse positions within the geometric feature space. Preform G has moderate values across all dimensions. Preform H has a neck length (H-Neck = 16.80) approaching the upper bound of the training range (17.00), representing a near-boundary interpolation case. Preform I has a relatively short neck (H-Neck = 9.62) and a long body (H-Body = 72.43), placing it in a different region of the feature space from G and H. Together, the three preforms sample distinct interior positions within the convex hull, providing a more comprehensive test of the interpolation hypothesis than a single case.
A single unified model, trained on the five boundary preforms using the geometry-aware formulation T = f ( x , g , s , r ) , was used to predict the temperature distributions for all three validation preforms without any retraining or fine-tuning. The results are summarized in Table 5.
All three validation preforms achieve R 2 values above 0.98, with Preform I reaching R 2 = 0.9901 and the lowest MAE of 0.0143. These results are comparable to or better than the performance observed for Preform C ( R 2 = 0.9865 ) in the main evaluation, confirming that the strong interpolation performance is not an artifact of a single favorable test case but a consistent property of the framework when the interpolation condition is satisfied.
Figure 15 presents the predicted versus simulated temperature profiles for all three validation preforms across three representative slab configurations each. The model closely tracks the simulated profiles across all nine cases, capturing both the overall shape of the temperature field and localized features such as peaks in the body region and the sharp temperature drop in the dome. The varying temperature ranges and profile shapes across configurations demonstrate that the model adapts to different heating conditions while maintaining consistent accuracy.
Preform H, despite having its neck length within 1.2% of the training maximum, still achieves R 2 = 0.9832 . This indicates that the model maintains high accuracy even for geometries approaching the boundary of the training domain, provided they remain strictly within the convex hull. The slight reduction in performance compared to Preforms G and I is consistent with the expectation that proximity to the feature space boundary introduces marginally higher prediction uncertainty.
These validation results provide strong additional evidence for the central claim of this work: when the training data span the geometric extremes of the feasible design space, the model can reliably predict temperature distributions for any intermediate preform geometry. In practical terms, this means that a manufacturer generating simulation data from the smallest and largest admissible preforms can deploy the trained model across the entire product portfolio without requiring geometry-specific retraining or additional simulation runs for each new design.

4.7. Discussion

The three case studies, the six-preform generalization evaluation, and the independent validation on three additional preforms provide complementary perspectives on the capabilities and limitations of the proposed framework.
Case Study 1 demonstrates that the fusion-based approach introduced in the conference publication [4] is effective for generalizing across material property variations. By combining transfer learning with stacking-based fusion, accurate temperature predictions can be achieved for unseen heat capacity profiles using limited training data. This result establishes that the neural network architecture and training pipeline are capable of capturing the influence of thermophysical parameters on the temperature field.
Case Study 2 exposes the critical limitation that motivated the present work. When the same fusion-based approach is applied to geometric generalization with conventionally selected training geometries, prediction accuracy degrades for the unseen test preform. The key insight from this case study is that the training preforms were too geometrically similar to pose a meaningful generalization challenge, and when the model was confronted with a more distinct geometry, it failed because extrapolation was required. This limitation cannot be overcome by increasing the training data volume for the existing geometries or by refining the fusion strategy—it is a structural problem arising from inadequate coverage of the geometry feature space.
Case Study 3 directly addresses this limitation through the domain-bounded training strategy. By training on preforms that span the geometric extremes, the prediction of Preform C becomes a pure interpolation task, and the model achieves strong accuracy without any model fusion. This result demonstrates that principled selection of training geometries is at least as important as the choice of learning algorithm for ensuring geometric generalization.
The six-preform generalization evaluation extends this analysis by testing the model under both interpolation and extrapolation conditions. The results establish a clear and quantifiable relationship between a test preform’s position in the geometric feature space and the achievable prediction accuracy. Interpolation cases yield high accuracy, partial extrapolation cases show moderate degradation, and full boundary cases exhibit the lowest performance.
A closer examination of the six-preform results reveals that not all geometric features contribute equally to extrapolation difficulty. The performance ranking across held-out preforms—C (0.9865) > A (0.880) > D (0.8646) > E (0.8607) > B (0.7733) > F (0.7498)—cannot be explained solely by counting the number of extreme features for each preform. Preform A defines three extreme values (neck wall thickness, body wall thickness, dome wall thickness) yet achieves R 2 = 0.880 , while Preform B defines only three extremes (body length, neck length, and dome radius) but performs substantially worse at R 2 = 0.7733 . This discrepancy is explained by the fundamentally different physical roles of these two groups of features.
Wall thickness parameters act primarily as scaling factors on the thermal response. A thicker wall requires more microwave energy to reach a given temperature, but the overall shape of the temperature profile along the preform does not change qualitatively. The neural network can learn this scaling relationship with relative ease, even when extrapolating moderately beyond the training range, because the underlying function remains smooth and predictable.
In contrast, the spatial and structural features—body length, neck length, and dome radius—define the physical proportions of the preform and determine where each measurement point lies relative to the boundaries between the neck, body, and dome regions. These features directly affect the electromagnetic field distribution within the cavity in a qualitative, not merely quantitative, manner. A short body means that the neck-to-dome transition occurs over a compressed spatial range, producing different standing wave patterns and thermal interactions between adjacent regions. A long body separates these transitions, creating a more uniform field in the central region. These are physically distinct regimes, not scaled versions of each other. When the model is forced to extrapolate along these dimensions—as occurs when Preform B or F is removed from the training set—it must predict temperature patterns for spatial configurations it has never encountered, where the underlying physics behave differently. This explains the substantially lower accuracy observed for these boundary preforms.
To examine these two failure cases more closely, the worst-case predicted-versus-simulated temperature profiles for Preforms B and F are shown in Figure 16 and Figure 17. The two preforms fail in qualitatively different ways. For Preform B, the predicted profiles retain the correct overall shape—a rising neck, central body peak, and declining dome—but are shifted upward by an approximately constant offset across the full axial range, indicating that the error is distributed roughly uniformly along the preform rather than localized in any particular region. For Preform F, in contrast, the error is concentrated in the upper body: the model predicts a pronounced central peak where the simulated profile is flatter, while prediction and simulation agree more closely near the dome. These contrasting failure signatures are consistent with the dominance of H-Body identified in the permutation feature importance analysis of Section 4.4: Preform B defines the minimum value of H-Body among all training geometries and Preform F defines the maximum, so when either is held out the model is forced to predict in a regime where its most influential geometric input lies outside the training distribution. The precise mechanism linking H-Body extrapolation to these two specific failure modes cannot be determined from the available data alone and represents a natural direction for future investigation.
This interpretation is further supported by the close performance of Preforms D ( R 2 = 0.8646 ) and E ( R 2 = 0.8607 ). Both preforms have extreme values distributed across a mix of wall thickness and spatial features, resulting in comparable moderate extrapolation difficulty. Neither preform represents a pure spatial extrapolation case like B or F, nor a pure wall thickness extrapolation case like A, placing them in an intermediate performance band that is consistent with their mixed feature profiles.
The independent validation on Preforms G, H, and I provides the strongest evidence for the domain-bounded strategy. All three preforms, despite occupying different positions within the geometric feature space and having no overlap with the training data, achieve R 2 values above 0.98. Notably, Preform I ( R 2 = 0.9901 ) outperforms even Preform C from the main evaluation, and Preform H maintains high accuracy ( R 2 = 0.9832 ) despite its neck length approaching the training boundary. These results demonstrate that the interpolation guarantee provided by the domain-bounded strategy is not sensitive to the specific location within the convex hull—strong performance is achieved regardless of whether the test geometry lies near the center or near the boundary of the feature space, provided it remains strictly within the training bounds.
This finding has direct practical implications for the domain-bounded training strategy. It suggests that ensuring coverage of the spatial and structural feature extremes is more critical than covering wall thickness extremes. When selecting training geometries for a new production scenario, priority should be given to including preforms that span the full range of body length, neck length, and dome proportions, as extrapolation along these dimensions carries the highest prediction risk.
An important practical implication of the overall framework is that the domain-bounded strategy reduces the burden of data collection. Rather than requiring simulation data from every preform variant in a production line, it is sufficient to generate training data from preforms that span the geometric extremes—particularly along the spatial and structural dimensions. All intermediate designs are then covered by interpolation, eliminating the need for geometry-specific retraining. The validation results on Preforms G, H, and I confirm this directly: a model trained on five boundary preforms achieves R 2 > 0.98 on three entirely new designs without any additional training.
The inclusion of region-aware inputs further contributes to prediction accuracy by decomposing the preform into physically distinct zones with different thermal characteristics. The neck, body, and dome regions experience different electromagnetic field intensities and boundary conditions, and providing this structural context as an explicit input allows the network to learn region-specific heating patterns rather than attempting to fit a single global function across the entire preform surface. This is consistent with the observation that spatial and structural features dominate the extrapolation difficulty: the region label provides the network with the qualitative context it needs to distinguish between fundamentally different physical regimes along the preform.
A central limitation of the current study is that all reported evaluations are based on simulation data generated by Ansys HFSS. While these simulations are high fidelity and capture the coupled electromagnetic–thermal physics of the system, several effects present in real production environments are necessarily idealized in the simulation model. Among these are temperature- and frequency-dependent variations in the dielectric properties of PET during heating, magnetron frequency drift and tuning behavior of the industrial microwave generator, mechanical tolerances in slab positioning, unit-to-unit variability between physically manufactured preforms, and convective and radiative heat losses during the brief interval between cavity exit and thermal imaging. Each of these effects can introduce small but systematic deviations between simulation and physical measurement, and an experimental validation campaign covering multiple preform geometries and slab configurations would be required to characterize the simulation-to-reality gap quantitatively. Such a campaign is planned as part of the next development phase, once the experimental infrastructure for systematic measurement under controlled conditions is fully operational. Until then, the present results should be interpreted as a validation of the proposed learning framework against the high-fidelity simulation model that defines the ground truth, rather than as a direct validation against the production process. Additionally, the current framework treats material properties as fixed within each case study; incorporating dynamic material behavior that changes during the heating process represents a further natural direction for future development.

5. Conclusions and Future Work

This study investigated the generalization capability of a unified neural network model for predicting temperature distributions in microwave preheating of PET preforms prior to blow molding. The work extends a previous conference publication [4], which employed variant-specific fine-tuning and model fusion, by introducing a geometry-aware problem formulation and a domain-bounded training strategy that enables a single model to generalize across diverse preform geometries.
The core finding of this work is that the distinction between interpolation and extrapolation in the geometric feature space is the primary determinant of prediction accuracy. Across six held-out experiments, the mean test R 2 is 0.852 with a standard deviation of 0.077. When the geometric parameters of a test preform lie within the bounds of the training data, the model achieves strong accuracy ( R 2 = 0.9865 for Preform C), while performance degrades for boundary preforms that require extrapolation ( R 2 0.75 for Preforms B and F). A further analysis reveals that extrapolation along spatial and structural features—body length, neck length, and dome radius—is substantially more damaging to prediction accuracy than extrapolation along wall thickness parameters because the former changes the qualitative character of the electromagnetic field distribution while the latter acts primarily as a scaling factor on the thermal response.
The domain-bounded training strategy was independently validated on three additional preforms (G, H, I) not included in any prior training or evaluation. All three achieved R 2 values above 0.98, with Preform I reaching R 2 = 0.9901 . These results confirm that the interpolation guarantee provided by the domain-bounded strategy holds consistently across diverse positions within the geometric feature space, and is not an artifact of a single favorable test case.
The practical implication is a direct reduction in data generation effort. Rather than requiring simulation data for every preform variant in a production line, training data from preforms spanning the geometric extremes are sufficient to cover the entire feasible design space through interpolation. Compared to the conference approach, which required training and fusing three separate variant-specific models, the proposed method uses a single model trained once on a combined dataset—reducing both computational overhead and implementation complexity.
Several directions for future work are identified. First, the current framework treats material properties as fixed for each preform; incorporating temperature-dependent and processing-history-dependent material behavior would improve fidelity for real production conditions where material properties evolve during heating. Second, the analysis of interpolation and extrapolation behavior suggests that uncertainty-aware architectures could be valuable: by quantifying prediction confidence as a function of distance from the training domain in the geometry feature space, the model could automatically flag cases where extrapolation risk is high. Third, more expressive architectures such as physics-informed neural networks or attention-based mechanisms could be explored to improve extrapolation performance for boundary geometries, potentially relaxing the requirement that training data must strictly bound the test geometry. Fourth, alternative loss functions, such as region-weighted variants that assign different penalties to the neck, body, and dome sections based on their criticality for blow molding quality, could further improve prediction accuracy in physically important regions of the preform.
Expanding the geometric design space to include a broader range of preform shapes and wall thickness profiles would strengthen the generality of the proposed approach. Most importantly, experimental validation using physical temperature measurements from an industrial microwave heating system is needed to complement the simulation-based results presented in this work and to assess the framework’s robustness under real manufacturing conditions.

Author Contributions

Conceptualization, A.A. and A.F.; methodology, A.A.; software, A.A.; validation, A.A.; formal analysis, A.A.; investigation, A.A.; resources, A.A.; data curation, A.A.; writing—original draft preparation, A.A.; writing—review and editing, A.A. and A.F.; visualization, A.A.; supervision, A.F.; project administration, A.A. and A.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Krones AG through the industrial PhD program in collaboration with Deggendorf Institute of Technology.

Data Availability Statement

The dataset used in this study is proprietary and was generated using Ansys Electronics HFSS 2024 R1 (Ansys Inc., Canonsburg, PA, USA) simulations with Latin Hypercube Sampling. Synthetic benchmark datasets and the implementation pseudo-code will be released in future work to facilitate reproducibility. Interested researchers may contact the corresponding author for further technical details.

Acknowledgments

The authors wish to thank Thomas Albrecht and Guenter Winkler for their support, fruitful discussions, and useful advice. During the preparation of this manuscript, the authors used ChatGPT (GPT-4) for the purposes of language editing and LaTeX formatting. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The first author is pursuing a PhD at Deggendorf Institute of Technology in collaboration with Krones AG, which funded the research presented in this article. The second author, a faculty member at Deggendorf Institute of Technology, contributed in an academic supervisory capacity. The authors declare that they have no other competing interests.

Abbreviations

The following abbreviations are used in this manuscript:
PETPolyethylene Terephthalate
MWMicrowave
IRInfrared
rPETRecycled Polyethylene Terephthalate
DOEDesign of Experiments
LHSLatin Hypercube Sampling
MLPMultilayer Perceptron
MSEMean Squared Error
RMSERoot Mean Squared Error
MAEMean Absolute Error
PTFEPolytetrafluoroethylene
HFSSHigh-Frequency Structure Simulator

References

  1. Wawrzyniak, P.; Karaszewski, W. A literature survey of the influence of preform reheating and stretch blow molding with hot mold process parameters on the properties of PET containers part ii. Polimery 2020, 65, 437–448. [Google Scholar] [CrossRef]
  2. Estel, L.; Ph, L.; Ledoux, A.; Bonnet, C.; Delmotte, M. Microwave assisted blow molding of polyethylene-terephthalate (PET) bottles. In Proceedings of the AIChE Annual Meeting, Austin, TX, USA, 7–12 November 2004. [Google Scholar]
  3. Luo, Y.-M.; Chevalier, L.; Nguyen, T.T. Optimization of the temperature profile of PET preform via a 3D modelling of the infrared heating and ventilation. In Material Forming—ESAFORM 2024; Materials Research Forum: Lancaster County, PA, USA, 2024; Volume 41, pp. 2584–2594. [Google Scholar]
  4. Alsheikh, A.; Fischer, A. Fusion-based neural generalization for predicting temperature fields in industrial PET preform heating. arXiv 2025, arXiv:2510.05394. [Google Scholar] [CrossRef]
  5. Yang, Z.; Naeem, W.; Menary, G.; Deng, J.; Li, K. Advanced modelling and optimization of infrared oven in injection stretch blow-moulding for energy saving. IFAC Proc. Vol. 2014, 47, 766–771. [Google Scholar]
  6. Monteix, S.; Schmidt, F.; Le Maoult, Y.; Denis, G.; Vigny, M. Recent issues in preform radiative heating modelling. In Proceedings of the International Conference of Polymer Processing Society (PPS), Montréal, QC, Canada, 21–24 May 2001; pp. 1–6. [Google Scholar]
  7. García-Baños, B.; Plaza-Gonzalez, P.; Sánchez, J.; Steger, S.; Feigl, A.; Penaranda-Foix, F.; Catalá-Civera, J. Focusing dielectric slabs for the optimization of heating patterns in single mode microwave applicators. Appl. Therm. Eng. 2022, 201, 117845. [Google Scholar] [CrossRef]
  8. Baker-Jarvis, J.; Kim, S. The interaction of radio-frequency fields with dielectric materials at macroscopic to mesoscopic scales. J. Res. Natl. Inst. Stand. Technol. 2012, 117, 1. [Google Scholar] [CrossRef] [PubMed]
  9. Sun, Y.; Sengupta, U.; Juniper, M. Physics-informed deep learning for simultaneous surrogate modeling and PDE-constrained optimization of an airfoil geometry. Comput. Methods Appl. Mech. Eng. 2023, 411, 116042. [Google Scholar] [CrossRef]
  10. Hsieh, P. Intelligent temperature control of a stretch blow molding machine using deep reinforcement learning. Processes 2023, 11, 1872. [Google Scholar] [CrossRef]
  11. Zhai, N.; Zhou, X. Temperature prediction of heating furnace based on deep transfer learning. Sensors 2020, 20, 4676. [Google Scholar] [CrossRef] [PubMed]
  12. Barba, P.D.; Dughiero, F.; Forzan, M.; Lowther, D.; Marconi, A.; Mognaschi, M.; Sykulski, J. Neural metamodels and transfer learning for induction heating processes (TEAM 36 problem). Int. J. Appl. Electromagn. Mech. 2023, 73, 389–398. [Google Scholar] [CrossRef]
  13. Liu, W.; Ouyang, H.; Liu, Q.; Cai, S.; Wang, C.; Xie, J.; Hu, W. Image recognition for garbage classification based on transfer learning and model fusion. Math. Probl. Eng. 2022, 2022, 4793555. [Google Scholar] [CrossRef]
  14. Zhou, J.; Li, Z.; Zhi, W.; Liang, B.; Moses, D.; Dawes, L. Using convolutional neural networks and transfer learning for bone age classification. In Proceedings of the 2017 International Conference on Digital Image Computing: Techniques and Applications (DICTA), Sydney, NSW, Australia, 27–29 November 2017; pp. 1–6. [Google Scholar]
  15. Ghazi, M.; Yanikoglu, B.; Aptoula, E. Plant identification using deep neural networks via optimization of transfer learning parameters. Neurocomputing 2017, 235, 228–235. [Google Scholar] [CrossRef]
  16. Chakraborty, S.; Mondal, R.; Singh, P.; Sarkar, R.; Bhattacharjee, D. Transfer learning with fine tuning for human action recognition from still images. Multimed. Tools Appl. 2021, 80, 20547–20578. [Google Scholar] [CrossRef]
  17. Whitney, H.; Li, H.; Ji, Y.; Liu, P.; Giger, M. Comparison of breast MRI tumor classification using human-engineered radiomics, transfer learning from deep convolutional neural networks, and fusion methods. Proc. IEEE 2019, 108, 163–177. [Google Scholar] [CrossRef]
  18. Korzh, O.; Joaristi, M.; Serra, E. Convolutional neural network ensemble fine-tuning for extended transfer learning. In Big Data—BigData 2018. Lecture Notes in Computer Science; Chin, F., Chen, C., Khan, L., Lee, K., Zhang, L.-J., Eds.; Springer: Cham, Switzerland, 2018; Volume 10968, pp. 110–123. [Google Scholar]
  19. Geyer, R.; Corinzia, L.; Wegmayr, V. Transfer learning by adaptive merging of multiple models. In Proceedings of the 2nd International Conference on Medical Imaging with Deep Learning (MIDL 2019), London, UK, 8–10 July 2019; Volume 102, pp. 185–196. [Google Scholar]
  20. Pfeiffer, J.; Kamath, A.; Rücklé, A.; Cho, K.; Gurevych, I. Adapterfusion: Non-destructive task composition for transfer learning. arXiv 2020, arXiv:2005.00247. [Google Scholar]
  21. Pastore, A. Extrapolating from neural network models: A cautionary tale. arXiv 2020, arXiv:2012.06605. [Google Scholar] [CrossRef]
  22. Cao, J.; Li, Q.; Xu, L.; Yang, R.; Dai, Y. Non-parametric surrogate model method based on machine learning with application on low-pressure steam turbine exhaust system. J. Glob. Power Propuls. Soc. 2022, 6, 165–180. [Google Scholar] [CrossRef] [PubMed]
  23. Oldenburg, J.; Borowski, F.; Öner, A.; Schmitz, K.-P.; Stiehm, M. Geometry aware physics informed neural network surrogate for solving Navier–Stokes equation (GAPINN). Adv. Model. Simul. Eng. Sci. 2022, 9, 8. [Google Scholar] [CrossRef]
  24. Wong, J.C.; Ooi, C.C.; Chattoraj, J.; Lestandi, L.; Dong, G.; Kizhakkinan, U.; Rosen, D.W.; Jhon, M.H.; Dao, M.H. Graph neural network based surrogate model of physics simulations for geometry design. arXiv 2023, arXiv:2302.00557. [Google Scholar] [CrossRef]
  25. Franco, N.R.; Fresca, S.; Tombari, F.; Manzoni, A. Deep learning-based surrogate models for parametrized PDEs: Handling geometric variability through graph neural networks. Chaos 2023, 33, 123121. [Google Scholar] [CrossRef] [PubMed]
  26. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. arXiv 2015, arXiv:1512.03385. [Google Scholar] [CrossRef]
  27. Venkatachalam, S.; Nayak, S.G.; Labde, J.V.; Gharal, P.R.; Rao, K.; Kelkar, A.K. Degradation and recyclability of poly(ethylene terephthalate). In Polyester; Saleh, H.E.-D.M., Ed.; IntechOpen: Rijeka, Croatia, 2012; pp. 75–98. [Google Scholar]
Figure 1. From left to right: a PET preform before blow molding; a properly molded bottle produced under uniform heating conditions; and two defective bottles exhibiting wall thickness variations, uneven deformation, and structural collapse caused by nonuniform temperature distribution during preheating.
Figure 1. From left to right: a PET preform before blow molding; a properly molded bottle produced under uniform heating conditions; and two defective bottles exhibiting wall thickness variations, uneven deformation, and structural collapse caused by nonuniform temperature distribution during preheating.
Jmmp 10 00138 g001
Figure 2. Overview of the proposed framework. (1) The geometry-aware input formulation defines temperature as a conditional function T = f ( x , g , s , r ) of spatial coordinate, geometric descriptors, slab configuration, and region indicator. (2) The domain-bounded training strategy is illustrated in the weight–body-length feature space, where black dots mark individual preform designs, the shaded polygon denotes the convex hull spanned by the extreme (minimum and maximum) feasible geometries, and the three preform icons in the legend indicate the minimum, maximum, and an example unseen intermediate preform. Any target preform whose features fall inside the shaded region is guaranteed to lie within the convex hull of the training data. (3) A unified predictor network with Min–Max scaled inputs and residual connections maps the combined feature vector (spatial coordinate x, geometric descriptors g, slab position s, region indicator r) to a scalar temperature output. (4) The trained model generalizes to simultaneous temperature prediction across multiple unseen preform geometries.
Figure 2. Overview of the proposed framework. (1) The geometry-aware input formulation defines temperature as a conditional function T = f ( x , g , s , r ) of spatial coordinate, geometric descriptors, slab configuration, and region indicator. (2) The domain-bounded training strategy is illustrated in the weight–body-length feature space, where black dots mark individual preform designs, the shaded polygon denotes the convex hull spanned by the extreme (minimum and maximum) feasible geometries, and the three preform icons in the legend indicate the minimum, maximum, and an example unseen intermediate preform. Any target preform whose features fall inside the shaded region is guaranteed to lie within the convex hull of the training data. (3) A unified predictor network with Min–Max scaled inputs and residual connections maps the combined feature vector (spatial coordinate x, geometric descriptors g, slab position s, region indicator r) to a scalar temperature output. (4) The trained model generalizes to simultaneous temperature prediction across multiple unseen preform geometries.
Jmmp 10 00138 g002
Figure 3. Microwave heating system for PET preforms. From left to right: a PET preform positioned for heating; the microwave cavity with stacked quartz slab aperture arrays arranged symmetrically on either side of the preform, with adjustable positions along the horizontal axis (blue arrows); a schematic cross-section showing the array-based control zones numbered along the z-axis, where each slab position can be independently adjusted to shape the electromagnetic field distribution; and the resulting thermal image of a heated preform, showing the temperature distribution along its surface, where warmer colors (yellow/white) indicate higher temperatures and cooler colors (red/dark) indicate lower temperatures.
Figure 3. Microwave heating system for PET preforms. From left to right: a PET preform positioned for heating; the microwave cavity with stacked quartz slab aperture arrays arranged symmetrically on either side of the preform, with adjustable positions along the horizontal axis (blue arrows); a schematic cross-section showing the array-based control zones numbered along the z-axis, where each slab position can be independently adjusted to shape the electromagnetic field distribution; and the resulting thermal image of a heated preform, showing the temperature distribution along its surface, where warmer colors (yellow/white) indicate higher temperatures and cooler colors (red/dark) indicate lower temperatures.
Jmmp 10 00138 g003
Figure 4. Structural regions of a PET preform used as the basis for the region indicator r in the proposed formulation. The preform is partitioned into three sections—neck, body, and dome—each characterized by distinct wall thickness, curvature, and electromagnetic field behavior during microwave heating. The thread region (mouth-piece) is excluded from the prediction domain, as it is not subjected to deformation during blow molding and remains at ambient temperature during heating.
Figure 4. Structural regions of a PET preform used as the basis for the region indicator r in the proposed formulation. The preform is partitioned into three sections—neck, body, and dome—each characterized by distinct wall thickness, curvature, and electromagnetic field behavior during microwave heating. The thread region (mouth-piece) is excluded from the prediction domain, as it is not subjected to deformation during blow molding and remains at ambient temperature during heating.
Jmmp 10 00138 g004
Figure 5. Neural network architecture used for temperature prediction. The input layer combines slab position parameters, spatial coordinate, geometric features, and region indicator. The network consists of five fully connected layers with widths 128, 128, 64, 32, and 1, using ReLU activations in all hidden layers and a linear output. Batch normalization is applied after the first two hidden layers, and dropout (rate 0.2) is applied after the second and third hidden layers as a regularization measure. Residual skip connections (Add blocks) are inserted between intermediate layers to enhance feature propagation and training stability. Training uses the Adam optimizer with an initial learning rate of 10 3 , a batch size of 256, and a maximum of 500 epochs, with early stopping (patience 50, best-weight restoration) and learning-rate reduction on plateau (factor 0.5, patience 10, minimum 10 5 ). Fifteen percent of the training data is held out as a validation set during fitting.
Figure 5. Neural network architecture used for temperature prediction. The input layer combines slab position parameters, spatial coordinate, geometric features, and region indicator. The network consists of five fully connected layers with widths 128, 128, 64, 32, and 1, using ReLU activations in all hidden layers and a linear output. Batch normalization is applied after the first two hidden layers, and dropout (rate 0.2) is applied after the second and third hidden layers as a regularization measure. Residual skip connections (Add blocks) are inserted between intermediate layers to enhance feature propagation and training stability. Training uses the Adam optimizer with an initial learning rate of 10 3 , a batch size of 256, and a maximum of 500 epochs, with early stopping (patience 50, best-weight restoration) and learning-rate reduction on plateau (factor 0.5, patience 10, minimum 10 5 ). Fifteen percent of the training data is held out as a validation set during fitting.
Jmmp 10 00138 g005
Figure 6. (a) Heat capacity vs. temperature for virgin and recycled PET. (b) Resulting temperature profiles along the preform length under different heat capacity assumptions.
Figure 6. (a) Heat capacity vs. temperature for virgin and recycled PET. (b) Resulting temperature profiles along the preform length under different heat capacity assumptions.
Jmmp 10 00138 g006
Figure 7. The six PET preform geometries used in this study, arranged from left to right by increasing size and referred to throughout the manuscript as Preforms A through F (leftmost to rightmost). The preforms span a wide range of lengths and wall thicknesses, representing the full range of geometric variation encountered in industrial production for this applicator. The leftmost preform defines the lower geometric bound (shortest body, thinnest walls), while the rightmost preform defines the upper bound (longest body, reaching the maximum cavity length). The different colors of the preforms are incidental and reflect the physical color of the manufactured samples; they carry no scientific meaning.
Figure 7. The six PET preform geometries used in this study, arranged from left to right by increasing size and referred to throughout the manuscript as Preforms A through F (leftmost to rightmost). The preforms span a wide range of lengths and wall thicknesses, representing the full range of geometric variation encountered in industrial production for this applicator. The leftmost preform defines the lower geometric bound (shortest body, thinnest walls), while the rightmost preform defines the upper bound (longest body, reaching the maximum cavity length). The different colors of the preforms are incidental and reflect the physical color of the manufactured samples; they carry no scientific meaning.
Jmmp 10 00138 g007
Figure 8. Pearson correlation coefficients between all input features and temperature. Only the spatial coordinate (≈0.45) and slider_17 (≈0.12) exhibit non-trivial values; all remaining features fall below 0.05. As discussed in the text, this aggregated view obscures the true spatial structure of the slab–temperature relationship.
Figure 8. Pearson correlation coefficients between all input features and temperature. Only the spatial coordinate (≈0.45) and slider_17 (≈0.12) exhibit non-trivial values; all remaining features fall below 0.05. As discussed in the text, this aggregated view obscures the true spatial structure of the slab–temperature relationship.
Jmmp 10 00138 g008
Figure 9. Pearson correlation between each slab position parameter and temperature at each of the 32 coordinate points along the preform surface (1 = neck, 32 = dome). The horizontal slabs exhibit a diagonal pattern reflecting their spatially local influence, while slider_17 shows strong positive correlation across the body and strong negative correlation in the dome.
Figure 9. Pearson correlation between each slab position parameter and temperature at each of the 32 coordinate points along the preform surface (1 = neck, 32 = dome). The horizontal slabs exhibit a diagonal pattern reflecting their spatially local influence, while slider_17 shows strong positive correlation across the body and strong negative correlation in the dome.
Jmmp 10 00138 g009
Figure 10. Pearson correlation between grouped slab position parameters and mean temperature in each structural region. Spatial grouping amplifies the correlation of neighboring slabs with their corresponding region temperature, while averaging all 16 horizontal slabs together yields near-zero correlation, demonstrating that the individual slab effects cancel under global aggregation.
Figure 10. Pearson correlation between grouped slab position parameters and mean temperature in each structural region. Spatial grouping amplifies the correlation of neighboring slabs with their corresponding region temperature, while averaging all 16 horizontal slabs together yields near-zero correlation, demonstrating that the individual slab effects cancel under global aggregation.
Jmmp 10 00138 g010
Figure 11. Permutation feature importance by group, evaluated on the full multi-preform dataset. Bars show the mean drop in R 2 from the baseline of 0.992 when each group is shuffled, with error bars indicating ±1 standard deviation over 20 random permutations. The spatial coordinate dominates all other inputs, while slider 17 alone is more important than all 16 horizontal slabs combined.
Figure 11. Permutation feature importance by group, evaluated on the full multi-preform dataset. Bars show the mean drop in R 2 from the baseline of 0.992 when each group is shuffled, with error bars indicating ±1 standard deviation over 20 random permutations. The spatial coordinate dominates all other inputs, while slider 17 alone is more important than all 16 horizontal slabs combined.
Jmmp 10 00138 g011
Figure 12. Permutation feature importance for the six individual geometric descriptors, evaluated on the full multi-preform dataset. Spatial/structural features are shown in orange and wall thickness features in blue. H-Body alone exceeds the combined importance of all three wall thickness features, confirming that spatial/structural parameters dominate the model’s geometric sensitivity.
Figure 12. Permutation feature importance for the six individual geometric descriptors, evaluated on the full multi-preform dataset. Spatial/structural features are shown in orange and wall thickness features in blue. H-Body alone exceeds the combined importance of all three wall thickness features, confirming that spatial/structural parameters dominate the model’s geometric sensitivity.
Jmmp 10 00138 g012
Figure 13. Predicted versus simulated temperature profiles along the surface of the unseen Preform C for six representative slab configurations. Each subplot corresponds to a different heating condition defined by the dielectric slab positions. The solid blue line represents the simulated temperature obtained from Ansys HFSS, while the dashed orange line shows the prediction of the geometry-aware model. The profiles exhibit varying temperature ranges and spatial patterns depending on the slab configuration, yet the model captures the overall shape and magnitude of the thermal field across all cases. Minor deviations are observed primarily in the dome region (right end of each profile) and at localized temperature peaks, consistent with the higher physical complexity of these zones.
Figure 13. Predicted versus simulated temperature profiles along the surface of the unseen Preform C for six representative slab configurations. Each subplot corresponds to a different heating condition defined by the dielectric slab positions. The solid blue line represents the simulated temperature obtained from Ansys HFSS, while the dashed orange line shows the prediction of the geometry-aware model. The profiles exhibit varying temperature ranges and spatial patterns depending on the slab configuration, yet the model captures the overall shape and magnitude of the thermal field across all cases. Minor deviations are observed primarily in the dome region (right end of each profile) and at localized temperature peaks, consistent with the higher physical complexity of these zones.
Jmmp 10 00138 g013
Figure 14. Three additional PET preforms used for independent validation, arranged from left to right and referred to throughout the manuscript as Preforms G, H, and I (leftmost to rightmost). All geometric parameters fall within the convex hull of the training data, confirming that their prediction represents an interpolation task. The different colors of the preforms are incidental and reflect the physical color of the manufactured samples; they carry no scientific meaning.
Figure 14. Three additional PET preforms used for independent validation, arranged from left to right and referred to throughout the manuscript as Preforms G, H, and I (leftmost to rightmost). All geometric parameters fall within the convex hull of the training data, confirming that their prediction represents an interpolation task. The different colors of the preforms are incidental and reflect the physical color of the manufactured samples; they carry no scientific meaning.
Jmmp 10 00138 g014
Figure 15. Predicted versus simulated temperature profiles for the three validation preforms (G, H, I) across three representative slab configurations each. Each row corresponds to one validation preform, and each column to a different heating configuration. The solid blue line represents the simulated temperature from Ansys HFSS, and the dashed orange line shows the geometry-aware model prediction. All nine cases demonstrate close agreement between simulation and prediction, confirming that the domain-bounded training strategy enables reliable interpolation across diverse unseen geometries and heating conditions.
Figure 15. Predicted versus simulated temperature profiles for the three validation preforms (G, H, I) across three representative slab configurations each. Each row corresponds to one validation preform, and each column to a different heating configuration. The solid blue line represents the simulated temperature from Ansys HFSS, and the dashed orange line shows the geometry-aware model prediction. All nine cases demonstrate close agreement between simulation and prediction, confirming that the domain-bounded training strategy enables reliable interpolation across diverse unseen geometries and heating conditions.
Jmmp 10 00138 g015
Figure 16. Predicted versus simulated temperature profiles for the four worst-case samples of Preform B when held out from training. The predicted curve consistently sits above the simulated curve across the full axial range, indicating a near-uniform offset rather than a localized structural mismatch. Shaded background regions indicate the approximate neck (blue), body (green), and dome (orange) zones.
Figure 16. Predicted versus simulated temperature profiles for the four worst-case samples of Preform B when held out from training. The predicted curve consistently sits above the simulated curve across the full axial range, indicating a near-uniform offset rather than a localized structural mismatch. Shaded background regions indicate the approximate neck (blue), body (green), and dome (orange) zones.
Jmmp 10 00138 g016
Figure 17. Predicted versus simulated temperature profiles for the four worst-case samples of Preform F when held out from training. The prediction error is concentrated in the upper axial region, where the model predicts a pronounced central peak absent from the simulated profile, while prediction and simulation agree more closely near the dome. Shaded background regions indicate the approximate neck (blue), body (green), and dome (orange) zones.
Figure 17. Predicted versus simulated temperature profiles for the four worst-case samples of Preform F when held out from training. The prediction error is concentrated in the upper axial region, where the model predicts a pronounced central peak absent from the simulated profile, while prediction and simulation agree more closely near the dome. Shaded background regions indicate the approximate neck (blue), body (green), and dome (orange) zones.
Jmmp 10 00138 g017
Table 1. Heat capacity and temperature array definitions for each dataset category used in Case Study 1.
Table 1. Heat capacity and temperature array definitions for each dataset category used in Case Study 1.
Heat Capacity CategoryHeat Capacity Array [J/kg·°C]Temperature Array [°C]Dataset Size
Low Cp[1000, 1050, 1100, 1350, 1450][80, 100, 120, 150, 250]550
Mid Cp[1100, 1150, 1200, 1500, 1600][80, 100, 120, 150, 250]450
High Cp[1250, 1300, 1650, 1750, 1800][80, 100, 120, 150, 250]450
Table 2. Geometric features of the six preforms in (mm). All features are subsequently normalized to the range [ 0 , 1 ] using min–max scaling before being passed to the network.
Table 2. Geometric features of the six preforms in (mm). All features are subsequently normalized to the range [ 0 , 1 ] using min–max scaling before being passed to the network.
PreformWT-NeckWT-BodyWT-DomeH-BodyH-NeckR-Dome
A0.751.751.6557.4210.6216.83
B1.273.192.0051.987.9215.77
C1.402.342.0067.4016.0019.00
D1.933.362.4969.6717.0020.47
E1.933.162.7770.0014.5022.20
F1.983.102.3376.0011.0023.73
Table 3. Prediction performance of the proposed model evaluated by systematically holding out each preform for testing. The gray-shaded row marks Preform C, the only test case whose normalized geometric features lie entirely within the interior of the feature range spanned by the remaining preforms (a pure interpolation case). Bold values indicate the best test-set performance across all hold-out experiments.
Table 3. Prediction performance of the proposed model evaluated by systematically holding out each preform for testing. The gray-shaded row marks Preform C, the only test case whose normalized geometric features lie entirely within the interior of the feature range spanned by the remaining preforms (a pure interpolation case). Bold values indicate the best test-set performance across all hold-out experiments.
Test PreformTest SetTraining Set
MSE R 2 MAEMSE R 2 MAE
A0.005520.8800.05800.000280.99220.0132
B0.009630.77330.08080.000320.99170.0143
C0.000550.98650.01830.000210.99480.0114
D0.004370.86460.04860.000210.99480.0111
E0.005090.86070.05280.000210.99470.0116
F0.006220.74980.06600.000250.99430.0119
Table 4. Geometric features of the three validation preforms (G–I). All values lie within the minimum and maximum bounds defined by the training preforms (A, B, D, E, F).
Table 4. Geometric features of the three validation preforms (G–I). All values lie within the minimum and maximum bounds defined by the training preforms (A, B, D, E, F).
PreformWT-NeckWT-BodyWT-DomeH-BodyH-NeckR-Dome
G1.452.532.0065.4016.0017.67
H1.722.741.9268.3216.8016.12
I1.352.262.2472.439.6219.91
Table 5. Prediction performance on three additional unseen preforms whose geometric features lie within the convex hull of the training data.
Table 5. Prediction performance on three additional unseen preforms whose geometric features lie within the convex hull of the training data.
Validation PreformMSE R 2 MAE
G0.000540.98820.0178
H0.000560.98320.0187
I0.000510.99010.0143
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alsheikh, A.; Fischer, A. Geometry-Aware Neural Network for Generalized Temperature Prediction in Microwave Heating of PET Preforms. J. Manuf. Mater. Process. 2026, 10, 138. https://doi.org/10.3390/jmmp10040138

AMA Style

Alsheikh A, Fischer A. Geometry-Aware Neural Network for Generalized Temperature Prediction in Microwave Heating of PET Preforms. Journal of Manufacturing and Materials Processing. 2026; 10(4):138. https://doi.org/10.3390/jmmp10040138

Chicago/Turabian Style

Alsheikh, Ahmad, and Andreas Fischer. 2026. "Geometry-Aware Neural Network for Generalized Temperature Prediction in Microwave Heating of PET Preforms" Journal of Manufacturing and Materials Processing 10, no. 4: 138. https://doi.org/10.3390/jmmp10040138

APA Style

Alsheikh, A., & Fischer, A. (2026). Geometry-Aware Neural Network for Generalized Temperature Prediction in Microwave Heating of PET Preforms. Journal of Manufacturing and Materials Processing, 10(4), 138. https://doi.org/10.3390/jmmp10040138

Article Metrics

Back to TopTop