Abstract
Computation modeling for large thermoplastic deformation of plastic solids is critical for industrial applications like non-invasive assessment of engineering components. While deep learning-based methods have emerged as promising alternatives to traditional numerical simulations, they often suffer from systematic errors caused by geometric mismatches between predicted and ground truth meshes. To overcome this limitation, we propose a novel boundary geometry-constrained neural framework that establishes direct point-wise mappings between spatial coordinates and full-field physical quantities within the deformed domain. The key contributions of this work are as follows: (1) a two-stage strategy that separates geometric prediction from physics-field resolution by constructing direct, point-wise mappings between coordinates and physical quantities, inherently avoiding errors from mesh misalignment; (2) a boundary-condition-aware encoding mechanism that ensures physical consistency under complex loading conditions; and (3) a fully mesh-free approach that operates on point clouds without structured discretization. Experimental results demonstrate that our method achieves a 36–98% improvement in prediction accuracy over deep learning baselines, offering a efficient alternative for high-fidelity simulation of large thermoplastic deformations.
1. Introduction
Computational modeling of large deformations of thermoplastic solids is essential in continuum mechanics, which is widely applied in various areas such as aerospace [], civil engineering [], and metal forming manufacturing []. Large deformations of solids arise when externally applied loads and contact constraints induce deformations sufficiently significant to violate the fundamental assumptions of infinitesimal strain theory []. The distribution of physical fields reveals critical material behaviors during deformation. For instance, localized stress concentrations develop at geometric discontinuities (e.g., holes, fillets, or corners), promoting void nucleation and initiating microcrack propagation []. These damage mechanisms progressively degrade mechanical properties, including strength and ductility, which significantly reduce the fatigue life of engineered components. While these subsurface defects critically impact product performance, their quantification traditionally requires destructive testing methods, such as cross-sectioning and metallographic etching, which irreversibly sacrifice production components []. Computational modeling for full-field physics distribution of large thermoplastic deformation thus emerges as an urgent technological imperative for non-invasive assessment of engineering components.
The large thermoplastic deformations of plastic solids governed by nonlinear partial differential equations (PDEs) []. Traditional numerical methods, such as Finite Element Methods (FEMs), approximate solutions through incremental discretization, which subdivides the solution domain into elements via meshing techniques []. While standard FEM employs local linear approximations, the specialized rigid-plastic FEM (RPFEM) variant simplifies analysis by neglecting elastic recovery, which is particularly advantageous for situations where plastic strains dominate []. However, these approaches face inherent accuracy-efficiency trade-offs—refined meshes improve solution fidelity but exponentially increase computational load []. This complexity arises from tens of thousands of degrees of freedom introduced during discretization, exacerbated when modeling irregular geometries requiring non-uniform elements [].
Recent advances in deep learning have enabled novel approaches for predicting physical field distributions in large deformation problems, offering transformative potential for computational mechanics applications [,,,,]. Lee et al. [] utilized Convolutional Neural Networks (CNNs) which interpolate FEM data onto fixed-size grid regions and convert them into structured image representations to reconstruct the geometry and strain field after deformation. Similarly, Park et al. [] and Kim et al. [] further develop CNN-based models to reconstruct grain size distributions for deformed metal workpieces. However, this mesh-to-pixel transformation, which converts continuous values of the physical field quantities into discrete representations (e.g., 0–255 grayscale), introduces systematic quantization errors, leading to the loss of high-frequency nodal information and fundamentally limiting prediction accuracy []. Lin et al. [] developed a 1D CNN-based sequential modeling approach for predicting surface temperature distributions in thermoplastic deformation processes. Uribe et al. [] introduced a point cloud-based neural network for predicting physical fields of large deformation. Petrik et al. [] proposed an autoencoder-based neural network architecture for reconstructing physical fields, including geometric deformation, temperature distribution, and stress evolution, in large deformation processes.
While these models have been successfully applied to physical fields computation of large deformations, their accuracy is fundamentally limited by the systematic neglect of mesh misalignment during optimization. The core issue lies in the mesh misalignment between the predicted node positions and the ground truth node positions. Generally, deep learning-based models typically establish a parameterized mapping , to learn the transformation from the features of initial mesh nodes to those of the deformed mesh nodes . is optimized by minimizing a global difference between the overall predicted features and ground truth configurations by
where denotes the number of times steps. However, there is inevitably some deviation in the prediction of the displacement of grid nodes. As visually emphasized in Figure 1, the predicted mesh (red, dashed lines) deviates from the ground truth mesh (black, solid lines). When the error between and is non-negligible, the the optimization strategy based on Equation (1) becomes invalid.
Figure 1.
Schematic of mesh mismatch and invalid optimization mode.
This critical oversight propagates physical field prediction errors to geometrically mismatched mesh points, where nodal position deviations corrupt gradient computations and compound solution inaccuracies.
To overcome this limitations, we propose a novel point-wise full-field physics neural mapping framework for large deformation. The core of our approach formulates the physical field prediction as a two-stage computational process. First, a graph neural network constructs the mapping from initial to deformed boundary configurations under prescribed mechanical constraints, establishing the deformed geometry. Subsequently, a residual network architecture establishes direct coordinate-to-physics mappings within the deformed domain. This point-wise learning strategy inherently eliminates errors induced by mesh misalignment. Furthermore, we introduce a displacement-fused encoding layer that enhances field prediction accuracy by integrating nodal displacement information via a guiding branch. The proposed method is rigorously validated on a comprehensive metal forming dataset covering diverse deformation scenarios, demonstrating superior performance against existing deep learning baselines. The main contributions of this work are summarized as follows:
- A end-to-end deep learning framework for accurate full-field physics prediction is proposed for large deformation problems.
- A novel two-stage neural architecture that structures the large deformation computation problem into geometric prediction and physical resolution phases, inherently circumventing mesh-mismatch errors by establishing direct mappings from spatial coordinates to physical field values.
- A displacement-aware encoding mechanism that incorporates nodal displacement constraints through a guided learning branch is introduced, which significantly enhancing prediction accuracy.
- Experimental results on an industrial-scale metal forming dataset show our method achieves 36–98% reduction in mean absolute error compared to existing deep learning baselines.
2. Methodology
In this section, we first introduce the numerical formulation of large thermoplastic deformation in Section 2.1. We then introduce the problem formulation of parameterized learning methods in Section 2.2. In Section 2.3, we introduce the proposed point-wise full-field physics neural mapping framework.
2.1. Numerical Formulation of Large Thermoplastic Deformations
The governing equation is derived from the linear momentum balance in the reference configuration []
where ∇ is the divergence operator in the reference configuration, is the deformation gradient, is the second Piola–Kirchhoff stress tensor, and is the body force per unit reference volume. The weak form is obtained by multiplying by a test function and integrating over the reference domain
where is the variation of the Green–Lagrange strain tensor with , and is the surface traction per unit reference area. The nonlinear finite element discretization of Equation (3) leads to the following incremental equilibrium equation:
where is the residual force vector, is the internal force vector, and is the external force vector. The internal force vector is computed as
where is the first Piola–Kirchhoff stress tensor, , is the strain–displacement matrix where the axisymmetric components can be expressed as
where denotes the shape function of i-th node, and L denotes the number of nodes.
The temperature evolution is governed by the heat conduction equation []
where T denotes temperature, (kg·m−3) denotes the density, c (J·kg−1·°C−1) denotes the specific heat, k (W·m−1·°C−1) denotes the thermal conductivity, and (W·m−3) denotes the heat source term representing the thermal energy generated by plastic deformation work. is determined by []
where (MPa) denotes the effective flow stress, (s−1) denotes the effective strain rate, and denotes the efficiency of conversion of deformation energy to heat. Generally, the value of is 0.9 []. The corresponding weak form is
where denotes the boundary heat flux, and denotes the heat flux boundary.
The finite element discretization yields the system
with the matrices defined as
where contains the element shape functions and for 2D axisymmetric problems.
2.2. Problem Formulation
Although FEM provides accurate solutions to the equations in Section 2.1 through iterative numerical schemes, it is hampered by high computational cost and lacks a flexible input–output mechanism. The rapid advancement of data-driven approaches, particularly deep learning techniques, has enabled the successful implementation of neural networks that directly model the mapping from boundary conditions to physical fields using finite element data. Let denote the initial state, where denote the initial coordinates, the initial states, and the boundary condition of the observation points. Let denote the state after deformation, where denote the coordinates and the corresponding physical quantities after deformation. The model aims to learn a -parameterized mapping
by
where denotes the loss function formulated by
where denotes the dataset, denotes the initial state vector and the state vector after deformation of the i-th sample, respectively. The validity of Equation (16) is contingent upon strict equivalence between predicted and ground truth coordinates, i.e., , where denotes the predicted coordinates. However, existing methods overlook this critical constraint, leading to erroneous propagation of prediction errors during model training and fundamentally limiting accuracy.
To circumvent prediction errors induced by spatial coordinate mismatches, we propose a novel paradigm that directly models the mapping between coordinates and physical quantities within the deformed fields. Let denote the coordinates of the query points within the deformed domain . The mapping is formulated by
where denotes the corresponding physical quantities of . The learning problem is formulated as
Note that during the training stage, serves as predefined point coordinates for supervised learning. However, during inference, the model must first predict . is determined by , which is defined by the boundary points. Let denote the boundary points coordinates before and after deformation. we can learn a mapping
to determine . The learning problem of can be formulated as
2.3. The Proposed Method
The proposed framework is shown in Figure 2. The architecture integrates two functionally specialized modules: boundary geometric prediction module and physical field prediction module. Firstly, the boundary geometric prediction module processes boundary coordinates, initial conditions, and boundary constraints to predict the deformed boundary point set . Then, this predicted geometry undergoes discretization to produce sampling points . Last, the coordinates together with the initial state and boundary condition are fed into the physical field prediction module to compute the corresponding field distribution . The details are as follows.
Figure 2.
The proposed framework. (a) The schematic of the boundary geometry prediction module. (b) The framework of the physical field prediction module.
Boundary geometry prediction module. The boundary geometry module is designed to predict the deformed boundary node coordinates from the initial configuration under given boundary conditions. The module is implemented as a GNN operating on a graph representation of the boundary nodes. The design consists of the following key components. The graph is constructed with boundary nodes as vertices. The i-th node is associated with its initial coordinates (where for 2D problems), initial state (e.g., initial displacement or velocity), and boundary condition (e.g., prescribed displacement or traction).
The graph’s adjacency structure is constructed using Delaunay triangulation applied to the initial boundary nodal coordinates. This computational geometry approach generates a mesh of non-overlapping triangles while naturally establishing connections between nodes based on spatial proximity. For each node i, the set of neighbors is determined by natural geometric adjacency relationships established through this triangulation. The graph remains undirected, with edges formed exclusively between nodes connected by triangulation edges.
Boundary conditions are encoded directly into the node features. For Dirichlet conditions (prescribed displacements), represents the prescribed displacement vector. For Neumann conditions (tractions), it represents the traction vector. The encoding functions and are implemented as linear layers that project the input features to a hidden dimension . Specifically, the node embedding and edge embedding are calculated by
where denotes the Euclidean distance, and denotes the unit direction vector.
The GNN performs steps of message passing (by default, ). At each step , the message from node j to i is computed as follows:
where here denotes the Sigmoid function, ensuring . This allows the network to dynamically balance node and edge information. The node feature is updated by aggregating messages from all neighbors:
where W is a linear transformation. This update rule combines neighborhood information with the node’s own state, preserving historical information. After message passing steps, the final node features are passed through a linear layer to predict the deformed coordinates
This module is trained end-to-end using mean squared error (MSE) loss between predicted and ground truth coordinates of the boundary nodes.
Discretization. This module performs domain discretization during the inference phase, implemented using the open-source library PyGmsh []. By inputting boundary point sets, it discretizes the solution domain into a specified number of discrete points.
Physical field prediction module. The physical field prediction module consists primarily of an encoding layer, a ResNet-based feature transformation layer, and a prediction layer. The encoding layer first projects input features to a target dimension via a linear layer, followed by an MLP with one hidden layer to perform feature encoding. The encoder layer follows
is then transformed by the resnet layer with L residual blocks. The l-th block follows
where denotes the input of the residual block and is the input of the first layer.
Finally, the prediction of i-th query node is calculated by
where denotes the dimension of . During training, we use as the query coordinates to learn the functional mapping between coordinates and the corresponding physical quantities. The guiding branch incorporates displacement prediction loss into the network, enabling the fusion of displacement information. The loss function is formulated by
where is the MSE of , is the MSE of the guiding branch that predicts , and denotes the scale factor. We further study the effectiveness of the guiding branch in Section 4.3.
3. Experiment Setting
This section details the experimental setup. Section 3.1 describes the dataset used for method validation. Section 3.2 specifies the default settings of our proposed model, while Section 3.3 outlines the baseline models implemented for comparison. Finally, Section 3.4 presents the evaluation metrics employed to assess model performance.
3.1. Dataset
Metal thermoplastic forming represents a typical large deformation process. The workpiece undergoes geometric deformation due to the compression of the dies, with the internal physical fields distribution variations. Figure 3a illustrates a typical example of cylindrical compression molding, along with the distribution of x-direction strain component after the forming process.
Figure 3.
(a) The schematic diagram of Metal thermoplastic forming and the field distribution (x-direction strain) after forming (1/4 part of the 3D model) (b) Geometrical setup of the cross-section of the workpiece for FEM simulations. The cross-section is discretized into 1080 nodes (27 along x-direction and 40 along y-direction). The height and radius of the workpiece are 30 mm and 20 mm, respectively.
To obtain the physical field variations during the forming process, numerical simulations are performed using the DEFORM-3D software (version V11.0). Taking advantage of geometric symmetry, a 2D axisymmetric module is employed to efficiently simulate the deformation process. The geometry of the axially symmetric section of the workpiece is detailed in Figure 3b. The height and radius of the workpiece are set as 30 mm and 20 mm, respectively. The cross section is discretized into a structured grid of 1080 nodes (27 along the x-direction and 40 along the y-direction). The material behavior is modeled using the built-in constitutive model for 2014 aluminum alloy (2014Al). The initial temperature of the workpiece is . The dies are set as rigid body with an initial temperature . During the forming process, the bottom die remains stationary while the top die moves downward with a constant speed v. The maximum stroke is set as 15 mm. The type of contact friction between the mold and the workpiece is set as Coulomb friction, with a friction coefficient . The number of time steps of the simulation is 30, i.e., the stroke of the top die between the adjacent two time steps is 0.5 mm.
We establish a dataset that records the simulation results, including nodal displacement, temperature, stress with its components, and strain with its components, of all the 30 time steps. The details of the process parameters, including , are listed in Table 1. The total number of process parameters combinations is 7480, corresponding to the number of samples of the dataset. The details of the physical quantities in the dataset are shown in Table 2.
Table 1.
Details of the process parameters.
Table 2.
Details of the physical quantities.
3.2. Implementation
The dataset is split into training, validation, and test sets with a ratio of 7:1:2. The partitioning is designed to evaluate the model’s performance under extrapolation conditions, particularly for high-temperature regimes where material behavior exhibits strong nonlinearities. Specifically, the training set (70% of samples) encompasses the low-to-medium temperature range ( = 400–455 °C), where conventional plastic deformation mechanisms dominate. The validation set (10%) contains transitional temperature conditions ( = 460–465 °C) to monitor training progress and prevent overfitting. Most critically, the test set (20%) is exclusively composed of high-temperature scenarios ( = 470–480 °C) that approach the recrystallization threshold of 2014 aluminum alloy.
For the proposed model, we take as the nodal initial state , as the nodal boundary condition , where s denotes the displacement of the top die. By default, the hidden dimension is 128. The total number of message passing steps is 5. The number of residual blocks L is 3.
3.3. Baselines
3.4. Metrics
We evaluate the geometrical deformation prediction performance and the physical fields reconstruction performance by mean absolute error (MAE). To evaluate the physical fields reconstruction performance, we apply the Inverse Distance Weighting (IDW) algorithm [], which estimates unknown values at specific locations based on nearby points, to calculate the predictions on the test set nodal coordinates for CrystalMind and DeepForge. Given neighbors , the interpolated value at is:
where is calculated by
By default, .
4. Results
This section evaluates the performance of the proposed method through comprehensive experiments. Section 4.1 assesses the boundary geometry prediction accuracy in comparison with baseline models. Section 4.2 extends the comparison to full-field reconstruction performance across all physical quantities. Further analysis includes a study on model parameter scaling in Section 4.3 and model computation efficiency in Section 4.4. An investigation of the guiding branch’s influence on prediction accuracy in Section 4.5.
4.1. Boundary Geometry Prediction Result
The performance of the model is evaluated by the Euclidean distance between the predicted points and the ground truth boundary (denoted as ). Figure 4 shows the error map of the boundary nodes’ coordinates prediction results of the proposed model on a test sample. The statistical result on the test set is shown in Figure 5. The mean values of on the test set of our proposed model, CrystalMind, and DeepForge are 0.12 mm, 0.16 mm, and 0.19 mm. Notably, as illustrated in the figure, our model outperforms the alternatives in stability, exhibiting the smallest fluctuations (standard deviation), with CrystalMind ranking second. DeepForge, on the other hand, shows both the highest average and the most significant variation. This stability arises from our dedicated boundary geometry module, which avoids mesh-misalignment errors by focusing solely on coordinate prediction, whereas CrystalMind and DeepForge suffer from error propagation due to their end-to-end learning of coupled geometry-physics mappings. DeepForge’s AutoEncoder architecture, which lacks explicit geometric inputs, further amplifies instability when reconstructing high-dimensional outputs.
Figure 4.
Error map of the boundary nodes position prediction result of the proposed model.
Figure 5.
Statistic results of on the test set. The solid line represents the mean value of each point, the shaded area represents the standard deviation, and the dotted line represents the mean value of all the points.
4.2. Full-Field Physics Reconstruction Result
Figure 6 visualizes the full-field physics reconstruction result of our proposed model on a test sample. The result encompasses reconstruction for nine physical quantities, including temperature, strain components, and stress components, as well as their discrepancies compared to the ground truth distributions. For brevity, we present only the final time step’s prediction results. As demonstrated in the result, our approach can reliably reconstruct the field distributions for all relevant physical quantities.
Figure 6.
Visualization of the full-field physics reconstruction result of the proposed model.
The MAEs of all the nine physical quantities on the test set are illustrated in Figure 7. Our proposed model demonstrates overall lower MAE compared to the other two models, with only slightly higher values for (0.01 vs. 0.007) and (0.654 vs. 0.503) when compared to CrystalMind. DeepForge demonstrates significantly inferior performance compared to our model, exhibiting MAEs that are consistently 10 times higher across all evaluated physical quantities. This model exemplifies the intrinsic limitations of sequence-tracking-based methods, where the non-negligible nodal position prediction errors can significantly amplify inaccuracies in physical field predictions.
Figure 7.
MAEs of the physical field prediction results.
In contrast, our proposed model fundamentally transforms the prediction paradigm by directly establishing the mapping between arbitrary point’s coordinates within deformed domains and their corresponding physical quantities. This architecture intrinsically ensures correspondence between ‘keys’ (coordinates) and ‘values’ (physical quantities), thereby achieving significant error reduction.
4.3. Analysis of Model Size Impact on Performance
We investigate the impact of model size on performance, with quantitative results presented in Figure 8. We scaled the hidden dimension across three settings (32, 128, and 512). This resulted in the boundary prediction module with 0.004 M, 0.052 M, and 0.8 M parameters, and the field distribution prediction module with 0.011 M, 0.168 M, and 2.636 M parameters, respectively. The results demonstrate a clear trend of decreasing MAE with increasing model parameters. Specifically, for the boundary prediction module, when the parameter count increased from 0.004 M to 0.8 M, the deviation in d decreased significantly from 0.21 to 0.05, representing a 75% reduction. Remarkably, compared to CrystalMind (28.98 M parameters with MAE 0.163), our model achieves approximately 60% higher accuracy while using only 3% of its parameter count (Note that the total number of parameters of CrystalMind is referenced because the nodal coordinates prediction and physical fields prediction are intrinsically coupled).
Figure 8.
Model performance vs. number of parameters.
On the other hand, scaling the physical field prediction module from 0.011 M to 2.636 M parameters yielded substantial accuracy gains, with improvement rates ranging from 68% to 85% across different physical quantities. Comparative analysis shows that our method achieves 36.2–79.8% higher accuracy in physical fields prediction with 89.7% parameter reduction compared to CrystalMind (2.636 M vs. 28.98 M parameters). Furthermore, our method achieves a 94–98% reduction in physical field prediction error compared to DeepForge (13.76 M parameters).
The proposed model establishes relationships between arbitrary point locations within the deformed field and their corresponding physical quantities, rather than modeling the mappings between initial states and the states after deformation. This fundamental innovation effectively circumvents prediction errors caused by mesh mismatches, not only significantly enhancing prediction accuracy but also dramatically reducing parameter requirements compared to existing methods.
4.4. Computation Efficiency
To quantify the computational efficiency, we evaluate the wall-clock training time as well as the model size, inference memory usage, and inference speed of the three models. The inference task requires simultaneous reconstruction of the geometrical deformation as well as the full-field distributions including the temperature field, strain field, and stress field for 30 time steps throughout the entire forming process. Both the proposed model and CrystalMind enforce independence inference among different time steps, implementing a parallel computation paradigm that enables full-progress physical field reconstruction through batch mode. In contrast, DeepForge requires sequential execution during full-progress inference, as each time step depends on the external observed surface temperature as input (note that it is regarded as a known item during the training process).
We executed 500 independent inference runs for each model on both CPU (Intel Xeon Silver 4314; Intel Corporation, Santa Clara, CA, USA) and GPU (NVIDIA RTX 3090 24 GB; NVIDIA Corporation, Santa Clara, CA, USA) platforms, recording precise inference latencies using high-resolution timers (ns precision). The complete comparisons are presented in Table 3. The results indicate a distinct trade-off between inference speed and resource consumption across the different approaches. In terms of inference speed on a CPU, DeepForge demonstrates the shortest computation time (71.88 ± 1.85 ms), followed by CrystalMind (115.22 ± 7.65 ms), while our proposed method requires the longest processing time (167.53 ± 3.90 ms). However, a markedly different trend is observed for GPU acceleration. CrystalMind achieves a significantly faster inference speed (4.47 ± 0.74 ms) than both our method (95.86 ± 0.63 ms) and DeepForge (20.79 ± 3.05 ms) on the GPU. Regarding memory footprint, our method consumes a moderate amount of memory (1.56 GB), which is comparable to DeepForge (1.48 GB) and substantially lower than CrystalMind (2.70 GB).
Table 3.
Inference time and memory usage of the models.
The observed differences in computational efficiency can be attributed to the underlying architectural designs of the models. The superior GPU performance of CrystalMind likely stems from a highly optimized computational graph and a higher degree of parallelism inherent in its single-stage, end-to-end architecture. In contrast, the two-stage prediction pipeline of our method, which involves sequential processing by distinct geometric and physical field modules, introduces inherent computational overhead, resulting in longer CPU and GPU times. However, this architectural choice also allows for more compact intermediate representations, contributing to our method’s moderate memory usage. The low memory footprint of DeepForge is consistent with its autoencoder-based structure, which typically operates on a compressed latent representation. Ultimately, the higher computational cost of our method is the direct trade-off for its enhanced prediction accuracy, as demonstrated in the previous sections, making it particularly suitable for applications where precision is paramount over real-time speed.
4.5. Influence of the Guiding Branch
In this section, we analyze the influence of the guiding branch. We first study the performance with different . By default, . The results are demonstrated in Table 4. We find that as decreases from 10 to 0.1, shows a slight improving trend (decreasing from approximately 0.00312 to 0.00285), with the minimum value achieved at . Conversely, exhibits a consistent and significant deterioration (increasing from 0.00506 to 0.00647) as decreases. This relationship is expected, as a lower reduces the penalty associated with during training, allowing the model to prioritize the minimization of at the expense of . While , the model represents a balanced compromise between and .
Table 4.
and of the last training epoch with different .
Figure 9 compares the training loss with (w/) and without (w/o) the guiding branch, where is set as 1 and 0, respectively. and are 0.0030 and 0.43 of the last training epoch for the model without guiding branch. Experimental results confirm that activating the guiding branch significantly reduces the physical field prediction error , demonstrating its effectiveness in enhancing prediction accuracy through the integration of displacement-related information. Furthermore, even when the guiding branch is disabled, a consistent decrease of is observed throughout training, of which dropping from approximately 0.6 initially to 0.43 upon convergence. This trend clearly indicates a mutually reinforcing relationship between displacement and physical field prediction, where improvements in one aspect facilitate learning in the other.
Figure 9.
Training loss with/without the guiding branch.
Additionally, we conducted comprehensive tests across different parameter scales, maintaining identical parameterization schemes as established in Section 4.3. By default, for model with guiding branch. As shown in Figure 10, activating the guiding branch consistently improves physical fields prediction accuracy, with more pronounced benefits at larger model sizes. Specifically, for the temperature field at 2.636 M parameters, the prediction error is reduced by 33.8% (from 1.04 to 0.688) when the branch is enabled. The prediction accuracy improvements for other physical fields range from 8% () to 38.5% (). Experimental results demonstrate that the integration of the guiding branch’s displacement prediction loss yields accuracy improvements for physical fields prediction. This demonstrates the critical role of geometrical deformation awareness in physical field prediction.
Figure 10.
Performance with/without the guiding branch.
5. Discussion
While the proposed method has achieved notable accuracy in physical field prediction for large thermoplastic deformations, we acknowledge several limitations requiring further investigation. Firstly, the current framework does not employ explicit PDE-constrained optimization, it incorporates an implicit physics regularization (nodal displacement) through the guiding branch. This design, inspired by the sequential geometry-physics solution paradigm of numerical methods, provides a flexible inductive bias towards physically consistent solutions. However, incorporating physical constraints into the learning framework represents a particularly meaningful direction for future enhancement for scenarios with well-defined physical relationships. Future work could explore hybrid approaches that combine the proposed learning paradigm with explicit physical constraints for problems where the governing equations are fully known and well-defined. Secondly, while the current validation has been conducted primarily on axisymmetric problems, extending the method to general 3D non-axisymmetric geometries with complex mesh topologies represents a necessary research direction. This expansion would require addressing substantial challenges in spatial representation and computational efficiency. Notably, the proposed method offers a worthy solution to consider. For physical fields prediction tasks involving tens or even hundreds of thousands of mesh nodes with high-dimensional features, it is more effective to establish a functional mapping from nodal coordinates to the corresponding physical quantities, rather than to model the temporal evolution of the large-scale high-dimension nodal sequences. Lastly, while this study demonstrates the framework’s effectiveness on a specific aluminum alloy forming process, the methodology is inherently general. The core innovation lies in the two-stage prediction framework, which bypass mesh-alignment errors, is not specific to a particular material model or process geometry. Future work will rigorously evaluate the transferability of this approach. This will include extending validation to different material classes (e.g., steels, titanium alloys) exhibiting varied hardening and thermal responses, as well as to other forming operations (e.g., rolling, extrusion) with distinct boundary condition complexities. Such studies will further solidify the framework’s potential as a general-purpose tool for high-fidelity deformation modeling.
6. Conclusions
In this paper, we propose a point-wise full-field physics neural mapping framework via boundary geometry constrained for large thermoplastic deformation. We design a two-stage computational framework which first predicts the geometric deformation and then establishes point-wise mappings between coordinates and physical quantities within the deformed domain. By establishing functional mappings between coordinates and their corresponding physical quantities, our approach fundamentally resolves the mesh-mismatch limitations inherent in existing deep learning-based methods. We validate our approach on a metal thermoplastic forming dataset and conduct comparative experiments with other deep learning models. Experiment results demonstrate that the proposed method achieves a 36–98% improvement in MAE compared to other deep learning baselines, confirming its effectiveness for large thermoplastic deformation computation.
Author Contributions
Conceptualization: J.W. and X.X.; Methodology: J.W., X.X. and C.Y.; Formal analysis and investigation: J.W. and X.X.; Software: J.W. and C.Y.; Validation: J.W. and C.Y.; Visualization: J.W.; Writing—original draft preparation: J.W.; Writing—review and editing: X.X. and W.H.; Funding acquisition: W.H. and X.X.; Resources: X.X. and W.H.; Supervision: X.X. and W.H. All authors have read and agreed to the published version of the manuscript.
Funding
This research is supported by the Key Program of the National Natural Science Foundation of China (No. 2024YFB4506002) and the Young Scientists Fund of the National Natural Science Foundation of China (No. 62306030) and the fundamental Research Funds for the Central Universities of Ministry of Education of China (No. 00007895).
Institutional Review Board Statement
Not applicable.
Data Availability Statement
The data presented in this study are available upon request from the corresponding author.
Conflicts of Interest
The authors declare no conflicts of interest.
Appendix A
Table A1.
The architecture of CrystalMind. The process parameters vector [] is firstly embedded by a linear layer and then concatenate with the coordinates tensor processed by Unit (The architecture is shown in Table A2). Lastly, the physical quantities and coordinates of all the 1080 nodes are predicted by Linear A and Linear B, respectively. There is a dropout layer after the first linear A and B and the ReLU activation function with a probability of 0.1. bs represent batch size.
Table A1.
The architecture of CrystalMind. The process parameters vector [] is firstly embedded by a linear layer and then concatenate with the coordinates tensor processed by Unit (The architecture is shown in Table A2). Lastly, the physical quantities and coordinates of all the 1080 nodes are predicted by Linear A and Linear B, respectively. There is a dropout layer after the first linear A and B and the ReLU activation function with a probability of 0.1. bs represent batch size.
| Operation | Input | Output Shape | Activation |
|---|---|---|---|
| Linear | Process parameters vec | (bs, 20) | – |
| Unit | Coordinates tensor | (bs, 1080, 256) | – |
| Unit | Coordinates tensor | (bs, 1080, 512) | – |
| Unit | Coordinates tensor | (bs, 1080, 1024) | – |
| Mean | Coordinates tensor | (bs, 1024) | – |
| Concat | Process parameters vec | (bs, 1044) | – |
| Coordinates tensor | – | ||
| Linear A | Concat vec | (bs, 2046) | ReLU |
| Linear A | Concat vec | (bs, 1080 × 9) | Sigmoid |
| Linear B | Concat vec | (bs, 2048) | ReLU |
| Linear B | Concat vec | (bs, 1080 × 2) | Sigmoid |
Table A2.
Architecture of Unit. The value of X is related to the level in the main framework shows in Table A1.
Table A2.
Architecture of Unit. The value of X is related to the level in the main framework shows in Table A1.
| Operation | Output Shape | Activation | Normalization |
|---|---|---|---|
| Linear | (bs, 128) | LeakyReLU | – |
| Linear | (bs, 128) | LeakyReLU | – |
| Linear | (bs, X) | – | BatchNorm1d |
Table A3.
The architecture of DeepForge. The surface temperature tensor consists of the values of the boundary nodes. Each operation is followed by a ReLU activation function and between the Linear layers is also a dropout with probability of 0.1. The kernel size of the Conv1D layers is 3 with padding of 1 and stride of 2. The hidden dimension size and layers of Gate Recurrent Unit (GRU) are 16 and 1. The output includes the coordinates and 9 physical quantities of all the 1080 nodes.
Table A3.
The architecture of DeepForge. The surface temperature tensor consists of the values of the boundary nodes. Each operation is followed by a ReLU activation function and between the Linear layers is also a dropout with probability of 0.1. The kernel size of the Conv1D layers is 3 with padding of 1 and stride of 2. The hidden dimension size and layers of Gate Recurrent Unit (GRU) are 16 and 1. The output includes the coordinates and 9 physical quantities of all the 1080 nodes.
| Operation | Input Type | Output Shape |
|---|---|---|
| Conv1D | Surface temperature tensor | (bs, 8, 1) |
| Conv1D | Surface temperature tensor | (bs, 16, 1) |
| Conv1D | Surface temperature tensor | (bs, 32, 1) |
| GRU | Surface temperature tensor | (bs, 32, 16) |
| Flatten | Surface temperature tensor | (bs, 512) |
| Concat | Surface temperature tensor | (bs, 516) |
| Process parameters vec | ||
| Linear | Concat vec | (bs, 1024) |
| Linear | Concat vec | (bs, 512) |
| Linear | Concat vec | (bs, 1024) |
| Linear | Concat vec | (bs, 1080 × 11) |
References
- Phanden, R.K.; Sharma, P.; Dubey, A. A review on simulation in digital twin for aerospace, manufacturing and robotics. Mater. Today Proc. 2021, 38, 174–178. [Google Scholar] [CrossRef]
- Shan, J.; Zhang, X.; Liu, Y.; Zhang, C.; Zhou, J. Deformation prediction of large-scale civil structures using spatiotemporal clustering and empirical mode decomposition-based long short-term memory network. Autom. Constr. 2024, 158, 105222. [Google Scholar] [CrossRef]
- He, B.; Bai, K.-J. Digital twin-based sustainable intelligent manufacturing: A review. Adv. Manuf. 2021, 9, 1–21. [Google Scholar] [CrossRef]
- Bathe, K.-J.; Ramm, E.; Wilson, E.L. Finite element formulations for large deformation dynamic analysis. Int. J. Numer. Methods Eng. 1975, 9, 353–386. [Google Scholar] [CrossRef]
- Nti, I.K.; Adekoya, A.F.; Weyori, B.A.; Nyarko-Boateng, O. Applications of artificial intelligence in engineering and manufacturing: A systematic review. J. Intell. Manuf. 2021, 33, 1581–1601. [Google Scholar] [CrossRef]
- Wang, B.; Tao, F.; Fang, X.; Liu, C.; Liu, Y.; Freiheit, T. Smart manufacturing and intelligent manufacturing: A comparative review. Engineering 2021, 7, 738–757. [Google Scholar] [CrossRef]
- Pai, P.F.; Palazotto, A.N. Large-deformation analysis of flexible beams. Int. J. Solids Struct. 1996, 33, 1335–1353. [Google Scholar] [CrossRef]
- Bruhns, O.T. Large deformation plasticity: From basic relations to finite deformation. Acta Mech. Sin. 2020, 36, 472–492. [Google Scholar] [CrossRef]
- Jiang, Z.Y.; Tieu, A.K. A simulation of three-dimensional metal rolling processes by rigid–plastic finite element method. J. Mater. Process. Technol. 2001, 112, 144–151. [Google Scholar] [CrossRef]
- Logg, A. Automating the finite element method. Arch. Comput. Methods Eng. 2007, 14, 93–138. [Google Scholar] [CrossRef]
- Shetty, N.; Shahabaz, S.M.; Sharma, S.; Shetty, S. A review on finite element method for machining of composite materials. Compos. Struct. 2017, 176, 790–802. [Google Scholar] [CrossRef]
- Odot, A.; Haferssas, R.; Cotin, S. Deepphysics: A physics aware deep learning framework for real-time simulation. Int. J. Numer. Methods Eng. 2022, 123, 2381–2398. [Google Scholar]
- Deshpande, S.; Lengiewicz, J.; Bordas, S.P.A. Probabilistic deep learning for real-time large deformation simulations. Comput. Methods Appl. Mech. Eng. 2022, 398, 115307. [Google Scholar] [CrossRef]
- Liang, L.; Liu, M.; Martin, C.K.; Sun, W. A deep learning approach to estimate stress distribution: A fast and accurate surrogate of finite-element analysis. J. R. Soc. Interface 2018, 15, 20170844. [Google Scholar] [CrossRef]
- Bolandi, H.; Li, X.; Salem, T.; Boddeti, V.N.; Lajnef, N. Bridging finite element and deep learning: High-resolution stress distribution prediction in structural components. Front. Struct. Civ. Eng. 2022, 16, 1365–1377. [Google Scholar] [CrossRef]
- Zhang, Y.; Li, Q.-J.; Zhu, T.; Li, J. Learning constitutive relations of plasticity using neural networks and full-field data. Extrem. Mech. Lett. 2022, 52, 101645. [Google Scholar]
- Lee, S.; Kim, K.; Kim, N. A preform design approach for uniform strain distribution in forging processes based on convolutional neural network. J. Manuf. Sci. Eng. 2022, 144, 121004. [Google Scholar] [CrossRef]
- Park, J.-H.; Han, B.; Choi, J.; Shin, S.; Kim, N. CNN-based preform design: Effect of training data configuration on strain distribution in forged products. Int. J. Adv. Manuf. Technol. 2024, 135, 4837–4854. [Google Scholar] [CrossRef]
- Kim, K.; Han, B.; Kim, Y.; Kim, N. Detailed preform design procedure considering the effect of heat treatment in IN718 disk forging. J. Mater. Res. Technol. 2024, 30, 4625–4644. [Google Scholar] [CrossRef]
- Cao, J.; Bambach, M.; Merklein, M.; Mozaffar, M.; Xue, T. Artificial intelligence in metal forming. CIRP Ann. 2024, 73, 561–587. [Google Scholar] [CrossRef]
- Lin, Z.; Wang, R.; Hu, Z.; Hu, Z. Surface temperature field real-time reconstruction of hot forging die based on 1DCNN. Int. J. Therm. Sci. 2024, 204, 109206. [Google Scholar] [CrossRef]
- Uribe, D.; Durand, C.; Baudouin, C.; Bigot, R. Accurate real-time modeling for multiple-blow forging. Int. J. Mater. Form. 2024, 17, 57. [Google Scholar] [CrossRef]
- Petrik, J.; Bambach, M. Deepforge: Leveraging AI for microstructural control in metal forming via model predictive control. J. Manuf. Process. 2024, 121, 193–204. [Google Scholar] [CrossRef]
- Castelló, W.B.; Flores, F.G. A triangular finite element with local remeshing for the large strain analysis of axisymmetric solids. Comput. Methods Appl. Mech. Eng. 2008, 198, 332–343. [Google Scholar] [CrossRef]
- Zhang, S.H.; Zhang, G.L.; Liu, J.S.; Li, C.S.; Mei, R.B. A fast rigid-plastic finite element method for online application in strip rolling. Finite Elem. Anal. Des. 2010, 46, 1146–1154. [Google Scholar] [CrossRef]
- Kim, Y.S.; Son, H.S.; Kim, C.I. Rigid–plastic finite element simulation for process design of impeller hub forming. J. Mater. Process. Technol. 2003, 143, 729–734. [Google Scholar] [CrossRef]
- Han, J.; Cheng, Q.; Hu, P.; Xing, H.; Li, S.; Ge, S.; Wang, K. Finite Element Analysis of Large Plastic Deformation Process of Pure Molybdenum Plate during Hot Rolling. Metals 2023, 13, 101. [Google Scholar] [CrossRef]
- A Python Frontend for Gmsh. Available online: https://github.com/nschloe/pygmsh (accessed on 10 February 2022).
- Lu, G.Y.; Wong, D.W. An adaptive inverse-distance weighting spatial interpolation technique. Comput. Geosci. 2008, 34, 1044–1055. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).