A Physics-Guided Machine Learning Model for Predicting Viscoelasticity of Solids at Large Deformation

Bao Qin; Zheng Zhong

doi:10.3390/polym16223222

and

¹

Research Institute of Interdisciplinary Science, School of Materials Science and Engineering, Dongguan University of Technology, Dongguan 523808, China

²

School of Science, Harbin Institute of Technology, Shenzhen 518055, China

^*

Author to whom correspondence should be addressed.

Polymers2024, 16(22), 3222;https://doi.org/10.3390/polym16223222

This article belongs to the Special Issue Mechanical Response Characteristics and Performance Evaluation of Polymer Materials

Version Notes

Order Reprints

Abstract

Physics-guided machine learning (PGML) methods are emerging as valuable tools for modelling the constitutive relations of solids due to their ability to integrate both data and physical knowledge. While various PGML approaches have successfully modeled time-independent elasticity and plasticity, viscoelasticity remains less addressed due to its dependence on both time and loading paths. Moreover, many existing methods require large datasets from experiments or physics-based simulations to effectively predict constitutive relations, and they may struggle to model viscoelasticity accurately when experimental data are scarce. This paper aims to develop a physics-guided recurrent neural network (RNN) model to predict the viscoelastic behavior of solids at large deformations with limited experimental data. The proposed model, based on a combination of gated recurrent units (GRU) and feedforward neural networks (FNN), utilizes both time and stretch (or strain) sequences as inputs, allowing it to predict stress dependent on time and loading paths. Additionally, the paper introduces a physics-guided initialization approach for GRU–FNN parameters, using numerical stress–stretch data from the generalized Maxwell model for viscoelastic VHB polymers. This initialization is performed prior to training with experimental data, helping to overcome challenges associated with data scarcity.

Keywords:

machine learning; viscoelasticity; large deformation; recurrent neural networks; polymers

1. Introduction

Machine learning has made significant strides in various fields of science and engineering, driven by the proliferation of digital data, increasing computing power, and advanced algorithms. It has achieved notable success in applications such as speech recognition [1], image classification [2], and cognitive science [3]. Recently, machine learning has emerged as a promising tool for predicting deformation patterns [4,5] by modeling the mechanical constitutive behavior of solids, particularly the relationship between mechanical stress and strain. However, modeling both time and history-dependent nonlinear constitutive relations such as viscoelasticity presents challenges for two reasons. First, many state-of-the-art machine learning techniques (e.g., feedforward/convolutional/recurrent neural networks, FNN/CNN/RNN) [6,7,8] require large datasets and often lack robustness, failing to guarantee convergence when only limited experimental stress–strain data are available from constitutive experiments, such as uniaxial tensile or bending tests. Second, modelling time and history-dependent nonlinear stress–strain relations can be difficult for many machine learning techniques, including the popular FNN, due to their limited capacity to handle sequential information. Therefore, advancing machine learning approaches to effectively model viscoelasticity in solids at large deformation is crucial for furthering the study of solid mechanics problems.

Physics-guided machine learning (PGML) methods, which integrate established physical principles with data-driven approaches, have proven beneficial in solid mechanics by simultaneously incorporating data information and physical knowledge [9,10,11]. These studies can be broadly classified into three categories: (i) discovering constitutive equations from data [12,13,14,15,16,17]; (ii) solving governing equations [18,19,20,21,22]; (iii) predicting data-driven constitutive relations [15,23,24,25,26,27,28]. Here, we focus on elaborating the third category. To predict data-driven constitutive relations, significant efforts have been made using deep neural networks, including FNN, temporal convolutional network (TCN), and RNN, among others. For example, Linka et al. [15] proposed a data-driven constitutive neural network model for predicting the mechanical constitutive behavior of hyperelastic materials. This model integrates physical laws and the symmetry of mechanical properties into the network structure, enabling the simultaneous use of information from three sources: stress–strain data, theoretical knowledge from materials science, and additional data such as microstructural or processing information; Danoun et al. [23] developed a thermodynamically consistent RNN model to simulate the mechanical response of elastoplastic materials under multi-axial and non-proportional loading conditions. Their work demonstrates that incorporating thermodynamic consistency can significantly enhance the predictive capabilities of such surrogate models; Wang et al. [24] developed a TCN model for materials with an ultra-long-history-dependent stress–strain relation. Their TCN model outperforms RNN models, including Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) models, in terms of both performance and speed for various sequential tasks. Abdolazizi et al. [28] proposed viscoelastic Constitutive Artificial Neural Networks (vCANNs), a novel physics-informed machine learning framework for anisotropic nonlinear viscoelasticity at finite strains. This approach is based on generalized Maxwell models, enhanced with neural networks to capture nonlinear strain- and strain rate-dependent properties. However, most of these models require a moderate to large amount of labeled data from experiments or physics-based simulations to accurately learn time-independent constitutive relations, such as those for path-independent elastic materials and path-dependent plastic materials [26,27]. They often face challenge in modelling the viscoelasticity of solids, which depends on both time and loading paths, especially when experimental data are scarce. Additionally, incorporating physical knowledge of viscoelasticity, such as viscous dissipation energy, directly into the training process is challenging. This difficulty arises because instantaneous viscous deformation and its conjugated driving force, two key components for calculating viscous dissipation energy, are difficult to obtain through experiments.

Therefore, this paper aims to develop a physics-guided machine learning model for predicting the viscoelasticity of solids at large deformation with scarce experimental data. First, we propose a surrogate model of viscoelasticity based on GRU and FNN (GRU–FNN). This model inputs both time and stretch (or strain) sequences, enabling it to predict stress based on time and loading path. Second, we implement a physics-guided initialization for the GRU–FNN parameters by training on numerical stress–stretch data derived from the generalized Maxwell model for viscoelastic solids under large deformations. This approach allows the model to predict the viscoelastic behavior of solids even with scarce experimental stress–stretch data. Unlike existing data-driven models that use residuals of the governing equations or variational forms as penalizing terms to restrict the solution space, our strategy effectively avoids the need for direct measurements of viscous deformation.

The remainder of this paper is organized as follows. In Section 2, a continuum theoretical framework for the viscoelasticity of solids at large deformation is given and constitutive equations including the state and evolving equations are derived. In Section 3, a GRU–FNN model for viscoelasticity is established. In Section 4, uniaxial viscoelasticity of commercially available dielectric polymers VHB4905 is modelled. Finally, conclusions are given in Section 5.

2. Thermodynamic Formulation of Constitutive Laws for Viscoelastic Solids at Large Deformation

Consider a solid

B_{0}

bounded by the surface

\partial B_{0}

, defined in a fixed reference configuration, which deforms into

B_{t}

with the surface

\partial B_{t}

. The deformation gradient is then given by

F = \nabla_{R} x

(1)

Here,

x = χ (X, t)

is a function that maps an arbitrary material point

X

inside

B_{0}

into a spatial point

x

inside

B_{t}

, and

\nabla_{R}

represents the gradient operator with respect to the coordinates

X

. To describe the viscoelasticity of solids under large deformation, a generalized Maxwell model [29,30,31,32,33] is employed, as shown in Figure 1. In this model, the solid is represented by an equilibrium spring, characterized by the deformation gradient

F

, along with

n

parallel Maxwell elements (each consisting of a spring and a dashpot in series), characterized by the elastic deformation gradient

F_{i}^{e}

and the viscous deformation gradient

F_{i}^{v}

, leading to the following relation:

F = F_{i}^{e} \cdot F_{i}^{v}, i = 1, \dots, n

(2)

Figure 1. The generalized Maxwell model for solids under large deformation. It is assumed to be equivalent to an equilibrium spring with a deformation gradient F, and n parallel Maxwell element, each characterized by an elastic deformation gradient

F_{i}^{e}

and a viscous deformation gradient

F_{i}^{v}

(1 ≤ i ≤ n).

Accordingly, the right Cauchy–Green deformation tensors,

C = F^{T} \cdot F

and

C_{i}^{e} = {(F_{i}^{e})}^{T} \cdot F_{i}^{e}

, where the superscript

T

denotes the transpose, are used to measure the deformation of the equilibrium spring and the elastic deformation of the Maxwell elements, respectively.

Neglecting inertial effects, the balance laws of force and moment in the current configuration are given by

σ \cdot \nabla + b = 0, σ = σ^{T}

(3)

where

σ

is the Cauchy stress tensor and

b

is the body force per unit volume in

B_{t}

, which can be expressed in the reference configuration as

P \cdot \nabla_{R} + b_{R} = 0, P \cdot F^{T} = F \cdot P^{T}

(4)

where

P

denotes the first P–K stress tensor and

b_{R}

denotes the body force per unit volume in

B_{0}

. Let

j

denote the determinant of

F

, and we have the relations

P = j σ \cdot {(F^{T})}^{- 1}

and

b_{R} = j b

.

Let

ε

and

q

represent the internal energy density and the heat source per unit volume in

B_{0}

, respectively, and

j^{q}

denote the heat flux per unit area on

\partial B_{t}

. The energy balance law in the current configuration is then expressed as

\dot{ε} = σ : \nabla \dot{χ} - \nabla \cdot j^{q} + q

(5)

whose corresponding form in the reference configuration is

{\dot{ε}}_{R} = P : \dot{F} - \nabla_{R} \cdot j_{R}^{q} + q_{R}

(6)

where

\frac{d}{d t}

or equivalently a superposed dot represents the material time derivative.

ε_{R}

and

q_{R}

denote the internal energy density and the heat source per unit volume in

B_{0}

, respectively, while

j_{R}^{q}

represents the heat flux per unit area on

\partial B_{0}

. The relationships

ε_{R} = j ε

,

q_{R} = j q

and

j_{R}^{q} = j j^{q} \cdot {(F^{T})}^{- 1}

hold for the above two equations. The first term on the right-hand side represents the mechanical work, while the second and third terms account for the energy from heat flow across the surface and the heat source inside the body, respectively.

Let

η

denote the entropy per unit volume in

B_{t}

and

ϑ

the absolute temperature. The entropy inequality in the current configuration is expressed locally as

\dot{η} \geq - \nabla \cdot \frac{j^{q}}{ϑ} + \frac{q}{ϑ}

(7)

which can also be written in the reference configuration as

{\dot{η}}_{R} \geq - \nabla_{R} \cdot \frac{j_{R}^{q}}{ϑ} + \frac{q_{R}}{ϑ}

(8)

where

η_{R}

is the entropy per unit volume in

B_{0}

, related to

η

by

η_{R} = j η

.

Introducing the Helmholtz free energy density

φ = ε - ϑ η

and considering the energy balance (5), the inequality (7) becomes

σ : \nabla \dot{χ} - η \dot{ϑ} - \dot{φ} - \frac{1}{ϑ} j^{q} \cdot \nabla ϑ \geq 0

(9)

Similarly, by introducing the Helmholtz free energy density

φ_{R} = ε_{R} - ϑ η_{R}

, where

φ_{R} = j φ

, and considering the energy balance (6), the inequality becomes

P : \dot{F} - η_{R} \dot{ϑ} - {\dot{φ}}_{R} - \frac{1}{ϑ} j_{R}^{q} \cdot \nabla_{R} ϑ \geq 0

(10)

This imposes the thermodynamic constraint on solids. For convenience, we will use the formulations provided in the reference configuration in the following sections.

Considering the thermo-viscoelastic effects in solids, the Helmholtz free energy density

φ_{R}

can be assumed to be a function of variables

(C, C_{i}^{e}, ϑ)

, i.e.,

φ_{R} = φ_{R} (C, C_{i}^{e}, ϑ)

(11)

whose material time derivative can be further written as

{\dot{φ}}_{R} = \frac{\partial φ_{R}}{\partial C} : \dot{C} + \sum_{i = 1}^{n} \frac{\partial φ_{R}}{\partial C_{i}^{e}} : {\dot{C}}_{i}^{e} + \frac{\partial φ_{R}}{\partial ϑ} \dot{ϑ}

(12)

The first and second terms on the right-hand side of Equation (12) can be rewritten as

\begin{array}{l} \frac{\partial φ_{R}}{\partial C} : \dot{C} = 2 F \cdot \frac{\partial φ_{R}}{\partial C} : \dot{F} \\ \frac{\partial φ_{R}}{\partial C_{i}^{e}} : {\dot{C}}_{i}^{e} = \frac{\partial φ_{R}}{\partial C_{i}^{e}} : \frac{\partial C_{i}^{e}}{\partial F} : \dot{F} + \frac{\partial φ_{R}}{\partial C_{i}^{e}} : \frac{\partial C_{i}^{e}}{\partial F_{i}^{v}} : {\dot{F}}_{i}^{v} = 2 F_{i}^{e} \cdot \frac{\partial φ_{R}}{\partial C_{i}^{e}} \cdot {(F_{i}^{v T})}^{- 1} : \dot{F} - 2 C_{i}^{e} \cdot \frac{\partial φ_{R}}{\partial C_{i}^{e}} : D_{i}^{v} \end{array}

(13)

with

D_{i}^{v} = \frac{1}{2} (L_{i}^{v} + L_{i}^{v T}), L_{i}^{v} = {\dot{F}}_{i}^{v} \cdot {(F_{i}^{v})}^{- 1}

(14)

where the superscript ‘

- 1

’ denotes the inverse of a tensor. Then, substitution of Equations (12) and (13) into Equation (10) yields

(P - 2 F \cdot \frac{\partial φ_{R}}{\partial C} - 2 F_{i}^{e} \cdot \frac{\partial φ_{R}}{\partial C_{i}^{e}} \cdot {(F_{i}^{v T})}^{- 1}) : \dot{F} - (η_{R} + \frac{\partial φ_{R}}{\partial ϑ}) \dot{ϑ} - \frac{1}{ϑ} j_{R}^{q} \cdot \nabla_{R} ϑ + \sum_{i = 1}^{n} 2 C_{i}^{e} \cdot \frac{\partial φ_{R}}{\partial C_{i}^{e}} : D_{i}^{v} \geq 0

(15)

In the case where

P

and

η_{R}

are independent of

\dot{F}

and

\dot{ϑ}

, the first two terms of the above inequality must vanish, leading to the following constitutive relations:

P = 2 F \cdot \frac{\partial φ_{R}}{\partial C} + 2 F_{i}^{e} \cdot \frac{\partial φ_{R}}{\partial C_{i}^{e}} \cdot {(F_{i}^{v T})}^{- 1}

(16)

η_{R} = - \frac{\partial φ_{R}}{\partial ϑ}

(17)

Thus, the inequality (15) reduces to

- \frac{1}{ϑ} j_{R}^{q} \cdot \nabla_{R} ϑ + \sum_{i = 1}^{n} M_{i}^{n e q} : D_{i}^{v} \geq 0

(18)

where

M_{i}^{n e q}

is the non-equilibrium Mandel stress tensor [34], defined as

M_{i}^{n e q} = 2 C_{i}^{e} \cdot \frac{\partial φ_{R}}{\partial C_{i}^{e}}

(19)

More specially, to satisfy the thermodynamic constraint imposed by inequality (18), the following constitutive equations are derived:

j_{R}^{q} = - Y \cdot \nabla_{R} ϑ

(20)

D_{i}^{v} = Q_{i} : M_{i}^{n e q}

(21)

where

Y

is a second order tensor and

Q_{i}

is a fourth order tensor, both of which are positive-definite. Here, Equation (20) represents the Fourier heat conduction law, and Equation (21) describes the rheological viscous flow rule [30,34].

The Helmholtz free energy density is assumed to be additively decomposed as follows:

φ_{R} = φ_{R}^{e q} + φ_{R}^{n e q}

(22)

where

φ_{R}^{e q}

and

φ_{R}^{n e q}

represent the equilibrium and non-equilibrium Helmholtz free energy densities, corresponding to the stretching of the single spring and the springs in Maxwell element, respectively. We adopt the Gent model [35] for both

φ_{R}^{e q}

and

φ_{R}^{n e q}

to account for the strain-stiffening effect, where solids may stiffen sharply as the stretch approaches their extension limit [36], as follows:

φ_{R}^{e q} = - \frac{G^{e q} L}{2} \ln (\frac{L - t r C + 3}{L})

(23)

φ_{R}^{n e q} = - \sum_{i = 1}^{n} \frac{G_{i}^{n e q} L_{i}}{2} \ln (\frac{L_{i} - t r C_{i}^{e} + 3}{L_{i}})

(24)

where the symbol ‘

t r

’ denotes the trace of a tensor,

G^{e q}

and

G_{i}^{n e q}

represent the equilibrium modulus of the single spring and the nonequilibrium modulus of the ith Maxwell element, respectively,

L

and

L_{i}

denote the extension limits of the single spring and the ith Maxwell element spring, respectively. Substituting Equations (22) into (16) and (19), we have

P = \frac{G^{e q} L}{L - t r C + 3} F + \sum_{i = 1}^{n} \frac{G_{i}^{n e q} L_{i}}{L_{i} - t r C_{i}^{e} + 3} F_{i}^{e} \cdot {(F_{i}^{v T})}^{- 1}

(25)

M_{i}^{n e q} = \frac{G_{i}^{n e q} L_{i}}{L_{i} - t r C_{i}^{e} + 3} C_{i}^{e}

(26)

Next, consider a case that the elastic deformations of springs are incompressible, which generally applies to polymers [32]. Lagrange multipliers

Π

and

Π_{i}

are introduced to enforce these constraint conditions, modifying the free energy density as

{\tilde{φ}}_{R} = φ_{R} - Π (j - 1) - \sum_{i = 1}^{n} Π_{i} (j_{i}^{e} - 1)

(27)

where

j_{i}^{e}

is the determinant of the elastic deformation gradient

F_{i}^{e}

. Replacing

φ_{R}

with

{\tilde{φ}}_{R}

in Equations (16) and (19), we can rewrite

P

and

M_{i}^{n e q}

as

P = \frac{G^{e q} L}{L - t r C + 3} F + \sum_{i = 1}^{n} \frac{G_{i}^{n e q} L_{i}}{L_{i} - t r C_{i}^{e} + 3} F_{i}^{e} \cdot {(F_{i}^{v T})}^{- 1} - j Π {(F^{T})}^{- 1} - \sum_{i = 1}^{n} j_{i}^{e} Π_{i} {(F^{T})}^{- 1}

(28)

M_{i}^{n e q} = \frac{G_{i}^{n e q} L_{i}}{L_{i} - t r C_{i}^{e} + 3} C_{i}^{e} - j_{i}^{e} Π_{i} I

(29)

where

I

is the second-order unit tensor. Furthermore, the Cauchy stress tensor

σ

can be obtained using the relations

σ = \frac{1}{j} P \cdot F^{T}

, and

j = j_{i}^{e} = 1

in Equation (28), as follows:

σ = \frac{G^{e q} L}{L - t r C + 3} F \cdot F^{T} + \sum_{i = 1}^{n} \frac{G_{i}^{n e q} L_{i}}{L_{i} - t r C_{i}^{e} + 3} F_{i}^{e} \cdot F_{i}^{e T} - (Π + \sum_{i = 1}^{n} Π_{i}) I

(30)

By employing

j = j_{i}^{e} = 1

and the multiplicative decomposition of

F

from Equation (2), the condition of viscous incompressibility,

t r D_{i}^{v} = 0

or

j_{i}^{v} = 1

, where

j_{i}^{v}

is the determinant of

F_{i}^{v}

, can be deduced. From this, the fourth-order tensor

Q_{i}

in Equation (21) can be expressed as [32]

Q_{i} = \frac{1}{2 v_{i}} (I_{4} - \frac{1}{3} I \otimes I)

(31)

where

I_{4}

is the fourth-order unit tensor and

v_{i}

is the viscosity of the

i th

subnetwork. The relaxation time for the

i th

Maxwell element is then defined as

τ_{i} = \frac{v_{i}}{G_{i}^{n e q}}

[31].

The corresponding deformation gradient in tensile tests, without shear deformation, can be expressed as

\begin{array}{l} F = [\begin{matrix} λ_{1} & 0 & 0 \\ 0 & λ_{2} & 0 \\ 0 & 0 & λ_{3} \end{matrix}], F_{i}^{e} = [\begin{matrix} λ_{i 1}^{e} & 0 & 0 \\ 0 & λ_{i 2}^{e} & 0 \\ 0 & 0 & λ_{i 3}^{e} \end{matrix}], F_{i}^{v} = [\begin{matrix} λ_{i 1}^{v} & 0 & 0 \\ 0 & λ_{i 2}^{v} & 0 \\ 0 & 0 & λ_{i 3}^{v} \end{matrix}] \\ D_{i}^{v} = [\begin{matrix} {\dot{λ}}_{i 1}^{v} / λ_{i 1}^{v} & 0 & 0 \\ 0 & {\dot{λ}}_{i 2}^{v} / λ_{i 2}^{v} & 0 \\ 0 & 0 & {\dot{λ}}_{i 3}^{v} / λ_{i 3}^{v} \end{matrix}] \end{array}

(32)

Here

λ_{1}

,

λ_{2}

,

λ_{3}

are the principal stretches of the deformation gradient

F

;

λ_{i 1}^{e}

,

λ_{i 2}^{e}

,

λ_{i 3}^{e}

are the principle stretches of the elastic deformation gradient

F_{i}^{e}

; and

λ_{i 1}^{v}

,

λ_{i 2}^{v}

,

λ_{i 3}^{v}

are the principal stretches of the viscous deformation gradient

F_{i}^{v}

. Additionally, considering the equal lateral stretches during the uniaxial tensile test and the incompressible deformation condition

j = j_{i}^{e} = j_{i}^{v} = 1

, we have

λ_{1} = λ, λ_{2} = λ_{3} = {(λ)}^{- \frac{1}{2}}

λ_{1} = λ_{i 1}^{e} λ_{i 1}^{v} = λ, λ_{i 2}^{e} λ_{i 2}^{v} = λ_{i 3}^{e} λ_{i 3}^{v} = {(λ)}^{- \frac{1}{2}}

λ_{i 1}^{e} = λ_{i}^{e}, λ_{i 2}^{e} = λ_{i 3}^{e} = {(λ_{i}^{e})}^{- \frac{1}{2}}, λ_{i 1}^{v} = λ_{i}^{v}, λ_{i 2}^{v} = λ_{i 3}^{v} = {(λ_{i}^{v})}^{- \frac{1}{2}}

(33)

with

λ

,

λ_{i}^{e}

,

λ_{i}^{v}

being the principal stretches of the deformation gradients

F

,

F_{i}^{e}

,

F_{i}^{v}

along the tensile direction, respectively.

Let

P_{1}

,

P_{2}

P_{3}

denote the components of

P

along the three principal directions, respectively. Substituting Equations (32) and (33) into Equation (28), we obtain

\begin{array}{l} P_{1} = \frac{G^{e q} L λ}{L - 2 λ^{- 1} - λ^{2} + 3} + \sum_{i = 1}^{n} \frac{G_{i}^{n e q} L_{i} λ {(λ_{i}^{v})}^{- 2}}{L_{i} - 2 λ^{- 1} λ_{i}^{v} - λ^{2} {(λ_{i}^{v})}^{- 2} + 3} - \frac{Π + \sum_{i = 1}^{n} Π_{i}}{λ} \\ P_{2} = P_{3} = \frac{G^{e q} L λ^{- \frac{1}{2}}}{L - 2 λ^{- 1} - λ^{2} + 3} + \sum_{i = 1}^{n} \frac{G_{i}^{n e q} L_{i} λ^{- \frac{1}{2}} λ_{i}^{v}}{L_{i} - 2 λ^{- 1} λ_{i}^{v} - λ^{2} {(λ_{i}^{v})}^{- 2} + 3} - (Π + \sum_{i = 1}^{n} Π_{i}) λ^{\frac{1}{2}} \end{array}

(34)

According to the boundary condition

P_{2} = P_{3} = 0

for uniaxial loading–unloading tests, we can further derive

Π + \sum_{i = 1}^{n} Π_{i} = \frac{G^{e q} L λ^{- 1}}{L - 2 λ^{- 1} - λ^{2} + 3} + \sum_{i = 1}^{n} \frac{G_{i}^{n e q} L_{i} λ^{- 1} λ_{i}^{v}}{L_{i} - 2 λ^{- 1} λ_{i}^{v} - λ^{2} {(λ_{i}^{v})}^{- 2} + 3}

(35)

and

P_{1} = \frac{G^{e q} L (λ - λ^{- 2})}{L - 2 λ^{- 1} - λ^{2} + 3} + \sum_{i = 1}^{n} \frac{G_{i}^{n e q} L_{i} [λ {(λ_{i}^{v})}^{- 2} - λ^{- 2} λ_{i}^{v}]}{L_{i} - 2 λ^{- 1} λ_{i}^{v} - λ^{2} {(λ_{i}^{v})}^{- 2} + 3}

(36)

Here, the effect of deformation incompressibility on the stress is considered, and the Lagrange multipliers are eliminated according to the boundary conditions.

Let

M_{i 1}^{n e q}

,

M_{i 2}^{n e q}

M_{i 3}^{n e q}

denote the components of

M_{i}^{n e q}

along the three principal directions, respectively. Similarly, substituting Equations (32) and (33) into Equation (29), we obtain

\begin{array}{l} M_{i 1}^{n e q} = \frac{G_{i}^{n e q} L_{i} λ^{2} {(λ_{i}^{v})}^{- 2}}{L_{i} - 2 λ^{- 1} λ_{i}^{v} - λ^{2} {(λ_{i}^{v})}^{- 2} + 3} - Π_{i} \\ M_{i 2}^{n e q} = M_{i 3}^{n e q} = \frac{G_{i}^{n e q} L_{i} λ^{- 1} λ_{i}^{v}}{L_{i} - 2 λ^{- 1} λ_{i}^{v} - λ^{2} {(λ_{i}^{v})}^{- 2} + 3} - Π_{i} \end{array}

(37)

Then, substitution of Equations (37), (32) and (31) into Equation (21) yields

\frac{{\dot{λ}}_{i}^{v}}{λ_{i}^{v}} = \frac{L_{i}}{3 τ_{i} [L_{i} - 2 λ^{- 1} λ_{i}^{v} - λ^{2} {(λ_{i}^{v})}^{- 2} + 3]} [λ^{2} {(λ_{i}^{v})}^{- 2} - λ^{- 1} λ_{i}^{v}]

(38)

This describes the viscous flow in solids subjected to uniaxial tension. It is important to note that the elastic incompressibility of the springs does not affect the viscous flow, as only the deviatoric stress, excluding the hydrostatic pressure

Π_{i}

, derives the viscous flow.

Under the initial conditions

{P_{1}|}_{t = 0} = 0

,

{λ|}_{t = 0} = {λ_{i}^{v}|}_{t = 0} = 1

, the coupled Equations (36) and (38) can be solved simultaneously to predict the mechanical behaviors based on the given material parameters. A finite difference approach is used to obtain the numerical results for the evolving stress

P_{1}

as a function of the stretch

λ

at ten different stretching rates. The model parameters are obtained by fitting Equations (36) and (38) with the experimental data at 296K as n = 3,

L = L_{i} = 155 (i = 1, 2, 3)

,

G^{e q} = 15.12 kPa

,

G_{1}^{n e q} = 16.83 kPa

,

G_{2}^{n e q} = 27.39 kPa

,

G_{3}^{n e q} = 41.04 kPa

,

τ_{1} = 413.32 s

,

τ_{2} = 5.43 s

,

τ_{3} = 1.65 s

. Great consistency between the experimental data and the theoretical model is shown in Figure 2. With these parameters determined, the theoretical data at different stretching rates are calculated as shown in Figure 3. The stress can be observed to increase with stretch during the loading phase and decrease with stretch during the unloading phase. Each loading–unloading cycle exhibits characteristic viscoelastic behavior, with the peak stress magnitude rising as the stretching rate increases.

Figure 2. The fitting stress–stretch curves between theoretical model and experimental data.

Figure 3. Theoretical stress–stretch curves at different stretching rates.

The initialization of parameters of the following machine learning model will be obtained by training these theoretical results, which is proven to be an effective strategy for predicting viscoelasticity of solids with scarce experimental data. This strategy avoids directly imposing physical constraints on the training procedure since the specific governing equations are usually uncertain and the physical quantities such as viscous deformation and its conjugated driving force are hard to obtain via experiments. In the next section, a machine learning model for predicting viscoelasticity will be introduced.

3. Machine Learning Method for Predicting Viscoelasticity

As mentioned, the FNN architecture is not well suited for handling history-dependent behaviors because there is no direct relationship between the network inputs at a current time step and the outputs from previous time steps. To address this issue, the RNN architecture is introduced as a reliable model for managing this history dependency. RNN training utilizes the Backpropagation Through Time (BPTT) algorithm, which applies backpropagation [37] to a time sequence. However, when dealing with long sequence inputs, RNN training may suffer from the common problems of vanishing and exploding gradients during backpropagation. This occurs because RNN builds very deep computational graphs by repeatedly applying the same operation at each time step of a long sequence. The LSTM and GRU are specialized recurrent neural network architectures designed to tackle these vanishing and exploding gradient issues. Memory cells in LSTM and GRU networks are equipped with gated units that dynamically control the flow of information, allowing the networks to “forget” old and unnecessary information and avoid the problems associated with multiplying large sequences of numbers during temporal backpropagation. GRU have the advantage of avoiding overfitting and reducing training time compared to LSTM, as GRU with two gates have significantly few parameters than LSTM with four gates.

To predict the time-dependent stress of viscoelastic materials, a surrogate model based on GRU and FNN (GRU–FNN) is employed, as illustrated in Figure 4a. At each time step, the GRU cell manages the information flow using a reset gate

r_{t}

, which regulates the integration of new input with the previous memory; and an update gate

Z_{t}

, which determines how much of the previous memory should be retained; and a hidden state

h_{t}

, which transfers information from the candidate hidden state

{\tilde{h}}_{t}

forward, as depicted in Figure 4b. These two gates control the information flow between the long-term hidden state and the predictions at each time step. FNN then processes the final hidden state of the GRU as unidirectional flow signals, moving from the input layer through hidden layers to the output layer, as shown in Figure 4c. A detailed introduction to GRU and FNN will follow.

Figure 4. (a) GRU–FNN architecture; (b) details of GRU; (c) details of FNN.

3.1. Gated Recurrent Unit

The reset gate output is [8,38]

r_{t} = σ (W^{r} x_{t} + U^{r} h_{t - 1} + b^{r})

(39)

Here, the weight matrix

W^{r}

connects the input at a specific time

t

to the current hidden layer within the reset gate, while the weight matrix

U^{r}

represents the recurrent connection between the previous and current hidden layers within the reset gate. The initial hidden state is initialized the null vector

h_{0} = 0

.

b^{r}

denotes the bias vector, and

σ

refers to the activation function.

The update gate output is

Z_{t} = σ (W^{z} x_{t} + U^{z} h_{t - 1} + b^{z})

(40)

where

W^{z}

,

U^{z}

,

b^{z}

are respectively two weights matrices and a bias vector regulating the update mechanism in this gate.

The candidate hidden state

{\tilde{h}}_{t}

is

{\tilde{h}}_{t} = \tanh [W^{h} x_{t} + U^{h} (r_{t} * h_{t - 1}) + b^{h}]

(41)

where

W^{h}

and

U^{h}

are two different weight matrices,

b^{h}

is a bias vector, the symbol

*

denote the Hadamard product of two vectors with identical dimensions,

t a n h

is the hyperbolic tangent function.

The GRU output current hidden state

h_{t}

is

h_{t} = (1 - Z_{t}) * h_{t - 1} + Z_{t} * {\tilde{h}}_{t}

(42)

where

1

is a unit vector, i.e., a vector filled with ones.

The total number of parameters

N_{P}

in a GRU cell is related to the dimension

N_{I}

of the input vector

x_{t}

and the number of neurons

N_{H}

in the hidden layers of the GRU cell, given by

N_{P} = 3 N_{H} (N_{I} + N_{H} + 2)

(43)

Note that the above bias vector is considered to consist of two different bias vectors related to two different weight matrices, respectively.

3.2. Feedforward Neural Network

The FNN can be described by the following compositional function [39]:

O (h_{t}, W, b) = σ^{Λ} \circ σ^{Λ - 1} \circ \dots \circ σ^{1}

(44)

where the symbol

\circ

denotes the composition operator, the superscript

Λ

represents the total number of the neural network, and

σ

is the activation function with the following form:

σ^{λ} (L^{λ - 1}) = σ^{λ} (W^{λ} L^{λ - 1} + b^{λ}), λ = 1, \dots, Λ

(45)

Here, the superscript

λ

denotes the layer number of the neural network,

L^{λ - 1}

the input vector from the

(λ - 1) th

layer (specially,

L^{0} = h_{t}

and

L^{Λ} = y_{t}

),

W^{λ}

and

b^{λ}

respectively the weight matrices and biases vector of

λ th

layer. Note that all hidden states of the last GRU are input into the same FNN in turn for training. The total parameters

N^{F}

in FNN can be calculated as

N^{F} = \sum_{λ = 1}^{Λ} N^{λ} (N^{λ - 1} + 1)

(46)

where

N^{λ}

is the number of neurons of the

λ th

layer.

3.3. Training Procedure

The loss function, defined as the mean absolute error (MAE), is expressed as

l (θ) = \frac{1}{N^{D}} \sum_{k = 1}^{m} |y_{k} - {\hat{y}}_{k}|

(47)

where

y_{k}

and

{\hat{y}}_{k}

denote the ground-truth and predicted outputs at the

i th

time sequence,

N^{D}

is the total number of datapoints in the training dataset, and

θ

represents the parameters of the neural networks. These parameters are updated by minimizing the loss function, as follows:

θ^{*} = \underset{θ}{a r g m i n} (l (x, θ))

(48)

where

θ^{*}

represents the values of

θ

that minimize

l (x, θ)

. Common optimization methods for minimizing the loss function include gradient decent techniques (such as Adam) and quai-Newton methods (such as L-BFGS). In this work, the Adam optimizer [40] is tentatively employed to update the hyperparameters. The Backpropagation (BP) technique and Backpropagation Through Time (BPTT) technique are used to compute the gradients of the loss function with respect to the parameters of FNN and GRU, respectively. In the following sections, the GRU–FNN model consists of three stacked GRU layers with

100

units each and a single time-distributed dense layer with

100

neurons. This configuration balances computational cost and error minimization through trial and error. The leaky rectified linear unit (LeakyReLU) is used as the activation function for the GRU, while the linear function is used for the FNN.

4. Results and Discussions

4.1. Initialization of GRU-FNN Parameters by Training Theoretical Data

The inputs and outputs are unrolled through

100

time steps, i.e.,

x = \{x_{1}, x_{2}, \cdot \cdot \cdot, x_{m}\}

and

y = \{y_{1}, y_{2}, \cdot \cdot \cdot, y_{m}\}

with m = 100. Let

t_{k}

,

λ_{k}

and

P_{k}

denote the loading time, the stretch and the first P-K stress at the

k th

time step, respectively.

t_{k}

is calculated by the following formula:

t_{k} = \{\begin{matrix} \frac{λ_{k} - 1}{|\dot{λ}|}, \dot{λ} > 0 \\ \frac{λ_{m a x} - 1}{|\dot{λ}|} + \frac{λ_{m a x} - λ_{k}}{|\dot{λ}|}, \dot{λ} < 0 \end{matrix}

(49)

where

λ_{m a x}

is the maximum of

λ

during loading tests. The inputs

x_{k} = (t_{k}, λ_{k})

and the outputs

y_{k} = (P_{k})

are used for the uniaxial loading-unloading viscoelasticity. Ten samples are generated by capturing 100 points per curve in Figure 2 and then split into training data (80% of total samples) and testing data (20% of total samples). The training data with a batch size of 2 are used during model’s training process, while the testing data are used for validation of the model’s generalization capabilities.

The GRU–FNN model is implemented in the software library Keras 2.10.0 and its information is listed in Table 1. In the output shape of layers, the first None refers to the batch, the second number the time steps and the third number the units (i.e., the dimension of the hidden state) of GRU or the dimension of the outputs of the dense layer. According to Equation (43), the numbers of parameters of GRU 1 and GRU 2 (GRU 3) layers are calculated respectively as 3 × 100 × (2 + 2 + 100) = 31,200 and 3 × 100 × (2 + 100 + 100) = 60,600. According to Equation (46), the number of parameters of the dense layer is calculated as 1 × (100 + 1) = 101.

Table 1. Information of the GRU-FNN model.

Figure 5 illustrates the evolution of loss with respect to epoch during training and testing. It is observed that the loss decreases dramatically at first and then stabilizes at a steady state. Additionally, the testing loss is slightly higher than the training loss. These observations suggest that the model does not suffer from overfitting or underfitting issues.

Figure 5. Evolution of the loss during training and testing with respect to epochs.

Figure 6 compares the stress–stretch curves from the training data with the model’s predictions at different stretching rates after 5000 epochs. It was observed that the loss value of the training data stabilizes around 0.50 after approximately 5000 epochs through trial and error. Therefore, we trained the model for 5000 epochs. To evaluate the model’s accuracy under various conditions, the root mean square error (RSME) is employed, defined as

RMSE = \frac{1}{m} \sum_{k = 1}^{m} \sqrt{{(y_{k} - {\hat{y}}_{k})}^{2}}

, where

y_{k}

and

{\hat{y}}_{k}

represents the ground truth and the predicted values. The RMSE quantifies the discrepancy between the predicted results and the training or testing data. RMSE values for all cases in this work are listed in Table 2. For the stretching rates of

|\dot{λ}|

= 0.01/s, 0.02/s, 0.06/s, 0.08/s, 0.12/s, 0.14/s, and 0.16/s, the corresponding RMSE values are 0.76 0.56, 0.53, 0.71, 0.85, 0.83, 0.73, and 0.63, respectively. The predictions from the GRU–FNN model are highly consistent with training data, and the hysteresis curves exhibit typical viscoelasticity.

Figure 6. The comparison of the stress–strain curves from the training data and the model’s prediction after 5000 epochs.

Figure 7 compares the stress–stretch curves from the testing data with the model’s predictions at two different stretching rates after 5000 epochs. For the stretching rates of

|\dot{λ}|

= 0.04/s and 0.18/s, the corresponding RMSE values are 3.41 and 1.52, respectively. The testing data, which was not used during the training process, is accurately predicted by the trained model, demonstrating the GRU–FNN model’s strong generalization capabilities in predicting viscoelastic behaviors. The parameter values (weights and biases) here after 5000 epochs are saved for use as initial parameters in training the GRU–FNN model on experimental data.

Figure 7. The comparison of the stress–strain curves from testing data and model’s prediction after 5000 epochs.

4.2. Initialization of GRU–FNN Parameters by Training Theoretical Data of Viscoelasticity of VHB4905 with Scarce Data from Experiments

In the thermo-viscoelastic experiments of Liao and Hossain [41], VHB4905 samples with dimensions of 100 m × 10 mm × 0.5 mm are used for cyclic loading–unloading tests to investigate the time- and stretching rate-dependent viscoelasticity of VHB polymers. Three datasets, selected from the experimental stress–stretch curves at 273 K and three different stretching rates, i.e., 0.03/s, 0.05/s, and 0.10/s, are split into two training datasets for learning the viscoelastic behaviors and one testing dataset for checking the GRU–FNN model’s capabilities in predicting viscoelasticity of materials with scarce training data.

First, the three datasets are used with the GRU–FNN model initialized with default parameter values (weights and biases), and the predicted stress is obtained after 5300 epochs. Figure 8 shows the evolution of the loss with respect to epochs during training and testing. It is observed that the loss for testing data are significantly higher than that for training data after several epochs and fluctuates at a high level, indicating model overfitting. Figure 9 compares the stress–stretch curves from the experimental data with the model’s predictions at 273 K after 5300 epochs. For the stretching rates of

|\dot{λ}|

= 0.03/s, 0.05/s, and 0.10/s, the corresponding RMSE values are 2.84, 3.07, and 62.49, respectively. The training data from cyclic experiments at

|\dot{λ}|

= 0.03/s and

|\dot{λ}|

= 0.05/s are predicted accurately. However, the testing data from cyclic experiments at

|\dot{λ}|

= 0.10/s show substantial divergence with the predicted results, further demonstrating the poor generalization capabilities of the model in predicting viscoelasticity of materials with scarce training data.

Figure 8. Evolution of loss with respect to epochs during training and testing at 273 K.

Figure 9. The comparison of the stress–stretch curves from the experimental data and the model’s prediction at 273 K after 5300 epochs.

To address the overfitting issues associated with scarce data, the parameter values from training the GRU–FNN model with theoretical data are used as the initialization for training on the experimental data. Figure 10 shows the evolution of the loss with respect to epochs during both training and testing. It is evident that the gap between the loss for training and testing data has significantly narrowed compared to Figure 8. Given that the theoretical data were trained for 5000 epochs, the experimental data were trained 300 epochs to maintain consistency in the total training duration (5300 epochs). Figure 11 compares the stress–stretch curves from the experimental data with the model’s predictions at 273 K after 300 epochs. It was also observed that the loss values of training data and testing data stabilize after approximately 300 epochs through trial and error. For the stretching rates of

|\dot{λ}|

= 0.03/s, 0.05/s, and 0.10/s, the corresponding RMSE values are

1.90

,

2.62

, and

15.70

, respectively. The high degree of consistency between the predicted and experimental stress–stretch curves indicates that our approach has effectively alleviated the overfitting issues.

Figure 10. Evolution of the loss with respect to epochs during training and testing at 273 K.

Figure 11. The comparison of the stress–strain curves from the experimental data and the model’s prediction at 273 K.

Finally, we split four datasets, selected from the stress–stretch curves at four different stretching rates, i.e., 0.025/s, 0.05/s, 0.10/s, and 0.20/s in the experiments of VHB4905 samples with dimensions of 130 mm × 10 mm × 0.5 mm [42], into three training datasets and one testing dataset to further assess the GRU–FNN model’s capabilities in predicting the viscoelasticity of materials across different temperatures. The saved parameter values from training with theoretical data are used for initialization. Figure 12 shows the evolution of the loss with respect to epochs during training and testing at 313 K. It can be seen that the gap of the loss during training and testing is smaller than that observed in Figure 9 due to the inclusion of an additional training dataset. Figure 13 compares the stress–stretch curves from the experimental data and the model’s predictions at 313 K after 300 epochs. For the stretching rates of

|\dot{λ}|

= 0.025/s, 0.05/s, 0.10/s, and 0.20/s at 313 K, the corresponding RMSE values are 0.81, 1.32, 1.72, and 4.55, respectively. Figure 14 shows the evolution of the loss with respect to epochs during training and testing at 333 K. The evolution trend is similar with that in Figure 12. Figure 15 compares the stress–stretch curves from the experimental data and the model’s predictions at 333 K after 300 epochs. At 333 K, for the same stretching rates, the RMSE values are 0.24, 0.31, 0.53, and 4.27, respectively. The model accurately predicts the training data and provides reasonable predictions for the testing data, demonstrating improved generalization capabilities.

Figure 12. Evolution of loss with respect to epochs during training and testing at 313 K.

Figure 13. The comparison of the stress–strain curves from the experimental data and the model’s prediction at 313 K.

Figure 14. Evolution of loss with respect to epochs during training and testing at 333 K.

Figure 15. The comparison of the stress–strain curves from the experimental data and the model’s prediction at 333 K.

Table 2. Statics of RMSE values.

Cases	Stretching Rates (/s)	RMSE Values (kPa)
Figure 6	(0.01,0.02,0.06,0.08,0.10,0.12,0.14,0.16)	(0.76,0.56,0.53,0.71,0.85,0.83,0.73,0.63)
Figure 7	(0.04,0.18)	(3.41,1.52)
Figure 9	(0.03,0.05,0.10)	(2.84,3.07,62.49)
Figure 11	(0.03,0.05,0.10)	(1.90,2.62,15.70)
Figure 13	(0.025,0.05,0.10,0.20)	(0.81,1.32,1.72,4.55)
Figure 15	(0.025,0.05,0.10,0.20)	(0.24,0.31,0.53,4.27)

4.3. Analyzing the Sensitivity of the GRU-FNN Model

To test the sensitivity of the GRU–FNN model to input data, a random number is generated from a standard normal distribution (mean = 0, standard deviation = 1). The generated noise value is then multiplied by 0.5% or 1.0% of the original stretch values, resulting in a noise adjustment value that is proportional to the magnitude of the original data. This value is added to the original data to simulate noise in the data. Since the time series can be calculated based on the stretching strain and stretching rate, we only apply noise processing to the stretching strain sequence here. The predicted results of the dataset at 273 K with added noise are shown in Figure 16. It can be observed that the model is resilient to small amounts of noise. However, as the noise increases, the prediction error also increases. When the noise exceeds the error between adjacent points, the prediction curve tends to become unsmooth and may even change direction, leading to a complete failure of the prediction. The same pattern was also observed in the predictions of the other two datasets at 313 K and 333 K, respectively, which are not shown here, but the corresponding RMSE values are listed in Table 3.

Figure 16. The comparison of the stress–strain curves from the model’s prediction at 273 K and the experimental data with (a) 0.5% noise and (b) 1.0% noise.

Table 3. Statics of RMSE values.

5. Conclusions

In this paper, a physics-guided GRU–FNN model for predicting the viscoelasticity of solids at large deformation is proposed and validated by the comparison of the stress–stretch curves from the model prediction and the uniaxial experiments of VHB4905. The major novelty of the present work lies in the following aspects. First, the time- and loading path- dependent stress can be predicted by utilizing both time and stretch/strain sequences as inputs of the proposed GRU–FNN model. Second, the saved values of the parameters after training theoretical data are employed as the parameter initialization of GRU–FNN model for learning the viscoelastic behavior of solids with scarce experimental data, which significantly alleviates the overfitting issues that arise in the case of scarce data. The results show that the GRU–FNN model can reduce the RMSE by several times, significantly improving its generalization ability in scarce data scenarios. The physics-guided initialization of parameters in this work provides a new idea on how to incorporate physical knowledge into a machine learning model when the governing equations describing physical phenomenon in one material cannot be specified. The model can be applied to predict multiaxial stress–strain relations of viscoelastic solids in a similar manner when the experimental data from multiaxial tests are available, which will be our future study subject. This study facilitates the rapid prediction of materials’ nonlinear viscoelastic properties and lays a theoretical foundation for future real-time monitoring and performance prediction of polymer-based devices such as electronic artificial muscles and dielectric elastomer actuators.

Author Contributions

Conceptualization, B.Q. and Z.Z.; Methodology, B.Q.; Writing—original draft, B.Q.; Writing—review & editing, Z.Z.; Supervision, Z.Z.; Funding acquisition, B.Q. and Z.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Guangdong Basic and Applied Basic Research Foundation (Grant No. 2023A1515111166), the Development and Reform Commission of Shenzhen (No. XMHT20220103004), the Shenzhen Natural Science Fund (No. GXWD20231130100351002) and the National Natural Science Foundation of China (Grant No. 11932005).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Nassif, A.B.; Shahin, I.; Attili, I.; Azzeh, M.; Shaalan, K. Speech Recognition Using Deep Neural Networks: A Systematic Review. IEEE Access 2019, 7, 19143–19165. [Google Scholar] [CrossRef]
Zhu, Y.; Zhuang, F.; Wang, J.; Ke, G.; Chen, J.; Bian, J.; Xiong, H.; He, Q. Deep Subdomain Adaptation Network for Image Classification. IEEE Trans. Neural Networks Learn. Syst. 2021, 32, 1713–1722. [Google Scholar] [CrossRef] [PubMed]
Lake, B.M.; Salakhutdinov, R.; Tenenbaum, J.B. Human-level concept learning through probabilistic program induction. Science 2015, 350, 1332–1338. [Google Scholar] [CrossRef] [PubMed]
Wang, C.; Wang, Z.; Zhang, S.; Liu, X.; Tan, J. Reinforced quantum-behaved particle swarm-optimized neural network for cross-sectional distortion prediction of novel variable-diameter-die-formed metal bent tubes. J. Comput. Des. Eng. 2023, 10, 1060–1079. [Google Scholar] [CrossRef]
Yang, C.; Li, Z.; Xu, P.; Huang, H. Recognition and optimisation method of impact deformation patterns based on point cloud and deep clustering: Applied to thin-walled tubes. J. Ind. Inf. Integr. 2024, 40, 100607. [Google Scholar] [CrossRef]
Masi, F.; Stefanou, I.; Vannucci, P.; Maffi-Berthier, V. Thermodynamics-based Artificial Neural Networks for constitutive modeling. J. Mech. Phys. Solids 2021, 147, 104277. [Google Scholar] [CrossRef]
Zhang, P.; Yin, Z.-Y. A novel deep learning-based modelling strategy from image of particles to mechanical properties for granular materials with CNN and BiLSTM. Comput. Methods Appl. Mech. Eng. 2021, 382, 113858. [Google Scholar] [CrossRef]
Tancogne-Dejean, T.; Gorji, M.B.; Zhu, J.; Mohr, D. Recurrent neural network modeling of the large deformation of lithium-ion battery cells. Int. J. Plast. 2021, 146, 103072. [Google Scholar] [CrossRef]
Jiao, S.; Li, W.; Li, Z.; Gai, J.; Zou, L.; Su, Y. Hybrid physics-machine learning models for predicting rate of penetration in the Halahatang oil field, Tarim Basin. Sci. Rep. 2024, 14, 5957. [Google Scholar] [CrossRef]
Hu, H.; Qi, L.; Chao, X. Physics-informed Neural Networks (PINN) for computational solid mechanics: Numerical frameworks and applications. Thin-Walled Struct. 2024, 205, 112495. [Google Scholar] [CrossRef]
Herrmann, L.; Kollmannsberger, S. Deep learning in computational mechanics: A review. Comput. Mech. 2024, 74, 281–331. [Google Scholar] [CrossRef]
Linka, K.; Kuhl, E. A new family of Constitutive Artificial Neural Networks towards automated model discovery. Comput. Methods Appl. Mech. Eng. 2023, 403, 115731. [Google Scholar] [CrossRef]
Eggersmann, R.; Kirchdoerfer, T.; Reese, S.; Stainier, L.; Ortiz, M. Model-free data-driven inelasticity. Comput. Methods Appl. Mech. Eng. 2019, 350, 81–99. [Google Scholar] [CrossRef]
Linka, K.; Tepole, A.B.; Holzapfel, G.A.; Kuhl, E. Automated model discovery for skin: Discovering the best model, data, and experiment. Comput. Methods Appl. Mech. Eng. 2023, 410, 116007. [Google Scholar] [CrossRef]
Linka, K.; Hillgärtner, M.; Abdolazizi, K.P.; Aydin, R.C.; Itskov, M.; Cyron, C.J. Constitutive artificial neural networks: A fast and general approach to predictive data-driven constitutive modeling by deep learning. J. Comput. Phys. 2021, 429, 110010. [Google Scholar] [CrossRef]
Fuhg, J.N.; Bouklas, N. On physics-informed data-driven isotropic and anisotropic constitutive models through probabilistic machine learning and space-filling sampling. Comput. Methods Appl. Mech. Eng. 2022, 394, 114915. [Google Scholar] [CrossRef]
Eghtesad, A.; Tan, J.; Fuhg, J.N.; Bouklas, N. NN-EVP: A physics informed neural network-based elasto-viscoplastic framework for predictions of grain size-aware flow response. Int. J. Plast. 2024, 181, 104072. [Google Scholar] [CrossRef]
Abueidda, D.W.; Lu, Q.; Koric, S. Meshless physics-informed deep learning method for three-dimensional solid mechanics. Int. J. Numer. Methods Eng. 2021, 122, 7182–7201. [Google Scholar] [CrossRef]
Abueidda, D.W.; Koric, S.; Abu Al-Rub, R.; Parrott, C.M.; James, K.A.; Sobh, N.A. A deep learning energy method for hyperelasticity and viscoelasticity. Eur. J. Mech. A/Solids 2022, 95, 104639. [Google Scholar] [CrossRef]
Abueidda, D.W.; Koric, S.; Sobh, N.A.; Sehitoglu, H. Deep learning for plasticity and thermo-viscoplasticity. Int. J. Plast. 2021, 136, 102852. [Google Scholar] [CrossRef]
As’ad, F.; Avery, P.; Farhat, C. A mechanics-informed artificial neural network approach in data-driven constitutive modeling. Int. J. Numer. Methods Eng. 2022, 123, 2738–2759. [Google Scholar] [CrossRef]
Cen, J.; Zou, Q. Deep finite volume method for partial differential equations. J. Comput. Phys. 2024, 517, 113307. [Google Scholar] [CrossRef]
Danoun, A.; Prulière, E.; Chemisky, Y. Thermodynamically consistent Recurrent Neural Networks to predict non linear behaviors of dissipative materials subjected to non-proportional loading paths. Mech. Mater. 2022, 173, 104436. [Google Scholar] [CrossRef]
Wang, J.-J.; Wang, C.; Fan, J.-S.; Mo, Y. A deep learning framework for constitutive modeling based on temporal convolutional network. J. Comput. Phys. 2022, 449, 110784. [Google Scholar] [CrossRef]
Benabou, L. Development of LSTM networks for predicting viscoplasticity with effects of deformation, strain rate and temperature history. J. Appl. Mech. 2021, 88, 071008. [Google Scholar] [CrossRef]
Gorji, M.B.; Mozaffar, M.; Heidenreich, J.N.; Cao, J.; Mohr, D. On the potential of recurrent neural networks for modeling path dependent plasticity. J. Mech. Phys. Solids 2020, 143, 103972. [Google Scholar] [CrossRef]
Mozaffar, M.; Bostanabad, R.; Chen, W.; Ehmann, K.; Cao, J.; Bessa, M.A. Deep learning predicts path-dependent plasticity. Proc. Natl. Acad. Sci. USA 2019, 116, 26414–26420. [Google Scholar] [CrossRef]
Abdolazizi, K.P.; Linka, K.; Cyron, C.J. Viscoelastic constitutive artificial neural networks (vCANNs)–A framework for data-driven anisotropic nonlinear finite viscoelasticity. J. Comput. Phys. 2024, 499, 112704. [Google Scholar] [CrossRef]
Banks, H.T.; Hu, S.; Kenz, Z.R. A brief review of elasticity and viscoelasticity for solids. Adv. Appl. Math. Mech. 2015, 3, 1–51. [Google Scholar] [CrossRef]
Hong, W. Modeling viscoelastic dielectrics. J. Mech. Phys. Solids 2011, 59, 637–650. [Google Scholar] [CrossRef]
Reese, S.; Govindjee, S. A theory of finite viscoelasticity and numerical aspects. Int. J. Solids. Struct. 1998, 35, 3455–3482. [Google Scholar] [CrossRef]
Zhou, J.; Jiang, L.; Khayat, R.E. A micro–macro constitutive model for finite-deformation viscoelasticity of elastomers with nonlinear viscosity. J. Mech. Phys. Solids. 2018, 110, 137–154. [Google Scholar] [CrossRef]
Su, X.; Peng, X. A 3D finite strain viscoelastic constitutive model for thermally induced shape memory polymers based on energy decomposition. Int. J. Plast. 2018, 110, 166–182. [Google Scholar] [CrossRef]
Qin, B.; Zhong, Z.; Zhang, T.-Y. A thermodynamically consistent model for chemically induced viscoelasticity in covalent adaptive network polymers. Int. J. Solids. Struct. 2022, 256, 111953. [Google Scholar] [CrossRef]
Gent, A.N. A new constitutive relation for rubber. Rubbery Chem. Technol. 1996, 69, 59–61. [Google Scholar] [CrossRef]
Arruda, E.M.; Boyce, M.C. A three-dimensional constitutive model for the large stretch behaviour of rubber elastic materials. J. Mech. Phys. Solids. 1993, 41, 389–412. [Google Scholar] [CrossRef]
Lecun, Y. A Theoretical Framework for Back-Propagation. In Artificial Neural Networks; Mehra, P., Wah, B., Eds.; IEEE Computer Society Press: Piscataway, NJ, USA, 1992. [Google Scholar]
Cho, K.; Merrïenboer, B.V.; Gulcehre, C.; Bougares, F.; Schwenk, H.; Bahdanau, D.; Bengio, Y. Learning phrase representations using RNN encoder–decoder for statistical machine translation. arXiv 2014, arXiv:1406.1078v3. [Google Scholar]
Amini Niaki, S.; Haghighat, E.; Campbell, T.; Poursartip, A.; Vaziri, R. Physics-informed neural network for modelling the thermochemical curing process of composite-tool systems during manufacture. Comput. Methods Appl. Mech. Engrg 2021, 384, 113959. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J.L. Adam: A method for stochastic optimization. arXiv 2015, arXiv:1412.6980v9. [Google Scholar]
Liao, Z.; Hossain, M.; Yao, X.; Mehnert, M.; Steinmann, P. On thermo-viscoelastic experimental characterization and numerical modelling of VHB polymer. Int. J. Non-Linear Mech. 2020, 118, 103263. [Google Scholar] [CrossRef]
Mehnert, M.; Hossain, M.; Steinmann, P. A complete thermo–electro–viscoelastic characterization of dielectric elastomers, Part I: Experimental investigations. J. Mech. Phys. Solids. 2021, 157, 104603. [Google Scholar] [CrossRef]

Figure 1. The generalized Maxwell model for solids under large deformation. It is assumed to be equivalent to an equilibrium spring with a deformation gradient F, and n parallel Maxwell element, each characterized by an elastic deformation gradient

F_{i}^{e}

and a viscous deformation gradient

F_{i}^{v}

(1 ≤ i ≤ n).

Figure 2. The fitting stress–stretch curves between theoretical model and experimental data.

Figure 3. Theoretical stress–stretch curves at different stretching rates.

Figure 4. (a) GRU–FNN architecture; (b) details of GRU; (c) details of FNN.

Figure 5. Evolution of the loss during training and testing with respect to epochs.

Figure 6. The comparison of the stress–strain curves from the training data and the model’s prediction after 5000 epochs.

Figure 7. The comparison of the stress–strain curves from testing data and model’s prediction after 5000 epochs.

Figure 8. Evolution of loss with respect to epochs during training and testing at 273 K.

Figure 9. The comparison of the stress–stretch curves from the experimental data and the model’s prediction at 273 K after 5300 epochs.

Figure 10. Evolution of the loss with respect to epochs during training and testing at 273 K.

Figure 11. The comparison of the stress–strain curves from the experimental data and the model’s prediction at 273 K.

Figure 12. Evolution of loss with respect to epochs during training and testing at 313 K.

Figure 13. The comparison of the stress–strain curves from the experimental data and the model’s prediction at 313 K.

Figure 14. Evolution of loss with respect to epochs during training and testing at 333 K.

Figure 15. The comparison of the stress–strain curves from the experimental data and the model’s prediction at 333 K.

Figure 16. The comparison of the stress–strain curves from the model’s prediction at 273 K and the experimental data with (a) 0.5% noise and (b) 1.0% noise.

Table 1. Information of the GRU-FNN model.

Layer	Output Shape	Parameters
GRU 1	(None, 100,100)	31,200
GRU 2	(None, 100,100)	60,600
GRU 3	(None, 100,100)	60,600
Dense	(None, 100,1)	101

Table 3. Statics of RMSE values.

Cases with Noise	Stretching Rates (/s)	RMSE Values (kPa)
273 K_0.5%	(0.03,0.05,0.10)	(1.93,2.17,8.28)
273 K_1.0%	(0.03,0.05,0.10)	(2.16,3.80,20.30)
313 K_0.5%	(0.025,0.05,0.10,0.20)	(0.45,0.59,0.84,4.55)
313 K_1.0%	(0.025,0.05,0.10,0.20)	(1.84,1.23,0.83,4.73)
333 K_0.5%	(0.025,0.05,0.10,0.20)	(1.37,0.84,0.66,5.27)
333 K_1.0%	(0.025,0.05,0.10,0.20)	(0.50,0.54,0.59,5.16)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

A Physics-Guided Machine Learning Model for Predicting Viscoelasticity of Solids at Large Deformation

Abstract

1. Introduction

2. Thermodynamic Formulation of Constitutive Laws for Viscoelastic Solids at Large Deformation

3. Machine Learning Method for Predicting Viscoelasticity

3.1. Gated Recurrent Unit

3.2. Feedforward Neural Network

3.3. Training Procedure

4. Results and Discussions

4.1. Initialization of GRU-FNN Parameters by Training Theoretical Data

4.2. Initialization of GRU–FNN Parameters by Training Theoretical Data of Viscoelasticity of VHB4905 with Scarce Data from Experiments

4.3. Analyzing the Sensitivity of the GRU-FNN Model

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics