Next Article in Journal
Accurate Rotor Temperature Prediction of Permanent Magnet Synchronous Motor in Electric Vehicles Using a Hybrid RIME-XGBoost Model
Previous Article in Journal
From Self-Cleaning to Self-Aware Solar Mirror Skin
Previous Article in Special Issue
The Development of an OpenAI-Based Solution for Decision-Making
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Neural Network Constitutive Model and Automatic Stiffness Evaluation for Multiscale Finite Elements

by
Aliki D. Mouratidou
* and
Georgios E. Stavroulakis
Institute of Computational Mechanics and Optimization, School of Production Engineering and Management, Technical University of Crete, 731 00 Chania, Greece
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(7), 3697; https://doi.org/10.3390/app15073697
Submission received: 20 January 2025 / Revised: 17 March 2025 / Accepted: 17 March 2025 / Published: 27 March 2025
(This article belongs to the Special Issue Applications in Neural and Symbolic Artificial Intelligence)

Abstract

:
A neural network model for a constitutive law in nonlinear structures is proposed in this paper. The artificial neural network (ANN) model is constructed based on a data set of responses from representative volume elements, which was calculated by finite elements and using an open scientific software machine learning platform. The tangential stiffness matrix, which can be used within a multiscale finite element analysis, is calculated via the method of automatic differentiation. Two types of constitutive neural networks are proposed. The first approach involves training a residual model with three respective surrogate models for the stress components, ensuring less computing cost. The second approach considers a separate ANN for each stress component, ensuring a high rate of convergence. The numerical results are compared with the given data set as well as with the results obtained after applying a polynomial regression. The loss function, without and including the Sobolev metrics, is considered. In addition, a physics-informed constitutive neural model, which enforces hyperelasticity principles, is also analyzed. The choice of the hyperparameters is discussed.

1. Introduction

Multiscale modeling is a simulation technique that describes a behavior at a given length scale based on the physics at a finer scale, which in turn is considered to be better known or easy to model. Structures made of composite materials or materials with microstructure constitute typical candidates for multiscale modeling. In fact, it is impossible to model a whole structure by taking into consideration all constituents and their possible nonlinear interactions within the model. Therefore, a compromise must be considered wherein a reduction in precision and an increase in uncertainty related to multiscale modeling should be accepted in order to avoid the complexity of a full-scale model. In engineering, homogenization is a representative example of applied multiscale modeling [1]. In linear problems, homogenization is usually based on analytical expressions that give the homogenized properties for a given microstructure or composite material [2,3]. This approach is powerful, despite certain theoretical difficulties for the calculation of homogenized properties. Numerical homogenization allows for the extension of classical analytical homogenization in order to describe more complicated problems. Homogenized properties are load dependent or even path dependent; therefore, analytical solutions are either difficult to calculate or do not exist. A path-independent problem is considered here as a first example.
ANNs have been applied in different areas of science and engineering, where data sets are available for training and testing. In particular, ANNs are employed in elastoplastic and contact problems in mechanics by using a minimization of energy. The Hopfield and Tank neural networks have been proposed by Kortesis and Panagiotopoulos [4] and Avdelas et al. [5]. The feedforward NNs trained by the backpropagation algorithm have been used for an approximation of several problems in mechanics based on examples (supervised learning). The buckling loads in nonlinear problems for elastic plates have been calculated by neural networks in Muradova and Stavroulakis [6]. Inverse and parameter identification problems in mechanics have been solved by using backpropagation neural networks in Stavroulakis [7]. A recent review of the classical usage of neural networks within computational mechanics has been published by Yagawa and Oishi [8].
Several neural network-based constitutive models have been proposed in recent works. The main directions are described in the review article by Dornheim et al. [9].
The work described here belongs to the more general topic of CANN (constitutive ANN). In fact, from input–output relations, an ANN is able to learn and replace a mathematically consistent constitutive material model. Adding additional information, like thermomechanical restrictions, enhances the effectiveness and allows for training with less data. Further information can be found in, among others, references [9,10,11].
Here, an efficient way to store the constitutive relation based on a database of responses from a representative volume element (RVE) is proposed through the construction of a neural network constitutive model. Subsequently, the usage of this model within a nonlinear, upper-level finite element model is straightforward.
The Sobolev metrics (e.g., [12,13]) is also tested in the minimization problem. The proposed neural model allows a prediction of the material response from the strain effects based on data set { ε i , σ i , ( σ / ε ) i , i = 1 , 2 , , N } .
Two types of constitutive neural networks are proposed. The first approach involves the training of a residual model with three respective surrogate models for stress components. The second approach considers a separate ANN for each stress component. The first neural model requires less computational cost, since the training of only one residual model is required. However, the second approach is preferred for the fitting of the data set of the tangent stiffness tensor (partial derivatives of the stress tensor) in the loss function. In this case, the desired accuracy can be reached with a smaller number of epochs. The applicability of the proposed method has a large range, e.g., in nonlinear structures and in hyperelastic materials.
For the fitting of the backpropagation model, data from the representative volume element calculated using finite elements have been taken. Thus, the target and test data have been collected from the numerical experiments. Two splits of the data set from the responses of representative volume element are chosen. First, the data are split into training ( 50 % ) and validation/test data ( 50 % ) in Examples 1, 2, and 4, and then, the data samples are split into 80 % training and 20 % ( 10 % for validation and 10 % for testing/prediction) for the data in Example 3. The computed loss, validation, and prediction errors are presented in tables. As numerical experiments have shown, there is no significant difference between these two splits.
The results are also compared with a linear polynomial regression method of the third degree. The results of the comparison are presented in figures. Although the computational cost of implementation of the linear regression model is less than that required for the training of the neural network, there are advantages to using ANNs. Namely, the material stiffness matrix, i.e., the partial derivatives of the stress function with respect to the components of the strain tensor, are more accurately approximated because of deep learning performance based on the tanh activation (basis) function of harmonic analysis. Fitting the stress data and their derivative data involved in the loss function provides stability and good accuracy.
The data should be carefully arranged in order to prevent overfitting. The developed programming code provides conditions to overcome this phenomenon by using Python’s tools. The data are not shuffled before the training process in order to keep them smoothly increasing or decreasing. Numerous numerical experiments have shown that this guarantees the stability of the training process, allows for an optimal updating of weights and biases in the backpropagation, and increases the rate of convergence to the original (target) data set. Moreover, it properly captures the nonlinear behavior of the considered materials.
The data set, derived from the nonlinear polynomial constitutive model for hyperelastic materials, has also been trained in the constructed neural model. Here, the loss function includes constraints on the second derivative of the strain energy function ( σ / ε = 2 Ψ / ε 2 0 ) , which is namely a convexity that enforces realistic physical laws and provides material stability conditions during deformation. The numerical results obtained from the constructed neural network are compared with the exact values at the predicted points.
An ANN together with the method of automatic differential of Tensorflow-scientific library 2.14.0 in Python 3.11 programming [14] is used here. The neural model includes automatic differentiation [15] to calculate the partial derivatives of the ANN output. Automatic differentiation is performed through the chain rule applied to the neural stress models. In the residual model, except for fitting the stress tensor data set, fitting the given tangent stiffness matrix, obtained from the representative volume element, is also provided.

2. Background of the Constitutive Metamodel Based on Responses of Representative Volume Elements

Numerical homogenization uses an RVE with several materials (composite) or microstructure and calculates homogenized properties, which in turn are used for the study of a whole structure with a continuous model [16,17]. In nonlinear behavior, this step must be repeated for each level of stress response. This approach has been proposed to facilitate multiscale modeling and avoid the more accurate but expensive FEM2 [18]. FEM2 requires the solution of the detailed model of the RVE for every different combination of loadings and points in the structure that appear during the incremental iterative solution of the homogenized finite element model.
Classical polynomial surrogates as well as a neural network and other ones have been proposed and tested in [1,19,20,21,22,23,24,25].
A feedforward neural network has the ability to correlate vector inputs with vector outputs provided that the parameters involved have been suitably calculated. This step is accomplished with the help of data (examples) within a step called training. The topology of the neural network (number of layers and neurons) as well as the activation functions (basis functions) at nodes in correlation with the number of examples define the ability of the network to approximate well an unknown target function. Trial and error or optimal design principles can be used toward a satisfactory result [26].
A modern approach for introducing physics into the neural network metamodel uses differential equations of mechanics in combination with the automatic differentiation of output of the neural network, in order to train the network, without using specific input–output sample examples. This is the so-called physics-informed neural networks approach ([14,25,27,28,29,30], etc.). Within the homogenization metamodel, PINN could provide thermomechanically consistent approximations that are more stable than those of the blind neural network.
Using ANNs is able to simulate classical mechanical problems if sufficient samples of input–output data are provided for training. The concept is shown schematically in Figure 1 and Figure 2 ([1]) for the stress–strain relation of the RVE. The RVE for every possible complexity is solved for different combinations of stresses, and the corresponding strains are calculated and used for training. More refined approaches introduce additional physical restrictions on the corresponding constitutive metamodel so the thermomechanical principles are not violated [25]. Furthermore, loading sequences can be used in a path-dependent problem to train a corresponding neural network that will provide the response based on previous loading sequences at a given point.
The trained neural network can be integrated into a finite element program and replaces the required constitutive response, as shown in Figure 1 (see, among others, [1,2,31,32]). The material constitutive law is based on experimental or RVE data. It is based either on an established mechanical material law or a metamodel (interpolation). This approach is close to the classical usage of constitutive relations, and it can be integrated with existing commercial finite element packages [33,34,35,36].
The results used here correspond to a masonry structure with different stones and mortar materials subjected to various stress–strain levels. The Representative Volume Element has been discretized by the finite element method. The results have been collected in a database that are used for the training of the neural network. A MATLAB v.2021a implementation for the multiscale analysis of two-dimensional problems as well as the database is available as a Supplementary Material to reference [1]. The masonry RVE has been used for the creation of the database.
The plastic deformations calculated under a given strain loading are shown in Figure 3 ([1]). The method has been described in reference [37], where the reader can find more details.

3. Feedforward and Backpropagation in the Neural Model with a Computation of Tangential Stiffness Matrix

The computational approach initially proposed and presented in [1] uses two neural network metamodels, one for the approximation of the constitutive relation and the second for the approximation of its first derivative, that is, the calculation of the tangential stiffness matrix. Here, following ideas from PINN’s community, we construct a neural network based on scientific software Tensorflow v.2.14.0, which provides automatic differentiation for the calculation of the partial derivatives and also enforces material consistency.
Two types of neural networks are introduced and trained. The first neural network consists of one residual model and three surrogate models, which are intended to compute the components of the stress tensor σ x x ,   σ y y and σ x y ( σ y x = σ x y ) . Namely, there are three NNs with three inputs, strain components ε x x ,   ε y y and ε x y ( ε y x = ε x y ) and one output, which is a component of the stress tensor; that is, the first NN has output σ x x N N , the second one has output σ y y N N and the third has output σ x y N N . The residuals of the outputs are treated simultaneously. The architecture of the proposed bundle of NNs is presented in Figure 4. In the second approach, there exist three separate neural networks for each component of the stress tensor (Figure 5). The loss error function for each output is minimized separately. This can be accomplished because the components of the stress tensor can be computed independently.
In the feedforward of the neural network, the output vector for each layer is calculated as follows
z 0 = ( ε 11 , ε 22 , ε 12 ) T ,
z k = f ( W k T z k 1 + b k ) , k = 1 , 2 , , N L 1 , σ i j N N = z N L = W N L T z N L 1 + b N L , i , j = 1 , 2 ,
where ε 11 = ε x x , ε 12 = ε 21 = ε x y = ε y x , ε 22 = ε y y ,   σ 11 = σ x x , σ 12 = σ 21 = σ x y = σ y x , σ 22 = σ y y , and W k T is the matrix of the dimension ( n k 1 × n k ) , containing the weights between k 1 and k hidden layers with n k 1 and n k neurons, respectively. W N L T is the matrix that contains the weights between the last hidden layer and the output of the neural network. The b k denote biases. There is no activation function f for the output layer in the proposed Keras module, but the tf.keras.layers.Activation function can be used to apply a custom activation function to the output if needed, for example, tf.keras.layers.Activation(tf.nn.relu).
The residual parts are computed as
M S E = M S E σ + M S E S ,
where M S E σ is the loss error function while training the data for the strain and stress tensor and M S E S is the Sobolev loss error function [12]. The residual M S E σ is written as
M S E σ = | σ 11 N N σ 11 data | + | σ 22 N N σ 22 data | + | σ 12 N N σ 12 data |
where σ i j N N = σ i j N N ( ε i ) , i , = 1 , 2 ,   ( σ 12 = σ 21 ) are the output of the neural network, namely, the values of the stress tensor which are computed by the composed constitutive neural network of the variable data set ε = ( ε 11 ,   ε 22 ,   ε 12 ) , and N is the number of the values used for fitting the given data set for the stress tensor (samples). If the residuals for the surrogate models, σ x x N N ( ε ) ,   σ y y N N ( ε ) ,   σ x y N N ( ε ) are treated separately, i.e., there are three neural networks (Figure 5), then instead of (4), the following formulas yield
M S E σ i j = | σ i j N N σ i j data | ,
and in Figure 5, we have three M S E σ 11 , M S E σ 22 and M S E σ 12 for each surrogate neural network, respectively.
In the case of the Sobolev metrics in the neural network [12], the derivative information can be easily incorporated into the neural network model training process for σ by making derivatives of the neural network match the ones given by σ .
If we have access to the partial derivatives of the stress tensor σ with respect to the input ε to K order
{ ε , σ ( ε ) , D ε 1 ( σ ( ε ) ) , , D ε K ( σ ( ε ) ) } ,
where D ε k ( σ ( ε ) ) = k σ / ε k ,   k = 1 , 2 , , K . Then, the Sobolev error loss function M S E S can be included as well in (4), i.e.,
M S E = i j | σ i j N N σ i j data | + + k = 1 K | D ε k ( σ i j N N ) D ε k ( σ i j data ) | ,
and for the neural architecture Figure 5 in virtue of (5),
M S E σ i j = | σ i j N N σ i j data | + k = 1 K | D ε k ( σ i j N N ( ε ) ) D ε k ( σ i j data ) | .
The numerical experiments have shown that the results are improved after using the Sobolev function. Since the process of numerical differentiation is not stable, to obtain good accuracy for computations of derivatives, a lot of training iterations are needed. With small perturbations (errors) in the output of the neural network (components of the stress tensor), one can obtain quite a large error for the derivatives of the components. The Sobolev training can overcome these disadvantages by matching the derivatives of the NN output. Then, the rate of convergence is higher, and a good accuracy can be reached quickly.
In backpropagation, the chain rule from calculus is used to compute the gradients. The output of an NN is considered as a composite function. Therefore, its derivative is equal to the product of the derivatives of its components. In the context of backpropagation, this means that the gradients of the loss with respect to the weights of a given layer depend on the gradients of the loss with respect to the weights of the subsequent layer. Once the gradients have been computed, they are used to update the weights of the network using a suitable optimization algorithm (e.g., gradient descent). The optimization method adjusts the weights in the direction that minimizes the loss, allowing the network to improve its performance.
The target solution is presented through expansion from basis (sigmoid, hyperhyperbolic tangent, etc.) functions. The network is trained iteratively on a set of input–output ( ε , σ ) data, the weights are updated, and the process is repeated until the network has learned to produce accurate outputs for a wide range of inputs. One of the advantages of backpropagation is that it allows neural networks to learn from their mistakes and improve their performance over time. This is particularly useful in applications where the relationship between the inputs and outputs is nonlinear (e.g., in macrostructures). However, the procedure of backpropagation can be computationally expensive, particularly for large networks and when the network becomes too complex and starts to fit the training data too closely, leading to poor performance on new, unseen data.
The automatic differentiation (AD) method, a built-in Tensorflow 2.14.0 library, computes exact gradients using the chain rule, resulting in more accurate derivatives in comparison with finite differences. In backpropagation, the gradients are computed with respect to the weights and the biases. The derivatives of the output function are computed using AD with the help of the chain rule with respect to the input variables. However, since the output of the neural network is an approximation of the exact solution, i.e., the output data have some perturbation with the AD, this perturbation (error) will not decrease and more likely increases. The results of computing the derivatives by the AD can be compared with numerical differentiation, e.g., FD approximation (forward, backward, and central) or with some other well-known numerical methods for numerical differentiation. For the error of approximation, for example, of the forward differences for the partial derivatives for the σ , we have
D ε k l 1 σ i j = Δ σ i j Δ ε k l + O ( Δ ε k l ) , i , j = 1 , 2 , k , l = 1 , 2 .
If the output of the neural network has an error δ i j N N = σ i j N N σ i j data , then
D ε k l 1 σ i j N N D ε k l 1 σ i j = Δ δ i j N N Δ ε k l + O ( Δ ε k l ) , i , j = 1 , 2 .
Thus,
| | D ε 1 σ N N D ε 1 σ | | = O | | δ N N | | ϵ + O ( | | Δ ε | | ) ,
where | | D ε 1 σ N N D ε 1 σ | | is the Euclidean norm, | | δ N N | | = max i j | δ i j N N | ,   ϵ = min i j | Δ ε i j | ,   | | Δ ε | | = max i j | Δ ε i j | . For the backward, the same estimate is true. Analogously, for the central differences, we obtain
| | D ε 1 σ N N D ε 1 σ | | = O | | δ N N | | ϵ + O ( | | Δ ε | | 2 ) .
Hence, one concludes that the error of numerical differentiation increases for a large perturbation | | δ N N | | . One can compare the values of the derivatives computed from the neural model with the use of AD with the approximated derivatives computed by the FD.
According to the results, presented in [38] for a common class of ANNs with one hidden layer, the mean integrated squared error between the estimated network and a target function σ is bounded by
O C σ 2 n + O n d N log N ,
where n is the number of neurons in the hidden layer, d is the input dimension, N is the number of training observations (samples), and C σ is the first absolute moment of the Fourier magnitude distribution of the target function σ ; i.e., C σ quantifies the regularity of the function via an integral involving the Fourier transform (see Formula (2) [38]). There are two contributions to this total risk: the approximation error and the estimation error. The approximation error is the distance between the target function and the closest neural network function of the given architecture, and an estimation error refers to the distance between this ideal network function and an estimated network function. The constant C σ can be exponentially large in d for sequences of functions σ of increasing dimensionality. Networks with many hidden layers can improve accuracy in some cases.
From Formulas (10), (11) and the estimate (12), one concludes that the error of the neural network influences the total estimate of the error of computation of the derivatives of the output function. However, with AD, the computations are more accurate because the derivatives are computed exactly by the chain rule.
Since the tangent stiffness matrix is symmetric, only the components, located in the lower or upper triangular parts of this matrix, are used, which thereby decreases the time of computations. Here in feedforward, the nonlinear tahn activation function is used. It provides better approximation in comparison with the other activation functions. In backpropagation, the weights and biases are usually updated using a stochastic optimizer, such as Stochastic Gradient Descent (SGD) and Adam’s method [39]. The residual network updates the surrogate network weights and biases using the residual obtained from the residual network. Here, we use the Adam optimization algorithm.

4. Computational Procedure

In this section, a computational algorithm for the construction of a neural network constitutive metamodel is presented. The procedure is implemented with the use of automatic differentiation of Tensorflow 2.14.0 and Keras library of Python [40]. To optimize weights and biases, Adam’s optimization algorithm is applied, and the training set is divided into batches. The flowchart summarizing the process of constructing a neural network is presented in Figure 6.
The steps of implementation of the procedure with some methods/functions of the Python programming code are described below. (The programming code is available from the Supplementary Material of the article).

Computational Algorithm

  • Import all necessary Python’s libraries:
    import tensorflow as tf, import keras, import matplotlib.pyplot as plt, and import csv.
  • Set up the input for neural surrogate networks from the data set of the strain tensor. There are three inputs, ε = [ ε x x , ε y y , ε x y ] , for each NN, and each input is a vector of values.
  • Set up the output, training and test samples for the neural networks as well as the data set of the stress tensor σ = [ σ x x , σ y y , σ x y ] for fitting while minimizing the error loss function.
  • Set up a number of training iterations, epochs and batches for the neural networks.
  • Read the data set, ε ,   σ and σ / ε if they are available, from txt/csv files with np.genfromtxt function. A part of the data set is used for training, and the remaining data are used for the test.
  • Normalize the ε and σ if it is necessary with
    ε ^ = a + ( b a ) ε ε ̲ ε ¯ ε ̲ , σ ^ = a + ( b a ) σ σ ̲ σ ¯ σ ̲ ,
    where [ a , b ] is the new segment, ε ̲ = min i j ε i j ,   ε ¯ = max i j ε i j ,   σ ̲ = min i j σ i j and σ ¯ = max i j σ i j , and use the chain rule for normalizing and computing derivatives, i.e.,
    σ ^ ε ^ = σ ^ σ σ ε ε ε ^ = ε ¯ ε ̲ σ ¯ σ ̲ σ ε .
  • Create a class/function object in Python allowing automatic differentiation using tensorflow tf.GradientTape module.
  • Set up a number of neurons and layers for the NNs.
  • Group layers (input, hidden and output) and neurons into an object with training/inference features for the surrogate net metamodels with dimensions, with three inputs and one output, based on the Keras modules, tf.keras.Input, tf.keras.layers.Dense, tf. keras.models.Model, tf.keras.layers.Input.
  • Call the class/function for Automatic Differentiation, which is defined in Step 7.
  • Define the input and the output for the NNs using module tf. keras.models.Model for inputs and training items in the list of outputs.
  • Create training and test data, including the training variables for the inputs [ ε x x , ε y y , ε x y ] and for the output σ x x ,   σ y y , and σ x y . In the case of the Sobolev function, D ε 1 ( σ ( ε ) ) , , D ε K ( σ ( ε ) ) is also included if the corresponding data are available.
  • Give a formulation for the residuals (4), (5), (6) or (7) for fitting the NNs to the data set for the output for [ σ x x N N , σ y y N N , σ x y N N ] i and the partial derivatives for them if available.
  • Choose an activation function. Here, the tanh function is used.
  • Compile the residual neural models using the Keras module keras.models.Model.compile with the help of the built-in Adam optimizer and Mean Square Error modules.
  • Train the neural metamodels with using keras.models.Model.fit as the training input and output data (backpropagation). If hyperelastic principles are enforced, include the inequality for the strain energy function,
    σ ε = 2 Ψ ε 2 0
    in the loss function and use the module tf.nn.relu(- d σ / d e ) in tf. keras.models.Model.
  • Stop simulation if overfitting happens (the validation loss starts to increase). In this case, add a penalty to the loss function for large weights. Simplify the NN by reducing the number of layers or neurons. Check the data set, increase data or correct them if necessary. Go back to the previous step.
  • Go back to the true values from the normalized output results σ and the derivatives.
  • Plot graphs for the model accuracy, model loss and prediction results of the outputs of the NNs with the use of plot and history() functions, which are readily available for use inside Python, and save the obtained data with, e.g., np.savetxt() and the figures with save plt.safefig().

5. Numerical Results

5.1. Example 1

The proposed neural constitutive network with the architecture presented in Figure 5 has been trained with different numbers of epochs, layers and neurons. The set of data for training and testing from the representative volume elements has in total 9261 couples of input–output values, i.e., for ε and σ . For the ε , we have
ε i j m = ε i j m 1 + h , h = 0.001 , ε i j 1 = 0.01 , m = 2 , 3 , M , M = 21 .
In this example, neural networks, which consist of four hidden layers with [15, 20, 15, 20] neurons, have been considered. The batch size is 64, and from the data set, M = 168 values of the stress tensor (84 for the training and 84 for the test validation) are used to construct the neural constitutive models with computations of the partial derivatives for the stress tensor. The batch size, which is an important key in machine learning and deep learning, refers to the number of training samples utilized in one iteration of model training. It can influence the time of training, the efficiency and the model performance. The batch size can be chosen while training the model, e.g., in the Keras library by the module "keras.models.Model.fit(xtrain, ytrain, validationdata= (xtest,ytest), epochs=nepochs, batchsize=64, verbose=2,callbacks=callbacks)".
Here, Formula (5) is used for the loss function; i.e., each neural model is trained separately. The results with 4000 training iterations are shown in Figure 7.
Figure 7 shows a match between the data set and the NN stress components. In Table 1, the error loss function
MSE σ = max { MSE σ x x , MSE σ y y , MSE σ x y }
is given for different numbers of layers, neurons and epochs.
The time of computations can be decreased by using the three surrogate neural networks with the residual neural model, which includes the residual (4). The loss error function includes all three components of the stress function (Figure 4). This approach is less expensive, but it is effective only for computations of the stress tensor. Figure 8 shows the loss function (4) while training the neural network to calculate the components of the stress tensor. The number of epochs is 4000. The neural network as before has four hidden layers with [15, 20, 15, 20] neurons, respectively. The batch size is 64, and from the data set, we have taken the M = 82 (42 for the training and 42 for the test validation) values of the stress tensor.

5.2. Example 2

In this example, the data set ( M = 9261 ) has been trained for obtaining a neural constitutive model with the partial derivatives of the components of the stress tensor obtained via automatic differentiation. In Figure 9, the components of the stress tensor are presented, and in Figure 10, Figure 11 and Figure 12, the partial derivatives obtained from the components are sketched, respectively. The Sobolev error function (6) ( K = 1 ) is also used to train the neural network with the use of AD for the computation of the partial derivatives. Figure 10, Figure 11 and Figure 12 show that the accuracy is better in the case of using the Sobolev metrics with the same number of training iterations. The neural network, which consists of four hidden layers with [15, 20, 15, 20] neurons with 5000 epochs, has been trained. The batch size is 64 and the data set has 9261 values for the components of the stress tensor (from which 4631 are used for the training and 4630 are used for the test validation). The test data are used for the comparison and validation of the results. In Table 2, the results of numerical experiments with different numbers of layers, neurons, and epochs are presented. It is noted that the accuracy of computations increases not only with an increase in the number of epochs but also with an increase in the number of layers and neurons.
The numerical experiments show that with a larger number of epochs, hidden layers, and neurons, the results are more accurate. In the case when derivative data are not available or Sobolev’s error is not included in the process, more training iterations are required in order to obtain good accuracy for the derivatives.
From Figure 10, Figure 11 and Figure 12, we can see the decreasing error after using the Sobolev metrics in the loss function of the training process. Therefore, if the data for a stiffness matrix are available, the calculations of the components of the stress tensor and its derivatives are more accurate.

5.3. Example 3

Here, the neural network is trained for each stress tensor component separately (Figure 5). The data set M = 9261 is split into 80 % for training and 20 % ( 10 % for validation and 10 % for test/prediction). The computed maximum of training and validation loss error, mean square error (MSE) and mean absolute error (MAE) with 84 batch size for the cases, without and with the Sobolev’s metrics, are given in Table 3 and Table 4, respectively. The above error results are presented for the stress tensor and the stiffness matrix.
As numerical experiments have shown, the hyperparameters have a significant impact on the results. More shallower networks sometimes can perform better for the computation of partial derivatives. From the values of the training loss, validation loss, MSE, and MAE, we can see that in the neural model with the Sobolev metrics, more epochs are needed to reach the desired accuracy for the stress components. The results are also compared with the third-order linear polynomial regression. In Figure 13, Figure 14, Figure 15 and Figure 16, the line fitting for the computed stress components, tangent stiffness and polynomial regression are sketched. Although the polynomial regression requires less computational effort, the ANN provides more accurate approximations. In Figure 13, the residual graphs for the stress components with [50,50] neurons and 5000 epochs are sketched, and in Figure 14, Figure 15 and Figure 16, the residual graphs for the derivatives with [30,40,30,40] neurons and 5000 epochs are illustrated.

5.4. Example 4

In this example, we consider uniaxial tension, which is stretching along a single direction (or axis). It is a common and useful method for the mechanical testing of materials. The experiments have been conducted on hyperelastic materials, using a polynomial model for the nonlinear elastic deformations, like the one used in polymers and rubbers (e.g., [41,42]). The stress–strain relation in tension can be written in a polynomial form for a nonlinear elastic material,
σ ( ε ) = i = 1 n = E i ε i ,
where E i is the Young’s modulus. The case E 2 = E 3 = E 4 = 0 corresponds to Hook’s law from the linear elasticity theory. Tension has been tested for E 1 = 2.14663 MPa, E 2 = 0.646588 MPa, E 3 = 0.0697791 MPa, E 4 = 0.566989 MPa (Dastjerdi [41]).
In (13), we have taken ε i j 1 = 0.6 , h = 0.05 and N = 50 . The neural network has been trained with a data set, M = 90 (45 for training and 45 for validation and test ) with four hidden layers and with [15,20,15,20] neurons, respectively. The number of epochs is 8000 and the batch size is 64. The stress function with respect to the strain and its derivative have been predicted from the data set { ε i , i = 1 , 2 , , P } . In the NN, the loss function incorporates the physical constraints
σ ε = 2 Ψ ε 2 0
and the Sobolev function. The physics-informed constitutive model enforces hyperelasticity principles, which ensures material stability and consistency.
The stress–strain curve and the derivative of the stress at the predicted points P = 40 with the exact values of the stress from Equation (14) are shown in Figure 17.
The loss error M S E = 5.7 × 10 7 on the last epoch and the relative errors of the computed stress values and their derivatives with respect to the strain at the predicted points are
δ 1 = σ NN σ σ = 0.000424 , δ 2 = σ ε NN σ ε σ ε = 0.008323 .
Since the chain rule is used in automatic differentiation, the approximation of the derivatives can be completed with high accuracy. As the numerical results have shown, with even a small number of epochs, we can obtain a good rate of convergences of the neural network to the data values if the material tangent stiffness tensor is included in the training of the neural network.

6. Conclusions

A constitutive nonlinear neural network metamodel has been proposed for approximation of the constitutive relation in macrostructures and hyperelastic materials. The components of the stress and strain tensor and the partial derivatives of the stress tensor components have been used for neural network training. The data set has been obtained from the representative volume element using finite elements. The tangential stiffness matrix, i.e., partial derivatives of the stress tensor components within a multiscale model, have been calculated via the method of automatic differentiation.
A neural constitutive network model with a common residual MSE and three surrogate models for stress components has been trained to predict the stress components and tangent stiffness tensor with respect to the strain components, respectively. In addition, a second type of neural network with a separate ANN for each stress component has been trained as well. The loss function, without and including the Sobolev metrics, is investigated. The numerical results have been compared with the given data set of the stress–strain and partial derivatives as well as with the results, obtained after applying a polynomial regression, at prediction points. In addition, a physics-informed constitutive neural model, which enforces hyperelasticity principles, has also been discussed. A choice of the hyperparameters has been analyzed. The results can be integrated into a multiscale finite element analysis and provide a viable solution to nonlinear homogenization.
The NN has been developed based on the open scientific software machine learning platform Tensorflow 2.14.0 and its Keras library, and it was written in Python.
The residual model has been constructed with different loss functions, including mean square error and the Sobolev function with the derivatives of the components of the stress tensor. The latter gives better results after training the neural network. The proposed model allows us to predict the stress tensor in macrostructures and in hyperelastic materials, where the relations between stress and strain tensor are highly nonlinear. The technique has been tested on hyperelastic materials with incorporating constrains on the second derivative of the strain energy function in order to keep material stability and consistency.
The proposed techniques can be applied to nonlinear structures and hyperelastic materials with large deflections. In these cases, the constitutive behavior can be simulated by the proposed trained neural model based on experimental or statistical data sets for various applications, like biological tissue (e.g aorta, blood vessels), elastomers, rubbers, polymers, etc.
Extensions of this work in order to take into account the path-dependent constitutive relation of the RVE and dynamic effects are possible on more complicated problems and constitute the subject of our current research investigation.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/app15073697/s1.

Author Contributions

A.D.M.: Investigation, software (Tensorflow 2.14.0, Python 3.11 and writing—original draft preparation; G.E.S.: resources, writing—review and editing. All authors have read and agreed to the published version of the manuscript.

Funding

The research was partially funded by the Project Safe-Aorta, implemented in the framework of the Action “Flagship actions in interdisciplinary scientific fields with a special focus on the productive fabric”, through the National Recovery and Resilience Fund Greece 2.0 and funded by the European Union-NextGenerationEU (Project ID: TAEDR-0535983).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Traininig data used in this paper have been taken from [1] and are available through the web page of the publisher and from the Supplementary Materials. The codes are available from the Supplementary Materials.

Acknowledgments

The authors are grateful to Georgios A. Drosopoulos (International Hellenic University, Greece) for discussions and fruitful suggestions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Drosopoulos, G.A.; Stavroulakis, G.E. Non-Linear Mechanics for Composite, Heterogeneous Structures; CRC Press, Taylor and Francis: Boca Raton, FL, USA, 2022. [Google Scholar]
  2. Urbański, A. The Unified, Finite Element Formulation of Homogenization of Structural Members with a Periodic Microstructure; Cracow University of Technology: Cracow, Poland, 2005. [Google Scholar]
  3. Geers, M.G.D.; Kouznetsova, V.G.; Brekelmans, W.A.M. Multi-scale computational homogenization: Trends and challenges. J. Comput. Appl. Math. 2010, 234, 2175–2182. [Google Scholar] [CrossRef]
  4. Kortesis, S.; Panagiotopoulos, P.D. Neural networks for computing in structural analysis: Methods and prospects of applications. Int. J. Num. Meth. Eng. 1993, 36, 2305–2318. [Google Scholar] [CrossRef]
  5. Avdelas, A.V.; Panagiotopoulos, P.D.; Kortesis, S. Neural networks for computing in the elastoplastic analysis of structures. Meccanica 1995, 30, 1–15. [Google Scholar] [CrossRef]
  6. Muradova, A.D.; Stavroulakis, G.E. The projective-iterative method and neural network estimation for buckling of elastic plates in nonlinear theory. Comm. Nonlin. Sci. Num. Sim. 2007, 12, 1068–1088. [Google Scholar] [CrossRef]
  7. Stavroulakis, G.E. Inverse and Identification Problems in Mechanics; Springer/Kluwer Academic: Berlin/Heidelberg, Germany, 2000. [Google Scholar]
  8. Yagawa, G.; Oishi, A. Computational Mechanics with Neural Networks; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar]
  9. Dornheim, J.; Mor, L.; Nallani, H.J.; Helm, D. Neural Networks for Constitutive Modeling: From Universal Function Approximators to Advanced Models and the Integration of Physics. Arch. Comp. Meth. Eng. 2024, 31, 1097–1127. [Google Scholar] [CrossRef]
  10. Linka, K.; Hillgärtner, M.; Abdolazizi, K.P.; Aydin, R.C.; Itskov, M.; Cyron, C.J. Constitutive artificial neural networks: A fast and general approach to predictive data-driven constitutive modeling by deep learning. J. Comput. Phys. 2021, 429, 110010. [Google Scholar] [CrossRef]
  11. Linka, K.; Kuhl, E. A new family of Constitutive Artificial Neural Networks towards automated model discovery. Comput. Methods Appl. Mech. Eng. 2023, 403, 115731. [Google Scholar] [CrossRef]
  12. Czarnecki, W.M.; Osindero, S.; Swirszcz, M.J.G.; Pascanu, R. Sobolev Training for Neural Networks. arXiv 2017, arXiv:1706.04859. [Google Scholar]
  13. Cocola, J.; Hand, P. Global Convergence of Sobolev Training for Overparameterized Neural Networks. In Machine Learning, Optimization, and Data Science; Nicosia, G., Ojha, V., La Malfa, E., Jansen, G., Sciacca, V., Pardalos, P., Giuffrida, G., Umeton, R., Eds.; Lecture Notes in Computer Science, LOD 2020; Springer: Cham, Switzerland, 2020; Volume 12565. [Google Scholar]
  14. Haghighat, E.; Juanes, R. Sciann, A keras/tensorflow wrapper for scientific computations and physics-informed deep learning using artificial neural networks. Comput. Meth. Appl. Mech. Eng. 2021, 373, 113552. [Google Scholar] [CrossRef]
  15. Baydin, A.; Pearlmutter, B.A.; Radul, A.A.; Siskind, J.M. Automatic Differentiation in Machine Learning: A Survey. arXiv 2018, arXiv:1502.05767. [Google Scholar]
  16. Michel, J.-C.; Moulinec, H.; Suquet, P. Effective properties of composite materials with periodic microstructure: A computational approach. Comput. Meth. Appl. Mech. Eng. 1999, 172, 109–143. [Google Scholar] [CrossRef]
  17. Zohdi, T.I.; Wriggers, P. An Introduction to Computational Micromechanics; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
  18. Tikarrouchine, E.; Benaarbia, A.; Chatzigeorgiou, G.; Meraghni, F. Non-linear FE2 multiscale simulation of damage, micro and macroscopic strains in polyamide 66-woven composite structures: Analysis and experimental validation. Compos. Struct. 2021, 255, 112926. [Google Scholar]
  19. Drosopoulos, G.A.; Giannis, K.; Stavroulaki, M.E.; Stavroulakis, G.E. Metamodeling-assisted numerical homogenization for masonry and cracked structures. ASCE J. Eng. Mech. 2018, 144, 04018072. [Google Scholar]
  20. Yvonnet, J.; He, Q.C.; Li, P. Reducing internal variables and improving efficiency in data-driven modelling of anisotropic damage from RVE simulations. Comput. Mech. 2023, 72, 37–55. [Google Scholar]
  21. Le, B.A.; Yvonnet, J.; He, Q.-C. Computational homogenization of nonlinear elastic materials using neural networks. Int. J. Num. Meth. Eng. 2015, 104, 1061–1084. [Google Scholar]
  22. Urbański, A.; Szymon, L.; Marcin, D. Multi-scale modeling of brick masonry using a numerical homogenization technique and an artificial neural network. Arch. Civil Eng. 2022, 68, 179–197. [Google Scholar]
  23. Eivazi, H.; Tröger, J.-A.; Wittek, S.; Hartmann, S.; Rausch, A. FE² Computations with Deep Neural Networks: Algorithmic Structure, Data Generation, and Implementation. Available online: https://ssrn.com/abstract=4485434 (accessed on 7 June 2023).
  24. Fish, J.; Yu, Y. Data-physics driven reduced order homogenization. Int. J. Numer. Methods Engrg. 2023, 124, 1620–1645. [Google Scholar] [CrossRef]
  25. As’ad, F.; Avery, P.; Farhat, C. A mechanics-informed artificial neural network approach in data-driven constitutive modeling. Int. J. Numer. Methods Eng. 2022, 123, 2738–2759. [Google Scholar]
  26. Protopapadakis, E.; Schauer, M.; Pierri, E.; Doulamis, A.D.; Stavroulakis, G.E.; Böhrnsen, J.-U.; Langer, S. A genetically optimized neural classifier applied to numerical pile integrity tests considering concrete piles. Comput. Struct. 2016, 162, 68–79. [Google Scholar]
  27. Muradova, A.D.; Stavroulakis, G.E. Physics-informed neural networks for elastic plate problems with bending and Winkler-type contact effects. J. Serbian Soc. Comput. Mech. 2021, 15, 45–54. [Google Scholar] [CrossRef]
  28. Mouratidou, A.D.; Drosopoulos, G.A.; Stavroulakis, G.E. Ensemble of physics-informed neural networks for solving plane elasticity problems with examples. Acta Mechanica 2024, 235, 6703–6722. [Google Scholar] [CrossRef]
  29. Karniadakis, G.E.; Kevrekidis, I.G.; Lu, L.; Perdikaris, P.; Wang, S.; Yang, L. Physics-informed machine learning. Nat. Rev. Phys. 2021, 3, 422–440. [Google Scholar] [CrossRef]
  30. Katsikis, D.; Muradova, A.D.; Stavroulakis, G.S. A Gentle Introduction to Physics-Informed Neural Networks, with Applications in Static Rod and Beam Problems. J. Adv. Appl. Comput. Math. 2022, 9, 103–128. [Google Scholar] [CrossRef]
  31. Lu, X.; Giovanis, D.G.; Yvonnet, J.; Papadopoulos, V.; Detrez, F.; Bai, J. A data-driven computational homogenization method based on neural networks for the nonlinear anisotropic electrical response of graphene/polymer nanocomposites. Comput. Mech. 2019, 64, 307–321. [Google Scholar] [CrossRef]
  32. Fish, J.; Wagner, G.J.; Keten, S. Mesoscopic and multiscale modelling in materials. Nat. Mater. 2021, 20, 774–786. [Google Scholar] [CrossRef] [PubMed]
  33. Tchalla, A.; Belouettar, S.; Makradi, A.; Zahrouni, H. An ABAQUS toolbox for multiscale finite element computation. Compos. Part B Eng. 2013, 52, 323–333. [Google Scholar] [CrossRef]
  34. Wei, H.; Wu, C.T.; Hu, W.; Su, T.H.; Oura, H.; Nishi, M.; Naito, T.; Chung, S.; Shen, L. LS-DYNA Machine Learning–Based Multiscale Method for Nonlinear Modeling of Short Fiber–Reinforced Composites. J. Eng. Mech. 2023, 149, 04023003. [Google Scholar] [CrossRef]
  35. Su, T.H.; Huang, S.J.; Jean, J.G.; Chen, C.S. Multiscale computational solid mechanics: Data and machine learning. J. Mech. 2022, 38, 568–585. [Google Scholar] [CrossRef]
  36. Fei, T.; Xin, L.; Haodong, D.; Wenbin, Y. Learning composite constitutive laws via coupling Abaqus and deep neural network. Compos. Struct. 2021, 272, 114137. [Google Scholar]
  37. Stavroulakis, G.E.; Giannis, K.; Drosopoulos, G.A.; Stavroulaki, M.E. Non-linear Computational Homogenization Experiments. In Proceedings of the COMSOL Conference, Rotterdam, The Netherlands, 23–25 October 2013; Available online: http://purl.tuc.gr/dl/dias/9D3019B4-7879-4E1F-9845-E53693BB1717 (accessed on 12 October 2015).
  38. Barron, A.R. Approximation and Estimation Bounds for Artificial Neural Networks. Mach. Learn. 1994, 14, 115–133. [Google Scholar] [CrossRef]
  39. Ruder, S. An overview of gradient descent optimization algorithms. arXiv 2017, arXiv:1609.04747. [Google Scholar]
  40. Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), USENIX, Savannah, GA, USA, 2–4 November 2016; pp. 265–283. Available online: https://www.usenix.org/conference/osdi16/technical-sessions/presentation/abadi (accessed on 16 March 2025).
  41. Dastjerdi, S.; Alibakhshi, A.; Akgöz, B.; Civalek, O. Novel Nonlinear Elasticity Approach for Analysis of Nonlinear and Hyperelastic Structures. Eng. Anal. Bound. Elem. 2022, 143, 219–236. [Google Scholar] [CrossRef]
  42. Marckmann, G.; Verron, E. Comparison of hyperelastic models for rubber-like materials. Rubber Chem. Tech. 2006, 79, 835–858. [Google Scholar] [CrossRef]
Figure 1. The concept of FEM multiscale modeling.
Figure 1. The concept of FEM multiscale modeling.
Applsci 15 03697 g001
Figure 2. Replacing the σ ε constitutive relation of the RVE by an ANN.
Figure 2. Replacing the σ ε constitutive relation of the RVE by an ANN.
Applsci 15 03697 g002
Figure 3. Plastic deformation of the masonry RVE.
Figure 3. Plastic deformation of the masonry RVE.
Applsci 15 03697 g003
Figure 4. The architecture of the neural constitutive model ε σ , σ / ε with training data ( ε data , σ data , σ / ε data ) with the common residual MSE and three surrogate neural models.
Figure 4. The architecture of the neural constitutive model ε σ , σ / ε with training data ( ε data , σ data , σ / ε data ) with the common residual MSE and three surrogate neural models.
Applsci 15 03697 g004
Figure 5. The architecture of the neural constitutive model ε σ , σ / ε with training data ( ε data , σ data , σ / ε data ) and the separate residual MSE for each neural network.
Figure 5. The architecture of the neural constitutive model ε σ , σ / ε with training data ( ε data , σ data , σ / ε data ) and the separate residual MSE for each neural network.
Applsci 15 03697 g005
Figure 6. The flowchart for the neural network model used to predict components of the stress tensor based on the data set from the representative volume elements.
Figure 6. The flowchart for the neural network model used to predict components of the stress tensor based on the data set from the representative volume elements.
Applsci 15 03697 g006
Figure 7. The components of the stress tensor after training of the neural network (Figure 5) and the data set ( M = 82 ), obtained from the representative volume elements.
Figure 7. The components of the stress tensor after training of the neural network (Figure 5) and the data set ( M = 82 ), obtained from the representative volume elements.
Applsci 15 03697 g007
Figure 8. The evolution of the model loss function (4) while training the network with three surrogate neural models and one residual model for computing the components of the stress tensor.
Figure 8. The evolution of the model loss function (4) while training the network with three surrogate neural models and one residual model for computing the components of the stress tensor.
Applsci 15 03697 g008
Figure 9. The components of the stress tensor after training of the neural network (Figure 5) and from the data set ( M = 9261 ) obtained from the representative volume elements.
Figure 9. The components of the stress tensor after training of the neural network (Figure 5) and from the data set ( M = 9261 ) obtained from the representative volume elements.
Applsci 15 03697 g009
Figure 10. The partial derivatives of the components of the stress tensor σ x x after training of the neural network (Figure 5) and from the data set M = 9261 obtained from representative volume elements.
Figure 10. The partial derivatives of the components of the stress tensor σ x x after training of the neural network (Figure 5) and from the data set M = 9261 obtained from representative volume elements.
Applsci 15 03697 g010
Figure 11. The partial derivatives of the components of the stress tensor σ y y after training of the neural network Figure 5 and from the data set M = 9261 obtained from representative volume elements.
Figure 11. The partial derivatives of the components of the stress tensor σ y y after training of the neural network Figure 5 and from the data set M = 9261 obtained from representative volume elements.
Applsci 15 03697 g011
Figure 12. The partial derivatives of the components of the stress tensor σ x y after training of the neural network Figure 5 and from the data set M = 9261 , which was obtained from representative volume elements.
Figure 12. The partial derivatives of the components of the stress tensor σ x y after training of the neural network Figure 5 and from the data set M = 9261 , which was obtained from representative volume elements.
Applsci 15 03697 g012
Figure 13. The line fitting for the components of the stress tensor after training the neural network and using the polynomial regression at the prediction points of the data set ( M = 9261 ) obtained from the representative volume element.
Figure 13. The line fitting for the components of the stress tensor after training the neural network and using the polynomial regression at the prediction points of the data set ( M = 9261 ) obtained from the representative volume element.
Applsci 15 03697 g013
Figure 14. The line fitting for the partial derivatives of the component of the stress tensor σ x x after training the neural network and using the polynomial regression at the prediction points of the data set M = 9261 obtained from the representative volume element.
Figure 14. The line fitting for the partial derivatives of the component of the stress tensor σ x x after training the neural network and using the polynomial regression at the prediction points of the data set M = 9261 obtained from the representative volume element.
Applsci 15 03697 g014
Figure 15. The line fitting for the partial derivatives of the component of the stress tensor σ y y after training the neural network (Figure 5) and using the polynomial regression at the prediction points of the data set M = 9261 obtained from the representative volume element.
Figure 15. The line fitting for the partial derivatives of the component of the stress tensor σ y y after training the neural network (Figure 5) and using the polynomial regression at the prediction points of the data set M = 9261 obtained from the representative volume element.
Applsci 15 03697 g015
Figure 16. The line fitting for the partial derivatives of the components of the stress tensor σ x y after training of the neural network and using the polynomial regression at the prediction points of the data set M = 9261 obtained from the representative volume element.
Figure 16. The line fitting for the partial derivatives of the components of the stress tensor σ x y after training of the neural network and using the polynomial regression at the prediction points of the data set M = 9261 obtained from the representative volume element.
Applsci 15 03697 g016
Figure 17. The stress–strain curve and the derivative function of the stress of the hyperelastic material from the prediction of the neural network and the data values.
Figure 17. The stress–strain curve and the derivative function of the stress of the hyperelastic material from the prediction of the neural network and the data values.
Applsci 15 03697 g017
Table 1. The model loss error for different values of the training parameters.
Table 1. The model loss error for different values of the training parameters.
Number of LayersNeuronsEpochsTraining SamplesMSE σ Time (min.)
2[40,40]200042 1.4 × 10 5 3
2[40,40]400042 1.2 × 10 6 6
3[15,30,40]200042 1.2 × 10 5 5
4[15,20,15,20]400084 4.7 × 10 6 8
4[15,20,15,20]800084 1.6 × 10 6 15.5
Table 2. The model loss error for different values of the training parameters.
Table 2. The model loss error for different values of the training parameters.
Number of LayersNeuronsEpochsBatch SizeMSE σ Time (min.)
2[15,15]100064 2.0 × 10 4 13.64
2[15,15]200064 1.8 × 10 4 27.62
2[40,40]200064 6.3 × 10 5 27.30
2[40,40]200084 7.7 × 10 5 26.43
3[15,30,15]400084 4.7 × 10 5 52.48
4[15,20,15,20]200084 5.8 × 10 5 26.31
4[30,40,30,40]200084 3.8 × 10 5 26.79
5[15,20,15,20,15]200084 5.7 × 10 5 26.27
5[15,20,15,20,15]400084 3.6 × 10 5 52.57
Table 3. The computed loss error, mean square error (MSE) and mean absolute error (MAE) of training of the NN for different hyperparameters with a batch size of 84.
Table 3. The computed loss error, mean square error (MSE) and mean absolute error (MAE) of training of the NN for different hyperparameters with a batch size of 84.
Neurons in LayersEpochsTrain. LossValid. LossMSE σ MAE σ MSEstiffMAEstiffTime (min.)
[50,50]2000 1.7 × 10 6 6.8 × 10 7 6.8 × 10 7 0.0006610.00018850.00695718.51
[50,50]5000 1.1 × 10 6 3.3 × 10 6 3.4 × 10 6 0.0015500.0001640.00660344.89
[30, 40, 30, 40]2000 1.1 × 10 6 9.45 × 10 7 9.1 × 10 7 0.0006000.0007400.01389221.82
[30, 40, 30, 40]5000 2.1 × 10 6 1.5 × 10 6 1.5 × 10 6 0.0009410.0899270.13521749.69
[15, 30, 15, 30, 15]2000 1.8 × 10 6 1.8 × 10 6 1.8 × 10 6 0.0011380.0006480.01387624.63
[15, 30, 15, 30, 15]5000 1.4 × 10 6 8.0 × 10 7 7.8 × 10 7 0.0006740.0024720.02629660.52
Table 4. The computed loss error, mean square error (MSE) and mean absolute error (MAE) of training of the NN with the Sobolev metrics for different hyperparameters with a batch size of 84.
Table 4. The computed loss error, mean square error (MSE) and mean absolute error (MAE) of training of the NN with the Sobolev metrics for different hyperparameters with a batch size of 84.
Neurons in LayersEpochsTrain. LossValid. LossMSE σ MAE σ MSEstiffMAEstiffTime (min.)
[50,50]2000 4.2 × 10 5 4.2 × 10 5 8.4 × 10 6 0.002398 3.0 × 10 5 0.00402727.04
[50,50]5000 4.5 × 10 5 6.3 × 10 6 5.1 × 10 6 0.002168 5.1 × 10 5 0.00462866.73
[30, 40, 30, 40]2000 5.2 × 10 5 4.6 × 10 5 4.5 × 10 6 0.002791 2.7 × 10 5 0.00311426.94
[30, 40, 30, 40]5000 3.1 × 10 5 5.0 × 10 5 6.4 × 10 6 0.001990 5.8 × 10 6 0.00467566.50
[15, 30, 15, 30, 15]2000 4.1 × 10 5 5.1 × 10 5 1.3 × 10 5 0.002822 2.6 × 10 5 0.00374926.53
[15, 30, 15, 30, 15]5000 2.3 × 10 5 3.2 × 10 5 2.9 × 10 6 0.001506 2.8 × 10 5 0.00367866.68
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mouratidou, A.D.; Stavroulakis, G.E. A Neural Network Constitutive Model and Automatic Stiffness Evaluation for Multiscale Finite Elements. Appl. Sci. 2025, 15, 3697. https://doi.org/10.3390/app15073697

AMA Style

Mouratidou AD, Stavroulakis GE. A Neural Network Constitutive Model and Automatic Stiffness Evaluation for Multiscale Finite Elements. Applied Sciences. 2025; 15(7):3697. https://doi.org/10.3390/app15073697

Chicago/Turabian Style

Mouratidou, Aliki D., and Georgios E. Stavroulakis. 2025. "A Neural Network Constitutive Model and Automatic Stiffness Evaluation for Multiscale Finite Elements" Applied Sciences 15, no. 7: 3697. https://doi.org/10.3390/app15073697

APA Style

Mouratidou, A. D., & Stavroulakis, G. E. (2025). A Neural Network Constitutive Model and Automatic Stiffness Evaluation for Multiscale Finite Elements. Applied Sciences, 15(7), 3697. https://doi.org/10.3390/app15073697

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop