#
FE^{2} Computations with Deep Neural Networks: Algorithmic Structure, Data Generation, and Implementation

^{1}

^{2}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. Classical FE${}^{\mathbf{2}}\phantom{\rule{0.166667em}{0ex}}$ Computations

#### 2.1. Spatial Discretization

#### 2.1.1. Macroscale

**Remark**

**1.**

#### 2.1.2. Microscale

#### 2.1.3. General System of Non-Linear Equations

**Remark**

**2.**

#### 2.2. Multilevel–Newton Algorithm

#### 2.2.1. Multilevel–Newton Algorithm for FE${}^{2}\phantom{\rule{0.166667em}{0ex}}$ Computations

Algorithm 1: Multilevel-Newton algorithm for FE^{2} computations with periodic displacement boundary conditions on a microscale. |

#### 2.2.2. Newton Algorithm for FE${}^{2}\phantom{\rule{0.166667em}{0ex}}$Computations with DNN Surrogate Models

Algorithm 2: Newton algorithm for FE^{2} computations following the DNN-FE^{2} approach. |

## 3. Deep Neural Networks

Algorithm 3: Computing the Jacobian matrix $\mathbf{J}$ of function f via reverse mode AD in TensorFlow and JAX frameworks for a batch of samples $\mathbf{x}$. | |

TensorFlow: | JAX: |

def Jacobian(f, $\mathbf{x}$): | Jacobian = jax.vmap(jax.jacrev(f)) |

with tf.GradientTape() as tape: | $\mathbf{J}$ = Jacobian($\mathbf{x}$) |

tape.watch($\mathit{x}$) | |

$\mathbf{y}$ = f($\mathbf{x}$) | |

return tape.batch_jacobian($\mathbf{y}$, $\mathbf{x}$) | |

$\mathrm{J}$ = Jacobian(f, $\mathbf{x}$) |

#### 3.1. Deep Neural Networks as Surrogate Models for Local RVE Computations

#### 3.2. Training and Validation Datasets

#### 3.3. Architecture and Training Process

#### 3.3.1. Data Pre-Processing

#### 3.3.2. Training

#### 3.3.3. Model Selection

#### 3.3.4. Hyperparameter Tuning

## 4. Numerical Experiments

#### 4.1. Problem Setup

#### 4.2. Investigation on the Size of Dataset

#### 4.3. Numerical Simulations

#### 4.3.1. L-Profile

#### 4.3.2. Cook’s Membrane

#### 4.4. Load-Step Size Behavior

## 5. Speed-Up with JAX and Just-in-Time Compilation

#### 5.1. Just-in-Time Compilation

#### 5.2. Speed-Up with JAX and JIT

## 6. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Data Availability Statement

## Conflicts of Interest

## Abbreviations

DNN | Deep neural network |

FE | Finite element |

RVE | Representative volume element |

TANN | Thermodynamics-based artificial neural network |

DMN | Deep material network |

NN | Neural network |

MLNA | Multilevel Newton algorithm |

DAE | Differential-algebraic equations |

AD | Automatic differentiation |

MPI | Message passing interface |

LHS | Latin hypercube sampling |

XLA | Accelerated linear algebra |

JIT | Just-in-time |

## Appendix A. Hyperparameter Tuning

**Table A1.**Summary of the results obtained for training and validation losses and the required time of training for different sizes of the NN–2–100 model. Results are reported for models with swish activation function.

${\mathit{N}}_{\mathit{h}}\times {\mathit{N}}_{\mathit{n}}$ | ${\mathit{L}}_{\mathcal{T}}^{\mathbf{train}}$ | ${\mathit{L}}_{\mathcal{T}}^{\mathbf{val}}$ | ${\mathit{L}}_{\mathcal{C}}^{\mathbf{train}}$ | ${\mathit{L}}_{\mathcal{C}}^{\mathbf{val}}$ | ${\mathit{t}}_{\mathbf{rel},\mathbf{train}}$ |
---|---|---|---|---|---|

64 × 2 | 1.06 $\times {10}^{-6}$ | 1.06 $\times {10}^{-6}$ | 1.35$\times {10}^{-4}$ | 1.33 $\times {10}^{-4}$ | 6.97 $\times {10}^{-3}$ |

64 × 4 | 1.49 $\times {10}^{-7}$ | 1.51 $\times {10}^{-7}$ | 3.58 $\times {10}^{-6}$ | 3.68 $\times {10}^{-6}$ | 8.36 $\times {10}^{-3}$ |

64 × 8 | 1.02 $\times {10}^{-7}$ | 1.04 $\times {10}^{-7}$ | 9.27 $\times {10}^{-7}$ | 9.35 $\times {10}^{-7}$ | 1.17 $\times {10}^{-2}$ |

64 × 16 | 1.66 $\times {10}^{-7}$ | 1.70 $\times {10}^{-7}$ | 3.93 $\times {10}^{-7}$ | 4.29 $\times {10}^{-7}$ | 1.81 $\times {10}^{-2}$ |

128 × 2 | 1.13 $\times {10}^{-6}$ | 1.11 $\times {10}^{-6}$ | 1.71 $\times {10}^{-4}$ | 1.67 $\times {10}^{-4}$ | 7.80 $\times {10}^{-3}$ |

128 × 4 | 1.14 $\times {10}^{-7}$ | 1.15 $\times {10}^{-7}$ | 9.47 $\times {10}^{-7}$ | 1.03 $\times {10}^{-6}$ | 9.19 $\times {10}^{-3}$ |

128 × 8 | 5.05 $\times {10}^{-8}$ | 4.96 $\times {10}^{-8}$ | 2.58 $\times {10}^{-7}$ | 3.10 $\times {10}^{-7}$ | 1.45 $\times {10}^{-2}$ |

128 × 16 | 5.22 $\times {10}^{-8}$ | 5.30 $\times {10}^{-8}$ | 2.19 $\times {10}^{-7}$ | 3.35 $\times {10}^{-7}$ | 2.08 $\times {10}^{-2}$ |

256 × 2 | 1.55 $\times {10}^{-6}$ | 1.54 $\times {10}^{-6}$ | 1.60 $\times {10}^{-4}$ | 1.58 $\times {10}^{-4}$ | 7.90 $\times {10}^{-3}$ |

256 × 4 | 7.95 $\times {10}^{-8}$ | 8.02 $\times {10}^{-8}$ | 5.08 $\times {10}^{-7}$ | 5.85 $\times {10}^{-7}$ | 1.18 $\times {10}^{-2}$ |

256 × 8 | 7.10 $\times {10}^{-8}$ | 7.00 $\times {10}^{-8}$ | 1.01 $\times {10}^{-7}$ | 1.67 $\times {10}^{-7}$ | 1.75 $\times {10}^{-2}$ |

256 × 16 | 5.19 $\times {10}^{-8}$ | 5.18 $\times {10}^{-8}$ | 3.35 $\times {10}^{-7}$ | 1.76 $\times {10}^{-7}$ | 3.37 $\times {10}^{-2}$ |

**Figure A1.**Influence of the choice of activation function on the learning process; ${L}_{\mathcal{T}}^{\mathrm{val}}$ (

**left**) and ${L}_{\mathcal{C}}^{\mathrm{val}}$ (

**right**). Results are reported for models containing 8 hidden layers and 128 neurons per each hidden layer.

## References

- Smit, R.J.; Brekelmans, W.M.; Meijer, H.E. Prediction of the mechanical behavior of nonlinear heterogeneous systems by multi-level finite element modeling. Comput. Methods Appl. Mech. Eng.
**1998**, 155, 181–192. [Google Scholar] [CrossRef] - Feyel, F. Multiscale FE
^{2}elastoviscoplastic analysis of composite structures. Comput. Mater. Sci.**1999**, 16, 344–354. [Google Scholar] [CrossRef] - Kouznetsova, V.; Brekelmans, W.A.M.; Baajiens, F.P.T. An approach to micro-macro modeling of heterogeneous materials. Comput. Mech.
**2001**, 27, 37–48. [Google Scholar] [CrossRef] - Miehe, C.; Koch, A. Computational micro-to-macro transitions of discretized microstructures undergoing small strains. Arch. Appl. Mech.
**2002**, 72, 300–317. [Google Scholar] [CrossRef] - Miehe, C. Computational micro-to-macro transitions for discretized micro-structures of heterogeneous materials at finite strains based on the minimization of averaged incremental energy. Comput. Methods Appl. Mech. Eng.
**2003**, 192, 559–591. [Google Scholar] [CrossRef] - Kouznetsova, V.; Geers, M.G.D.; Brekelmans, W.A.M. Multi-scale second-order computational homogenization of multi-phase materials: A nested finite element solution strategy. Comput. Methods Appl. Mech. Eng.
**2004**, 193, 5525–5550. [Google Scholar] [CrossRef] - Schröder, J. A numerical two-scale homogenization scheme: The FE
^{2}-method. In Plasticity and Beyond: Microstructures, Crystal-Plasticity and Phase Transitions; Schröder, J., Hackl, K., Eds.; Springer: Vienna, Austria, 2014; pp. 1–64. [Google Scholar] - Kochmann, J.; Wulfinghoff, S.; Reese, S.; Mianroodi, J.R.; Svendsen, B. Two-scale FE–FFT-and phase-field-based computational modeling of bulk microstructural evolution and macroscopic material behavior. Comput. Methods Appl. Mech. Eng.
**2016**, 305, 89–110. [Google Scholar] [CrossRef] - Düster, A.; Sehlhorst, H.G.; Rank, E. Numerical homogenization of heterogeneous and cellular materials utilizing the finite cell method. Comput. Mech.
**2012**, 50, 413–431. [Google Scholar] [CrossRef] - Bock, F.E.; Aydin, R.C.; Cyron, C.J.; Huber, N.; Kalidindi, S.R.; Klusemann, B. A Review of the Application of Machine Learning and Data Mining Approaches in Continuum Materials Mechanics. Front. Mater.
**2019**, 6, 110. [Google Scholar] [CrossRef] - Brodnik, N.; Muir, C.; Tulshibagwale, N.; Rossin, J.; Echlin, M.; Hamel, C.; Kramer, S.; Pollock, T.; Kiser, J.; Smith, C.; et al. Perspective: Machine learning in experimental solid mechanics. J. Mech. Phys. Solids
**2023**, 173, 105231. [Google Scholar] [CrossRef] - Jin, H.; Zhang, E.; Espinosa, H.D. Recent Advances and Applications of Machine Learning in Experimental Solid Mechanics: A Review. arXiv
**2023**, arXiv:2303.07647. [Google Scholar] [CrossRef] - Johnson, N.; Vulimiri, P.; To, A.; Zhang, X.; Brice, C.; Kappes, B.; Stebner, A. Invited review: Machine learning for materials developments in metals additive manufacturing. Addit. Manuf.
**2020**, 36, 101641. [Google Scholar] [CrossRef] - Kumar, S.; Kochmann, D.M. What Machine Learning Can Do for Computational Solid Mechanics. In Current Trends and Open Problems in Computational Mechanics; Aldakheel, F., Hudobivnik, B., Soleimani, M., Wessels, H., Weißenfels, C., Marino, M., Eds.; Springer International Publishing: Cham, Switzerland, 2022; pp. 275–285. [Google Scholar]
- Zhang, P.; Yin, Z.Y.; Jin, Y.F. State-of-the-Art Review of Machine Learning Applications in Constitutive Modeling of Soils. Arch. Comput. Methods Eng.
**2021**, 28, 3661–3686. [Google Scholar] [CrossRef] - Kirchdoerfer, T.; Ortiz, M. Data-driven computational mechanics. Comput. Methods Appl. Mech. Eng.
**2016**, 304, 81–101. [Google Scholar] [CrossRef] - Ghaboussi, J.; Garrett, J.H.; Wu, X. Knowledge-Based Modeling of Material Behavior with Neural Networks. J. Eng. Mech.
**1991**, 117, 132–153. [Google Scholar] [CrossRef] - Lefik, M.; Schrefler, B. Artificial neural network as an incremental non-linear constitutive model for a finite element code. Comput. Methods Appl. Mech. Eng.
**2003**, 192, 3265–3283. [Google Scholar] [CrossRef] - Hashash, Y.M.A.; Jung, S.; Ghaboussi, J. Numerical implementation of a neural network based material model in finite element analysis. Int. J. Numer. Methods Eng.
**2004**, 59, 989–1005. [Google Scholar] [CrossRef] - Deshpande, S.; Sosa, R.I.; Bordas, S.P.A.; Lengiewicz, J. Convolution, aggregation and attention based deep neural networks for accelerating simulations in mechanics. Front. Mater.
**2023**, 10, 1128954. [Google Scholar] [CrossRef] - Yao, H.; Gao, Y.; Liu, Y. FEA-Net: A physics-guided data-driven model for efficient mechanical response prediction. Comput. Methods Appl. Mech. Eng.
**2020**, 363, 112892. [Google Scholar] [CrossRef] - Oishi, A.; Yagawa, G. Computational mechanics enhanced by deep learning. Comput. Methods Appl. Mech. Eng.
**2017**, 327, 327–351. [Google Scholar] [CrossRef] - Huang, D.; Fuhg, J.N.; Weißenfels, C.; Wriggers, P. A machine learning based plasticity model using proper orthogonal decomposition. Comput. Methods Appl. Mech. Eng.
**2020**, 365, 113008. [Google Scholar] [CrossRef] - Nguyen, L.T.K.; Keip, M.A. A data-driven approach to nonlinear elasticity. Comput. Struct.
**2018**, 194, 97–115. [Google Scholar] [CrossRef] - Stainier, L.; Leygue, A.; Ortiz, M. Model-free data-driven methods in mechanics: Material data identification and solvers. Comput. Mech.
**2019**, 64, 381–393. [Google Scholar] [CrossRef] - Eggersmann, R.; Kirchdoerfer, T.; Reese, S.; Stainier, L.; Ortiz, M. Model-Free Data-Driven Inelasticity. Comput. Methods Appl. Mech. Eng.
**2019**, 350, 81–99. [Google Scholar] [CrossRef] - González, D.; Chinesta, F.; Cueto, E. Thermodynamically consistent data-driven computational mechanics. Contin. Mech. Thermodyn.
**2019**, 31, 239–253. [Google Scholar] [CrossRef] - Ciftci, K.; Hackl, K. Model-free data-driven simulation of inelastic materials using structured data sets, tangent space information and transition rules. Comput. Mech.
**2022**, 70, 425–435. [Google Scholar] [CrossRef] - Eghbalian, M.; Pouragha, M.; Wan, R. A physics-informed deep neural network for surrogate modeling in classical elasto-plasticity. Comput. Geotech.
**2023**, 159, 105472. [Google Scholar] [CrossRef] - Huber, N.; Tsakmakis, C. Determination of constitutive properties from sperical indentation data using neural networks, Part I: The case of pure kinematic hardening in plasticity laws. J. Mech. Phys. Solids
**1999**, 47, 1569–1588. [Google Scholar] [CrossRef] - Huber, N.; Tsakmakis, C. Determination of constitutive properties from sperical indentation data using neural networks, Part II: Plasticity with nonlinear and kinematic hardening. J. Mech. Phys. Solids
**1999**, 47, 1589–1607. [Google Scholar] [CrossRef] - Villarreal, R.; Vlassis, N.; Phan, N.; Catanach, T.; Jones, R.; Trask, N.; Kramer, S.; Sun, W. Design of experiments for the calibration of history-dependent models via deep reinforcement learning and an enhanced Kalman filter. Comput. Mech.
**2023**, 72, 95–124. [Google Scholar] [CrossRef] - Hamel, C.M.; Long, K.N.; Kramer, S.L.B. Calibrating constitutive models with full-field data via physics informed neural networks. Strain
**2022**, 59, e12431. [Google Scholar] [CrossRef] - Flaschel, M.; Kumar, S.; De Lorenzis, L. Unsupervised discovery of interpretable hyperelastic constitutive laws. Comput. Methods Appl. Mech. Eng.
**2021**, 381, 113852. [Google Scholar] [CrossRef] - Flaschel, M.; Kumar, S.; De Lorenzis, L. Discovering plasticity models without stress data. NPJ Comput. Mater.
**2022**, 8, 91. [Google Scholar] [CrossRef] - Flaschel, M.; Kumar, S.; De Lorenzis, L. Automated discovery of generalized standard material models with EUCLID. Comput. Methods Appl. Mech. Eng.
**2023**, 405, 115867. [Google Scholar] [CrossRef] - Linka, K.; Kuhl, E. A new family of Constitutive Artificial Neural Networks towards automated model discovery. Comput. Methods Appl. Mech. Eng.
**2023**, 403, 115731. [Google Scholar] [CrossRef] - Linka, K.; Hillgärtner, M.; Abdolazizi, K.P.; Aydin, R.C.; Itskov, M.; Cyron, C.J. Constitutive artificial neural networks: A fast and general approach to predictive data-driven constitutive modeling by deep learning. J. Comput. Phys.
**2021**, 429, 110010. [Google Scholar] [CrossRef] - Le, B.A.; Yvonnet, J.; He, Q.C. Computational homogenization of nonlinear elastic materials using neural networks. Int. J. Numer. Methods Eng.
**2015**, 104, 1061–1084. [Google Scholar] [CrossRef] - Liu, Z.; Bessa, M.; Liu, W.K. Self-consistent clustering analysis: An efficient multi-scale scheme for inelastic heterogeneous materials. Comput. Methods Appl. Mech. Eng.
**2016**, 306, 319–341. [Google Scholar] [CrossRef] - Fritzen, F.; Fernández, M.; Larsson, F. On-the-Fly Adaptivity for Nonlinear Twoscale Simulations Using Artificial Neural Networks and Reduced Order Modeling. Front. Mater.
**2019**, 6, 75. [Google Scholar] [CrossRef] - Yang, J.; Xu, R.; Hu, H.; Huang, Q.; Huang, W. Structural-Genome-Driven computing for thin composite structures. Compos. Struct.
**2019**, 215, 446–453. [Google Scholar] [CrossRef] - Mianroodi, J.; Rezaei, S.; Siboni, N.; Xu, B.X.; Raabe, D. Lossless multi-scale constitutive elastic relations with artificial intelligence. NPJ Comput. Mater.
**2022**, 8, 67. [Google Scholar] [CrossRef] - Gupta, A.; Bhaduri, A.; Graham-Brady, L. Accelerated multiscale mechanics modeling in a deep learning framework. Mech. Mater.
**2023**, 184, 104709. [Google Scholar] [CrossRef] - Nguyen-Thanh, V.M.; Trong Khiem Nguyen, L.; Rabczuk, T.; Zhuang, X. A surrogate model for computational homogenization of elastostatics at finite strain using high-dimensional model representation-based neural network. Int. J. Numer. Methods Eng.
**2020**, 121, 4811–4842. [Google Scholar] [CrossRef] - Aldakheel, F.; Elsayed, E.S.; Zohdi, T.I.; Wriggers, P. Efficient multiscale modeling of heterogeneous materials using deep neural networks. Comput. Mech.
**2023**, 72, 155–171. [Google Scholar] [CrossRef] - Kim, S.; Shin, H. Data-driven multiscale finite-element method using deep neural network combined with proper orthogonal decomposition. Eng. Comput.
**2023**. [Google Scholar] [CrossRef] - Eidel, B. Deep CNNs as universal predictors of elasticity tensors in homogenization. Comput. Methods Appl. Mech. Eng.
**2023**, 403, 115741. [Google Scholar] [CrossRef] - Yang, H.; Guo, X.; Tang, S.; Liu, W.K. Derivation of heterogeneous material laws via data-driven principal component expansions. Comput. Mech.
**2019**, 64, 365–379. [Google Scholar] [CrossRef] - Rao, C.; Liu, Y. Three-dimensional convolutional neural network (3D-CNN) for heterogeneous material homogenization. Comput. Mater. Sci.
**2020**, 184, 109850. [Google Scholar] [CrossRef] - Reimann, D.; Nidadavolu, K.; ul Hassan, H.; Vajragupta, N.; Glasmachers, T.; Junker, P.; Hartmaier, A. Modeling Macroscopic Material Behavior With Machine Learning Algorithms Trained by Micromechanical Simulations. Front. Mater.
**2019**, 6, 181. [Google Scholar] [CrossRef] - Göküzüm, F.S.; Nguyen, L.T.K.; Keip, M.A. An Artificial Neural Network Based Solution Scheme for Periodic Computational Homogenization of Electrostatic Problems. Math. Comput. Appl.
**2019**, 24, 40. [Google Scholar] [CrossRef] - Korzeniowski, T.F.; Weinberg, K. Data-driven finite element computation of open-cell foam structures. Comput. Methods Appl. Mech. Eng.
**2022**, 400, 115487. [Google Scholar] [CrossRef] - Xu, R.; Yang, J.; Yan, W.; Huang, Q.; Giunta, G.; Belouettar, S.; Zahrouni, H.; Zineb, T.B.; Hu, H. Data-driven multiscale finite element method: From concurrence to separation. Comput. Methods Appl. Mech. Eng.
**2020**, 363, 112893. [Google Scholar] [CrossRef] - Li, B.; Zhuang, X. Multiscale computation on feedforward neural network and recurrent neural network. Front. Struct. Civ. Eng.
**2020**, 14, 1285–1298. [Google Scholar] [CrossRef] - Fuhg, J.N.; Böhm, C.; Bouklas, N.; Fau, A.; Wriggers, P.; Marino, M. Model-data-driven constitutive responses: Application to a multiscale computational framework. Int. J. Eng. Sci.
**2021**, 167, 103522. [Google Scholar] [CrossRef] - Masi, F.; Stefanou, I. Multiscale modeling of inelastic materials with Thermodynamics-based Artificial Neural Networks (TANN). Comput. Methods Appl. Mech. Eng.
**2022**, 398, 115190. [Google Scholar] [CrossRef] - Masi, F.; Stefanou, I. Evolution TANN and the identification of internal variables and evolution equations in solid mechanics. J. Mech. Phys. Solids
**2023**, 174, 105245. [Google Scholar] [CrossRef] - Kalina, K.A.; Linden, L.; Brummund, J.; Kästner, M. FE
^{ANN}: An efficient data-driven multiscale approach based on physics-constrained neural networks and automated data mining. Comput. Mech.**2023**, 71, 827–851. [Google Scholar] [CrossRef] - Feng, N.; Zhang, G.; Khandelwal, K. Finite strain FE
^{2}analysis with data-driven homogenization using deep neural networks. Comput. Struct.**2022**, 263, 106742. [Google Scholar] [CrossRef] - Czarnecki, W.M.; Osindero, S.; Jaderberg, M.; Swirszcz, G.; Pascanu, R. Sobolev Training for Neural Networks. In Advances in Neural Information Processing Systems; Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., Garnett, R., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2017; Volume 30. [Google Scholar]
- Vlassis, N.N.; Sun, W. Sobolev training of thermodynamic-informed neural networks for interpretable elasto-plasticity models with level set hardening. Comput. Methods Appl. Mech. Eng.
**2021**, 377, 113695. [Google Scholar] [CrossRef] - Vlassis, N.N.; Sun, W. Geometric learning for computational mechanics Part II: Graph embedding for interpretable multiscale plasticity. Comput. Methods Appl. Mech. Eng.
**2023**, 404, 115768. [Google Scholar] [CrossRef] - Ghavamian, F.; Simone, A. Accelerating multiscale finite element simulations of history-dependent materials using a recurrent neural network. Comput. Methods Appl. Mech. Eng.
**2019**, 357, 112594. [Google Scholar] [CrossRef] - Drosopoulos, G.A.; Stavroulakis, G.E. Data-Driven Computational Homogenization Using Neural Networks: FE2-NN Application on Damaged Masonry. J. Comput. Cult. Herit.
**2021**, 14, 1–19. [Google Scholar] [CrossRef] - Yin, M.; Zhang, E.; Yu, Y.; Karniadakis, G.E. Interfacing finite elements with deep neural operators for fast multiscale modeling of mechanics problems. Comput. Methods Appl. Mech. Eng.
**2022**, 402, 115027. [Google Scholar] [CrossRef] [PubMed] - Rocha, I.; Kerfriden, P.; van der Meer, F. Machine learning of evolving physics-based material models for multiscale solid mechanics. Mech. Mater.
**2023**, 184, 104707. [Google Scholar] [CrossRef] - Liu, Z.; Wu, C.; Koishi, M. A deep material network for multiscale topology learning and accelerated nonlinear modeling of heterogeneous materials. Comput. Methods Appl. Mech. Eng.
**2019**, 345, 1138–1168. [Google Scholar] [CrossRef] - Liu, Z.; Wu, C. Exploring the 3D architectures of deep material network in data-driven multiscale mechanics. J. Mech. Phys. Solids
**2019**, 127, 20–46. [Google Scholar] [CrossRef] - Gajek, S.; Schneider, M.; Böhlke, T. An FE–DMN method for the multiscale analysis of short fiber reinforced plastic components. Comput. Methods Appl. Mech. Eng.
**2021**, 384, 113952. [Google Scholar] [CrossRef] - Gajek, S.; Schneider, M.; Böhlke, T. An FE-DMN method for the multiscale analysis of thermomechanical composites. Comput. Mech.
**2022**, 69, 1–27. [Google Scholar] [CrossRef] - Nguyen, V.D.; Noels, L. Micromechanics-based material networks revisited from the interaction viewpoint; robust and efficient implementation for multi-phase composites. Eur. J. Mech. A Solids
**2022**, 91, 104384. [Google Scholar] [CrossRef] - Hughes, T.J.R. The Finite Element Method; Prentice-Hall: Englewood Cliffs, NJ, USA, 1987. [Google Scholar]
- Hartmann, S.; Quint, K.J.; Hamkar, A.W. Displacement control in time-adaptive non-linear finite-element analysis. ZAMM J. Appl. Math. Mech.
**2008**, 88, 342–364. [Google Scholar] [CrossRef] - Nguyen, V.D.; Béchet, E.; Geuzaine, C.; Noels, L. Imposing periodic boundary condition on arbitrary meshes by polynomial interpolation. Comput. Mater. Sci.
**2012**, 55, 390–406. [Google Scholar] [CrossRef] - Hartmann, S. A remark on the application of the Newton-Raphson method in non-linear finite element analysis. Comput. Mech.
**2005**, 36, 100–116. [Google Scholar] [CrossRef] - Lange, N.; Hütter, G.; Kiefer, B. An efficient monolithic solution scheme for FE
^{2}problems. Comput. Methods Appl. Mech. Eng.**2021**, 382, 113886. [Google Scholar] [CrossRef] - Rabbat, N.B.G.; Sangiovanni-Vincentelli, A.L.; Hsieh, H.Y. A Multilevel Newton Algorithm with Macromodeling and Latency for the Analysis of Large-Scale Nonlinear Circuits in the Time Domain. IEEE Trans. Circuits Syst.
**1979**, 26, 733–740. [Google Scholar] [CrossRef] - Hoyer, W.; Schmidt, J.W. Newton-Type Decomposition Methods for Equations Arising in Network Analysis. ZAMM Z. Angew. Math. Und Mech.
**1984**, 64, 397–405. [Google Scholar] [CrossRef] - Baydin, A.G.; Pearlmutter, B.A.; Radul, A.A.; Siskind, J.M. Automatic Differentiation in Machine Learning: A Survey. J. Mach. Learn. Res.
**2017**, 18, 5595–5637. [Google Scholar] - Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. Tensorflow: A system for large-scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, Savannah, GA, USA, 2–4 November 2016; Volume 16, pp. 265–283. [Google Scholar]
- Bradbury, J.; Frostig, R.; Hawkins, P.; Johnson, M.J.; Leary, C.; Maclaurin, D.; Necula, G.; Paszke, A.; VanderPlas, J.; Wanderman-Milne, S.; et al. JAX: Composable Transformations of Python+NumPy Programs. 2018. Available online: https://news.ycombinator.com/item?id=22812312 (accessed on 16 July 2023).
- Müller, J.D.; Cusdin, P. On the performance of discrete adjoint CFD codes using automatic differentiation. Int. J. Numer. Methods Fluids
**2005**, 47, 939–945. [Google Scholar] [CrossRef] - Charpentier, I.; Ghemires, M. Efficient adjoint derivatives: Application to the meteorological model meso-nh. Optim. Methods Softw.
**2000**, 13, 35–63. [Google Scholar] [CrossRef] - Chandrasekhar, A.; Sridhara, S.; Suresh, K. AuTO: A framework for Automatic differentiation in Topology Optimization. Struct. Multidiscip. Optim.
**2021**, 64, 4355–4365. [Google Scholar] [CrossRef] - Rothe, S.; Hartmann, S. Automatic Differentiation for stress and consistent tangent computation. Arch. Appl. Mech.
**2015**, 85, 1103–1125. [Google Scholar] [CrossRef] - Rabel, E.; Rüger, R.; Govoni, M.; Ehlert, S. Forpy: A library for Fortran-Python interoperability. Available online: https://github.com/ylikx/forpy (accessed on 16 July 2023).
- McKay, M.D.; Beckman, R.J.; Conover, W.J. A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output from a Computer Code. Technometrics
**1979**, 21, 239–245. [Google Scholar] - Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv
**2017**, arXiv:1412.6980. [Google Scholar] - Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, Chia Laguna Resort, Sardinia, Italy, 13–15 May 2010; Volume 9, pp. 249–256. [Google Scholar]
- Hartmann, S. A thermomechanically consistent constitutive model for polyoxymethylene: Experiments, material modeling and computation. Arch. Appl. Mech.
**2006**, 76, 349–366. [Google Scholar] [CrossRef] - Maclaurin, D.; Duvenaud, D.; Adams, R.P. Autograd: Effortless gradients in numpy. In Proceedings of the ICML 2015 AutoML Workshop, Paris, France, 11 July 2015; Volume 238. [Google Scholar]
- Sabne, A. XLA: Compiling Machine Learning for Peak Performance. 2020. Available online: https://research.google/pubs/pub50530/ (accessed on 16 July 2023).
- Heek, J.; Levskaya, A.; Oliver, A.; Ritter, M.; Rondepierre, B.; Steiner, A.; van Zee, M. Flax: A Neural Network Library and Ecosystem for JAX. 2023. Available online: https://github.com/google/flax (accessed on 16 July 2023).

**Figure 2.**Geometry of the RVE (dimensions in mm) used as microstructure in the numerical experiments with fibers (grey) and matrix material (blue).

**Figure 3.**Spatial discretization and boundary conditions for macroscale test cases. (

**A**) L-profile; (

**B**) Cook’s membrane.

**Figure 5.**Reference data (

**left**) and results obtained from DNN-FE${}^{2}\phantom{\rule{0.166667em}{0ex}}$simulation (

**middle**) with NN–AD–100 model as well as error measure (74) (

**right**) for the components of strain tensor $\overline{\mathbf{E}}$.

**Figure 6.**Reference data (

**left**) and results obtained from DNN-FE${}^{2}\phantom{\rule{0.166667em}{0ex}}$simulation (

**middle**) with NN–AD–100 model as well as error measure (74) (

**right**) for the components of stress tensor $\overline{\mathbf{T}}$.

**Figure 7.**Histograms of the error (74) for the L-profile when applying the NN–AD-100 model. The top and bottom panels illustrate the error for the components of the stress and strain tensors, $\overline{\mathbf{T}}$ and $\overline{\mathbf{E}}$, respectively.

**Figure 8.**Reference data (

**left**) and results obtained from DNN-FE${}^{2}\phantom{\rule{0.166667em}{0ex}}$simulation (

**middle**) with NN–AD–100 model as well as error measure (74) (

**right**) for the components of strain tensor $\overline{\mathbf{E}}$.

**Figure 9.**Reference data (

**left**) and results obtained from DNN-FE${}^{2}\phantom{\rule{0.166667em}{0ex}}$simulation (

**middle**) with NN–AD–100 model as well as error measure (74) (

**right**) for the components of stress tensor $\overline{\mathbf{T}}$.

**Figure 10.**Histograms of the error (74) for the Cook’s membrane when applying the NN–AD-100 model. The top and bottom panels illustrate the error for the components of the stress and strain tensors, $\overline{\mathbf{T}}$ and $\overline{\mathbf{E}}$, respectively.

**Figure 11.**Step-size behavior of DNN-FE${}^{2}\phantom{\rule{0.166667em}{0ex}}$simulations (red) with NN–AD–100 model and FE${}^{2}\phantom{\rule{0.166667em}{0ex}}$reference simulation (blue), only accepted time-step sizes are shown. (

**A**) L-profile; (

**B**) Cook’s membrane.

**Table 1.**Effect of the weighting coefficients $\alpha $ and $\beta $ for the components of the loss (72) on the performance of the NN–AD model.

($\mathit{\alpha}$, $\mathit{\beta}$) | ${\mathit{L}}_{\mathcal{T}}^{\mathbf{val}}$ | ${\mathit{L}}_{{\mathcal{T}}^{\prime}}^{\mathbf{val}}$ |
---|---|---|

(1, 0.01) | $4.84\times {10}^{-8}$ | $1.35\times {10}^{-6}$ |

(1, 1) | $2.20\times {10}^{-8}$ | $8.85\times {10}^{-8}$ |

(1, 100) | $3.55\times {10}^{-8}$ | $2.97\times {10}^{-8}$ |

${\mathit{K}}_{\mathbf{f}}$ | ${\mathit{G}}_{\mathbf{f}}$ | ${\mathit{K}}_{\mathbf{m}}$ | ${\mathsf{\alpha}}_{1}$ | ${\mathsf{\alpha}}_{2}$ |
---|---|---|---|---|

N mm ${}^{-2}$ | N mm ${}^{-2}$ | N mm ${}^{-2}$ | N mm ${}^{-2}$ | - |

4.35 × 10^{4} | 2.99 × 10^{4} | 4.78 × 10^{3} | 5.0 × 10^{1} | 6.0 × 10^{−2} |

**Table 3.**Results of the DNN-FE${}^{2}\phantom{\rule{0.166667em}{0ex}}$simulation of the L-profile for different sizes of the training/validation dataset.

Model | ${\mathit{N}}_{\mathbb{D}}$ | ${\mathsf{\u03f5}}_{\mathbf{mean}}$ (%) | ${\mathsf{\u03f5}}_{\mathbf{std}}$ (%) | Speed-Up | ${\mathit{N}}_{\mathbf{iter}}$ | ${\mathit{N}}_{\mathit{t}}$ |
---|---|---|---|---|---|---|

NN–2–1 | $1\times {10}^{3}$ | 8.79 | 10.8 | 232× | 104 | 32 |

NN–AD–1 | $1\times {10}^{3}$ | 4.68 | 8.52 | 443× | 79 | 30 |

NN–2–10 | $1\times {10}^{4}$ | 3.20 | 4.87 | 246× | 101 | 32 |

NN–AD–10 | $1\times {10}^{4}$ | 0.42 | 0.69 | 400× | 86 | 30 |

NN–2–100 | $1\times {10}^{5}$ | 2.48 | 3.68 | 254× | 97 | 32 |

NN–AD–100 | $1\times {10}^{5}$ | 0.21 | 0.40 | 462× | 73 | 30 |

NN–2–1000 | $1\times {10}^{6}$ | 1.59 | 2.62 | 251× | 97 | 31 |

NN–AD–1000 | $1\times {10}^{6}$ | 0.20 | 0.37 | 452× | 76 | 30 |

NN–2–4000 | $4\times {10}^{6}$ | 1.87 | 2.89 | 236× | 103 | 32 |

NN–AD–4000 | $4\times {10}^{6}$ | 0.15 | 0.30 | 462× | 73 | 30 |

**Table 4.**Results of the DNN-FE${}^{2}\phantom{\rule{0.166667em}{0ex}}$simulation of the Cook’s membrane different for sizes of the training/validation dataset.

Model | ${\mathit{N}}_{\mathbb{D}}$ | ${\mathsf{\u03f5}}_{\mathbf{mean}}$ (%) | ${\mathsf{\u03f5}}_{\mathbf{std}}$ (%) | Speed-Up | ${\mathit{N}}_{\mathbf{iter}}$ | ${\mathit{N}}_{\mathit{t}}$ |
---|---|---|---|---|---|---|

NN–2–1 | $1\times {10}^{3}$ | 0.68 | 0.89 | 242× | 123 | 32 |

NN–AD–1 | $1\times {10}^{3}$ | 0.60 | 0.71 | 527× | 85 | 30 |

NN–2–10 | $1\times {10}^{4}$ | 0.31 | 0.44 | 292× | 104 | 32 |

NN–AD–10 | $1\times {10}^{4}$ | 0.09 | 0.13 | 542× | 84 | 30 |

NN–2–100 | $1\times {10}^{5}$ | 0.19 | 0.26 | 287× | 104 | 32 |

NN–AD–100 | $1\times {10}^{5}$ | 0.02 | 0.02 | 554× | 82 | 30 |

NN–2–1000 | $1\times {10}^{6}$ | 0.13 | 0.16 | 286× | 105 | 32 |

NN–AD–1000 | $1\times {10}^{6}$ | 0.03 | 0.06 | 575× | 79 | 30 |

NN–2–4000 | $4\times {10}^{6}$ | 0.12 | 0.17 | 291× | 103 | 32 |

NN–AD–4000 | $4\times {10}^{6}$ | 0.01 | 0.01 | 611× | 73 | 30 |

**Table 5.**Comparison of JAX and TensorFlow implementations of the surrogate model NN–AD–100 regarding the computational efficiency.

Framework | ${\mathit{t}}_{\mathbf{rel},\mathbf{train}}$ | Speed-Up for L-Profile | Speed-Up for Cook’s Membrane |
---|---|---|---|

TensorFlow | $1.39\times {10}^{-2}$ | 462× | 554× |

JAX | $9.49\times {10}^{-3}$ | 4629× | 5853× |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Eivazi, H.; Tröger, J.-A.; Wittek, S.; Hartmann, S.; Rausch, A.
FE^{2} Computations with Deep Neural Networks: Algorithmic Structure, Data Generation, and Implementation. *Math. Comput. Appl.* **2023**, *28*, 91.
https://doi.org/10.3390/mca28040091

**AMA Style**

Eivazi H, Tröger J-A, Wittek S, Hartmann S, Rausch A.
FE^{2} Computations with Deep Neural Networks: Algorithmic Structure, Data Generation, and Implementation. *Mathematical and Computational Applications*. 2023; 28(4):91.
https://doi.org/10.3390/mca28040091

**Chicago/Turabian Style**

Eivazi, Hamidreza, Jendrik-Alexander Tröger, Stefan Wittek, Stefan Hartmann, and Andreas Rausch.
2023. "FE^{2} Computations with Deep Neural Networks: Algorithmic Structure, Data Generation, and Implementation" *Mathematical and Computational Applications* 28, no. 4: 91.
https://doi.org/10.3390/mca28040091