Feed-Forward Neural Networks for Failure Mechanics Problems

: This work addresses an efﬁcient neural network (NN) representation for the phase-ﬁeld modeling of isotropic brittle fracture. In recent years, data-driven approaches, such as neural networks, have become an active research ﬁeld in mechanics. In this contribution, deep neural networks—in particular, the feed-forward neural network (FFNN)—are utilized directly for the development of the failure model. The veriﬁcation and generalization of the trained models for elasticity as well as fracture behavior are investigated by several representative numerical examples under different loading conditions. As an outcome, promising results close to the exact solutions are produced.


Introduction
Various discretization schemes exist in the literature for solving different science and engineering problems, e.g., the finite element method (FEM) [1], isogeometric analysis (IGA) [2], and the virtual element method (VEM) [3]. These modern element technologies heavily depend on the availability of a "material model" that describes the nonlinear behavior of the material as well as the structural failure. Such analytical and physically motivated mathematical models lead to a pronounced computational cost. Thus, a computationally less expensive approach has always been sought after. In this regard, one of the current possibilities introduced in the literature is the employment of unconventional approaches, such as data-driven models, to reduce the computation costs [4][5][6][7][8][9]. Machine learning, deep learning, and neural networks are examples of such data-driven models that help to reduce model complexity and may surpass conventional constitutive modeling [4,7,8,10,11].

Motivation and State of the Art
The inspiration for implementing machine learning approaches, more specifically, artificial neural networks (ANNs), to a wide range of engineering disciplines comes from human and animal neural systems. These biologically inspired simulations, performed on the computer, carry out many tasks, e.g., clustering, classification, and pattern recognition. Recently, ANNs were satisfyingly used in voice/image recognition and robotics. In the field of mechanics, novel investigations have been proposed in the literature. To this end, the so-called self-learning finite element procedure introduced by Ghaboussi et al. [20] and developed by Shin and Pande [21] can be mentioned as a very auspicious technique for the ANNs training process. Further interesting reviews of possible applications of ANNs in mechanics can be seen in [22][23][24], and their application for constitutive modeling in [25,26].
To incorporate data-driven models for solving computational mechanics problems by taking over classical constitutive modeling, numerous approaches have been proposed in the literature. This contribution represents the first steps toward deep learning (DL) as a subset of machine learning that extracts patterns from data, using neural networks for inelastic solids.

Deep Learning (DL) Architectures
DL has many architectures, especially in mechanics, and research is focused mainly on using the feed-forward neural network (FFNN), recurrent neural network (RNN), and convolutional neural network (CNN). So far, applications of these types of neural networks devoted to the field of mechanics can be seen in the recent works of [27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42][43]. A detailed description of those approaches is summarized as follows: • FFNN is used in [44,45] to model the material behavior at the macroscale level, using strains (as inputs) and stresses (as outputs). One main advantage of such a model is that the training data required for the neural network can be directly acquired from experimental data. On this basis, and for modeling a neural network that approximates the non-linear behavior of history-dependent material models (e.g., plasticity, where loading history is relevant), Ref. [46] proposed the incorporation of the strain from the previous load step as an input data or feature for the neural network. Recently, a novel method called the proper orthogonal decomposition feed forward neural network (PODFNN) was proposed by [47] for predicting the stress sequences in the case of plasticity, which reduces the complexity of the model significantly by transforming the stress sequence into multiple independent coefficient sequences. • RNN is another type of neural network that uses the previous outputs as inputs, i.e., path-dependent scheme. Two different approaches, direct (black box) and graph-based (physically-informed), were applied in [37] for modeling elastoplastic materials. In the former case (black box approach), the total stress was predicted purely considering the total strain history, whereas in the latter case (graph-based approach), besides using the recurrent neural network to predict the path-dependent behavior, the feed forward neural network is used to predict the path-independent responses, which may lead to a more accurate prediction of stresses. • CNN is a special type of deep neural network that has recently become a dominant method in computer vision. A CNN architecture consists of an input and an output layer, as well as multiple hidden layers. These hidden layers typically consist of convolutional layers, activation layers, pooling layers, and fully connected layers. CNNs have been used in image classification, video classification, face recognition, scene labeling, action recognition, image segmentation, and natural language translation, among others. In the work of [48] CNNs are used to quantitatively predict the mechanical properties (i.e., stiffness, strength, and toughness) of a 2D checkerboard composed of two different phases (brittle and ductile). Following this line, Ref. [49] introduced a graph convolutional deep neural network, incorporating the non-Euclidean weighted graph data to predict the elastic response of materials with complex microstructures. For recent works on CNNs, we refer to [50][51][52], and the citations therein.
Within this work, a feed-forward neural network (FFNN) is employed to perform the regression task of predicting stresses in the case of linear elastic material and also predicting the elasticity in the fracture phase-field modeling of brittle solids along with the prediction of the crack phase field. The objective of this paper is to incorporate neural network models as constitutive models within the finite element applications for elasticity and phase-field modeling of brittle fracture by combining the available tools. The efficiency of the trained neural network is studied within the finite element analysis. The raw numerical data acquired by FEM simulation are used to train the neural network model under investigation. As an advantage of this contribution, the proposed method has the potential for providing accurate and feasible approximations for various engineering applications. To this end, promising results close to the exact solution are achieved, provided that the training data set contains all the possible patterns as the target problem. Note that this paper represents an initial contribution to the use of neural networks for solving fracture mechanics problems. Hence, as a starting point, the model performance is evaluated through numerical results instead of real experimental data.
The paper is organized as follows: In Section 2 a brief overview of the neural network, in particular the feed-forward neural network, for path-independent predictions is presented. In Section 3, the trained network is applied to learn the constitutive behavior of elastic materials within the finite element application. Then, the trained model is used to predict the elasticity in the phase-field fracture simulation in Section 4. Thereafter, the data collection strategy for predicting phase-field fracture using a neural network model is developed and validated in Section 5. Section 6 presents a summary and the outlook of this work.

Theory of Neural Networks
A neural network (NN) algorithm can be described using a data set, a model, a lossfunction, and an optimization procedure. A data set refers to the total samples available for the training, validation, and testing of an algorithm. It is commonly split into three parts: training, validation, and test sets. The algorithm learns from the training set. Then, a validation set is used to evaluate and optimize the learning algorithm. Finally, the testing set acts as unseen data, and is used to test the trained algorithm. A model refers to the structure that holds all information that describes the (learned function) trained algorithm. A loss function or objective function is a metric that must be minimized during training. The optimization procedure is used to find the optimal parameters of the model that minimizes the loss function. A more detailed analysis of neural networks can be found, for instance in [53].

Artificial Neural Network (ANN)
ANN is defined as a mathematical model divided into a series of interconnected elements classified in layers, whose geometry and functionality have been compared with those of the human brain. ANN is made up of neurons that have one scalar output and multiple inputs, as sketched in Figure 1.
An inventive but simple structure is composed of four parts: (i) input values, (ii) weights and bias, (iii) weighted total (sum), and (iv) activation function (threshold unit). A schematic diagram of an artificial neuron is illustrated in Figure 1, where, x 1 − x 3 are inputs, w 1 − w 3 are their corresponding weights, b is the bias, f is the activation function applied to the weighted sum of the inputs and y is the output of the neuron.

Theory of Neural Networks
Neural Network (NN) algorithm can be described using: a dataset, a model, a loss-function, and an optimizationprocedure. A dataset refers to the total samples available for training, validation, and testing of the algorithm. It is commonly split into three parts: training, validation, and test sets. The algorithm learns from the training set. Then a validation set is used to evaluate and optimize the learning algorithm. Finally, the testing set acts as unseen data and is used to test the trained algorithm. A model refers to the structure that holds all information that describes the (learned function) trained algorithm. A loss-function or objective function is a metric that must be minimized during training. The optimization-procedure is used to find the optimal parameters of the model that minimizes the loss function. A more detailed analysis of neural networks can be found, for instance in [53].

Artificial Neural Network (ANN)
ANN is defined as a mathematical model divided into a series of interconnected elements classified in layers, whose geometry and functionality have been compared with that of the human brain. ANN is made up of neurons that have one scalar output and multiple inputs, as sketched in Figure 1.
An inventive but simple structure is composed of four parts: (i) Input values, (ii) weights and bias, (iii) weighted total (sum), and (iv) activation function (threshold unit). A schematic diagram of an artificial neuron is illustrated in Figure 1 where, x 1 − x 3 are inputs, w 1 − w 3 are their corresponding weights, b is the bias, f is the activation function applied to the weighted sum of the inputs and y is the output of the neuron. Mathematically, this relation can be written as where N is the number of samples. Equation (1) can be rewritten in the following compact form as The activation-function f (also known as transfer function) determines the output of neurons in the neural network Mathematically, this relation can be written as follows: where N is the number of samples. Equation (1) can be rewritten in the following compact form as follows: The activation function f (also known as the transfer function) determines the output of neurons in the neural network model, its computational efficiency and its ability to train and converge after multiple iterations of training. This introduces non-linear properties to the network. Hence, its main purpose is to convert the input signal of a node in an ANN to an output signal. The output signal is then used as an input in the next layer of the neural network. Specifically, in ANN, a sum of products is performed for inputs x and the corresponding weights W. Next, the activation function f (x) is applied to obtain the output of that layer and feed it as an input to the next layer. Activation functions can be linear and nonlinear. A linear activation function (e.g., Purelin activation function) contains an infinite range and has no effect on the complexity of the data set. On the other hand, non-linear activation functions (e.g., Sigmoid and Tanh activation functions) introduce non-linearity in order to better learn the complex relationship between the input and output data. Here, the most widely used classic activation functions with their first derivatives are presented as follows: 1.
Sigmoid transfer function.
Purelin function.  A key aspect of the activation function is that it should be differentiable for performing a successful backpropagation optimization strategy. In this paper, we used the hyperbolic tangent function, due to its flexible range and general applicability in different engineering problems; see, for example, [53].

Feed-Forward Neural Network (FFNN)
In a FFNN, the number of inputs and outputs are fixed and the knowledge of history variables is disregarded. According to the complexity of the training data, the architecture of the FFNN (to be more specific, the number of hidden layers and neurons in each layer) has to be determined. In this paper, Machine Learning Toolbox (NNTRAINTOOL) is used to train the model under consideration (NNTRAINTOOL is a Matlab toolbox that is used for training neural networks. It is divided into four parts, such as the neural network architecture, algorithms, training progress, and plots. Interested readers are referred to [Neural network training tool-MATLAB nntraintool (mathworks.com accessed on 7 July 2021)]).
The formulation of a fully-connected feed forward neural network with two hidden layers H, shown in Figure 3, is defined as follows: where the vector y is the output, the input vector x contains the features of a sample, W is the weight matrix, and b is the bias vector for each respective layer. In the current study, hidden layers have the tangent hyperbolic activation function f , which is formulated as follows: This function is a rescaling of the logistic Sigmoid function, depicted in Figure 2, with an output range of [−1, 1]. The hidden layers values are included in the vector H. The feed forward neural network architecture can be simplified as follows, given that y is the true function: where the loss function L is minimized to find the optimized weights W and biases b of the trained neural network model. In this analysis, a feed forward neural network with two hidden layers (we have tried different architectures with hidden layers of different sizes; however, the final result was not affected. Therefore, for simplicity, we have kept the neural network architecture with two hidden layers throughout this research) of 10 neurons each is created such that strains ε are features and stresses σ are the outputs of the model. Mean-squared error (MSE) is considered the loss function L such that it is minimized during the training process as follows: Since the stress and strain values are on different scales of magnitude, it is important to scale the data set to a comparable range; therefore, the input and target data are initially scaled to be in the range [−1, 1] by using the following formula: wherex is the scaled or normalized value of x in the range of [−1, 1].

Neural Network Training
Training a neural network model is like solving an optimization problem, where the weights within the model are optimized such that it gets constantly updated during training. This procedure continues until their optimal values are reached. Hence, the optimization of weights depends on the optimization algorithm or optimizer that one chooses for modeling. In the presented contribution, Levenberg-Marquardt optimization [54] is employed, due to its memory efficiency. Furthermore, it is the fastest backpropagation algorithm which updates the weights and biases in the following Newton-like update: where W n+1 and W n are the weight vectors at iterations "n + 1" and "n", respectively. Furthermore, 1 is the identity matrix, χ is a parameter that adaptively controls the speed of convergence, and J is the Jacobian matrix that contains the derivatives of the network errors vector e with respect to the weights W. It is defined by the following: In the training phase, the weights are initialized firstly and then get updated until some predefined stopping criteria are satisfied as follows: 1.
The maximum number of epochs (iterations or repetitions) is reached.

2.
Performance is minimized to the goal.

3.
Validation performance is increased more than the last time it decreased. 4.
The maximum amount of time is exceeded.

Neural Network (NN) Based Elasticity
In this section, a NN-based small-strain elasticity model is developed using feedforward neural networks (FFNNs), which is then embedded within the finite element formulation using the software tool ACEGEN [55].

Data Collection
Determination of the input and output variables for the neural network is the first task for the approximation of elastic behavior by the FFNNs for finite element applications. The strain-stress mapping can be achieved approximated by the FFNNs without considering the loading history since, for the small strain elasticity (elastic deformation), the loading and unloading curve coincide with each other.

Analytical Model
To verify the performance of the NN-based model, a comparison with an analytical model is investigated. The training data are solely collected from the analytical solution instead of using experimental data. To make sure that all the possible values of strains are covered in the training data, the inputs to the analytical model are generated by taking equally spaced points within the given range of strain space. For this purpose, Latin hyper-cube sampling (LHS) is used to generate the data; see Figure 4. As an example of elasticity, the linear elastic model for isotropic material is applied as the target model; a brief overview of this model is summarized below.

Isotropic Elasticity
For isotropic elastic material behavior the Hookean strain energy is assumed to be a quadratic function as follows: where λ > 0 and µ > 0 are the elastic Lamé constants defined in terms of Young's modulus E, Poisson's ratio ν and the shear modulus G as follows: and Following the Coleman-Noll procedure, the stress tensor is obtained from the energy function ψ in (13) for isotropic material behavior as follows: hereby, both the stresses σ and strains ε are symmetric tensors. For the 2D case, the inputs of the model and their outputs are chosen as the strain and stress components, respectively. Inputs: Outputs: The stresses as output data can be computed using (15) accordingly. The neural network is trained until the stopping criteria is reached. After training, the model is saved. The NN-based elasticity model reads as follows: where σ N N is the predicted stress by the FFNN. As a representative example, we choose the following material parameters for the isotropic-elastic model in the training data collection: Young's modulus E = 21 kN/mm 2 and Poisson's ratio ν = 0.3. The data set is split into training (70%), validation (15%), and test (15%) subsets. A feed-forward neural network with an architecture of 3 − 10 − 10 − 3 is applied. It consists of three neurons for the input and output layer each, and 10 neurons for each of the two hidden layers. The Levenberg-Marquardt algorithm [54] is chosen as the training optimizer. Figure 5 depicts the performance of the neural network model throughout training, where the model performance reaches its optimum at epoch (iteration) 1400 over the scaled data set. This is based on the termination criteria introduced in Section 2. The mean squared error is decreased to 2.2 × 10 −16 , which costs time of 14 m 48 s; see Table 1.  Training duration 888 s 3.

Representative Numerical Examples
In the following, the performance of the proposed machine learning based model is demonstrated through two representative numerical examples. The material parameter used for the isotropic-elastic model in the training data collection and in the finite element analysis using software tools Acegen and AceFEM [55] are as follows: Young's modulus E = 21 kN/mm 2 and Poisson's ratio ν = 0.3. Here, also a FFNN with the architecture of 3 − 10 − 10 − 3 is applied, with 3 neurons for input and output layer each, and 10 neurons for the hidden layers. Furthermore, the Levenberg-Marquardt algorithm is considered the training optimizer. To illustrate the computational methodology, representative tests under different loading conditions are presented.

Compression Test of a Plate
The first model problem is the uniaxial compression test of a rectangular plate. The geometric setup and the loading conditions of the specimen are depicted in Figure 6a. The plate is fixed at the bottom, and a prescribed displacement with an amplitude of u = −0.04 mm is imposed at the top surface of the plate with L = 1 mm. The geometric domain of the structure is discretized by 400 triangular T1 elements, leading to 231 nodes.
The load-deflection curve is depicted in Figure 6b. Next, the stresses computed with the neural network based model is compared with that of Hooke's model. It can be observed from the contour plot that FFNN predicts the stresses very accurately as shown in Figure 7. This verifies the accuracy of the proposed neural network approach for elasticity problems.

A-Notched Bar in Tension
Next, a tensile test of the A-notched bar as depicted in Figure 8 is conducted. The bar is clamped at the left end and a prescribed displacement with an amplitude of u = 0.02 mm is imposed on the right end with L = 1 mm and r = 0.25 mm. The geometric domain of the structure is discretized by unstructured meshes with a total of 306 triangular T1 elements, leading to 189 nodes. Similarly, the A-notched bar stresses computed with the neural network model are compared with those of the purely finite element method. It can be seen from the contour plot of both FE and NN formulations that FFNN predicts the stresses very accurately, as shown in Figure 9.

Discussion
From the above-detailed studies, it can be concluded that NN-formulation works well for predicting linear elastic material behavior under different loading conditions and geometries. For a better understanding of the computational efficiency of the neural network model incorporated inside the finite element formulations, a comparison is made with that of finite element analysis based on AceGen generated c code; see Table 2. The evaluation time for the NN embedded model is longer due to the size of the AceGen file (which includes the neural network formulation) and the functions necessary for the computation and normalization of the data required by the NN-model. The positive aspect is that this has to be done once, and after a successful execution, the generated file can be used for finite element simulations. Similarly, Table 3 provides the simulation report obtained using AceFEM for the 2D plate (Section 3.2.1) and A-notched bar (Section 3.2.2), respectively. Note that the computational effort heavily depends on the machine on which the simulations are running. Therefore, here, only the computation time is compared between the FEM and NN simulations. It can be concluded from the representative examples that although NN accurately predicts the stress-strain relationship, it has no superiority when it comes to the computational effort for the problems in elasticity.

Neural Network (NN) Based Elasticity for Fracture Problems
The main objective of this section is to incorporate the NN-based elasticity model (developed in Section 3) within the finite element formulation of phase-field brittle fracture for the sole purpose of efficiently predicting the elasticity part of the phase field. For the sake of brevity, we omit the detailed description of the phase-field modeling of brittle fracture and summarize next the most important equations. For more details, the interested reader is referred to  and the citations therein.

Phase-Field Modeling of Brittle Fracture
In this section, we summarize the variational formulations for phase-field modeling of brittle fracture in elastic solids at small strains. The constitutive work density function consists of the following sum: of a degrading elastic bulk energy ψ depicted in (13) and a contribution due to fracture which represents the accumulated dissipative energy. Hereby, the crack phase-field d(x, t) = 0 represents the unbroken state of the solid and d(x, t) = 1 refers to the fully fractured state. The function g(d) = (1 − d) 2 models the degradation of the stored elastic energy of the solid due to fracture. The crack surface density function is defined as γ(d, ∇d) = 1 2l d 2 + l 2 |∇d| 2 in terms of the fracture length scale l that governs the regularization. The formulation (19) depends on two additional fracture parameters, namely, the critical fracture energy ψ c and ζ, which controls the post-critical range after crack initialization, as well documented in [61]. Based on the above-introduced work density function, we derive two governing equations for the coupled problem. The first equation is the stress equilibrium or the quasi-static form of the balance of linear momentum defined as follows: (20) in terms of the effective stress tensor σ and by neglecting volume forces. Following [79], the evolution of the crack phase-field in the domain Ω represents the second governing equation as follows: along with its homogeneous Neumann boundary condition ∇d · n = 0 on ∂Ω. Here, n is the outward normal on ∂Ω and η ≥ 0 is a material parameter that characterizes the artificial/numerical viscosity of the crack propagation. The crack driving force H is introduced as a local history variable that accounts for the irreversibility of the phase-field evolution by filtering out a maximum value of what is known as the crack driving state function D. This is achieved by introducing the Macaulay bracket x + = (x + |x|)/2. Note that only the tensile/positive part of the elastic energy in (13) is considered for computing the crack driving force. It is defined in terms of the positive strain tensor Here, {ε a } a=1..δ are the principal elastic strains and {N a } a=1..δ are the principal strain directions; for further details on energy decomposition, see [56].

Neural Network Architecture
In this part, a feed-forward neural network with a similar architecture as that in Section 3 is applied. As a loss function, the mean-squared error (MSE) is considered. Here, also the Levenberg-Marquardt algorithm is chosen as the training optimizer. Since the stress σ and strain ε values are on different scales of magnitude, a normalization of the data set is required before training. Thus, the data set is scaled to a comparable range, in which the input and target data are initially scaled to be in the range [−1, 1]. The training data are collected from the analytical model, similar to the method used in NN-based elasticity; see Section 3. The material parameters used for the creation of the training data set are given in Table 4. Fracture length scale l 0.004 mm 5.
Post critical parameter ζ 1.0 − In the following, the performance of the proposed machine learning based model will be examined for the phase-field fracture simulations. Herein, the elasticity is predicted by the neural network. The formulation of the ML-based model substituting the above formulations defines a function in the following format: FFNN(ε, W, b), (22) where W is the weight matrix and b is the bias of the neural network.

Numerical Examples
To illustrate the computational methodology and verify the formulation, two benchmark problems are investigated.

Single-Edge-Notched Tension Test
The first benchmark test considers a square plate (L = 1 mm) with a horizontal notch placed at the middle height, as plotted in Figure 10 (left). The specimen is discretized using FEM with linear triangles. A mesh refinement in the expected fracture zone is applied. Furthermore, Figure 10 shows the contour plot of the crack phase-field d for different loading states up to final rupture.
Next, Figure 11 illustrates a comparison between FEM and the Neural Network formulation by calculating the stress-strain relationship using both methods. The predicted stress thoroughly follows the FE solution, which verifies the generalization of the NN model.

V-Notch Bar in a Tension Test
The second model problem is concerned with analyzing the brittle failure of a V-notch bar under tensile loading. The geometric setup and the loading conditions of the specimen are depicted in Figure 12a. The size of the specimen is chosen to be as follows: H = 1 mm, L = 0.35 mm, h = 0.1 mm and V = 0.24 mm. The mesh size of the specimen is chosen to be h e = 0.004 mm in the expected fracture zone. The computation is performed by applying a displacement with an amplitude of u = 0.02 mm at the top edge while the bottom edge is fixed in both directions x and y. The material parameters used in this simulation are similar to that of the single edge notched tension test as shown in Table 4. Discretization is achieved by finite element (FEM) formulations with 3-noded linear triangular elements. The evolution of the crack phase-field d is reported in Figure 12b-d. The crack initiates at the notch tip and successively propagates horizontally from the notches inwards till the final rupture. For visualization of crack surface, deformed regions with a phase-field d ≈ 1 are plotted in Figure 12.
Next, Figure 13 illustrates a comparison between FEM and NN formulations by calculating the stress-strain relationship using both methods. As in previous examples, predicted stress exactly follows the FE solution.

Discussion
It has been shown that the incorporation of the NN model within FE formulation for the fracture phase-field approach is successful. Herein, the elasticity is approximated using the neural network model rather than the classical methods. Hence, data-driven methods, such as neural networks, are promising in the solution of mechanical problems.
In this regard, Table 5 compares the AceGen generated c code, while Table 6 provides a brief simulation report of numerical examples to illustrate the computational efficiency of the proposed NN-model. From these tables, it can be observed that, due to the bigger size of the AceGen file, in NN case the evaluation time is longer. On the other hand, the AceFEM simulation report indicates that the total linear solver time, total K and R time, and CPU Mathematica time are smaller when using the NN method; see Table 6. It is worth mentioning that the computational time may vary on different machines; nevertheless, this again verifies the applicability and efficiency of the neural network model.

Neural Network (NN)-Based Phase-Field Brittle Fracture
In this last section, a NN-based phase-field model is developed using feed-forward neural networks (FFNNs). The aim here is to embed a neural network-based model inside FEM, which predicts the fracture phase-field d in such a way that this NN-model is able to mimic the behavior of phase-field modeling of brittle fracture to its full potential. Toward this goal, the problem at hand is gradually developed in different steps. These steps demonstrate the scope of applicability and feasibility of this approach. Moreover, it represents a good strategy for confronting possible challenges throughout this study.

Data Collection
To verify the performance of the NN-based model by comparing their predictions with FE formulations, the training data are exclusively collected from the finite element analysis, i.e., experimental data are not considered. In this regard, both geometries in Figures 10-12 are employed for the creation of the data set necessary for training the neural network model. Hereby, data from a specific part of the geometry are considered since the data of interest lie in the region where the crack is expected. For the elasticity part of the phase-field modeling of brittle fracture, the previously trained model is utilized. A second model is also trained solely for predicting the fracture phase-field d. It suffices to state that the accuracy of the predictions (approximations) depends on the complexity of the relationship between the data set for the inputs and outputs. The trained neural network will approximate the mapping between the input and output. Unlike the elasticity model, the relationship between the input and output data in the fracture process is highly nonlinear and complex. To understand these relationships, a sensitivity analysis will provide a better insight into the selection of the proper choice of inputs required to accurately predict the target output.

Sensitivity Analysis
Sensitivity analysis is quite useful in specifying the effect of a particular input on an output under a set of assumptions. Different methods exist in the literature for conducting a thorough sensitivity analysis. However, for our scope of the study, a scatter-plot is sufficient. Such a representation is a qualitative method, which provides no sensitivity index or numerical value. For a full dependency analysis, plots as many as the number of inputs are required.
As an example, Figure 14 illustrates the relationship of total stress due to the degradation of stored elastic energy and total strain. Hereby, the stresses decrease after a certain strain value related to the critical fracture energy ψ c introduced in (19).

Input and Output Relationship
From the sensitivity analysis, it can be observed that the input and output relationships are highly non-linear. Hence, not much can be learned about the input-output relationship from the scatter-plots of outputs and their respective inputs. Therefore, the model is trained using a different number of input features, and the architecture with better performance was chosen to be embedded in the finite element formulations.
For the prediction of the fracture phase-field d at the current time-step, which would be the output of the NN model, strains, stresses, driving force at the current time-step along with the phase-field d at the previous time step are considered the input to the NN model. Note that the stresses at the current time step have a direct dependency on the current fracture phase-field due to the degradation of stored elastic energy. In this contribution, the stresses are degraded with the previous value of d. This results in a small increase in the global error, i.e., the performance of the neural network model; however, it is still in an acceptable range and yields good results. The formulation of the NN-based phase-field model defines a function in the following format: where d t NN is the fracture phase-field at the current time-step, H t is the current driving force, ε t is the current strain with ε = [ε xx , ε yy , ε xy ], and σ t NN is the NN-model trained apriori to approximate the stress as follows: for given strains ε , weight matrix W, and the bias of neural network b.

Feed-Forward Neural Network
A feed-forward neural network with an architecture of 8-15-15-1 is applied to learn the relationship between input and output datasets. The input layer of the neural network contains 8 neurons (3 strains, 3 stresses, 1 old phase-field and 1 driving force), and the two hidden layers containing 15 neurons each. The output layer contains one neuron which predicts the fracture phase-field d. The Levenberg-Marquardt algorithm [54] is applied as an optimizer for the training of the neural network. The rest of the NN structure follows the same procedures and techniques described in Section 2. After this training, a performance in the order of 10 −5 is achieved, as demonstrated in Table 7. Training duration 1200 s 3.
Number of nodes per hidden layer 15 − 5.

Representative Numerical Examples
In the following, the performance of the proposed machine learning based approach is further examined in the prediction of the fracture phase-field d. To this end, the finite element computation is performed by symbolic-numeric programming MATHEMATICA using ACEGEN and ACEFEM packages; see [55]. The trained model weight matrices and biases are incorporated throughout separate functions inside the finite element formulations. Therefore, the calculations can be done without dependency on any other programs (e.g., Matlab machine learning toolbox). After the incorporation of the neural network models within the finite element formulations, ACEFEM is used as a finite element environment for the solution of the multi-field problem. The neural network learns from the data, thus the complexity of the NN model is significantly influenced by the data set. To show the contribution of the relevant input (namely previous the NN-based phase-field value d t−1 NN at time t − 1) in the accuracy of the neural network model, two different cases are investigated.

Single-Edge-Notched Test
The same benchmark test depicted in Section 4.3.1 is further examined to verify the computational methodology by comparing the predicted results to that of finite element analysis.
Case 1: The first case considers only stress, strain and, driving force as the input to the neural network model. The formulation of the NN-based phase-field model defines a function in the following format: FFNN(ε, σ, H, W, b), (25) where stresses are calculated using the finite element method. Only one NN-model is incorporated within the FE-analysis which predicts the phase-field d. The results are promising; however, there are some mismatches between the finite element solution and the NN-model as shown in Figure 15.

Case 2:
The second case is similar to the first case; however, there is only one more input to the neural network. Here, the previous value of fracture phase-field d is also considered as an input to the neural network model. The formulation has the following format: In terms of accuracy, this model predicts the relationship between the input and outputs much more accurately than the previous case, see Figure 16. Hence, for the next example, we employ the analysis in Case 2. In this example, different cases are considered to create the most efficient neural network model for the prediction of the fracture phase-field d and compared in terms of both accuracy and efficiency. Table 8 depicts the AceGen generated c code size, while Table 9 compares the AceFEM solution times related to the generated code. It can be concluded that incorporating more than one model within the finite element formulation costs more evaluation time during generating the c code. On the other hand, in the AceFEM simulation, there is no significant difference in the total solve time. Total size of c code 30, 257 55, 070 bytes Table 9. AceFEM-simulation report of Case 1 vs. Case 2 (phase-field).

V-Notch Bar in a Tension Test
Further verification of this methodology is illustrated using a V-notched bar in tension. The geometry and boundary conditions of the specimen are similar as before, see Figure 12.
For the analysis, the formulation introduced in Case 2 above is utilized, and the procedure of the previous example is followed. Qualitatively good results are obtained as depicted in Figure 17.

Conclusions
This work presented a neural network-based material modeling approach for elasticity and phase field of brittle fracture. The approximation of stresses and fracture phase field in an elastic domain have shown that NN-based approaches can learn the linear and highly non-linear relationship between the input and output data. A commonly used machine learning tool called the "feed-forward neural network" (FFNN) is proven to be efficient for learning the complex input-output relationship, particularly when there is no history dependency between the input and output. Therefore, using NN-based models instead of the conventional numerical procedures to obtain the stresses and fracture phase-field can result in an efficient approach. The automatic symbolic differentiation tool AceGen provides a very convenient way for embedding the neural network formulations within the finite element formulations. The verification of the neural network-based models was conducted by representative numerical examples. We have demonstrated that the NN-based methods, particularly FFNN, can provide accurate and feasible approximations. Accordingly, it can be incorporated in finite element formulations. As our results suggest, the neural network predictions can be identical to the exact solution, provided that the training data set contains all the possible patterns as the target problem. This was achieved for the approximation of stresses in the elastic domain. The question of building a universal NN-based model that requires a universal training data set to be used in a wide range of boundary conditions and different geometries is still open and will be a topic of further research. In this regard, real experimental data of concrete failure underwater (DFG Priority Program SPP 2020 Experimental-Virtual-Lab) will be used as future trained data in the ML approach.