Inductive Determination of Rate-Reaction Equation Parameters for Dislocation Structure Formation Using Artificial Neural Network

Umeno, Yoshitaka; Kawai, Emi; Kubo, Atsushi; Shima, Hiroyuki; Sumigawa, Takashi

doi:10.3390/ma16052108

Open AccessEditor’s ChoiceArticle

Inductive Determination of Rate-Reaction Equation Parameters for Dislocation Structure Formation Using Artificial Neural Network

by

Yoshitaka Umeno

^1,*

,

Emi Kawai

¹

,

Atsushi Kubo

¹,

Hiroyuki Shima

²

and

Takashi Sumigawa

³

¹

Institute of Industrial Science, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505, Japan

²

Department of Environmental Sciences, University of Yamanashi, 4-4-37, Takeda, Kofu, Yamanashi 400-8510, Japan

³

Department of Energy Conversion Science, Graduate School of Energy Science, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan

^*

Author to whom correspondence should be addressed.

Materials 2023, 16(5), 2108; https://doi.org/10.3390/ma16052108

Submission received: 29 December 2022 / Revised: 21 February 2023 / Accepted: 3 March 2023 / Published: 5 March 2023

(This article belongs to the Section Materials Simulation and Design)

Download

Browse Figures

Review Reports Versions Notes

Abstract

The reaction–diffusion equation approach, which solves differential equations of the development of density distributions of mobile and immobile dislocations under mutual interactions, is a method widely used to model the dislocation structure formation. A challenge in the approach is the difficulty in the determination of appropriate parameters in the governing equations because deductive (bottom-up) determination for such a phenomenological model is problematic. To circumvent this problem, we propose an inductive approach utilizing the machine-learning method to search a parameter set that produces simulation results consistent with experiments. Using a thin film model, we performed numerical simulations based on the reaction–diffusion equations for various sets of input parameters to obtain dislocation patterns. The resulting patterns are represented by the following two parameters; the number of dislocation walls (

p_{2}

), and the average width of the walls (

p_{3}

). Then, we constructed an artificial neural network (ANN) model to map between the input parameters and the output dislocation patterns. The constructed ANN model was found to be able to predict dislocation patterns; i.e., average errors in

p_{2}

and

p_{3}

for test data having 10% deviation from the training data were within 7% of the average magnitude of

p_{2}

and

p_{3}

. The proposed scheme enables us to find appropriate constitutive laws that lead to reasonable simulation results, once realistic observations of the phenomenon in question are provided. This approach provides a new scheme to bridge models for different length scales in the hierarchical multiscale simulation framework.

Keywords:

reaction–diffusion model; dislocation structure; fatigue; artificial neural network; multiscale simulation; machine learning

1. Introduction

Fatigue is a common fracture mode in metal and accounts for a substantial fraction of failure cases in real industrial products. It is therefore demanded to fully understand the mechanism of fatigue fracture. In particular, there is still much room for investigations of the mechanism of fatigue crack formation under cyclic loading. It is widely understood that the fatigue crack formation in macroscopic metal materials originates in the persistent slip band (PSB) formed as a result of self-organization of dislocation structures [1]. Nevertheless, the PSB formation mechanisms proposed thus far have room for further examination and assessment, urging investigation by modeling and simulation. Moreover, recent experimental studies of fatigue in nanometer- or submicron-sized materials by Sumigawa et al. [2,3] indicate the possibility of unveiled mechanisms of fatigue at the nanometer and submicron scales, or “nano–micro fatigue”. As fatigue fracture was observed in a specimen smaller than the dimension of PSB, which suggests that fatigue fracture can occur without the presence of PSB, the variation of self-organized dislocation patterns due to size effect should play a key role in nano–micro fatigue [2,4]. This finding also urges modeling and simulation to reveal unknown complicated mechanisms lying behind the dislocation structure formation.

Walgraef and Aifantis proposed a phenomenological model of the dislocation pattern formation in metal under cyclic loading based on the rate-reaction (reaction–diffusion) theory [5,6,7,8,9,10,11,12,13,14]. The reaction–diffusion equation approach, which solves differential equations of the development of density distributions of mobile and immobile dislocations under mutual interactions, has been widely used to simulate the dislocation structure formation and was found to be useful to discuss its mechanisms. A challenge in such a phenomenological model is, however, the difficulty in the determination of appropriate parameters in the governing equations. Parameters are often fitted so that the simulation results become consistent with experimental observations of the phenomenon in question, but this can be daunting when the governing equations contain a number of parameters.

An alternative way would be bottom-up (deductive) determination based on a different physical model covering a smaller length scale. For example, dislocation mobility used in the discrete dislocation dynamics can be obtained by a molecular dynamics simulation of a single dislocation. Although this scheme may look straightforward, consistency with experimental facts is not guaranteed because the simulation at the lower scale may produce nontrivial deviation from reality due to technical constraints such as the limitation of spatial and temporal scales of the simulation setup. Moreover, the deductive approach must rely on a multi-story hierarchy of multiscale models if the phenomenological model is on the macroscopic side, which can make the parameter determination substantially prone to accumulated deviation.

In this study, we propose an inductive approach for the parameter determination utilizing machine-learning. Using a thin film model, we performed numerical simulations of dislocation structure formation based on the Walgraef–Aifantis (WA) model using various sets of input parameters. The results of dislocation structures (density distributions) were characterized with devised algorithms. Then, we constructed an artificial neural network (ANN) model that predicts the output dislocation patterns from the input parameters. The application of the proposed scheme for finding appropriate constitutive laws consistent with experiments was discussed. This new scheme paves the way for bridging models for different length scales in the hierarchical multiscale simulation framework.

2. Methodology

2.1. Reaction-Diffusion Equation by Walgraef-Aifantis

Walgraef and Aifantis proposed a reaction–diffusion model (which we call the WA model hereafter in this paper) to describe temporal change in dislocation densities, which has a long history and is widely used for its convenience. In this model, dislocations are divided into two categories; i.e., the mobile and immobile dislocations. The former is free to move as a response to stress exerted in the slip plane, while the latter is trapped or moves slowly. The mobile and immobile dislocation densities in space

x

and time

t

are, respectively, described by

ρ_{m} (x, t)

and

ρ_{i} (x, t)

. Temporal evolution of the dislocation density functions is obtained by solving the following parallel non-linear partial differential equations [14]:

\frac{\partial ρ_{i}}{\partial t} = D_{i} \frac{\partial^{2} ρ_{i}}{\partial x^{2}} + α (ρ_{0 i} - ρ_{i}) - β ρ_{i} + γ ρ_{m} ρ_{i}^{2}

(1)

\frac{\partial ρ_{m}}{\partial t} = D_{m} \frac{\partial^{2} ρ_{m}}{\partial x^{2}} + β ρ_{i} - γ ρ_{m} ρ_{i}^{2}

(2)

The first terms of the right-hand side of the two equations represent diffusion-like behavior of the mobile and immobile dislocations, with

D_{i}

and

D_{m}

being constants for the strength of diffusivity. Pinning-up of newly produced dislocations during the formation of PSBs is expressed by

α (ρ_{0 i} - ρ_{i})

, where

α

represents the annihilation rate and

ρ_{0 i}

is a constant value describing the source of immobile dislocations, which is assumed to exist uniformly in the system. Release of immobile dislocations from the dislocation forest is described by

β ρ_{i}

, where

β

designates the dislocation release from the forest. Capture of mobile dislocations by immobile dipoles is represented by the nonlinear term of

γ ρ_{m} ρ_{i}^{2}

, with

γ

being the capture rate.

While the two parallel equations of the evolution of the dislocation density functions include five parameters (WA parameters;

D_{i}, D_{m}, α, β, γ

), not all of them are independent, meaning that the five parameters should not be arbitrarily determined. According to Schiller et al. [15], the following relations hold among the pinning-up rate, the release rate and the diffusivity strengths:

α = \frac{D_{i}}{l_{i}^{2}}

(3)

γ = \frac{v_{m}^{2}}{2 ρ_{0 i}^{2} D_{m}}

(4)

where

l_{i}

and

v_{m}

are the mean free path of immobile dislocations, and the effective velocity of mobile dislocations considering trapping by obstacles, respectively.

In addition, it is known that there are two bifurcations in the dislocation pattern depending on the value of

β

. If

β

equals or exceeds a critical value of

β_{H}

, the dislocation pattern oscillates with time, which is called the Hopf bifurcation. The other critical value is

β_{c}

(assuming

β_{c} < β_{H}

), at which the Turing instability occurs; i.e., the dislocation pattern is formed if

β

exceeds

β_{c}

. According to stability analysis,

β_{H}

and

β_{c}

are in relation with some parameters as [14]

β_{c} = {(\sqrt{α} + \sqrt{\frac{c D_{i}}{D_{m}}})}^{2}

(5)

β_{H} = α + c

(6)

c = γ ρ_{0 i}^{2}

(7)

The five WA parameters with the consideration of the abovementioned traits were changed as follows:

D_{i}

was set to be

10^{- 4}, 10^{- 3.5} or 10^{- 3}

µm²/s.

D_{m}

was set such that

D_{i} / D_{m} = 0.2 \times 10^{- 2}, 0.5 \times 10^{- 2} or 10^{- 2}

because

D_{i} / D_{m}

should be at least 10⁻² [15].

β

was set to be

β_{1}, β_{2} or β_{3}

, where

β_{1} = 0.9 β_{c}, β_{2} = β_{c} + \frac{β_{H} - β_{c}}{3} and β_{3} = β_{2} + \frac{β_{H} - β_{c}}{3}

(NB:

β_{1} < β_{c} < β_{2} < β_{3} < β_{H}

).

α

and

γ

are determined according to Equations (3) and (4), with

l_{i} = 10^{- 2} μ m

,

v_{m} = 10 μ m / s

and

ρ_{0 i} = 0.5 {μ m}^{- 2}

[15], which is an average value of the initial distribution of ρ_i.

Therefore, 27 parameter sets were used in total.

As described above,

ρ_{i} and ρ_{m}

are functions of space coordinate

x

and time

t

, meaning that we consider a one-dimensional distribution of dislocations.

x

is in the range

0 \leq x \leq l

, where

l

indicates the thickness of the space. In our simulation, we set

l = 1.0 μ m

. At the boundaries of the space, the spatial derivatives of the dislocation densities are assumed to be zero; i.e.,

\frac{\partial ρ_{i, m}}{\partial x} = 0

at

x = 0, l

.

Here, we explain two schemes to construct initial dislocation distributions from which the parallel reaction–diffusion calculations start. One is a simple way to use random fractional values of

f (0 < f < 1)

as the initial dislocation density (Scheme A). The other is a devised scheme to give more smooth but random distributions (Scheme B). There, the initial dislocation distributions were generated by superposing sinusoidal waves with various (given) wavenumbers and random amplitudes, controlled to satisfy the given boundary condition and minimum/maximum values. The detailed procedure of Scheme B is explained in Appendix A. From these initial distributions, we solved the parallel partial differential equations (Equations (1) and (2)) numerically using the Euler method with a time step of

1.0 \times 10^{- 6} s

.

2.2. Characterization of Resulting Dislocation Structure

After a sufficient number of iterations of numerical integral (time development) of Equations (1) and (2), we obtain converged dislocation densities (

ρ_{i} and ρ_{m})

. Since the density distribution of immobile dislocations should represent the dislocation pattern formed as a result of diffusion and reaction of mobile and immobile dislocations, we analyze the form of

ρ_{i}

. Figure 1 schematically shows two typical distribution patterns of immobile dislocations. Figure 1a depicts the case where we find peaks of dislocation density aligned with low-density areas lying in between, which can be regarded as the formation of the wall structure. Thus, the peaks in

ρ_{i}

will be called “walls” hereafter in this paper. In contrast, no characteristic shape is found in some cases such as Figure 1b, indicating no formation of self-organized dislocation patterns.

Now, we need an algorithm to extract features of the distribution pattern from the function

ρ_{i} (x) (0 \leq x \leq l)

, namely; (a) the presence of a self-organized pattern; (b) the number of walls; and (c) the average width of walls. Our algorithm works as follows: It is regarded that a self-organized pattern is formed when more than 50% of

ρ_{i} (x)

exceeds the threshold

ρ_{th} ≔ ρ_{\min} + 0.05 (ρ_{\max} - ρ_{\min})

, where

ρ_{\max}

and

ρ_{\min}

are the maximum and minimum values of

ρ_{i} (x)

. By detecting points where the curve

y = ρ_{i} (x)

and

y = ρ_{th}

intersect each other, the number of walls can be counted. The width of a wall is defined as the width of a continuous region where

ρ_{i} (x) \geq ρ_{th}

.

2.3. Mapping of WA Parameters and Resulting Dislocation Structure

2.3.1. Structure of Artificial Neural Network model

Our ANN model consists of five layers including the input and output layers. The number of nodes on each layer is 5 → 6 → 6 → 4 → 3. The input layer has five nodes corresponding to WA parameters (

D_{i}, D_{m}, α, β and γ

), respectively. The output layer has three nodes, which give quantities (

p_{1}, p_{2} and p_{3}

) for characterization of the resulting dislocation pattern.

p_{1}

is a Boolean value representing whether a dislocation wall structure is formed.

p_{2} and p_{3}

are the number of walls [

{μ m}^{- 1}

] and the average width [nm] of the formed walls, respectively.

Figure 2 shows a schematic illustration of the ANN architecture. The present ANN model is based on a typical feed-forward network, consisting of five layers; one input layer (hereafter, referred to as Layer 0), three internal layers (Layers 1, 2, and 3), and one output layer (Layer 4). Each layer consists of nodes. The numbers of nodes in Layer

n

,

N_{n}

, are set to 5, 6, 6, 4, and 3 for Layers 0, 1, 2, 3, and 4, respectively. The nodes in Layer 0 (input) and Layer 4 (output) are corresponding to the WA parameters (D_i, D_m, α, β, γ) and the characterization parameters (

p_{1}, p_{2}, p_{3}

), respectively. The state of each node is given by a real number, and hereafter we refer to the state of the q-th node in the

n

-th layer as

x_{q}^{n}

. The states of nodes in the input layer (

n = 0

),

x_{q}^{0} (q = 1, 2, 3, 4, 5)

, are given by the common logarithms of the WA parameters:

\begin{array}{l} x_{1}^{0} = {l o g}_{10} D_{i}, \\ x_{2}^{0} = {l o g}_{10} D_{m}, \\ x_{3}^{0} = {l o g}_{10} α, \\ x_{4}^{0} = {l o g}_{10} β, \\ x_{5}^{0} = {l o g}_{10} γ . \end{array}

(8)

Note that we adopted a logarithm of the parameters instead of the parameters themselves because the parameters are expected to vary in a wide range (by several digits). The states of the internal and output layers are determined by the previous layer as [16]

x_{q}^{n} = f_{n} (w_{q 0}^{n} + \sum_{r = 1}^{N_{n - 1}} w_{q r}^{n} x_{r}^{n - 1}),

(9)

where

w_{q r}^{n}

denotes the weight parameter, and

w_{q r}^{n}

is the bias parameter (

n = 1 - 4; q = 1, \dots, N_{n}; r = 1, \dots, N_{n - 1}

), which are the internal parameters to be optimized by machine learning. Therefore, the total number of the parameters is 121. The function

f_{n} (x)

represents the activation functions defined as [16]

f_{n} (x) = {\begin{matrix} \frac{1}{1 + e^{- x}} & (n = 1, 2, 3) \\ x & (n = 4) \end{matrix} .

(10)

The node states of the output layer are interpreted as the predicted characterization parameters, i.e.,

p_{q} = x_{q}^{4} (q = 1, 2, 3)

. Note that the Boolean parameter

p_{1}

deals with a real number in the ANN model for simplicity (If a wall structure is formed, then

p_{1} = 1

; otherwise

p_{1} = 0

). In this description, the value of

p_{1}

can be interpreted as the probability of formation of a wall structure.

The numbers of intermediate layers and nodes on the layer are arbitrarily chosen, but should affect the performance of the ANN model. This will be discussed later in this paper.

2.3.2. Training of ANN

The reaction–diffusion equations were solved for the parameter sets and the initial structures described in Section 2.1 (i.e.,

27 \times 10 = 270

cases for each scheme). Among these cases, we found that the final ρ_m had negative values for the cases with

D_{i} / D_{m} = 0.2 \times 10^{- 2}

and

D_{i} = 10^{- 3} {μ m}^{2} / s

. This was presumably because

D_{m}

was relatively large, resulting in numerical errors in solving the partial differential equations with the Euler method. Excluding these parameter sets, we used 240 (

= 24 \times 10

) cases as training data of the ANN model for each scheme. Then, the ANN model was trained to map the input WA parameters to the resulting dislocation structure (

p_{1}, p_{2} and p_{3}

).

The loss function, L, which represents deviation of the ANN prediction from actual results, is set to be

L = \sum_{k = 1}^{n} \sum_{i = 1}^{3} {(p_{i, k} - p_{i, k}^{0})}^{2}

(11)

where

p_{i, k}

indicates

p_{i}

of the case

k

out of the combinations of parameter sets and initial distributions by either Scheme A or B.

n = 240

(24 parameter sets and 10 initial distributions) is the total number of the cases. The WA model parameters were optimized to reduce the loss function with the steepest descent method.

2.3.3. Test of ANN

To examine the predictability of the trained ANN, i.e., the reliability of the prediction when a parameter set deviates from training datasets, we prepared test datasets (denoted with

\hat{}

) using a predetermined value indicating the amount of deviation,

Δ_{i, m}

, as follows: First,

{\hat{D}}_{i}

and

{\hat{D}}_{m}

were determined as

{\hat{D}}_{i, m} = D_{i, m} + R_{i, m} Δ_{i, m}

(12)

with

R_{i}

and

R_{m}

randomly taking integer values

- 1, 0 or 1

. Here, all combinations of

R_{i}

and

R_{m}

excluding

R_{i} = R_{m} = 0

(i.e., eight cases) were produced with the same probability. Next, using the determined

{\hat{D}}_{i}

and

{\hat{D}}_{m}

we obtained

\hat{α}

and

\hat{γ}

as follows:

\hat{α} = \frac{{\hat{D}}_{i}}{l_{i}^{2}} .

(13)

\hat{γ} = \frac{v_{m}^{2}}{2 ρ_{0 i}^{2} {\hat{D}}_{m}}

(14)

where the values of l_i, v_m and ρ_0i are the same values shown in Section 2.1. Finally,

\hat{β}

was determined in the following way so that the magnitude correlation among

β, β_{c} and β_{H}

was kept: After calculating

{\hat{β}}_{c}

and

{\hat{β}}_{H}

based on

{\hat{D}}_{i}

and

{\hat{D}}_{m}

,

\hat{β}

was obtained as

{\begin{matrix} \hat{β} = 0.9 {\hat{β}}_{c} + R^{'} Δ & : β < β_{c} \\ \hat{β} = {\hat{β}}_{c} + \frac{β - β_{c}}{β_{H} - β_{c}} ({\hat{β}}_{H} - {\hat{β}}_{c}) + R^{'} Δ & : β_{c} < β < β_{H} \end{matrix}

(15)

where

R^{'}

randomly takes integer values

- 1, 0 or 1

and

Δ

represents the amount of deviation.

To examine the predictability of the ANN model according to the deviation magnitude of test datasets from the training datasets, we set

Δ_{i, m} and Δ

to be 0.1, 1 or 10 % of the corresponding parameter values.

3. Results and Discussion

3.1. Training of ANN

Figure 3 shows changes in the loss function during the steepest descent iterations. In both the cases with initial structures by Scheme A and Scheme B, the loss function was successfully reduced. This means that the training of the ANN model on the prepared datasets was achieved.

Figure 4 compares dislocation structures (

p_{1}, p_{2} and p_{3}

) predicted by the trained ANN and the actual WA results. Note that the predicted

p_{1}

can take fractional (non-integer) values, which makes the deviation of the predicted values from the actual

p_{1}

a little conspicuous (up to 0.28 and 0.29 for Schemes A and B, respectively). It is however only a few points that show a relatively large deviation. It is noticed therefore that the predicted and actual values are overall in good agreement.

3.2. Evaluation of ANN with Test Datasets

Figure 5 compares predicted dislocation structures (

p_{1}, p_{2} and p_{3}

) with the ANN and actual WA solutions for the deviated parameter sets. To quantitatively assess the validity of the ANN, errors in the ANN prediction of

p_{2} and p_{3}

from the actual WA solutions were calculated and compared with the magnitude of

p_{2} and p_{3}

, respectively. The average errors for the parameter sets with 10% deviation (test data) were found to be within 7% of the average magnitude of

p_{2} and p_{3}

. The comparison overall demonstrates a good performance of the ANN model giving predictions in a good agreement with the actual WA results, with exceptions at relatively large average wall widths (

p_{3}

). These deviations are, however, reasonable because the points showing the large deviation are from the cases that were excluded from the training of the ANN (i.e.,

D_{i} / D_{m} = 0.2 \times 10^{- 2}

and

D_{i} = 10^{- 3} {μ m}^{2} / s

). These cases have the largest value of

D_{m} = 0.5 {μ m}^{2} / s

among the test datasets, which was presumably a reason behind the large deviation because machine learning is basically not suitable for extrapolation.

Although the trained ANN does not seem to work well for extrapolation as shown above, its predictability for 10% deviation from the training datasets demonstrates the good performance of the constructed ANN. It is presumably possible to make the ANN model more robust and reliable to cover a wider area in the parameter space by providing more training datasets. It is however not the objective of this study to construct such a robust ANN model to bypass the WA diffusion–reaction equation calculation. We rather aim to suggest the possibility of mapping between the input parameters of a simulation model and its results using machine learning.

3.3. Possibility of Inductive Construction of a Simulation Model

The successful demonstration of the input–output mapping paves the way for the inductive determination of input parameters of a phenomenological simulation model. Once the mapping that links between the input parameters and the results is achieved, it is possible to select a parameter set that produces a desired simulation result. Now, if we have experimental results of a specific phenomenon and a reliable simulation model (e.g., governing equations to model the phenomenon), we can conjecture what the simulation result should look like. Then, we can pick a parameter set that gives a simulation result consistent with the experimental observation. In other words, this way is to find parameters in the simulation model as a reverse problem. This inductive scheme, which can be regarded as a top-down determination of parameters, provides an alternative measure to determine constitutive equation parameters, which is often challenging to carry out in a bottom-up (deductive) manner.

The proposed scheme of inductive determination of simulation models can be regarded as one example of a physics-informed neural network (PINN) [17,18,19,20,21]. Raissi et al. proposed the following three types of PINN:

(A): Finding solution of partial differential equations
The function forms and parameters of partial differential equations (PDEs) are known. Initial and boundary conditions are given at discrete sampling points. A neural network mimicking the solution of the PDEs is to be found.
(B): Finding parameter of partial differential equations
The function forms of PDEs are known while their parameters are unknown. The solution of the PDEs is given at discrete sampling points. A neural network as the solution of the PDEs is to be found, resulting in the determination of the PDE parameters.
(C): Finding latent physical quantities in observations
The function forms and parameters of PDEs are known. A physical quantity in the considered system is given by observation. A neural network giving the observed physical quantity and other (latent) physical quantities appearing in the PDEs is to be found so that the prediction of the quantities is consistent with the observation and the PDEs.

Among these PINN types, our approach may fall in (B) where parameters in PDEs are found by means of a neural network model.

It should be noted here, however, that what was represented by the ANN in our approach is not the mapping between the parameters and the PDE solution, but that between the parameters and the values quantifying the PDE solution. In other words, we presented an original scheme to quantify the dislocation density distribution as the WA solution while we adopted a simple ANN model that maps between the input and output sets of scalar values.

3.4. Integration of Deduction and Induction Approaches in Multiscale Modeling

The scheme of inductive determination of simulation parameters should not be limited to the WA model as demonstrated in this study, but can be applied to any other simulation models in general. A simulation model usually consists of governing equations that have some parameters, and its solution can be obtained once the parameters and the initial conditions are given. Thus, it is possible to apply the proposed scheme to any simulation model and obtain mapping between the parameters and the solution, making the inductive determination of the parameters possible once the desired (i.e., consistent with observed facts) simulation results are known.

This approach may be extended to realize a reasonable link between simulation models covering adjoining length scales as schematically shown in Figure 6. Let us take a situation, for example, where we deal with a material behavior that requires two scale models, the lower of which can be treated with the atomistic model (e.g., molecular dynamics) and the larger is described by a phenomenological model (e.g., phase field). Molecular dynamics can simulate material behaviors using interatomic potentials non-empirically constructed with first-principles calculations to obtain characteristic material properties such as diffusion coefficients, dislocation mobility, critical stress for crystal slips, etc., which is a so-called a bottom-up (deductive) evaluation of material properties.

This deductive/bottom-up scheme is often applied for scale-bridging in hierarchical multiscale simulation models [22]. For example, material properties at the nanometer scale obtained with atomistic model calculations can be put into an upper-scale (mesoscopic) model such as dislocation dynamics to conduct a larger scale (but coarser resolution) simulation. Similarly, the mesoscopic model can evaluate material properties at the corresponding scale, which may be fed to a macroscopic model. This way one can build a hierarchical multiscale model where different length scales are interconnected by the bottom-up fashion. The problem is, however, that the evaluation in the material properties may contain some errors, and the errors can accumulate if the bottom-up scale-bridging is repeated, leading to substantial deviation from reality at the macroscopic scale.

Accepting that the properties evaluated by a simulation model inevitably contain some amount of errors, we may adjust the obtained properties so that the models on the different stories of the multiscale hierarchy are mutually interconnected and consistent with experimental observations, which are usually given at the largest end of the hierarchy. The inductive determination of model parameters can be utilized for such objectives. The inductive scheme makes it possible to find possible parameter sets that produce results consistent with the experiment, i.e., the top-down determination of reliable parameters. When this is combined with the bottom-up evaluation of material properties, one can find constitutive laws of the material that are consistent with the experiment and also based on physics. The deduction–induction integration may be a promising new concept for the hierarchical multiscale simulation because it can eliminate common problems in the top-down (i.e., constitutive laws are not physics-based) and bottom-up (i.e., results can be deviated substantially from experiment) scale-bridging.

4. Conclusions

An inductive approach for the determination of appropriate parameters in simulation models by means of machine learning is presented and demonstrated for a rate-reaction model of dislocation structure formation. Using the reaction–diffusion equations of mobile and immobile dislocation density distributions proposed by Walgraef and Aifantis, we calculated dislocation wall formation in a one-dimensional model with various predetermined parameter sets.

After obtaining resulting dislocation wall structures with extensive different parameter sets, an ANN model was constructed to reproduce the characterization of the dislocation wall structures as a function of the parameter set. The constructed ANN presented a good predictability with test datasets; i.e., average errors for the parameter sets with 10% deviation from training data were found to be within 7% of the average magnitude of the target data, although non-negligible deviation was found for extrapolation. The presented scheme of inductive determination of simulation model parameters can be regarded as a top-down approach to find an appropriate constitutive laws of materials. This approach provides a new scheme to bridge models for different length scales in the hierarchical multiscale simulation framework.

Author Contributions

Conceptualization, Y.U.; methodology, Y.U.; software, Y.U. and A.K.; validation, E.K., H.S. and T.S.; formal analysis, Y.U.; investigation, Y.U., E.K. and A.K.; resources, Y.U.; data curation, E.K. and A.K.; writing—original draft preparation, Y.U.; writing—review and editing, Y.U., E.K. and A.K.; visualization, Y.U., E.K. and A.K.; supervision, Y.U.; project administration, Y.U.; funding acquisition, Y.U. and T.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by JST (Japan Science and Technology Agency) through a CREST program “Nanomechanics” (Funding number: JPMJCR2092). Y.U. and T.S. also acknowledge financial support from JSPS KAKENHI Grants No. 19H02020 and 21H04534, respectively.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Nomenclature

$c$	Auxiliary parameter for Equations (5)–(7)
$D_{i}$	Diffusion coefficient of immobile dislocation
$D_{m}$	Diffusion coefficient of mobile dislocation
$f$	Uniform random number
$f_{n} (x)$	Activation function
L	Loss function
$l$	Simulation cell size for WA equation
$l_{i}$	Mean free path of immobile dislocation
$N$	Number of test data
$p_{1, 2, 3}$	ANN output
$p_{i, k}$	Training data
$p_{i, k}^{0}$	Reference data
$R^{'}, R_{i}, R_{m}$	Uniform random number
$t$	Time
$v_{m}$	Effective velocity of mobile dislocation
$w_{q 0}^{n}$	Bias parameter
$w_{q r}^{n}$	Weight parameter
$x$	Position
$x_{q}^{n}$	State of ANN nodes
$α$	Walgraef–Aifantis parameter
$β$	Walgraef–Aifantis parameter
$β_{1, 2, 3}$	$β$ sampled for machine learning
$β_{c}$	Critical value of $β$ at Turing instability
$β_{H}$	Critical value of $β$ at Hopf bifurcation
$γ$	Walgraef–Aifantis parameter
$Δ, Δ_{i, m}$	Deviation for test dataset
$ρ_{i}$	Density of immobile dislocation
$ρ_{m}$	Density of mobile dislocation
$ρ_{\max}$	Maximum dislocation density
$ρ_{\min}$	Minimum dislocation density
$ρ_{th}$	Threshold dislocation density
$ρ_{0 i}$	Walgraef–Aifantis parameter

Appendix A

Random initial distributions of dislocation density,

ρ^{i} (x)

and

ρ^{m} (x)

, in the two ways, i.e., Schemes A and B in Section 2.1, were created. In this section, the procedure in Scheme B (smooth random distribution) is explained in detail. Since both immobile and mobile dislocation densities,

ρ^{i} (x)

and

ρ^{m} (x)

, are dealt with in the same way, we refer to them together as

ρ^{i, m} (x)

hereafter. The basic process is based on the three steps as follows: (1) A basic density distribution profile is created by superposing sinusoidal functions with different wavenumbers; (2) the distribution profile is modulated in the spatial direction (x) to introduce high frequency contributions; and (3) the distribution is rescaled to set the minimal and maximal values of density distribution to predetermined values.

(1) Basic function profile

{\bar{ρ}}^{i, m} (x)

is given by the following equation:

{\bar{ρ}}^{i, m} (x) = \sum_{n = 1}^{O^{i, m}} A_{n}^{i, m} \cos (n π \frac{x}{L}),

where

O^{i, m}

is the number of sinusoidal waves to be superposed (predetermined), and the coefficient

A_{n}^{i, m}

is a random number (uniform distribution in

- 1 \leq A_{n}^{i, m} < 1

). Note that

{\bar{ρ}}^{i, m} (x)

satisfies

d {\bar{ρ}}^{i, m} / d x |_{x = 0} = d {\bar{ρ}}^{i, m} / d x |_{x = L} = 0

(i.e., Neumann boundary condition) for arbitrary

O^{i, m}

and

A_{n}^{i, m}

.

(2) Spatial modulation is applied with the following function

ξ^{i, m} (x)

:

x \mapsto ξ^{i, m} (x) = x + C_{w}^{i, m} \frac{\sin (2 N_{w}^{i, m} π \frac{x}{L})}{2 N_{w}^{i, m} π / L},

where

C_{w}^{i, m}

is a real number (

- 1 < C_{w}^{i, m} < 1

), and

N_{w}

is a positive integer (both

C_{w}^{i, m}

and

N_{w}^{i, m}

are predetermined). The modulated density distribution is given as

{\bar{ρ}}^{i, m} (ξ^{i, m} (x))

. Note the following features in the

ξ^{i, m}

function, which are useful for properly generating many random distributions: (i)

ξ^{i, m} (x)

is a bijective (one-to-one) function; (ii)

ξ^{i, m} (0) = 0

and

ξ^{i, m} (L) = L

; (iii)

d ξ^{i, m} / d x > 0

, and (iv)

ξ^{i, m} (x)

does not affect the minimum and maximum values of

{\bar{ρ}}^{i, m}

.

(3) The modulated density distribution

{\bar{ρ}}^{i, m} (ξ^{i, m} (x))

is rescaled by the linear transformation to adjust the maximal and minimal values:

\begin{array}{l} {\bar{ρ}}^{i, m} (x) \mapsto ρ^{i, m} (x) = S^{i, m} [- ρ_{\min}^{i, m} + {\bar{ρ}}^{i, m} (x)] + ρ_{\min}^{i, m}, \\ S^{i, m} ≔ \frac{ρ_{\max}^{i, m} - ρ_{\min}^{i, m}}{\max ({\bar{ρ}}^{i, m}) - \min ({\bar{ρ}}^{i, m})}, \end{array}

where

ρ_{\max}^{i, m}

and

ρ_{\min}^{i, m}

are the predetermined maximum and minimum values of the dislocation density. The resultant function

ρ^{i, m} (x)

is adopted as the initial dislocation density distribution.

In summary, a random smooth distribution can be obtained by giving the ten parameters

O^{i, m}

,

C_{w}^{i, m}

,

N_{w}^{i, m}

,

ρ_{\max}^{i, m}

, and

ρ_{\min}^{i, m}

. In this study, we prepared ten initial distributions by this method, with a single parameter set (listed in Table A1) with ten different random seeds.

Table A1. Controlling parameter sets.

	Immobile Dislocation	Mobile Dislocation
$O$	17	23
$C_{w}$	0.5	0.5
$N_{w}$	6	6
$ρ_{\max}$	1.0	1.0
$ρ_{\min}$	0.2	0.2

References

Suresh, S. Fatigue of Materials, 2nd ed.; Cambridge University Press: Cambridge, UK, 2012; pp. 53–79. [Google Scholar]
Sumigawa, T.; Uegaki, S.; Yukishita, T.; Arai, S.; Takahashi, Y.; Kitamura, T. FE-SEM in situ observation of damage evolution in tension-compression fatigue of micro-sized single-crystal copper. Mater. Sci. Eng. A 2019, 764, 138218. [Google Scholar] [CrossRef]
Sumigawa, T.; Hikasa, K.; Kusunose, A.; Unno, H.; Masuda, K.; Shimada, T.; Kitamura, T. In situ TEM observation of nanodomain mechanics in barium titanate under external loads. Phys. Rev. Mater. 2020, 4, 054415. [Google Scholar] [CrossRef]
Lavenstein, S.; Gu, Y.; Madisetti, D.; El-Awady, J.A. The heterogeneity of persistent slip band nucleation and evolution in metals at the micrometer scale. Science 2020, 370, 190. [Google Scholar] [CrossRef] [PubMed]
Walgraef, D.; Aifantis, E.C. Dislocation patterning in fatigued metals as a result of dynamical instabilities. J. Appl. Phys. 1985, 58, 688–691. [Google Scholar] [CrossRef]
Walgraef, D.; Aifantis, E.C. On the formation and stability of dislocation patterns—I: One-dimensional considerations. Int. J. Eng. Sci. 1985, 23, 1351–1358. [Google Scholar] [CrossRef]
Walgraef, D.; Aifantis, E.C. On the formation and stability of dislocation patterns—II: Two-dimensional considerations. Int. J. Eng. Sci. 1985, 23, 1359–1364. [Google Scholar] [CrossRef]
Walgraef, D.; Aifantis, E.C. On the dynamical origin of dislocation patterns—III: Three-dimensional considerations. Int. J. Eng. Sci. 1985, 23, 1365–1372. [Google Scholar] [CrossRef]
Aifantis, E.C. On the dynamical origin of dislocation patterns. Mater. Sci. Eng. 1986, 81, 563–574. [Google Scholar] [CrossRef]
Walgraef, D.; Aifantis, E.C. Dislocation patterning in fatigued metals: Labyrinth structures and rotational effects. Int. J. Eng. Sci. 1986, 24, 1789–1798. [Google Scholar] [CrossRef]
Romanov, A.E.; Aifantis, E.C. On the kinetic and diffusional nature of linear defects. Scr. Metall. Mater. 1993, 29, 707–712. [Google Scholar] [CrossRef]
Aifantis, E.C. Non-linearity, periodicity and patterning in plasticity and fracture. Int. J. Non-Linear Mech. 1996, 31, 797–809. [Google Scholar] [CrossRef]
Walgraef, D.; Aifantis, E.C. On certain problems of deformation-induced material instabilities. Int. J. Eng. Sci. 2012, 59, 140–155. [Google Scholar] [CrossRef]
Spiliotis, K.G.; Russo, L.; Siettos, C.; Aifantis, E.C. Analytical and numerical bifurcation analysis of dislocation pattern formation of the Walgraef–Aifantis model. Int. J. Non-Linear Mech. 2018, 102, 41–52. [Google Scholar] [CrossRef]
Schiller, C.; Walgraef, D. Numerical simulation of persistent slip band formation. Acta Metall. 1988, 36, 563–574. [Google Scholar] [CrossRef]
Saito, Y. Deep Learning from Scratch; O’Reilly Japan: Tokyo, Japan, 2016; pp. 39–122. [Google Scholar]
Karniadakis, G.E.; Kevrekidis, I.G.; Lu, L.; Perdikaris, P.; Wang, S.; Yang, L. Physics-informed machine learning. Nat. Rev. Phys. 2021, 3, 422–440. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Inferring solutions of differential equations using noisy multi-fidelity data. J. Comput. Phys. 2017, 335, 736–746. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Machine learning of linear differential equations using Gaussian processes. J. Comput. Phys. 2017, 348, 683–693. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Numerical Gaussian Processes for Time-Dependent and Nonlinear Partial Differential Equations. SIAM J. Sci. Comput. 2018, 40, A172–A198. [Google Scholar] [CrossRef]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
Fan, J. Multiscale Analysis of Deformation and Failure of Materials; John Wiley & Sons. Ltd.: Chichester, UK, 2011; pp. 261–341. [Google Scholar]

Figure 1. Schematics of typical dislocation density distributions (

ρ_{i}

). (a): Aligned peaks with low-density areas in between, indicating formation of wall structure. (b): No self-organized dislocation pattern.

Figure 1. Schematics of typical dislocation density distributions (

ρ_{i}

). (a): Aligned peaks with low-density areas in between, indicating formation of wall structure. (b): No self-organized dislocation pattern.

Figure 2. Schematic of ANN architecture. Nodes in each layer are indicated as square, circle, diamond, and triangle for the input layer, internal layers, output layer, and bias, respectively. Not all connections between adjacent layers are drawn.

Figure 3. Loss function in (a) Scheme A and (b) Scheme B. The horizontal axis shows the value obtained by Step (max step is 10⁸) × Learning rate (10⁻⁷). The vertical axis shows the loss per training data of the ANN model (N = 240).

Figure 4. Comparison of predicted dislocation structures with the ANN and WA solutions for the training parameter sets of (a) Scheme A and (b) Scheme B. (i), (ii) and (iii) show Boolean values representing whether a dislocation wall structure is formed, p₁, the number of walls, p₂, and the average width of the formed walls, p₃, respectively. In (ii) and (iii), the red and faded red plots show the results after and before averaging with the same initial structure, respectively.

Figure 5. Comparison of predicted dislocation structures with the ANN and WA solutions for the deviated parameter sets of (a) Scheme A and (b) Scheme B. (i), (ii) and (iii) show Boolean values representing whether a dislocation wall structure is formed, p₁, the number of walls, p₂, and the average width of the formed walls, p₃, respectively. The square, diamond and triangle plots show the amount of deviation Δ = 0.1, 1, and 10%, respectively.

Figure 6. Schematic explaining the concept of suggested integration of deduction–induction approaches compared with conventional hierarchical multiscale models.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Umeno, Y.; Kawai, E.; Kubo, A.; Shima, H.; Sumigawa, T. Inductive Determination of Rate-Reaction Equation Parameters for Dislocation Structure Formation Using Artificial Neural Network. Materials 2023, 16, 2108. https://doi.org/10.3390/ma16052108

AMA Style

Umeno Y, Kawai E, Kubo A, Shima H, Sumigawa T. Inductive Determination of Rate-Reaction Equation Parameters for Dislocation Structure Formation Using Artificial Neural Network. Materials. 2023; 16(5):2108. https://doi.org/10.3390/ma16052108

Chicago/Turabian Style

Umeno, Yoshitaka, Emi Kawai, Atsushi Kubo, Hiroyuki Shima, and Takashi Sumigawa. 2023. "Inductive Determination of Rate-Reaction Equation Parameters for Dislocation Structure Formation Using Artificial Neural Network" Materials 16, no. 5: 2108. https://doi.org/10.3390/ma16052108

APA Style

Umeno, Y., Kawai, E., Kubo, A., Shima, H., & Sumigawa, T. (2023). Inductive Determination of Rate-Reaction Equation Parameters for Dislocation Structure Formation Using Artificial Neural Network. Materials, 16(5), 2108. https://doi.org/10.3390/ma16052108

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Inductive Determination of Rate-Reaction Equation Parameters for Dislocation Structure Formation Using Artificial Neural Network

Abstract

1. Introduction

2. Methodology

2.1. Reaction-Diffusion Equation by Walgraef-Aifantis

2.2. Characterization of Resulting Dislocation Structure

2.3. Mapping of WA Parameters and Resulting Dislocation Structure

2.3.1. Structure of Artificial Neural Network model

2.3.2. Training of ANN

2.3.3. Test of ANN

3. Results and Discussion

3.1. Training of ANN

3.2. Evaluation of ANN with Test Datasets

3.3. Possibility of Inductive Construction of a Simulation Model

3.4. Integration of Deduction and Induction Approaches in Multiscale Modeling

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Nomenclature

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI