Inductive Determination of Rate-Reaction Equation Parameters for Dislocation Structure Formation Using Artificial Neural Network

The reaction–diffusion equation approach, which solves differential equations of the development of density distributions of mobile and immobile dislocations under mutual interactions, is a method widely used to model the dislocation structure formation. A challenge in the approach is the difficulty in the determination of appropriate parameters in the governing equations because deductive (bottom-up) determination for such a phenomenological model is problematic. To circumvent this problem, we propose an inductive approach utilizing the machine-learning method to search a parameter set that produces simulation results consistent with experiments. Using a thin film model, we performed numerical simulations based on the reaction–diffusion equations for various sets of input parameters to obtain dislocation patterns. The resulting patterns are represented by the following two parameters; the number of dislocation walls (p2), and the average width of the walls (p3). Then, we constructed an artificial neural network (ANN) model to map between the input parameters and the output dislocation patterns. The constructed ANN model was found to be able to predict dislocation patterns; i.e., average errors in p2 and p3 for test data having 10% deviation from the training data were within 7% of the average magnitude of p2 and p3. The proposed scheme enables us to find appropriate constitutive laws that lead to reasonable simulation results, once realistic observations of the phenomenon in question are provided. This approach provides a new scheme to bridge models for different length scales in the hierarchical multiscale simulation framework.


Introduction
Fatigue is a common fracture mode in metal and accounts for a substantial fraction of failure cases in real industrial products. It is therefore demanded to fully understand the mechanism of fatigue fracture. In particular, there is still much room for investigations of the mechanism of fatigue crack formation under cyclic loading. It is widely understood that the fatigue crack formation in macroscopic metal materials originates in the persistent slip band (PSB) formed as a result of self-organization of dislocation structures [1]. Nevertheless, the PSB formation mechanisms proposed thus far have room for further examination and assessment, urging investigation by modeling and simulation. Moreover, recent experimental studies of fatigue in nanometer-or submicron-sized materials by Sumigawa et al. [2,3] indicate the possibility of unveiled mechanisms of fatigue at the nanometer and submicron scales, or "nano-micro fatigue". As fatigue fracture was observed in a specimen smaller than the dimension of PSB, which suggests that fatigue fracture can occur without the presence of PSB, the variation of self-organized dislocation patterns due to size effect should play a key role in nano-micro fatigue [2,4]. This finding also urges modeling and simulation to reveal unknown complicated mechanisms lying behind the dislocation structure formation.
Walgraef and Aifantis proposed a phenomenological model of the dislocation pattern formation in metal under cyclic loading based on the rate-reaction (reaction-diffusion) theory [5][6][7][8][9][10][11][12][13][14]. The reaction-diffusion equation approach, which solves differential equations of the development of density distributions of mobile and immobile dislocations under mutual interactions, has been widely used to simulate the dislocation structure formation and was found to be useful to discuss its mechanisms. A challenge in such a phenomenological model is, however, the difficulty in the determination of appropriate parameters in the governing equations. Parameters are often fitted so that the simulation results become consistent with experimental observations of the phenomenon in question, but this can be daunting when the governing equations contain a number of parameters.
An alternative way would be bottom-up (deductive) determination based on a different physical model covering a smaller length scale. For example, dislocation mobility used in the discrete dislocation dynamics can be obtained by a molecular dynamics simulation of a single dislocation. Although this scheme may look straightforward, consistency with experimental facts is not guaranteed because the simulation at the lower scale may produce nontrivial deviation from reality due to technical constraints such as the limitation of spatial and temporal scales of the simulation setup. Moreover, the deductive approach must rely on a multi-story hierarchy of multiscale models if the phenomenological model is on the macroscopic side, which can make the parameter determination substantially prone to accumulated deviation.
In this study, we propose an inductive approach for the parameter determination utilizing machine-learning. Using a thin film model, we performed numerical simulations of dislocation structure formation based on the Walgraef-Aifantis (WA) model using various sets of input parameters. The results of dislocation structures (density distributions) were characterized with devised algorithms. Then, we constructed an artificial neural network (ANN) model that predicts the output dislocation patterns from the input parameters. The application of the proposed scheme for finding appropriate constitutive laws consistent with experiments was discussed. This new scheme paves the way for bridging models for different length scales in the hierarchical multiscale simulation framework.

Reaction-Diffusion Equation by Walgraef-Aifantis
Walgraef and Aifantis proposed a reaction-diffusion model (which we call the WA model hereafter in this paper) to describe temporal change in dislocation densities, which has a long history and is widely used for its convenience. In this model, dislocations are divided into two categories; i.e., the mobile and immobile dislocations. The former is free to move as a response to stress exerted in the slip plane, while the latter is trapped or moves slowly. The mobile and immobile dislocation densities in space x and time t are, respectively, described by ρ m (x, t) and ρ i (x, t). Temporal evolution of the dislocation density functions is obtained by solving the following parallel non-linear partial differential equations [14]: The first terms of the right-hand side of the two equations represent diffusion-like behavior of the mobile and immobile dislocations, with D i and D m being constants for the strength of diffusivity. Pinning-up of newly produced dislocations during the formation of PSBs is expressed by α(ρ 0i − ρ i ), where α represents the annihilation rate and ρ 0i is a constant value describing the source of immobile dislocations, which is assumed to exist Materials 2023, 16, 2108 3 of 14 uniformly in the system. Release of immobile dislocations from the dislocation forest is described by βρ i , where β designates the dislocation release from the forest. Capture of mobile dislocations by immobile dipoles is represented by the nonlinear term of γρ m ρ 2 i , with γ being the capture rate.
While the two parallel equations of the evolution of the dislocation density functions include five parameters (WA parameters; D i , D m , α, β, γ), not all of them are independent, meaning that the five parameters should not be arbitrarily determined. According to Schiller et al. [15], the following relations hold among the pinning-up rate, the release rate and the diffusivity strengths: where l i and v m are the mean free path of immobile dislocations, and the effective velocity of mobile dislocations considering trapping by obstacles, respectively. In addition, it is known that there are two bifurcations in the dislocation pattern depending on the value of β. If β equals or exceeds a critical value of β H , the dislocation pattern oscillates with time, which is called the Hopf bifurcation. The other critical value is β c (assuming β c < β H ), at which the Turing instability occurs; i.e., the dislocation pattern is formed if β exceeds β c . According to stability analysis, β H and β c are in relation with some parameters as [14] The five WA parameters with the consideration of the abovementioned traits were changed as follows: D i was set to be 10 −4 , 10 −3.5 or 10 −3 µm 2 /s. D m was set such that D i /D m = 0.2 × 10 −2 , 0.5 × 10 −2 or 10 −2 because D i /D m should be at least 10 −2 [15].
β was set to be β 1 , β 2 or β 3 , where β 1 = 0.9 β c , . α and γ are determined according to Equations (3) and (4), with l i = 10 −2 µm, v m = 10 µm/s and ρ 0i = 0.5 µm −2 [15], which is an average value of the initial distribution of ρ i . Therefore, 27 parameter sets were used in total. As described above, ρ i and ρ m are functions of space coordinate x and time t, meaning that we consider a one-dimensional distribution of dislocations. x is in the range 0 ≤ x ≤ l, where l indicates the thickness of the space. In our simulation, we set l = 1.0 µm. At the boundaries of the space, the spatial derivatives of the dislocation densities are assumed to be zero; i.e., Here, we explain two schemes to construct initial dislocation distributions from which the parallel reaction-diffusion calculations start. One is a simple way to use random fractional values of f (0 < f < 1) as the initial dislocation density (Scheme A). The other is a devised scheme to give more smooth but random distributions (Scheme B). There, the initial dislocation distributions were generated by superposing sinusoidal waves with various (given) wavenumbers and random amplitudes, controlled to satisfy the given boundary condition and minimum/maximum values. The detailed procedure of Scheme B is explained in Appendix A. From these initial distributions, we solved the parallel partial differential equations (Equations (1) and (2)) numerically using the Euler method with a time step of 1.0 × 10 −6 s.

Characterization of Resulting Dislocation Structure
After a sufficient number of iterations of numerical integral (time development) of Equations (1) and (2), we obtain converged dislocation densities (ρ i and ρ m ). Since the density distribution of immobile dislocations should represent the dislocation pattern formed as a result of diffusion and reaction of mobile and immobile dislocations, we analyze the form of ρ i . Figure 1 schematically shows two typical distribution patterns of immobile dislocations. Figure 1a depicts the case where we find peaks of dislocation density aligned with low-density areas lying in between, which can be regarded as the formation of the wall structure. Thus, the peaks in ρ i will be called "walls" hereafter in this paper. In contrast, no characteristic shape is found in some cases such as Figure 1b After a sufficient number of iterations of numerical integral (time development) of Equations (1) and (2), we obtain converged dislocation densities ( and ). Since the density distribution of immobile dislocations should represent the dislocation pattern formed as a result of diffusion and reaction of mobile and immobile dislocations, we analyze the form of . Figure 1 schematically shows two typical distribution patterns of immobile dislocations. Figure 1a depicts the case where we find peaks of dislocation density aligned with low-density areas lying in between, which can be regarded as the formation of the wall structure. Thus, the peaks in will be called "walls" hereafter in this paper. In contrast, no characteristic shape is found in some cases such as Figure 1b, indicating no formation of self-organized dislocation patterns. Now, we need an algorithm to extract features of the distribution pattern from the function ( ) (0 ≤ ≤ ), namely; (a) the presence of a self-organized pattern; (b) the number of walls; and (c) the average width of walls. Our algorithm works as follows: It is regarded that a self-organized pattern is formed when more than 50% of ( ) exceeds the threshold ≔ + 0.05 ( − ), where and are the maximum and minimum values of ( ). By detecting points where the curve = ( ) and = intersect each other, the number of walls can be counted. The width of a wall is defined as the width of a continuous region where ( ) ≥ .
(a) Wall structure (b) No characteristic pattern

Structure of Artificial Neural Network model
Our ANN model consists of five layers including the input and output layers. The number of nodes on each layer is 5  6  6  4  3. The input layer has five nodes corresponding to WA parameters ( , , , and ), respectively. The output layer has Now, we need an algorithm to extract features of the distribution pattern from the function ρ i (x) (0 ≤ x ≤ l), namely; (a) the presence of a self-organized pattern; (b) the number of walls; and (c) the average width of walls. Our algorithm works as follows: It is regarded that a self-organized pattern is formed when more than 50% of ρ i (x) exceeds the threshold ρ th := ρ min + 0.05 (ρ max − ρ min ), where ρ max and ρ min are the maximum and minimum values of ρ i (x). By detecting points where the curve y = ρ i (x) and y = ρ th intersect each other, the number of walls can be counted. The width of a wall is defined as the width of a continuous region where ρ i (x) ≥ ρ th .

Structure of Artificial Neural Network model
Our ANN model consists of five layers including the input and output layers. The number of nodes on each layer is 5 → 6 → 6 → 4 → 3. The input layer has five nodes corresponding to WA parameters (D i , D m , α, β and γ), respectively. The output layer has three nodes, which give quantities (p 1 , p 2 and p 3 ) for characterization of the resulting dislocation pattern. p 1 is a Boolean value representing whether a dislocation wall structure is formed. p 2 and p 3 are the number of walls [µm −1 ] and the average width [nm] of the formed walls, respectively. Figure 2 shows a schematic illustration of the ANN architecture. The present ANN model is based on a typical feed-forward network, consisting of five layers; one input layer (hereafter, referred to as Layer 0), three internal layers (Layers 1, 2, and 3), and one output layer (Layer 4). Each layer consists of nodes. The numbers of nodes in Layer n, N n , are set to 5, 6, 6, 4, and 3 for Layers 0, 1, 2, 3, and 4, respectively. The nodes in Layer 0 (input) and Layer 4 (output) are corresponding to the WA parameters (D i , D m , α, β, γ) and the characterization parameters (p 1 , p 2 , p 3 ), respectively. The state of each node is given by a real number, and hereafter we refer to the state of the q-th node in the n-th layer as x n q . The states of nodes in the input layer (n = 0), x 0 q (q = 1, 2, 3, 4, 5), are given by the common logarithms of the WA parameters:

Training of ANN
The reaction-diffusion equations were solved for the parameter sets and the initial structures described in Section 2.1 (i.e., 27 × 10 = 270 cases for each scheme). Among these cases, we found that the final ρm had negative values for the cases with / = 0.2 × 10 and = 10 μm /s. This was presumably because was relatively large, resulting in numerical errors in solving the partial differential equations with the Euler method. Excluding these parameter sets, we used 240 (= 24 × 10) cases as training data of the ANN model for each scheme. Then, the ANN model was trained to map the input WA parameters to the resulting dislocation structure ( , and ).
The loss function, L, which represents deviation of the ANN prediction from actual results, is set to be where , indicates of the case out of the combinations of parameter sets and initial distributions by either Scheme A or B. = 240 (24 parameter sets and 10 initial distributions) is the total number of the cases. The WA model parameters were optimized to reduce the loss function with the steepest descent method.

Test of ANN
To examine the predictability of the trained ANN, i.e., the reliability of the prediction when a parameter set deviates from training datasets, we prepared test datasets (denoted with ) using a predetermined value indicating the amount of deviation, , , as follows: First, and were determined as with R and randomly taking integer values −1, 0 or 1. Here, all combinations of and excluding = = 0 (i.e., eight cases) were produced with the same probability. Next, using the determined and we obtained and as follows: Schematic of ANN architecture. Nodes in each layer are indicated as square, circle, diamond, and triangle for the input layer, internal layers, output layer, and bias, respectively. Not all connections between adjacent layers are drawn.
Note that we adopted a logarithm of the parameters instead of the parameters themselves because the parameters are expected to vary in a wide range (by several digits). The states of the internal and output layers are determined by the previous layer as [16] x n q = f n w n q0 + where w n qr denotes the weight parameter, and w n qr is the bias parameter (n = 1-4; q = 1, . . . , N n ; r = 1, . . . , N n−1 ), which are the internal parameters to be optimized by machine learning. Therefore, the total number of the parameters is 121. The function f n (x) represents the activation functions defined as [16] The node states of the output layer are interpreted as the predicted characterization parameters, i.e., p q = x 4 q (q = 1, 2, 3). Note that the Boolean parameter p 1 deals with a real number in the ANN model for simplicity (If a wall structure is formed, then p 1 = 1; otherwise p 1 = 0). In this description, the value of p 1 can be interpreted as the probability of formation of a wall structure.
The numbers of intermediate layers and nodes on the layer are arbitrarily chosen, but should affect the performance of the ANN model. This will be discussed later in this paper.

Training of ANN
The reaction-diffusion equations were solved for the parameter sets and the initial structures described in Section 2.1 (i.e., 27 × 10 = 270 cases for each scheme). Among these cases, we found that the final ρ m had negative values for the cases with D i /D m = 0.2 × 10 −2 and D i = 10 −3 µm 2 /s. This was presumably because D m was relatively large, resulting in numerical errors in solving the partial differential equations with the Euler method. Excluding these parameter sets, we used 240 (=24 × 10 ) cases as training data of the ANN model for each scheme. Then, the ANN model was trained to map the input WA parameters to the resulting dislocation structure ( p 1 , p 2 and p 3 ).
The loss function, L, which represents deviation of the ANN prediction from actual results, is set to be where p i,k indicates p i of the case k out of the combinations of parameter sets and initial distributions by either Scheme A or B. n = 240 (24 parameter sets and 10 initial distributions) is the total number of the cases. The WA model parameters were optimized to reduce the loss function with the steepest descent method.

Test of ANN
To examine the predictability of the trained ANN, i.e., the reliability of the prediction when a parameter set deviates from training datasets, we prepared test datasets (denoted withˆ) using a predetermined value indicating the amount of deviation, ∆ i,m , as follows: First,D i andD m were determined aŝ with R i and R m randomly taking integer values −1, 0 or 1. Here, all combinations of R i and R m excluding R i = R m = 0 (i.e., eight cases) were produced with the same probability. Next, using the determinedD i andD m we obtainedα andγ as follows: where the values of l i , v m and ρ 0i are the same values shown in Section 2.1. Finally,β was determined in the following way so that the magnitude correlation among β, β c and β H was kept: After calculatingβ c andβ H based onD i andD m ,β was obtained as where R randomly takes integer values −1, 0 or 1 and ∆ represents the amount of deviation.
To examine the predictability of the ANN model according to the deviation magnitude of test datasets from the training datasets, we set ∆ i,m and ∆ to be 0.1, 1 or 10 % of the corresponding parameter values. Figure 3 shows changes in the loss function during the steepest descent iterations. In both the cases with initial structures by Scheme A and Scheme B, the loss function was successfully reduced. This means that the training of the ANN model on the prepared datasets was achieved.  Figure 3 shows changes in the loss function during the steepest descent iterations. In both the cases with initial structures by Scheme A and Scheme B, the loss function was successfully reduced. This means that the training of the ANN model on the prepared datasets was achieved.   can take fractional (non-integer) values, which makes the deviation of the predicted values from the actual a little conspicuous (up to 0.28 and 0.29 for Schemes A and B, respectively). It is however only a few points that show a relatively large deviation. It is noticed therefore that the predicted and actual values are overall in good agreement.  Note that the predicted p 1 can take fractional (non-integer) values, which makes the deviation of the predicted values from the actual p 1 a little conspicuous (up to 0.28 and 0.29 for Schemes A and B, respectively). It is however only a few points that show a relatively large deviation. It is noticed therefore that the predicted and actual values are overall in good agreement. Figure 5 compares predicted dislocation structures (p 1 , p 2 and p 3 ) with the ANN and actual WA solutions for the deviated parameter sets. To quantitatively assess the validity of the ANN, errors in the ANN prediction of p 2 and p 3 from the actual WA solutions were calculated and compared with the magnitude of p 2 and p 3 , respectively. The average errors for the parameter sets with 10% deviation (test data) were found to be within 7% of the average magnitude of p 2 and p 3 . The comparison overall demonstrates a good performance of the ANN model giving predictions in a good agreement with the actual WA results, with exceptions at relatively large average wall widths (p 3 ). These deviations are, however, reasonable because the points showing the large deviation are from the cases that were excluded from the training of the ANN (i.e., D i /D m = 0.2 × 10 −2 and D i = 10 −3 µm 2 /s). These cases have the largest value of D m = 0.5 µm 2 /s among the test datasets, which was presumably a reason behind the large deviation because machine learning is basically not suitable for extrapolation.

Evaluation of ANN with Test Datasets
Although the trained ANN does not seem to work well for extrapolation as shown above, its predictability for 10% deviation from the training datasets demonstrates the good performance of the constructed ANN. It is presumably possible to make the ANN model more robust and reliable to cover a wider area in the parameter space by providing more training datasets. It is however not the objective of this study to construct such a robust ANN model to bypass the WA diffusion-reaction equation calculation. We rather aim to suggest the possibility of mapping between the input parameters of a simulation model and its results using machine learning.   good performance of the constructed ANN. It is presumably possible to make the ANN model more robust and reliable to cover a wider area in the parameter space by providing more training datasets. It is however not the objective of this study to construct such a robust ANN model to bypass the WA diffusion-reaction equation calculation. We rather aim to suggest the possibility of mapping between the input parameters of a simulation model and its results using machine learning.

Possibility of Inductive Construction of a Simulation Model
The successful demonstration of the input-output mapping paves the way for the inductive determination of input parameters of a phenomenological simulation model. Once the mapping that links between the input parameters and the results is achieved, it is possible to select a parameter set that produces a desired simulation result. Now, if we have experimental results of a specific phenomenon and a reliable simulation model (e.g., governing equations to model the phenomenon), we can conjecture what the simulation result should look like. Then, we can pick a parameter set that gives a simulation result

Possibility of Inductive Construction of a Simulation Model
The successful demonstration of the input-output mapping paves the way for the inductive determination of input parameters of a phenomenological simulation model.
Once the mapping that links between the input parameters and the results is achieved, it is possible to select a parameter set that produces a desired simulation result. Now, if we have experimental results of a specific phenomenon and a reliable simulation model (e.g., governing equations to model the phenomenon), we can conjecture what the simulation result should look like. Then, we can pick a parameter set that gives a simulation result consistent with the experimental observation. In other words, this way is to find parameters in the simulation model as a reverse problem. This inductive scheme, which can be regarded as a top-down determination of parameters, provides an alternative measure to determine constitutive equation parameters, which is often challenging to carry out in a bottom-up (deductive) manner.
The proposed scheme of inductive determination of simulation models can be regarded as one example of a physics-informed neural network (PINN) [17][18][19][20][21]. Raissi et al. proposed the following three types of PINN: (A) Finding solution of partial differential equations The function forms and parameters of partial differential equations (PDEs) are known. Initial and boundary conditions are given at discrete sampling points. A neural network mimicking the solution of the PDEs is to be found. (B) Finding parameter of partial differential equations The function forms of PDEs are known while their parameters are unknown. The solution of the PDEs is given at discrete sampling points. A neural network as the solution of the PDEs is to be found, resulting in the determination of the PDE parameters. (C) Finding latent physical quantities in observations The function forms and parameters of PDEs are known. A physical quantity in the considered system is given by observation. A neural network giving the observed physical quantity and other (latent) physical quantities appearing in the PDEs is to be found so that the prediction of the quantities is consistent with the observation and the PDEs.
Among these PINN types, our approach may fall in (B) where parameters in PDEs are found by means of a neural network model.
It should be noted here, however, that what was represented by the ANN in our approach is not the mapping between the parameters and the PDE solution, but that between the parameters and the values quantifying the PDE solution. In other words, we presented an original scheme to quantify the dislocation density distribution as the WA solution while we adopted a simple ANN model that maps between the input and output sets of scalar values.

Integration of Deduction and Induction Approaches in Multiscale Modeling
The scheme of inductive determination of simulation parameters should not be limited to the WA model as demonstrated in this study, but can be applied to any other simulation models in general. A simulation model usually consists of governing equations that have some parameters, and its solution can be obtained once the parameters and the initial conditions are given. Thus, it is possible to apply the proposed scheme to any simulation model and obtain mapping between the parameters and the solution, making the inductive determination of the parameters possible once the desired (i.e., consistent with observed facts) simulation results are known.
This approach may be extended to realize a reasonable link between simulation models covering adjoining length scales as schematically shown in Figure 6. Let us take a situation, for example, where we deal with a material behavior that requires two scale models, the lower of which can be treated with the atomistic model (e.g., molecular dynamics) and the larger is described by a phenomenological model (e.g., phase field). Molecular dynamics can simulate material behaviors using interatomic potentials non-empirically constructed with first-principles calculations to obtain characteristic material properties such as diffusion coefficients, dislocation mobility, critical stress for crystal slips, etc., which is a so-called a bottom-up (deductive) evaluation of material properties. uation, for example, where we deal with a material behavior that requires two scale models, the lower of which can be treated with the atomistic model (e.g., molecular dynamics) and the larger is described by a phenomenological model (e.g., phase field). Molecular dynamics can simulate material behaviors using interatomic potentials non-empirically constructed with first-principles calculations to obtain characteristic material properties such as diffusion coefficients, dislocation mobility, critical stress for crystal slips, etc., which is a so-called a bottom-up (deductive) evaluation of material properties.  This deductive/bottom-up scheme is often applied for scale-bridging in hierarchical multiscale simulation models [22]. For example, material properties at the nanometer scale obtained with atomistic model calculations can be put into an upper-scale (mesoscopic) model such as dislocation dynamics to conduct a larger scale (but coarser resolution) simulation. Similarly, the mesoscopic model can evaluate material properties at the corresponding scale, which may be fed to a macroscopic model. This way one can build a hierarchical multiscale model where different length scales are interconnected by the bottom-up fashion. The problem is, however, that the evaluation in the material properties may contain some errors, and the errors can accumulate if the bottom-up scale-bridging is repeated, leading to substantial deviation from reality at the macroscopic scale.
Accepting that the properties evaluated by a simulation model inevitably contain some amount of errors, we may adjust the obtained properties so that the models on the different stories of the multiscale hierarchy are mutually interconnected and consistent with experimental observations, which are usually given at the largest end of the hierarchy. The inductive determination of model parameters can be utilized for such objectives. The inductive scheme makes it possible to find possible parameter sets that produce results consistent with the experiment, i.e., the top-down determination of reliable parameters. When this is combined with the bottom-up evaluation of material properties, one can find constitutive laws of the material that are consistent with the experiment and also based on physics. The deduction-induction integration may be a promising new concept for the hierarchical multiscale simulation because it can eliminate common problems in the top-down (i.e., constitutive laws are not physics-based) and bottom-up (i.e., results can be deviated substantially from experiment) scale-bridging.

Conclusions
An inductive approach for the determination of appropriate parameters in simulation models by means of machine learning is presented and demonstrated for a rate-reaction model of dislocation structure formation. Using the reaction-diffusion equations of mobile and immobile dislocation density distributions proposed by Walgraef and Aifantis, we calculated dislocation wall formation in a one-dimensional model with various predetermined parameter sets.
After obtaining resulting dislocation wall structures with extensive different parameter sets, an ANN model was constructed to reproduce the characterization of the dislocation wall structures as a function of the parameter set. The constructed ANN presented a good predictability with test datasets; i.e., average errors for the parameter sets with 10% deviation from training data were found to be within 7% of the average magnitude of the target data, although non-negligible deviation was found for extrapolation. The presented scheme of inductive determination of simulation model parameters can be regarded as a top-down approach to find an appropriate constitutive laws of materials. This approach provides a new scheme to bridge models for different length scales in the hierarchical multiscale simulation framework. Funding: This research was funded by JST (Japan Science and Technology Agency) through a CREST program "Nanomechanics" (Funding number: JPMJCR2092). Y.U. and T.S. also acknowledge financial support from JSPS KAKENHI Grants No. 19H02020 and 21H04534, respectively.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.

Nomenclature c
Auxiliary parameter for Equations (5)

Appendix A
Random initial distributions of dislocation density, ρ i (x) and ρ m (x), in the two ways, i.e., Schemes A and B in Section 2.1, were created. In this section, the procedure in Scheme B (smooth random distribution) is explained in detail. Since both immobile and mobile dislocation densities, ρ i (x) and ρ m (x), are dealt with in the same way, we refer to them together as ρ i,m (x) hereafter. The basic process is based on the three steps as follows: (1) A basic density distribution profile is created by superposing sinusoidal functions with different wavenumbers; (2) the distribution profile is modulated in the spatial direction (x) to introduce high frequency contributions; and (3) the distribution is rescaled to set the minimal and maximal values of density distribution to predetermined values.
(1) Basic function profile ρ i,m (x) is given by the following equation: where O i,m is the number of sinusoidal waves to be superposed (predetermined), and the coefficient A i,m n is a random number (uniform distribution in −1 ≤ A i,m n < 1). Note that ρ i,m (x) satisfies dρ i,m /dx | x=0 = dρ i,m /dx | x=L = 0 (i.e., Neumann boundary condition) for arbitrary O i,m and A i,m n . (2) Spatial modulation is applied with the following function ξ i,m (x): x → ξ i,m (x) = x + C i,m w sin 2N i,m w π x L 2N i,m w π/L , where C i,m w is a real number (−1 < C i,m w < 1), and N w is a positive integer (both C i,m w and N i,m w are predetermined). The modulated density distribution is given as ρ i,m ξ i,m (x) . Note the following features in the ξ i,m function, which are useful for properly generating many random distributions: (i) ξ i,m (x) is a bijective (one-to-one) function; (ii) ξ i,m (0) = 0 and ξ i,m (L) = L; (iii) dξ i,m /dx > 0, and (iv) ξ i,m (x) does not affect the minimum and maximum values of ρ i,m .
(3) The modulated density distribution ρ i,m ξ i,m (x) is rescaled by the linear transformation to adjust the maximal and minimal values: , where ρ i,m max and ρ i,m min are the predetermined maximum and minimum values of the dislocation density. The resultant function ρ i,m (x) is adopted as the initial dislocation density distribution.
In summary, a random smooth distribution can be obtained by giving the ten parameters O i,m , C i,m w , N i,m w , ρ i,m max , and ρ i,m min . In this study, we prepared ten initial distributions by this method, with a single parameter set (listed in Table A1) with ten different random seeds.