Acceleration and Parallelization of a Linear Equation Solver for Crack Growth Simulation Based on the Phase Field Model

Ishii, Gaku; Yamamoto, Yusaku; Takaishi, Takeshi

doi:10.3390/math9182248

Open AccessArticle

Acceleration and Parallelization of a Linear Equation Solver for Crack Growth Simulation Based on the Phase Field Model

by

Gaku Ishii

¹,

Yusaku Yamamoto

^1,*

and

Takeshi Takaishi

²

¹

Department of Communication Engineering and Informatics, The University of Electro-Communications, Tokyo 182-8585, Japan

²

Department of Mathematical Engineering, Musashino University, Tokyo 135-8181, Japan

^*

Author to whom correspondence should be addressed.

Mathematics 2021, 9(18), 2248; https://doi.org/10.3390/math9182248

Submission received: 31 July 2021 / Revised: 2 September 2021 / Accepted: 10 September 2021 / Published: 13 September 2021

(This article belongs to the Section E: Applied Mathematics)

Download

Browse Figures

Versions Notes

Abstract

:

We aim to accelerate the linear equation solver for crack growth simulation based on the phase field model. As a first step, we analyze the properties of the coefficient matrices and prove that they are symmetric positive definite. This justifies the use of the conjugate gradient method with the efficient incomplete Cholesky preconditioner. We then parallelize this preconditioner using so-called block multi-color ordering and evaluate its performance on multicore processors. The experimental results show that our solver scales well and achieves an acceleration of several times over the original solver based on the diagonally scaled CG method.

Keywords:

crack growth simulation; phase field model; conjugate gradient method; incomplete Cholesky factorization; parallelization; block red-black ordering; performance evaluation

MSC:

15A06; 35F61; 65F08; 65M60; 74R99; 74S05

1. Introduction

Crack growth is a ubiquitous phenomenon that affects the strength and functions of materials and structures. Since cracks can grow very quickly, simulation is a useful tool to study their growth process in detail. Simulation can also be used to predict the generation of cracks under given stresses or other conditions. In the conventional method of crack growth simulation, the finite element method (FEM) is used and the mesh is regenerated at every time step so that the mesh boundary conforms to the crack boundary. However, this incurs huge computational cost. Moreover, to determine the direction of crack growth, it is usually necessary to evaluate the total energy for various possible scenarios. This also adds to the computational cost.

To resolve these problems, a crack growth simulation method based on the phase field model has been proposed [1,2,3,4]. In this model, a new continuous dependent variable

z (x, t)

called the phase field [5,6] is introduced in addition to the displacement

u (x, t)

. This variable expresses the degree of the crack at each point: z≃0 if there is no crack and z≃1 if there is a crack. Moreover, a partial differential equation (PDE) describing the time evolution of

z (x, t)

is also derived along with that of

u (x, t)

. This makes it possible to determine the direction of crack growth without the total energy evaluations. Hence, the method is a promising candidate for real-time three-dimensional crack growth simulation. Takaishi et al. implemented a crack growth simulation program based on this method and showed that it works well in various examples. For evaluation purposes, a two-dimensional FreeFEM code based on the method is also available.

When implementing this method, the time-discretization of the PDEs both for

u (x, t)

and

z (x, t)

is usually done with the semi-implicit method to ensure numerical stability. In that case, the time taken to solve the resulting linear simultaneous equations is often dominant in the total computation time. Takaishi et al.’s simulation program uses the conjugate gradient (CG) method with a diagonal scaling preconditioner for both of the equations for

u (x, t)

and

z (r, t)

, which is a simple preconditioner applicable to a wide class of matrices but is not very efficient. The reason for using this preconditioner is that the properties of the coefficient matrices have not been fully understood yet.

In this paper, we aim to accelerate this linear equation solver by employing a more powerful preconditioner. This will help to speed up the linear equation solution that accounts for a large part of the computing time and open a way to solve larger scale and more realistic problems. To this end, we make the following contributions. First, we analyze the properties of the coefficient matrices obtained by applying semi-implicit time discretization and space discretization by FEM to the PDEs for

u (x, t)

and

z (x, t)

. We show that, under appropriate boundary conditions, both of these coefficient matrices become symmetric positive definite (SPD). This justifies the use of the incomplete Cholesky (IC) preconditioner, which is more powerful than diagonal scaling. Second, we show that the IC preconditioner can be parallelized efficiently using the block multi-color ordering proposed by Iwashita et al. [7,8]. In fact, our numerical experiments suggest that the number of CG iterations increases only slightly, if at all, by this parallelization method compared to the sequential case. Finally, we optimize several performance parameters such as the block division scheme and show that the resulting parallel solver is several times faster than the diagonally scaled CG solver on multicore processors. Our results will be applicable to crack growth simulation in a variety of fields in science and engineering, such as the prediction of cracking in buildings and bridges and the analysis of solder cracking in circuit boards [2].

The rest of this paper is structured as follows. In Section 2, we briefly describe the crack simulation method based on the phase field model and its space and time discretization. In Section 3, the properties of the coefficient matrices arising from the discretization are analyzed, and their symmetric positive definiteness is proved under certain conditions. Section 4 describes the parallelization of the IC preconditioner using block multi-color ordering. Numerical results are given in Section 5. Finally, Section 6 concludes the paper.

2. Crack Growth Simulation Based on the Phase Field Model

In this section, we briefly explain the crack growth simulation method based on the phase field model, which was proposed by Takaishi and Kimura [3]. We begin with the two-dimensional case and then proceed to the three-dimensional case.

2.1. The Two-Dimensional Case

Let us consider crack growth in a thin panel as shown in Figure 1. Here, we focus on the so-called mode 3 crack, in which the displacement of the panel is in the direction perpendicular to the panel. Thus, we treat the problem as two-dimensional and denote the displacement at a point

x = (x_{1}, x_{2})

by a scalar variable

u (x, t)

. We denote the region by

Ω

, its boundary by

Γ

, and the crack, which is modeled as a curve on

Ω

, by

Σ

. Hence,

u (x)

is discontinuous across

Σ

. In the example shown in Figure 1, the Dirichlet boundary condition is imposed on

Γ_{D} \subset Γ

, while the Neumann boundary condition is imposed on

Γ_{N} = Γ ∖ Γ_{D}

.

The basic idea of the crack growth simulation method to be described below goes back to Griffith [9]. He proposed the expression of the total energy of the system as a sum of the elasticity energy

E_{1}

and the surface energy

E_{2}

due to the existence of a crack. Both of these energies depend on the crack

Σ

, as well as on

u (x, t)

, so we denote them as

E_{1} [u, Σ]

and

E_{2} [u, Σ]

. Griffith assumed that crack growth occurs if the total energy

E [u, Σ] = E_{1} [u, Σ] + E_{2} [u, Σ]

decreases due to that factor. In the two-dimensional case,

E_{1}

and

E_{2}

can be written as follows [1]:

\begin{matrix} E_{1} [u, Σ] & = & \frac{μ}{2} \int_{Ω ∖ Σ} {| \nabla u |}^{2} d x \end{matrix}

(1)

\begin{matrix} E_{2} [u, Σ] & = & \int_{Σ} γ (x) d s, \end{matrix}

(2)

where

γ (x) > 0

is fracture toughness at

x

and

μ

is one of Lamé’s constants. Equation (1) means that the elasticity energy is expressed as an integral of

\frac{μ}{2} {| \nabla u |}^{2}

over the entire region

Ω

, excluding the crack

Σ

. This is because the difference of u across

Σ

does not contribute to the elasticity energy. On the other hand, the surface energy (2) is expressed as a line integral along the crack.

While in principle (1) and (2) can be used to study the development of crack

Σ

, they are not convenient for numerical computation because the regions of the integral depend on

Σ

and change from step to step. To resolve this problem, Bourdin et al. [10] introduced a phase field variable

z (x, t)

that expresses the degree of crack at

(x, t)

: z≃0 if there is no crack and z≃1 if there is a crack.

z (x, t)

is assumed to be a smooth function of

x

, and the transition between

z = 0

and

z = 1

is assumed to occur across a narrow region of width

≃ ϵ

, where

ϵ > 0

is a regularization parameter [11]. Under these assumptions, Bourdin et al. propose the use of the following regularized total energy functional instead of

E [u, Σ]

:

E [u, z; ϵ] = \frac{μ}{2} \int_{Ω} {(1 - z)}^{2} {| \nabla u |}^{2} d x + \frac{1}{2} \int_{Ω} γ (x) ({ϵ | \nabla z |}^{2} + \frac{1}{ϵ} z^{2}) d x .

(3)

In this formulation, the region of the integral is the entire region

Ω

for both the elasticity and surface energies, which greatly simplifies the numerical procedure.

For the efficient computation of

u (x, t)

and

z (x, t)

based on (3), Takaishi and Kimura [3] proposed the use of the gradient flow

\begin{matrix} \frac{\partial u}{\partial τ} = - α_{1} \frac{δ E}{δ u}, \frac{\partial z}{\partial τ} = - α_{2} \frac{δ E}{δ z}, \end{matrix}

(4)

where

τ

is a virtual time parameter and

α_{1}

and

α_{2}

are time constants. It can easily be shown that if

u (x, t)

and

z (x, t)

evolve according to (4), the total energy functional (3) decreases monotonically. Thus we can expect that the (local) minimum of

E [u, z; ϵ]

is reached as

τ \to \infty

. Furthermore, if

α_{1}

and

α_{2}

are sufficiently large,

u (x, t)

and

z (x, t)

are expected to reach the minimizer of

E [u, z; ϵ]

for given external conditions such as the boundary conditions and external forces (if any) very quickly. Thus, we can regard

u (x, t)

and

z (x, t)

determined by (4) as instantaneous reactions to the external conditions and (4) as approximately describing the time evolution of

u (x, t)

and

z (x, t)

. By computing

\frac{δ E}{δ u}

and

\frac{δ E}{δ z}

explicitly, we obtain the following set of PDEs:

\begin{matrix} \{\begin{matrix} α_{1} \frac{\partial u}{\partial t} & = & \nabla \cdot ({(1 - z)}^{2} \nabla u) & (x \in Ω), \\ α_{2} \frac{\partial z}{\partial t} & = & {(ϵ \nabla \cdot (γ (x) \nabla z) - \frac{γ (x)}{ϵ} z + {| \nabla u |}^{2} (1 - z))}_{+} & (x \in Ω), \\ u (x, t) & = & g (x, t) & (x \in Γ_{D}), \\ \frac{\partial u}{\partial n} & = & 0 & (x \in Γ_{N}), \\ \frac{\partial z}{\partial n} & = & 0 & (x \in Γ), \\ u (x, 0) & = & u_{0} (x) & (x \in Ω), \\ z (x, 0) & = & z_{0} (x) \in [0, 1] & (x \in Ω), \end{matrix} \end{matrix}

(5)

where we changed

τ

to t and set

μ = 1

for simplicity.

g (r, t)

denotes the Dirichlet boundary condition that causes the development of the crack. The symbol

{(\cdot)}_{+}

in the second equation means

max (\cdot, 0)

, which expresses the fact that a crack does not vanish once it is created [12]. See [13] for the treatment of partial differential equations with such terms.

\frac{\partial}{\partial n}

denotes the partial derivative in the direction of the outgoing normal vector.

Crack growth simulation based on (5) has the following advantages:

The direction of crack growth is automatically determined by the PDEs. Hence, total energy evaluation under multiple possible scenarios, which is needed in simulation methods based directly on (3), is not necessary;
By introducing the phase field variable $z (x, t)$ and the regularization parameter $ϵ$ , the divergence of the stress at the tip of the crack is kept to a level manageable by numerical methods;
It is not necessary to regenerate the mesh at every time step to conform to the crack boundary.

Our theoretical analysis and numerical experiments in the two-dimensional case are based on these PDEs.

2.2. The Three-Dimensional Case

In the three-dimensional case, the displacement becomes a vector field variable

u (x, t) = {(u_{1} (x, t), u_{2} (x, t), u_{3} (x, t))}^{⊤}

, where

x = (x_{1}, x_{2}, x_{3})

. The phase field variable

z (x, t)

is still a scalar field variable. Using the same idea as in the previous subsection, we can derive the set of PDEs corresponding to (5). In the following, we assume isotropic materials for simplicity. First, let us define the strain tensor

ϵ_{i j} = \frac{1}{2} (\frac{\partial u_{i}}{\partial x_{j}} + \frac{\partial u_{j}}{\partial x_{i}})

and the stress tensor

σ_{i j}

. These tensors are connected by Hooke’s law, which has the following form in the case of isotropic materials:

σ_{i j} = λ δ_{i j} \nabla \cdot u + 2 μ ϵ_{i j} (i, j = 1, 2, 3),

(6)

where

λ

and

μ

are Lamé’s constants and

δ_{i j}

is Kronecker’s delta. We also write the stress tensor as

σ = (σ_{i j}) = (s_{1}, s_{2}, s_{3}) .

(7)

Using these tensors, the elasticity energy density

e (u)

, which corresponds to

\frac{μ}{2} {| \nabla u |}^{2}

in the two-dimensional case, can be defined as follows:

e (u) = \frac{1}{2} \sum_{i = 1}^{3} \sum_{j = 1}^{3} σ_{i j} ϵ_{i j} = \frac{1}{2} λ {(\nabla \cdot u)}^{2} + μ \sum_{i = 1}^{3} \sum_{j = 1}^{3} ϵ_{i j} ϵ_{i j} .

(8)

Using this, the regularized total energy functional is defined as

E [u, z; ϵ] = \int_{Ω} {(1 - z)}^{2} e (u) d x + \frac{1}{2} \int_{Ω} γ (x) ({ϵ | \nabla z |}^{2} + \frac{1}{ϵ} z^{2}) d x,

(9)

which has the same form as (3) except that

\frac{μ}{2} {| \nabla u |}^{2}

is replaced with

e (u)

. Note that now

Ω

is a three-dimensional region and

\int_{Ω} \cdot d x

denotes a volume integral. By considering the gradient flow as in (4), we obtain the following set of PDEs after some calculations [2]:

\{\begin{matrix} α_{1} \frac{\partial u}{\partial t} & = & \nabla ((λ + μ) {(1 - z)}^{2} (\nabla \cdot u)) + \nabla \cdot (μ {(1 - z)}^{2} \nabla u) & (x \in Ω), \\ α_{2} \frac{\partial z}{\partial t} & = & {(ϵ \nabla \cdot (γ (x) \nabla z) - \frac{γ (x)}{ϵ} z + 2 e (u) (1 - z))}_{+} & (x \in Ω), \\ u (x, t) & = & g (x, t) & (x \in Γ_{D}), \\ s_{j} \cdot n & = & 0 (j = 1, 2, 3) & (x \in Γ_{N}), \\ \frac{\partial z}{\partial n} & = & 0 & (x \in Γ), \\ u (x, 0) & = & u_{0} (x) & (x \in Ω), \\ z (x, 0) & = & z_{0} (x) \in [0, 1] . & (x \in Ω) . \end{matrix}

(10)

Here,

\nabla u = (\begin{matrix} \nabla u_{1} (x, t) \\ \nabla u_{2} (x, t) \\ \nabla u_{3} (x, t) \end{matrix}) = (\begin{matrix} \frac{\partial u_{1}}{\partial x_{1}} & \frac{\partial u_{1}}{\partial x_{2}} & \frac{\partial u_{1}}{\partial x_{3}} \\ \frac{\partial u_{2}}{\partial x_{1}} & \frac{\partial u_{2}}{\partial x_{2}} & \frac{\partial u_{2}}{\partial x_{3}} \\ \frac{\partial u_{3}}{\partial x_{1}} & \frac{\partial u_{3}}{\partial x_{2}} & \frac{\partial u_{3}}{\partial x_{3}} \end{matrix})

(11)

and

\nabla \cdot (μ {(1 - z)}^{2} \nabla u) = (\begin{matrix} \nabla \cdot (μ {(1 - z)}^{2} \nabla u_{1}) \\ \nabla \cdot (μ {(1 - z)}^{2} \nabla u_{2}) \\ \nabla \cdot (μ {(1 - z)}^{2} \nabla u_{3}) \end{matrix}) .

(12)

where

n = {(n_{1}, n_{2}, n_{3})}^{⊤}

denotes the unit normal vector. Note that the first equation of (10) can also be written as

α_{1} \frac{\partial u_{j}}{\partial t} = \nabla \cdot ({(1 - z)}^{2} s_{j}) (j = 1, 2, 3, x \in Ω) .

(13)

This can be verified directly using the definitions of

ϵ_{i j}

,

σ_{i j}

and

s_{j}

. The fourth equation of (10) is often expressed as

σ \cdot n = 0 .

(14)

and is known as the stress-free boundary condition.

2.3. Temporal Discretization

For the temporal discretization of (5) and (10), we use a semi-implicit method. More specifically, to compute the solution

(u_{t + Δ t}, z_{t + Δ_{t}})

at time

t + Δ t

from

(u_{t}, z_{t})

at time t, we replace u and z in the right-hand side of the equation for

\partial u / \partial t

by

u_{t + Δ t}

and

z_{t}

, respectively. Similarly, in the right-hand side of the equation for

\partial z / \partial t

, we replace u and z by

u_{t}

and

z_{t + Δ t}

, respectively. Thus, in the two-dimensional case, we obtain the following set of equations, which are linear both in

u_{t + Δ t}

and

z_{t + Δ t}

:

Here, (17) corresponds to taking

{(\cdot)}_{+}

In the numerical simulation; we set

α_{1} = 0

, which corresponds to assuming that the displacement

u (x, t)

responds to the changes of

z (x, t)

and the boundary conditions instantaneously.

The equations for the three-dimensional case are as follows:

We set

α_{1} = 0

also in this case. Using expression (13), Equation (18) can also be written as

α_{1} \frac{u_{t + Δ t, j} - u_{t, j}}{Δ t} = \nabla \cdot ({(1 - z_{t})}^{2} s_{t + Δ t, j}) (j = 1, 2, 3),

(21)

where

s_{t + Δ t, j}

is the jth column of the stress tensor

σ

at time

t + Δ t

.

3. Properties of the Coefficient Matrices Arising from Phase Field-Based Crack Growth Simulation

In this section, we study the properties of the coefficient matrices arising from the finite element discretization of the basic equations for the phase field-based crack growth simulation. In particular, we prove that these matrices become symmetric positive definite under certain assumptions. This is important to be able to apply the efficient incomplete Cholesky preconditioner to these matrices.

3.1. The Two-Dimensional Case

The weak forms

We start with the time-discretized Equation (15) with

α_{1} = 0

. This is a Poisson equation for

u_{t + Δ t}

with variable coefficient

{(1 - z_{t})}^{2}

, and it is well known that its weak form can be written as follows:

\int_{Ω} {(1 - z_{t})}^{2} \nabla v \cdot \nabla u_{t + Δ t} d x = 0,

(22)

where

v (x)

is a test function with

v (x) = 0

on

Γ_{D}

. Finding

u_{t + Δ t}

satisfying (15) with

α_{1} = 0

is equivalent to finding

u_{t + Δ t}

satisfying (22) for arbitrary test functions

v (x)

.

Next, we derive the weak form for (16). By multiplying (16) by a test function

w (x)

, integrating over

Ω

, using a slight extension of Green’s identity (see Lemma 1),

\int_{Ω} ϕ \nabla \cdot (f \nabla ψ) d x = \int_{Γ} ϕ f \nabla ψ \cdot n d s - \int_{Ω} f \nabla ϕ \cdot \nabla ψ d x

(23)

with

ϕ = w

,

ψ = {\tilde{z}}_{t + Δ t}

and

f = γ (x)

, and noting that

\nabla {\tilde{z}}_{r + Δ t} \cdot n = 0

on

Γ_{N} = Γ

, we obtain

\begin{matrix} \int_{Ω} \{α_{2} w {\tilde{z}}_{t + Δ t} + Δ t (ϵ γ (x) \nabla w \cdot \nabla {\tilde{z}}_{t + Δ t} + w \frac{γ (x)}{ϵ} {\tilde{z}}_{t + Δ t} + w {| \nabla u_{t} |}^{2} {\tilde{z}}_{t + Δ t})\} d x \\ - \int_{Ω} w (α_{2} z_{t} + Δ t | \nabla u_{t} |^{2}) d x = 0 . \end{matrix}

(24)

Equations (17), (22) and (24) constitute the weak forms for the two-dimensional case.

Properties of the coefficient matrix for $u_{t + Δ t}$

Now, we consider the properties of the matrix obtained by applying the finite element discretization to the weak form (22). Let

ϕ_{0} (x)

be a function satisfying the boundary condition

ϕ_{0} (x, t) = g (x, t)

(

x \in Γ_{D}

) and

{ϕ_{j}}_{j = 1}^{m}

be basis functions that are zero on

Γ_{D}

. We approximate

u_{t + Δ t}

by

{\hat{u}}_{t + Δ t} (x)

, which is a linear combination of these functions:

{\hat{u}}_{t + Δ t} = ϕ_{0} + \sum_{j = 1}^{m} a_{j} ϕ_{j} .

(25)

Inserting this into (22) and choosing the test function as

v = ϕ_{i}

, we obtain

\sum_{j = 1}^{m} \{\int_{Ω} {(1 - z_{t})}^{2} \nabla ϕ_{i} \cdot \nabla ϕ_{j} d x\} a_{j} = - \int_{Ω} {(1 - z_{t})}^{2} \nabla ϕ_{i} \cdot \nabla ϕ_{0} d x .

(26)

By defining the matrix

C = (c_{i j}) \in R^{m \times m}

and the vector

d = (d_{i}) \in R^{m}

by

and letting

a = {(a_{1}, a_{2}, \dots, a_{m})}^{⊤}

, (26) can be written as follows.

C a = d .

(29)

Since

c_{i j} = c_{j i}

from (27), it is clear that C is a symmetric matrix. Moreover, for an arbitrary nonzero vector

p = {(p_{1}, p_{2}, \dots, p_{m})}^{⊤}

, we have

\begin{matrix} p^{⊤} C p & = & \sum_{i = 1}^{m} \sum_{j = 1}^{m} p_{j} p_{j} \int_{Ω} {(1 - z_{t})}^{2} \nabla ϕ_{i} \cdot \nabla ϕ_{j} d x \\ = & \int_{Ω} {(1 - z_{t})}^{2} (\sum_{i = 1}^{m} p_{i} \nabla ϕ_{i}) \cdot (\sum_{j = 1}^{m} p_{j} \nabla ϕ_{j}) d x \\ = & \int_{Ω} {(1 - z_{t})}^{2} {|\nabla \sum_{i = 1}^{m} p_{i} ϕ_{i}|}^{2} d x . \end{matrix}

(30)

Noting that

\nabla \sum_{i = 1}^{m} p_{i} ϕ_{i}

is not identically zero from the linear independence of

{ϕ_{i}}

, we know that the integral in the right-hand side is positive as long as

\forall x

,

0 \leq z_{t} (x) < 1

. Hence, C is positive definite, and we obtain the following theorem.

Theorem 1.

If

\forall x

,

0 \leq z_{t} (x) < 1

, then the coefficient matrix C of the equation for

u_{t + Δ t}

is symmetric positive definite.

Properties of the coefficient matrix for ${\tilde{z}}_{t + Δ t}$

For the phase field variable

{\tilde{z}}_{t + d t}

, the boundary condition consists of only the Neumann boundary condition. We therefore choose the basis functions

{ψ_{j}}_{j = 1}^{m}

and approximate

{\tilde{z}}_{t + d t}

by the following function:

{\hat{z}}_{t + d t} = \sum_{j = 1}^{m} b_{j} ψ_{j} .

(31)

Inserting this into (24) and choosing the test function as

w = ψ_{i}

gives

\begin{matrix} \sum_{i = 1}^{m} [\int_{Ω} \{α_{2} ψ_{i} ψ_{j} + Δ t (ϵ γ (x) \nabla ψ_{i} \cdot \nabla ψ_{j} + ψ_{i} \frac{γ (x)}{ϵ} ψ_{j} + ψ_{i} {| \nabla u_{t} |}^{2} ψ_{j})\} d x] b_{j} \\ = \int_{Ω} ψ_{i} (α_{2} z_{t} + d t | \nabla u_{t} |^{2}) d x . \end{matrix}

(32)

By defining the matrix

E = (e_{i j}) \in R^{m \times m}

and the vector

f = (f_{i}) \in R^{m}

by

and letting

b = {(b_{1}, b_{2}, \dots, b_{m})}^{⊤}

, we have the following linear simultaneous equation.

E b = f .

(35)

Since

e_{i j} = e_{j i}

from (33), E is symmetric. Furthermore, the first term of

e_{i j}

, which is

\int_{Ω} α_{2} ψ_{i} ψ_{j} d x

, is a Gram matrix and is therefore positive definite if

{ψ_{i}}

is linearly independent. It is also clear that the remaining parts of

e_{i j}

are also symmetric positive semidefinite. Thus, we arrive at the following theorem.

Theorem 2.

The coefficient matrix E of the equation for

{\tilde{z}}_{t + Δ t}

is symmetric positive definite.

Theorems 1 and 2 ensure that the CG method preconditioned by the incomplete Cholesky decomposition can be used to solve (29) and (35).

3.2. The Three-Dimensional Case

We next consider the three-dimensional case described by (18) through (21). First, we prepare the following lemma.

Lemma 1.

Let Ω be a bounded region in the three-dimensional space, Γ its boundary, and

n

the outward normal vector on Γ. Additionally, let

ϕ (x)

,

ψ (x)

and

f (x)

be scalar fields and

w (x)

be a vector field defined in a region containing Ω. Then, the following equations hold:

\begin{matrix} \int_{Ω} ϕ \nabla \cdot (f w) d x & = & \int_{Γ} ϕ f w \cdot n d S - \int_{Ω} f \nabla ϕ \cdot w d x, \end{matrix}

(36)

\begin{matrix} \int_{Ω} ϕ \nabla \cdot (f \nabla ψ) d x & = & \int_{Γ} ϕ f \nabla ψ \cdot n d S - \int_{Ω} f \nabla ϕ \cdot \nabla ψ d x, \end{matrix}

(37)

where

\int_{Γ} \cdot d S

denotes the surface integral on Γ.

Proof.

First, we integrate both sides of the identity

\nabla \cdot (ϕ f u) = ϕ \nabla \cdot (f u) + f \nabla ϕ \cdot u

(38)

over

Ω

and apply Gauss’s theorem to the left-hand side to transform it to

\int_{Γ} ϕ f w \cdot n d S

. By moving the terms, we have (36). Then, we obtain (37) by letting

w = \nabla ψ

. □

The weak forms

We derive the weak form for

u_{t + Δ t}

by considering a vector test function

v (x)

that becomes

0

on

Γ_{D}

, computing its inner product with both sides of (21), assuming

α_{1} = 0

, and integrating the results over

Ω

(this is equivalent to deriving a weak form for each component of (18) by multiplying it by a scalar test function and integrating the result over

Ω

. This is because if we choose a special vector test function with its y and z components identical to zero, we obtain the same result as if we multiply the x component of (18) by a scalar test function). By letting

ϕ = v_{q}

,

f = {(1 - z_{t})}^{2}

and

w = s_{t + Δ t, q}

(

q = 1, 2, 3

) in (36) and summing both sides over q, we have

\begin{matrix} 0 & = & \sum_{q = 1}^{3} \int_{Ω} v_{q} \nabla \cdot ({(1 - z_{t})}^{2} s_{t + Δ t, q}) d x \\ = & \sum_{q = 1}^{3} \int_{Γ} v_{q} {(1 - z_{t})}^{2} s_{t + Δ t, q} \cdot n d S - \sum_{q = 1}^{3} \int_{Ω} {(1 - z_{t})}^{2} \nabla v_{q} \cdot s_{t + Δ t, q} d x \\ = & - \sum_{j = 1}^{3} \sum_{q = 1}^{3} \int_{Ω} {(1 - z_{t})}^{2} \frac{\partial v_{q}}{\partial x_{j}} {(σ_{t + Δ t})}_{j q} d x \\ = & - \sum_{j = 1}^{3} \sum_{q = 1}^{3} \int_{Ω} {(1 - z_{t})}^{2} \frac{\partial v_{q}}{\partial x_{j}} (λ δ_{j q} \nabla \cdot u_{t + Δ t} + 2 μ {(ϵ_{t + Δ t})}_{j q}) d x \\ = & - \int_{Ω} λ {(1 - z_{t})}^{2} (\nabla \cdot v) (\nabla \cdot u_{t + Δ t}) d x \\ - 2 \sum_{j = 1}^{3} \sum_{q = 1}^{3} \int_{Ω} μ {(1 - z_{t})}^{2} \frac{\partial v_{q}}{\partial x_{j}} {(ϵ_{t + Δ t})}_{j q} d x, \end{matrix}

(39)

where, in the second equality, we use the fact that

v_{q} = 0

on

Γ_{D}

and

s_{t + Δ t, q} \cdot n = 0

on

Γ_{N}

and therefore the surface integral on

Γ

vanishes. By equating the right-hand side of (39) to zero, we obtain the weak form for

u_{t + Δ t}

.

The weak form for

{\tilde{z}}_{t + Δ t}

is exactly the same as (24) for the two-dimensional case, except that

| \nabla u_{t} |^{2}

is replaced with

2 e (u_{t})

.

Properties of the coefficient matrix for $u_{t + Δ t}$

For finite element discretization of the weak form for

u_{t + Δ t}

, we approximate each component

u_{t + d t, j}

(

j = 1, 2, 3

) of

u_{t + d t}

as a linear combination of a function

ϕ_{j, 0}

satisfying the boundary condition on

Γ_{D}

and basis functions

{ϕ_{j, ℓ}}_{ℓ = 1}^{m}

that become zero on

Γ_{D}

:

{\hat{u}}_{t + d t, j} = ϕ_{j, 0} + \sum_{ℓ = 1}^{m} a_{j, ℓ} ϕ_{j, ℓ} (k = 1, 2, 3) .

(40)

Now, we choose as the test function

v

a function whose ith element is

ϕ_{i, k}

and whose other elements are 0. Thus,

v_{q} = δ_{q i} ϕ_{i, k}

. Inserting this along with (40) and the definition of

ϵ_{i j}

into the weak form for

u_{t + Δ t}

gives

\begin{matrix} 0 & = & - \int_{Ω} λ {(1 - z_{t})}^{2} \frac{\partial ϕ_{i, k}}{\partial x_{i}} \sum_{j = 1}^{3} (\frac{\partial ϕ_{j, 0}}{\partial x_{j}} + \sum_{ℓ = 1}^{m} a_{j, ℓ} \frac{\partial ϕ_{j, ℓ}}{\partial x_{j}}) d x \\ - \sum_{j = 1}^{3} \int_{Ω} μ {(1 - z_{t})}^{2} \frac{\partial ϕ_{i, k}}{\partial x_{j}} (\frac{\partial ϕ_{j, 0}}{\partial x_{i}} + \sum_{ℓ = 1}^{m} a_{j, ℓ} \frac{\partial ϕ_{j, ℓ}}{\partial x_{i}} + \frac{\partial ϕ_{i, 0}}{\partial x_{j}} + \sum_{ℓ = 1}^{m} a_{i, ℓ} \frac{\partial ϕ_{i, ℓ}}{\partial x_{j}}) d x . \end{matrix}

By moving the terms containing the unknowns

{a_{j, ℓ}}

to the left-hand side and the other terms to the right-hand side, we have

\begin{matrix} \int_{Ω} λ {(1 - z_{t})}^{2} \frac{\partial ϕ_{i, k}}{\partial x_{i}} \sum_{j = 1}^{3} \sum_{ℓ = 1}^{m} a_{j, ℓ} \frac{\partial ϕ_{j, ℓ}}{\partial x_{j}} d x \\ + \sum_{j = 1}^{3} \int_{Ω} μ {(1 - z_{t})}^{2} \frac{\partial ϕ_{i, k}}{\partial x_{j}} (\sum_{ℓ = 1}^{m} a_{j, ℓ} \frac{\partial ϕ_{j, ℓ}}{\partial x_{i}} + \sum_{ℓ = 1}^{m} a_{i, ℓ} \frac{\partial ϕ_{i, ℓ}}{\partial x_{j}}) d x \\ = - \int_{Ω} λ {(1 - z_{t})}^{2} \frac{\partial ϕ_{i, k}}{\partial x_{i}} \sum_{j = 1}^{3} \frac{\partial ϕ_{j, 0}}{\partial x_{j}} d x \\ - \sum_{j = 1}^{3} \int_{Ω} μ {(1 - z_{t})}^{2} \frac{\partial ϕ_{i, k}}{\partial x_{j}} (\frac{\partial ϕ_{j, 0}}{\partial x_{i}} + \frac{\partial ϕ_{i, 0}}{\partial x_{j}}) d x \end{matrix}

(41)

Equations (39) for

i = 1, 2, 3

and

k = 1, 2, \dots, m

constitute linear simultaneous equations of order

3 m

in

{a_{j, ℓ}}

. Let us write this equation as

C a = d,

(42)

where C is a

3 m \times 3 m

coefficient matrix,

a

is a

3 m

-dimensional unknown vector and

d

is a

3 m

-dimensional right-hand side vector. To investigate the positive definiteness of C, we compute

p^{⊤} C p

, where

p

is a nonzero

3 m

-dimensional vector. To this end, we replace

a_{j, ℓ}

with

p_{j, ℓ}

in the left-hand side of (41), multiply the result with

p_{i, k}

and sum over i and k. Then, after some calculations, we obtain

\begin{matrix} p^{⊤} C p & = & \int_{Ω} λ {(1 - z_{t})}^{2} (\sum_{i = 1}^{3} \sum_{k = 1}^{m} p_{i, k} \frac{\partial ϕ_{i, k}}{\partial x_{i}}) (\sum_{j = 1}^{3} \sum_{ℓ = 1}^{m} p_{j, ℓ} \frac{\partial ϕ_{j, ℓ}}{\partial x_{j}}) d x \\ + \frac{1}{2} \sum_{i = 1}^{3} \sum_{j = 1}^{3} \int_{Ω} μ {(1 - z_{t})}^{2} (\sum_{k = 1}^{m} p_{i, k} \frac{\partial ϕ_{i, k}}{\partial x_{j}} + \sum_{k = 1}^{m} p_{j, k} \frac{\partial ϕ_{j, k}}{\partial x_{i}}) \\ \times (\sum_{ℓ = 1}^{m} p_{i, ℓ} \frac{\partial ϕ_{i, ℓ}}{\partial x_{j}} + \sum_{ℓ = 1}^{m} p_{j, ℓ} \frac{\partial ϕ_{j, ℓ}}{\partial x_{i}}) d x \\ = & \int_{Ω} λ {(1 - z_{t})}^{2} {(\sum_{i = 1}^{3} \sum_{k = 1}^{m} p_{i, k} \frac{\partial ϕ_{i, k}}{\partial x_{i}})}^{2} d x \\ + \frac{1}{2} \sum_{i = 1}^{3} \sum_{j = 1}^{3} \int_{Ω} μ {(1 - z_{t})}^{2} {(\sum_{k = 1}^{m} p_{i, k} \frac{\partial ϕ_{i, k}}{\partial x_{j}} + \sum_{k = 1}^{m} p_{j, k} \frac{\partial ϕ_{j, k}}{\partial x_{i}})}^{2} d x . \end{matrix}

(43)

It is clear from the expression in the middle that C is symmetric, since interchanging

(i, k)

and

(j, ℓ)

leaves it invariant. Furthermore, if

\forall x

,

0 \leq z_{t} (x) < 1

, we have

p^{⊤} C p \geq 0

from the last expression, so C is also positive semidefinite.

Finally, we investigate whether

p^{⊤} C p = 0

can occur for some

p \neq 0

. By writing

w_{i} = \sum_{k = 1}^{m} p_{i, k} ϕ_{i, k}

and

w = {(w_{1}, w_{2}, w_{3})}^{⊤}

, we can rewrite (43) as

p^{⊤} C p = \int_{Ω} λ {(1 - z_{t})}^{2} {(\nabla \cdot w)}^{2} d x + \frac{1}{2} \sum_{i = 1}^{3} \sum_{j = 1}^{3} \int_{Ω} μ {(1 - z_{t})}^{2} {(\frac{\partial w_{i}}{\partial x_{j}} + \frac{\partial w_{j}}{\partial x_{i}})}^{2} d x .

(44)

For the right-hand side to be zero, under the assumption that

\forall x

,

0 \leq z_{t} (x) < 1

,

\frac{\partial w_{i}}{\partial x_{j}} + \frac{\partial w_{j}}{\partial x_{i}}

must be identically zero for

i, j = 1, 2, 3

. This means that the strain tensor computed from

w

can have only a rotational component. However, if the target solid is fixed at three or more points, no such rotation is possible, and therefore the vector field

w

must be identical to zero, implying that

p = 0

. Thus, we arrive at the following theorem.

Theorem 3.

If

\forall x

,

0 \leq z_{t} (x) < 1

, then the coefficient matrix C of the equation for

u_{t + Δ t}

is symmetric positive definite.

Properties of the coefficient matrix for ${\tilde{z}}_{t + Δ t}$

The weak form for

{\tilde{z}}_{t + Δ t}

in the three-dimensional case is identical to that of the two-dimensional case except that

| \nabla u_{t} |^{2}

is replaced with

2 e (u_{t})

. Since

e (u_{t})

is nonnegative as well as

| \nabla u_{t} |^{2}

, by repeating the same argument that led to Theorem 2, we obtain the following theorem:

Theorem 4.

The coefficient matrix of the equation for

{\tilde{z}}_{t + Δ t}

is symmetric positive definite.

So far, we have assumed that both

λ

and

μ

are constants over

Ω

. However, a close examination of the derivation of Theorems 3 and 4 reveals that they are valid even when

λ

and

μ

are continuous functions of

x

. In some applications,

λ (x)

and

μ (x)

might have discontinuities. In such a case, we can approximate them with smooth functions of

x

, by using sufficiently fine mesh around the discontinuities.

4. Application of the Incomplete Cholesky Preconditioner and Its Parallelization

Now that we have established that the coefficient matrices are symmetric positive definite, we can apply the IC preconditioner, which is more effective than diagonal scaling. In this section, we first describe the IC conditioner briefly and then explain how to parallelize it using the block multicolor ordering proposed by Iwashita et al.

4.1. The Incomplete Cholesky Preconditioner

Let

A = (a_{i j}) \in R^{n \times n}

be a sparse symmetric positive definite matrix and consider solving a linear simultaneous equation

A x = b

by the conjugate gradient (CG) method. In the preconditioned conjugate gradient method, one applies the CG method to the modified equation

(K^{- 1} A K^{- ⊤}) (K^{⊤} x) = K^{- 1} b

, whose coefficient matrix

K^{- 1} A K^{- ⊤}

is again SPD. Here,

K \in R^{n \times n}

is a nonsingular matrix designed so that

K^{- 1} A K^{- ⊤}

has a smaller condition number than A. Thus, the convergence of the CG method is accelerated. The diagonal scaling preconditioner uses

diag (\sqrt{a_{11}}, \sqrt{a_{22}}, \dots, \sqrt{a_{n n}})

as K. This preconditioner is simple and applicable to a wide class of matrices but not as effective in reducing the number of iterations of the CG method.

In this study, we use a more powerful preconditioner based on the incomplete Cholesky decomposition without fill-ins (the IC(0) decomposition). In this decomposition, we compute the Cholesky decomposition of A approximately by allowing the element

{\tilde{l}}_{i j}

of the approximate Cholesky factor

\tilde{L}

to be nonzero only when

a_{i j} \neq 0

. Thus, fill-ins in the Cholesky decomposition are suppressed, and the computational cost and the memory requirement are reduced. The algorithm of the IC(0) decomposition is shown as Algorithm 1. Here, in the sums

\sum_{k = 1}^{j - 1} {\tilde{l}}_{j k}^{2}

and

\sum_{k = 1}^{j - 1} {\tilde{l}}_{i k} {\tilde{l}}_{j k}

, zero terms are skipped to reduce the computational work.

Algorithm 1: IC(0) decomposition

1: for

j = 1

to n do
2:

{\tilde{l}}_{j j} = {(a_{j j} - \sum_{k = 1}^{j - 1} {\tilde{l}}_{j k}^{2})}^{1 / 2}

3: for

i = j + 1

to n if

a_{i j} \neq 0

do
4:

{\tilde{l}}_{i j} = (a_{i j} - \sum_{k = 1}^{j - 1} {\tilde{l}}_{i k} {\tilde{l}}_{j k}) / l_{j j}

5: end for
6: end for

In the incomplete Cholesky preconditioner,

\tilde{L}

computed by Algorithm 1 is used as K. While

\tilde{L}

satisfies only an approximate relation

A ≃ \tilde{L} {\tilde{L}}^{⊤}

, it is often a sufficiently good approximation to the true Cholesky factor to make

K^{- 1} A K^{- ⊤}

much better conditioned than A.

For a sparse matrix A arising in the finite element method, a simplified IC(0) decomposition is sometimes used instead of Algorithm 1. In this variant, the off-diagonal elements are not updated and the fourth line of Algorithm 1 is replaced with

{\tilde{l}}_{i j} = a_{i j} / l_{j j}

. We use this variant in this study.

4.2. Parallelization by the Block Multi-Color Ordering

The IC(0) decomposition algorithm inherits the sequential nature of the original Cholesky decomposition. Suppose that

i < j

and

a_{i j} \neq 0

. Then, since

{\tilde{l}}_{i i}

depends on

{\tilde{l}}_{i j}

by line 2 of Algorithm 1 and

{\tilde{l}}_{i j}

depends on

{\tilde{l}}_{j j}

by line 4,

{\tilde{l}}_{i i}

depends on

{\tilde{l}}_{j j}

. In the finite element method using triangular (or tetrahedral) elements and piecewise linear basis functions, the matrix element

a_{i j}

is nonzero if and only if the nodes i and j belongs to the same element. The dependency thus caused gives rise to a difficulty in parallelizing the IC(0) decomposition.

One of the techniques to resolve this problem is multi-color ordering [14]. In this ordering strategy, we assign colors

1, 2, \dots, c

to the nodes in such a way that nodes belonging to the same element have different colors and the number of required colors is minimal. Then, we renumber the nodes so that the nodes with color 1 are numbered first, those with color 2 are numbered next, and so on. Then, if nodes i and j have the same color, they do not belong to the same element, and therefore

a_{i j} = 0

. Thus, the computation of

{\tilde{l}}_{i i}

and

{\tilde{l}}_{j j}

can be done in parallel.

However, it has been pointed out that this reordering can degrade the quality of the IC(0) preconditioner, thereby increasing the number of CG iterations. Consult [15] for more details about this phenomenon. As a remedy, Iwashita et al. proposed block multi-color ordering [7,8], which partitions the set of nodes into blocks and applies the multi-color ordering to the blocks rather than to the individual nodes. It is known that this modification is effective in retaining the quality of the IC(0) preconditioner, since the ordering within each block can be made the same as the natural ordering. The block multi-color ordering applied to a two-dimensional triangular mesh and the nonzero pattern of the resulting matrix are depicted in Figure 2 and Figure 3, respectively. Here,

c = 4

, and thus the matrix has a

4 \times 4

block structure. Each of the four diagonal blocks has a

2 \times 2

block diagonal structure, reflecting the fact that there are two blocks of each color. Thus, the IC(0) decomposition of these two (small) diagonal blocks can be performed in parallel. We use this ordering strategy in our numerical experiments.

5. Numerical Results

In this section, we apply the conjugate gradient method with the IC(0) preconditioner to the linear simultaneous equations to compute the time evolution of

u (x, t)

and

z (x, t)

in the phase field-based crack growth simulation. We parallelize the IC(0) preconditioner using the block red-black ordering and evaluate its parallel performance, as well as the convergence acceleration effect compared with the diagonal scaling preconditioner. Both two and three-dimensional problems are used as test problems.

5.1. The Two-Dimensional Case

We used the 2-D phase field-based crack growth simulation code developed by Takaishi et al. as a basis and replaced its linear equation solver, which uses the CG method with diagonal scaling, with our solver. Our solver is based on the CG method with IC(0) preconditioning and is thread-parallelized using multi-color ordering. We used four colors, as in Figure 2, and set the number of blocks equal to four times the number of threads. The program was written in C and OpenMP, and all computations were performed in double precision arithmetic. In the numerical experiments in this subsection, we used an Intel Xeon processor E5-2660 v2, which has 10 cores, and Intel C Compiler Ver. 16.0.0.109 with the -O3 option.

The computational region

Ω

used in the numerical experiments is as shown in Figure 1. It is a square region (panel) with the initial crack

Σ

running from the left edge to the center. The Dirichlet boundary conditions, which represent the forces to widen the crack, are applied to the upper and lower edges. More specifically, the problem is defined as follows.

Computational region: $Ω = [- 1, 1] \times [- 1, 1], Γ = Γ_{D} + Γ_{N}$ .
Dirichlet boundary: $Γ_{D} = \{(x_{1}, x_{2}) | x_{1} \in [- 1, 1], x_{2} = \pm 1\}$ .
Neumann boundary: $Γ_{N} = \{(x_{1}, x_{2}) | x_{1} = \pm 1, x_{2} \in [- 1, 1]\}$ .
Time step: $Δ t = 0.05$ .
Parameters: $α_{1} = 0, α_{2} = 10^{- 3}, γ = 0.5, ε = 10^{- 3}$ .
Initial conditions: $u (x, 0) = 0, z (x, 0) = ξ (x_{1} + 0.5, x_{2})$ , where $ξ (x_{1}, x_{2}) = exp (- {(x_{2} / δ)}^{2}) / (1 + exp (x_{1} / δ))$ .
Convergence criterion of the CG method: relative residual $\leq 10^{- 10}$ .

We first checked the positive definiteness of the coefficient matrix C for

u (x, t)

(see (27)) using a small problem with

11 \times 11

to

50 \times 50

mesh. Note that the matrix E for

z (x, t)

(see (33)) is obviously positive definite because it is an

O (Δ t)

perturbation of the positive definite Gram matrix. It was confirmed by the numerical experiment that the smallest eigenvalue of C is always positive, and thus the matrix C is positive definite. The smallest eigenvalue

λ_{min}

and the largest eigenvalue

λ_{max}

of C for each mesh before, during and after crack growth are shown in Table 1. We also show the change of the condition number of C as the crack grows in Figure 4. It can be seen that the condition number leaps up suddenly around

t = 35

, where the crack grows rapidly. It seems that this is because the area of

z (x, t) ≃ 1

is widened, causing the near-singularity of C, as can be inferred from (30).

The time evolution of

z (x, t)

for this problem is shown in Figure 5. Here, the mesh size is

101 \times 101

and the period of simulation is from

t = 0

to

t = 34.5

. Until

t = 28.5

,

z (x, t)

does not change significantly and

z (x, t) ≃ 1

only along the line connecting

(- 1, 0)

and

(0, 0)

, showing that the crack exists only in this region. As time passes, the region of

z (x, t) ≃ 1

extends to the right edge of the region, meaning that the panel has broken into two pieces. The evolution of

z (x, t)

for other initial conditions, which was computed using a FreeFEM code, is shown in Appendix A.

We next evaluated the parallel acceleration of our linear equation solver by varying the number of threads from 1 to 10. The results for

101 \times 101

and

201 \times 201

meshes are shown in Figure 6 and Figure 7, respectively. The number of blocks in the

x_{1}

and

x_{2}

directions, which we denote by

n b x

and

n b y

, are shown in Table 2. These numbers were determined experimentally to achieve the best performance in each case, under the condition that

n b x \times n b y

equals four times the number of threads. It can be seen that the solution of the linear simultaneous equations for

u

and z achieves an acceleration of up to 5 and 4 times, respectively, using 10 threads.

Finally, we compare the execution time of our solver with that of the original solver using the diagonal scaling preconditioner. The results are shown in Figure 8 and Figure 9. Compared with the original solver, our solver achieved an acceleration of 7.2 and 7.4 times for the

101 \times 101

and

201 \times 201

meshes, respectively.

5.2. The Three-Dimensional Case

For the three-dimensional case, we again used Takaishi et al.’s code and replaced its linear equation solver with our parallel CG solver with IC(0) preconditioning and multi-color ordering. We used a semi-structured mesh as shown in Figure 10, which is unstructured in the

(x_{1}, x_{2})

plane and is structured in the

x_{3}

direction, sliced it horizontally into 2× (number of threads) panels, and colored them in an alternating manner using two colors. Then, each pair of panels was allocated to one thread for parallel execution. In the numerical experiments in this subsection, we used an Intel Xeon Gold 6148 Processor with 20 cores (1 node of the Grand Chariot supercomputer at the Hokkaido University Information Initiative Center) and Intel C Compiler Ver. 18.0.3 with the -O3 option.

The test problem is crack growth in a rectangular parallelepiped region, defined as follows.

Computational region: $Ω = [- 1, 1] \times [- 1, 1] \times [- 0.5, 0.5], Γ = Γ_{D}^{(1)} + Γ_{D}^{(2)} + Γ_{N}$ ;
Dirichlet boundary: $Γ_{D}^{(1)} = \{(x_{1}, x_{2}, x_{3}) | x_{1} \in [- 1, 1], x_{2} \in [- 1, 1], x_{3} = \pm 0.5\}$ ;
Dirichlet boundary: $Γ_{D}^{(2)} = \{(x_{1}, x_{2}, x_{3}) | x_{1} \in [- 1, 1], x_{2} = \pm 1, x_{3} \in [- 0.5, 0.5]\}$ ;
Neumann boundary: $Γ_{N} = \{(x_{1}, x_{2}, x_{3}) | x_{1} = \pm 1, x_{2} \in [- 1, 1], x_{3} \in [- 0.5, 0.5]\}$ ;
Time step: $Δ t = 0.05$ ;
Parameters: $α_{1} = 0, α_{2} = 10^{- 3}, γ = 0.5, ε = 10^{- 3}$ ;
Initial conditions: $u (x, 0) = 0, z (x, 0) = ξ (x_{1} + 0.5, x_{2})$ , where $ξ (x_{1}, x_{2}) = exp (- {(x_{2} / δ)}^{2}) / (1 + exp (x_{1} / δ))$ ;
Convergence criterion of the CG method: relative residual $\leq 10^{- 10}$ .

The parallel acceleration of our solver for the

51 \times 51 \times 52

and

101 \times 101 \times 102

meshes is shown in Figure 11 and Figure 12, respectively. For the latter mesh, the solution of the linear system for

u

and z was accelerated by up to 6.7 and 4.4 times, respectively. We also show the acceleration of each component of our solver and the breakdown of the execution time in Figure 13, Figure 14, Figure 15 and Figure 16. Our solver consists of a matrix-vector product, forward and backward substitutions corresponding to the application of

{\tilde{L}}^{- 1}

and

{\tilde{L}}^{- ⊤}

, respectively, and other parts such as vector additions, dot products and computation of norms. Figure 13 and Figure 15 show that the matrix-vector product and the forward/backward substitutions are reasonably well accelerated, but Figure 14 and Figure 16 reveal that the other parts are not accelerated at all. The latter are difficult to parallelize efficiently because they are vector operations with small computational work and a relatively large amount of data transfer. If this part could be improved, our solver could achieve further acceleration.

Finally, we compare the execution time of our solver with that of the original solver in Figure 17 and Figure 18. Thanks to the use of the IC(0) preconditioner with multi-color ordering, our solver attains an acceleration of 6.1 and 8.3 for the

51 \times 51 \times 52

and

101 \times 101 \times 102

meshes, respectively.

6. Conclusions

In this paper, we accelerated the linear equation solution part in phase field-based crack growth simulation using the conjugate gradient method with IC(0) preconditioning. To this end, we first analyzed the properties of the coefficient matrices both for the displacement

u (x, t)

and the phase field

z (x, t)

and proved that they are symmetric positive definite under certain conditions. Thus, the use of the IC(0) preconditioning is justified. Then, we parallelized the IC(0) preconditioner using the block multi-color ordering and evaluated its performance on multicore processors. The experimental results show that our solver scales well both for the two and three-dimensional problems and achieves an acceleration of several times over the original solver based on the diagonally scaled CG method. Our future work will include the distributed-memory parallelization of our solver and its application to real-world crack growth problems.

Author Contributions

Conceptualization, T.T. and Y.Y.; theoretical analysis, G.I. and Y.Y.; coding, G.I. and T.T.; parallelization and optimization, G.I.; numerical experiments, G.I. All authors have read and agreed to the published version of the manuscript.

Funding

This study is partially supported by JSPS KAKENHI Grant Numbers 17H02828, 17K19966 and 19KK0255.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The parallel ICCG solver developed in this work, as well as the FreeFEM code used in the Appendix A, is available from the authors upon request. Some of these programs are also downloadable from the following URL: https://github.com/yusakuyamamoto/Crack-growth-simulation, accessed on 20 August 2021.

Acknowledgments

The authors thank Takaharu Yaguchi of Kobe University and Shuhei Kudo of The University of Electro-Communications for valuable discussion. They are also grateful to Takeshi Fukaya of Hokkaido University for providing computational environments for the three-dimensional problem. Part of our numerical experiments were performed on the Grand Chariot supercomputer at Hokkaido University Information Initiative Center.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Two-Dimensional Crack Growth Simulation for Various Initial Conditions

Here, we present the results of the two-dimensional crack growth simulation for various initial conditions. The computational region is the same as that used in Section 5.1, and a FreeFEM [16] code was used for the simulation. The initial conditions used are as follows:

1.: $z (x, 0) = ξ (- x_{1} + 0.5, x_{2} + 0.2) + ξ (x_{1} + 0.5, x_{2} - 0.2)$
2.: $z (x, 0) = ξ (- x_{1} + 0.5, x_{2} + 0.8) + ξ (x_{1} + 0.5, x_{2} - 0.8)$
3.: $z (x, 0) = ξ (- x_{1} + 0.5, x_{2} + 0.2) + ξ (x_{1} + 0.5, x_{2} - 0.2) + ξ (- x_{1} + 0.5, x_{2})$

Here, the function

ξ (x)

is as defined in Section 5.1 and the initial condition for u is the same as that used in Section 5.1. Cases 1 and 2 correspond to the case of two initial cracks: one at the left and another at the right. The vertical distance between the cracks is small for case 1 and large for case 2. Case 3 corresponds to the case of three initial cracks: two at the left and one at the right. The time evolution of

z (x, t)

for these cases is shown in Figure A1, Figure A2, Figure A3 as contour maps. We used a

50 \times 50

mesh for cases 1 and 2. For case 3, it turned out that this mesh is too coarse, so we used a

100 \times 100

mesh. It can be seen that all of the simulations give physically plausible results.

Figure A1. Time evolution of the phase-field variable

z (x, t)

(Case 1).

Figure A1. Time evolution of the phase-field variable

z (x, t)

(Case 1).

Figure A2. Time evolution of the phase-field variable

z (x, t)

(Case 2).

Figure A2. Time evolution of the phase-field variable

z (x, t)

(Case 2).

Figure A3. Time evolution of the phase-field variable

z (x, t)

(Case 3).

Figure A3. Time evolution of the phase-field variable

z (x, t)

(Case 3).

References

Francfort, G.A.; Marigo, J.-J. Revisiting Brittle Fracture as an Energy Minimization Problem. J. Mech. Phys. Solids 1998, 46, 1319–1342. [Google Scholar] [CrossRef]
Kimura, M.; Takaishi, T.; Alfat, S.; Nakano, T.; Tanaka, Y. Irreversible phase field models for crack growth in industrial applications: Thermal stress, viscoelasticity, hydrogen embrittlement. SN Appl. Sci. 2021, 3, 781. [Google Scholar]
Takaishi, T.; Kimura, M. Phase field model for mode III crack growth. Kybernetika 2009, 45, 605–614. [Google Scholar]
Takaishi, T. Numerical simulations of a phase field model for mode III crack growth. Trans. Jpn. Soc. Ind. Appl. Math. 2009, 19, 351–369. (In Japanese) [Google Scholar]
Kobayashi, R. Modeling and numerical simulations of dendritic crystal growth. Phys. D 1998, 63, 410–423. [Google Scholar] [CrossRef]
Provatas, N.; Elder, K. Phase-Field Methods in Materials Science and Engineering; Wiley-VCH: Weinheim, Germany, 2010. [Google Scholar]
Iwashita, T.; Nakashima, H.; Takahashi, Y. Algebraic block multi-color ordering method for parallel multi-threaded sparse triangular solver in ICCG method. In Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium, Shanghai, China, 21 May 2012; pp. 474–483. [Google Scholar]
Iwashita, T.; Li, S.; Fukaya, T. Hierarchical block multi-color ordering: A new parallel ordering method for vectorization and parallelization of the sparse triangular solver in the ICCG method. CCF Trans. High. Perform. Comput. 2020, 2, 84–97. [Google Scholar] [CrossRef]
Griffith, A.A. The Phenomena of Rupture and Flow in Solids. Phil. Trans. R. Soc. Lond. 1921, A221, 163–198. [Google Scholar]
Bourdin, B.; Francfort, G.A.; Marigo, J.-J. Numerical Experiments in Revisited Brittle Fracture. J. Mech. Phys. Solids 2000, 48, 797–826. [Google Scholar] [CrossRef]
Ambrosio, L.; Tortorelli, V.M. On the approximation of free discountinuity problems. Boll. Un. Mat. Ital. 1992, 7, 105–123. [Google Scholar]
Akagi, G.; Kimura, M. Unidirectional evolution equations of diffusion type. J. Differ. Equ. 2019, 266, 1–43. [Google Scholar] [CrossRef] [Green Version]
Visintin, A. Models of Phase Transitions; Birkhaeuser: Basle, Switzerland, 1996. [Google Scholar]
Saad, Y. Iterative Methods for Sparse Linear Systems; SIAM: Philadelphia, PA, USA, 2003. [Google Scholar]
Iwashita, T.; Nakanishi, Y.; Shimasaki, M. Comparison criteria for parallel orderings in ILU preconditioning. SIAM J. Sci. Comput. 2005, 26, 1234–1260. [Google Scholar] [CrossRef]
FreeFEM. Available online: https://freefem.org/ (accessed on 20 August 2021).

Figure 1. A two-dimensional region

Ω

and crack

Σ

.

Figure 1. A two-dimensional region

Ω

and crack

Σ

.

Figure 2. Block multi-color ordering for a two-dimensional triangular mesh.

Figure 3. The coefficient matrix ordered by block multi-color ordering.

Figure 4. Time evolution of the condition number of C.

Figure 5. Time evolution of the phase-field variable

z (x, t)

.

Figure 5. Time evolution of the phase-field variable

z (x, t)

.

Figure 6. Parallel acceleration for the

101 \times 101

mesh.

Figure 6. Parallel acceleration for the

101 \times 101

mesh.

Figure 7. Parallel acceleration for the

201 \times 201

mesh.

Figure 7. Parallel acceleration for the

201 \times 201

mesh.

Figure 8. Execution time for the

101 \times 101

mesh.

Figure 8. Execution time for the

101 \times 101

mesh.

Figure 9. Execution time for the

201 \times 201

mesh.

Figure 9. Execution time for the

201 \times 201

mesh.

Figure 10. Semi-structured mesh for the 3-D simulation and its ordering.

Figure 11. Parallel acceleration for the

51 \times 51 \times 52

mesh.

Figure 11. Parallel acceleration for the

51 \times 51 \times 52

mesh.

Figure 12. Parallel acceleration for the

101 \times 101 \times 102

mesh.

Figure 12. Parallel acceleration for the

101 \times 101 \times 102

mesh.

Figure 13. Parallel acceleration of each component for the

51 \times 51 \times 52

mesh.

Figure 13. Parallel acceleration of each component for the

51 \times 51 \times 52

mesh.

Figure 14. Breakdown of the execution time for the

51 \times 51 \times 52

mesh.

Figure 14. Breakdown of the execution time for the

51 \times 51 \times 52

mesh.

Figure 15. Parallel acceleration of each component for the

101 \times 101 \times 102

mesh.

Figure 15. Parallel acceleration of each component for the

101 \times 101 \times 102

mesh.

Figure 16. Breakdown of the execution time for the

101 \times 101 \times 102

mesh.

Figure 16. Breakdown of the execution time for the

101 \times 101 \times 102

mesh.

Figure 17. Execution time for the

51 \times 51 \times 52

mesh.

Figure 17. Execution time for the

51 \times 51 \times 52

mesh.

Figure 18. Execution time for the

101 \times 101 \times 102

mesh.

Figure 18. Execution time for the

101 \times 101 \times 102

mesh.

Table 1. The smallest and the largest eigenvalues of C for each mesh.

Point	$11 \times 11$		$30 \times 30$		$50 \times 50$
	$λ_{min}$	$λ_{max}$	$λ_{min}$	$λ_{max}$	$λ_{min}$	$λ_{max}$
Before growth	1.000	493.486	0.182	520.190	0.063	522.691
During growth	0.544	424.596	0.070	428.934	0.015	504.204
After growth	0.376	424.558	0.065	428.934	0.014	504.204

Table 2. Optimal combinations of

n b x

and

n b y

.

Table 2. Optimal combinations of

n b x

and

n b y

.

Number of Threads	$101 \times 101$	$201 \times 201$
1	(1, 4)	(1, 4)
2	(2, 4)	(2, 4)
3	(2, 6)	(2, 6)
4	(2, 8)	(2, 8)
5	(2, 10)	(2, 10)
6	(4, 6)	(4, 6)
7	(14, 2)	(14, 2)
8	(4, 8)	(4, 8)
9	(6, 6)	(6, 6)
10	(4, 10)	(4, 10)

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ishii, G.; Yamamoto, Y.; Takaishi, T. Acceleration and Parallelization of a Linear Equation Solver for Crack Growth Simulation Based on the Phase Field Model. Mathematics 2021, 9, 2248. https://doi.org/10.3390/math9182248

AMA Style

Ishii G, Yamamoto Y, Takaishi T. Acceleration and Parallelization of a Linear Equation Solver for Crack Growth Simulation Based on the Phase Field Model. Mathematics. 2021; 9(18):2248. https://doi.org/10.3390/math9182248

Chicago/Turabian Style

Ishii, Gaku, Yusaku Yamamoto, and Takeshi Takaishi. 2021. "Acceleration and Parallelization of a Linear Equation Solver for Crack Growth Simulation Based on the Phase Field Model" Mathematics 9, no. 18: 2248. https://doi.org/10.3390/math9182248

APA Style

Ishii, G., Yamamoto, Y., & Takaishi, T. (2021). Acceleration and Parallelization of a Linear Equation Solver for Crack Growth Simulation Based on the Phase Field Model. Mathematics, 9(18), 2248. https://doi.org/10.3390/math9182248

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Acceleration and Parallelization of a Linear Equation Solver for Crack Growth Simulation Based on the Phase Field Model

Abstract

1. Introduction

2. Crack Growth Simulation Based on the Phase Field Model

2.1. The Two-Dimensional Case

2.2. The Three-Dimensional Case

2.3. Temporal Discretization

3. Properties of the Coefficient Matrices Arising from Phase Field-Based Crack Growth Simulation

3.1. The Two-Dimensional Case

3.2. The Three-Dimensional Case

4. Application of the Incomplete Cholesky Preconditioner and Its Parallelization

4.1. The Incomplete Cholesky Preconditioner

4.2. Parallelization by the Block Multi-Color Ordering

5. Numerical Results

5.1. The Two-Dimensional Case

5.2. The Three-Dimensional Case

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Two-Dimensional Crack Growth Simulation for Various Initial Conditions

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI