Physics-Informed Neural Networks for Bingham Fluid Flow Simulation Coupled with an Augmented Lagrange Method

Zhang, Jianying

doi:10.3390/appliedmath3030028

Open AccessArticle

Physics-Informed Neural Networks for Bingham Fluid Flow Simulation Coupled with an Augmented Lagrange Method

by

Jianying Zhang

Department of Mathematics, Western Washington University, Bellingham, WA 98225, USA

AppliedMath 2023, 3(3), 525-551; https://doi.org/10.3390/appliedmath3030028

Submission received: 21 May 2023 / Revised: 19 June 2023 / Accepted: 26 June 2023 / Published: 30 June 2023

Download

Browse Figures

Versions Notes

Abstract

:

As a class of non-Newtonian fluids with yield stresses, Bingham fluids possess both solid and liquid phases separated by implicitly defined non-physical yield surfaces, which makes the standard numerical discretization challenging. The variational reformulation established by Duvaut and Lions, coupled with an augmented Lagrange method (ALM), brings about a finite element approach, whereas the inevitable local mesh refinement and preconditioning of the resulting large-scaled ill-conditioned linear system can be involved. Inspired by the mesh-free feature and architecture flexibility of physics-informed neural networks (PINNs), an ALM-PINN approach to steady-state Bingham fluid flow simulation, with dynamically adaptable weights, is developed and analyzed in this work. The PINN setting enables not only a pointwise ALM formulation but also the learning of families of (physical) parameter-dependent numerical solutions through one training process, and the incorporation of ALM into a PINN induces a more feasible loss function for deep learning. Numerical results obtained via the ALM-PINN training on one- and two-dimensional benchmark models are presented to validate the proposed scheme. The efficacy and limitations of the relevant loss formulation and optimization algorithms are also discussed to motivate some directions for future research.

Keywords:

physics-informed neural network; augmented Lagrange method; Bingham fluid; Adam algorithm; Gauss–Newton method; variational form; mesh-free method

MSC:

65N22; 76D07

1. Introduction

Bingham fluids are complex fluids with yield stresses. They belong to a broader family of generalized Newtonian fluids, namely viscoplastic fluids [1]. Governed by discontinuous constitutive laws, viscoplastic fluids possess both solid and fluid phases, whereas the yield surfaces, the non-physical separating interfaces, are only implicitly defined. Carrying over into the momentum equations, the viscous terms can not be explicitly expressed, which makes the standard numerical discretization rather challenging.

Various mesh-based methods for viscoplastic fluid flow simulation have been developed. Besides the successful ones studied in [2,3,4,5], the numerical approaches in the literature can be classified into two main categories: One is through constructing highly viscous Newtonian approximation of the constitutive laws [6,7,8,9,10,11,12,13,14,15,16], with Papanastasiou’s regularization (PR) [13] as one of the most popular approximation models. The other is on reformulating the governing equations into a variational problem [17,18] that can be solved via properly designed optimization schemes. The PR mollifies the discontinuous constitutive laws into a viscous (continuous) approximation so that regular finite difference (FD), finite element (FE) or finite volume (FV) solvers can be readily applied to solve the resulting boundary value problem. However, the replacement of the sharp solid–liquid phase transition in complex fluid flows by extremely slow viscous motions causes an inherent drawback in accurately capturing the original yield surfaces. To maintain the genuine viscoplastic features of the underlying fluids, this work is along the lines of the variational approach.

The variational reformulation and its application to Bingham fluid flows date back to the pioneer work of Duvaut and Lions [17], in which the flow motion is captured via minimizing an energy functional

F

that carries a non-differentiable yield stress term induced by the discontinuity of constitutive laws. The augmented Lagrange method (ALM) [19,20] is called for to relax the nonlinearity via introducing an auxiliary variable, paired with an augmented constraint (constraint II) in addition to the fluid incompressibility (constraint I). That is, solving the implicitly defined momentum equations (BVP) is reformulated into the augmented Lagrange minimization problem

min AL = F + constraint I + constraint II .

Following a decomposition coordination process [21,22], minimization of the augmented Lagrangian is decoupled into a series of element-wise optimization tasks, each of which can be accomplished through standard function optimization. To achieve the desired resolution of yield surfaces, the ALM is conventionally implemented in the finite element setting accompanied by local mesh refinement strategies [22,23,24]. A complete loop of the traditional ALM implementation hence includes a linear system solver, an element-wise optimization, updating of pressure and the augmented Lagrange multiplier, following the Uzawa type iterations:

Step 1: Solve an elliptic problem for the velocity.
Step 2: Update the pressure based on the incompressibility constraint.
Step 3: Solve element-wise optimization problems for the rate of strain tensor.
Step 4: Update the Lagrange multiplier corresponding to the augmented constraint.

These steps repeat until all the physical quantities converge. Furthermore, an effective preconditioning scheme [25,26,27,28] is necessary for solving the resulting large-scaled ill-conditioned linear system in Step 1. This is carried out by adding a regularization term to the energy functional, which considerably slows down the iteration process especially when simulating high yield stress flows. As pointed out in [29], ALM, although superior in capturing yield surfaces, was found to be about

10^{3}

–

10^{4}

times slower than the PR method. Modified ALM are therefore investigated to reduce the computational cost [29,30,31,32,33], in which various accelerated optimization algorithms are implemented to enhance the convergence rate.

All these mesh-based approaches are a common sight in numerical simulation of viscoplastic fluid flows. Table 1 outlines their pros and cons in the following aspects:

(A) Computational cost;

(B) Rate of convergence;

(C) Accuracy in capturing yield surfaces;

(D) Flexibility in handling complex geometries;

(E) Higher-dimensional complexity.

Table 1. Pros and cons of commonly used mesh-based approaches.

Numerical Schemes	Pros	Cons
PR + FD	A, B	C, D, E
PR + FV (or FE)	A, B, D	C, E
ALM + FE	C, D	A, B, E
modified ALM + FE	B, C, D	A, E

The required consistency of a mesh-based method implies sufficiently fine meshes to fulfill the accuracy goals. Uniform mesh refinements will dramatically increase the computational cost particularly in higher dimensions, whereas effective local mesh refinements can be technical and have to be implemented iteratively. This motivates mesh-free approaches as another route to take, from which one can expect a globally (continuously) formulated solution instead of discrete numerical values merely on pre-generated grids. When solving linear partial differential equations (PDEs), the attempt is to fit the solution into a linear combination of properly chosen basis functions [34]; examples include spectral methods, method of fundamental solutions (MFS) [35], and radial basis function (RBF) expansions [36], to list a few. While superpositions are no longer valid in constructing solutions to nonlinear equations, neural networks can serve as a powerful tool for learning more sophisticated solution patterns.

Physics-informed neural networks (PINNs) have recently made their way to numerical PDEs inspired by numerous research works [37,38,39,40,41,42,43,44]. A PINN approach seeks a surrogate model for the numerical solution of the governing equations via a loss minimization. In contrast to the traditional mesh-based methods, PINNs are mesh-free and hence desirable to be incorporated into simulations over complex geometries or in multiple dimensions, especially for solving free boundary problems or inverse problems. When it comes to PINN training, gradient descent or Gauss–Newton-type optimization strategies can be adopted and implemented systematically. These virtues of PINNs motivate the present work, in which a neural-network-based ALM for the numerical simulation of Bingham fluid flows is investigated.

The ALM–PINN is a PINN approach that performs the decomposition coordination process in a PINN framework to leverage the desired features of both variable decoupling and network flexibility. This approach is original with some key attributes. The PINN configuration supports pointwise evaluations and differentiations, hence allowing for a pointwise ALM formulation that is more manageable compared to the element-based variational form. The auxiliary variable in the ALM can be effortlessly learned due to the flexibility of network architecture. Particularly in the simulation of two-dimensional incompressible fluid flows, one can opt for the stream function formulation without added computational complexity. Furthermore, setting the yield stress parameter, termed Bingham number and denoted by B, as a network input variable enables the learning of a family of B-dependent solutions through a single training process. Likewise, time-dependent solutions to convection-dominated transient problems can also be rapidly learned without the hassle of stability concerns. From the implementation standpoint, incorporating the decomposition coordination process into a PINN yields a well-decoupled loss function that is feasible for network training. These key attributes of the proposed ALM–PINN are listed in Table 2.

In PINNs, the loss is generally formulated as a weighted sum of least-squared errors in order to fulfill the differential equations, the initial and boundary conditions, and sometimes experimental data, etc. A generic loss function takes the form

\sum_{i = 1}^{N} λ_{i} L_{i} (x, α),

where

λ_{i}

is the weight associated with the loss term

L_{i} (x, α)

,

x

is a vector with components being all the input variables, and

α

is a vector holding all the network parameters to be optimized. What the optimal

λ_{i}

s are remains an open modeling question, for which various weight adjustment strategies have been investigated in the literature, such as introducing nonadaptive weighting as hyperparameters [45,46], adaptively collocation points re-sampling [46], separate multiple subinterval training [46], neural tangent kernel weighting [47] and self-adaptive weight training via a soft attention mechanism [48], to list a few. In this work,

L_{i}

will be formulated to carry extra spatially varying weights to be adapted along the training, aiming at enhancing the network learning speed through keeping track of high-loss locations in the computational domain. This new weight adaptivity method is different from the one proposed in [48] in the sense that these weights are network-parameter-dependent and hence will be treated as part of the network outputs instead of the minimizer of a separate min–max problem. The analogous parameter adjustments are as well involved in the mesh-based ALM iteration process, and the approaches still remain heuristic.

To further illustrate how the proposed ALM-PINN potentially outperforms the other methods, Table 3 outlines a comprehensive comparison of it to the ALM, modified ALM, and regular PINNs. Additionally presented (in bold) in the same table is the difference between the new dynamically adaptable weights (DAW) method and the one proposed in [48].

The rest of the paper is organized as follows. In Section 2, we will briefly outline the finite-element-based ALM and the associated decomposition coordination process [22,24]. The Stokes equations for Bingham fluids in a square lid-driven cavity will be used as a generic model for algorithm illustration. The ALM-PINN approach will be presented in Section 3, including the detailed loss function formulation and network training procedures. Numerical results and discussions on both the one-dimensional Poiseuille flow profiles and the two-dimensional cavity flows, followed by convergence analysis of network training, will be provided in Section 4 to showcase the validity of the proposed numerical scheme. Finally, conclusions will be drawn to motivate some further research.

2. A Mesh-Based ALM for Bingham Fluid Flow Simulation

To make this paper self-contained, we start with a brief overview of the mesh-based ALM for Bingham fluid flow simulation in this section. More technical details can be found in [24].

2.1. Bingham Fluids and the Stokes Equations

In the following presentation, a matrix is denoted by

M_{i j}

and its tensor product is denoted by

M_{i j} M_{i j}

, adopting the Einstein’s notation. The generalized Newtonian fluids of interest are a class of non-Newtonian fluids, in which the rate of strain tensor

{\dot{γ}}_{i j}

and the deviatoric stress tensor

τ_{i j}

are related through a constitutive equation of the form

τ_{i j} = η (\dot{γ}) {\dot{γ}}_{i j} with \dot{γ} = \sqrt{\frac{{\dot{γ}}_{i j} {\dot{γ}}_{i j}}{2}},

(1)

where

η = η (\dot{γ})

is termed the effective viscosity. Denote the second invariant of the deviatoric stress by

τ = \sqrt{(τ_{i j} τ_{i j}) / 2}

. If

τ (\dot{γ} \to 0^{+}) = τ_{Y} > 0

; then, the models are viscoplastic, with yield stress

τ_{Y}

.

A typical viscoplastic model is the Herschel–Bulkley model with the following non-dimensionalized constitutive relation:

\begin{matrix} τ_{i j} = ({\dot{γ}}^{n - 1} + \frac{B}{\dot{γ}}) {\dot{γ}}_{i j} & if τ > B \\ \dot{γ} = 0 & if τ \leq B \end{matrix}

where n is the power-law index. This is an extension of the power-law model to a fluid with yield stress

{\hat{τ}}_{Y}

. The dimensionless parameter

B = ({\hat{τ}}_{Y} \hat{R}) / (\hat{μ} \hat{V})

, termed the Bingham number, denotes the ratio of yield stress to viscous stress. Here,

\hat{μ}

represents the plastic viscosity, and

\hat{R}

and

\hat{V}

are the reference spatial and velocity scales, respectively.

For the Herschel–Bulkley model, the effective viscosity is defined via

τ = η (\dot{γ}) \dot{γ}

. Setting

n = 1

and

B = 0

returns the Newtonian model,

η = 1

. Setting

n = 1

and allowing B to vary, we recover the popular Bingham model carrying a physical parameter B. Note that for the Herschel–Bulkley model, if

B > 0

, then

η \to \infty

as

\dot{γ} \to 0

.

Let

Ω \subset R^{2}

be open and bounded. Denote the velocity field and the pressure over

\bar{Ω}

by

u = (u_{1}, u_{2})

and p, respectively. The following two-dimensional Stokes equations for viscoplastic fluids with Dirichlet boundary conditions will be used as a generic model to describe the implementation of both the mesh-based ALM and the proposed ALM-PINN.

\begin{matrix} \frac{\partial p}{\partial x_{i}} = \frac{\partial τ_{i j}}{\partial x_{j}} + f_{i}, & in Ω, \end{matrix}

(2)

\begin{matrix} \nabla \cdot u = 0 & in Ω, \end{matrix}

(3)

\begin{matrix} u = u_{0} & on \partial Ω \end{matrix}

(4)

where

\frac{\partial p}{\partial x_{i}}

, for

i = 1, 2

, represent the two components of the gradient field of the pressure p, and the scaled body force is denoted by

f = (f_{1}, f_{2})

. With the Einstein’s notation, the viscous terms

\frac{\partial τ_{i j}}{\partial x_{j}}

, for

i = 1, 2

, stand for the divergences of the two row vectors in the deviatoric stress tensor

τ_{i j}

. Due to the unknown yield surfaces, these viscous terms are implicitly defined.

2.2. Variational Inequality and the Equivalent Variational Equality

Define the admissible set

A = {v \in {(H^{1} (Ω))}^{2} : \nabla \cdot v = 0 in Ω and v = u_{0} on \partial Ω} .

It is shown in [20] that the desired vector field

u

for the Stokes problem (2)–(4) is the one that satisfies the following constrained variational inequality:

a (u, v - u) + j (v) - j (u) \geq L (v - u), \forall v \in A

(5)

where

a (u, v) = \frac{1}{2} \int_{Ω} {\dot{γ}}^{n - 1} (u) {\dot{γ}}_{i j} (u) {\dot{γ}}_{i j} (v),

j (v) = B \int_{Ω} \dot{γ} (v) and L (v) = - \int_{Ω} f_{i} v_{i} .

Note that

a (\cdot, \cdot)

, referred to as the viscous dissipation rate in some of the literature, is linear in its argument

v

for general Herschel–Bulkley fluids and is bilinear in either of its argument for Bingham fluids, i.e., when

n = 1

. The force term

L (\cdot)

is linear in its argument, whereas the yield stress dissipation rate

j (\cdot)

is nonlinear and non-differentiable in its argument.

Equivalently as shown in [20], the desired vector field

u

in (5) is the one that minimizes

J (v) = \frac{1}{n + 1} a (v, v) + j (v) - L (v)

(6)

over the function space

A

. The mesh-based ALM to be presented next is based on this variational equality formulation.

For Bingham fluids, the existence and uniqueness of the solution to the variational inequality (5) or the minimizer to the functional (6) in

A

can be shown by directly applying Theorem 4.1 and Lemma 4.1 from Chapter 1 of [20].

2.3. Finite Element ALM Implementation

Issues caused by the nonlinear and non-differentiable yield stress term

j (\cdot)

in (6) can be resolved by applying the augmented Lagrange method [20].

Let W be the collection of symmetric

2 \times 2

tensors with

L^{2}

entries. The key idea of the ALM is to relax the nonlinear yield stress term

{\dot{γ}}_{i j} (u)

to an auxiliary tensor

w_{i j} \in W

so that (6) can be reformulated into a constrained minimization problem:

\begin{matrix} min_{u \in A, w_{i j} \in W} & \frac{1}{n + 1} \int_{Ω} w^{n + 1} + B \int_{Ω} w + \int_{Ω} f_{i} u_{i} \\ subject to & {\dot{γ}}_{i j} (u) - w_{i j} = 0 \end{matrix}

(7)

with the tensor norm defined as

w = \sqrt{(w_{i j} w_{i j}) / 2}

.

Besides the constraint imposed in (7), the incompressibility assumption of the vector fields in the admissible set

A

yields the second constraint

\nabla \cdot u = 0

. Therefore, the augmented Lagrange corresponding to the double-constrained problem (7) finally reads

\begin{matrix} F (u, p, w_{i j}, s_{i j}) = \frac{1}{n + 1} \int_{Ω} w^{n + 1} + B \int_{Ω} w - \int_{Ω} p \nabla \cdot u + \int_{Ω} f_{i} u_{i} \\ + \int_{Ω} ({\dot{γ}}_{i j} (u) - w_{i j}) s_{i j} + \frac{R}{2} \int_{Ω} ({\dot{γ}}_{i j} (u) - w_{i j}) ({\dot{γ}}_{i j} (u) - w_{i j}), \end{matrix}

(8)

in which the pressure p appears as a Lagrange multiplier corresponding to the incompressibility constraint. The other Lagrange multiplier that corresponds to the yield stress constraint is denoted as

s_{i j} \in W

. The last term in (8) is a regularization term for the yield stress constraint, with

R > 0

being a numerical parameter.

Define the test function space

V = {v \in {(H_{0}^{1} (Ω))}^{2}} .

As shown in [22], by taking the Fr

\overset{`}{e}

chet derivatives of

F

with respect to its corresponding arguments, the augmented problem (8) can be solved by iteratively implementing the following decomposition coordination steps until the solution converges.

Initialize p,

w_{i j}

and

s_{i j}

.

Step 1: With fixed p, $w_{i j}$ , and $s_{i j}$ , solve for $u$ so that

$- \int_{Ω} p \nabla \cdot v + \int_{Ω} f_{i} v_{i} + \int_{Ω} {\dot{γ}}_{i j} (v) s_{i j} + R \int_{Ω} ({\dot{γ}}_{i j} (u) - w_{i j}) {\dot{γ}}_{i j} (v) = 0, \forall v \in V .$

(9)
Step 2: With fixed $u$ and $s_{i j}$ , solve for $w_{i j}$ via

$min_{w_{i j} \in W} \frac{1}{n + 1} \int_{Ω} w^{n + 1} + B \int_{Ω} w - \int_{Ω} w_{i j} s_{i j} - R \int_{Ω} {\dot{γ}}_{i j} (u) w_{i j} + R \int_{Ω} w^{2} .$

(10)
Step 3: With fixed u, update p based on the globally stabilized constraint

$\int_{Ω} q \nabla \cdot u + c_{b} \int_{Ω} \nabla p \cdot \nabla q = 0, \forall q \in H_{0}^{1} (Ω)$

(11)

where the second term on the left-hand-side of the equation is a Brezzi–Pitk $\ddot{a}$ ranta stabilization term [22,49] with some properly adjusted stabilization parameter $c_{b} > 0$ to ensure the desired convergence in the pressure update.
Step 4: With fixed $u$ and $w_{i j}$ , update $s_{i j} \in W$ based on the yield stress constraint

$\int_{Ω} ({\dot{γ}}_{i j} (u) - w_{i j}) t_{i j} = 0, \forall t_{i j} \in W$

(12)

The user-adjusted numerical parameters R,

c_{b}

and the updating rate of

s_{i j}

in Step 4 all affect the rate of convergence.

3. The ALM-PINN Approach

Motivated by PINNs, the proposed ALM-PINN is designed to solve the implicitly defined boundary value problem (2)–(4) via implementing the ALM (8) in the neural network setting. This yields a mesh-free analogy to the decomposition coordination process (9)–(12).

Consider a fully connected feed forward neural network. Let the spatial variable

x \in Ω

be the network input. The network outputs include the physical quantities

ϕ (x, α), p (x, α), s_{i j} (x, α),

as well as an additional weight vector

β (x, α) = (β_{1} (x, α), β_{2} (x, α))

whose components are to be inserted into different loss terms. The output

ϕ

is a stream function that induces the divergence free velocity field

u = (\frac{\partial ϕ}{\partial y}, - \frac{\partial ϕ}{\partial x}) .

For illustrative purposes, the loss function, which depends on

ϕ

, will be formulated in terms of

u

. Components of the vector

α

consist of all the network parameters to be optimized. The goal is to determine

α

via deep learning so that the loss function

L (α) = λ_{u} L_{u} (α, β_{1} (α)) + λ_{w} L_{w} (α, β_{2} (α)) + L_{p} (α) + λ L_{b} (α)

(13)

is minimized. Here,

λ_{u}

,

λ_{w}

and

λ

are user specified hyperparameters to enhance the training performance.

Throughout this paper, the mean squared error between any two scalar functions f and g over sample points

x_{k} \in D

,

1 \leq k \leq N_{D}

, is denoted by

{M S E}_{D} (f, g) = \frac{1}{N_{D}} \sum_{k = 1}^{N_{D}} {(f (x_{k}) - g (x_{k}))}^{2} .

With the Einstein’s notation, the loss terms involved in (13) are mean squared errors derived as follows.

The determination of $u$ in Step 1 of the ALM requires solving a large-scaled and ill-conditioned linear system. With the network-learned functions $u$ , p and s, the ALM-PINN approach supports the following pointwise reformulation of the variational form (9)

$\frac{\partial p}{\partial x_{i}} + f_{i} = \nabla \cdot (s_{i j} + R {\dot{γ}}_{i j} (u) - R w_{i j}) : = \nabla \cdot (σ_{i j} - R w_{i j})$

with $σ_{i j} = s_{i j} + R {\dot{γ}}_{i j} (u)$ . This results in the first loss term

$L_{u} (α, β_{1}) = {M S E}_{Ω} (\frac{\partial p}{\partial x_{i}} + f_{i} - \nabla \cdot (σ_{i j} - R w_{i j}), 0)$

$+ μ_{1} {M S E}_{Ω} (β_{1}^{i} (\frac{\partial p}{\partial x_{i}} + f_{i} - \nabla \cdot (σ_{i j} - R w_{i j})), 0),$

where $β_{1} = (β_{1}^{1}, β_{1}^{2})$ is a vector-valued grid dependent weight parameter whose ith component $β_{1}^{i}$ is injected in the loss from the ith equation of the system. The loss term is formulated as a linear combination of a $β_{1}$ -weighted term and a regular unweighted term with a tunable hyperparameter $μ_{1} > 0$ , and $β_{1}$ will be part of the network output. To avoid resulting in a vanishing $β_{1}$ , the inclusion of the unweighted term is necessary. Although $w_{i j}$ is not set as a direct network output, it can be represented in terms of other outputs as will become clear in the derivation of $L_{w} (α, β_{2})$ next.
The minimization problem in Step 2 of the ALM can be solved element-wise, and the preferable $w_{i j}$ is of the form $w_{i j} = θ σ_{i j}$ for $θ > 0$ that solves [24]

$min_{θ} q (θ) = C_{1} θ^{n + 1} + C_{2} θ^{2} + C_{3} θ$

with $C_{1} = \frac{σ^{n + 1}}{n + 1} > 0, C_{2} = R σ^{2} > 0, C_{3} = B σ - 2 σ^{2}$ . Since q is a convex function for $θ > 0$ , a unique minimizer can be determined as;
(i) If $B > 2 σ$ , then $θ = 0$ .
(ii) If $B \leq 2 σ$ , then $θ$ is the solution to $q^{'} (θ) = 0$ , which can be approximated using Newton’s method. With $n = 1$ for Bingham fluids, we can directly calculate $θ = \frac{2 σ^{2} - B σ}{2 R + 1}$ . This results in the second loss term

$L_{w} (α, β_{2}) = {M S E}_{Ω} (θ σ_{i j}, {\dot{γ}}_{i j}) + μ_{2} {M S E}_{Ω} (β_{2} θ σ_{i j}, β_{2} {\dot{γ}}_{i j}),$

where $β_{2}$ again is a spatially dependent weight parameter to be optimized as a network output. For the same reason as above, the loss term is formulated as a linear combination of a $β_{2}$ -weighted and an unweighted term with a tunable hyper-parameter $μ_{2} > 0$ . Since $w_{i j} = θ σ_{i j}$ can be substituted in $L_{u} (α, β_{1})$ , it is not necessary to assign $w_{i j}$ as an additional network output. Meanwhile, Step 4 in the ALM is automatically imposed through minimizing $L_{w} (α, β_{2})$ .
The incompressibility condition enforced in Step 3 of the ALM can be ruled out in the ALM-PINN approach when opting for the stream function formulation.
The output pressure $p (x, α)$ is rescaled so that $p (x^{*}, α) = 0$ , where $x^{*}$ is a fixed location in the computational domain. This contributes the loss term

$L_{p} (α) = p^{2} (x^{*}, α) .$
The boundary conditions are imposed via the loss term

$L_{b} (α) = {M S E}_{\partial Ω} (u, u_{0}) .$

The weights

β_{1}

and

β_{2}

will be dynamically optimized through the network training to balance out spatially dependent decay rates in the corresponding loss terms and to accelerate the overall rate of convergence. They also serve as error intensity indicators. That is, locations with lower magnitudes of

β_{1}

or

β_{2}

are where the corresponding unweighted error is larger. The user specified hyperparameters

μ_{1}

and

μ_{2}

are introduced to gauge the significance of these weighted losses. They can be set to zeros in less complex training models, which yield the unweighted ALM-PINN algorithm outlined in Algorithm 1. Inclusion of the adaptable weights can be viewed as a more flexible and general network analogy of the PAL (penalized augmented Lagrangian) method proposed in [29], a mesh-based variation of ALM aiming to enhance the rate of convergence via scaling the regularization term in the ALM, and the complete algorithm is outlined in Algorithm 2.

Algorithm 1 Unweighted ALM-PINN training process

1. Generate random mesh points inside the computational domain, $x$ , and on its boundaries, $x_{0}$ .
2. Assign the location of zero pressure, $x^{*}$ .
3. Initialize network parameters, components of $α$ .
4. Evaluate $ϕ (x, α)$ , $p (x, α)$ , $s_{i j} (x, α)$ , as well as $ϕ (x_{0}, α)$ and $p (x^{*}, α)$ , according to the network architecture. They are all user-defined composite functions specified by the network layers, nodes and activation functions.
5. Setting $μ_{1} = μ_{2} = 0$ .
6. Determine the weighted loss $L (α)$ as defined in (13).
while $L (α) > T O L$ do
% Update α using the Adam algorithm.
Approximate $\nabla_{α} L$ over mini-batches using back propagations.
Determine $Δ α$ according to $\nabla_{α} L$ following the Adam algorithm.
$α \leftarrow α + Δ α$
end while
7. Implement the LBFGS algorithm to further minimize $L (α)$ if necessary. Enhance $α \leftarrow L B F G S (α)$

Algorithm 2 ALM-PINN with DAW

1∼4. Same as in Algorithm 1.
5. Specify the hyperparameters $μ_{1}$ and $μ_{2}$ . Evaluate the adaptable weights $β_{1} (x, α)$ and $β_{2} (x, α)$ according to the network architecture.
% $β_{1} (x, α)$ and $β_{2} (x, α)$ are also user-defined composite functions specified by the network layers, nodes and activation functions.
6∼7. Same as in Algorithm 1.

4. Numerical Results and Discussions

The ALM-PINN will be tested on one- and two-dimensional steady-state Bingham fluid models, namely Poiseuille flows and square lid-driven cavity flows. The regularization parameter R in the ALM is always set to 1. The detailed network architecture including the depth and width of hidden layers will be specified in the related subsections.

ϕ (x) = tanh (x)

is numerically testified to be the most effective activation function and will be applied throughout. For Poiseuille flows, the loss optimization is accomplished by implementing the Adam algorithm alone, in which the effective initial learning rate and the learning rate adaption factor are set to

0.01

and

0.001

, respectively. For two-dimensional cavity flows, additional LBFGS updates will also follow.

4.1. A One-Dimensional Example: Poiseuille Flow

The Poiseuille flow of Bingham fluids on

I = (- 1, 1)

has a one-dimensional velocity profile

u (x)

modeled by

τ^{'} = - 1, \forall x \in I,

(14)

with boundary conditions

u (- 1) = u (1) = 0

. In particular, the constitutive relation for one-dimensional Bingham fluids reads

\begin{matrix} τ = (1 + \frac{B}{| u^{'} |}) u^{'} & if τ > B \\ u^{'} = 0 & if τ \leq B \end{matrix}

(15)

so that the effective viscosity is

η (u^{'}) = 1 + \frac{B}{| u^{'} |}

. This is a generalization of the quadratic profile in a Newtonian flow corresponding to

η (u^{'}) = 1

.

Note that the Newtonian approximation approach [13] proposes

τ = (1 + \frac{B}{| u^{'} |} (1 - e^{- q | u^{'} |})) u^{'}

in place of (15) to regularize the discontinuous constitutive relation, with a sufficiently large

q > 0

. This would technically enable (14) to be explicitly formulated and solved via a standard PINN approach. However, numerical experiments indicate that the resulting loss optimization does not converge properly under regular network training. The rapid exponential decay involved in the regularization may account for it. On the other hand, minimizing the well-decoupled ALM-PINN loss function is more straightforward.

Recall that the ALM for (14) seeks

u \in H_{0}^{1} (I)

, a piecewise constant auxiliary function w and the augmented Lagrange multiplier s via the augmented minimization

\begin{matrix} min_{u, w, s} & \frac{1}{2} \int_{I} w^{2} + B \int_{I} | w | + \int_{I} s (u^{'} - w) + \frac{R}{2} \int_{I} {(u^{'} - w)}^{2} - \int_{I} u . \end{matrix}

(16)

In the network setting, let

x \in I

be the input and

u (α, x)

,

s (α, x)

and

β (α, x) = (β_{1} (α, x)

,

β_{2} (α, x))

be the outputs. The neural network with

q_{l}

layers (

(q_{l} - 2)

hidden layers) and

q_{n}

neurons in each hidden layer will be trained to construct the vector-valued composite function

y_{q_{l}} (α, x) = (u (α, x), s (α, x), β (α, x))

nested via

y_{k} = ϕ (A_{k} y_{k - 1} + b_{k}), 1 \leq k \leq q_{l} - 1,

y_{0} = x, y_{q_{l}} = A_{q_{l}} y_{q_{l} - 1} + b_{q_{l}} .

Here,

A_{1}

and

A_{q_{l}}

are

1 \times q_{n}

and

q_{n} \times 4

matrices, respectively, whereas

A_{k}

is

q_{n} \times q_{n}

, for

2 \leq k \leq q_{l} - 1

. The biases are

b_{k} \in R^{q_{n}}

, for

1 \leq k \leq q_{l} - 1

and

b_{q_{n}} \in R^{4}

. In the one-dimensional simulation,

q_{l} = 5

and

q_{n} = 25

are testified to be reliable and cost-effective network dimensions.

Following (13), the goal is to determine the network parameter

α

via deep learning so that the ALM-PINN loss function

L (α) = L_{u} (α, β_{1} (α)) + L_{w} (α, β_{2} (α)) + L_{b} (α)

(17)

is minimized, for setting

λ_{u} = λ_{w} = λ = 1

as sufficient in a one-dimensional case.

Denoting

σ = R u^{'} + s

, the loss terms can be formulated as

L_{u} (α, β_{1}) = {MSE}_{I} ({[(1 - R θ) σ]}^{'}, - 1) + μ_{1} {MSE}_{I} (β_{1} {[(1 - R θ) σ]}^{'}, - β_{1}),

L_{w} (α, β_{2}) = {MSE}_{I} (θ σ, u^{'}) + μ_{2} {M S E}_{I} (β_{2} θ σ, β_{2} u^{'}),

L_{b} (α) = \frac{1}{2} (u^{2} (- 1, α) + u^{2} (1, α)) .

Numerical solutions from the mesh-based ALM with 200 uniform meshes are sufficiently accurate and will be used as test solutions in the following ALM-PINN implementation.

4.1.1. Role of Minibatch

The minibatch size is one of the key factors for the convergence of the Adam algorithm. Numerical experiments suggest that the minibatch influence be independent of

β

; thus, we set

μ_{1} = μ_{2} = 0

to simplify the loss function. This suffices reasonable training quality in the one-dimensional model.

Set

B = 0.5

as an example. Implementing the ALM-PINN on 100 random sample points, Figure 1 shows the absolute errors between the ALM test solution and the ALM-PINN solutions, obtained in case of minibatch sizes 10, 20, 50 and 100. In particular, the ALM-PINN solutions corresponding to minibatch sizes 10 and 50 are plotted in Figure 2 and Figure 3 to illustrate how the subtleness of u in the unyielded region is better captured during the iteration process in the case of a smaller minibatch size. From a general observation, smaller minibatch sizes are favorable in the Adam implementation, and this tendency stays put regardless of B and the number of epochs N. In all the upcoming simulations, the minibatch size is always set to

10 %

of the random sample size. This will ensure an adequate convergence rate of SGD without over-processing the dataset (when the minibatch size is too small).

Finally, Figure 4 and Figure 5 showcase the validity of the ALM-PINN approach. The implementation is on 200 random sample points with minibatch size 20. The absolute error in u reaches

O (10^{- 3})

after

N = 8000

epochs, as shown in Figure 4. Meanwhile the absolute error in

u^{'}

reaches

O (10^{- 3})

at all the interior grids, as shown in Figure 5. At the end of the network training, the loss terms

L_{u}

,

L_{w}

and

L_{b}

drop to

O (10^{- 5})

,

O (10^{- 6})

and

O (10^{- 8})

, respectively.

4.1.2. Role of Self-Adaptive Weights

With 200 random sample points, can the self-adaptive

β

help to reduce losses in the training and upgrade the ALM-PINN solutions? The effect of

β

on the loss minimization will be discussed in this section.

Numerical results indicate that

β_{2}

helps to enhance the convergence much more notably than

β_{1}

. This is rational since

β_{2}

correlates with the learning of the auxiliary variable in the ALM, which as the fluid complexity indicator, has a considerable impact on the quality of the velocity profile. Raising (but not overwhelmingly)

μ_{2}

potentially speeds up the learning of

L_{w}

, as shown in Figure 6 and Figure 7, where the ALM-PINN solutions with

μ_{2} = 50

and

μ_{2} = 100

are compared to the ALM test solution. Similar comparison is also performed on

u^{'}

, as shown in Figure 8 and Figure 9.

The evolution of the self-adaptive

| β_{1} |

and

| β_{2} |

along the way are shown in Figure 10. It turns out that the final profile of

| β_{2} |

is independent of

μ_{2}

,

β_{1}

and

μ_{1}

. The network-learned

| β_{1} |

depends on

μ_{1}

,

β_{2}

and

μ_{2}

, whereas

μ_{1} | β_{1} |

always maintains a fixed scale of

O (10^{- 3})

, indicating that

L_{u}

becomes more evenly distributed in the end, whereas

L_{w}

becomes more dominant when moving from the yielded subintervals (

| x | > 0.5

) toward the center of the unyielded interval (

| x | \leq 0.5

).

4.1.3. Role of Sample Point Collocation and Clustering

Mesh refinement in the ALM-PINN has less prominent effects on the overall numerical accuracy enhancement compared to what the self-adaptive weights achieve, yet local improvement can be expected when coupling them. This is demonstrated in Figure 11 where u is learned with 100 sample points on

[- 1, 1]

and an extra 100 sample points on

[- 0.6, 0.6]

, as well in Figure 12 where u is learned with 50 sample points on

[- 1, 1]

and an extra 150 sample points on

[- 0.6, 0.6]

. The self-adaptive

β_{1}

and

β_{2}

are incorporated the same way as in the previous section with

μ_{1} = 1

and

μ_{2} = 100

. The local refinement is performed over

[- 0.6, 0.6]

because it contains the entire unyielded region as well as the yielded–unyielded boundaries, where fluctuations in u are most likely to occur. The local sample clustering enhances the accuracy of

u^{'}

in the targeted region as shown in Figure 13 and Figure 14.

Another attempt is to implement more concentrated sample clusters merely around

x = \pm 0.5

, but it is testified to be less effective even in the targeted regions. The resulting degraded accuracy indicates that the ALM-PINN approach favors the uniformity of random sampling, which promotes a more reliable direction search through minibatches in SGD.

A comprehensive comparison of the absolute errors in u and

u^{'}

obtained via different approaches is demonstrated in Figure 15. In the top figure, one can observe a profound overall error in u obtained without using self-adaptive weights or local sample clustering (termed “uniform” in the figure). The inclusion of self-adaptive weights brings down the overall error. It also shows that increasing

μ_{2}

from 50 to 100 helps to lower the maximum error and to further flatten out the fluctuations in the error distribution. Local sample clustering on top of self-adaptive weights results in additional solution enhancement, especially within the targeted region. In the bottom figure, error enhancement in

u^{'}

can be observed outside the unyielded region

[- 0.5, 0.5]

when self-adaptive weights are adopted. This is because the higher weighting in the loss function at these locations triggers faster update in the training, whereas in the unyielded region, the errors in

u^{'}

are comparable except that self-adaptive weights tend to even out the errors, and local sample clustering reinforces local error improvement.

4.1.4. Bingham Number-Dependent Solutions

The B-dependent version of u, s and

β

can be learned by assigning B as an additional network input.

The computational domain in the

(x, B)

-plane is now

U = (- 1, 1) \times (0, 1)

. The number of neurons in the hidden layers is reset to

q_{n} = 30

. With 1000 interior random sample points, as well as 25 boundary ones on

x = - 1

and

x = 1

each, the ALM-PINN can be trained effectively with minibatch size 100 after

N =

10,000 epochs.

To showcase the benefit from the self-adaptive weights, an accuracy comparison of the ALM-PINN-determined yield surfaces with

μ_{2} = 0

(uniform) and

μ_{2} = 100

is presented in Figure 16. The exact B-dependent yield surfaces are known to be located at

x = \pm B

(shown in red). When the contours of

| u^{'} |

between levels

5 \times 10^{- 3}

and

10^{- 2}

(six contours of levels

(0.5 : 0.1 : 1) \times 10^{- 2}

) are plotted, the regular ALM outputs a sharp boundary (shown in black) between the unyielded (shown in white, in which

| u^{'} | < 5 \times 10^{- 3}

) and yielded (shown in yellow, in which

| u^{'} | > 10^{- 2}

) regions. The ALM-PINN detected yield surfaces are a little blurred, consisting of the contours of

| u^{'} |

spreading out between levels

5 \times 10^{- 3}

and

10^{- 2}

, reflecting the randomness in the network learned solutions. However, setting

μ_{2} = 100

clearly helps with the yield surface deblurring, provided that B is not too close to its upper bound

B = 1

, where the network auto-differentiation may be less accurate.

4.2. A Two-Dimensional Example

In this section, the ALM-PINN will be tested on the Stokes flow of Bingham fluids in a square lid-driven cavity governed by the momentum Equations (2) and (3). The computational domain is set to

Ω = (0, 1) \times (0, 1)

. Dirichlet boundary conditions are imposed on the left, right and bottom walls, whereas on the lid,

u_{0} = (1, 0)

.

4.2.1. Network Architecture and Hyperparameters

The loss formulation is provided in Section 3, along with the specified network input and outputs. The number of hidden layers is set to 4, each with 40 neurons. The hyperparameters

λ_{u} = 20

and

λ_{w} = 50

are fixed for all B, whereas

λ

will be increasing with B to properly handle the boundary loss, or desired flow patterns would not be captured. In particular,

λ = 250

for

B = 2

,

λ = 1000

for

B = 5

, and

λ = 2500

for

B = 20

, are set in the corresponding simulation.

4.2.2. ALM-PINN Performance

The convergence of network optimization is measured via the final scales of four loss terms

L_{u}

,

L_{w}

,

L_{p}

and

L_{b}

, demonstrating how well the governing equations, the constitutive laws, the pointwise pressure restriction and the boundary conditions are fulfilled, respectively. The results for

B = 2, 5, 20

are presented in Table 4.

L_{p}

reduces to a small scale for all the B values, indicating that the pointwise pressure restriction can be well established. The relatively lower accuracy for larger B is a reasonable indicator of the increment in Bingham fluid complexity. Plots of the pressure at the lid for different B values are also provided in Figure 17 (left). The overall scales and evolution tendency are consistent with the results in [15], except that relatively smaller pressure values are obtained around the boundary walls. Without preimposed symmetry, these curves are not perfectly symmetric about the origin, especially around the center, and leaning errors are noticeable in small-scaled output.

L_{b}

drops to

O (10^{- 3})

or lower and decreases as B increases, which is an effect of the heavier weighting (larger

λ

) on

L_{b}

. This boundary loss weighting turns out to be vital for balancing distinct convergence rates in different loss terms. The training can not succeed without properly setting

λ

.

Unlike the pointwise losses,

L_{u}

and

L_{w}

do not decay to significantly small scales. The main cause is the existence of dominant approximation errors isolated locally in Bingham fluids, especially as B grows. Referring to Figure 18, Figure 19 and Figure 20, these regions are located around lower level contours of

| β_{1}^{1} |

,

| β_{1}^{2} |

and

| β_{2} |

, contributing major losses in the governing equations and constitutive laws, respectively. They tend to be more concentrated around the upper yield surface and move toward the lid as B increases.

4.2.3. Optimization Algorithm

In the two-dimensional simulation, the Adam algorithm and the LBFGS need to be implemented back-to-back. An example of the output comparison with and without the LBFGS is provided in Figure 21 to showcase the resolution enhancement through a LBFGS follow-up.

In the Adam process, the network is trained on

60, 000

interior sample points and

4 \times 250

boundary points, with a minibatch size of 6000 and number of epochs

N = 14, 000

. The LBFGS step requires full batch engagement but only a moderate sample size to fine-tune the noisy edges. In all the simulations presented, merely 8000 interior sample points are involved for the LBFGS to well function. Hence, this full batch training step can be completed rapidly.

4.2.4. Unyielded Regions

Unyielded regions along with the streamlines for

B = 2, 5, 20

are provided in Figure 21 and Figure 22. The bottom center of the upper yield surface and the top center of the lower yield surface, as well as the top and bottom tips of the upper unyielded regions, are marked in red dashed line segments. These critical locations and their evolution with B are consistent with the existing numerical results in the literature, for instance [15], as well as the corresponding vortex locations listed in Table 4 and the plots of

τ

at the lid as shown in Figure 17 (right). On the other hand, nuances can also be observed. Isolated unyielded islands show up for small B (

B = 2

) at the end of the top boundary and tend to partially merge into the bulk as B increases. In the upper unyielded regions, the top boundary seems to grow faster and becomes wider as B increases compared to the lower boundary, which grows at a slower rate and hence appears relatively narrower. The bottom center of the upper yield surface and top center of the lower yield surface are located slightly lower in all the cases, compared to the results in [15].

4.3. Convergence Analysis

The quality of the ALM-PINN solutions considerably depends on various network inherent aspects that affect the effectiveness of the corresponding network training, as summarized below.

4.3.1. Network Architectures

The network solution is a composite function defined in a use-specified hypothesis function space. The effectiveness of this approximation relies on the underlying network architecture including network dimensions, type of connectivity, and choice of the activation function. Fully connected networks are adopted in this work. In Poiseuille flow simulation, the convergence is not sensitive to the activation function provided that it is continuous and bounded, whereas in the driven-cavity flow simulation, tanh is much more stable compared to other options. The influence of network dimensions on the ALM-PINN training for both one- and two-dimensional models are provided in Table 5 and Table 6. As observed, convergence is achieved with sufficient network layers

q_{l}

and neurons

q_{n}

. Accuracy improvement can be expected as

q_{l}

and

q_{n}

increase, which is consistent with the universal approximation theorem for neural networks [50]. The network output solution can be viewed as a linear combination of nonlinear composite (basis) functions. Its complexity is determined by the network dimensions, where

q_{l}

reflects the complexity of each basis element, and

q_{n}

corresponds to the number of basis elements in the combination. Higher accuracy is hence obtained through increasing

q_{l}

to capture more complex solution patterns, or through increasing

q_{n}

to enlarge the degree of freedom. Over-fitting may occur if

q_{l}

and

q_{n}

are overboard; thus, there is always a balance between the accuracy and network dimensions. For the one-dimensional model, a moderate choice is

q_{l} = 5

and

q_{n} = 20 \sim 30

, in the sense that a further level of enhancement of the solution resolution is no longer worth the extra training cost in demand. Similarly, we opt for

q_{l} = 6

and

q_{n} = 30 \sim 40

in the two-dimensional simulation. Another observation is that the boundary loss

L_{b}

is less sensitive to the network dimensions.

4.3.2. Optimization Error

The ALM-PINN training is established via a two-stage optimization process consisting of the preliminary minimization by a stochastic gradient descent (SGD) algorithm, the Adam algorithm [51], and a follow-up enhancement using a Gauss–Newton type optimization, the LBFGS [52]. Although the Adam algorithm is no longer a stand-alone optimization scheme as it is in data-driven learning, it can still output a valid initial guess that is crucial for the LBFGS to succeed. With appropriate initialization of network parameters, a functional learning rate and a suitable minibatch size, the Adam search is faster and more stable than the LBFGS. In the one-dimensional simulation, the ALM–PINN can be effectively trained merely using the Adam algorithm as confirmed by the numerical results shown in Table 7. The comparison of loss convergence in the cases with and without weight adaptivity showcases the effectiveness of the latter.

In the two-dimensional simulation, weight adaptivity is necessary to achieve reasonable convergence. In addition, the Adam algorithm does not suffice as desirable resolution in the solution and can only capture a sufficient close initial for the LBFGS when

B \leq 20

. A more sophisticated network architecture will need to be developed in future works to better handle the simulation for a wider range of B. This Bingham number-influenced convergence is presented in Table 8 and Table 9.

Unlike the uniform convergence in mesh-based methods, the network losses decay with fluctuations, which is caused by the SGD searches in the Adam algorithm.

4.3.3. Other Numerical Factors

The accuracy of the network solution depends on both the interior and the boundary sample sizes. As discussed above, uniformly distributed random samples promote more effective learning via the SGD-based Adam algorithm. This suggests that the number of boundary points

N_{\partial Ω}

and the number of interior points

N_{Ω}

be related as

N_{\partial Ω} \approx \sqrt{N_{Ω}}

.

The training convergence also depends on the initialization of network parameters. In this work, the initial weights

W_{0}

and bias

b_{0}

are set to

W_{0} \sim \sqrt{\frac{2}{q_{n}}} N (0, 1), b_{0} \sim N (0, 1),

respectively, where

N (0, 1)

is the normal distribution with zero mean and unit standard deviation, and

q_{n}

is the network width. This is one of the popular initialization options for fully connected feed forward networks.

5. Conclusions and Future Investigations

An ALM-PINN approach is proposed and implemented to simulate steady-state Poiseuille and plane lid-driven cavity flows of Bingham fluids. Although the effectiveness in the two-dimensional implementation is still limited to relatively small Bingham numbers, this work initiates the incorporation of deep learning techniques into the traditional ALM for complex fluid flow simulation. The mesh-free feature and network versatility make the new approach potentially powerful.

5.1. Numerical Achievements and Observations

With the analogous decomposition coordination process in the network setting, the ALM-PINN loss function is formulated as a linear combination of mean squared errors reflecting the governing equations, constitutive laws, pressure rescaling and boundary conditions, coupled with dynamically adaptable weights to balance out the diverse decay rates in the individual losses. The subsequent loss minimization can be successfully carried out via implementing the Adam algorithm in a one-dimensional simulation, whereas the two-dimensional output needs to be sharpened with a LBFGS follow-up. The LBFGS search is highly initially sensitive and fails when the initial estimate from the Adam step is not sufficiently close. This is the major cause of the training collapse in the case of large Bingham numbers.

For the Adam algorithm, numerical evidence suggests the desired minibatch size of

10 %

, which is effectively applied to all implementations. The influence of random sample clustering on the overall learning proficiency is also investigated numerically with detailed discussions provided in the one-dimensional simulation. The SGD-based optimization algorithm is not in favor of utterly concentrated sample distributions. Successive local grid clustering, analogous to the iterative local mesh refinement in mesh-based methods, may break up the uniformity of sample randomness and have a negative impact on the Adam search in a training epoch. More severe accuracy issues can occur in higher-dimensional training. Hence, the two-dimensional simulation conducted here is only over uniformly distributed random samples.For the one-dimensional simulation with a fixed Bingham number B, the network training is seemingly more expensive compared to the regular FEM approach. However, a high-resolution B-dependent solution can be learned via a single training with only 1000 sample points in the two-dimensional

x B

-plane. This is where the ALM-PINN potentially well outperforms the regular FEM, in which building B dependence into the numerical solution is equivalent to adding one more dimension to the computational domain, hence significantly enlarging the number of meshes. Technically, the ALM-PINN setting is advantageous to constructing families of B-dependent solutions as testified in Poiseuille flow simulation. Unfortunately the convergence of the Adam step is hindered in the two-dimensional learning because of the involvement of the B-dependent hyperparameter

λ

associated with the boundary loss, which needs to be manually tuned in advance. Even for the two-dimensional simulation with a fixed Bingham number B, the network training can be accomplished on

O (10^{4})

sample points through

O (10^{4})

epochs. To achieve comparable accuracy, FEM has to be coupled with iterative local refinements (usually requiring

10 \sim 15

iteration steps), implementing

O (10^{3})

decomposition coordination loops over around 50,000

\sim 8000

non-uniform meshes. This results in the same order of complexity compared to the ALM-PINN, whereas the mesh-free feature of the latter offers various flexibilities in algorithm design.

5.2. Further Development

Inspired by the present work, along with the recent advancements in deep learning and numerical methods for complex physics problems, one can explore a set of novel approaches to unlock new opportunities in the related areas of research.

Regarding network architectures, the remedies for exceedingly non-Newtonian singularities (large B cases) may include the construction of multiple localized sub-networks, the involvement of spatially varying hyperparameters in the activation function, and the approximation of network solutions using customized basis functions that can properly characterize specific features of the underlying physical quantities. In a more general setting, innovating neural network architectures that can better incorporate physical principles and constraints, investigating the integration of advanced activation functions, regularization techniques, and adaptive learning algorithms to improve PINN performance, as well as developing hybrid architectures that combine PINNs with other machine learning methods, such as convolutional neural networks (CNNs) or recurrent neural networks (RNNs), can be considered to tackle subtle multi-scale structures and more diverse physics phenomena.

Another strategy is the integration of PINNs with data-based learning. It is the abrupt phase transition in the fluids with large Bingham numbers that causes the major numerical challenges. On the other hand, the predominance of solid phases in these highly non-Newtonian fluid bodies implies the vanishing rate of strain at the corresponding locations in the computational domain. Despite the unknown yield surfaces, sampling points in some interior solid portions are completely manageable since numerical results at these smooth locations are always of much higher accuracy. Using them as complementary input data to the ALM-PINN and minimizing the loss only over other problematic locations can significantly reduce the scale of network training.

Eliminating the boundary loss by hard-coding boundary conditions into the network solution will be substantial for the ALM-PINN to nail B-dependent solutions. For Dirichlet boundary conditions, the network solution can be formulated as

u = P \cdot u_{N}, P = 0 on the boundary,

where P is a user-specified polynomial or some other function. Generalizing to arbitrary boundary conditions, let

u = P \cdot u_{N} + u_{p}, P = 0 on the boundary,

where P is the same as above, and

u_{p}

is another user-specified function satisfying the originally imposed boundary conditions. The determination of

u_{p}

may be a regular function construction or a generic-boundary-value problem solver in the case of complex boundary conditions. The flexibility of choices of P and

u_{p}

ensures the manageability of this approach, but pulling off the cost-effective options remains non-trivial. For example, in Poiseuille flow simulation, setting P as a quadratic polynomial and

u_{p}

as the solution to the corresponding Newtonian model can work perfectly, yet the network fails to learn the solution effectively, with P being replaced by a Sine function. In a higher-dimensional simulation, it is not straightforward to come up with a simple closed form for

u_{p}

, of which the construction may be resolved by coupling appropriate traditional numerical methods with the network.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The author declares no conflict of interest.

References

Bird, R.B.; Dai, G.C.; Yarusso, B.J. The rheology and flow of viscoplastic materials. Rev. Chem. Eng. 1982, 1, 1–70. [Google Scholar] [CrossRef]
Bercovier, M.; Engelman, M. A finite element for incompressible non-Newtonian flows. J. Comput. Phys. 1980, 36, 313–326. [Google Scholar] [CrossRef]
Beris, A.N.; Tsamopoulos, J.A.; Armstrong, R.C.; Brown, R.A. Creeping motion of a sphere through a Bingham plastic. J. Fluid Mech. 1985, 158, 219–244. [Google Scholar] [CrossRef]
Hughes, T.J.R.; Liu, W.K.; Brooks, A. Finite element analysis of incompressible viscous flows by the penalty function formulation. J. Comput. Phys. 1979, 30, 1–60. [Google Scholar] [CrossRef]
Szabo, P.; Hassager, O. Flow of viscoplastic fluids in eccetric annular geometries. J. Non-Newtonian Fluid Mech. 1992, 45, 149–169. [Google Scholar] [CrossRef]
Abdali, S.S.; Mitsoulis, E.; Markatos, N.C. Markatos: Entry and exit flows of Bingham fluids. J. Rheol. 1992, 36, 389–407. [Google Scholar] [CrossRef]
Alexandrou, A.N.; McGilvreay, T.M.; Burgos, G.B. Steady Herschel–Bulkley fluid flow in three-dimensional expansions. J. Non-Newtonian Fluid Mech. 2001, 100, 77–96. [Google Scholar] [CrossRef]
Beverly, C.R.; Tanne, R.I. Numerical analysis of extrude swell in viscoelastic materials with yield stress. J. Rheol. 1989, 33, 989–1009. [Google Scholar] [CrossRef]
Matsoukas, A.; Mitsoulis, E. Geometry effects in squeeze flow of Bingham plastics. J. Non-Newtonian Fluid Mech. 2001, 109, 231–240. [Google Scholar] [CrossRef]
Mitsoulis, E.; Tsamopoulos, J. Numerical simulations of complex yield-stress fluid flows. Rheol. Acta 2017, 56, 231–258. [Google Scholar] [CrossRef]
Mitsoulis, E.; Zisis, T. Flow of Bingham plastics in a lid-driven square cavity. J. Non-Newtonian Fluid Mech. 2001, 101, 173–180. [Google Scholar] [CrossRef]
O’Donovan, E.J.; Tanner, R.I. Numerical study of the Bingham Squeeze film problem. J. Non-Newtonian Fluid Mech. 1984, 15, 75–83. [Google Scholar] [CrossRef]
Papanastasiou, T.C. Flows of materials with yield. J. Rheol. 1987, 31, 385–404. [Google Scholar] [CrossRef]
Smyrnaios, D.N.; Tsamopoulos, J.A. Squeeze flow of Bingham plastics. J. Non-Newtonian Fluid Mech. 2001, 100, 165–190. [Google Scholar] [CrossRef]
Syrakos, A.; Georgiou, G.C.; Alexandrou, A.N. Solution of the square lid-driven cavity flow of a Bingham plastic using the finite volume method. J. Non-Newtonian Fluid Mech. 2013, 195, 19–31. [Google Scholar] [CrossRef] [Green Version]
Wilson, S.D.R. Squeezing flow of a Bingham naterial. J. Non-Newtonian Fluid Mech. 1993, 47, 211–219. [Google Scholar] [CrossRef]
Duvaut, G.; Lions, J.L. Inequalities in Mechanics and Physics; Springer: Berlin/Heidelberg, Germany, 1976. [Google Scholar]
Huilgol, R.R.; You, Z. Application of the augmented Lagrangian method to steady pipe flows of Bingham, Casson and Herschel–Bulkley fluids. J. Non-Newtonian Fluid Mech. 2005, 128, 126–143. [Google Scholar] [CrossRef]
Fortin, M.; Glowinski, R. Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems; North-Holland Publishing Co.: Amsterdam, The Netherlands, 1983. [Google Scholar]
Glowinski, R. Numerical Methods for Nonlinear Variational Problem; Springer: New York, NY, USA, 1984. [Google Scholar]
Latché, J.-C.; Vola, D. Analysis of the Brezzi-Pitkäranta stabilized Galerkin scheme for creeping flows of Bingham fluids. SIAM J. Numer. Anal. 2004, 42, 1208–1225. [Google Scholar] [CrossRef] [Green Version]
Vola, D.; Boscardin, L.; Latché, J.-C. Laminar unsteady flows of Bingham fluids: A numerical strategy and some benchmark results. J. Comput. Phys. 2003, 187, 441–456. [Google Scholar] [CrossRef]
Saramito, P.; Roquet, N. An adaptive finite element method for viscoplastic fluid flows in pipes. Comput. Methods Appl. Mech. Eng. 2001, 190, 5391–5412. [Google Scholar] [CrossRef]
Zhang, J. An augmented Lagrangian approach to Bingham fluid flows in a lid-driven square cavity with piecewise linear equal-order finite elements. Comput. Methods Appl. Mech. Eng. 2010, 199, 3051–3057. [Google Scholar] [CrossRef]
Borne, S.L.E. Preconditioned nullspace method for the two-dimensional Oseen problem. SIAM J. Sci. Comput. 2009, 31, 2494–2509. [Google Scholar] [CrossRef]
Rehman, M.; Vuik, C.; Segal, G. SIMPLE-type preconditioners for the Oseen problem. Int. J. Numer. Meth. Fluids 2009, 61, 432–452. [Google Scholar] [CrossRef] [Green Version]
Vuik, C.; Saghir, A.; Boerstoel, G.P. The Krylov accelerated SIMPLE(R) method for flow problems in industrial furnaces. Int. J. Numer. Meth. Fluids 2000, 33, 1027–1040. [Google Scholar] [CrossRef]
Wathen, A.; Silvester, D. Fast iterative solution of stabilized Stokes systems. Part I: Using simple diagonal preconditioners. SIAM J. Numer. Anal. 1993, 30, 630–649. [Google Scholar] [CrossRef]
Dimakopoulos, Y.; Makrigiorgos, G.; Georgiou, G.C.; Tsamopoulos, J. The PAL (Penalized Augmented Lagrangian) method for computing viscoplastic flows: A new fast converging scheme. J. Non-Newtonian Fluid Mech. 2018, 256, 23–42. [Google Scholar] [CrossRef]
Alart, P.; Curnier, A. A mixed formulation for frictional contact problems prone to Newton like solution methods. Comput. Methods Appl. Mech. Eng. 1991, 92, 353–375. [Google Scholar] [CrossRef]
Bleyer, J. Advances in the simulation of viscoplastic fluid flows using interior-point methods. Comput. Methods Appl. Mech. Eng. 2018, 330, 368–394. [Google Scholar] [CrossRef] [Green Version]
Saramito, P. A damped Newton algorithm for computing viscoplastic fluid flows. J. Non-Newtonian Fluid Mech. 2016, 238, 6–15. [Google Scholar] [CrossRef] [Green Version]
Treskatis, T.; Moyers-Gonzalez, M.A.; Prince, C.J. An accelerated dual proximal gradient method for applications in viscoplasticity. J. Non-Newtonian Fluid Mech. 2016, 238, 115–130. [Google Scholar] [CrossRef] [Green Version]
Schaback, R.; Wendland, H. Kernel techniques: From machine learning to meshless methods. Acta Numer. 2006, 15, 543–639. [Google Scholar] [CrossRef] [Green Version]
Young, D.L.; Jane, S.J.; Fan, C.M.; Murugesan, K.; Tsai, C.C. The method of fundamental solutions for 2D and 3D Stokes problems. J. Comput. Phys. 2006, 211, 1–8. [Google Scholar] [CrossRef]
Fornberg, B.; Flyer, N. Solving PDEs with radial basis functions. Acta Numer. 2015, 24, 215–258. [Google Scholar] [CrossRef] [Green Version]
Lagaris, I.E.; Likas, A.; Fotiadis, D.I. Artificial neural networks for solving ordinary and partial differential equations. IEEE Trans. Neural Netw. 1998, 9, 987–1000. [Google Scholar] [CrossRef] [Green Version]
Owhadi, H. Bayesian numerical homogenization. Multiscale Model. Siml. 2015, 13, 812–828. [Google Scholar] [CrossRef] [Green Version]
Raissi, M.; Karniadakis, G.E. Hidden physics models: Machine learning of nonlinear partial differential equations. J. Comput. Phys. 2018, 357, 125–141. [Google Scholar] [CrossRef] [Green Version]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Inferring solutions of differential equations using noisy multi-fidelity data. J. Comput. Phys. 2017, 335, 736–746. [Google Scholar] [CrossRef] [Green Version]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Machine learning of linear differential equations using Gaussian processes. J. Comput. Phys. 2017, 348, 683–693. [Google Scholar] [CrossRef] [Green Version]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Numerical Gaussian processes for time-dependent and nonlinear partial differential equations. SIAM J. Sci. Comput. 2018, 40, A172–A198. [Google Scholar] [CrossRef] [Green Version]
Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural network: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
Rao, C.; Sun, H.; Liu, Y. Physics-informed deep learning for incompressible laminar flows. Theor. Appl. Mech. Lett. 2020, 10, 207–212. [Google Scholar] [CrossRef]
Wang, S.; Teng, Y.; Perdikaris, P. Understanding and mitigating gradient pathologies in physics-informed neural networks. SIAM J. Sci. Comput. 2021, 43, A3055–A3081. [Google Scholar] [CrossRef]
Wight, C.L.; Zhao, J. Solving Allen-Cahn and Cahn-Hilliard Equations Using the Adaptive Physics Informed Neural Networks. Commun. Comput. Phys. 2021, 29, 930–954. [Google Scholar] [CrossRef]
Wang, S.; Yu, X.; Perdikaris, P. When and why PINNs fail to train: A neural tangent kernel perspective. J. Comput. Phys. 2022, 449, 110768. [Google Scholar] [CrossRef]
McClenny, L.; Braga-Neto, U. Self-adaptive physics-informed neural networks using a soft attention mechanism. arXiv 2020, arXiv:2009.04544. [Google Scholar]
Malkus, D.S.; Hughes, T.J.R. Mixed finite element methods - reduced and selective integration techniques: A unification of concepts. Comput. Methods Appl. Mech. Eng. 1978, 15, 63–81. [Google Scholar] [CrossRef]
Hornik, K.; Stinchcombe, M.; White, H. Multilayer feedforward networks are universal approximations. Neural Netw. 1989, 2, 359–366. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
Liu, D.C.; Nocedal, J. On the Limited Memory Method for Large Scale Optimization. Math. Program. 1989, 45, 503–528. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Poiseuille flow: absolute error in u (comparison between the ALM-PINN and ALM solutions) in the case of 100 sample points with minibatch sizes 10, 20, 50 and 100. Setting

B = 0.5

and number of epochs

N = 1000

.

Figure 1. Poiseuille flow: absolute error in u (comparison between the ALM-PINN and ALM solutions) in the case of 100 sample points with minibatch sizes 10, 20, 50 and 100. Setting

B = 0.5

and number of epochs

N = 1000

.

Figure 2. Poiseuille flow: the ALM-PINN and ALM solution comparison (left), absolute error in u (right), in the case of 100 sample points with minibatch size 10. Setting

B = 0.5

and number of epochs

N = 1000

.

Figure 2. Poiseuille flow: the ALM-PINN and ALM solution comparison (left), absolute error in u (right), in the case of 100 sample points with minibatch size 10. Setting

B = 0.5

and number of epochs

N = 1000

.

Figure 3. Poiseuille flow: the ALM-PINN and ALM solution comparison (left), absolute error in u (right), in the case of 100 sample points with minibatch size 50. Setting

B = 0.5

and number of epochs

N = 1000

.

Figure 3. Poiseuille flow: the ALM-PINN and ALM solution comparison (left), absolute error in u (right), in the case of 100 sample points with minibatch size 50. Setting

B = 0.5

and number of epochs

N = 1000

.

Figure 4. Poiseuille flow: the ALM-PINN and ALM solution comparison (left), absolute error in u (right), in the case of 200 sample points with minibatch size 20. Setting

B = 0.5

and number of epochs

N = 8000

.

Figure 4. Poiseuille flow: the ALM-PINN and ALM solution comparison (left), absolute error in u (right), in the case of 200 sample points with minibatch size 20. Setting

B = 0.5

and number of epochs

N = 8000

.

Figure 5. Poiseuille flow: the ALM-PINN and ALM solution derivative comparison (left), absolute error in

u^{'}

(right), in the case of 200 sample points with minibatch size 20. Setting

B = 0.5

and number of epochs

N = 8000

.

Figure 5. Poiseuille flow: the ALM-PINN and ALM solution derivative comparison (left), absolute error in

u^{'}

(right), in the case of 200 sample points with minibatch size 20. Setting

B = 0.5

and number of epochs

N = 8000

.

Figure 6. Poiseuille flow: the ALM-PINN and ALM solution comparison (left), absolute error in u (right), in the case of 200 sample points with minibatch size 20. Setting

B = 0.5

, number of epochs

N = 8000

, scaling factors

μ_{1} = 1

and

μ_{2} = 50

.

Figure 6. Poiseuille flow: the ALM-PINN and ALM solution comparison (left), absolute error in u (right), in the case of 200 sample points with minibatch size 20. Setting

B = 0.5

, number of epochs

N = 8000

, scaling factors

μ_{1} = 1

and

μ_{2} = 50

.

Figure 7. Poiseuille flow: the ALM-PINN and ALM solution comparison (left), absolute error in u (right), in the case of 200 sample points with minibatch size 20. Setting

B = 0.5

, number of epochs

N = 8000

, scaling factors

μ_{1} = 1

and

μ_{2} = 100

.

Figure 7. Poiseuille flow: the ALM-PINN and ALM solution comparison (left), absolute error in u (right), in the case of 200 sample points with minibatch size 20. Setting

B = 0.5

, number of epochs

N = 8000

, scaling factors

μ_{1} = 1

and

μ_{2} = 100

.

Figure 8. Poiseuille flow: the ALM-PINN and ALM solution derivative comparison (left), absolute error in

u^{'}

(right), in the case of 200 sample points with minibatch size 20. Setting

B = 0.5

, number of epochs

N = 8000

, scaling factors

μ_{1} = 1

and

μ_{2} = 50

.

Figure 8. Poiseuille flow: the ALM-PINN and ALM solution derivative comparison (left), absolute error in

u^{'}

(right), in the case of 200 sample points with minibatch size 20. Setting

B = 0.5

, number of epochs

N = 8000

, scaling factors

μ_{1} = 1

and

μ_{2} = 50

.

Figure 9. Poiseuille flow: the ALM-PINN and ALM solution derivative comparison (left), absolute error in

u^{'}

(right), in the case of 200 sample points with minibatch size 20. Setting

B = 0.5

, number of epochs

N = 8000

, scaling factors

μ_{1} = 1

and

μ_{2} = 100

.

Figure 9. Poiseuille flow: the ALM-PINN and ALM solution derivative comparison (left), absolute error in

u^{'}

(right), in the case of 200 sample points with minibatch size 20. Setting

B = 0.5

, number of epochs

N = 8000

, scaling factors

μ_{1} = 1

and

μ_{2} = 100

.

Figure 10. Evolution of network learned

| β_{1} |

(left) and

| β_{2} |

(right) after 10, 100, 1000 and 8000 iterations, in the case of 200 sample points with minibatch size 20. Setting

B = 0.5

, number of epochs

N = 8000

, scaling factors

μ_{1} = 1

and

μ_{2} = 100

.

Figure 10. Evolution of network learned

| β_{1} |

(left) and

| β_{2} |

(right) after 10, 100, 1000 and 8000 iterations, in the case of 200 sample points with minibatch size 20. Setting

B = 0.5

, number of epochs

N = 8000

, scaling factors

μ_{1} = 1

and

μ_{2} = 100

.

Figure 11. Poiseuille flow: the ALM-PINN and ALM solution comparison (left), absolute error in u (right), in the case of 100 sample points on

[- 1, 1]

and extra 100 sample points on

[- 0.6, 0.6]

with minibatch size 20. Setting

B = 0.5

and number of epochs

N = 8000

.

Figure 11. Poiseuille flow: the ALM-PINN and ALM solution comparison (left), absolute error in u (right), in the case of 100 sample points on

[- 1, 1]

and extra 100 sample points on

[- 0.6, 0.6]

with minibatch size 20. Setting

B = 0.5

and number of epochs

N = 8000

.

Figure 12. Poiseuille flow: the ALM-PINN and ALM solution comparison (left), absolute error in u (right), in the case of 50 sample points on

[- 1, 1]

and extra 150 sample points on

[- 0.6, 0.6]

with minibatch size 20. Setting

B = 0.5

and number of epochs

N = 8000

.

Figure 12. Poiseuille flow: the ALM-PINN and ALM solution comparison (left), absolute error in u (right), in the case of 50 sample points on

[- 1, 1]

and extra 150 sample points on

[- 0.6, 0.6]

with minibatch size 20. Setting

B = 0.5

and number of epochs

N = 8000

.

Figure 13. Poiseuille flow: the ALM-PINN and ALM solution derivative comparison (left), absolute error in

u^{'}

(right), in the case of 100 sample points on

[- 1, 1]

and extra 100 sample points on

[- 0.6, 0.6]

with minibatch size 20. Setting

B = 0.5

and number of epochs

N = 8000

.

Figure 13. Poiseuille flow: the ALM-PINN and ALM solution derivative comparison (left), absolute error in

u^{'}

(right), in the case of 100 sample points on

[- 1, 1]

and extra 100 sample points on

[- 0.6, 0.6]

with minibatch size 20. Setting

B = 0.5

and number of epochs

N = 8000

.

Figure 14. Poiseuille flow: the ALM-PINN and ALM solution derivative comparison (left), absolute error in

u^{'}

(right), in the case of 50 sample points on

[- 1, 1]

and extra 150 sample points on

[- 0.6, 0.6]

with minibatch size 20. Setting

B = 0.5

and number of epochs

N = 8000

.

Figure 14. Poiseuille flow: the ALM-PINN and ALM solution derivative comparison (left), absolute error in

u^{'}

(right), in the case of 50 sample points on

[- 1, 1]

and extra 150 sample points on

[- 0.6, 0.6]

with minibatch size 20. Setting

B = 0.5

and number of epochs

N = 8000

.

Figure 15. Poiseuille flow: Solution error comparison (top) and derivative error comparison (bottom). Setting

B = 0.5

and number of epochs

N = 8000

.

Figure 15. Poiseuille flow: Solution error comparison (top) and derivative error comparison (bottom). Setting

B = 0.5

and number of epochs

N = 8000

.

Figure 16. Bingham number-dependent velocity profile of Poiseuille flow: Yield surfaces are captured by plotting contours of

u^{'}

. The ALM-PINN learning is performed over 1000 interior points and

25 + 25

boundary points, minibatch size 100 and number of epochs

N =

10,000. The red solid lines represent the exact yield surfaces. The black solid lines are contour plots of

u^{'}

at levels between

5 \times 10^{- 3}

and

10^{- 2}

.

Figure 16. Bingham number-dependent velocity profile of Poiseuille flow: Yield surfaces are captured by plotting contours of

u^{'}

. The ALM-PINN learning is performed over 1000 interior points and

25 + 25

boundary points, minibatch size 100 and number of epochs

N =

10,000. The red solid lines represent the exact yield surfaces. The black solid lines are contour plots of

u^{'}

at levels between

5 \times 10^{- 3}

and

10^{- 2}

.

Figure 17. Plots of pressure (left) and

τ

(right) at the lid for

B = 2

,

B = 5

and

B = 20

.

Figure 17. Plots of pressure (left) and

τ

(right) at the lid for

B = 2

,

B = 5

and

B = 20

.

Figure 18. Contour plots of network learned scaling factor

| β_{1}^{1} |

for

B = 5

(left) and

B = 20

(right).

Figure 18. Contour plots of network learned scaling factor

| β_{1}^{1} |

for

B = 5

(left) and

B = 20

(right).

Figure 19. Contour plots of network learned scaling factor

| β_{1}^{2} |

for

B = 5

(left) and

B = 20

(right).

Figure 19. Contour plots of network learned scaling factor

| β_{1}^{2} |

for

B = 5

(left) and

B = 20

(right).

Figure 20. Contour plots of network learned scaling factor

| β_{2} |

for

B = 5

(left) and

B = 20

(right).

Figure 20. Contour plots of network learned scaling factor

| β_{2} |

for

B = 5

(left) and

B = 20

(right).

Figure 21. Streamline and unyielded region plots in the lid-driven cavity of Bingham fluid with

B = 2

. (Left: solution from the Adam algorithm only. Right: solution after the LBFGS enhancement.)

Figure 21. Streamline and unyielded region plots in the lid-driven cavity of Bingham fluid with

B = 2

. (Left: solution from the Adam algorithm only. Right: solution after the LBFGS enhancement.)

Figure 22. Streamline and unyielded region plots in the lid-driven cavity of Bingham fluid with

B = 5

(left) and

B = 20

(right). Solutions are obtained via implementing the Adam algorithm and the LBFGS.

Figure 22. Streamline and unyielded region plots in the lid-driven cavity of Bingham fluid with

B = 5

(left) and

B = 20

(right). Solutions are obtained via implementing the Adam algorithm and the LBFGS.

Table 2. Key attributes of ALM-PINN.

Computational Aspects	Preferred Features of ALM–PINN
mesh requirements	mesh-free, flexibility of point sampling
formulation of solution	closed-form available, easy to evaluate
in higher dimensions	more cost-effective vs. mesh-based methods
time dependent problems	one network training, no stability concerns
parameter dependent solutions	one network training, cost-effective
ALM convergence	enhanced by adaptive weights
stream function formulation	divergence free constraint eliminated
loss function	feasible for network training

Table 3. Comparison of the proposed ALM-PINN to other approaches.

Methods	Objective Functions	Handling Singularity
ALM	$F$ + I + II	+ regularization term (RT)
		RT of fixed scale
		slow convergence
		fine meshes required
modified ALM	$F$ + I + II	+ penalization term (PT)
		PT $\to 0$ allowed
		relatively faster convergence
		fine meshes required
PINNs	Loss function	Bingham fluids
	(from BVP)	(no explicit PDEs)
		⇒ not applicable
		mesh-free
weight adaptivity	Loss function
([48])	(from BVP or AL)	network learned weights:
	$\sum_{i = 1}^{N} λ_{i} L_{i} (x, α)$	$λ_{i} \to$ constant
ALM-PINN	Loss function	+ RT scaled via DAW
(new DAW)	( $F$ + II + BCs)	convergence adjusted by network
	$\sum_{i = 1}^{N} λ_{i} L_{i} (x, α, β (x, α))$	built-in divergence free
		(stream function formulation)
		mesh-free
		user specified
		hyperparameters:
		$λ_{i}$ (Fixed)
		network learned weights:
		$β (x, α) \to$ scaling function

Table 4. The ALM-PINN performance comparison for different B values.

B	$λ$	Vortex Center	Loss $L_{u}$	loss $L_{w}$	Loss $L_{p}$	Loss $L_{b}$
2	250	$(0.5, 0.8076)$	$0.0482$	$0.0077$	$2.8844 \times 10^{- 6}$	$0.0017$
5	1000	$(0.5, 0.8437)$	$0.1431$	$0.0347$	$4.7370 \times 10^{- 6}$	$0.0013$
20	2500	$(0.5, 0.9058)$	$0.2506$	$0.0573$	$4.6986 \times 10^{- 4}$	$1.0243 \times 10^{- 4}$

Table 5. Poiseuille flow (B-dependent solution): convergence of ALM-PINN with respect to network depth (number of layers

q_{l}

) and width (number of neurons

q_{n}

in each hidden layer). Activation function is tanh, with 1000 sample points, and after 10,000 epochs. No weight adaptivity.

Table 5. Poiseuille flow (B-dependent solution): convergence of ALM-PINN with respect to network depth (number of layers

q_{l}

) and width (number of neurons

q_{n}

in each hidden layer). Activation function is tanh, with 1000 sample points, and after 10,000 epochs. No weight adaptivity.

$q_{l}$	$q_{n}$	Loss $L_{u}$	Loss $L_{w}$	Loss $L_{b}$
4	20	$1.05 \times 10^{- 3}$	$2.23 \times 10^{- 5}$	$4.85 \times 10^{- 6}$
	30	$1.27 \times 10^{- 3}$	$1.84 \times 10^{- 5}$	$2.37 \times 10^{- 6}$
5	20	$6.07 \times 10^{- 4}$	$1.57 \times 10^{- 5}$	$2.25 \times 10^{- 6}$
	30	$1.12 \times 10^{- 4}$	$2.33 \times 10^{- 5}$	$5.36 \times 10^{- 7}$
6	20	$1.30 \times 10^{- 4}$	$8.21 \times 10^{- 6}$	$1.86 \times 10^{- 7}$
	30	$4.00 \times 10^{- 5}$	$5.80 \times 10^{- 7}$	$6.23 \times 10^{- 8}$

Table 6. Driven cavity (

B = 5

): convergence of ALM-PINN with respect to network depth (number of layers

q_{l}

) and width (number of neurons in each hidden layer

q_{n}

). Activation function is tanh, with 60,000 sample points, and after 14,000 epochs. Boundary hyperparameter

λ = 1000

, with weight adaptivity

μ_{1} = 20

and

μ_{2} = 50

.

Table 6. Driven cavity (

B = 5

): convergence of ALM-PINN with respect to network depth (number of layers

q_{l}

) and width (number of neurons in each hidden layer

q_{n}

). Activation function is tanh, with 60,000 sample points, and after 14,000 epochs. Boundary hyperparameter

λ = 1000

, with weight adaptivity

μ_{1} = 20

and

μ_{2} = 50

.

$q_{l}$	$q_{n}$	Loss $L_{u}$	Loss $L_{w}$	Loss $L_{p}$	Loss $L_{b}$
5	30	$0.1866$	$0.0532$	$1.08 \times 10^{- 5}$	$9.06 \times 10^{- 4}$
	40	$0.1611$	$0.0467$	$1.01 \times 10^{- 5}$	$9.36 \times 10^{- 4}$
6	30	$0.1629$	$0.0458$	$8.65 \times 10^{- 6}$	$8.83 \times 10^{- 4}$
	40	$0.1431$	$0.0347$	$4.74 \times 10^{- 6}$	$1.30 \times 10^{- 3}$
7	30	$0.1357$	$0.0310$	$4.98 \times 10^{- 6}$	$7.22 \times 10^{- 4}$
	40	$0.1001$	$0.0238$	$4.57 \times 10^{- 6}$	$7.36 \times 10^{- 4}$

Table 7. Poiseuille flow (B-dependent solution): convergence of the Adam algorithm without weight adaptivity (left) and with weight adaptivity (right). Activation function is tanh, with 1000 sample points, and after 2000, 40,000, 6000, 8000 epochs.

Steps	Loss $L_{u} (\times 10^{- 4})$	Loss $L_{w} (\times 10^{- 5})$	Loss $L_{b} (\times 10^{- 6})$
2000	6.412; 3.979	6.125; 2.651	1.722; 1.599
4000	5.138; 1.234	3.120; 0.997	1.573; 1.091
6000	1.817; 1.011	1.047; 0.766	1.432; 1.584
8000	1.536; 0.652	1.072; 0.503	1.260; 1.013

Table 8. Driven cavity (

B = 2

): convergence of the Adam algorithm with weight adaptivity

μ_{1} = 20

and

μ_{2} = 50

, boundary hyperparameter

λ = 250

. Activation function is tanh, with 60,000 sample points, and after 2000, 40,000, 6000, 8000, 10,000 epochs.

Table 8. Driven cavity (

B = 2

): convergence of the Adam algorithm with weight adaptivity

μ_{1} = 20

and

μ_{2} = 50

, boundary hyperparameter

λ = 250

. Activation function is tanh, with 60,000 sample points, and after 2000, 40,000, 6000, 8000, 10,000 epochs.

Steps	Loss $L_{u}$	Loss $L_{w}$	Loss $L_{p}$	Loss $L_{b}$
2000	$0.1439$	$0.0870$	$9.13 \times 10^{- 6}$	$0.0026$
4000	$0.0881$	$0.0383$	$6.48 \times 10^{- 6}$	$0.0018$
6000	$0.0797$	$0.0112$	$5.39 \times 10^{- 6}$	$0.0015$
8000	$0.0501$	$0.0085$	$4.03 \times 10^{- 6}$	$0.0017$
10,000	$0.0538$	$0.0090$	$3.13 \times 10^{- 6}$	$0.0018$

Table 9. Driven cavity (

B = 20

): convergence of the Adam algorithm with weight adaptivity

μ_{1} = 20

and

μ_{2} = 50

, boundary hyperparameter

λ = 2500

. Activation function is tanh, with 60,000 sample points, and after 2000, 40,000, 6000, 8000, 10,000 epochs.

Table 9. Driven cavity (

B = 20

): convergence of the Adam algorithm with weight adaptivity

μ_{1} = 20

and

μ_{2} = 50

, boundary hyperparameter

λ = 2500

. Activation function is tanh, with 60,000 sample points, and after 2000, 40,000, 6000, 8000, 10,000 epochs.

Steps	Loss $L_{u}$	Loss $L_{w}$	Loss $L_{p}$	Loss $L_{b}$
2000	$0.7513$	$0.1595$	$3.44 \times 10^{- 3}$	$2.27 \times 10^{- 3}$
4000	$0.6800$	$0.0882$	$1.23 \times 10^{- 3}$	$1.10 \times 10^{- 3}$
6000	$0.4143$	$0.0954$	$1.06 \times 10^{- 3}$	$7.02 \times 10^{- 4}$
8000	$0.3366$	$0.0806$	$9.83 \times 10^{- 4}$	$3.54 \times 10^{- 4}$
10,000	$0.2527$	$0.0615$	$8.36 \times 10^{- 4}$	$1.20 \times 10^{- 4}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, J. Physics-Informed Neural Networks for Bingham Fluid Flow Simulation Coupled with an Augmented Lagrange Method. AppliedMath 2023, 3, 525-551. https://doi.org/10.3390/appliedmath3030028

AMA Style

Zhang J. Physics-Informed Neural Networks for Bingham Fluid Flow Simulation Coupled with an Augmented Lagrange Method. AppliedMath. 2023; 3(3):525-551. https://doi.org/10.3390/appliedmath3030028

Chicago/Turabian Style

Zhang, Jianying. 2023. "Physics-Informed Neural Networks for Bingham Fluid Flow Simulation Coupled with an Augmented Lagrange Method" AppliedMath 3, no. 3: 525-551. https://doi.org/10.3390/appliedmath3030028

APA Style

Zhang, J. (2023). Physics-Informed Neural Networks for Bingham Fluid Flow Simulation Coupled with an Augmented Lagrange Method. AppliedMath, 3(3), 525-551. https://doi.org/10.3390/appliedmath3030028

Article Menu

Physics-Informed Neural Networks for Bingham Fluid Flow Simulation Coupled with an Augmented Lagrange Method

Abstract

1. Introduction

2. A Mesh-Based ALM for Bingham Fluid Flow Simulation

2.1. Bingham Fluids and the Stokes Equations

2.2. Variational Inequality and the Equivalent Variational Equality

2.3. Finite Element ALM Implementation

3. The ALM-PINN Approach

4. Numerical Results and Discussions

4.1. A One-Dimensional Example: Poiseuille Flow

4.1.1. Role of Minibatch

4.1.2. Role of Self-Adaptive Weights

4.1.3. Role of Sample Point Collocation and Clustering

4.1.4. Bingham Number-Dependent Solutions

4.2. A Two-Dimensional Example

4.2.1. Network Architecture and Hyperparameters

4.2.2. ALM-PINN Performance

4.2.3. Optimization Algorithm

4.2.4. Unyielded Regions

4.3. Convergence Analysis

4.3.1. Network Architectures

4.3.2. Optimization Error

4.3.3. Other Numerical Factors

5. Conclusions and Future Investigations

5.1. Numerical Achievements and Observations

5.2. Further Development

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI