A Mathematical Model of the Generalized Finite Strain Consolidation Process and Its Deep Galerkin Solution

Guang Yih Sheu

doi:10.3390/axioms14100733

Department of Accounting and Information Systems, Chang Jung Christian University, No.1, Changda Rd., Gueiren District, Tainan City 711301, Taiwan

Axioms2025, 14(10), 733;https://doi.org/10.3390/axioms14100733

This article belongs to the Special Issue Mathematical Modeling, Simulations and Applications

Version Notes

Order Reprints

Abstract

Developing classical three-dimensional consolidation theories considers the small-strain assumption. This small-strain assumption is inappropriate when studying the consolidation process of soft or very soft clay layers. Instead, this study derives a novel generalized mathematical model describing a three-dimensional finite-strain consolidation process and applies the deep Galerkin method to deduce its novel numerical solution. Developing this mathematical model uses the Reynolds transport theorem to describe mass and momentum balances for clay grain and pore water phases. The governing equation is the sum of the resulting mass and momentum balance equations. Next, the deep Galerkin method is applied to train a deep neural network to minimize the loss function defined by the governing equation and available initial and boundary conditions. The unknowns are the average velocity, effective stress, and pore water pressure. Predicting consolidation settlements is implemented by updating the problem domain using the resulting average velocity. Beneficial from the deep Galerkin method, two real-world examples demonstrate that the current mathematical model provides accurate predictions of consolidation settlements caused by the self-weight of two very soft clay layers. The deep Galerkin method helps resolve ill-posed problems by fitting a family of fields constrained by sampling/regularization rather than physics if the physics is under-determined.

Keywords:

finite strain consolidation; Reynolds transport theorem; deep Galerkin method; mass and moment balance equations; self-weight

MSC:

37N15; 65D25; 65D30; 65M99; 74H15; 74S99

1. Introduction

Consolidation represents the time-dependent process by which clay decreases its volume due to the expulsion of pore water under an external loading or its self-weight. A consolidation theory or a mathematical model describing the consolidation process is essential in many actual problems, such as estimating the consolidation settlements due to a construction loading (e.g., [1]), evaluating the results of a sand drain scheme (e.g., [2]), and scheduling a land reclamation project (e.g., [3]). Unexpected consolidation settlements may damage a building and delay a land reclamation project.

Nevertheless, developing classical mathematical models describing consolidation processes considers the small-strain assumption (e.g., [4]) in which consolidation settlements do not substantially change the thickness of a clay layer. It is invalid for a soft or very soft clay layer since it may be significantly thinner due to its consolidation settlements. These consolidation settlements may be caused only by their self-weight. A real example may be the Kansai airport, which may gradually sink into the sea due to excessive settlements of undersea clay layers. However, we only had one-dimensional [5] or quasi multi-dimensional mathematical models [6] to describe the consolidation process of a soft or very soft clay layer. Developing these mathematical models assumes that a soft or very soft clay layer deforms only in the vertical direction. This assumption is inconsistent with the actual condition.

Therefore, the author’s thesis [7] developed a generalized mathematical model describing a generalized finite-strain consolidation process. The Reynolds transport theorem [8] is employed to represent mass and momentum balances. The previous thesis [7] further combined the resulting mass and momentum equations into one single governing equation. Nevertheless, the resulting governing equation is long and complex. Applying it to model a real problem is difficult since we have few numerical methods to solve it. To solve this drawback, this study applies the deep Galerkin method (DGM) [9] to minimize the loss function defined by the sum of resulting momentum equations, initial, and boundary conditions. The unknowns are the average velocity field, effective stress, and pore water pressure. Calculating consolidation settlements is implemented by updating the problem domain according to the resulting average velocity field.

This DGM (e.g., [9]) is a meshless deep algorithm for solving high-dimensional partial differential equations. Choosing it considers the difficulties of collecting necessary data to define the boundary conditions (for example, a deep boundary) of a soft or very soft clay layer. By using the DGM, we train a deep neural network to provide a solution to a partial differential equation at random points in a problem domain. Defining complete boundary conditions is unnecessary.

The principal contributions of this study are as follows:

Compensate for the absence of a useful mathematical model for describing a generalized finite-strain consolidation process. It advances the simulation of the finite-strain consolidation process of soft or very soft clay layers. Compared with traditional small-strain consolidation theories (e.g., [4]), the proposed mathematical model has more applications, such as estimating settlements of dredge fill deposits.
Extend the application of the DGM to ill-posed problems. Specifying complete boundary conditions is unnecessary.

The remainder of this article is divided into four sections. Section 2 presents a review of published articles relevant to this study. Section 3 presents the development of a generalized finite strain consolidation theory and its DGM solution. Section 4 presents the numerical results generated using the current mathematical model. This section compares the numerical results with the observed settlements of the phosphatic waste clay and Osaka Bay mud. Based on this comparison, Section 5 presents a discussion. Section 6 presents the conclusion and concluding remarks of this research.

2. Related Works

This study intends to develop a mathematical model to describe a generalized finite-strain consolidation process and apply the DGM to solve the unknowns. Therefore, relevant references are those articles presenting a two- or three-dimensional finite strain consolidation theory and applications of the DGM to solve various partial differential equations.

2.1. Finite Strain Consolidation Theory

When we intend to develop a generalized finite strain consolidation theory, the origin is usually the one-dimensional finite strain consolidation theory proposed by Gibson et al. [3,5]. Different from Biot [4], they used the void ratio as the main unknown of their mathematical model. Considering that consolidation settlements may substantially thin a soft clay layer, they defined a convective coordinate system to describe the changes in a clay layer’s thickness. Besides, Gibson et al. [3,5] accounted for the effects of a thick clay layer’s self-weight on its consolidation process.

After the one-dimensional finite strain consolidation theory [3,5], few articles presented a truly generalized finite strain consolidation theory. However, quasi-two- or three-dimensional theories are available. For example, Jeeravipoolvarn et al. [6] developed a quasi-three-dimensional finite strain consolidation theory, which assumed that pore water flows in any direction and pore pressure dissipation causes only vertical deformations. Huerta and Rodriguez (1992) [10] calculated finite-strain consolidation settlements of soft sediment fillings at high water levels. They considered one-dimensional deformations and two-dimensional fluxes. Liu et al. [11] assumed vertical strains. They evaluated the vertical drains combined with vacuum pressure using a quasi-axisymmetric finite strain consolidation theory.

The author’s Ph.D. thesis [7] presented a truly three-dimensional finite strain consolidation theory. This generalized finite strain consolidation theory extended the one-dimensional works provided by Gibson et al. [3,5]. Creating this extension used the Reynolds transport theory to model the mass and momentum balances for the clay grain and pore water phases. Limiting the deformation and pore water dissipation in the vertical direction is not required. Besides, the effect of the self-weight of a thick clay layer on its consolidation process is studied. However, the resulting mathematical model may be complex.

2.2. Deep Galerkin Method

Since this study chooses the DGM (e.g., [9]) to solve the proposed mathematical model, it is necessary to review the representative partial differential equations, which were solved using the DGM.

Kumar et al. [12] used the DGM to solve the one-dimensional Burger-Huxley and Huxley equations. They are second-order partial differential equations with initial and boundary conditions. In the application of DGM, estimating the time derivatives using a finite difference scheme was not required. Approximating a time derivative by a finite difference scheme is frequently seen in the application of the finite element method in solving a time-dependent partial differential equation.

Sirignano and Spiliopoulos [9] applied the DGM to a free boundary equation. It is a one-dimensional and second-order partial differential equation. Its theoretical background is stock price dynamics. One of the boundaries is a time-dependent function. It enters into the derivation of the loss function defined to implement the DGM.

Masaharu [13] employed the DGM to solve a compressible Navier–Stokes equation. The Navier–Stokes equation contains convective accelerations. They are highly nonlinear differential terms. We may notice that Masaharu (2021) [13] studied supersonic flows around a blunt body without specifying full boundary conditions. This example is quite different from the application of a finite element method.

We may notice that only the third published study [13] applied the DGM to ill-posed problems. It encourages the current study to apply the DGM to solve finite strain consolidation problems.

3. Generalized Finite Strain Consolidation Theory

Presenting the following subsections frequently employs the following symbols: t is the time,

Ω

is the problem domain, V is its volume, S is its boundary, v is the velocity,

a

denotes the Lagrangian coordinate,

x

represents the convective coordinate, the subscript _s denotes the clay grain phase, the subscript _w denotes the pore water phase, the subscript ₀ represents the Lagrangian coordinate

a

,

Θ

denotes the total amount of an extensive property (such as mass and momentum) within

Ω_{0}

,

θ

is its density,

\frac{d}{d t}

is the total derivative with respect to t,

v

is the velocity field, n is the porosity, e is the void ratio,

n

denotes a unit normal vector,

ρ

is the density,

σ

denotes the total stress tensor,

σ^{'}

denotes the effective stress tensor, p is the pore water pressure, g is the gravitational acceleration,

Γ

represents the boundary (for specifying boundary conditions),

β

is the learning rate,

L o s s

denotes the loss function, ‖‖ represents the size, and

ϕ

is the neural network’s parameter.

Based on the above symbols, the aim of Section 3.1 is to develop a mathematical model describing a finite strain consolidation process. The aim of Section 3.2 is to present the DGM solution of this mathematical model, whereas the aim of Section 3.3 is to discuss the aim of the resulting DGM solution.

3.1. Mathematical Model

Suppose a homogeneous and saturated soil layer. Besides, clay grains and pore water are incompressible. The convective coordinate

x

is defined to identify a point in a problem domain

Ω

.

The Reynolds transport theorem states that the conservation of the extensive property

Θ

can be represented by [8]

\frac{d Θ}{d t} = \frac{d}{d t} (\int_{Ω} θ d V) + \int_{S} θ (v_{S} \cdot n_{S}) d S = 0

(1)

where the dot · is the dot operator and

n_{S}

is a unit vector normal to S.

3.1.1. Mass Balance for the Clay Grain Phase

Substituting

θ = (1 - n) ρ_{s}

into Equation (1) results in

\frac{d}{d t} [\int_{Ω} (1 - n) ρ_{s} d V] + \int_{S} (1 - n) ρ_{s} (v_{s} \cdot n_{S}) d S = 0

(2)

where f is a function. Applying the definition of total derivative

\frac{d f}{d t} = \frac{\partial f}{\partial t} + \nabla f \cdot v

and coordinate transform to simplify Equation (2) yields

\begin{matrix} \frac{d}{d t} [\int_{Ω} (1 - n) ρ_{s} d V] + \int_{S} (1 - n) ρ_{s} (v_{s} \cdot n_{S}) d S = \frac{d}{d t} [\int_{Ω_{0}} (1 - n_{0}) ρ_{s 0} J d V_{0}] \\ + \int_{S} (1 - n) ρ_{s} (v_{s} \cdot n_{S}) d S = \int_{Ω_{0}} \frac{d [(1 - n_{0}) ρ_{s 0} J]}{d t} d V_{0} + \int_{S} (1 - n) ρ_{s} (v_{s} \cdot n_{S}) d S \\ = \int_{Ω_{0}} \{\frac{\partial [(1 - n_{0}) ρ_{s 0} J]}{\partial t} + v_{s 0} \cdot \nabla [(1 - n_{0}) ρ_{s 0} J]\} d V_{0} + \int_{S} (1 - n) ρ_{s} (v_{s} \cdot n_{S}) d S = 0 \end{matrix}

(3)

in which ∇ is the gradient vector,

J = |\frac{\partial (x_{1}, x_{2}, x_{3})}{\partial (a_{1}, a_{2}, a_{3})}|

is the Jacobian,

x_{i} (i = 1, 2, 3)

are the components of convective coordinate

x

, and

a_{i}

are the components of Lagrangian coordinate

a

.

Applying the Gauss theorem [14]

(\int_{S} π \cdot d S = \int \int_{Ω} π d V

, and

π

denotes a continuously differentiable vector field) and coordinate transform to simplify the final term of Equation (3) yields

\int_{S} (1 - n) ρ_{s} (v_{s} \cdot n_{S}) d S = \int_{Ω_{0}} (1 - n) ρ_{s} \nabla \cdot v_{s} d V = \int_{Ω_{0}} (1 - n_{0}) ρ_{s 0} \nabla \cdot v_{s 0} J d V_{0}

(4)

Combining the final term of Equation (4) with the eighth term

(\int_{Ω_{0}} v_{s 0} \cdot \nabla [(1 - n_{0}) ρ_{s 0} J] d V_{0})

in Equation (3) yields

\int_{Ω_{0}} \nabla \cdot [(1 - n_{0}) ρ_{s 0} J v_{s 0}] d V_{0}

. Substituting the resulting expression into Equation (3) yields

\int_{Ω_{0}} \{\frac{\partial [(1 - n_{0}) ρ_{s 0} J]}{\partial t} + \nabla \cdot [(1 - n_{0}) ρ_{s 0} v_{s 0} J]\} d V_{0} = 0

(5)

Since Equation (5) holds for any problem domain

Ω_{0}

, we can reduce it by considering the localization:

\frac{\partial [(1 - n_{0}) ρ_{s 0} J]}{\partial t} + \nabla \cdot [(1 - n_{0}) ρ_{s 0} v_{s 0} J] = 0

(6)

Equation (6) is the mass balance equation for the clay grain phase. Furthermore, simplifying Equation (5) by considering that only the pore water dissipation from the

d V

causes the consolidation settlements yields

\int_{Ω_{0}} \nabla \cdot [(1 - n_{0}) ρ_{s 0} J v_{s 0}] d V_{0} = 0 and \int_{Ω_{0}} \frac{\partial [(1 - n_{0}) ρ_{s 0} J]}{\partial t} d V_{0} = 0

(7)

Further simplifying the final expression of Equation (7) leads to

(1 - n_{0}) ρ_{s 0} J = C = [1 - n_{0} (t = 0)] ρ_{s 0} (t = 0)

(8)

where C is independent of time t. Since this study assumed that clay grains are incompressible, we can obtain

ρ_{s 0} = ρ_{s 0} (t = 0)

and simplify Equation (8) to

J = \frac{1 - n_{0} (t = 0)}{1 - n_{0}}

(9)

3.1.2. Mass Balance for the Pore Water Phase

Substituting

θ = n ρ_{w}

into Equation (1) yields

\frac{d}{d t} (\int_{Ω} n ρ_{w} d V) + \int_{S} n ρ_{w} (v_{w} \cdot n_{S}) d S = 0

(10)

Referring to Section 3.1.1 to simplify Equation (10) results in

\begin{matrix} \frac{d}{d t} (\int_{Ω} n ρ_{w} d V) + \int_{S} n ρ_{w} (v_{w} \cdot n_{S}) d S = \int_{Ω_{0}} [\frac{\partial (n_{0} ρ_{w 0} J)}{\partial t} + v_{w 0} \cdot \nabla (n_{0} ρ_{w 0} J)] d V_{0} \\ + \int_{Ω_{0}} n_{0} ρ_{w 0} (\nabla \cdot v_{w 0}) J d V_{0} = \int_{Ω_{0}} [\frac{\partial (n_{0} ρ_{w 0} J)}{\partial t} + \nabla \cdot (n_{0} ρ_{w 0} v_{w 0} J)] d V_{0} = 0 \end{matrix}

(11)

Since Equation (11) holds for any problem domain

Ω_{0}

, considering the localization to simplify Equation (11) yields

\frac{\partial (n_{0} ρ_{w 0} J)}{\partial t} + \nabla \cdot (n_{0} ρ_{w 0} v_{w 0} J) = 0

(12)

Equation (12) is the mass balance equation for the pore water phase.

3.1.3. Momentum Balance for the Clay Grain Phase

Substituting

θ = (1 - n) ρ_{s} v_{s}

into Equation (1) yields

\frac{d}{d t} [\int_{Ω} (1 - n) ρ_{s} v_{s} d V] + \int_{S} (1 - n) ρ_{s} v_{s} (v_{s} \cdot n_{S}) d S = 0

(13)

Similar to the derivation of Equation (3), one can simplify Equation (13) to

\begin{matrix} \frac{d}{d t} [\int_{Ω} (1 - n) ρ_{s} v_{s} d V] + \int_{S} (1 - n) ρ_{s} v_{s} (v_{s} \cdot n_{S}) d S = \frac{d}{d t} [\int_{Ω_{0}} (1 - n_{0}) ρ_{s 0} v_{s 0} J d V_{0}] \\ + \int_{S} (1 - n) ρ_{s} v_{s} (v_{s} \cdot n_{S}) d S = \int_{Ω_{0}} \frac{d [(1 - n_{0}) ρ_{s 0} v_{s 0} J]}{d t} d V_{0} + \int_{S} (1 - n) ρ_{s} v_{s} (v_{s} \cdot n_{S}) d S \\ = \int_{Ω_{0}} ρ_{s 0} (1 - n_{0}) J \frac{d v_{s 0}}{d t} d V_{0} + \int_{Ω_{0}} v_{s 0} \frac{d [ρ_{s 0} (1 - n_{0}) J]}{d t} d V_{0} + \int_{S} (1 - n) ρ_{s} v_{s} (v_{s} \cdot n_{S}) d S \end{matrix}

(14)

By simplifying Equation (14) using the Gauss theorem and transforming the final term of the same equation using the coordinate transformation, we can reduce the equation to

\begin{matrix} \int_{Ω_{0}} ρ_{s 0} (1 - n_{0}) J \frac{d v_{s 0}}{d t} d V_{0} + \int_{Ω_{0}} v_{s 0} \frac{d [ρ_{s 0} (1 - n_{0}) J]}{d t} d V_{0} + \int_{S} (1 - n) ρ_{s} v_{s} (v_{s} \cdot n_{S}) d S \\ = \int_{Ω_{0}} ρ_{s 0} (1 - n_{0}) J \frac{d v_{s 0}}{d t} d V_{0} + \int_{Ω_{0}} v_{s 0} \frac{d [ρ_{s 0} (1 - n_{0}) J]}{d t} d V_{0} + \int_{Ω_{0}} (1 - n_{0}) ρ_{s 0} v_{s 0} J (\nabla \cdot v_{s 0}) d V_{0} \end{matrix}

(15)

Combining the final two terms of Equation (15) yields

\begin{matrix} \int_{Ω_{0}} v_{s 0} \frac{d [ρ_{s 0} (1 - n_{0}) J]}{d t} d V_{0} + \int_{Ω_{0}} (1 - n_{0}) ρ_{s 0} v_{s 0} J (\nabla \cdot v_{s 0}) d V_{0} \\ = \int_{Ω_{0}} v_{s 0} \{\frac{\partial [ρ_{s 0} (1 - n_{0}) J]}{\partial t} + \nabla [ρ_{s 0} (1 - n_{0}) J] \cdot v_{s 0} + (1 - n_{0}) ρ_{s 0} J (\nabla \cdot v_{s 0})\} d V_{0} \\ = \int_{Ω_{0}} v_{s 0} \{\frac{\partial [ρ_{s 0} (1 - n_{0}) J]}{\partial t} + \nabla \cdot [ρ_{s 0} (1 - n_{0}) J v_{s 0}]\} d V_{0} \end{matrix}

(16)

Based on Equation (6), Equation (16) is equal to 0. We then simplify Equation (13) to

\int_{Ω_{0}} ρ_{s 0} (1 - n_{0}) J \frac{d v_{s 0}}{d t} d V_{0} = 0

(17)

where the term

\frac{d v}{d t}

is the acceleration field and

ρ_{s 0} (1 - n_{0})

denotes the mass of clay grains (in

J d V_{0}

). Therefore, this equation is equal to the forces exerted on clay grains. Thus, we can define

\int_{Ω_{0}} ρ_{s 0} (1 - n_{0}) J \frac{d v_{s 0}}{d t} d V_{0} = \int_{Ω_{0}} f_{s} (1 - n_{0}) d V_{0} + \int_{Ω_{0}} h^{s} d V_{0} + \int_{S_{0}} (σ_{0}^{'} \cdot n_{S_{0}}) d S_{0}

(18)

in which

h^{s}

is the seepage force vector per unit volume arising from the frictional drag of pore water, and

f_{s}

is the body force vector. This study considers that the

f_{s}

comes from the weights of clay grains. It and

σ_{0}^{'}

can be further equated by

σ_{0}^{'} = [\begin{matrix} {σ^{'}}_{11, 0} & {σ^{'}}_{12, 0} & {σ^{'}}_{13, 0} \\ {σ^{'}}_{21, 0} & {σ^{'}}_{22, 0} & {σ^{'}}_{23, 0} \\ {σ^{'}}_{31, 0} & {σ^{'}}_{32, 0} & {σ^{'}}_{33, 0} \end{matrix}] and f_{s} = \pm ρ_{s 0} g

(19)

where

g = {(0, 0, g)}^{T}

. The sign ± depends upon the direction of

a_{3}

. If the direction of

a_{3}

is opposite to the direction of the gravitational force, the sign + is adopted. Meanwhile, applying the Gauss theorem to simplify the final term of Equation (18) yields

\int_{S_{0}} (σ_{0}^{'} \cdot n_{S_{0}}) d S_{0} = \int_{Ω_{0}} (\nabla \cdot σ_{0}^{'}) d V_{0}

. Substituting the resulting expression into Equation (18) and considering that the resulting integral equation holds for any problem domain

Ω_{0}

, we can further simplify the resulting integral equation by considering localization. The result is

ρ_{s 0} (1 - n_{0}) J \frac{d v_{s 0}}{d t} = \pm ρ_{s 0} (1 - n_{0}) g + h^{s} + \nabla \cdot σ_{0}^{'}

(20)

Equation (20) is the momentum balance equation for the clay grain phase.

3.1.4. Momentum Balance for the Pore Water Phase

Substituting

θ = n ρ_{w} v_{s}

into Equation (1) yields

\frac{d}{d t} (\int_{Ω} n ρ_{w} v_{w} d V) + \int_{S} n ρ_{w} v_{w} (v_{w} \cdot n_{S}) d S = 0

(21)

Referring to the derivation of Equation (14) to simplify the first term of Equation (21) and applying the Gauss theorem and coordinate transform to the final term of the same equation, the resulting equation is

\begin{matrix} \frac{d}{d t} (\int_{Ω} n ρ_{w} v_{w} d V) + \int_{S} n ρ_{w} v_{w} (v_{w} \cdot n_{S}) d S = \frac{d}{d t} (\int_{Ω_{0}} n_{0} ρ_{w 0} v_{w 0} J d V_{0}) \\ + \int_{S} n ρ_{w} v_{w} (v_{w} \cdot n_{S}) d S = \int_{Ω_{0}} ρ_{w 0} n_{0} J \frac{d v_{w 0}}{d t} d V_{0} + \int_{Ω_{0}} v_{w 0} \frac{d (ρ_{w 0} n_{0} J)}{d t} d V_{0} \\ + \int_{S_{0}} n_{0} ρ_{w 0} v_{w 0} J (\nabla \cdot v_{w 0}) d V_{0} \end{matrix}

(22)

The final two terms of Equation (22) are combined and reduced to

\begin{matrix} \int_{Ω_{0}} v_{w 0} \frac{d (ρ_{w 0} n_{0} J)}{d t} d V_{0} + \int_{Ω_{0}} n_{0} ρ_{w 0} v_{w 0} J (\nabla \cdot v_{w 0}) d V_{0} \\ = \int_{Ω_{0}} v_{w 0} [\frac{\partial (ρ_{w 0} n_{0} J)}{\partial t} + \nabla \cdot (ρ_{w 0} n_{0} J v_{w 0})] d V_{0} \end{matrix}

(23)

Based on Equation (12), Equation (23) is equal to 0. Thus, we can reduce Equation (21) to

\int_{Ω_{0}} ρ_{w 0} n_{0} J \frac{d v_{w 0}}{d t} d V_{0} = 0

(24)

Similar to Equation (17), Equation (24) is equal to the forces exerted on the pore water phase. Thus [8],

\int_{Ω_{0}} ρ_{w 0} n_{0} J \frac{d v_{w 0}}{d t} d V_{0} = \int_{Ω_{0}} f_{w} n_{0} d V_{0} + \int_{Ω_{0}} h^{w} d V_{0} + \int_{S_{0}} (p_{0} \cdot n_{S_{0}}) d S_{0}

(25)

where

p

denotes the pore water pressure tensor,

h^{w}

is the reactive force vector per unit volume exerted by clay grains on the pore water phase as the pore water seeps through the clay, and

f_{w}

is the body force vector. The

p_{0}

and

f_{w}

are further defined by

p_{0} = [\begin{matrix} p_{0} & 0 & 0 \\ 0 & p_{0} & 0 \\ 0 & 0 & p_{0} \end{matrix}] and f_{w} = \pm ρ_{w 0} g

(26)

in which setting the sign ± still depends upon the direction of

a_{3}

. On the other hand, applying the Gauss theorem to the final term of Equation (25) yields

\int_{S_{0}} (p_{0} \cdot n_{S_{0}}) d S_{0} = \int_{Ω_{0}} (\nabla \cdot p_{0}) d V_{0}

. Substituting the resulting expression into Equation (25) and considering that the resulting expression holds for any problem domain

Ω_{0}

, we can further simplify the resulting integral equation by considering localization. The result is

ρ_{w 0} n_{0} J \frac{d v_{w 0}}{d t} = \pm n_{0} ρ_{w 0} g + h^{w} + \nabla \cdot p_{0}

(27)

Equation (27) is the momentum balance equation for the pore water phase.

3.2. DGM Formulation

Summing Equations (20) and (27) results in

ρ_{t o t a l} J \frac{d v_{a v g}}{d t} = \pm ρ_{t o t a l} g + h^{w} + h^{s} + \nabla \cdot p_{0} + \nabla \cdot σ_{0}^{'}

(28)

where

v_{a v g} = \frac{ρ_{s 0} (1 - n_{0}) v_{s 0} + ρ_{w 0} n_{0} v_{w 0}}{ρ_{t o t a l}}

denotes the average velocity field,

\frac{d v_{a v g}}{d t}

represents the average acceleration field, and

ρ_{t o t a l} = ρ_{s 0} (1 - n_{0}) + ρ_{w 0} n_{0}

is the total mass density. Different from the author’s thesis [7], this study does not try to combine Equations (9), (12), (20) and (27) into a governing equation. Thus, deriving a DGM solution of the current mathematical model is simpler.

Furthermore, focusing on the macro behaviors of clay grains and pore water, this study assumes

h^{w} + h^{s} = 0

[7] or neglects the interaction between pore water and clay grains. Simplifying Equation (28) with this consideration yields

ρ_{t o t a l} J \frac{d v_{a v g}}{d t} = \pm ρ_{t o t a l} g + \nabla \cdot p_{0} + \nabla \cdot σ_{0}^{'}

(29)

A DGM solution of Equation (29) is next derived: Suppose the time t is between 0 and T. The initial and boundary conditions are

\frac{d v_{a v g} (t = 0)}{d t} = 0, p (t = 0) = p_{i n i}, and σ^{'} (t = 0) = σ_{i n i}^{'}

(30)

σ^{'} = σ_{d} on Γ_{f}

(31)

p = 0 on Γ_{D}

(32)

v_{a v g} = 0 on Γ_{F}

(33)

\frac{\partial p_{0}}{\partial n_{U j}} = 0 on Γ_{U}

(34)

where

Γ_{f}

,

Γ_{D}

,

Γ_{F}

, and

Γ_{U}

are free, drained, fixed, and undrained boundaries, respectively,

p_{i n i}

is the initial pore water pressure tensor,

σ_{i n i}^{'}

represents the initial effective stress tensor,

σ_{d}

denotes an imposed loading, and

n_{U j} (j = 1, 2, 3)

is the component of a unit vector

n_{U}

normal to the undrained boundary

Γ_{U}

. Note that Equations (31)–(34) do not mean that

Γ_{f}

,

Γ_{D}

,

Γ_{F}

, and

Γ_{U}

exist separately in a particular problem domain

Ω_{0}

. Also, the computation of

σ_{i n i}^{'}

and

p_{i n i}

depends upon a particular problem domain

Ω_{0}

.

Based on Equations (29)–(34), this study applied the DGM algorithm to approximate the

v_{a v g}

using

α (a, t, ϕ)

,

p_{0}

using

P (a, t, ϕ)

, and

σ_{0}^{'}

using

S (a, t, ϕ)

in which

ϕ

is the neural network’s parameter. The DGM algorithm defines the loss function by (e.g., [9])

\begin{matrix} L o s s = {∥ρ_{t o t a l} J \frac{d α}{d t} - \nabla \cdot P - \nabla \cdot S∥}_{(0, T) \times Ω_{0}, ν_{1}}^{2} + {∥α (t = 0)∥}_{Ω_{0}, ν_{2}}^{2} \\ + {∥P (t = 0) - p_{i n i}∥}_{Ω_{0}, ν_{2}}^{2} + {∥σ^{'} (t = 0) - σ_{i n i}^{'}∥}_{Ω_{0}, ν_{2}}^{2} \\ + {∥P∥}_{(0, T) \times Γ_{D}, ν_{3}}^{2} + {∥S - σ_{d}∥}_{(0, T) \times Γ_{f}, ν_{4}}^{2} + {∥\frac{\partial P_{i j}}{\partial n_{U j}}∥}_{(0, T) \times Γ_{U}, ν_{5}}^{2} \end{matrix}

(35)

where

ν_{1} = \frac{1}{|Ω|}

,

ν_{2} = \frac{1}{T}

,

ν_{3} = \frac{1}{|Γ_{D}|}

,

ν_{4} = \frac{1}{|Γ_{f}|}

,

ν_{5} = \frac{1}{|Γ_{U}|}

, denotes the size,

P_{i j} (i, j = 1, 2, 3)

is the component of the

P

tensor, the symbol represents the

L^{2}

norm function in which

L o s s

measures how well the functions

α

,

P

, and

S

satisfy Equations (29)–(34). If we can obtain

L o s s = 0

, the

α

,

P

, and

S

are the solutions of Equation (29).

The goal of defining Equation (35) is to find the

ϕ

parameter with which the corresponding

α (a, t, ϕ)

,

P (a, t, ϕ)

, and

S (a, t, ϕ)

minimize the

L o s s

function. Then, they are solutions of the

v_{a v g}

,

p

, and

σ^{'}

. The best

ϕ

value is estimated using an existing optimization method (for example, the Adam optimizer). Discretizing the problem domain

Ω

into some elements is unnecessary.

Algorithm 1 [13] illustrates the proposed steps of implementing the DGM algorithm and three post-processing steps (6–8 th steps) in which k denotes an iteration number, E denotes an error,

β

represents the learning rate, and

C

is the compliance matrix, R denotes the void ratio-effective stress relationship, and

ϵ

is the strain tensor. The sixth step provides the void ratio e. Optionally, calculating the strain tensor

ϵ

may be implemented. The seventh step outputs the numerical results for updating the Jacobian J and

ρ_{t o t a l}

. Since the initial displacement and velocity fields are zero, the eighth step is employed to update the coordinates

x_{j} (j = 1, 2, 3)

. Updating the coordinates

x_{j}

also provides the consolidation settlements. Note that this study does not calculate consolidation settlements from the strain tensor

ϵ

since collecting undisturbed samples of soft or very soft clay to construct its stress-strain and void ratio–stress relationships is difficult.

The learning rate

β_{k}

in Algorithm 1 decreases with the iteration number k. The calculation

\nabla_{ϕ} E

is an unbiased estimation of

\nabla_{ϕ} (L o s s)

:

E [\nabla_{ϕ} E (\cdot | ϕ_{k})] = \nabla_{ϕ} [L o s s (\cdot : ϕ_{k})]

(36)

where

E

denotes the mathematical expectation. Under technical conditions, the

ϕ

parameter will converge to a critical point of the

L o s s

function as

k \to \infty

:

lim_{k \to \infty} ∥\nabla_{ϕ} L o s s (\cdot : ϕ_{k})∥ = 0

(37)

Algorithm 1 DGM algorithm

Input: A problem domain

Ω

, a time interval (0, T), initial and boundary conditions, the neural network’s parameter

ϕ

, a learning rate

β

, a maximum iteration number, and a compliance matrix

C

or a void ratio-effective stress relationship R.

Output: Solutions of the

v_{a v g}

,

p

and

σ^{'}

.

1:: Generate random points from $Ω \times [0, T]$ and existing boundaries
2:: while a maximum iteration number is not exceeded do
3:: Calculate the $σ_{i n i}^{'}$ and $p_{i n i}$ according to the initial problem domain $Ω$ .
4:: Calculate: $\begin{matrix} E \leftarrow {[ρ_{t o t a l} J \frac{d α (a, t, ϕ_{k})}{d t} - \nabla \cdot P (a, t, ϕ_{k}) - \nabla \cdot S (a, t, ϕ_{k})]}_{Ω}^{2} + {[α (t = 0, ϕ_{k})]}_{Ω}^{2} \\ + {[P (t = 0, ϕ_{k}) - p_{i n i}]}_{Ω}^{2} + {[σ^{'} (t = 0, ϕ_{k}) - σ_{i n i}^{'}]}_{Ω}^{2} + {[P (a, t, ϕ_{k})]}_{Γ_{D}}^{2} \\ + {[S (a, t, ϕ_{k}) - σ_{d}]}_{Γ_{f}}^{2} + {[\frac{\partial P_{i j}}{\partial n_{U j}}]}_{Γ_{U}}^{2} \end{matrix}$
5:: Update $ϕ_{k + 1} \leftarrow ϕ_{k} - β_{k} \nabla_{ϕ} E$ .
6:: Calculate $e \leftarrow R (σ^{'})$ . If available, compute $ϵ \leftarrow C σ^{'}$ .
7:: Update J and $ρ_{t o t a l}$ .
8:: Update $x = x + \int v_{a v g} d t$ .
9:: $k \leftarrow (k + 1)$ .
10:: end while

Figure 1 (e.g., [12,13]) shows an example of the neural network’s architecture for implementing the DGM. It approximates the

v_{a v g}

by

α (a, t, ϕ)

,

p

by

P (a, t, ϕ)

, and

σ

by

S (a, t, ϕ)

in which

ϕ

is the neural network’s parameter. In mathematical expressions, the deep neural network contains three types of layers:

Figure 1. An example of the neural network’s architecture for implementing the DGM (e.g., [9]).

Input layer: The neural network calculates

$Z^{(0)} = ψ (W_{0} X + b_{0})$

(38)

in which $X = {(a, t)}^{T}$ , $Z$ is the intermediate hidden feature vector, $ψ$ represents the activation function, $W_{0}$ denotes the input weight matrix, and $b_{0}$ represents the input bias vector.
Hidden layer: Suppose L hidden layers are generated. For each hidden layer, the neural network computes

$\begin{matrix} F^{(k)} = ψ [W_{f}^{(k)} Z^{(k - 1)} + b_{f}^{(k)}] \\ r^{(k)} = ψ [W_{r}^{(k)} Z^{(k - 1)} + {b_{r}}^{(k)}] \\ {\tilde{Z}}^{(k)} = tanh \{W_{h}^{(k)} [r^{(k)} ⊙ Z^{(k - 1)}] + b_{h}^{(k)}\} \\ Z^{(k)} = [1 - F^{(k)}] ⊙ Z^{(k - 1)} + F^{(k)} ⊙ {\tilde{Z}}^{(k)} \end{matrix}$

(39)

where $k = 1, 2 \dots, L$ denotes the L-th hidden layer, $F$ is the gate vector, the subscripts _f, _r, and _h denote the update gate, reset gate, and candidate hidden gate, respectively, $r$ represents the reset gate. $\tilde{Z}$ represents the candidate hidden state, ⊙ is the Hadamard product, and tanh is the hyperbolic tangent function. This tanh function serves as a nonlinear activation function.
Output layer:

$U = W_{o u t} Z^{(L)} + b_{o u t}$

(40)

where $U = {(v_{a v g}, p, σ^{'})}^{T}$ represents the output of the neural network, $W_{o u t}$ is the weight matrix of the output layer, and $b_{o u t}$ is the bias vector of the output layer.

3.3. Implementation of the DGM

This study uses PyTorch 2.8.0 in coding a Python program for implementing Algorithm 1. It has an automatic differentiation function that can easily output the derivatives with respect to independent variables. Moreover, back propagation is adopted to train a deep neural network. Besides, optimizing the training process adopts the Adam optimizer based on the stochastic gradient descent method. Furthermore, an Apple Mac Pro with 8 MB of RAM, an M2 CPU, and 16 GPUs was employed to generate numerical results.

4. Application

Considering the acquisition of experimental data, this study introduces the recorded consolidation settlements of the phosphatic waste clay [15] and Osaka Bay [16] mud to test the proposed mathematical model.

4.1. Phosphatic Waste Clay

The phosphate waste clay is the byproduct of the beneficiation process of the phosphate ore. This phosphate ore occurs in a gravely, clayey sand and contains

\frac{1}{3}

phosphate,

\frac{1}{3}

granular materials (sand), and

\frac{1}{3}

clays. Merely the phosphate is collected as the primary source of phosphorus in inorganic fertilizers. After extracting the phosphate, the phosphate waste clay is pumped into large retention ponds and allowed to consolidate without any imposed loading. For designing the storage capacity of the retention ponds, we must predict well the consolidation settlements of the phosphate waste clay caused by its self-weight.

The phosphate waste clay has a liquid limit between 100 and 200, a plastic index between 70 and 150, and a specific yield

G_{s}

equal to 2.71 [15]. Some previous studies have constructed empirical relationships between the void ratio and effective stress. Equation (41) provides an example [17]:

e = (e_{b} - e_{\infty}) exp [- λ σ_{33}^{'} ({kg / cm}^{2})] + e_{\infty}

(41)

where

e_{b}

and

e_{\infty}

are the anticipated void ratios before and after the consolidation process, and

λ

is an empirical parameter. This word, ‘anticipated’, denotes the difficulty of measuring the void ratio within the whole phosphate waste clay layer.

Testing the performance of the current mathematical model is through the results of a field test [17]. In this field test, a retention pond settled due to the self-weight of poured phosphate waste clay. Cargill [3] derived a one-dimensional mathematical model to predict the consolidation settlements. Instead, this study applied the two-dimensional version of the proposed mathematical model to predict the consolidation settlements and compare the measured results. The initial problem domain is

0 \leq a_{j} \leq H

(

j = 1, 3

) in which H is the depth of the retention pond, the direction of

a_{3}

is identical to the direction of the gravitational force, and the

a_{1}

is from right to left. Equation (35) is employed to fit the

v_{a v g}

,

p

, and

σ^{'}

in this problem domain. If the physics is under-determined, Equation (35) fits the

v_{a v g}

,

p

, and

σ^{'}

, constrained by sampling/regularization rather than physics. A small figure inside the subsequent Figure 2 also illustrates the initial problem domain. The two-dimensional version of Equation (29) is the governing equation in the subsection:

\begin{matrix} ρ_{t o t a l} J \frac{d v_{a v g, 1}}{d t} = \frac{\partial p_{0}}{\partial a_{1}} + \frac{\partial σ_{11, 0}^{'}}{\partial a_{1}} + \frac{\partial σ_{13, 0}^{'}}{\partial a_{3}} \\ ρ_{t o t a l} J \frac{d v_{a v g, 3}}{d t} = - ρ_{t o t a l} g + \frac{\partial p_{0}}{\partial a_{3}} + \frac{\partial σ_{13, 0}^{'}}{\partial a_{1}} + \frac{\partial σ_{33, 0}^{'}}{\partial a_{3}} \end{matrix}

(42)

Figure 2. Comparison of the predicted and observed elevations of the retention pond [17].

The initial and boundary conditions are

\begin{matrix} v_{a v g} (t = 0) = 0, p (t = 0) = γ_{w} a_{3}, \\ σ_{11}^{'} (t = 0) = σ_{13}^{'} (t = 0) = 0, \\ σ_{33}^{'} (t = 0) = \frac{γ_{w} (G_{s} - 1) a_{3}}{1 + e (t = 0)}, \\ σ^{'} (a_{3} = 0) = p (a_{3} = 0) = 0 \end{matrix}

(43)

where

γ_{w}

is the unit weight of pore water. Equation (43) means that

a_{3} = 0

is a free boundary and also a drained one. Moreover, the previous study [17] provided the following data:

T = 633 days, λ = 3.53 H = 6.33 m, e_{b} = 6.936, and e_{\infty} = 3.64

(44)

in which the high ratio

e_{b}

value implies that the phosphate waste clay layer is very soft. Meanwhile, training a deep neural network uses the Adam optimizer, learning rate

β

equal to

10^{- 3}

, and

10^{5}

epochs. This deep neural network has three hidden layers, and each hidden layer has 128 neurons. Discretizing the problem domain

Ω_{0}

and

Γ_{f}

uses

2^{22}

random interior nodes and

2^{9}

random boundary nodes.

After subtracting predicted consolidation settlements from the elevation of the retention pond, Figure 2 compares the remaining elevation. Figure 3a–d selectively show the predicted

σ_{11, 0}^{'}

,

σ_{33, 0}^{'}

,

σ_{13, 0}^{'}

, and

p_{0}

at the time t = 30 days. The unit of predicted

σ_{11, 0}^{'}

,

σ_{33, 0}^{'}

,

σ_{13, 0}^{'}

, and

p_{0}

is kPa. Creating the data for plotting Figure 2 and Figure 3a–d takes the GPU time = 1546.87 s.

Figure 3. Predicted

σ_{11, 0}^{'}

,

σ_{33, 0}^{'}

,

σ_{13, 0}^{'}

, and

p_{0}

of a phosphate waste layer consolidated by its self-weight at the time t = 30 days (

σ_{11, 0}^{'}

,

σ_{33, 0}^{'}

,

σ_{13, 0}^{'}

in kPa).

Figure 2 shows the necessity of developing a general mathematical model for studying a finite strain consolidation process. Considering the horizontal pore water dissipation, the current mathematical model provides more accurate predicted elevations of the retention pond, especially during 10–100 days. The corresponding maximum error

(= \frac{|predicted value - true value|}{true value})

is about 2.3%, whereas the previous one-dimensional model [17] outputs the maximum error about 4.2%. Figure 3a,b show that the

σ_{11, 0}^{'}

and

σ_{33, 0}^{'}

vary symmetrically to two skewed vertical axes. Besides, Figure 3c demonstrates that a nonuniform

σ_{13, 0}^{'}

exerts on the retention pond. It is impossible to use a one-dimensional mathematical model to output variations of the

σ_{11, 0}^{'}

,

σ_{33, 0}^{'}

, and

σ_{13, 0}^{'}

shown in Figure 3a–c. Meanwhile, Figure 3d demonstrates that the pore water pressure

p_{0}

varies uniformly across the depth of the retention pond.

Figure 4 demonstrates the convergence of the

L o s s

value with respect to the number of epochs. This figure indicates that the

L o s s

value decreases to a low value after running Algorithm 1 sufficiently.

Figure 4. Convergence of the

L o s s

value in predicting consolidation settlements of a phosphate waste clay layer.

If we desire the

L o s s

values below

10^{- 4}

, Figure 4 indicates that we can adopt merely 4000 to 5000 epochs to generate such predictions. However, it may be interesting to study how to investigate the variation of

L o s s

values with respect to the learning rate

β

and the number of interior or boundary nodes.

4.2. Osaka Bay Mud

Since Japan has limited land and a dense population, it has no choice but to build some offshore public facilities. Therefore, some published studies (e.g., [16]) investigated the consolidation behaviors of offshore clay layers near big cities such as Tokyo and Osaka. This study uses the results of a model test [16] whose goal is to monitor the consolidation process induced by the self-weight of an Osaka Bay mud layer to test the current mathematical model.

The previous study [16] reported that Osaka Bay mud has a liquid limit equal to 102.8, a plastic index equal to 45.8, and a specific yield

G_{s}

equal to 2.59. Another published study [18] provided an empirical relationship between the void ratio and vertical effective stress

σ_{33}^{'}

:

e = 1.35 - 0.45 log (\frac{{σ^{'}}_{33}}{25 kPa})

(45)

Similar to Section 4.1, Equation (42) is still the governing equation for predicting consolidation settlements of the Osaka Bay mud layer, with the initial problem domain set equal to

0 \leq a_{j} \leq H

(

j = 1, 3

), where H is the thickness of the Osaka Bay mud layer. The direction of

a_{3}

is identical to the direction of the gravitational force, whereas the direction of

a_{1}

is from left to right. Equation (35) is employed to fit the

v_{a v g}

,

p

, and

σ^{'}

in this problem domain. If the physics is under-determined, Equation (35) fits the

v_{a v g}

,

p

, and

σ^{'}

constrained by sampling/regularization rather than physics. The small figure inside the subsequent Figure 5 further illustrates this problem domain

Ω_{0}

. The initial and boundary conditions are

\begin{matrix} v_{a v g} (t = 0) = 0, p (t = 0) = γ_{w} a_{3} σ_{33}^{'} (t = 0) = \frac{γ_{w} (G_{s} - 1) a_{3}}{1 + e (t = 0)} \\ σ_{11}^{'} (t = 0) = σ_{13}^{'} (t = 0) = 0 \\ σ^{'} (a_{3} = 0) = 0, p (a_{3} = 0) = 0 \\ \frac{\partial p (a_{1} = 0)}{\partial a_{1}} = \frac{\partial p (a_{1} = H)}{\partial a_{1}} = v_{a v g, 1} (a_{1} = 0) = v_{a v g, 1} (a_{1} = H) = 0 \\ \frac{\partial p (a_{3} = H)}{\partial a_{3}} = v_{a v g, 3} (a_{3} = H) = 0 \end{matrix}

(46)

where

a_{1} = 0

is a fixed boundary and also an undrained one,

a_{3} = H

is a fixed boundary and also an undrained one. Other necessary data are [16]:

T = 112.5 days, H = 80 cm, e (t = 0) = 7.849

(47)

in which the initial void ratio

e (t = 0)

indicates that the Osaka mud layer is softer than the phosphate waste clay layer in Section 4.1. Meanwhile, training a deep neural network uses the Adam optimizer, a learning rate

β

equal to

10^{- 3}

, and

10^{5}

epochs. This deep neural network has three hidden layers, and each hidden layer has 128 neurons. Discretizing the problem domain

Ω_{0}

and

Γ_{f}

uses

2^{22}

random interior nodes and

2^{9}

random boundary nodes. Figure 5 compares the predicted and measured vertical settlements. Figure 6a–d selectively show variations of the predicted

σ_{11, 0}^{'}

,

σ_{33, 0}^{'}

,

σ_{13, 0}^{'}

, and

p_{0}

at the time t =

1.55 \times 10^{4}

min. The unit of the

σ_{11, 0}^{'}

,

σ_{33, 0}^{'}

,

σ_{13, 0}^{'}

, and

p_{0}

is kPa. Creating the data for plotting Figure 2 and Figure 3a–d takes the GPU time = 1299.66 s.

Figure 5. Comparison of predicted and measured vertical settlements of an Osaka Bay mud layer at the time

t = 1.55 \times 10^{4}

min [16].

Figure 6. Predicted

σ_{11, 0}^{'}

,

σ_{33, 0}^{'}

,

σ_{13, 0}^{'}

, and

p_{0}

of an Osaka Bay mud layer consolidated by its self-weights at the time

t = 1.55 \times 10^{4}

min (

σ_{11, 0}^{'}

,

σ_{33, 0}^{'}

,

σ_{13, 0}^{'}

in kPa).

Figure 5 indicates that the current mathematical model provides accurate predictions of vertical settlements. Further inspection of the data used to plot this figure finds that the maximum error is below 5%. Although the Osaka Bay mud layer in this section is thinner than the phosphate waste clay layer in Section 4.1, Figure 6a still shows that nonuniform

σ_{11, 0}^{'}

imposes the Osaka Bay mud layer. Larger stresses

σ_{11, 0}^{'}

are imposed on the middle part of the Osaka Bay mud layer. Similarly, Figure 6c demonstrates that the shear stress

σ_{13, 0}^{'}

varies nonuniformly. In contrast, Figure 6b,d shows that variations of the

σ_{33, 0}^{'}

and

p_{0}

are uniform.

Figure 7 visualizes the convergence of the

L o s s

value with respect to the number of epochs. Similar to Section 4.1, if the desired

L o s s

value is below

10^{- 4}

, this figure shows that we can adopt merely 2000 epochs to obtain such

L o s s

values. Convergence of the

L o s s

value is even faster in this section than in Section 4.1, although more complex boundary conditions are in this section.

Figure 7. Convergence of the

L o s s

value in predicting consolidation settlements of an Osaka Bay mud layer.

4.3. Ablation Study

In Section 3.2, Equation (32) defines the

L o s s

value. Several parameters can affect its convergence. Compared with the error analysis in the application of other numerical methods (for example, the finite element method) to solve a partial differential equation, this study chooses to inspect the

L o s s

value with respect to different learning rates

β

and numbers of interior and boundary nodes. Obviously, increasing the number of hidden layers or neurons can provide lower

L o s s

values. Discussing the influence of the number of hidden layers or neurons on the

L o s s

value cannot deliver new results.

Figure 8a,b compare the

L o s s

values with respect to the learning rates

β = 10^{- 2}

and

10^{- 4}

. Except for the learning rate

β

, creating Figure 8a uses the required data or settings for plotting Figure 2, Figure 3a–d and Figure 4, whereas generating Figure 8b employs the required data or settings for drawing Figure 5, Figure 6a–d and Figure 7.

Figure 8. Comparison of the

L o s s

values with respect to different learning rates

β

: (a) the phosphate waste clay, (b) Osaka Bay mud.

Observing Figure 8a,b, one may find that training the deep neural network using a faster learning rate

β = 10^{- 2}

is not appropriate. The corresponding

L o s s

value does not converge to an acceptable value (below

10^{- 4}

) in Figure 8a. In contrast, the

L o s s

value is below

10^{- 4}

when training the deep neural network for over 4000 to 5000 epochs with a learning rate

β = 10^{- 4}

.

Figure 9a,b compare the

L o s s

values versus different numbers of interior and boundary nodes. Except for the number of interior or boundary nodes, editing Figure 9a uses the data and settings required to prepare Section 4.1 while creating Figure 9b is based on the necessary data and settings employed to generate Figure 5, Figure 6a–d and Figure 7. For keeping a particular nodal spacing, the total number of interior and boundary nodes is simultaneously increased in creating Figure 9a,b.

Figure 9. Comparison of the

L o s s

values with respect to different numbers of interior and boundary nodes: (a) phosphate waste clay and (b) Osaka Bay mud.

Increasing the total number of interior and boundary nodes means a finer discretization of the problem domain. Since this study creates random interior and boundary nodes, Figure 9a,b indicate that generating a finer discretization of the problem domain does not apparently improve the

L o s s

value. These two figures may show a distinguishing characteristic that does not exist in other existing numerical methods (for example, the finite element method). A finer discretization of the problem domain can apparently improve the accuracy of numerical results.

5. Discussion

Section 4.1 and Section 4.2 reveal four difficulties of developing a mathematical model for an engineering problem; however, this study resolves these difficulties to a certain extent:

The first difficulty is the challenge of balancing the number of assumptions and the simplicity of the corresponding governing equations. This study eliminates the assumption that a clay layer consolidates only in the vertical direction in modeling a finite strain consolidation process; however, the current governing equation (Equation (29)) is not complex.
The second difficulty is the challenge of choosing a suitable numerical method for solving a real-world problem. For this study, two real-world problems are ill-posed; nevertheless, boundary conditions are prerequisites for implementing existing numerical methods (for example, the finite element method). Section 4.1 provides an example. Probably due to the lack of field measurements, some boundary conditions were unavailable in the previous study [17]. However, the DGM can resolve this difficulty since its goal is to minimize the $L o s s$ value at random nodes.
The third difficulty arises from the fact that a finite strain consolidation problem is usually ill-posed. For example, limited boundary conditions are available, or the number of unknowns exceeds the number of equations. The DGM helps resolve an ill-posed problem. It fits a family of fields constrained by sampling/regularization rather than physics if the physics is under-determined. In Section 4.1 and Section 4.2, two governing equations are available, but six unknowns (pore water pressure, two average velocity components, and three effective stress components) exist. If this study does not adopt the DGM, modifying the problem to be well-posed must be implemented using available material properties (for example, a void ratio–stress relationship). The author’s Ph.D. thesis provided an example in which the void ratio is the unknown of a single and complex governing equation.
The fourth difficulty is the non-homogeneity of clay’s properties. This difficulty represents the limitation of this study. Natural clay’s properties are non-homogeneous. Although we can create a particular probability model to regress the distribution of a clay’s property, there must be enough clay samples to provide accurate regression points. However, gathering clay samples of a natural soft clay layer is not easy. Probably due to this reason, it is unavoidable that the accuracy of predicted consolidation settlements is limited.

Meanwhile, Section 4.3 demonstrates that the DGM may be a promising numerical method for studying engineering problems. Even in a finite-strain engineering problem, we only need to adjust the learning rate

β

in implementing a particular deep neural network.

6. Conclusions and Concluding Remarks

This study develops a mathematical model for describing a generalized finite strain consolidation process and its DGM solution. Developing this mathematical model adopts the Reynolds transport theorem to describe mass and momentum balances for clay grain and pore water phases. The governing equation is the sum of the resulting mass and momentum balance equations. Applying the DGM to train a deep neural network to minimize a loss function defined using the resulting governing equation, available initial, and boundary conditions, is next implemented. Two real-world problems show that the current mathematical model outperforms previous one-dimensional models in predicting the consolidation settlements caused by the self-weights of two natural soft clay layers. Besides, an additional ablation study finds that the learning rate has an apparent effect on the value of the loss function.

Based on Section 4.1, Section 4.2 and Section 4.3, the conclusions of this study are

Deriving the current mathematical model advances the modeling of a finite-strain engineering problem. The current governing equation is simple but adapts to the changes in the problem domain. Based on this, we can make more accurate predictions about settlements.
The DGM helps resolve an ill-posed problem in which the number of unknowns exceeds the number of equations, or limited boundary conditions are available. If the physics is under-determined, it fits a family of fields constrained by sampling/regularization rather than physics.
To obtain the desired accuracy of numerical results provided by the DGM, adopting a lower learning rate in training a deep neural network is preferred.

Furthermore, the limitation of this study is the non-homogeneity of clay properties. Since implementing the DGM uses random interior and boundary nodes, constructing particular probability models to regress the distributions of clay properties and combining the resulting probability model with a DGM solution may be the main interest of future research.

Funding

This research received no external funding.

Data Availability Statement

Data and runnable Python codes used to implement Section 4.1 and Section 4.2 are digitized from two previous studies [16,17]. They are available on https://github.com/xsheu/consolidation, accessed on 27 September 2025.

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

DGM	Deep Galerkin Method

References

Loktev, K.A.; Ulanov, I.; Shishkina, I.; Savulidi, M.; Klekovkina, N.; Kuznetsov, A. Determination of settlement parameters of highway embankment and base consolidation time depending on soil characteristics. Transp. Res. Proc. 2022, 63, 946–955. [Google Scholar] [CrossRef]
Kamash, W.E.; Hafez, K.; Zakaria, M.; Moubarak, A. Improvement of soft organic clay soil using vertical drains. KSCE J. Civ. Eng. 2021, 25, 429–441. [Google Scholar] [CrossRef]
Cao, L.F.; Chang, M.-F.; Teh, C.I.; Na, Y.M. Back-calculation of consolidation parameters from field measurements at a reclamation site. Can. Geotech. J. 2001, 38, 755–769. [Google Scholar] [CrossRef]
Biot, M.A. General theory of three-dimensional consolidation. J. Appl. Phys. 1941, 12, 155–164. [Google Scholar] [CrossRef]
Gibson, R.E.; England, G.L.; Hussey, M.J.L. The theory of one-dimensional consolidation of saturated clays. Géotechnique 1967, 17, 261–273. [Google Scholar] [CrossRef]
Jeeravipoolvarn, S.; Scott, J.D.; Chalaturnyk, R. Multi-dimensional finite strain consolidation theory: Modeling study. In Proceedings of the 61st Canadian Geotechnical Conference and the 9th Joint CGS/IAH-CNC Groundwater Conference, Edmonton, AB, Canada, 21–24 September 2008; pp. 167–175. [Google Scholar]
Sheu, G.Y. A General Finite Strain Consolidation Theory and Its Application. Ph.D. Thesis, National Cheung Kung University, Tainan, Taiwan, 1997. [Google Scholar]
Malvern, L.E. Introduction to the Mechanics of a Continuous Medium; Prentice-Hall: Englewood Cliffs, NJ, USA, 1977. [Google Scholar]
Justin, S.; Konstantinos, S. DGM: A deep learning algorithm for solving partial differential equations. J. Comput. Phys. 2018, 375, 1339–1364. [Google Scholar] [CrossRef]
Huerta, A.; Rodriguez, A. Numerical analysis of nonlinear large-strain consolidation and filling. Degree-Comput. Struct. 1992, 44, 357–365. [Google Scholar] [CrossRef]
Liu, S.J.; Sun, H.L.; Pan, X.D.; Shi, L.; Cai, Y.Q.; Geng, X.Y. Analytical solutions and simplified design method for large-strain radial consolidation. Comput. Geotech. 2021, 134, 103987. [Google Scholar] [CrossRef]
Kumar, H.; Yadav, N.; Nagar, A.K. Numerical solution of Generalized Burger-Huxley & Huxley’s equation using deep Galerkin neural network method. Eng. Appl. Artif. Intell. 2022, 115, 105289. [Google Scholar]
Matsumoto, M. Application of deep Galerkin method to solve compressible Navier-Stokes equations. Trans. Jpn. Soc. Aeronaut. Space Sci. 2021, 64, 348–357. [Google Scholar] [CrossRef]
Bali, N.P.; Goyal, M. A Textbook of Engineering Mathematics, 9th ed.; University Science Press: New Delhi, India, 2017. [Google Scholar]
McVay, M.; Townsend, F.; Bloomquist, D. Quiescent consolidation of phosphatic waste clays. J. Geotech. Eng. 1986, 112, 1033–1049. [Google Scholar] [CrossRef]
Zen, K.; Umehara, Y. A new consolidation testing procedure and technique for very soft soils. In Consolidation of Soils: Testing and Evaluation; Yong, R.N., Townsend, F.C., Eds.; American Society for Testing and Materials: Philadelphia, PA, USA, 2019; pp. 405–432. [Google Scholar]
Cargill, K.W. Prediction of consolidation of very soft soil. J. Geotech. Eng. 1984, 110, 775–795. [Google Scholar] [CrossRef]
Hiroyuki, T.; Jacques, L. A microstructural investigation of Osaka Bay clay: The impact of microfossils on its mechanical behaviour. Can. Geotech. J. 1999, 36, 493–508. [Google Scholar] [CrossRef]

Figure 1. An example of the neural network’s architecture for implementing the DGM (e.g., [9]).

Figure 2. Comparison of the predicted and observed elevations of the retention pond [17].

Figure 3. Predicted

σ_{11, 0}^{'}

,

σ_{33, 0}^{'}

,

σ_{13, 0}^{'}

, and

p_{0}

of a phosphate waste layer consolidated by its self-weight at the time t = 30 days (

σ_{11, 0}^{'}

,

σ_{33, 0}^{'}

,

σ_{13, 0}^{'}

in kPa).

Figure 4. Convergence of the

L o s s

value in predicting consolidation settlements of a phosphate waste clay layer.

Figure 5. Comparison of predicted and measured vertical settlements of an Osaka Bay mud layer at the time

t = 1.55 \times 10^{4}

min [16].

Figure 6. Predicted

σ_{11, 0}^{'}

,

σ_{33, 0}^{'}

,

σ_{13, 0}^{'}

, and

p_{0}

of an Osaka Bay mud layer consolidated by its self-weights at the time

t = 1.55 \times 10^{4}

min (

σ_{11, 0}^{'}

,

σ_{33, 0}^{'}

,

σ_{13, 0}^{'}

in kPa).

Figure 7. Convergence of the

L o s s

value in predicting consolidation settlements of an Osaka Bay mud layer.

Figure 8. Comparison of the

L o s s

values with respect to different learning rates

β

: (a) the phosphate waste clay, (b) Osaka Bay mud.

Figure 9. Comparison of the

L o s s

values with respect to different numbers of interior and boundary nodes: (a) phosphate waste clay and (b) Osaka Bay mud.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

A Mathematical Model of the Generalized Finite Strain Consolidation Process and Its Deep Galerkin Solution

Abstract

1. Introduction

2. Related Works

2.1. Finite Strain Consolidation Theory

2.2. Deep Galerkin Method

3. Generalized Finite Strain Consolidation Theory

3.1. Mathematical Model

3.1.1. Mass Balance for the Clay Grain Phase

3.1.2. Mass Balance for the Pore Water Phase

3.1.3. Momentum Balance for the Clay Grain Phase

3.1.4. Momentum Balance for the Pore Water Phase

3.2. DGM Formulation

3.3. Implementation of the DGM

4. Application

4.1. Phosphatic Waste Clay

4.2. Osaka Bay Mud

4.3. Ablation Study

5. Discussion

6. Conclusions and Concluding Remarks

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Article Metrics

Citations

Article Access Statistics