Next Article in Journal
An Endogenous Security-Oriented Framework for Cyber Resilience Assessment in Critical Infrastructures
Previous Article in Journal
Ultrasonic Nondestructive Testing Image Enhancement Model Based on Super-Resolution Imaging
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Modeling One-Dimensional Nonlinear Consolidation Problems by Physics-Informed Neural Network with Layer-Wise Locally Adaptive Activation Functions

1
Department of Civil Engineering, Shanghai University, Shanghai 200444, China
2
Department of Civil Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(15), 8341; https://doi.org/10.3390/app15158341
Submission received: 19 June 2025 / Revised: 16 July 2025 / Accepted: 18 July 2025 / Published: 26 July 2025

Abstract

The study on soil consolidation and settlement is of great importance in geotechnical engineering practice. Nowadays, physics-informed neural networks (PINN) are becoming more and more popular in solving geotechnical engineering problems thanks to their meshless, physically constrained, and data-driven nature. Although there have been some successful applications in one-dimensional (1D) consolidation problems in saturated soils, the ability and stability to deal with more complex boundary conditions remain to be tested. In this paper, the effects of activation function and random state on the PINN are investigated for solving two 1D consolidation problems in saturated soils, and the proposed method for inverse modeling of the two 1D consolidation problems. The results show that PINN with layer-wise locally adaptive activation functions improves the convergence speed and prediction accuracy of the PINN for solving the 1D nonlinear soil consolidation problems, and at the same time the robustness of the model to random states. Moreover, the proposed method still converges faster in the inverse modeling of 1D consolidation problems.

1. Introduction

Saturated soft soil will produce excess pore water pressure under loading, and the dissipation of excess pore water pressure causes internal pore shrinkage to produce soil settlement, i.e., consolidation of the soil body. In 1925, Terzaghi [1] established a 1D consolidation theory based on a series of idealized assumptions, such as coefficients of permeability and compression remaining unchanged in the consolidation process, etc. Subsequently, researchers continuously revised its basic assumptions to develop and improve the 1D consolidation theory so that it is closer to the natural properties.
Gibson et al. [2] derived the governing equation of 1D large deformation consolidation of soils and obtained the theory of 1D large deformation consolidation, but due to the highly nonlinear nature of the governing equation, the analytical solution could not be obtained directly. Xie et al. [3] obtained the analytical solution of 1D large deformation consolidation of saturated soil by assuming that the compression coefficient is a constant and the permeability coefficient and the square of the void ratio are proportional to each other. Davis and Ramond [4] obtained a 1D nonlinear consolidation theory for soils based on a linear e l g σ relationship by assuming that the coefficient of permeability k v varies synchronously with the coefficient of volume compressibility m v during consolidation and that the self-gravitational stress remains constant along depth. Shi et al. [5] established a 1D nonlinear consolidation equation of soil based on a hyperbolic compression model and obtained the analytical solutions. Xie et al. [6] derived an analytical solutions of 1D nonlinear consolidation of single-layer and two-layer soils under the conditions of unipolar loading on the basis of assumptions made by Davis and Ramond. Poskitt [7] utilized the ingress method to solve the problem of nonlinear consolidation of soil when the consolidation coefficient C v is a variable. Xie et al. [8] obtained a semi-analytical solution to 1D nonlinear consolidation for layered soils under time-varying loading with   C v by using the method of discretization of soil thickness.
It is worth noting that it is difficult to obtain an analytical solution in nonlinear consolidation theory unless the consolidation coefficient C v is constant. Therefore, in most cases, nonlinear consolidation was usually solved by numerical methods such as the finite element method (FEM), finite difference method (FDM), finite volume method (FVM), etc., in order to obtain a nonlinear consolidated high-accuracy numerical solution. With the increase in available computational resources, traditional numerical methods can effectively solve such highly nonlinear problems. However, for these traditional numerical methods, careful spatio-temporal discretization is essential.
In recent years, PINNs have become increasingly popular in solving partial differential equations (PDEs), which, instead of solving PDEs directly, identify the nonlinear mapping between inputs and outputs through agent modeling [9]. Although ordinary deep neural networks (DNNs) can approximate any continuous function under certain conditions, they usually require a large amount of labeled data to maintain their accuracy [10]. To overcome this drawback, Raissi et al. [11] introduced the concept of PINN and used PINN to solve benchmark PDEs such as the Navier-Stokes equations. PINN is the process of embedding PDEs and known physical knowledge such as initial and boundary conditions in a DNN and evaluating the model on a set of dispersed spatio-temporal points called residual points. The development of PINN has benefited from the progress of deep learning, such as libraries like Pytorch [12] and tensorflow [13], and many code libraries based on PINN have been developed, such as DeepXDE [14] and SciANN [15].
PINN has been proven to be a powerful tool for solving PDEs or determining model parameters in many fields. In fluid mechanics, PINN is used to solve forward and inverse problems such as incompressible flow [16] and turbulent flow [17]. In the field of solid mechanics, PINN is used to solve linear elastic problems [18], nonlinear elastoplastic problems [19], and elastic dynamic problems [20]. In the fields of materials science, PINN is applied to the reconstruction of particles [21] and the simulation of tiny bubbles [22]. In the field of electromagnetics, PINN is applied in the design of various electromagnetic metamaterials [23,24]. In geotechnical engineering, PINN has also been used to solve the Terzaghi consolidation [25,26] and seepage problems in unsaturated soils [27,28]. Lan et al. [29] applied hard constraints on the fixed solution conditions to PINN and solved two 1D nonlinear consolidation problems as well as the estimation of interface parameters.
Compared with traditional numerical methods, the advantages of PINN include being meshless, capable of handling complex boundaries, easy to integrate physical knowledge, easy to integrate observational data, etc. [14,30]. The disadvantages are slow training speed, sensitivity to hyperparameters, unstable convergence, high dependence on random seeds, etc. [28,31]. The activation functions in the aforementioned PINNs for solving the soil consolidation problems are hyperbolic tangent functions with fixed activation slopes. In this paper, we apply a locally adaptive activation function with a scalable parameter, add this parameter to the network for learning, and apply it to 1D nonlinear consolidation models. The paper is organized as follows. Firstly, the principle of PINN is introduced for solving 1D Terzaghi consolidation with continuous drainage boundaries, and the layer-wise locally adaptive activation functions (L-LAAF) are described. Secondly, the performance difference is demonstrated between the PINN with layer-wise locally adaptive activation functions (L-LAAF-PINN) and the baseline PINN for solving two 1D nonlinear consolidation problems. Finally, the advantages of the L-LAAF-PINN are summarized for solving 1D nonlinear consolidation problems.

2. Materials and Methods

2.1. 1D Consolidation by PINN with Layer-Wise Locally Adaptive Activation

2.1.1. 1D Terzagghi Consolidation with Continuous Drainage Boundaries

The traditional 1D Terzaghi consolidation equation is
u t = C v 2 u z 2 ,   z 0 , H , t > 0
where t is the time, z is the depth, u is the excess pore water pressure, H is the soil layer thickness, and C v is the consolidation coefficient, which is expressed as
C v = k v m v γ w
where k v is the permeability coefficient, m v is the coefficient of volume compressibility, and γ w is the weight per unit volume of water. The traditional 1D Terzaghi consolidation boundary conditions are either fully drained or fully undrained. However, in the practice, the boundary conditions are between fully permeable and fully impermeable, and all 1D consolidation models in this paper use a time-dependent continuously drained boundary [32]. Figure 1 shows the schematic diagram of the 1D consolidation of a soil layer with a continuously drained boundary under a uniform instantaneous vertical loading q .
The boundary conditions for the upper and lower interfaces are taken, respectively.
u t , 0 = q e b t
u t , H = q e c t
where q is the top applied load, b , c are the drainage parameters at the top and bottom boundaries, and additionally, the initial conditions are set to
u 0 , z = q
In order to speed up the neural network solution, the input and output data of the network need to be converted into dimensionless form in advance [33], and the following normalized parameters are defined:
u ¯ = u q , Z = z H , T v = C v t H 2 , α = b H 2 c v , β = c H 2 C v
Thus, the governing equation of the soil consolidation and fixed solution conditions in dimensionless form can be obtained:
u ¯ T v = 2 u ¯ Z 2
u ¯ Z , 0 = 1
u ¯ 0 , T v = e α T v
u ¯ 1 , T v = e β T v

2.1.2. Feed-Forward Neural Network

The next step is to describe how to use PINN to solve the 1D Terzaghi consolidation problem. Firstly, the implementation of the baseline PINN solving consolidation problem begins with constructing a feed-forward neural network (FNN). Let N L : R D i n R D o u t be a FNN with L + 1 layers and N k neurons in the k t h layer ( N 0 = D i n , N L = D o u t ). FNN consists of input layer, hidden layers and output layer. The weights and bias vectors of the k t h layer are denoted by W k R N k × N k 1 and b k R N k , The input vector is denoted by x R D i n , The output vector of the k t h layer is denoted by N k x . With a non-linear activation function, σ, applied layer-wise, the FNN can be recursively defined as:
  • input layer:
    N 0 x = x
    the k t h hidden layer ( 1 k L 1 ):
    N k x = σ W k N k 1 x + b k
    output layer:
    N L x = W L N L 1 x + b L
Therefore, the predicted value of FNN is given by
u ^ x ; λ = N L x ; λ
where λ = { W 1 , W L , b 1 , , b L } .
Constructing the above FNN requires first initializing the weight and bias parameters of the network, λ . However, the initialization of weights and biases can seriously affect the network performance, and Glorot et al. [34] proposed the Xavier initialization method, which can initialize the weight vectors according to the number of input and output neurons in each network layer, preventing the network from experiencing the phenomena of gradient explosion and gradient vanishing, and thus improving the network performance. The PINN models in this paper all use the Xavier initialization method to initialize the weight parameters.

2.1.3. PINN for 1D Terzaghi Consolidation Equation with Continuous Drainage Boundaries

The input vector of FNN is x ( Z , T v ) , where Z is dimensionless depth and T v is time factors, the output vector is dimensionless excess pore pressure, u ¯ ^ x ; λ , which can be computed as the partial differential terms in the governing equations, using the automatic derivation (AD) mechanism in the pytorch [12]. These partial differential terms can be used to compute the residuals of the governing equation f ^ ( x ; λ ) :
f ^ x ; λ u ¯ ^ T v 2 u ¯ ^ Z 2
The input vector x comes from the dataset D , which includes the upper boundary condition points D u b , lower boundary condition points   D l b , initial condition points D i c , measured points obtained from experiments or high-precision numerical simulations D m , and the residual points D f . Among them, the residuals are mostly sampled randomly, and in this paper, we use the Latin hypercube sampling proposed by Helton et al. [35].
The next step is to define the loss function. For a PINN without a measured dataset, the loss function should consist of two main components: (1) the u ¯ ( x ) and u ¯ ^ ( x ; λ ) for x D u b , D l b , D i c ; (2) zero and f ^ x ; λ for x D f . According to the definition of Raissi et al. [11], the loss function of PINN is the sum of mean square error (MSE) of FNN on D u b , D l b , D i c and D f :
L D ; λ = L D u b ; λ + L D l b ; λ + L D i c ; λ + L D f ; λ
L D u b ; λ = 1 | D u b | x D u b u x u ^ x ; λ 2
L D l b ; λ = 1 | D l b | x D l b u x u ^ x ; λ 2
L D i c ; λ = 1 | D i c | x D i c u x u ^ x ; λ 2
L D f ; λ = 1 | D f | x D f f ^ x ; λ 2
Finally, the network is trained by setting the number of network iterations, and the optimizer is used to minimize the loss function to iteratively update the network’s learnable parameters λ . In this paper, the Adam optimizer [36], which is based on the stochastic gradient descent (SGD) algorithm, is used to update the parameters, and the learning rate is set to 0.001. Figure 2 illustrates the framework of PINN solving the 1D Terzaghi consolidation equation.
The loss function of the above PINN contains several loss terms, and in order to minimize the loss function, each loss term needs to be weighed; such a process is known as multi-task learning [37]. Simply adding up all the loss terms may lead to an imbalance in the training rate across tasks, which seriously affects the PINN performance, a phenomenon known as gradient pathology [38]. Gradient pathology is usually addressed by specifying the weights of each loss term and weighting the summation [39]:
L D ; λ = i w i L D i ; λ , i u b , l b , i c , f
where w i is the weight for the loss term L D i ; λ . Methods for determining the optimal weights include manual trial-and-error and adaptive dynamic adjustment. Xiang et al. [40] proposed LBPINN that can dynamically balance the losses, and Chen et al. [31] applied the principle loss function and gradient normalization method in computer vision to dynamically adjust the weights of the loss terms in solving the unsaturated soil infiltration problem using PINN. In this paper, we are not trying to find out the optimal weight configuration but to investigate the possible advantages of the adaptive activation function under the influence of the stochastic state of PINN and different initial loss weight configurations. In the study, we kept w u b = w l b = w i c , the loss function eventually reduces to two terms:
L D ; λ = w t r a i n L D t r a i n ; λ + w f L D f ; λ
where D t r a i n = { D u b , D l b , D i c } .
In addition, DNN-based models are subject to various stochasticities during training. Madhuastha et al. [41] found that models under a set of random seeds outperform other models, and PINNs also face this problem. To solve this problem, Bengio et al. [42] used random seeds as additional hyperparameters of the network dynamically tuned to find the model with the best performance.

2.2. Layer-Wise Locally Adaptive Activation Function

Since the optimization parameters depend on the derivatives of the activation function, the activation function σ x is actually a basic function in neural networks, has an important role in the training process, and Jagtap et al. [43] proposed that the activation function σ x in PINNs is usually used as the following nonlinear functions.
Hyperbolic tangent function:
σ x = tanh x = e x e x e x + e x
Sine function:
σ x = sin x
Logistic sigmoid function:
σ x = 1 1 e x
The choice of the activation function σ x depends on the problem to be solved. Jagtap et al. [35] proposed a PINN with a global adaptive function (GAAF-PINN) based on the global adaptive activation function (GAAF) and introduced an additional scalable parameter n a in the network, where n is the scaling factor and the hyperparameter a is the slope of the activation function. The results proved that introducing scalable parameter n a in the activation function not only accelerates the convergence of the network, but also obtains better prediction accuracy. Subsequently, Jagtap et al. [9] proposed two locally adaptive activation functions: layer-wise locally adaptive activation functions (L-LAAF) and neuron-wise locally adaptive activation functions (N-LAAF). The locally adaptive activation function is implemented by introducing scalable parameters in each network layer or each neuron n a . Jagtap et al. demonstrated that the local adaptive activation function outperforms the global adaptive activation function in terms of training speed and accuracy. In addition, compared to the fixed activation function, although the locally adaptive activation function adds extra parameters, nevertheless the computational cost of both is comparable as the number of network layers and the number of neurons per layer increase. The code program for solving 1D consolidation problems with the PINN with layer-wise locally adaptive activation functions (L-LAAF-PINN) is shown in Algorithm 1. It is worth noting that the method is applied to the nonlinear layer of the neural network, thus can be flexibly and quickly applied to complex consolidation problems like 2d and 3d consolidation problems. For instance, the transfer of the method can be achieved by adjusting the input and output dimensions of the neural network and the construction of the loss function accordingly.
The PINNs in this paper all use a hyperbolic tangent function that introduces a scalable hyper-parameter a and sets initial a = 0.01 . Figure 3 shows the image of the hyperbolic tangent function with different activation slopes. As the activation function slope a changes, the expression of hyperbolic tangent function is
σ x = e a x e a x e a x + e a x
Algorithm 1 L-LAAF-PINN
Initialize   network   parameters   λ   and   dataset   D
Initialize   adaptive   activation   slope     a
for   t = 0   to   m a x _ t r a i n _ e p o c h s  do
Compute   predicted   excess   pore   water   pressure   u ¯ ^
Compute   the   PDE   residual     f ^ using automatic differentiation
Compute   loss   function   L D ; λ = w t r a i n L D t r a i n ; λ + w f L D f ; λ
Compute   the   gradient   of   loss   function   λ L D ; λ
Update   the   network   parameters   λ t λ t + 1
Update   adaptive   activation   slope   a t a t + 1
end for

2.3. Experiment

We discuss the performance of the baseline PINN (PINN in brief) and PINN with layer-wise locally adaptive activation functions (L-LAAF-PINN) through two experiments. Experiment 1 is the 1D nonlinear large strain consolidation, Experiment 2 is the 1D nonlinear consolidation, and the performance of L-LAAF-PINN in the inverse modeling of 1D consolidation problems is studied in Experiment 3. Next, the partial differential equations and parameter Settings of these experiments will be introduced.

2.3.1. Experiment 1: 1D Nonlinear Large Strain Consolidation

The assumptions for m v and k v made by Xie et al. [3] are adopted:
(1) The volume compression coefficient m v remains constant during consolidation, i.e.,
m v = 1 1 + e d e d σ = C
(2) The permeability coefficient k v
k v k v 0 = 1 + e 1 + e 0 2
where σ is effective stress, e is void ratio, C is constant, k v 0 is initial permeability coefficient, and e 0 is initial void ratio. Since the large strain consolidation leads to large soil deformation, the Lagrangian coordinate is used for the equation governing the excess pore water pressure u :
C v 0 2 u a 2 + m v u a 2 = u t
where C v 0 is initial coefficient of consolidation, and a is depth in the Lagrangian coordinate. The normalized parameters of Equation (6) are still used to rewrite the dimensionless governing equation:
2 u ¯ Z 2 + λ p u ¯ Z 2 = u ¯ T v
where λ p = m v q is normalized load parameter. In the computation, the following parameters are adopted: H = 10   m , m v = 4 × 10 3   k P a 1 , q = 100   k P a , k v 0 = 10 9   m / s , e 0 = 3.0 , G s = 2.75 , α = 2 , β = 3 , γ w = 9.8   k N / m 3 .
As for the PINN solver, we fixed the network architecture as [2, 50, 50, 50, 50, 50, 1]. The training epochs were set to 20,000 epochs. The training dataset contains 101 initial condition points, 1001 upper boundary condition points, 1001 lower edge condition points, and 10,000 residual points; the validation dataset is a uniformly distributed set of 10,201 spatio-temporal points. The model performance is assessed by measuring the relative L 2 error of the validation dataset, which is defined as follows:
r e l a t i v e   L 2   e r r o r = x D v a l u x u ^ x ; λ 2 x D v a l u x 2

2.3.2. Experiment 2: 1D Nonlinear Consolidation

The empirical equations proposed by MESRI et al. [44] are adopted:
e e 0 = c c lg σ 0 σ
e e 0 = c k lg k v k v 0
Further adopting the assumptions of Zong et al. [45] that c c / c k = 1 , i.e., the decrease in permeability is proportional to the decrease in compressibility, the coefficient of consolidation c v is expressed as
c v = k v γ w m v = k v 0 γ w m v 0 = c v 0
According to the principle of effective stress for soils:
σ = q + σ 0 u = σ f u
where q is the constant load, σ 0 is the initial effective stress, σ f is the ultimate effective stress. According to these equations, the governing equation for the 1D nonlinear consolidation can be described as follows:
u t = c v 2 u z 2 + 1 σ u z 2
The normalized parameters of Equation (6) are still used to rewrite the dimensionless governing equation:
u ¯ T v = 2 u ¯ Z 2 + σ f σ 0 σ u ¯ Z 2
In this computation, the following parameters are adopted: H = 10   m , m v 0 = 4 × 10 3   k P a 1 , σ 0 = 50   k P a , σ f = 150   k P a , q = 100   k P a , k v 0 = 10 9   m / s , α = 4 , β = 2 , γ w = 9.8   k N / m 3 . As for the PINN solver, the network architecture is setup as [2, 50, 50, 50, 50, 50, 1]. The training epochs was set to 20,000 epochs. The training dataset contains 101 initial condition points, 1001 upper boundary condition points, 1001 lower edge condition points, and 10,000 residual points; the validation dataset is a uniformly distributed set of 10,201 spatio-temporal points.

2.3.3. Experiment 3: Inverse Modeling

In order to further study the application of the proposed L-LAAF-PINN in the inverse modeling for 1D nonlinear consolidation problems, it is respectively assumed that the volume compression coefficient   m v   in Experiment 1 and the drainage parameters   α and β in Experiment 2 are unknown, while the remaining soil parameters are the same. The hyperparameters of the neural network are consistent with those of the forward model. The difference from the forward model is that the measured data items need to be added to the training set. Among them, Gaussian noise with a standard deviation of 0.05 is added to the analytical solution to serve as the measured data, and the spatio-temporal step size for sampling the measured data are set as d Z = d T v = 0.1 .

3. Results

3.1. 1D Nonlinear Large Strain Consolidation (Experiment 1)

3.1.1. Performance Comparisons: Best Model (Experiment 1)

Figure 4 shows the cloud plot of the analytical solution and the cloud plots of the absolute error of the best PINN and best L-LAAF-PINN model with the analytical solution, respectively. It is observed that the absolute error of the L-LAAF-PINN model is significantly smaller than that of the PINN model, and the maximum absolute error is only 0.0026, which is lower than 0.011 by the PINN model and is basically distributed at the upper and lower boundaries close to the initial moment. The best PINN and L-LAAF-PINN models are listed in Table 1.
Figure 5 shows how the validation loss decreases with training for the best PINN and L-LAAF-PINN models, and it can be seen that the validation loss of the L-LAAF-PINN model decreased faster, the final L 2 error and relative L 2 error were smaller than that of the PINN model. As shown in Figure 6, when the training proceeds to the 500th epoch, the best L-LAAF-PINN can accurately predict the excess pore water pressure under different time factors ( T v = 0.2, 0.4, 0.6, and 0.8), while the best PINN cannot obtain accurate predictions. As can be seen in Figure 5 and Figure 6, the best L-LAAF-PINN solving the nonlinear large strain consolidation problem converges faster than the best PINN.
Figure 7 shows a schematic of the activation slope n a with training time for the best L-LAAF-PINN model. It is observed that at first the activation slope of the second layer is the largest and rises the fastest, and then it starts to decline. After about 10,000 epochs, the activation slope of the first layer is the largest and rises continuously.

3.1.2. Performance Comparisons: Model Stability (Experiment 1)

Figure 8 shows the means and standard deviations of the relative L 2 error for the PINN and L-LAAF-PINN models using different random seeds and w t r a i n , similar to the previous example, the average relative L 2 error of the PINN model generally became larger with decreasing w t r a i n . It is apparent that the relative L 2 error means of the L-LAAF-PINN model under all training loss term weights w t r a i n were smaller than those of the PINN model, indicating that the layer-wise locally adaptive activation function improves the prediction accuracy of the PINN solver. Furthermore, the error band of the L-LAAF-PINN model was narrower than the PINN model, especially when the weights of the loss terms, w t r a i n , are small, which suggests that the layer-wise locally adaptive activation function improved the stability of the PINN prediction for the 1D large-strain consolidation problem for random states.

3.2. 1D Nonlinear Consolidation (Experiment 2)

3.2.1. Performance Comparisons: Best Model (Experiment 2)

In Figure 9, the cloud plots of the analytical solution are shown, as well as the cloud plots of the absolute error of the best PINN and best L-LAAF-PINN model with the analytical solution, respectively. It is observed that the absolute error of the L-LAAF-PINN model was significantly lower than that of the PINN model, and the maximum absolute error was only 0.008, which was lower than 0.025 by the PINN model. Table 2 summarizes the details of the best PINN and L-LAAF-PINN models.
Figure 10 shows how the validation loss decreases with training for the best PINN and L-LAAF-PINN models, it can be seen that the validation loss of the L-LAAF-PINN model decreased faster, the final L 2 error and relative L 2 error were smaller than those of the PINN model. As shown in Figure 11, when the training proceeds to the 500th epoch, L-LAAF-PINN can accurately predict the excess pore water pressure under different time factors ( T v = 0.2, 0.4, 0.6, and 0.8), while the best PINN cannot obtain accurate predictions. It means that L-LAAF-PINN converges faster than PINN, and this advantage has been shown in the previous nonlinear consolidation problem.
Figure 12 shows the change in the activation slope n a with training time for the best L-LAAF-PINN model. It can be seen that the activation slope of the first layer kept rising and was the highest of all activation function layers, and the activation slopes of the other layers only rose briefly over a period of epochs and then declined.

3.2.2. Performance Comparisons: Model Stability (Experiment 2)

Figure 13 shows the means and standard deviations of the relative L 2 error for the PINN and L-LAAF-PINN models using different random seed and w t r a i n . Similar to the previous example, the average relative L 2 error of the PINN model generally became larger with decreasing w t r a i n . It is apparent that the relative L 2 error means of the L-LAAF-PINN model under all training loss term weights w t r a i n were lower than those of the PINN model, indicating that the layer-wise locally adaptive activation function improved the prediction accuracy of the PINN solver. Furthermore, the error band of the L-LAAF-PINN model was narrower than the PINN model, especially when the weights of the loss terms, w t r a i n , were small, which suggests that the layer-wise locally adaptive activation function improved the stability of the PINN prediction for the 1D large-strain consolidation problem for random states.

3.3. Inverse Modeling (Experiment 3)

3.3.1. 1D Nonlinear Large Strain Consolidation

Figure 14 shows the variation results of m v inversion through the training process using PINN and L-LAAF-PINN, respectively. It can be seen from the figure that the inversion results of both models can stably converge to the true value, but L-LAAF-PINN converges faster than PINN, especially in the early stage of thetraining process. It indicates that the layer-wise locally adaptive activation functions improve the solution speed of 1D nonlinear large strain consolidation inverse problems.

3.3.2. 1D Nonlinear Consolidation

Figure 15 shows the variation results of the inversion of drainage parameters, α and β, using PINN and L-LAAF-PINN through the training process, respectively. It can be seen from the figure that both PINN and L-LAAF-PINN can accurately invert the drainage coefficients in the upper and lower boundaries; compared with PINN, L-LAAF-PINN converges faster. This still indicates that the layer-wise locally adaptive activation function accelerates the parameter participation speed of 1D nonlinear consolidation problems.

4. Conclusions

In this study, we investigated how the layer-wise locally adaptive activation function affects the performance of PINN for solving 1D consolidation problems, and for this purpose, two 1D consolidation problems are solved by using L-LAAF-PINN with the layer-wise locally adaptive activation function in comparison with baseline PINN. Based on the results obtained, the following conclusions can be drawn:
  • For solving 1D nonlinear consolidation problems, the baseline PINN is seriously affected by the loss term weight settings and random states, and the accuracy of the baseline PINN solution decreases as the weights of the training loss terms decrease, and the random states also lead to the instability of the performance of the baseline PINN.
  • Regardless of the loss term weight settings, the L-LAAF-PINN model with the layer-wise locally adaptive activation function solves the 1D consolidation problem with significantly better convergence speed and prediction accuracy than the benchmark PINN model.
  • The L-LAAF-PINN model with the locally layered adaptive activation functions is more stable than the baseline PINN model in solving 1D consolidation problems for random states, especially in two nonlinear consolidation problems. However, the effect of loss term weight settings on PINN is hardly improved.
  • In the inverse modeling of two 1D nonlinear consolidation problems, the convergence speed of L-LAAF-PINN is faster, indicating that the layer-wise locally adaptive activation functions still accelerate the parameter inversion speed in 1D nonlinear consolidation problems.

Author Contributions

Conceptualization, D.S.; methodology, J.Z.; software, J.Z.; validation, J.Z., D.S., and Y.C.; formal analysis, J.Z. and Y.C.; investigation, J.Z.; resources, J.Z.; writing—original draft preparation, J.Z.; writing—review and editing, Y.C.; visualization, J.Z.; supervision, Y.C. and D.S.; funding acquisition, D.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (grant no. 52378354 and no. 52238007).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Terzaghi, K. Erdbaumechanik auf Bodenphysikalischer Grundlage; Deuticke: Leipzig, Germany, 1925. [Google Scholar]
  2. Gibson, R.E.; England, G.L.; Hussey, M.J.L. The theory of one-dimensional consolidation of saturated clays. Géotechnique 1967, 17, 261–273. [Google Scholar] [CrossRef]
  3. Xie, K.H.; Leo, C.J. Analytical solutions of one-dimensional large strain consolidation of saturated and homogeneous clays. Comput. Geotech. 2004, 31, 301–314. [Google Scholar] [CrossRef]
  4. Davis, E.H.; Raymond, G.P. A non-linear theory of consolidation. Géotechnique 1965, 15, 161–173. [Google Scholar] [CrossRef]
  5. Shi, J.; Yang, L.; Zhao, W.; Liu, Y. Research of one-dimensional consolidation theory considering nonlinear characteristics of soil. J. HOHAI Univ. 2001, 29, 5. [Google Scholar]
  6. Xie, K.-H.; Xie, X.-Y.; Jiang, W. A study on one-dimensional nonlinear consolidation of double-layered soil. Comput. Geotech. 2002, 29, 151–168. [Google Scholar] [CrossRef]
  7. Poskitt, T.J. The consolidation of saturated clay with variable permeability and compressibility. Géotechnique 1969, 19, 234–252. [Google Scholar] [CrossRef]
  8. Xie, K.-H.; Zheng, H.; Li, B.-H.; Liu, X.-W. Analysis of one dimensional nonlinear consolidation of layered soils under time-dependent loading. J. Zhejiang Univ. (Eng. Sci.) 2003, 37, 426–431. [Google Scholar]
  9. Jagtap, A.D.; Kawaguchi, K.; Em Karniadakis, G. Locally adaptive activation functions with slope recovery for deep and physics-informed neural networks. Proc. R. Soc. A Math. Phys. Eng. Sci. 2020, 476, 20200334. [Google Scholar] [CrossRef] [PubMed]
  10. Tang, H.; Liao, Y.; Yang, H.; Xie, L. A transfer learning-physics informed neural network (TL-PINN) for vortex-induced vibration. Ocean Eng. 2022, 266, 113101. [Google Scholar] [CrossRef]
  11. Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
  12. Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. arXiv 2019, arXiv:1912.01703. [Google Scholar]
  13. Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv 2016, arXiv:1603.04467. [Google Scholar]
  14. Lu, L.; Meng, X.; Mao, Z.; Karniadakis, G.E. Deepxde: A deep learning library for solving differential equations. arXiv 2019, arXiv:1907.04502. [Google Scholar] [CrossRef]
  15. Haghighat, E.; Juanes, R. Sciann: A keras/tensorflow wrapper for scientific computations and physics-informed deep learning using artificial neural networks. Comput. Methods Appl. Mech. Eng. 2021, 373, 113552. [Google Scholar] [CrossRef]
  16. Jin, X.; Cai, S.; Li, H.; Karniadakis, G.E. Nsfnets (navier-stokes flow nets): Physics-informed neural networks for the incompressible navier-stokes equations. J. Comput. Phys. 2021, 426, 109951. [Google Scholar] [CrossRef]
  17. Wu, J.-L.; Xiao, H.; Paterson, E. Physics-informed machine learning approach for augmenting turbulence models: A comprehensive framework. Phys. Rev. Fluids 2018, 3, 074602. [Google Scholar] [CrossRef]
  18. Haghighat, E.; Raissi, M.; Moure, A.; Gomez, H.; Juanes, R. A physics-informed deep learning framework for inversion and surrogate modeling in solid mechanics. Comput. Methods Appl. Mech. Eng. 2021, 379, 113741. [Google Scholar] [CrossRef]
  19. Haghighat, E.; Bekar, A.C.; Madenci, E.; Juanes, R. A nonlocal physics-informed deep learning framework using the peridynamic differential operator. Comput. Methods Appl. Mech. Eng. 2021, 385, 114012. [Google Scholar] [CrossRef]
  20. Rao, C.; Sun, H.; Liu, Y. Physics-informed deep learning for computational elastodynamics without labeled data. J. Eng. Mech. 2021, 147, 04021043. [Google Scholar] [CrossRef]
  21. Stielow, T.; Scheel, S. Reconstruction of nanoscale particles from single-shot wide-angle free-electron-laser diffraction patterns with physics-informed neural networks. Phys. Rev. E 2021, 103, 053312. [Google Scholar] [CrossRef] [PubMed]
  22. Lin, C.; Li, Z.; Lu, L.; Cai, S.; Maxey, M.; Karniadakis, G.E. Operator learning for predicting multiscale bubble growth dynamics. J. Chem. Phys. 2021, 154, 104118. [Google Scholar] [CrossRef] [PubMed]
  23. Chen, Y.; Lu, L.; Karniadakis, G.E.; Dal Negro, L. Physics-informed neural networks for inverse problems in nano-optics and metamaterials. Opt. Express 2019, 28, 11618–11633. [Google Scholar] [CrossRef] [PubMed]
  24. Fang, Z.; Zhan, J. Deep physical informed neural networks for metamaterial design. IEEE Access 2020, 8, 24506–24513. [Google Scholar] [CrossRef]
  25. Bekele, Y.W. Physics-informed deep learning for one-dimensional consolidation. J. Rock Mech. Geotech. Eng. 2021, 13, 420–430. [Google Scholar] [CrossRef]
  26. Lu, Y.; Mei, G. A deep learning approach for predicting two-dimensional soil consolidation using physics-informed neural networks (pinn). Mathematics 2022, 10, 2949. [Google Scholar] [CrossRef]
  27. Tartakovsky, A.M.; Marrero, C.O.; Perdikaris, P.; Tartakovsky, G.D.; Barajas-Solano, D. Physics-informed deep neural networks for learning parameters and constitutive relationships in subsurface flow problems. Water Resour. Res. 2020, 56, e2019WR026731. [Google Scholar] [CrossRef]
  28. Bandai, T.; Ghezzehei, T.A. Forward and inverse modeling of water flow in unsaturated soils with discontinuous hydraulic conductivities using physics-informed neural networks with domain decomposition. Hydrol. Earth Syst. Sci. 2022, 26, 4469–4495. [Google Scholar] [CrossRef]
  29. Lan, P.; Su, J.-J.; Ma, X.-Y.; Zhang, S. Application of improved physics-informed neural networks for nonlinear consolidation problems with continuous drainage boundary conditions. Acta Geotech. 2024, 19, 495–508. [Google Scholar] [CrossRef]
  30. Cuomo, S.; Di Cola, V.S.; Giampaolo, F.; Rozza, G.; Raissi, M.; Piccialli, F. Scientific machine learning through physics–informed neural networks: Where we are and what’s next. J. Sci. Comput. 2022, 92, 88. [Google Scholar] [CrossRef]
  31. Chen, Y.; Xu, Y.; Wang, L.; Li, T. Modeling water flow in unsaturated soils through physics-informed neural network with principled loss function. Comput. Geotech. 2023, 161, 105546. [Google Scholar] [CrossRef]
  32. Mei, G.-X.; Xia, J.; Mei, L. Terzaghi’s one-dimensional consolidation equation and its solution based on asymmetric continuous drainage boundary. Chin. J. Geotech. Eng. 2011, 33, 28–31. [Google Scholar]
  33. Haghighat, E.; Amini, D.; Juanes, R. Physics-informed neural network simulation of multiphase poroelasticity using stress-split sequential training. Comput. Methods Appl. Mech. Eng. 2022, 397, 115141. [Google Scholar] [CrossRef]
  34. Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. J. Mach. Learn. Res.—Proc. Track 2010, 9, 249–256. [Google Scholar]
  35. Helton, J.C.; Davis, F.J. Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems. Reliab. Eng. Syst. Saf. 2003, 81, 23–69. [Google Scholar] [CrossRef]
  36. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
  37. Bischof, R.; Kraus, M. Multi-objective loss balancing for physics-informed deep learning. arXiv 2021, arXiv:2110.09813. [Google Scholar] [CrossRef]
  38. Wang, S.; Teng, Y.; Perdikaris, P. Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM J. Sci. Comput. 2021, 43, A3055–A3081. [Google Scholar] [CrossRef]
  39. Kendall, A.; Gal, Y.; Cipolla, R. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7482–7491. [Google Scholar]
  40. Xiang, Z.; Peng, W.; Liu, X.; Yao, W. Self-adaptive loss balanced physics-informed neural networks. Neurocomputing 2022, 496, 11–34. [Google Scholar] [CrossRef]
  41. Madhyastha, P.S.; Batra, D. On model stability as a function of random seed. In Proceedings of the Conference on Computational Natural Language Learning, Hong Kong, China, 3–4 November 2019. [Google Scholar]
  42. Bengio, Y. Practical recommendations for gradient-based training of deep architectures. In Neural Networks; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
  43. Jagtap, A.D.; Kharazmi, E.; Karniadakis, G.E. Conservative physics-informed neural networks on discrete domains for conservation laws: Applications to forward and inverse problems. Comput. Methods Appl. Mech. Eng. 2020, 365, 113028. [Google Scholar] [CrossRef]
  44. Mesri, G.; Rokhsar, A. Theory of consolidation for clays. J. Geotech. Eng. Div. 1974, 100, 889–904. [Google Scholar] [CrossRef]
  45. Zong, M.-F.; Wu, W.-B.; Mei, G.-X.; Liang, R.-Z.; Tian, Y. An analytical solution for one-dimensional nonlinear consolidation of soils with continuous drainage boundary. Chin. J. Rock Mech. Eng. 2018, 37, 2829–2838. [Google Scholar] [CrossRef]
Figure 1. Schematic diagram of 1D consolidation with continuous drainage boundary.
Figure 1. Schematic diagram of 1D consolidation with continuous drainage boundary.
Applsci 15 08341 g001
Figure 2. Schematic diagram of the PINN saving 1D consolidation equation.
Figure 2. Schematic diagram of the PINN saving 1D consolidation equation.
Applsci 15 08341 g002
Figure 3. Hyperbolic tangent function with different activation slope ‘a’ values.
Figure 3. Hyperbolic tangent function with different activation slope ‘a’ values.
Applsci 15 08341 g003
Figure 4. Cloud plot of u/q during 1D nonlinear large strain consolidation: (a) Analytical solution, (b) Absolute error for the best PINN, (c) Absolute error for the best L-LAAF-PINN models.
Figure 4. Cloud plot of u/q during 1D nonlinear large strain consolidation: (a) Analytical solution, (b) Absolute error for the best PINN, (c) Absolute error for the best L-LAAF-PINN models.
Applsci 15 08341 g004
Figure 5. 1D nonlinear large strain consolidation: loss evolution over epochs for the best PINN and L-LAAF-PINN models. (a) the best PINN. (b) the best L-LAAF-PINN.
Figure 5. 1D nonlinear large strain consolidation: loss evolution over epochs for the best PINN and L-LAAF-PINN models. (a) the best PINN. (b) the best L-LAAF-PINN.
Applsci 15 08341 g005
Figure 6. 1D nonlinear large strain consolidation: Comparison of solutions by the best PINN and L-LAAF-PINN after the 500th epoch with the analytic solutions under T v = 0.2, 0.4, 0.6, and 0.8. (a) the best PINN. (b) the best L-LAAF-PINN.
Figure 6. 1D nonlinear large strain consolidation: Comparison of solutions by the best PINN and L-LAAF-PINN after the 500th epoch with the analytic solutions under T v = 0.2, 0.4, 0.6, and 0.8. (a) the best PINN. (b) the best L-LAAF-PINN.
Applsci 15 08341 g006
Figure 7. 1D nonlinear large strain consolidation: n a evolution over epochs for the best L-LAAF-PINN model.
Figure 7. 1D nonlinear large strain consolidation: n a evolution over epochs for the best L-LAAF-PINN model.
Applsci 15 08341 g007
Figure 8. 1D nonlinear large strain consolidation: means and standard deviations of the relative L 2 error versus w t r a i n for the PINN and L-LAAF-PINN models with 10 different random seeds. The solid lines stand for the means, while the shaded areas are the error bands whose widths are twice the standard deviations.
Figure 8. 1D nonlinear large strain consolidation: means and standard deviations of the relative L 2 error versus w t r a i n for the PINN and L-LAAF-PINN models with 10 different random seeds. The solid lines stand for the means, while the shaded areas are the error bands whose widths are twice the standard deviations.
Applsci 15 08341 g008
Figure 9. Cloud plot of u/p 1D nonlinear consolidation: nonlinear large strain consolidation: (a) Analytical solution, (b) Absolute error cloud plot for the best PINN, (c) Absolute error cloud plot for the best L-LAAF-PINN models.
Figure 9. Cloud plot of u/p 1D nonlinear consolidation: nonlinear large strain consolidation: (a) Analytical solution, (b) Absolute error cloud plot for the best PINN, (c) Absolute error cloud plot for the best L-LAAF-PINN models.
Applsci 15 08341 g009
Figure 10. 1D nonlinear consolidation: loss evolution over epochs for the best PINN and L-LAAF-PINN models. (a) best PINN. (b) best L-LAAF-PINN.
Figure 10. 1D nonlinear consolidation: loss evolution over epochs for the best PINN and L-LAAF-PINN models. (a) best PINN. (b) best L-LAAF-PINN.
Applsci 15 08341 g010
Figure 11. 1D nonlinear consolidation: comparison of solutions by the best PINN and L-LAAF-PINN after the 500th epoch with the analytic solutions under T v = 0.2, 0.4, 0.6, and 0.8. (a) the best PINN. (b) the best L-LAAF-PINN.
Figure 11. 1D nonlinear consolidation: comparison of solutions by the best PINN and L-LAAF-PINN after the 500th epoch with the analytic solutions under T v = 0.2, 0.4, 0.6, and 0.8. (a) the best PINN. (b) the best L-LAAF-PINN.
Applsci 15 08341 g011
Figure 12. 1D nonlinear consolidation: n a evolution over epochs for the best L-LAAF-PINN model.
Figure 12. 1D nonlinear consolidation: n a evolution over epochs for the best L-LAAF-PINN model.
Applsci 15 08341 g012
Figure 13. 1D nonlinear consolidation: means and standard deviations of the relative L 2 error versus w t r a i n for the PINN and L-LAAF-PINN models with 10 different random seeds. The solid lines stand for the means while the shaded areas are the error bands whose widths are twice the standard deviations.
Figure 13. 1D nonlinear consolidation: means and standard deviations of the relative L 2 error versus w t r a i n for the PINN and L-LAAF-PINN models with 10 different random seeds. The solid lines stand for the means while the shaded areas are the error bands whose widths are twice the standard deviations.
Applsci 15 08341 g013
Figure 14. 1D nonlinear large strain consolidation: mv evolution over epochs for the PINN and L-LAAF-PINN models.
Figure 14. 1D nonlinear large strain consolidation: mv evolution over epochs for the PINN and L-LAAF-PINN models.
Applsci 15 08341 g014
Figure 15. 1D nonlinear consolidation: α and β evolution over epochs for the PINN and L-LAAF-PINN models. (a) α ; (b) β .
Figure 15. 1D nonlinear consolidation: α and β evolution over epochs for the PINN and L-LAAF-PINN models. (a) α ; (b) β .
Applsci 15 08341 g015
Table 1. 1D nonlinear large strain consolidation: summary of the best PINN and L-LAAF-PINN models.
Table 1. 1D nonlinear large strain consolidation: summary of the best PINN and L-LAAF-PINN models.
1D Nonlinear Large Strain ConsolidationPINNL-LAAF-PINN
R e l a t i v e   L 2   e r r o r 1.1 × 10 3 2.15 × 10 4
Maximum Absolute error 0.011 0.0026
w t r a i n 0.5 0.9
Random seed 6 8
Table 2. 1D nonlinear consolidation: summary of the best PINN and L-LAAF-PINN models.
Table 2. 1D nonlinear consolidation: summary of the best PINN and L-LAAF-PINN models.
1D Nonlinear ConsolidationPINNL-LAAF-PINN
R e l a t i v e   L 2   e r r o r 1.09 × 10 3 2.2 × 10 4
Maximum Absolute error 0.011 0.003
w t r a i n 0.8 0.9
Random seed 6 3
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhou, J.; Sun, D.; Chen, Y. Modeling One-Dimensional Nonlinear Consolidation Problems by Physics-Informed Neural Network with Layer-Wise Locally Adaptive Activation Functions. Appl. Sci. 2025, 15, 8341. https://doi.org/10.3390/app15158341

AMA Style

Zhou J, Sun D, Chen Y. Modeling One-Dimensional Nonlinear Consolidation Problems by Physics-Informed Neural Network with Layer-Wise Locally Adaptive Activation Functions. Applied Sciences. 2025; 15(15):8341. https://doi.org/10.3390/app15158341

Chicago/Turabian Style

Zhou, Jie, De’an Sun, and Yang Chen. 2025. "Modeling One-Dimensional Nonlinear Consolidation Problems by Physics-Informed Neural Network with Layer-Wise Locally Adaptive Activation Functions" Applied Sciences 15, no. 15: 8341. https://doi.org/10.3390/app15158341

APA Style

Zhou, J., Sun, D., & Chen, Y. (2025). Modeling One-Dimensional Nonlinear Consolidation Problems by Physics-Informed Neural Network with Layer-Wise Locally Adaptive Activation Functions. Applied Sciences, 15(15), 8341. https://doi.org/10.3390/app15158341

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop